Skip to Main Content

Digital Collection Project Best Practices

File Processing: Raw, Preservation, and Access Files

The following sets of digital files are produced during digitization and the digital file processing: 

1. Raw Files

Raw files are high resolution digital reproduction files produced by digitization or scanning as well as born-digital files transferred to the Gerth Archives and Special Collections. Digitization can be completed by the departmental staff and student assistants as well as outside parties, such as vendors and donors. Original or native formats are retained if the digital files are transferred to the Gerth Archives and Special Collections. Raw files may or may not include errors and imperfection which do not meet our standards.   

2. Preservation Master Files

Preservation master files are derived from the raw files in high resolution in preservation quality digital file formats. The processes include correction of the file names, deletion of duplicates, re-digitization of the physical items, and/or normalization to archival preferred formats.

Production (or edited TIFF) files are produced by rotating and cropping when creating access derivatives. Those files can be maintained as an additional version of preservation files. No other process, such as, compression, redaction, and image enhancement, be applied.   

Recommended software: Adobe Acrobat Pro, Photoshop, Audition, and Premier Pro. 

3. Access Derivatives

Access derivatives are produced from the preservation master files by converting to recommended file formats, such as, JPEG, PDF, MP3, and MP4, for the public use. Other processes, such as redaction, optimization, and image enhancement, can be applied.  
Public use files are produced in lower resolutions and intended to provide access to born-digital materials and un-cataloged digital surrogates. The materials can be available in the Reading Room and/or via Dropbox under restricted conditions. The public use files can include more comprehensive materials than the digital collections available online.  

Recommended software: Adobe Acrobat Pro, Photoshop, Audition, and Premier Pro. 

Preservation Master Files

Still Image 

1. Copy a set of raw files. 

2. Review the copied files, checking against the corresponding physical objects.

3. Delete duplicates.

4. Re-digitize the physical objects if the digital files are imperfect.

5. Correct the file names.

6. Normalize to TIFF. 

Text 

1. Copy a set of raw files. 

2. Review the copied files, checking against the corresponding physical objects.

3. Delete duplicates.

4. Re-digitize the physical objects if the digital files are imperfect.

5. Correct the file names. 

6. Normalization

If a digital file is reproduced in TIFF format, save it as a preservation master TIFF file and produce a PDF/A file as a preservation master PDF file.

If a digital file is transferred in native format, produce PDF/A as a preservation master file.

PDF/A: 

(1)  Crop and rotate a digital file in Adobe Photoshop, maintaining a small border around the  entire image (1/4 inch or more). Do not add a border artificially even if there is not enough      space around the image. If the digital file is TIFF, save the edited TIFF file (production file) on Dropbox. 

(2) Convert to PDF in Adobe Acrobat.

(3) Run optical character recognition (OCR) in Adobe Acrobat.

(4) Normalize to PDF/A (1-b) in Adobe Acrobat.

Audio 

1. Copy a set of raw files. 

2. Normalization to Broadcast Wave (WAV) or Uncompressed Audio Interchange File Format (AIFF)

Moving Image 

1. Copy a set of raw files. 

2. Normalization to Materials Exchange Format (MXF)

Access Derivatives

Still Image

1. Copy a set of the preservation master files.

2. Crop and rotate an image file in Adobe Photoshop, maintaining a small border around the entire image (1/4 inch or more). Do not add a border artificially even if there is not enough space around the image.

3. Image enhancement in Adobe Photoshop is not recommended but can be applied. 

4. Convert to JPEG or JPEG2000. This process would be automatically completed when uploading TIFF files to CONTENTdm. 

Text

1. Copy a set of the preservation master files (PDF/A).

2. Convert to PDF in Adobe Acrobat.  

3. Redaction can be applied in Adobe Acrobat.

4. Optimization is recommended if the file size is greater than 10 MB.
    Split a file if it is greater than 20 MB because of CONTENTdm software limitations. 

5. Convert to PDF/A (1-b). Conversion may not be applicable after redaction and/or optimization. 

Audio

1. Download a preservation master file from Dropbox, converting to MP3.

2. Cut and trim the audio file in Adobe Premiere Pro or Audition if the file size is greater than 2 GB.

3. Transcribe the audio file in Adobe Premiere Pro or Audition.

4. Export in MP3 in Adobe Premiere Pro or Audition

Moving Image

1. Download a preservation master file from Dropbox, converting to MP4.

2. Cut and trim the video file in Adobe Premiere Pro or Audition if the file size is greater than 2 GB

3. Transcribe the video file in Adobe Premiere Pro or Audition.

4. Export in MP4 in Adobe Premiere Pro or Audition