Welcome! I’ve been working with current and new Breezers on the topic of electronic discovery. As I’m going through the tutorials, I am frequently asked to explain the difference between native files and image files. There is definitely some overlap and distinct differences which makes it important on how you decide to process these files. So, here is a breakdown of each file type.
Native Files
Native files are simply files created on your computer. Then, what makes them so special? What makes native files so special is that in order to view a native file, you must have the native application installed on your system. For example, to open an Excel worksheet, you need to open that file in Microsoft Excel; in order to view a Word Perfect document, you must open the native file with Word Perfect.
Native files can be grouped into different categories such as word processing files, email, image files, graphic files, and so on. Each of these categories is unique and the characteristics of the files within each category reflect how the files should be processed. Files such as Word or Word Perfect are text based and this text can be extracted from the files. Emails can be housed into email databases called mail stores, such as PST files. These mail stores require the native application to open them so that the emails can be extracted. Image files can either be created by converting one native file type to another (i.e. – Word to PDF conversion) or can be files saved from an external device such as a scanner or digital camera.
Though there are many different native file types, all native files are alike in that metadata is created when the files are created. Metadata is a record of attributes belonging to a specific file. The metadata information that can be extracted from a file differs on the type of files you are processing. These are a couple of examples of different metadata which differs between the two file formats:
| Word file |
Extracted Email file |
| Date Created |
Date Sent |
| Date Modified |
To/From |
| Author |
Subject |
The metadata information can be inventoried from the native files and added to your electronic discovery database such as Summation or Concordance. The metadata can then be searched and referenced throughout the duration of the case. The metadata information also helps the reviewers when marking documents for production.
Native File Type – Image Files
As I mentioned before, image files are a type of native file and are typically handled differently than most native files. The metadata associated with image files is not always useful in the review process. A great example of this is a stack of documents scanned to PDF using your multi-function device. The copy center personnel scanned each document to a separate PDF file and the PDF files were saved to the network. Now, the documents need to be loaded into the database. You must, at this time, decide if you need to process these images as native files or image files. Well, looking at the metadata, you can see that the information that would be extracted all pertains to your network. This is because the files were created and saved by your system. This makes the metadata not worth loading into the database, thus the files should be processed as image files. Knowing the difference in the type of file processing can cut the amount of time required before the document review can begin.
Metadata for image files becomes important when the PDF, JPG, TIF, etc was created at the custodian’s office. Perhaps the metadata extracted from a contract in PDF form has a create date later than when the contract was supposedly executed. This could underline an issue existing in your case. Having this information could be critical.
What are your experiences with making processing/file handling determinations? Do things like budget, electronic discovery software, and project size determine how these situations are handled in your firm? I and other Breezers would love to hear your feedback!
Posted on
Fri, September 25, 2009
by Brittney Aleman
filed under