EDRM Collection Standards

Updated January 16, 2014

In May of 2013 a small group of attendees at the EDRM meeting discussed the maturity of the e-discovery industry and how different phases of the EDRM model have developed as standards over time. Not official standards but rather; what processes are repeatable and have understandable risks and rewards that can be used to evaluate a strategy in various cases. The group decided that “Collection” of had evolved to the point that it made sense to document collection best practices and considerations for developing a collection strategy. A team collaborated over the last several months to develop these standards for public comment.

Accompanying the EDRM Collection Standards is the EDRM Collection Standards Glossary.

1. Forensic Image (Physical or Logical Target)

  1. Definition: A forensic image is an image or exact, sector-by-sector bit stream image that captures all the ones and zeros on from a source. This type of image will capture both and inactive data. Inactive data includes files, fragments and artifacts that reside in unallocated and/or slack space including deleted files that have not yet been overwritten. The source can be a such as a physical hard drive or a such as a logical drive (C:\ drive, file system, etc.) contained on a hard drive. Considering all user and OS data is stored on logical partitions it is typically unnecessary to image the non-partitioned space for ediscovery purposes.
  2. Also known as: bit by bit image or , , , , , etc.
  3. When to use: If date stamps, deleted data, edit history, web browser history, or registry values have any bearing on issues at bar, a forensic image should be considered. Forensic images are commonly acquired in the contexts of internal and criminal investigations. Forensic images can also contribute to e-discovery process design as a means of capturing static, verifiable preservation data sets. Images are also often acquired when employees terminate their employment so their data may be recoverable even after reassigning their computer, should it be deemed necessary in the future.
  4. How to use: In a non-investigative, preservation context, a forensic image may be acquired by a person trained in best practices and competent in the use of the tool of choice. Next generation software can simplify the acquisition process enabling a defensible self-collection even by untrained users/custodians, while the use of a hardware solution can reduce drive imaging to a simple, ministerial task. However, in an investigative context requiring forensic analysis, a forensic image should be examined by a thoroughly vetted, with a proven track record as an expert in judicial proceedings.
  5. Pros:
    • Creation of static preservation data set in e-discovery context.
    • Improved defensibility since all possible data has been captured and therefore may allow the recovery of deleted files.
    • Provides client with a sense of security.
    • Verifiable at both device and file levels.
  6. Cons:
    • May require use of third party forensic examiner/expert.
    • May incur additional cost to extract native files from forensic image format before processing.
    • Potential for over preservation/collection, especially whenever incremental/delta collections are needed.
    • Increased time of collection and potential down-time or disruption to the business.
    • Bandwidth issues if moving the image(s) over the network.
    • Increased storage needs and associated costs.
    • If using self collection the individual may have to testify regarding technology and processes if challenged.
    • If using self collection the individual may need to explain any error messages that came up during the copy process.
  7. Glossary

2. Custom Content/Targeted Image

  1. Definition: The resulting dataset from a process of collecting documents and folders from a computer’s file system. The dataset is an exact copy of the source and intended to be used for evidentiary purposes. Creating such an image is intended to preserve the integrity of the source.
  2. Also known as: File copy, or logical copy, forensic duplicate…
  3. When to use: When collecting from a trusted source and there is no suspicion of data deletion any user or system process that has access to the system. When only specific files need to be collected. When collection of specific files may be completed and complete is also required. May be used when both parties to a matter agree on specifics of time, date and/or subject matter and there is no allegation that data deletion has already taken place.
  4. How to use: Performed using specific Information Technology (IT) software tools that work in conjunction with the computer’s file system to transfer the files in question to an external container while preserving the . Different software solutions use various technologies to initiate the process; sometimes applications or services (i.e. “agents”) are installed on the target machine, with or without the user/custodians’ knowledge, while others may only run in memory after downloading the application via their web browser or starting it on a preconfigured portable USB drive.
  5. Pros:
    • Fast collection.
    • Targets only necessary files.
    • Can be run on a or file server without interrupting service to the users.
    • Decreased storage and processing costs over the use of a forensic image.
    • Potential for decreased storage and processing costs if parties can agree on the appropriate data subset, i.e. , a more “reasonable” approach for the majority of e-discovery cases.
    • Most IT departments have basic tools on hand, more comprehensive tools that include data analysis are readily available.
    • New technology conveniently enables the custodian to run the collection based on criteria set by an authorized user (i.e. Legal).
  6. Cons:
    • Only collects the specified files, no analysis or recovering of deleted items or slack space.
    • System must be powered – and either available for direct access or available via an Ethernet connection to a Local Area Network.
    • The collector must have appropriate permissions, i.e. read-write access to the computer file system.
    • Some tools must be properly configured to preserve .
    • Collector needs to be trained in how to select appropriate tools and how to execute collection processes.
    • Collector may need to testify in court.
  7. Glossary

3. Non-Forensic Copy

  1. Definition: Creating a copy of a file using the operating system including file copying provisions in the user interface such as the command line commands “cp” in and “copy” in ; operating systems with a , or GUI, usually provide or methods of file copying.
  2. Also known as: copy, LINUX copy, Windows copy, , .
  3. When to use: When preservation of is not required, usually by agreement between parties. This method can be helpful when there is a small quantity of relevant documents in a larger volume and/or when tools aren’t available for logical collection or dedicated collection personnel are not available.
  4. How to use: The relevant files are selected by the custodian and then copied using the appropriate command or action to a new location. It is critical to document the process and have specific instructions. Best practices suggest that for graphical system users (Windows users) the Copy/Paste function should be used as opposed to “” to ensure a copy is made rather than accidentally moving the selected files and thereby deleting them from the source drive.
  5. Pros:
    • This is similar to what was done in the paper world in that it relies on users to identify the location of relevant information and gather that information without supervision.
    • Typically results in a smaller collection set reducing processing and review costs.
    • May eliminate highly sensitive irrelevant information from collection.
    • Will likely include information missed by keyword search terms due to the custodian’s knowledge of local practice and usage.
  6. Cons:
    • May alter and lose original folder structure.
    • Requires rigorous processes and documentation.
    • May not be easy to reproduce.
    • Defensibility may be compromised if mistakes are made during the collection process.
    • May require re-collection of files in a forensically sound manner if authentication of file(s) is likely to be questioned
    • May require notification to opposing counsel that metadata may be altered during collection.
    • Highly dependent on custodians to locate and copy data.
    • Using filtering (including keywords) during collection may result in missed files.
  7. Glossary

4. Exports – Harvesting Email

4.1. Back-End: Server or Archive Solution

  1. Definition: An automated process supported by various email platforms to export content for mailbox accounts from a server.
  2. Also known as: Export, ExMerge, replicate, extract (, , , ).
  3. When to use: Whenever you have access to a server along with a dedicated internal client IT resource to execute the process.
  4. How to use: In collaboration with appropriate internal IT resource, an automated process is staged and executed to send relevant data from the application server to an external location. This process usually leverages internal services within the application from which the data is being retrieved and is therefore application specific.
  5. Pros:
    • Complete – entire mailbox account or identified folder locations.
    • Effective automated method to create a preservation data set when scope is accurately identified by location.
    • Reduced collection cost by leveraging internal client resources.
    • Export utilities provided by the authors of the application software reduce risk of altering .
    • Speeds collection in a defensible manner when large number of custodians are at issue.
    • More defensible than using in some forms of individual custodian collection.
  6. Cons:
    • Potential for over preservation.
    • Potential increase in culling costs.
  7. Glossary

4.2. Front-End: Local Email Client

  1. Definition: An automated or manual process supported by various email platforms to export content for mailbox accounts from the email client.
  2. Also known as: Export, copy, capture (in , , , HTM formats).
  3. When to use: During desk-side collection that identifies content through an interview process.
  4. How to use: Use e-mail client export utilities or directly from e-mail client into Windows foldering.
  5. Pros:
    • Targeted.
    • Email client export utilities reduce risk of altering .
  6. Cons:
    • Incomplete.
    • Lack of automated audit trail with into Windows.
    • of email increases risk of altering .
  7. Glossary

4.3. Web-based: configure local e-mail client

  1. Definition: An automated process to download webmail content by configuring a local email client.
  2. Also known as: Download, synchronize, create local mail store, configure local client (, containers).
  3. When to use: When user credentials are available and access can be gained through an internet protocol.
  4. How to use: Use internet protocol and email client configuration settings.
  5. Pros:
    • Complete – should retrieve entire mailbox account.
    • Effective preservation data set created when scope extends to entire mailbox.
    • Reduced collection cost by leveraging internal client resources.
    • Defensible.
    • Export utilities reduce risk of altering .
  6. Cons:
    • Will require an expert in many cases to ensure a complete and accurate collection.
    • Lack of standards in web mail clients and configurations make this complex.
    • Potential for over preservation.
    • Potential for increased culling costs.
    • Difficulty accessing this information.
    • Ability of provider to work with you.
    • Lack of direct access and administrative control can make these collections much more difficult.
    • Difficult to verify complete collection.
  7. Glossary

5. Exports – Non-Email

  1. Definition: The use of utilities that are included in an application or application suite that enable the export of records to an external location. These utilities are usually provided to enable reuse of data from application to another or to provide backup and recovery of key data.
  2. Also known as: Exports from Sharepoint 2013, Enterprise Systems, Databases, Social Media, Instant Messaging.
  3. When to use: When collecting .
  4. How to use: Work in conjunction with dedicated IT representative to utilize application utilities to export the relevant information. A (DBA) or (Sys Admin) is the appropriate person to engage. He or she will understand the export capabilities of the software product in question and will assist in developing a strategy for retrieving the information that has been requested for the matter in question. An export migration plan will be prepared and approved by the legal representatives. That migration plan will result in an automated process being scheduled and run by the IT department to copy the relevant data to an external container.
  5. That information then needs to be re-imported into either a standalone tool for analysis or a new “clean” copy of the application on another server to replicate the information that has been requested.
  6. Pros:
    • Export utilities potentially eliminate the risk of accidentally altering .
    • The only way to retrieve from key systems so that it may be used elsewhere.
    • Avoids having to give opponents access to sensitive corporate operations and systems.
    • Avoids protracted arguments on running reports against databases and the incongruities those reports sometimes seem to expose.
  7. Cons:
    • Requires knowledge of how data is stored and what is included/excluded in export.
    • Can only be executed by s and s within the IT department.
    • Exports may need to be scheduled for nights, weekends or holidays when the software system is not in use by the business.
    • Once data has been exported, it may be very expensive to rebuild a copy of the original system or reformat the data so that it may be read and understood.
  8. Glossary

6. Exceptions

As technology continues to evolve so do best practices. At this time, the EDRM collection standards do not address the technologies below. It is recommended that you consult an expert when collecting these data sources. As best practices evolve in the industry the EDRM will update these standards to include these technologies.

6.1. Mobile Devices

Reason why not addressed here: There are different manufacturers and brands, with different carriers, different plans, different operating systems/settings, different applications and different tools. This requires different collections software; merging images, cell phone service providers and sms; mixing many different pieces together.

Specific Challenges:

  • Cell phones can be wiped by the cell provider, owner or company (i.e. Find My Phone by Apple).
  • Place in a faraday bag/store in a shielded room.
  • The lock code may be necessary to access the phone.
  • Make sure you seize the power cables.
  • Keeping pace with new technology. New phones are released almost on a weekly basis. Tools cannot keep pace.
  • Data can be stored on the network.
  • Device access can vary from device to device (even the same model) based on carrier, plan, operating system version, etc.

6.2. Instant Messaging

Reason why not addressed here: Each is a proprietary software that is many times different than anything else.

Specific Challenges:

  • Logging chat is generally a feature that is not set to “on” even though users can manually save them.
  • Logging is generally saved at a default location whereas manual save can be anywhere.
  • There are also tools such as On The Record (OTR) and Off The Record that can affect whether messages are stored or even wiped from the computer.
  • Data, videos and images can be sent via the IM tool.
  • Images and video, if stored, can frequently be stored in a separate location and be very difficult to recombine to the original message.
  • Various tools have no formal registration process, leaving room for anonymity.
  • Chat messages can be forwarded to cell phones.

6.3. MACs (Macintosh or MAC)

Reason why not addressed here: While not as difficult as they once were, many Mac’s cannot be disassembled and can only be acquired with the hard drive in place. Most of the common industry standard tools still do not handle Mac systems or images effectively.

6.4. International Protocols

Reason not addressed here: This may cross into privacy standards and the rest of the world is much more strict than US; each country has their standards and their processes; country by country basis typically and these are changing continually.

Specific Challenges:

  • Many countries place privacy of the individual before all else. In addition, many European countries require an intermediary such as a Data Protection Officer (Germany) who can institute their own requirements.
  • For many businesses, sending data in or out of the country can implicate national security concerns, requiring compliance with additional domestic regulations.

6.5. Social Media (or Other Cloud Storage)

Reason not addressed here: Different software, different firewalls and ways to protect against hacking and viruses and spam; different settings and protocols across the board.

Specific Challenges:

  • Difficulty accessing this information.
  • Ability or willingness of provider to work with you.
  • Lack of direct access and administrative control can make these collections much more difficult.
  • Contain embedded audio and video content.
  • Various platforms frequently link to and interact with each other (i.e., a Facebook post that links to a YouTube video).
  • Cannot be effectively collected using traditional tools.

Contributors

Julie Brown, Vorys (project lead)
Patrick Chavez
Teri Christensen, Faegre Baker Daniels
Kevin Clark
Justin Coffey
Sean d’Albertis, Faegre Baker Daniels
Kevin Esposito
Faisal Habib, AccessData Group
Valerie Lloyd, Excel Energy
Jeremy Montz, kCura
Rick Nalle, KPMG
Andrea Donovan Napp, Robinson & Cole
John Wilson

  • Electronically Stored Information or ESI is information that is stored electronically on enumerable types of media regardless of the original format in which it was created.
  • Electronically Stored Information: this is an all inclusive term referring to conventional electronic documents (e.g. spreadsheets and word processing documents) and in addition the contents of databases, mobile phone messages, digital recordings (e.g. of voicemail) and transcripts of instant messages. All of this material needs to be considered for disclosure.
  • Data currently displayed on a computer screen, and/or files on a computer that can be accessed without having to use a restoration process.
  • The information readily available and accessible to users, including word processing files, spreadsheets, databases' data, e-mail messages, electronic calendars and contact managers.
  • Active data is information residing on the direct access storage media of computer systems, which is readily visible to the operating system and/or application software with which it was created and immediately accessible to users without undeletion, modification or reconstruction (i.e., word processing and spreadsheet files, programs and files used by the computer’s operating system).
  • Active data is information residing on the direct access storage media of computer systems, which is readily visible to the operating system and/or application software with which it was created and immediately accessible to users without undeletion, modification or reconstruction.
  • Data existing on the data and file storage media of computer systems. Active data is easily viewed on the operating system and/or application software that was used to create it and is directly available to users without un-deletion, alteration, or restoration.
  • Data currently displayed on a computer screen.
  • Information residing on the computer which is visible and fully available to the user.
When the forensic imaging process targets the entire physical drive or data storage media.

EDRM Collection Standards

When forensic imaging process targets a logical portion of the media such as the C:\ drive or other logical volume or partition.

EDRM Collection Standards

  • A sector-by-sector, bit-by-bit copy of a physical hard drive or a logical drive.

    EDRM Collection Standards

  • Bit stream backup (also referred to as mirror image backup) involves the backup of all areas of a computer hard disk drive or another type of storage media. Such a backup exactly replicates all sectors on a given storage device. Thus, all files and ambient data storage areas are copied. Bit stream backups - sometimes also referred to as "evidence grade" backups - differ substantially from traditional computer file backups and network server backups.

    Fenwick & West LLP, FWPS eDiscovery Terminology (11/6/2005). Citing NTI's Computer Forensics Definitions, http://www.forensics-intl.com/def2.html

".E01" is a legacy EnCase evidence file format. An ".E01" file is a byte-for-byte representation of a physical device or a logical volume.

EnCase Forensic Imager, Version 7.06, User's Guide. Guidance Software.

".Ex01" is the current EnCase evidence file format. An ".Ex01" file is a byte-for-byte representation of a physical device or a logical volume. It has LZ compression, AES256 encryption with keypairs or passwords, and options for MD5 hashing, SHA-1 hashing, or both.

EnCase Forensic Imager, Version 7.06, User's Guide. Guidance Software.

A RAW image file is a bit-by-bit copy of data on a disk or volume, without additions, deletions, or metadata. Originally used by dd, the RAW image format is supported by most computer forensic applications.

http://www.forensicswiki.org/wiki/Raw_Image_Format

A "dd" file is a raw image file created using the dd forensic imaging tool, a command line program that uses command line arguments to control the imaging process.

http://www.forensicswiki.org/wiki/Dd

A person holding one of a number of commonly recognized certifications in the field. Due to a lack of industry wide certifications it is critical to research the certifications and any requirements within your state or jurisdiction.

EDRM Collection Standards

".E01" is a legacy EnCase evidence file format. An ".E01" file is a byte-for-byte representation of a physical device or a logical volume.

EnCase Forensic Imager, Version 7.06, User's Guide. Guidance Software.

".Ex01" is the current EnCase evidence file format. An ".Ex01" file is a byte-for-byte representation of a physical device or a logical volume. It has LZ compression, AES256 encryption with keypairs or passwords, and options for MD5 hashing, SHA-1 hashing, or both.

EnCase Forensic Imager, Version 7.06, User's Guide. Guidance Software.

  • A sector-by-sector, bit-by-bit copy of a physical hard drive or a logical drive.

    EDRM Collection Standards

  • Bit stream backup (also referred to as mirror image backup) involves the backup of all areas of a computer hard disk drive or another type of storage media. Such a backup exactly replicates all sectors on a given storage device. Thus, all files and ambient data storage areas are copied. Bit stream backups - sometimes also referred to as "evidence grade" backups - differ substantially from traditional computer file backups and network server backups.

    Fenwick & West LLP, FWPS eDiscovery Terminology (11/6/2005). Citing NTI's Computer Forensics Definitions, http://www.forensics-intl.com/def2.html

A person holding one of a number of commonly recognized certifications in the field. Due to a lack of industry wide certifications it is critical to research the certifications and any requirements within your state or jurisdiction.

EDRM Collection Standards

A "dd" file is a raw image file created using the dd forensic imaging tool, a command line program that uses command line arguments to control the imaging process.

http://www.forensicswiki.org/wiki/Dd

When forensic imaging process targets a logical portion of the media such as the C:\ drive or other logical volume or partition.

EDRM Collection Standards

When the forensic imaging process targets the entire physical drive or data storage media.

EDRM Collection Standards

A RAW image file is a bit-by-bit copy of data on a disk or volume, without additions, deletions, or metadata. Originally used by dd, the RAW image format is supported by most computer forensic applications.

http://www.forensicswiki.org/wiki/Raw_Image_Format

  • Data currently displayed on a computer screen, and/or files on a computer that can be accessed without having to use a restoration process.
  • The information readily available and accessible to users, including word processing files, spreadsheets, databases' data, e-mail messages, electronic calendars and contact managers.
  • Active data is information residing on the direct access storage media of computer systems, which is readily visible to the operating system and/or application software with which it was created and immediately accessible to users without undeletion, modification or reconstruction (i.e., word processing and spreadsheet files, programs and files used by the computer’s operating system).
  • Active data is information residing on the direct access storage media of computer systems, which is readily visible to the operating system and/or application software with which it was created and immediately accessible to users without undeletion, modification or reconstruction.
  • Data existing on the data and file storage media of computer systems. Active data is easily viewed on the operating system and/or application software that was used to create it and is directly available to users without un-deletion, alteration, or restoration.
  • Data currently displayed on a computer screen.
  • Information residing on the computer which is visible and fully available to the user.
With a logical evidence file, you can selectively choose which files or folders you want to preserve, instead of acquiring the entire drive. Unlike copying files from a device and altering critical metadata, logical evidence files preserve the original files as they existed on the media and include additional information such as file name, file extension, last accessed, file created, last written, entry modified, logical size, physical size, MD5 hash value, permissions, starting extent, and original path of the file.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

A computer that is powered up and actively logged in.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

A computer that is powered up and actively logged in.

EDRM Collection Standards

With a logical evidence file, you can selectively choose which files or folders you want to preserve, instead of acquiring the entire drive. Unlike copying files from a device and altering critical metadata, logical evidence files preserve the original files as they existed on the media and include additional information such as file name, file extension, last accessed, file created, last written, entry modified, logical size, physical size, MD5 hash value, permissions, starting extent, and original path of the file.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

Pronounced yoo-niks, a popular multi-user, multitasking operating system developed at Bell Labs in the early 1970s. Created by just a handful of programmers, UNIX was designed to be a small, flexible system used exclusively by programmers.

http://www.webopedia.com/TERM/U/UNIX.html

Acronym for disk operating system. The term DOS can refer to any operating system, but it is most often used as a shorthand for MS-DOS (Microsoft disk operating system). Originally developed by Microsoft for IBM, MS-DOS was the standard operating system for IBM-compatible personal computers.

http://www.webopedia.com/TERM/D/DOS.html

Abbreviated GUI (pronounced GOO-ee). A program interface that takes advantage of the computer's graphics capabilities to make the program easier to use. Well-designed graphical user interfaces can free the user from learning complex command languages.

http://www.webopedia.com/TERM/G/Graphical_User_Interface_GUI.html

To copy a piece of data to a temporary location and then make a new copy of the object in a new location. This is usually done by clicking the right mouse button while holding the mouse cursor over the relevant file and then clicking “copy” from the menu that appears. The mouse pointer is then moved to the destination location, a right mouse click brings up the same function menu and “paste” is selected to copy the file(s) to the new location.

EDRM Collection Standards

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

Pronounced yoo-niks, a popular multi-user, multitasking operating system developed at Bell Labs in the early 1970s. Created by just a handful of programmers, UNIX was designed to be a small, flexible system used exclusively by programmers.

http://www.webopedia.com/TERM/U/UNIX.html

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

To copy a piece of data to a temporary location and then make a new copy of the object in a new location. This is usually done by clicking the right mouse button while holding the mouse cursor over the relevant file and then clicking “copy” from the menu that appears. The mouse pointer is then moved to the destination location, a right mouse click brings up the same function menu and “paste” is selected to copy the file(s) to the new location.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

To copy a piece of data to a temporary location and then make a new copy of the object in a new location. This is usually done by clicking the right mouse button while holding the mouse cursor over the relevant file and then clicking “copy” from the menu that appears. The mouse pointer is then moved to the destination location, a right mouse click brings up the same function menu and “paste” is selected to copy the file(s) to the new location.

EDRM Collection Standards

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

Abbreviated GUI (pronounced GOO-ee). A program interface that takes advantage of the computer's graphics capabilities to make the program easier to use. Well-designed graphical user interfaces can free the user from learning complex command languages.

http://www.webopedia.com/TERM/G/Graphical_User_Interface_GUI.html

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

Acronym for disk operating system. The term DOS can refer to any operating system, but it is most often used as a shorthand for MS-DOS (Microsoft disk operating system). Originally developed by Microsoft for IBM, MS-DOS was the standard operating system for IBM-compatible personal computers.

http://www.webopedia.com/TERM/D/DOS.html

A process where individual custodians identify and copy potentially relevant files for discovery.

EDRM Collection Standards

Pronounced yoo-niks, a popular multi-user, multitasking operating system developed at Bell Labs in the early 1970s. Created by just a handful of programmers, UNIX was designed to be a small, flexible system used exclusively by programmers.

http://www.webopedia.com/TERM/U/UNIX.html

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

Databases in IBM Notes, formerly Lotus Notes, are Notes Storage Facility (.nsf) files, containing basic units of storage known as a "note".

http://en.wikipedia.org/wiki/IBM_Notes.

mbox is a common format for storing email messages. An mbox is a single file containing zero or more email messages.

http://www.qmail.org/qmail-manual-html/man5/mbox.html.

Microsoft Outlook Express stores your messages in a folder that contains several different .dbx files. These files (folders.dbx, inbox.dbx, outbox.dbx) contain all your messages.

Import messages into Windows Mail from Outlook Express, http://windows.microsoft.com/en-us/windows-vista/import-messages-into-windows-mail-from-outlook-express.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

Microsoft Outlook Express stores your messages in a folder that contains several different .dbx files. These files (folders.dbx, inbox.dbx, outbox.dbx) contain all your messages.

Import messages into Windows Mail from Outlook Express, http://windows.microsoft.com/en-us/windows-vista/import-messages-into-windows-mail-from-outlook-express.

mbox is a common format for storing email messages. An mbox is a single file containing zero or more email messages.

http://www.qmail.org/qmail-manual-html/man5/mbox.html.

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

Databases in IBM Notes, formerly Lotus Notes, are Notes Storage Facility (.nsf) files, containing basic units of storage known as a "note".

http://en.wikipedia.org/wiki/IBM_Notes.

The Microsoft Outlook Item (.msg) File Format is used to format a Message object, such as an e-mail message, an appointment, a contact, a task, and so on, for storage in the file system.

[MS-OXMSG]: Outlook Item (.msg) File Format- Introduction, http://msdn.microsoft.com/en-us/library/ee160779(v=exchg.80).aspx.

EML is a file extension for an e-mail message saved to a file in the MIME RFC 822 standard format by Microsoft Outlook Express as well as some other email programs.

EML File Format, http://whatis.techtarget.com/fileformat/EML-Microsoft-Outlook-Express-mail-message-MIME-RFC-822.

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

A common way to move or copy a file or folder is to highlight it and literally “drag” a copied version of it to another location. First the mouse would be used to highlight the file. Then while holding down the left mouse button, the name of the file would be dragged to a new location. In the background, the operating system creates a new copy and places it in the new location. For example, you can drag a file to the Recycle Bin to delete the file, or to a folder to copy or move it to that location.

EDRM Collection Standards

EML is a file extension for an e-mail message saved to a file in the MIME RFC 822 standard format by Microsoft Outlook Express as well as some other email programs.

EML File Format, http://whatis.techtarget.com/fileformat/EML-Microsoft-Outlook-Express-mail-message-MIME-RFC-822.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

The Microsoft Outlook Item (.msg) File Format is used to format a Message object, such as an e-mail message, an appointment, a contact, a task, and so on, for storage in the file system.

[MS-OXMSG]: Outlook Item (.msg) File Format- Introduction, http://msdn.microsoft.com/en-us/library/ee160779(v=exchg.80).aspx.

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

mbox is a common format for storing email messages. An mbox is a single file containing zero or more email messages.

http://www.qmail.org/qmail-manual-html/man5/mbox.html.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

mbox is a common format for storing email messages. An mbox is a single file containing zero or more email messages.

http://www.qmail.org/qmail-manual-html/man5/mbox.html.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

There are two types of Outlook Data Files used by Outlook. An Outlook Data File (.pst) is used for most accounts.... Outlook Data Files (.pst) are used for POP3, IMAP, and web-based mail accounts. When you want to create archives or back up your Outlook folders and items on your computer, such as Exchange accounts, you must create and use additional .pst files.... A Personal Folders file (.pst) is an Outlook data file that stores your messages and other items on your computer. This is the most common file in which information in Outlook is saved by home users or in small organizations....

Introduction to Outlook Data Files (.pst and .ost), http://office.microsoft.com/en-us/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx.

Data that resides in a fixed field within a record or file is called structured data. This includes data contained in relational databases and spreadsheets.

http://www.webopedia.com/TERM/S/structured_data.html.

A database administrator (short form DBA) is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of databases in an organization.

http://en.wikipedia.org/wiki/Database_administrator.

A system administrator, or sysadmin, is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems; especially multi-user computers, such as servers.

http://en.wikipedia.org/wiki/System_administrator.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

Data that resides in a fixed field within a record or file is called structured data. This includes data contained in relational databases and spreadsheets.

http://www.webopedia.com/TERM/S/structured_data.html.

A database administrator (short form DBA) is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of databases in an organization.

http://en.wikipedia.org/wiki/Database_administrator.

A system administrator, or sysadmin, is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems; especially multi-user computers, such as servers.

http://en.wikipedia.org/wiki/System_administrator.

A database administrator (short form DBA) is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of databases in an organization.

http://en.wikipedia.org/wiki/Database_administrator.

The term metadata refers to "data about data". The term is ambiguous, as it is used for two fundamentally different concepts (types). Structural metadata is about the design and specification of data structures and is more properly called "data about the containers of data"; descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description would be "data about data content" or "content about content" thus metacontent.

http://en.wikipedia.org/wiki/Metadata

Data that resides in a fixed field within a record or file is called structured data. This includes data contained in relational databases and spreadsheets.

http://www.webopedia.com/TERM/S/structured_data.html.

A system administrator, or sysadmin, is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems; especially multi-user computers, such as servers.

http://en.wikipedia.org/wiki/System_administrator.

17 comments to EDRM Collection Standards

  • Nothing is impractical in forensics just challenging. Imaging a RAID is a forensic practice also known as Large Data Set Acquisitions(LDSA).

  • Understanding that the definition of “forensic” deals with scientific methods and techniques, calling something “non-forensic” immediately places the tag that this process is unacceptable. Therefore, why include it? Otherwise, redefine the technique. If done correctly, copying of files can be done in a forensic fashion and be acceptable for use in analysis. This process usually requires gathering file metadata in another fashion, but it is not “non forensic”.

    • I agree that non-forensic techniques do not need allot of explanation but should be mentioned so novice evidence collectors do not mistakenly use a non-forensic method when collecting evidence. But I would say that imaging evidence circumvents having to collect the metadata in another fashion because if done correctly one should be getting a bit for bit copy which always includes the metadata. In an unusual situation, I could see using a non-forensic tool like Robocopy if that is the only practical way of collecting the evidence and your position for collecting it in a non-forensic way was the only solution at the time. The collector would have to be able to defend his/her position on the stand against another testifying forensic expert while complying with the FRE.

      • I’m not saying that they shouldn’t be mentioned. I’m saying that they shouldn’t be called “non-forensic”. You can make copies of files using any sort of tool (RoboCopy, copy, etc.) in a forensic fashion.

        • I would disagree with “You can make copies of files using any sort of tool”. As forensic tools have certain standards and documentation that qualifies them as a forensic tool, Robocopy is not a forensic tool it is a data coping tool for IT administrator and I don’t think it would be defensible under the Federal Rules of Evidence. As a testifying forensic expert I believe I could help a good attorney get evidence collected by IT tools like RoboCopy suppressed however I am not aware of any case law on Robocopy collection yet that I am aware of. If you know of any adjudicated cases where tools like Robocopy have been challenged and was ruled admissible; I would really appreciate that information. I have not heard any certifying training academy advocate or train practitioners in the use of standard IT tools for forensics. I would recommend only using IT tools under exigent circumstances.

          • Tools are not “forensic”. Your methods and processes are. Any tool, whether it is made for IT purposes or specifically for forensics, should be used with caution. There are many training facilities that train students using IT tools for forensic purposes, specifically when it comes to working in the Linux world. Collection of live, volatile data is one area that IT tools are commonly used.

          • In my over 6 years as a computer crime investigator, forensic investigator for a California pole department and a DoD certified Digital Media Collector (CDMC). Certified through government train provided by the OSI & NSA and certification as a police detective by California P.O.S.T, that is all news to me as I have never heard that before nor was it taught by the DOJ, NSA, OSI or POST. Tools are forensic by testing and validation by using established federal standards in concept and research techniques. A excellent site for this information is http://www.nist.gov. Here is an a quote from that site:

            “Computer Forensics Tool Testing
            Many software and hardware packages available today offer forensic capabilities. Before these claims can be trusted by forensic scientists and the courts, however, the tools must be tested to ensure that they can support the collection, examination, analysis, and court testimony of digital evidence. We have conducted functionality tests on specific software and hardware products using a unique testing framework. The results of these tests are available on the Computer Forensics Tool Testing project website.”

          • My apologizes for the typos, fat fingers and short on time.

          • I’m just going off of my experience in doing forensics for over a decade and the experience of others, some of whom have been doing it longer than I.

    • tchristensen

      The point of using the term “non-forensic” in these standards is to highlight that this methodology may be acceptable for discovery purposes regardless of forensic best practices. Although these methods would not be appropriate for a forensic investigation, that does not establish that they are indefensible in a discovery context.

    • sdalbertis

      It is important to note that although the Federal Rules of Evidence apply to expert testimony in a forensic context, they do not establish the applicable legal standard for evaluating defensibility in discovery. Rather, this area is controlled by the Federal Rules of Civil Procedure. Whereas forensic methods are generally commendable, they are not always necessary. Parties must employ reasonable processes to comply with their discovery obligations. These processes are not required to be perfect or even the best available. They simply must be reasonable under the totality of the circumstances. Discovery is pervaded with an underlying standard of reasonableness. We must take care not to project a forensic purist point of view beyond the appropriate scope of forensic best practices. To do so actually increases the costs and burdens of discovery in contravention of the proportionality calculus embodied in the Federal Rules of Civil Procedure.

  • wtkjd

    These standards appear a bit naïve and anachronistic.

    First, all use of the term ‘forensic’ should be excluded unless better defined. A ‘forensic’ image requires both technology and process that is not described in Standard 1 (or elsewhere, where the term is used.) The defensibility of any collection process is wholly dependent on the evidentiary standard to be used to evaluate the information collected. The term ‘forensic’ is too often misused and abused.

    Standard 1 also ignores the most important purpose behind traditional computer forensics – capturing attributes that can be used to identify the person interacting with the device when the information was created, modified or deleted.

    To that end, a full physical image of a device is but one step in creating a forensically sound collection for use in a criminal or quasi-criminal proceeding. It is useful but overkill for a civil proceeding where there is a presumption the information collected is authentic and belongs to the producing party from which it was collected.

    I would limit the first five standards to:
    1) Physical Image (bitstream image of physical device)
    2) Logical Image (bitstream image of logical device)
    3) Targeted Logical Copy
    4) Operating System Copy
    5) Export
    with the difference between 3 and 5 limited to using specialized collection software for 3 and using native application software features for 5. Moreover one can write a UNIX or Linux command (or even Windows command line command) that will use the OS to copy and preserve MAC attributes and other file or object level metadata. Conversely using export features in some systems will affect metadata.

  • akass_USLS

    Re 1.f (Forensic Image – Con):
    Physical bit-level collection from a RAID5 or SAN system is highly impractical.

    • Why is bit-level collection from a RAID5 impractical? It is done all the time. There are even times that the individual disks from a RAID5 are acquired in order to recreate the actual drive itself.

      • Nothing is impractical in forensics just challenging. Imaging a RAID is a forensic practice also known as Large Data Set Acquisitions(LDSA).

        • wtkjd makes alot of good comments, above. Especial when it comes to imaging MAC and Linux devices and Big Data. However you can image Mac’s and Linux with MacQuisition or a Raptor boot disk. To comment directly to wtkjd’s five standards here is my input:

          I would limit the first five standards to:
          1) Physical Image (Bit for Bit image of physical device)
          2) Logical Image (Bit for Bit image of logical volume)
          3) Targeted Logical Image

          4 & 5 I would strike because you can capture all the OS files in a logical or targeted collection using a tool like FTK Imager or Linux forensic boot disk, then export from the image file and not the original evidence.

          • wtkjd

            Jim,

            Based on the resume shared here, your focus is criminal or quasi-criminal computer forensics, as opposed to civil litigation, where there is a lower standard, both in terms of the rules of evidence and the rules of procedure (Federal Rules of Civil Procedure vs. Federal Rules of Criminal Procedure.) So while all your comments are valid, and appreciated, for criminal standard computer forensics, they are overkill for civil electronic discovery.

            When you also factor in FRCP Rule 1 proportionality, which may be embodied in amendments to discovery rules, you find 4 & 5 have a practical place in electronic discovery for civil litigation. Especially where systems are designed to defensibly (i.e. accurately, reliably and with data integrity) export information with appropriate attributes intact. Why image a RAID holding an EDB when a PST export of a mailbox is all you need?

Leave a Reply