Statistical Sampling Applied to Electronic Discovery

April 27, 2012

Statistical Sampling Home | 2. Estimating | 3. Guidelines | 4. Theory | 5. Examples | 6. Validation Study »

The purpose of this document is to provide guidance regarding the use of statistical sampling in e-discovery contexts. Most of the material is definitional and conceptual, and is intended for a broad audience. The later material and the accompanying spreadsheet provide additional information, particularly technical information, to people in e-discovery roles who become responsible for developing further expertise in this area.

1. Introduction

2. Estimating Proportions within a Binary Population

3. Guidelines and Considerations

4. Additional Guidance on Statistical Theory

5. Examples Using the Accompanying Excel Spreadsheet

6. Validation Study

Please complete the following to download file

Your Name *

Your Email *

Your Organization

Organization Type
AssociationCorporationGovernmentLaw FirmProvider

Joining EDRM
Please contact me about EDRM membership

Subscribe me to the mailing list

3 comments to Statistical Sampling Applied to Electronic Discovery

  • Steve Gdula

    I actually wanted to follow up my previous comment with some updated perspectives. I see that some vendors are now offering the merits of Predictive Coding to accomplish the task I described in detail for my prior posting.

    When documents are targeted for imminent production I have stated that random auditing of those with privilege keywords was a diligent task to prevent potential inadvertent production of privileged material.

    With accessibility to the technologies of Machine Learning / Predictive Coding this could replace the need for a [ randomized audit ] of documents with privilege keywords with a more targeted-ranked approach. Predictive Coding can supply a ‘Confidence Score’ for documents which appear to be of a Privileged Status. This ranked grouping of documents, which have been ‘manually’ tagged as non-privileged by a human being, can be scrutinized one last time by an authority where an acceptable threshold of confidence is exceeded. This is quite reasonable and on its surface a more desirable approach to the final privilege screening.


    Steve Gdula
    Paralegal – Technical Engineer
    Banner & Witcoff, Ltd.
    Intellectual Property Law

  • Steve Gdula

    I wanted to point out a possible need under Section 1.3 under the heading “Potential e-Disc situations that warrant sampling”; particularly with respect to ‘Review’.

    While you already point out the need to audit review-coding in general, responsive/issues etc, I feel there is a need to run a test against the current collection of documents identified to be produced – that is explicitly for privilege content. Using a term from your own article I believe it would fall under the label of “Statified Sampling”. Most especially considering a great deal of contract reviewing is occurring and the quality of privilege scrutiny should be more closely audited. While nothing as egregious as the inadvertent productions of ‘Victor Stanley v. Creative Pipe’ should occur there seems to be complacency due to the use of claw-back agreements. While the claw-back permits the return of the inadvertent document, the damage is already done as opposing counsel has laid eyes on potentially impactful information; ie: You cannot unring the bell.

    I propose that a random auditing of all pre-produced documents occur for a universe of those that contain any of the established privilege keywords. Of those containing a word which may imply privilege ( and has been coded to be produced ), a percentage of these can be screened by random audit to confirm the non-privilege status. Certainly this would reinforce a “reasonable” approach for a judge although the claw-back agreement should make this less necessary for that purpose. It is the prevention of disseminating case damaging information to the other side which is really the primary goal here.


    Steve Gdula
    Paralegal – Technical Engineer
    Banner & Witcoff, Ltd.
    Intellectual Property Law

  • jbrown

    This guide is outstanding! Something that the industry really needs.

Leave a Reply