DupesEven before either side does a first-pass review of their collected documents, they can easily identify which potentially-discoverable documents both sides already have in common. This process would be fast, inexpensive, and easy, and would allow new kinds of cooperation between parties.


In 2009 and 2010, Patrick Oot, Joe Howie, and Anne Kershaw exposed a disturbing lack of custodial and cross-custodial deduplication in ediscovery at the time. They also considered the ethical implications of that lack. See, e.g., Patrick Oot, Joe Howie, and Anne Kershaw, “Ethics and Ediscovery Review,” ACC Docket Vol. 28, Issue 1 (Jan/Feb 2010): Pages 46-57, available at http://www.knowledgestrategysolutions.com/wp-content/uploads/ACC-Docket-Ethics-of-Edisc…pdf (“Ethics Review”) (last retrieved September 4, 2014).

The Ethics Review also observed that counsel should “[c]onsider consolidating duplicates across parties.” Ethics Review at 56 (emphasis added). However, although the last four years have seen great strides in the adoption of cross-custodial deduplication, it seems that deduplication between parties is not yet being done.

Immediately identifying those common documents should be done in many cases. Here are a few thoughts about why and how.

Identifying which documents are already in both parties’ possession would allow the parties to immediately begin discussing the responsiveness of specific documents, categories, and concepts, without exposing any confidential information. It would virtually eliminate the risk that any party’s valid interests could be compromised by those discussions.

For efficiency and objectivity, this protocol should be limited to exact duplicates. Exact duplicates could be easily and cheaply determined by comparing hash lists. To maximize the identification of duplicates, both parties should agree that all emails would be converted to the RFC 2822 format before being hashed.  See Ethics Review at page 57 (addressing cross-custodian deduplication). More generally, where different collection or ingestion methods would result in non-comparable hash values, counsel should agree on using the same methods.

One way to minimize differences between ingestion methods would be for the parties to agree to share a vendor. Of course, sharing vendors raises potential conflict issues. See, e.g., Gordon v. Kaleida Health, No. 08-CV-378S(F) (W.D.N.Y. May 21, 2013), available at http://scholar.google.com/scholar_case?case=4027097771033406737 (last retrieved September 3, 2014).

Such issues could be minimized or eliminated by the use of ediscovery neutrals. See, e.g., The United States District Court for the District of Kansas Guidelines for Cases Involving Electronically Stored Information, available at http://www.ksd.uscourts.gov/guidelines-for-esi/, at p. 5 ¶13:

13. Creation of a Shared Database and Use of One Search Protocol

In appropriate cases counsel may want to attempt to agree on the construction of a shared database, accessible and searchable by both parties. In such cases, they should consider both hiring a neutral vendor and/or using one search protocol with a goal of minimizing the costs of discovery for both sides.

Using a shared vendor for the common documents would also allow the parties to use exactly the same search and clustering methods on the database of duplicates. Comparing apples to apples could help them to agree on priorities for the larger universe of non-common documents.

One slight objection to this protocol is that a common document might be part of a privileged communication to one of the parties. However, there are many solutions to that problem, such as the use of an ediscovery neutral.

Of course, deduplication between parties could also result in smaller production volumes and could also provide a way to split the cost of objectively coding the common documents.

In light of these potential opportunities, agreeing on deduplication between parties may be ethically required in some cases.