Searching…Sorting Through the Tool Box Posted By Daniel Kaiser, Esq. on August 28, 2009

You may know what you’re looking for, but do you know how to look?

For those who are engaged in eDiscovery, two cases touching on search methodologies that have held our attention over the past year include Magistrate Judge John Facciola’s decision in U.S. v. O’Keefe[1], and Magistrate Judge Paul Grimm’s decision in Victor Stanley, Inc. v. Creative Pipe, Inc.[2]Facciola’s harangue regarding the complex nature of ESI searches may have assured his immortality, and it is too good to resist quoting yet again:

Whether search terms or ‘keywords’ will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics…. Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.[3]

In this vein, Facciola noted that searching was best left to the experts.  On the other hand, Grimm, emphasizing cross-party collaboration, sees the creation of search protocols as potentially falling within attorney competency – so long as the attorney has performed quality assurance testing on the methodology selected, can explain the rationale for selecting the methodology, and can show proper implementation.[4]

Facciola and Grimm come by their wariness honestly.  The Text Retrieval Conference (TREC) series is a research body co-sponsored by the NIST and the IARPA (Logik is a 2009 TREC participant).  The TREC Legal Track “focuses on evaluation of search technology for discovery of electronically stored information in litigation and regulatory settings.”[5]  The Overview of the TREC 2008 Legal Track reports that “the consensus Boolean query found 42% of the highly relevant documents, on average per topic, . . . [and 33% ] of all relevant documents.”[6] Further, negotiated Boolean keyword searches were found to be on par with the newer and more complex search methods tested.[7]  In fact, keyword searches can be notably strengthened when they are performed in an iterative fashion: sampling the search results, and then adjusting the negotiated keywords to improve the results.  Yet it has been observed that although various search methodologies may return a comparable percentage of recall, the actual responsive documents retrieved varies – allowing a higher rate of recall through the use of mixed search technologies on the same data set.[8]

This emerging data, along with recent judicial enthusiasm for the incorporation of concept searching[9], reinforces the idea that attorneys need develop a comfortable working knowledge of the array of electronic data search technologies.  The following non-exclusive list of search methodologies and vocabulary is intended as a reference for those who are finding their way through the etymological wrangle and getting to know the eDiscovery landscape:

[1] U.S. v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008).
[2] Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008).
[3] U.S. v. O’Keefe, 537 F. Supp. 2d 14, 24 (D.D.C. 2008).
[4] Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008) (citing The Sedona Conference Best Practices Commentary on the Use of Search & Information Retrieval Methods in E-Discovery, 8 Sedona Conf. J. 189 (2007)).
[5] http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf.
[6] http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf at 5.
[7] Jason Krause, In Search of the Perfect Search, A.B.A.J. (Apr. 2009), http://www.abajournal.com/ magazine/in_search_of_the_perfect_search/.
[8] Id.
[9] See Disability Rights Council of Greater Wash. V. Wash. Metro. Area Transit Auth., 2007 WL 1585452 (D.D.C. June 1, 2007).

Comments

Post A Comment

Categories

Jul 2010

S M T W T F S
       1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Sign me up for Logik news!