Title: Title Slide
1E-Discovery Revisited A Broader Perspective for
IR Researchers
Jack G. Conrad, Thomson RD ICAIL07 / DESI
Workshop June 4, 2007
2EDD Outline
- EDD ? The Big Picture
- Motivations
- Background
- EDD interactions the dance of the litigants
- The complete EDD pipeline
- Alternative view of the enabling technologies
3EDD ? The Big Picture
- Electronic Data Discovery ?
- Context Practical Research TREC
- Motivations
- (1) Recent characterization of State of the Art
in EDD - (2) Informational materials available for
participants in forums like TREC
4EDD ? The Big Picture
- Electronic Data Discovery ?
- Presently exist 300-500 companies offering some
form of EDD software or services. - Several offer complete services across the
E-Discovery spectrum -
- Kroll On-Track
- Recently acquired Engenium (Symetric), the
concept search engine co. - LN
- Acquired Applied Discovery in recent past, and
also offers a full spectrum of EDD services - EDD performance bar constantly being raised
- Essential need to share diverse perspectives in
field with next generation researchers - What is the dance of the litigants? the
complete EDD pipeline? possible interactions of
the enabling technologies?
5EDD ? Related Areas
- Litigation Support / Management
- Focal point for TLR and L/N
- Wests Strategic Acquisition Team
- Compliance Regulation
- Federal, State and other Regulatory Agencies
- Freedom of Information Act (FOIA) Inquiries
- Federal Government, the National Archives
- Nat. Archives had role in initiating TREC Legal
Track - Homeland Security / Anti-Terrorism
- Federal Government, National Security Agencies
6E-Discovery Growth Prospects
Dollars (in Millions)
7Source of EDD Survey Responses
- The Socha-Gelbmann Report, 2005
- In total, 240 consumers/providers of EDD software
/ services were contacted - 139 expressed interest in participating
- 72 of those were surveyed via spreadsheet or
phone interview - 3 of the final spreadsheets did not contain
enough info to be used - Conducted among 69 E-Discovery consumers
providers - 24 consumers 45 providers
- Consumers
- A cross-section of Am Law 200 law firms large
U.S. companies - Providers
- A broad-based collection of software service
providers who market their offerings as
E-Discovery tools or services
8Fastest Growing E-Discovery Services / Software
9E-Discovery ? Areas of Industry Strength
10E-Discovery ? Areas of Industry Weakness
11EDD Scenarios the dance of the litigants
Employment Discrimination
Party A vs. Company B (David vs. Goliath)
Securities Fraud
Govt vs. Company C
Intellectual Property
EDD resources
Company D vs. Company E
EDD resources
12The EDD Work Flow Model
Breadth and depth of discoverable materials
established
Data transferred from original or intermediate
media to uniform media for analysis
Vetting performed to reduce volume of data (incl.
filtering, deduping, clustering, etc.)
Primary review stage. Data transferred to
dedicated repository
Hard copy media converted (e.g., OCR) or audio
records transcribed
Electronically stored info. is preserved from
multiple sources
E-Discovery Pipeline
Searching based upon sources, dates, orig. file
types, key words, etc.
Advice to clients on strategies procedures for
conducting E-Discovery processing
- Data Gathering Preservation Collection
- Media Restoration (data trans. to a std. media)
- Data Processing (filtering, format conversion)
- Online Review Hosting Searching
- E-Discovery Consulting (throughout process)
13The EDD Work Flow Model
Proposed extended scope of text retrieval
task (i.e., including filtering, organizing
report generation)
Identification (scope, depth of information)
E-Discovery Pipeline
- Data Gathering Preservation Collection
- Media Restoration (data trans. to a std. media)
- Data Processing (filtering, format conversion)
- Online Review Hosting Searching
- E-Discovery Consulting (throughout process)
14E-Discovery Technology Pyramid
Reporting
Fourth Tier ? analyzing consoli- dating
summarizing production
Third Tier ? organizing classifying or
clustering tagging linking
Indexing
Second Tier ? vetting filtering, de- duping,
handling similar doc-objects
Hosting
Foundation ? collecting identification,
conversion, migration
15Additional E-Discovery Challenges
- Workflow Support
- Process Efficiencies
- Per Step
- Overall
- Tool Integration
- Ease of Use
- For Customers
- For Support
- High Value to Cost Ratio
- Added value through advanced technologies
- A TREC-like forum has much potential to
contribute here - Both within and beyond the context of IR
16E-Discovery Revisited A Broader Perspective for
IR Researchers
Jack G. Conrad, Thomson RD ICAIL07 / DESI
Workshop June 4, 2007