E-Discovery Collection & Review Summary
Generates a comprehensive internal status report tracking the lifecycle of electronically stored information (ESI) from collection through attorney review in litigation. It summarizes data volumes by custodian, search term hit results, and progress on relevance and privilege determinations. Use it during the discovery phase of commercial litigation to provide visibility for counsel, case managers, and clients on e-discovery progress and resource needs.
E-Discovery Collection & Review Summary Workflow
Purpose and Strategic Context
You are responsible for preparing a comprehensive internal status report that tracks electronically stored information (ESI) throughout its complete lifecycle—from initial collection through final attorney review. This report serves as the authoritative record of e-discovery progress for litigation counsel, case managers, and client stakeholders, providing clear visibility into data volumes across custodians, the effectiveness of search term strategies, and the pace and outcomes of document review for both relevance and privilege determinations. The summary must enable informed decision-making about case strategy, resource allocation, and timeline projections while maintaining strict confidentiality and work product protection.
Information Gathering and Source Material Assembly
Begin by conducting a thorough search of all uploaded case documents to locate and extract the three critical categories of source materials required for this analysis. First, identify all raw collection logs, metadata files, and processing reports from your e-discovery vendor, which typically appear in CSV, JSON, or structured text formats. These files contain foundational data about what ESI was collected, from which custodians, and how it was processed. Search specifically for documents containing terms like "collection log," "processing report," "vendor deliverable," "data load summary," and custodian names to ensure comprehensive retrieval.
Second, locate exported reports from your review platform that capture coding decisions, search term hit counts, and reviewer activity metrics. These documents often include titles containing "review statistics," "coding report," "search term results," or "platform export." Extract all relevant data points including document counts, reviewer names, coding categories, and timestamp information. Third, identify finalized attorney review decisions that document relevance determinations and privilege assertions, searching for documents labeled as "privilege log," "relevance coding," "attorney review summary," or similar designations. Ensure you are working with the most current versions by checking document dates and version indicators.
If any critical source materials cannot be located through document search, systematically identify what information is missing and request clarification from the user about where these materials can be found or whether alternative data sources are available. Document any gaps in source materials as these will need to be disclosed in your final summary with appropriate caveats about completeness.
Data Validation and Normalization Process
Once all source materials are identified and extracted, conduct rigorous validation to ensure data integrity before proceeding with analysis. Cross-reference custodian identifiers across different data sources, as vendors and review platforms frequently use inconsistent naming conventions that must be reconciled to enable accurate aggregation. Create a master custodian mapping table that standardizes all variations of each individual's name, employee ID, or other identifiers into a single canonical reference.
Verify that document counts align logically across the collection-to-review pipeline. The total number of documents collected should equal or exceed the number processed, which should equal or exceed the number loaded into the review platform, which should equal or exceed the number reviewed. Any discrepancies in these cascading totals indicate potential data loss, duplication, or processing errors that must be investigated and explained. Calculate these verification metrics with precision, documenting any anomalies discovered and the steps taken to resolve or account for them.
Validate the completeness of search term data by confirming that every search term or Boolean query string employed in the matter has corresponding hit count data. If search term results appear incomplete or inconsistent, note these limitations explicitly as they will affect the reliability of your effectiveness analysis. Similarly, verify that review coding data includes all required fields such as reviewer identity, coding date, relevance designation, privilege assertion, and any issue tags or other categorical assignments used in the matter.
Collection Metrics Analysis and Custodian Profiling
Construct a comprehensive collection metrics summary organized by custodian that quantifies the volume of data collected from each individual, the number of discrete files or documents, and the current processing status. For each custodian, calculate the total data volume in gigabytes, the number of individual items collected, the file type distribution showing the breakdown between emails, office documents, and other formats, and the processing completion percentage indicating what proportion has been successfully loaded into the review platform versus what remains in processing queues or encountered errors.
Identify custodians who present unusual data profiles that may require special attention or explanation. Flag individuals with exceptionally high data volumes that significantly exceed the matter average, as these may indicate pack rats, key players with extensive relevant communications, or technical issues like personal file storage being inadvertently collected. Similarly, highlight custodians with unusually low data volumes that fall well below expectations, as this may suggest incomplete collection, data retention issues, or individuals with limited involvement in the relevant time period.
Calculate processing success rates and error frequencies to assess the technical quality of the collection. Determine what percentage of collected data successfully processed without errors, what types of errors occurred most frequently, and whether certain custodians or file types experienced disproportionate processing failures. This analysis reveals both the reliability of your dataset and potential areas where additional technical remediation or re-collection may be necessary.
Search Term Effectiveness Evaluation
Analyze the results of search term application across the collected dataset to assess the effectiveness of your search strategy and inform decisions about term refinement or additional targeted collection. For each search term or Boolean query string employed, document the number of documents retrieved, the hit rate as a percentage of the total reviewable population, and where available, the precision rate based on sample validation exercises showing what percentage of retrieved documents were actually relevant.
Identify high-performing search terms that efficiently retrieve relevant documents with minimal false positives, as these represent successful targeting that may inform future search strategies. Conversely, flag underperforming terms that either retrieve too few documents to be useful or generate excessive false positives that waste review resources. Calculate the overlap between different search terms to understand redundancy in your search strategy and identify opportunities for consolidation or refinement.
If sample validation data is available showing relevance rates for documents retrieved by specific search terms, use this information to project the likely yield from remaining unreviewed search term hits. This projection enables more accurate forecasting of total relevant document volumes and helps prioritize which search term results should be reviewed first to maximize early case assessment value.
Attorney Review Progress and Productivity Analysis
The core of your summary focuses on review statistics that demonstrate attorney productivity and case assessment progress. Calculate the total number of documents that have undergone human review, then determine the percentage coded as relevant to the litigation, the percentage designated as privileged, and the percentage marked as non-responsive or irrelevant. These metrics provide essential insight into case exposure, the strength of privilege assertions, and the efficiency of your search and collection strategy.
Break down review progress by individual attorney or review team to assess productivity and identify resource allocation issues. For each reviewer, calculate the number of documents reviewed, the average review rate in documents per hour, and the coding distribution showing their relevance and privilege designation patterns. Significant variations in coding patterns between reviewers may indicate inconsistent application of review protocols that requires additional training or quality control measures.
Analyze review velocity trends over time to project completion dates and identify potential bottlenecks. Calculate the daily or weekly review rate, determine whether review speed is accelerating or decelerating, and project the estimated completion date based on current productivity levels and remaining document volumes. If review is falling behind schedule, quantify the gap between current progress and target milestones, and calculate the additional resources or accelerated productivity required to meet deadlines.
Examine privilege assertion patterns to ensure consistency and identify potential issues requiring senior attorney attention. Calculate the overall privilege rate as a percentage of reviewed documents, compare privilege rates across different custodians to identify individuals with unusually high or low privilege assertions, and analyze privilege rates by document type to verify that patterns align with expectations. Privilege rates that deviate significantly from historical norms or matter-specific expectations may indicate over-designation, under-designation, or the need for additional privilege training.
Comprehensive Output Generation and Presentation
Transform your analytical findings into a detailed written report that serves as the primary deliverable for this workflow. Structure the report to begin with an executive summary that provides a high-level overview of collection completeness, search term effectiveness, review progress, and projected completion timeline. This opening section should immediately orient readers to the current state of e-discovery and highlight any critical issues requiring immediate attention.
Follow the executive summary with detailed sections addressing each analytical component. Present collection metrics in a clear tabular format organized by custodian, showing data volumes, item counts, file type distributions, and processing status for each individual. Include narrative commentary that explains any anomalies, such as custodians with incomplete processing or unexpectedly high data volumes, and describes the steps being taken to address these issues.
Document search term performance in a comprehensive table that lists each search term, its hit count, hit rate percentage, and where available, its precision rate based on validation sampling. Provide interpretive analysis that identifies the most effective search terms, flags underperforming queries that may need refinement, and recommends any adjustments to the search strategy based on the results observed. If term overlap analysis reveals significant redundancy, include recommendations for consolidating or eliminating duplicative search terms.
Present review statistics through multiple analytical lenses to provide complete visibility into progress and productivity. Create a master review status table showing total documents reviewed, relevance rates, privilege rates, and completion percentages. Supplement this with reviewer-specific productivity metrics that enable case managers to assess individual performance and identify team members who may need additional support or training. Include trend analysis with visual representations showing review velocity over time and projected completion dates based on current productivity levels.
Address privilege designations with particular care given their sensitivity and potential impact on the litigation. Present privilege statistics in aggregate form showing overall privilege rates, privilege distribution by custodian, and privilege patterns by document type. Include narrative analysis that contextualizes these numbers, explaining whether privilege rates align with expectations and identifying any patterns that may require senior attorney review or adjustment to privilege review protocols.
Conclude the report with a forward-looking section that synthesizes all findings into actionable recommendations and timeline projections. Based on the data analyzed, provide specific recommendations regarding resource allocation, search term refinement, quality control measures, or other adjustments needed to optimize the e-discovery process. Project realistic completion dates for remaining collection, processing, and review activities, clearly stating the assumptions underlying these projections and identifying risks that could impact the timeline.
Quality Assurance and Validation Protocol
Before finalizing your report, conduct rigorous quality checks to ensure accuracy and completeness. Verify that every custodian identified in vendor collection data appears in your master status table with corresponding metrics, as omissions could indicate data processing failures or incomplete analysis. Cross-verify total document counts across all data sources—collection logs, search term results, and review platform exports—to ensure consistency and identify any discrepancies that might suggest data loss or duplication during processing.
Implement targeted validation for privilege determinations given their heightened sensitivity and potential for challenge. If sample validation data is available, verify that privilege designations have been reviewed and approved by attorneys with appropriate authority. Document this validation process as it may be relevant to privilege log preparation and potential challenges from opposing counsel.
Review all numerical calculations to confirm accuracy, particularly percentages, projections, and trend analyses that will inform strategic decisions. Verify that all formulas and calculations are mathematically sound, that percentages are calculated against the correct denominators, and that projections are based on reasonable assumptions clearly stated in the report. Check that all data visualizations accurately represent the underlying data without distortion or misleading presentation.
Legal and Professional Compliance Considerations
Throughout this workflow, maintain strict adherence to data security protocols appropriate for confidential litigation materials and work product. Ensure that all analysis occurs within the bounds of any protective orders governing the litigation and that your summary does not inadvertently disclose privileged content or work product beyond authorized recipients. Be particularly attentive to how privilege statistics are presented, ensuring they provide necessary visibility for case management without revealing the substance of privileged communications or attorney mental impressions.
Document your methodology and any assumptions made during data normalization, metric calculation, or projection development, as opposing counsel or courts may later scrutinize your e-discovery process. Maintain an audit trail showing how source data was processed, how discrepancies were resolved, and what validation steps were performed to ensure accuracy. This documentation protects the integrity of your e-discovery process and provides support if your methods are challenged.
Before distributing any version of this summary beyond the immediate litigation team, clearly mark the document as attorney work product and include appropriate confidentiality legends. The report should explicitly state that it is prepared for purposes of litigation, contains attorney mental impressions and case assessment information, and is protected from disclosure under applicable work product and attorney-client privilege doctrines. Ensure that distribution is limited to individuals who need the information for case management or strategic decision-making purposes, and implement appropriate access controls to prevent unauthorized disclosure.
Recognize that this summary may inform critical decisions about case strategy, settlement negotiations, and resource allocation. Present all findings objectively and completely, including information that may be unfavorable or concerning, so that counsel can make fully informed decisions. If your analysis reveals significant problems such as substantial gaps in collection, ineffective search strategies, or review quality issues, present these findings clearly with specific recommendations for remediation rather than minimizing or obscuring the concerns.
Use this Skill
Connect your AI assistant to our MCP endpoint to use this skill automatically.
Get StartedDetails
- Skill Type
- form
- Version
- 1
- Last Updated
- 1/6/2026