Harvey Leads In Benchmarking Study Against Human Lawyers

This article has been saved to your Favorites!
Harvey emerged as the top-performing platform in an independent benchmarking study of legal artificial intelligence tools released Thursday, which also showed how AI's strengths and weaknesses compared to a group of human lawyers.

The study, conducted by the AI evaluation platform Vals AI with collaboration from Legaltech Hub, evaluated five leading legal generative AI tools across seven tasks. Vals AI says its auto-evaluation framework produces blind assessments to evaluate the accuracy of AI models.

Vals AI tested the AI tools against a control group of independent lawyers, called the Lawyer Baseline, supplied by Cognia Law, an alternative legal service provider. The results showed that AI does deliver some value in legal work.

"Generative AI has reshaped the legal landscape, but not all tools are created equal," Rayan Krishnan, co-founder of Vals AI, said in a statement. "Our study not only measures performance, but also establishes first-ever standards that legal professionals and developers can rely on to understand the technology's impact, but most importantly, its limitations."

Harvey, the fast-growing legal tech startup that just raised a $300 million Series D round, put its AI assistant into six of the seven tasks in the study. It got the top score of AI tools on five tasks and outperformed the lawyer baseline on four. It also tied the Lawyer Baseline in generating a chronology, but stayed out of the task of doing EDGAR research.

This marked the first public benchmarking evaluation of Harvey's AI assistant.

CoCounsel, the AI tool from Thomson Reuters, is the only other vendor whose AI tool received a top score in the study, which it got for summarizing documents. Thomson Reuters submitted its product in four of the seven task areas for the study, surpassing the Lawyer Baseline in those four and achieving the highest average score across them.

Vincent AI, the AI assistant from vLex, participated in seven tasks. It performed better than the Lawyer Baseline in document question-answering, document summarization and transcript analysis.

Vecflow's Oliver, the newest company in this study of AI assistants, opted into five tasks. It outperformed the Lawyer Baseline in document question-answering and document summarization, and was the only AI tool to opt into the EDGAR research category.

Lexis+ AI, the AI platform from LexisNexis, was originally part of the study, but withdrew from all tasks except for legal research. Vals AI plans to release its study on legal research soon.

The Lawyer Baseline topped the AI tools in the tasks of EDGAR research and redlining, which refers to editing contracts.

A consortium of law firms, including Reed Smith LLP, Fisher Phillips, McDermott Will & Emery LLP, Ogletree Deakins Nash Smoak & Stewart PC and four anonymous firms, contributed sample questions and documents for the study.

Vals AI found that AI tools outperformed the human lawyers in easy cases, but fell short in complex and reasoning-intensive tasks.

"These results offer a balanced perspective for the legal community," Langston Nashold, co-founder of Vals AI, said in a statement. "For developers, it's a roadmap to prioritize innovation in underperforming areas. For law firms, it's a guide to making strategic investments in AI that enhance both client service and operational ROI."

--Editing by Adam LoBelia.

Law360 is owned by LexisNexis Legal & Professional, a RELX company.


For a reprint of this article, please contact reprints@law360.com.

×

Law360

Law360 Law360 UK Law360 Tax Authority Law360 Employment Authority Law360 Insurance Authority Law360 Real Estate Authority Law360 Healthcare Authority Law360 Bankruptcy Authority

Rankings

NEWLeaderboard Analytics Social Impact Leaders Prestige Leaders Pulse Leaderboard Women in Law Report Law360 400 Diversity Snapshot Rising Stars Summer Associates

National Sections

Modern Lawyer Courts Daily Litigation In-House Mid-Law Legal Tech Small Law Insights

Regional Sections

California Pulse Connecticut Pulse DC Pulse Delaware Pulse Florida Pulse Georgia Pulse New Jersey Pulse New York Pulse Pennsylvania Pulse Texas Pulse

Site Menu

Subscribe Advanced Search About Contact