top of page
Agentic GraphRAG for Legal Contract Analysis
- Methodology: The post details an agentic GraphRAG system for legal contract analysis, leveraging LLMs (specifically Gemini 2.0 Flash) and a knowledge graph (Neo4j) to extract structured information from contracts and enable more precise, context-aware retrieval than naive RAG. The system uses LangGraph to orchestrate an agent that can query the knowledge graph based on user input.
- Graph Construction: Contracts from the CUAD dataset (CC BY 4.0 license, 500+ contracts) are processed using LLMs and Pydantic schemas to extract entities (parties, locations, clauses) and relationships, which are then loaded into Neo4j using Cypher queries. The use of Pydantic enforces structured output from the LLM, improving reliability.
- Tool Design: A `ContractSearchTool` is implemented as a semantic layer, abstracting the underlying Neo4j graph structure from the LLM. The tool uses a `ContractInput` Pydantic model to define search parameters (date ranges, contract type, parties, etc.) and dynamically constructs Cypher queries based on these parameters. The design incorporates inferred property filtering (e.g., determining contract activity based on end date) and custom operator filtering (e.g., using `NumberOperator` enum for monetary value comparisons).
- Dynamic Queries with `cypher_aggregation`: The system experiments with a `cypher_aggregation` attribute, allowing the LLM to generate custom Cypher aggregations for advanced analytics. This provides flexibility but introduces potential instability due to the complexity of LLM-generated queries.
- Agent Evaluation: A benchmark of 22 questions is used to evaluate the system, with a custom metric called `answer_satisfaction` to assess the correctness and completeness of the LLM's responses. Initial results show similar performance across Gemini 1.5 Pro, Gemini 2.0 Flash, and GPT-4o, with GPT-4o slightly outperforming the Gemini models (0.82 vs. 0.77).
- Limitations: The evaluation dataset is small (22 questions) and doesn't fully explore the reasoning capabilities of LLMs. The post also notes that some LLMs struggle with nested objects as inputs, which can complicate the implementation of structured operator-based filtering. According to additional sources, the CUAD dataset consists of 510 contracts with 13,000+ labels, focusing on 41 types of legal clauses relevant to corporate transactions.
Source:
bottom of page