GraphRAG — Knowledge Graph-Enhanced Retrieval
Extract entities, build knowledge graphs, and traverse relationships for multi-hop reasoning in agents.
GraphRAG — Knowledge Graph-Enhanced Retrieval
GraphRAG extends vector search with entity extraction and relationship traversal. Pure vector search finds semantically similar chunks. GraphRAG finds connected concepts by walking entity relationships — perfect for multi-hop reasoning ("What does AgentBreeder RBAC affect?").
:::note Prerequisites for local GraphRAG
- Ollama installed (ollama.com)
ollama pull qwen2.5:7b— entity extractionollama pull nomic-embed-text— embeddings- Local stack running:
docker compose up -d:::
When to Use GraphRAG
Vector RAG is enough for:
- "What is the refund policy for annual subscriptions?" — factual lookup from a document
- "Summarize the release notes for v2.3" — semantic similarity over a corpus
GraphRAG is the right choice for:
- Multi-hop questions — "Who reported to whom during the 2023 incident, and which systems did they own?" requires traversing
Person → reports_to → Person → owns → System - Relationship queries — "Which agents use the same MCP server as the billing-agent?" requires traversing a shared-dependency graph
- Domain ontologies — Legal, medical, and financial domains have rich entity hierarchies (contracts → clauses → obligations → parties) where relationships carry as much meaning as the text
Concrete example: A support agent fielding "Which products were affected by the 2023 supply chain issue?" can answer correctly only if it can traverse Incident → affects → Supplier → supplies → Product. A pure vector search returns chunks that mention "supply chain" and "2023" but cannot reason about the causal chain — GraphRAG can.
Vector vs Graph vs Hybrid
| Type | How it works | Best for | Speed | Multi-hop | Exact match |
|---|---|---|---|---|---|
| Vector | Semantic similarity in embedding space | General QA, paraphrasing | Fast | ❌ | ❌ |
| Full-Text | BM25 keyword matching | IDs, codes, exact terms | Fast | ❌ | ✅ |
| Hybrid (Vector + Full-Text) | Weighted blend of both | Most use cases | Fast | ❌ | Partial |
| Graph | Entity and relationship traversal | Linked data, multi-hop reasoning | Slower | ✅ | ✅ (exact entities) |
| GraphRAG (Vector + Graph) | Vector search + entity relationships | Complex questions over structured knowledge | Moderate | ✅ | ✅ |
How GraphRAG Works
Ingestion Pipeline
──────────────────
Documents
→ Chunking (same as vector RAG)
→ Embedding (same as vector RAG)
→ Entity + Relationship Extraction (LLM call per chunk)
→ Store: nodes + edges in GraphStore; chunk vectors in vector store
→ Link: chunk_id → entity_id
Graph Search
────────────
→ Embed query
→ Vector search (top-20 candidates)
→ Identify seed entities from candidates
→ BFS traversal (1–2 hops)
→ Merge graph context + vector chunks
→ Rerank by: score = 0.6 × cosine_sim + 0.4 × hop_decay
→ Return top-k GraphSearchHitDuring ingestion, AgentBreeder calls an LLM (Claude Haiku or a local Ollama model) to extract named entities and their relationships from each chunk. These are stored as nodes and edges alongside the standard vector embeddings, and each chunk is linked to the entities it mentions.
At query time, the graph search first runs a standard vector search to identify candidate chunks and their seed entities. It then walks the knowledge graph outward (BFS, 1–2 hops) to pull in related entities and their source chunks. The merged results are reranked using a combined score that blends vector similarity with graph distance.
Local Entity Extraction with Ollama
AgentBreeder supports local LLM-based entity extraction via Ollama — no API key required.
Prerequisites:
ollama pull qwen2.5:7b # entity extraction model
ollama pull nomic-embed-text # embedding modelConfigure your RAG index to use Ollama:
knowledge_bases:
- ref: kb/my-docs
index_type: graph
embedding_model: ollama/nomic-embed-text
entity_model: ollama/qwen2.5:7bAgentBreeder routes any model prefixed with ollama/ to the local Ollama server at http://localhost:11434. Override with the OLLAMA_BASE_URL environment variable.
See the complete working example in examples/graphrag-ollama-agent/.
Want Neo4j pre-wired and seeded?
Run agentbreeder quickstart to start the full local stack with Neo4j running on port 7687
and a sample knowledge graph (~50 nodes, ~60 relationships) already loaded.
To load your own graph: agentbreeder seed --neo4j --cypher ./my-graph.cypher
Neo4j browser UI: http://localhost:7474 — login: neo4j / agentbreeder
Step 1 — Create a Graph Index
Go to Registry → Knowledge Bases → New Index. Under Index Type, select Graph (instead of Vector).
Configure:
| Field | Default | Description |
|---|---|---|
| Name | required | Slug-friendly (e.g., product-docs) |
| Index Type | vector | Choose graph for entity extraction |
| Embedding model | openai/text-embedding-3-small | Model used to embed chunks |
| Entity extraction model | openai/gpt-4o | LLM used to extract entities and relationships |
| Chunk strategy | recursive | fixed_size or recursive (splits on semantic boundaries) |
| Chunk size | 512 tokens | Number of tokens per chunk |
| Chunk overlap | 64 tokens | Overlap between adjacent chunks |
Click Create Index.
curl -X POST http://localhost:8000/api/v1/rag/indexes \
-H "Content-Type: application/json" \
-d '{
"name": "product-docs",
"description": "Product documentation with entity relationships",
"index_type": "graph",
"embedding_model": "ollama/nomic-embed-text",
"entity_model": "ollama/qwen2.5:7b",
"chunk_strategy": "recursive",
"chunk_size": 512,
"chunk_overlap": 64,
"source": "manual"
}'Response:
{
"data": {
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"name": "product-docs",
"index_type": "graph",
"embedding_model": "ollama/nomic-embed-text",
"entity_model": "ollama/qwen2.5:7b",
"chunk_strategy": "recursive",
"chunk_size": 512,
"chunk_overlap": 64,
"entity_count": 0,
"relationship_count": 0,
"document_count": 0,
"status": "active",
"created_at": "2026-04-14T00:00:00Z"
}
}knowledge_bases:
- ref: kb/product-docs
index_type: graph # Enable graph mode
embedding_model: ollama/nomic-embed-text
entity_model: ollama/qwen2.5:7bEntity extraction cost
Entity extraction runs the LLM on every chunk during ingestion. Local models (Ollama) are free. Cloud models (GPT-4o, Claude, etc.) cost per token.
Step 2 — Ingest Documents
Upload files to build the knowledge graph. Entity extraction happens automatically.
Open the index → click Upload Documents → drag and drop files.
The dashboard shows extraction progress:
✅ Chunking... 14 chunks from docs/architecture.md
✅ Entity Extraction... 28 entities found, 42 relationships discovered
✅ Vector Embedding... 14 chunks embedded
✅ Graph Construction... graph with 28 nodes, 42 edges built
✅ Stored graph index ready# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
-F "files=@docs/architecture.md" \
-F "files=@docs/quickstart.md"Response (ingestion job):
{
"data": {
"id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
"index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"status": "processing",
"total_files": 2,
"processed_files": 0,
"entities_extracted": 0,
"relationships_found": 0,
"error": null,
"started_at": "2026-04-14T00:00:00Z",
"completed_at": null
}
}Poll for completion:
GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"Ingestion takes longer for graph indexes
Entity extraction runs an LLM call per chunk, so ingestion is slower than pure vector indexing. For large corpora, expect roughly 2–4× the ingestion time. Ingestion is asynchronous — poll the job status endpoint as usual.
Step 3 — Search the Graph
Test entity-aware retrieval before wiring the index to an agent.
curl -X POST http://localhost:8000/api/v1/rag/search \
-H "Content-Type: application/json" \
-d '{
"index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"query": "What features does RBAC enable in AgentBreeder?",
"search_mode": "graph",
"top_k": 5,
"max_hops": 2
}'Query modes:
vector— semantic similarity onlygraph— entity traversal up tomax_hopsdepthhybrid— both, combined ranking (default)
Response:
{
"data": {
"index_id": "xxxxxxxx-...",
"query": "What features does RBAC enable in AgentBreeder?",
"search_mode": "hybrid",
"results": [
{
"chunk_id": "chunk-001",
"text": "RBAC validates team permissions before deploying agents...",
"metadata": { "source": "architecture.md" },
"entity_path": ["RBAC", "Governance", "Team Permissions"],
"vector_score": 0.94,
"graph_score": 0.87,
"combined_score": 0.91,
"hops_from_query": 1
},
{
"chunk_id": "chunk-015",
"text": "Teams can restrict who deploys agents in their org...",
"metadata": { "source": "quickstart.md" },
"entity_path": ["Team Management", "Access Control", "RBAC"],
"vector_score": 0.81,
"graph_score": 0.92,
"combined_score": 0.87,
"hops_from_query": 2
}
],
"total": 2,
"entities_mentioned": ["RBAC", "Governance", "Team Permissions", "Access Control"],
"relationships": [
{ "source": "RBAC", "target": "Team Permissions", "type": "enables" },
{ "source": "Team Permissions", "target": "Access Control", "type": "part_of" }
]
}
}Response fields:
entity_path— the chain of entities from the query entity to this resultvector_score— semantic relevance (0–1)graph_score— entity relationship strength (0–1)combined_score— weighted combination (default: 0.5 each)hops_from_query— how many relationships away this entity is from the query entity
Step 4 — Use in agent.yaml
name: architecture-qa-agent
version: 1.0.0
framework: claude_sdk
knowledge_bases:
- ref: kb/product-docs
search_mode: hybrid # vector | graph | hybrid (default)
max_graph_hops: 2 # for graph search
top_k: 5
# At runtime, the agent queries the knowledge base and receives
# both vector-matched chunks and graph-traversed entity paths
# before sending the user's message to the model.Graph search with multiple hops
Setting max_graph_hops: 3 allows the traversal to go 3 relationships deep. Higher hops = more context but slower retrieval. Start with 1–2 hops for latency-sensitive agents.
Explore Graph Data
Three endpoints let you inspect the knowledge graph built from your documents:
| Method | Path | Description |
|---|---|---|
GET | /api/v1/rag/indexes/{id}/graph | Graph metadata — node count, edge count, top entity types |
GET | /api/v1/rag/indexes/{id}/entities | Paginated list of extracted entities with type and frequency |
GET | /api/v1/rag/indexes/{id}/relationships | Paginated list of relationships (subject → predicate → object) |
Example — list top entities:
curl "http://localhost:8000/api/v1/rag/indexes/{id}/entities?limit=10&sort=frequency"{
"data": [
{ "id": "rbac", "label": "RBAC", "type": "Feature", "mention_count": 34 },
{ "id": "team-permissions", "label": "Team Permissions", "type": "Concept", "mention_count": 18 },
{ "id": "access-control", "label": "Access Control", "type": "Feature", "mention_count": 12 }
]
}Entity Extraction Configuration
Built-in Entity Types
AgentBreeder recognizes these entity types by default:
| Type | Examples |
|---|---|
| PERSON | Alice, Bob, author names |
| ORGANIZATION | Company, team names (e.g., "customer-success") |
| LOCATION | Cities, regions, cloud regions |
| TECHNOLOGY | Framework names, tool names (e.g., "LangGraph", "OpenAI") |
| FEATURE | Product features (e.g., "RBAC", "Cost Tracking") |
| CONCEPT | Abstract ideas (e.g., "Governance", "Orchestration") |
Custom Entity Types
Define your own for domain-specific extraction:
knowledge_bases:
- ref: kb/legal-docs
index_type: graph
entity_extraction:
custom_types:
- name: CONTRACT
description: "Legal agreements and contracts"
- name: PARTY
description: "Signatory organizations or people"
- name: OBLIGATION
description: "Duties and responsibilities"Eval Metrics
GraphRAG adds four new eval metrics to agentbreeder eval:
| Metric | Description |
|---|---|
entity_recall | Fraction of ground-truth entities present in retrieved context |
relationship_precision | Fraction of retrieved relationships that are correct |
hop_coverage | Fraction of multi-hop questions answered with graph context |
vector_fallback_rate | % of queries that fell back to vector-only (graph returned nothing) |
agentbreeder eval run my-agent \
--dataset kb-qa-v1 \
--scorer judge \
--rag-metrics entity_recall,relationship_precisionA high vector_fallback_rate (> 20%) indicates your documents lack the entity density needed for graph retrieval to activate — consider using a hybrid index type instead of pure graph.
Local Development with Neo4j
By default, GraphRAG uses an in-memory graph store (GRAPH_BACKEND=memory) — no extra setup required. To spin up a local Neo4j instance for a more production-like environment:
docker compose --profile graphrag upThis starts Neo4j alongside the standard AgentBreeder stack. Configure the connection with:
NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=agentbreederAdd these to your .env file. The Neo4j browser UI is available at http://localhost:7474 for visual graph exploration.
Memory backend is the default
GRAPH_BACKEND=memory works for development and small corpora. Graphs are rebuilt on each API server restart. Switch to Neo4j when you need persistence across restarts or graphs with > 100k nodes.
Production — Neo4j AuraDB
Recommended for production
For production, use Neo4j AuraDB — a fully managed Neo4j cloud service. Set NEO4J_URL to your AuraDB connection string and NEO4J_USER/NEO4J_PASSWORD to your AuraDB credentials. AuraDB handles backups, upgrades, and scaling automatically.
NEO4J_URL=neo4j+s://xxxxxxxx.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your-auradb-password>
GRAPH_BACKEND=neo4jStore these as secrets using agentbreeder secret set — do not commit them to version control.
Index Types Comparison
| Type | Storage | Best For |
|---|---|---|
vector | In-memory / pgvector | Factual lookup, semantic similarity |
graph | In-memory / Neo4j | Relationship queries, multi-hop reasoning |
hybrid | Both | General-purpose (recommended for complex domains) |
When in doubt, start with hybrid. It runs both vector and graph retrieval and merges the results — so you get the benefits of both without having to predict which mode will perform better for a given query type.
Complete Example: GraphRAG with Ollama
See examples/graphrag-ollama-agent/ for a working example:
cd examples/graphrag-ollama-agent
# Start Ollama (if not running)
ollama serve
# In another terminal, pull models
ollama pull qwen2.5:7b
ollama pull nomic-embed-text
# Start the full AgentBreeder stack
docker compose up -d
# Ingest documents and build the knowledge graph
python ingest.py
# Chat with the agent
agentbreeder chat --agent graphrag-demo-agent \
"What are the core concepts of GraphRAG?"Troubleshooting
"Ollama connection refused"
Problem: Entity extraction can't reach local Ollama.
Solution:
# Check Ollama is running
curl http://localhost:11434/api/tags
# If not, start it
ollama serve
# Override the endpoint (if Ollama is on a different port)
export OLLAMA_BASE_URL=http://localhost:11435"Entity extraction taking too long"
Problem: Graph building is slow on large documents.
Solution:
- Use a smaller extraction model:
ollama/qwen2.5:7binstead ofqwen2.5:72b - Reduce
chunk_size(fewer chunks = fewer extractions) - Use
search_mode: "hybrid"to blend graph + vector for faster queries
"Graph traversal returns too many results"
Problem: max_hops: 3 is causing exponential result growth.
Solution:
- Reduce
max_hopsto 1 or 2 - Use
search_mode: "hybrid"with lowergraph_weight - Filter results client-side by relationship type
API Reference — Graph Search
| Method | Path | Description |
|---|---|---|
POST | /api/v1/rag/indexes | Create graph index (with index_type: "graph") |
GET | /api/v1/rag/indexes/{id} | Get index metadata (includes entity/relationship counts) |
POST | /api/v1/rag/indexes/{id}/ingest | Upload files and build graph |
GET | /api/v1/rag/indexes/{id}/entities | List all entities in the graph |
GET | /api/v1/rag/indexes/{id}/relationships | List all relationships in the graph |
POST | /api/v1/rag/search | Search with search_mode: "graph" |
Next Steps
| What | Where |
|---|---|
| Vector and hybrid indexes | Knowledge Bases → |
| Add tools to your agent | Tools → |
| Connect MCP servers | MCP Servers → |
| Run evals on your agent | Evaluations → |
| Full agent.yaml fields | agent.yaml Reference → |