Extract entities, build knowledge graphs, and traverse relationships for multi-hop reasoning in agents.

GraphRAG — Knowledge Graph-Enhanced Retrieval

GraphRAG extends vector search with entity extraction and relationship traversal. Pure vector search finds semantically similar chunks. GraphRAG finds connected concepts by walking entity relationships — perfect for multi-hop reasoning ("What does AgentBreeder RBAC affect?").

:::note Prerequisites for local GraphRAG

Ollama installed (ollama.com)
ollama pull qwen2.5:7b — entity extraction
ollama pull nomic-embed-text — embeddings
Local stack running: docker compose up -d :::

When to Use GraphRAG

Vector RAG is enough for:

"What is the refund policy for annual subscriptions?" — factual lookup from a document
"Summarize the release notes for v2.3" — semantic similarity over a corpus

GraphRAG is the right choice for:

Multi-hop questions — "Who reported to whom during the 2023 incident, and which systems did they own?" requires traversing Person → reports_to → Person → owns → System
Relationship queries — "Which agents use the same MCP server as the billing-agent?" requires traversing a shared-dependency graph
Domain ontologies — Legal, medical, and financial domains have rich entity hierarchies (contracts → clauses → obligations → parties) where relationships carry as much meaning as the text

Concrete example: A support agent fielding "Which products were affected by the 2023 supply chain issue?" can answer correctly only if it can traverse Incident → affects → Supplier → supplies → Product. A pure vector search returns chunks that mention "supply chain" and "2023" but cannot reason about the causal chain — GraphRAG can.

Vector vs Graph vs Hybrid

Type	How it works	Best for	Speed	Multi-hop	Exact match
Vector	Semantic similarity in embedding space	General QA, paraphrasing	Fast	❌	❌
Full-Text	BM25 keyword matching	IDs, codes, exact terms	Fast	❌	✅
Hybrid (Vector + Full-Text)	Weighted blend of both	Most use cases	Fast	❌	Partial
Graph	Entity and relationship traversal	Linked data, multi-hop reasoning	Slower	✅	✅ (exact entities)
GraphRAG (Vector + Graph)	Vector search + entity relationships	Complex questions over structured knowledge	Moderate	✅	✅

How GraphRAG Works

Ingestion Pipeline
──────────────────
Documents
  → Chunking (same as vector RAG)
  → Embedding (same as vector RAG)
  → Entity + Relationship Extraction (LLM call per chunk)
  → Store: nodes + edges in GraphStore; chunk vectors in vector store
  → Link: chunk_id → entity_id

Graph Search
────────────
  → Embed query
  → Vector search (top-20 candidates)
  → Identify seed entities from candidates
  → BFS traversal (1–2 hops)
  → Merge graph context + vector chunks
  → Rerank by: score = 0.6 × cosine_sim + 0.4 × hop_decay
  → Return top-k GraphSearchHit

During ingestion, AgentBreeder calls an LLM (Claude Haiku or a local Ollama model) to extract named entities and their relationships from each chunk. These are stored as nodes and edges alongside the standard vector embeddings, and each chunk is linked to the entities it mentions.

At query time, the graph search first runs a standard vector search to identify candidate chunks and their seed entities. It then walks the knowledge graph outward (BFS, 1–2 hops) to pull in related entities and their source chunks. The merged results are reranked using a combined score that blends vector similarity with graph distance.

Local Entity Extraction with Ollama

AgentBreeder supports local LLM-based entity extraction via Ollama — no API key required.

Prerequisites:

ollama pull qwen2.5:7b      # entity extraction model
ollama pull nomic-embed-text # embedding model

Configure your RAG index to use Ollama:

knowledge_bases:
  - ref: kb/my-docs
    index_type: graph
    embedding_model: ollama/nomic-embed-text
    entity_model: ollama/qwen2.5:7b

AgentBreeder routes any model prefixed with ollama/ to the local Ollama server at http://localhost:11434. Override with the OLLAMA_BASE_URL environment variable.

See the complete working example in examples/graphrag-ollama-agent/.

Want Neo4j pre-wired and seeded?

Run agentbreeder quickstart to start the full local stack with Neo4j running on port 7687 and a sample knowledge graph (~50 nodes, ~60 relationships) already loaded. To load your own graph: agentbreeder seed --neo4j --cypher ./my-graph.cypher Neo4j browser UI: http://localhost:7474 — login: neo4j / agentbreeder

Step 1 — Create a Graph Index

Go to Registry → Knowledge Bases → New Index. Under Index Type, select Graph (instead of Vector).

Configure:

Field	Default	Description
Name	required	Slug-friendly (e.g., `product-docs`)
Index Type	vector	Choose graph for entity extraction
Embedding model	`openai/text-embedding-3-small`	Model used to embed chunks
Entity extraction model	`openai/gpt-4o`	LLM used to extract entities and relationships
Chunk strategy	`recursive`	`fixed_size` or `recursive` (splits on semantic boundaries)
Chunk size	512 tokens	Number of tokens per chunk
Chunk overlap	64 tokens	Overlap between adjacent chunks

Click Create Index.

curl -X POST http://localhost:8000/api/v1/rag/indexes \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-docs",
    "description": "Product documentation with entity relationships",
    "index_type": "graph",
    "embedding_model": "ollama/nomic-embed-text",
    "entity_model": "ollama/qwen2.5:7b",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "source": "manual"
  }'

Response:

{
  "data": {
    "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "name": "product-docs",
    "index_type": "graph",
    "embedding_model": "ollama/nomic-embed-text",
    "entity_model": "ollama/qwen2.5:7b",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "entity_count": 0,
    "relationship_count": 0,
    "document_count": 0,
    "status": "active",
    "created_at": "2026-04-14T00:00:00Z"
  }
}

knowledge_bases:
  - ref: kb/product-docs
    index_type: graph              # Enable graph mode
    embedding_model: ollama/nomic-embed-text
    entity_model: ollama/qwen2.5:7b

Entity extraction cost

Entity extraction runs the LLM on every chunk during ingestion. Local models (Ollama) are free. Cloud models (GPT-4o, Claude, etc.) cost per token.

Step 2 — Ingest Documents

Upload files to build the knowledge graph. Entity extraction happens automatically.

Open the index → click Upload Documents → drag and drop files.

The dashboard shows extraction progress:

✅ Chunking...           14 chunks from docs/architecture.md
✅ Entity Extraction...  28 entities found, 42 relationships discovered
✅ Vector Embedding...   14 chunks embedded
✅ Graph Construction... graph with 28 nodes, 42 edges built
✅ Stored               graph index ready

# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
  -F "files=@docs/architecture.md" \
  -F "files=@docs/quickstart.md"

Response (ingestion job):

{
  "data": {
    "id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "status": "processing",
    "total_files": 2,
    "processed_files": 0,
    "entities_extracted": 0,
    "relationships_found": 0,
    "error": null,
    "started_at": "2026-04-14T00:00:00Z",
    "completed_at": null
  }
}

Poll for completion:

GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"

Ingestion takes longer for graph indexes

Entity extraction runs an LLM call per chunk, so ingestion is slower than pure vector indexing. For large corpora, expect roughly 2–4× the ingestion time. Ingestion is asynchronous — poll the job status endpoint as usual.

Step 3 — Search the Graph

Test entity-aware retrieval before wiring the index to an agent.

curl -X POST http://localhost:8000/api/v1/rag/search \
  -H "Content-Type: application/json" \
  -d '{
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "query": "What features does RBAC enable in AgentBreeder?",
    "search_mode": "graph",
    "top_k": 5,
    "max_hops": 2
  }'

Query modes:

vector — semantic similarity only
graph — entity traversal up to max_hops depth
hybrid — both, combined ranking (default)

Response:

{
  "data": {
    "index_id": "xxxxxxxx-...",
    "query": "What features does RBAC enable in AgentBreeder?",
    "search_mode": "hybrid",
    "results": [
      {
        "chunk_id": "chunk-001",
        "text": "RBAC validates team permissions before deploying agents...",
        "metadata": { "source": "architecture.md" },
        "entity_path": ["RBAC", "Governance", "Team Permissions"],
        "vector_score": 0.94,
        "graph_score": 0.87,
        "combined_score": 0.91,
        "hops_from_query": 1
      },
      {
        "chunk_id": "chunk-015",
        "text": "Teams can restrict who deploys agents in their org...",
        "metadata": { "source": "quickstart.md" },
        "entity_path": ["Team Management", "Access Control", "RBAC"],
        "vector_score": 0.81,
        "graph_score": 0.92,
        "combined_score": 0.87,
        "hops_from_query": 2
      }
    ],
    "total": 2,
    "entities_mentioned": ["RBAC", "Governance", "Team Permissions", "Access Control"],
    "relationships": [
      { "source": "RBAC", "target": "Team Permissions", "type": "enables" },
      { "source": "Team Permissions", "target": "Access Control", "type": "part_of" }
    ]
  }
}

Response fields:

entity_path — the chain of entities from the query entity to this result
vector_score — semantic relevance (0–1)
graph_score — entity relationship strength (0–1)
combined_score — weighted combination (default: 0.5 each)
hops_from_query — how many relationships away this entity is from the query entity

Step 4 — Use in agent.yaml

name: architecture-qa-agent
version: 1.0.0
framework: claude_sdk

knowledge_bases:
  - ref: kb/product-docs
    search_mode: hybrid          # vector | graph | hybrid (default)
    max_graph_hops: 2            # for graph search
    top_k: 5

# At runtime, the agent queries the knowledge base and receives
# both vector-matched chunks and graph-traversed entity paths
# before sending the user's message to the model.

Graph search with multiple hops

Setting max_graph_hops: 3 allows the traversal to go 3 relationships deep. Higher hops = more context but slower retrieval. Start with 1–2 hops for latency-sensitive agents.

Explore Graph Data

Three endpoints let you inspect the knowledge graph built from your documents:

Method	Path	Description
`GET`	`/api/v1/rag/indexes/{id}/graph`	Graph metadata — node count, edge count, top entity types
`GET`	`/api/v1/rag/indexes/{id}/entities`	Paginated list of extracted entities with type and frequency
`GET`	`/api/v1/rag/indexes/{id}/relationships`	Paginated list of relationships (subject → predicate → object)

Example — list top entities:

curl "http://localhost:8000/api/v1/rag/indexes/{id}/entities?limit=10&sort=frequency"

{
  "data": [
    { "id": "rbac", "label": "RBAC", "type": "Feature", "mention_count": 34 },
    { "id": "team-permissions", "label": "Team Permissions", "type": "Concept", "mention_count": 18 },
    { "id": "access-control", "label": "Access Control", "type": "Feature", "mention_count": 12 }
  ]
}

Entity Extraction Configuration

Built-in Entity Types

AgentBreeder recognizes these entity types by default:

Type	Examples
PERSON	Alice, Bob, author names
ORGANIZATION	Company, team names (e.g., "customer-success")
LOCATION	Cities, regions, cloud regions
TECHNOLOGY	Framework names, tool names (e.g., "LangGraph", "OpenAI")
FEATURE	Product features (e.g., "RBAC", "Cost Tracking")
CONCEPT	Abstract ideas (e.g., "Governance", "Orchestration")

Custom Entity Types

Define your own for domain-specific extraction:

knowledge_bases:
  - ref: kb/legal-docs
    index_type: graph
    entity_extraction:
      custom_types:
        - name: CONTRACT
          description: "Legal agreements and contracts"
        - name: PARTY
          description: "Signatory organizations or people"
        - name: OBLIGATION
          description: "Duties and responsibilities"

Eval Metrics

GraphRAG adds four new eval metrics to agentbreeder eval:

Metric	Description
`entity_recall`	Fraction of ground-truth entities present in retrieved context
`relationship_precision`	Fraction of retrieved relationships that are correct
`hop_coverage`	Fraction of multi-hop questions answered with graph context
`vector_fallback_rate`	% of queries that fell back to vector-only (graph returned nothing)

agentbreeder eval run my-agent \
  --dataset kb-qa-v1 \
  --scorer judge \
  --rag-metrics entity_recall,relationship_precision

A high vector_fallback_rate (> 20%) indicates your documents lack the entity density needed for graph retrieval to activate — consider using a hybrid index type instead of pure graph.

Local Development with Neo4j

By default, GraphRAG uses an in-memory graph store (GRAPH_BACKEND=memory) — no extra setup required. To spin up a local Neo4j instance for a more production-like environment:

docker compose --profile graphrag up

This starts Neo4j alongside the standard AgentBreeder stack. Configure the connection with:

NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=agentbreeder

Add these to your .env file. The Neo4j browser UI is available at http://localhost:7474 for visual graph exploration.

Memory backend is the default

GRAPH_BACKEND=memory works for development and small corpora. Graphs are rebuilt on each API server restart. Switch to Neo4j when you need persistence across restarts or graphs with > 100k nodes.

Production — Neo4j AuraDB

Recommended for production

For production, use Neo4j AuraDB — a fully managed Neo4j cloud service. Set NEO4J_URL to your AuraDB connection string and NEO4J_USER/NEO4J_PASSWORD to your AuraDB credentials. AuraDB handles backups, upgrades, and scaling automatically.

NEO4J_URL=neo4j+s://xxxxxxxx.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your-auradb-password>
GRAPH_BACKEND=neo4j

Store these as secrets using agentbreeder secret set — do not commit them to version control.

Index Types Comparison

Type	Storage	Best For
`vector`	In-memory / pgvector	Factual lookup, semantic similarity
`graph`	In-memory / Neo4j	Relationship queries, multi-hop reasoning
`hybrid`	Both	General-purpose (recommended for complex domains)

When in doubt, start with hybrid. It runs both vector and graph retrieval and merges the results — so you get the benefits of both without having to predict which mode will perform better for a given query type.

Complete Example: GraphRAG with Ollama

See examples/graphrag-ollama-agent/ for a working example:

cd examples/graphrag-ollama-agent

# Start Ollama (if not running)
ollama serve

# In another terminal, pull models
ollama pull qwen2.5:7b
ollama pull nomic-embed-text

# Start the full AgentBreeder stack
docker compose up -d

# Ingest documents and build the knowledge graph
python ingest.py

# Chat with the agent
agentbreeder chat --agent graphrag-demo-agent \
  "What are the core concepts of GraphRAG?"

Troubleshooting

"Ollama connection refused"

Problem: Entity extraction can't reach local Ollama.

Solution:

# Check Ollama is running
curl http://localhost:11434/api/tags

# If not, start it
ollama serve

# Override the endpoint (if Ollama is on a different port)
export OLLAMA_BASE_URL=http://localhost:11435

"Entity extraction taking too long"

Problem: Graph building is slow on large documents.

Solution:

Use a smaller extraction model: ollama/qwen2.5:7b instead of qwen2.5:72b
Reduce chunk_size (fewer chunks = fewer extractions)
Use search_mode: "hybrid" to blend graph + vector for faster queries

"Graph traversal returns too many results"

Problem: max_hops: 3 is causing exponential result growth.

Solution:

Reduce max_hops to 1 or 2
Use search_mode: "hybrid" with lower graph_weight
Filter results client-side by relationship type

API Reference — Graph Search

Method	Path	Description
`POST`	`/api/v1/rag/indexes`	Create graph index (with `index_type: "graph"`)
`GET`	`/api/v1/rag/indexes/{id}`	Get index metadata (includes entity/relationship counts)
`POST`	`/api/v1/rag/indexes/{id}/ingest`	Upload files and build graph
`GET`	`/api/v1/rag/indexes/{id}/entities`	List all entities in the graph
`GET`	`/api/v1/rag/indexes/{id}/relationships`	List all relationships in the graph
`POST`	`/api/v1/rag/search`	Search with `search_mode: "graph"`

Next Steps

What	Where
Vector and hybrid indexes	Knowledge Bases →
Add tools to your agent	Tools →
Connect MCP servers	MCP Servers →
Run evals on your agent	Evaluations →
Full agent.yaml fields	agent.yaml Reference →

GraphRAG — Knowledge Graph-Enhanced Retrieval

On this page