agentbreeder

GraphRAG — Knowledge Graph-Enhanced Retrieval

Extract entities, build knowledge graphs, and traverse relationships for multi-hop reasoning in agents.

GraphRAG — Knowledge Graph-Enhanced Retrieval

GraphRAG extends vector search with entity extraction and relationship traversal. Pure vector search finds semantically similar chunks. GraphRAG finds connected concepts by walking entity relationships — perfect for multi-hop reasoning ("What does AgentBreeder RBAC affect?").

:::note Prerequisites for local GraphRAG

  • Ollama installed (ollama.com)
  • ollama pull qwen2.5:7b — entity extraction
  • ollama pull nomic-embed-text — embeddings
  • Local stack running: docker compose up -d :::

When to Use GraphRAG

Vector RAG is enough for:

  • "What is the refund policy for annual subscriptions?" — factual lookup from a document
  • "Summarize the release notes for v2.3" — semantic similarity over a corpus

GraphRAG is the right choice for:

  • Multi-hop questions — "Who reported to whom during the 2023 incident, and which systems did they own?" requires traversing Person → reports_to → Person → owns → System
  • Relationship queries — "Which agents use the same MCP server as the billing-agent?" requires traversing a shared-dependency graph
  • Domain ontologies — Legal, medical, and financial domains have rich entity hierarchies (contracts → clauses → obligations → parties) where relationships carry as much meaning as the text

Concrete example: A support agent fielding "Which products were affected by the 2023 supply chain issue?" can answer correctly only if it can traverse Incident → affects → Supplier → supplies → Product. A pure vector search returns chunks that mention "supply chain" and "2023" but cannot reason about the causal chain — GraphRAG can.


Vector vs Graph vs Hybrid

TypeHow it worksBest forSpeedMulti-hopExact match
VectorSemantic similarity in embedding spaceGeneral QA, paraphrasingFast
Full-TextBM25 keyword matchingIDs, codes, exact termsFast
Hybrid (Vector + Full-Text)Weighted blend of bothMost use casesFastPartial
GraphEntity and relationship traversalLinked data, multi-hop reasoningSlower✅ (exact entities)
GraphRAG (Vector + Graph)Vector search + entity relationshipsComplex questions over structured knowledgeModerate

How GraphRAG Works

Ingestion Pipeline
──────────────────
Documents
  → Chunking (same as vector RAG)
  → Embedding (same as vector RAG)
  → Entity + Relationship Extraction (LLM call per chunk)
  → Store: nodes + edges in GraphStore; chunk vectors in vector store
  → Link: chunk_id → entity_id

Graph Search
────────────
  → Embed query
  → Vector search (top-20 candidates)
  → Identify seed entities from candidates
  → BFS traversal (1–2 hops)
  → Merge graph context + vector chunks
  → Rerank by: score = 0.6 × cosine_sim + 0.4 × hop_decay
  → Return top-k GraphSearchHit

During ingestion, AgentBreeder calls an LLM (Claude Haiku or a local Ollama model) to extract named entities and their relationships from each chunk. These are stored as nodes and edges alongside the standard vector embeddings, and each chunk is linked to the entities it mentions.

At query time, the graph search first runs a standard vector search to identify candidate chunks and their seed entities. It then walks the knowledge graph outward (BFS, 1–2 hops) to pull in related entities and their source chunks. The merged results are reranked using a combined score that blends vector similarity with graph distance.


Local Entity Extraction with Ollama

AgentBreeder supports local LLM-based entity extraction via Ollama — no API key required.

Prerequisites:

ollama pull qwen2.5:7b      # entity extraction model
ollama pull nomic-embed-text # embedding model

Configure your RAG index to use Ollama:

knowledge_bases:
  - ref: kb/my-docs
    index_type: graph
    embedding_model: ollama/nomic-embed-text
    entity_model: ollama/qwen2.5:7b

AgentBreeder routes any model prefixed with ollama/ to the local Ollama server at http://localhost:11434. Override with the OLLAMA_BASE_URL environment variable.

See the complete working example in examples/graphrag-ollama-agent/.


Want Neo4j pre-wired and seeded?

Run agentbreeder quickstart to start the full local stack with Neo4j running on port 7687 and a sample knowledge graph (~50 nodes, ~60 relationships) already loaded. To load your own graph: agentbreeder seed --neo4j --cypher ./my-graph.cypher Neo4j browser UI: http://localhost:7474 — login: neo4j / agentbreeder

Step 1 — Create a Graph Index

Go to Registry → Knowledge Bases → New Index. Under Index Type, select Graph (instead of Vector).

Configure:

FieldDefaultDescription
NamerequiredSlug-friendly (e.g., product-docs)
Index TypevectorChoose graph for entity extraction
Embedding modelopenai/text-embedding-3-smallModel used to embed chunks
Entity extraction modelopenai/gpt-4oLLM used to extract entities and relationships
Chunk strategyrecursivefixed_size or recursive (splits on semantic boundaries)
Chunk size512 tokensNumber of tokens per chunk
Chunk overlap64 tokensOverlap between adjacent chunks

Click Create Index.

curl -X POST http://localhost:8000/api/v1/rag/indexes \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-docs",
    "description": "Product documentation with entity relationships",
    "index_type": "graph",
    "embedding_model": "ollama/nomic-embed-text",
    "entity_model": "ollama/qwen2.5:7b",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "source": "manual"
  }'

Response:

{
  "data": {
    "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "name": "product-docs",
    "index_type": "graph",
    "embedding_model": "ollama/nomic-embed-text",
    "entity_model": "ollama/qwen2.5:7b",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "entity_count": 0,
    "relationship_count": 0,
    "document_count": 0,
    "status": "active",
    "created_at": "2026-04-14T00:00:00Z"
  }
}
knowledge_bases:
  - ref: kb/product-docs
    index_type: graph              # Enable graph mode
    embedding_model: ollama/nomic-embed-text
    entity_model: ollama/qwen2.5:7b

Entity extraction cost

Entity extraction runs the LLM on every chunk during ingestion. Local models (Ollama) are free. Cloud models (GPT-4o, Claude, etc.) cost per token.


Step 2 — Ingest Documents

Upload files to build the knowledge graph. Entity extraction happens automatically.

Open the index → click Upload Documents → drag and drop files.

The dashboard shows extraction progress:

✅ Chunking...           14 chunks from docs/architecture.md
✅ Entity Extraction...  28 entities found, 42 relationships discovered
✅ Vector Embedding...   14 chunks embedded
✅ Graph Construction... graph with 28 nodes, 42 edges built
✅ Stored               graph index ready
# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
  -F "files=@docs/architecture.md" \
  -F "files=@docs/quickstart.md"

Response (ingestion job):

{
  "data": {
    "id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "status": "processing",
    "total_files": 2,
    "processed_files": 0,
    "entities_extracted": 0,
    "relationships_found": 0,
    "error": null,
    "started_at": "2026-04-14T00:00:00Z",
    "completed_at": null
  }
}

Poll for completion:

GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"

Ingestion takes longer for graph indexes

Entity extraction runs an LLM call per chunk, so ingestion is slower than pure vector indexing. For large corpora, expect roughly 2–4× the ingestion time. Ingestion is asynchronous — poll the job status endpoint as usual.


Step 3 — Search the Graph

Test entity-aware retrieval before wiring the index to an agent.

curl -X POST http://localhost:8000/api/v1/rag/search \
  -H "Content-Type: application/json" \
  -d '{
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "query": "What features does RBAC enable in AgentBreeder?",
    "search_mode": "graph",
    "top_k": 5,
    "max_hops": 2
  }'

Query modes:

  • vector — semantic similarity only
  • graph — entity traversal up to max_hops depth
  • hybrid — both, combined ranking (default)

Response:

{
  "data": {
    "index_id": "xxxxxxxx-...",
    "query": "What features does RBAC enable in AgentBreeder?",
    "search_mode": "hybrid",
    "results": [
      {
        "chunk_id": "chunk-001",
        "text": "RBAC validates team permissions before deploying agents...",
        "metadata": { "source": "architecture.md" },
        "entity_path": ["RBAC", "Governance", "Team Permissions"],
        "vector_score": 0.94,
        "graph_score": 0.87,
        "combined_score": 0.91,
        "hops_from_query": 1
      },
      {
        "chunk_id": "chunk-015",
        "text": "Teams can restrict who deploys agents in their org...",
        "metadata": { "source": "quickstart.md" },
        "entity_path": ["Team Management", "Access Control", "RBAC"],
        "vector_score": 0.81,
        "graph_score": 0.92,
        "combined_score": 0.87,
        "hops_from_query": 2
      }
    ],
    "total": 2,
    "entities_mentioned": ["RBAC", "Governance", "Team Permissions", "Access Control"],
    "relationships": [
      { "source": "RBAC", "target": "Team Permissions", "type": "enables" },
      { "source": "Team Permissions", "target": "Access Control", "type": "part_of" }
    ]
  }
}

Response fields:

  • entity_path — the chain of entities from the query entity to this result
  • vector_score — semantic relevance (0–1)
  • graph_score — entity relationship strength (0–1)
  • combined_score — weighted combination (default: 0.5 each)
  • hops_from_query — how many relationships away this entity is from the query entity

Step 4 — Use in agent.yaml

name: architecture-qa-agent
version: 1.0.0
framework: claude_sdk

knowledge_bases:
  - ref: kb/product-docs
    search_mode: hybrid          # vector | graph | hybrid (default)
    max_graph_hops: 2            # for graph search
    top_k: 5

# At runtime, the agent queries the knowledge base and receives
# both vector-matched chunks and graph-traversed entity paths
# before sending the user's message to the model.

Graph search with multiple hops

Setting max_graph_hops: 3 allows the traversal to go 3 relationships deep. Higher hops = more context but slower retrieval. Start with 1–2 hops for latency-sensitive agents.


Explore Graph Data

Three endpoints let you inspect the knowledge graph built from your documents:

MethodPathDescription
GET/api/v1/rag/indexes/{id}/graphGraph metadata — node count, edge count, top entity types
GET/api/v1/rag/indexes/{id}/entitiesPaginated list of extracted entities with type and frequency
GET/api/v1/rag/indexes/{id}/relationshipsPaginated list of relationships (subject → predicate → object)

Example — list top entities:

curl "http://localhost:8000/api/v1/rag/indexes/{id}/entities?limit=10&sort=frequency"
{
  "data": [
    { "id": "rbac", "label": "RBAC", "type": "Feature", "mention_count": 34 },
    { "id": "team-permissions", "label": "Team Permissions", "type": "Concept", "mention_count": 18 },
    { "id": "access-control", "label": "Access Control", "type": "Feature", "mention_count": 12 }
  ]
}

Entity Extraction Configuration

Built-in Entity Types

AgentBreeder recognizes these entity types by default:

TypeExamples
PERSONAlice, Bob, author names
ORGANIZATIONCompany, team names (e.g., "customer-success")
LOCATIONCities, regions, cloud regions
TECHNOLOGYFramework names, tool names (e.g., "LangGraph", "OpenAI")
FEATUREProduct features (e.g., "RBAC", "Cost Tracking")
CONCEPTAbstract ideas (e.g., "Governance", "Orchestration")

Custom Entity Types

Define your own for domain-specific extraction:

knowledge_bases:
  - ref: kb/legal-docs
    index_type: graph
    entity_extraction:
      custom_types:
        - name: CONTRACT
          description: "Legal agreements and contracts"
        - name: PARTY
          description: "Signatory organizations or people"
        - name: OBLIGATION
          description: "Duties and responsibilities"

Eval Metrics

GraphRAG adds four new eval metrics to agentbreeder eval:

MetricDescription
entity_recallFraction of ground-truth entities present in retrieved context
relationship_precisionFraction of retrieved relationships that are correct
hop_coverageFraction of multi-hop questions answered with graph context
vector_fallback_rate% of queries that fell back to vector-only (graph returned nothing)
agentbreeder eval run my-agent \
  --dataset kb-qa-v1 \
  --scorer judge \
  --rag-metrics entity_recall,relationship_precision

A high vector_fallback_rate (> 20%) indicates your documents lack the entity density needed for graph retrieval to activate — consider using a hybrid index type instead of pure graph.


Local Development with Neo4j

By default, GraphRAG uses an in-memory graph store (GRAPH_BACKEND=memory) — no extra setup required. To spin up a local Neo4j instance for a more production-like environment:

docker compose --profile graphrag up

This starts Neo4j alongside the standard AgentBreeder stack. Configure the connection with:

NEO4J_URL=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=agentbreeder

Add these to your .env file. The Neo4j browser UI is available at http://localhost:7474 for visual graph exploration.

Memory backend is the default

GRAPH_BACKEND=memory works for development and small corpora. Graphs are rebuilt on each API server restart. Switch to Neo4j when you need persistence across restarts or graphs with > 100k nodes.


Production — Neo4j AuraDB

Recommended for production

For production, use Neo4j AuraDB — a fully managed Neo4j cloud service. Set NEO4J_URL to your AuraDB connection string and NEO4J_USER/NEO4J_PASSWORD to your AuraDB credentials. AuraDB handles backups, upgrades, and scaling automatically.

NEO4J_URL=neo4j+s://xxxxxxxx.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=<your-auradb-password>
GRAPH_BACKEND=neo4j

Store these as secrets using agentbreeder secret set — do not commit them to version control.


Index Types Comparison

TypeStorageBest For
vectorIn-memory / pgvectorFactual lookup, semantic similarity
graphIn-memory / Neo4jRelationship queries, multi-hop reasoning
hybridBothGeneral-purpose (recommended for complex domains)

When in doubt, start with hybrid. It runs both vector and graph retrieval and merges the results — so you get the benefits of both without having to predict which mode will perform better for a given query type.


Complete Example: GraphRAG with Ollama

See examples/graphrag-ollama-agent/ for a working example:

cd examples/graphrag-ollama-agent

# Start Ollama (if not running)
ollama serve

# In another terminal, pull models
ollama pull qwen2.5:7b
ollama pull nomic-embed-text

# Start the full AgentBreeder stack
docker compose up -d

# Ingest documents and build the knowledge graph
python ingest.py

# Chat with the agent
agentbreeder chat --agent graphrag-demo-agent \
  "What are the core concepts of GraphRAG?"

Troubleshooting

"Ollama connection refused"

Problem: Entity extraction can't reach local Ollama.

Solution:

# Check Ollama is running
curl http://localhost:11434/api/tags

# If not, start it
ollama serve

# Override the endpoint (if Ollama is on a different port)
export OLLAMA_BASE_URL=http://localhost:11435

"Entity extraction taking too long"

Problem: Graph building is slow on large documents.

Solution:

  • Use a smaller extraction model: ollama/qwen2.5:7b instead of qwen2.5:72b
  • Reduce chunk_size (fewer chunks = fewer extractions)
  • Use search_mode: "hybrid" to blend graph + vector for faster queries

"Graph traversal returns too many results"

Problem: max_hops: 3 is causing exponential result growth.

Solution:

  • Reduce max_hops to 1 or 2
  • Use search_mode: "hybrid" with lower graph_weight
  • Filter results client-side by relationship type

MethodPathDescription
POST/api/v1/rag/indexesCreate graph index (with index_type: "graph")
GET/api/v1/rag/indexes/{id}Get index metadata (includes entity/relationship counts)
POST/api/v1/rag/indexes/{id}/ingestUpload files and build graph
GET/api/v1/rag/indexes/{id}/entitiesList all entities in the graph
GET/api/v1/rag/indexes/{id}/relationshipsList all relationships in the graph
POST/api/v1/rag/searchSearch with search_mode: "graph"

Next Steps

WhatWhere
Vector and hybrid indexesKnowledge Bases →
Add tools to your agentTools →
Connect MCP serversMCP Servers →
Run evals on your agentEvaluations →
Full agent.yaml fieldsagent.yaml Reference →

On this page