agentbreeder

Knowledge Bases — Lifecycle & Registry

Create indexes, ingest documents, run hybrid search, and wire knowledge bases into agents.

Knowledge bases (RAG indexes) let agents retrieve context from your documents before answering. AgentBreeder handles chunking, embedding, and hybrid search — you just upload files and reference the index in agent.yaml.


How It Works

Your documents (PDF, MD, CSV, JSON)

   Chunking (fixed-size or recursive)

   Embedding (text-embedding-3-small by default)

   Vector store (in-memory or pgvector)

   Hybrid search (vector + full-text, 70/30 default)

   Agent receives top-k chunks as context

Want ChromaDB pre-wired and seeded?

Run agentbreeder quickstart to start the full local stack with ChromaDB running on port 8001 and sample data already loaded into the agentbreeder_knowledge collection. To load your own docs: agentbreeder seed --chromadb --docs ./my-docs/

Step 1 — Create an Index

Go to Registry → Knowledge Bases → New Index. Configure:

FieldDefaultDescription
NamerequiredSlug-friendly (e.g., product-docs)
Embedding modelopenai/text-embedding-3-smallModel used to embed chunks
Chunk strategyrecursivefixed_size or recursive (splits on semantic boundaries)
Chunk size512 tokensNumber of tokens per chunk
Chunk overlap64 tokensOverlap between adjacent chunks

Click Create Index.

curl -X POST http://localhost:8000/api/v1/rag/indexes \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-docs",
    "description": "Product documentation and FAQs",
    "embedding_model": "openai/text-embedding-3-small",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "source": "manual"
  }'

Response:

{
  "data": {
    "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "name": "product-docs",
    "description": "Product documentation and FAQs",
    "embedding_model": "openai/text-embedding-3-small",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "document_count": 0,
    "chunk_count": 0,
    "status": "active",
    "created_at": "2026-04-14T00:00:00Z"
  }
}

Step 2 — Ingest Documents

Upload files to the index. Supported formats: .pdf, .txt, .md, .csv, .json.

Open the index → click Upload Documents → drag and drop files.

Studio shows a live ingestion progress bar:

✅ Chunking...   14 chunks from docs/product-guide.pdf
✅ Embedding...  14 chunks embedded
✅ Stored        14 chunks indexed
# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
  -F "files=@docs/product-guide.pdf" \
  -F "files=@docs/faq.md" \
  -F "files=@data/pricing.csv"

Response (ingestion job):

{
  "data": {
    "id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "status": "processing",
    "total_files": 3,
    "processed_files": 0,
    "total_chunks": 0,
    "error": null,
    "started_at": "2026-04-14T00:00:00Z",
    "completed_at": null
  }
}

Poll for completion:

GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"

Step 3 — Search the Index

Test retrieval before wiring the index to an agent.

curl -X POST http://localhost:8000/api/v1/rag/search \
  -H "Content-Type: application/json" \
  -d '{
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "query": "What is the refund policy for annual subscriptions?",
    "top_k": 5,
    "vector_weight": 0.7,
    "text_weight": 0.3
  }'

Response:

{
  "data": {
    "index_id": "xxxxxxxx-...",
    "query": "What is the refund policy for annual subscriptions?",
    "top_k": 5,
    "results": [
      {
        "chunk_id": "chunk-001",
        "text": "Annual subscribers are eligible for a full refund within 30 days...",
        "metadata": { "source": "faq.md", "chunk_index": 12 },
        "score": 0.94,
        "similarity": 0.91
      }
    ],
    "total": 5
  }
}

Score is the combined hybrid score (vector similarity × vector_weight + text score × text_weight).


Step 4 — Use in agent.yaml

name: support-agent
version: 1.0.0
framework: claude_sdk

knowledge_bases:
  - ref: kb/product-docs         # ← resolves from registry at deploy time
  - ref: kb/return-policy

# At runtime, the agent automatically queries all attached
# knowledge bases and includes the top-k chunks as context
# before sending the user's message to the model.

Multiple knowledge bases

You can attach multiple knowledge bases to one agent. Each base is queried independently and the top-k results from all bases are merged and ranked before being included in the agent's context window.

SDK Usage

from agenthub import Agent

agent = (
    Agent("support-agent")
        .with_framework("claude_sdk")
        .with_knowledge_bases([
            "kb/product-docs",
            "kb/return-policy",
        ])
        .with_deploy(cloud="aws", runtime="app-runner")
)
import { Agent } from "@agentbreeder/sdk";

const agent = new Agent("support-agent")
  .withFramework("claude_sdk")
  .withKnowledgeBases(["kb/product-docs", "kb/return-policy"])
  .withDeploy({ cloud: "aws", runtime: "app-runner" });

Search Backends

AgentBreeder supports three retrieval approaches:

BackendHow it worksWhen to use
VectorDense embedding similarity (cosine)Semantic queries, paraphrase matching
Full-textBM25 keyword searchExact term matching, IDs, codes
HybridWeighted combination of vector + full-textDefault — best of both worlds
GraphEntity-relationship traversal over a knowledge graphLinked entities, structured knowledge bases, multi-hop reasoning

Hybrid (vector + full-text) is the default and works well for most document-based knowledge bases. Graph search is available when your knowledge base is structured as an entity graph.


Chunking Strategies

StrategyHow it worksBest for
fixed_sizeSplit every N tokens with O-token overlapStructured data, code, tables
recursiveSplit on \n\n, \n, ., in order — keeps paragraphs intactProse, documentation, FAQs

Choosing the wrong strategy can hurt retrieval quality — recursive is the better default for most text.


Embedding Models

The embedding_model field accepts any provider/model-id string:

ValueDimensionsNotes
openai/text-embedding-3-small1536Default — best cost/quality balance
openai/text-embedding-3-large3072Higher quality, higher cost
ollama/nomic-embed-text768Local, no API key needed
ollama/mxbai-embed-large1024Larger local model

Changing the embedding model

If you change embedding_model after ingesting documents, you must re-ingest all files. Vectors from different models are not compatible.


RAG YAML Schema (standalone rag.yaml)

You can define knowledge bases as standalone YAML files in version control:

spec_version: v1
name: product-docs
version: 1.0.0
description: Product documentation and FAQs
team: customer-success
owner: alice@company.com

backend: in_memory           # in_memory | pgvector — see "pgvector backend" below

embedding_model:
  provider: openai
  name: text-embedding-3-small
  dimensions: 1536

chunking:
  strategy: recursive        # fixed_size | recursive
  chunk_size: 512
  chunk_overlap: 64

sources:
  - type: file
    path: "docs/**/*.md"     # glob pattern
  - type: file
    path: "docs/**/*.pdf"

search:
  hybrid: true
  vector_weight: 0.7
  text_weight: 0.3
  default_top_k: 5

pgvector backend (HR-4 / #406)

The pgvector backend persists chunks + embeddings in PostgreSQL using the pgvector extension. Use it whenever you want durable storage that survives a server restart and is shared across replicas.

Local development

The fastest path is the upstream Docker image:

docker run --name agentbreeder-pgvector \
  -e POSTGRES_PASSWORD=pw \
  -p 5432:5432 \
  -d pgvector/pgvector:pg16

Then point AgentBreeder at it via env var:

export PGVECTOR_DSN="postgresql://postgres:pw@localhost:5432/postgres"

The backend self-installs the vector extension and creates a chunk table on first connect — no manual SQL needed for the dev path. Production deploys should still ship the DDL via alembic.

Configuration shape

backend: pgvector
backend_config:
  dsn: ${PGVECTOR_DSN}        # or set PGVECTOR_DSN env, or DATABASE_URL
  pool_min_size: 1
  pool_max_size: 10

Status (as of 2026-05-19)

The backend itself ships full upsert + cosine-similarity search + delete round-trip against a real Postgres (verified by tests/integration/test_pgvector_testcontainers.py). The wire-through from backend: pgvector in rag.yaml to the existing RAGStore.search() and ingestion pipeline is staged as a follow-up — today the backend is callable directly via api.services.pgvector_rag_backend.PgvectorRAGBackend.


Deploying with a managed pgvector store (cloud)

When you deploy an agent that declares knowledge_bases, retrieval runs inside the agent container — where the local in-process index is empty. To retrieve in the cloud, point the knowledge base at a managed Postgres + pgvector store with backend_url:

knowledge_bases:
  - ref: kb/product-docs
    backend_url: postgresql://user:pass@db.internal:5432/agentbreeder

At deploy time the resolver exposes this to the container as two env vars:

Env varSourcePurpose
KB_PGVECTOR_DSNknowledge_bases[].backend_urlConnection string the runtime uses to query pgvector
KB_EMBEDDING_MODELthe index's embedding modelEmbeds queries the same way the documents were ingested

At invoke time the runtime embeds the user query with KB_EMBEDDING_MODEL, runs a cosine-similarity search against the pgvector store for each attached index, and prepends the top-k chunks as context — nothing is read from the machine you deployed from.

Per-cloud pgvector notes

The vector extension must be available on the managed Postgres:

  • AWS RDS — supported on PostgreSQL ≥ 15.2; the extension is created via rds_superuser (no parameter-group change needed).
  • GCP Cloud SQL — pgvector is built in; connect over the instance's private IP.
  • Azure Database for PostgreSQL Flexible ServerVECTOR must be in the azure.extensions server parameter before CREATE EXTENSION succeeds.

Automatic provisioning (no backend_url)

If you omit backend_url, agentbreeder deploy provisions a managed Postgres for you and injects KB_PGVECTOR_DSN automatically:

knowledge_bases:
  - ref: kb/product-docs        # no backend_url → auto-provisioned

The store is created into the agent's own BYO network (the same VPC/subnets you already deploy into) so the agent reaches it over private networking:

  • AWS — an RDS PostgreSQL instance in your AWS_VPC_SUBNETS, behind a dedicated DB security group that allows 5432 only from the agent's security group (never the internet). The VPC is derived from your subnets.
  • GCP — a Cloud SQL instance with a private IP on your VPC network. The network must have Private Service Access configured.
  • Azure — a PostgreSQL Flexible Server on a subnet delegated to Microsoft.DBforPostgreSQL/flexibleServers (set AZURE_DB_SUBNET_ID).

The DB password is generated at deploy time and written only to the cloud secret manager (AWS Secrets Manager / GCP Secret Manager / Azure Key Vault) — never to disk or the agent image. The provisioned footprint is recorded in .agentbreeder/infra-state.json; agentbreeder teardown <agent> destroys it, removing only what AgentBreeder created (the DB + its security group) and never your VPC, subnets, or resource group.

backend_url always wins

An explicit backend_url short-circuits auto-provisioning — point it at a Postgres you manage and AgentBreeder won't create one. Auto-provisioning applies only to aws / gcp / azure; local, kubernetes, and claude-managed are out of scope. Per-cloud SDK provisioning is validated against live clouds during rollout.


API Reference

MethodPathDescription
POST/api/v1/rag/indexesCreate a new vector index
GET/api/v1/rag/indexesList all indexes (paginated)
GET/api/v1/rag/indexes/{id}Get index metadata
DELETE/api/v1/rag/indexes/{id}Delete index and all its chunks
POST/api/v1/rag/indexes/{id}/ingestUpload files and start ingestion
GET/api/v1/rag/indexes/{id}/ingest/{job_id}Poll ingestion job status
POST/api/v1/rag/searchHybrid search (vector + full-text)

agent.yaml knowledge_bases field

knowledge_bases:
  - ref: string          # required — registry reference (kb/name)
    backend_url: string  # optional — explicit pgvector DSN for cloud retrieval
FieldTypeRequiredDescription
refstringYesRegistry reference in format kb/{name}
backend_urlstringNoExplicit cloud-reachable pgvector DSN. Exposed to the agent as KB_PGVECTOR_DSN; see Deploying with a managed pgvector store.

Next Steps

WhatWhere
Add tools to your agentTools →
Connect MCP serversMCP Servers →
Register system promptsPrompts →
Full agent.yaml fieldsagent.yaml Reference →

On this page