Knowledge Bases — Lifecycle & Registry

Create indexes, ingest documents, run hybrid search, and wire knowledge bases into agents.

Knowledge bases (RAG indexes) let agents retrieve context from your documents before answering. AgentBreeder handles chunking, embedding, and hybrid search — you just upload files and reference the index in agent.yaml.

How It Works

Your documents (PDF, MD, CSV, JSON)
        ↓
   Chunking (fixed-size or recursive)
        ↓
   Embedding (text-embedding-3-small by default)
        ↓
   Vector store (in-memory or pgvector)
        ↓
   Hybrid search (vector + full-text, 70/30 default)
        ↓
   Agent receives top-k chunks as context

Want ChromaDB pre-wired and seeded?

Run agentbreeder quickstart to start the full local stack with ChromaDB running on port 8001 and sample data already loaded into the agentbreeder_knowledge collection. To load your own docs: agentbreeder seed --chromadb --docs ./my-docs/

Step 1 — Create an Index

Go to Registry → Knowledge Bases → New Index. Configure:

Field	Default	Description
Name	required	Slug-friendly (e.g., `product-docs`)
Embedding model	`openai/text-embedding-3-small`	Model used to embed chunks
Chunk strategy	`recursive`	`fixed_size` or `recursive` (splits on semantic boundaries)
Chunk size	512 tokens	Number of tokens per chunk
Chunk overlap	64 tokens	Overlap between adjacent chunks

Click Create Index.

curl -X POST http://localhost:8000/api/v1/rag/indexes \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-docs",
    "description": "Product documentation and FAQs",
    "embedding_model": "openai/text-embedding-3-small",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "source": "manual"
  }'

Response:

{
  "data": {
    "id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "name": "product-docs",
    "description": "Product documentation and FAQs",
    "embedding_model": "openai/text-embedding-3-small",
    "chunk_strategy": "recursive",
    "chunk_size": 512,
    "chunk_overlap": 64,
    "document_count": 0,
    "chunk_count": 0,
    "status": "active",
    "created_at": "2026-04-14T00:00:00Z"
  }
}

Step 2 — Ingest Documents

Upload files to the index. Supported formats: .pdf, .txt, .md, .csv, .json.

Open the index → click Upload Documents → drag and drop files.

Studio shows a live ingestion progress bar:

✅ Chunking...   14 chunks from docs/product-guide.pdf
✅ Embedding...  14 chunks embedded
✅ Stored        14 chunks indexed

# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
  -F "files=@docs/product-guide.pdf" \
  -F "files=@docs/faq.md" \
  -F "files=@data/pricing.csv"

Response (ingestion job):

{
  "data": {
    "id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "status": "processing",
    "total_files": 3,
    "processed_files": 0,
    "total_chunks": 0,
    "error": null,
    "started_at": "2026-04-14T00:00:00Z",
    "completed_at": null
  }
}

Poll for completion:

GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"

Step 3 — Search the Index

Test retrieval before wiring the index to an agent.

curl -X POST http://localhost:8000/api/v1/rag/search \
  -H "Content-Type: application/json" \
  -d '{
    "index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    "query": "What is the refund policy for annual subscriptions?",
    "top_k": 5,
    "vector_weight": 0.7,
    "text_weight": 0.3
  }'

Response:

{
  "data": {
    "index_id": "xxxxxxxx-...",
    "query": "What is the refund policy for annual subscriptions?",
    "top_k": 5,
    "results": [
      {
        "chunk_id": "chunk-001",
        "text": "Annual subscribers are eligible for a full refund within 30 days...",
        "metadata": { "source": "faq.md", "chunk_index": 12 },
        "score": 0.94,
        "similarity": 0.91
      }
    ],
    "total": 5
  }
}

Score is the combined hybrid score (vector similarity × vector_weight + text score × text_weight).

Step 4 — Use in agent.yaml

name: support-agent
version: 1.0.0
framework: claude_sdk

knowledge_bases:
  - ref: kb/product-docs         # ← resolves from registry at deploy time
  - ref: kb/return-policy

# At runtime, the agent automatically queries all attached
# knowledge bases and includes the top-k chunks as context
# before sending the user's message to the model.

Multiple knowledge bases

You can attach multiple knowledge bases to one agent. Each base is queried independently and the top-k results from all bases are merged and ranked before being included in the agent's context window.

SDK Usage

from agenthub import Agent

agent = (
    Agent("support-agent")
        .with_framework("claude_sdk")
        .with_knowledge_bases([
            "kb/product-docs",
            "kb/return-policy",
        ])
        .with_deploy(cloud="aws", runtime="app-runner")
)

import { Agent } from "@agentbreeder/sdk";

const agent = new Agent("support-agent")
  .withFramework("claude_sdk")
  .withKnowledgeBases(["kb/product-docs", "kb/return-policy"])
  .withDeploy({ cloud: "aws", runtime: "app-runner" });

Search Backends

AgentBreeder supports three retrieval approaches:

Backend	How it works	When to use
Vector	Dense embedding similarity (cosine)	Semantic queries, paraphrase matching
Full-text	BM25 keyword search	Exact term matching, IDs, codes
Hybrid	Weighted combination of vector + full-text	Default — best of both worlds
Graph	Entity-relationship traversal over a knowledge graph	Linked entities, structured knowledge bases, multi-hop reasoning

Hybrid (vector + full-text) is the default and works well for most document-based knowledge bases. Graph search is available when your knowledge base is structured as an entity graph.

Chunking Strategies

Strategy	How it works	Best for
`fixed_size`	Split every N tokens with O-token overlap	Structured data, code, tables
`recursive`	Split on `\n\n`, `\n`, `.`, in order — keeps paragraphs intact	Prose, documentation, FAQs

Choosing the wrong strategy can hurt retrieval quality — recursive is the better default for most text.

Embedding Models

The embedding_model field accepts any provider/model-id string:

Value	Dimensions	Notes
`openai/text-embedding-3-small`	1536	Default — best cost/quality balance
`openai/text-embedding-3-large`	3072	Higher quality, higher cost
`ollama/nomic-embed-text`	768	Local, no API key needed
`ollama/mxbai-embed-large`	1024	Larger local model

Changing the embedding model

If you change embedding_model after ingesting documents, you must re-ingest all files. Vectors from different models are not compatible.

RAG YAML Schema (standalone `rag.yaml`)

You can define knowledge bases as standalone YAML files in version control:

spec_version: v1
name: product-docs
version: 1.0.0
description: Product documentation and FAQs
team: customer-success
owner: alice@company.com

backend: in_memory           # in_memory | pgvector — see "pgvector backend" below

embedding_model:
  provider: openai
  name: text-embedding-3-small
  dimensions: 1536

chunking:
  strategy: recursive        # fixed_size | recursive
  chunk_size: 512
  chunk_overlap: 64

sources:
  - type: file
    path: "docs/**/*.md"     # glob pattern
  - type: file
    path: "docs/**/*.pdf"

search:
  hybrid: true
  vector_weight: 0.7
  text_weight: 0.3
  default_top_k: 5

pgvector backend (HR-4 / #406)

The pgvector backend persists chunks + embeddings in PostgreSQL using the pgvector extension. Use it whenever you want durable storage that survives a server restart and is shared across replicas.

Local development

The fastest path is the upstream Docker image:

docker run --name agentbreeder-pgvector \
  -e POSTGRES_PASSWORD=pw \
  -p 5432:5432 \
  -d pgvector/pgvector:pg16

Then point AgentBreeder at it via env var:

export PGVECTOR_DSN="postgresql://postgres:pw@localhost:5432/postgres"

The backend self-installs the vector extension and creates a chunk table on first connect — no manual SQL needed for the dev path. Production deploys should still ship the DDL via alembic.

Configuration shape

backend: pgvector
backend_config:
  dsn: ${PGVECTOR_DSN}        # or set PGVECTOR_DSN env, or DATABASE_URL
  pool_min_size: 1
  pool_max_size: 10

Status (as of 2026-05-19)

The backend itself ships full upsert + cosine-similarity search + delete round-trip against a real Postgres (verified by tests/integration/test_pgvector_testcontainers.py). The wire-through from backend: pgvector in rag.yaml to the existing RAGStore.search() and ingestion pipeline is staged as a follow-up — today the backend is callable directly via api.services.pgvector_rag_backend.PgvectorRAGBackend.

Deploying with a managed pgvector store (cloud)

When you deploy an agent that declares knowledge_bases, retrieval runs inside the agent container — where the local in-process index is empty. To retrieve in the cloud, point the knowledge base at a managed Postgres + pgvector store with backend_url:

knowledge_bases:
  - ref: kb/product-docs
    backend_url: postgresql://user:pass@db.internal:5432/agentbreeder

At deploy time the resolver exposes this to the container as two env vars:

Env var	Source	Purpose
`KB_PGVECTOR_DSN`	`knowledge_bases[].backend_url`	Connection string the runtime uses to query pgvector
`KB_EMBEDDING_MODEL`	the index's embedding model	Embeds queries the same way the documents were ingested

At invoke time the runtime embeds the user query with KB_EMBEDDING_MODEL, runs a cosine-similarity search against the pgvector store for each attached index, and prepends the top-k chunks as context — nothing is read from the machine you deployed from.

Per-cloud pgvector notes

The vector extension must be available on the managed Postgres:

AWS RDS — supported on PostgreSQL ≥ 15.2; the extension is created via rds_superuser (no parameter-group change needed).
GCP Cloud SQL — pgvector is built in; connect over the instance's private IP.
Azure Database for PostgreSQL Flexible Server — VECTOR must be in the azure.extensions server parameter before CREATE EXTENSION succeeds.

Automatic provisioning (no `backend_url`)

If you omit backend_url, agentbreeder deploy provisions a managed Postgres for you and injects KB_PGVECTOR_DSN automatically:

knowledge_bases:
  - ref: kb/product-docs        # no backend_url → auto-provisioned

The store is created into the agent's own BYO network (the same VPC/subnets you already deploy into) so the agent reaches it over private networking:

AWS — an RDS PostgreSQL instance in your AWS_VPC_SUBNETS, behind a dedicated DB security group that allows 5432 only from the agent's security group (never the internet). The VPC is derived from your subnets.
GCP — a Cloud SQL instance with a private IP on your VPC network. The network must have Private Service Access configured.
Azure — a PostgreSQL Flexible Server on a subnet delegated to Microsoft.DBforPostgreSQL/flexibleServers (set AZURE_DB_SUBNET_ID).

The DB password is generated at deploy time and written only to the cloud secret manager (AWS Secrets Manager / GCP Secret Manager / Azure Key Vault) — never to disk or the agent image. The provisioned footprint is recorded in .agentbreeder/infra-state.json; agentbreeder teardown <agent> destroys it, removing only what AgentBreeder created (the DB + its security group) and never your VPC, subnets, or resource group.

backend_url always wins

An explicit backend_url short-circuits auto-provisioning — point it at a Postgres you manage and AgentBreeder won't create one. Auto-provisioning applies only to aws / gcp / azure; local, kubernetes, and claude-managed are out of scope. Per-cloud SDK provisioning is validated against live clouds during rollout.

API Reference

Method	Path	Description
`POST`	`/api/v1/rag/indexes`	Create a new vector index
`GET`	`/api/v1/rag/indexes`	List all indexes (paginated)
`GET`	`/api/v1/rag/indexes/{id}`	Get index metadata
`DELETE`	`/api/v1/rag/indexes/{id}`	Delete index and all its chunks
`POST`	`/api/v1/rag/indexes/{id}/ingest`	Upload files and start ingestion
`GET`	`/api/v1/rag/indexes/{id}/ingest/{job_id}`	Poll ingestion job status
`POST`	`/api/v1/rag/search`	Hybrid search (vector + full-text)

agent.yaml `knowledge_bases` field

knowledge_bases:
  - ref: string          # required — registry reference (kb/name)
    backend_url: string  # optional — explicit pgvector DSN for cloud retrieval

Field	Type	Required	Description
`ref`	string	Yes	Registry reference in format `kb/{name}`
`backend_url`	string	No	Explicit cloud-reachable pgvector DSN. Exposed to the agent as `KB_PGVECTOR_DSN`; see Deploying with a managed pgvector store.

Next Steps

What	Where
Add tools to your agent	Tools →
Connect MCP servers	MCP Servers →
Register system prompts	Prompts →
Full agent.yaml fields	agent.yaml Reference →

Knowledge Bases — Lifecycle & Registry

On this page