Knowledge Bases — Lifecycle & Registry
Create indexes, ingest documents, run hybrid search, and wire knowledge bases into agents.
Knowledge bases (RAG indexes) let agents retrieve context from your documents before answering.
AgentBreeder handles chunking, embedding, and hybrid search — you just upload files and reference the index in agent.yaml.
How It Works
Your documents (PDF, MD, CSV, JSON)
↓
Chunking (fixed-size or recursive)
↓
Embedding (text-embedding-3-small by default)
↓
Vector store (in-memory or pgvector)
↓
Hybrid search (vector + full-text, 70/30 default)
↓
Agent receives top-k chunks as contextWant ChromaDB pre-wired and seeded?
Run agentbreeder quickstart to start the full local stack with ChromaDB running on port 8001
and sample data already loaded into the agentbreeder_knowledge collection.
To load your own docs: agentbreeder seed --chromadb --docs ./my-docs/
Step 1 — Create an Index
Go to Registry → Knowledge Bases → New Index. Configure:
| Field | Default | Description |
|---|---|---|
| Name | required | Slug-friendly (e.g., product-docs) |
| Embedding model | openai/text-embedding-3-small | Model used to embed chunks |
| Chunk strategy | recursive | fixed_size or recursive (splits on semantic boundaries) |
| Chunk size | 512 tokens | Number of tokens per chunk |
| Chunk overlap | 64 tokens | Overlap between adjacent chunks |
Click Create Index.
curl -X POST http://localhost:8000/api/v1/rag/indexes \
-H "Content-Type: application/json" \
-d '{
"name": "product-docs",
"description": "Product documentation and FAQs",
"embedding_model": "openai/text-embedding-3-small",
"chunk_strategy": "recursive",
"chunk_size": 512,
"chunk_overlap": 64,
"source": "manual"
}'Response:
{
"data": {
"id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"name": "product-docs",
"description": "Product documentation and FAQs",
"embedding_model": "openai/text-embedding-3-small",
"chunk_strategy": "recursive",
"chunk_size": 512,
"chunk_overlap": 64,
"document_count": 0,
"chunk_count": 0,
"status": "active",
"created_at": "2026-04-14T00:00:00Z"
}
}Step 2 — Ingest Documents
Upload files to the index. Supported formats: .pdf, .txt, .md, .csv, .json.
Open the index → click Upload Documents → drag and drop files.
Studio shows a live ingestion progress bar:
✅ Chunking... 14 chunks from docs/product-guide.pdf
✅ Embedding... 14 chunks embedded
✅ Stored 14 chunks indexed# Upload one or more files
curl -X POST http://localhost:8000/api/v1/rag/indexes/{index_id}/ingest \
-F "files=@docs/product-guide.pdf" \
-F "files=@docs/faq.md" \
-F "files=@data/pricing.csv"Response (ingestion job):
{
"data": {
"id": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
"index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"status": "processing",
"total_files": 3,
"processed_files": 0,
"total_chunks": 0,
"error": null,
"started_at": "2026-04-14T00:00:00Z",
"completed_at": null
}
}Poll for completion:
GET /api/v1/rag/indexes/{index_id}/ingest/{job_id}
# status: "pending" | "processing" | "completed" | "failed"Step 3 — Search the Index
Test retrieval before wiring the index to an agent.
curl -X POST http://localhost:8000/api/v1/rag/search \
-H "Content-Type: application/json" \
-d '{
"index_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"query": "What is the refund policy for annual subscriptions?",
"top_k": 5,
"vector_weight": 0.7,
"text_weight": 0.3
}'Response:
{
"data": {
"index_id": "xxxxxxxx-...",
"query": "What is the refund policy for annual subscriptions?",
"top_k": 5,
"results": [
{
"chunk_id": "chunk-001",
"text": "Annual subscribers are eligible for a full refund within 30 days...",
"metadata": { "source": "faq.md", "chunk_index": 12 },
"score": 0.94,
"similarity": 0.91
}
],
"total": 5
}
}Score is the combined hybrid score (vector similarity × vector_weight + text score × text_weight).
Step 4 — Use in agent.yaml
name: support-agent
version: 1.0.0
framework: claude_sdk
knowledge_bases:
- ref: kb/product-docs # ← resolves from registry at deploy time
- ref: kb/return-policy
# At runtime, the agent automatically queries all attached
# knowledge bases and includes the top-k chunks as context
# before sending the user's message to the model.Multiple knowledge bases
You can attach multiple knowledge bases to one agent. Each base is queried independently and the top-k results from all bases are merged and ranked before being included in the agent's context window.
SDK Usage
from agenthub import Agent
agent = (
Agent("support-agent")
.with_framework("claude_sdk")
.with_knowledge_bases([
"kb/product-docs",
"kb/return-policy",
])
.with_deploy(cloud="aws", runtime="app-runner")
)import { Agent } from "@agentbreeder/sdk";
const agent = new Agent("support-agent")
.withFramework("claude_sdk")
.withKnowledgeBases(["kb/product-docs", "kb/return-policy"])
.withDeploy({ cloud: "aws", runtime: "app-runner" });Search Backends
AgentBreeder supports three retrieval approaches:
| Backend | How it works | When to use |
|---|---|---|
| Vector | Dense embedding similarity (cosine) | Semantic queries, paraphrase matching |
| Full-text | BM25 keyword search | Exact term matching, IDs, codes |
| Hybrid | Weighted combination of vector + full-text | Default — best of both worlds |
| Graph | Entity-relationship traversal over a knowledge graph | Linked entities, structured knowledge bases, multi-hop reasoning |
Hybrid (vector + full-text) is the default and works well for most document-based knowledge bases. Graph search is available when your knowledge base is structured as an entity graph.
Chunking Strategies
| Strategy | How it works | Best for |
|---|---|---|
fixed_size | Split every N tokens with O-token overlap | Structured data, code, tables |
recursive | Split on \n\n, \n, ., in order — keeps paragraphs intact | Prose, documentation, FAQs |
Choosing the wrong strategy can hurt retrieval quality — recursive is the better default for most text.
Embedding Models
The embedding_model field accepts any provider/model-id string:
| Value | Dimensions | Notes |
|---|---|---|
openai/text-embedding-3-small | 1536 | Default — best cost/quality balance |
openai/text-embedding-3-large | 3072 | Higher quality, higher cost |
ollama/nomic-embed-text | 768 | Local, no API key needed |
ollama/mxbai-embed-large | 1024 | Larger local model |
Changing the embedding model
If you change embedding_model after ingesting documents, you must re-ingest all files. Vectors from different models are not compatible.
RAG YAML Schema (standalone rag.yaml)
You can define knowledge bases as standalone YAML files in version control:
spec_version: v1
name: product-docs
version: 1.0.0
description: Product documentation and FAQs
team: customer-success
owner: alice@company.com
backend: in_memory # in_memory | pgvector — see "pgvector backend" below
embedding_model:
provider: openai
name: text-embedding-3-small
dimensions: 1536
chunking:
strategy: recursive # fixed_size | recursive
chunk_size: 512
chunk_overlap: 64
sources:
- type: file
path: "docs/**/*.md" # glob pattern
- type: file
path: "docs/**/*.pdf"
search:
hybrid: true
vector_weight: 0.7
text_weight: 0.3
default_top_k: 5pgvector backend (HR-4 / #406)
The pgvector backend persists chunks + embeddings in PostgreSQL using the
pgvector extension. Use it whenever
you want durable storage that survives a server restart and is shared across
replicas.
Local development
The fastest path is the upstream Docker image:
docker run --name agentbreeder-pgvector \
-e POSTGRES_PASSWORD=pw \
-p 5432:5432 \
-d pgvector/pgvector:pg16Then point AgentBreeder at it via env var:
export PGVECTOR_DSN="postgresql://postgres:pw@localhost:5432/postgres"The backend self-installs the vector extension and creates a chunk table
on first connect — no manual SQL needed for the dev path. Production deploys
should still ship the DDL via alembic.
Configuration shape
backend: pgvector
backend_config:
dsn: ${PGVECTOR_DSN} # or set PGVECTOR_DSN env, or DATABASE_URL
pool_min_size: 1
pool_max_size: 10Status (as of 2026-05-19)
The backend itself ships full upsert + cosine-similarity search + delete
round-trip against a real Postgres (verified by
tests/integration/test_pgvector_testcontainers.py). The wire-through from
backend: pgvector in rag.yaml to the existing RAGStore.search() and
ingestion pipeline is staged as a follow-up — today the backend is callable
directly via api.services.pgvector_rag_backend.PgvectorRAGBackend.
Deploying with a managed pgvector store (cloud)
When you deploy an agent that declares knowledge_bases, retrieval runs inside
the agent container — where the local in-process index is empty. To retrieve in
the cloud, point the knowledge base at a managed Postgres + pgvector store with
backend_url:
knowledge_bases:
- ref: kb/product-docs
backend_url: postgresql://user:pass@db.internal:5432/agentbreederAt deploy time the resolver exposes this to the container as two env vars:
| Env var | Source | Purpose |
|---|---|---|
KB_PGVECTOR_DSN | knowledge_bases[].backend_url | Connection string the runtime uses to query pgvector |
KB_EMBEDDING_MODEL | the index's embedding model | Embeds queries the same way the documents were ingested |
At invoke time the runtime embeds the user query with KB_EMBEDDING_MODEL, runs a
cosine-similarity search against the pgvector store for each attached index, and
prepends the top-k chunks as context — nothing is read from the machine you
deployed from.
Per-cloud pgvector notes
The vector extension must be available on the managed Postgres:
- AWS RDS — supported on PostgreSQL ≥ 15.2; the extension is created via
rds_superuser(no parameter-group change needed). - GCP Cloud SQL — pgvector is built in; connect over the instance's private IP.
- Azure Database for PostgreSQL Flexible Server —
VECTORmust be in theazure.extensionsserver parameter beforeCREATE EXTENSIONsucceeds.
Automatic provisioning (no backend_url)
If you omit backend_url, agentbreeder deploy provisions a managed Postgres
for you and injects KB_PGVECTOR_DSN automatically:
knowledge_bases:
- ref: kb/product-docs # no backend_url → auto-provisionedThe store is created into the agent's own BYO network (the same VPC/subnets you already deploy into) so the agent reaches it over private networking:
- AWS — an RDS PostgreSQL instance in your
AWS_VPC_SUBNETS, behind a dedicated DB security group that allows5432only from the agent's security group (never the internet). The VPC is derived from your subnets. - GCP — a Cloud SQL instance with a private IP on your VPC network. The network must have Private Service Access configured.
- Azure — a PostgreSQL Flexible Server on a subnet delegated to
Microsoft.DBforPostgreSQL/flexibleServers(setAZURE_DB_SUBNET_ID).
The DB password is generated at deploy time and written only to the cloud
secret manager (AWS Secrets Manager / GCP Secret Manager / Azure Key Vault) —
never to disk or the agent image. The provisioned footprint is recorded in
.agentbreeder/infra-state.json; agentbreeder teardown <agent> destroys it,
removing only what AgentBreeder created (the DB + its security group) and never
your VPC, subnets, or resource group.
backend_url always wins
An explicit backend_url short-circuits auto-provisioning — point it at a
Postgres you manage and AgentBreeder won't create one. Auto-provisioning applies
only to aws / gcp / azure; local, kubernetes, and claude-managed are
out of scope. Per-cloud SDK provisioning is validated against live clouds during
rollout.
API Reference
| Method | Path | Description |
|---|---|---|
POST | /api/v1/rag/indexes | Create a new vector index |
GET | /api/v1/rag/indexes | List all indexes (paginated) |
GET | /api/v1/rag/indexes/{id} | Get index metadata |
DELETE | /api/v1/rag/indexes/{id} | Delete index and all its chunks |
POST | /api/v1/rag/indexes/{id}/ingest | Upload files and start ingestion |
GET | /api/v1/rag/indexes/{id}/ingest/{job_id} | Poll ingestion job status |
POST | /api/v1/rag/search | Hybrid search (vector + full-text) |
agent.yaml knowledge_bases field
knowledge_bases:
- ref: string # required — registry reference (kb/name)
backend_url: string # optional — explicit pgvector DSN for cloud retrieval| Field | Type | Required | Description |
|---|---|---|---|
ref | string | Yes | Registry reference in format kb/{name} |
backend_url | string | No | Explicit cloud-reachable pgvector DSN. Exposed to the agent as KB_PGVECTOR_DSN; see Deploying with a managed pgvector store. |
Next Steps
| What | Where |
|---|---|
| Add tools to your agent | Tools → |
| Connect MCP servers | MCP Servers → |
| Register system prompts | Prompts → |
| Full agent.yaml fields | agent.yaml Reference → |