pgVectorDB - PostgreSQL Vector Database¶
Welcome to the official developer documentation for pgVectorDB.
pgVectorDB is a production-ready Retrieval-Augmented Generation (RAG) orchestration layer built on PostgreSQL, pgvector, and langchain_postgres. It offers 10 distinct search methods, multi-embedding multimodal search, robust connection pooling, and multi-tenant metadata isolation — all without managing your own vector database infrastructure.
Why pgVectorDB?¶
| Feature | Description |
|---|---|
| 10 Search Methods | Semantic, keyword (FTS/BM25), hybrid (RRF), trigram fuzzy, ensemble |
| Multimodal Spaces | Multiple embeddings per document — text + price + category + recency |
| Infinite Scaling | HNSW (<1M), IVFFlat (10M), DiskANN (10M+) with label partitioning |
| 13 Filter Operators | MongoDB-style JSONB filtering ($eq, $between, $in, $and, $or, etc.) |
| Reranking | Cross-encoder, Cohere, AWS Bedrock, HuggingFace API |
| Statistical RAG Evaluation | Hit Rate, MRR, NDCG — benchmark any search pipeline |
| Production Diagnostics | Query plans, benchmarks, recall measurement, index health |
Documentation Architecture¶
Getting Started¶
- Installation — Docker & Python setup
- Quickstart — Build your first RAG in 5 minutes
- Core Concepts — Architecture, mixin system, security design
User Guide¶
- Vector Store Operations — CRUD, batch ingestion, upsert, DiskANN labels
- Embeddings & Spaces — TextSpace, NumberSpace, CategorySpace, RecencySpace
- Multimodal Search — Multi-embedding RAG with weighted space fusion
- Search & Retrieval — All 10 search methods in depth
- Metadata Filtering — 13 filter operators with SQL translation
- Reranking —
rerank_search(), cross-encoder, Cohere, Bedrock - Metrics & Evaluation — Hit Rate, MRR, NDCG, A/B testing
- Analytics & Diagnostics — Stats, benchmarks, query plans, recall
- LangChain Integration — Native LangChain retriever & chains
Advanced¶
- Indexing & Performance — HNSW, IVFFlat, DiskANN tuning +
maintenance_work_mem - Configuration — Connection pooling, schema isolation, environment configs
API Reference¶
- pgVectorDB Core — Auto-generated class reference
- Spaces — VectorSpace, TextSpace, NumberSpace, CategorySpace, RecencySpace
- Rerankers — CrossEncoderReranker, CohereReranker, AWSBedrockReranker
- Metrics — RAGEvaluator, EvaluationDataset
- Configuration — Config, get_production_config, get_test_config
- Exceptions — RetrievalSystemError, InitializationError, DatabaseError
Examples & Notebooks¶
See the examples/ folder for full working notebooks:
| Notebook | Description |
|---|---|
01_quickstart.ipynb |
Basic RAG pipeline |
02_advanced_search.ipynb |
Hybrid search with RRF |
03_multimodal_search.ipynb |
Multi-embedding search with Spaces |
04_storage_optimization.ipynb |
DiskANN tuning and label filtering |
05_rag_evaluation.ipynb |
Metric evaluation with RAGEvaluator |
Status¶
- Version:
0.0.5 - Status: Production-Ready
Report Issues
Found a bug? Open an issue on GitHub