Skip to content

pgVectorDB - PostgreSQL Vector Database

Welcome to the official developer documentation for pgVectorDB.

pgVectorDB is a production-ready Retrieval-Augmented Generation (RAG) orchestration layer built on PostgreSQL, pgvector, and langchain_postgres. It offers 10 distinct search methods, multi-embedding multimodal search, robust connection pooling, and multi-tenant metadata isolation — all without managing your own vector database infrastructure.


Why pgVectorDB?

Feature Description
10 Search Methods Semantic, keyword (FTS/BM25), hybrid (RRF), trigram fuzzy, ensemble
Multimodal Spaces Multiple embeddings per document — text + price + category + recency
Infinite Scaling HNSW (<1M), IVFFlat (10M), DiskANN (10M+) with label partitioning
13 Filter Operators MongoDB-style JSONB filtering ($eq, $between, $in, $and, $or, etc.)
Reranking Cross-encoder, Cohere, AWS Bedrock, HuggingFace API
Statistical RAG Evaluation Hit Rate, MRR, NDCG — benchmark any search pipeline
Production Diagnostics Query plans, benchmarks, recall measurement, index health

Documentation Architecture

Getting Started

User Guide

Advanced

API Reference

  • pgVectorDB Core — Auto-generated class reference
  • Spaces — VectorSpace, TextSpace, NumberSpace, CategorySpace, RecencySpace
  • Rerankers — CrossEncoderReranker, CohereReranker, AWSBedrockReranker
  • Metrics — RAGEvaluator, EvaluationDataset
  • Configuration — Config, get_production_config, get_test_config
  • Exceptions — RetrievalSystemError, InitializationError, DatabaseError

Examples & Notebooks

See the examples/ folder for full working notebooks:

Notebook Description
01_quickstart.ipynb Basic RAG pipeline
02_advanced_search.ipynb Hybrid search with RRF
03_multimodal_search.ipynb Multi-embedding search with Spaces
04_storage_optimization.ipynb DiskANN tuning and label filtering
05_rag_evaluation.ipynb Metric evaluation with RAGEvaluator

Status

  • Version: 0.0.5
  • Status: Production-Ready

Report Issues

Found a bug? Open an issue on GitHub