Metadata-Version: 2.4
Name: minivecdb
Version: 1.0.0
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Database
Requires-Dist: langchain-core>=0.3 ; extra == 'dev'
Requires-Dist: pytest>=8 ; extra == 'dev'
Requires-Dist: freezegun>=1.4 ; extra == 'dev'
Requires-Dist: langchain-core>=0.3 ; extra == 'langchain'
Provides-Extra: dev
Provides-Extra: langchain
Summary: Ultra-fast 1-bit quantised vector database with TTL GC — Rust native extension
Keywords: vector-database,embeddings,hnsw,rag,rust,langchain,llm
Author: Alessandro Saladino
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/Alekkk777/MiniVecDb
Project-URL: Issues, https://github.com/Alekkk777/MiniVecDb/issues
Project-URL: Repository, https://github.com/Alekkk777/MiniVecDb

# minivecdb

**Ultra-fast 1-bit quantised vector database with TTL garbage collection — Rust native extension for Python.**

```bash
pip install minivecdb
pip install "minivecdb[langchain]"  # + LangChain adapter
```

Same core as [`@microvecdb/core`](https://github.com/Alekkk777/MiniVecDb) (browser / Node.js) — compiled from the same Rust codebase via PyO3 + Maturin.

## Quick start

```python
from minivecdb import MiniVecDb
import numpy as np, time

db = MiniVecDb(capacity=10_000)

vec = np.random.randn(384).astype(np.float32)
vec /= np.linalg.norm(vec)

db.insert(id=0, vector=vec.tolist(), inserted_at=time.time() * 1000)
db.build_index(m=16, ef_construction=200)

results = db.search(vec.tolist(), limit=5)
# → [{"id": 0, "score": 1.0, "distance": 0}]
```

## LangChain adapter

```python
from langchain_openai import OpenAIEmbeddings
from minivecdb.langchain import LangChainMiniVecDb

# Ephemeral agent scratchpad — 10 min TTL
with LangChainMiniVecDb(
    embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
    ttl_minutes=10,
    gc_interval_sec=30,
) as memory:
    memory.add_texts(["User mentioned ticket #42-ABC."])
    docs = memory.similarity_search("what is the ticket number?", k=3)
```

## TTL & GC

Every text has a wall-clock TTL. A daemon GC thread tombstones expired vectors automatically. Set `ttl_minutes=0` (default) to disable.

```python
count = store.run_gc()  # manual GC cycle — returns tombstone count
store.destroy()         # stop GC thread, free native memory
```

## Benchmarks

| Metric | Result |
|---|---|
| Search latency (10k vectors) | 0.08 ms |
| RAM per vector (384-dim) | 48 B (vs 1,536 B f32) |
| Recall@5 (sentence embeddings) | 100% |

## License

MIT

