Metadata-Version: 2.4
Name: vektra-langchain
Version: 0.1.0
Summary: LangChain VectorStore integration for VectorDB (Vektra)
Project-URL: Homepage, https://gitlab.fswise.com.br/wi/vektra/langchain-connector
Project-URL: Repository, https://gitlab.fswise.com.br/wi/vektra/langchain-connector
Project-URL: Issues, https://gitlab.fswise.com.br/wi/vektra/langchain-connector/-/issues
License: MIT
Keywords: embeddings,langchain,rag,vectordb,vectorstore,vektra
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27.0
Requires-Dist: langchain-core>=0.2.0
Provides-Extra: dev
Requires-Dist: langchain-openai>=0.1.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# vektra-langchain

LangChain `VectorStore` integration for **VectorDB**.

## Installation

```bash
pip install vektra-langchain
```

## Quick start

```python
from vektra_langchain import VectorDBVectorStore

store = VectorDBVectorStore(
    api_key="sk_live_your_key",
    base_url="http://localhost:8888",
    collection="my_docs",
    embedding_model="text-embedding-3-small",
)

# Insert texts with metadata
ids = store.add_texts(
    texts=["LangChain is great", "VectorDB is fast"],
    metadatas=[{"source": "docs"}, {"source": "docs"}],
)

# Search
docs = store.similarity_search("What is LangChain?", k=2)
```

## Constructor parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `api_key` | `str` | **required** | Bearer token (`sk_live_...`) |
| `base_url` | `str` | **required** | VectorDB API URL |
| `collection` | `str` | `"default"` | Collection name |
| `embedding_model` | `str \| None` | `None` | Server-side model name |
| `embeddings` | `Embeddings \| None` | `None` | LangChain client-side embeddings |
| `k` | `int` | `4` | Default result count |
| `timeout` | `float` | `30.0` | HTTP timeout (seconds) |

## Embedding modes

### Server-side (default)
Pass `embedding_model` — raw text is sent to VectorDB, which embeds it internally.

```python
store = VectorDBVectorStore(
    api_key="sk_live_...",
    base_url="http://localhost:8888",
    embedding_model="text-embedding-3-small",
)
```

### Client-side
Pass a LangChain `Embeddings` object — vectors are computed locally and raw floats are sent to VectorDB.

```python
from langchain_openai import OpenAIEmbeddings

store = VectorDBVectorStore(
    api_key="sk_live_...",
    base_url="http://localhost:8888",
    embeddings=OpenAIEmbeddings(),
)
```

## Methods

### `add_texts`

```python
ids: List[str] = store.add_texts(
    texts=["doc1", "doc2"],
    metadatas=[{"source": "a"}, {"source": "b"}],
)
```

### `similarity_search`

```python
docs: List[Document] = store.similarity_search(
    query="your question",
    k=4,
    filter={"source": "wiki"},   # optional metadata filter
)
```

### `similarity_search_with_score`

```python
results: List[Tuple[Document, float]] = store.similarity_search_with_score(
    query="your question",
    k=4,
)
for doc, score in results:
    print(score, doc.page_content)
```

### `similarity_search_by_vector`

```python
docs = store.similarity_search_by_vector(
    embedding=[0.1, 0.2, ...],
    k=4,
)
```

### `delete`

```python
# Returns True if all IDs were deleted, False if some were not found
ok: Optional[bool] = store.delete(ids=["id1", "id2"])
```

### `from_texts` (class method)

```python
store = VectorDBVectorStore.from_texts(
    texts=["doc1", "doc2"],
    api_key="sk_live_...",
    base_url="http://localhost:8888",
    embedding_model="text-embedding-3-small",
)
```

## Using as a LangChain retriever

```python
retriever = store.as_retriever(search_kwargs={"k": 4})

# Inside a RAG chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

qa = RetrievalQA.from_chain_type(llm=ChatOpenAI(), retriever=retriever)
answer = qa.invoke({"query": "What is VectorDB?"})
```

## Service helpers

`vektra_langchain.service` provides factory helpers on top of `VectorDBVectorStore`
for the two main LangChain flows: ingestion and retrieval.

```python
from vektra_langchain import service

store = service.get_vectorstore(
    collection="my_docs",
    api_key="sk_live_your_key",
    embedding_model="all-MiniLM-L6-v2",
)
ids = store.add_texts(["doc1", "doc2"])
docs = store.similarity_search("query", k=4)

# Or get a LangChain retriever for RAG chains:
retriever = service.get_retriever(
    collection="my_docs",
    api_key="sk_live_your_key",
    embedding_model="all-MiniLM-L6-v2",
    k=4,
)
```

`base_url` defaults to the `VECTORDB_BASE_URL` environment variable (falling back to
`http://localhost:8888`) when not passed explicitly.

## Error handling

```python
from vektra_langchain import VectorDBError

try:
    store.add_texts(["hello"])
except VectorDBError as e:
    print(e.status_code, e.detail)   # e.g. 401, "Invalid or missing API key."
```

## Running tests

```bash
pip install -e ".[dev]"
pytest tests/ -v --cov=vektra_langchain
```

## Releasing / Publishing to PyPI

Releases are published automatically by GitLab CI/CD whenever a tag matching
`vX.Y.Z` is pushed:

```bash
# 1. Bump the version in pyproject.toml, commit it.
# 2. Tag and push:
git tag v0.1.0
git push origin v0.1.0
```

The pipeline builds the package and uploads it to PyPI using the
`PYPI_API_TOKEN` CI/CD variable (masked + protected, restricted to protected
tags).

Manual fallback, if ever needed:

```bash
pip install build twine
python -m build
twine upload dist/*
```

## License

MIT
