Metadata-Version: 2.4
Name: langchain-actian-vectorai
Version: 1.0.0
Summary: An integration package connecting Actian VectorAI DB and LangChain
Project-URL: Documentation, https://docs.vectoraidb.actian.com
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: <4.0.0,>=3.10.0
Requires-Dist: actian-vectorai-client<2.0.0,>=1.0.0
Requires-Dist: langchain-core>=1.2
Requires-Dist: numpy>=1.26.0
Requires-Dist: pydantic<3.0.0,>=2.7.4
Provides-Extra: dev
Requires-Dist: langchain-tests>=0.3.0; extra == 'dev'
Requires-Dist: pytest-asyncio<1.0.0,>=0.21.1; extra == 'dev'
Requires-Dist: pytest-mock<4.0.0,>=3.10.0; extra == 'dev'
Requires-Dist: pytest<8.0.0,>=7.3.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Description-Content-Type: text/markdown

# langchain-actian-vectorai

LangChain VectorStore integration for [Actian VectorAI DB](https://docs.vectoraidb.actian.com).

## Installation

```bash
pip install langchain-actian-vectorai
```

## Quick Start

```python
from actian_vectorai import VectorAIClient, VectorParams, Distance
from langchain_actian_vectorai import ActianVectorAIVectorStore
from langchain_openai import OpenAIEmbeddings

client = VectorAIClient("localhost:6574")
client.connect()

if client.collections.exists(collection_name):
    client.collections.delete(collection_name)
    
client.collections.create(
    "my_collection",
    vectors_config=VectorParams(size=4096, distance=Distance.Cosine),
)

store = ActianVectorAIVectorStore(
    client=client,
    collection_name="my_collection",
    embedding=OpenAIEmbeddings(),
)

ids = store.add_texts(["hello world", "goodbye world"])
results = store.similarity_search("hello", k=1)
```

> **Note:** Document IDs can be provided as UUID strings. If not provided,
> a UUID will be automatically generated for each document.

## Create from Texts

Use `from_texts` to create a vector store, set up a collection, and add texts
in a single call. IDs can be provided as UUID strings; if omitted, UUIDs are
automatically generated:

```python
from uuid import uuid4

store = ActianVectorAIVectorStore.from_texts(
    texts=["the cat sat on the mat", "the dog played in the park"],
    embedding=OpenAIEmbeddings(),
    metadatas=[{"source": "book"}, {"source": "article"}],
    ids=[uuid4().hex, uuid4().hex],  # optional — UUIDs auto-generated if omitted
    collection_name="my_collection",
    url="localhost:6574",
)
```

## Create from Documents

Use `from_documents` to create a vector store from LangChain `Document` objects.
Document IDs can be provided as UUID strings via `Document.id`; if not provided,
UUIDs are automatically generated:

```python
from langchain_core.documents import Document
from uuid import uuid4

docs = [
    Document(page_content="foo", metadata={"baz": "bar"}, id=uuid4().hex),
    Document(page_content="thud", metadata={"bar": "baz"}),  # UUID auto-generated
]
store = ActianVectorAIVectorStore.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection_name="my_collection",
    url="localhost:6574",
)
```

## Connect to Existing Collection

Use `from_existing_collection` to connect to a collection that already exists
on the server, without adding any data. The collection is verified to exist
before returning:

```python
store = ActianVectorAIVectorStore.from_existing_collection(
    collection_name="my_collection",
    embedding=OpenAIEmbeddings(),
    url="localhost:6574",
)

# The store is ready for search, retrieval, and mutation
results = store.similarity_search("hello", k=3)
ids = store.add_texts(["new document"])
```

## Custom Payload Keys

By default, document content is stored under `"page_content"` and metadata
under `"metadata"` in the VectorAI point payload:

```json
{
    "page_content": "Lorem ipsum dolor sit amet",
    "metadata": {
        "foo": "bar"
    }
}
```

You can override these keys with `content_payload_key` and
`metadata_payload_key` to work with collections that use a different payload
schema — for example, a legacy collection or one shared with another system:

```python
# Connect with non-standard payload keys
store = ActianVectorAIVectorStore.from_existing_collection(
    collection_name="legacy_archive",
    embedding=OpenAIEmbeddings(),
    url="localhost:6574",
    content_payload_key="blog_text",
    metadata_payload_key="extra_info",
)

# All operations honour the custom keys automatically
results = store.similarity_search("hello", k=3)
ids = store.add_texts(
    ["new post"],
    metadatas=[{"author": "Alice"}],
)
docs = store.get_by_ids(ids)
```

The custom keys are supported across all class methods
(`from_texts`, `from_documents`, `afrom_texts`, `afrom_documents`,
`construct_instance`, `from_existing_collection`) and all search / retrieval
operations.

> **Note:** `content_payload_key` and `metadata_payload_key` must be
> different. Passing the same value for both raises `ValueError`.

## Async: Create from Texts

Use `afrom_texts` to asynchronously create a vector store, set up a collection,
and add texts. This uses `AsyncVectorAIClient` under the hood for non-blocking
operations:

```python
store = await ActianVectorAIVectorStore.afrom_texts(
    texts=["the cat sat on the mat", "the dog played in the park"],
    embedding=OpenAIEmbeddings(),
    metadatas=[{"source": "book"}, {"source": "article"}],
    collection_name="my_collection",
    url="localhost:6574",
)

# The returned store supports all async operations
results = await store.asimilarity_search("cat", k=2)
```

## Async: Create from Documents

Use `afrom_documents` to asynchronously create a vector store from LangChain
`Document` objects. Document IDs are preserved when set:

```python
from langchain_core.documents import Document

docs = [
    Document(page_content="foo", metadata={"baz": "bar"}, id="a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4"),
    Document(page_content="thud", metadata={"bar": "baz"}, id="b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5"),
]
store = await ActianVectorAIVectorStore.afrom_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection_name="my_collection",
    url="localhost:6574",
)

# Search, delete, and retrieve all work asynchronously
results = await store.asimilarity_search("foo", k=1)
await store.adelete(ids=["a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4"])
```

## Get by IDs

Retrieve documents by their IDs:

```python
# Retrieve documents by ID
docs = store.get_by_ids(["a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4", "b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5"])
for doc in docs:
    print(doc.page_content, doc.metadata)

# Returns an empty list if no IDs are provided or none found
docs = store.get_by_ids([])

# Async variant
docs = await store.aget_by_ids(["a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4", "b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5"])
```

## Delete by Filter

Remove documents matching a metadata filter without knowing their IDs.
Accepts the same dict-style filters used by search methods, or a native
`Filter` object from `actian_vectorai.FilterBuilder`:

```python
# Dict-style filter — delete all documents with category "obsolete"
store.delete_by_filter({"category": "obsolete"})

# Comparison operators work too
store.delete_by_filter({"year": {"$lt": 2020}})

# Native FilterBuilder for advanced conditions
from actian_vectorai import Field, FilterBuilder

f = FilterBuilder().must(Field("metadata.category").eq("C")).build()
store.delete_by_filter(f)

# Async variant
await store.adelete_by_filter({"status": "archived"})
```

> **Note:** Passing `None` as the filter raises `ValueError` to prevent
> accidental deletion of all documents.

## Similarity Search

```python
# Basic search (returns list of Document)
results = store.similarity_search("hello", k=4)

# General search (returns list of Document, supports all search types)
results = store.search("hello", k=4)

# Search with scores
results = store.similarity_search_with_score("hello", k=4)
for doc, score in results:
    print(f"[{score:.3f}] {doc.page_content}")

# Search with relevance scores (normalized to [0, 1])
results = store.similarity_search_with_relevance_scores("hello", k=4)

# Search by embedding vector with scores
vec = embeddings.embed_query("hello")
results = store.similarity_search_with_score_by_vector(vec, k=4)
for doc, score in results:
    print(f"[{score:.3f}] {doc.page_content}")

# Async search
results = await store.asimilarity_search("hello", k=4)
results = await store.asearch("hello", k=4)
results = await store.asimilarity_search_with_score("hello", k=4)
results = await store.asimilarity_search_with_relevance_scores("hello", k=4)
results = await store.asimilarity_search_with_score_by_vector(vec, k=4)
```

### Score Threshold

All search methods accept an optional `score_threshold` parameter to filter
out results below a minimum similarity score:

```python
# Only return results with score >= 0.8
results = store.similarity_search("hello", k=10, score_threshold=0.8)

# Works with all search variants
results = store.similarity_search_with_score("hello", k=10, score_threshold=0.8)
results = store.max_marginal_relevance_search("hello", k=4, score_threshold=0.7)

# Async variants too
results = await store.asimilarity_search("hello", k=10, score_threshold=0.8)
```

### Search Parameters (HNSW Tuning & Exact Search)

All search methods accept an optional `search_params` parameter to control
low-level search behaviour. Pass a `SearchParams` object from the
`actian_vectorai` package:

```python
from actian_vectorai import SearchParams

# Search with custom HNSW ef (higher = more accurate, slower)
results = store.similarity_search(
    "hello", k=5,
    search_params=SearchParams(hnsw_ef=256),
)

# Exact (brute-force) search — bypasses HNSW index for 100% recall
results = store.similarity_search(
    "hello", k=5,
    search_params=SearchParams(exact=True),
)

# Combine with filter and score threshold
results = store.similarity_search(
    "hello", k=5,
    filter={"topic": "ml"},
    score_threshold=0.7,
    search_params=SearchParams(hnsw_ef=128),
)

# Works with all search variants
results = store.similarity_search_with_score(
    "hello", k=5, search_params=SearchParams(exact=True),
)
results = store.max_marginal_relevance_search(
    "hello", k=4, fetch_k=20,
    search_params=SearchParams(hnsw_ef=256),
)

# Async variants
results = await store.asimilarity_search(
    "hello", k=5, search_params=SearchParams(exact=True),
)
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `hnsw_ef` | `int \| None` | `None` | Search-time ef for HNSW (higher = more accurate) |
| `exact` | `bool \| None` | `None` | Force exact brute-force search (bypasses index) |
| `quantization` | `QuantizationSearchParams \| None` | `None` | Quantization search config |
| `indexed_only` | `bool \| None` | `None` | Only search indexed segments |

## Metadata Filtering

Filter search results by document metadata using a simple dict syntax.
Each key-value pair becomes an AND condition that matches against the
stored metadata fields:

```python
# Add documents with metadata (e.g., topic and year fields)
store.add_texts(
    [
        "Python is a popular programming language",
        "Machine learning transforms data into insights",
        "Vector databases enable semantic search",
        "Neural networks learn hierarchical features",
        "SQL is the language of relational databases",
    ],
    metadatas=[
        {"topic": "programming", "year": 2024},
        {"topic": "ml", "year": 2024},
        {"topic": "databases", "year": 2024},
        {"topic": "ml", "year": 2023},
        {"topic": "databases", "year": 2020},
    ],
)

# Filter by a single metadata field
results = store.similarity_search(
    "data science",
    k=3,
    filter={"topic": "ml"},
)

# Filter by multiple fields (AND semantics)
results = store.similarity_search(
    "data science",
    k=3,
    filter={"topic": "ml", "year": 2024},
)

# Works with all search methods
results = store.similarity_search_with_score(
    "learning", k=3, filter={"topic": "ml"}
)
results = store.max_marginal_relevance_search(
    "learning", k=3, fetch_k=10, filter={"topic": "ml"}
)

# Async variants also support dict filters
results = await store.asimilarity_search("learning", k=3, filter={"topic": "ml"})
```

Supported value types: strings, integers, booleans. Nested dicts and lists
are also supported for complex metadata structures.

### Comparison Operators

Use operator dicts to express richer conditions on a single field:

| Operator | Meaning                  | Example                                |
|----------|--------------------------|----------------------------------------|
| `$eq`    | Equal                    | `{"topic": {"$eq": "ml"}}`             |
| `$ne`    | Not equal                | `{"topic": {"$ne": "databases"}}`      |
| `$gt`    | Greater than             | `{"year": {"$gt": 2020}}`              |
| `$gte`   | Greater than or equal    | `{"year": {"$gte": 2023}}`             |
| `$lt`    | Less than                | `{"year": {"$lt": 2024}}`              |
| `$lte`   | Less than or equal       | `{"year": {"$lte": 2023}}`             |
| `$in`    | In list                  | `{"topic": {"$in": ["ml", "db"]}}`     |
| `$nin`   | Not in list              | `{"topic": {"$nin": ["sql", "html"]}}` |
| `$text`  | Full-text substring      | `{"desc": {"$text": "wireless"}}`      |
| `$between` | Inclusive range        | `{"price": {"$between": [10, 100]}}`   |
| `$range` | Range with keyword bounds | `{"price": {"$range": {"gte": 10, "lt": 100}}}` |
| `$values_count` | Array length bounds | `{"tags": {"$values_count": {"gte": 1, "lte": 5}}}` |
| `$datetime_gt`  | Datetime >           | `{"created": {"$datetime_gt": dt}}`    |
| `$datetime_gte` | Datetime >=          | `{"created": {"$datetime_gte": dt}}`   |
| `$datetime_lt`  | Datetime <           | `{"created": {"$datetime_lt": dt}}`    |
| `$datetime_lte` | Datetime <=          | `{"created": {"$datetime_lte": dt}}`   |
| `$datetime_between` | Datetime range   | `{"created": {"$datetime_between": [start, end]}}` |

Operators can be combined on the same field for range queries:

```python
# Documents with year between 2022 and 2024 (inclusive)
results = store.similarity_search(
    "learning", k=3, filter={"year": {"$gte": 2022, "$lte": 2024}}
)

# Same range, using $between shorthand
results = store.similarity_search(
    "learning", k=3, filter={"year": {"$between": [2022, 2024]}}
)

# Arbitrary range with keyword bounds (half-open interval)
results = store.similarity_search(
    "item", k=3, filter={"price": {"$range": {"gte": 50, "lt": 200}}}
)

# Full-text substring match on a metadata field
results = store.similarity_search(
    "audio", k=3, filter={"description": {"$text": "wireless"}}
)

# Documents where topic is NOT "databases"
results = store.similarity_search(
    "learning", k=3, filter={"topic": {"$ne": "databases"}}
)

# Documents where topic is one of "ml" or "programming"
results = store.similarity_search(
    "learning", k=3, filter={"topic": {"$in": ["ml", "programming"]}}
)
```

### Datetime Operators

Filter by datetime fields using `datetime` objects:

```python
from datetime import datetime, timezone

# Documents created after a specific date
results = store.similarity_search(
    "recent", k=3,
    filter={"created": {"$datetime_gte": datetime(2024, 1, 1, tzinfo=timezone.utc)}},
)

# Documents created within a date range
results = store.similarity_search(
    "recent", k=3,
    filter={"created": {"$datetime_between": [
        datetime(2024, 1, 1, tzinfo=timezone.utc),
        datetime(2025, 1, 1, tzinfo=timezone.utc),
    ]}},
)
```

### Array Operators

```python
# Filter by number of values in an array field
results = store.similarity_search(
    "tagged", k=3,
    filter={"tags": {"$values_count": {"gte": 2, "lte": 5}}},
)
```

### Logical Operators

Combine conditions with `$and`, `$or`, and `$not`:

```python
# OR — match documents where topic is "ml" OR topic is "databases"
results = store.similarity_search(
    "data science", k=5,
    filter={"$or": [{"topic": "ml"}, {"topic": "databases"}]},
)

# AND — explicitly combine conditions (equivalent to multi-key dict)
results = store.similarity_search(
    "learning", k=3,
    filter={"$and": [{"topic": "ml"}, {"year": {"$gte": 2023}}]},
)

# NOT — exclude documents matching a condition
results = store.similarity_search(
    "learning", k=3,
    filter={"$not": {"topic": "databases"}},
)

# Nested logical operators
results = store.similarity_search(
    "data", k=5,
    filter={
        "$and": [
            {"$or": [{"topic": "ml"}, {"topic": "programming"}]},
            {"year": {"$gte": 2023}},
        ]
    },
)

# $min_should — at least N of the listed conditions must match
results = store.similarity_search(
    "data", k=5,
    filter={
        "$min_should": {
            "conditions": [
                {"topic": "ml"},
                {"topic": "databases"},
                {"topic": "programming"},
            ],
            "min_count": 2,
        }
    },
)
```

### Standalone Conditions

Top-level filter conditions that operate on point-level properties
rather than individual metadata fields:

| Operator       | Meaning                  | Example                                       |
|----------------|--------------------------|-----------------------------------------------|
| `$has_id`      | Point ID in list         | `{"$has_id": ["a1b2c3d4...", "b2c3d4e5..."]}`  |
| `$has_vector`  | Point has named vector   | `{"$has_vector": "text"}`                      |
| `$is_empty`    | Metadata field is empty  | `{"$is_empty": "tags"}`                        |
| `$nested`      | Nested object match      | `{"$nested": {"key": "addr", "filter": {...}}}` |

```python
# Filter by specific point IDs
results = store.similarity_search(
    "doc", k=3,
    filter={"$has_id": ["a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4", "b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5", "c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6"]},
)

# Filter for points that have a specific named vector
results = store.similarity_search(
    "doc", k=3,
    filter={"$has_vector": "text"},
)

# Filter for points where a metadata field is empty or missing
results = store.similarity_search(
    "doc", k=3,
    filter={"$is_empty": "tags"},
)

# Nested object match — filter inside a nested metadata structure
results = store.similarity_search(
    "doc", k=3,
    filter={"$nested": {"key": "address", "filter": {"city": "Paris"}}},
)

# Combine standalone conditions with field operators
results = store.similarity_search(
    "item", k=5,
    filter={
        "$has_id": ["a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4", "b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5"],
        "price": {"$between": [50, 200]},
    },
)
```

### Native FilterBuilder

You can also pass a pre-built VectorAI `Filter` object (from
`actian_vectorai.FilterBuilder`) directly for full access to the
filter DSL — including `should` (OR), `must_not` (NOT), range, and other advanced operators:

```python
from actian_vectorai import Field, FilterBuilder

# Range filter: topic is "ml" AND year >= 2023
f = (
    FilterBuilder()
    .must(Field("metadata.topic").eq("ml"))
    .must(Field("metadata.year").gte(2023))
    .build()
)
results = store.similarity_search("learning", k=3, filter=f)

# OR via should: match "ml" or "databases"
f = (
    FilterBuilder()
    .should(Field("metadata.topic").eq("ml"))
    .should(Field("metadata.topic").eq("databases"))
    .build()
)
results = store.similarity_search("data science", k=3, filter=f)

# NOT via must_not: exclude "programming"
f = (
    FilterBuilder()
    .must_not(Field("metadata.topic").eq("programming"))
    .build()
)
results = store.similarity_search("learning", k=3, filter=f)

# Combined must + must_not
f = (
    FilterBuilder()
    .must(Field("metadata.topic").eq("ml"))
    .must_not(Field("metadata.year").lt(2023))
    .build()
)
results = store.similarity_search("learning", k=3, filter=f)
```

## HNSW and Optimizer Configuration

Tune the HNSW index and optimizer parameters at collection creation time
or update them on an existing collection.

### At creation time

Pass `hnsw_config` and/or `optimizers_config` to any factory method:

```python
from actian_vectorai import HnswConfigDiff, OptimizersConfigDiff

store = ActianVectorAIVectorStore.from_texts(
    ["hello world", "goodbye world"],
    embedding=my_embeddings,
    collection_name="products",
    url="localhost:6574",
    hnsw_config=HnswConfigDiff(ef_construct=200, m=32),
    optimizers_config=OptimizersConfigDiff(indexing_threshold=10000),
)
```

Also supported in `from_documents`, `afrom_texts`, `afrom_documents`,
`construct_instance`, and `aconstruct_instance`.

### Updating an existing collection

Use `update_collection_config()` to adjust parameters after creation:

```python
from actian_vectorai import HnswConfigDiff, OptimizersConfigDiff

store.update_collection_config(
    hnsw_config=HnswConfigDiff(ef_construct=200, m=32),
    optimizers_config=OptimizersConfigDiff(indexing_threshold=10000),
)
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `hnsw_config` | `HnswConfigDiff` | HNSW index parameters: `ef_construct` (build-time search depth), `m` (graph links per node) |
| `optimizers_config` | `OptimizersConfigDiff` | Optimizer parameters: `indexing_threshold` (min vectors before HNSW index is built) |

## Index Type Selection

Choose the index algorithm at collection creation time with `index_type`.
This is a VDE extension — the default is `INDEX_TYPE_AUTO`.

Available values (`actian_vectorai.IndexType`):

| Value | Description |
|-------|-------------|
| `INDEX_TYPE_AUTO` | Server picks the best algorithm (default) |
| `INDEX_TYPE_FLAT` | Brute-force exact search — best for small collections |
| `INDEX_TYPE_HNSW` | HNSW approximate search — best for large collections |

```python
from actian_vectorai import IndexType

# FLAT index for a small collection
store = ActianVectorAIVectorStore.from_texts(
    ["hello world", "goodbye world"],
    embedding=my_embeddings,
    collection_name="small_lookup",
    url="localhost:6574",
    index_type=IndexType.INDEX_TYPE_FLAT,
)
```

Combine with `hnsw_config` for fine-tuned HNSW indexing:

```python
from actian_vectorai import HnswConfigDiff, IndexType

store = ActianVectorAIVectorStore.from_texts(
    ["hello world", "goodbye world"],
    embedding=my_embeddings,
    collection_name="products",
    url="localhost:6574",
    index_type=IndexType.INDEX_TYPE_HNSW,
    hnsw_config=HnswConfigDiff(ef_construct=200, m=32),
)
```

Also supported in `from_documents`, `afrom_texts`, `afrom_documents`,
`construct_instance`, and `aconstruct_instance`.

## Named Vectors

Named vectors allow a single collection to hold multiple vector spaces
(e.g. *text* embeddings and *image* embeddings) with independent
dimensionality and distance metrics. Each `ActianVectorAIVectorStore`
targets one named vector at a time via the `vector_name` parameter.

### Creating a collection with a named vector

When you pass `vector_name`, the collection is automatically created with
a named-vector configuration:

```python
store = ActianVectorAIVectorStore.from_texts(
    ["hello world", "machine learning"],
    embedding=my_embeddings,
    collection_name="products",
    url="localhost:6574",
    vector_name="text",          # store embeddings under the "text" name
)
```

### Creating a multi-vector collection

Use `vectors_config` to define multiple independent vector spaces upfront:

```python
from actian_vectorai import Distance, VectorAIClient, VectorParams
from langchain_actian_vectorai import ActianVectorAIVectorStore

client = VectorAIClient("localhost:6574")
client.connect()

# Create a collection with two named vector spaces
client.collections.create(
    "multimodal",
    vectors_config={
        "text": VectorParams(size=384, distance=Distance.Cosine),
        "image": VectorParams(size=512, distance=Distance.Euclid),
    },
)

# One store for text embeddings
text_store = ActianVectorAIVectorStore(
    client=client,
    collection_name="multimodal",
    embedding=text_embeddings,
    vector_name="text",
)

# Another store for image embeddings
image_store = ActianVectorAIVectorStore(
    client=client,
    collection_name="multimodal",
    embedding=image_embeddings,
    vector_name="image",
)
```

Or use `from_texts` / `from_documents` with both `vector_name` and
`vectors_config`:

```python
from actian_vectorai import Distance, VectorParams

store = ActianVectorAIVectorStore.from_documents(
    docs,
    embedding=my_embeddings,
    collection_name="multimodal",
    url="localhost:6574",
    vector_name="text",
    vectors_config={
        "text": VectorParams(size=384, distance=Distance.Cosine),
        "image": VectorParams(size=512, distance=Distance.Euclid),
    },
)
```

### Searching a named vector

All search methods automatically target the configured named vector:

```python
# Similarity search on the "text" vector space
results = text_store.similarity_search("hello", k=5)

# Search with scores
results = text_store.similarity_search_with_score("hello", k=5)

# MMR search
results = text_store.max_marginal_relevance_search("hello", k=4, fetch_k=20)

# Metadata filtering works as usual
results = text_store.similarity_search(
    "hello", k=5, filter={"topic": "ml"}
)

# Async search
results = await text_store.asimilarity_search("hello", k=5)
```

### Connecting to an existing named-vector collection

```python
store = ActianVectorAIVectorStore.from_existing_collection(
    collection_name="multimodal",
    embedding=text_embeddings,
    url="localhost:6574",
    vector_name="text",
)
results = store.similarity_search("query", k=3)
```

## Max Marginal Relevance Search

MMR optimizes for both similarity to the query and diversity among results:

```python
# Standard MMR search
results = store.max_marginal_relevance_search(
    "machine learning",
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)

# MMR by vector (provide embedding directly)
embedding = store._embed_query("machine learning")
results = store.max_marginal_relevance_search_by_vector(
    embedding,
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)

# MMR with scores — returns (Document, score) tuples
results = store.max_marginal_relevance_search_with_score_by_vector(
    embedding,
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)
for doc, score in results:
    print(f"[{score:.3f}] {doc.page_content}")

# Async MMR
results = await store.amax_marginal_relevance_search(
    "machine learning", k=4, fetch_k=20,
)
# Async MMR by vector
embedding = await store._aembed_query("machine learning")
results = await store.amax_marginal_relevance_search_by_vector(
    embedding,
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)
# Async MMR with scores
results = await store.amax_marginal_relevance_search_with_score_by_vector(
    embedding,
    k=4,
    fetch_k=20,
    lambda_mult=0.5,
)
```

## Use as Retriever

```python
# Create a retriever from the vector store
retriever = store.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20, "lambda_mult": 0.5},
)
docs = retriever.invoke("machine learning")

# Async retriever
docs = await retriever.ainvoke("machine learning")
```

## API Reference

| Method | Async | Description |
|--------|-------|-------------|
| `construct_instance()` | `aconstruct_instance()` | Low-level factory: connect, create collection, return store |
| `from_texts()` | `afrom_texts()` | Create store, collection, and add texts |
| `from_documents()` | `afrom_documents()` | Create store, collection, and add documents |
| `from_existing_collection()` | | Connect to an existing collection (no data added) |
| `add_texts()` | `aadd_texts()` | Add texts to existing store |
| `add_documents()` | `aadd_documents()` | Add documents to existing store |
| `delete()` | `adelete()` | Delete by IDs |
| `delete_by_filter()` | `adelete_by_filter()` | Delete by metadata filter |
| `get_by_ids()` | `aget_by_ids()` | Retrieve documents by IDs |
| `search()` | `asearch()` | General search (all types, returns Document) |
| `similarity_search()` | `asimilarity_search()` | Search by query text |
| `similarity_search_with_score()` | `asimilarity_search_with_score()` | Search with scores |
| `similarity_search_by_vector()` | `asimilarity_search_by_vector()` | Search by embedding |
| `similarity_search_with_score_by_vector()` | `asimilarity_search_with_score_by_vector()` | Search by embedding with scores |
| `similarity_search_with_relevance_scores()` | `asimilarity_search_with_relevance_scores()` | Search with normalized relevance scores |
| `max_marginal_relevance_search()` | `amax_marginal_relevance_search()` | MMR search (by text) |
| `max_marginal_relevance_search_by_vector()` | `amax_marginal_relevance_search_by_vector()` | MMR search by embedding |
| `max_marginal_relevance_search_with_score_by_vector()` | `amax_marginal_relevance_search_with_score_by_vector()` | MMR search by embedding with scores |
| `as_retriever()` | | Convert store to Retriever interface |
| `update_collection_config()` | | Update HNSW / optimizer config on the collection |
| `close()` | | Close client connections and release resources |

## Resource Management

The vector store supports Python's context manager protocol for automatic
resource cleanup:

```python
# Context manager — connections are closed automatically on exit
with ActianVectorAIVectorStore(
    client=client,
    collection_name="my_collection",
    embedding=my_embeddings,
) as store:
    ids = store.add_texts(["hello world"])
    results = store.similarity_search("hello", k=1)
# client connections are closed here

# Manual cleanup
store = ActianVectorAIVectorStore(
    client=client,
    collection_name="my_collection",
    embedding=my_embeddings,
)
try:
    results = store.similarity_search("hello", k=1)
finally:
    store.close()
```

## Low-Level Factory: `construct_instance`

`construct_instance` (and its async counterpart `aconstruct_instance`)
handle client connection and collection setup without adding any data.
`from_texts` and `from_documents` use this internally:

```python
# Sync
store = ActianVectorAIVectorStore.construct_instance(
    embedding=my_embeddings,
    url="localhost:6574",
    collection_name="my_collection",
    distance="COSINE",
    force_recreate=True,
)

# Async
store = await ActianVectorAIVectorStore.aconstruct_instance(
    embedding=my_embeddings,
    url="localhost:6574",
    collection_name="my_collection",
    distance="COSINE",
)
```

## Error Handling

All VectorAI client errors are wrapped in `ActianVectorAIException`, which
can be imported from the package:

```python
from langchain_actian_vectorai import ActianVectorAIException

try:
    results = store.similarity_search("hello", k=4)
except ActianVectorAIException as exc:
    print(f"VectorAI error: {exc}")
    if exc.original:
        print(f"Caused by: {exc.original}")
```

## Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `url` | `"localhost:6574"` | VectorAI server gRPC address |
| `collection_name` | Auto-generated UUID | Collection name |
| `distance` | `"COSINE"` | Distance metric: `COSINE`, `EUCLID` (or `EUCLIDEAN`), `DOT` |
| `content_payload_key` | `"page_content"` | Payload key for document content |
| `metadata_payload_key` | `"metadata"` | Payload key for document metadata |
| `batch_size` | `64` | Batch size for upsert operations |
| `force_recreate` | `False` | Recreate collection if it exists |
| `vector_name` | `None` | Named vector to target for upserts and searches |
| `vectors_config` | `None` | Pre-built named-vectors config dict (`{name: VectorParams}`) |
| `hnsw_config` | `None` | `HnswConfigDiff` for HNSW index tuning at creation time |
| `optimizers_config` | `None` | `OptimizersConfigDiff` for optimizer tuning at creation time |
| `index_type` | `None` | `IndexType` enum for index algorithm selection at creation time (VDE extension) |
