Metadata-Version: 2.4
Name: cognee-community-vector-adapter-azure
Version: 0.0.2
Summary: Azure AI search vector database adapter for cognee
Project-URL: Homepage, https://www.cognee.ai
Project-URL: Repository, https://github.com/topoteretes/cognee-community
Requires-Python: <=3.13,>=3.10
Requires-Dist: azure-core>=1.29.0
Requires-Dist: azure-search-documents>=11.4.0
Requires-Dist: cognee>=0.2.4
Description-Content-Type: text/markdown

# Azure AI Search Adapter for Cognee

This adapter provides integration between Cognee and Azure AI Search (formerly Azure Cognitive Search) for vector storage and retrieval operations.

## Features

- Full vector search capabilities using Azure AI Search
- Hybrid search (combining text and vector search)
- HNSW algorithm for efficient similarity search
- Async/await support for all operations
- Batch operations for improved performance

## Installation

If published, the package can be simply installed via pip:

```bash
pip install cognee-community-vector-adapter-azure
```

In case it is not published yet, you can use poetry to locally build the adapter package:

```bash
pip install poetry
poetry install # run this command in the directory containing the pyproject.toml file
```

## Configuration

The adapter requires the following credentials:
- `endpoint`: Your Azure AI Search service endpoint (e.g., `https://your-service.search.windows.net`)
- `api_key`: Your Azure AI Search API key
- `embedding_engine`: An instance of EmbeddingEngine for text vectorization

## Usage

```python
from cognee.infrastructure.databases.vector.embeddings.EmbeddingEngine import EmbeddingEngine
from packages.vector.azureaisearch import AzureAISearchAdapter

# Initialize the adapter
embedding_engine = EmbeddingEngine(...)  # Your embedding engine
adapter = AzureAISearchAdapter(
    endpoint="https://your-service.search.windows.net",
    api_key="your-api-key",
    embedding_engine=embedding_engine
)

# Create a collection (index)
await adapter.create_collection("my_collection")

# Add data points
await adapter.create_data_points("my_collection", data_points)

# Search
results = await adapter.search(
    collection_name="my_collection",
    query_text="search query",
    limit=10
)

# Batch search
results = await adapter.batch_search(
    collection_name="my_collection",
    query_texts=["query1", "query2"],
    limit=10
)
```

## Key Differences from Other Vector Databases

1. **Collections as Indexes**: In Azure AI Search, what other vector databases call "collections" are called "indexes"
2. **Document Structure**: Documents in Azure AI Search have a specific schema with defined fields
3. **Batch Operations**: Azure AI Search doesn't have native batch search, so batch operations are parallelized
4. **Scoring**: Azure AI Search returns `@search.score` which is normalized differently than other vector databases

## Vector Search Configuration

The adapter uses HNSW (Hierarchical Navigable Small World) algorithm with the following default parameters:
- `m`: 4 (number of bi-directional links)
- `efConstruction`: 400 (size of the dynamic list)
- `efSearch`: 500 (size of the dynamic list for search)
- `metric`: cosine (similarity metric)

These parameters can be adjusted in the `create_collection` method if needed.
