Metadata-Version: 2.4
Name: cognee-community-vector-adapter-opensearch
Version: 0.1.0
Summary: OpenSearch vector database adapter for cognee
Project-URL: Homepage, https://www.cognee.ai
Project-URL: Repository, https://github.com/topoteretes/cognee-community
Requires-Python: <=3.13,>=3.10
Requires-Dist: cognee==0.5.2
Requires-Dist: instructor>=1.11
Requires-Dist: opensearch-py==3.0.0
Requires-Dist: starlette>=0.48.0
Description-Content-Type: text/markdown

# OpenSearch Adapter for Cognee

This adapter provides integration between Cognee and [OpenSearch](https://opensearch.org/) for vector storage and retrieval operations.

## Features

- Full vector search capabilities using OpenSearch;
- Hybrid search (combining text and vector search);
- HNSW algorithm for efficient similarity search (NOTE: For now, the algorithm is not configurable in the adapter. New versions may allow for more flexibility in the near future.);
- Async/await support for all operations;
- Batch operations for improved performance

## Installation

If published, the package can be simply installed via pip:

```bash
pip install cognee-community-vector-adapter-opensearch
```

In case it is not published yet, you can use pip or poetry to locally build the adapter package:

```bash
pip install .
# OR
pip install poetry
poetry install # run this command in the directory containing the pyproject.toml file
```

## Connection Setup

For a quick local setup, you can run a docker container that opensearch provides.
After this, you will be able to connect to the Qdrant DB through the appropriate ports. The command for running the docker 
container looks something like the following:

```
docker pull opensearchproject/opensearch:2.17.1 && docker run -it -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "DISABLE_SECURITY_PLUGIN=true" opensearchproject/opensearch:2.17.1
```

## Configuration

The adapter requires the following credentials:
- `url`: The URL of your OpenSearch instance, including the port if necessary (e.g., `https://your-open-search-url:9200`);
- `api_key`: A base64 encoded string of a JSON object containing connection parameters:
  - `username`: Your OpenSearch username;
  - `password`: Your OpenSearch password;
  - `use_ssl`: Whether to use SSL (True/False);
  - `verify_certs`: Whether to verify SSL certificates (True/False);
  - `ssl_assert_hostname`: Whether to assert the hostname in SSL (True/False);
  - `ssl_show_warn`: Whether to show SSL warnings (True/False);
  - `index_prefix`: A prefix for the index names used by the adapter.
- `embedding_engine`: An instance of EmbeddingEngine for text vectorization

## Usage

```python
from cognee.infrastructure.databases.vector.embeddings.EmbeddingEngine import EmbeddingEngine
from packages.vector.cognee_community_vector_adapter_opensearch.cognee_community_vector_adapter_opensearch import OpenSearchAdapter
import json
import base64

# Creating the api_key as a base64 encoded string from the json string of the parameters
connection_parameters = {
    "username": "my-username",
    "password": "my-password",
    "use_ssl": "False",
    "verify_certs": "False",
    "ssl_assert_hostname": "False",
    "ssl_show_warn": "False",
    "index_prefix": "my-special-app-prefix-",
}

api_key = base64.b64encode(json.dumps(connection_parameters).encode()).decode()

# Initialize the adapter
embedding_engine = EmbeddingEngine(...)  # Your embedding engine
adapter = OpenSearchAdapter(
    url="https://your-open-search-url-including-port-if-any",
    api_key=api_key,
    embedding_engine=embedding_engine
)

# Create a collection (index)
await adapter.create_collection("my_collection")

# Add data points
await adapter.create_data_points("my_collection", data_points)

# Search
results = await adapter.search(
    collection_name="my_collection",
    query_text="search query",
    limit=10
)

# Batch search
results = await adapter.batch_search(
    collection_name="my_collection",
    query_texts=["query1", "query2"],
    limit=10
)
```
