Metadata-Version: 2.4
Name: ai-parrot-embeddings
Version: 0.1.0
Summary: Concrete embedding, vector-store, and reranker backends for AI-Parrot
Author-email: Jesus Lara <jesuslara@phenobarbital.info>
License-Expression: MIT
Project-URL: Homepage, https://github.com/phenobarbital/ai-parrot
Project-URL: Source, https://github.com/phenobarbital/ai-parrot
Keywords: ai,rag,embeddings,vector-store,rerankers
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: POSIX :: Linux
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Framework :: AsyncIO
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: ai-parrot
Provides-Extra: huggingface
Requires-Dist: sentence-transformers>=5.0.0; extra == "huggingface"
Requires-Dist: tokenizers<=0.22.2,>=0.20.0; extra == "huggingface"
Requires-Dist: safetensors>=0.4.3; extra == "huggingface"
Requires-Dist: einops>=0.7.0; extra == "huggingface"
Requires-Dist: accelerate>=0.30.0; extra == "huggingface"
Requires-Dist: peft>=0.10.0; extra == "huggingface"
Requires-Dist: xformers>=0.0.27; extra == "huggingface"
Requires-Dist: simsimd>=4.3.1; extra == "huggingface"
Requires-Dist: bm25s[full]==0.2.14; extra == "huggingface"
Requires-Dist: rank_bm25==0.2.2; extra == "huggingface"
Requires-Dist: sentencepiece==0.2.1; extra == "huggingface"
Provides-Extra: google
Requires-Dist: google-genai>=2.6.0; extra == "google"
Requires-Dist: google-cloud-aiplatform==1.133.0; extra == "google"
Provides-Extra: openai
Requires-Dist: openai==2.8.1; extra == "openai"
Requires-Dist: tiktoken>=0.9.0; extra == "openai"
Provides-Extra: pgvector
Requires-Dist: pgvector==0.4.1; extra == "pgvector"
Provides-Extra: milvus
Requires-Dist: pymilvus==2.4.8; extra == "milvus"
Requires-Dist: milvus-lite>=2.4.0; extra == "milvus"
Provides-Extra: arango
Requires-Dist: python-arango-async==1.2.0; extra == "arango"
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.30.0; extra == "bigquery"
Provides-Extra: faiss
Provides-Extra: chroma
Requires-Dist: chromadb==0.6.3; extra == "chroma"
Provides-Extra: reranker-local
Requires-Dist: sentence-transformers>=5.0.0; extra == "reranker-local"
Requires-Dist: tokenizers<=0.22.2,>=0.20.0; extra == "reranker-local"
Requires-Dist: safetensors>=0.4.3; extra == "reranker-local"
Provides-Extra: reranker-llm
Provides-Extra: all
Requires-Dist: ai-parrot-embeddings[arango,bigquery,chroma,faiss,google,huggingface,milvus,openai,pgvector,reranker-llm,reranker-local]; extra == "all"

# ai-parrot-embeddings

Concrete backend implementations for the AI-Parrot retrieval stack:
embedding models, vector stores, and rerankers.

## What's in this package

This satellite contributes modules to three subsystems of the
`parrot.*` namespace:

- `parrot.embeddings.{google, huggingface, openai}` — embedding backends
- `parrot.stores.{postgres, pgvector, milvus, arango, bigquery, faiss_store}` — vector stores
- `parrot.rerankers.{local, llm}` — rerankers

The abstract base classes (`EmbeddingModel`, `AbstractStore`, `AbstractReranker`),
the registries (`EmbeddingRegistry`), the dispatch maps (`supported_embeddings`,
`supported_stores`), and all shared types (`parrot.stores.models.Document`,
`SearchResult`, etc.) remain in the `ai-parrot` core package.

## Import contract

This package uses **PEP 420 implicit namespace packages**. Its modules ship
directly under the existing `parrot.*` namespace — no separate top-level.
Existing imports continue to work unchanged once installed:

```python
from parrot.embeddings.huggingface import SentenceTransformerModel  # from satellite
from parrot.stores.pgvector import PgVectorStore                    # from satellite
from parrot.embeddings import EmbeddingRegistry                     # from core
from parrot.stores import AbstractStore, supported_stores           # from core
```

No code changes are needed in user projects after upgrading from
`ai-parrot[embeddings]` to `ai-parrot-embeddings[...]`.

## Install

| Goal | Command |
|------|---------|
| Core framework only (no backends) | `pip install ai-parrot` |
| One backend | `pip install ai-parrot-embeddings[pgvector]` |
| Multiple backends | `pip install ai-parrot-embeddings[pgvector,milvus,huggingface]` |
| Embeddings + vector stores | `pip install ai-parrot-embeddings[huggingface,pgvector]` |
| Rerankers | `pip install ai-parrot-embeddings[reranker-local]` |
| Everything | `pip install ai-parrot-embeddings[all]` |
| Legacy all-in-one (unchanged) | `pip install ai-parrot[all]` |

## Extras

| Extra | Pulls in | Enables |
|-------|----------|---------|
| `huggingface` | `sentence-transformers`, `tokenizers`, `safetensors`, `einops`, `accelerate`, `peft`, `xformers`, `simsimd`, `bm25s`, `rank_bm25`, `sentencepiece` | `parrot.embeddings.huggingface.SentenceTransformerModel` |
| `google` | `google-genai`, `google-cloud-aiplatform` | `parrot.embeddings.google.GoogleEmbeddingModel` |
| `openai` | `openai`, `tiktoken` | `parrot.embeddings.openai.OpenAIEmbeddingModel` |
| `pgvector` | `pgvector==0.4.1` | `parrot.stores.postgres.PgVectorStore`, `parrot.stores.pgvector.PgVectorStore` |
| `milvus` | `pymilvus`, `milvus-lite` | `parrot.stores.milvus.MilvusStore` |
| `arango` | `python-arango-async` | `parrot.stores.arango.ArangoDBStore` |
| `bigquery` | `google-cloud-bigquery` | `parrot.stores.bigquery.BigQueryStore` |
| `faiss` | (no extra deps; `faiss-cpu` ships with `ai-parrot` core) | `parrot.stores.faiss_store.FAISSStore` |
| `chroma` | `chromadb` | (reserved for future `ChromaStore`) |
| `reranker-local` | `sentence-transformers`, `tokenizers`, `safetensors` | `parrot.rerankers.local.LocalCrossEncoderReranker` |
| `reranker-llm` | (no extra deps; uses existing LLM clients) | `parrot.rerankers.llm.LLMReranker` |
| `all` | All of the above | Full retrieval stack |

## Development

```bash
git clone https://github.com/phenobarbital/ai-parrot
cd ai-parrot
source .venv/bin/activate
uv pip install -e packages/ai-parrot -e packages/ai-parrot-embeddings
uv run pytest packages/ai-parrot-embeddings/tests/
```

Or for the full workspace (if no pre-existing dependency conflicts):

```bash
uv sync --all-packages
```

## Architecture

This package uses **PEP 420 implicit namespace packages** (not the
`parrot_<name>.*` + `sys.meta_path` redirector pattern used by
`ai-parrot-tools`, `-loaders`, `-pipelines`). The satellite ships no
`__init__.py` files at `parrot/`, `parrot/embeddings/`, `parrot/stores/`,
or `parrot/rerankers/`. Python's import machinery merges the satellite's
directory entries with the host's regular packages at import time, because
the host's sub-package `__init__.py` files call `pkgutil.extend_path`.

## Design rationale

- Spec: [`sdd/specs/ai-parrot-embeddings.spec.md`](../../sdd/specs/ai-parrot-embeddings.spec.md)
- Proposal: [`sdd/proposals/ai-parrot-embeddings.proposal.md`](../../sdd/proposals/ai-parrot-embeddings.proposal.md)
- Feature: FEAT-201
