Metadata-Version: 2.4
Name: crewai-infino
Version: 0.1.0rc2
Summary: CrewAI integration for Infino — vector, BM25, hybrid, and SQL retrieval as agent tools over one engine on object storage.
Project-URL: Homepage, https://github.com/infino-ai/crewai-infino
Project-URL: Repository, https://github.com/infino-ai/crewai-infino
Author: The Infino Authors
License: Apache-2.0
License-File: LICENSE
Keywords: agents,bm25,crewai,hybrid-search,infino,rag,retrieval,tools,vector-search
Requires-Python: >=3.10
Requires-Dist: crewai>=1.0
Requires-Dist: infino>=0.1.0
Requires-Dist: pyarrow>=14
Requires-Dist: pydantic>=2
Provides-Extra: lint
Requires-Dist: mypy; extra == 'lint'
Requires-Dist: ruff; extra == 'lint'
Provides-Extra: test
Requires-Dist: numpy; extra == 'test'
Requires-Dist: pytest; extra == 'test'
Description-Content-Type: text/markdown

# crewai-infino

CrewAI integration for [Infino](https://pypi.org/project/infino/) — give your
agents **semantic (vector), full-text (BM25), hybrid, and SQL** retrieval as
tools over **one** store on object storage. No second vector DB, no separate
metadata store, no client-side fusion: one Infino table answers all four ways,
and the agent picks the right one at runtime.

```
pip install crewai-infino
```

## Agent in minutes

Build and populate a table with `InfinoIndex`, then hand its tools to an agent:

```python
import infino
from crewai import Agent, Task, Crew
from crewai_infino import InfinoIndex

# Your embedding model — any callables that turn text into vectors.
# (e.g. sentence-transformers; dim must match.)
from my_embeddings import embed_documents, embed_query  # -> list[list[float]] / list[float]

conn = infino.connect("./infino-data")
index = InfinoIndex.create(
    conn, "kb",
    embed_documents=embed_documents, embed_query=embed_query, dim=384,
)
index.add_texts(
    ["Reset the X200 by holding power for 10s.", "Error E-507 means a stale cache."],
    metadatas=[{"product": "X200"}, {"product": "X200"}],
)

analyst = Agent(
    role="Support Analyst",
    goal="Answer customer questions from the knowledge base",
    backstory="You retrieve before you answer.",
    tools=index.as_tools(),          # semantic + keyword + hybrid + SQL, over one store
)

crew = Crew(agents=[analyst], tasks=[
    Task(description="How do I fix error E-507 on the X200?",
         expected_output="A concise fix.", agent=analyst),
])
print(crew.kickoff())
```

## The tools

`index.as_tools()` (or `infino_tools(searcher)`) returns four `crewai.tools.BaseTool`s:

| Tool | Backed by | Use for |
| --- | --- | --- |
| `InfinoSemanticSearchTool` | vector kNN | intent / how-to / paraphrased questions |
| `InfinoKeywordSearchTool` | BM25 (`mode="and"` default) | exact codes, SKUs, names, error strings |
| `InfinoHybridSearchTool` | BM25 + vector fused by RRF, one SQL call | strong default — exact *and* semantic |
| `InfinoSQLTool` | `query_sql` | point lookups, `GROUP BY` aggregates, joins |

Attach individual tools to different agents, or the whole set to one. The
package bundles **no** embedding model — you pass embed callables, so you keep
full control of the model (and the examples stay key-free with a local one).

## Knowledge

Use Infino as a CrewAI **knowledge** backend so a crew auto-grounds its answers —
no explicit tool call. `InfinoKnowledgeStorage` stores knowledge chunks in one
Infino table and retrieves them by vector similarity:

```python
from crewai import Crew
from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
from crewai_infino import InfinoKnowledgeStorage

storage = InfinoKnowledgeStorage(
    connection=conn, embed_documents=embed_documents,
    embed_query=embed_query, dim=384,
)
source = StringKnowledgeSource(content="Error E-507 means a stale cache.")
crew = Crew(agents=[...], tasks=[...],
            knowledge_sources=[source], knowledge_storage=storage)
```

## Lower-level API

- `InfinoIndex` — create / open a table, `add_texts`, `delete`, `as_tools()`.
- `InfinoSearcher` — the read path: `.semantic()`, `.keyword()`, `.hybrid()`,
  `.sql()` returning plain row dicts. The tools are thin wrappers over it.

The table schema matches `langchain-infino`, so a table built by one adapter is
queryable by the other.

## Status

Tools and the **Knowledge** storage backend (`InfinoKnowledgeStorage`) are the
shipping surface. CrewAI **Memory** (durable agent memory) is planned next,
targeting CrewAI's `RAGStorage` seam. The knowledge backend implements CrewAI's
`BaseKnowledgeStorage`, which is 1.x-only — hence the `crewai>=1.0` floor.

## Development

```sh
make install     # editable install with test + lint extras
make unit        # engine-free unit tests
make integration # full suite against the engine
make lint type   # ruff + mypy
```

Apache-2.0.
