Metadata-Version: 2.4
Name: chunksmith-agent
Version: 0.5.0
Summary: ChunkSmith document Q&A agent over saved multi-indexing outlines.
Author-email: AnshulParate2004 <anshulnparate@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/AnshulParate2004/ChunkSmith
Project-URL: Repository, https://github.com/AnshulParate2004/ChunkSmith
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pydantic>=2.10.0
Requires-Dist: chunksmith-core<0.6,>=0.5.0
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.28; extra == "langchain"
Requires-Dist: langchain-openai>=0.3.0; extra == "langchain"
Requires-Dist: langchain-litellm>=0.2.0; extra == "langchain"
Requires-Dist: langgraph>=0.2.0; extra == "langchain"
Provides-Extra: llm
Requires-Dist: chunksmith-multimodal[llm]<0.6,>=0.5.0; extra == "llm"

# chunksmith-agent

Standalone document Q&A over **saved ChunkSmith index JSON** (no dependency on `chunksmith-core`, `chunksmith-multimodal`, or `chunksmith-pageindex`).

## Install

```bash
pip install chunksmith-agent
pip install "chunksmith-agent[langchain]"   # LangChain tool-calling Q&A
```

## Usage

```python
from pathlib import Path
from chunksmith_agent import ChunkSmithAgent
from chunksmith_agent.index_builder import build_document_index_from_saved

index = build_document_index_from_saved(
    pageindex_path=Path("runs/my-doc/json/my-doc_pageindex.json"),
)
agent = ChunkSmithAgent(index)
answer = agent.ask("What is this document about?")
print(answer.answer)
```

## JSON artifact contract

The agent reads files produced by `chunksmith-cli` (or compatible tools) under a run folder:

| File | Fields used |
|------|-------------|
| `json/*_pageindex.json` | `doc_name`, `structure`, optional embedded `canonical_bundle` |
| `json/*_canonical_bundle.json` | `elements[]`, `coded_formate`, `path_image` |
| Outline nodes | `node_id`, `title`, `summary`, `start_index`/`end_index`, anchor fields |

## Environment variables

Same LLM env vars as ChunkSmith CLI / MVL:

- `OPENAI_API_KEY` (or `CHATGPT_API_KEY`)
- `PAGEINDEX_MODEL`, `CHUNKSMITH_LLM_MODEL`, `LLM_MODEL`
- Azure: `AZURE_API_KEY`, `AZURE_API_BASE`, `AZURE_API_VERSION`

## Integration patterns (loose coupling)

**Do not** depend on `chunksmith-adapters` inside this package. Pass data in yourself:

| Caller | How to load index | Agent config |
|--------|-------------------|--------------|
| **CLI** | `build_document_index_from_saved(pageindex_path=...)` | `load_settings()` from `.env` |
| **MVL app** | `chunksmith_agent_bridge.load_document_index_from_mvl(repo, ...)` | `load_settings()` or explicit `AgentSettings` |
| **Custom app** | Fetch JSON from your DB/S3 → `build_document_index(dict)` | Your env / settings |

Install separately from CLI or MVL:

```bash
pip install "chunksmith-agent[langchain]"
```

## Extensibility

- **Outline nodes:** come from saved JSON (`structure`); re-index or edit JSON to add sections.
- **Tools:** extend `_make_tools()` in `tool_agent.py` (e.g. add `get_page_images`).
- **Another agent in your app:** compose `ChunkSmithAgent` alongside your own planners/retrievers — this package is one document Q&A brain, not your whole system.
