Metadata-Version: 2.4
Name: chunksmith-cli
Version: 0.5.0
Summary: ChunkSmith CLI — multi-indexing (JSON/TOON) and PageIndexer pipelines.
License: MIT
Project-URL: Homepage, https://github.com/AnshulParate2004/ChunkSmith
Project-URL: Repository, https://github.com/AnshulParate2004/ChunkSmith
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.vectify
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: questionary>=2.0.0
Requires-Dist: toon-format==0.9.0b1
Requires-Dist: chunksmith-core<0.6,>=0.5.0
Requires-Dist: chunksmith-multimodal[llm,pdf,toon]<0.6,>=0.5.0
Requires-Dist: chunksmith-pageindex[llm,pdf]<0.6,>=0.5.0
Provides-Extra: online
Requires-Dist: chunksmith-adapters[mvl]<0.6,>=0.5.0; extra == "online"
Provides-Extra: agent
Requires-Dist: chunksmith-agent[langchain]<0.6,>=0.5.0; extra == "agent"
Provides-Extra: all
Requires-Dist: chunksmith-cli[agent,online]; extra == "all"
Dynamic: license-file

# chunksmith-cli (Python)

PyPI package **`chunksmith-cli`** v0.5.0 — Rich terminal for ChunkSmith indexing with Gemini-style arrow-key menus and loosely coupled storage.

Part of the **ChunkSmith** monorepo (`packages/chunksmith-cli`).

## What's new in 0.5.0

- **Unified LiteLLM gateway** — `CHUNKSMITH_LLM_MODEL` + `API_KEY` / `API_BASE` / `API_VERSION` (Azure, OpenAI, Anthropic, Gemini, …)
- **Dynamic storage** — `CHUNKSMITH_STORAGE_MODE=auto` (default) detects Mongo+S3 env; falls back to `~/.chunksmith/data`
- **Loose coupling** — core/multimodal/pageindex work local-only; optional `chunksmith-adapters[mvl]` for online DB
- Arrow-key menus, stepped configure wizard, `chunksmith doctor`, pipeline spinners
- Optional extras: `chunksmith-cli[online]`, `[agent]`, `[all]`

## Package layout (loose coupling)

| Package | Role | Required by CLI |
|---------|------|-----------------|
| `chunksmith-core` | Ports, preferences, shared LLM config | yes |
| `chunksmith-multimodal` | Multi-indexing pipeline | yes |
| `chunksmith-pageindex` | PageIndexer pipeline | yes |
| `chunksmith-adapters` | Mongo + S3 + Postgres (MVL) | optional `[online]` |
| `chunksmith-agent` | Document Q&A | optional `[agent]` |

Libraries never import `chunksmith-adapters`. The CLI wires online storage when env + adapters are present.

## Environment variables

### LLM (LiteLLM gateway)

| Variable | Purpose |
|----------|---------|
| `CHUNKSMITH_LLM_MODEL` | e.g. `azure/gpt-4o`, `openai/gpt-4o`, `anthropic/claude-3-5-sonnet` |
| `API_KEY` | Provider API key |
| `API_BASE` | Custom endpoint (Azure / proxy); empty for direct OpenAI |
| `API_VERSION` | Azure API version |

### Storage

| Variable | Purpose |
|----------|---------|
| `CHUNKSMITH_STORAGE_MODE` | `auto` (default), `local`, or `online` |
| `CHUNKSMITH_LOCAL_DATA_DIR` | Override local data root |
| `CHUNKSMITH_USER_ID` / `CHUNKSMITH_PROJECT_ID` | Online scope (defaults: `cli` / `default`) |

**Auto-detect online** when `MONGODB_URI` + `S3_*` are set and `chunksmith-adapters` is installed.

Online stack (from MVL `.env`): `DATABASE_URL`, `S3_LAYOUT`, `S3_ACCESS_KEY_ID`, `S3_SECRET_ACCESS_KEY`, `S3_PUBLIC_BASE_URL`, `S3_ENDPOINT_URL`, `S3_PREFIX_*`, `MONGODB_URI`, `MONGODB_DB_NAME`, `UNSTRUCTURED_API_KEY`, …

## Development

```bash
cd ChunkSmith
uv sync
chunksmith
```

## Tests

```bash
cd ChunkSmith
uv run pytest packages/chunksmith-core/tests packages/chunksmith-cli/tests packages/chunksmith-agent/tests -q
```

## PyPI install

```bash
# Minimal — local disk only
pipx install "chunksmith-cli==0.5.0"

# Online storage + agent Q&A
pipx install "chunksmith-cli[all]==0.5.0"

# uv (toon pre-release pin)
uv tool install "chunksmith-cli[all]==0.5.0" --prerelease=allow
```

```bash
chunksmith configure   # stepped wizard → ~/.chunksmith/.env
chunksmith doctor      # LLM + storage + adapters check
chunksmith --no-ui     # CI / non-interactive
```

## Publish to PyPI (maintainers)

From `ChunkSmith` repo root, in dependency order:

```powershell
$env:UV_PUBLISH_TOKEN = "pypi-...."   # or: uv auth login

$pkgs = @(
  "packages\chunksmith-core",
  "packages\chunksmith-multimodal",
  "packages\chunksmith-pageindex",
  "packages\chunksmith-adapters",
  "packages\chunksmith-agent",
  "packages\chunksmith-cli"
)

foreach ($p in $pkgs) {
  Push-Location $p
  uv build
  uv publish
  Pop-Location
}
```

Verify before publish:

```powershell
uv run pytest packages/chunksmith-core/tests packages/chunksmith-cli/tests packages/chunksmith-agent/tests -q
```
