Metadata-Version: 2.4
Name: agentspend-sdk
Version: 0.1.0
Summary: AgentSpend — runtime cost optimizer for AI agents. Route LLM calls to the cheapest capable model with fallbacks, loop guards, and telemetry.
Requires-Python: <3.14,>=3.12
Requires-Dist: aiosqlite>=0.20
Requires-Dist: alembic>=1.14
Requires-Dist: fastapi>=0.115
Requires-Dist: google-auth>=2.48.0
Requires-Dist: google-cloud-aiplatform>=1.139.0
Requires-Dist: litellm>=1.50
Requires-Dist: pandas>=2.2
Requires-Dist: pydantic-settings>=2.0
Requires-Dist: pydantic>=2.0
Requires-Dist: python-multipart>=0.0.9
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: sqlalchemy>=2.0
Requires-Dist: typer>=0.15
Requires-Dist: uvicorn[standard]>=0.32
Provides-Extra: dev
Requires-Dist: pytest>=9.0; extra == 'dev'
Description-Content-Type: text/markdown

# token-aud

AI cost optimization toolkit for LLM workloads.

`token-aud` now includes two complementary workflows:

- **Audit mode (CLI/API):** Analyze historical usage logs and estimate savings opportunities with Student-Teacher-Judge sampling.
- **AgentSpend SDK:** Route live agent steps (`plan`, `reason`, `tool`, `verify`, `draft`, `summarize`) to cost/quality-appropriate models with fallbacks, loop guards, and telemetry.

## Installation

```bash
uv sync --no-editable
```

For local development tooling (tests):

```bash
uv sync --no-editable --extra dev
```

## Quick Start (AgentSpend)

Run these three commands from repo root:

```bash
uv sync --no-editable --extra dev
uv run --no-sync python -m pytest tests/agent -q
uv run --no-sync python examples/agent_routing_demo.py
```

Expected results:

- Agent tests pass.
- Demo prints routed step decisions, per-step telemetry, and total run cost.
- `agent_telemetry.jsonl` is generated locally.

## AgentSpend Usage

### 1) Default policy

```python
from token_aud.agent import AgentSpend

agent = AgentSpend.default()
result = agent.route_call(
    step="plan",
    messages=[{"role": "user", "content": "Break this task into a plan"}],
)

print(result.model_used, result.cost_usd, result.content)
```

### 2) Custom policy YAML

```python
from token_aud.agent import AgentSpend

agent = AgentSpend.from_yaml("routing_policy.yaml")
result = agent.route_call(
    step="reason",
    messages=[{"role": "user", "content": "Compare two architectures"}],
)

print(result.model_used, result.fallbacks_tried)
```

Built-in default policy path:

- `src/token_aud/data/default_routing_policy.yaml`

## AgentSpend Core Components

- `src/token_aud/agent/policy.py`: Pydantic policy schema + YAML loading
- `src/token_aud/agent/router.py`: deterministic model selection
- `src/token_aud/agent/runtime.py`: `route_call()` execution + fallbacks
- `src/token_aud/agent/loop_guard.py`: repeated-turn loop detection
- `src/token_aud/agent/telemetry.py`: JSONL/HTTP telemetry sinks
- `src/token_aud/agent/adaptive.py`: optional adaptive routing layer

## AgentSpend Examples

- `examples/agent_routing_demo.py`: end-to-end routed run with telemetry
- `examples/custom_policy_demo.py`: loop escalation and hard-stop behavior
- `examples/framework_agnostic_integration.py`: generic agent-loop integration with explicit success feedback
- `scripts/summarize_telemetry.py`: convert `agent_telemetry.jsonl` into cost/fallback/latency summary

```bash
uv run --no-sync python scripts/summarize_telemetry.py agent_telemetry.jsonl
```

## Audit CLI (legacy + still supported)

```bash
uv run --no-sync token-aud --help
uv run --no-sync token-aud analyze sample_data.csv --dry-run
```

## Environment Variables

Common provider credentials:

- `OPENAI_API_KEY`
- `ANTHROPIC_API_KEY`
- `GEMINI_API_KEY` or `GOOGLE_API_KEY` (depending on provider path)

For Google Vertex flows, ensure ADC is configured (`gcloud auth application-default login`).
