Metadata-Version: 2.4
Name: pretia
Version: 1.0.4
Summary: Pre-deployment cost intelligence for AI agent workflows
Project-URL: Homepage, https://github.com/pretia-ai/pretia
Project-URL: Issues, https://github.com/pretia-ai/pretia/issues
Author: Pretia contributors
License: BSL-1.1
License-File: LICENSE
Keywords: agents,cost,llm,observability,profiling
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: click>=8.1
Requires-Dist: jinja2>=3.1.0
Requires-Dist: rich>=13.0
Provides-Extra: backtesting
Requires-Dist: langchain-anthropic>=0.3.0; extra == 'backtesting'
Requires-Dist: langchain-google-genai>=2.0.0; extra == 'backtesting'
Requires-Dist: langchain-openai>=0.3.0; extra == 'backtesting'
Requires-Dist: langgraph>=1.0.0; extra == 'backtesting'
Provides-Extra: bt-agents
Requires-Dist: litellm>=1.40; extra == 'bt-agents'
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: pyright>=1.1; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Provides-Extra: langgraph
Requires-Dist: langchain-core>=1.0; extra == 'langgraph'
Requires-Dist: langgraph>=1.0; extra == 'langgraph'
Provides-Extra: openai
Requires-Dist: openai-agents>=0.2; extra == 'openai'
Provides-Extra: pdf-generation
Requires-Dist: matplotlib>=3.8; extra == 'pdf-generation'
Requires-Dist: numpy>=1.24; extra == 'pdf-generation'
Requires-Dist: pdfplumber>=0.11; extra == 'pdf-generation'
Requires-Dist: pillow>=10.0; extra == 'pdf-generation'
Requires-Dist: pypdf>=4.0; extra == 'pdf-generation'
Requires-Dist: pypdfium2>=4.0; extra == 'pdf-generation'
Requires-Dist: reportlab>=4.1; extra == 'pdf-generation'
Requires-Dist: tiktoken>=0.7; extra == 'pdf-generation'
Provides-Extra: qwen
Requires-Dist: qwen-agent>=0.0.30; extra == 'qwen'
Provides-Extra: ui
Requires-Dist: fastapi>=0.115.0; extra == 'ui'
Requires-Dist: jinja2>=3.1.0; extra == 'ui'
Requires-Dist: uvicorn[standard]>=0.34.0; extra == 'ui'
Requires-Dist: websockets>=14.0; extra == 'ui'
Provides-Extra: validation
Requires-Dist: scikit-learn>=1.3; extra == 'validation'
Requires-Dist: scipy>=1.11.0; extra == 'validation'
Provides-Extra: visualization
Requires-Dist: matplotlib>=3.8; extra == 'visualization'
Requires-Dist: plotly>=5.18; extra == 'visualization'
Requires-Dist: seaborn>=0.13; extra == 'visualization'
Description-Content-Type: text/markdown

# Pretia

**Know what your agent will cost before you deploy.**

<!-- Terminal GIF: add after recording with VHS or asciinema -->

Pre-deployment cost intelligence for AI agent workflows. Two commands, zero config, ~$2. Get distributional cost projections (p50-p99), detect cost time-bombs, and receive dollar-denominated optimization recommendations.

## Install

```bash
pip install pretia
```

## Quick Start

**Zero-cost estimate** (static analysis, no execution):

```bash
pretia estimate my_agent.py
```

**Full profile** (runs your workflow, ~$2, ~3 minutes):

```bash
pretia profile run my_agent.py
```

No config files, no JSONL datasets, no setup. Pretia reads your workflow, generates diverse synthetic inputs, runs 20 profiling runs, detects patterns, and opens an HTML report with projections and recommendations.

<!-- Report screenshot: add after rendering -->

## Features

### Distributional Projections

Cost projections at p50, p75, p90, p95, and p99 — not just averages. For workflows with non-linear behavior (context growth, variable loop counts), Pretia uses Monte Carlo simulation (10K runs) instead of linear scaling.

### 8 Pattern Detectors

Automatically detects cost risks in your workflow:

- **Context growth** — input tokens increasing with each iteration
- **Loop count variance** — unpredictable iteration counts
- **High token variance** — wide spread between typical and worst-case calls
- **Step count variance** — routing variability across runs
- **Bimodality** — two distinct cost clusters (e.g., cache hit vs. miss)
- **Cache utilization opportunity** — missing prompt caching on supported providers
- **Zero-execution steps** — workflow paths never triggered during profiling
- **Output token budget** — wasteful max_tokens settings or truncation risk

### 6 Optimization Recommendations

Each recommendation comes with estimated monthly savings in dollars:

- **Model swap** — downshift steps using frontier models for classification tasks
- **Loop iteration cap** — cap iterations where marginal returns diminish
- **Circuit breaker** — hard exit for stuck loops consuming >15% of cost
- **Enable prompt caching** — activate provider caching for repeated system prompts
- **Filter tool definitions** — remove unused tools from step context
- **Cache re-sent context** — eliminate redundant system prompts across consecutive steps

### Optimization Score

A 0-100 score measuring workflow cost efficiency. Three zones: red (0-40, needs optimization), amber (41-70, room to improve), green (71-100, well optimized).

### Five Input Modes

A friction ladder from zero-effort to maximum precision:

| Level | Command | What happens | Cost |
|-------|---------|-------------|------|
| 0 | `pretia estimate workflow.py` | Static code analysis only. No execution. | Free |
| 1 | `--input "How do I reset my password?"` | One run + priors for variance estimation. | ~$0.10 |
| 2 | `--auto-generate N` **(default)** | LLM generates diverse inputs from system prompt. | ~$2 |
| 3 | `--from-langfuse --last 100` | Pull real inputs from Langfuse production traces. | Free |
| 4 | `--inputs samples.jsonl` | User-curated test dataset. Maximum precision. | Execution only |

## Add to Your CI in 2 Minutes

Pretia ships a GitHub Action that comments on every PR with cost analysis.

**Diff-only mode** (free, default) — static analysis in seconds:

```yaml
# .github/workflows/pretia.yml
name: Pretia
on: [pull_request]

permissions:
  contents: read
  pull-requests: write  # required for PR comments

jobs:
  cost-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pretia-ai/pretia/action@v1
        with:
          workflow_path: src/agent.py
          cost_threshold: "20"  # fail if cost increases >20%
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
```

**Full profile mode** (opt-in, ~$2) — real profiling with recommendations:

```yaml
      - uses: pretia-ai/pretia/action@v1
        with:
          workflow_path: src/agent.py
          mode: profile
          cost_threshold: "20"
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}  # or your provider key
```

The PR comment shows: optimization score, projected monthly cost, cost delta vs. baseline, and recommendations in a collapsible section.

## CLI Commands

```bash
pretia estimate workflow.py             # Instant cost estimate (no execution)
pretia profile run workflow.py          # Full profiling (default: --auto-generate 20)
pretia report profile.json              # Generate HTML report from saved profile
pretia recommend profile.json           # Generate optimization recommendations
pretia analyze --from-langfuse          # Analyze Langfuse traces (no execution)
pretia baseline update profile.json     # Save baseline for CI diffing
pretia diff baseline.json new.json      # Compare profiles, show per-step deltas
```

## Supported Frameworks

| Framework | Collection method | Install |
|-----------|------------------|---------|
| **LangGraph** | Callback handler | `pip install pretia[langgraph]` |
| **OpenAI Agents SDK** | RunHooks lifecycle | `pip install pretia[openai]` |
| **Qwen-Agent** | LLM proxy | `pip install pretia[qwen]` |
| **Generic** | `@collector.step()` decorator | `pip install pretia` |

## How It Works

Data flows through a five-stage pipeline:

1. **Collector** — Framework adapters instrument your workflow and emit unified StepRecords
2. **StepRecord** — Frozen dataclass capturing one LLM call: model, tokens, cost, timing, tool usage
3. **ProfileStore** — Persists profiling sessions as JSON (one workflow x N input runs)
4. **Projection** — Distributional scaling (p50-p99) for stable workflows, Monte Carlo for non-linear cases
5. **Recommendation** — Rule-based generators produce dollar-denominated optimization suggestions

The projection engine is validated against 13 real-world workflow archetypes (12/13 within 10% projection error).

## Positioning

**Langfuse** tells you what you spent. **Pretia** tells you what you'll spend. Use both.

Pretia sits above the LLM tooling stack. It detects when other tools are needed — it doesn't replace them. No proxy (use LiteLLM), no routing (use Martian), no tracing (use Langfuse), no evals (use Braintrust).

## Development

```bash
uv pip install -e ".[dev]"
pytest tests/unit/ -v
ruff check pretia/ tests/
ruff format pretia/ tests/
pyright pretia/
```

See [CLAUDE.md](CLAUDE.md) for architecture details and coding conventions.

## Contributing

Issues and PRs welcome. Run `pytest tests/unit/` and `ruff check pretia/ tests/` before opening a PR.

## License

[BSL 1.1](LICENSE) (Business Source License). Free for all use except offering Pretia as a commercial hosted service. Converts to Apache 2.0 on 2030-06-13.
