Metadata-Version: 2.4
Name: halton-meter
Version: 0.5.0
Summary: Local LLM API proxy — captures usage, attributes cost to projects.
Project-URL: Homepage, https://haltonmeter.com
Author-email: Halton Labs <operator@haltonlabs.com>
Maintainer-email: Halton Labs <operator@haltonlabs.com>
License-Expression: LicenseRef-Polyform-Perimeter-1.0.1
License-File: LICENSE
Keywords: anthropic,claude,cost-tracking,governance,llm,mitmproxy,observability,openai,proxy
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development
Classifier: Topic :: System :: Monitoring
Classifier: Topic :: System :: Networking :: Monitoring
Classifier: Typing :: Typed
Requires-Python: <3.15,>=3.11
Requires-Dist: aiosqlite>=0.19
Requires-Dist: certifi>=2024.0
Requires-Dist: click<9,>=8.1
Requires-Dist: fastapi>=0.110
Requires-Dist: greenlet>=3.0
Requires-Dist: httpx<1,>=0.27
Requires-Dist: mitmproxy<13,>=10.0
Requires-Dist: psutil>=6.0
Requires-Dist: pydantic<3,>=2.0
Requires-Dist: rich>=13.0
Requires-Dist: sqlalchemy[asyncio]>=2.0
Requires-Dist: structlog>=24.0
Requires-Dist: tomli>=2.0; python_version < '3.11'
Requires-Dist: uvicorn>=0.27
Description-Content-Type: text/markdown

# Halton Meter

**Local LLM API proxy. Observes outbound traffic, attributes every request to a project, computes exact cost, surfaces it via terminal reports and a loopback HTTP API.**

Designed for solo developers, agencies, and in-house dev teams who use Claude Code, the Anthropic SDK, OpenAI, Gemini, xAI, or other LLM clients heavily and want transparency over what's being spent and on what.

[![License: Polyform Perimeter 1.0.1](https://img.shields.io/badge/License-Polyform_Perimeter_1.0.1-blue.svg)](https://polyformproject.org/licenses/perimeter/1.0.1)
[![PyPI](https://img.shields.io/pypi/v/halton-meter.svg)](https://pypi.org/project/halton-meter/)
[![Python](https://img.shields.io/pypi/pyversions/halton-meter.svg)](https://pypi.org/project/halton-meter/)

---

## Platform support

**macOS is the primary supported platform.** Install + lifecycle + capture + reporting are exercised end-to-end on the maintainer's daily-driver before every release.

**Linux is in public beta as of v0.1.24.** Verified on Ubuntu 24.04 LTS against (a) a Docker + systemd-container soak run on every commit and (b) an AWS Lightsail instance with real `systemd-logind`, real `loginctl enable-linger`, real reboot survival, real Claude Code sessions captured end-to-end (Anthropic + Gemini interception confirmed). Other distros (Debian, Fedora, Arch) should work — same systemd `--user` mechanism — but are not yet covered by the release gate. **Please [open an issue](mailto:operator@haltonlabs.com) with `[linux-beta]` in the subject** if you hit anything; we want to graduate Linux to "primary supported" with broader distro coverage in v0.2.

Linux runs apps-mode only (env-var-only `HTTPS_PROXY` — no system-level proxy via `networksetup`-equivalent or transparent-proxy via iptables / nftables; that scope is on the post-v1.0 roadmap). The daemon + edge ship as systemd `--user` units under `~/.config/systemd/user/`; the watchdog unit is macOS-only (the launchd-based `networksetup` poller has no Linux counterpart). See [`LINUX.md`](LINUX.md) for the full Linux setup guide.

**Windows apps-mode is available as of v0.2.11 (beta).** `uvx halton-meter` works on Windows 10 / 11. No admin rights required for apps-mode. The daemon and edge run as Task Scheduler user-level tasks; the mitmproxy CA cert is installed into the user cert store via `certutil -user -addstore Root`; the system proxy is written to `HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings` with a `WM_SETTINGCHANGE` broadcast so Electron apps (Claude Code, VS Code) pick it up without restart. Full-mode (machine-wide proxy, NSSM service, MDM cert deployment) is on the post-v1.0 roadmap. See [`INSTALL.md`](../INSTALL.md) for Windows setup instructions.

---

## Quick install

### Prerequisites

| Requirement | Notes |
|---|---|
| **[`uv`](https://docs.astral.sh/uv/)** | Install with `curl -LsSf https://astral.sh/uv/install.sh \| sh` (macOS / Linux) or `powershell -c "irm https://astral.sh/uv/install.ps1 \| iex"` (Windows). uv ships its own Python interpreter, so you do **not** need to install Python yourself. |
| **macOS or Linux** | macOS (any recent version) — primary support. Linux with systemd `--user` and `loginctl enable-linger` — public beta. Windows: distribution layer works, OS-level proxy install is future work. |
| Admin password (one-time) | Granted during `init` to install the mitmproxy CA cert into the macOS System keychain. |

Everything else (mitmproxy, FastAPI, SQLAlchemy, structlog, Rich, Click, certifi) is bundled in the wheel. No Docker, no Postgres, no separate backend.

### Install

Pick one — they all install from the same PyPI wheel:

```bash
# Option A — uvx (canonical; zero Python prereq, no PATH change).
uvx halton-meter --version

# Option B — uv tool install (persistent; puts `halton-meter` on PATH).
uv tool install halton-meter
halton-meter --version

# Option C — pipx, if you already use it.
pipx install halton-meter
halton-meter --version
```

Then run the setup flow (substitute `uvx halton-meter` for `halton-meter` if you chose Option A):

```bash
# one-time setup: cert trust + launchd supervisors + per-shell env block
halton-meter init --apps

# start metering
halton-meter start

# verify health
halton-meter status
```

After this, every terminal you open and every Spotlight-launched app routes its LLM API traffic through the meter. Existing terminals get the env on next shell open.

To stop metering: `halton-meter stop`. To remove entirely: `halton-meter uninstall`.

> **Graceful uninstall.** `uninstall` stops metering and removes the launchd supervisor plists, but the edge process keeps running in **pure-passthrough mode** until next reboot. This is intentional — apps with `HTTPS_PROXY=http://127.0.0.1:8081` baked into their environment (running terminals, IDEs, browsers) don't lose connectivity mid-flow. On reboot the edge dies with your session and isn't respawned. To kill it immediately: `pkill -f "halton-meter edge"`. To also wipe `~/.halton-meter/db.sqlite` and logs, add `--purge`.

---

## What it does

**Captures and costs every outbound LLM API call across these providers:**

- **Anthropic** — `api.anthropic.com` (`/v1/messages`, streaming + non-streaming, extended thinking, cache read/write)
- **OpenAI** — `api.openai.com` (`/v1/chat/completions`, `/v1/responses`, `/v1/embeddings`; o-series reasoning tokens; prompt cache reads)
- **Gemini** — `generativelanguage.googleapis.com` (`/v1beta/models/<model>:generateContent`, `:streamGenerateContent`, `:embedContent`, `:batchEmbedContents`; Gemini 2.5 thinking tokens; >200k tiered pricing; cached-content reads)
- **xAI** — `api.x.ai` (`/v1/chat/completions` OpenAI-compat shape; `/v1/messages` Anthropic-compat shape)

**Project attribution** via `HALTON_PROJECT` env var, a `.haltonrc` file in the working directory, or the working directory name. macOS GUI apps get attributed via launchctl bundle-ID humanisation (`mac:com.docker.docker` → `Docker`).

**Storage** — local SQLite (`~/.halton-meter/db.sqlite`) with WAL. Token counts, latency, cost in integer millicents (no float drift), and full request/response bodies with credential redaction. Body capture default ON; opt out per project with `halton-meter project <slug> set body-capture off`. Background retention sweep keeps disk bounded (default 90 days).

**Loopback HTTP API** at `127.0.0.1:8765` for the cloud dashboard or any reader. JSON in, JSON out. Audit trail of every request, policy event, and reconciliation run.

**Cost report** — single-command client-billing surface:

```bash
halton-meter report                           # default window
halton-meter report --since 7d                # last 7 days
halton-meter report --from 2026-04-01 --to 2026-05-01
halton-meter report --vs-previous             # Δ% vs prior period per row
halton-meter report --by project              # filter to one breakdown
halton-meter report --csv > spend.csv         # export
halton-meter report --json | jq               # pipe to dashboard
```

Layout adapts to terminal width (60 / 80 / 100 / 120 cols). Includes share-of-spend bars, p95 latency, and per-project 7-day trend sparklines.

**Failure mode** — if the daemon is stopped, upgraded, or crashes, the always-on edge process transparently tunnels traffic direct to upstream. Your apps never see `ConnectionRefused`. Logging is observation infrastructure, never a single point of failure.

---

## Pricing rates

Halton Meter ships with a bundled per-model pricing matrix sourced from each provider's public price page. Every captured request is costed against this matrix at write time and the rate-source is stamped on the row (`requests.rates_source = "bundled-YYYY-MM-DD"` or `"override"`), so every figure in a report or CSV export is traceable to a specific snapshot.

Three layers of freshness defence:

1. **Visible.** `halton-meter report` prints `Rates: bundled YYYY-MM-DD · N override(s)` under the masthead.
2. **Overridable.** `halton-meter pricing set / list / reset` records operator-set rates. Future captures use them immediately; historical `halton-meter recompute-costs` consults overrides first.
3. **Probed.** `halton-meter doctor` does a 2s loopback fetch of `rates-manifest.json` and warns if the bundled date trails the latest published bundle. Fail-open: any network/parse error leaves the check silent. Disable with `HALTON_METER_NO_RATES_PROBE=1`.

Cost math itself never touches the network. The freshness probe is an opt-out signal — it never affects pricing, only the doctor row.

---

## Three install modes

| Mode | Invoke | What's metered | Risk |
|---|---|---|---|
| **env-var-only** (default) | `halton-meter init` | Whatever you explicitly route via `halton-meter run <cmd>` (Claude Code, the Anthropic SDK, `python my_script.py`, an interactive subshell with `--shell`, …). | None — never touches the system proxy. |
| **apps** | `halton-meter init --apps` | Spotlight/Dock-spawned apps that respect `HTTPS_PROXY` env (Claude Desktop, JetBrains IDEs, Cursor, Windsurf), and new shells launched after the install. Not browsers. | None — system proxy still untouched. |
| **full** | `halton-meter init --full` | All of `apps` plus browsers and GUI apps that ignore `HTTPS_PROXY` env (every other client) via the macOS system proxy. | May break HSTS browser sites if the CA isn't trusted by SecTrust. The daemon refuses to enable the proxy when cert verification fails. |

Switching modes is one command — the install path reconciles all mode-specific state in either direction. No uninstall step required.

### Decision tree

- **"I just want to meter my Claude Code / SDK calls."** → `init` (env-only default)
- **"I want every app I open from Spotlight metered, but I don't want my browser to break."** → `init --apps`
- **"I want every HTTPS request my Mac makes, including browser traffic."** → `init --full`

---

## Branded CLI surfaces

Every user-facing command renders with the brand-orange Halton Meter mark and editorial sections:

- `halton-meter status` — aggregate verdict (HEALTHY / DEGRADED / BROKEN), Component Health table with pill state indicators, ground-truth port resolution, sentinel state.
- `halton-meter doctor` — row-by-row diagnostic with pill severity and copy-pasteable next-actions.
- `halton-meter report` — masthead with date range, summary KV block, by-project + by-model tables with share-of-spend bars and 7-day sparklines.
- `halton-meter init / uninstall` — sectioned step rows (`supervisor (launchd)` / `certificate trust` / `shell environment`), idempotency-aware.
- `halton-meter start / stop` — masthead + step panel with PIDs and `/health` latency.

Layouts adapt to terminal width. The aggregate verdict is always honest — never says HEALTHY when truth disagrees.

---

## Useful commands

```bash
halton-meter init [--apps|--full] [--no-shell-rc]   # install (re-runnable; idempotent)
halton-meter start | stop                           # supervisor control
halton-meter status [--json]                        # mode-aware HEALTHY / DEGRADED / BROKEN
halton-meter doctor [--json] [--curl]               # detailed diagnostics
halton-meter report [--since 7d | --from … --to …]  # cost report (see above)
halton-meter run <cmd> [args...]                    # exec wrapper that injects metering env
halton-meter run --shell                            # interactive subshell with metering env
halton-meter project <slug> set body-capture off    # opt out of body capture per project
halton-meter bodies show <request_id>               # inspect a captured body
halton-meter bodies stats                           # body-capture disk usage
halton-meter uninstall [--purge]                    # remove (graceful — see above)
halton-meter reset-proxy                            # emergency: disable system proxy
halton-meter version | --version                    # version
```

`halton-meter doctor` is the diagnostic of last resort: it prints every signal that matters (daemon health, cert trust, system proxy state, launchctl env, shell rc marker, env in current process) and ends with a top-line verdict and concrete next-actions. `--curl` adds an end-to-end TLS smoke test against `https://example.com` (hard-coded — never a provider domain).

---

## Architecture (edge ↔ daemon split)

```
┌────────────────┐         ┌─────────────────────────┐
│ apps           │         │   halton-meter-edge     │
│ (Claude Code,  │ ─CONNECT──►  always-on TCP proxy  │
│  SDKs, ...)    │         │   listens on :8081      │
│ HTTPS_PROXY=...│         └────────┬─────────┬──────┘
└────────────────┘                  │         │
                              healthy?       not healthy
                                    │         │
                                    ▼         ▼
                       ┌────────────────┐  ┌─────────────────┐
                       │  halton-meter  │  │  upstream API   │
                       │  daemon        │  │  (direct tunnel,│
                       │                │  │   no metering)  │
                       │  mitmproxy +   │  └─────────────────┘
                       │  FastAPI +     │
                       │  SQLite        │
                       └────────────────┘
```

Three macOS LaunchAgents:
- `com.haltonlabs.meter.edge` — always-on TCP CONNECT proxy on `:8081`. Apps see this in `HTTPS_PROXY`. When the daemon is healthy, chains CONNECTs to it for full metering. When the daemon is down, tunnels direct.
- `com.haltonlabs.meter` — daemon (mitmproxy + FastAPI + SQLite) on `:8090` internal + `:8765` API.
- `com.haltonlabs.meter.watchdog` — Layer-2 watchdog. Polls `/health` every 1.5s; disables the system proxy after ~7.5s of unresponsiveness so traffic flows direct.

`halton-meter stop` no longer breaks running apps. They keep working in passthrough until you `halton-meter start` again. `halton-meter uninstall` drops the supervisor plists; the edge keeps running in passthrough until reboot or manual `pkill`.

---

## What it doesn't do

- Doesn't wrap SDKs — your code stays exactly as it is
- Doesn't intercept anything you don't ask it to — only configured LLM endpoints
- Doesn't send your data anywhere by default — runs entirely locally
- Doesn't break your workflow — if the proxy fails, traffic falls through to the real provider

## Known limitations

Honest list, not a roadmap. Apps that ignore both the macOS system proxy AND `HTTPS_PROXY` env have no general capture path:

- **Go binaries with `GODEBUG=netdns=go`.** Go's pure-Go resolver path bypasses `HTTPS_PROXY` env in some builds. Calls reach the provider unmetered.
- **libcurl callers using `--insecure` or with `CURLOPT_PROXY` not honoured.** `--insecure` callers skip TLS verification entirely.
- **WSL2 networking on Windows hosts.** Currently doesn't route WSL2 guest traffic through the Windows-side proxy. Linux-native and macOS are the supported platforms.
- **Hardcoded HTTP stacks that open raw TLS sockets to known provider IPs.** Anything that bypasses both system proxy AND `HTTPS_PROXY` env will reach the provider unmetered.
- **Late-coming network interfaces.** macOS network services are enumerated once when the daemon starts. If you plug in a new interface (Thunderbolt dock, USB-tethered iPhone), restart with `halton-meter stop && halton-meter start`.

---

## Project

Halton Meter is source-available under the [Polyform Perimeter License 1.0.1](https://polyformproject.org/licenses/perimeter/1.0.1). Built by [Halton Labs](https://haltonlabs.com).

- Website: <https://haltonmeter.com>
- Contact / support / bug reports: <operator@haltonlabs.com>
- Security disclosures: <operator@haltonlabs.com> (subject prefix `[security]`) — please do not post to public channels
- Source: available on request — email <operator@haltonlabs.com> with your use case

This package is the **daemon component** — the local proxy that intercepts, attributes, and costs LLM API traffic. The cloud dashboard and Phase 3 multi-tenant SaaS are developed separately.
