Metadata-Version: 2.4
Name: llm-observe-proxy
Version: 0.2.0
Summary: OpenAI-compatible LLM proxy with SQLite request capture, observability, and an admin UI.
Project-URL: Homepage, https://github.com/shamitv/llm-observe-proxy
Project-URL: Repository, https://github.com/shamitv/llm-observe-proxy
Project-URL: Issues, https://github.com/shamitv/llm-observe-proxy/issues
Project-URL: Changelog, https://github.com/shamitv/llm-observe-proxy/blob/main/CHANGELOG.md
Author: LLM Observe Proxy Contributors
License-Expression: MIT
License-File: LICENSE
Keywords: admin-ui,fastapi,llm,observability,openai,proxy,sqlite
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Web Environment
Classifier: Framework :: FastAPI
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Internet :: WWW/HTTP :: HTTP Servers
Classifier: Topic :: Software Development :: Debuggers
Requires-Python: >=3.10
Requires-Dist: fastapi>=0.136.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: jinja2>=3.1.6
Requires-Dist: markdown-it-py>=4.0.0
Requires-Dist: python-multipart>=0.0.27
Requires-Dist: sqlalchemy>=2.0.49
Requires-Dist: uvicorn[standard]>=0.46.0
Provides-Extra: dev
Requires-Dist: build>=1.5.0; extra == 'dev'
Requires-Dist: playwright>=1.59.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=1.3.0; extra == 'dev'
Requires-Dist: pytest>=9.0.3; extra == 'dev'
Requires-Dist: python-dotenv>=1.2.2; extra == 'dev'
Requires-Dist: respx>=0.23.1; extra == 'dev'
Requires-Dist: ruff>=0.15.12; extra == 'dev'
Requires-Dist: twine>=6.2.0; extra == 'dev'
Description-Content-Type: text/markdown

# LLM Observe Proxy

`llm-observe-proxy` is an OpenAI-compatible, record-only proxy for inspecting LLM
traffic. It forwards requests to an upstream `/v1` API, stores requests and responses in
SQLite, and provides a polished local admin UI for browsing, pretty-printing, trimming,
grouping task runs, and changing runtime settings.

It is useful when you want LiteLLM-style observability without introducing a full gateway
or external database.

Project repository: https://github.com/shamitv/llm-observe-proxy

## Features

- OpenAI-compatible passthrough route: `ANY /v1/{path:path}`.
- SQLite capture for request/response headers, bodies, status, timing, model, endpoint,
  streaming state, tool-call signals, image assets, and errors.
- Admin UI for searching and browsing captured traffic, including per-request output TPS.
- Runs for grouping all requests made during a task, benchmark, or repro workflow.
- Run detail pages with request counts, LLM wall time, token totals, tokens/sec, model
  and endpoint breakdowns, and signal/error counts.
- Detail pages with response render modes for JSON, plain text, Markdown, tool calls,
  and raw SSE streams.
- Request image gallery for data URL and remote image references.
- Settings UI for upstream URL, model upstream routes, incoming host/port preferences,
  all-IPs exposure, and retention trimming.
- Config-driven model routes for sending selected proxy-facing model names to different
  upstream `/v1` endpoints with optional upstream model rewrites and API key injection.
- No authentication by default, intended for local or trusted development networks.

## Install

From PyPI with `pip`:

```powershell
python -m pip install llm-observe-proxy
llm-observe-proxy
```

From PyPI with `uv`:

```powershell
uv tool install llm-observe-proxy
llm-observe-proxy
```

Run it once without installing:

```powershell
uvx llm-observe-proxy
```

By default, the proxy listens on:

```text
http://localhost:8080
```

and forwards requests to:

```text
http://localhost:8000/v1
```

Open the admin UI:

```text
http://localhost:8080/admin
```

## Usage

Point an OpenAI-compatible client at the proxy:

```python
from openai import OpenAI

client = OpenAI(
    api_key="local-dev-key",
    base_url="http://localhost:8080/v1",
)

response = client.chat.completions.create(
    model="gpt-demo",
    messages=[{"role": "user", "content": "Hello through the proxy"}],
)
print(response.choices[0].message.content)
```

Run on a different port:

```powershell
llm-observe-proxy --port 8090
```

Expose on all interfaces:

```powershell
llm-observe-proxy --expose-all-ips
```

Set the upstream from the CLI:

```powershell
llm-observe-proxy --upstream-url http://localhost:8000/v1
```

Load model-specific upstream routes from a JSON file:

```powershell
llm-observe-proxy --models-file .\models.json
```

You can also change the upstream URL, model upstream routes, and next-start incoming
host/port settings from `/admin/settings`.

## Model Routes

Model routes let one proxy endpoint send different client-facing models to different
OpenAI-compatible upstreams. Routes match the request payload's top-level `model`
exactly. Unknown models, requests without a JSON model, and generic calls such as
`GET /v1/models` use the global upstream fallback.

Example route file:

```json
[
  {
    "model": "local-qwen",
    "upstream_url": "http://localhost:8000/v1",
    "upstream_model": "qwen3-coder-30b"
  },
  {
    "model": "openai-mini",
    "upstream_url": "https://api.openai.com/v1",
    "upstream_model": "gpt-4.1-mini",
    "api_key_env": "OPENAI_API_KEY"
  }
]
```

Run with the file:

```powershell
$env:OPENAI_API_KEY = "sk-..."
llm-observe-proxy --models-file .\models.json
```

You can also set `LLM_OBSERVE_MODELS_JSON` to the same JSON array. If both
`LLM_OBSERVE_MODELS_FILE` and `LLM_OBSERVE_MODELS_JSON` are set, the file wins.

You can add, update, and delete UI-managed model routes from `/admin/settings`.
UI-managed routes are stored in SQLite and take effect immediately. Routes loaded from
`--models-file`, `LLM_OBSERVE_MODELS_FILE`, or `LLM_OBSERVE_MODELS_JSON` remain read-only
in the UI, and duplicate model names are rejected.

When a route has an API key, the proxy injects `Authorization: Bearer <key>` for the
upstream request. Captured request headers remain the original client headers; injected
keys are not stored or shown in the admin UI. UI-managed routes store only `api_key_env`;
prefer `api_key_env` for shared configs.

## Runs

Use **Runs** when you want to measure or review LLM usage for one bounded task, such as
processing a video, comparing local and cloud models, or reproducing an agent issue.

1. Open `/admin/runs` or use the run control on `/admin`.
2. Enter a required run name and choose **Start run**.
3. Run your application or benchmark through the proxy.
4. Choose **End run** when the task is complete.

Starting a new run automatically ends any existing active run. Requests made while a run
is active are linked to that run; requests outside a run are still captured normally.

The request browser can filter by run, and request rows link back to their run. The run
detail page reports LLM wall time from the first request start to the last response
completion, plus token totals and tokens/sec metrics. The request table's **TPS** column
shows per-request output tokens per second when token usage and duration are available.

Screenshots and the full developer README are available in the project repository:
https://github.com/shamitv/llm-observe-proxy

## Routes

- `ANY /v1/{path:path}`: OpenAI-compatible pass-through proxy.
- `GET /admin`: request browser.
- `GET /admin/requests/{id}`: request/response detail view.
- `GET /admin/runs`: run browser and active run controls.
- `GET /admin/runs/{id}`: run metrics and associated request list.
- `POST /admin/runs/start`: start a named run, ending any active run first.
- `POST /admin/runs/end`: end the active run.
- `GET /admin/settings`: upstream settings and retention tools.
- `POST /admin/settings/incoming`: update incoming host/port settings for next startup.
- `POST /admin/settings/upstream`: update upstream URL.
- `POST /admin/settings/model-routes`: create or update a UI-managed model route.
- `POST /admin/settings/model-routes/delete`: delete a UI-managed model route.
- `POST /admin/trim`: delete records older than `N` days.
- `GET /healthz`: health check.

## Configuration

Environment variables:

| Variable | Default | Purpose |
| --- | --- | --- |
| `LLM_OBSERVE_DATABASE_URL` | `sqlite:///./llm_observe_proxy.sqlite3` | SQLite SQLAlchemy URL. |
| `LLM_OBSERVE_INCOMING_HOST` | `localhost` | Bind host when not exposing all IPs. |
| `LLM_OBSERVE_INCOMING_PORT` | `8080` | Bind port. |
| `LLM_OBSERVE_EXPOSE_ALL_IPS` | `false` | Bind to `0.0.0.0` when true. |
| `LLM_OBSERVE_UPSTREAM_URL` | `http://localhost:8000/v1` | Upstream OpenAI-compatible `/v1` base URL. |
| `LLM_OBSERVE_MODELS_JSON` | unset | JSON array of model route objects. |
| `LLM_OBSERVE_MODELS_FILE` | unset | Path to a JSON file containing model routes. Wins over `LLM_OBSERVE_MODELS_JSON`. |
| `LLM_OBSERVE_LOG_LEVEL` | `INFO` | Uvicorn log level. |

Incoming host/port settings saved in the UI are used on the next process startup; they do
not rebind a currently running process.

## Tests

```powershell
.\.venv\Scripts\ruff.exe check src tests
.\.venv\Scripts\python.exe -m compileall -q src tests
.\.venv\Scripts\pytest.exe -q
```

The test suite starts a fake upstream on `localhost:8080/v1`, so stop any local process
using port `8080` before running tests.

## Publishing

See the repository publishing guide for name checks, build commands, and the pre-publish
checklist.

## License

MIT.
