Metadata-Version: 2.1
Name: ooai-llm
Version: 0.5.2
Summary: Typed LLM settings, LangChain-first factories, LiteLLM metadata enrichment, and callback helpers.
Keywords: llm,langchain,pydantic,settings,ai
Author: OpenAI
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Project-URL: Homepage, https://github.com/pr1m8/ooai-llm
Project-URL: Documentation, https://ooai-llm.readthedocs.io/
Project-URL: Repository, https://github.com/pr1m8/ooai-llm
Project-URL: Issues, https://github.com/pr1m8/ooai-llm/issues
Project-URL: Changelog, https://github.com/pr1m8/ooai-llm/releases
Requires-Python: >=3.13
Requires-Dist: pydantic>=2.12.0
Requires-Dist: pydantic-settings>=2.12.0
Requires-Dist: langchain>=1.2.0
Requires-Dist: langchain-community>=0.4.0
Requires-Dist: sqlalchemy>=2.0.0
Provides-Extra: openai
Requires-Dist: langchain-openai>=0.3.0; extra == "openai"
Requires-Dist: openai>=1.0.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: langchain-anthropic>=0.3.0; extra == "anthropic"
Requires-Dist: anthropic>=0.39.0; extra == "anthropic"
Provides-Extra: google
Requires-Dist: langchain-google-genai>=2.1.0; extra == "google"
Requires-Dist: google-genai>=1.30.0; extra == "google"
Provides-Extra: xai
Requires-Dist: langchain-xai>=0.2.0; extra == "xai"
Requires-Dist: xai-sdk>=0.3.0; extra == "xai"
Provides-Extra: deepseek
Requires-Dist: langchain-deepseek>=0.1.0; extra == "deepseek"
Requires-Dist: openai>=1.0.0; extra == "deepseek"
Provides-Extra: mistral
Requires-Dist: langchain-mistralai>=0.2.0; extra == "mistral"
Requires-Dist: mistralai>=1.0.0; extra == "mistral"
Provides-Extra: litellm
Requires-Dist: litellm>=1.74.0; extra == "litellm"
Provides-Extra: cli
Requires-Dist: rich>=13.9.0; extra == "cli"
Provides-Extra: benchmarks
Requires-Dist: rich>=13.9.0; extra == "benchmarks"
Provides-Extra: tui
Requires-Dist: textual>=0.89.0; extra == "tui"
Requires-Dist: rich>=13.9.0; extra == "tui"
Requires-Dist: litellm>=1.74.0; extra == "tui"
Provides-Extra: logging
Requires-Dist: ultilog>=0.3.0; extra == "logging"
Provides-Extra: redis
Requires-Dist: redis>=5.0.0; extra == "redis"
Provides-Extra: upstash
Requires-Dist: upstash-redis>=1.2.0; extra == "upstash"
Provides-Extra: caches
Requires-Dist: redis>=5.0.0; extra == "caches"
Requires-Dist: upstash-redis>=1.2.0; extra == "caches"
Provides-Extra: providers
Requires-Dist: langchain-openai>=0.3.0; extra == "providers"
Requires-Dist: openai>=1.0.0; extra == "providers"
Requires-Dist: langchain-anthropic>=0.3.0; extra == "providers"
Requires-Dist: anthropic>=0.39.0; extra == "providers"
Requires-Dist: langchain-google-genai>=2.1.0; extra == "providers"
Requires-Dist: google-genai>=1.30.0; extra == "providers"
Requires-Dist: langchain-xai>=0.2.0; extra == "providers"
Requires-Dist: xai-sdk>=0.3.0; extra == "providers"
Requires-Dist: langchain-deepseek>=0.1.0; extra == "providers"
Requires-Dist: langchain-mistralai>=0.2.0; extra == "providers"
Requires-Dist: mistralai>=1.0.0; extra == "providers"
Provides-Extra: all
Requires-Dist: langchain-openai>=0.3.0; extra == "all"
Requires-Dist: openai>=1.0.0; extra == "all"
Requires-Dist: langchain-anthropic>=0.3.0; extra == "all"
Requires-Dist: anthropic>=0.39.0; extra == "all"
Requires-Dist: langchain-google-genai>=2.1.0; extra == "all"
Requires-Dist: google-genai>=1.30.0; extra == "all"
Requires-Dist: langchain-xai>=0.2.0; extra == "all"
Requires-Dist: xai-sdk>=0.3.0; extra == "all"
Requires-Dist: langchain-deepseek>=0.1.0; extra == "all"
Requires-Dist: langchain-mistralai>=0.2.0; extra == "all"
Requires-Dist: mistralai>=1.0.0; extra == "all"
Requires-Dist: litellm>=1.74.0; extra == "all"
Requires-Dist: rich>=13.9.0; extra == "all"
Requires-Dist: textual>=0.89.0; extra == "all"
Requires-Dist: ultilog>=0.3.0; extra == "all"
Requires-Dist: redis>=5.0.0; extra == "all"
Requires-Dist: upstash-redis>=1.2.0; extra == "all"
Provides-Extra: docs
Requires-Dist: sphinx>=9.1.0; extra == "docs"
Requires-Dist: furo>=2025.12.19; extra == "docs"
Requires-Dist: sphinx-autobuild>=2025.8.25; extra == "docs"
Requires-Dist: myst-parser>=5.0.0; extra == "docs"
Requires-Dist: myst-nb>=1.4.0; extra == "docs"
Requires-Dist: sphinx-copybutton>=0.5.2; extra == "docs"
Requires-Dist: sphinx-design>=0.7.0; extra == "docs"
Requires-Dist: sphinx-togglebutton>=0.4.5; extra == "docs"
Requires-Dist: sphinx-inline-tabs>=2025.12.21.14; extra == "docs"
Requires-Dist: sphinxcontrib-mermaid>=2.0.1; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints>=3.10.2; extra == "docs"
Requires-Dist: autodoc-pydantic>=2.2.0; extra == "docs"
Requires-Dist: sphinx-autoapi>=3.8.0; extra == "docs"
Requires-Dist: sphinxext-opengraph>=0.13.0; extra == "docs"
Requires-Dist: sphinx-sitemap>=2.9.0; extra == "docs"
Requires-Dist: sphinx-notfound-page>=1.1.0; extra == "docs"
Requires-Dist: sphinx-last-updated-by-git>=0.3.8; extra == "docs"
Requires-Dist: sphinx-reredirects>=1.1.0; extra == "docs"
Requires-Dist: sphinxcontrib-spelling>=8.0.2; extra == "docs"
Requires-Dist: doc8>=2.0.0; extra == "docs"
Provides-Extra: test
Requires-Dist: pytest>=9.0.0; extra == "test"
Requires-Dist: pytest-cov>=7.0.0; extra == "test"
Requires-Dist: coverage[toml]>=7.10.0; extra == "test"
Provides-Extra: dev
Requires-Dist: pytest>=9.0.0; extra == "dev"
Requires-Dist: pytest-cov>=7.0.0; extra == "dev"
Requires-Dist: coverage[toml]>=7.10.0; extra == "dev"
Requires-Dist: build>=1.3.0; extra == "dev"
Requires-Dist: twine>=6.2.0; extra == "dev"
Requires-Dist: commitizen>=4.9.0; extra == "dev"
Description-Content-Type: text/markdown

# ooai-llm

[![CI](https://github.com/pr1m8/ooai-llm/actions/workflows/ci.yml/badge.svg)](https://github.com/pr1m8/ooai-llm/actions/workflows/ci.yml)
[![Docs](https://readthedocs.org/projects/ooai-llm/badge/?version=latest)](https://ooai-llm.readthedocs.io/en/latest/)
[![PyPI](https://img.shields.io/pypi/v/ooai-llm.svg)](https://pypi.org/project/ooai-llm/)
[![Python](https://img.shields.io/pypi/pyversions/ooai-llm.svg)](https://pypi.org/project/ooai-llm/)
[![Coverage](https://img.shields.io/codecov/c/github/pr1m8/ooai-llm.svg)](https://codecov.io/gh/pr1m8/ooai-llm)

Typed LLM settings, provider-aware model-string parsing, LangChain-first chat
model creation, live provider model discovery, LiteLLM pricing enrichment, and
usage/cost callback helpers for Python applications.

## What This Is

`ooai-llm` is a small integration layer for application code that already wants
to use LangChain model classes directly, but does not want to repeat the same
provider configuration, model defaults, env-var handling, cache setup, metadata
lookup, and usage accounting in every project.

It is not a router, proxy, agent framework, or hosted model catalog.

## Current Status

Implemented now:

- Model factories: `create_llm(...)`, `create_llm_bundle(...)`, `ChatModelProfile`, and `LLM`.
- Model inventory: `ooai-llm models list` with provider, date, capability, context, and cost filters.
- Raw LiteLLM registry exploration: `ooai-llm models list --all-litellm` and
  `ooai-llm tui --catalog-all` for broad provider-label scans.
- Model comparisons: `ooai-llm models compare` plus Python helpers for
  cheapest-per-provider, coding-only, and calls-per-dollar estimates.
- Model shortcuts: `ooai-llm models cheapest`, `ooai-llm models coding`, and
  `ooai-llm recipes` for the common package/CLI workflows.
- Model suites: named, iterable shortlists for comparing providers, creating
  multiple profiles/runtimes, or wiring LangGraph nodes.
- Model default refresh: `refresh_model_defaults(...)`, `update_model_defaults(...)`, `ooai-llm models update`, and opt-in factory auto-refresh.
- Caches: SQLite, memory, SQLAlchemy, Redis, Upstash Redis, and profile-level namespaced cache keys.
- Accounting and observability: observed LangChain/LiteLLM usage events, cost
  estimates from LiteLLM pricing metadata, run labels, tags, metadata,
  summaries, stable runtime IDs, and optional `ultilog` logging.

Planned next:

- General token estimation for a string or file path.
- Provider preflight counting for exact OpenAI, Anthropic, and Gemini request payloads.
- Local tokenizer estimates for OpenAI-style text and open-model tokenizers.

## Features

- Typed `ModelString` parsing for bare, LangChain-style, and LiteLLM-style model names.
- Provider inference for OpenAI, Anthropic, Google GenAI, xAI, DeepSeek, and Mistral.
- `AppSettings` with provider credentials, default aliases, provider presets, cache settings, catalog settings, and LiteLLM settings.
- Native and app-prefixed credential env vars, such as `OPENAI_API_KEY` and `OOAI_OPENAI_API_KEY`.
- LangChain global cache bootstrap for SQLite, memory, SQLAlchemy, Redis, and Upstash Redis.
- `create_llm(...)`, a thin wrapper around LangChain `init_chat_model(...)`.
- `create_llm_bundle(...)`, which returns the model, resolved metadata, and reasoning resolution together.
- Serializable `ChatModelProfile` configs that can create chat models, bundles, or an `LLM` runtime.
- `LLM` runtime wrapper with lazy model construction and observed usage/cost totals.
- Stable `LLM.id` and `LLM.uuid` values for tying logs, LangChain metadata, and usage events together.
- Namespaced cache-key policy for profile-specific LangChain cache entries.
- Live model listing through provider SDKs or REST fallbacks.
- Cross-provider model catalog filtering by release date, capability, cost, context, and provider.
- Cost-ranked catalog comparisons for a chosen input/output token shape,
  including calls per dollar and baseline equivalence ratios.
- Rich CLI tables for catalog, comparison, and suite views, with summary
  panels, compact cost formatting, and capability badges.
- Optional Textual TUI with model catalog, cheapest, coding, suite, profile,
  and benchmark exploration views.
- Configurable model suites from provider presets or filtered catalog rows.
- Provider-generic default refresh from live model catalogs or LiteLLM metadata.
- Opt-in automatic factory refresh so aliases such as `latest` can update at runtime.
- LangChain profile + LiteLLM pricing metadata in one `ModelInfo` object.
- Provider-aware reasoning kwargs for OpenAI, Anthropic, Gemini, xAI, DeepSeek, and Mistral.
- Usage and cost helpers for LangChain metadata and LiteLLM callbacks.
- Unit, integration, e2e, live-provider tests, coverage reports, docs builds, package builds, and PyPI release workflow.

## Installation

Base package:

```bash
pip install ooai-llm
```

With PDM:

```bash
pdm add ooai-llm
```

Install only the provider extras you use:

```bash
pdm add ooai-llm[openai]
pdm add ooai-llm[anthropic]
pdm add ooai-llm[deepseek]
pdm add ooai-llm[mistral]
pdm add ooai-llm[litellm]
pdm add ooai-llm[cli]
pdm add ooai-llm[tui]
pdm add ooai-llm[benchmarks]
pdm add ooai-llm[logging]
pdm add ooai-llm[redis]
pdm add ooai-llm[upstash]
pdm add ooai-llm[caches]
```

Gemini and xAI are available as `ooai-llm[google]` and `ooai-llm[xai]`, but you
can skip those extras entirely if you do not have those keys.

Install `ooai-llm[logging]` when you want `ultilog` Rich/JSON logging. The base
package still works without it and falls back to standard-library logging.
Install `ooai-llm[tui]` when you want the optional Textual explorer.

## Factory Quick Start

```python
from ooai_llm import AppSettings, configure_global_llm_cache, create_llm

settings = AppSettings()
configure_global_llm_cache(settings)

llm = create_llm(
    alias="testing",
    settings=settings,
    temperature=0,
    reasoning="fast",
)

print(type(llm).__name__)
```

Most applications only need the factory plus settings:

```python
from ooai_llm import AppSettings, create_llm

settings = AppSettings()

testing_llm = create_llm(alias="testing", settings=settings, temperature=0)
reasoning_llm = create_llm(provider="anthropic", preset="reasoning", settings=settings)
explicit_llm = create_llm("openai:gpt-5.4-mini", settings=settings)
```

Use `create_llm_bundle(...)` when you want the created model and resolved
metadata together:

```python
from ooai_llm import create_llm_bundle

bundle = create_llm_bundle(
    alias="testing",
    reasoning="fast",
    temperature=0,
)

print(bundle.model.as_langchain())
print(bundle.metadata.identity.litellm_model)
print(bundle.reasoning.constructor_kwargs if bundle.reasoning else None)
```

## Profiles And Runtime

Use `ChatModelProfile` when you want the same model configuration in code, CLI
tools, and checked-in JSON:

```python
from ooai_llm import ChatModelProfile

profile = ChatModelProfile(
    id="support-bot-profile",
    model="openai:gpt-5.4-mini",
    temperature=0,
    top_p=0.9,
    max_retries=2,
    reasoning="fast",
    cache={"namespace": "agents", "key": "support-bot-v1"},
    run_name="support-bot",
    tags=["prod"],
    cost_labels={"team": "support"},
)

print(profile.to_json())
llm = profile.create_llm()
bundle = profile.create_bundle()
```

The runtime wrapper is useful when you want one object that owns the profile,
the private LangChain runnable, and observed usage totals:

```python
from ooai_llm import LLM, UsageRecorder, configure_logging

configure_logging(preset="prod", mode="json")

runtime = LLM(
    id="support-bot-prod",
    profile=profile,
    recorder=UsageRecorder(),
)

# result = runtime.invoke("Summarize this ticket.")
print(runtime.id)
print(runtime.uuid)
print(runtime.usage_summary.model_dump())
```

`ChatModelProfile` exposes typed controls for common constructor knobs:
`temperature`, `max_tokens`, `top_p`, penalties, `seed`, `stop`,
`max_retries`, `parallel_tool_calls`, `timeout`, and `streaming`. Use `model_kwargs` or
`constructor_kwargs` for provider-specific LangChain parameters that are not
modeled directly.

Parallel tool calls are provider/model dependent. `ooai-llm` does not force a
global default; when `parallel_tool_calls` is omitted, the provider wrapper keeps
its native default. Set `parallel_tool_calls=False` to disable it for models
that support the request parameter, or `True` to pass the preference explicitly:

```python
profile = ChatModelProfile(
    model="openai:gpt-5.4-mini",
    parallel_tool_calls=False,
)
```

Profile JSON can be validated or resolved from the CLI:

```bash
ooai-llm profiles validate --input profile.json
ooai-llm profiles render --input profile.json
ooai-llm profiles resolve --input profile.json --format json
```

## Environment

The package accepts both native provider variables and `OOAI_` aliases:

```bash
export OPENAI_API_KEY="..."
export ANTHROPIC_API_KEY="..."
export DEEPSEEK_API_KEY="..."
export MISTRAL_API_KEY="..."

export OOAI_OPENAI_API_KEY="..."
export OOAI_ANTHROPIC_API_KEY="..."
export OOAI_DEEPSEEK_API_KEY="..."
export OOAI_MISTRAL_API_KEY="..."
```

Google/Gemini and xAI variables are supported too, but are optional:

```bash
export GOOGLE_API_KEY="..."
export GEMINI_API_KEY="..."
export XAI_API_KEY="..."
```

## Model Strings

```python
from ooai_llm import ModelString

model = ModelString.parse("gpt-5.4-mini")
print(model.provider)       # Provider.OPENAI
print(model.model_name)     # gpt-5.4-mini
print(model.canonical())    # openai:gpt-5.4-mini
print(model.as_litellm())   # openai/gpt-5.4-mini
```

## Settings And Defaults

```python
from ooai_llm import AppSettings

settings = AppSettings()

print(settings.resolve_model(alias="cheap"))
print(settings.resolve_model(provider="anthropic", preset="reasoning"))
print(settings.default_llm_cache_path)
```

Default aliases and provider presets include:

- `default`
- `latest`
- `cheap`
- `testing`
- `fast`
- `balanced`
- `reasoning`
- `coding`
- `vision`

## Live Model Discovery

```python
from ooai_llm import AppSettings, ListModelsConfig, list_available_models

settings = AppSettings()
result = list_available_models(
    "openai",
    settings=settings,
    config=ListModelsConfig(limit=5),
)

for model in result.models:
    print(model.model_string, model.display_name)
```

Provider SDKs are preferred when installed. Supported REST fallbacks are used
when SDK listing is unavailable or when you pass `ListModelsConfig(prefer_sdk=False)`.

Use the model catalog helper or CLI when you want a cross-provider table with
date, cost, context, and capability filters:

```python
from ooai_llm import list_model_catalog

catalog = list_model_catalog(
    providers=["openai", "anthropic", "mistral"],
    source="litellm",
    capabilities=["tool_calling", "structured_output"],
    released_after="2026-01",
    min_context_tokens=128_000,
    min_output_tokens=8_000,
    max_output_cost_per_1m=200,
    sort_by="output_tokens",
)

for model in catalog.models:
    print(
        model.model_string,
        model.release_date,
        model.context_window,
        model.max_output_tokens,
        model.capability_labels,
    )
```

`list_model_catalog(...)` is intentionally scoped to ooai's first-class
providers because those rows can feed factory defaults, suites, and
`ChatModelProfile` workflows. If you want the broad raw LiteLLM registry across
all LiteLLM provider labels, use `list_litellm_registry(...)` or
`ooai-llm models list --all-litellm` instead:

```python
from ooai_llm import list_litellm_registry

registry = list_litellm_registry(
    providers=["openrouter", "fireworks_ai"],
    include_non_chat=True,
    sort_by="provider",
)

print(len(registry.models))
print(registry.models[0].provider, registry.models[0].model_string)
```

```bash
ooai-llm models list \
  --source litellm \
  --providers openai,anthropic,mistral \
  --tool-calling-only \
  --structured-output-only \
  --released-after 2026-01 \
  --min-input-tokens 128000 \
  --min-output-tokens 8000 \
  --max-output-cost-per-1m 200 \
  --sort output_tokens

ooai-llm models list --all-litellm --limit 0 --format json
ooai-llm models list --all-litellm --providers openrouter,fireworks_ai --limit 0
```

Install `ooai-llm[cli]` when you want the prettier terminal tables:

```bash
pip install "ooai-llm[cli]"
```

When Rich is installed, `--format table` is the default and output is optimized
for scanning in a terminal: summary panel first, compact price and
input/output-token columns, and capability badges for chat, reasoning, coding,
vision, tool calling, tool choice, parallel tool calls, structured output, and
cheap models. Use `--format json` or `--format csv` for automation, or
`--no-rich` for a plain fixed-width table:

```bash
ooai-llm models list --source litellm --providers openai,mistral --limit 5
ooai-llm models list --source litellm --providers openai,mistral --limit 5 --format json
ooai-llm models list --source litellm --structured-output-only --sort output_tokens --format csv
ooai-llm models list --source litellm --providers openai,mistral --limit 5 --no-rich
ooai-llm models list --all-litellm --format table
```

Use catalog comparisons when you want planning numbers such as cheapest
matching models, coding-oriented shortlists, calls per dollar, or "how many
calls of this model equal one call of that model." The estimates use the
selected catalog source and explicit input/output token assumptions:

```python
from ooai_llm import compare_model_catalog, get_coding_model_comparison

comparison = compare_model_catalog(
    providers=["openai", "anthropic", "google", "deepseek", "mistral"],
    source="litellm",
    input_tokens=10_000,
    output_tokens=2_000,
    per_provider=True,
)

for estimate in comparison.estimates:
    print(
        estimate.model,
        estimate.call_cost_usd,
        estimate.calls_per_usd,
        estimate.max_input_tokens,
        estimate.max_output_tokens,
    )

coding = get_coding_model_comparison(
    providers=["openai", "anthropic", "deepseek", "mistral"],
    source="litellm",
    input_tokens=10_000,
    output_tokens=2_000,
)
```

```bash
ooai-llm models cheapest \
  --source litellm \
  --providers mistral \
  --input-tokens 10000 \
  --output-tokens 2000 \
  --budget-usd 10

ooai-llm models coding \
  --source litellm \
  --providers openai,anthropic,deepseek,mistral \
  --tool-calling-only \
  --structured-output-only \
  --per-provider

ooai-llm models compare \
  --source litellm \
  --providers openai,anthropic,google,deepseek,mistral \
  --input-tokens 10000 \
  --output-tokens 2000 \
  --per-provider

ooai-llm models compare \
  --source litellm \
  --providers openai,anthropic,deepseek,mistral \
  --coding-only \
  --tool-calling-only \
  --baseline openai:gpt-5-mini

ooai-llm models compare \
  --source litellm \
  --providers openai,mistral \
  --structured-output-only \
  --min-output-tokens 8000 \
  --sort output_tokens \
  --input-tokens 10000 \
  --output-tokens 2000 \
  --baseline openai:gpt-5-mini \
  --limit 10 \
  --format csv
```

Catalog prices are planning estimates. Use provider response usage metadata for
post-call accounting and billing reconciliation.

For guided command/package examples:

```bash
ooai-llm recipes --topic cheapest
ooai-llm recipes --topic coding
ooai-llm recipes --topic runtime --format markdown
```

For interactive exploration:

```bash
pdm run ooai-llm tui \
  --providers openai,anthropic,mistral \
  --source litellm \
  --input-tokens 10000 \
  --output-tokens 2000 \
  --budget-usd 10 \
  --theme paper \
  --views cheapest,coding,catalog

pdm run ooai-llm tui --catalog-all
```

From a repo checkout, use `pdm run ooai-llm tui` so the command resolves to the
current working tree instead of a stale global install.

The TUI includes searchable tables, provider filters, row detail panes, visible
view-navigation buttons, `Tab`/`Shift+Tab` and `n`/`p` view navigation, a
`Ctrl+P` command palette, and a refresh cooldown so accidental key repeat does
not hammer provider catalogs. The CLI and TUI internals are package modules
(`ooai_llm.cli.*`, `ooai_llm.tui.*`) so new commands, views, and exporters can
be added without expanding a single monolithic file. See the CLI and TUI docs
for the full terminal workflows.

TUI themes are explicit: `paper` is the default, `slate` is a dark neutral
theme, `mono` minimizes color, and `forest` preserves the earlier earthy look.
Use `--views cheapest,catalog` or repeat `--view catalog` to load only a subset
of table-backed model surfaces instead of fetching every comparison, catalog,
suite, and benchmark view. Use `--catalog-all` when the catalog looks too small
and you want the raw LiteLLM registry instead of the supported-provider catalog;
it is shorthand for raw LiteLLM registry scope, catalog-only loading,
non-chat rows included, and no row limit.

## Optional Benchmark Exploration

The core package does not depend on external benchmark services. Exploratory
benchmark helpers live under `ooai_llm.benchmarks` and are intentionally kept
out of the top-level public API.

Install the optional benchmark/CLI rendering extra when you want these terminal
views:

```bash
pdm add ooai-llm[benchmarks]
```

LiveCodeBench Pro exploration currently exposes what the public website
leaderboard backend returns: overall model ratings, easy/medium/hard pass
rates, per-problem verdicts for a selected model, and individual submission
details. The backend is public but undocumented, so treat it as best-effort
research data rather than a stable API contract.

```python
from ooai_llm.benchmarks.livecodebench_pro import LiveCodeBenchProClient

client = LiveCodeBenchProClient()

for model in client.list_models(status="active", limit=5):
    print(model.rating, model.label, model.provider)

hard = client.get_difficulty("hard", providers=["openai"], limit=5)
for row in hard.llms:
    print(row.label, row.passrate_percent)
```

CLI examples:

```bash
ooai-llm benchmarks lcb-pro summary
ooai-llm benchmarks lcb-pro models --status active --limit 10
ooai-llm benchmarks lcb-pro difficulty --difficulty hard --provider openai
ooai-llm benchmarks lcb-pro submissions \
  --model-name gpt-5.2-2025-12-11 \
  --model-provider openai \
  --difficulty hard \
  --format json
```

## Model Suites

Use model suites when you want a repeatable shortlist for comparisons,
LangGraph nodes, or per-request experiments:

```python
from ooai_llm import get_model_suite

suite = get_model_suite(
    "comparison",
    providers=["openai", "anthropic", "deepseek", "mistral"],
    temperature=0,
    parallel_tool_calls=False,
)

print(suite.model_list())
print(suite.model_dict())

profiles = suite.filter(roles=["cheap", "balanced"]).to_profiles()
runtimes = suite.create_runtimes()
```

Suites are configurable. `get_model_suite(...)` builds from current
`AppSettings` provider presets, so refreshed aliases and provider defaults flow
into the suite. Use `model_suite_from_catalog(...)` when you want a dynamic
shortlist from the same date, capability, context, and cost filters used by
`ooai-llm models list`:

```python
from ooai_llm import model_suite_from_catalog

reasoning_suite = model_suite_from_catalog(
    providers=["openai", "anthropic", "mistral"],
    source="litellm",
    capabilities=["tool_calling", "structured_output"],
    released_after="2026-01",
    min_context_tokens=128_000,
    min_output_tokens=8_000,
    max_output_cost_per_1m=200,
    sort_by="output_tokens",
    limit=5,
)
```

CLI equivalents:

```bash
ooai-llm models suite --suite comparison --providers openai,anthropic,mistral
ooai-llm models suite --from-catalog --source litellm --tool-calling-only --structured-output-only --sort output_tokens --limit 5
ooai-llm models suite --suite comparison --providers openai,mistral --no-rich
```

Refresh convenience factory defaults once at startup when you want aliases and
provider presets to follow newer models:

```python
from ooai_llm import AppSettings, refresh_model_defaults

settings = AppSettings()
refresh = refresh_model_defaults(
    settings,
    providers=["openai", "anthropic", "mistral"],
    source="auto",
)

settings = refresh.settings
print(settings.resolve_model(alias="latest"))
print(settings.resolve_model(provider="anthropic", preset="reasoning"))
```

Use `update_model_defaults(...)` when you want the refreshed settings plus a
reusable override payload:

```python
from ooai_llm import AppSettings, create_llm, update_model_defaults

update = update_model_defaults(
    AppSettings(),
    providers=["openai", "anthropic", "mistral"],
    source="litellm",
    output_format="env",
)

settings = update.settings
llm = create_llm(alias="latest", settings=settings)
print(update.output_text)
```

Or use the CLI to print or write those overrides:

```bash
ooai-llm models update --source litellm --providers openai,anthropic,mistral --format json
ooai-llm models update --source auto --provider openai --format env --output .env.models
```

Use `source="litellm"` to derive defaults from LiteLLM's local model registry
without provider-listing credentials. Use `source="provider"` for live provider
catalogs only.

Automatic refresh is opt-in for factory calls. Use this when you want
`create_llm(...)` to refresh convenience defaults before resolving aliases:

```python
from ooai_llm import AppSettings, create_llm

settings = AppSettings(
    llm={
        "auto_refresh_models": {
            "enabled": True,
            "source": "litellm",
            "providers": ["openai", "anthropic", "mistral"],
        }
    }
)

llm = create_llm(alias="latest", settings=settings)
```

The same setting can be enabled from `.env`:

```bash
OOAI_LLM__AUTO_REFRESH_MODELS__ENABLED=true
OOAI_LLM__AUTO_REFRESH_MODELS__SOURCE=litellm
OOAI_LLM__AUTO_REFRESH_MODELS__PROVIDERS='["openai","anthropic","mistral"]'
```

Automatic refresh uses a one-hour process-local cache by default. Set
`OOAI_LLM__AUTO_REFRESH_MODELS__CACHE_SECONDS=0` to refresh on every factory
call, or pass `force_model_refresh=True` for one call.

## Reasoning

```python
from ooai_llm import ReasoningConfig, build_reasoning_resolution, create_llm

resolution = build_reasoning_resolution(
    model="openai:gpt-5.4-mini",
    reasoning="deep",
)
print(resolution.constructor_kwargs)

llm = create_llm(
    "anthropic:claude-sonnet-4-20250514",
    reasoning=ReasoningConfig(effort="medium", summary="auto"),
)
```

## Metadata And Usage

```python
from ooai_llm import BudgetPolicy, UsageRecorder, create_llm_bundle, make_litellm_cost_callback

bundle = create_llm_bundle(
    "openai:gpt-5.4-mini",
    reasoning="fast",
)

print(bundle.metadata.identity.litellm_model)
print(bundle.metadata.capabilities.raw_profile)
print(bundle.metadata.pricing.input_cost_per_token)

recorder = UsageRecorder()
callback = make_litellm_cost_callback(
    recorder,
    budget=BudgetPolicy(warn_total_tokens=5000),
)
```

For LangChain/LangGraph flows, use `LangChainUsageCallbackHandler` or the
`LLM` runtime to record observed response usage metadata. Usage events include
`count_source`, run labels, tags, metadata, and cost labels so dashboards can
distinguish framework callbacks from provider usage metadata.

General pre-call token estimation for arbitrary strings and files is the next
planned API. The intended shape is:

```python
from ooai_llm import estimate_tokens

text_estimate = estimate_tokens("hello world", model="openai:gpt-5.4-mini")
file_estimate = estimate_tokens.from_file("prompt.md", model="openai:gpt-5.4-mini")
```

That will be separate from observed usage because local tokenizers are only an
estimate for many real provider payloads.

## Cache Bootstrap

```python
from ooai_llm import AppSettings, configure_global_llm_cache

settings = AppSettings()
cache = configure_global_llm_cache(settings)
print(cache)
```

By default the SQLite cache is placed under:

```text
{app_root}/.ooai/cache/llm/langchain_llm_cache.sqlite3
```

Override it with `OOAI_LLM__CACHE__PATH` or `AppSettings(llm={"cache": {"path": ...}})`.

Use Redis or Upstash Redis for a shared application cache:

```python
settings = AppSettings(
    llm={
        "cache": {
            "backend": "redis",
            "redis_url": "redis://localhost:6379/0",
            "ttl": 3600,
        }
    }
)
configure_global_llm_cache(settings)
```

Profiles can also namespace cache entries without changing prompt
serialization:

```python
from ooai_llm import ChatModelProfile

profile = ChatModelProfile(
    model="openai:gpt-5.4-mini",
    cache={"namespace": "tenant-a", "key": "assistant-v1"},
)
llm = profile.create_llm()
```

```python
settings = AppSettings(
    llm={
        "cache": {
            "backend": "upstash_redis",
            "upstash_url": "...",
            "upstash_token": "...",
            "ttl": 3600,
        }
    }
)
configure_global_llm_cache(settings)
```

## Development

Install the development dependencies:

```bash
pdm install -G test -G docs -G dev
```

Run the full checked suite with coverage:

```bash
pdm run pytest
```

Live provider tests are skipped by the default suite so local keys in `.env`
do not trigger network calls accidentally.

Run tiers directly without the global coverage gate:

```bash
pdm run pytest -m unit --no-cov
pdm run pytest -m integration --no-cov
pdm run pytest -m "e2e and not live" --no-cov
```

Run live provider tests for your configured providers. This skips Gemini and xAI:

```bash
OOAI_LIVE_PROVIDERS=openai,anthropic,deepseek,mistral pdm run pytest -m live --no-cov
```

To make live e2e fail instead of skip when a selected provider is missing a key
or SDK package:

```bash
OOAI_REQUIRE_LIVE=true OOAI_LIVE_PROVIDERS=openai,anthropic,deepseek,mistral pdm run pytest -m live --no-cov
```

`AppSettings` loads a local `.env` file. Keep real keys in `.env`, not
`.env.example`.

Build docs and distributions:

```bash
pdm run sphinx-build -E -W --keep-going -b html docs docs/_build/html
pdm build
pdm run twine check dist/*
```

## Project Layout

```text
src/ooai_llm/
  cache.py       LangChain cache setup
  callbacks.py   usage and cost events
  catalog.py     live provider model listing
  factory.py     LangChain chat-model creation
  messages.py    message normalization
  metadata.py    LangChain + LiteLLM metadata
  profiles.py    serializable profiles and LLM runtime
  providers.py   provider normalization
  reasoning.py   provider reasoning kwargs
  settings.py    Pydantic settings
  types.py       ModelString

docs/             Sphinx + MyST docs
examples/         runnable examples
tests/            unit, integration, e2e, and live tests
```

## Publishing

The repository includes:

- `.github/workflows/ci.yml` for tests, coverage, docs build, and package build
- `.github/workflows/docs.yml` for standalone docs validation
- `.github/workflows/version-bump.yml` for manual Commitizen version bumps,
  changelog updates, release commits, and tags
- `.github/workflows/release.yml` for tagged PyPI releases with trusted publishing
- `.readthedocs.yaml` for Read the Docs builds

Before publishing, configure the PyPI trusted publisher for `release.yml` and
environment `pypi`, import the repo in Read the Docs, and update
`docs/changelog.md`. The normal release path is:

```bash
gh workflow run version-bump.yml -f increment=patch -f dry-run=false
```

The version-bump workflow runs `cz bump`, commits the version/changelog change,
pushes the `v*` tag, and that tag triggers the PyPI release workflow.

## License

MIT
