Metadata-Version: 2.4
Name: justllm
Version: 0.7.0
Summary: Production LLM calls. Just the three lines. Reliability, native caching, and reversible context compression on by default.
Project-URL: Homepage, https://github.com/robbiebusinessacc/justllm
Project-URL: Repository, https://github.com/robbiebusinessacc/justllm
Project-URL: Issues, https://github.com/robbiebusinessacc/justllm/issues
Project-URL: Changelog, https://github.com/robbiebusinessacc/justllm/blob/main/CHANGELOG.md
Author: Robert Walmsley
License: MIT
License-File: LICENSE
Keywords: agents,ai,anthropic,context-compression,fallback,litellm-alternative,llm,llm-gateway,openai,orchestration,prompt-caching,routing,structured-output
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Provides-Extra: all
Requires-Dist: fastembed>=0.3; extra == 'all'
Requires-Dist: headroom-ai>=0.25; extra == 'all'
Requires-Dist: instructor>=1.5; extra == 'all'
Requires-Dist: litellm>=1.70; extra == 'all'
Requires-Dist: opentelemetry-sdk>=1.20; extra == 'all'
Requires-Dist: tiktoken>=0.7; extra == 'all'
Provides-Extra: benchmarks
Requires-Dist: headroom-ai>=0.25; extra == 'benchmarks'
Requires-Dist: tiktoken>=0.7; extra == 'benchmarks'
Provides-Extra: compression
Requires-Dist: headroom-ai>=0.25; extra == 'compression'
Provides-Extra: dev
Requires-Dist: build; extra == 'dev'
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: twine; extra == 'dev'
Provides-Extra: embeddings
Requires-Dist: fastembed>=0.3; extra == 'embeddings'
Provides-Extra: langfuse
Requires-Dist: langfuse>=2; extra == 'langfuse'
Provides-Extra: litellm
Requires-Dist: litellm>=1.70; extra == 'litellm'
Provides-Extra: otel
Requires-Dist: opentelemetry-sdk>=1.20; extra == 'otel'
Provides-Extra: structured
Requires-Dist: instructor>=1.5; extra == 'structured'
Requires-Dist: litellm>=1.70; extra == 'structured'
Description-Content-Type: text/markdown

# justllm

[![PyPI](https://img.shields.io/pypi/v/justllm)](https://pypi.org/project/justllm/)
[![CI](https://github.com/robbiebusinessacc/justllm/actions/workflows/ci.yml/badge.svg)](https://github.com/robbiebusinessacc/justllm/actions/workflows/ci.yml)
[![Python](https://img.shields.io/pypi/pyversions/justllm)](https://pypi.org/project/justllm/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)

**Production LLM calls. Just the three lines.**

![justllm demo](assets/demo.gif)

```python
from justllm import LLM

llm = LLM("anthropic/claude-opus-4-8")
llm("Summarize this contract.")
```

That call already does the work you'd normally wire up yourself, on by default:

- **Context compression.** [Headroom](https://github.com/chopratejas/headroom) shrinks tool output by 50–95% before it reaches the model.
- **Prompt-cache optimization.** Cache breakpoints go where each provider wants them (Anthropic, OpenAI, Google).
- **Reliability.** Calls retry with backoff, then fail over to the next provider.

```bash
pip install 'justllm[all]'
```

## A little more

Same three lines. Each of these is one call or one kwarg:

```python
llm.extract(Invoice, text)                    # structured output (validated Pydantic)
llm.stream("...")                             # token streaming
await llm.acall("...")                        # async
llm.map(prompts, concurrency=8)               # many prompts at once, in order
llm.embed(texts)                              # embeddings
chat = llm.chat(); chat.send("..."); chat.send("...")   # multi-turn, remembers history
llm.agent(system="...").run("...")            # tool-calling loop
llm.judge(output, criteria="...")             # LLM-as-judge score
llm.evaluate(cases)                           # run + grade a test set
LLM(router=Cascade(small=cheap, large=big))   # cheap first, escalate when needed
```

A few more things sit behind opt-in extras: OpenTelemetry traces that include the
per-call dollar cost (most setups leave that out), Langfuse-backed prompts,
semantic cascade escalation, and exact-match caching. The hard parts are already
wired; you just call them.

Runnable recipes: **[cookbook](examples/)**

## Why

The ecosystem splits two ways. You can have powerful but heavy (LiteLLM,
LangChain), or simple but thin (aisuite, any-llm). justllm sits in the middle:
every optimization is on, and the surface stays at three lines. Keeping it that
small was most of the work.

| | justllm | LiteLLM | aisuite |
|---|---|---|---|
| three-line call | yes | yes | yes |
| cross-provider fallback | on by default | config | no |
| context compression | on by default (Headroom) | manual trim | no |
| prompt-cache optimization | on by default | passthrough | no |
| structured output | yes (instructor) | passthrough | no |
| tool-calling agent | yes (minimal) | no | no |
| surface area | tiny | large | tiny |

It runs on LiteLLM underneath, so think of it as the opinionated layer on top
rather than a replacement.

---

*Alpha. The wiring is tested on CI (Python 3.10–3.13) and the call paths are
checked against live models.*

[Cookbook](examples/) · [Roadmap](ROADMAP.md) · [Changelog](CHANGELOG.md) · [Contributing](CONTRIBUTING.md) · [MIT](LICENSE)
