Metadata-Version: 2.4
Name: anyllm
Version: 0.2.3
Summary: A thin, unified LLM abstraction layer. Call any LLM with a single API.
Project-URL: Homepage, https://github.com/vietanhdev/anyllm
Project-URL: Documentation, https://github.com/vietanhdev/anyllm#readme
Project-URL: Repository, https://github.com/vietanhdev/anyllm
Project-URL: Issues, https://github.com/vietanhdev/anyllm/issues
Author-email: Viet-Anh Nguyen <vietanh.dev@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,anthropic,chatbot,llama-cpp,llm,ollama,openai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Requires-Dist: httpx>=0.24.0
Provides-Extra: all
Requires-Dist: anthropic>=0.20.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20.0; extra == 'anthropic'
Provides-Extra: benchmark
Requires-Dist: numpy; extra == 'benchmark'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: respx>=0.20; extra == 'dev'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Description-Content-Type: text/markdown

<h1 align="center">anyllm</h1>
<p align="center"><em>Local-first LLM abstraction — one API for Ollama, llama.cpp, OpenAI, Anthropic, and HuggingFace.</em></p>

<p align="center">
<img src="https://img.shields.io/pypi/v/anyllm.svg" alt="PyPI">
<img src="https://img.shields.io/pypi/pyversions/anyllm.svg" alt="Python">
<img src="https://img.shields.io/pypi/l/anyllm.svg" alt="License">
</p>

**anyllm** is a lightweight abstraction layer over the most popular LLM providers. Unlike heavier alternatives, it is local-first: if Ollama is running on your machine, `anyllm.chat("hello")` just works — no API keys, no cloud. It also supports llama.cpp, OpenAI, Anthropic, and HuggingFace Transformers behind the same tiny API, with first-class support for tool/function calling, streaming, structured JSON outputs, multi-modal inputs, embeddings, and conversation memory.

Built by [Viet-Anh Nguyen](https://github.com/vietanhdev) at [NRL.ai](https://www.nrl.ai).

## Why anyllm?

- **One-liner API** — `anyllm.chat("Hello")` auto-detects your best local provider
- **Plugin architecture** — Add custom providers via `@register_provider`
- **Local-first** — Defaults to Ollama if available, no API key required
- **Minimal core deps** — Only `httpx` and `pydantic`; every provider is optional
- **Production-ready** — Streaming, async, tool-calling, retries, structured outputs

## Installation

```bash
pip install anyllm
```

For optional providers:

```bash
pip install anyllm[openai]          # OpenAI GPT-4, GPT-3.5
pip install anyllm[anthropic]       # Claude 3.5 Sonnet / Opus / Haiku
pip install anyllm[llamacpp]        # llama.cpp local quantized models
pip install anyllm[transformers]    # HuggingFace Transformers (local)
pip install anyllm[all]             # everything
```

Ollama needs no Python package — just have it running at `http://localhost:11434`.

**Python 3.8+ supported** (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

## Quick Start

```python
import anyllm

# 1. Simple chat (auto-selects Ollama if running, else first configured provider)
reply = anyllm.chat("Explain RAG in one sentence.")
print(reply)

# 2. Specify a provider + model explicitly
reply = anyllm.chat(
    "What is the capital of France?",
    provider="ollama",
    model="llama3.1:8b",
)

# 3. Streaming (yields tokens as they are generated)
for chunk in anyllm.stream("Write a haiku about Python"):
    print(chunk, end="", flush=True)

# 4. Structured output (JSON mode — validates against a Pydantic model)
from pydantic import BaseModel
class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

recipe = anyllm.chat("Give me a pasta recipe", response_model=Recipe)
print(recipe.name, recipe.ingredients)
```

## Models & Methods

### Providers (local-first priority)

| Priority | Provider | How it works | Install |
|---|---|---|---|
| 1 | **Ollama** | HTTP client to `http://localhost:11434` (default if reachable) | built-in |
| 2 | **llama.cpp** | Loads GGUF models via `llama-cpp-python` | `anyllm[llamacpp]` |
| 3 | **OpenAI** | REST API (`gpt-4o`, `gpt-4o-mini`, `gpt-3.5-turbo`) | `anyllm[openai]` |
| 4 | **Anthropic** | REST API (`claude-3-5-sonnet`, `claude-3-5-haiku`, `claude-3-opus`) | `anyllm[anthropic]` |
| 5 | **HuggingFace Transformers** | Loads any HF causal-LM model locally | `anyllm[transformers]` |

Provider priority can be overridden via `anyllm.set_priority([...])` or per-call with `provider="..."`.

### Features

- **Tool / function calling** — Pass Python functions; parameter schemas are auto-extracted from type hints and docstrings. Dispatches to Ollama tools, OpenAI tools, or Anthropic tool use automatically.
- **Streaming** — Unified token streaming for every provider (yields strings).
- **Async** — `anyllm.achat(...)`, `anyllm.astream(...)`.
- **Structured outputs** — `response_model=MyPydanticModel` uses native JSON mode on OpenAI/Anthropic/Ollama, falls back to regex extraction + retries elsewhere.
- **Multi-modal** — Pass images via `anyllm.chat([..., {"image": "cat.jpg"}], model="gpt-4o")`.
- **Embeddings** — `anyllm.embed("text", model="nomic-embed-text")` with Ollama / OpenAI / sentence-transformers.
- **Conversation memory** — `Conversation()` with sliding-window history and optional disk persistence.
- **Retries + timeouts** — Configurable exponential backoff on transient errors.

## API Reference

| Function | Purpose |
|---|---|
| `anyllm.chat(messages, **opts)` | Chat completion -> `str` or `Pydantic` model |
| `anyllm.stream(messages, **opts)` | Generator yielding token chunks |
| `anyllm.achat / astream` | Async variants |
| `anyllm.embed(text, model=...)` | Returns `list[float]` embedding |
| `anyllm.tools(fns, prompt)` | Tool-calling loop with auto-dispatch |
| `anyllm.Conversation(system=...)` | Multi-turn memory |
| `anyllm.list_models(provider=...)` | Enumerate available models |
| `anyllm.register_provider(name, cls)` | Add a custom provider |

## CLI Usage

```bash
anyllm chat "Summarize this file" --file notes.txt
anyllm chat "Hi" --provider ollama --model llama3.1:8b
anyllm stream "Write a poem"
anyllm embed "hello world" --model nomic-embed-text
anyllm list-models --provider ollama
```

## Examples

### Tool calling with auto-extracted schemas

```python
import anyllm

def get_weather(city: str, units: str = "celsius") -> dict:
    """Get the current weather for a city."""
    # ... call a weather API ...
    return {"city": city, "temp": 22, "units": units}

# anyllm inspects the signature + docstring, builds the JSON schema,
# runs the LLM, dispatches the tool call, and returns the final reply.
reply = anyllm.tools([get_weather], "What's the weather in Hanoi?")
print(reply)
```

### Multi-turn conversation with memory

```python
from anyllm import Conversation

conv = Conversation(system="You are a helpful Python tutor.", model="llama3.1:8b")
conv.send("What is a decorator?")
conv.send("Show me an example")          # remembers previous context
conv.save("chat.json")                   # persist to disk
```

### Vision input with a multi-modal model

```python
import anyllm

reply = anyllm.chat(
    [{"text": "What's in this image?"}, {"image": "cat.jpg"}],
    provider="openai",
    model="gpt-4o",
)
```

## License

MIT (c) Viet-Anh Nguyen
