Metadata-Version: 2.4
Name: llmbridge-sdk
Version: 0.1.0
Summary: A lightweight Python SDK for using local and OpenAI-compatible LLMs.
Author-email: Rahul Kumar <iamrahul.rk4@gmail.com>
Maintainer-email: Rahul Kumar <iamrahul.rk4@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/iwasbugged/llmbridge
Project-URL: Repository, https://github.com/iwasbugged/llmbridge
Project-URL: Issues, https://github.com/iwasbugged/llmbridge/issues
Project-URL: Changelog, https://github.com/iwasbugged/llmbridge/blob/main/CHANGELOG.md
Keywords: llm,ai,ollama,openai-compatible,local-llm,python-sdk,structured-output,cli
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.25
Requires-Dist: pydantic>=2
Requires-Dist: tomli>=2; python_version < "3.11"
Requires-Dist: typer>=0.9
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: ruff>=0.8.0; extra == "dev"
Requires-Dist: build>=1.2.0; extra == "dev"
Requires-Dist: twine>=5.0.0; extra == "dev"
Dynamic: license-file

# llmbridge

`llmbridge` is a lightweight Python SDK and CLI for using local and
OpenAI-compatible LLMs. It connects to runtimes you already run, such as Ollama,
LM Studio, vLLM, llama.cpp server, LocalAI, or another OpenAI-compatible API.

llmbridge does not ship model files. You install and run the model runtime
yourself, then use llmbridge as a small developer-friendly bridge.

## Features

- Local Ollama provider
- Generic OpenAI-compatible provider
- CLI commands for setup checks, model listing, chat, ask, pull, and config
- Streaming responses
- Local config at `~/.llmbridge/config.toml`
- Prompt templates
- Structured JSON output with Pydantic validation and retry
- Typed response models

## Installation

```bash
pip install llmbridge-sdk
```

The PyPI distribution is named `llmbridge-sdk`. The Python import and CLI command
remain `llmbridge`.

## Requirements

- Python 3.10+
- Ollama for the Ollama provider, or an already-running OpenAI-compatible server
- No bundled LLM model files

## Ollama Quickstart

Install Ollama from https://ollama.com, start it locally, then pull a model:

```bash
ollama pull llama3.1:latest
```

Check your setup:

```bash
llmbridge doctor
llmbridge serve-check
llmbridge models
```

Ask a question:

```bash
llmbridge ask "Explain FastAPI in simple words"
```

Set your default model:

```bash
llmbridge config set model llama3.1:latest
```

## OpenAI-Compatible Quickstart

Use an OpenAI-compatible server such as LM Studio, vLLM, llama.cpp server, or
LocalAI. The `base_url` should point to the API root, usually ending in `/v1`.

```bash
llmbridge ask "Explain FastAPI" \
  --provider openai_compatible \
  --model local-model \
  --base-url http://localhost:1234/v1
```

List models:

```bash
llmbridge models \
  --provider openai_compatible \
  --base-url http://localhost:1234/v1
```

The OpenAI-compatible provider does not download or manage models. Start your
server with the model you want before calling llmbridge.

## CLI Usage

Use the configured default model:

```bash
llmbridge ask "Explain FastAPI"
```

Override the model:

```bash
llmbridge ask "Explain FastAPI" --model gemma4:e4b
```

Adjust temperature:

```bash
llmbridge ask "Explain FastAPI" --temperature 0.2
```

Run chat with an explicit model:

```bash
llmbridge chat llama3.1:latest "Explain PostgreSQL in simple words"
```

Run chat against an OpenAI-compatible server:

```bash
llmbridge chat local-model "Hello" \
  --provider openai_compatible \
  --base-url http://localhost:1234/v1
```

Pull an Ollama model:

```bash
llmbridge pull llama3.1:latest
```

## Streaming Usage

```bash
llmbridge chat llama3.1:latest "Explain Docker" --stream
llmbridge ask "Explain Docker" --stream
```

Streaming chunks are printed as they arrive. Non-streaming CLI output is trimmed
before printing.

Python streaming:

```python
from llmbridge import LLM

llm = LLM(model="llama3.1:latest")

for chunk in llm.stream("Explain Docker Compose"):
    print(chunk.text, end="")
```

## Config Usage

llmbridge stores local CLI defaults in:

```text
~/.llmbridge/config.toml
```

Supported config keys:

- `provider`
- `model`
- `base_url`
- `api_key`
- `temperature`
- `timeout`

Commands:

```bash
llmbridge config show
llmbridge config set provider ollama
llmbridge config set model llama3.1:latest
llmbridge config set base_url http://localhost:11434
llmbridge config set api_key local-secret
llmbridge config set temperature 0.2
llmbridge config set timeout 120
llmbridge config reset
```

For OpenAI-compatible servers:

```bash
llmbridge config set provider openai_compatible
llmbridge config set base_url http://localhost:1234/v1
llmbridge config set model local-model
llmbridge config set api_key local-secret
```

`llmbridge config show` masks stored API keys.

For `llmbridge ask`, model resolution order is:

1. `--model`
2. `model` in `~/.llmbridge/config.toml`
3. `LLMBRIDGE_DEFAULT_MODEL`
4. `llama3.1:latest`

## Python Usage

Ollama:

```python
from llmbridge import LLM

llm = LLM(
    provider="ollama",
    model="llama3.1:latest",
)

response = llm.chat("Explain FastAPI in simple words")
print(response.text)
```

OpenAI-compatible:

```python
from llmbridge import LLM

llm = LLM(
    provider="openai_compatible",
    model="local-model",
    base_url="http://localhost:1234/v1",
)

response = llm.chat("Explain FastAPI in simple words")
print(response.text)
```

Message format:

```python
response = llm.chat(
    [
        {"role": "system", "content": "You are a helpful backend architect."},
        {"role": "user", "content": "Explain PostgreSQL indexes."},
    ]
)
```

## PromptTemplate Usage

Use `PromptTemplate` for small reusable prompts with named variables:

```python
from llmbridge import LLM, PromptTemplate

template = PromptTemplate("Explain {topic} for a {audience}.")
prompt = template.format(topic="FastAPI", audience="backend developer")

llm = LLM(model="llama3.1:latest")
response = llm.chat(prompt)
print(response.text)
```

If a required variable is missing, llmbridge raises `PromptTemplateError`.

## Structured Output Usage

`LLM.structured()` asks the model for JSON, validates it with a Pydantic schema,
and returns a typed object:

```python
from pydantic import BaseModel

from llmbridge import LLM


class TaskResult(BaseModel):
    title: str
    priority: str


llm = LLM(model="llama3.1:latest")
result = llm.structured(
    "Create a task for fixing a login bug",
    schema=TaskResult,
)

print(result.title)
print(result.priority)
```

Structured output depends on the model following instructions. llmbridge asks for
JSON matching your schema, extracts JSON from the response, validates it with
Pydantic, and retries when the output is invalid. If the final response still
cannot be parsed or validated, llmbridge raises `StructuredOutputError`.

SQL plan example:

```python
from pydantic import BaseModel

from llmbridge import LLM


class SQLPlan(BaseModel):
    sql: str
    explanation: str
    tables_used: list[str]


llm = LLM(model="llama3.1:latest")
plan = llm.structured(
    "Create a SQL plan to list the latest 10 paid invoices. Do not execute SQL.",
    schema=SQLPlan,
)

print(plan.sql)
```

This returns a structured SQL plan only. llmbridge does not execute SQL.

## Examples

Runnable examples live in the `examples/` folder:

```bash
python examples/basic_chat.py
python examples/streaming_chat.py
python examples/list_models.py
python examples/custom_options.py
python examples/ask_style_usage.py
python examples/prompt_template.py
python examples/structured_output.py
python examples/structured_sql_plan.py
```

## Troubleshooting

If Ollama is not running, you may see:

```text
Ollama is not running at http://localhost:11434. Start Ollama and run: ollama pull llama3.1
```

Start Ollama and pull the selected model:

```bash
ollama pull llama3.1:latest
```

If the CLI says a model is missing:

```text
Model 'llama3.1:latest' is not installed.
Run:
  llmbridge pull llama3.1:latest
```

Pull it:

```bash
llmbridge pull llama3.1:latest
```

If your Ollama server uses a different URL:

```bash
llmbridge config set base_url http://localhost:11434
```

Or pass it for one command:

```bash
llmbridge ask "Explain FastAPI" --base-url http://localhost:11434
```

## Roadmap

- More provider integrations
- Better structured-output controls
- Tool calling
- Embeddings and RAG support
- Higher-level application workflows

## Local Development

```bash
git clone https://github.com/iwasbugged/llmbridge.git
cd llmbridge
python3 -m pip install -e ".[dev]"
```

Run tests:

```bash
python3 -m pytest
```

Run linting:

```bash
python3 -m ruff check .
python3 -m ruff format --check .
```

## Author

Rahul Kumar <iamrahul.rk4@gmail.com>

## License

MIT License. See [LICENSE](LICENSE).
