Metadata-Version: 2.3
Name: lmdk
Version: 1.8.0
Summary: Language Model Development Kit.
Author: Ignacio Llorca
Author-email: Ignacio Llorca <nllorca@proton.me>
License: MIT
Requires-Dist: jinja2>=3.1.6
Requires-Dist: pydantic>=2.12.5
Requires-Dist: requests>=2.32.5
Requires-Dist: opentelemetry-api>=1.38.0 ; extra == 'telemetry'
Requires-Dist: opentelemetry-sdk>=1.38.0 ; extra == 'telemetry'
Requires-Python: >=3.12
Project-URL: Homepage, https://github.com/nachollorca/lmdk
Provides-Extra: telemetry
Description-Content-Type: text/markdown

# Language Model Development Kit

What it offers:
- **Simplest interface to call different Language Model APIs**
- Minimal dependencies: HTTP requests only, no third party packages
- Streaming
- Comfy structured outputs via Pydantic models, **only if the provider / model supports it natively**
- Parallel completions
- Unified HTTP error handling
- Easy location config (for providers with multiple datacenters like AWS Bedrock, GCP Vertex and Azure)
- Model fallbacks
- Bring Your Own Key (for each provider)
- Optional Telemetry following OpenTelemetry GenAI Semantic Conventions

What it does **NOT** offer:
- Tools / function calling / MCP
- Agents
- Multimodality (only text-in, text-out)
- Shady under-the-hood prompt modification (e.g. to force structured output)
- API gateways

If you are looking for a more constrained but out-of-the-box agent interface, I'd recommend [pydantic-ai](https://ai.pydantic.dev) or [haystack-ai](https://docs.haystack.deepset.ai/docs/generators).
If you are looking to keep granular control but extend on tools or multimodality, I'd recommend [litellm](https://docs.litellm.ai/docs/) or leveraging the OpenAI-compatible endpoints that providers normally set up.
If you want a unified a token for all providers and are willing to give away telemetry data, check Gateways like [openrouter](https://openrouter.ai).

## Installation
`uv add lmdk`

Optional OpenTelemetry support:

```bash
uv add 'lmdk[telemetry]'
```

## Usage
```python
from lmdk import complete

model = "mistral:mistral-small-2603"
# supports locations as in "vertex:gemini-2.5-flash@europe-west4"
```

<details>
<summary>Single prompt</summary>

```python
response = complete(model=model, prompt="Tell me a joke")
```
</details>

<details>
<summary>Multi-turn conversation</summary>

```python
messages = [
    UserMessage("My name is Alice."),
    AssistantMessage("Nice to meet you, Alice!"),
    UserMessage("What is my name?"),
]
response = complete(model=model, prompt=messages)
```
</details>

<details>
<summary>System prompt and generation kwargs</summary>

```python
response = complete(
    model=model,
    prompt="Hi!",
    system_instruction="Talk like a pirate",
    generation_kwargs={"temperature": 0.9, "max_tokens": 10}
)
```
</details>

<details>
<summary>Streaming</summary>

```python
token_iter = complete(model=model, prompt="Count from 1 to 5.", stream=True)
```
</details>

<details>
<summary>Model fallbacks</summary>

```python
response = complete(model=["mistral:nonexistent-model", model], prompt="Hi")
# first request will raise NotFoundError bc model does not exist, second will work
```
</details>

<details>
<summary>Structured output</summary>

```python
class Ingredient(BaseModel):
    name: str
    quantity: int
    unit: str = ""

class Recipe(BaseModel):
    ingredients: list[Ingredient]

response = complete(model=model, prompt="How do I make cheescake?", output_schema=Recipe)
# response.parsed will have a Recipe instance
```
</details>

<details>
<summary>Parallel calls</summary>

```python
from lmdk import complete_batch

results = complete_batch(model=model, prompt_list=["Greet in english", "Saluda en espanyol."])
# results will be al list of CompletionResult
```
</details>

<details>
<summary>Template Rendering</summary>

```python
from lmdk import render_template

# Render a template string with variables
result = render_template(
    template="Hello, {{ name }}!",
    name="World"
)
# Output: "Hello, World!"

# Render a template from a jinja file
result = render_template(
    path="path/to/template.jinja2",
    name="World"
)
```
</details>

## Telemetry

Telemetry is off by default and adds no required dependencies to the default install.
To enable **OpenTelemetry**-based spans and metrics, install the optional extra and set `LMDK_TELEMETRY`:

```bash
uv add 'lmdk[telemetry]'
export LMDK_TELEMETRY=metadata  # spans/metrics without prompt text
# export LMDK_TELEMETRY=content  # also records prompt, system-instruction, and response text
```

We follows the experimental [**Gen AI semconv**](https://opentelemetry.io/docs/specs/semconv/gen-ai/) v1.41.0. We only instrument non-streaming responses for now.

`lmdk` only emits telemetry through the OpenTelemetry SDK. Your application owns exporter, processor, reader, collector endpoint, i.e.: you decide how and where to send the emitted traces.

Below are some minimal exporter setups. Call them once at process start before invoking `complete` / `complete_batch`.

<details>
<summary>Console (debugging)</summary>

Prints spans to stdout. Useful to verify instrumentation locally without any backend.

```python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter


def configure_console_traces() -> None:
    provider = TracerProvider()
    provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
    trace.set_tracer_provider(provider)
```
</details>

<details>
<summary>Pydantic Logfire</summary>

Logfire installs itself as the global `TracerProvider`, so spans emitted by `lmdk` are forwarded automatically. Requires `uv add logfire` and a `LOGFIRE_TOKEN`.

```python
import os
import logfire


def configure_logfire_traces() -> None:
    logfire.configure(
        token=os.environ["LOGFIRE_TOKEN"],
        service_name="my-app",
        # lmdk already controls prompt/response redaction via LMDK_TELEMETRY;
        # don't let Logfire second-guess scrubbing of content.
        scrubbing=False,
        send_to_logfire=True,
    )
```
</details>

<details>
<summary>Grafana (OTLP / Tempo)</summary>

Ship spans over OTLP to Grafana Cloud (or a self-hosted Tempo + OTel Collector). Requires `uv add opentelemetry-exporter-otlp`.

```python
import os

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor


def configure_grafana_traces() -> None:
    # For Grafana Cloud OTLP, set:
    #   OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-<region>.grafana.net/otlp
    #   OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic%20<base64(instanceID:token)>
    exporter = OTLPSpanExporter(
        endpoint=os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] + "/v1/traces",
    )
    provider = TracerProvider(resource=Resource.create({"service.name": "my-app"}))
    provider.add_span_processor(BatchSpanProcessor(exporter))
    trace.set_tracer_provider(provider)
```
</details>


## Development

### Structure
```text
src/lmdk/
├── core.py         # Entry points: complete, complete_batch
├── datatypes.py    # Common message and response schemas
├── provider.py     # Base Provider class and registry
├── providers/      # Concrete implementations (Mistral, Vertex, etc.)
├── errors.py       # Unified HTTP and API error handling
└── utils.py        # Shared helper functions
```

### Tooling
We use `just` for development tasks. Use:
- `just sync`: Updates lockfile and syncs environment.
- `just format`: Lints and formats with `ruff`.
- `just check-types`: Static analysis with `ty`.
- `just check-complexity`: Cyclomatic complexity checks with `complexipy`.
- `just test`: Runs pytest with 90% coverage threshold.

See [`justfile`](justfile) for a complete list of dev commands.

### Contribute
1. **Hooks**: Install pre-commit hooks via `just install-hooks`. PRs will fail CI if linting/formatting is not applied.
2. **Issues**: Open an issue first using the default template.
3. **PRs**: Link your PR to the relevant issue using the PR template.

You can use `just validate <model>` (runs `example.py`) to verify which features run properly and which do not for a new provider / model.
**Not all of them have to pass to open a PR:** some providers do not even support native structured output. Do at least the normal non-structured, non-streamed completion. The rest can raise `NotImplementedError`.

## License
MIT

_Made with [`mold`](https://github.com/nachollorca/mold) template_
