Metadata-Version: 2.4
Name: articuno
Version: 0.4.9
Summary: Convert Polars DataFrames to lists of Pydantic models with schema inference
Author-email: Odos Matthews <odosmatthews@gmail.com>
Project-URL: Homepage, https://github.com/eddiethedean/articuno
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: polars>=0.19.0
Requires-Dist: pydantic>=2.0
Requires-Dist: datamodel-code-generator>=0.24.0
Requires-Dist: typer
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints; extra == "docs"
Requires-Dist: furo; extra == "docs"
Provides-Extra: patito
Requires-Dist: patito; extra == "patito"

# ❄️ Articuno ❄️

Convert Polars DataFrames to Pydantic models — and optionally generate clean Python code from them.

> A blazing-fast tool for schema inference, data validation, and model generation powered by [Polars](https://pola.rs/) and [Pydantic](https://docs.pydantic.dev/).

---

## 📋 Table of Contents

* [🚀 Features](#-features)
* [📦 Installation](#-installation)
* [🛠 Usage](#-usage)
* [🧬 Example: Nested Structs](#-example-nested-structs)
* [🦜 Patito Integration (Optional)](#-patito-integration-optional)
* [⏰ When to Use Articuno](#-when-to-use-articuno)
* [⚙️ Supported Type Mappings](#️-supported-type-mappings)
* [🧩 Integration Ideas](#-integration-ideas)
* [🧪 Development & Testing](#-development--testing)
* [🧙‍♂️ FastAPI Integration (Decorator + CLI Bootstrap)](#️-fastapi-integration-decorator--cli-bootstrap)
* [🛠 CLI Options](#-cli-options)
* [📜 Patito vs Articuno](#-patito-vs-articuno)
* [License](#license)

---

## 🚀 Features

* 🔍 **Infer Pydantic models** directly from `polars.DataFrame` schemas
* 🧪 **Validate data** by converting DataFrame rows to Pydantic instances
* 🧱 **Supports nested Structs**, Lists, Nullable fields, and advanced types
* 🧬 **Generate Python model code** from dynamic models using [datamodel-code-generator](https://pypi.org/project/datamodel-code-generator/)
* 🦜 **Optional Patito integration** for declarative, constraint-rich models and advanced validation
* 🎨 **Generate Patito model code** alongside Pydantic for flexible schema workflows

---

## 📦 Installation

```bash
pip install articuno
```

Optional extras:

```bash
pip install datamodel-code-generator  # for code generation
pip install patito                    # for Patito support
```

---

## 🛠 Usage

### 1. Convert a DataFrame to Pydantic Models

```python
import polars as pl
from articuno import df_to_pydantic

df = pl.DataFrame({
    "name": ["Alice", "Bob"],
    "age": [30, 25],
    "is_active": [True, False],
})

models = df_to_pydantic(df)

print(models[0])
print(models[0].dict())
```

**Output:**

```
name='Alice' age=30 is_active=True
{'name': 'Alice', 'age': 30, 'is_active': True}
```

---

### 2. Infer a Model Only

```python
from articuno import infer_pydantic_model

model = infer_pydantic_model(df, model_name="UserModel")
print(model.model_json_schema(indent=2))
```

---

### 3. Generate Python Source Code from a Model

```python
from articuno import generate_class_code

code = generate_class_code(model, model_name="UserModel")
print(code)
```

**Output:**

```python
from pydantic import BaseModel

class UserModel(BaseModel):
    name: str
    age: int
    is_active: bool
```

Or write it to a file:

```python
generate_class_code(model, output_path="user_model.py")
```

---

## 🧬 Example: Nested Structs

```python
nested_df = pl.DataFrame({
    "user": pl.Series([
        {"name": "Alice", "age": 30},
        {"name": "Bob", "age": 25},
    ], dtype=pl.Struct([
        ("name", pl.Utf8),
        ("age", pl.Int64),
    ]))
})

models = df_to_pydantic(nested_df)
print(models[0])
print(models[0].user)
print(models[0].user.name)
```

**Output:**

```
AutoModel(user=AutoModel_0_Struct(name='Alice', age=30))
AutoModel_0_Struct(name='Alice', age=30)
Alice
```

---

## 🦜 Patito Integration (Optional)

Articuno can optionally generate and validate models using [Patito](https://pypi.org/project/patito/), a declarative schema validation library with advanced constraints.

### How it works:

* Use `df_to_patito(df)` to convert rows into Patito instances
* Use `infer_patito_model(df)` to generate a reusable Patito model class

### Example:

```python
from articuno import infer_patito_model

patito_model = infer_patito_model(df, model_name="UserPatitoModel")
print(patito_model.schema_json(indent=2))
```

Patito integration is optional and requires installing Patito:

```bash
pip install patito
```

---

## ⏰ When to Use Articuno

* ✅ You use **Polars** and want **type-safe modeling**
* ✅ You dynamically load or transform tabular data
* ✅ You want to **generate sharable Python classes**
* ✅ You want to **validate Polars DataFrames** using Pydantic rules
* ✅ You want **optional advanced validation** with Patito

---

## ⚙️ Supported Type Mappings

| Polars Type           | Pydantic Type         |
| --------------------- | --------------------- |
| `pl.Int*`, `pl.UInt*` | `int`                 |
| `pl.Float*`           | `float`               |
| `pl.Utf8`             | `str`                 |
| `pl.Boolean`          | `bool`                |
| `pl.Date`             | `datetime.date`       |
| `pl.Datetime`         | `datetime.datetime`   |
| `pl.Duration`         | `datetime.timedelta`  |
| `pl.List`             | `List[...]`           |
| `pl.Struct`           | Nested Pydantic model |
| `pl.Null`             | `Optional[...]`       |

---

## 🧩 Integration Ideas

* 🔐 Use for **FastAPI** or **Litestar** API schemas
* 🧼 Use in **ETL pipelines** to enforce schema contracts
* 📄 Use to **generate Pydantic models** from data exports
* 🔀 Use with `polars.read_json` / `read_parquet` to auto-model nested data
* 🦜 Use **Patito models** for advanced schema validation where needed

---

## 🧪 Development & Testing

```bash
git clone https://github.com/your-username/articuno
cd articuno
pip install -e ".[dev]"
pytest
```

---

## 🧙‍♂️ FastAPI Integration (Decorator + CLI Bootstrap)

Articuno makes it easy to generate response\_models for your FastAPI endpoints that return polars.DataFrames — no need to manually define Pydantic models.

### 🧩 Step 1: Add the Decorator

Use the `@infer_response_model` decorator on your FastAPI endpoint. Provide:

* a name for the generated Pydantic model,
* an example input dict to simulate a call to your endpoint,
* an optional path to your models.py file (defaults to `models.py` next to the FastAPI app file).

```python
from fastapi import FastAPI
from articuno.decorator import infer_response_model
import polars as pl

app = FastAPI()

@infer_response_model(
    name="UserModel",
    example_input={"limit": 2},
    models_path="models.py"  # Optional, relative to this file by default
)
@app.get("/users")
def get_users(limit: int):
    return pl.DataFrame({
        "name": ["Alice", "Bob"],
        "age": [30, 25],
    }).head(limit)
```

The decorator registers the endpoint for the CLI to analyze later without changing runtime behavior.

### ⚙️ Step 2: Run the CLI Bootstrap

After writing or modifying your endpoints, run:

```bash
articuno bootstrap app/main.py
```

This will:

1. Import and call all decorated endpoints with the example input
2. Infer a Pydantic model from the returned DataFrame
3. Write the model to the specified models.py file
4. Update your FastAPI app to use the generated response models

---

## 🛠 CLI Options

```bash
Usage: articuno bootstrap [OPTIONS] APP_PATH

Arguments:
  APP_PATH                Path to your FastAPI app file (e.g., app/main.py)

Options:
  --models-path PATH      Optional output path for models.py (defaults to same folder as app)
  --dry-run               Preview changes without writing files
  --help                  Show this message and exit
```

---

## 📜 Patito vs Articuno

| Feature                | **Patito**           | **Articuno**              |
| ---------------------- | -------------------- | ------------------------- |
| Polars–Pydantic bridge | ✅ Declarative schema | ✅ Dynamic inference       |
| Validation constraints | ✅ Unique, bounds     | ⚠️ Basic types, nullables |
| Nested Structs         | ❌ Not supported      | ✅ Fully recursive         |
| Code generation        | ❌                    | ✅ via datamodel-code-gen  |
| Example/mock data      | ✅ `.examples`        | ❌                         |

**[Patito](https://pypi.org/project/patito/)** is ideal for static schema validation with custom constraints and ETL pipelines.

**Articuno** excels at dynamic schema inference, nested model generation, and code export for API use cases.

---

## License

MIT © 2025 Odos Matthews
