Metadata-Version: 2.4
Name: airow
Version: 0.1.0
Summary: AI-powered DataFrame processing made simple
Author-email: Dmitrii K <dmitriik@protonmail.com>
Maintainer-email: Dmitrii K <dmitriik@protonmail.com>
License: MIT
Project-URL: Homepage, https://github.com/dmitriiweb/airow
Project-URL: Repository, https://github.com/dmitriiweb/airow
Project-URL: Documentation, https://github.com/dmitriiweb/airow
Project-URL: Bug Tracker, https://github.com/dmitriiweb/airow/issues
Keywords: ai,ai-agent,dataframe,pandas,pydantic-ai,async,data-processing
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing
Classifier: Topic :: Database
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: loguru>=0.7.3
Requires-Dist: pandas>=2.3.2
Requires-Dist: pydantic>=2.11.7
Requires-Dist: pydantic-ai>=0.8.1
Requires-Dist: tqdm>=4.67.1
Provides-Extra: dev
Requires-Dist: mypy>=1.17.1; extra == "dev"
Requires-Dist: pytest>=8.4.2; extra == "dev"
Requires-Dist: pytest-asyncio>=1.1.0; extra == "dev"
Requires-Dist: pytest-cov>=6.3.0; extra == "dev"
Requires-Dist: ruff>=0.12.12; extra == "dev"
Dynamic: license-file

# Airow

**AI-powered DataFrame processing made simple**

Airow is a Python library that combines the power of pandas DataFrames with AI models to process structured data at scale. Built on top of `pydantic-ai`, it provides type-safe, async processing of DataFrames using any AI model.

## Features

- 🚀 **Async processing** with batch support for high performance
- 🔒 **Type-safe outputs** using Pydantic models
- 📊 **Progress tracking** with built-in progress bars
- 🔄 **Automatic retries** with configurable retry logic
- 🤖 **Flexible AI models** - works with OpenAI, Ollama, Anthropic, and more
- ⚡ **Parallel processing** within batches for maximum throughput
- 📝 **Structured outputs** with defined schemas and validation

## Installation

```bash
# Using pip
pip install airow

# Using uv (recommended)
uv add airow

# Using conda
conda install -c conda-forge airow
```

## Quick Start

```python
import pandas as pd
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.ollama import OllamaProvider
from airow import Airow, OutputColumn
import asyncio

async def main():
    # Setup your AI model
    model = OpenAIChatModel(
        model_name="llama3.2:latest",
        provider=OllamaProvider(base_url="http://localhost:11434/v1"),
    )
    # or use strings:
    model = "openai:gpt-5"
    model = "anthropic:claude-sonnet-4-0"
    
    # Create Airow instance
    airow = Airow(
        model=model,
        system_prompt="You are an expert in wine tasting and selection.",
    )
    
    # Load your data
    df = pd.read_csv("wine_data.csv")

    output_columns = [
        OutputColumn(name="sentiment", type=str, description="Positive, negative, or neutral sentiment"),
        OutputColumn(name="confidence", type=float, description="Confidence score between 0 and 1"),
        OutputColumn(name="keywords", type=list, description="List of key terms extracted"),
    ]
    
    # Process with AI
    result_df = await airow.run(
        df,
        prompt="Analyze the wine description and provide sentiment analysis, confidence score, and extract key terms.",
        input_columns=["description"],
        output_columns=output_columns,
        show_progress=True,
    )
    
    print(result_df.head())

if __name__ == "__main__":
    asyncio.run(main())
```
