Metadata-Version: 2.3
Name: sintezi
Version: 0.2.0
Summary: Synthetic data generation that actually doesn't hurt.
Keywords: ai,llm,synthetic-data,openai,structured-output,pydantic,data-generation
Author: Maksim Afanasyev
Author-email: Maksim Afanasyev <mr.applexz@gmail.com>
License: MIT License
         
         Copyright (c) 2026 Maksim Afanasyev
         
         Permission is hereby granted, free of charge, to any person obtaining a copy
         of this software and associated documentation files (the "Software"), to deal
         in the Software without restriction, including without limitation the rights
         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
         copies of the Software, and to permit persons to whom the Software is
         furnished to do so, subject to the following conditions:
         
         The above copyright notice and this permission notice shall be included in all
         copies or substantial portions of the Software.
         
         THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
         SOFTWARE.
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Dist: openai>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-xml>=2.18.0
Requires-Dist: tenacity>=9.0.0
Requires-Python: >=3.11
Project-URL: Repository, https://github.com/mrapplexz/sintezi
Project-URL: Documentation, https://mrapplexz.github.io/sintezi/
Project-URL: Changelog, https://github.com/mrapplexz/sintezi/blob/master/CHANGELOG.md
Project-URL: Bug Tracker, https://github.com/mrapplexz/sintezi/issues
Description-Content-Type: text/markdown

# sintezi

Synthetic data generation that actually doesn't hurt.

[![PyPI](https://img.shields.io/pypi/v/sintezi?color=blue)](https://pypi.org/project/sintezi/)
[![Python](https://img.shields.io/pypi/pyversions/sintezi)](https://pypi.org/project/sintezi/)
[![License](https://img.shields.io/github/license/mrapplexz/sintezi)](LICENSE)
[![CI](https://github.com/mrapplexz/sintezi/actions/workflows/release.yml/badge.svg)](https://github.com/mrapplexz/sintezi/actions/workflows/release.yml)
[![Docs](https://img.shields.io/badge/docs-online-blue)](https://mrapplexz.github.io/sintezi/)

A type-safe Python library for generating synthetic data using LLMs. Built with structured outputs, automatic retry policies, and support for multiple response formats (JSON, XML).

**Why sintezi?** Unlike general-purpose LLM frameworks (LangChain, LlamaIndex), sintezi is focused on bulk synthetic data generation with explicit developer control:

- **Bulk generation first** — optimized for creating large synthetic datasets, not building chatbots or agents
- **Explicit control** — you define formats, parsers, and retry logic; no hidden prompt engineering or magic
- **Simple by design** — no memory systems, RAG pipelines, or high-level abstractions; just clean, predictable data generation

If you need agentic workflows, memory, or RAG, use LangChain. If you need to generate 10,000 structured examples with full control, use sintezi.

## Features

- **Type-safe** — Pydantic models for requests and responses with full type hints
- **Multiple formats** — JSON, XML, plain text, or custom formatters
- **Smart retry** — Separate retry policies for network errors and validation failures
- **Auto-parsing** — Automatic format selection based on Pydantic models
- **LLM-agnostic** — Works with any OpenAI-compatible API

## Installation

```bash
pip install sintezi
```

**Requirements:** Python 3.11+

## Quick start

```python
from pydantic import BaseModel
from openai import AsyncOpenAI
from sintezi.ai.context import ai_context_from_openai
from sintezi.ai.executor import StructuredAiCall, StructuredAiCallConfig, AiCallParameters
from sintezi.ai.formatter import auto_formatter_for_type
from sintezi.ai.parser import auto_parser_for_type

class ProductInfo(BaseModel):
    name: str
    category: str

class ProductDescription(BaseModel):
    description: str

# Setup
client = AsyncOpenAI(api_key="your-api-key")
ctx = ai_context_from_openai(client)

config = StructuredAiCallConfig(
    system_message="Generate product descriptions.",
    parameters=AiCallParameters(model="gpt-4o-mini"),
)

ai_call = StructuredAiCall(
    ctx=ctx,
    config=config,
    formatter=auto_formatter_for_type(ProductInfo),
    parser=auto_parser_for_type(ProductDescription),
)

# Generate
product = ProductInfo(name="Laptop", category="Electronics")
result = await ai_call(product)
print(result.description)
```

See the [quick start guide](https://mrapplexz.github.io/sintezi/guide/quickstart/) for a complete walkthrough.

## Documentation

Full documentation: [https://mrapplexz.github.io/sintezi/](https://mrapplexz.github.io/sintezi/)

- [Quick start guide](https://mrapplexz.github.io/sintezi/guide/quickstart/) — complete walkthrough with examples
- [Executors](https://mrapplexz.github.io/sintezi/guide/executors/) — available AI call executors
- [Formatters](https://mrapplexz.github.io/sintezi/guide/formatters/) — JSON, XML, custom formats
- [Parsers](https://mrapplexz.github.io/sintezi/guide/parsers/) — response parsing and validation
- [Retry policies](https://mrapplexz.github.io/sintezi/guide/retry/) — network and validation retry configuration
- [API Reference](https://mrapplexz.github.io/sintezi/reference/) — complete API documentation

## License

[MIT](LICENSE)
