Metadata-Version: 2.4
Name: openextract
Version: 0.3.2
Summary: Extract structured data from documents, images, audio, and video using LLMs
Project-URL: Homepage, https://github.com/Mellow-Artificial-Intelligence/openextract
Project-URL: Documentation, https://mellow-artificial-intelligence.github.io/openextract/
Project-URL: Repository, https://github.com/Mellow-Artificial-Intelligence/openextract
Project-URL: Issues, https://github.com/Mellow-Artificial-Intelligence/openextract/issues
Project-URL: Changelog, https://github.com/Mellow-Artificial-Intelligence/openextract/blob/main/CHANGELOG.md
Author: Cole McIntosh
License-Expression: MIT
License-File: LICENSE
Keywords: ai,document,extraction,llm,pydantic,structured-data
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Requires-Dist: pydantic-ai-slim[google,logfire,openai]>=1.37.0
Requires-Dist: pydantic>=2.12.5
Requires-Dist: python-dotenv>=1.2.2
Description-Content-Type: text/markdown

# openextract

Extract structured data from documents, images, audio, and video using LLMs.

## Installation

```bash
uv add openextract
```

Or

```bash
pip install openextract
```

## Usage

```python
from pydantic import BaseModel
from openextract import extract

class PdfInfo(BaseModel):
    summary: str
    language: str

result = extract(
    schema=PdfInfo,
    model="openai:gpt-5.4",
    input_file="https://example.com/document.pdf",
    instructions="return a 2 sentence summary and the primary language of the document",
)
print(result)
```

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for release history.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.
