Metadata-Version: 2.4
Name: pydantic-ai-crawlers
Version: 0.1.1
Summary: Add crawling capability to pydantic ai agent
Project-URL: Homepage, https://gitlab.com/rzk.ssr/pydantic-ai-crawling
Project-URL: Repository, https://gitlab.com/rzk.ssr/pydantic-ai-crawling
Project-URL: Issues, https://gitlab.com/rzk.ssr/pydantic-ai-crawling
Author: Pydantic AI Channels Contributors
License-File: LICENSE
Keywords: ai,bot,crawling,pydantic,scraper
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Requires-Dist: crawl4ai>=0.8.6
Requires-Dist: pydantic-ai>=1.103.0
Description-Content-Type: text/markdown

# pydantic-ai-crawling

Seamlessly integrate [Pydantic AI](https://github.com/pydantic/pydantic-ai) with [Crawl4AI](https://github.com/unclecode/crawl4ai) to empower your AI agents with advanced web crawling and scraping capabilities.

## Features

- **🔎 Crawling & Scraping**: High-performance web content extraction tailored for AI agents.
- **🖼️ Media Support**: Extract images, audio, videos, and responsive formats (srcset, picture).
- **🚀 Dynamic Crawling**: Execute JavaScript and handle async/sync content extraction.
- **📸 Screenshots**: Capture page screenshots for debugging or visual analysis.
- **📂 Raw Data Crawling**: Process raw HTML (`raw:`) or local files (`file://`) directly.
- **🔗 Link Extraction**: Comprehensive extraction of internal, external, and iframe links.
- **🛠️ Customizable Hooks**: Define hooks at every step to customize crawling behavior.
- **💾 Caching**: Built-in caching for improved speed and efficiency.
- **📄 Metadata Extraction**: Retrieve structured metadata from any web page.
- **📡 IFrame Support**: Seamless extraction from embedded iframe content.
- **🕵️ Lazy Load Handling**: Automatically waits for images and content to load.
- **🔄 Full-Page Scanning**: Simulates scrolling for infinite-scroll and dynamic pages.

## Installation

```bash
pip install pydantic-ai-crawling
```

## Usage

```python
import pydantic_ai_crawling

# Example usage
pydantic_ai_crawling.greet()
```

## CLI

After installation, you can use the built-in CLI:

```bash
crawler
```

## Development

To set up the development environment:

```bash
uv sync
```

To run tests:

```bash
uv run pytest
```

## License

MIT
