Metadata-Version: 2.4
Name: crawlerforge
Version: 1.0.13
Summary: Advanced Scrapy framework with multi-engine support and intelligent proxy management
Home-page: https://github.com/fabiocantone/crawlerforge
Author: Fabio Cantone
Author-email: fabio@cantone.me
Project-URL: Bug Tracker, https://github.com/fabiocantone/crawlerforge/issues
Project-URL: Documentation, https://github.com/fabiocantone/crawlerforge/wiki
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: scrapy>=2.11.0
Requires-Dist: requests>=2.28.0
Requires-Dist: aiohttp>=3.8.0
Requires-Dist: demjson3>=3.0.6
Requires-Dist: python-dateutil>=2.8.0
Requires-Dist: twisted>=22.10.0
Provides-Extra: browser
Requires-Dist: camoufox>=0.2.0; extra == "browser"
Requires-Dist: undetected-chromedriver>=3.5.0; extra == "browser"
Provides-Extra: http
Requires-Dist: curl-cffi>=0.5.0; extra == "http"
Requires-Dist: httpx>=0.24.0; extra == "http"
Provides-Extra: storage
Requires-Dist: pymongo>=4.0.0; extra == "storage"
Requires-Dist: mysql-connector-python>=8.0.0; extra == "storage"
Provides-Extra: all
Requires-Dist: camoufox>=0.2.0; extra == "all"
Requires-Dist: undetected-chromedriver>=3.5.0; extra == "all"
Requires-Dist: curl-cffi>=0.5.0; extra == "all"
Requires-Dist: httpx>=0.24.0; extra == "all"
Requires-Dist: pymongo>=4.0.0; extra == "all"
Requires-Dist: mysql-connector-python>=8.0.0; extra == "all"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

Advanced Scrapy framework with multi-engine support and intelligent proxy management.

## Features

- **Multi-Engine Support**: HTTP (curl_cffi), Camoufox, Undetected Chrome
- **Intelligent Proxy Management**: API, file, database providers with auto-rotation
- **JSON Configuration**: Zero-code spider setup
- **Advanced Anti-Detection**: Human-like behaviors and stealth features
- **Flexible Data Extraction**: CSS, XPath, JSON, derived fields

## Installation

```bash
# Basic installation
pip install crawlerforge

# With browser support
pip install crawlerforge[browser]

# With all features
pip install crawlerforge[all]
```

## Quick Start

```bash
# Generate configuration
crawlerforge genconfig --template ecommerce --output config.json

# Run spider
crawlerforge crawl myspider -c config.json -o products.json
```

## Example Configuration

```json
{
  "engine": "camoufox",
  "start_url": ["https://example.com/sitemap.xml"],
  "products_list_selector": ".product",
  "fields": {
    "name": {"type": "text", "tags": [".title::text"], "required": true},
    "price": {"type": "price", "tags": [".price::text"], "required": true}
  }
}
```

## Documentation

Visit [GitHub](https://github.com/fabiocantone/crawlerforge) for full documentation and examples.

## License

MIT License

# Commands for publication:

# 1. Install build tools
pip install build twine

# 2. Build package
python -m build

# 3. Upload to TestPyPI (testing)
python -m twine upload --repository testpypi dist/*

# 4. Upload to PyPI (production)
python -m twine upload dist/*

# 5. Install from PyPI
pip install crawlerforge

# 6. Install from GitHub (development)
pip install git+https://github.com/fabiocantone/crawlerforge.git
