Metadata-Version: 2.4
Name: browsefn
Version: 0.0.1
Summary: Comprehensive self-hosted web browsing and data extraction platform for developers
Author: 21n
License-Expression: Apache-2.0
Keywords: crawling,geolocation,images,sdk,web-scraping
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: html2text>=2020.1.16
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: sfns>=0.1.0
Requires-Dist: tenacity>=8.2.0
Requires-Dist: typing-extensions>=4.0.0
Provides-Extra: fastapi
Requires-Dist: fastapi>=0.100.0; extra == 'fastapi'
Requires-Dist: superfunctions-fastapi>=0.1.0; extra == 'fastapi'
Requires-Dist: uvicorn>=0.20.0; extra == 'fastapi'
Provides-Extra: playwright
Requires-Dist: playwright>=1.40.0; extra == 'playwright'
Provides-Extra: test
Requires-Dist: mypy>=1.0.0; extra == 'test'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'test'
Requires-Dist: pytest-cov>=4.1.0; extra == 'test'
Requires-Dist: pytest>=7.0.0; extra == 'test'
Description-Content-Type: text/markdown

# BrowseFn Python SDK

> Comprehensive self-hosted web browsing and data extraction platform for developers.

The Python SDK for BrowseFn provides a unified interface for:

- Web scraping and crawling (HTML, Markdown, Text)
- Image search and download
- Geolocation services (Geocoding, Reverse Geocoding)
- Provider-agnostic interface (swap providers easily)

## Status

🚧 **Alpha**

## Features

- **Type-safe**: Built with Pydantic for robust data validation.
- **Async**: Fully asynchronous API using `httpx`.
- **Extensible**: Easy to add custom providers.
- **Batteries included**: Comes with basic providers (BeautifulSoup).

## Installation

```bash
pip install browsefn
```

## Usage

### Web Scraping

```python
import asyncio
from browsefn import browse_fn
from browsefn.web.providers.bs4 import BeautifulSoupProvider

async def main():
    # Initialize
    browse = browse_fn()
    
    # Register a provider (e.g., BeautifulSoup)
    bs4_provider = BeautifulSoupProvider()
    browse.web.register_provider("beautifulsoup", bs4_provider)
    browse.web.config.default_provider = "beautifulsoup"
    
    # Get a page
    page = await browse.web.get_page("https://example.com")
    
    print(f"Title: {page.metadata.title}")
    print(f"Content length: {len(page.content)}")

if __name__ == "__main__":
    asyncio.run(main())
```

### Configuration

You can configure BrowseFn using the `BrowseFnConfig` object.

```python
from browsefn import browse_fn, BrowseFnConfig, WebConfig

config = BrowseFnConfig(
    web=WebConfig(
        default_provider="firecrawl",
        # ...
    )
)
browse = browse_fn(config)
```

## Development

1.  **Install dependencies**:
    ```bash
    pip install -e ".[test]"
    ```

2.  **Run tests**:
    ```bash
    pytest
    ```