Metadata-Version: 2.4
Name: owl-browser
Version: 2.1.2
Summary: Python SDK for Owl Browser automation - async-first with dynamic OpenAPI method generation
Project-URL: Homepage, https://www.owlbrowser.net
Project-URL: Documentation, https://www.owlbrowser.net/docs
Project-URL: Repository, https://github.com/Olib-AI/olib-browser
Author-email: Olib AI <support@olib.ai>
License-Expression: MIT
Keywords: antidetect,async,automation,browser,owl,testing,web-scraping
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: cryptography>=42.0.0
Requires-Dist: pyjwt[crypto]>=2.8.0
Provides-Extra: dev
Requires-Dist: mypy>=1.8.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.2.0; extra == 'dev'
Description-Content-Type: text/markdown

# Owl Browser Python SDK v2

Async-first Python SDK for [Owl Browser](https://www.owlbrowser.net) automation with dynamic OpenAPI method generation and flow execution support.

## Features

- **Dynamic Method Generation**: Methods are automatically generated from the OpenAPI schema
- **Async-First Design**: Built with asyncio for optimal performance
- **Sync Wrappers**: Convenience methods for non-async code
- **Flow Execution**: Execute test flows with variable resolution and expectations
- **Type Safety**: Full type hints with Python 3.12+ features
- **Connection Pooling**: Efficient HTTP connection management
- **Retry Logic**: Automatic retries with exponential backoff

## Installation

```bash
pip install owl-browser
```

For development:

```bash
pip install owl-browser[dev]
```

## Quick Start

### Connection Modes

The SDK supports two connection modes depending on your deployment:

```python
from owl_browser import OwlBrowser, RemoteConfig

# Production (via nginx proxy) - this is the default
# Uses /api prefix: https://your-domain.com/api/execute/...
config = RemoteConfig(
    url="https://your-domain.com",
    token="your-token"
)

# Development (direct to http-server on port 8080)
# No prefix: http://localhost:8080/execute/...
config = RemoteConfig(
    url="http://localhost:8080",
    token="test-token",
    api_prefix=""  # Empty string for direct connection
)
```

### Async Usage (Recommended)

```python
import asyncio
from owl_browser import OwlBrowser, RemoteConfig

async def main():
    config = RemoteConfig(
        url="https://your-domain.com",
        token="your-secret-token"
    )

    async with OwlBrowser(config) as browser:
        # Create a browser context
        ctx = await browser.create_context()
        context_id = ctx["context_id"]

        # Navigate to a page
        await browser.navigate(context_id=context_id, url="https://example.com")

        # Click an element
        await browser.click(context_id=context_id, selector="button#submit")

        # Take a screenshot
        screenshot = await browser.screenshot(context_id=context_id)

        # Extract text content
        text = await browser.extract_text(context_id=context_id, selector="h1")
        print(f"Page title: {text}")

        # Close the context
        await browser.close_context(context_id=context_id)

asyncio.run(main())
```

### Sync Usage

```python
from owl_browser import OwlBrowser, RemoteConfig

config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)

browser = OwlBrowser(config)
browser.connect_sync()

# Execute tools synchronously
ctx = browser.execute_sync("browser_create_context")
browser.execute_sync("browser_navigate", context_id=ctx["context_id"], url="https://example.com")
browser.execute_sync("browser_close_context", context_id=ctx["context_id"])

browser.close_sync()
```

## Authentication

### Bearer Token

```python
config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)
```

### JWT Authentication

```python
from owl_browser import RemoteConfig, AuthMode, JWTConfig

config = RemoteConfig(
    url="http://localhost:8080",
    auth_mode=AuthMode.JWT,
    jwt=JWTConfig(
        private_key_path="/path/to/private.pem",
        expires_in=3600,  # 1 hour
        refresh_threshold=300,  # Refresh 5 minutes before expiry
        issuer="my-app",
        subject="user-123"
    )
)
```

## Flow Execution

Execute JSON-described automation flows with declarative assertions,
variable resolution, conditional branching, loops, retries, and
notifications.

```python
from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.flow import FlowExecutor

async def run_flow():
    async with OwlBrowser(RemoteConfig(...)) as browser:
        ctx = await browser.create_context()
        executor = FlowExecutor(browser, ctx["context_id"])

        flow = FlowExecutor.load_flow("test-flows/navigation.json")
        result = await executor.execute(flow)

        if result.success:
            print(f"Flow completed in {result.total_duration_ms:.0f}ms")
        else:
            print(f"Flow failed: {result.error}")

        await browser.close_context(context_id=ctx["context_id"])
```

For the full reference — flow file schema, every step-level field
(`expected`, `condition`, `for_each`, `capture`, `optional`, `timeoutMs`,
`retry`), variable scopes (`${prev}`, `${vars}`, `${params}`), the
`FlowNotifier` API, troubleshooting, and ~30 worked examples drawn from
`enterprise/test-flows/` — see
[**docs/FLOW_EXECUTOR.md**](docs/FLOW_EXECUTOR.md).

## Playwright-Compatible API

Drop-in Playwright API that translates Playwright calls to Owl Browser tools. Use your existing Playwright code with Owl Browser's antidetect capabilities.

```python
from owl_browser.playwright import chromium, devices

async def main():
    browser = await chromium.connect("http://localhost:8080", token="your-token")
    context = await browser.new_context(**devices["iPhone 15 Pro"])
    page = await context.new_page()

    await page.goto("https://example.com")
    await page.click("button#submit")
    await page.fill("#search", "query")

    text = await page.text_content("h1")
    await page.screenshot(path="page.png")

    # Locators
    button = page.locator("button.primary")
    await button.click()

    # Playwright-style selectors
    login = page.get_by_role("button", name="Log in")
    search = page.get_by_placeholder("Enter email")
    heading = page.get_by_text("Welcome")

    await context.close()
    await browser.close()
```

**Supported features:** Page navigation, click/fill/type/press, locators (CSS, text, role, test-id, xpath), frames, keyboard & mouse input, screenshots, network interception (`route`/`unroute`), dialogs, downloads, viewport emulation, and 20+ device descriptors (iPhone, Pixel, Galaxy, iPad, Desktop).

## Data Extraction

Universal structured data extraction from any website — CSS selectors, auto-detection, tables, metadata, and multi-page scraping with pagination. No AI dependencies, works deterministically with BeautifulSoup.

```python
from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.extraction import Extractor

async def main():
    async with OwlBrowser(RemoteConfig(url="...", token="...")) as browser:
        ctx = await browser.create_context()
        ex = Extractor(browser, ctx["context_id"])
        await ex.goto("https://example.com/products")

        # CSS selector extraction
        products = await ex.select(".product-card", {
            "name": "h3",
            "price": ".price",
            "image": "img@src",
            "link": "a@href",
        })

        # Auto-detect repeating patterns (zero-config)
        patterns = await ex.detect()

        # Multi-page scraping with automatic pagination
        result = await ex.scrape(".product-card", {
            "fields": {"name": "h3", "price": ".price", "sku": "@data-sku"},
            "max_pages": 10,
            "deduplicate_by": "sku",
        })
        print(f"{result['total_items']} items from {result['pages_scraped']} pages")
```

**Capabilities:**

| Method | Description |
|--------|-------------|
| `select()` / `select_first()` | Extract with CSS selectors and field specs (`"selector"`, `"selector@attr"`, object specs with transforms) |
| `table()` / `grid()` / `definition_list()` | Parse `<table>`, CSS grid/flexbox, and `<dl>` structures |
| `meta()` / `json_ld()` | Extract OpenGraph, Twitter Card, JSON-LD, microdata, feeds |
| `detect()` / `detect_and_extract()` | Auto-discover repeating DOM patterns |
| `lists()` | Extract list/card containers with auto-field inference |
| `scrape()` | Multi-page with pagination detection (click-next, URL patterns, buttons, load-more, infinite scroll) |
| `clean()` | Remove cookie banners, modals, fixed elements, ads |
| `html()` / `markdown()` / `text()` | Raw content with cleaning levels |

All extraction functions are also available as standalone pure functions for use without a browser connection.

## Available Tools

Methods are dynamically generated from the server's OpenAPI schema. Common tools include:

### Context Management
- `create_context()` - Create a new browser context
- `close_context(context_id)` - Close a context

### Navigation
- `navigate(context_id, url)` - Navigate to URL
- `reload(context_id)` - Reload page
- `go_back(context_id)` - Navigate back
- `go_forward(context_id)` - Navigate forward

### Interaction
- `click(context_id, selector)` - Click element
- `type(context_id, selector, text)` - Type text
- `press_key(context_id, key)` - Press keyboard key

### Content Extraction
- `extract_text(context_id, selector)` - Extract text
- `get_html(context_id)` - Get page HTML
- `screenshot(context_id)` - Take screenshot

### AI Features
- `summarize_page(context_id)` - Summarize page content
- `query_page(context_id, query)` - Ask questions about page
- `solve_captcha(context_id)` - Solve CAPTCHA challenges

Use `browser.list_tools()` to see all available tools.

## Error Handling

```python
from owl_browser import (
    OwlBrowserError,
    ConnectionError,
    AuthenticationError,
    ToolExecutionError,
    TimeoutError,
)

try:
    async with OwlBrowser(config) as browser:
        await browser.navigate(context_id="invalid", url="https://example.com")
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except ToolExecutionError as e:
    print(f"Tool {e.tool_name} failed: {e.message}")
except TimeoutError as e:
    print(f"Operation timed out: {e}")
except ConnectionError as e:
    print(f"Connection failed: {e}")
```

## Configuration Options

```python
from owl_browser import RemoteConfig, RetryConfig

config = RemoteConfig(
    url="https://your-domain.com",
    token="secret",

    # Timeout settings
    timeout=30.0,  # seconds

    # Concurrency
    max_concurrent=10,

    # Retry configuration
    retry=RetryConfig(
        max_retries=3,
        initial_delay_ms=100,
        max_delay_ms=10000,
        backoff_multiplier=2.0,
        jitter_factor=0.1
    ),

    # API prefix - determines URL structure for API calls
    # Default: "/api" (production via nginx proxy)
    # Set to "" for direct connection to http-server (development)
    api_prefix="/api",

    # SSL verification
    verify_ssl=True
)
```

## API Reference

### OwlBrowser

- `connect() / connect_sync()` - Connect to server
- `close() / close_sync()` - Close connection
- `execute(tool_name, **params) / execute_sync(...)` - Execute any tool
- `health_check()` - Check server health
- `list_tools()` - List all tool names
- `list_methods()` - List all method names
- `get_tool(name)` - Get tool definition

### FlowExecutor

- `execute(flow)` - Execute a flow
- `abort()` - Abort current execution
- `reset()` - Reset abort flag and clear vars/params
- `set_params(params)` - Override flow `${params.NAME}` defaults
- `set_event_metadata(metadata)` - Attach metadata to every emitted FlowEvent
- `load_flow(path)` - Load flow from JSON file (static)

Full reference: [docs/FLOW_EXECUTOR.md](docs/FLOW_EXECUTOR.md).

### Extractor

- `goto(url, wait_for_idle=True)` - Navigate to URL
- `select(selector, fields)` - Extract from all matches
- `select_first(selector, fields)` - Extract first match
- `count(selector)` - Count matching elements
- `table(selector, options)` - Parse HTML tables
- `grid(container, item)` - Parse CSS grids
- `definition_list(selector)` - Parse `<dl>` lists
- `detect_tables()` - Auto-detect tables
- `meta()` - Extract page metadata
- `json_ld()` - Extract JSON-LD
- `detect(options)` - Detect repeating patterns
- `detect_and_extract(options)` - Detect + extract
- `lists(selector, options)` - Extract lists/cards
- `scrape(selector, options)` - Multi-page scrape
- `abort_scrape()` - Abort running scrape
- `clean(options)` - Remove obstructions
- `html(clean_level)` - Get page HTML
- `markdown()` - Get page markdown
- `text(selector, regex)` - Get filtered text
- `detect_site()` - Detect site type
- `site_data(template)` - Site-specific extraction

## Requirements

- Python 3.12+
- aiohttp >= 3.9.0
- pyjwt[crypto] >= 2.8.0
- cryptography >= 42.0.0
- beautifulsoup4 >= 4.12.0

## License

MIT License - see LICENSE file for details.

## Links

- Website: https://www.owlbrowser.net
- Documentation: https://www.owlbrowser.net/docs
- GitHub: https://github.com/Olib-AI/olib-browser
