Metadata-Version: 2.4
Name: aibrowser
Version: 1.1.1
Summary: AI-powered browser automation
License: MIT License
        
        Copyright (c) 2026 YuBing
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: browser-use==0.12.6
Requires-Dist: aiohttp<4.0,>=3.13
Requires-Dist: pydantic<3.0,>=2.12
Requires-Dist: requests<3.0,>=2.32
Requires-Dist: chromadb<2.0,>=1.5
Requires-Dist: python-dotenv<2.0,>=1.2
Requires-Dist: pydoll-python<3.0,>=2.22
Requires-Dist: socksio<2.0,>=1.0
Requires-Dist: httpx<1.0,>=0.28
Requires-Dist: beautifulsoup4<5.0,>=4.14
Requires-Dist: rich<15.0,>=13.0
Requires-Dist: html2text<2026.0,>=2024.2
Provides-Extra: dev
Requires-Dist: pytest<10.0,>=9.0; extra == "dev"
Requires-Dist: pytest-asyncio<2.0,>=1.3; extra == "dev"
Requires-Dist: pytest-cov<8.0,>=7.0; extra == "dev"
Requires-Dist: ruff>=0.9.0; extra == "dev"
Requires-Dist: mypy>=1.14.0; extra == "dev"
Requires-Dist: pre-commit>=4.0.0; extra == "dev"
Dynamic: license-file

# AI Browser

> Let Claude Code / Cursor / Windsurf AI Agents **operate the browser autonomously** — stable, cheap, and automation-ready.
> Not a browser bot. This is the **bridge layer** between AI Agents and the browser.

---

## Demo

### Natural Language Driven

```
> Search GitHub for the most starred Python repos
⠋ Navigating to GitHub...
✓ Navigate to https://github.com
✓ Fill search: Python
✓ Click search button
✓ Extract top 10 repo names and star counts

Task complete:
  1. public-apis/public-apis        ⭐ 382k
  2. donnemartin/system-design-primer ⭐ 310k
  3. AUTOMATIC1111/stable-diffusion  ⭐ 280k
  ...
```

### Benchmark Results

4 tasks × 3 runs each, measured on real pages with Chrome.
Full data in `benchmark_results.json`.

**Accuracy**: 100% (12/12 runs successful)

**Token efficiency**:

| Task | DOM tokens sent to LLM |
|------|----------------------|
| GitHub | 656 |
| DuckDuckGo | 720 |
| Wikipedia | 3,408 |
| Hacker News | 4,998 |

**Cost per 100 tasks** (3 LLM calls each):

| Model | Cost |
|-------|------|
| deepseek-chat | ¥1.24 |
| gpt-4o | $3.11 |
| claude-sonnet-4-6 | $4.00 |
| **Playbook replay** | **¥0** (zero LLM calls) |

### vs Browser Use

| | Browser Use | AI Browser |
|---|---|---|
| Sent to LLM | Raw accessibility tree (~5,000–77,000 tokens) | DomRouteMap (~200–5,000 tokens) |
| Token reduction | — | **17x lower** |
| Repeated tasks | LLM re-interprets every time | Playbook replay, zero LLM cost |
| Search tasks | Opens browser → may trigger CAPTCHA | HTTP search (DuckDuckGo/Bing), no browser |

---

## Quick Start

### 1. Install

```bash
pip install aibrowser
```

### 2. Start

```bash
./scripts/start.sh --install
```

### 3. Connect to Claude Code / Cursor

Add to your MCP config:

```json
{
  "mcpServers": {
    "aibrowser": {
      "command": "bash",
      "args": ["-c", "exec python3 python_bridge/mcp_server.py"]
    }
  }
}
```

### 4. Use

In Claude Code, just say:

```
> What's the top story on Hacker News today?
> Compare iPhone 16 prices on Amazon vs eBay
> Log into my company dashboard and pull this week's report
```

Or control manually:

```python
goto("https://example.com")
snapshot()       # structured page view
click("@3")      # stable element index
fill("@7", "hello")
screenshot()
```

---

## Core Features

### DomRouteMap — Stable Selectors, Every Time

No fragile CSS selectors or XPaths. Pages are compressed into semantic route keys like `button[Login]` or `textbox[Username]`. LLM interaction is stable, tokens reduced ~20x.

### Playbook — Learn Once, Replay Forever

Every successful action chain is auto-saved as a Playbook. Replay on the same site next time — **zero LLM calls, zero cost**.

```
# First run: LLM explores the path
aib agent "Log into Wikipedia and search for AI"
# → 5 steps, 5 LLM calls

# Second run: plays from Playbook
aib agent "Log into Wikipedia and search for AI"
# → 0 LLM calls, instant
```

### MoE — Mixture of Experts

Complex requests are decomposed into sub-tasks assigned to specialized experts:

- **Information Expert**: Research without opening a browser (HTTP search + crawl + extract)
- **Browser Expert**: Stable DOM interaction
- **Site Planner**: Plan workflows using saved site profiles

### Human-in-the-Loop Assistance

Behavioral simulation and interaction prompts for common human-verification scenarios, helping users complete legitimate web operations. Users are responsible for ensuring their actions comply with target websites' terms of service.

### Responsible Use

AI Browser is a general-purpose automation tool designed for legitimate use cases such as:
- **Testing your own websites and applications** (multi-tab concurrency simulates multi-user scenarios)
- **Automating your own workflows** (form filling, data entry, report generation)
- **Accessibility and usability research**
- **Compliance monitoring of publicly available information**

You are solely responsible for ensuring your use complies with applicable laws and the terms of service of any websites you interact with. This tool must not be used to bypass security measures, conduct unauthorized access, or scrape data in violation of applicable laws.

### Privacy Protection Mode

Optimized browser compatibility configuration to reduce environment differences in automated testing scenarios, supporting stable long-session operations.

---

## Commands

```
Navigation:     goto, back, forward, reload
Reading:        text, html, snapshot, url, title, links, cookies, storage
Interaction:    click, fill, type, press, scroll, hover, select, upload
Inspection:     screenshot, inspect, get_attrs, is_element
Tabs:           tabs, tab, newtab, closetab, switch_tab
Sessions:       session-list, session-create, session-destroy, save_session, load_session
Agent:          agent, agent-stop, agent-status, agent-team, interrupt, history
Playbook:       playbook-list, playbook-save, playbook-replay, playbook-delete
Advanced:       js, chain, frame, state, cdp, download, privacy, handoff, shell
```

---

## License

AI Browser is open-source under the [MIT License](LICENSE).

### Features

- 90+ browser commands
- DomRouteMap (stable selectors)
- Playbook (saved workflows)
- Multi-tab concurrency
- MoE task decomposition
- Knowledge base (semantic search)
- Task scheduler (Cron)
- Multi-tenant RBAC
- Audit log
- Unlimited agent runs
- HTTP search (no browser needed)

Activate:

```bash
aib start
```

---

## Configuration

| Variable | Description |
|---|---|
| `AIBRIDGE_PORT` | Server port (default: 18936) |
| `AIBRIDGE_AUTH_TOKEN` | API auth token |
| `LLM_PROVIDER` | LLM provider (`openai`, `ollama`) |
| `LLM_MODEL` | LLM model name |
| `LLM_API_KEY` | LLM API key |

> **Security tip**: Store sensitive credentials like `LLM_API_KEY` in your system keychain (macOS Keychain / Windows Credential Manager / Linux Secret Service) rather than plaintext config files or env scripts.

---

## Tech Stack

| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| Browser engine | browser-use >= 0.12.6 |
| Browser | System Chrome/Chromium |
| HTTP server | aiohttp |
| MCP SDK | FastMCP (stdio/SSE) |
| LLM providers | OpenAI, Ollama, DeepSeek, GPT-4o |
| Vector store | ChromaDB |
| CDP | pydoll-python, native CDP |

---

## License

AI Browser is open-source under the [MIT License](LICENSE).

This product includes third-party open-source components. See [NOTICE](NOTICE) for details.

---

<details>
<summary>🇨🇳 中文简介</summary>

AI Browser 让 Claude Code / Cursor / Windsurf 的 AI Agent 自主操作浏览器 —— 稳定、便宜、自动化就绪。

- **DomRouteMap**：稳定选择器，Token 降低 20 倍
- **Playbook**：学会就回放，零 LLM 成本
- **多 tab 并发**：同一浏览器同时操作多个标签页
- **免费可用**：核心功能不收费，按需升级

安装：`pip install aibrowser`

</details>
