Metadata-Version: 2.4
Name: anyweb
Version: 0.5.2
Summary: CLI-first browser automation with platform intelligence
Project-URL: Homepage, https://github.com/mixiaomi/anyweb
Project-URL: Repository, https://github.com/mixiaomi/anyweb
Author: mixiaomi
License-Expression: MIT
License-File: LICENSE
Keywords: automation,browser,cli,playwright,scraping
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: playwright-stealth>=1.0.6
Requires-Dist: playwright~=1.59.0
Requires-Dist: websockets>=13.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Provides-Extra: mcp
Requires-Dist: mcp[cli]>=1.0.0; extra == 'mcp'
Requires-Dist: pydantic>=2.0.0; extra == 'mcp'
Description-Content-Type: text/markdown

# anyweb

CLI-first browser automation for AI agents. One command reads a webpage, one command clicks a button — no boilerplate, no browser lifecycle to manage.

Built on Playwright with anti-detection, a persistent daemon, and platform-specific extractors for X, Zhihu, XHS, and more.

## Why anyweb

AI coding agents (Claude Code, Codex, etc.) need to interact with the web but can't drive a browser natively. anyweb fills that gap:

- **Atomic commands** — `open`, `click`, `input`, `keys`, `read` — each one maps to a single action
- **Persistent daemon** — browser stays alive between commands, ~50ms latency vs ~3s cold start
- **Multi-page parallelism** — `--page` flag runs named tabs concurrently, perfect for batch queries
- **Accessibility tree** — `state --ax` returns a compact, structured page representation with text, URLs, and element indices — ideal for LLM context
- **Content-stable waiting** — `wait content-stable` detects when dynamic content (Grok responses, streaming results) finishes loading
- **Platform intelligence** — `read` auto-detects X/Zhihu/XHS and applies platform-specific extraction

## Install

```bash
pip install anyweb        # or: uv tool install anyweb
anyweb install            # install Chromium
```

## Quick start

```bash
# Read any webpage — auto-detects platform
anyweb read "https://x.com/karpathy"

# JSON output for programmatic use
anyweb --json read "https://x.com/karpathy"

# Search within a platform
anyweb search x "embodied AI news" --limit 10

# System health check
anyweb doctor
```

## Commands

### High-level

| Command | What it does |
|---------|-------------|
| `read URL` | Smart extract: auto-detect platform, render JS, return clean content |
| `search PLATFORM QUERY` | Search within X / Zhihu / XHS |
| `login PLATFORM` | Save session cookies (opens browser window) |
| `status` | Daemon, browsers, pages, sessions, recommended actions |
| `doctor [--fix]` | Dependency + credential check with auto-repair |

### Atomic browser

| Command | What it does |
|---------|-------------|
| `open URL` | Navigate (starts daemon if needed) |
| `state` | List interactive elements with indices |
| `state --ax` | Accessibility tree — compact text + URLs for LLMs |
| `click IDX` | Click element by index or CSS selector |
| `input IDX "text"` | Click element, then type |
| `type "text"` | Type via keyboard events (no click) |
| `keys Enter` | Send keyboard events |
| `hover IDX` | Hover over element |
| `scroll down` | Scroll the page |
| `eval "js"` | Execute JavaScript (supports Promises) |
| `screenshot` | Capture screenshot |
| `get text\|title\|url` | Get page info |
| `wait selector "h1"` | Wait for CSS selector |
| `wait content-stable` | Wait for dynamic content to settle |
| `back` | Navigate back |
| `close` | Close browser; daemon stays alive |
| `shutdown` | Stop everything |

### Diagnostics

| Command | What it does |
|---------|-------------|
| `log [-f] [--errors]` | View, follow, or filter daemon command log |
| `sessions` | List saved sessions with expiry info |
| `cookies` | Cookie management |

## Multi-page parallelism

`--page NAME` runs multiple named tabs concurrently — essential for batch operations like querying Grok 4 times in parallel:

```bash
# Open 4 pages
anyweb open --page g1 "https://x.com/i/grok?new=true"
anyweb open --page g2 "https://x.com/i/grok?new=true" &
anyweb open --page g3 "https://x.com/i/grok?new=true" &
anyweb open --page g4 "https://x.com/i/grok?new=true" &
wait

# Submit queries in parallel
anyweb input --page g1 textarea "Query 1" && anyweb keys --page g1 Enter &
anyweb input --page g2 textarea "Query 2" && anyweb keys --page g2 Enter &
wait

# Wait for all responses to finish
anyweb wait --page g1 content-stable --duration 5 --timeout 120000 &
anyweb wait --page g2 content-stable --duration 5 --timeout 120000 &
wait

# Read results
anyweb state --page g1 --ax > /tmp/result1.txt
anyweb state --page g2 --ax > /tmp/result2.txt
```

`read` manages pages internally — no `--page` needed for batch reads:

```bash
anyweb --json read "https://x.com/AnthropicAI" > /tmp/a.json &
anyweb --json read "https://x.com/OpenAI" > /tmp/b.json &
anyweb --json read "https://x.com/karpathy" > /tmp/c.json &
wait
```

## Accessibility tree

`state --ax` returns the page structure optimized for LLM consumption — text, links, and interactive elements in a compact format:

```
URL: https://x.com/karpathy
Title: Andrej Karpathy (@karpathy) / X
Refs: 42 interactive elements
---
RootWebArea "Andrej Karpathy (@karpathy) / X"
  e0:link "Home" -> https://x.com/home
  e1:link "Search" -> https://x.com/explore
  ...
  StaticText "The hottest new programming language is English"
  e12:link "View tweet" -> https://x.com/karpathy/status/1617979122625712128
```

Options: `--interactive-only` (clickable/typable elements only), `--depth N` (limit tree depth).

## Platform intelligence

`read` auto-detects the platform from the URL and applies the right extraction:

| Platform | Detection | Extraction |
|----------|-----------|------------|
| X/Twitter | `x.com`, `twitter.com` | Tweets, threads, profiles, engagement stats, media URLs |
| Zhihu | `zhihu.com` | Auto-expand folded content, answers |
| XHS | `xiaohongshu.com` | Notes, comments, images |
| Generic | everything else | Article body, metadata |

Force a platform: `anyweb read -p x "https://example.com"`

## Session management

Login once, use forever. Cookies persist across daemon restarts:

```bash
anyweb login x                    # opens browser, you log in, cookies saved
anyweb login x --method import    # import from system Chrome (no manual login)
anyweb login x --method cookie    # paste cookies directly
```

`doctor` and `status` show session health:

```
=== Sessions ===
  x         : saved 2026-05-18 | cookies: 465 | expires 2027-05-25
  zhihu     : saved 2026-05-08 | cookies: 36  | expires 2026-11-03
  xhs       : no auth cookie ❌

=== Recommended Actions ===
  anyweb login xhs
```

## Output formats

```bash
anyweb read URL              # plain text (default, pipe-friendly)
anyweb --json read URL       # structured JSON with metadata
anyweb --markdown read URL   # markdown with frontmatter
```

## Architecture

```
CLI ──▶ Unix socket ──▶ Daemon (persistent) ──▶ Playwright browser
                         │
                         ├── Headless engine (default, fast)
                         └── CDP engine (system Chrome, for login/headed)
```

- **Daemon** auto-starts on first command, stays alive across invocations
- **Dual engine**: headless Playwright for speed, CDP to system Chrome for `--headed` / login
- **Anti-detection**: playwright-stealth applied automatically
- **Session isolation**: per-platform cookie storage in `~/.anyweb/sessions/`

## Requirements

- Python 3.11+
- macOS or Linux

## License

MIT
