Metadata-Version: 2.4
Name: spagents
Version: 0.1.4
Summary: SPA-aware browsing library for AI agents
Project-URL: Homepage, https://github.com/moseswynn/spagents
Project-URL: Repository, https://github.com/moseswynn/spagents
Project-URL: Issues, https://github.com/moseswynn/spagents/issues
Project-URL: License, https://github.com/moseswynn/spagents/blob/main/LICENSE.md
Author-email: Moses Wynn <accounts@moseswynn.com>
License: # spagents License
        
        ## MIT License (Amended)
        
        Copyright (c) 2026 Moses Wynn
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        ### Additional Condition: Community Contribution Requirement
        
        Organizations with **more than 100 employees** that use this Software in
        any capacity (internal tools, products, services, or development workflows)
        must make a one-time donation of **$1.00 USD per employee** to one or more
        of the following organizations:
        
        - **ACLU (American Civil Liberties Union)** — [donate.aclu.org](https://donate.aclu.org/)
        - **SPLC (Southern Poverty Law Center)** — [donate.splcenter.org](https://donate.splcenter.org/sslpage.aspx?pid=463)
        - **DSA (Democratic Socialists of America)** — [dsausa.org/donate](https://www.dsausa.org/donate/)
        - **Appalachian OUTReach** — [appalachianoutreach.org/donate](https://www.appalachianoutreach.org/donate)
        
        The donation may be split across multiple organizations.
        
        Upon making the donation, the organization must submit a **pull request** to
        this repository adding their name to the [Approved Organizations](#approved-organizations)
        section below, including proof of donation (receipt, confirmation number, or
        screenshot with sensitive information redacted).
        
        Organizations that believe they should be **exempt** from this requirement
        (e.g., nonprofits, educational institutions, organizations with fewer
        resources than their headcount suggests) may submit a pull request adding
        their name to the Approved Organizations section with a written explanation
        of why they believe an exemption is warranted. Exemptions are granted at the
        sole discretion of the copyright holder.
        
        **Use of this Software by qualifying organizations without compliance
        constitutes a violation of this license.**
        
        ### Approved Organizations
        
        <!-- Add your organization below via pull request -->
        <!-- Format: | Organization Name | Date | Donation or Exemption | -->
        
        | Organization | Date | Status |
        |---|---|---|
        | | | |
        
        ---
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE.md
Requires-Python: >=3.10
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: fastmcp>=2.0
Requires-Dist: lxml>=5.0
Requires-Dist: playwright>=1.40
Requires-Dist: pydantic>=2.0
Requires-Dist: typer>=0.9
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

<p align="center">
  <h1 align="center">spagents</h1>
  <p align="center">
    <strong>Give your AI agents eyes for the modern web.</strong>
  </p>
  <p align="center">
    <a href="#installation">Installation</a> &middot;
    <a href="#quick-start">Quick Start</a> &middot;
    <a href="#mcp-server">MCP Server</a> &middot;
    <a href="#python-sdk">Python SDK</a> &middot;
    <a href="#how-it-works">How It Works</a>
  </p>
</p>

---

**AI agents are blind to Single Page Applications.** Tools like `fetch` and `requests` return empty HTML shells — no articles, no content, no interactive elements. SPAs render everything via JavaScript *after* the page loads.

**spagents** fixes this. It launches a real browser, intelligently waits for SPAs to finish rendering, and returns structured, agent-friendly data — articles, actions, inputs, navigation — ready for your agent to use.

### How Claude handles dynamic content before spagents

![Claude sends error message that dynamic content can't be loaded.](img/before-spagents.png)

### How Claude handles dynamic content after spagents
![Claude successfully retrieves the dynamic content](img/after-spagents.png)

### Key features

- **Smart content detection** — Knows when a SPA is done rendering (not just `sleep(5)`)
- **Structured extraction** — Returns articles, links, metadata as typed Pydantic models
- **Full interaction** — Click, type, scroll, press keys, navigate — like a real user
- **Action discovery** — Finds every interactive element: buttons, inputs, ARIA roles, custom components
- **Session persistence** — Cookies, localStorage, and auth state preserved across navigations
- **Three interfaces** — Python SDK, CLI, and MCP server for Claude Desktop / AI agents

---

## Installation

```bash
pip install spagents
playwright install chromium
```

### From source

```bash
git clone https://github.com/your-username/spagents.git
cd spagents
uv sync
uv run playwright install chromium
```

## Quick start

### CLI

```bash
# Structured JSON output (default)
spagents browse "https://news.kagi.com"

# Human-readable text
spagents browse "https://news.kagi.com" --format text

# Interactive REPL session
spagents interactive "https://news.kagi.com"
```

### Python SDK

```python
import asyncio
from spagents import BrowserManager

async def main():
    async with BrowserManager() as browser:
        session = await browser.new_session()

        # Browse a SPA — content is fully rendered
        state = await session.navigate("https://news.kagi.com")
        for article in state.content.articles:
            print(f"{article.category}: {article.headline}")

        # Interact with the page
        for action in state.actions:
            if "Technology" in action.description:
                state = await session.click(action.selector)
                break

        # Type into inputs, press keys
        state = await session.type_text("#search", "climate change")
        state = await session.press_key("Enter")

        await session.close()

asyncio.run(main())
```

### MCP Server

Connect spagents to Claude Desktop or any MCP-compatible AI agent:

```bash
spagents mcp
```

Add to your Claude Desktop config (`claude_desktop_config.json`):

```json
{
  "mcpServers": {
    "spagents": {
      "command": "spagents",
      "args": ["mcp"]
    }
  }
}
```

#### Claude Code

Add the MCP server to your project or user settings:

```bash
claude mcp add spagents -- spagents mcp
```

Or add it directly to your `.claude/settings.json`:

```json
{
  "mcpServers": {
    "spagents": {
      "command": "spagents",
      "args": ["mcp"]
    }
  }
}
```

Now Claude can browse any SPA:

> **You:** "What's the top story on Kagi News today?"
>
> **Claude:** *uses `browse` tool* → *reads structured articles* → gives you the answer

#### MCP tools

| Tool | Description |
|---|---|
| `browse` | Navigate to a URL, return rendered content + interactive actions |
| `click` | Click an element by CSS selector |
| `type_text` | Type into an input field |
| `press_key` | Press a keyboard key (Enter, Tab, Escape, etc.) |
| `list_actions` | Discover all interactive elements on the page |
| `navigate` | Go to a new URL within an existing session |
| `scroll` | Scroll up or down, trigger infinite scroll |
| `extract_content` | Re-extract content from the current page |
| `close_session` | Close a browser session and free resources |

## Interactive REPL commands

```
click <n>                  Click action by number
click <selector>           Click by CSS selector
type <n> <text>            Type text into an input by action number
type "<selector>" <text>   Type text into an input by CSS selector
press <key>                Press a key (Enter, Escape, Tab, ArrowDown, etc.)
select <n> <value>         Select dropdown option by action number
scroll [down|up]           Scroll the page
actions                    List all interactive elements
extract                    Re-extract page content
navigate <url>             Navigate to a new URL
json                       Dump current state as JSON
quit                       Exit the session
```

## How it works

spagents wraps [Playwright](https://playwright.dev/python/) and adds three intelligent layers:

### 1. Content ready detection

A multi-signal detector that knows when a SPA has *actually* finished rendering:

| Signal | What it checks |
|---|---|
| **Network quiescence** | No pending XHR/fetch requests for 500ms (ignoring analytics noise) |
| **DOM stabilization** | MutationObserver sees no changes for 300ms after initial render |
| **Content heuristic** | Meaningful text exists, no loading spinners, real links present |

This replaces naive approaches like `sleep(5)` or Playwright's `networkidle` (which breaks on long-polling and WebSocket connections).

### 2. Content extraction

Extracts structured data from the rendered DOM:

- **Articles** with headlines, summaries, sources, highlights, quotes, and sections
- **Links** with surrounding context (which heading or section they're under)
- **Metadata** from OG tags, meta descriptions, and page title

### 3. Action discovery

Finds *every* interactive element on the page through four phases:

1. **Semantic HTML** — `<a>`, `<button>`, `<input>`, `<select>`
2. **ARIA roles** — `role="button"`, `role="tab"`, `role="listitem"`, etc.
3. **Custom components** — `tabindex`, `onclick`, `cursor: pointer`
4. **Disambiguation** — Duplicate labels get context from parent containers

## CLI reference

```
spagents browse <url> [OPTIONS]
  --format, -f    Output format: json (default) or text
  --timeout, -t   Content detection timeout in ms (default: 15000)
  --no-headless   Run browser with visible window

spagents interactive <url> [OPTIONS]
  --timeout, -t   Content detection timeout in ms (default: 15000)
  --no-headless   Run browser with visible window

spagents mcp [OPTIONS]
  --transport     Transport: stdio (default) or sse
  --port, -p      Port for SSE transport (default: 8000)
```

## License

MIT with an amended community contribution requirement for organizations with
more than 100 employees. See [LICENSE.md](LICENSE.md) for details.
