Metadata-Version: 2.4
Name: finedata-mcp
Version: 0.1.1
Summary: MCP Server for FineData web scraping API - enables AI agents to scrape any website
Project-URL: Homepage, https://finedata.ai
Project-URL: Documentation, https://docs.finedata.ai
Project-URL: Repository, https://github.com/quality-network/finedata-mcp
Project-URL: Issues, https://github.com/quality-network/finedata-mcp/issues
Project-URL: Changelog, https://github.com/quality-network/finedata-mcp/releases
Author-email: FineData <support@finedata.ai>
Maintainer-email: FineData <support@finedata.ai>
License-Expression: MIT
License-File: LICENSE
Keywords: ai-agents,antibot,captcha-solving,claude,cursor,finedata,mcp,model-context-protocol,playwright,web-scraping
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: httpx>=0.26.0
Requires-Dist: mcp>=1.0.0
Description-Content-Type: text/markdown

# FineData MCP Server

MCP (Model Context Protocol) server for [FineData](https://finedata.ai) web scraping API.

Enables AI agents like Claude, Cursor, and GPT to scrape any website with:

- **Antibot Bypass** - Cloudflare, DataDome, PerimeterX, and more
- **JavaScript Rendering** - Full browser rendering with Playwright
- **Captcha Solving** - reCAPTCHA, hCaptcha, Cloudflare Turnstile, Yandex
- **Proxy Rotation** - 87K+ datacenter, residential, and mobile proxies
- **Smart Retry** - Automatic retries with block detection

## Installation

### Using uvx (Recommended)

```bash
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run directly with uvx
FINEDATA_API_KEY=fd_xxx uvx finedata-mcp
```

### Using pip

```bash
pip install finedata-mcp

# Run
FINEDATA_API_KEY=fd_xxx finedata-mcp
```

### Using npx

```bash
npx @finedata/mcp-server
```

## Configuration

### Claude Desktop

Add to your `claude_desktop_config.json`:

**macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`

```json
{
  "mcpServers": {
    "finedata": {
      "command": "uvx",
      "args": ["finedata-mcp"],
      "env": {
        "FINEDATA_API_KEY": "fd_your_api_key_here"
      }
    }
  }
}
```

### Cursor IDE

Add to your MCP settings in Cursor:

```json
{
  "mcpServers": {
    "finedata": {
      "command": "uvx",
      "args": ["finedata-mcp"],
      "env": {
        "FINEDATA_API_KEY": "fd_your_api_key_here"
      }
    }
  }
}
```

### Alternative: Using npx

```json
{
  "mcpServers": {
    "finedata": {
      "command": "npx",
      "args": ["-y", "@finedata/mcp-server"],
      "env": {
        "FINEDATA_API_KEY": "fd_your_api_key_here"
      }
    }
  }
}
```

## Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `FINEDATA_API_KEY` | Yes | Your FineData API key |
| `FINEDATA_API_URL` | No | API URL (default: https://api.finedata.ai) |
| `FINEDATA_TIMEOUT` | No | Default timeout in seconds (default: 60) |

## Available Tools

### scrape_url

Scrape content from any web page with antibot bypass.

```
scrape_url(
  url: "https://example.com",
  use_js_render: false,      # Enable Playwright for SPAs
  use_residential: false,    # Use residential proxy
  use_undetected: false,     # Use Undetected Chrome
  solve_captcha: false,      # Auto-solve captchas
  timeout: 60                # Timeout in seconds
)
```

**Token costs:**
- Base request: 1 token
- Antibot bypass: +2 tokens
- JS rendering: +5 tokens
- Residential proxy: +3 tokens
- Captcha solving: +10 tokens

### scrape_async

Submit an async scraping job for long-running requests.

```
scrape_async(
  url: "https://heavy-site.com",
  use_js_render: true,
  timeout: 120,
  callback_url: "https://your-webhook.com/callback"
)
```

Returns a `job_id` for status polling.

### get_job_status

Get the status of an async scraping job.

```
get_job_status(job_id: "550e8400-e29b-41d4-a716-446655440000")
```

Statuses: `pending`, `processing`, `completed`, `failed`, `cancelled`

### batch_scrape

Scrape multiple URLs in a single batch (up to 100 URLs).

```
batch_scrape(
  urls: ["https://example.com/1", "https://example.com/2"],
  use_js_render: false,
  callback_url: "https://your-webhook.com/batch-done"
)
```

### get_usage

Get current API token usage.

```
get_usage()
```

## Examples

### Basic Scraping

Ask Claude or your AI agent:

> "Scrape https://example.com and show me the content"

### JavaScript Rendered Page

> "Scrape https://spa-website.com with JavaScript rendering enabled"

### Protected Site with Captcha

> "Scrape https://protected-site.com using residential proxy and captcha solving"

### Batch Scraping

> "Scrape these URLs: https://example.com/1, https://example.com/2, https://example.com/3"

## Pricing

FineData uses token-based pricing. Each feature adds tokens:

| Feature | Tokens |
|---------|--------|
| Base request | 1 |
| Antibot (TLS fingerprinting) | +2 |
| JS Rendering (Playwright) | +5 |
| Undetected Chrome | +5 |
| Residential Proxy | +3 |
| Mobile Proxy | +4 |
| reCAPTCHA / hCaptcha | +10 |
| Cloudflare Turnstile | +12 |
| Yandex SmartCaptcha | +15 |

Get your API key and free trial tokens at [finedata.ai](https://finedata.ai).

## Support

- Documentation: https://docs.finedata.ai
- Email: support@finedata.ai
- Issues: https://github.com/quality-network/finedata-mcp/issues

## License

MIT
