Metadata-Version: 2.4
Name: sliprequests
Version: 0.2.4
Summary: requests 完全兼容的反检测爬虫库，基于 Camoufox 反检测浏览器
Home-page: https://github.com/violettoolssite/sliprequests
Author: violet
Author-email: violet <violettools.site@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/violettoolssite/sliprequests
Project-URL: Repository, https://github.com/violettoolssite/sliprequests
Project-URL: Issues, https://github.com/violettoolssite/sliprequests/issues
Keywords: scraping,anti-detection,camoufox,cloudflare,akamai,perimeterx,bot-detection,requests
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: camoufox[geoip]>=0.4.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# sliprequests

**A drop-in replacement for `requests` that bypasses Cloudflare and other anti-bot detection systems.**

[![PyPI version](https://badge.fury.io/py/sliprequests.svg)](https://pypi.org/project/sliprequests/)
[![Python](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

## Table of Contents

- [What is sliprequests?](#what-is-sliprequests)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [API Reference](#api-reference)
- [Architecture](#architecture)
- [Comparison with requests](#comparison-with-requests)
- [Troubleshooting](#troubleshooting)
- [License](#license)

---

## What is sliprequests?

`sliprequests` is a Python HTTP library that uses the [Camoufox](https://github.com/nicoreed/camoufox) anti-detection browser under the hood. It provides a **100% compatible API** with the popular [`requests`](https://requests.readthedocs.io/) library — just change your import statement and you're done.

### The Problem

Web scraping with traditional HTTP clients hits walls:

```python
# ❌ Gets blocked by Cloudflare
import requests
r = requests.get("https://example.com")
# 403 Forbidden / 503 Challenge
```

### The Solution

Use `sliprequests` — it launches a real browser that renders JavaScript and passes bot detection:

```python
# ✅ Bypasses Cloudflare automatically
import sliprequests as requests
r = requests.get("https://example.com")
# 200 OK — full rendered page
```

### What's the Difference?

| Feature | requests | sliprequests |
|---------|----------|--------------|
| Simple API | ✅ | ✅ |
| Cloudflare bypass | ❌ | ✅ |
| JavaScript rendering | ❌ | ✅ |
| Anti-fingerprint | ❌ | ✅ |
| Browser-like headers | ❌ | ✅ |
| Drop-in replacement | — | ✅ |

---

## Installation

```bash
pip install sliprequests
```

That's it. The `camoufox` browser and all dependencies are installed automatically. No extra configuration needed.

### Requirements

- Python 3.8+
- Supported OS: Windows, macOS, Linux
- No need to install Playwright or Camoufox manually — `pip install sliprequests` handles everything.

---

## Quick Start

### Basic Usage

```python
import sliprequests as requests

# GET request — renders JavaScript like a real browser
r = requests.get("https://www.cloudflare.com/")
print(r.status_code)  # 200
print(r.text)         # Full rendered page content
print(r.html)         # Raw HTML for parsing (xpath, BeautifulSoup, etc.)
```

### POST Request

```python
import sliprequests as requests

# POST with JSON body
r = requests.post("https://httpbin.org/post", json={"key": "value"})
print(r.json())

# POST with form data
r = requests.post("https://httpbin.org/post", data={"key": "value"})
print(r.json())
```

### Using Session

```python
import sliprequests as requests

# Session maintains cookies across requests
with requests.Session() as s:
    s.get("https://httpbin.org/cookies/set/token/abc123")
    r = s.get("https://httpbin.org/cookies")
    print(r.json())  # {'cookies': {'token': 'abc123'}}
```

### Parsing with lxml xpath

```python
import sliprequests as requests
from lxml import html

r = requests.get("https://github.com/trending")

doc = html.fromstring(r.html)
# Extract all h1 text
titles = doc.xpath("//h1/text()")

# Extract links
links = doc.xpath("//a/@href")

# Extract with conditions
python_repos = doc.xpath("//a[contains(@href, 'python')]/text()")
```

### Parsing with BeautifulSoup

```python
import sliprequests as requests
from bs4 import BeautifulSoup

r = requests.get("https://github.com/trending")

soup = BeautifulSoup(r.html, "lxml")

# Find elements
h1 = soup.find("h1").text
links = soup.find_all("a")

# CSS selectors
repos = soup.select("h2 a")
prices = soup.select(".price")
```

### Using CSS Selectors (lxml)

```python
import sliprequests as requests
from lxml import html

r = requests.get("https://example.com")
doc = html.fromstring(r.html)

# CSS selectors
headings = doc.cssselect("h1, h2, h3")
nav_links = doc.cssselect("nav a")
```

---

## API Reference

`sliprequests` mirrors the [`requests`](https://requests.readthedocs.io/en/latest/api/) API exactly. Every method, parameter, and return type is compatible.

### HTTP Methods

```python
requests.get(url, **kwargs)
requests.post(url, **kwargs)
requests.put(url, **kwargs)
requests.delete(url, **kwargs)
requests.patch(url, **kwargs)
requests.head(url, **kwargs)
requests.options(url, **kwargs)
```

### Parameters

| Parameter | Description | Example |
|-----------|-------------|---------|
| `params` | URL query parameters | `params={"q": "python"}` |
| `data` | Request body (form data or raw) | `data={"key": "value"}` |
| `json` | JSON request body | `json={"key": "value"}` |
| `headers` | Custom HTTP headers | `headers={"Accept": "text/html"}` |
| `cookies` | Request cookies | `cookies={"session": "abc"}` |
| `timeout` | Request timeout in seconds | `timeout=30` |
| `allow_redirects` | Follow redirects | `allow_redirects=False` |
| `proxies` | Proxy configuration | `proxies={"https": "socks5://..."}` |
| `auth` | HTTP authentication | `auth=("user", "pass")` |
| `stream` | Stream response body | `stream=True` |
| `verify` | Verify SSL certificates | `verify=False` |
| `cert` | Client certificate | `cert=("cert.pem", "key.pem")` |

### Response Object

The `Response` object is fully compatible with `requests.Response`:

```python
r = requests.get("https://httpbin.org/get")

# Status & metadata
r.status_code          # 200
r.ok                   # True (status_code < 400)
r.reason               # "OK"
r.url                  # Final URL after redirects
r.headers              # Response headers
r.cookies              # Response cookies
r.elapsed              # Time taken (timedelta)

# Content
r.text                 # Decoded text (str)
r.content              # Raw bytes
r.html                 # Full rendered HTML — sliprequests exclusive!
r.json()               # Parsed JSON
r.encoding             # Detected encoding

# History (redirects)
r.history              # List of previous Response objects
```

### The `.html` Property

`sliprequests` adds an `.html` property to the Response object that returns the **full rendered HTML** of the page. This is the HTML after JavaScript execution, making it ideal for parsing with `lxml`, `BeautifulSoup`, or CSS selectors.

```python
import sliprequests as requests
from lxml import html
from bs4 import BeautifulSoup

r = requests.get("https://github.com/trending")

# Using lxml xpath
doc = html.fromstring(r.html)
repos = doc.xpath("//h2/a/text()")

# Using BeautifulSoup
soup = BeautifulSoup(r.html, "lxml")
titles = soup.select("h2 a")
```

### Session Object

The `Session` object persists settings and cookies across requests:

```python
s = requests.Session()

# Persistent headers
s.headers.update({"Authorization": "Bearer token123"})

# Persistent cookies
s.cookies.set("session", "abc123")

# Persistent proxy
s.proxies = {"https": "socks5://user:pass@host:port"}

# All requests through this session use these settings
r1 = s.get("https://example.com/login")
r2 = s.get("https://example.com/dashboard")  # cookies carried over
```

### Session Attributes

| Attribute | Description | Default |
|-----------|-------------|---------|
| `headers` | Default headers | Browser UA |
| `cookies` | Default cookies | `{}` |
| `auth` | Default auth | `None` |
| `proxies` | Default proxy | `{}` |
| `params` | Default URL params | `{}` |
| `verify` | SSL verification | `True` |
| `cert` | Client certificate | `None` |
| `timeout` | Default timeout | `30` |
| `allow_redirects` | Follow redirects | `True` |
| `stream` | Stream responses | `False` |

### Proxies

`sliprequests` supports SOCKS5 and HTTP proxies:

```python
import sliprequests as requests

# SOCKS5 proxy
proxies = {
    "https": "socks5://user:password@host:port"
}
r = requests.get("https://httpbin.org/ip", proxies=proxies)
print(r.json())  # Shows proxy IP

# HTTP proxy
proxies = {
    "http": "http://user:password@host:port",
    "https": "http://user:password@host:port"
}
r = requests.get("https://httpbin.org/ip", proxies=proxies)
```

### Authentication

```python
import sliprequests as requests

# Basic Auth
r = requests.get("https://httpbin.org/basic-auth/user/pass",
                  auth=("user", "pass"))

# Bearer Token
headers = {"Authorization": "Bearer your-token-here"}
r = requests.get("https://api.example.com/data", headers=headers)
```

### Timeouts

```python
import sliprequests as requests

# Timeout in seconds
r = requests.get("https://slow-api.example.com", timeout=10)

# No timeout (not recommended)
r = requests.get("https://example.com", timeout=None)
```

### SSL Verification

```python
import sliprequests as requests

# Disable SSL verification (not recommended for production)
r = requests.get("https://self-signed.example.com", verify=False)
```

---

## Architecture

`sliprequests` uses a dual-mode architecture:

1. **Server Mode** (preferred): Connects to a local `camofox-browser` REST API service (port 9377). Zero extra memory overhead — the browser runs as a system service.

2. **Subprocess Mode** (fallback): If no server is available, automatically launches a Camoufox browser in a subprocess. This is the default for most users.

```
sliprequests (your code)
        │
   ┌────┴────┐
   │         │
   ▼         ▼
Server     Subprocess
Mode       Mode
(port 9377) (auto-launch)
   │         │
   └────┬────┘
        ▼
   Camoufox Browser
   (anti-detect)
```

### How GET Requests Work

When you call `requests.get()`:

1. The URL is opened in a Camoufox browser tab (like a real user visiting the page)
2. The browser renders the page, executes JavaScript, and loads all resources
3. The fully rendered HTML is returned in `response.html`
4. The visible text is returned in `response.text`

### How POST/PUT/PATCH/DELETE Work

Non-GET requests use the browser's `fetch()` API:

1. The request is sent via JavaScript `fetch()` inside the browser
2. Supports all HTTP methods and request bodies
3. Returns the response status, headers, and body

---

## Comparison with requests

### What's the same

Everything that matters for day-to-day usage:

```python
import sliprequests as requests

# All these work exactly like requests
r = requests.get(url)
r = requests.post(url, json=data)
r = requests.get(url, params=params)
r = requests.get(url, headers=headers)
r = requests.get(url, cookies=cookies)
r = requests.get(url, proxies=proxies)
r = requests.get(url, auth=auth)
r = requests.get(url, timeout=30)

# Response object
r.status_code
r.text
r.content
r.json()
r.headers
r.cookies
r.url
r.ok
r.reason

# Session
s = requests.Session()
s.headers.update(...)
s.cookies.set(...)
s.get(url)
s.post(url, data=data)
s.close()

# Context manager
with requests.Session() as s:
    s.get(url)
```

### What's different

| Feature | requests | sliprequests |
|---------|----------|--------------|
| Response.html | ❌ Not available | ✅ Full rendered HTML |
| GET requests | Direct HTTP | Browser page navigation |
| POST/PUT/PATCH | Direct HTTP | Browser fetch() API |
| Memory usage | Low (~1MB) | Higher (~300MB, browser) |
| Speed | Fast (~100ms) | Slower (~5-10s) |
| JavaScript | Not executed | Fully executed |
| Anti-bot | ❌ | ✅ Cloudflare, DataDome, etc. |

### When to use sliprequests

- ✅ Scraping sites with Cloudflare, DataDome, or other anti-bot protection
- ✅ Sites that require JavaScript rendering
- ✅ Need browser-like behavior (fingerprints, cookies, etc.)
- ✅ API reverse engineering (when you need to understand JS-rendered responses)

### When to use requests

- ✅ Simple API calls without anti-bot protection
- ✅ High-throughput scraping (thousands of requests)
- ✅ Low memory environments
- ✅ Speed-critical applications

---

## Troubleshooting

**Q: "ModuleNotFoundError: No module named 'camoufox'"**

```bash
pip install sliprequests
# camoufox is installed automatically as a dependency
```

**Q: "Browser failed to start"**

Make sure your system has enough memory (at least 512MB free). Camoufox uses ~300MB per browser instance.

**Q: "Connection refused" on port 9377**

This is normal — the server mode requires a separate `camofox-browser` Node.js service. If you don't have it, `sliprequests` automatically falls back to subprocess mode.

**Q: Slow performance**

`sliprequests` is slower than `requests` because it launches a real browser and renders JavaScript. This is the tradeoff for bypassing anti-bot detection. For better performance:

- Using `Session` objects to reuse browser instances
- Using server mode (port 9377) to avoid browser startup overhead

**Q: Memory usage is high**

Each browser instance uses ~300MB RAM. Use `Session` objects to reuse instances:

```python
# ❌ Creates new browser each time
for url in urls:
    r = requests.get(url)

# ✅ Reuses same browser
with requests.Session() as s:
    for url in urls:
        r = s.get(url)
```

---

## License

MIT License

## Credits

- [Camoufox](https://github.com/nicoreed/camoufox) — Anti-detection Firefox-based browser
- [requests](https://github.com/psf/requests) — The library that inspired this project
- [Playwright](https://playwright.dev/) — Browser automation
