Metadata-Version: 2.4
Name: sliprequests
Version: 0.2.2
Summary: requests 完全兼容的反检测爬虫库，基于 Camoufox 反检测浏览器
Home-page: https://github.com/violettoolssite/sliprequests
Author: violet
Author-email: violet <violettools.site@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/violettoolssite/sliprequests
Project-URL: Repository, https://github.com/violettoolssite/sliprequests
Project-URL: Issues, https://github.com/violettoolssite/sliprequests/issues
Keywords: scraping,anti-detection,camoufox,cloudflare,akamai,perimeterx,bot-detection,requests
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: camoufox[geoip]>=0.4.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# sliprequests

**A drop-in replacement for `requests` that bypasses Cloudflare and other anti-bot detection systems.**

**requests 的平替库，自动绕过 Cloudflare 等反爬虫检测系统。**

[![PyPI version](https://badge.fury.io/py/sliprequests.svg)](https://pypi.org/project/sliprequests/)
[![Python](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

---

## Table of Contents / 目录

- [What is sliprequests? / 什么是 sliprequests？](#what-is-sliprequests)
- [Installation / 安装](#installation)
- [Quick Start / 快速开始](#quick-start)
- [API Reference / API 参考](#api-reference)
- [Architecture / 架构](#architecture)
- [Comparison with requests / 与 requests 的对比](#comparison-with-requests)
- [Troubleshooting / 常见问题](#troubleshooting)
- [License](#license)

---

## What is sliprequests?

<!-- EN -->
`sliprequests` is a Python HTTP library that uses the [Camoufox](https://github.com/nicoreed/camoufox) anti-detection browser under the hood. It provides a **100% compatible API** with the popular [`requests`](https://requests.readthedocs.io/) library — just change your import statement and you're done.

<!-- CN -->
`sliprequests` 是一个 Python HTTP 库，底层使用 [Camoufox](https://github.com/nicoreed/camoufox) 反检测浏览器。它提供了与 [`requests`](https://requests.readthedocs.io/) 库 **100% 兼容的 API** —— 只需修改 import 语句即可。

### The Problem / 问题

Web scraping with traditional HTTP clients hits walls:

```python
# ❌ Gets blocked by Cloudflare
import requests
r = requests.get("https://example.com")
# 403 Forbidden / 503 Challenge
```

### The Solution / 解决方案

Use `sliprequests` — it launches a real browser that renders JavaScript and passes bot detection:

```python
# ✅ Bypasses Cloudflare automatically
import sliprequests as requests
r = requests.get("https://example.com")
# 200 OK — full rendered page
```

### What's the Difference? / 有什么区别？

| Feature / 功能 | requests | sliprequests |
|----------------|----------|--------------|
| Simple API / 简单 API | ✅ | ✅ |
| Cloudflare bypass / 绕过 Cloudflare | ❌ | ✅ |
| JavaScript rendering / 渲染 JS | ❌ | ✅ |
| Anti-fingerprint / 反指纹检测 | ❌ | ✅ |
| Browser-like headers / 浏览器 UA | ❌ | ✅ |
| Drop-in replacement / 无缝替换 | — | ✅ |

---

## Installation / 安装

<!-- EN -->
```bash
pip install sliprequests
```

<!-- CN -->
```bash
pip install sliprequests
```

That's it. The `camoufox` browser and all dependencies are installed automatically. No extra configuration needed.

一行搞定。`camoufox` 浏览器和所有依赖自动安装，无需额外配置。

### Requirements / 环境要求

- Python 3.8+
- Supported OS: Windows, macOS, Linux / 支持系统：Windows、macOS、Linux
- No need to install Playwright or Camoufox manually — `pip install sliprequests` handles everything.
- 无需手动安装 Playwright 或 Camoufox —— `pip install sliprequests` 自动处理一切。

---

## Quick Start / 快速开始

### Basic Usage / 基础用法

```python
import sliprequests as requests

# GET request — renders JavaScript like a real browser
# GET 请求 — 像真实浏览器一样渲染 JavaScript
r = requests.get("https://www.cloudflare.com/")
print(r.status_code)  # 200
print(r.text)         # Full rendered page content / 完整渲染后的页面内容
print(r.html)         # Raw HTML for parsing / 原始 HTML，可用于 xpath/BS4 提取
```

### POST Request / POST 请求

```python
import sliprequests as requests

# POST with JSON body / JSON 请求体
r = requests.post("https://httpbin.org/post", json={"key": "value"})
print(r.json())

# POST with form data / 表单数据
r = requests.post("https://httpbin.org/post", data={"key": "value"})
print(r.json())
```

### Using Session / 使用 Session

```python
import sliprequests as requests

# Session maintains cookies across requests
# Session 会自动保持 Cookie
with requests.Session() as s:
    s.get("https://httpbin.org/cookies/set/token/abc123")
    r = s.get("https://httpbin.org/cookies")
    print(r.json())  # {'cookies': {'token': 'abc123'}}
```

### Parsing with lxml xpath / 使用 lxml xpath 提取

```python
import sliprequests as requests
from lxml import html

r = requests.get("https://github.com/trending")

doc = html.fromstring(r.html)
# Extract all h1 text / 提取所有 h1 文本
titles = doc.xpath("//h1/text()")

# Extract links / 提取链接
links = doc.xpath("//a/@href")

# Extract with conditions / 条件提取
python_repos = doc.xpath("//a[contains(@href, 'python')]/text()")
```

### Parsing with BeautifulSoup / 使用 BeautifulSoup 提取

```python
import sliprequests as requests
from bs4 import BeautifulSoup

r = requests.get("https://github.com/trending")

soup = BeautifulSoup(r.html, "lxml")

# Find elements / 查找元素
h1 = soup.find("h1").text
links = soup.find_all("a")

# CSS selectors / CSS 选择器
repos = soup.select("h2 a")
prices = soup.select(".price")
```

### Using CSS Selectors (lxml) / 使用 CSS 选择器

```python
import sliprequests as requests
from lxml import html

r = requests.get("https://example.com")
doc = html.fromstring(r.html)

# CSS selectors / CSS 选择器
headings = doc.cssselect("h1, h2, h3")
nav_links = doc.cssselect("nav a")
```

---

## API Reference / API 参考

`sliprequests` mirrors the [`requests`](https://requests.readthedocs.io/en/latest/api/) API exactly. Every method, parameter, and return type is compatible.

`sliprequests` 与 [`requests`](https://requests.readthedocs.io/en/latest/api/) API 完全对应。每个方法、参数和返回类型都兼容。

### HTTP Methods / HTTP 方法

```python
requests.get(url, **kwargs)
requests.post(url, **kwargs)
requests.put(url, **kwargs)
requests.delete(url, **kwargs)
requests.patch(url, **kwargs)
requests.head(url, **kwargs)
requests.options(url, **kwargs)
```

### Parameters / 参数

<!-- EN -->
| Parameter | Description | Example |
|-----------|-------------|---------|
| `params` | URL query parameters | `params={"q": "python"}` |
| `data` | Request body (form data or raw) | `data={"key": "value"}` |
| `json` | JSON request body | `json={"key": "value"}` |
| `headers` | Custom HTTP headers | `headers={"Accept": "text/html"}` |
| `cookies` | Request cookies | `cookies={"session": "abc"}` |
| `timeout` | Request timeout in seconds | `timeout=30` |
| `allow_redirects` | Follow redirects | `allow_redirects=False` |
| `proxies` | Proxy configuration | `proxies={"https": "socks5://..."}` |
| `auth` | HTTP authentication | `auth=("user", "pass")` |
| `stream` | Stream response body | `stream=True` |
| `verify` | Verify SSL certificates | `verify=False` |
| `cert` | Client certificate | `cert=("cert.pem", "key.pem")` |

<!-- CN -->
| 参数 | 说明 | 示例 |
|------|------|------|
| `params` | URL 查询参数 | `params={"q": "python"}` |
| `data` | 请求体（表单或原始数据） | `data={"key": "value"}` |
| `json` | JSON 请求体 | `json={"key": "value"}` |
| `headers` | 自定义请求头 | `headers={"Accept": "text/html"}` |
| `cookies` | 请求 Cookie | `cookies={"session": "abc"}` |
| `timeout` | 超时时间（秒） | `timeout=30` |
| `allow_redirects` | 是否跟随重定向 | `allow_redirects=False` |
| `proxies` | 代理配置 | `proxies={"https": "socks5://..."}` |
| `auth` | HTTP 认证 | `auth=("user", "pass")` |
| `stream` | 流式响应 | `stream=True` |
| `verify` | SSL 验证 | `verify=False` |
| `cert` | 客户端证书 | `cert=("cert.pem", "key.pem")` |

### Response Object / 响应对象

The `Response` object is fully compatible with `requests.Response`:

`Response` 对象与 `requests.Response` 完全兼容：

```python
r = requests.get("https://httpbin.org/get")

# Status & metadata / 状态和元数据
r.status_code          # 200
r.ok                   # True (status_code < 400)
r.reason               # "OK"
r.url                  # Final URL after redirects / 重定向后的最终 URL
r.headers              # Response headers / 响应头
r.cookies              # Response cookies / 响应 Cookie
r.elapsed              # Time taken (timedelta) / 耗时

# Content / 内容
r.text                 # Decoded text (str) / 解码后的文本
r.content              # Raw bytes / 原始字节
r.html                 # Full rendered HTML — sliprequests exclusive! / 完整渲染 HTML（sliprequests 独有！）
r.json()               # Parsed JSON / 解析 JSON
r.encoding             # Detected encoding / 检测到的编码

# History (redirects) / 重定向历史
r.history              # List of previous Response objects / 之前的 Response 对象列表
```

### The `.html` Property / `.html` 属性

<!-- EN -->
`sliprequests` adds an `.html` property to the Response object that returns the **full rendered HTML** of the page. This is the HTML after JavaScript execution, making it ideal for parsing with `lxml`, `BeautifulSoup`, or CSS selectors.

<!-- CN -->
`sliprequests` 为 Response 对象添加了 `.html` 属性，返回页面**完整渲染后的 HTML**。这是 JavaScript 执行后的 HTML，非常适合用 `lxml`、`BeautifulSoup` 或 CSS 选择器解析。

```python
import sliprequests as requests
from lxml import html
from bs4 import BeautifulSoup

r = requests.get("https://github.com/trending")

# Using lxml xpath / 使用 lxml xpath
doc = html.fromstring(r.html)
repos = doc.xpath("//h2/a/text()")

# Using BeautifulSoup / 使用 BeautifulSoup
soup = BeautifulSoup(r.html, "lxml")
titles = soup.select("h2 a")
```

### Session Object / Session 对象

The `Session` object persists settings and cookies across requests:

`Session` 对象会跨请求保持设置和 Cookie：

```python
s = requests.Session()

# Persistent headers / 持久化请求头
s.headers.update({"Authorization": "Bearer token123"})

# Persistent cookies / 持久化 Cookie
s.cookies.set("session", "abc123")

# Persistent proxy / 持久化代理
s.proxies = {"https": "socks5://user:pass@host:port"}

# All requests through this session use these settings
# 此 Session 的所有请求都会使用这些设置
r1 = s.get("https://example.com/login")
r2 = s.get("https://example.com/dashboard")  # cookies carried over / Cookie 自动携带
```

### Session Attributes / Session 属性

| Attribute / 属性 | Description / 说明 | Default / 默认值 |
|------------------|---------------------|------------------|
| `headers` | Default headers / 默认请求头 | Browser UA / 浏览器 UA |
| `cookies` | Default cookies / 默认 Cookie | `{}` |
| `auth` | Default auth / 默认认证 | `None` |
| `proxies` | Default proxy / 默认代理 | `{}` |
| `params` | Default URL params / 默认查询参数 | `{}` |
| `verify` | SSL verification / SSL 验证 | `True` |
| `cert` | Client certificate / 客户端证书 | `None` |
| `timeout` | Default timeout / 默认超时 | `30` |
| `allow_redirects` | Follow redirects / 跟随重定向 | `True` |
| `stream` | Stream responses / 流式响应 | `False` |

### Proxies / 代理

`sliprequests` supports SOCKS5 and HTTP proxies:

`sliprequests` 支持 SOCKS5 和 HTTP 代理：

```python
import sliprequests as requests

# SOCKS5 proxy / SOCKS5 代理
proxies = {
    "https": "socks5://user:password@host:port"
}
r = requests.get("https://httpbin.org/ip", proxies=proxies)
print(r.json())  # Shows proxy IP / 显示代理 IP

# HTTP proxy / HTTP 代理
proxies = {
    "http": "http://user:password@host:port",
    "https": "http://user:password@host:port"
}
r = requests.get("https://httpbin.org/ip", proxies=proxies)
```

### Authentication / 认证

```python
import sliprequests as requests

# Basic Auth / 基本认证
r = requests.get("https://httpbin.org/basic-auth/user/pass",
                  auth=("user", "pass"))

# Bearer Token / Bearer 令牌
headers = {"Authorization": "Bearer your-token-here"}
r = requests.get("https://api.example.com/data", headers=headers)
```

### Timeouts / 超时

```python
import sliprequests as requests

# Timeout in seconds / 超时（秒）
r = requests.get("https://slow-api.example.com", timeout=10)

# No timeout (not recommended) / 无超时不推荐
r = requests.get("https://example.com", timeout=None)
```

### SSL Verification / SSL 验证

```python
import sliprequests as requests

# Disable SSL verification / 禁用 SSL 验证（生产环境不推荐）
r = requests.get("https://self-signed.example.com", verify=False)
```

---

## Architecture / 架构

<!-- EN -->
`sliprequests` uses a dual-mode architecture:

1. **Server Mode** (preferred): Connects to a local `camofox-browser` REST API service (port 9377). Zero extra memory overhead — the browser runs as a system service.

2. **Subprocess Mode** (fallback): If no server is available, automatically launches a Camoufox browser in a subprocess. This is the default for most users.

<!-- CN -->
`sliprequests` 使用双重架构：

1. **Server 模式**（优先）：连接本地 `camofox-browser` REST API 服务（端口 9377）。零额外内存开销 —— 浏览器作为系统服务运行。

2. **Subprocess 模式**（回退）：如果没有服务器可用，自动在子进程中启动 Camoufox 浏览器。这是大多数用户的默认模式。

```
sliprequests (your code)
        │
   ┌────┴────┐
   │         │
   ▼         ▼
Server     Subprocess
Mode       Mode
(port 9377) (auto-launch)
   │         │
   └────┬────┘
        ▼
   Camoufox Browser
   (anti-detect)
```

### How GET Requests Work / GET 请求原理

When you call `requests.get()`:

1. The URL is opened in a Camoufox browser tab (like a real user visiting the page)
2. The browser renders the page, executes JavaScript, and loads all resources
3. The fully rendered HTML is returned in `response.html`
4. The visible text is returned in `response.text`

当你调用 `requests.get()` 时：

1. URL 在 Camoufox 浏览器标签页中打开（就像真实用户访问页面）
2. 浏览器渲染页面、执行 JavaScript 并加载所有资源
3. 完整渲染后的 HTML 在 `response.html` 中返回
4. 可见文本在 `response.text` 中返回

### How POST/PUT/PATCH/DELETE Work / POST/PUT/PATCH/DELETE 原理

Non-GET requests use the browser's `fetch()` API:

1. The request is sent via JavaScript `fetch()` inside the browser
2. Supports all HTTP methods and request bodies
3. Returns the response status, headers, and body

非 GET 请求使用浏览器的 `fetch()` API：

1. 通过浏览器内的 JavaScript `fetch()` 发送请求
2. 支持所有 HTTP 方法和请求体
3. 返回响应状态、头和体

---

## Comparison with requests / 与 requests 的对比

### What's the same / 相同之处

Everything that matters for day-to-day usage / 日常使用中所有重要功能：

```python
import sliprequests as requests

# All these work exactly like requests / 以下用法与 requests 完全一致
r = requests.get(url)
r = requests.post(url, json=data)
r = requests.get(url, params=params)
r = requests.get(url, headers=headers)
r = requests.get(url, cookies=cookies)
r = requests.get(url, proxies=proxies)
r = requests.get(url, auth=auth)
r = requests.get(url, timeout=30)

# Response object / 响应对象
r.status_code
r.text
r.content
r.json()
r.headers
r.cookies
r.url
r.ok
r.reason

# Session
s = requests.Session()
s.headers.update(...)
s.cookies.set(...)
s.get(url)
s.post(url, data=data)
s.close()

# Context manager / 上下文管理器
with requests.Session() as s:
    s.get(url)
```

### What's different / 不同之处

| Feature / 功能 | requests | sliprequests |
|----------------|----------|--------------|
| Response.html | ❌ Not available | ✅ Full rendered HTML / 完整渲染 HTML |
| GET requests | Direct HTTP | Browser page navigation / 浏览器页面导航 |
| POST/PUT/PATCH | Direct HTTP | Browser fetch() API |
| Memory usage / 内存 | Low (~1MB) | Higher (~300MB, browser) / 较高 |
| Speed / 速度 | Fast (~100ms) | Slower (~5-10s) / 较慢 |
| JavaScript | Not executed / 不执行 | Fully executed / 完整执行 |
| Anti-bot / 反爬 | ❌ | ✅ Cloudflare, DataDome, etc. |

### When to use sliprequests / 何时使用 sliprequests

- ✅ Scraping sites with Cloudflare, DataDome, or other anti-bot protection
- ✅ 采集有 Cloudflare、DataDome 等反爬保护的网站
- ✅ Sites that require JavaScript rendering
- ✅ 需要 JavaScript 渲染的网站
- ✅ Need browser-like behavior (fingerprints, cookies, etc.)
- ✅ 需要浏览器行为（指纹、Cookie 等）
- ✅ API reverse engineering (when you need to understand JS-rendered responses)
- ✅ API 逆向工程（需要理解 JS 渲染的响应）

### When to use requests / 何时使用 requests

- ✅ Simple API calls without anti-bot protection
- ✅ 无反爬保护的简单 API 调用
- ✅ High-throughput scraping (thousands of requests)
- ✅ 高频采集（数千次请求）
- ✅ Low memory environments
- ✅ 内存受限环境
- ✅ Speed-critical applications
- ✅ 对速度要求高的应用

---

## Troubleshooting / 常见问题

### Q: "ModuleNotFoundError: No module named 'camoufox'" / 找不到 camoufox 模块

```bash
pip install sliprequests
# camoufox is installed automatically as a dependency
# camoufox 作为依赖自动安装
```

### Q: "Browser failed to start" / 浏览器启动失败

Make sure your system has enough memory (at least 512MB free). Camoufox uses ~300MB per browser instance.

确保系统有足够内存（至少 512MB 空闲）。Camoufox 每个浏览器实例约占 300MB。

### Q: "Connection refused" on port 9377 / 端口 9377 连接被拒绝

This is normal — the server mode requires a separate `camofox-browser` Node.js service. If you don't have it, `sliprequests` automatically falls back to subprocess mode.

这是正常的 —— Server 模式需要单独运行 `camofox-browser` Node.js 服务。如果没有，`sliprequests` 会自动回退到 Subprocess 模式。

### Q: Slow performance / 性能较慢

`sliprequests` is slower than `requests` because it launches a real browser and renders JavaScript. This is the tradeoff for bypassing anti-bot detection. For better performance:

`sliprequests` 比 `requests` 慢，因为它启动真实浏览器并渲染 JavaScript。这是绕过反爬检测的代价。优化方法：

- Using `Session` objects to reuse browser instances / 使用 Session 复用浏览器实例
- Using server mode (port 9377) to avoid browser startup overhead / 使用 Server 模式避免启动开销

### Q: Memory usage is high / 内存占用高

Each browser instance uses ~300MB RAM. Use `Session` objects to reuse instances:

每个浏览器实例约占 300MB 内存。使用 Session 复用实例：

```python
# ❌ Creates new browser each time / 每次创建新浏览器
for url in urls:
    r = requests.get(url)

# ✅ Reuses same browser / 复用同一个浏览器
with requests.Session() as s:
    for url in urls:
        r = s.get(url)
```

---

## License

MIT License

## Credits / 致谢

- [Camoufox](https://github.com/nicoreed/camoufox) — Anti-detection Firefox-based browser / 反检测 Firefox 浏览器
- [requests](https://github.com/psf/requests) — The library that inspired this project / 灵感来源
- [Playwright](https://playwright.dev/) — Browser automation / 浏览器自动化
