Metadata-Version: 2.4
Name: serpcheap
Version: 0.2.2
Summary: Official Python client for the serp.cheap Google SERP API.
Project-URL: Homepage, https://serp.cheap
Project-URL: Documentation, https://app.serp.cheap
Project-URL: Repository, https://github.com/SerpCheap/serpcheap-py
Author: serp.cheap
License: MIT
License-File: LICENSE
Keywords: api,google,scraping,search,serp
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Internet :: WWW/HTTP
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# serpcheap

[![PyPI](https://img.shields.io/pypi/v/serpcheap)](https://pypi.org/project/serpcheap/)
[![Python versions](https://img.shields.io/pypi/pyversions/serpcheap)](https://pypi.org/project/serpcheap/)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

Official Python client for the [serp.cheap](https://serp.cheap) **Google Search API** — real-time Google SERP data (organic results, ads, knowledge graph, page scraping, rank tracking).

The **cheapest Google Search API** we know of: $0.0003 per cached search, $0.0006 fresh, no monthly minimum (~10× cheaper than SerpApi).

A thin, zero-dependency, synchronous client built on the standard library.

## Install

```bash
pip install serpcheap
```

## Quickstart

```python
from serpcheap import SerpCheap

client = SerpCheap("KEY")
r = client.search(q="best running shoes", gl="us")
print(r.organic[0].title)
```

Get an API key at [app.serp.cheap](https://app.serp.cheap).

## Search parameters

```python
client.search(
    q="best running shoes",  # required
    gl="us",                 # country, default "us"
    hl="en",                 # UI language (optional)
    tbs="qdr:d",             # time filter: qdr:h / qdr:d / qdr:w (optional)
    page=1,                  # 1-indexed page, default 1
)
```

The response is a `SearchResponse` dataclass. JSON camelCase fields are exposed
as snake_case attributes:

```python
r.search            # the query
r.page              # page number
r.organic           # list[OrganicResult] (always a list)
                    #   each: .content / .screenshot_url / .scrape_error when scraped
r.ads               # list[Ad] | None
r.knowledge_graph   # KnowledgeGraph | None
r.people_also_ask   # list[str] | None
r.related_searches  # list[RelatedSearch] | None
r.stats             # SearchStats(balance, cost, cached) | None
```

## Scraping page content

Pass `scrape` to fetch the page content of the top organic results alongside the
SERP. Omit it (the default) for a search-only request. Each scraped page is billed
on top of the search.

```python
from serpcheap import SerpCheap, ScrapeOptions

client = SerpCheap("KEY")
r = client.search(
    q="best running shoes",
    scrape=ScrapeOptions(
        render_js=False,   # render with a headless browser first (JS-heavy sites)
        screenshot=False,  # capture a full-page screenshot (48h presigned URL)
        top_n=5,           # how many top organic results to scrape (1..20)
        wait_for=None,     # CSS selector to wait for (render_js only)
        wait_ms=None,      # extra settle time in ms, 0..5000 (render_js only)
        screenshot_width=None,   # screenshot viewport width in px (default 1920, max 1920)
        screenshot_height=None,  # screenshot viewport height in px (default 1080, max 1920)
    ),
)

for result in r.organic:
    if result.scrape_error:
        print(result.position, "failed:", result.scrape_error)
    else:
        print(result.position, result.content)  # markdown
        print(result.screenshot_url)             # set when screenshot=True
```

## Scraping a single page

`scrape` fetches and extracts one URL as markdown, independent of a search:

```python
from serpcheap import SerpCheap

client = SerpCheap("KEY")
r = client.scrape(
    "https://example.com",
    render_js=False,          # render with a headless browser first (JS-heavy sites)
    screenshot=False,         # capture a screenshot (1920x1080 by default, 48h presigned URL)
    wait_for=None,            # CSS selector to wait for (render_js only)
    wait_ms=None,             # extra settle time in ms, 0..5000 (render_js only)
    screenshot_width=None,    # screenshot width in px (default 1920, max 1920)
    screenshot_height=None,   # screenshot height in px (default 1080, max 1920)
)

print(r.url)             # the requested URL
print(r.status)          # upstream HTTP status | None
print(r.title)           # page title | None
print(r.content)         # markdown | None
print(r.content_text)    # plain text | None
print(r.screenshot_url)  # 48h presigned URL, set when screenshot=True
r.stats                  # ScrapeStats(balance, cost) | None
```

## Rank tracking

`rank` scans Google result pages for a keyword and reports where a domain or URL ranks:

```python
from serpcheap import SerpCheap

client = SerpCheap("KEY")
r = client.rank(
    "example.com",            # domain or full URL to locate
    "best running shoes",     # the keyword
    gl="us",                  # country, default "us"
    hl=None,                  # UI language (optional)
    tbs=None,                 # time filter: qdr:h / qdr:d / qdr:w (optional)
    pages=1,                  # how many result pages to scan (1..10), billed per page
    match_type="domain",      # "domain" (any result on the domain) or "exact" (identical URL)
)

print(r.found)            # bool
print(r.rank)             # absolute 1-based rank | None
for m in r.matches:       # list[RankMatch]
    print(m.rank, m.page, m.position_on_page, m.link)
r.organic                 # list[OrganicResult] across scanned pages
r.pages_scanned           # int
r.partial                 # True when some pages failed
r.pages_failed            # list[int]
r.stats                   # RankStats(balance, cost, pages_cached, pages_fresh) | None
```

## Paginating

`search_pages` lazily yields pages and stops on the first empty page:

```python
for page in client.search_pages(q="best running shoes", from_=1, to=5):
    for result in page.organic:
        print(result.position, result.title)
```

## Error handling

Every failure raises a typed `SerpCheapError`:

```python
from serpcheap import SerpCheap, SerpCheapError

client = SerpCheap("KEY")
try:
    r = client.search(q="best running shoes")
except SerpCheapError as e:
    print(e.code)            # e.g. "insufficient_credits", "rate_limited"
    print(e.status)          # HTTP status, if any
    print(e.retry_after_ms)  # set on rate_limited
    print(e.retryable)       # True for transient errors
```

## Retries & timeouts

```python
client = SerpCheap("KEY", timeout_ms=5000, max_retries=2)
```

Transient errors (`rate_limited`, `too_many_concurrent_requests`,
`service_temporarily_unavailable`, `result_timeout`, `client_timeout`,
`network_error`) are retried automatically up to `max_retries`. The backoff
honors a server-provided `retry_after_ms`, otherwise it grows exponentially
(capped at 2s). Each request is bounded by `timeout_ms`.

## Development

Run the test suite with enforced coverage (no install needed; tests put `src`
on `sys.path`):

```bash
python3 -m coverage run --source=src/serpcheap -m unittest discover -s tests -v
python3 -m coverage report --fail-under=95
```

The parity suite (`tests/test_parity.py`) cross-checks the client against the
shared `contract/mockserver` (requires `node`); the unit suites use an
in-process `http.server` and have no external dependencies.
