Metadata-Version: 2.4
Name: openserp
Version: 0.1.2
Summary: Python SDK for the OpenSERP self-hosted server and OpenSERP Cloud.
Project-URL: Homepage, https://openserp.org
Project-URL: Documentation, https://openserp.org/docs
Project-URL: Repository, https://github.com/karust/openserp
Project-URL: Issues, https://github.com/karust/openserp/issues
Project-URL: Changelog, https://github.com/karust/openserp/releases
Project-URL: Chat (Telegram), https://t.me/openserp_cloud
Author: OpenSERP
License-Expression: MIT
Keywords: agent,agent-tools,ai,ai-grounding,anthropic,baidu,bing,duckduckgo,ecosia,google,google-search,google-search-api,keyword-research,llm,llm-tools,mcp,openai,openserp,rag,rank-tracker,rank-tracking,scraper,scraping,search,search-api,seo,serp,serp-api,serpapi,serpapi-alternative,web-scraping,yandex
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: AsyncIO
Classifier: Framework :: Pydantic :: 2
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx<1,>=0.27
Requires-Dist: pydantic<3,>=2.7
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.2; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Provides-Extra: pandas
Requires-Dist: pandas>=2.0; extra == 'pandas'
Description-Content-Type: text/markdown

# openserp

[![PyPI version](https://img.shields.io/pypi/v/openserp.svg)](https://pypi.org/project/openserp/)
[![Python versions](https://img.shields.io/pypi/pyversions/openserp.svg)](https://pypi.org/project/openserp/)
[![License](https://img.shields.io/pypi/l/openserp.svg)](https://github.com/karust/openserp/blob/master/LICENSE)

```bash
pip install openserp
```

Cloud:

```python
import os
from openserp import OpenSERP

client = OpenSERP(api_key=os.environ["OPENSERP_KEY"])
resp = client.search(engine="google", text="openserp")

print(resp.results[0].title, resp.results[0].url)
```

Self-hosted:

```python
from openserp import OpenSERP

client = OpenSERP(base_url="http://localhost:7000")
resp = client.search(engine="bing", text="openserp")

print(resp.results[0].title, resp.results[0].url)
```

Python SDK for the **OpenSERP** multi-engine SERP API — Google, Bing, Yandex, Baidu, DuckDuckGo, and Ecosia results in a single call. Works against the self-hosted [open-source server](https://github.com/karust/openserp) and against [OpenSERP Cloud](https://openserp.org/cloud) with the same code.

Use it for AI grounding, RAG pipelines, LLM tool use, agent tool use, LangChain / LlamaIndex integrations, SEO rank tracking, competitor analysis, and search-powered automations. Open-source alternative to SerpAPI, DataForSEO, ScrapingBee, Bright Data SERP, Oxylabs SERP, and Zenserp.

> Also available for TypeScript / JavaScript: [`@openserp/sdk`](https://www.npmjs.com/package/@openserp/sdk) ([source](https://github.com/karust/openserp/tree/master/integrations/sdk-js)).

> **Alpha — the API may change before `1.0.0`.** Pin a version in production.

## Contents

- [Install](#install)
- [Quickstart — OSS (self-hosted)](#quickstart--oss-self-hosted)
- [Quickstart — Cloud](#quickstart--cloud)
- [Why two backends?](#why-two-backends)
- [Search](#search)
- [Images](#images)
- [Async](#async)
- [Endpoint availability](#endpoint-availability)
- [Telemetry](#telemetry)
- [Error handling](#error-handling)
- [Retry hook](#retry-hook)
- [Use cases](#use-cases)
- [Development](#development)

## Install

```bash
pip install openserp
```

DataFrame export is an optional extra:

```bash
pip install "openserp[pandas]"
```

Requires Python 3.10+.

## Quickstart — OSS (self-hosted)

Run the open-source server locally, no API key required:

```bash
docker run -p 7000:7000 karust/openserp serve
```

```python
from openserp import OpenSERP

client = OpenSERP(base_url="http://localhost:7000")

resp = client.search(
    engine="google",
    text="openserp",
    limit=10,
    region="US",
)

print(resp.results[0].title, resp.results[0].url)
```

If you pass no options, the client defaults to `http://localhost:7000`.

## Quickstart — Cloud

Get an API key at the [dashboard](https://openserp.org/dashboard/keys). When `api_key` is set, the SDK defaults `base_url` to `https://api.openserp.org/v1` and sends `Authorization: Bearer ...` for you.

```python
import os
from openserp import OpenSERP

client = OpenSERP(api_key=os.environ["OPENSERP_KEY"])

resp = client.search(engine="google", text="openserp")

print(resp.results[0].title)
print(client.last_response.credits)  # CreditInfo(used=..., remaining=...)
```

If both `base_url` and `api_key` are set, `base_url` wins and the key is still sent. Use this for an authenticated self-hosted deployment.

## Why two backends?

OpenSERP Cloud is the same HTTP contract as the OSS server, with a `/v1/` prefix and bearer auth. The same SDK call works on both — you only change `base_url` / `api_key`. Start with OSS for free, move to Cloud when you need managed proxies, captcha handling, and pre-warmed routing without the operational work. See [openserp.org/docs/oss-vs-cloud](https://openserp.org/docs/oss-vs-cloud) for the full comparison.

## Search

```python
single = client.search(engine="bing", text="golang", limit=10, region="US")

mega = client.mega_search(
    text="golang",
    engines=["google", "bing", "yandex"],
    mode="balanced",
    limit=20,
)

fast = client.fast_search(text="golang", engines=["google", "bing"])
any_ = client.any_search(text="golang", engines=["google", "yandex"])
```

`mega_search` aggregates multiple engines. `mode` is `"balanced"` (default, merged and deduplicated), `"any"` (first successful engine wins), or `"fast"` (engines reordered by recent health). `fast_search` / `any_search` are sugar for the matching mode.

## Images

```python
images = client.image(engine="bing", text="golang logo", limit=20)

mega_images = client.mega_image(text="golang logo", engines=["bing", "google"])
```

## Async

```python
import asyncio, os
from openserp import AsyncOpenSERP


async def main() -> None:
    async with AsyncOpenSERP(api_key=os.environ["OPENSERP_KEY"]) as client:
        resp = await client.search(engine="google", text="openserp")
        print(resp.results[0].title)


asyncio.run(main())
```

Run hundreds of queries concurrently with a semaphore:

```python
import asyncio
from openserp import AsyncOpenSERP


async def main() -> None:
    sem = asyncio.Semaphore(20)
    queries = [f"keyword {i}" for i in range(500)]

    async with AsyncOpenSERP() as client:
        async def run(query: str):
            async with sem:
                return await client.search(engine="google", text=query, limit=10)

        responses = await asyncio.gather(*(run(q) for q in queries))
        print(len(responses))


asyncio.run(main())
```

## Endpoint availability

OSS-only operational methods raise `OssOnlyError` when the client is configured for Cloud:

```python
client.parse_google(html="<html>...</html>")
client.stats()
client.health()
```

Cloud-only account methods raise `CloudOnlyError` when the client is configured for OSS:

```python
client.me()
client.pricing()
client.engines_status()
client.engines_capabilities()
```

The backend is inferred from `base_url` and `api_key`. Pass `backend="oss"` or `backend="cloud"` to the constructor to override.

## Telemetry

`client.last_response` is updated after every HTTP response:

```python
client.last_response.credits          # Cloud — CreditInfo(used, remaining)
client.last_response.engine_used      # both — X-Engine-Used
client.last_response.fallback_engine  # OSS only
client.last_response.cache            # OSS only
client.last_response.headers          # raw response headers (lower-cased)
```

`fallback_engine` and `cache` are deliberately hidden by Cloud, so expect them to be `None` against `api.openserp.org`. `credits` is the inverse: only present when talking to Cloud.

## Error handling

```python
from openserp import OpenSERP, RateLimitError, CaptchaError, SERPError

client = OpenSERP(api_key="...")

try:
    client.search(engine="google", text="openserp")
except RateLimitError:
    # slow down or queue the request
    ...
except CaptchaError:
    # OSS may surface upstream captcha challenges; on Cloud this is handled for you
    ...
except SERPError as err:
    print(err.status, err.code, err.reason, err.request_id)
```

## Retry hook

The SDK does not apply a retry policy. Provide a hook when you want one:

```python
import os, random, time
from openserp import OpenSERP, SERPError

RETRYABLE = {408, 429, 500, 502, 503}
client: OpenSERP


def should_retry(err: Exception, attempt: int) -> bool:
    if attempt >= 3 or not isinstance(err, SERPError) or err.status not in RETRYABLE:
        return False
    headers = client.last_response.headers if client.last_response else {}
    retry_after = float(headers.get("retry-after", 0) or 0)
    wait = retry_after or min(2 ** attempt * 0.25, 8.0)
    time.sleep(wait + random.random() * 0.25)
    return True


client = OpenSERP(api_key=os.environ["OPENSERP_KEY"], retry=should_retry)
client.search(engine="google", text="openserp")
```

## Use cases

- **AI grounding / RAG** — feed top-N results into an LLM prompt (OpenAI, Anthropic, Ollama) for up-to-date answers.
- **LLM tool use** — expose `client.search` as a tool to your agent.
- **SEO monitoring** — daily rank tracking across multiple engines and regions, export to a DataFrame or Sheets.
- **Competitor analysis** — weekly diff of top-10 results for a keyword set.
- **Data pipelines** — stream SERPs to ClickHouse, BigQuery, or a DataFrame for NLP on snippets.

Quick SEO rank report with `pandas`:

```python
import pandas as pd
from openserp import OpenSERP

client = OpenSERP()
keywords = ["openserp", "serp api", "google search api"]
frames = []

for keyword in keywords:
    resp = client.search(engine="google", text=keyword, region="US", limit=10)
    frame = resp.to_pandas()
    frame["keyword"] = keyword
    frames.append(frame)

pd.concat(frames, ignore_index=True).to_csv("rank-report.csv", index=False)
```

## Development

```bash
python -m pip install -e ".[dev,pandas]"
pytest
ruff check .
mypy src
python -m build
```
