Metadata-Version: 2.4
Name: raysearch
Version: 0.1.2
Summary: Omni meta-search engine for agentic AI.
Author-email: Kotodama <jameswjj0416@gmail.com>
License-Expression: Apache-2.0
Keywords: search,serp,research,rag,crawler,llm
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: AsyncIO
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: <3.14,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anyio==4.9.0
Requires-Dist: httpx<1,>=0.27
Requires-Dist: jsonschema>=4.23
Requires-Dist: pydantic<3,>=2.7
Requires-Dist: jieba
Requires-Dist: pyyaml
Requires-Dist: typing_extensions<5,>=4.10
Provides-Extra: extract
Requires-Dist: beautifulsoup4; extra == "extract"
Requires-Dist: selectolax>=0.3.26; extra == "extract"
Requires-Dist: trafilatura>=1.10.0; extra == "extract"
Requires-Dist: html-to-markdown>=2.28.0; extra == "extract"
Requires-Dist: chardet>=5.2.0; extra == "extract"
Provides-Extra: extract-pdf
Requires-Dist: pymupdf>=1.24.0; extra == "extract-pdf"
Requires-Dist: pymupdf4llm>=0.3.4; extra == "extract-pdf"
Requires-Dist: pypdf>=5.2.0; extra == "extract-pdf"
Provides-Extra: extract-plus
Requires-Dist: inscriptis>=2.5.0; extra == "extract-plus"
Provides-Extra: crawl
Requires-Dist: curl_cffi; extra == "crawl"
Requires-Dist: playwright>=1.49.1; extra == "crawl"
Requires-Dist: beautifulsoup4; extra == "crawl"
Provides-Extra: rank
Requires-Dist: rank-bm25; extra == "rank"
Requires-Dist: scikit-learn>=1.8.0; extra == "rank"
Requires-Dist: sentence-transformers>=5.2.3; extra == "rank"
Provides-Extra: cache
Requires-Dist: aiosqlite>=0.22.1; extra == "cache"
Requires-Dist: aioredis>=2.0.1; extra == "cache"
Requires-Dist: aiomysql>=0.3.2; extra == "cache"
Requires-Dist: asyncmy>=0.2.11; extra == "cache"
Requires-Dist: sqlalchemy>=2.0.46; extra == "cache"
Provides-Extra: api
Requires-Dist: fastapi<1,>=0.110; extra == "api"
Requires-Dist: uvicorn[standard]<1,>=0.27; extra == "api"
Provides-Extra: overview
Requires-Dist: openai>=2.17.0; extra == "overview"
Requires-Dist: google-genai>=1.63.0; extra == "overview"
Requires-Dist: dashscope>=1.0.0; extra == "overview"
Provides-Extra: to-zh-tw
Requires-Dist: opencc>=1.2.0; extra == "to-zh-tw"
Provides-Extra: tokenize-ja
Requires-Dist: sudachipy>=0.6.8; extra == "tokenize-ja"
Provides-Extra: stopwords
Requires-Dist: marisa-trie>=1.3.1; extra == "stopwords"
Provides-Extra: tracking
Requires-Dist: structlog>=24.1.0; extra == "tracking"
Provides-Extra: service
Requires-Dist: raysearch[api,crawl,extract,extract_pdf,overview,rank,tracking]; extra == "service"
Provides-Extra: full
Requires-Dist: raysearch[api,cache,crawl,extract,extract_pdf,overview,rank,to_zh_tw,tokenize_ja,tracking]; extra == "full"
Dynamic: license-file

![cover-v5-optimized](./images/GitHub_README.png)

<p align="center">
  <a href="./API.md">API Documentation</a> ·
  <a href="./docs/zh-TW/API.md">API 文檔</a> ·
  <a href="./docs/zh-CN/API.md">API 文档</a> ·
  <a href="./docs/ja-JP/API.md">API ドキュメント</a>
</p>

<p align="center">
  <a href="./README.md"><img alt="README in English" src="https://img.shields.io/badge/English-d9d9d9"></a>
  <a href="./docs/zh-TW/README.md"><img alt="Traditional Chinese README" src="https://img.shields.io/badge/Traditional_Chinese-d9d9d9"></a>
  <a href="./docs/zh-CN/README.md"><img alt="Simplified Chinese README" src="https://img.shields.io/badge/Simplified_Chinese-d9d9d9"></a>
  <a href="./docs/ja-JP/README.md"><img alt="Japanese README" src="https://img.shields.io/badge/Japanese-d9d9d9"></a>
</p>

# RaySearch

RaySearch is an async-first search orchestration engine for AI-overview style workflows. It can be used as:

- a Python package with `Engine`
- a personal HTTP API service
- a Docker or Docker Compose deployment

## Core Capabilities

- `search`: multi-provider retrieval with optional fetch and rerank stages
- `fetch`: page crawling, extraction, abstracts, overview generation, and related links
- `answer`: grounded answer generation with citations
- `research`: multi-round research reports with synthesis and structured output

## Start With Docker Compose

```bash
git clone https://github.com/radiata-labs/raysearch.git
cd raysearch/docker
cp .env.example .env
```

Then edit `.env` with your own values. At minimum:

- set `RAYSEARCH_API_KEY`
- set `OPENAI_API_KEY` if you want `answer`, `research`, or LLM-powered overview features

Start the service:

```bash
docker compose up -d
```

For development with hot reload:

```bash
docker compose -f docker-compose.yml -f docker-compose.override.yml up
```

The compose setup reads:

- environment from `docker/.env`
- service config from `docker/raysearch.example.yaml`

If you want a separate config file, copy `docker/raysearch.example.yaml` and point `RAYSEARCH_CONFIG_FILE` to it in `.env`.

## Start With uv

From the repository root:

```bash
uv run --extra service --env-file docker/.env raysearch-api --config docker/raysearch.example.yaml
```

Or from inside the `docker` directory:

```bash
uv run --project .. --extra service --env-file .env raysearch-api --config raysearch.example.yaml
```

## Python Package Usage

```python
from raysearch import Engine, SearchRequest

async def main() -> None:
    async with Engine.from_settings("docker/raysearch.example.yaml") as engine:
        response = await engine.search(
            SearchRequest(
                query="latest multimodal model papers",
                user_location="US",
                mode="deep",
                max_results=8,
                fetchs={"content": True},
            )
        )
        print(response.results)
```

## Personal API

The personal API exposes:

- `GET /healthz`
- `POST /v1/search`
- `POST /v1/fetch`
- `POST /v1/answer`
- `POST /v1/research`

It uses one shared static bearer token from `api.bearer_token` or `RAYSEARCH_API_KEY`. This service is intended for a single user or a private trusted environment.
