Metadata-Version: 2.4
Name: wide-research
Version: 0.2.1
Summary: Spawn thousands of sub-agents in parallel. Pass in strings, files, or folders, get back structured data and generated artifacts.
Project-URL: Homepage, https://github.com/datacurve-ai/wide-research
Project-URL: Repository, https://github.com/datacurve-ai/wide-research
Project-URL: Issues, https://github.com/datacurve-ai/wide-research/issues
Author: Datacurve
License: MIT License
        
        Copyright (c) 2026 Datacurve
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: agent,batch,map,modal,openai-agents,sandbox
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.11
Requires-Dist: anyio>=4.4
Requires-Dist: jinja2>=3.1
Requires-Dist: modal>=1.3
Requires-Dist: openai-agents>=0.14
Requires-Dist: openai>=1.40
Requires-Dist: pydantic>=2.7
Requires-Dist: python-dotenv>=1.0
Requires-Dist: rich>=13.7
Requires-Dist: tomli-w>=1.0
Requires-Dist: typer>=0.12
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == 'test'
Description-Content-Type: text/markdown

# wide-research

Spawn thousands of sub-agents in parallel. Pass in a list of strings, files,
or folders, get back structured data and generated artifacts.

Each subtask gets its own Modal sandbox and its own OpenAI
Agents SDK agent.
The agent has shell + file-edit + web-search tools plus a
single `submit(...)` that ends the task. Results come back as JSON / CSV
rows plus any files or folders the agent wrote.

## Install

### Add the skill

```bash
npx skills add datacurve-ai/wide-research
```

This is the primary way to use wide-research from an agent. The skill teaches
the agent when to fan out work, how to write job configs, and how to collect
the outputs.

### Install the CLI from PyPI

```bash
uv tool install wide-research
wide-research --help         # or the short alias: wr --help
wide-research doctor         # check credentials + paths
```

### From source (for development)

```bash
git clone https://github.com/datacurve-ai/wide-research ~/code/wide-research
cd ~/code/wide-research
uv sync --extra test
source .venv/bin/activate
```

### Credentials

`wide-research doctor` will tell you what's missing. If you use a custom env
file, check it with `wide-research doctor --env-file PATH`. Short version:

- Modal: `modal token set --token-id … --token-secret …`
- `OPENAI_API_KEY` and optional `OPENAI_BASE_URL` — any of (first hit wins):
  1. `--env-file PATH`
  2. `./.env` (current directory)
  3. `~/.wide-research/.env` (recommended for global installs)
  4. shell environment
  5. repo `.env` (source checkouts only)

The OpenAI credential and base URL are used by the host-side Agent runner and
are not injected into sandboxes. Use TOML `secrets = [...]` only for
credentials that the sandbox itself should be able to read.

### Where things live

Everything the tool owns is under `~/.wide-research/`:

- `~/.wide-research/.env` — credentials
- `~/.wide-research/runs/` — one subdirectory per run

Override with `WIDE_RESEARCH_HOME=/somewhere/else` (whole base) or
`WIDE_RESEARCH_RUNS_DIR=/somewhere/else` (runs only).

## Quickstart

`wide-research run` defaults to **spawn-detached + auto-tail**: it starts a
worker process, then follows the live merged log in your terminal. Ctrl+C only
detaches the tail; the worker keeps running. Use `--detach` / `-d` to spawn and
exit immediately, or `--foreground` / `-f` to run in-process for scripts/CI.

```bash
# 1. Create the smallest smoke-test config.
cat > echo.toml <<'EOF'
brief = "Echo each input."
name = "echo_test"
title = "Echo Test"
target_count = 3
inputs = ["alpha", "beta", "gamma"]

prompt_template = """
Return the input exactly, then call submit(success=true, output={"echo": "{{ input }}"}).
"""

[[output_schema]]
name = "echo"
type = "string"
title = "Echo"
description = "The echoed input."
EOF

# 2. Dry-run — validate a config without spawning sandboxes
wide-research run echo.toml --dry-run

# 3. Render just the first prompt
wide-research plan echo.toml --index 0

# 4. Smoke-test, then full run. Use -f / --foreground for scripts/CI:
wide-research run echo.toml --sample 1 --foreground
wide-research run echo.toml               # detaches, then auto-tails
wide-research run echo.toml --detach      # pure detached; wr wait to sync

# 5. Look at past runs
wide-research list
wide-research inspect ~/.wide-research/runs/<dir>
wide-research inspect ~/.wide-research/runs/<dir> --failed-only
```

If you cloned the repository, you can also run the checked-in examples under
[`examples/`](examples/).

## Config shape

TOML. Six required top-level keys + an `output_schema` array-of-tables:

```toml
brief = "Find the current CEO of each company."
name = "find_ceos"
title = "Find CEOs"
target_count = 3
inputs = ["Apple", "Microsoft", "Alphabet"]

prompt_template = """
Research the current CEO of {{ input }}.
Use web_search to verify, then call submit(success=true, output={"ceo": "..."}).
"""

[[output_schema]]
name = "ceo"
type = "string"
title = "CEO"
description = "Verified name of the current CEO."
```

See [`skills/wide-research/references/CONFIG.md`](skills/wide-research/references/CONFIG.md)
for the full schema, and [`skills/wide-research/references/PRESETS.md`](skills/wide-research/references/PRESETS.md)
for ready-to-paste `[resources]` / `[image]` blocks (research / coding / DinD / GPU).

## Design docs

- [`docs/agent-design.md`](docs/agent-design.md) — harness/compute split,
  the SDK-native tool set, and how the loop terminates.
- [`docs/file-io.md`](docs/file-io.md) — how files and directories move in
  and out of sandboxes (`<file>` tags, `mount_files`, `file` / `directory`
  output fields, size limits).

## Inline / stdin configs

For short configs you'd rather not leave on disk:

```bash
# Inline string
wr run --inline "$(cat <<'EOF'
brief = "tiny inline smoke"
name = "inline_smoke"
title = "Inline Smoke"
prompt_template = "Echo {{ input }}"
target_count = 2
inputs = ["hi", "there"]

[[output_schema]]
name = "echo"
type = "string"
title = "Echo"
description = "echoed"
EOF
)"

# Stdin via '-'
cat my-config.toml | wr run -
wr plan - < my-config.toml
```

## Watching a run

`wr run` prints the run dir + pid and auto-tails by default. For a run started
with `--detach`, or after Ctrl+C detaches the tail, follow along with:

```bash
wr tail ~/.wide-research/runs/<dir>        # watch what each subtask is doing
wr wait ~/.wide-research/runs/<dir>        # block silently; non-zero exit if any fail
tail -f ~/.wide-research/runs/<dir>/worker.log   # low-level worker log

# Stop a run early (tears down every live Modal sandbox first):
wr stop ~/.wide-research/runs/<dir>

# Or, if you'd rather have `run` block the current terminal:
wr run job.toml --foreground
```

## Cost Estimates

Run summaries include token usage and best-effort cost estimates. By default,
wide-research fetches the public OpenRouter model catalog from
`https://openrouter.ai/api/v1/models` and caches it under
`~/.wide-research/cache/` for a week. Set `WR_DISABLE_PRICE_FETCH=1` to avoid
that network request, or provide your own prices with `WR_PRICE_TABLE_JSON`.

## License

MIT. See [`LICENSE`](LICENSE).

## Publishing

Maintainers can build and validate the package locally with:

```bash
uv build
uvx twine check dist/*
```

Publish with `uv publish` after the package version has been bumped and the
Modal/OpenAI example tests pass with live credentials.

## Layout

See [`AGENTS.md`](AGENTS.md).
