Metadata-Version: 2.4
Name: crowdos-mcp
Version: 0.7.0
Summary: Model Context Protocol server for CrowdOS — synthetic focus groups as agent-callable tools.
Project-URL: Homepage, https://crowdos.ai
Project-URL: Documentation, https://crowdos.ai/developers
Project-URL: Repository, https://github.com/bjnagent/crowd
Project-URL: Issues, https://github.com/bjnagent/crowd/issues
Author-email: CrowdOS <support@crowdos.ai>
License: MIT
Keywords: ai-agents,anthropic,claude,crowdos,focus-group,mcp
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27.0
Requires-Dist: mcp>=1.0.0
Description-Content-Type: text/markdown

# CrowdOS MCP Server

Synthetic focus groups as agent-callable tools. Exposes the
[CrowdOS](https://crowdos.ai) developer API as
[Model Context Protocol](https://modelcontextprotocol.io) tools so AI
agents (Claude Desktop, Cursor, Cline, LangGraph, CrewAI, AutoGPT,
Devin, etc.) can run synthetic public-opinion research with a single
tool call.

## What this gives you

Your agent can now do things like:

```
> Run a focus group on whether companies should mandate 4-day weeks.
   Use 200 agents from the us_general_population preset.

[tool: run_focus_group]
{
  "id": "ad4b3736-...",
  "sentiment_summary": {
    "positive_pct": 71.5, "negative_pct": 18.0, "neutral_pct": 10.5,
    "positive": 143, "negative": 36, "neutral": 21
  },
  "sample_responses": [
    {
      "agent_name": "Maria Chen", "age": 34, "occupation": "Software engineer",
      "sentiment": "positive",
      "reasoning": "It would be great for parents — three full days with the kids ..."
    },
    ...
  ]
}
```

## Installation

```bash
pip install crowdos-mcp
```

Then mint a free sandbox API key at <https://crowdos.ai/developers> —
no credit card required for the free tier (50 agents per study,
5 req/min).

## Configure for Claude Desktop

Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):

```json
{
  "mcpServers": {
    "crowdos": {
      "command": "crowdos-mcp",
      "env": {
        "CROWDOS_API_KEY": "crowd_..."
      }
    }
  }
}
```

Restart Claude Desktop. The CrowdOS tools should appear in the
slash-command picker.

## Configure for Cursor

Settings → MCP Servers → Add. Same env block as above; command =
`crowdos-mcp`.

## Configure for Cline (VS Code)

Settings → Cline → MCP Servers → Edit JSON:

```json
{
  "mcpServers": {
    "crowdos": {
      "command": "crowdos-mcp",
      "env": { "CROWDOS_API_KEY": "crowd_..." }
    }
  }
}
```

## Tools exposed

| Tool | What it does | Auth |
|---|---|---|
| `run_focus_group` | Synthetic poll on a topic, returns sentiment + quotes | required |
| `run_debate` | Multi-round synthetic debate, returns convergence + key arguments | required |
| `compare_options` | Synthetic A/B/C/D test on 2-4 text options, returns leaderboard | required |
| `preview_cost` | Estimate cents + seconds before running (free to call) | required |
| `list_demographic_presets` | Discover available audience templates | required |
| `get_simulation` | Fetch full results of a previously-run study | required |
| `crowd_sample` | Browse the public CrowdOS crowd (sanitized) | none |

`run_focus_group` and `run_debate` accept an optional `image_url`
(public HTTPS URL of a JPEG/PNG). One Gemini Flash vision call
extracts a description, which every text-only agent reacts to
alongside the question. ~$0.001 + ~1-3s overhead, regardless of
panel size — useful for testing diagrams, product photos, charts,
screenshots, or political imagery.

`run_focus_group` and `run_debate` block 5–120s depending on
population size — that's a real synthetic-research call running
behind the scenes, not a cached response. The MCP server returns a
trimmed envelope (sentiment summary + first 5 representative quotes
+ billing breakdown). Use `get_simulation` to pull the full payload
when you need every agent's full reasoning.

### Live progress

`run_focus_group`, `run_debate`, and `compare_options` stream via
SSE under the hood. When your MCP host (Claude Desktop / Cursor /
Cline) sends a `progressToken` with the tool call (the default),
the server emits MCP `notifications/progress` as agents complete
— so the host displays "47/200 agents responded" live instead of
a blank "running tool..." indicator. Debate runs additionally
emit "Round 3/5 starting" / "Round 3 complete" message updates.
No client work needed; progress just shows up.

### Recommended flow

For non-trivial studies the agent should:

1. **`list_demographic_presets`** — if the user didn't pick one,
   propose one based on the topic.
2. **`preview_cost`** — gets a cents+seconds estimate before
   committing.
3. **Confirm with the user** — show them the cost estimate and
   the proposed audience/size.
4. **`run_focus_group` / `run_debate`** — only after confirmation.

Defaults match the platform's calibrated quality bars — Standard
Pulse (200 agents) for `run_focus_group`, the platform-standard
debate (30 agents × 5 rounds) for `run_debate`. Cheaper defaults
would silently weaken the output, and the moat is calibrated
quality. Free-tier keys hit `over_plan_cap` on the 200-agent
voting default; the agent should retry with `population_size=50`
or tell the user to upgrade.

### Response shapes

Tool responses are mode-aware and field names are stable. Internal
QA fields (consistency_score, model routing, harness flags,
sampling metadata) are dropped — agents don't need them.

`run_focus_group` (voting mode):
```json
{
  "id": "...", "status": "complete", "mode": "voting",
  "topic": "...", "demographic_preset": "us_general_population",
  "population_size": 50,
  "sentiment_summary": {
    "positive": 30, "neutral": 12, "negative": 8,
    "positive_pct": 60.0, "neutral_pct": 24.0, "negative_pct": 16.0
  },
  "sample_responses": [
    { "agent_name": "Maria Chen", "age": 34, "occupation": "Software engineer",
      "sentiment": "positive", "reasoning": "..." },
    "..."
  ],
  "total_responses": 50,
  "billing": { "actual_cents": 12, "plan": "free" }
}
```

If a `stance_statement` was provided, the envelope additionally
carries `stance_statement` + `sentiment_axis: { positive_label,
neutral_label, negative_label }` so the agent knows whether
positive means "agrees" vs. "supports".

`run_debate`:
```json
{
  "id": "...", "status": "complete", "mode": "debate",
  "topic": "...", "num_rounds": 5, "agent_count": 30,
  "final_consensus_score": 0.72,
  "summary": "Most agents converged toward ...",
  "key_arguments_for": [ "..." ],
  "key_arguments_against": [ "..." ],
  "dissenting_views": [ "..." ],
  "final_round_responses": [
    { "agent_name": "...", "position": "FOR", "reasoning": "...",
      "confidence": 0.8 },
    "..."
  ],
  "position_shifts_count": 8,
  "billing": { "actual_cents": 18, "plan": "free" }
}
```

`get_simulation` returns the same envelope as the originating tool
but with **every** response (no 5-quote cap), still with internal
QA fields stripped. Use this when the trimmed envelope from
`run_focus_group` / `run_debate` isn't enough.

### Errors

API failures come back as a typed envelope so agents can
pattern-match and react. The `error` field is one of:

| `error` | When | Agent action |
|---|---|---|
| `auth_error` | 401, 403 | Get a fresh API key |
| `over_plan_cap` | 400 — population > tier limit | Lower population_size or upgrade |
| `validation_error` | 400 / 422 | Fix the request |
| `insufficient_funds` | 402 — wallet drained | Top up at `top_up_url` |
| `rate_limited` | 429 | Sleep and retry (`retry_after` seconds when available) |
| `quota_exceeded` | 429 — monthly token quota | Wait for next month or upgrade |
| `simulation_failed` | 500 — sim crashed mid-run | Inspect `sim_id`; fall back to a smaller run |
| `server_error` | 500/502/503/504 | Retry with backoff |
| `configuration_error` | local — `CROWDOS_API_KEY` unset | Tell user to fix host config |
| `internal_error` | bug in this MCP server | Report at the GitHub issues link |

Insufficient-funds responses also carry `balance_cents`,
`needed_cents`, and `top_up_url` so the agent can surface a clear
upgrade prompt.

## Configuration

| Env var | Default | Required |
|---|---|---|
| `CROWDOS_API_KEY` | — | yes (except `crowd_sample`) |
| `CROWDOS_API_BASE_URL` | `https://api.crowdos.ai` | no |

## Cost

CrowdOS uses a metered wallet. The MCP server returns the actual
debit on every successful call inside `billing.actual_cents`. Free
tier ships with $5 of credit; top up at <https://crowdos.ai/account/billing>
once it runs out.

Free-tier monthly quota is 120k tokens (~3 large studies). Pro tier
removes the cap.

## Programmatic use (without an MCP host)

The server is also a regular Python module:

```bash
python -m crowdos_mcp
# stdio MCP server, waits for messages on stdin
```

Or import and embed:

```python
from crowdos_mcp.server import build_server
server = build_server()
# server is a configured mcp.server.Server instance
```

## Versioning

Follows semver. The MCP tool surface (tool names, input schemas) is
stable; additive changes (new tools, new optional fields) ship as
minor versions. Removing or renaming a tool is a major version.

### Maintainer publish flow

```bash
# 1. Bump version in BOTH pyproject.toml and src/crowdos_mcp/__init__.py
# 2. Commit + push
# 3. Publish:
./scripts/publish.sh         # → PyPI
./scripts/publish.sh --test  # → TestPyPI (dry run)
```

The script verifies the two version numbers agree, runs tests,
cleans `dist/`, builds, and uploads via twine using `~/.pypirc`
(needs `username = __token__` + `password = pypi-<token>`).

## License

MIT.

## Issues / questions

<https://github.com/bjnagent/crowd/issues>
