Metadata-Version: 2.4
Name: bernstein
Version: 1.8.13
Summary: Declarative agent orchestration for engineering teams
Project-URL: Homepage, https://github.com/chernistry/bernstein
Project-URL: Documentation, https://github.com/chernistry/bernstein#readme
Project-URL: Repository, https://github.com/chernistry/bernstein
Project-URL: Issues, https://github.com/chernistry/bernstein/issues
Project-URL: Changelog, https://github.com/chernistry/bernstein/releases
Project-URL: Funding, https://github.com/sponsors/chernistry
Author-email: Alex Chernysh <alex@alexchernysh.com>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: agents,ai,automation,cli,coding,devtools,llm,multi-agent,orchestration
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: OS Independent
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Distributed Computing
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: click>=8.1
Requires-Dist: cryptography>=45.0.0
Requires-Dist: defusedxml>=0.7.1
Requires-Dist: fastapi>=0.115
Requires-Dist: httpx>=0.27
Requires-Dist: mcp>=1.0
Requires-Dist: openai>=2.29.0
Requires-Dist: opentelemetry-api>=1.30.0
Requires-Dist: opentelemetry-exporter-otlp>=1.30.0
Requires-Dist: opentelemetry-sdk>=1.30.0
Requires-Dist: pillow>=12.1.1
Requires-Dist: pluggy>=1.5
Requires-Dist: prometheus-client>=0.21
Requires-Dist: pydantic-settings>=2.13.1
Requires-Dist: pyfiglet>=1.0
Requires-Dist: python-dotenv>=1.2.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: setproctitle>=1.3
Requires-Dist: signxml>=4.0
Requires-Dist: terminaltexteffects>=0.11
Requires-Dist: textual>=1.0
Requires-Dist: uvicorn>=0.30
Requires-Dist: watchdog>=4.0
Requires-Dist: websockets>=14.0
Provides-Extra: azure
Requires-Dist: azure-storage-blob>=12.20; extra == 'azure'
Provides-Extra: dev
Requires-Dist: hypothesis>=6.100; extra == 'dev'
Requires-Dist: import-linter>=2.0; extra == 'dev'
Requires-Dist: pyright>=1.1; extra == 'dev'
Requires-Dist: pytest-asyncio>=1.3.0; extra == 'dev'
Requires-Dist: pytest-benchmark>=4.0; extra == 'dev'
Requires-Dist: pytest-cov>=7.1.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.3.1; extra == 'dev'
Requires-Dist: pytest>=9.0.2; extra == 'dev'
Requires-Dist: respx>=0.22; extra == 'dev'
Requires-Dist: ruff>=0.9; extra == 'dev'
Provides-Extra: docker
Requires-Dist: docker>=7; extra == 'docker'
Provides-Extra: e2b
Requires-Dist: e2b-code-interpreter>=1.0; extra == 'e2b'
Provides-Extra: gcs
Requires-Dist: google-cloud-storage>=3.0; extra == 'gcs'
Provides-Extra: graphics
Provides-Extra: grpc
Requires-Dist: grpcio-reflection>=1.68; extra == 'grpc'
Requires-Dist: grpcio-tools>=1.68; extra == 'grpc'
Requires-Dist: grpcio>=1.68; extra == 'grpc'
Requires-Dist: protobuf>=5.29; extra == 'grpc'
Provides-Extra: k8s
Requires-Dist: kubernetes>=31.0; extra == 'k8s'
Provides-Extra: ml
Requires-Dist: scikit-learn>=1.5; extra == 'ml'
Provides-Extra: modal
Requires-Dist: modal>=0.65; extra == 'modal'
Provides-Extra: openai
Requires-Dist: openai-agents>=0.4.0; extra == 'openai'
Provides-Extra: r2
Requires-Dist: boto3>=1.40; extra == 'r2'
Provides-Extra: s3
Requires-Dist: boto3>=1.40; extra == 's3'
Description-Content-Type: text/markdown

<div align="center">

<picture>
  <source media="(prefers-color-scheme: dark)" srcset="docs/assets/logo-dark.svg">
  <source media="(prefers-color-scheme: light)" srcset="docs/assets/logo-light.svg">
  <img alt="Bernstein" src="docs/assets/logo-light.svg" width="340">
</picture>

<br>

> *"To achieve great things, two things are needed: a plan and not quite enough time."* — Leonard Bernstein

### Orchestrate any AI coding agent. Any model. One command.

<img alt="Bernstein in action: parallel AI agents orchestrated in real time" src="docs/assets/in-action-small.gif" width="700">

[![CI](https://github.com/chernistry/bernstein/actions/workflows/ci.yml/badge.svg)](https://github.com/chernistry/bernstein/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/bernstein)](https://pypi.org/project/bernstein/)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-3776ab?logo=python&logoColor=white)](https://python.org)
[![License](https://img.shields.io/github/license/chernistry/bernstein)](LICENSE)

[Documentation](https://bernstein.readthedocs.io/) &middot; [Getting Started](docs/getting-started/GETTING_STARTED.md) &middot; [Glossary](docs/reference/GLOSSARY.md) &middot; [Limitations](docs/reference/KNOWN_LIMITATIONS.md)

</div>

---

Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and merges the results. When agents succeed, the janitor merges verified work into main. Failed tasks retry or route to a different model.

### Why deterministic coordination

LLMs write code well. They schedule work across other LLMs badly. Most agent orchestrators use an LLM as the coordinator and hit the same failure modes: non-reproducible plans, silent coordination drift, token burn on meta-decisions a 200-line event loop does reliably. Bernstein inverts that. One LLM call upfront decomposes the goal; after that, scheduling, worktree isolation, quality gates, and HMAC-chained audit replay are all deterministic Python. Every run is bit-identically replayable.

No framework to learn. No vendor lock-in. Agents are interchangeable workers. Swap any agent, any model, any provider.

```bash
pipx install bernstein
cd your-project && bernstein init
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"
```

```
$ bernstein -g "Add JWT auth"
[manager] decomposed into 4 tasks
[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)
[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)
[verify]  all gates pass. merging to main.
```

Also available via `pip`, `uv tool install`, `brew`, `dnf copr`, and `npx bernstein-orchestrator`. See [install options](#install).

## Supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

18 CLI agent adapters: 17 third-party wrappers plus a generic wrapper for anything with `--prompt`.

| Agent | Models | Install |
|-------|--------|---------|
| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | Opus 4, Sonnet 4.6, Haiku 4.5 | `npm install -g @anthropic-ai/claude-code` |
| [Codex CLI](https://github.com/openai/codex) | GPT-5, GPT-5 mini | `npm install -g @openai/codex` |
| [OpenAI Agents SDK v2](https://openai.github.io/openai-agents-python/) | GPT-5, GPT-5 mini, o4 | `pip install 'bernstein[openai]'` |
| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | Gemini 2.5 Pro, Gemini Flash | `npm install -g @google/gemini-cli` |
| [Cursor](https://www.cursor.com) | Sonnet 4.6, Opus 4, GPT-5 | [Cursor app](https://www.cursor.com) |
| [Aider](https://aider.chat) | Any OpenAI/Anthropic-compatible | `pip install aider-chat` |
| [Amp](https://ampcode.com) | Amp-managed | `npm install -g @sourcegraph/amp` |
| [Cody](https://sourcegraph.com/cody) | Sourcegraph-hosted | `npm install -g @sourcegraph/cody` |
| [Continue](https://continue.dev) | Any OpenAI/Anthropic-compatible | `npm install -g @continuedev/cli` (binary: `cn`) |
| [Goose](https://block.github.io/goose/) | Any provider Goose supports | See [Goose docs](https://block.github.io/goose/) |
| [IaC](https://www.terraform.io/) (Terraform/Pulumi) | Any provider the base agent uses | Built-in |
| [Kilo](https://kilo.dev) | Kilo-hosted | See [Kilo docs](https://kilo.dev) |
| [Kiro](https://kiro.dev) | Kiro-hosted | See [Kiro docs](https://kiro.dev) |
| [Ollama](https://ollama.ai) + Aider | Local models (offline) | `brew install ollama` |
| [OpenCode](https://opencode.ai) | Any provider OpenCode supports | See [OpenCode docs](https://opencode.ai) |
| [Qwen](https://github.com/QwenLM/qwen-code) | Qwen Code models | `npm install -g @qwen-code/qwen-code` |
| [Cloudflare Agents](https://developers.cloudflare.com/agents/) | Workers AI models | `bernstein cloud login` |
| **Generic** | Any CLI with `--prompt` | Built-in |

Any adapter also works as the **internal scheduler LLM**. Run the entire stack without any specific provider:

```yaml
internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...
internal_llm_model: gemini-2.5-pro
```

> [!TIP]
> Run `bernstein --headless` for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

## Quick start

```bash
cd your-project
bernstein init                    # creates .sdd/ workspace + bernstein.yaml
bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit
bernstein live                    # watch progress in the TUI dashboard
bernstein stop                    # graceful shutdown with drain
```

For multi-stage projects, define a YAML plan:

```bash
bernstein run plan.yaml           # skips LLM planning, goes straight to execution
bernstein run --dry-run plan.yaml # preview tasks and estimated cost
```

## How it works

1. **Decompose**. The manager breaks your goal into tasks with roles, owned files, and completion signals.
2. **Spawn**. Agents start in isolated git worktrees, one per task. Main branch stays clean.
3. **Verify**. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.
4. **Merge**. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible.

## Cloud execution (Cloudflare)

Bernstein can run agents on Cloudflare Workers instead of locally. The `bernstein cloud` CLI handles deployment and lifecycle.

- **Workers**. Agent execution on Cloudflare's edge, with Durable Workflows for multi-step tasks and automatic retry.
- **V8 sandbox isolation**. Each agent runs in its own isolate, no container overhead.
- **R2 workspace sync**. Local worktree state syncs to R2 object storage so cloud agents see the same files.
- **Workers AI** (experimental). Use Cloudflare-hosted models as the LLM provider, no external API keys required.
- **D1 analytics**. Task metrics and cost data stored in D1 for querying.
- **Vectorize**. Semantic cache backed by Cloudflare's vector database.
- **Browser rendering**. Headless Chrome on Workers for agents that need to inspect web output.
- **MCP remote transport**. Expose or consume MCP servers over Cloudflare's network.

```bash
bernstein cloud login      # authenticate with Bernstein Cloud
bernstein cloud deploy     # push agent workers
bernstein cloud run plan.yaml  # execute a plan on Cloudflare
```

A `bernstein cloud init` scaffold for `wrangler.toml` and bindings is planned.

## Capabilities

**Core orchestration**. Parallel execution, git worktree isolation, janitor verification, quality gates (lint, types, PII scan), cross-model code review, circuit breaker for misbehaving agents, token growth monitoring with auto-intervention.

**Intelligence**. Contextual bandit router for model/effort selection. Knowledge graph for codebase impact analysis. Semantic caching saves tokens on repeated patterns. Cost anomaly detection (burn-rate alerts). Behavior anomaly detection with Z-score flagging.

**Sandboxing**. Pluggable [`SandboxBackend`](docs/architecture/sandbox.md) protocol — run agents in local git worktrees (default), Docker containers, [E2B](https://e2b.dev) Firecracker microVMs, or [Modal](https://modal.com) serverless containers (with optional GPU). Plugin authors can register custom backends through the `bernstein.sandbox_backends` entry-point group. Inspect installed backends with `bernstein agents sandbox-backends`.

**Artifact storage**. `.sdd/` state can stream to pluggable [`ArtifactSink`](docs/architecture/storage.md) backends: local filesystem (default), S3, Google Cloud Storage, Azure Blob, or Cloudflare R2. `BufferedSink` keeps the WAL crash-safety contract by writing locally with fsync first and mirroring to the remote asynchronously.

**Skill packs**. Progressive-disclosure [skills](docs/architecture/skills.md) (OpenAI Agents SDK pattern): only a compact skill index ships in every spawn's system prompt, agents pull full bodies via the `load_skill` MCP tool on demand. 17 built-in role packs plus third-party `bernstein.skill_sources` entry-points.

**Controls**. HMAC-chained audit logs, policy engine, PII output gating, WAL-backed crash recovery (experimental multi-worker safety), OAuth 2.0 PKCE. SSO/SAML/OIDC support is in progress.

**Observability**. Prometheus `/metrics`, OTel exporter presets, Grafana dashboards. Per-model cost tracking (`bernstein cost`). Terminal TUI and web dashboard. Agent process visibility in `ps`.

**Ecosystem**. MCP server mode, A2A protocol support, GitHub App integration, pluggy-based plugin system, multi-repo workspaces, cluster mode for distributed execution, self-evolution via `--evolve` (experimental).

Full feature matrix: [FEATURE_MATRIX.md](docs/reference/FEATURE_MATRIX.md) &middot; Recent features: [What's New](docs/whats-new.md)

## How it compares

| Feature | Bernstein | CrewAI | AutoGen [^autogen] | LangGraph |
|---------|-----------|--------|---------|-----------|
| Orchestrator | Deterministic code | LLM-driven (+ code Flows) | LLM-driven | Graph + LLM |
| Works with | Any CLI agent (18 adapters) | Python SDK classes | Python agents | LangChain nodes |
| Git isolation | Worktrees per agent | No | No | No |
| Pluggable sandboxes | Worktree, Docker, E2B, Modal | No | No | No |
| Verification | Janitor + quality gates | Guardrails + Pydantic output | Termination conditions | Conditional edges |
| Cost tracking | Built-in | `usage_metrics` | `RequestUsage` | Via LangSmith |
| State model | File-based (.sdd/) | In-memory + SQLite checkpoint | In-memory | Checkpointer |
| Remote artifact sinks | S3, GCS, Azure Blob, R2 | No | No | No |
| Self-evolution | Built-in (experimental) | No | No | No |
| Declarative plans (YAML) | Yes | Yes (`agents.yaml`, `tasks.yaml`) | No | Partial (`langgraph.json`) |
| Model routing per task | Yes | Per-agent LLM | Per-agent `model_client` | Per-node (manual) |
| MCP support | Yes (client + server) | Yes | Yes (client + workbench) | Yes (client + server) |
| Agent-to-agent chat | Bulletin board | Yes (Crew process) | Yes (group chat) | Yes (supervisor, swarm) |
| Web UI | TUI + web dashboard | CrewAI AMP | AutoGen Studio | LangGraph Studio + LangSmith |
| Cloud hosted option | Yes (Cloudflare) | Yes (CrewAI AMP) | No | Yes (LangGraph Cloud) |
| Built-in RAG/retrieval | Yes (codebase FTS5 + BM25) | `crewai_tools` | `autogen_ext` retrievers | Via LangChain |

*Last verified: 2026-04-19. See [full comparison pages](docs/compare/README.md) for detailed feature matrices.*

The table above compares Bernstein against LLM-orchestration frameworks (they orchestrate LLM calls). The table below covers the closer category — other tools that orchestrate **CLI coding agents**:

| Feature | Bernstein | [ComposioHQ/agent-orchestrator](https://github.com/ComposioHQ/agent-orchestrator) | [emdash](https://github.com/generalaction/emdash) |
|---------|-----------|-----------|-----------|
| Shape | Python CLI + library + MCP server | TypeScript CLI + local dashboard | Electron desktop app |
| Primary language | Python | TypeScript | TypeScript |
| Install | `pipx install bernstein` | `npm install -g @aoagents/ao` | `.dmg` / `.msi` / `.AppImage` |
| Agent adapters | 18 | 3 (Claude Code, Codex, Aider) | 23 |
| Git worktree per agent | Yes | Yes | Yes |
| MCP server mode (exposes self as MCP) | Yes (stdio + HTTP/SSE) | No | No |
| Coordinator | Deterministic Python scheduler | LLM-driven | Not documented |
| HMAC-chained audit replay | Yes | No | No |
| Autonomous CI-fix / PR flow | No | Yes | No |
| Visual dashboard | TUI + web | Web | Desktop app |
| Backing | Solo OSS | Funded (Composio.dev) | YC W26 |
| License | Apache 2.0 | MIT | Apache 2.0 |

Bernstein's wedge in this category: **Python-native, MCP-server-first, widest adapter coverage**. If your stack is TypeScript and you want a product with a dashboard, Composio's `@aoagents/ao` is a better fit; if you want a polished desktop ADE, emdash is. If you want a primitive that imports into Python, exposes itself over MCP to any client, and covers the full agent breadth (including Qwen, Goose, Ollama, OpenAI Agents SDK, Cloudflare Agents, and more) — Bernstein.

[^autogen]: AutoGen is in maintenance mode; successor is Microsoft Agent Framework 1.0.

## Monitoring

```bash
bernstein live       # TUI dashboard
bernstein dashboard  # web dashboard
bernstein status     # task summary
bernstein ps         # running agents
bernstein cost       # spend by model/task
bernstein doctor     # pre-flight checks
bernstein recap      # post-run summary
bernstein trace <ID> # agent decision trace
bernstein run-changelog --hours 48  # changelog from agent-produced diffs
bernstein explain <cmd>  # detailed help with examples
bernstein dry-run    # preview tasks without executing
bernstein dep-impact # API breakage + downstream caller impact
bernstein aliases    # show command shortcuts
bernstein config-path    # show config file locations
bernstein init-wizard    # interactive project setup
bernstein debug-bundle   # collect logs, config, and state for bug reports
bernstein skills list    # discoverable skill packs (progressive disclosure)
bernstein skills show <name>  # print a skill body with its references
```

```bash
bernstein fingerprint build --corpus-dir ~/oss-corpus  # build local similarity index
bernstein fingerprint check src/foo.py                 # check generated code against the index
```

## Install

| Method | Command |
|--------|---------|
| **pip** | `pip install bernstein` |
| **pipx** | `pipx install bernstein` |
| **uv** | `uv tool install bernstein` |
| **Homebrew** | `brew tap chernistry/bernstein && brew install bernstein` |
| **Fedora / RHEL** | `sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein` |
| **npm** (wrapper) | `npx bernstein-orchestrator` |

### Optional extras

Provider SDKs are optional so the base install stays lean. Pick what you need:

| Extra | Enables |
|-------|---------|
| `bernstein[openai]` | OpenAI Agents SDK v2 adapter (`openai_agents`) |
| `bernstein[docker]` | Docker sandbox backend |
| `bernstein[e2b]` | [E2B](https://e2b.dev) microVM sandbox backend (needs `E2B_API_KEY`) |
| `bernstein[modal]` | [Modal](https://modal.com) sandbox backend, optional GPU (needs `MODAL_TOKEN_ID` / `MODAL_TOKEN_SECRET`) |
| `bernstein[s3]` | S3 artifact sink (via `boto3`) |
| `bernstein[gcs]` | Google Cloud Storage artifact sink |
| `bernstein[azure]` | Azure Blob artifact sink |
| `bernstein[r2]` | Cloudflare R2 artifact sink (S3-compatible `boto3`) |
| `bernstein[grpc]` | gRPC bridge |
| `bernstein[k8s]` | Kubernetes integrations |

Combine extras with brackets, e.g. `pip install 'bernstein[openai,docker,s3]'`.

Editor extensions: [VS Marketplace](https://marketplace.visualstudio.com/items?itemName=alex-chernysh.bernstein) &middot; [Open VSX](https://open-vsx.org/extension/alex-chernysh/bernstein)

## Contributing

PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for setup and code style.

## Support

If Bernstein saves you time: [GitHub Sponsors](https://github.com/sponsors/chernistry)

Contact: [forte@bernstein.run](mailto:forte@bernstein.run)

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=chernistry/bernstein&type=Date)](https://star-history.com/#chernistry/bernstein&Date)

## License

[Apache License 2.0](LICENSE)

---

<!-- mcp-name: io.github.chernistry/bernstein -->
