Metadata-Version: 2.4
Name: operative-ai
Version: 0.1.0
Summary: A safety-gated, streaming, tool-using agent SDK with hooks, subagents, structured output, multi-turn sessions, and MCP.
Author: James Lewis
License: MIT
Project-URL: Homepage, https://operative.my
Project-URL: Documentation, https://operative.onl
Project-URL: Repository, https://github.com/lewis-jamie/operative
Project-URL: Company, https://axetechnologies.ca
Keywords: agent,llm,tool-use,mcp,sdk,ai
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.24
Requires-Dist: PyYAML>=6.0
Provides-Extra: serve
Requires-Dist: fastapi>=0.110; extra == "serve"
Requires-Dist: uvicorn>=0.29; extra == "serve"
Requires-Dist: pydantic>=2.6; extra == "serve"
Requires-Dist: python-multipart>=0.0.9; extra == "serve"
Provides-Extra: speech
Requires-Dist: numpy>=1.24; extra == "speech"
Requires-Dist: faster-whisper>=1.0; extra == "speech"
Requires-Dist: kokoro-onnx>=0.4; extra == "speech"
Requires-Dist: soundfile>=0.12; extra == "speech"
Requires-Dist: edge-tts>=6.1; extra == "speech"
Provides-Extra: neo
Requires-Dist: numpy>=1.24; extra == "neo"
Provides-Extra: browser
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: black>=24.0; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Requires-Dist: hypothesis>=6.0; extra == "dev"
Requires-Dist: numpy>=1.24; extra == "dev"
Dynamic: license-file

# operative

**An AI-agnostic agent SDK and governed multi-backend router.**
By [AXE Technologies](https://axetechnologies.ca).

One binary serves any stack, with Tier-0 data isolation, a full audit trail, and cost
governance built in.

The core imports no vendor SDK and hardcodes no model. Backends are pluggable adapters
behind protocols; a deployment YAML wires the stack. The same binary runs self-hosted in
your own perimeter or as a managed service.

## Why operative

- **AI-agnostic.** The core is a clean protocol layer. Vendors (Anthropic, OpenAI,
  Ollama, MLX, vLLM, Qdrant, Postgres, Pinecone, Weaviate) live only inside adapters,
  imported lazily. Add Groq or Mistral with one adapter and a YAML line.
- **Governed by construction.** Tier-0 data never leaves local hardware: a constitution
  enforces it before any backend runs, and fails fast and safe if a request would
  violate it. Every decision is replayable in an audit trail.
- **A spectrum of agency.** Dial autonomy from a single deterministic call up to a fully
  autonomous agent with one named level. The catastrophic safety floor and the
  constitution hold identically at every level.
- **Cost governance.** Per-tenant budgets, cheap-first cascade execution, an L0 response
  cache, and cost-anomaly detection.

## Install

```bash
pip install operative-ai       # the import name stays `import operative`
pip install -e .               # from source
```

## Quickstart: the SDK

```python
from operative import Agent, tool, HTTPAgentModel

@tool
def add(a: int, b: int) -> int:
    "Add two integers."
    return a + b

agent = Agent(HTTPAgentModel("http://localhost:8095", "qwen"), tools=[add], workspace="./repo")
result = agent.run_sync("add 2 and 3, then write it to sum.txt")
print(result.answer)
```

Tool use with a safety floor, streaming, cost metering, self-correction, lifecycle hooks,
subagent delegation, structured output, and MCP (stdio and HTTP) are wired in by default.
See [docs/SDK.md](docs/SDK.md).

## The router: serve any stack from a YAML

```bash
operative serve --deployment deployment.yaml
```

A deployment names its backends, routing policy, and governance. The same binary serves
an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop running
Ollama. The HTTP surface routes and governs every request:

```
POST /v1/route            route and execute one inference (tier-aware, governed)
GET  /v1/explain/{id}     replay the full decision trail for a request
GET  /v1/stats            per-backend latency and cost
GET  /healthz /info       liveness and backends
```

```python
from operative import RoutedAgent, load_deployment, make_resolver

agent = RoutedAgent.from_deployment(load_deployment("deployment.yaml"), make_resolver())
result = await agent.run(request)        # routed, governed, audited
```

See [docs/ROUTER.md](docs/ROUTER.md) and [docs/OPERATIVEoptimizations.md](docs/OPERATIVEoptimizations.md).

## What's in the box

Every innovation, with a short read and a link to its deep-dive.

**AI-agnostic agent SDK.** Tool use with a safety floor, streaming, self-correction,
lifecycle hooks, structured output, multi-turn conversations with token-budget compaction,
subagent delegation, MCP over both stdio and HTTP, and a human-in-the-loop approval channel.
One import (`from operative import Agent, tool`) is the whole adoption surface.
See [docs/SDK.md](docs/SDK.md).

**Governed multi-backend router.** `operative serve --deployment deployment.yaml` turns a
single YAML into a routing HTTP server. The same binary serves an internal MLX-plus-cloud
stack, a customer on OpenAI plus Pinecone, or a laptop on Ollama. Inference backends:
Anthropic, OpenAI (and any OpenAI-compatible server: vLLM, LM Studio, Together, Groq),
Ollama, MLX. Knowledge backends: Qdrant, Postgres, Pinecone, Weaviate, plus two embedders.
All are lazy-imported adapters behind a registry. See [docs/ROUTER.md](docs/ROUTER.md).

**Governance by construction.** A constitution checks every request before any backend runs
and fails fast: Tier-0 data isolation (sensitive data never leaves owned hardware), plus
cost and latency ceilings. Below it sits the IronGate catastrophic floor, which denies the
truly dangerous (`rm -rf /`) in every mode. Permission modes (auto / ask / acceptEdits /
plan) and a spectrum of named agency levels dial autonomy from a single deterministic call
to a fully autonomous loop, with the floor and constitution holding identically at each
level. Every decision is replayable via `explain()`. See [docs/ROUTER.md](docs/ROUTER.md)
and [docs/SDK.md](docs/SDK.md).

**Capabilities: the third governance axis.** Where agency governs how many human checkpoints
and data tier governs classification, capabilities govern which surfaces a deployment may
touch at all (shell, network egress, filesystem write, browser eval). A `CapabilityManifest`
and named `TrustTier` presets (`locked` / `standard` / `trusted`) are checked by a
constitutional rule before execution, giving a TrustTier x AgencyLevel x DataTier cube that
makes postures like "fully autonomous but think-only" first-class. Unconfigured deployments
grant everything (backward compatible); naming a tier fails closed.
See [docs/CAPABILITIES.md](docs/CAPABILITIES.md).

**Neo: learned routing.** An optional learned head ranks the backends the policy already
deemed eligible (the Fugu insight that model selection beats model quality), for one cheap
scoring pass per request. The head proposes; the constitution still disposes, so Neo can
never widen the allowed set. A reference offline trainer mines operative's own audit log into
ranked examples and fits the head. See [docs/NEO.md](docs/NEO.md).

**Speech.** A backend family for voice: pluggable STT, TTS, and VAD adapters (local
faster-whisper / Kokoro / Silero are Tier-0-safe; OpenAI-compatible engines plug in too), a
realtime duplex `SpeechSession` with barge-in interruption, and an OpenAI-compatible serve
surface (`operative serve --speech`, exposing `/v1/audio/transcriptions` and
`/v1/audio/speech`). See [docs/SPEECH.md](docs/SPEECH.md).

**Browser automation.** A backend family for the web: a curated ~16-primitive
`BrowserBackend` protocol (navigate, scrape, search, screenshot, act, script) with surfboard
and Safari adapters, surfaced to an agent as governed `browser_tools` so navigation and
JavaScript execution pass through the capability and safety gates. See
[docs/BROWSER.md](docs/BROWSER.md).

**Optimization and learning.** Opt-in and inert when unset: an L0 response cache, per-tenant
budgets with cheapest-first cascade, adaptive failure memory, and a bandit explore/exploit
reorderer - all running inside the governance envelope.
See [docs/OPERATIVEoptimizations.md](docs/OPERATIVEoptimizations.md).

**Observability and auth.** OTLP traces, metrics, and logs without the heavy
opentelemetry-sdk dependency; a Prometheus surface; usage and cost metering with anomaly
detection. Auth is a pluggable authenticator with push step-up for sensitive operations
(Tier-0 or over-budget), including an AuthGate adapter.

**Multi-tenancy.** Per-tenant workspaces, knowledge scoping, and budgets, plus a white-label
embeddable app (`build_app`) so one image serves many branded tenants from config and keys.

## Command line

```bash
operative run --model <url> --workspace ./repo "task"   # run a task to completion
operative serve --deployment deployment.yaml            # the governed router HTTP server
operative serve --deployment deployment.yaml --speech   # the OpenAI-compatible voice server
operative serve --config agent.yaml --api-keys-file k   # the white-label tool-use agent
operative capabilities --deployment customer.yaml       # audit what a config permits
operative capabilities --deployment new.yaml --diff old.yaml   # capability/backend delta
```

## Architecture

```
+---------------------------------------------------------------+
|  request                                                      |
|    -> authenticate        credential to a tenant principal    |
|    -> route               policy picks a backend chain        |
|    -> constitution        Tier-0 / cost / latency, fail fast  |
|    -> execute             run the chain with fallback         |
|    -> audit + metrics     every decision logged and measured  |
+---------------------------------------------------------------+
   backends and knowledge stores are pluggable adapters
```

## Documentation

Full index in [docs/README.md](docs/README.md). The guides:

- [docs/SDK.md](docs/SDK.md) - the agent SDK: tools, MCP, structured output, hooks, agency
- [docs/ROUTER.md](docs/ROUTER.md) - the multi-backend router, backend matrix, and constitution
- [docs/CAPABILITIES.md](docs/CAPABILITIES.md) - capability grants, trust tiers, the governance cube
- [docs/NEO.md](docs/NEO.md) - learned routing and the offline head trainer
- [docs/SPEECH.md](docs/SPEECH.md) - STT/TTS/VAD backends, the realtime SpeechSession, voice serve
- [docs/BROWSER.md](docs/BROWSER.md) - the browser backend family and governed browser tools
- [docs/OPERATIVEoptimizations.md](docs/OPERATIVEoptimizations.md) - composing governance, optimization, and learning
- [docs/ERROR-HANDLING.md](docs/ERROR-HANDLING.md) - error semantics
- [ARCHITECTURE.md](ARCHITECTURE.md) - the request lifecycle and module map
- [examples/](examples/) - runnable examples and deployment configs

## License

MIT. See [LICENSE](LICENSE).
