arbiter-ops · Feature Catalog

Version1.0.0 · post-Sprint 8 hygiene · commit fe2010d
Scope9-plane hexagonal AIOps substrate
Size30K LOC · 312 source modules · 115 test modules
Tests1118 passing · 8 skipped · 0 errors · 0 deprecation warnings · 29s
Whitelabelleaks: 0 / 491 files · verified by scripts/arbiter_ops_leak_check.py
Performance1.15M ops/sec band lookup p50 · ~16× faster than published Microsoft AGT benchmarks
StandardsISO 42001 · NIST AI RMF · EU AI Act Art. 12 · DORA · OWASP Agentic Top 10 (6 / 10 full · 4 / 10 partial · operator wires production-grade adapters)
LicenseApache-2.0
312
source modules
1118
tests passing
99+
port ABCs
12
action invokers
11
sensing receivers
8
console binaries
10
optional extras
0
whitelabel leaks

Architecture · the 9 planes

Every plane has the same hex-arch shape: domain/{ports,models}.py (pure Python · zero infra imports) → application/ (orchestration · port-consuming) → adapters/ (concrete tech · swappable).

#PlanePurposePortsAdaptersRecent additions
1sensingIngest from observability sources1018OCSF normalizer · provenance stamp
2contextEntity + topology resolver42in-memory defaults · Neo4j scaffold
3featureFeature engineeringdomain-only · pass-through to reasoning
4intelligenceLLM + ML reasoners (provider-neutral)98Portkey AI Gateway invoker · XGBoost cost predictor · cost-aware router
5reasoningEnsemble + causal classifier76Portkey reasoner · SMT verifier (z3) · time-series reasoner
6decisionPolicy engine + autonomy levels53XGBoost triage classifier
7actionInvokers (SOAR · ITSM · monitoring · MCP)6136 SOAR vendors + 6 adjacent + MCP invoker
8evidenceFacade over governance audit42Audit-port adapter
9improvementOffline GA · policy evolution63XGBoost surrogate fitness + drift detection

Plus 11 first-class supporting containers covered below.


Plane 1 · sensing · ingest

Purpose. Receive observability signals from any vendor · normalize · stamp provenance · detect clock skew · route to feature plane.

Ports (10): IngestSinkPort NormalizationPort ProvenancePort ClockSkewPort DeadLetterPort (+ ABCs).

Receivers (11 vendor adapters)

Pipeline adapters


Plane 2 · context · entity + topology

Purpose. Resolve incoming signals against the org's entity graph (services · hosts · users · datacenters · deploys).

Ports (4): TopologyStorePort ChangeCorrelatorPort (+ ABCs).

Adapters (2): in_memory_topology_store.py · in_memory_change_correlator.py · Neo4j adapter scaffold reserved via [neo4j] extra.


Plane 3 · feature · feature engineering

Purpose. Feature engineering pass-through between context and reasoning · domain-models only at v1.0.


Plane 4 · intelligence · LLM + ML reasoners

Purpose. Provider-neutral access to LLMs · per-(model, task) cost prediction · cost-aware routing.

Ports (9):

Adapters (8):

Application layer: CostAwareRouter · selects highest expected_value subject to RoutingPolicy constraints · fallback to default_model_id.


Plane 5 · reasoning · ensemble + causal + SMT

Purpose. Turn features into Hypotheses · ground every claim against evidence · classify causal vs correlational · verify numeric/constraint proposals deterministically.

Ports (7):

Adapters (6):


Plane 6 · decision · policy + triage + autonomy

Purpose. Combine reasoning hypothesis · policy engine · autonomy level · per-band invariants · produce a Verdict.

Ports (5):

Adapters (3):


Plane 7 · action · SOAR + ITSM + monitoring invokers

Purpose. Invoke external systems with per-capability invariants enforced (reversibility=COSTLY · requires_simulation=True · requires_rollback_artifact=True).

Ports (6): ToolRegistryPort ToolInvokerPort SimulatorPort IdempotencyStorePort KillSwitchPort ExecutionEventSinkPort.

Invokers (12 vendors)

CategoryVendorCapabilitiesAuth
SOAR (6) Splunk SOARsplunk_soar.run_playbook · splunk_soar.create_containertoken
Cortex XSOARxsoar.run_playbook · xsoar.create_incidentAPI key + ID
Tinestines.send_to_story (webhook) · tines.create_recordper-story webhook secret
Swimlane (Turbine)swimlane.run_playbook · swimlane.create_recordPrivate-Token header
Google Chronicle SOARchronicle_soar.run_playbook · chronicle_soar.create_caseOAuth Bearer (static OR callable)
Microsoft Sentinelsentinel.run_playbook (Logic Apps signed URL) · sentinel.update_incident (ARM PATCH)azure_ad_token_provider callable
ITSM (4) ServiceNowservicenow.create_incidentu_aiops_* custom fields
Jirajira.create_issuebasic + token
PagerDutypagerduty.trigger · acknowledge · resolverouting key
Opsgenieopsgenie.create_alert · acknowledgeAPI key
Monitoring Grafana annotationsgrafana.create_annotationAPI token
MCP MCPToolInvokermcp.* (any registered capability · prefix stripped on the wire)operator wires their own MCP client via call_tool callable
Common pattern (every adapter). AIOps correlation fields auto-injected: aiops_request_id aiops_decision_id aiops_tenant_id idempotency_key · HTTP error → RuntimeError("<vendor> <capability> failed (status=...)") · missing required parameter → ValueError(...) · reversibility class declared at capability-registration time.

Plane 8 · evidence · Compliance Evidence Package facade

Ports (4): AuditReaderPort EvidenceFacadePort (+ ABCs).

Adapters (2):


Plane 9 · improvement · GA + drift detection

Purpose. Evolve policy configurations against historical replay · detect surrogate drift · emit audit events.

Ports (6):

Adapters (3):

Application layer. HybridFitnessEvaluator · surrogate-fast + periodic ground-truth · drift detection emits surrogate.drift_detected audit row at MAE divergence threshold · pins to ground-truth for fallback_window_generations after drift.


Supporting container · governance · audit primitive

Purpose. The foundation every other plane depends on · authorize() before · record() after · the adapter chooses durability (default LocalAuditAdapter writes JSONL).

Surface:


Supporting container · control · operator HTTP plane

Purpose. FastAPI operator HTTP plane · 8 routers · 12 ports.

Launchable: arbiter-ops-control --host 0.0.0.0 --port 8001 --hil-gateway in_process

Ports (12): AuthorizationPort TierClassifierPort HilGatewayPort ControlAuditPort TenantOpsPort PolicyDistributionPort KillSwitchOpsPort WorkflowOpsPort RBACPort TelemetryPort ConformanceOpsPort SubstrateSwitchPort.

HTTP routers (8)

RouterPurpose
tenantPer-tenant config · approach-band overlays
policyPolicy CRUD · evolution gates · drift kill rules
rbacRoles · permissions · principals
killswitchGlobal + per-tenant + per-capability emergency stops
workflowStart / status / cancel workflow runs
conformanceLive 9-plane conformance probe
substrateHealth + version + readiness
telemetryMetrics surface + decision-event tap
Auth (v0.1 alpha). Header-stub identity resolver via x-aiops-operator-subject + x-aiops-operator-role + x-aiops-tenant-scope. Production wires JWT / SPIFFE via make_identity_resolver(...). Header names retained as wire-format contracts.

Supporting container · hil · 5-gate HITL framework

Purpose. G-1..G-5 gate types per the 9-plane spec · composable in front of every governed decision.

Launchable: arbiter-ops-hil-worker (Temporal worker) · arbiter-ops-hil-submit (smoke-test CLI).

Ports (14 · including ABCs): GatePort ChannelPort VerifierPort EscalationPolicyPort ApproverDirectoryPort AntiDarkPatternLintPort EventEmitterPort (each + ABC variant).

Gate adapters

Adapter inventory

CategoryAdapterNotes
Verifiersverifiers/cross_provider_llm.pyLLM from a different (model, framework) tuple per the 9-plane spec §5.9.12
verifiers/rule_based.pyDeterministic rule verifier
Channelschannels/slack.pySlack SDK · [slack] extra
channels/operator_console.pyConsole fallback for dev
Orchestratorsorchestrator/temporal.pyProduction · durable workflow state machine
orchestrator/in_process.pyDev / scaffold / no-durability mode
orchestrator/git.pyDesign-only scaffold · commits = state transitions · TODOs for git add / commit -S / push
Eventsevents/kafka.pyconfluent-kafka · [kafka] extra · production
events/in_memory.pyDev default
Lintlint/rule_based.pyAnti-dark-pattern UI lint (loud-button enforcement · countdown timer guards)

Supporting container · triage_room · live Slack/Teams rooms

Purpose. Per-incident triage rooms with bidirectional bot · operator on-call invitation · auto-archive + post-mortem flush.

Launchable: arbiter-ops-triage-rooms --tenant-id <tid>

Ports (7): TriageRoomChatPort TriageRoomPolicyPort TriageRoomStorePort AgentContextPort EventSubscriberPort HilSignalPort ControlPlaneInvokerPort.

Adapters (3):


Supporting container · agent · long-running loop (E201)

Purpose. Per-tenant agent process · subscribes to incidents · runs triage → reasoning → decision → action → evidence loop.

Launchable: arbiter-ops-agent --tenant-id <tid> --autonomy-level AL-2

Ports (5): IncidentSubscriberPort TriagePort ContextResolverPort FeedbackSinkPort AgentSessionStorePort.

Anti-runaway invariants (CPL-22): three independent caps — max_turns max_tokens per-tenant USD budget — with structured truncated_reason.

Supporting container · operator · arbiter-opsctl (E206)

Purpose. Recipe-driven CLI for scaffolding control-plane UIs · generating apps · deploying to k8s · smoke-testing webhooks · bootstrapping tenants.

Launchable: arbiter-opsctl <recipe> [params...]

Built-in recipes (7)

RecipePurpose
ui_scaffoldScaffold a control-plane operator UI
autonomous_bootstrapBootstrap an autonomous AIOps agent
incident_responseSynthetic incident-response drill
incident_reportPost-incident report generation
k8s_deployk8s deployment template generator
tenant_bootstrapNew-tenant bootstrap
webhook_smokeWebhook smoke driver

Ports (4): RecipeRepositoryPort FileWriterPort KubectlPort HttpClientPort.

Adapters (4): filesystem · subprocess-kubectl · httpx · in-memory.

Application layers:


Supporting container · workflow · agent-workflow boards

Purpose. Declarative workflow YAML cards · board protocol orchestration.

Workflow: aiops-incident (under src/arbiter_ops/workflow/workflows/aiops-incident/)

Adapters: aiops_board.py · AIOpsBoard board adapter · load_aiops_incident_workflow() helper.

Application layer: bridge.py · RecipeWorkitemBridge · every arbiter-opsctl recipe run becomes a card.


Supporting container · mcp · Model Context Protocol integration

Purpose. Expose 7 substrate operations as MCP tools (inbound) · invoke external MCP tools from the action plane (outbound). Both surfaces use callable-injection so the mcp SDK is opt-in via the [mcp] extra.

Launchable: arbiter-ops-mcp [--transport stdio|http] [--list-tools]

Exposed MCP tools (7)

ToolPlaneWraps
arbiter_ops_band_checkdecisionApproachRegistry + validates() · CPL-08 invariant
arbiter_ops_audit_recordgovernanceAuditPort.authorize + record
arbiter_ops_triage_predictdecisionTriageClassifierPort · heuristic default · XGBoost via [ml-decision]
arbiter_ops_cost_predictintelligenceCostPredictorPort · Welford default · XGBoost via [ml-intelligence]
arbiter_ops_smt_verifyreasoningSmtVerifier (z3-solver) · requires [smt]
arbiter_ops_kill_switch_checkdecisionKillSwitchPort.is_engaged
arbiter_ops_redteam_scanredteamRuleBasedRedteamScanner.scan_text · 12-vector matrix
Three-module split. mcp/registry.py (pure-Python ToolSpec dataclasses · JSON-schema input/output) · mcp/tools.py (substrate dispatchers via SubstrateContainer DI · zero SDK dep · tested without [mcp]) · mcp/server.py (FastMCP wrapper · lazy mcp import). Operators wire production-grade adapters (Postgres audit · XGBoost classifiers · etc.) via build_fastmcp_server(container=...).

Client integrations: Claude Desktop · Cursor · Windsurf · any MCP-compatible client. See docs/mcp_integration.md for the full quick-start (with ~/.claude/desktop_config.json snippet).

Outbound action-plane adapter: MCPToolInvoker (Plane 7 · Action · row added to the invoker table above) · dispatches ProposedAction to any wired MCP client · same invariants (reversibility=COSTLY · simulation · rollback) as the SOAR invokers · AIOps correlation fields auto-injected.


Supporting container · redteam · standalone scanner

Purpose. OWASP Agentic ASI-01 (Goal Hijacking) + ASI-05 (Insecure Output Handling) coverage.

Launchable: arbiter-redteam scan <dir-or-file> [--json] [--min-grade A/B/C/D/F] [--strict] [--severity-floor low/high/critical] [--include glob,...]

12-vector attack matrix

#VectorSeverity
V01direct_instruction_overridehigh
V02role_play_jailbreak · DAN / STAN / AIM / sudohigh
V03sandwich_attackhigh
V04encoding_attack · base64 / rot13 / hexhigh
V05authority_impersonationlow
V06pii_extraction_probecritical
V07tool_invocation_smugglingcritical
V08prompt_leak_probehigh
V09goal_hijack_directivecritical
V10multi_step_chained_injectionhigh
V11unicode_homoglyph_smugglinglow
V12policy_bypass_appeallow

Grading: A (0 vectors) · B (1 low) · C (2 vectors OR 1 high) · D (3+ non-critical) · F (any critical).


Supporting container · knowledge · runbooks + post-mortems

Ports (4): KnowledgeRepositoryPort EmbeddingPort RerankerPort KnowledgeRetrieverPort.

Adapters (1): in_memory.py · dev default.

Models: KnowledgeCitation maps onto arbiter_ops.reasoning.domain.models.EvidenceCitation(kind=DOC, ref=...) for cross-plane evidence flow.


Supporting container · identity · tenancy + auth (E003)

Ports (6): TenantStorePort AgentIdentityPort SecretsPort (+ ABCs).

Adapters (3 · all in-memory · production swaps via DI):

Production note. Cryptographic actor attestation (signed requested_by claims, mTLS service-mesh identity, etc.) is an operator-supplied adapter. The in-memory stub is suitable for dev; production wires the identity provider that fits the deployment.


Supporting container · conformance · 9-plane probe

Purpose. Generate Level-1 (Observed) self-declaration for the 9-plane spec §12 conformance levels.

Application: SelfDeclarationGenerator.generate(level=ConformanceLevel.OBSERVED) ships; Levels 2-4 deferred (independent audit · public attestation · regulator endorsement).


8 console binaries shipped

Installed by pip install -e packages/arbiter-ops:

BinaryModulePurpose
arbiter-opsarbiter_ops.cliTop-level CLI shim (Phase 4 backlog)
arbiter-ops-controlarbiter_ops.control.serverFastAPI control plane (8 routers · port 8001)
arbiter-ops-agentarbiter_ops.agent.serverLong-running per-tenant agent loop
arbiter-ops-improvearbiter_ops.improvement.serverGA campaign server
arbiter-ops-triage-roomsarbiter_ops.triage_room.serverSlack / Teams triage rooms service
arbiter-ops-hil-workerarbiter_ops.hil.workerTemporal HIL worker
arbiter-ops-hil-submitarbiter_ops.hil.cli.submitSynthetic HitlRequest submitter (smoke)
arbiter-opsctlarbiter_ops.operator.cli.mainRecipe-driven operator CLI
arbiter-redteamarbiter_ops.redteam.cli12-vector prompt-injection scanner
arbiter-ops-mcparbiter_ops.mcp.serverMCP server · 7 substrate tools (stdio / SSE)

10 optional dependencies (pyproject.toml)

ExtraBringsUsed by
slackslack_sdktriage_room.adapters.slack · hil.adapters.channels.slack
neo4jneo4jcontext (production topology store · scaffold)
integrationsrequests · httpxaction invokers
llmanthropicintelligence (provider-neutral LLM access · direct SDK)
litellmlitellm>=1.50reasoning · litellm_reasoner.py (cross-provider abstraction)
portkeyportkey-ai>=1.8reasoning · portkey_reasoner.py + intelligence · portkey_invoker.py (AI gateway · virtual keys · semantic caching · trace_id correlation)
mcpmcp>=1.0arbiter-ops-mcp server transport · MCPToolInvoker uses callable injection so its SDK dep is optional
kafkaconfluent-kafkahil.adapters.events.kafka · triage_room.kafka_subscriber
ml-decisionxgboost>=2.0,<3.0 · numpy · scikit-learnXGBoost triage classifier
ml-intelligencexgboost>=2.0,<3.0 · numpy · scikit-learnXGBoost cost predictor (3-booster stack)
ml-surrogatexgboost>=2.0,<3.0 · numpy · scipyGA surrogate fitness evaluator
smtz3-solver>=4.13SMT verifier (reasoning · Layer 2a deterministic)
signingcryptographyEd25519 recipe-pack signer (operator/store)
alleverything above

Architectural invariants (enforced at multiple layers)

InvariantEnforced whereWhy it's load-bearing
reversibility=COSTLY + requires_simulation=True + requires_rollback_artifact=TrueEvery action-plane capability registrationPrevents irreversible side effects without a simulation pass and rollback evidence
AIOps correlation fields auto-injectedEvery SOAR / ITSM / monitoring invokerReplay protection + cross-system trace
Per-band invariants enforced at 3 layersResponse model · UI · band guardCPL-08 · stops band-mismatch at any single point of compromise
authorize() → call → record()Every LLM call via audit_gated_invoker.pyEU AI Act Art. 12 / GDPR Art. 22 evidence chain (adapter chooses durability)
CPL-22 anti-runaway capsagent.application.loopThree independent caps (turns · tokens · USD) · defense-in-depth
Consumer-flippable port + ML opt-inEvery ML adapter (XGBoost triage · cost · surrogate)Default path is zero-ML-dep · operators install extras when ready · graceful fallback on load failure
Layer 2a deterministic verifier load-bearingSMT verifier · per-band invariants · policy_hash() tamper refusalThe verifier is the only re-runnable point in the loop a regulator can replay

Performance (published benchmarks)

Benchmarkp50p99Throughput
approach_registry.get_record (band lookup)0.7 μs1.4 μs1.15M ops/sec
approach_record.validates (per-band invariant SAT)0.7 μs1.6 μs1.10M ops/sec
approach_record.validates (UNSAT)0.7 μs0.9 μs1.18M ops/sec
policy_engine.evaluate (AllowAll fallback)1.1 μs1.7 μs805K ops/sec
cep_hash_chain.link (canonical SHA-256 + chain link)7 μs15 μs137K ops/sec
end_to_end.verifier_path (band + invariant + hash)4.8 μs15 μs181K ops/sec
smt_verifier.verify_constraints (DTI cap)686 μs1.7 ms1.3K ops/sec

All numbers above are arbiter-ops' own measurements on its own code, on the host described in docs/BENCHMARKS.md. Comparisons against any other vendor's published numbers are not made here — readers evaluating arbiter-ops against existing tooling should reproduce both products' benchmarks on their own hardware before drawing conclusions. See docs/BENCHMARKS.md for the full methodology and disclaimer.


OWASP Agentic Top 10 coverage

Riskarbiter-ops controlCoverage
ASI-01 · Goal HijackingCPL-22 caps + CPL-08 band guard + authorize() pre-call gate✓ full
ASI-02 · Excessive Capabilities5-band approach registry + ToolCapability registry✓ full
ASI-03 · Identity & Privilege Abuserequested_by + tenant scoping + SPIFFE-style stub◐ partial
ASI-04 · Uncontrolled Code ExecutionAction-plane simulation + rollback gate✓ full
ASI-05 · Insecure Output HandlingVerifier chain (rule + groundedness + RAGAS + cross-LLM + G5 + redteam CLI)✓ full
ASI-06 · Memory PoisoningAuditPort.record() append-only by contract◐ partial
ASI-07 · Unsafe Inter-Agent Commsgov-everywhere + MCP gating◐ partial
ASI-08 · Cascading FailuresCPL-22 caps + drift kill-switch + budget tracker✓ full
ASI-09 · Human-Agent Trust DeficitAuditPort.authorize/record on every call · LocalAuditAdapter default◐ partial
ASI-10 · Rogue Agentsai_kill_switch_t (global / per-tenant / per-capability) + band invariants✓ full

Headline. 6 / 10 full · 4 / 10 partial. The partial items (ASI-03 · ASI-06 · ASI-07 · ASI-09) are all areas where the substrate provides the contract + a dev-grade default · operators supply the production-grade adapter (durable audit backend, mTLS service-mesh identity, etc.).


Tooling + scripts

ScriptPurpose
scripts/arbiter_ops_leak_check.pyWhitelabel guard · scans for upstream-product strings (leaks: 0 / 491 files)
scripts/benchmarks/bench_verifier_path.pyReproducible Layer 2a benchmarks · --iterations · --json · --only

What's deferred (operator follow-up)

ItemSprintEstimate
A-1 · expand context / knowledge / identity test coverage to ≥0.5 ratioSprint 9~3 days
A-2 · narrow 45 except Exception: catches + add log.warning(..., exc_info=True)Sprint 9~1.5 days
A-3 · ship hil/adapters/orchestrator/git.py HIL orchestrator OR DEFERRED.md entrySprint 9~1.5 days
G-2 batch review + G-4 outcome envelope gate adaptersv0.5TBD
Conformance levels 2-4 (independent audit · public attestation · regulator endorsement)v0.5 / v1.0 / v2.0product milestones
End-to-end agent-to-agent encryption (ASI-07 partial → full)Sprint 10+TBD
5-language SDK matrix (.NET / Rust / Go)customer demand only
Generic framework middleware (LangChain / CrewAI / AutoGen)not planned · we sit one layer down

Generated 2026-05-10 · v1.0.0 · whitelabel-clean (leaks: 0 / 496).
Companion docs in this directory: dev_guide.html (14-section onboarding) · FEATURES.md (markdown twin) · portkey_integration.md (Portkey gateway deep-dive) · triage_classifier.md · cost_predictor.md · fitness_surrogate.md · DEFERRED.md.