jeevesagent

JeevesAgent — model-agnostic, MCP-native agent harness.

Submodules

Exceptions

AuthenticationError

Invalid, missing, or revoked API credentials.

BudgetExceeded

A run was halted because a budget limit was hit.

CancelledByUser

A user-driven interruption (signal, timeout) ended the run.

ConfigError

Invalid or unresolvable configuration passed to Agent.

ContentFilterError

The provider's safety system blocked the request or response.

FreshnessError

A certified value failed its freshness policy.

InvalidRequestError

The request was malformed or violated the provider's API

IsolationWarning

Emitted when a memory query is likely to silently miss data

JeevesAgentError

Base class for all harness errors.

LineageError

A certified value failed its lineage policy.

MCPError

An MCP transport, handshake, or protocol error.

MemoryStoreError

The memory backend failed an operation.

ModelError

A call to the underlying model adapter failed.

OutputValidationError

The model's final answer did not validate against the supplied

PathEscapeError

Raised when a tool argument resolves outside its workdir.

PermanentModelError

A model call failed in a way that retrying will not fix.

PermissionDenied

A tool call was denied by the permission layer or a user hook.

RateLimitError

The provider returned a 429 / quota-exhausted response.

RuntimeJournalError

The durable runtime journal is unreadable or inconsistent.

SandboxError

The sandbox refused or failed to execute a tool.

SkillError

Raised on invalid skill construction or frontmatter.

ToolError

A tool invocation failed at the tool's own boundary.

TransientModelError

A model call failed in a way that may succeed on retry.

Classes

ActorCritic

Actor + adversarial critic with optional different models.

Agent

A fully-async, MCP-native, model-agnostic agent harness.

AgentGraph

Renderable graph of an agent's structure.

AgentSession

Mutable per-run state shared between Agent and an

AllowAll

Trivial permission policy: every call is allowed.

AnthropicModel

Talks to Claude via anthropic.AsyncAnthropic.

Architecture

Strategy interface for driving the agent loop.

AuditEntry

An immutable, signed entry in the audit log.

AuditLog

The append-only signed log surface.

Blackboard

Public + per-agent private state for the architecture.

BlackboardArchitecture

Coordinator + agents + decider, mediated by a shared

BlackboardEntry

One contribution on the blackboard.

Budget

Resource governance — tokens, calls, cost, wall clock.

BudgetStatus

Result of a budget check before each step.

CertifiedValue

A value carrying provenance metadata for freshness/lineage checks.

ChromaFactStore

Bi-temporal fact store backed by a Chroma collection.

ChromaMemory

Memory backed by chromadb.

ChromaVectorStore

Vector store backed by chromadb.

CohereEmbedder

Embeddings via Cohere's cohere SDK.

ConsolidationWorker

Periodic consolidator for any Memory backend.

Consolidator

Wraps a Model to extract Fact rows from episodes.

Dependencies

Bundled protocol implementations passed to every architecture.

EchoModel

Echo-style model for tests and demos.

Embedder

Text-to-vector embedding model used by the memory subsystem.

Episode

A single (input, decisions, tool calls, output) tuple from history.

Event

A single observable record from a running session.

EventKind

Enum where members are also (and must be) strings

FAISSVectorStore

Vector store backed by faiss-cpu.

Fact

A semantic claim extracted from one or more episodes.

FactStore

Storage surface for bi-temporal facts.

FileAuditLog

JSONL append-only audit log with HMAC signatures.

FilesystemSandbox

Restrict a tool host's path-typed arguments to declared roots.

FreshnessPolicy

Maximum age for certified values from each source.

Handoff

Per-peer handoff configuration.

HashEmbedder

Deterministic SHA256-seeded unit vectors.

HookHost

Aggregator over user-registered lifecycle callbacks.

HookRegistry

Implements HookHost.

InMemoryAuditLog

List-backed signed audit log.

InMemoryFactStore

Dict-backed bi-temporal fact store.

InMemoryJournalStore

Dict-backed journal. Process-local; lost on exit.

InMemoryMemory

Dict-backed implementation of Memory.

InMemoryVectorStore

In-process vector store backed by a Python list.

InProcRuntime

No durability. Each step runs immediately.

InProcessToolHost

A dict-backed ToolHost.

JeevesConfig

Connection details for the Jeeves Gateway.

JeevesGateway

ToolHost-shaped wrapper around the Jeeves Gateway.

JournalStore

Storage surface for the durable runtime.

JournaledRuntime

Runtime that journals every step's result for replay.

LineagePolicy

Allow-list of source prefixes for the entire lineage chain.

LiteLLMModel

Talks to any LiteLLM-supported provider.

MCPClient

One client per MCP server. Holds the live ClientSession.

MCPRegistry

Aggregates many MCPClient instances into a single ToolHost.

MCPServerSpec

How to find and talk to a single MCP server.

Memory

Tiered memory: working blocks, episodic store, semantic graph.

MemoryBlock

An in-context memory block, pinned to every prompt.

Message

A single chat message in the model's conversation.

Mode

Enum where members are also (and must be) strings

Model

LLM provider interface. One adapter per lab (Anthropic, OpenAI, ...).

ModelChunk

A single chunk from a streaming model call.

MultiAgentDebate

N debaters + optional judge orchestration.

NoSandbox

Pass-through wrapper around a ToolHost.

NoTelemetry

No-op telemetry. Very cheap; safe to call on every loop step.

OTelTelemetry

OpenTelemetry-backed Telemetry.

OpenAIEmbedder

Embeddings via OpenAI's embeddings.create API.

OpenAIModel

Talks to OpenAI via openai.AsyncOpenAI.

PermissionDecision

Outcome of a permission check or pre-tool hook.

Permissions

Decides whether a tool call is allowed.

Plan

A list of plan steps in execution order.

PlanAndExecute

Planner → step executor → synthesizer.

PlanStep

One step of a plan.

PostgresFactStore

Postgres-backed bi-temporal fact store.

PostgresJournalStore

Postgres-backed journal. Production-grade durable replay.

PostgresMemory

Postgres-backed Memory.

PostgresRuntime

JournaledRuntime backed by Postgres for cross-host

PostgresVectorStore

Vector store backed by Postgres + pgvector.

ReAct

Observe-think-act in a tight loop.

ReWOO

Plan-then-tool-execute with placeholder substitution.

ReWOOPlan

A list of ReWOO steps (no required ordering — dependencies

ReWOOStep

One step of a ReWOO plan: id + tool + args.

ReWOOStepResult

!!! abstract "Usage Documentation"

RedisFactStore

Bi-temporal fact store over plain Redis hashes.

RedisMemory

Redis-backed Memory. Use connect() to construct.

Reflexion

Wrap a base architecture with evaluator + reflector + lesson

RetryPolicy

Exponential-backoff-with-jitter retry schedule.

Role

Enum where members are also (and must be) strings

Router

Classify input → dispatch to ONE specialist Agent.

RouterRoute

One specialist + classification metadata.

RunContext

Typed, immutable context for one agent run.

RunResult

Final outcome of an Agent.run call.

Runtime

Durable execution. Wraps every side effect in a journal entry.

RuntimeSession

Handle to an open durable session held by a Runtime.

Sandbox

Isolation layer for tool execution.

ScriptedModel

Model that emits canned responses, one per call to stream().

ScriptedTurn

SearchResult

One hit from VectorStore.search().

Secrets

Resolution and redaction of named secrets.

SelfRefine

Wrap a base architecture with iterative critique / refine.

Skill

A loadable agent skill.

SkillMetadata

Lightweight skill descriptor — what loads at startup.

SkillRegistry

A keyed collection of Skill instances.

SkillSource

A folder of skills + an optional label.

Span

A trace span handle. Concrete telemetry adapters return their own

SqliteFactStore

Durable bi-temporal fact store rooted at a sqlite file.

SqliteJournalStore

SQLite-backed journal. Durable across process restarts.

SqliteRuntime

JournaledRuntime with a SqliteJournalStore.

StandardPermissions

Mode + allow/deny-list permission policy.

StepResult

The output of executing one step.

SubprocessSandbox

Run each tool call in a fresh child Python process.

Supervisor

Coordinator + workers, glued by a delegate tool.

Swarm

Peer agents passing control through handoff tools.

Team

Namespace for multi-agent team builders.

Telemetry

OpenTelemetry-compatible tracing/metrics surface.

ThoughtNode

One node in the Tree-of-Thoughts search tree.

Tool

A registered tool: definition plus the callable that executes it.

ToolCall

A model-emitted request to invoke a tool.

ToolDef

Schema description of a tool the model can call.

ToolEvent

Tool registry change notification (MCP listChanged etc.).

ToolHost

MCP-aware tool registry. Lazy-loads schemas on demand.

ToolResult

Outcome of a tool invocation.

TreeOfThoughts

Branch + evaluate + prune. BFS beam search over thoughts.

Usage

Token and cost accounting for a model call.

VectorMemory

Pure-Python embedding-backed Memory.

VectorStore

Async protocol for vector stores.

VoyageEmbedder

Embeddings via Voyage AI's voyageai SDK.

set_run_context

Context manager that installs a RunContext for the

Functions

bash_tool(→ jeevesagent.tools.registry.Tool)

Build a Tool that runs a shell command with the

build_graph(→ AgentGraph)

Walk an Agent and return its renderable

classify_model_error(...)

Map an exception from any model SDK to the framework's taxonomy.

default_workdir(→ pathlib.Path)

Return the framework's default workdir for built-in tools,

deterministic_hash(→ str)

Stable hash of arbitrary JSON-serializable parts.

edit_tool(→ jeevesagent.tools.registry.Tool)

Build a Tool that does find-and-replace inside an

filesystem_tools(→ list[jeevesagent.tools.registry.Tool])

Return all three filesystem tools (read + write + edit)

get_run_context(→ RunContext)

Return the RunContext for the currently-running agent.

new_id(→ str)

Return a fresh ULID, optionally prefixed for readability.

read_tool(→ jeevesagent.tools.registry.Tool)

Build a Tool that reads a text file under workdir.

resolve_architecture(...)

Coerce spec to a concrete Architecture.

run_architecture(→ jeevesagent.core.types.RunResult)

Run an Architecture once with a minimal Agent shell.

tool(…)

Promote a callable to a Tool.

write_graph(→ str)

Walk the agent, render to Mermaid, write to path.

write_tool(→ jeevesagent.tools.registry.Tool)

Build a Tool that writes / overwrites a text file

Package Contents

exception jeevesagent.AuthenticationError(message: str, *, cause: BaseException | None = None)[source]

Bases: PermanentModelError

Invalid, missing, or revoked API credentials.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.BudgetExceeded(reason: str)[source]

Bases: JeevesAgentError

A run was halted because a budget limit was hit.

Initialize self. See help(type(self)) for accurate signature.

reason
exception jeevesagent.CancelledByUser[source]

Bases: JeevesAgentError

A user-driven interruption (signal, timeout) ended the run.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.ConfigError[source]

Bases: JeevesAgentError

Invalid or unresolvable configuration passed to Agent.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.ContentFilterError(message: str, *, cause: BaseException | None = None)[source]

Bases: PermanentModelError

The provider’s safety system blocked the request or response.

Typically a permanent failure for the same prompt; users may rephrase but the framework should not silently retry.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.FreshnessError[source]

Bases: JeevesAgentError

A certified value failed its freshness policy.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.InvalidRequestError(message: str, *, cause: BaseException | None = None)[source]

Bases: PermanentModelError

The request was malformed or violated the provider’s API contract — bad parameters, oversized prompt, unknown model name, etc. Fix the request, don’t retry.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.IsolationWarning[source]

Bases: UserWarning

Emitted when a memory query is likely to silently miss data because the caller forgot to pass user_id.

Concrete trigger: a backend’s recall / recall_facts runs with user_id=None against a store whose persisted records include at least one non-None user_id — the partition is safe (the anonymous bucket and named-user buckets are isolated), but the developer probably wired up multi-tenancy somewhere and forgot to pass user_id here, so they will see suspiciously empty recall results.

Subclass of UserWarning so it goes through Python’s standard warnings filter machinery — apps can silence, promote-to-error, or log it however they want, e.g.:

import warnings
from jeevesagent import IsolationWarning
warnings.simplefilter("error", IsolationWarning)  # raise on hit

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.JeevesAgentError[source]

Bases: Exception

Base class for all harness errors.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.LineageError[source]

Bases: JeevesAgentError

A certified value failed its lineage policy.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.MCPError[source]

Bases: JeevesAgentError

An MCP transport, handshake, or protocol error.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.MemoryStoreError[source]

Bases: JeevesAgentError

The memory backend failed an operation.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.ModelError(message: str, *, cause: BaseException | None = None)[source]

Bases: JeevesAgentError

A call to the underlying model adapter failed.

Base of the model-error taxonomy: catch this to handle every model failure regardless of whether it is transient or permanent. The SDK exception that triggered the classification is attached via __cause__ (and cause) so debug code can still inspect the raw error.

Initialize self. See help(type(self)) for accurate signature.

cause = None
exception jeevesagent.OutputValidationError(message: str, *, raw: str, schema: type, cause: BaseException | None = None)[source]

Bases: JeevesAgentError

The model’s final answer did not validate against the supplied output_schema.

Raised by Agent.run() when the caller passed output_schema= and the model’s final assistant text could not be parsed/validated as the requested Pydantic model — even after the optional one-shot “retry with the validation error” turn.

Carries the raw model output (raw), the underlying Pydantic pydantic.ValidationError (cause, also exposed via __cause__), and the schema that was being targeted (schema) so callers can build whatever recovery strategy they need (re-prompt with extra examples, fall back to free-text, etc.).

Initialize self. See help(type(self)) for accurate signature.

cause = None
raw
schema
exception jeevesagent.PathEscapeError[source]

Bases: ValueError

Raised when a tool argument resolves outside its workdir.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.PermanentModelError(message: str, *, cause: BaseException | None = None)[source]

Bases: ModelError

A model call failed in a way that retrying will not fix.

Wrong API key, malformed request, content-filter rejection, deprecated model name, etc. The retry layer raises these immediately without backoff so callers can fail fast and surface the real problem.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.PermissionDenied(tool: str, reason: str)[source]

Bases: JeevesAgentError

A tool call was denied by the permission layer or a user hook.

Initialize self. See help(type(self)) for accurate signature.

reason
tool
exception jeevesagent.RateLimitError(message: str, *, retry_after: float | None = None, cause: BaseException | None = None)[source]

Bases: TransientModelError

The provider returned a 429 / quota-exhausted response.

Carries retry_after when the provider supplied one. Subclass of TransientModelError so generic transient handlers cover it; catch RateLimitError specifically when you need to surface “slow down” to the caller (e.g. propagate a 429 to your own clients).

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.RuntimeJournalError[source]

Bases: JeevesAgentError

The durable runtime journal is unreadable or inconsistent.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.SandboxError[source]

Bases: JeevesAgentError

The sandbox refused or failed to execute a tool.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.SkillError[source]

Bases: ValueError

Raised on invalid skill construction or frontmatter.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.ToolError[source]

Bases: JeevesAgentError

A tool invocation failed at the tool’s own boundary.

Initialize self. See help(type(self)) for accurate signature.

exception jeevesagent.TransientModelError(message: str, *, retry_after: float | None = None, cause: BaseException | None = None)[source]

Bases: ModelError

A model call failed in a way that may succeed on retry.

Covers HTTP 5xx responses, network errors, timeouts, and provider-side rate limits. The retry layer treats this family as retryable and applies backoff.

retry_after (in seconds) carries a provider-supplied hint when one is available — e.g. an Retry-After HTTP header on a 429 response. The retry layer respects the larger of the policy’s computed backoff and retry_after so we never wait less than the provider asked for.

Initialize self. See help(type(self)) for accurate signature.

retry_after = None
class jeevesagent.ActorCritic(*, actor: jeevesagent.agent.api.Agent, critic: jeevesagent.agent.api.Agent, max_rounds: int = 3, approval_threshold: float = 0.9, critique_template: str | None = None, refine_template: str | None = None)[source]

Actor + adversarial critic with optional different models.

Constructor parameters:

  • actor (required): the generating Agent. Sees the original prompt on round 0 and a refine prompt on subsequent rounds.

  • critic (required): the reviewing Agent. Sees the original prompt + the actor’s current output and produces structured JSON critique.

  • max_rounds: cap on critique-refine cycles after the initial generation. Default 3.

  • approval_threshold: terminate when critique.score is at or above this value. Default 0.9.

  • critique_template / refine_template: override the default prompts. Templates use {prompt}, {output}, {critique}, {issues_bulleted}.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'actor-critic'
class jeevesagent.Agent(instructions: str, *, model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, architecture: jeevesagent.architecture.Architecture | str | None = None, skills: list[Any] | None = None, retry_policy: jeevesagent.governance.retry.RetryPolicy | None = None)[source]

A fully-async, MCP-native, model-agnostic agent harness.

add_tool(item: jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]) jeevesagent.tools.registry.Tool[source]

Register a tool after construction.

Convenience for plugin-style code that adds tools after the Agent exists. Only works when the underlying tool host is an InProcessToolHost (the default — and the only host that has a writable registry today).

Returns the constructed Tool so callers can introspect the auto-derived schema.

after_tool(fn: jeevesagent.security.hooks.PostToolHook) jeevesagent.security.hooks.PostToolHook[source]

Register a best-effort post-tool callback.

before_tool(fn: jeevesagent.security.hooks.PreToolHook) jeevesagent.security.hooks.PreToolHook[source]

Register a pre-tool hook. First denial wins; allow otherwise.

async consolidate() int[source]

Manually trigger memory consolidation.

Returns the number of new facts the consolidator extracted, or 0 when the memory backend doesn’t expose a fact store.

Useful when auto_consolidate=False (the default) and you want to batch consolidation at a controlled cadence — e.g. once a day, or before shutdown.

classmethod from_config(path: str | pathlib.Path, *, model: jeevesagent.core.protocols.Model | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | None = None) Agent[source]

Construct an Agent from a TOML config file.

Designed for ops/devops users who want declarative agent config separate from code. Supports the textual / numeric bits — instructions, model spec (string), max_turns, auto_consolidate, budget — and lets callers pass concrete instances for the things TOML can’t reasonably express (real Memory, Runtime, custom Model, tools).

Example agent.toml:

instructions = "You are a research assistant."
model = "claude-opus-4-7"
max_turns = 100
auto_consolidate = true

[budget]
max_tokens = 200_000
max_cost_usd = 5.0
max_wall_clock_minutes = 10
soft_warning_at = 0.8

Then:

agent = Agent.from_config("agent.toml")
classmethod from_dict(cfg: dict[str, Any], *, model: jeevesagent.core.protocols.Model | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | None = None) Agent[source]

Construct an Agent from a parsed config dict.

Same shape as from_config() but skips the file read. Useful when the config comes from somewhere other than a TOML file — environment variables, a Pydantic settings model, a yaml.safe_load result, an HTTP API, etc.

Recognised keys (all optional except instructions and model):

  • instructions: str — required

  • model: str — required (or pass model= kwarg)

  • max_turns: int

  • auto_consolidate: bool

  • budget: dict with any of max_tokens, max_input_tokens, max_output_tokens, max_cost_usd, max_wall_clock_minutes, soft_warning_at

async generate_graph(path: str | pathlib.Path | None = None, *, title: str | None = None) str[source]

Render this agent’s structure as a Mermaid graph.

Walks the agent + its architecture + all sub-agents + every agent’s tools, producing a graph that captures the full team, tool attachments, and architecture-specific relationships (delegate / handoff / classify / etc.).

Returns the Mermaid text. If path is provided, also writes to disk — extension determines the format:

  • .mmd — raw Mermaid source

  • .md — Markdown with the diagram in a mermaid fence (renders on GitHub, IDE markdown previews, Jupyter)

  • .png / .svg — rendered via mermaid.ink; falls back to .mmd next to the path on network failure

Example:

mermaid_text = await agent.generate_graph("graph.md")
print(mermaid_text)

Pass title= to override the diagram title (defaults to the file’s stem, or "Agent" if no path is given).

async recall(query: str, *, kind: str = 'episodic', limit: int = 5) list[Any][source]

Convenience wrapper around self.memory.recall(query, ...).

Returns episodes matching query. For semantic / fact-store recall, use self.memory.facts.recall_text(...) directly.

remove_tool(name: str) bool[source]

Unregister a tool by name. Returns True if a tool was removed, False if no tool with that name was registered.

Same constraint as add_tool(): only works with InProcessToolHost.

async resume(session_id: str, prompt: str, *, user_id: str | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, context: jeevesagent.core.context.RunContext | None = None, extra_tools: list[jeevesagent.tools.registry.Tool] | None = None, emit: Emit | None = None, output_schema: type[pydantic.BaseModel] | None = None, output_validation_retries: int = 1) jeevesagent.core.types.RunResult[source]

Resume a previously-interrupted run from its journal.

Equivalent to agent.run(prompt, session_id=session_id, ...) with the same kwarg surface as run(). Exists as a separate method so the intent is explicit at the call site — when a durable Runtime (e.g. SqliteRuntime) is configured, completed steps replay from the journal instead of re-executing.

async run(prompt: str, *, user_id: str | None = None, session_id: str | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, context: jeevesagent.core.context.RunContext | None = None, extra_tools: list[jeevesagent.tools.registry.Tool] | None = None, emit: Emit | None = None, output_schema: type[pydantic.BaseModel] | None = None, output_validation_retries: int = 1) jeevesagent.core.types.RunResult[source]

Run the agent to completion and return its RunResult.

user_id is the namespace partition for memory recall and persistence — episodes and facts stored with one user_id are never visible to a query scoped to a different user_id. None is the “anonymous / single-tenant” bucket. See RunContext for the partitioning contract.

Pass session_id to resume a journaled run — when paired with a durable runtime (e.g. SqliteRuntime), already-completed steps replay from the journal instead of re-executing. Without a durable runtime, session_id just labels the run.

metadata is a free-form bag for application context the framework does not interpret (locale, request id, feature flags). Tools and hooks read it via get_run_context().metadata.

context accepts a fully-formed RunContext instead of the individual kwargs — useful when passing context through multi-agent boundaries that received their parent’s context as a single object. When both context and the individual kwargs are provided, the kwargs override the corresponding fields on context.

extra_tools injects additional Tools for this run only — the agent’s configured ToolHost is wrapped so the model sees the extras alongside whatever tools were registered at construction. Used by multi-agent architectures that need to inject coordination tools (e.g. Swarm’s handoff(target, message)) into a peer agent’s loop without permanently mutating that agent’s static configuration.

emit is an awaitable callback invoked once per Event produced during the run (model chunks, tool calls, tool results, architecture progress, errors, …). Default None drops events on the floor (regular run semantics — return only the final RunResult). Multi-agent architectures pass an emit that forwards a sub-Agent’s events into the parent’s stream, so calls like await worker.run(prompt, emit=parent_send) surface the worker’s token-by-token streaming to the outermost agent.stream(...) consumer.

output_schema requests a structured, validated final answer. Pass any Pydantic BaseModel subclass and the framework will (1) append a JSON-schema directive to the system prompt instructing the model to emit a final answer that matches, (2) parse the final assistant text against the schema, and (3) populate RunResult.parsed with the validated instance. RunResult.output keeps the raw text so you can log or display it. Up to output_validation_retries extra turns are spent recovering from a parse failure (the model is given the validation error as feedback and asked to try again); if it still fails after the retry budget, the run raises OutputValidationError. Set retries to 0 to fail fast.

async stream(prompt: str, *, user_id: str | None = None, session_id: str | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, context: jeevesagent.core.context.RunContext | None = None, extra_tools: list[jeevesagent.tools.registry.Tool] | None = None, output_schema: type[pydantic.BaseModel] | None = None, output_validation_retries: int = 1) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]

Stream Events as the loop produces them.

The loop runs as a background task; events are pushed through a bounded memory stream so a slow consumer applies backpressure. Breaking out of the iteration cancels the producer cleanly. session_id works the same as run()’s — pass an existing one to resume against a durable runtime’s journal. extra_tools works the same as run()’s.

async tools_list() list[str][source]

Return the names of all currently-registered tools.

Convenience that works for any ToolHost. Calls tool_host.list_tools() under the hood and returns just the names; use self.tool_host.list_tools() directly for the full ToolDef records.

with_tool(fn: collections.abc.Callable[Ellipsis, object]) collections.abc.Callable[Ellipsis, object][source]

Decorator-style equivalent of add_tool().

Usage:

@agent.with_tool
async def search(query: str) -> str:
    '''Search a knowledge base.'''
    return f"results for {query}"

Returns the original function unchanged (so it can still be called normally), and registers it as a tool on the agent’s underlying InProcessToolHost. Same constraint as add_tool(): the host must be writable.

property architecture: jeevesagent.architecture.Architecture

The configured Architecture strategy.

Default is ReAct. Pass architecture= to Agent(...) to override.

property budget: jeevesagent.core.protocols.Budget

The configured Budget.

property hooks: jeevesagent.core.protocols.HookHost
property instructions: str

The system prompt the agent runs with.

Surfaced as a public property so multi-agent architectures (e.g. Supervisor) can read each worker’s intended role when composing instructions for the supervising model.

property memory: jeevesagent.core.protocols.Memory

The configured Memory backend.

property model: jeevesagent.core.protocols.Model

The configured Model adapter.

property permissions: jeevesagent.core.protocols.Permissions

The configured Permissions policy.

property runtime: jeevesagent.core.protocols.Runtime

The configured Runtime.

property skills: Any | None

The SkillRegistry of skills registered on this agent (or None if no skills were configured). Useful for inspecting / mutating the skill set after construction.

property tool_host: jeevesagent.core.protocols.ToolHost

The configured ToolHost.

class jeevesagent.AgentGraph[source]

Renderable graph of an agent’s structure.

to_mermaid() str[source]

Render to a Mermaid flowchart TB block.

Output is plain Mermaid (no %%{init}%% directives) so it renders consistently across GitHub, IDE previews, and mermaid.ink.

edges: list[_Edge] = []
nodes: list[_Node] = []
subgraphs: list[_Subgraph] = []
title: str = 'Agent'
class jeevesagent.AgentSession[source]

Mutable per-run state shared between Agent and an Architecture.

The Agent constructs this once per run, the architecture mutates it as iteration progresses, and the Agent reads the final state to build a RunResult.

metadata is a free-form dict architectures use for things that don’t deserve their own field — multi-agent architectures stash worker handoff state, planners stash plans, etc.

cumulative_usage: jeevesagent.core.types.Usage
id: str
instructions: str
interrupted: bool = False
interruption_reason: str | None = None
messages: list[jeevesagent.core.types.Message] = []
metadata: dict[str, Any]
output: str = ''
turns: int = 0
class jeevesagent.AllowAll[source]

Trivial permission policy: every call is allowed.

The default for Agent when no permissions are configured.

async check(call: jeevesagent.core.types.ToolCall, *, context: collections.abc.Mapping[str, Any]) jeevesagent.core.types.PermissionDecision[source]
class jeevesagent.AnthropicModel(model: str = 'claude-opus-4-7', *, client: Any = None, api_key: str | None = None, max_tokens: int = DEFAULT_MAX_TOKENS)[source]

Talks to Claude via anthropic.AsyncAnthropic.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot non-streaming completion.

Calls client.messages.create(...) (no stream=True, no stream context manager) — Anthropic returns the full Message in one HTTP response. We walk its content blocks once to assemble (text, tool_calls, usage, stop_reason). Used by the non-streaming hot path (agent.run()); agent.stream() keeps using stream().

Falls back to consuming stream() if the underlying client raises (test fakes that only support streaming, or transports that don’t honour single-shot creation).

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name = 'claude-opus-4-7'
class jeevesagent.Architecture[source]

Bases: Protocol

Strategy interface for driving the agent loop.

Implementations are async generators: they yield Event values for every milestone they want surfaced (model chunks, tool calls, tool results, budget warnings, errors, architecture-specific progress events).

See Subagent.md for the catalogue of architectures and the design rationale behind the protocol shape.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]

Sub-Agents this architecture composes, keyed by role name.

Used by multi-agent architectures (Supervisor, Actor-Critic, Debate, Router, Blackboard, Swarm) to expose their workers for introspection (logging, telemetry, eval). Single-agent architectures return {}.

run(session: AgentSession, deps: Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]

Drive iteration; yield events as they happen.

The architecture mutates session (turns, output, cumulative_usage, messages, interrupted, interruption_reason, metadata) as it iterates and yields Events for the caller to forward (or ignore, in non-streaming runs).

Implementations are async generators — declared async def run(...) -> AsyncIterator[Event]: with yield statements in the body.

name: str
class jeevesagent.AuditEntry(/, **data: Any)[source]

Bases: pydantic.BaseModel

An immutable, signed entry in the audit log.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

action: str
actor: str
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

payload: dict[str, Any]
seq: int
session_id: str
signature: str
timestamp: datetime.datetime
class jeevesagent.AuditLog[source]

Bases: Protocol

The append-only signed log surface.

async append(*, session_id: str, actor: str, action: str, payload: dict[str, Any]) jeevesagent.core.types.AuditEntry[source]
async query(*, session_id: str | None = None, action: str | None = None) list[jeevesagent.core.types.AuditEntry][source]
class jeevesagent.Blackboard[source]

Public + per-agent private state for the architecture.

post(author: str, content: str, *, kind: str = 'contribution', private_to: str | None = None) BlackboardEntry[source]
render_for(agent_name: str) str[source]

Format the blackboard state as a string for agent_name.

Includes every public entry and the agent’s own private scratchpad if any.

private: dict[str, list[BlackboardEntry]]
public: list[BlackboardEntry] = []
class jeevesagent.BlackboardArchitecture(*, agents: dict[str, jeevesagent.agent.api.Agent], coordinator: jeevesagent.agent.api.Agent | None = None, decider: jeevesagent.agent.api.Agent | None = None, max_rounds: int = 10, coordinator_instructions: str | None = None, decider_instructions: str | None = None)[source]

Coordinator + agents + decider, mediated by a shared blackboard.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'blackboard'
class jeevesagent.BlackboardEntry[source]

One contribution on the blackboard.

author: str
content: str
kind: str = 'contribution'
timestamp: datetime.datetime
class jeevesagent.Budget[source]

Bases: Protocol

Resource governance — tokens, calls, cost, wall clock.

async allows_step() jeevesagent.core.types.BudgetStatus[source]
async consume(*, tokens_in: int, tokens_out: int, cost_usd: float) None[source]
class jeevesagent.BudgetStatus(/, **data: Any)[source]

Bases: pydantic.BaseModel

Result of a budget check before each step.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod blocked_(reason: str) BudgetStatus[source]
classmethod ok_() BudgetStatus[source]
classmethod warn_(reason: str) BudgetStatus[source]
property blocked: bool
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reason: str | None = None
state: Literal['ok', 'warn', 'blocked']
property warn: bool
class jeevesagent.CertifiedValue(/, **data: Any)[source]

Bases: pydantic.BaseModel

A value carrying provenance metadata for freshness/lineage checks.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

fetched_at: datetime.datetime
lineage: tuple[str, Ellipsis] = ()
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

schema_version: str = '1'
source: str
valid_until: datetime.datetime | None = None
value: Any
class jeevesagent.ChromaFactStore(client: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_FACTS_COLLECTION)[source]

Bi-temporal fact store backed by a Chroma collection.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]
classmethod ephemeral(*, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_FACTS_COLLECTION) ChromaFactStore[source]
classmethod local(persist_directory: str, *, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_FACTS_COLLECTION) ChromaFactStore[source]
async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
property embedder: jeevesagent.core.protocols.Embedder
class jeevesagent.ChromaMemory(client: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_COLLECTION, fact_store: Any | None = None)[source]

Memory backed by chromadb.

Construct via local() for an on-disk persistent client or ephemeral() for a process-local in-memory client.

async append_block(name: str, content: str) None[source]
async consolidate() None[source]
classmethod ephemeral(*, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_COLLECTION, with_facts: bool = False, facts_collection_name: str = 'jeeves_facts') ChromaMemory[source]

In-memory client (lost on process exit). Great for tests.

classmethod local(persist_directory: str, *, embedder: jeevesagent.core.protocols.Embedder | None = None, collection_name: str = DEFAULT_COLLECTION, with_facts: bool = False, facts_collection_name: str = 'jeeves_facts') ChromaMemory[source]

Persistent on-disk client at persist_directory.

with_facts=True attaches a ChromaFactStore rooted at the same client so facts persist alongside episodes in the same on-disk store.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]
async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async remember(episode: jeevesagent.core.types.Episode) str[source]
async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]
async update_block(name: str, content: str) None[source]
async working() list[jeevesagent.core.types.MemoryBlock][source]
facts: Any | None = None
class jeevesagent.ChromaVectorStore(embedder: jeevesagent.core.protocols.Embedder, *, collection_name: str = 'jeeves_vectors', persist_directory: str | None = None, client: Any = None)[source]

Vector store backed by chromadb.

async add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) list[str][source]
async count() int[source]
async delete(ids: list[str]) None[source]
classmethod from_chunks(chunks: list[jeevesagent.loader.base.Chunk], *, embedder: jeevesagent.core.protocols.Embedder, ids: list[str] | None = None, collection_name: str = 'jeeves_vectors', persist_directory: str | None = None, client: Any = None) ChromaVectorStore[source]
Async:

One-shot: construct a ChromaVectorStore + add chunks.

classmethod from_texts(texts: list[str], *, embedder: jeevesagent.core.protocols.Embedder, metadatas: list[dict[str, Any]] | None = None, ids: list[str] | None = None, collection_name: str = 'jeeves_vectors', persist_directory: str | None = None, client: Any = None) ChromaVectorStore[source]
Async:

One-shot: construct a ChromaVectorStore from raw text strings (each becomes a Chunk with the matching metadata dict, or empty if metadatas is None).

async get_by_ids(ids: list[str]) list[jeevesagent.loader.base.Chunk][source]
async search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
async search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
property embedder: jeevesagent.core.protocols.Embedder
name = 'chroma'
class jeevesagent.CohereEmbedder(model: str = 'embed-english-v3.0', *, client: Any | None = None, api_key: str | None = None, input_type: str = 'search_document')[source]

Embeddings via Cohere’s cohere SDK.

Models and dimensions:

  • embed-english-v3.0 / embed-multilingual-v3.0 -> 1024

  • embed-english-light-v3.0 / embed-multilingual-light-v3.0 -> 384

input_type is required by Cohere v3 models:

  • "search_document" (default) — corpus / fact-store entries

  • "search_query" — retrieval queries

  • "classification" / "clustering" for non-retrieval uses

async embed(text: str) list[float][source]
async embed_batch(texts: list[str]) list[list[float]][source]
dimensions: int
name: str = 'embed-english-v3.0'
class jeevesagent.ConsolidationWorker(memory: jeevesagent.core.protocols.Memory, *, interval_seconds: float = 60.0, on_consolidated: OnConsolidatedCb | None = None, on_error: OnErrorCb | None = None)[source]

Periodic consolidator for any Memory backend.

async run_forever() None[source]

Sleep interval_seconds then consolidate. Repeat until cancelled.

Spawn this in an anyio.create_task_group() — the cancel scope at scope exit terminates the worker cooperatively.

async run_once() int[source]

Run a single consolidation pass. Returns the number of new facts extracted (0 when no fact store / nothing changed).

Errors in memory.consolidate() are routed to on_error and not re-raised, so callers can use this in a polling loop without wrapping it in their own try/except.

property iterations: int

Number of consolidate cycles attempted (test introspection).

property total_extracted: int

Cumulative count of facts extracted across all cycles.

class jeevesagent.Consolidator(*, model: jeevesagent.core.protocols.Model, system_prompt: str = DEFAULT_SYSTEM_PROMPT, max_facts_per_episode: int = 20)[source]

Wraps a Model to extract Fact rows from episodes.

async consolidate(episodes: collections.abc.Iterable[jeevesagent.core.types.Episode], *, store: jeevesagent.memory.facts.FactStore) list[jeevesagent.core.types.Fact][source]

Process episodes; append extracted facts to store; return the new Fact instances in extraction order.

Uses store.append_many when available so the underlying store can batch the embedder calls (one embed_batch API round-trip instead of N individual embed calls). Falls back to per-fact append for stores that haven’t implemented append_many.

class jeevesagent.Dependencies[source]

Bundled protocol implementations passed to every architecture.

Constructed once per run from the Agent’s configured backends. Architectures treat this as read-only — they call methods on the contained protocols but don’t mutate the struct itself.

Multi-agent architectures (Supervisor, Router, etc.) will grow helper methods on this class — fresh_session, scope_for_worker, with_extra_tools, spawn_child — as they land in v0.5+. v0.3 keeps it as a passive struct.

audit_log: jeevesagent.security.audit.AuditLog | None
budget: jeevesagent.core.protocols.Budget
context: jeevesagent.core.context.RunContext

Typed scope for the run — user_id (memory namespace), session_id (conversation thread), run_id (this specific invocation), and metadata (free-form app context). See RunContext for the per-field semantics.

fast_audit: bool = True

Skip _audit(...) calls when audit_log is None.

fast_budget: bool = True

Skip budget.allows_step() and budget.consume(...) when budget is NoBudget.

fast_hooks: bool = True

Skip hooks.pre_tool / hooks.post_tool dispatch when no hooks have been registered.

fast_permissions: bool = True

Skip per-tool permissions.check(...) when permissions is the no-op AllowAll.

fast_runtime: bool = True

Inline await fn(*args) (skipping runtime.step(...) wrapping + idempotency-key derivation) when runtime is InProcRuntime.

fast_telemetry: bool = True

Skip telemetry.trace(...) contextmanagers + emit_metric calls when telemetry is NoTelemetry.

hooks: jeevesagent.security.hooks.HookRegistry
max_turns: int
memory: jeevesagent.core.protocols.Memory
model: jeevesagent.core.protocols.Model
permissions: jeevesagent.core.protocols.Permissions
runtime: jeevesagent.core.protocols.Runtime
streaming: bool = False

Whether a downstream consumer is reading from agent.stream(). When True, architectures should preserve real-time event-arrival semantics so a consumer that breaks out of the iterator triggers prompt cancellation. When False (the default for agent.run()), architectures may batch events for fewer task-group / channel allocations on the hot path.

telemetry: jeevesagent.core.protocols.Telemetry
tools: jeevesagent.core.protocols.ToolHost
class jeevesagent.EchoModel(*, prefix: str = 'Echo: ', chunk_delay_s: float = 0.0, cost_per_token: float = 0.0)[source]

Echo-style model for tests and demos.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot echo. Returns the echoed user prompt as one string with synthetic usage. No per-token chunking — used by the non-streaming hot path (agent.run()).

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name: str = 'echo'
class jeevesagent.Embedder[source]

Bases: Protocol

Text-to-vector embedding model used by the memory subsystem.

async embed(text: str) list[float][source]
async embed_batch(texts: list[str]) list[list[float]][source]
dimensions: int
name: str
class jeevesagent.Episode(/, **data: Any)[source]

Bases: pydantic.BaseModel

A single (input, decisions, tool calls, output) tuple from history.

user_id is the framework-managed namespace partition. Episodes persisted with one user_id value are never visible to memory recall queries scoped to a different user_id. None is its own bucket — the “anonymous / single-tenant” namespace — and does not see episodes belonging to a non-None user_id (and vice versa). Set automatically from RunContext by the agent loop; pass explicitly when constructing episodes outside a run.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

format() str[source]
embedding: list[float] | None = None
id: str = None
input: str
occurred_at: datetime.datetime = None
output: str
session_id: str
tool_calls: list[ToolCall] = None
user_id: str | None = None
class jeevesagent.Event(/, **data: Any)[source]

Bases: pydantic.BaseModel

A single observable record from a running session.

Carries a discriminator (kind) plus a free-form payload. Construct via the class methods to ensure consistent shapes.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod architecture_event(session_id: str, name: str, **data: Any) Event[source]

Generic architecture-progress event.

name is a namespaced string identifying the source architecture and the kind of progress (e.g. "self_refine.critique", "reflexion.lesson_persisted", "router.classified"). data is merged into the payload alongside name so consumers can pattern-match on name and read structured fields off the rest.

classmethod budget_exceeded(session_id: str, status: BudgetStatus) Event[source]
classmethod budget_warning(session_id: str, status: BudgetStatus) Event[source]
classmethod completed(session_id: str, result: Any) Event[source]
classmethod error(session_id: str, exc: BaseException) Event[source]
classmethod model_chunk(session_id: str, chunk: ModelChunk) Event[source]
classmethod started(session_id: str, prompt: str) Event[source]
classmethod tool_call(session_id: str, call: ToolCall) Event[source]
classmethod tool_result(session_id: str, result: ToolResult) Event[source]
at: datetime.datetime = None
kind: EventKind
payload: dict[str, Any] = None
session_id: str
class jeevesagent.EventKind[source]

Bases: enum.StrEnum

Enum where members are also (and must be) strings

Initialize self. See help(type(self)) for accurate signature.

ARCHITECTURE_EVENT = 'architecture_event'

Generic architecture-progress event. Carries a namespaced name in the payload (e.g. "self_refine.critique", "reflexion.lesson_persisted", "router.classified") so each architecture can stream its own progress signal without expanding EventKind.

BUDGET_EXCEEDED = 'budget_exceeded'
BUDGET_WARNING = 'budget_warning'
COMPLETED = 'completed'
ERROR = 'error'
MEMORY_RECALL = 'memory_recall'
MEMORY_WRITE = 'memory_write'
MODEL_CHUNK = 'model_chunk'
PERMISSION_ASK = 'permission_ask'
PERMISSION_DECISION = 'permission_decision'
STARTED = 'started'
TOOL_CALL = 'tool_call'
TOOL_RESULT = 'tool_result'
class jeevesagent.FAISSVectorStore(embedder: jeevesagent.core.protocols.Embedder, *, dimension: int | None = None, index_factory_string: str = 'HNSW32', metric: str = 'ip')[source]

Vector store backed by faiss-cpu.

async add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) list[str][source]
async count() int[source]
async delete(ids: list[str]) None[source]
classmethod from_chunks(chunks: list[jeevesagent.loader.base.Chunk], *, embedder: jeevesagent.core.protocols.Embedder, ids: list[str] | None = None, dimension: int | None = None, index_factory_string: str = 'HNSW32', metric: str = 'ip') FAISSVectorStore[source]
Async:

One-shot: construct a FAISSVectorStore + add chunks.

classmethod from_texts(texts: list[str], *, embedder: jeevesagent.core.protocols.Embedder, metadatas: list[dict[str, Any]] | None = None, ids: list[str] | None = None, dimension: int | None = None, index_factory_string: str = 'HNSW32', metric: str = 'ip') FAISSVectorStore[source]
Async:

One-shot: construct a FAISSVectorStore from raw text strings (each becomes a Chunk with the matching metadata dict, or empty if metadatas is None).

async get_by_ids(ids: list[str]) list[jeevesagent.loader.base.Chunk][source]
async search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
async search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
property embedder: jeevesagent.core.protocols.Embedder
name = 'faiss'
class jeevesagent.Fact(/, **data: Any)[source]

Bases: pydantic.BaseModel

A semantic claim extracted from one or more episodes.

Bi-temporal: valid_from/valid_until tracks when the fact was true in the world; recorded_at tracks when we learned it.

user_id is the framework-managed namespace partition. Facts persisted with one user_id value are never visible to recall queries scoped to a different user_id. Set automatically from RunContext by the agent loop / consolidator; pass explicitly when constructing facts outside a run.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

format() str[source]
confidence: float = 1.0
id: str = None
object: str
predicate: str
recorded_at: datetime.datetime = None
sources: list[str] = None
subject: str
user_id: str | None = None
valid_from: datetime.datetime = None
valid_until: datetime.datetime | None = None
class jeevesagent.FactStore[source]

Bases: Protocol

Storage surface for bi-temporal facts.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]
async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
class jeevesagent.FileAuditLog(path: str | pathlib.Path, *, secret: str = '')[source]

JSONL append-only audit log with HMAC signatures.

On construction we read any pre-existing entries to recover the highest seq, so a process restart picks up where the last one left off.

async append(*, session_id: str, actor: str, action: str, payload: dict[str, Any]) jeevesagent.core.types.AuditEntry[source]
async query(*, session_id: str | None = None, action: str | None = None) list[jeevesagent.core.types.AuditEntry][source]
property path: pathlib.Path
class jeevesagent.FilesystemSandbox(inner: jeevesagent.core.protocols.ToolHost, *, roots: collections.abc.Iterable[str | pathlib.Path], path_args: collections.abc.Iterable[str] | None = None, auto_detect: bool = True)[source]

Restrict a tool host’s path-typed arguments to declared roots.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]
property inner: jeevesagent.core.protocols.ToolHost
property roots: tuple[pathlib.Path, Ellipsis]
class jeevesagent.FreshnessPolicy[source]

Maximum age for certified values from each source.

per_source maps a source-prefix (matched with startswith) to a timedelta. The first prefix that matches wins. default is used when no prefix matches; if also None, the policy treats all values as fresh.

classmethod from_dict(per_source: dict[str, datetime.timedelta] | None = None, *, default: datetime.timedelta | None = None) FreshnessPolicy[source]
max_age_for(source: str) datetime.timedelta | None[source]
default: datetime.timedelta | None = None
per_source: tuple[tuple[str, datetime.timedelta], Ellipsis] = ()
class jeevesagent.Handoff[source]

Per-peer handoff configuration.

  • agent — the peer Agent.

  • input_type — optional Pydantic model. When set, the generated handoff tool’s input schema mirrors this model’s fields, so the calling model gets a typed schema (instead of a string message). The validated payload is exposed to input_filter and surfaces in the swarm.handoff event.

  • input_filter — optional callback (history, payload) prompt for selective context forwarding. Default behavior respects the Swarm’s pass_full_history flag.

  • description — override the generated tool’s description. Useful when the agent’s name is opaque (“billing_v2”) but the description should be user-friendly.

  • tool_name — override the auto-generated tool name. Default is "transfer_to_<key>" where <key> is the peer’s key in the swarm’s agents dict.

agent: jeevesagent.agent.api.Agent
description: str | None = None
input_filter: InputFilter | None = None
input_type: type[pydantic.BaseModel] | None = None
tool_name: str | None = None
class jeevesagent.HashEmbedder(dimensions: int = DEFAULT_HASH_DIMENSIONS)[source]

Deterministic SHA256-seeded unit vectors.

Each text gets a fresh random.Random seeded by the SHA256 of its UTF-8 bytes, then samples dimensions Gaussian values and L2-normalises the result. Same text always produces the same vector; different texts produce well-distributed vectors with cosine distances that correlate with literal text equality (not semantic similarity).

Use this in tests (fast, no network) and as a default for in-memory backends that need some vector but don’t need real semantic recall.

async embed(text: str) list[float][source]
async embed_batch(texts: list[str]) list[list[float]][source]
dimensions: int = 384
name: str = 'hash-embedder-384'
class jeevesagent.HookHost[source]

Bases: Protocol

Aggregator over user-registered lifecycle callbacks.

async on_event(event: jeevesagent.core.types.Event) None[source]
async post_tool(call: jeevesagent.core.types.ToolCall, result: jeevesagent.core.types.ToolResult) None[source]
async pre_tool(call: jeevesagent.core.types.ToolCall) jeevesagent.core.types.PermissionDecision[source]
class jeevesagent.HookRegistry[source]

Implements HookHost.

async on_event(event: jeevesagent.core.types.Event) None[source]
async post_tool(call: jeevesagent.core.types.ToolCall, result: jeevesagent.core.types.ToolResult) None[source]

Best-effort post-tool callbacks. Failures and timeouts are absorbed so they cannot affect the result the loop returns.

async pre_tool(call: jeevesagent.core.types.ToolCall) jeevesagent.core.types.PermissionDecision[source]

Run all pre-tool hooks. First deny wins; otherwise allow.

register_event(hook: EventHook) EventHook[source]
register_post_tool(hook: PostToolHook) PostToolHook[source]
register_pre_tool(hook: PreToolHook) PreToolHook[source]
event_hooks: list[EventHook] = []
hook_timeout_s: float = 5.0
post_tool_hooks: list[PostToolHook] = []
pre_tool_hooks: list[PreToolHook] = []
class jeevesagent.InMemoryAuditLog(*, secret: str = '')[source]

List-backed signed audit log.

async all_entries() list[jeevesagent.core.types.AuditEntry][source]
async append(*, session_id: str, actor: str, action: str, payload: dict[str, Any]) jeevesagent.core.types.AuditEntry[source]
async query(*, session_id: str | None = None, action: str | None = None) list[jeevesagent.core.types.AuditEntry][source]
class jeevesagent.InMemoryFactStore(*, embedder: jeevesagent.core.protocols.Embedder | None = None)[source]

Dict-backed bi-temporal fact store.

All operations are coordinated by an anyio.Lock so concurrent appends from the consolidator and reads from the agent loop don’t tear the index.

When an embedder is supplied, every appended fact’s triple ("subject predicate object") is embedded and stored alongside the fact, and recall_text() ranks by cosine similarity against the query’s embedding. When no embedder is given, recall_text() falls back to token-overlap matching.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]

Append a fact, invalidating any superseded predecessors.

Supersession rule: any existing fact with matching subject + predicate, currently valid (valid_until is None), and a different object gets its valid_until set to the new fact’s valid_from.

async append_many(facts: collections.abc.Iterable[jeevesagent.core.types.Fact]) list[str][source]

Append a batch of facts. Embedder calls are coalesced via Embedder.embed_batch() when an embedder is configured — one network round-trip for the batch instead of N.

async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]

Rank facts against query.

With an embedder configured: cosine-similarity over the query’s embedding vs each fact triple’s stored embedding. Without one: token-overlap with a small stop-word list (longer overlaps win, ties break by shorter haystack = more specific match).

user_id partitions the candidate set as a hard namespace boundary — see Fact for semantics.

snapshot() dict[str, jeevesagent.core.types.Fact][source]
property embedder: jeevesagent.core.protocols.Embedder | None
class jeevesagent.InMemoryJournalStore[source]

Dict-backed journal. Process-local; lost on exit.

async aclose() None[source]
async get_step(session_id: str, step_name: str) JournalEntry | None[source]
async get_stream(session_id: str, step_name: str) list[Any] | None[source]
async put_step(session_id: str, step_name: str, value: Any) None[source]
async put_stream(session_id: str, step_name: str, chunks: list[Any]) None[source]
step_keys() list[tuple[str, str]][source]
stream_keys() list[tuple[str, str]][source]
class jeevesagent.InMemoryMemory(*, consolidator: jeevesagent.memory.consolidator.Consolidator | None = None, fact_store: jeevesagent.memory.facts.FactStore | None = None)[source]

Dict-backed implementation of Memory.

async append_block(name: str, content: str) None[source]
async consolidate() None[source]

Process unconsolidated episodes through the configured Consolidator, appending facts to self.facts.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]
async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async remember(episode: jeevesagent.core.types.Episode) str[source]
async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]

Return user/assistant pairs from prior runs of this session.

Materialises each persisted Episode for the given session_id (within the user_id partition) into a [USER input, ASSISTANT output] pair, ordered oldest-first and capped at limit turns total — i.e. up to limit / 2 Q/A exchanges. Tool-call traces are not replayed; the final assistant text per turn is sufficient context for follow-ups.

snapshot() dict[str, Any][source]
async update_block(name: str, content: str) None[source]
async working() list[jeevesagent.core.types.MemoryBlock][source]
facts: jeevesagent.memory.facts.FactStore
class jeevesagent.InMemoryVectorStore(embedder: jeevesagent.core.protocols.Embedder)[source]

In-process vector store backed by a Python list.

async add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) list[str][source]
async count() int[source]
async delete(ids: list[str]) None[source]
classmethod from_chunks(chunks: list[jeevesagent.loader.base.Chunk], *, embedder: jeevesagent.core.protocols.Embedder, ids: list[str] | None = None) InMemoryVectorStore[source]
Async:

One-shot: construct an InMemoryVectorStore + add chunks.

classmethod from_texts(texts: list[str], *, embedder: jeevesagent.core.protocols.Embedder, metadatas: list[dict[str, Any]] | None = None, ids: list[str] | None = None) InMemoryVectorStore[source]
Async:

One-shot: construct an InMemoryVectorStore from raw text strings (each becomes a Chunk with the matching metadata dict, or empty if metadatas is None).

async get_by_ids(ids: list[str]) list[jeevesagent.loader.base.Chunk][source]
classmethod load(path: str | pathlib.Path, *, embedder: jeevesagent.core.protocols.Embedder) InMemoryVectorStore[source]
Async:

Restore a store previously save()-d. Pass the same embedder kind/dimensions or queries will produce nonsense scores.

async save(path: str | pathlib.Path) None[source]

Write the full store (chunks + vectors + ids) to a JSON file. The embedder is NOT serialized — supply the same embedder when calling load().

async search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
async search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
async search_hybrid(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, alpha: float = 0.5) list[jeevesagent.vectorstore.base.SearchResult][source]

Hybrid lexical (BM25) + vector search via RRF.

alpha is in [0, 1]: 0 = pure BM25, 1 = pure vector, 0.5 = even weighting (RRF default). Both rankings are computed independently and fused by Reciprocal Rank Fusion, then the top-k survivors are returned.

Embeddings catch semantic similarity (“automobile” ↔ “car”), BM25 catches exact-term hits (model names, error codes, person names) — together they outperform either alone on most retrieval benchmarks.

property embedder: jeevesagent.core.protocols.Embedder
name = 'in-memory'
class jeevesagent.InProcRuntime[source]

No durability. Each step runs immediately.

async session(session_id: str) collections.abc.AsyncIterator[InProcSession][source]
async signal(session_id: str, name: str, payload: Any) None[source]
async step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.Awaitable[Any]], *args: Any, idempotency_key: str | None = None, **kwargs: Any) Any[source]
stream_step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.AsyncIterator[Any]], *args: Any, **kwargs: Any) collections.abc.AsyncIterator[Any][source]
name = 'inproc'
class jeevesagent.InProcessToolHost(tools: list[Tool | collections.abc.Callable[Ellipsis, Any]] | None = None)[source]

A dict-backed ToolHost.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
get(name: str) Tool | None[source]
async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
register(item: Tool | collections.abc.Callable[Ellipsis, Any]) Tool[source]
unregister(name: str) bool[source]

Remove a tool by name. Returns True if removed.

async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]

In-process registry is static; the generator yields nothing.

Iterating over an empty tuple keeps this an async generator (so the return type is AsyncIterator) without ever producing an event at runtime.

class jeevesagent.JeevesConfig[source]

Connection details for the Jeeves Gateway.

api_key: str
base_url: str = 'https://jeeves.works/mcp'
server_name: str = 'jeeves'
class jeevesagent.JeevesGateway(config: JeevesConfig, *, registry: jeevesagent.mcp.registry.MCPRegistry | None = None)[source]

ToolHost-shaped wrapper around the Jeeves Gateway.

async aclose() None[source]
as_mcp_server() jeevesagent.mcp.spec.MCPServerSpec[source]

Return the MCPServerSpec describing this gateway.

as_registry() jeevesagent.mcp.registry.MCPRegistry[source]

Return a one-server MCPRegistry rooted at this gateway.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
classmethod from_env(*, env_var: str = JEEVES_API_KEY_ENV, base_url: str | None = None, server_name: str = JEEVES_DEFAULT_SERVER_NAME) JeevesGateway[source]

Build a gateway from the JEEVES_API_KEY environment variable.

async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]
property config: JeevesConfig
property server_name: str
class jeevesagent.JournalStore[source]

Bases: Protocol

Storage surface for the durable runtime.

async aclose() None[source]
async get_step(session_id: str, step_name: str) JournalEntry | None[source]
async get_stream(session_id: str, step_name: str) list[Any] | None[source]
async put_step(session_id: str, step_name: str, value: Any) None[source]
async put_stream(session_id: str, step_name: str, chunks: list[Any]) None[source]
class jeevesagent.JournaledRuntime(store: jeevesagent.runtime.journal.JournalStore | None = None)[source]

Runtime that journals every step’s result for replay.

Pass any JournalStore (in-memory for tests, sqlite for durable single-process use, future Postgres/DBOS adapters for multi-process / multi-host).

async session(session_id: str) collections.abc.AsyncIterator[JournaledSession][source]
async signal(session_id: str, name: str, payload: Any) None[source]
async step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.Awaitable[Any]], *args: Any, idempotency_key: str | None = None, **kwargs: Any) Any[source]
stream_step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.AsyncIterator[Any]], *args: Any, **kwargs: Any) collections.abc.AsyncIterator[Any][source]
name = 'journaled'
property store: jeevesagent.runtime.journal.JournalStore
class jeevesagent.LineagePolicy[source]

Allow-list of source prefixes for the entire lineage chain.

A CertifiedValue is acceptable if every entry in value.lineage (interpreted as a source prefix) starts with one of the allowed prefixes.

classmethod from_iter(sources: list[str] | tuple[str, Ellipsis]) LineagePolicy[source]
allowed_sources: frozenset[str]
class jeevesagent.LiteLLMModel(model: str, *, api_key: str | None = None, client: Any | None = None, **litellm_kwargs: Any)[source]

Bases: jeevesagent.model.openai.OpenAIModel

Talks to any LiteLLM-supported provider.

Inherits chunk normalisation, tool-call delta aggregation, and message-conversion from OpenAIModel because LiteLLM produces OpenAI-shaped outputs.

class jeevesagent.MCPClient(spec: jeevesagent.mcp.spec.MCPServerSpec, *, session: Any | None = None)[source]

One client per MCP server. Holds the live ClientSession.

async aclose() None[source]

Tear down the session and underlying transport.

async call_tool(name: str, args: dict[str, Any]) Any[source]

Invoke name with args. Returns the SDK’s CallToolResult.

async connect() None[source]

Open the transport and initialise the session.

No-op if already connected (or a fake session was injected at construction time).

async list_tools() list[Any][source]

Return whatever the SDK gave us — a list of tool descriptors.

Each descriptor has name, description, inputSchema. We don’t translate to ToolDef here — the registry does that, since it also assigns names with disambiguation.

property is_connected: bool
property name: str
property spec: jeevesagent.mcp.spec.MCPServerSpec
class jeevesagent.MCPRegistry(items: list[jeevesagent.mcp.spec.MCPServerSpec | jeevesagent.mcp.client.MCPClient] | None = None)[source]

Aggregates many MCPClient instances into a single ToolHost.

async aclose() None[source]
async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
async connect() None[source]

Connect every client in parallel and rebuild the index.

async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
async refresh() None[source]

Re-pull tool lists from every client and rebuild the index.

async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]

listChanged notifications. Not yet implemented; yields nothing.

property server_names: list[str]
class jeevesagent.MCPServerSpec[source]

How to find and talk to a single MCP server.

Construct via the class methods stdio() or http() rather than the bare constructor — they enforce the right combination of fields per transport.

classmethod http(name: str, url: str, headers: dict[str, str] | None = None, *, description: str = '') MCPServerSpec[source]

Connect to url via Streamable HTTP transport.

classmethod stdio(name: str, command: str, args: list[str] | tuple[str, Ellipsis] | None = None, env: dict[str, str] | None = None, *, description: str = '') MCPServerSpec[source]

Spawn command as a subprocess and speak JSON-RPC over its stdio.

args: tuple[str, Ellipsis] = ()
command: str | None = None
description: str = ''
env: tuple[tuple[str, str], Ellipsis] = ()
headers: tuple[tuple[str, str], Ellipsis] = ()
name: str
transport: Literal['stdio', 'http']
url: str | None = None
class jeevesagent.Memory[source]

Bases: Protocol

Tiered memory: working blocks, episodic store, semantic graph.

async append_block(name: str, content: str) None[source]

Append to a named block, creating it if absent.

async consolidate() None[source]

Background: extract semantic facts from recent episodes.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]

Retrieve episodes (or facts, when kind='semantic').

When user_id is supplied, results are restricted to episodes stored with that exact user_id value. None is its own bucket (the “anonymous / single-tenant” namespace) — episodes stored with user_id=None are never visible to a query with user_id="alice" and vice versa. Backends MUST honour this filter to preserve the framework’s multi-tenant safety contract.

async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]

Retrieve bi-temporal facts matching query.

Backends that don’t expose a fact store return []. The agent loop calls this directly rather than duck-typing on memory.facts so backends without fact support don’t need any opt-out mechanism.

user_id filters by namespace partition with the same semantics as recall(): None is its own bucket and does not cross-contaminate with non-None values.

async remember(episode: jeevesagent.core.types.Episode) str[source]

Persist an episode. Returns the episode ID.

async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]

Return the most-recent limit user/assistant turns from the conversation identified by session_id, in order (oldest first).

This is the conversation-continuity primitive — the agent loop calls it at the top of every run so that reusing a session_id actually continues the chat (the model sees previous turns as real Message history) rather than starting fresh and relying solely on semantic recall.

user_id MUST be respected by backends as a hard namespace partition: messages persisted under one user_id are never visible to a query scoped to a different one. Backends without persisted message logs return [] — the agent loop falls back to the semantic-recall path in that case.

async update_block(name: str, content: str) None[source]

Replace the contents of a named block.

async working() list[jeevesagent.core.types.MemoryBlock][source]

All in-context blocks. Pinned to every prompt.

class jeevesagent.MemoryBlock(/, **data: Any)[source]

Bases: pydantic.BaseModel

An in-context memory block, pinned to every prompt.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

format() str[source]
content: str
name: str
pinned_order: int = 0
updated_at: datetime.datetime = None
class jeevesagent.Message(/, **data: Any)[source]

Bases: pydantic.BaseModel

A single chat message in the model’s conversation.

tool_calls is populated on assistant messages that emitted tool calls in the previous turn — real provider adapters (Anthropic tool_use blocks, OpenAI tool_calls array) need to reconstruct the right wire format from this.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

content: str
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None = None
role: Role
tool_call_id: str | None = None
tool_calls: tuple[ToolCall, Ellipsis] = ()
class jeevesagent.Mode[source]

Bases: enum.StrEnum

Enum where members are also (and must be) strings

Initialize self. See help(type(self)) for accurate signature.

ACCEPT_EDITS = 'acceptEdits'
BYPASS = 'bypassPermissions'
DEFAULT = 'default'
class jeevesagent.Model[source]

Bases: Protocol

LLM provider interface. One adapter per lab (Anthropic, OpenAI, …).

The required surface is stream(...) — every adapter must implement it. Adapters MAY additionally override complete(...) with a non-streaming (single-shot) call; if not, complete falls back to consuming the stream internally and assembling the full response, which is correct but slower (per-chunk wire + parsing overhead). Architectures use complete on the non-streaming hot path (agent.run()) and stream when a consumer is reading from agent.stream().

stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]

Stream completion chunks. Each chunk is text, tool_call, or finish.

name: str
class jeevesagent.ModelChunk(/, **data: Any)[source]

Bases: pydantic.BaseModel

A single chunk from a streaming model call.

Discriminated by kind. Exactly one of the optional fields is set depending on the kind.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

finish_reason: str | None = None
kind: Literal['text', 'tool_call', 'finish']
text: str | None = None
tool_call: ToolCall | None = None
usage: Usage | None = None
class jeevesagent.MultiAgentDebate(*, debaters: list[jeevesagent.agent.api.Agent], judge: jeevesagent.agent.api.Agent | None = None, rounds: int = 2, convergence_check: bool = True, convergence_similarity: float = 0.85, debater_instructions: str | None = None, judge_instructions: str | None = None)[source]

N debaters + optional judge orchestration.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'debate'
class jeevesagent.NoSandbox(inner: jeevesagent.core.protocols.ToolHost)[source]

Pass-through wrapper around a ToolHost.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]
property inner: jeevesagent.core.protocols.ToolHost
class jeevesagent.NoTelemetry[source]

No-op telemetry. Very cheap; safe to call on every loop step.

async emit_metric(name: str, value: float, **attrs: Any) None[source]
async trace(name: str, **attrs: Any) collections.abc.AsyncIterator[jeevesagent.core.types.Span][source]
class jeevesagent.OTelTelemetry(*, tracer_provider: Any | None = None, meter_provider: Any | None = None, instrumentation_name: str = 'jeevesagent')[source]

OpenTelemetry-backed Telemetry.

async emit_metric(name: str, value: float, **attrs: Any) None[source]
async trace(name: str, **attrs: Any) collections.abc.AsyncIterator[jeevesagent.core.types.Span][source]
class jeevesagent.OpenAIEmbedder(model: str = 'text-embedding-3-small', *, dimensions: int | None = None, client: Any | None = None, api_key: str | None = None)[source]

Embeddings via OpenAI’s embeddings.create API.

Dimensions are fixed by the model:

  • text-embedding-3-small -> 1536

  • text-embedding-3-large -> 3072

  • text-embedding-ada-002 -> 1536

Pass dimensions= only for text-embedding-3-* models, which support the dimensions parameter for projection.

async embed(text: str) list[float][source]
async embed_batch(texts: list[str]) list[list[float]][source]
dimensions: int
name: str = 'text-embedding-3-small'
class jeevesagent.OpenAIModel(model: str = 'gpt-4o', *, client: Any = None, api_key: str | None = None, base_url: str | None = None)[source]

Talks to OpenAI via openai.AsyncOpenAI.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot completion (no per-chunk yields).

Tries the OpenAI non-streaming endpoint (stream=False) first. If that fails — e.g. when a test fake client only supports streaming, or a transport doesn’t honor stream=False — falls back to consuming stream() internally and accumulating the result. The fallback still saves the per-chunk yield + Event construction overhead on the architecture side because ReAct calls complete with a single await.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name = 'gpt-4o'
class jeevesagent.PermissionDecision(/, **data: Any)[source]

Bases: pydantic.BaseModel

Outcome of a permission check or pre-tool hook.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod allow_(reason: str | None = None) PermissionDecision[source]
classmethod ask_(reason: str | None = None) PermissionDecision[source]
classmethod deny_(reason: str) PermissionDecision[source]
property allow: bool
property ask: bool
decision: Literal['allow', 'deny', 'ask']
property deny: bool
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reason: str | None = None
class jeevesagent.Permissions[source]

Bases: Protocol

Decides whether a tool call is allowed.

async check(call: jeevesagent.core.types.ToolCall, *, context: collections.abc.Mapping[str, Any]) jeevesagent.core.types.PermissionDecision[source]
class jeevesagent.Plan(/, **data: Any)[source]

Bases: pydantic.BaseModel

A list of plan steps in execution order.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

steps: list[PlanStep] = None
class jeevesagent.PlanAndExecute(*, max_steps: int = 8, planner_prompt: str | None = None, executor_prompt: str | None = None, synthesizer_prompt: str | None = None)[source]

Planner → step executor → synthesizer.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'plan-and-execute'
class jeevesagent.PlanStep(/, **data: Any)[source]

Bases: pydantic.BaseModel

One step of a plan.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

description: str
id: str
class jeevesagent.PostgresFactStore(pool: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None)[source]

Postgres-backed bi-temporal fact store.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]
async append_many(facts: collections.abc.Iterable[jeevesagent.core.types.Fact]) list[str][source]
classmethod connect(dsn: str, *, embedder: jeevesagent.core.protocols.Embedder | None = None, min_size: int = 1, max_size: int = 10) PostgresFactStore[source]
Async:

async init_schema() None[source]
async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
schema_sql() list[str][source]

Return the DDL for this fact store’s schema.

Exposed so tests can assert on the SQL strings, and so migration scripts can apply the schema in their own transaction.

property embedder: jeevesagent.core.protocols.Embedder | None
class jeevesagent.PostgresJournalStore(pool: Any)[source]

Postgres-backed journal. Production-grade durable replay.

Same shape as SqliteJournalStore but uses asyncpg and a Postgres database. Designed for users who already run a Postgres instance for the rest of their stack (memory, audit, app state) and want their durable-runtime journal to live there too.

Why not a DBOS adapter?

DBOS Python’s workflow model requires @DBOS.workflow() and @DBOS.communicator() decorators at module-load time. Our Runtime.step(name, fn, *args) API takes arbitrary callables at runtime, which doesn’t compose cleanly with DBOS’s static-decoration model. PostgresJournalStore gives the same durability guarantee through our existing JournaledRuntime architecture, with no decorator intrusion on user code.

async aclose() None[source]
classmethod connect(dsn: str, *, min_size: int = 1, max_size: int = 10) PostgresJournalStore[source]
Async:

Open an asyncpg pool and return the store rooted at it.

async get_step(session_id: str, step_name: str) JournalEntry | None[source]
async get_stream(session_id: str, step_name: str) list[Any] | None[source]
async init_schema() None[source]
async put_step(session_id: str, step_name: str, value: Any) None[source]
async put_stream(session_id: str, step_name: str, chunks: list[Any]) None[source]
static schema_sql() list[str][source]

Return the DDL needed to bootstrap this store’s schema.

Idempotent; safe to run on every process start.

class jeevesagent.PostgresMemory(pool: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None, namespace: str = DEFAULT_NAMESPACE, fact_store: Any | None = None)[source]

Postgres-backed Memory.

pool is an asyncpg.Pool (or anything with the same API). Tests can pass a fake pool whose acquire() returns a fake connection.

async aclose() None[source]
async append_block(name: str, content: str) None[source]
classmethod connect(dsn: str, *, embedder: jeevesagent.core.protocols.Embedder | None = None, namespace: str = DEFAULT_NAMESPACE, min_size: int = 1, max_size: int = 10, with_facts: bool = False) PostgresMemory[source]
Async:

Open an asyncpg pool and register the pgvector codec.

When with_facts=True a PostgresFactStore rooted at the same pool is attached as self.facts so the agent loop’s memory.facts integration just works.

async consolidate() None[source]
async init_schema() None[source]

Apply schema_sql() against the connected pool.

When a PostgresFactStore is attached as self.facts, its schema is initialised in the same call.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]
async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async remember(episode: jeevesagent.core.types.Episode) str[source]
schema_sql() list[str][source]

Return the DDL needed to bootstrap this backend’s schema.

Exposed so tests can assert on the SQL without running it; also usable from migration scripts that want to apply the schema in their own transaction.

async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]
async update_block(name: str, content: str) None[source]
async working() list[jeevesagent.core.types.MemoryBlock][source]
property embedding_dimensions: int
facts: Any | None = None
property namespace: str
class jeevesagent.PostgresRuntime(pool: Any)[source]

Bases: jeevesagent.runtime.journaled.JournaledRuntime

JournaledRuntime backed by Postgres for cross-host durable replay.

async aclose() None[source]

Close the underlying connection pool.

classmethod connect(dsn: str, *, min_size: int = 1, max_size: int = 10) PostgresRuntime[source]
Async:

Open a fresh asyncpg pool and return the runtime rooted at it.

async init_schema() None[source]

Create the journal tables if they don’t already exist.

name = 'postgres'
class jeevesagent.PostgresVectorStore(embedder: jeevesagent.core.protocols.Embedder, *, dsn: str, table: str = 'jeeves_vectors', dimension: int | None = None)[source]

Vector store backed by Postgres + pgvector.

async add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) list[str][source]
async count() int[source]
async delete(ids: list[str]) None[source]
classmethod from_chunks(chunks: list[jeevesagent.loader.base.Chunk], *, embedder: jeevesagent.core.protocols.Embedder, ids: list[str] | None = None, dsn: str, table: str = 'jeeves_vectors', dimension: int | None = None) PostgresVectorStore[source]
Async:

One-shot: construct a PostgresVectorStore + add chunks.

classmethod from_texts(texts: list[str], *, embedder: jeevesagent.core.protocols.Embedder, metadatas: list[dict[str, Any]] | None = None, ids: list[str] | None = None, dsn: str, table: str = 'jeeves_vectors', dimension: int | None = None) PostgresVectorStore[source]
Async:

One-shot: construct a PostgresVectorStore from raw text strings (each becomes a Chunk with the matching metadata dict, or empty if metadatas is None).

async get_by_ids(ids: list[str]) list[jeevesagent.loader.base.Chunk][source]
async init_schema(dimension: int) None[source]

Create the table + HNSW index. Idempotent.

async search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
async search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[jeevesagent.vectorstore.base.SearchResult][source]
property embedder: jeevesagent.core.protocols.Embedder
name = 'postgres'
class jeevesagent.ReAct(*, max_turns: int | None = None)[source]

Observe-think-act in a tight loop.

The default architecture for every Agent. Other architectures wrap or replace this strategy; see Subagent.md.

max_turns overrides Dependencies.max_turns for this architecture only — useful when wrapping ReAct inside another architecture that sets its own per-leaf cap (Reflexion, Plan-and-Execute, etc.). None means “use whatever the Agent was configured with”.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'react'
class jeevesagent.ReWOO(*, max_steps: int = 8, planner_prompt: str | None = None, solver_prompt: str | None = None, parallel_levels: bool = True)[source]

Plan-then-tool-execute with placeholder substitution.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'rewoo'
class jeevesagent.ReWOOPlan(/, **data: Any)[source]

Bases: pydantic.BaseModel

A list of ReWOO steps (no required ordering — dependencies are inferred from {{En}} placeholders).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

steps: list[ReWOOStep] = None
class jeevesagent.ReWOOStep(/, **data: Any)[source]

Bases: pydantic.BaseModel

One step of a ReWOO plan: id + tool + args.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

args: dict[str, Any] = None
property depends_on: list[str]

Extract {{En}} step ids referenced in args.

id: str
tool: str
class jeevesagent.ReWOOStepResult(/, **data: Any)[source]

Bases: pydantic.BaseModel

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

__class_vars__

The names of the class variables defined on the model.

__private_attributes__

Metadata about the private attributes of the model.

__signature__

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__

The core schema of the model.

__pydantic_custom_init__

Whether the model has a custom __init__ function.

__pydantic_decorators__

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__

Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.

__pydantic_parent_namespace__

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__

The name of the post-init method for the model, if defined.

__pydantic_root_model__

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__

A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.

__pydantic_computed_fields__

A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__

The names of fields explicitly set during instantiation.

__pydantic_private__

Values of private attributes set on the model instance.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

error: str | None = None
output: str
step_id: str
tool: str
class jeevesagent.RedisFactStore(client: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None, key_prefix: str = DEFAULT_KEY_PREFIX)[source]

Bi-temporal fact store over plain Redis hashes.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]
classmethod connect(url: str = 'redis://localhost:6379/0', *, embedder: jeevesagent.core.protocols.Embedder | None = None, key_prefix: str = DEFAULT_KEY_PREFIX) RedisFactStore[source]
Async:

async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
property embedder: jeevesagent.core.protocols.Embedder
class jeevesagent.RedisMemory(client: Any, *, embedder: jeevesagent.core.protocols.Embedder | None = None, key_prefix: str = DEFAULT_KEY_PREFIX, index_name: str = DEFAULT_INDEX_NAME, use_vector_index: bool = True, fact_store: Any | None = None)[source]

Redis-backed Memory. Use connect() to construct.

async aclose() None[source]
async append_block(name: str, content: str) None[source]
classmethod connect(url: str = 'redis://localhost:6379/0', *, embedder: jeevesagent.core.protocols.Embedder | None = None, key_prefix: str = DEFAULT_KEY_PREFIX, index_name: str = DEFAULT_INDEX_NAME, use_vector_index: bool = True, with_facts: bool = False, fact_key_prefix: str = 'jeeves:fact:') RedisMemory[source]
Async:

Open an async Redis connection.

with_facts=True attaches a RedisFactStore sharing the same client; facts go to {fact_key_prefix}* keys so they don’t collide with episode keys.

async consolidate() None[source]
async ensure_index() None[source]

Create the RediSearch HNSW index, if not already present.

Skipped silently when use_vector_index=False or when RediSearch isn’t available on the server.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]
async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async remember(episode: jeevesagent.core.types.Episode) str[source]
async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]
async update_block(name: str, content: str) None[source]
async working() list[jeevesagent.core.types.MemoryBlock][source]
facts: Any | None = None
class jeevesagent.Reflexion(*, base: jeevesagent.architecture.base.Architecture | None = None, max_attempts: int = 3, threshold: float = 0.8, evaluator_prompt: str | None = None, reflector_prompt: str | None = None, lessons_block_name: str = 'reflexion_lessons', lesson_store: jeevesagent.vectorstore.base.VectorStore | None = None, top_k_lessons: int = 5)[source]

Wrap a base architecture with evaluator + reflector + lesson memory.

See module docstring for the full mechanism. Constructor parameters:

  • base — architecture to retry. Default ReAct.

  • max_attempts — cap on retries within a single run. Default 3.

  • threshold — minimum evaluator score to terminate as success. Default 0.8.

  • evaluator_prompt / reflector_prompt — override the default system prompts.

  • lessons_block_name — memory working-block name for persisted lessons. Default "reflexion_lessons". Multiple Reflexion-wrapped agents in the same memory should pick distinct names.

  • lesson_store — optional VectorStore enabling selective recall. When set, lessons are stored as embedded chunks and only the top-top_k_lessons most relevant lessons are surfaced on each attempt (instead of all past lessons). Avoids context bloat as lessons accumulate.

  • top_k_lessons — how many lessons to recall per attempt (selective-recall mode only). Default 5.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'reflexion'
class jeevesagent.RetryPolicy[source]

Exponential-backoff-with-jitter retry schedule.

The default is sensible for production: up to 3 attempts (one initial + two retries), starting at 1 s, doubling each attempt, capped at 30 s, with ±10% jitter so synchronised clients don’t reform a thundering herd.

Examples:

# default — sensible for most apps
RetryPolicy()

# disable retries (fail fast)
RetryPolicy.disabled()

# aggressive — survives long provider blips
RetryPolicy.aggressive()

# tuned to a specific SLO
RetryPolicy(max_attempts=4, initial_delay_s=0.5, max_delay_s=15)

The schedule applies between attempts: the first call has no delay, the second is delayed by initial_delay_s (± jitter), the third by initial_delay_s * multiplier (± jitter), etc., each capped at max_delay_s. Provider-supplied Retry-After hints (carried on retry_after) override the computed delay when they ask for more time — we never sleep less than the provider asked for.

classmethod aggressive() RetryPolicy[source]

Up to 6 attempts, faster initial backoff, longer cap. Use when the underlying provider is known-flaky and the caller prefers slow success over fast failure.

classmethod disabled() RetryPolicy[source]

Single attempt, no retries — fail fast on any error.

is_enabled() bool[source]

True when the policy permits at least one retry.

initial_delay_s: float = 1.0

Backoff before the FIRST retry (i.e. between attempts 1 and 2). Subsequent retries use initial_delay_s * multiplier**n.

jitter: float = 0.1

Fractional ±jitter applied to each computed delay. 0.1 = ±10%. Set to 0 for deterministic backoff (useful in tests).

max_attempts: int = 3

Maximum total attempts including the first call. 1 means no retries; the call either succeeds or raises immediately. The minimum-meaningful retry policy is therefore max_attempts=2.

max_delay_s: float = 30.0

Cap on any single backoff. Prevents runaway sleeps when multiplier is large or max_attempts is high.

multiplier: float = 2.0

Geometric growth between successive retries. 2.0 doubles each time; 1.0 makes the policy linear (fixed-interval).

class jeevesagent.Role[source]

Bases: enum.StrEnum

Enum where members are also (and must be) strings

Initialize self. See help(type(self)) for accurate signature.

ASSISTANT = 'assistant'
SYSTEM = 'system'
TOOL = 'tool'
USER = 'user'
class jeevesagent.Router(*, routes: list[RouterRoute], fallback_route: str | None = None, require_confidence_above: float = 0.0, classifier_prompt: str | None = None)[source]

Classify input → dispatch to ONE specialist Agent.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'router'
class jeevesagent.RouterRoute[source]

One specialist + classification metadata.

name is what the classifier emits in its route: line and must be unique within a Router. description is shown to the classifier alongside the name — keep it specific and distinguishing so the classifier picks reliably.

agent: jeevesagent.agent.api.Agent
description: str = ''
name: str
class jeevesagent.RunContext[source]

Typed, immutable context for one agent run.

Set once at the start of Agent.run() and propagated to every architecture, tool, hook, sub-agent, and memory operation via a contextvars.ContextVar. The framework treats user_id and session_id as first-class fields (typed, namespaced); metadata is an opaque bag for app-specific keys the framework does not interpret.

Construct one directly when you need to spawn work outside an active run with explicit scope:

ctx = RunContext(user_id="alice", session_id="conv_42")
async with set_run_context(ctx):
    await my_tool(...)

Inside an agent run, prefer get_run_context() over constructing a new one — that gives you the live context the framework set up.

get(key: str, default: Any = None) Any[source]

Shorthand for self.metadata.get(key, default).

with_overrides(*, user_id: str | None | _Sentinel = _Sentinel.UNSET, session_id: str | None | _Sentinel = _Sentinel.UNSET, run_id: str | _Sentinel = _Sentinel.UNSET, metadata: collections.abc.Mapping[str, Any] | _Sentinel = _Sentinel.UNSET) RunContext[source]

Return a new context with selected fields replaced.

Used by multi-agent architectures when spawning sub-agents that need to inherit most of the parent’s context but with a derived session_id or augmented metadata. The sentinel makes “leave this field unchanged” distinguishable from “explicitly set this field to None”.

metadata: collections.abc.Mapping[str, Any]

Free-form application context. Use this for keys the framework does not need to understand — locale, request id, feature flags, tenant id beyond user_id, etc. Read inside tools / hooks via get_run_context().metadata.

run_id: str = ''

Unique identifier for this single Agent.run() invocation. Distinct from session_id (which identifies a conversation that may span many runs). Auto-set by Agent.run(); an explicit value passed in by the caller is overridden.

session_id: str | None = None

Conversation thread identifier. Reusing the same session_id across calls signals “continue this conversation” — the framework will rehydrate prior session messages so the model sees real chat history, not just memory recall. None means “fresh conversation”; the framework auto-generates one inside Agent.run() if not supplied.

user_id: str | None = None

Namespace for memory recall + persistence. None is the “anonymous / single-tenant” bucket; episodes / facts stored with user_id=None never see episodes / facts stored with a non-None user_id and vice versa. The framework treats this as a hard partition key, not a soft filter.

class jeevesagent.RunResult(/, **data: Any)[source]

Bases: pydantic.BaseModel

Final outcome of an Agent.run call.

output is always the raw assistant text (the JSON itself when a structured-output schema was supplied). parsed is the validated Pydantic instance — populated only when the caller passed output_schema= to Agent.run(). Use whichever fits the call site:

# free-form text run
result = await agent.run("summarise this PDF")
print(result.output)

# structured-output run
result = await agent.run(prompt, output_schema=Invoice)
invoice: Invoice = result.parsed   # typed, validated

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

cost_usd: float = 0.0
property duration: datetime.timedelta

Wall-clock latency between started_at and finished_at.

finished_at: datetime.datetime
interrupted: bool = False
interruption_reason: str | None = None
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output: str
parsed: Any | None = None

The validated Pydantic instance when output_schema= was supplied to Agent.run(); None otherwise. Typed as Any to keep the runtime type free; the call site has the schema and can cast or annotate as needed.

session_id: str
started_at: datetime.datetime
tokens_in: int = 0
tokens_out: int = 0
property total_tokens: int

tokens_in + tokens_out.

Type:

Convenience

turns: int
class jeevesagent.Runtime[source]

Bases: Protocol

Durable execution. Wraps every side effect in a journal entry.

session(session_id: str) contextlib.AbstractAsyncContextManager[RuntimeSession][source]

Open or resume a durable session.

async signal(session_id: str, name: str, payload: Any) None[source]

Send an external signal (e.g., human approval) to a session.

async step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.Awaitable[Any]], *args: Any, idempotency_key: str | None = None, **kwargs: Any) Any[source]

Execute fn as a journaled step. Replays cached on resume.

stream_step(name: str, fn: collections.abc.Callable[Ellipsis, collections.abc.AsyncIterator[Any]], *args: Any, **kwargs: Any) collections.abc.AsyncIterator[Any][source]

Execute a streaming step. Replays the aggregate on resume.

name: str
class jeevesagent.RuntimeSession[source]

Bases: Protocol

Handle to an open durable session held by a Runtime.

async deliver(name: str, payload: Any) None[source]
id: str
class jeevesagent.Sandbox[source]

Bases: Protocol

Isolation layer for tool execution.

async execute(tool: jeevesagent.core.types.ToolDef, args: collections.abc.Mapping[str, Any]) jeevesagent.core.types.ToolResult[source]
with_filesystem(root: str) contextlib.AbstractAsyncContextManager[None][source]

Temporary filesystem sandbox for the duration of the context.

class jeevesagent.ScriptedModel(turns: list[ScriptedTurn])[source]

Model that emits canned responses, one per call to stream().

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot replay of the next scripted turn.

Mirrors stream() but returns the turn’s text + tool_calls + usage in one tuple. Used by the non-streaming hot path (agent.run()); agent.stream() keeps using stream() for per-chunk replay.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name: str = 'scripted'
property remaining: int
class jeevesagent.ScriptedTurn[source]
text: str = ''
tool_calls: list[jeevesagent.core.types.ToolCall] = []
usage: jeevesagent.core.types.Usage
class jeevesagent.SearchResult[source]

One hit from VectorStore.search().

  • chunk — the matched chunk (with its full metadata).

  • score — similarity in [-1, 1] for cosine; backend- specific for other distance metrics. Higher = more similar.

  • id — the store-assigned id (so callers can delete() or get_by_ids() later).

chunk: jeevesagent.loader.base.Chunk
id: str
score: float
class jeevesagent.Secrets[source]

Bases: Protocol

Resolution and redaction of named secrets.

redact(text: str) str[source]
async resolve(ref: str) str[source]
async store(ref: str, value: str) None[source]
class jeevesagent.SelfRefine(*, base: jeevesagent.architecture.base.Architecture | None = None, max_rounds: int = 3, critic_prompt: str | None = None, refiner_prompt: str | None = None, stop_phrase: str = 'no issues')[source]

Wrap a base architecture with iterative critique / refine.

base defaults to ReAct; the round-0 generator runs the base architecture’s full strategy. Subsequent rounds are text-only model calls — no tools, just critique and rewrite.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'self-refine'
class jeevesagent.Skill(path: str | pathlib.Path, *, source_label: str | None = None)[source]

A loadable agent skill.

classmethod from_text(text: str, *, source_label: str | None = None) Skill[source]

Build an inline skill from a SKILL.md-formatted string.

No filesystem path; bundled scripts and tools.py aren’t accessible. Useful for one-off skill definitions in code.

list_files() list[pathlib.Path][source]

Enumerate every file bundled with this skill.

load_body() str[source]

Return the full SKILL.md body (without frontmatter).

property description: str
metadata
property name: str
path
property pending_tools: list[jeevesagent.tools.registry.Tool]

The Tool instances this skill will register on load.

Both Mode B (Python @tool from tools.py) and Mode C (subprocess wrappers from frontmatter tools: manifest) contribute to this list. Empty for pure markdown skills.

class jeevesagent.SkillMetadata[source]

Lightweight skill descriptor — what loads at startup.

The body is NOT in here; it’s read on demand via Skill.load_body(). Keep this small — it lives in the system prompt for the entire agent’s lifetime.

to_catalog_line() str[source]

One-line catalog entry for the system prompt.

allowed_tools: list[str] | None = None
compatibility: str | None = None
declared_tool_count: int = 0
description: str
extra: dict[str, Any]
has_python_tools: bool = False
license: str | None = None
name: str
source_label: str | None = None
class jeevesagent.SkillRegistry(items: collections.abc.Iterable[SkillSpec] | None = None)[source]

A keyed collection of Skill instances.

add(skill: jeevesagent.skills.skill.Skill) None[source]

Append (or override) a single skill after construction.

catalog_section() str[source]

The markdown bullet list that gets appended to the agent’s system prompt.

Empty registry → empty string (so the constructor can unconditionally call this without polluting the system prompt with a blank “Available skills” header).

get(name: str) jeevesagent.skills.skill.Skill | None[source]
is_loaded(name: str) bool[source]

Whether the skill’s pending tools have been registered.

load(name: str) str[source]

Return the full body of a skill (the load_skill tool’s result). Raises SkillError for unknown names so the model gets a clear error in the tool result.

Does NOT register pending Tools. For the full load-and- register flow, see load_with_tools().

load_with_tools(name: str) tuple[str, list[jeevesagent.tools.registry.Tool]][source]

Return (body, newly_pending_tools) — the body of the skill plus the Tool instances the framework should register with the agent’s tool host on this load.

Idempotent: subsequent calls for the same skill return the body and an empty tool list, since registration only needs to happen once.

metadata_map() collections.abc.Mapping[str, jeevesagent.skills.skill.SkillMetadata][source]

All currently-registered skills’ metadata, keyed by name. Cheap to compute — used to build the catalog section.

names() list[str][source]
remove(name: str) jeevesagent.skills.skill.Skill | None[source]

Drop a skill by name. Returns the removed instance or None if no such skill was registered.

class jeevesagent.SkillSource[source]

A folder of skills + an optional label.

classmethod coerce(item: SkillSource | str | pathlib.Path | tuple[str | pathlib.Path, str]) SkillSource[source]

Normalize one user-supplied source spec.

Accepts: * SkillSource(...) — used as-is * str / Path — bare path, no label * (path, label) — path with explicit label

discover() list[jeevesagent.skills.skill.Skill][source]

Find every SKILL.md under this source directory.

Recurses one level (most common layout: skills/<name>/SKILL.md) but also handles deeper nesting. Each SKILL.md becomes one Skill instance with this source’s label attached.

label: str | None = None
path: pathlib.Path
class jeevesagent.Span(/, **data: Any)[source]

Bases: pydantic.BaseModel

A trace span handle. Concrete telemetry adapters return their own representation; this is the value-object contract for in-process use.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

attributes: dict[str, Any] = None
name: str
span_id: str
started_at: datetime.datetime = None
trace_id: str
class jeevesagent.SqliteFactStore(path: str | pathlib.Path, *, embedder: jeevesagent.core.protocols.Embedder | None = None)[source]

Durable bi-temporal fact store rooted at a sqlite file.

async aclose() None[source]
async all_facts() list[jeevesagent.core.types.Fact][source]
async append(fact: jeevesagent.core.types.Fact) str[source]

Append a fact, invalidating any superseded predecessors.

Same supersession rule as InMemoryFactStore: if there’s an existing currently-valid fact with matching subject + predicate but different object, set its valid_until to the new fact’s valid_from.

async query(*, subject: str | None = None, predicate: str | None = None, object_: str | None = None, valid_at: datetime.datetime | None = None, limit: int = 10, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async recall_text(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
property embedder: jeevesagent.core.protocols.Embedder | None
property path: pathlib.Path
class jeevesagent.SqliteJournalStore(path: str | pathlib.Path)[source]

SQLite-backed journal. Durable across process restarts.

async aclose() None[source]
async get_step(session_id: str, step_name: str) JournalEntry | None[source]
async get_stream(session_id: str, step_name: str) list[Any] | None[source]
async put_step(session_id: str, step_name: str, value: Any) None[source]
async put_stream(session_id: str, step_name: str, chunks: list[Any]) None[source]
property path: pathlib.Path
class jeevesagent.SqliteRuntime(path: str | pathlib.Path)[source]

Bases: jeevesagent.runtime.journaled.JournaledRuntime

JournaledRuntime with a SqliteJournalStore.

name = 'sqlite'
property path: pathlib.Path
class jeevesagent.StandardPermissions(*, mode: Mode = Mode.DEFAULT, allowed_tools: list[str] | None = None, denied_tools: list[str] | None = None)[source]

Mode + allow/deny-list permission policy.

async check(call: jeevesagent.core.types.ToolCall, *, context: collections.abc.Mapping[str, Any]) jeevesagent.core.types.PermissionDecision[source]
classmethod strict() StandardPermissions[source]

Convenience: default-mode permissions with no overrides.

class jeevesagent.StepResult(/, **data: Any)[source]

Bases: pydantic.BaseModel

The output of executing one step.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

description: str
output: str
step_id: str
class jeevesagent.SubprocessSandbox(inner: jeevesagent.core.protocols.ToolHost, *, timeout_seconds: float = 30.0)[source]

Run each tool call in a fresh child Python process.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]
async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
async watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]
property inner: jeevesagent.core.protocols.ToolHost
property timeout_seconds: float
class jeevesagent.Supervisor(*, workers: dict[str, jeevesagent.agent.api.Agent], base: jeevesagent.architecture.base.Architecture | None = None, instructions_template: str | None = None, delegate_tool_name: str = 'delegate', forward_tool_name: str = 'forward_message')[source]

Coordinator + workers, glued by a delegate tool.

The supervisor’s base architecture (default ReAct) sees a fresh delegate(worker, instructions) tool that routes calls to the named worker Agent. Worker outputs come back as tool results just like any other tool call.

Constructor

  • workers: dict mapping role-names to fully-built Agent instances. Names must be valid identifiers (the model emits them as the worker argument).

  • base: the architecture the supervisor itself runs. Default ReAct. Wrap inside Reflexion to learn delegation patterns across runs.

  • instructions_template: format string with {worker_descriptions}. Default teaches the supervisor to delegate effectively. The agent’s own instructions are prepended (so domain context survives).

  • delegate_tool_name: defaults to "delegate". Customize to avoid clashes with user-defined tools that happen to have the same name.

  • forward_tool_name: defaults to "forward_message". The supervisor calls this with a worker name to return that worker’s last output VERBATIM as the supervisor’s final response. Skips a synthesis round-trip — the `langchain.com/blog/benchmarking-multi-agent-architectures`_ benchmark showed +50% quality on tasks where the supervisor would otherwise paraphrase a worker’s output.

add_worker(name: str, agent: jeevesagent.agent.api.Agent) None[source]

Register a worker between runs.

Safe to call between Agent.run() invocations on the agent that owns this supervisor; the new worker becomes available for delegate(name, ...) on the next run. Calling mid-run is undefined — the supervisor’s prompt is composed at run start.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
remove_worker(name: str) jeevesagent.agent.api.Agent | None[source]

Unregister a worker by name. Returns the removed Agent if it was registered, None otherwise. Same lifecycle rules as add_worker().

async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'supervisor'
class jeevesagent.Swarm(*, agents: dict[str, jeevesagent.agent.api.Agent | Handoff], entry_agent: str, max_handoffs: int = 8, detect_cycles: bool = True, pass_full_history: bool = True, handoff_tool_name: str = 'handoff')[source]

Peer agents passing control through handoff tools.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'swarm'
class jeevesagent.Team[source]

Namespace for multi-agent team builders.

Every classmethod returns a fully-built Agent whose architecture is the corresponding multi-agent strategy. The returned Agent has the standard run / stream / etc. interface — call sites don’t change between single-agent and team agents.

static actor_critic(actor: jeevesagent.agent.api.Agent, critic: jeevesagent.agent.api.Agent, *, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, max_rounds: int = 3, approval_threshold: float = 0.9, critique_template: str | None = None, refine_template: str | None = None) jeevesagent.agent.api.Agent[source]

Build an actor-critic pair where the critic reviews the actor’s output (with structured JSON scoring + rubric) and the actor refines below approval_threshold.

static blackboard(agents: dict[str, jeevesagent.agent.api.Agent], *, coordinator: jeevesagent.agent.api.Agent | None = None, decider: jeevesagent.agent.api.Agent | None = None, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, max_rounds: int = 10, coordinator_instructions: str | None = None, decider_instructions: str | None = None) jeevesagent.agent.api.Agent[source]

Build a blackboard team where agents collaborate via a shared workspace; an optional coordinator selects who acts each round and an optional decider decides when the work is done.

static debate(debaters: list[jeevesagent.agent.api.Agent], *, judge: jeevesagent.agent.api.Agent | None = None, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, rounds: int = 2, convergence_check: bool = True, convergence_similarity: float = 0.85, debater_instructions: str | None = None, judge_instructions: str | None = None) jeevesagent.agent.api.Agent[source]

Build a multi-agent debate where debaters argue for rounds (with optional convergence early-exit). If judge is provided, the judge synthesizes a final answer; otherwise majority vote wins.

static router(routes: list[jeevesagent.architecture.RouterRoute], *, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, fallback_route: str | None = None, require_confidence_above: float = 0.0, classifier_prompt: str | None = None) jeevesagent.agent.api.Agent[source]

Build a router that classifies once and dispatches to ONE specialist Agent. Cheaper than Supervisor for tasks with clear specialist boundaries (one classifier call + one specialist run, no synthesis pass).

static supervisor(workers: dict[str, jeevesagent.agent.api.Agent], *, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, instructions_template: str | None = None, delegate_tool_name: str = 'delegate', forward_tool_name: str = 'forward_message') jeevesagent.agent.api.Agent[source]

Build a coordinator Agent that delegates to workers.

The coordinator can call delegate(worker, instructions) to dispatch a subtask, or forward_message(worker) to return a worker’s output verbatim. Multiple delegations in one turn run in parallel.

static swarm(agents: dict[str, jeevesagent.agent.api.Agent | jeevesagent.architecture.swarm.Handoff], entry_agent: str, *, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False, skills: list[Any] | None = None, max_handoffs: int = 8, detect_cycles: bool = True, pass_full_history: bool = True, handoff_tool_name: str = 'handoff') jeevesagent.agent.api.Agent[source]

Build a peer-swarm of agents that hand off control via a handoff tool (or per-target transfer_to_<name> tools when peers are wrapped in Handoff with an input_type).

entry_agent is the peer that receives the first message.

class jeevesagent.Telemetry[source]

Bases: Protocol

OpenTelemetry-compatible tracing/metrics surface.

async emit_metric(name: str, value: float, **attrs: Any) None[source]
trace(name: str, **attrs: Any) contextlib.AbstractAsyncContextManager[jeevesagent.core.types.Span][source]
class jeevesagent.ThoughtNode(/, **data: Any)[source]

Bases: pydantic.BaseModel

One node in the Tree-of-Thoughts search tree.

Children are stored implicitly (each node has a parent_id). The full tree is reconstructable from the node list ToT keeps in its session metadata.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

content: str
depth: int
id: str
parent_id: str | None
score: float = 0.0
class jeevesagent.Tool[source]

A registered tool: definition plus the callable that executes it.

async execute(args: collections.abc.Mapping[str, Any]) Any[source]

Invoke the underlying callable.

Async functions are awaited; sync functions are dispatched to a worker thread via anyio.to_thread.run_sync() so they don’t block the event loop.

to_def() jeevesagent.core.types.ToolDef[source]
description: str
destructive: bool = False
fn: collections.abc.Callable[Ellipsis, Any]
input_schema: dict[str, Any]
name: str
class jeevesagent.ToolCall(/, **data: Any)[source]

Bases: pydantic.BaseModel

A model-emitted request to invoke a tool.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

idempotency_key() str[source]
is_destructive() bool[source]
args: dict[str, Any] = None
destructive: bool = False
id: str = None
tool: str
tool_def: ToolDef | None = None
class jeevesagent.ToolDef(/, **data: Any)[source]

Bases: pydantic.BaseModel

Schema description of a tool the model can call.

Mirrors the JSON-Schema-flavored shape used across MCP and provider APIs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

description: str
input_schema: dict[str, Any] = None
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str
server: str | None = None
class jeevesagent.ToolEvent(/, **data: Any)[source]

Bases: pydantic.BaseModel

Tool registry change notification (MCP listChanged etc.).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

at: datetime.datetime = None
kind: Literal['added', 'removed', 'updated']
server: str | None = None
tool: str
class jeevesagent.ToolHost[source]

Bases: Protocol

MCP-aware tool registry. Lazy-loads schemas on demand.

async call(tool: str, args: collections.abc.Mapping[str, Any], *, call_id: str = '') jeevesagent.core.types.ToolResult[source]

Invoke tool with args. The call_id is propagated into the returned ToolResult so the loop can correlate results with the originating model-emitted call.

async list_tools(*, query: str | None = None) list[jeevesagent.core.types.ToolDef][source]
watch() collections.abc.AsyncIterator[jeevesagent.core.types.ToolEvent][source]

Notifications when the tool list changes (MCP listChanged).

class jeevesagent.ToolResult(/, **data: Any)[source]

Bases: pydantic.BaseModel

Outcome of a tool invocation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod denied_(call_id: str, reason: str, **kwargs: Any) ToolResult[source]
classmethod error_(call_id: str, message: str, **kwargs: Any) ToolResult[source]
classmethod success(call_id: str, output: Any, **kwargs: Any) ToolResult[source]
call_id: str
denied: bool = False
duration_ms: float | None = None
error: str | None = None
ok: bool
output: Any = None
reason: str | None = None
started_at: datetime.datetime = None
class jeevesagent.TreeOfThoughts(*, branch_factor: int = 3, max_depth: int = 3, beam_width: int = 2, solved_threshold: float = 1.0, min_score: float = 0.0, parallel: bool = True, proposer_prompt: str | None = None, evaluator_prompt: str | None = None)[source]

Branch + evaluate + prune. BFS beam search over thoughts.

declared_workers() dict[str, jeevesagent.agent.api.Agent][source]
async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]
name = 'tree-of-thoughts'
class jeevesagent.Usage(/, **data: Any)[source]

Bases: pydantic.BaseModel

Token and cost accounting for a model call.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

cost_usd: float = 0.0
input_tokens: int = 0
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_tokens: int = 0
class jeevesagent.VectorMemory(*, embedder: jeevesagent.core.protocols.Embedder | None = None, max_episodes: int | None = None, consolidator: jeevesagent.memory.consolidator.Consolidator | None = None, fact_store: jeevesagent.memory.facts.FactStore | None = None)[source]

Pure-Python embedding-backed Memory.

async append_block(name: str, content: str) None[source]
async consolidate() None[source]

Process unconsolidated episodes through the configured Consolidator, appending facts to self.facts.

No-op when no consolidator is configured.

async recall(query: str, *, kind: str = 'episodic', limit: int = 5, time_range: tuple[datetime.datetime, datetime.datetime] | None = None, user_id: str | None = None) list[jeevesagent.core.types.Episode][source]
async recall_facts(query: str, *, limit: int = 5, valid_at: datetime.datetime | None = None, user_id: str | None = None) list[jeevesagent.core.types.Fact][source]
async remember(episode: jeevesagent.core.types.Episode) str[source]
async session_messages(session_id: str, *, user_id: str | None = None, limit: int = 20) list[jeevesagent.core.types.Message][source]
snapshot() dict[str, Any][source]
async update_block(name: str, content: str) None[source]
async working() list[jeevesagent.core.types.MemoryBlock][source]
property embedder: jeevesagent.core.protocols.Embedder
facts: jeevesagent.memory.facts.FactStore
class jeevesagent.VectorStore[source]

Bases: Protocol

Async protocol for vector stores.

Six methods cover the lifecycle: add (embed + store), delete, search (by query string), search_by_vector (precomputed), count, get_by_ids.

Backends that aren’t natively async (FAISS, Chroma) wrap their sync calls in anyio.to_thread.run_sync() so they don’t block the event loop.

async add(chunks: list[jeevesagent.loader.base.Chunk], ids: list[str] | None = None) list[str][source]

Embed + store chunks. Returns the assigned ids (caller-provided or generated).

async count() int[source]

Number of chunks currently in the store.

async delete(ids: list[str]) None[source]

Remove the named chunks. Unknown ids are silently skipped (idempotent).

async get_by_ids(ids: list[str]) list[jeevesagent.loader.base.Chunk][source]

Fetch chunks by id, in the same order as ids. Unknown ids are skipped (the result may be shorter than the input).

async search(query: str, *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[SearchResult][source]

Embed query and return the top-k chunks ranked by similarity. filter (optional) restricts candidates by metadata. diversity (optional, 0..1) enables MMR reranking for varied results.

async search_by_vector(vector: list[float], *, k: int = 4, filter: collections.abc.Mapping[str, Any] | None = None, diversity: float | None = None) list[SearchResult][source]

Same as search() but with a precomputed query vector.

class jeevesagent.VoyageEmbedder(model: str = 'voyage-3', *, client: Any | None = None, api_key: str | None = None, input_type: str = 'document')[source]

Embeddings via Voyage AI’s voyageai SDK.

Models and dimensions:

  • voyage-3 / voyage-3-large / voyage-code-3 -> 1024

  • voyage-3-lite -> 512

input_type controls how Voyage encodes the text:

  • "document" (default) — for corpus / fact-store entries

  • "query" — for retrieval queries

Pass an explicit input_type= if your embedder is dedicated to one role; for the agent loop’s mixed use (we embed both stored triples and recall queries through the same embedder), the "document" default is the safer choice.

async embed(text: str) list[float][source]
async embed_batch(texts: list[str]) list[list[float]][source]
dimensions: int
name: str = 'voyage-3'
class jeevesagent.set_run_context(context: RunContext)[source]

Context manager that installs a RunContext for the duration of an async with block.

The framework uses this internally inside Agent.run() to expose the live context to tools and hooks. Application code rarely needs it, but it is the supported way to invoke a tool outside an agent loop with explicit scope — for example in background workers that share tool implementations with the agent:

async with set_run_context(RunContext(user_id="alice")):
    await some_tool(...)

Behaves correctly under structured concurrency: nested async with blocks restore the prior context on exit, and anyio task-group spawns inherit the active context automatically.

jeevesagent.bash_tool(workdir: pathlib.Path | str | None = None, *, name: str = 'bash', timeout: float = 30.0, allow_pattern: collections.abc.Callable[[str], bool] | None = None, extra_env: dict[str, str] | None = None) jeevesagent.tools.registry.Tool[source]

Build a Tool that runs a shell command with the workdir as the current working directory.

Default safety:

  • Commands matching the built-in destructive patterns (rm -rf /, sudo, mkfs, fork bombs, …) are rejected before being executed.

  • Commands run with a default timeout of 30 seconds; the subprocess is killed on timeout.

  • The shell is invoked via /bin/sh -c <command>, so pipelines + redirections work the way you’d expect.

Knobs:

  • allow_pattern — a callable that takes the command string and returns True if the command should run. When provided, it OVERRIDES the default deny list — you take full responsibility.

  • extra_env — extra environment variables merged into the subprocess env.

  • timeout — seconds before the command is killed.

workdir is optional; None uses the framework’s default tempdir (shared with the other built-in tools).

async jeevesagent.build_graph(agent: jeevesagent.agent.api.Agent, *, title: str = 'Agent') AgentGraph[source]

Walk an Agent and return its renderable AgentGraph.

jeevesagent.classify_model_error(exc: BaseException) jeevesagent.core.errors.ModelError | None[source]

Map an exception from any model SDK to the framework’s taxonomy.

Returns None when the exception is not recognised as a model-call failure — let callers decide whether to wrap it in something else or propagate. Returns an instance of one of TransientModelError / RateLimitError / AuthenticationError / InvalidRequestError / ContentFilterError / PermanentModelError otherwise.

SDK imports are lazy — having e.g. the anthropic package installed is not required for OpenAI classification to work, and vice versa.

jeevesagent.default_workdir() pathlib.Path[source]

Return the framework’s default workdir for built-in tools, creating it lazily on first call.

The directory is a fresh tempdir under $TMPDIR/jeeves_agent_*, created once per process. All built-in tool factories share it when called without an explicit workdir argument, so an Agent that registers read_tool() and write_tool() (no args) sees the same place.

The directory is NOT auto-cleaned at process exit — leave that to the OS’s tempdir cleanup so debug data survives a crash.

jeevesagent.deterministic_hash(*parts: Any) str[source]

Stable hash of arbitrary JSON-serializable parts.

Used as an idempotency key for journaled steps. The hash is stable across processes and Python versions because the input is canonicalised via json.dumps(..., sort_keys=True).

jeevesagent.edit_tool(workdir: pathlib.Path | str | None = None, *, name: str = 'edit') jeevesagent.tools.registry.Tool[source]

Build a Tool that does find-and-replace inside an existing file under workdir.

The tool’s signature seen by the model:
``edit(path: str, old_string: str, new_string: str,

replace_all: bool = False)``

Behaviour matches Claude Code’s Edit tool:

  • old_string must be EXACTLY present in the file. Mismatch (whitespace, indentation, line breaks) → error.

  • old_string must appear EXACTLY once in the file unless replace_all=True is passed — forces the model to give enough surrounding context for unambiguous matches.

  • new_string replaces old_string (or every occurrence if replace_all=True).

workdir is optional; None uses the framework’s default tempdir (shared with the other built-in tools).

jeevesagent.filesystem_tools(workdir: pathlib.Path | str | None = None) list[jeevesagent.tools.registry.Tool][source]

Return all three filesystem tools (read + write + edit) bound to a single workdir. bash_tool is excluded — pair them only when you want shell access too.

workdir is optional; None uses the framework’s default tempdir (shared with bash_tool() called the same way).

jeevesagent.get_run_context() RunContext[source]

Return the RunContext for the currently-running agent.

Inside an active Agent.run() call this returns the live context with user_id, session_id, run_id, and metadata populated. Outside any active run (test code, direct @tool invocation, REPL exploration) this returns the default empty RunContext — never raises.

Tools that need scope information call this rather than taking extra parameters:

@tool
async def fetch_user_orders() -> str:
    ctx = get_run_context()
    return await db.query("orders", user_id=ctx.user_id)
jeevesagent.new_id(prefix: str = '') str[source]

Return a fresh ULID, optionally prefixed for readability.

>>> new_id("ep").startswith("ep_")
True
jeevesagent.read_tool(workdir: pathlib.Path | str | None = None, *, name: str = 'read', line_limit: int = _DEFAULT_READ_LINE_LIMIT) jeevesagent.tools.registry.Tool[source]

Build a Tool that reads a text file under workdir.

The tool’s signature seen by the model:

read(path: str, offset: int = 0, limit: int | None = None)

Returns the file’s text with line numbers prefixed (one line per output line), in the same format Claude Code’s Read tool uses — that lets the edit tool work without ambiguity later. Long files are truncated to line_limit lines per call; pass offset / limit to read further chunks.

Errors (file-not-found, path-escape) are returned as a string starting with "ERROR: " rather than raising — the model sees them as a tool result and can adjust.

workdir is optional; None uses the framework’s default tempdir (shared with the other built-in tools called without a workdir).

jeevesagent.resolve_architecture(spec: jeevesagent.architecture.base.Architecture | str | None) jeevesagent.architecture.base.Architecture[source]

Coerce spec to a concrete Architecture.

  • NoneReAct (the default)

  • str → looked up in KNOWN (only "react" in v0.3)

  • Architecture instance → returned as-is

Unknown strings raise ConfigError with a list of known names.

async jeevesagent.run_architecture(architecture: jeevesagent.architecture.Architecture, prompt: str, *, instructions: str = '', model: jeevesagent.core.protocols.Model | str | None = None, memory: jeevesagent.core.protocols.Memory | None = None, runtime: jeevesagent.core.protocols.Runtime | None = None, budget: jeevesagent.core.protocols.Budget | None = None, permissions: jeevesagent.core.protocols.Permissions | None = None, hooks: jeevesagent.security.hooks.HookRegistry | None = None, tools: list[jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object]] | jeevesagent.core.protocols.ToolHost | jeevesagent.tools.registry.Tool | collections.abc.Callable[Ellipsis, object] | None = None, telemetry: jeevesagent.core.protocols.Telemetry | None = None, audit_log: jeevesagent.security.audit.AuditLog | None = None, max_turns: int = DEFAULT_MAX_TURNS, auto_consolidate: bool = False) jeevesagent.core.types.RunResult[source]

Run an Architecture once with a minimal Agent shell.

Useful for testing orchestrators in isolation or for one-shot scripts where you don’t want to construct an Agent yourself.

The default model is the framework’s resolver default (set via model= or env / config); pass an explicit model or string id to override.

Example:

sup = Supervisor(workers={"a": agent_a})
result = await run_architecture(
    sup, "do the thing", model="gpt-4.1-mini"
)
jeevesagent.tool(fn: collections.abc.Callable[Ellipsis, Any]) Tool[source]
jeevesagent.tool(*, name: str | None = None, description: str | None = None, destructive: bool = False) collections.abc.Callable[[collections.abc.Callable[Ellipsis, Any]], Tool]

Promote a callable to a Tool.

Use as @tool (bare) or @tool(name=..., description=..., destructive=...). The schema is derived from parameter annotations; primitive types map to their JSON-Schema equivalents, anything else falls back to string.

async jeevesagent.write_graph(agent: jeevesagent.agent.api.Agent, path: str | pathlib.Path, *, title: str | None = None) str[source]

Walk the agent, render to Mermaid, write to path.

Extension dispatch:

  • .mmd — raw Mermaid source

  • .md — Markdown with the diagram in a mermaid fence

  • .png / .svg — fetched from mermaid.ink; on network failure, writes .mmd next to the requested path and returns the Mermaid text anyway

Returns the Mermaid text in every case.

jeevesagent.write_tool(workdir: pathlib.Path | str | None = None, *, name: str = 'write', create_parents: bool = True) jeevesagent.tools.registry.Tool[source]

Build a Tool that writes / overwrites a text file under workdir.

The tool’s signature seen by the model:

write(path: str, content: str)

Overwrites existing files. With create_parents=True (the default), missing parent directories are created automatically.

Returns a confirmation string with the byte count, or an "ERROR: "-prefixed message on failure.

workdir is optional; None uses the framework’s default tempdir (shared with the other built-in tools).