jeevesagent.model

Model adapters.

The provider adapters import their SDK lazily inside __init__ so from jeevesagent.model import AnthropicModel works even without the corresponding extra installed; the ImportError is raised only when the constructor needs to build a default client.

Submodules

Classes

AnthropicModel

Talks to Claude via anthropic.AsyncAnthropic.

EchoModel

Echo-style model for tests and demos.

LiteLLMModel

Talks to any LiteLLM-supported provider.

OpenAIModel

Talks to OpenAI via openai.AsyncOpenAI.

ScriptedModel

Model that emits canned responses, one per call to stream().

ScriptedTurn

Package Contents

class jeevesagent.model.AnthropicModel(model: str = 'claude-opus-4-7', *, client: Any = None, api_key: str | None = None, max_tokens: int = DEFAULT_MAX_TOKENS)[source]

Talks to Claude via anthropic.AsyncAnthropic.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot non-streaming completion.

Calls client.messages.create(...) (no stream=True, no stream context manager) — Anthropic returns the full Message in one HTTP response. We walk its content blocks once to assemble (text, tool_calls, usage, stop_reason). Used by the non-streaming hot path (agent.run()); agent.stream() keeps using stream().

Falls back to consuming stream() if the underlying client raises (test fakes that only support streaming, or transports that don’t honour single-shot creation).

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name = 'claude-opus-4-7'
class jeevesagent.model.EchoModel(*, prefix: str = 'Echo: ', chunk_delay_s: float = 0.0, cost_per_token: float = 0.0)[source]

Echo-style model for tests and demos.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot echo. Returns the echoed user prompt as one string with synthetic usage. No per-token chunking — used by the non-streaming hot path (agent.run()).

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name: str = 'echo'
class jeevesagent.model.LiteLLMModel(model: str, *, api_key: str | None = None, client: Any | None = None, **litellm_kwargs: Any)[source]

Bases: jeevesagent.model.openai.OpenAIModel

Talks to any LiteLLM-supported provider.

Inherits chunk normalisation, tool-call delta aggregation, and message-conversion from OpenAIModel because LiteLLM produces OpenAI-shaped outputs.

class jeevesagent.model.OpenAIModel(model: str = 'gpt-4o', *, client: Any = None, api_key: str | None = None, base_url: str | None = None)[source]

Talks to OpenAI via openai.AsyncOpenAI.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot completion (no per-chunk yields).

Tries the OpenAI non-streaming endpoint (stream=False) first. If that fails — e.g. when a test fake client only supports streaming, or a transport doesn’t honor stream=False — falls back to consuming stream() internally and accumulating the result. The fallback still saves the per-chunk yield + Event construction overhead on the architecture side because ReAct calls complete with a single await.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name = 'gpt-4o'
class jeevesagent.model.ScriptedModel(turns: list[ScriptedTurn])[source]

Model that emits canned responses, one per call to stream().

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot replay of the next scripted turn.

Mirrors stream() but returns the turn’s text + tool_calls + usage in one tuple. Used by the non-streaming hot path (agent.run()); agent.stream() keeps using stream() for per-chunk replay.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name: str = 'scripted'
property remaining: int
class jeevesagent.model.ScriptedTurn[source]
text: str = ''
tool_calls: list[jeevesagent.core.types.ToolCall] = []
usage: jeevesagent.core.types.Usage