jeevesagent.model¶
Model adapters.
EchoModel— zero-key, echoes the prompt; defaultScriptedModel— replays canned turns for testsAnthropicModel— Claude via theanthropicSDKOpenAIModel— GPT via theopenaiSDK
The provider adapters import their SDK lazily inside __init__ so
from jeevesagent.model import AnthropicModel works even without
the corresponding extra installed; the ImportError is raised only when
the constructor needs to build a default client.
Submodules¶
Classes¶
Talks to Claude via |
|
Echo-style model for tests and demos. |
|
Talks to any LiteLLM-supported provider. |
|
Talks to OpenAI via |
|
Model that emits canned responses, one per call to |
|
Package Contents¶
- class jeevesagent.model.AnthropicModel(model: str = 'claude-opus-4-7', *, client: Any = None, api_key: str | None = None, max_tokens: int = DEFAULT_MAX_TOKENS)[source]¶
Talks to Claude via
anthropic.AsyncAnthropic.- async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]¶
Single-shot non-streaming completion.
Calls
client.messages.create(...)(nostream=True, nostreamcontext manager) — Anthropic returns the fullMessagein one HTTP response. We walk itscontentblocks once to assemble(text, tool_calls, usage, stop_reason). Used by the non-streaming hot path (agent.run());agent.stream()keeps usingstream().Falls back to consuming
stream()if the underlying client raises (test fakes that only support streaming, or transports that don’t honour single-shot creation).
- async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]¶
- name = 'claude-opus-4-7'¶
- class jeevesagent.model.EchoModel(*, prefix: str = 'Echo: ', chunk_delay_s: float = 0.0, cost_per_token: float = 0.0)[source]¶
Echo-style model for tests and demos.
- async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]¶
Single-shot echo. Returns the echoed user prompt as one string with synthetic usage. No per-token chunking — used by the non-streaming hot path (
agent.run()).
- async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]¶
- class jeevesagent.model.LiteLLMModel(model: str, *, api_key: str | None = None, client: Any | None = None, **litellm_kwargs: Any)[source]¶
Bases:
jeevesagent.model.openai.OpenAIModelTalks to any LiteLLM-supported provider.
Inherits chunk normalisation, tool-call delta aggregation, and message-conversion from
OpenAIModelbecause LiteLLM produces OpenAI-shaped outputs.
- class jeevesagent.model.OpenAIModel(model: str = 'gpt-4o', *, client: Any = None, api_key: str | None = None, base_url: str | None = None)[source]¶
Talks to OpenAI via
openai.AsyncOpenAI.- async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]¶
Single-shot completion (no per-chunk yields).
Tries the OpenAI non-streaming endpoint (
stream=False) first. If that fails — e.g. when a test fake client only supports streaming, or a transport doesn’t honorstream=False— falls back to consumingstream()internally and accumulating the result. The fallback still saves the per-chunk yield + Event construction overhead on the architecture side because ReAct callscompletewith a singleawait.
- async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]¶
- name = 'gpt-4o'¶
- class jeevesagent.model.ScriptedModel(turns: list[ScriptedTurn])[source]¶
Model that emits canned responses, one per call to
stream().- async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]¶
Single-shot replay of the next scripted turn.
Mirrors
stream()but returns the turn’s text + tool_calls + usage in one tuple. Used by the non-streaming hot path (agent.run());agent.stream()keeps usingstream()for per-chunk replay.
- async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]¶
- class jeevesagent.model.ScriptedTurn[source]¶
-
- tool_calls: list[jeevesagent.core.types.ToolCall] = []¶