jeevesagent.model.retrying

Retry wrapper for any Model.

RetryingModel decorates an underlying model adapter (OpenAIModel, AnthropicModel, LiteLLMModel, custom implementations) with the framework’s retry policy. The agent loop sees a Model like any other; the retry mechanics are invisible above this layer.

What it does:

  • On every complete() / stream() call, runs the underlying model up to RetryPolicy.max_attempts times.

  • Catches the underlying SDK exception, runs it through classify_model_error().

  • PermanentModelError (auth, bad request, content filter) is re-raised immediately without backoff.

  • TransientModelError (rate limit, 5xx, network) is retried after a backoff computed by compute_backoff(). Provider-supplied Retry-After hints set a floor on the wait.

  • Unrecognised exceptions (anything classify_model_error() returns None for) propagate unchanged — better to let an unknown error bubble up than silently retry it.

Streaming retries are deliberately limited: once the first chunk has been yielded to the consumer we cannot rewind, so retries only fire while waiting for that first chunk. Errors mid-stream propagate.

Classes

RetryingModel

Wraps any Model with retry semantics.

Module Contents

class jeevesagent.model.retrying.RetryingModel(inner: Any, policy: jeevesagent.governance.retry.RetryPolicy)[source]

Wraps any Model with retry semantics.

Construction does not validate the inner model — anything that quacks like a Model (has name, stream, optional complete) works. The wrapper keeps a stable name matching the underlying model so telemetry and audit logs stay consistent.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot completion with retry on transient failures.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]

Streaming completion with retry-before-first-chunk.

We can’t roll back chunks already yielded to the consumer, so retry behaviour applies to errors that occur before the first ModelChunk is produced. If the underlying stream raises mid-stream the error propagates unchanged.

property inner: Any

The wrapped model. Useful for tests + introspection.

name: str
property policy: jeevesagent.governance.retry.RetryPolicy