jeevesagent.governance.retry

Resilience for model calls.

Two pieces:

The actual retry loop lives in RetryingModel, which wraps any Model and runs every call through this policy + classifier pair. Splitting policy/classification from the retry mechanics keeps each piece testable in isolation and lets callers reuse the classifier for non-Agent code (e.g. cron jobs that hit the same SDK).

Classes

RetryPolicy

Exponential-backoff-with-jitter retry schedule.

Functions

classify_model_error(...)

Map an exception from any model SDK to the framework's taxonomy.

compute_backoff(→ float)

Backoff (seconds) before retry number attempt (1-indexed).

Module Contents

class jeevesagent.governance.retry.RetryPolicy[source]

Exponential-backoff-with-jitter retry schedule.

The default is sensible for production: up to 3 attempts (one initial + two retries), starting at 1 s, doubling each attempt, capped at 30 s, with ±10% jitter so synchronised clients don’t reform a thundering herd.

Examples:

# default — sensible for most apps
RetryPolicy()

# disable retries (fail fast)
RetryPolicy.disabled()

# aggressive — survives long provider blips
RetryPolicy.aggressive()

# tuned to a specific SLO
RetryPolicy(max_attempts=4, initial_delay_s=0.5, max_delay_s=15)

The schedule applies between attempts: the first call has no delay, the second is delayed by initial_delay_s (± jitter), the third by initial_delay_s * multiplier (± jitter), etc., each capped at max_delay_s. Provider-supplied Retry-After hints (carried on retry_after) override the computed delay when they ask for more time — we never sleep less than the provider asked for.

classmethod aggressive() RetryPolicy[source]

Up to 6 attempts, faster initial backoff, longer cap. Use when the underlying provider is known-flaky and the caller prefers slow success over fast failure.

classmethod disabled() RetryPolicy[source]

Single attempt, no retries — fail fast on any error.

is_enabled() bool[source]

True when the policy permits at least one retry.

initial_delay_s: float = 1.0

Backoff before the FIRST retry (i.e. between attempts 1 and 2). Subsequent retries use initial_delay_s * multiplier**n.

jitter: float = 0.1

Fractional ±jitter applied to each computed delay. 0.1 = ±10%. Set to 0 for deterministic backoff (useful in tests).

max_attempts: int = 3

Maximum total attempts including the first call. 1 means no retries; the call either succeeds or raises immediately. The minimum-meaningful retry policy is therefore max_attempts=2.

max_delay_s: float = 30.0

Cap on any single backoff. Prevents runaway sleeps when multiplier is large or max_attempts is high.

multiplier: float = 2.0

Geometric growth between successive retries. 2.0 doubles each time; 1.0 makes the policy linear (fixed-interval).

jeevesagent.governance.retry.classify_model_error(exc: BaseException) jeevesagent.core.errors.ModelError | None[source]

Map an exception from any model SDK to the framework’s taxonomy.

Returns None when the exception is not recognised as a model-call failure — let callers decide whether to wrap it in something else or propagate. Returns an instance of one of TransientModelError / RateLimitError / AuthenticationError / InvalidRequestError / ContentFilterError / PermanentModelError otherwise.

SDK imports are lazy — having e.g. the anthropic package installed is not required for OpenAI classification to work, and vice versa.

jeevesagent.governance.retry.compute_backoff(policy: RetryPolicy, attempt: int, *, retry_after: float | None = None, rng: random.Random | None = None) float[source]

Backoff (seconds) before retry number attempt (1-indexed).

attempt=1 is the delay before the first retry (i.e. between attempts 1 and 2 of max_attempts). Returns 0 when policy is disabled.

retry_after (provider hint, e.g. from a 429 Retry-After header) acts as a floor: we never wait less than the provider asked for, but we still cap at policy.max_delay_s. This means a provider-supplied 60-second hint paired with a 30-second cap is honoured at 60 seconds (exceeding the cap on purpose — the provider is more authoritative than our heuristic).