jeevesagent.model.openai

Adapter for OpenAI chat completions via the official openai SDK.

Streams via chat.completions.create(stream=True). OpenAI streams text in delta.content and tool calls in delta.tool_calls arrays where each entry carries an index; the same index across chunks refers to the same tool call (so we accumulate by index). The final chunk with stream_options={"include_usage": True} carries token counts.

The SDK is imported lazily inside __init__ so users without the openai extra installed can still from jeevesagent.model import OpenAIModel — the import only fires when constructing without passing a client.

Classes

OpenAIModel

Talks to OpenAI via openai.AsyncOpenAI.

Module Contents

class jeevesagent.model.openai.OpenAIModel(model: str = 'gpt-4o', *, client: Any = None, api_key: str | None = None, base_url: str | None = None)[source]

Talks to OpenAI via openai.AsyncOpenAI.

async complete(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) tuple[str, list[jeevesagent.core.types.ToolCall], jeevesagent.core.types.Usage, str][source]

Single-shot completion (no per-chunk yields).

Tries the OpenAI non-streaming endpoint (stream=False) first. If that fails — e.g. when a test fake client only supports streaming, or a transport doesn’t honor stream=False — falls back to consuming stream() internally and accumulating the result. The fallback still saves the per-chunk yield + Event construction overhead on the architecture side because ReAct calls complete with a single await.

async stream(messages: list[jeevesagent.core.types.Message], *, tools: list[jeevesagent.core.types.ToolDef] | None = None, temperature: float = 1.0, max_tokens: int | None = None) collections.abc.AsyncIterator[jeevesagent.core.types.ModelChunk][source]
name = 'gpt-4o'