================================================================================
Provider wire-format structures  (models + completion)
================================================================================

Reference for the provider package. Two endpoints are normalized here:

  * GET  {base}/models                 -> CheckModel  (check.py)        -> ModelInfo
  * POST {base}/chat/completions       -> GetCompletion (completion.py) -> Completion
    POST {base}/messages   (Anthropic)

Two wire formats, selected by Provider.endpoint_type (EndpointType enum):

  OPENAI-compatible (OpenAI, DeepSeek, Groq, OpenRouter, ...):
      auth:     Authorization: Bearer <key>
      models:   GET  {base}/models
      complete: POST {base}/chat/completions   body: {model, messages, ...}

  ANTHROPIC:
      auth:     x-api-key: <key>   +   anthropic-version: 2023-06-01
      models:   GET  {base}/models
      complete: POST {base}/messages          body: {model, max_tokens, system?, messages, ...}

"Minimal" providers (OpenAI, DeepSeek) only report an `id` on /models, so every
ModelInfo field except model_exists is unknown -> None / [] and the agent fails
open (does not block on capabilities it could not determine).


################################################################################
# PART 1 -- /models  (CheckModel -> ModelInfo)
################################################################################

ModelInfo(model_exists, context_length, compactable, max_completion_tokens,
          reasoning, input_modalities, output_modalities)

`compactable` is derived, not read from the response: it is (context_length is
not None) -- i.e. whether the agent can measure context fill to decide when to
auto-summarize. Minimal providers (no context_length) are not compactable.

All responses wrap models in a top-level "data" array. Anthropic also adds
cursor pagination (first_id / last_id / has_more).


================================================================================
OpenAI  --  GET https://api.openai.com/v1/models   (Authorization: Bearer)
================================================================================
{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1715367049,
      "owned_by": "system"
    }
  ]
}

ModelInfo mapping:
  model_exists          <- id present in data[]
  context_length        -> None    (not reported)
  max_completion_tokens -> None     (not reported)
  reasoning             -> False    (not reported)
  input_modalities      -> []       (unknown)
  output_modalities     -> []       (unknown)


================================================================================
DeepSeek  --  GET https://api.deepseek.com/models   (Authorization: Bearer)
================================================================================
{
  "object": "list",
  "data": [
    { "id": "deepseek-chat",     "object": "model", "owned_by": "deepseek" },
    { "id": "deepseek-reasoner", "object": "model", "owned_by": "deepseek" }
  ]
}

ModelInfo mapping: same as OpenAI -- minimal, existence only.


================================================================================
Groq  --  GET https://api.groq.com/openai/v1/models   (Authorization: Bearer)
================================================================================
{
  "object": "list",
  "data": [
    {
      "id": "llama-3.3-70b-versatile",
      "object": "model",
      "created": 1693721698,
      "owned_by": "Meta",
      "active": true,
      "context_window": 131072,
      "max_completion_tokens": 32768
    }
  ]
}

ModelInfo mapping:
  context_length        <- context_window           (flat, Groq-specific name)
  max_completion_tokens <- max_completion_tokens     (flat)
  reasoning / modalities  not reported -> False / []


================================================================================
OpenRouter  --  GET https://openrouter.ai/api/v1/models   (Authorization: Bearer)
================================================================================
{
  "id": "openai/gpt-4-0314",
  "canonical_slug": "openai/gpt-4-0314",
  "hugging_face_id": null,
  "name": "OpenAI: GPT-4 (older v0314)",
  "created": 1685232000,
  "description": "...",
  "context_length": 8191,
  "architecture": {
    "modality": "text->text",
    "input_modalities": ["text"],
    "output_modalities": ["text"],
    "tokenizer": "GPT",
    "instruct_type": null
  },
  "pricing": {
    "prompt": "0.00003", "completion": "0.00006", "request": "0",
    "image": "0", "web_search": "0", "internal_reasoning": "0"
  },
  "top_provider": {
    "context_length": 8191,
    "max_completion_tokens": 4096,
    "is_moderated": true
  },
  "per_request_limits": null,
  "supported_parameters": ["frequency_penalty", "...", "tools", "top_p"],
  "default_parameters": {}
}

ModelInfo mapping:
  context_length        <- top_provider.context_length | context_length
  max_completion_tokens <- top_provider.max_completion_tokens
  reasoning             <- "reasoning" in supported_parameters
                           OR pricing.internal_reasoning != "0"
  input_modalities      <- architecture.input_modalities
  output_modalities     <- architecture.output_modalities


================================================================================
Anthropic  --  GET https://api.anthropic.com/v1/models
               (x-api-key + anthropic-version: 2023-06-01)
================================================================================
{
  "data": [
    {
      "id": "claude-opus-4-6",
      "type": "model",
      "display_name": "Claude Opus 4.6",
      "created_at": "2026-02-04T00:00:00Z",
      "max_input_tokens": 0,
      "max_tokens": 0,
      "capabilities": {
        "image_input":        { "supported": true },
        "pdf_input":          { "supported": true },
        "structured_outputs": { "supported": true },
        "citations":          { "supported": true },
        "code_execution":     { "supported": true },
        "batch":              { "supported": true },
        "thinking": {
          "supported": true,
          "types": {
            "adaptive": { "supported": true },
            "enabled":  { "supported": true }
          }
        },
        "effort": {
          "supported": true,
          "low":   { "supported": true },
          "medium":{ "supported": true },
          "high":  { "supported": true },
          "max":   { "supported": true },
          "xhigh": { "supported": true }
        },
        "context_management": {
          "supported": true,
          "clear_thinking_20251015": { "supported": true },
          "clear_tool_uses_20250919": { "supported": true },
          "compact_20260112":         { "supported": true }
        }
      }
    }
  ],
  "first_id": "first_id",
  "has_more": true,
  "last_id": "last_id"
}

ModelInfo mapping:
  context_length        <- max_input_tokens   (0 -> None)
  max_completion_tokens <- max_tokens          (0 -> None)
  reasoning             <- capabilities.thinking.supported
  input_modalities      <- ["text"]  (+ "image" if capabilities.image_input,
                                       + "file"  if capabilities.pdf_input)
  output_modalities     <- ["text"]  (Claude models are text-out only)

  NOTE: capabilities is a nested tree of { "supported": bool } flags, not flat
  lists. If `capabilities` is absent, modalities are left [] (unknown) so the
  agent fails open instead of wrongly asserting text-only.


################################################################################
# PART 2 -- completion  (GetCompletion -> Completion)
################################################################################

Completion(content, reasoning, images, error, total_tokens)


================================================================================
OpenAI-compatible  --  POST {base}/chat/completions   (Authorization: Bearer)
================================================================================
Request body:
{
  "model": "...",
  "messages": [ {"role": "system|user|assistant", "content": "..."} ],
  ... passthrough kwargs (temperature, top_p, tools, response_format, ...)
}

Response:
{
  "id": "gen-123...",
  "provider": "Seed",
  "model": "bytedance-seed/seedream-4.5",
  "object": "chat.completion",
  "created": 1768040219,
  "choices": [
    {
      "logprobs": null,
      "finish_reason": "stop",
      "native_finish_reason": null,
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "refusal": null,
        "reasoning": null,
        "images": [
          {
            "index": 0,
            "type": "image_url",
            "image_url": { "url": "data:image/jpeg;base64,..." }
          }
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 23,
    "completion_tokens": 4175,
    "total_tokens": 4198,
    "cost": 0.040000675,
    "completion_tokens_details": { "reasoning_tokens": 0, "image_tokens": 4175 }
  }
}

Completion mapping:
  content      <- choices[0].message.content        (stripped)
  reasoning    <- choices[0].message.reasoning       (<think>/<reason> tags stripped)
  images       <- choices[0].message.images[].image_url.url   (type == image_url)
  total_tokens <- usage.total_tokens
  error        <- set when choices is empty / non-200 (see _check_for_errors)


================================================================================
Anthropic  --  POST https://api.anthropic.com/v1/messages
               (x-api-key + anthropic-version: 2023-06-01)
================================================================================
Request body:
{
  "model": "claude-opus-4-6",
  "max_tokens": 1024,
  "system": "You are a helpful assistant.",
  "messages": [
    { "role": "user",      "content": "Hello" },
    { "role": "assistant", "content": "Hi! How can I help?" }
  ],
  ... passthrough kwargs (temperature, top_p, top_k, tools, stop_sequences, thinking)
}

Response:
{
  "id": "msg_013Zva2CMHLNnXjNJJKqJ2EF",
  "container": {
    "id": "id",
    "expires_at": "2019-12-27T18:11:19.117Z",
    "skills": [
      { "skill_id": "pdf", "type": "anthropic", "version": "latest" }
    ]
  },
  "content": [
    {
      "citations": [
        {
          "cited_text": "cited_text",
          "document_index": 0,
          "document_title": "document_title",
          "end_char_index": 0,
          "file_id": "file_id",
          "start_char_index": 0,
          "type": "char_location"
        }
      ],
      "text": "Hi! My name is Claude.",
      "type": "text"
    }
  ],
  "context_management": {
    "applied_edits": [
      { "cleared_input_tokens": 0, "cleared_tool_uses": 0, "type": "clear_tool_uses_20250919" }
    ]
  },
  "diagnostics": {
    "cache_miss_reason": { "cache_missed_input_tokens": 0, "type": "model_changed" }
  },
  "model": "claude-opus-4-6",
  "role": "assistant",
  "stop_details": {
    "category": "cyber",
    "explanation": "explanation",
    "type": "refusal"
  },
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "cache_creation": { "ephemeral_1h_input_tokens": 0, "ephemeral_5m_input_tokens": 0 },
    "cache_creation_input_tokens": 2051,
    "cache_read_input_tokens": 2051,
    "inference_geo": "inference_geo",
    "input_tokens": 2095,
    "iterations": [
      {
        "cache_creation": { "ephemeral_1h_input_tokens": 0, "ephemeral_5m_input_tokens": 0 },
        "cache_creation_input_tokens": 0,
        "cache_read_input_tokens": 0,
        "input_tokens": 0,
        "output_tokens": 0,
        "type": "message"
      }
    ],
    "output_tokens": 503,
    "output_tokens_details": { "thinking_tokens": 0 },
    "server_tool_use": { "web_fetch_requests": 2, "web_search_requests": 0 },
    "service_tier": "standard",
    "speed": "standard"
  }
}

Extended thinking adds a block to content[] before the text block:
    { "type": "thinking", "thinking": "Let me reason...", "signature": "..." }

Completion mapping:
  content      <- concat of content[].text     where type == "text"     (stripped)
  reasoning    <- concat of content[].thinking where type == "thinking"
                  (thinking blocks only present when extended thinking is enabled)
  images       -> None    (Claude is text-out only)
  total_tokens <- usage.input_tokens
                  + usage.cache_read_input_tokens
                  + usage.cache_creation_input_tokens
                  + usage.output_tokens        (cache fields are 0 when caching off)
  error        <- set when content missing / non-200
                  (Anthropic errors: {"type": "error", "error": {"type", "message"}})

  Ignored: container, context_management, diagnostics, stop_details,
  stop_reason / stop_sequence, and the usage sub-objects (iterations,
  cache_creation breakdown, server_tool_use, *_details, inference_geo, etc.).
  NOTE: stop_reason can be "refusal" (see stop_details) -- not currently
  surfaced; the refusal text, if any, still comes through as a text block.