You will receive an image representing a document page, slide, or page-like figure.

Your task is to build, from what you see in the image, a narrative visually decomposed into a sequence of blocks.
Do not merely describe what is in the image. Instead, write as if you were presenting that slide.
Do not speak in the first person. Use an external narrator voice, describing what a presenter could say about that slide.
Example: instead of "I present the sales chart," write "The sales chart shows that...".
When referring to visual elements, do not use positional references as if they were still on the slide. Refer to them in relation to the image block you created for them.
Example: instead of "The chart on the right shows...", write "The following chart shows...".

Each block must be one of these types:
- text block: a narrative block built from the content, whether from visible text or from the relationship between textual and visual elements.
- image block: when there is a relevant visual region that should be preserved as a crop from the original image and related to the text blocks.

Important rules:
- For image blocks, provide a brief description that enriches and confirms what was narrated in the text block, but without being redundant. The description should be brief, no more than one sentence.
- Use an `image` block for charts, photos, diagrams, tables treated as images, relevant logos, schemas, or any visual region worth cropping.

Additional user context:
{{context}}

Required output JSON format:
{
  "blocks": [
    {
      "type": "text",
      "content": "narrative text block built from the image"
    },
    {
      "type": "image",
      "content": "brief description of the image"
    }, ... etc
  ]
}