Reason2 VLM
Video + image + text understanding with Chain-of-Thought reasoning. Physics, driving, robots — all with thinking tokens.
21 tools · 2 providers · jetson native
Give your agent eyes that understand physics. From VLM reasoning to world-model generation, edge deployment, and evaluation — one toolkit, every stage.
capabilities
Video + image + text understanding with Chain-of-Thought reasoning. Physics, driving, robots — all with thinking tokens.
World-model video generation — predict future frames from context. Cosmos Policy for robot control actions.
ControlNet video-to-video. Depth, edge, sketch conditioning for sim-to-real domain transfer.
FP8 quantize → ONNX export → TensorRT engine build → serve. One pipeline, Jetson AGX Thor native.
SFT, LoRA, distillation (8B→2B). Fine-tune Reason2 on your domain data with a single tool call.
FID, FVD, CSE, CLIP benchmarks. Automated quality gates for every pipeline stage.
pipeline
All orchestrated by a single Strands Agent — or one just command.
get started
Then give your agent Cosmos superpowers:
from strands_cosmos import cosmos_reason, cosmos_predict, cosmos_transfer