[How to: pass multimodal data directly to models](https://python.langchain.com/docs/how_to/multimodal_inputs/): LLM should read this page when needing to pass multimodal data (images, videos, etc.) to models, when working with models that support multimodal input and tool calling capabilities, and when looking to understand how to encode and pass different types of multimodal data. This page demonstrates how to pass multimodal input like images directly to LLMs and chat models, covering encoding techniques, passing single/multiple images, and invoking models with image/multimodal content. It also shows how to use multimodal models for tool calling.

