Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 683 Bytes

VISION_MODELS.md

File metadata and controls

13 lines (9 loc) · 683 Bytes

Vision model support in mistral.rs

Mistral.rs supports various modalities of models, including vision models. Vision models take images and text as input and have the capability to reason over both.

Please see docs for the following model types:

Note for the Python and HTTP APIs: We follow the OpenAI specification for structuring the image messages and allow both base64 encoded images as well as a URL/path to the image. There are many examples of this, see this Python example.