This template uses GROQ LLAVA V1.5 7B API that offers fast inference for multimodal models with vision capabilities for understanding and interpreting visual data from images.
The users send an image and get a description of the image from the model.
Once you set your bot, you can send the image and get the descriptions.