Alibaba Cloud announced on Friday it is launching two open-source artificial intelligence models that can "understand images and text" in English and Chinese.
The Chinese tech giant noted its pre-trained large vision language model Qwen-VL and its "conversationally fine-tuned version" Qwen-VL-Chat are available for download on its AI model community ModelScope and the collaborative AI platform Hugging Face.
Alibaba Cloud explained that the AI models, which are trained based on the 7-billion-parameter version of its large language model Qwen-7B, can answer open-ended questions based on multiple images and generate image captions, while Qwen-VL-Chat can also do mathematical calculations and come up with stories based on images.