Aquila-VL-2B

Primary tabs

Disclaimer

This is beta software containing preliminary data which is incomplete and may be inaccurate. If you experience errors with the tool or discover inaccurate information, please open an Issue or Pull Request on the MOF GitHub repository. Thank you.
Download JSONDownload YAML
Description
The Aquila-VL-2B model is a vision-language model (VLM) trained based on the LLava-one-vision framework. The Qwen2.5-1.5B-instruct model is chose as the LLM, while siglip-so400m-patch14-384 is utilized as the vision tower.
Version/Parameters
2.18B
Organization
Beijing Academy of Artificial Intelligence(BAAl)
Type
Multimodal model
Status
Approved
Architecture
Transformer (Decoder-only)
Treatment
Instruct fine-tuned
Base model
Qwen2.5-1.5B-instruct
Last updated