Disclaimer
This is beta software containing preliminary data which is incomplete and may be inaccurate. If you experience errors with the tool or discover inaccurate information, please use the “report” feature in the model details to report inaccuracies or for site issues please contact us.
Download JSONDownload YAML
Class III - Open Model
Included components
- Model architecture
- Model parameters (Final)
- Model card
- Data card
- Technical report
- Evaluation results
Class II - Open Tooling
Included components
- Model architecture
- Training code
- Inference code
- Evaluation code
- Model parameters (Final)
- Evaluation data
- Model card
- Data card
- Technical report
- Evaluation results
Class I - Open Science
Included components
- Model architecture
- Training code
- Inference code
- Evaluation code
- Model parameters (Final)
- Datasets
- Evaluation data
- Model card
- Data card
- Technical report
- Research paper
- Evaluation results
Missing components
- Data preprocessing code
- Model parameters (Intermediate)
Description
The Aquila-VL-2B model is a vision-language model (VLM) trained based on the LLava-one-vision framework. The Qwen2.5-1.5B-instruct model is chose as the LLM, while siglip-so400m-patch14-384 is utilized as the vision tower.
Version/Parameters
2.18B
Organization
Beijing Academy of Artificial Intelligence(BAAl)
Type
Multimodal model
Status
Approved
Architecture
Transformer (Decoder-only)
Treatment
Instruct fine-tuned
Base model
Qwen2.5-1.5B-instruct
Last updated