Aquila-VL-2B

Primary tabs

Disclaimer

This is beta software containing preliminary data which is incomplete and may be inaccurate. If you experience errors with the tool or discover inaccurate information, please use the “report” feature in the model details to report inaccuracies or for site issues please contact us.
Download JSONDownload YAML

Class III - Open Model

Class III - Open Model Class III - Open Model Qualified Qualified

Included components

  • Model architecture
  • Model parameters (Final)
  • Model card
  • Data card
  • Technical report
  • Evaluation results

Class II - Open Tooling

Class II - Open Tooling Class II - Open Tooling Qualified Qualified

Included components

  • Model architecture
  • Training code
  • Inference code
  • Evaluation code
  • Model parameters (Final)
  • Evaluation data
  • Model card
  • Data card
  • Technical report
  • Evaluation results

Class I - Open Science

Class I - Open Science Class I - Open Science Qualified Qualified

Included components

  • Model architecture
  • Data preprocessing code
  • Training code
  • Inference code
  • Evaluation code
  • Model parameters (Final)
  • Model parameters (Intermediate)
  • Datasets
  • Evaluation data
  • Model card
  • Data card
  • Technical report
  • Research paper
  • Evaluation results
Description
The Aquila-VL-2B model is a vision-language model (VLM) trained based on the LLava-one-vision framework. The Qwen2.5-1.5B-instruct model is chose as the LLM, while siglip-so400m-patch14-384 is utilized as the vision tower.
Version/Parameters
2.18B
Organization
Beijing Academy of Artificial Intelligence(BAAl)
Type
Multimodal model
Status
Approved
Architecture
Transformer (Decoder-only)
Treatment
Instruct fine-tuned
Base model
Qwen2.5-1.5B-instruct
Last updated