-+ 0.00%

Ali (09988) open source new version of Qwen3 model dominates the list text characterization

Zhitongcaijing·06/05/2025 23:17:06

Listen to the news

The Zhitong Finance App learned that early this morning, Alibaba (09988) open-sourced two new Qwen3 series models, Qwen3-Embedding and Qwen3-Reranker. These two models are specially designed for text characterization, retrieval and sorting tasks. Based on Qwen3 basic model training, they fully inherit the advantages of Qwen3 in understanding multilingual texts and support 119 languages. According to test data, Qwen3 Embedding performed very well in multi-language text characterization benchmarks. Among them, the 8B parameter ranked first with a high score of 70.58, surpassing many commercial API services, such as Google's Gemini-Embedding.

Excellent generality: The Qwen3-Embedding series has reached industry-leading levels in multiple downstream task evaluations. Among them, the 8B parameter-scale embedding model ranked first in the MTEB multi-language leaderboard list (score 70.58 as of June 6, 2025), and its performance surpassed many commercial API services. Furthermore, this series of sorting models excels in various text search scenarios, significantly improving the relevance of search results.

Flexible model architecture: The Qwen3-Embedding series provides 3 model configurations from 0.6B to 8B parameter scales to meet performance and efficiency requirements in different scenarios. Developers can flexibly combine characterization and sequencing modules to expand functionality.

Additionally, the model supports the following customized features:

1) Characterization dimension customization: Allows users to adjust characterization dimensions according to actual needs, effectively reducing application costs;

2) Instruction adaptation optimization: Users can customize instruction templates to improve performance in specific tasks, languages, or scenarios.

Comprehensive multi-language support: The Qwen3-Embedding series supports more than 100 languages, covering mainstream natural language and multiple programming languages. This series of models has strong multi-language, cross-language and code retrieval capabilities, and can effectively respond to data processing requirements in multi-language scenarios.

pictures

In several benchmark tests, the Qwen3-Embedding series showed excellent performance in text characterization and sorting tasks.

pictures

Currently, this series of models is open sourced on Hugging Face, ModelScope, and GitHub platforms, and users can also directly use the latest text vector model service provided by Alibaba Cloud's Bailian platform.