review33 - 數碼地帶: Qwen 2.5 Max

#1 [iku000], 25-01-30 11:00
The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive… pic.twitter.com/oHVl16vfje
— Qwen (@Alibaba_Qwen) January 28, 2025
https://x.com/alibaba_qwen/status/1884263157574820053

https://www.youtube.com/watch?v=SgcvHxKayLQ
返回 ...