Overview
Alibaba released Qwen 3.5, their new multimodal AI series featuring both open-source and proprietary models that can process text and vision inputs. The key innovation is a hybrid architecture combining linear attention with sparse mixture-of-experts for efficient inference while maintaining high capability.
The Breakdown
- The open-source Qwen3.5-397B-A17B uses a Mixture of Experts architecture that activates only 17B of 397B parameters per inference, dramatically reducing computational costs while maintaining full model capability
- Both models feature native multimodal capabilities for vision tasks, demonstrated through image generation tests like drawing pelicans riding bicycles
- The architecture combines linear attention via Gated Delta Networks with sparse mixture-of-experts, representing a novel approach to balancing model size and efficiency
- The proprietary Qwen3.5 Plus extends the context length to 1M tokens and includes integrated search and code interpreter tools, making it suitable for complex multi-step reasoning tasks