Overview

Alibaba released Qwen 3.5, their new multimodal AI series featuring both open-source and proprietary models that can process text and vision inputs. The key innovation is a hybrid architecture combining linear attention with sparse mixture-of-experts for efficient inference while maintaining high capability.

The Breakdown

  • The open-source Qwen3.5-397B-A17B uses a Mixture of Experts architecture that activates only 17B of 397B parameters per inference, dramatically reducing computational costs while maintaining full model capability
  • Both models feature native multimodal capabilities for vision tasks, demonstrated through image generation tests like drawing pelicans riding bicycles
  • The architecture combines linear attention via Gated Delta Networks with sparse mixture-of-experts, representing a novel approach to balancing model size and efficiency
  • The proprietary Qwen3.5 Plus extends the context length to 1M tokens and includes integrated search and code interpreter tools, making it suitable for complex multi-step reasoning tasks