Browse by Device

Local vs Cloud: How Close Are We?

4%

Gap vs GPT-4

Qwen3.5-122B reaches 84.8% MMLU vs GPT-4's 88.7%

=

Coding Parity

DeepSeek-R1 matches Claude 3.5 at 92% on HumanEval

July 2026

Projected Parity

Local models expected to match GPT-4 in ~5 months

See the full comparison: MMLU scores, speed benchmarks, and projections →

Mar 4, 2026

Lin Junyang, Yu Bowen, and Binyuan Hui leave Alibaba's Qwen team. What it means for open-source AI...

Mar 4, 2026

The Qwen3.5-4B scores 88.8 on MMLU-Redux, beating GPT-class 20B models. Runs on 2-14 GB RAM...

Feb 26, 2026

New inference engine brings 10-15% performance gains on Apple Silicon with 8-bit KV cache...