Qwen3.5 35B-A3B Instruct
Qwen / 35B / Q4_K_M / ~20 GB
Best for: Reasoning, Coding, Agent scenarios·Pop: 90/100
Perf: ~41.4 tok/s · first token ~1.0s
Fits in 32 GB VRAM with room to spare. Best for reasoning, coding, agent scenarios on RTX 5090.
ollama run qwen3.5:35b-a3b-instruct-q4_K_M