Qwen3.5 4B Instruct
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~100.9 tok/s · first token ~0.3s
Fits in 12 GB VRAM with room to spare. Best for coding, agents, multimodal on RTX 4070 SUPER.
ollama run qwen3.5:4b-instruct-q4_K_M