Qwen3.5 4B Instruct
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~93.7 tok/s · first token ~0.4s
Fits in 12 GB VRAM with room to spare. Best for coding, agents, multimodal on RTX 4070.
ollama run qwen3.5:4b-instruct-q4_K_M