Qwen3.5 4B Instruct
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for coding, agents, multimodal. Strong fit for 16 GB RAM with balanced speed and quality.
A 16GB MacBook Air is a fine drafting partner: 9B-class models brainstorm, outline, and rough out scenes anywhere you can open the lid. Prose at this size is serviceable for drafts; the polish pass is yours.
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for coding, agents, multimodal. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 9B / Q4_K_M / ~7 GB
Best for: Quality, Coding, Reasoning·Pop: 86/100
Perf: ~58.7 tok/s · first token ~0.6s
Best for quality, coding, reasoning. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 8B / Q4_K_M / ~6.5 GB
Best for: Chat, Coding·Pop: 88/100
Perf: ~65.3 tok/s · first token ~0.6s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
LFM2 / 8.3B / Q4_K_M / ~5.5 GB
Best for: On-device agents, tool calling, multilingual chat·Pop: 72/100
Perf: ~63.1 tok/s · first token ~0.6s
Best for on-device agents, tool calling, multilingual chat. Strong fit for 16 GB RAM with balanced speed and quality.
Gemma / 4.5B / Q4_K_M / ~4 GB
Best for: On-device, Mobile, Chat·Pop: 82/100
Perf: ~109.5 tok/s · first token ~0.5s
Best for on-device, mobile, chat. Strong fit for 16 GB RAM with balanced speed and quality.
Llama / 8B / Q4_K_M / ~6.5 GB
Best for: Chat, Coding·Pop: 78/100
Perf: ~65.3 tok/s · first token ~0.6s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
Gemma / 4B / Q4_K_M / ~3.5 GB
Best for: Chat, Coding·Pop: 81/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 7B / Q4_K_M / ~5.5 GB
Best for: Coding·Pop: 72/100
Perf: ~73.6 tok/s · first token ~0.6s
Best for coding. Strong fit for 16 GB RAM with balanced speed and quality.
Generation in writer-sized pieces (a scene, a stanza, three takes on an opening paragraph) is burst work the fanless Air handles without strain. The 9B class is genuinely useful for unblocking: alternatives, continuations, tone experiments. Expect functional prose with occasional repetition, not finished style.
Keep sessions scene-scoped: with ~11GB of budget the practical context covers a chapter, not a manuscript. Summarize earlier chapters into a story bible you paste in, and the small window stops mattering.
Use the ModelFit wizard to test different RAM and chip configurations for your exact MacBook Air setup.
Open ModelFit Wizard