Qwen3.5 4B Instruct
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for coding, agents, multimodal. Strong fit for 16 GB RAM with balanced speed and quality.
A MacBook Air with a local model is a fully self-contained AI: open the lid, work with confidential material, and verify with airplane mode that nothing can leave. For client work under NDA, 16GB covers the 4B-9B class.
Qwen / 4B / Q4_K_M / ~3.5 GB
Best for: Coding, Agents, Multimodal·Pop: 88/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for coding, agents, multimodal. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 9B / Q4_K_M / ~7 GB
Best for: Quality, Coding, Reasoning·Pop: 86/100
Perf: ~58.7 tok/s · first token ~0.6s
Best for quality, coding, reasoning. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 8B / Q4_K_M / ~6.5 GB
Best for: Chat, Coding·Pop: 88/100
Perf: ~65.3 tok/s · first token ~0.6s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
LFM2 / 8.3B / Q4_K_M / ~5.5 GB
Best for: On-device agents, tool calling, multilingual chat·Pop: 72/100
Perf: ~63.1 tok/s · first token ~0.6s
Best for on-device agents, tool calling, multilingual chat. Strong fit for 16 GB RAM with balanced speed and quality.
Gemma / 4.5B / Q4_K_M / ~4 GB
Best for: On-device, Mobile, Chat·Pop: 82/100
Perf: ~109.5 tok/s · first token ~0.5s
Best for on-device, mobile, chat. Strong fit for 16 GB RAM with balanced speed and quality.
Llama / 8B / Q4_K_M / ~6.5 GB
Best for: Chat, Coding·Pop: 78/100
Perf: ~65.3 tok/s · first token ~0.6s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
Gemma / 4B / Q4_K_M / ~3.5 GB
Best for: Chat, Coding·Pop: 81/100
Perf: ~121.8 tok/s · first token ~0.5s
Best for chat, coding. Strong fit for 16 GB RAM with balanced speed and quality.
Qwen / 7B / Q4_K_M / ~5.5 GB
Best for: Coding·Pop: 72/100
Perf: ~73.6 tok/s · first token ~0.6s
Best for coding. Strong fit for 16 GB RAM with balanced speed and quality.
Turn the network off. A local Ollama model behaves identically with wifi disabled. That simple test is your proof, repeatable any time, no trust required. Contracts, medical letters, financial statements: summarize and analyze them in a coffee shop without the contents existing anywhere but your SSD.
For consultants, the Air is the portable privacy story: client data processed on-site never touches your cloud accounts. The 9B class handles document Q&A and summarization; disk encryption (FileVault) closes the at-rest side of the loop.
Use the ModelFit wizard to test different RAM and chip configurations for your exact MacBook Air setup.
Open ModelFit Wizard