Qwen3.6 35B-A3B
Qwen / 35B / Q4_K_M / ~22 GB
Best for: Reasoning, Coding, Agents·Pop: 88/100
Perf: ~30.3 tok/s · first token ~1.6s
Best for reasoning, coding, agents. Strong fit for 48 GB RAM with balanced speed and quality.
At 32GB, a MacBook Pro crosses the line where AI prose stops needing apologies. The 14B-24B class writes with real sentence variety, holds character voice across a chapter, and takes style direction seriously.
Qwen / 35B / Q4_K_M / ~22 GB
Best for: Reasoning, Coding, Agents·Pop: 88/100
Perf: ~30.3 tok/s · first token ~1.6s
Best for reasoning, coding, agents. Strong fit for 48 GB RAM with balanced speed and quality.
Qwen / 35B / Q4_K_M / ~20 GB
Best for: Reasoning, Coding, Agent scenarios·Pop: 90/100
Perf: ~30.3 tok/s · first token ~1.6s
Best for reasoning, coding, agent scenarios. Strong fit for 48 GB RAM with balanced speed and quality.
Qwen / 27B / Q4_K_M / ~16 GB
Best for: Chat, Coding, Complex reasoning·Pop: 82/100
Perf: ~38.2 tok/s · first token ~0.7s
Best for chat, coding, complex reasoning. Strong fit for 48 GB RAM with balanced speed and quality.
Qwen / 27B / Q4_K_M / ~18 GB
Best for: Coding, Quality, Long context·Pop: 92/100
Perf: ~38.2 tok/s · first token ~0.7s
Best for coding, quality, long context. Strong fit for 48 GB RAM with balanced speed and quality.
Gemma / 26B / Q4_K_M / ~16 GB
Best for: Chat, Coding, Multimodal·Pop: 86/100
Perf: ~39.5 tok/s · first token ~0.7s
Best for chat, coding, multimodal. Strong fit for 48 GB RAM with balanced speed and quality.
LFM2 / 24B / Q4_K_M / ~14 GB
Best for: Local AI agents, privacy-first tool calling, MCP workflows·Pop: 80/100
Perf: ~42.5 tok/s · first token ~0.7s
Best for local ai agents, privacy-first tool calling, mcp workflows. Strong fit for 48 GB RAM with balanced speed and quality.
Qwen / 14B / Q4_K_M / ~11 GB
Best for: Coding, Quality·Pop: 84/100
Perf: ~69.0 tok/s · first token ~0.6s
Best for coding, quality. Strong fit for 48 GB RAM with balanced speed and quality.
Gemma / 31B / Q4_K_M / ~20 GB
Best for: Quality, Coding, Multimodal·Pop: 84/100
Perf: ~33.8 tok/s · first token ~1.5s
Best for quality, coding, multimodal. Strong fit for 48 GB RAM with balanced speed and quality.
Vocabulary and consistency. The jump from 9B to 14B is the most noticeable quality step in local writing models: fresher phrasing, fewer tics, dialogue that stays in character. The ~22GB budget also fits a chapter plus your style notes and a character sheet in context simultaneously.
Use system prompts as a style contract (voice, tense, banned cliches) and the 14B class actually honors it. For long projects, pin a story-bible summary at the top of context and regenerate it as the plot moves.
Use the ModelFit wizard to test different RAM and chip configurations for your exact MacBook Pro setup.
Open ModelFit Wizard