Best Local LLMs for 64GB RAM

Ranked open-weight models that run well on a 64GB machine — with estimated speed and the exact ollama command for each.

Best local models for ~64GB

12 picks

Estimates assume a representative Apple-Silicon machine with 64GB unified memory. Tok/s are ModelFit estimates, not measured benchmarks. Run the wizard for figures tuned to your exact chip.

01

Qwen3.6 35B-A3B

Qwen / 35B / Q4_K_M / ~22 GB

Best for: Reasoning, Coding, Agents·Perf: ~38.6 tok/s (est.) · first token ~1.5s

Runs well

Best for reasoning, coding, agents. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run qwen3.6:35b-a3b
02

Qwen3.5 35B-A3B Instruct

Qwen / 35B / Q4_K_M / ~20 GB

Best for: Reasoning, Coding, Agent scenarios·Perf: ~38.6 tok/s (est.) · first token ~1.5s

Runs well

Best for reasoning, coding, agent scenarios. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run qwen3.5:35b-a3b
03

Qwen3.5 27B Instruct

Qwen / 27B / Q4_K_M / ~16 GB

Best for: Chat, Coding, Complex reasoning·Perf: ~48.8 tok/s (est.) · first token ~0.7s

Perfect fit

Best for chat, coding, complex reasoning. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run qwen3.5:27b
04

Qwen3.6 27B

Qwen / 27B / Q4_K_M / ~18 GB

Best for: Coding, Quality, Long context·Perf: ~48.8 tok/s (est.) · first token ~0.7s

Perfect fit

Best for coding, quality, long context. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run qwen3.6:27b
05

Gemma 4 26B-A4B

Gemma / 26B / Q4_K_M / ~16 GB

Best for: Chat, Coding, Multimodal·Perf: ~50.5 tok/s (est.) · first token ~0.6s

Perfect fit

Best for chat, coding, multimodal. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run gemma4:26b
06

LFM2 24B-A2B Instruct

LFM2 / 24B / Q4_K_M / ~14 GB

Best for: Local AI agents, privacy-first tool calling, MCP workflows·Perf: ~54.3 tok/s (est.) · first token ~0.6s

Perfect fit

Best for local ai agents, privacy-first tool calling, mcp workflows. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run lfm2:24b-a2b
07

Gemma 4 31B

Gemma / 31B / Q4_K_M / ~20 GB

Best for: Quality, Coding, Multimodal·Perf: ~43.1 tok/s (est.) · first token ~1.5s

Runs well

Best for quality, coding, multimodal. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run gemma4:31b
08

Qwen3 30B

Qwen / 30B / Q4_K_M / ~22 GB

Best for: Quality, Coding·Perf: ~44.4 tok/s (est.) · first token ~1.5s

Runs well

Best for quality, coding. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run qwen3:30b
09

Gemma 3 27B Instruct

Gemma / 27B / Q4_K_M / ~21 GB

Best for: Quality, Coding·Perf: ~48.8 tok/s (est.) · first token ~0.7s

Runs well

Best for quality, coding. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run gemma3:27b
10

Mixtral 8x7B Instruct

Mistral / 46.7B / Q4_K_M / ~30 GB

Best for: Coding, Quality·Perf: ~29.8 tok/s (est.) · first token ~1.6s

Runs well

Best for coding, quality. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run mixtral:8x7b
11

Mistral Small 3.1

Mistral / 24B / Q4_K_M / ~15 GB

Best for: Chat, Coding·Perf: ~54.3 tok/s (est.) · first token ~0.6s

Perfect fit

Best for chat, coding. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run mistral-small3.1:24b
12

Mistral Small 22B

Mistral / 22B / Q4_K_M / ~17 GB

Best for: Coding, Quality·Perf: ~58.7 tok/s (est.) · first token ~0.6s

Perfect fit

Best for coding, quality. Strong fit for 64 GB RAM with balanced speed and quality.

ollama
$ollama run mistral-small:22b

Want an exact recommendation?

The wizard tunes picks and speed estimates to your exact device, chip, and RAM.