Local LLM Hardware Stats

How much memory each model tier needs, and what fits your device. Derived from ModelFit's 59 local models. Updated 2026-06-15.

How much RAM do you need to run a local LLM? An 8GB device runs small 3-4B models, a 16GB device comfortably runs ~9B models, 32GB unlocks 14B, and 64GB+ runs 30-70B models. At Q4_K_M a model needs roughly 0.6 GB per billion parameters, and ModelFit budgets ~70% of unified memory for the model. ModelFit tracks 59 local models across 17 families.

LOCAL MODELS
59
MODEL FAMILIES
17
16GB SWEET SPOT
~14B
64GB+ CEILING
30-70B

Model size by memory budget

Memory (RAM / VRAM)Model budget (~70%)Runs up toModels that fitStrong pick
8 GB~5.6 GB~8.3B params23 / 59DeepSeek-R1 Distill Qwen 7B
16 GB~11.2 GB~14B params36 / 59Qwen2.5 Coder 14B
24 GB~16.8 GB~27B params41 / 59Qwen2.5 Coder 14B
32 GB~22.4 GB~35B params49 / 59Qwen3 30B
48 GB~33.6 GB~46.7B params50 / 59Mixtral 8x7B Instruct
64 GB~44.8 GB~70B params53 / 59Llama 3.1 70B Instruct
128 GB~89.6 GB~122B params55 / 59Llama 3.1 70B Instruct

Q4_K_M assumed. tok/s and fit are estimates from ModelFit's dataset, not measured benchmarks.

Key facts

  • ModelFit tracks 90 AI models across 17 families; 59 run locally via Ollama on Apple Silicon or NVIDIA GPUs (ModelFit, 2026).
  • At Q4_K_M quantization, a local LLM needs roughly 0.6 GB of memory per billion parameters (ModelFit, 2026).
  • ModelFit sizes recommendations to ~70% of a device’s unified memory, leaving headroom for the OS, context, and KV-cache (ModelFit, 2026).
  • A 8GB device comfortably runs local models up to ~8.3B parameters at Q4; 23 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 16GB device comfortably runs local models up to ~14B parameters at Q4; 36 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 24GB device comfortably runs local models up to ~27B parameters at Q4; 41 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 32GB device comfortably runs local models up to ~35B parameters at Q4; 49 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 48GB device comfortably runs local models up to ~46.7B parameters at Q4; 50 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 64GB device comfortably runs local models up to ~70B parameters at Q4; 53 of ModelFit’s 59 local models fit (ModelFit, 2026).
  • A 128GB device comfortably runs local models up to ~122B parameters at Q4; 55 of ModelFit’s 59 local models fit (ModelFit, 2026).

Frequently asked questions

How much RAM do I need to run a local LLM?

8GB runs small 3-4B models, 16GB comfortably runs ~9B models (the sweet spot for most laptops), 32GB unlocks 14B, and 64GB or more runs 30-70B models. ModelFit sizes this at ~70% of unified memory, since a Q4 model needs roughly 0.6 GB per billion parameters.

What size LLM can I run on 16GB of RAM?

On 16GB you can comfortably run local models up to ~14B parameters at Q4 — 36 of ModelFit's 59 local models fit, with Qwen2.5 Coder 14B a strong pick.

Can 8GB of RAM run a local LLM?

Yes. 8GB runs 23 of ModelFit's local models — small 3-4B models up to ~8.3B at Q4. Expect tight headroom; close other apps for the best speed.

Cite this page

Free to reuse with attribution (CC BY 4.0). Full machine-readable data: the compatibility dataset and a JSON export.

ModelFit — Local LLM Hardware Stats (2026).
https://modelfit.io/stats/ (accessed 2026-06-15).