Google DeepMind11 local models
Gemma Models: Google's Lightweight Local AI
Google DeepMind's Gemma family delivers impressive quality from compact models. Gemma 2 9B is one of the best sub-10B models available, and Gemma 1B runs on iPhones. If you want a well-tuned, safety-conscious model with solid general performance, Gemma is a strong choice.
Developer
Google DeepMind
Models
11
Size Range
1B – 31B
RAM Range
3 – 32 GB
Key Features
✓Excellent quality at small sizes (1B-9B)
✓Strong safety and instruction tuning
✓Efficient on Apple Silicon
✓Good for chat and general tasks
All Gemma Models
| Model | Size | Quant | VRAM | Min RAM | Best For | Quality | Ollama |
|---|---|---|---|---|---|---|---|
| Gemma 3 1B Instruct | 1B | Q4_K_M | 1 GB | 3 GB | Chat, Mobile | 52 | |
| Gemma 2 2B Instruct | 2B | Q4_K_M | 1.8 GB | 5 GB | Chat | 61 | |
| Gemma 4 E2B | 2.3B | Q4_K_M | 2.3 GB | 4 GB | IoT, Mobile, Edge | 66 | |
| Gemma 3 4B Instruct | 4B | Q4_K_M | 3.5 GB | 8 GB | Chat, Coding | 74 | |
| Gemma 4 E4B | 4.5B | Q4_K_M | 4 GB | 8 GB | On-device, Mobile, Chat | 76 | |
| Gemma 2 9B Instruct | 9B | Q4_K_M | 7 GB | 14 GB | Chat, Coding | 84 | |
| Gemma 3 12B Instruct | 12B | Q4_K_M | 9.5 GB | 18 GB | Chat, Quality | 87 | |
| Gemma 4 26B-A4B | 26B | Q4_K_M | 16 GB | 24 GB | Chat, Coding, Multimodal | 88 | |
| Gemma 2 27B Instruct | 27B | Q4_K_M | 21 GB | 28 GB | Quality, Coding | 94 | |
| Gemma 3 27B Instruct | 27B | Q4_K_M | 21 GB | 28 GB | Quality, Coding | 94 | |
| Gemma 4 31B | 31B | Q4_K_M | 20 GB | 32 GB | Quality, Coding, Multimodal | 92 |
Device Compatibility
Which Gemma models can run on each device class, based on minimum RAM requirements.
| Model | iPhone | Air | Pro | Studio | Mini |
|---|---|---|---|---|---|
| Gemma 3 1B Instruct (1B) | Excellent | Excellent | Excellent | Excellent | Excellent |
| Gemma 2 2B Instruct (2B) | Possible | Excellent | Excellent | Excellent | Excellent |
| Gemma 4 E2B (2.3B) | Excellent | Excellent | Excellent | Excellent | Excellent |
| Gemma 3 4B Instruct (4B) | Possible | Possible | Excellent | Excellent | Excellent |
| Gemma 4 E4B (4.5B) | Possible | Possible | Excellent | Excellent | Excellent |
| Gemma 2 9B Instruct (9B) | No | Possible | Possible | Excellent | Possible |
| Gemma 3 12B Instruct (12B) | No | Possible | Possible | Excellent | Possible |
| Gemma 4 26B-A4B (26B) | No | Possible | Possible | Possible | Possible |
| Gemma 2 27B Instruct (27B) | No | Possible | Possible | Possible | Possible |
| Gemma 3 27B Instruct (27B) | No | Possible | Possible | Possible | Possible |
| Gemma 4 31B (31B) | No | Possible | Possible | Possible | Possible |
RAM Requirements
Gemma 3 1B Instructmin 3 GB
1 GB
Gemma 2 2B Instructmin 5 GB
1.8 GB
Gemma 4 E2Bmin 4 GB
2.3 GB
Gemma 3 4B Instructmin 8 GB
3.5 GB
Gemma 4 E4Bmin 8 GB
4 GB
Gemma 2 9B Instructmin 14 GB
7 GB
Gemma 3 12B Instructmin 18 GB
9.5 GB
Gemma 4 26B-A4Bmin 24 GB
16 GB
Gemma 2 27B Instructmin 28 GB
21 GB
Gemma 3 27B Instructmin 28 GB
21 GB
Gemma 4 31Bmin 32 GB
20 GB
Frequently Asked Questions
What is the best Gemma model for a MacBook Air?+
Gemma 2 9B Q4 is the top pick for MacBook Air with 16GB RAM. It uses about 6GB and delivers strong chat performance. For 8GB Air, Gemma 2 2B is a good fallback.
Can Gemma run on an iPhone?+
Yes. Gemma 1B runs on iPhones with 6GB+ RAM. Quality is basic but useful for simple chat and text tasks.
How does Gemma compare to Llama at similar sizes?+
Gemma 2 9B and Llama 3.1 8B are very close. Gemma tends to be more conservative and safety-tuned, while Llama is more flexible. Pick based on whether you want guardrails or freedom.