Gemma Models: Google's Lightweight Local AI

Google DeepMind's Gemma family delivers impressive quality from compact models. Gemma 2 9B is one of the best sub-10B models available, and Gemma 1B runs on iPhones. If you want a well-tuned, safety-conscious model with solid general performance, Gemma is a strong choice.

Google DeepMind12 local models

DEVELOPER

Google DeepMind

MODELS

SIZE RANGE

1B–31B

RAM RANGE

3–32 GB

Key Features

Excellent quality at small sizes (1B-9B)

Strong safety and instruction tuning

Efficient on Apple Silicon

Good for chat and general tasks

All Gemma Models

Model	Size	Quant	VRAM	Min RAM	Best For	Quality
Gemma 3 1B Instruct	1B	Q4_K_M	1 GB	3 GB	Chat, Mobile	52
Gemma 2 2B Instruct	2B	Q4_K_M	1.8 GB	5 GB	Chat	61
Gemma 4 E2B	2.3B	Q4_K_M	2.3 GB	4 GB	IoT, Mobile, Edge	66
Gemma 3 4B Instruct	4B	Q4_K_M	3.5 GB	8 GB	Chat, Coding	74
Gemma 4 E4B	4.5B	Q4_K_M	4 GB	8 GB	On-device, Mobile, Chat	76
Gemma 2 9B Instruct	9B	Q4_K_M	7 GB	14 GB	Chat, Coding	84
Gemma 3 12B Instruct	12B	Q4_K_M	9.5 GB	18 GB	Chat, Quality	87
Gemma 4 12B	12B	Q4_K_M	8 GB	16 GB	Chat, Coding, Multimodal	86
Gemma 4 26B-A4B	26B	Q4_K_M	16 GB	24 GB	Chat, Coding, Multimodal	88
Gemma 2 27B Instruct	27B	Q4_K_M	21 GB	28 GB	Quality, Coding	94
Gemma 3 27B Instruct	27B	Q4_K_M	21 GB	28 GB	Quality, Coding	94
Gemma 4 31B	31B	Q4_K_M	20 GB	32 GB	Quality, Coding, Multimodal	92

Device Compatibility

Which Gemma models can run on each device class, based on minimum RAM requirements.

Model	iPhone	Air	Pro	Studio	Mini
Gemma 3 1B Instruct (1B)	Excellent	Excellent	Excellent	Excellent	Excellent
Gemma 2 2B Instruct (2B)	Possible	Excellent	Excellent	Excellent	Excellent
Gemma 4 E2B (2.3B)	Excellent	Excellent	Excellent	Excellent	Excellent
Gemma 3 4B Instruct (4B)	Possible	Possible	Excellent	Excellent	Excellent
Gemma 4 E4B (4.5B)	Possible	Possible	Excellent	Excellent	Excellent
Gemma 2 9B Instruct (9B)	No	Possible	Possible	Excellent	Possible
Gemma 3 12B Instruct (12B)	No	Possible	Possible	Excellent	Possible
Gemma 4 12B (12B)	No	Possible	Possible	Excellent	Possible
Gemma 4 26B-A4B (26B)	No	Possible	Possible	Possible	Possible
Gemma 2 27B Instruct (27B)	No	Possible	Possible	Possible	Possible
Gemma 3 27B Instruct (27B)	No	Possible	Possible	Possible	Possible
Gemma 4 31B (31B)	No	Possible	Possible	Possible	Possible

RAM Requirements

Gemma 3 1B Instruct

1 GB · min 3 GB

Gemma 2 2B Instruct

1.8 GB · min 5 GB

Gemma 4 E2B

2.3 GB · min 4 GB

Gemma 3 4B Instruct

3.5 GB · min 8 GB

Gemma 4 E4B

4 GB · min 8 GB

Gemma 2 9B Instruct

7 GB · min 14 GB

Gemma 3 12B Instruct

9.5 GB · min 18 GB

Gemma 4 12B

8 GB · min 16 GB

Gemma 4 26B-A4B

16 GB · min 24 GB

Gemma 2 27B Instruct

21 GB · min 28 GB

Gemma 3 27B Instruct

21 GB · min 28 GB

Gemma 4 31B

20 GB · min 32 GB

Frequently Asked Questions

What is the best Gemma model for a MacBook Air?

Gemma 2 9B Q4 is the top pick for MacBook Air with 16GB RAM. It uses about 6GB and delivers strong chat performance. For 8GB Air, Gemma 2 2B is a good fallback.

Can Gemma run on an iPhone?

Yes. Gemma 1B runs on iPhones with 6GB+ RAM. Quality is basic but useful for simple chat and text tasks.

How does Gemma compare to Llama at similar sizes?

Gemma 2 9B and Llama 3.1 8B are very close. Gemma tends to be more conservative and safety-tuned, while Llama is more flexible. Pick based on whether you want guardrails or freedom.

Related Model Families

PhiMicrosoft LlamaMeta QwenAlibaba Cloud

Getting Started

How to Set Up Ollama Best LLM for iPhone Browse All Devices ModelFit Wizard