Question 1

What is the LFM2 24B-A2B MoE and what Mac can run it?

Accepted Answer

It is a sparse mixture-of-experts model: 24B total parameters with only 2B active per token. It loads in about 14GB, so it runs on any Mac with 16GB of unified memory: a 16GB MacBook Air, base MacBook Pro, or Mac Mini.

Question 2

What makes LFM2 different from other models?

Accepted Answer

LFM2 uses a hybrid architecture instead of a standard dense transformer. The MoE design activates a small slice of the model per token, which keeps inference fast while retaining 24B-scale knowledge. It is tuned for tool calling and agent reliability.

Question 3

Is LFM2 good for general chat?

Accepted Answer

It handles chat fine but shines on structured tasks, tool use, and agent workflows. For pure conversational quality at a similar RAM budget, Qwen or Gemma alternatives are the stronger picks.

Question 4

What Ollama command runs LFM2?

Accepted Answer

Run `ollama run lfm2:24b-a2b`. You need a 16GB machine at minimum, since the model itself occupies about 14GB once loaded.

Model	Size	Quant	VRAM	Min RAM	Best For	Quality	Ollama
LFM2.5 8B-A1B	8.3B	Q4_K_M	5.5 GB	10 GB	On-device agents, tool calling, multilingual chat	84
LFM2 24B-A2B Instruct	24B	Q4_K_M	14 GB	16 GB	Local AI agents, privacy-first tool calling, MCP workflows	85

Model	iPhone	Air	Pro	Studio	Mini
LFM2.5 8B-A1B (8.3B)	Possible	Possible	Excellent	Excellent	Excellent
LFM2 24B-A2B Instruct (24B)	No	Possible	Possible	Excellent	Possible

LFM2 Models: Liquid AI for Agentic Workflows

All LFM2 Models

Device Compatibility

RAM Requirements

Frequently Asked Questions

Related Model Families

Getting Started