AI Model Families for Local Inference
Browse open-weight model families you can run locally with Ollama. Each family page shows all variants, RAM requirements, device compatibility, and performance expectations.
AI Model Families for Local Use
Qwen is Alibaba Cloud's open-weight model family with the widest range of sizes, from 0.5B to 235B parameters. Known for strong multilingual performance and coding ability.
Llama is Meta's open-weight model family and the most popular choice for local AI. Known for strong general reasoning and a massive community ecosystem.
DeepSeek specializes in reasoning and coding models. DeepSeek R1 introduced chain-of-thought reasoning that rivals proprietary models, while V3 is a massive MoE model.
Mistral AI's models are known for efficiency and strong performance relative to their size. Mistral 7B was a breakthrough that proved small models could compete with much larger ones.
Gemma is Google DeepMind's lightweight open model family. Known for excellent quality at small sizes and strong safety tuning.
Phi is Microsoft's small-but-mighty model family, built on the idea that careful training data beats raw parameter count. Phi-4 Mini packs strong reasoning into just 3.8B parameters, while Phi-4 14B competes with much larger models on quality. Both run locally with Ollama and pair naturally with low-RAM Apple Silicon Macs.
LFM2 is Liquid AI's efficiency-focused model family, built on a hybrid architecture rather than a standard dense transformer. Its flagship, LFM2 24B-A2B, is a sparse mixture-of-experts model that activates only 2B of its 24B parameters per token. That design makes it fast on consumer hardware and well suited to agent workflows, tool calling, and privacy-sensitive local setups.
SmolLM is Hugging Face's ultra-tiny model family for the most constrained devices. SmolLM2 360M loads in about 0.5GB and runs on anything with 1GB of RAM, from old Macs and iPhones to embedded boards. It is the smallest model in our database and the fastest by speed score.