What is Ollama?
Ollama is a free, open-source tool that makes running large language models (LLMs) locally incredibly simple. Think of it as the “Docker for AI models” — it handles all the complexity of downloading, configuring, and running AI models on your own hardware.
Private
Your data never leaves your device
Fast
Native GPU acceleration on Apple Silicon
Free
Open-source with no usage limits
With Ollama, you can run popular models like Llama, Mistral, Qwen, and many others directly on your Mac — no internet connection required after initial download, no API keys, and no subscription fees.
System Requirements
Minimum Requirements
- • macOS 11 Big Sur or later
- • Apple Silicon Mac (M1 or newer)
- • 8GB RAM (16GB recommended)
- • 10GB free storage
Recommended for Best Experience
- • macOS 14 Sonoma or later
- • M2, M3, or M4 Mac
- • 16GB+ RAM
- • 50GB+ free storage
Note: While Ollama primarily targets Apple Silicon, experimental Intel Mac support exists but with significantly reduced performance. For the best experience, we strongly recommend using an M-series Mac.
Installation Steps
Download Ollama
Visit the official Ollama website and download the macOS version, or use the command line:
curl -fsSL https://ollama.com/install.sh | sh
The install script will download and install Ollama to /usr/local/bin
Verify Installation
Open a new Terminal window and verify Ollama is installed correctly:
ollama --version
You should see the version number, like ollama version 0.3.0
Start Ollama Service
Ollama runs as a background service. Start it with:
ollama serve
Keep this terminal window open, or run it in the background. On macOS, Ollama typically auto-starts after installation.
Downloading Your First Model
Ollama makes downloading models as simple as running a single command. Let's start with Llama 3.2, Meta's latest efficient model that works well on most Macs.
Download Llama 3.2 (3B)
ollama pull llama3.2:3b
This downloads the 3 billion parameter version (~2GB). Progress will be displayed during download.
Other great starter models to try:
ollama pull qwen2.5:7b— Excellent 7B model for coding and chat (4.5GB)ollama pull mistral:7b— Popular, well-tested model (4.1GB)ollama pull gemma2:2b— Google's efficient small model (1.6GB)You can browse all available models at ollama.com/library.
Running the Model
Interactive Chat Mode
Start an interactive chat session with your downloaded model:
ollama run llama3.2:3b
You'll see a prompt where you can type messages. Press Ctrl+D or type /bye to exit.
Single Prompt Mode
Send a single prompt and get a response:
ollama run llama3.2:3b "Explain quantum computing"
Using the API
Ollama provides a local API for building applications:
curl http://localhost:11434/api/generate -d '{"model": "llama3.2:3b", "prompt": "Hello!"}'Essential Commands
| Command | Description |
|---|---|
| ollama list | Show all downloaded models |
| ollama pull <model> | Download a model |
| ollama run <model> | Run a model (downloads if needed) |
| ollama rm <model> | Remove a model to free space |
| ollama cp <src> <dst> | Copy a model |
| ollama show <model> | Display model information |
| ollama ps | Show running models |
Troubleshooting
Model downloads are slow
Downloads happen directly from model hosts. Try using a VPN if your connection is slow, or download during off-peak hours. You can also resume interrupted downloads by running the pull command again.
Out of memory errors
Your Mac doesn't have enough RAM for the model you're trying to run. Try a smaller model (use 3B instead of 7B) or close other applications to free up memory. Models require roughly 1GB RAM per billion parameters.
Ollama command not found
The installation directory might not be in your PATH. Add /usr/local/bin to your PATH or restart your terminal. You can also try reinstalling using the official installer from ollama.com.
Connection refused errors
The Ollama service isn't running. Start it with ollama serve in a separate terminal window, or check if it's running with ollama ps.
Next Steps
Congratulations! You now have local AI running on your Mac. Here are some ways to expand your setup: