Blog

Guides, comparisons, and insights on running local LLMs on Apple Silicon.

Best LLM for Mac Mini M4 with 16GB RAM (2026)

2026-03-09

> TL;DR: The Mac Mini M4 with 16GB RAM runs models up to ~13B parameters at Q4 quantization without breaking a sweat. Qwen3 8B is the best daily driver at 28–35 tok/s. Unlike the fanless MacBook Air, ...

LFM2-24B-A2B + LocalCowork: Run a Full AI Agent Locally on Your Mac (2026)

2026-03-09

Liquid AI just shipped a working on-device AI agent — not a demo, a real one. LFM2-24B-A2B paired with LocalCowork runs 75 MCP tools entirely on your Mac: security scanning, file operations, audit log...

Qwen 3.5 4B Beats GPT-4o in Independent Test — Runs on Any Mac

2026-03-09

A Johns Hopkins researcher ran both Qwen 3.5 4B and GPT-4o on 1,000 real-world prompts. Qwen won 499, lost 431, and tied 70 — a statistically significant edge over OpenAI's flagship API (N8Progra...

Run Claude Code for Free with Ollama on Mac (2026)

2026-03-09

Claude Code is Anthropic's AI coding agent — and you can run it locally with Ollama instead of paying $100/month for Claude Max. One environment variable swap points Claude Code at your local mod...

Best LLM for MacBook Pro M4 Pro with 24GB RAM (2026)

2026-03-08

> TL;DR: The MacBook Pro M4 Pro with 24GB RAM is one of the best local AI machines you can buy. Qwen3 14B is the clear all-rounder at 28–38 tok/s, fitting comfortably in ~9.5GB. For reasoning, DeepSee...

DeepSeek V4 Is Coming: What Mac Users Need to Know (2026)

2026-03-08

DeepSeek V4 could drop this week — a 1-trillion-parameter multimodal monster with a 1M-token context window and leaked coding benchmarks that reportedly beat GPT-5.3 and Claude Opus 4.6. Three release...

Mac Mini for Local AI: The Best Value Setup in 2026

2026-03-08

> TL;DR: The Mac Mini M4 Pro with 64GB ($1,999–$2,499) is the best value local AI machine in 2026. It runs 30B-class models at 12–18 tok/s, costs ~$25/year in electricity, and gives every gigabyte of ...

Apple Core AI Framework: Core ML Replacement Coming at WWDC 2026

2026-03-07

Apple is replacing Core ML with a brand-new framework called Core AI, set to debut at WWDC 2026 this June. The rename from "Machine Learning" to "AI" isn't cosmetic — it signals a fundamental shi...

Best LLM for MacBook Air M4 with 16GB RAM (2026)

2026-03-07

> TL;DR: The MacBook Air M4 with 16GB RAM can comfortably run models up to ~14B parameters at Q4 quantization. Qwen3 8B is the best all-rounder — 30–40 tok/s, fits in ~5.5GB, and outperforms models tw...

LLMfit: One Command to Find What Runs on Your Mac (2026)

2026-03-06

TL;DR: LLMfit is a Rust CLI that detects your RAM, CPU, and GPU, then scores 200+ models across quality, speed, fit, and context. It picks the best quantization that fits your memory and estimates tok...

Apple M5 Pro & M5 Max: The Local LLM Leap (2026)

2026-03-05

Apple just announced the M5 Pro and M5 Max — and the local AI community is paying close attention. With up to 4x faster LLM prompt processing versus M4, 128GB of unified memory, and Neural Accelerator...

Qwen 3.5 Small Models: 4B Beats 20B Models on Any Mac (2026)

2026-03-04

Alibaba just dropped four small Qwen 3.5 models that rewrite what "small" means in local AI. The Qwen3.5-4B scores 88.8 on MMLU-Redux — higher than GPT-class 20B open-source models (HuggingFace, March...

Qwen Team Exodus: Alibaba Loses 3 Key AI Leaders (2026)

2026-03-04

Three senior leaders have left Alibaba's Qwen team in Q1 2026 — including tech lead Lin Junyang, the architect behind the world's most downloaded open-source AI project. The departures came days after...

Ollama 0.17 Update: 15% Faster on Apple Silicon (2026)

2026-02-26

Ollama 0.17 dropped on February 21, 2026, with a major overhaul of its inference engine. Performance gains hit 40% on NVIDIA GPUs and 10-15% on Apple Silicon. Here's what actually changes for Mac user...

How to Run Claude Code on Local LLMs: The Complete Setup

2026-02-25

Want to use Claude Code without Anthropic's API? Here's how @sudoingX hacked it to run on local Qwen models with impressive results.

DeepSeek-V3 vs Qwen 3.5: Which Local LLM Wins on Mac?

2026-02-25

The two giants of open-source local AI are here. DeepSeek-V3 and Qwen 3.5 both promise frontier-level quality on consumer hardware. But which one actually delivers the best experience on your Mac?

Qwen 3.5: New Models That Beat Giants with 7x Less RAM

2026-02-25

Alibaba just released the Qwen 3.5 Medium series — and it's a masterclass in one thing: smart architecture beats raw parameters.

Benchmark: Local LLMs vs Cloud Flagships

2026-02-24

Verdict: 100B+ local models approach within 5%. Gap mainly on very long context.

How to Install Ollama on Mac — Apple Silicon Guide (2026)

2026-02-24

Want to run language models locally on your Mac? This guide shows you how to install Ollama and launch your first model in 5 minutes.

MacBook Air M4 vs MacBook Pro M4: Which Mac for Local LLMs?

2026-02-24

Apple recently updated its entire Mac lineup with M4 chips. But which one should you choose for running language models locally? We compare the two options.