🐾Self-hosted agent· 13 models ranked

Best Local LLMs for OpenClaw

Self-hosted agentic assistant with first-class Ollama support

OpenClaw is a self-hosted, model-agnostic agentic assistant that runs across your messaging apps and executes tools (browser, cron, canvas, skills) on your own hardware. It supports local open-weight models through a native Ollama provider, where the agent loop depends entirely on the model emitting structured tool calls. Pick a model trained for that, and connect through the right endpoint.

Best pick

Qwen3.6 35B-A3B

Newest Qwen3 MoE; the Qwen3 line has the most stable tool calling and rarely drops params.

What OpenClaw needs

Reliable structured tool-calling at 32K+ context, accessed via the native Ollama API. Not the /v1 OpenAI-compatible endpoint, which OpenClaw's docs warn breaks tool calls.

OpenClaw Local LLM Tier List

SS: Best in class

Qwen3.6 35B-A3B35B· 24GB RAM

Newest Qwen3 MoE; the Qwen3 line has the most stable tool calling and rarely drops params.

Qwen3.5 35B-A3B Instruct35B· 24GB RAM

Same proven Qwen3 tool-calling reliability; the family featured in OpenClaw config recipes.

Gemma 4 31B31B· 32GB RAM

Gemma 4 ships native function calling; OpenClaw docs treat gemma4 as the local default.

AA: Strong, reliable

Gemma 4 26B-A4B26B· 24GB RAM

Native function calling in an efficient MoE; a top consumer-hardware agent pick.

Qwen3.5 27B Instruct27B· 20GB RAM

Explicitly named in OpenClaw Ollama recipes; dense Qwen3 tool reliability.

Llama 3.3 70B Instruct70B· 48GB RAM

Solid tool calling and large context; named in OpenClaw recipes.

Qwen3.5 9B Instruct9B· 14GB RAM

Directly featured in OpenClaw docs (qwen3.5:9b, num_ctx 32768).

BB: Usable with caveats

Qwen3 30B30B· 28GB RAM

Qwen3-Coder 30B is RL-trained for multi-step agentic loops; strong but prior-gen.

Mistral Small 22B22B· 26GB RAM

Mistral supports tools; mid reliability with decent context.

Phi-4 14B14B· 22GB RAM

Reasoning-strong starter, but less battle-tested for sustained tool loops.

CC: Works, but not recommended

Qwen2.5 Coder 14B14B· 22GB RAM

Great code completion, but not RL-trained for OpenClaw-style multi-step tool loops.

Llama 3.1 8B Instruct8B· 12GB RAM

Small, older Llama tool-calling; prone to malformed calls.

Gemma 2 9B Instruct9B· 14GB RAM

Predates native function calling; poor agentic fit.

Tiers weigh tool-calling reliability, context window, and coding quality for OpenClaw specifically. A model can rank higher for one tool than another. RAM figures are for Q4 quantization. Sources are listed below.

Local setup notes

Pull a model with Ollama and OpenClaw auto-discovers it. Point it at the native API base URL (http://host:11434, no /v1) and set num_ctx to 32K+. Quality and tool-call reliability scale with model size, so prefer the largest variant your hardware allows.

OpenClaw official site ↗

New open-weight models, real Apple Silicon benchmarks, and the one model worth running on your Mac this week. Free, one email a week, unsubscribe anytime.

By subscribing you agree to our Privacy Policy and to receive the weekly email. Unsubscribe anytime.

Frequently Asked Questions

Can I run OpenClaw fully local with no API costs?+

Yes. OpenClaw has a native Ollama provider: pull a model and it is auto-discovered. The docs note quality and tool-call reliability scale with model size, so a larger model gives a smoother agent loop.

Why does the endpoint URL matter so much for OpenClaw?+

OpenClaw's docs warn against the /v1 OpenAI-compatible URL: it breaks tool calling, and models may emit raw tool JSON as plain text. Use the native API base URL (http://host:11434, no /v1) so tool calls parse correctly.

Why are the Qwen2.5-Coder models only mid-tier here despite being great coders?+

They excel at code completion, but OpenClaw is an agentic tool-loop driver. Qwen3 and Gemma 4 are trained for reliable multi-step tool calling, which OpenClaw weights most. A great autocomplete model that fumbles tool JSON makes a poor agent.