Best Local LLMs for Roo Code
Autonomous VS Code agent with modes (a Cline fork)
Roo Code is an autonomous VS Code coding agent forked from Cline, adding selectable modes (Architect, Code, Ask, Debug). It runs any local model via Ollama, LM Studio, or an OpenAI-compatible endpoint, and because it shares Cline's diff/tool-use protocol it has the same low tolerance for weak tool-callers and small context windows. The strong agentic Qwen MoE models lead.

Best pick
Qwen3.6 35B-A3B
Newest large Qwen3.x MoE; strongest tool-use + repo-scale reasoning in the catalog.
What Roo Code needs
Reliable structured tool-calling sustained across a large multi-turn context window — the model must follow the diff/tool protocol turn after turn without drifting.
Roo Code Local LLM Tier List
Newest large Qwen3.x MoE; strongest tool-use + repo-scale reasoning in the catalog.
Prior-gen of the same MoE class; proven agentic tool-calling, fast active-param inference.
Direct lineage of Cline/Roo's recommended Qwen3 Coder 30B; the safe default local pick.
The most-reported "actually works in Roo/Cline" dense coder; community Modelfiles target it.
Dense 27B with current-gen instruction-following; reliable tool calls, ample context.
Mistral Small (Devstral) lineage built for agentic tool-use; solid mid-size performer.
Handles tool calls but occasionally drifts on long agentic chains.
Decent tool-use; weaker repo-scale memory than 27B+.
Strong reasoning, but Gemma tool-calling is less battle-tested in Roo/Cline than Qwen.
Too small for reliable multi-turn tool-calling; fine only for quick Ask mode.
Verbose chain-of-thought interferes with the strict tool-call protocol.
Large MoE but inconsistent tool-calling in coding agents.
Tiers weigh tool-calling reliability, context window, and coding quality for Roo Code specifically — a model can rank higher for one tool than another. RAM figures are for Q4 quantization. Sources are listed below.
Local setup notes
Select Ollama or LM Studio as the provider. Roo Code defers to your model's Modelfile num_ctx — use 16K for ~8GB VRAM, 32K for ~16GB, and 64K+ for 24GB+. With LM Studio, switch the model to the OpenAI Compatible mode for clean tool calls.
Roo Code official site ↗The weekly local-AI refresh
New open-weight models, real Apple Silicon benchmarks, and the one model worth running on your Mac this week. Free, one email a week, unsubscribe anytime.
Frequently Asked Questions
What is the best local model for Roo Code?+
What context window does Roo Code need for local models?+
Why do small local models fail in Roo Code?+
Sources
Other AI Coding Tools
Claude Code
CloudAnthropic's terminal coding agent — pointed at a local model
OpenCode
Open-source terminal coding agent (Ollama / OpenAI-compatible)
OpenClaw
Self-hosted agentic assistant with first-class Ollama support
Aider
Terminal AI pair-programmer with its own edit-format leaderboard
Cline
Autonomous coding agent for VS Code (formerly Claude Dev)
Continue.dev
Open-source chat, edit & autocomplete for VS Code / JetBrains
Goose
Block's open-source autonomous developer agent (MCP-driven)
Open Claude Code
Open-source Claude Code CLI reimplementation, run on local models
Codex
CloudOpenAI's coding agent — ranked across the GPT-5 model lineup
Cursor
CloudThe AI code editor — a curated cross-vendor frontier lineup