Best Local LLMs for Continue.dev
Open-source chat, edit & autocomplete for VS Code / JetBrains
Continue.dev is an open-source AI code assistant for VS Code and JetBrains that runs chat, edit, and tab-autocomplete fully locally via Ollama. It is less agentic than Cline or Roo — mostly chat plus edit plus autocomplete — so tool-calling matters little and a wide range of instruct models work well. The catch: autocomplete needs a separate fill-in-the-middle model, not the same one you chat with.

Best pick
Qwen2.5 Coder 14B
Continue's code-specialist sweet spot; the top locally-runnable chat/edit pick that fits common RAM.
What Continue.dev needs
Strong coding-instruct quality at a size that fits in RAM — Continue's chat/edit roles tolerate non-tool-calling models, so raw code reasoning beats agentic polish.
Continue.dev Local LLM Tier List
Continue's code-specialist sweet spot; the top locally-runnable chat/edit pick that fits common RAM.
Newest large Qwen MoE — desktop-class speed with big-model quality for chat/edit.
Continue's explicitly recommended local chat + code-gen model; light enough for 16GB.
Named in Continue docs as a recommended Ollama chat model.
Dense high-quality Qwen for strong chat/edit when RAM allows.
Recommended by Continue for fast, versatile chat; reliable edit behavior.
Top-tier reasoning/edit quality for high-RAM Macs; strongest Llama for chat.
Solid mid-size dense chat/edit at lower RAM cost.
Strong newer Gemma for chat/edit when RAM allows.
Strong reasoning-for-size; good chat/edit on modest hardware.
Useful for debugging/refactor chat, but thinking models are slower and not for autocomplete.
General (non-coder) Qwen2.5; the coder-14b is the better pick for code roles.
Older Gemma; weak for coding, generic chat only.
Heavy reasoning distill; slow and overkill for Continue's mostly non-agentic roles.
Tiers weigh tool-calling reliability, context window, and coding quality for Continue.dev specifically — a model can rank higher for one tool than another. RAM figures are for Q4 quantization. Sources are listed below.
Local setup notes
Continue assigns models to roles. Use an instruct coder (7B–32B) for chat and edit, and pair it with a small FIM-trained model for tab-autocomplete. Continue's validated local stack is qwen2.5-coder:7b for chat plus qwen2.5-coder:1.5b for autocomplete.
Continue.dev official site ↗The weekly local-AI refresh
New open-weight models, real Apple Silicon benchmarks, and the one model worth running on your Mac this week. Free, one email a week, unsubscribe anytime.
Frequently Asked Questions
Which local model is best for Continue.dev?+
Can I use one model for both chat and autocomplete?+
Do reasoning models like DeepSeek-R1 work in Continue.dev?+
Sources
Other AI Coding Tools
Claude Code
CloudAnthropic's terminal coding agent — pointed at a local model
OpenCode
Open-source terminal coding agent (Ollama / OpenAI-compatible)
OpenClaw
Self-hosted agentic assistant with first-class Ollama support
Aider
Terminal AI pair-programmer with its own edit-format leaderboard
Cline
Autonomous coding agent for VS Code (formerly Claude Dev)
Roo Code
Autonomous VS Code agent with modes (a Cline fork)
Goose
Block's open-source autonomous developer agent (MCP-driven)
Open Claude Code
Open-source Claude Code CLI reimplementation, run on local models
Codex
CloudOpenAI's coding agent — ranked across the GPT-5 model lineup
Cursor
CloudThe AI code editor — a curated cross-vendor frontier lineup