Qwen3-Coder-30B-A3B-Instruct

llama.cpp MoE · 3B active / 30B total AgentsCode

Throughput

70.9t/s@ 32k

Engine

llama.cpp

Parameters

3B active / 30B total

Released

2025-07-31

Benchmarked

2026-06-27

Throughput at each benched context window (single measurement).

Context	KV	Throughput
@ 32k peak golden	—	70.9t/s

unsloth-qwen3-coder-30b-a3b-instruct-llama

codingagenticmoellamacppgguf

DGX Spark forum pick for agentic coding. MoE coder fits 128GB; Q4 + Q5 GGUF for llama.cpp bake-off path.

golden ?/? @ 29.2 tok/s — fill~14745 — bench-agent-v2 — tool_ok=True