Qwen3.6 35B Q4 (llama.cpp)

llama.cpp MoE · 3B active / 35B total AgentsGeneralReasoning

Throughput

48.6t/s@ 32k

Engine

llama.cpp

Parameters

3B active / 35B total

Released

2026-04-16

Benchmarked

2026-06-27

Throughput at each benched context window (single measurement).

Context	KV	Throughput
@ 32k peak golden	—	48.6t/s

qwen36-q4-llama

reasoningggufllamacppgolden

Golden fleet target — auto-scaffolded from recipe qwen36-q4-llama.

golden ?/? @ 38.8 tok/s — fill~14745 — bench-agent-v2 — tool_ok=False