← Leaderboard
yuxinlu1/mellum2-12b-opus-thinking

Mellum2 12B MoE Opus Thinking Q4

llama.cpp 12B CodeReasoning
Throughput
74.4t/s@ 32k
Engine
llama.cpp
Parameters
12B
Benchmarked
2026-06-27

Context ladder

Throughput at each benched context window.

Context KV Throughput
@ 32k peak golden q8_0 74.4t/s
@ 64k q8_0 27.1t/s
@ 96k q8_0 19.3t/s
@ 128k q8_0 14.6t/s

Golden profile

mellum2-12b-opus-q4

Capabilities

testingcodingmoellamacppgoldengguf

Why we run it

Golden fleet target — auto-scaffolded from recipe mellum2-12b-opus-q4.

Bench notes

golden 32k/q8_0 @ 74.4 tok/s — fill~14745 — bench-agent-v2 — tool_ok=True

Links

Benchmarked 2026-06-27
SparkBench · GB10 · single node