← Leaderboard
nvidia/qwen3-30b-a3b

Qwen3-30B-A3B

vLLM MoE · 3B active / 30B total AgentsGeneralReasoning
Throughput
75.8t/s@ 40k
Engine
vLLM
Parameters
3B active / 30B total
Released
2025-07-08
Benchmarked
2026-06-27

Context ladder

Throughput at each benched context window (single measurement).

Context KV Throughput
@ 40k peak golden 75.8t/s

Golden profile

nvidia-qwen3-30b-a3b-eugr

Capabilities

generalmoefastvllmnvfp4tool-calling

Why we run it

Faster MoE sibling for interactive agent loops — lower latency and memory pressure when max intelligence isn't required.

Bench notes

golden ?/? @ 71.2 tok/s — fill~18432 — bench-agent-v2 — tool_ok=False

Benchmarked 2026-06-27
SparkBench · GB10 · single node