Type to search · Enter for full results

9.1 Global

qwen3.6:latest

Judge: gemma4:31b · 170/178 tests · 54 min 47 s · 44.2 tok/s

36.0B · Q4_K_M · 22.3 GB · 262K ctx

VisionToolsThinking

Category breakdown

frontend 10.0
long-context 10.0
math 10.0
surprise 10.0
agentic 9.9
roleplay 9.5
vision 9.5
multilingual 9.1
reasoning 9.1
organization 9.0
safety 9.0
code 8.9
instruction 8.7
web 8.1
writing 7.1