Qwen3.5-9B bake-off results, model named Mortdecai
Bake-off: qwen3.5:9b base model, 147 cases: - 70.1% command match (2x qwen3:8b baseline) - 15.6% needed syntax fixes - 29.9% miss (mostly God/prayer — no persona training) - Avg 7.5s, median 5.7s (thinking tokens) Model officially named Mortdecai. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,17 @@
|
|||||||
|
{
|
||||||
|
"timestamp": 1773963920,
|
||||||
|
"ollama_url": "http://192.168.0.141:11434",
|
||||||
|
"note": "Partial run \u2014 147 of 1542 cases before manual kill",
|
||||||
|
"summary": [
|
||||||
|
{
|
||||||
|
"model": "qwen3.5:9b",
|
||||||
|
"n": 147,
|
||||||
|
"cmd_match_%": 70.1,
|
||||||
|
"syntax_fixes_%": 15.6,
|
||||||
|
"miss_%": 29.9,
|
||||||
|
"avg_latency_ms": 7457,
|
||||||
|
"median_latency_ms": 5660,
|
||||||
|
"note": "Base model, no fine-tuning. 2x better than qwen3:8b base (70% vs 34%)"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user