Switch Ollama to gemma3n:e4b on node-197 GPU

Bake-off results: gemma3n:e4b (80.6% cmd match, 100% safety, 5.9s)
outperforms qwen3-coder:30b (67.7%, 93.5%, 14.7s) on all metrics.
Moved from steel141 CPU inference to node-197 RTX 4000 GPU.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude Code
2026-03-18 10:23:55 -04:00
parent 0ed3a512a2
commit 31be504f69
2 changed files with 12 additions and 3 deletions
+9
View File
@@ -113,3 +113,12 @@ For shrink-world use port `25576` and password `REDACTED_RCON`.
- External access requires port forwarding on router: `25565` and `25566``192.168.0.244`
- Web panel accessible via Caddy at `mc.sethpc.xyz`
- DNS: Pi-hole at `192.168.0.153`
---
## AI / Ollama
- **Ollama instance:** `192.168.0.179:11434` (CT 105, node-197, Quadro RTX 4000 8GB)
- **Model (both message + command):** `gemma3n:e4b` (6.9B, Q4_K_M, GPU-accelerated)
- **Previous:** `192.168.0.141:11434` (steel141), `gemma3:12b` + `qwen3-coder:30b`
- **Changed:** 2026-03-18 after bake-off showed gemma3n:e4b outperforms qwen3-coder:30b (80.6% vs 67.7% cmd accuracy, 100% vs 93.5% safety, 3x faster)
+3 -3
View File
@@ -5,9 +5,9 @@
"rcon_host": "127.0.0.1",
"rcon_port": 25576,
"rcon_password": "REDACTED_RCON",
"ollama_url": "http://192.168.0.141:11434",
"model": "gemma3:12b",
"command_model": "qwen3-coder:30b",
"ollama_url": "http://192.168.0.179:11434",
"model": "gemma3n:e4b",
"command_model": "gemma3n:e4b",
"temperature": 0.85,
"max_tokens": 600,
"cooldown_seconds": 20,