Major changes from this session:
Training:
- 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL
- 7,256 merged training examples (up from 3,183)
- New training data: failure modes (85), midloop messaging (27),
prompt injection defense (29), personality (32), gold from quarantine
bank (232), new tool examples (30), claude's own experience (10)
- All training data RCON-validated at 100% pass rate
- Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56%
Oracle Bot (Mind's Eye):
- Invisible spectator bot (mineflayer) streams world state via WebSocket
- HTML5 Canvas frontend at mind.mortdec.ai
- Real-time tool trace visualization with expandable entries
- Streaming model tokens during inference
- Gateway integration: fire-and-forget POST /trace on every tool call
Reinforcement Learning:
- Gymnasium environment wrapping mineflayer bot (minecraft_env.py)
- PPO training via Stable Baselines3 (10K param policy network)
- Behavioral cloning pretraining (97.5% accuracy on expert policy)
- Infinite training loop with auto-restart and checkpoint resume
- Bot learns combat, survival, navigation from raw experience
Bot Army:
- 8-soldier marching formation with autonomous combat
- Combat bots using mineflayer-pvp, pathfinder, armor-manager
- Multilingual prayer bots via translategemma:27b (18 languages)
- Frame-based AI architecture: LLM planner + reactive micro-scripts
Infrastructure:
- Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser)
- Billing gateway now tracks all LAN traffic (LAN auto-auth)
- Gateway fallback for empty god-mode responses
- Updated mortdec.ai landing page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- training/scripts/train_lora.py: Unsloth QLoRA trainer for qwen3:8b
- training/scripts/train_lora.sh: Launch script for steel141 RTX 3090 Ti
- eval/bakeoff.py: Fixed token budget (400->1500) that caused qwen3
models to exhaust tokens on thinking, added --no-think flag
- agent/serve.py: Default model changed to gemma3n:e4b
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bake-off tested 7 models on 31 seed examples via GPU-accelerated Ollama
on node-197 RTX 4000. gemma3n:e4b leads for serving (80.6% cmd match,
100% safety, 5.9s). qwen3:8b recommended as fine-tuning base (Apache 2.0,
best syntax quality, strong ecosystem). Full research in MODEL_RESEARCH.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>