Mortdecai

Author	SHA1	Message	Date
Seth	f39809eaca	Semver rename: v1-v5 → 0.1.0-0.5.0 across all files Versioning scheme: semantic versioning (MAJOR.MINOR.PATCH) - 0.x.0 = pre-release development - 1.0.0 = first public/monetized release Renamed everywhere: PLAN.md, training scripts, self-play, overnight script, status printer, whitelist app, discord bot, all training data references. Ollama models retagged: mortdecai-v4 → mortdecai:0.4.0 Server configs updated on all three servers. Self-play restarted with new model name. Entity targeting + radius-aware kill + distance scale training added. Seed dataset: 2,503 + tool: 1,159 + self-play: 5,059 = 8,721 total examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 21:37:14 -04:00
Seth	0f043384e5	Self-play: --api-key for authenticated gateway connections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:40:01 -04:00
Seth	ead16fd429	Persistent RCON connections — fixes server crash from connection spam Root cause: self-play opened/closed a new TCP socket for every RCON command (hundreds/minute). Paper's RCON listener creates a thread per connection, overwhelming the server until it stopped. Fix: PersistentRCON class maintains a single connection per server with auto-reconnect. Thread-safe via lock. Connection pool keyed by host:port. Applied to: - mc_aigod_paper.py (prod paper-ai + dev) - mc_aigod.py (shrink-world) - self_play.py (training data generation) - persistent_rcon.py (shared module) Before: ~100+ RCON connections/minute → server crash After: 3 persistent connections total → stable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 18:24:44 -04:00
Seth	25918b5b66	Self-play: 50 rounds, 0.1s sleep, max GPU utilization Bumped from 20 rounds/tier to 50. Reduced sleep from 1s to 0.1s. GPUs should run near 100% — Ollama queues requests internally. mortdecai-sites container (CT 650) created on pve112. Landing page live at mortdec.ai. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 07:36:01 -04:00
Seth	9abf9238c5	3-tier self-play: command drills, self-critique, adversarial Tier 1 — Command drills: Random seed prompts → generate commands → RCON validates Teaches: accurate command syntax Tier 2 — Single-shot self-critique: Model invents a tricky prompt AND responds in one call RCON validates the self-generated commands Teaches: edge-case awareness, self-evaluation Tier 3 — Adversarial self-play: Session A generates challenging prompts Fresh Session B responds cold (can't cheat) RCON validates, self-corrects on errors Teaches: robustness, generalization Usage: --tier 1\|2\|3\|all --rounds N --focus category Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 19:39:33 -04:00
Seth	c947fc3fa9	Self-play loop, Qwen3.5-9B bake-off: 70% base accuracy Self-play (training/scripts/self_play.py): - Model generates edge-case prompts across 9 categories - Attempts commands via RCON, self-corrects on errors - Successful traces → standard training examples - Error correction traces → multi-turn tool-calling examples - Anti-collapse: focuses on categories model is weakest in - Ready for v4 deployment, not yet active Qwen3.5-9B base model bake-off (147/1542 cases): - 70.1% OK (vs 34% Qwen3-8B base) — 2x improvement - 29.9% MISS (mostly God/prayer — no persona training) - 15.6% needed syntax fixes - Avg 7.5s response (thinking tokens) - Strong v4 candidate: better base + tool-calling architecture Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 19:35:57 -04:00

6 Commits