# PLAN.md — Mortdecai Project Roadmap > **Last updated:** 2026-03-20 04:45 UTC > **Model name:** Mortdecai > **Domain:** mortdec.ai > **Status legend:** `[ ]` planned | `[~]` in progress | `[x]` done | `[-]` cancelled/deferred --- ## Vision **Mortdecai** is a fine-tuned 9B parameter language model for Minecraft server operations. It translates natural language to commands, controls an AI God character, self-corrects errors via RCON feedback, and improves through self-play. Runs locally on consumer hardware with zero cloud dependencies at inference time. --- ## Current State ### Models | Model | Base | Examples | Loss | Status | |-------|------|---------|------|--------| | v1 | Qwen3-8B | 233 | 0.10 | Retired (overfit) | | v2 | Qwen3-8B | 361 | 2.03 | Retired | | v3 | Qwen3-8B | 1,308 | 0.55 | Available on steel141 | | **v4** | **Qwen3.5-9B** | **3,369** | **0.20** | **Deployed on prod** | ### Infrastructure | Component | Location | Details | |-----------|----------|---------| | Training GPU | steel141 RTX 3090 Ti (24GB) | QLoRA via Unsloth 2026.3.8 | | Prod inference | node-197 RTX 4000 (16GB) | Ollama, mortdecai-v4 | | MC servers | CT 644 on node-112 | paper-ai:25567, shrink:25566, dev:25568, vanilla:25565 | | Dev data collection | CT 644 | Gemini 3.1 Flash Lite (preview), 5 bots | | Whitelist app | CT 644:8099 | minecraft.mortdec.ai | | Caddy proxy | CT 600 on node-241 | mortdec.ai, minecraft.mortdec.ai | | GPU monitoring | Grafana CT 300 (node-173) | Prometheus + nvidia exporter on steel141 | | Greenfield map | paper-ai | Downloaded, world swapped, needs MCSManager start | | WorldEdit schematics | paper-ai | 77 installed in FAWE/schematics/ | ### API Spend | Provider | Spent | Budget | Status | |----------|-------|--------|--------| | Claude Haiku | $20.01 | $20 | Exhausted | | Gemini (all) | ~$0.50 | $20 | Active on dev (3.1 Flash Lite) | ### Branding - **Font:** Rajdhani Bold - **Color:** #D35400 (Sethian orange) - **Domain:** mortdec.ai → Gitea repo, minecraft.mortdec.ai → whitelist page - **Public repo:** https://git.sethpc.xyz/Seth/Mortdecai ### Training Data: 2,397 seed + 1,159 tool-calling = 3,556 total | Category | Count | |----------|-------| | Command syntax reference | 107 | | Crafting recipes & chains | 176 | | Enchantments (mutual exclusions, max levels) | 60 | | Entities/mobs (summon, kill, NBT) | 60 | | Execute chains | 45 | | Multiplayer (selectors, teams, scoreboards) | 45 | | Advanced commands (tellraw, clone, data, ride) | 45 | | WorldEdit | 45 | | Paper server features | 55 | | Cosmetics/XP/effects | 42 | | Gamerules (49) + risk hierarchy (40) | 89 | | Quantity boundaries | 32 | | Dangerous effect caps (levitation, wither, etc.) | 12 | | Fall safety + drops + suffocation | 33 | | Death/environment (drowning, lava, void, mobs, etc.) | 26 | | Revert-aware gamerules + drops | 20 | | Error correction pairs (enchant order, NBT, etc.) | 54 | | Claude-distilled outputs | 344 | | Bot audit interactions | 448+ | | Boundary/safety/prompt injection | 95+ | | Tool-calling (multi-turn with RCON) | 1,159 | --- ## Completed This Session ### Model & Training - [x] Mortdecai v4 trained: Qwen3.5-9B, 3,369 examples, loss 0.20 - [x] v4 exported to GGUF Q4_K_M (5.3GB) - [x] v4 deployed to prod (RTX 4000) — paper-ai + shrink-world - [x] Single-call mode enabled on prod - [x] `/no_think` in all training data to suppress thinking tokens - [x] Qwen3.5-9B base bake-off: 70.1% accuracy (2x Qwen3-8B) - [~] v4 bake-off running on steel141 ### Validator & Safety - [x] Error correction: detects RCON errors, asks model to fix, retries - [x] Broadened error patterns: `<--[HERE]` universal catch - [x] `kill @a` blocked (players only) - [x] `tp minecraft:spawn` → safe coordinates - [x] Fire fallback won't trigger on "firework" - [x] Dangerous effect caps: levitation 15s, wither 30s, poison 60s, nausea 30s - [x] Fall protection: detects lethal tp, adds slow_falling unless intentional - [x] Gamerule revert timers: auto-revert after 5-10 min (configurable) - [x] Expanded safe_prefixes: gamerule, particle, playsound, title, scoreboard, team, bossbar, locate, etc. - [x] Validator hit-rate tracking to /var/log/mc_validator_stats.json - [x] Command format examples (RIGHT vs WRONG) in prompt - [x] max_tokens bumped to 600 for command calls - [x] Removed template workflow from sudo prompt ### Infrastructure - [x] Ollama updated on steel141 + RTX 4000 (Qwen3.5 support) - [x] GPU monitoring: nvtop + Grafana dashboard on steel141 - [x] Whitelist UUID fix: Mojang API lookup, patches all whitelist.json files - [x] mortdec.ai + minecraft.mortdec.ai live with SSL - [x] Public Mortdecai repo on Gitea with README - [x] `status` command: shows model name, mode, validator stats in-game - [x] Verbose pipeline logging: token counts, speed, elapsed time, think stripping - [x] Greenfield world downloaded and installed on paper-ai - [x] 77 WorldEdit schematics installed ### Training Data Added - [x] Gamerules (49 examples): all major gamerules with natural language - [x] Risk hierarchy (40): L0 blocked, L1 permanent, L2 temporary, prompt injection - [x] Dangerous effects (12): levitation/wither/poison caps - [x] Fall safety (25): height math, water/slime/hay awareness, intent detection - [x] Suffocation (8): tp into blocks, sand/gravel crushing - [x] Death/environment (26): drowning, lava, void, explosions, mobs, starvation, lightning - [x] Revert-aware gamerules (8): revert_after field for v5 - [x] Drop/height (12): intentional drops, safe tp, slow_falling - [x] Enchantment error correction (7): count-before-bracket, typos, old NBT ### Data Collection - [x] API cascade: Haiku ($20) → Gemini ($20) → local - [x] Switched dev to Gemini 3.1 Flash Lite (preview) with 5 bots - [x] Dynamic pricing by Gemini model name - [x] Async prayer/sudo processing (ThreadPoolExecutor, 3 workers) ### Branding - [x] Model named Mortdecai - [x] mortdec.ai domain purchased and configured - [x] Rajdhani Bold as official font - [x] Logo variants generated (6 fonts × 2 text versions) - [x] Whitelist page branded with Mortdecai logo --- ## Active TODOs ### Immediate - [~] v4 bake-off running — publish results to Gitea when complete - [ ] Fix v4 Modelfile chat template on RTX 4000 (done, needs verification) - [ ] Also fix on steel141's Ollama instance ### Short-term (v5 prep) - [ ] Shared memory system: per-server JSON, owner-tagged, location/preference/fact types - Player says "remember this is home" → AI writes location memory - Other players can reference: "tp me to slingshooter08's home" - Memory in context for location lookups, tool call for read/write - [ ] `memory_write` field in model output schema - [ ] Setblock training data expansion - [ ] `world.check_block` tool for terrain queries before tp - [ ] Self-play loop deployment (3-tier: drills, self-critique, adversarial) - [ ] Ingest all Gemini 3.1 Flash Lite training data - [ ] More error correction from production RCON failures ### Model v5 Training - [ ] Train with tool-calling format (rcon.execute, wiki_lookup, world.player_info) - [ ] `revert_after` / `revert_commands` in output schema - [ ] Self-play generated data (200 rounds post-v4) - [ ] Memory read/write training examples - [ ] Ground-level terrain detection training - [ ] Fall damage math in model reasoning (not just validator) - [ ] Setblock + block state training - [ ] More death mechanics awareness in reasoning ### Infrastructure - [ ] GPU monitoring for RTX 4000 (second exporter) - [ ] Validator hit-rate analysis — remove fixes that fire <1% - [ ] Automate training pipeline: ingest → dedup → train → export → deploy - [ ] POS receipt for Gemini milestones - [ ] Start Greenfield world via MCSManager ### Content & Community - [ ] Invite more playtesters via minecraft.mortdec.ai - [ ] Update mortdec.ai README with v4 bake-off results - [ ] Consider public HuggingFace release - [ ] WorldEdit schematic library expansion --- ## Risk Hierarchy Commands classified by permanence: | Level | Permanence | Examples | Model behavior | |:-----:|-----------|----------|----------------| | **0** | Irreversible/admin | ban, kick, stop, op, deop, whitelist | Never execute | | **1** | Permanent toggle | gamemode @a, permanent gamerules, difficulty | Execute for self only, refuse for @a | | **2** | Temporary/reversible | gamerules with time limits, brief changes | Allow, schedule auto-revert | | **3** | Transient | time, weather, tick speed, chat settings | Execute freely | | **4** | Generous | full enchanted gear, large material stacks | Execute for worthy requests | **Gamerule revert system:** Changes auto-revert after 5-10 min unless "permanently" specified. Player notified of countdown. **Dangerous effect caps (hardcoded):** Levitation 15s, Wither 30s, Poison 60s, Nausea 30s. **Fall protection:** Lethal tp detected → slow_falling added unless intent words present (drop, yeet, throw, kill me). --- ## Key Decisions | Date | Decision | Rationale | |------|----------|-----------| | 03-18 | gemma3n:e4b for initial prod | Bake-off winner at 80.6% accuracy | | 03-18 | Qwen3-8B for v1-v3 training | Best syntax quality, Apache 2.0 | | 03-18 | God Soul document | Character framework from Claude's soul | | 03-19 | API cascade for data collection | Haiku→Gemini→local fallback | | 03-19 | Single-call mode | One LLM call for commands + message | | 03-19 | Error correction via RCON | Model tries → error → self-corrects | | 03-19 | 3-tier self-play | Drills, self-critique, adversarial | | 03-20 | Qwen3.5-9B for v4 | 2x base accuracy, native tool-calling | | 03-20 | Gamerule revert timers | Permanence determines risk level | | 03-20 | Dangerous effect caps | Validator hardcodes max durations | | 03-20 | Fall protection | Health check + intent detection before tp | | 03-20 | Shared player memory (planned) | Owner-tagged, cross-player, AI-managed | | 03-20 | Mortdecai branding | Rajdhani Bold, #D35400, mortdec.ai | --- ## Success Criteria | Metric | v3 | v4 (target) | v5 (goal) | |--------|:-:|:-:|:-:| | Command accuracy | ~70% | 85%+ | 95%+ | | Safety compliance | ~95% | 99%+ | 99.9%+ | | Error self-correction | N/A | 50%+ | 80%+ | | Response latency | 5-15s | <5s | <3s | | Empty response rate | ~10% | <5% | <2% | | Think token leakage | Yes | No | No | --- *Updated as the project evolves. Check git history for previous versions.*