Files
Mortdecai/PLAN.md
T
Seth f39809eaca Semver rename: v1-v5 → 0.1.0-0.5.0 across all files
Versioning scheme: semantic versioning (MAJOR.MINOR.PATCH)
- 0.x.0 = pre-release development
- 1.0.0 = first public/monetized release

Renamed everywhere: PLAN.md, training scripts, self-play, overnight script,
status printer, whitelist app, discord bot, all training data references.

Ollama models retagged: mortdecai-v4 → mortdecai:0.4.0
Server configs updated on all three servers.
Self-play restarted with new model name.
Entity targeting + radius-aware kill + distance scale training added.

Seed dataset: 2,503 + tool: 1,159 + self-play: 5,059 = 8,721 total examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:37:14 -04:00

10 KiB
Raw Blame History

PLAN.md — Mortdecai Project Roadmap

Last updated: 2026-03-20 04:45 UTC Model name: Mortdecai Domain: mortdec.ai Status legend: [ ] planned | [~] in progress | [x] done | [-] cancelled/deferred


Vision

Mortdecai is a fine-tuned 9B parameter language model for Minecraft server operations. It translates natural language to commands, controls an AI God character, self-corrects errors via RCON feedback, and improves through self-play. Runs locally on consumer hardware with zero cloud dependencies at inference time.


Current State

Models

Model Base Examples Loss Status
0.1.0 Qwen3-8B 233 0.10 Retired (overfit)
0.2.0 Qwen3-8B 361 2.03 Retired
0.3.0 Qwen3-8B 1,308 0.55 Available on steel141
0.4.0 Qwen3.5-9B 3,369 0.20 Deployed on prod

Infrastructure

Component Location Details
Training GPU steel141 RTX 3090 Ti (24GB) QLoRA via Unsloth 2026.3.8
Prod inference node-197 RTX 4000 (16GB) Ollama, mortdecai:0.4.0
MC servers CT 644 on node-112 paper-ai:25567, shrink:25566, dev:25568, vanilla:25565
Dev data collection CT 644 Gemini 3.1 Flash Lite (preview), 5 bots
Whitelist app CT 644:8099 minecraft.mortdec.ai
Caddy proxy CT 600 on node-241 mortdec.ai, minecraft.mortdec.ai
GPU monitoring Grafana CT 300 (node-173) Prometheus + nvidia exporter on steel141
Greenfield map paper-ai Downloaded, world swapped, needs MCSManager start
WorldEdit schematics paper-ai 77 installed in FAWE/schematics/

API Spend

Provider Spent Budget Status
Claude Haiku $20.01 $20 Exhausted
Gemini (all) ~$0.50 $20 Active on dev (3.1 Flash Lite)

Branding

Training Data: 2,397 seed + 1,159 tool-calling = 3,556 total

Category Count
Command syntax reference 107
Crafting recipes & chains 176
Enchantments (mutual exclusions, max levels) 60
Entities/mobs (summon, kill, NBT) 60
Execute chains 45
Multiplayer (selectors, teams, scoreboards) 45
Advanced commands (tellraw, clone, data, ride) 45
WorldEdit 45
Paper server features 55
Cosmetics/XP/effects 42
Gamerules (49) + risk hierarchy (40) 89
Quantity boundaries 32
Dangerous effect caps (levitation, wither, etc.) 12
Fall safety + drops + suffocation 33
Death/environment (drowning, lava, void, mobs, etc.) 26
Revert-aware gamerules + drops 20
Error correction pairs (enchant order, NBT, etc.) 54
Claude-distilled outputs 344
Bot audit interactions 448+
Boundary/safety/prompt injection 95+
Tool-calling (multi-turn with RCON) 1,159

Completed This Session

Model & Training

  • Mortdecai 0.4.0 trained: Qwen3.5-9B, 3,369 examples, loss 0.20
  • 0.4.0 exported to GGUF Q4_K_M (5.3GB)
  • 0.4.0 deployed to prod (RTX 4000) — paper-ai + shrink-world
  • Single-call mode enabled on prod
  • /no_think in all training data to suppress thinking tokens
  • Qwen3.5-9B base bake-off: 70.1% accuracy (2x Qwen3-8B)
  • [~] 0.4.0 bake-off running on steel141

Validator & Safety

  • Error correction: detects RCON errors, asks model to fix, retries
  • Broadened error patterns: <--[HERE] universal catch
  • kill @a blocked (players only)
  • tp minecraft:spawn → safe coordinates
  • Fire fallback won't trigger on "firework"
  • Dangerous effect caps: levitation 15s, wither 30s, poison 60s, nausea 30s
  • Fall protection: detects lethal tp, adds slow_falling unless intentional
  • Gamerule revert timers: auto-revert after 5-10 min (configurable)
  • Expanded safe_prefixes: gamerule, particle, playsound, title, scoreboard, team, bossbar, locate, etc.
  • Validator hit-rate tracking to /var/log/mc_validator_stats.json
  • Command format examples (RIGHT vs WRONG) in prompt
  • max_tokens bumped to 600 for command calls
  • Removed template workflow from sudo prompt

Infrastructure

  • Ollama updated on steel141 + RTX 4000 (Qwen3.5 support)
  • GPU monitoring: nvtop + Grafana dashboard on steel141
  • Whitelist UUID fix: Mojang API lookup, patches all whitelist.json files
  • mortdec.ai + minecraft.mortdec.ai live with SSL
  • Public Mortdecai repo on Gitea with README
  • status command: shows model name, mode, validator stats in-game
  • Verbose pipeline logging: token counts, speed, elapsed time, think stripping
  • Greenfield world downloaded and installed on paper-ai
  • 77 WorldEdit schematics installed

Training Data Added

  • Gamerules (49 examples): all major gamerules with natural language
  • Risk hierarchy (40): L0 blocked, L1 permanent, L2 temporary, prompt injection
  • Dangerous effects (12): levitation/wither/poison caps
  • Fall safety (25): height math, water/slime/hay awareness, intent detection
  • Suffocation (8): tp into blocks, sand/gravel crushing
  • Death/environment (26): drowning, lava, void, explosions, mobs, starvation, lightning
  • Revert-aware gamerules (8): revert_after field for 0.5.0
  • Drop/height (12): intentional drops, safe tp, slow_falling
  • Enchantment error correction (7): count-before-bracket, typos, old NBT

Data Collection

  • API cascade: Haiku ($20) → Gemini ($20) → local
  • Switched dev to Gemini 3.1 Flash Lite (preview) with 5 bots
  • Dynamic pricing by Gemini model name
  • Async prayer/sudo processing (ThreadPoolExecutor, 3 workers)

Branding

  • Model named Mortdecai
  • mortdec.ai domain purchased and configured
  • Rajdhani Bold as official font
  • Logo variants generated (6 fonts × 2 text versions)
  • Whitelist page branded with Mortdecai logo

Active TODOs

Immediate

  • [~] 0.4.0 bake-off running — publish results to Gitea when complete
  • Fix 0.4.0 Modelfile chat template on RTX 4000 (done, needs verification)
  • Also fix on steel141's Ollama instance

Short-term (0.5.0 prep)

  • Shared memory system: per-server JSON, owner-tagged, location/preference/fact types
    • Player says "remember this is home" → AI writes location memory
    • Other players can reference: "tp me to slingshooter08's home"
    • Memory in context for location lookups, tool call for read/write
  • memory_write field in model output schema
  • Setblock training data expansion
  • world.check_block tool for terrain queries before tp
  • Self-play loop deployment (3-tier: drills, self-critique, adversarial)
  • Ingest all Gemini 3.1 Flash Lite training data
  • More error correction from production RCON failures

Model 0.5.0 Training

  • Train with tool-calling format (rcon.execute, wiki_lookup, world.player_info)
  • revert_after / revert_commands in output schema
  • Self-play generated data (200 rounds post-0.4.0)
  • Memory read/write training examples
  • Ground-level terrain detection training
  • Fall damage math in model reasoning (not just validator)
  • Setblock + block state training
  • More death mechanics awareness in reasoning

Infrastructure

  • GPU monitoring for RTX 4000 (second exporter)
  • Validator hit-rate analysis — remove fixes that fire <1%
  • Automate training pipeline: ingest → dedup → train → export → deploy
  • POS receipt for Gemini milestones
  • Start Greenfield world via MCSManager

Content & Community

  • Invite more playtesters via minecraft.mortdec.ai
  • Update mortdec.ai README with 0.4.0 bake-off results
  • Consider public HuggingFace release
  • WorldEdit schematic library expansion

Risk Hierarchy

Commands classified by permanence:

Level Permanence Examples Model behavior
0 Irreversible/admin ban, kick, stop, op, deop, whitelist Never execute
1 Permanent toggle gamemode @a, permanent gamerules, difficulty Execute for self only, refuse for @a
2 Temporary/reversible gamerules with time limits, brief changes Allow, schedule auto-revert
3 Transient time, weather, tick speed, chat settings Execute freely
4 Generous full enchanted gear, large material stacks Execute for worthy requests

Gamerule revert system: Changes auto-revert after 5-10 min unless "permanently" specified. Player notified of countdown.

Dangerous effect caps (hardcoded): Levitation 15s, Wither 30s, Poison 60s, Nausea 30s.

Fall protection: Lethal tp detected → slow_falling added unless intent words present (drop, yeet, throw, kill me).


Key Decisions

Date Decision Rationale
03-18 gemma3n:e4b for initial prod Bake-off winner at 80.6% accuracy
03-18 Qwen3-8B for 0.1.0-0.3.0 training Best syntax quality, Apache 2.0
03-18 God Soul document Character framework from Claude's soul
03-19 API cascade for data collection Haiku→Gemini→local fallback
03-19 Single-call mode One LLM call for commands + message
03-19 Error correction via RCON Model tries → error → self-corrects
03-19 3-tier self-play Drills, self-critique, adversarial
03-20 Qwen3.5-9B for 0.4.0 2x base accuracy, native tool-calling
03-20 Gamerule revert timers Permanence determines risk level
03-20 Dangerous effect caps Validator hardcodes max durations
03-20 Fall protection Health check + intent detection before tp
03-20 Shared player memory (planned) Owner-tagged, cross-player, AI-managed
03-20 Mortdecai branding Rajdhani Bold, #D35400, mortdec.ai

Success Criteria

Metric 0.3.0 0.4.0 (target) 0.5.0 (goal)
Command accuracy ~70% 85%+ 95%+
Safety compliance ~95% 99%+ 99.9%+
Error self-correction N/A 50%+ 80%+
Response latency 5-15s <5s <3s
Empty response rate ~10% <5% <2%
Think token leakage Yes No No

Updated as the project evolves. Check git history for previous versions.