Files
Seth 5b28002001 0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline
Major changes from this session:

Training:
- 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL
- 7,256 merged training examples (up from 3,183)
- New training data: failure modes (85), midloop messaging (27),
  prompt injection defense (29), personality (32), gold from quarantine
  bank (232), new tool examples (30), claude's own experience (10)
- All training data RCON-validated at 100% pass rate
- Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56%

Oracle Bot (Mind's Eye):
- Invisible spectator bot (mineflayer) streams world state via WebSocket
- HTML5 Canvas frontend at mind.mortdec.ai
- Real-time tool trace visualization with expandable entries
- Streaming model tokens during inference
- Gateway integration: fire-and-forget POST /trace on every tool call

Reinforcement Learning:
- Gymnasium environment wrapping mineflayer bot (minecraft_env.py)
- PPO training via Stable Baselines3 (10K param policy network)
- Behavioral cloning pretraining (97.5% accuracy on expert policy)
- Infinite training loop with auto-restart and checkpoint resume
- Bot learns combat, survival, navigation from raw experience

Bot Army:
- 8-soldier marching formation with autonomous combat
- Combat bots using mineflayer-pvp, pathfinder, armor-manager
- Multilingual prayer bots via translategemma:27b (18 languages)
- Frame-based AI architecture: LLM planner + reactive micro-scripts

Infrastructure:
- Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser)
- Billing gateway now tracks all LAN traffic (LAN auto-auth)
- Gateway fallback for empty god-mode responses
- Updated mortdec.ai landing page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 20:22:50 -04:00

7.0 KiB

Mortdecai — Agent Context

Single source of truth for AI agents working on this project. When docs disagree, trust this file > implementation > historical docs.

Project Identity

Mortdecai is a fine-tuned Qwen3.5-9B (and upcoming 14B) language model for Minecraft server operations. It runs inside a Paper 1.21 server via a LangGraph-style gateway, responding to player prayers (god mode) and commands (sudo mode) using 24 tools. The model is trained via QLoRA on curated + Claude-distilled + RCON-validated data.

This is NOT a chatbot, NOT a general assistant. It is a domain-specific Minecraft operations agent.

Current State (2026-03-22)

  • Current model: mortdecai:0.5.0 (Qwen3.5-9B, QLoRA fine-tune)
  • Base model: Qwen/Qwen3.5-9B (HuggingFace)
  • Known issue: 30% empty god-mode responses (think-block token drain, no freeform text training)
  • Next version: 0.6.0 (9B + 14B, training on rented H100)
  • Tool count: 24 (all deployed and wired into gateway)

Canonical Files (trust these)

File Domain Trust
agent/tools/tool_schemas.py Tool capability inventory Canonical — 24 tools
langgraph_gateway.py (in Sethpc-Minecraft-PaperFork repo) Runtime orchestrator Canonical — production code
mc_aigod_paper.py (in Sethpc-Minecraft-PaperFork repo) Server-side AI handler Canonical
training/scripts/validate_all_training.py Data quality validator Current
training/scripts/merge_datasets.py Training data merge Current
training/scripts/train_lora.py Training script Current

Historical Files (useful but potentially stale)

File Note
PLAN.md Reflects 0.4.0→0.5.0 era planning. Not current deployment truth.
SESSION.md Historical session log. May reference old architecture.
README.md Public-facing. Tool count may lag behind tool_schemas.py.
MODEL_CARD.md Reflects 0.5.0 release. Update on each version bump.
data/schema.json LEGACY — does not validate current data. Use validate_all_training.py instead.
data/validate_dataset.py LEGACY — replaced by training/scripts/validate_all_training.py.
agent/serve.py Reference only — not the production runtime. Production runs in PaperFork repo.

Runtime Architecture

Player chat → Paper server (mc_aigod_paper.py) → LangGraph Gateway (langgraph_gateway.py)
                                                      ↓
                                               Mortdecai model (Ollama)
                                                      ↓
                                               Tool loop (24 tools, max 50 steps)
                                                      ↓
                                               RCON execution → Minecraft server
  • Production entrypoint: mc_aigod_paper.py watches server log, dispatches to gateway
  • Gateway: langgraph_gateway.py on port 8091 (internal to CT 644)
  • Model inference: Ollama on steel141 (3090 Ti F16 for dev, RTX 4000 Q4 for prod)
  • This repo contains: tools, schemas, training data, training scripts, eval harness
  • The PaperFork repo contains: runtime (gateway + server handler)

24 Tools (all deployed, all wired into gateway)

Tool Status In Training Data
rcon.execute Production Yes (heavy)
minecraft.lookup Production Partial (was wiki_lookup)
plugin.docs_lookup Production Yes
world.player_info (+ inventory) Production Yes
world.server_state Production Yes
world.nearby_entities Production Yes
world.scan_area Production Needs 0.6.0 examples
world.redstone_trace Production Needs 0.6.0 examples
world.render Production Needs 0.6.0 examples
server.config Production Needs 0.6.0 examples
memory.read Production Yes
memory.write Production Yes
journal.read Production 120 multitool examples
journal.write Production 120 multitool examples
log.query Production Needs more examples
user.ask Production Needs more examples
script.write Production Yes
script.validate Production Yes
script.execute Production Yes
script.read Production Minimal
script.list Production Minimal
script.delete Production Minimal
script.schedule (tick/load/delay) Production Minimal
training.save Dev only (config toggle) Needs 0.6.0 examples

Data Map

Raw data: data/raw/ — individual JSONL files from various sources Processed data: data/processed/ — merged, filtered, validated Quarantine: data/quarantine/ — failed validation, some salvageable External: data/external/ — IGLU Microsoft Research dataset (nested git repo)

Known data issues (from validator run):

  • 2,891 commands use @s (should be player name)
  • 2,633 commands use enchantment syntax Paper RCON rejects
  • 24,476 examples in old dict format (need conversion to messages[] chat)
  • 7,647 examples have outdated system prompts (missing new tools)

Ignore List

  • __pycache__/ — committed by accident, not meaningful
  • USER_NOTES_IGNORE_ME/ — private notes, not project context
  • data/external/iglu-repo/ — external dataset, read-only
  • eval/results/ — historical eval outputs, may be stale
  • data/processed/pipeline_output.jsonl — 7,032 examples with RCON connection failures marked as success. Do NOT trust.
  • data/raw/scraped_*.jsonl — empty files
  • web/ — admin/community tools, not model-related

Known Problems

  1. Secrets in tracked files — RCON passwords, API keys hardcoded. Should be env vars.
  2. README/MODEL_CARD tool count lag — says 17, reality is 24. Update on release.
  3. agent/serve.py misleads — looks like main entrypoint but isn't. Real runtime is in PaperFork repo.
  4. data/schema.json is legacy — doesn't validate current data. Replaced by validate_all_training.py.
  5. pipeline_output.jsonl is poisoned — connection failures marked as success.

Working Rules for AI Agents

  1. agent/tools/tool_schemas.py is the tool inventory. Not README, not PLAN.md.
  2. The production runtime is NOT in this repo. It's in Sethpc-Minecraft-PaperFork/.
  3. Paper RCON cannot give enchanted items. Use plain give + effect combos.
  4. @s does not work via RCON (no executor context). Training data uses @p as a pragmatic fix, but @p selects nearest player — not always the requester. Prefer explicit player names in new training data.
  5. Dev world is named devworld, not world. WorldGuard needs -w devworld.
  6. When in doubt about a command, RCON-validate it on dev (192.168.0.244:25578, pass REDACTED_RCON).
  7. Keep seed_dataset dominant in training mix to prevent fill_build regression.
  8. The model (0.5.0) cannot produce freeform text in sudo mode — it only outputs JSON. Training 0.6.0 fixes this.
  9. data/processed/pipeline_output.jsonl is poisoned (7,032 examples with RCON connection failures marked success). Excluded from training.