Files

T

Seth 5b28002001 0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline

Major changes from this session:

Training:
- 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL
- 7,256 merged training examples (up from 3,183)
- New training data: failure modes (85), midloop messaging (27),
  prompt injection defense (29), personality (32), gold from quarantine
  bank (232), new tool examples (30), claude's own experience (10)
- All training data RCON-validated at 100% pass rate
- Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56%

Oracle Bot (Mind's Eye):
- Invisible spectator bot (mineflayer) streams world state via WebSocket
- HTML5 Canvas frontend at mind.mortdec.ai
- Real-time tool trace visualization with expandable entries
- Streaming model tokens during inference
- Gateway integration: fire-and-forget POST /trace on every tool call

Reinforcement Learning:
- Gymnasium environment wrapping mineflayer bot (minecraft_env.py)
- PPO training via Stable Baselines3 (10K param policy network)
- Behavioral cloning pretraining (97.5% accuracy on expert policy)
- Infinite training loop with auto-restart and checkpoint resume
- Bot learns combat, survival, navigation from raw experience

Bot Army:
- 8-soldier marching formation with autonomous combat
- Combat bots using mineflayer-pvp, pathfinder, armor-manager
- Multilingual prayer bots via translategemma:27b (18 languages)
- Frame-based AI architecture: LLM planner + reactive micro-scripts

Infrastructure:
- Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser)
- Billing gateway now tracks all LAN traffic (LAN auto-auth)
- Gateway fallback for empty god-mode responses
- Updated mortdec.ai landing page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-22 20:22:50 -04:00

7.0 KiB

Raw Permalink Blame History

Mortdecai — Agent Context

Single source of truth for AI agents working on this project. When docs disagree, trust this file > implementation > historical docs.

Project Identity

Mortdecai is a fine-tuned Qwen3.5-9B (and upcoming 14B) language model for Minecraft server operations. It runs inside a Paper 1.21 server via a LangGraph-style gateway, responding to player prayers (god mode) and commands (sudo mode) using 24 tools. The model is trained via QLoRA on curated + Claude-distilled + RCON-validated data.

This is NOT a chatbot, NOT a general assistant. It is a domain-specific Minecraft operations agent.

Current State (2026-03-22)

Current model: mortdecai:0.5.0 (Qwen3.5-9B, QLoRA fine-tune)
Base model: Qwen/Qwen3.5-9B (HuggingFace)
Known issue: 30% empty god-mode responses (think-block token drain, no freeform text training)
Next version: 0.6.0 (9B + 14B, training on rented H100)
Tool count: 24 (all deployed and wired into gateway)

Canonical Files (trust these)

File	Domain	Trust
`agent/tools/tool_schemas.py`	Tool capability inventory	Canonical — 24 tools
`langgraph_gateway.py` (in Sethpc-Minecraft-PaperFork repo)	Runtime orchestrator	Canonical — production code
`mc_aigod_paper.py` (in Sethpc-Minecraft-PaperFork repo)	Server-side AI handler	Canonical
`training/scripts/validate_all_training.py`	Data quality validator	Current
`training/scripts/merge_datasets.py`	Training data merge	Current
`training/scripts/train_lora.py`	Training script	Current

Historical Files (useful but potentially stale)

File	Note
`PLAN.md`	Reflects 0.4.0→0.5.0 era planning. Not current deployment truth.
`SESSION.md`	Historical session log. May reference old architecture.
`README.md`	Public-facing. Tool count may lag behind tool_schemas.py.
`MODEL_CARD.md`	Reflects 0.5.0 release. Update on each version bump.
`data/schema.json`	LEGACY — does not validate current data. Use validate_all_training.py instead.
`data/validate_dataset.py`	LEGACY — replaced by training/scripts/validate_all_training.py.
`agent/serve.py`	Reference only — not the production runtime. Production runs in PaperFork repo.

Runtime Architecture

Player chat → Paper server (mc_aigod_paper.py) → LangGraph Gateway (langgraph_gateway.py)
                                                      ↓
                                               Mortdecai model (Ollama)
                                                      ↓
                                               Tool loop (24 tools, max 50 steps)
                                                      ↓
                                               RCON execution → Minecraft server

Production entrypoint: mc_aigod_paper.py watches server log, dispatches to gateway
Gateway: langgraph_gateway.py on port 8091 (internal to CT 644)
Model inference: Ollama on steel141 (3090 Ti F16 for dev, RTX 4000 Q4 for prod)
This repo contains: tools, schemas, training data, training scripts, eval harness
The PaperFork repo contains: runtime (gateway + server handler)

24 Tools (all deployed, all wired into gateway)

Tool	Status	In Training Data
rcon.execute	Production	Yes (heavy)
minecraft.lookup	Production	Partial (was wiki_lookup)
plugin.docs_lookup	Production	Yes
world.player_info (+ inventory)	Production	Yes
world.server_state	Production	Yes
world.nearby_entities	Production	Yes
world.scan_area	Production	Needs 0.6.0 examples
world.redstone_trace	Production	Needs 0.6.0 examples
world.render	Production	Needs 0.6.0 examples
server.config	Production	Needs 0.6.0 examples
memory.read	Production	Yes
memory.write	Production	Yes
journal.read	Production	120 multitool examples
journal.write	Production	120 multitool examples
log.query	Production	Needs more examples
user.ask	Production	Needs more examples
script.write	Production	Yes
script.validate	Production	Yes
script.execute	Production	Yes
script.read	Production	Minimal
script.list	Production	Minimal
script.delete	Production	Minimal
script.schedule (tick/load/delay)	Production	Minimal
training.save	Dev only (config toggle)	Needs 0.6.0 examples

Data Map

Raw data: data/raw/ — individual JSONL files from various sources Processed data: data/processed/ — merged, filtered, validated Quarantine: data/quarantine/ — failed validation, some salvageable External: data/external/ — IGLU Microsoft Research dataset (nested git repo)

Known data issues (from validator run):

2,891 commands use @s (should be player name)
2,633 commands use enchantment syntax Paper RCON rejects
24,476 examples in old dict format (need conversion to messages[] chat)
7,647 examples have outdated system prompts (missing new tools)

Ignore List

__pycache__/ — committed by accident, not meaningful
USER_NOTES_IGNORE_ME/ — private notes, not project context
data/external/iglu-repo/ — external dataset, read-only
eval/results/ — historical eval outputs, may be stale
data/processed/pipeline_output.jsonl — 7,032 examples with RCON connection failures marked as success. Do NOT trust.
data/raw/scraped_*.jsonl — empty files
web/ — admin/community tools, not model-related

Known Problems

Secrets in tracked files — RCON passwords, API keys hardcoded. Should be env vars.
README/MODEL_CARD tool count lag — says 17, reality is 24. Update on release.
agent/serve.py misleads — looks like main entrypoint but isn't. Real runtime is in PaperFork repo.
data/schema.json is legacy — doesn't validate current data. Replaced by validate_all_training.py.
pipeline_output.jsonl is poisoned — connection failures marked as success.

Working Rules for AI Agents

agent/tools/tool_schemas.py is the tool inventory. Not README, not PLAN.md.
The production runtime is NOT in this repo. It's in Sethpc-Minecraft-PaperFork/.
Paper RCON cannot give enchanted items. Use plain give + effect combos.
@s does not work via RCON (no executor context). Training data uses @p as a pragmatic fix, but @p selects nearest player — not always the requester. Prefer explicit player names in new training data.
Dev world is named devworld, not world. WorldGuard needs -w devworld.
When in doubt about a command, RCON-validate it on dev (192.168.0.244:25578, pass REDACTED_RCON).
Keep seed_dataset dominant in training mix to prevent fill_build regression.
The model (0.5.0) cannot produce freeform text in sudo mode — it only outputs JSON. Training 0.6.0 fixes this.
data/processed/pipeline_output.jsonl is poisoned (7,032 examples with RCON connection failures marked success). Excluded from training.

7.0 KiB Raw Permalink Blame History