Mortdecai

Author	SHA1	Message	Date
Seth	750cf15c79	1,542 seed + 1,159 tool-calling examples, async processing, validator tracking New knowledge baked in: - Enchantments (60): all 1.21 enchants, mutual exclusions, max levels, component syntax - WorldEdit (45): //set, //replace, //sphere, //stack, selection, brushes - Paper server (55): gamerules, permissions, plugins, scoreboard, moderation - Cosmetics/XP (42): title, tellraw, playsound, particle, xp, effect mechanics - Quantity boundaries (32): item tier caps, greedy→stingy, humble→generous Training infrastructure: - train_lora.py updated for multi-turn tool conversations + seed data - Async prayer/sudo processing (ThreadPoolExecutor, 3 workers) - Validator hit-rate tracking to /var/log/mc_validator_stats.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 19:03:30 -04:00
Seth	e28836106f	Risk_level in all 644 examples + model outputs risk classification - All 644 examples tagged: 0=blocked(15), 1=refuse(33), 2=warn(24), 3=normal(498), 4=generous(74) - Training output now includes risk_level field for decision transparency - Model learns to classify risk before generating commands - Validator can sanity-check: risk 0-1 should have empty commands Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 22:35:50 -04:00
Seth	62419976e5	361 training examples, default to 1 epoch Ingested 128 new examples from bot-driven data collection. Dropped: 86 duplicates, 19 language mismatches, 10 prompt leaks, 19 empty. Changed default epochs from 3 to 1 (previous run overfit at loss 0.10). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 18:03:33 -04:00
Seth	142e4fd3c4	Fix training script: bf16 for Ampere GPU, add system prompts to training data - Switch fp16 to bf16 (RTX 3090 Ti is Ampere, supports BF16 natively) - Include system prompt in training conversations (mode-aware: sudo/god/god_system) - Include message field only for god modes - Add determine_mode() and get_system_prompt() helpers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 16:26:47 -04:00
Seth	48b627d498	Add LoRA training scripts and fix bake-off token budget - training/scripts/train_lora.py: Unsloth QLoRA trainer for qwen3:8b - training/scripts/train_lora.sh: Launch script for steel141 RTX 3090 Ti - eval/bakeoff.py: Fixed token budget (400->1500) that caused qwen3 models to exhaust tokens on thinking, added --no-think flag - agent/serve.py: Default model changed to gemma3n:e4b Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 10:40:18 -04:00

5 Commits