Mortdecai

Files

T

Seth 9abf9238c5 3-tier self-play: command drills, self-critique, adversarial

Tier 1 — Command drills:
  Random seed prompts → generate commands → RCON validates
  Teaches: accurate command syntax

Tier 2 — Single-shot self-critique:
  Model invents a tricky prompt AND responds in one call
  RCON validates the self-generated commands
  Teaches: edge-case awareness, self-evaluation

Tier 3 — Adversarial self-play:
  Session A generates challenging prompts
  Fresh Session B responds cold (can't cheat)
  RCON validates, self-corrects on errors
  Teaches: robustness, generalization

Usage: --tier 1|2|3|all --rounds N --focus category

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-19 19:39:33 -04:00

configs

Initial project scaffold: dataset schema, 31 seed training examples, Mineflayer bot framework, and 7-phase roadmap

2026-03-18 01:51:28 -04:00

scripts

3-tier self-play: command drills, self-critique, adversarial

2026-03-19 19:39:33 -04:00

MODEL_RESEARCH.md

Add model bake-off harness and base model research

2026-03-18 08:54:11 -04:00