9abf9238c5
Tier 1 — Command drills: Random seed prompts → generate commands → RCON validates Teaches: accurate command syntax Tier 2 — Single-shot self-critique: Model invents a tricky prompt AND responds in one call RCON validates the self-generated commands Teaches: edge-case awareness, self-evaluation Tier 3 — Adversarial self-play: Session A generates challenging prompts Fresh Session B responds cold (can't cheat) RCON validates, self-corrects on errors Teaches: robustness, generalization Usage: --tier 1|2|3|all --rounds N --focus category Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>