Phase 2: eval harness, 182 examples, live bake-off, playtest infrastructure
- Expanded dataset from 31 to 182 examples (45 manual + 106 extracted from server logs) - Built eval/harness.py with per-category breakdowns and baseline tracking - Built eval/live_bakeoff.py for RCON-verified model comparison on live server - Extracted training data from prayer logs, sudo logs, and bug reports on CT 644 - Added Reddit post draft and modmail for playtester recruitment - Updated server context: all servers now online-mode=false + whitelist - Updated PLAN.md with Phase 2 progress Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,13 @@
|
||||
# Modmail to r/admincraft
|
||||
|
||||
**Subject:** Permission to post a playtester request?
|
||||
|
||||
**Body:**
|
||||
|
||||
Hi — I'm working on a custom feature for my 1.21 Java server that involves AI-powered chat interactions. I'm looking for a small group of players to help test it and I think this community would be a good fit since the people here actually understand server administration.
|
||||
|
||||
The post would be a short description of what I'm looking for (10ish playtesters, whitelisted server, a few sessions over a couple weeks) with a link to a Google Form application. It's a hobby/research project, not a product or a server advertisement — I'm not trying to recruit a playerbase.
|
||||
|
||||
Wanted to check if this is the kind of thing that's allowed here before posting. Happy to share a draft if that helps.
|
||||
|
||||
Thanks for your time.
|
||||
Reference in New Issue
Block a user