Add knowledge corpus: 14 command references, server context, and TF-IDF search index (Phase 1.3)
- knowledge/mc-commands/commands.json: 14 MC commands with JE syntax, args, examples, common errors, 1.21 version notes - knowledge/server-context/servers.json: all 4 servers (mc1, shrink, paper-ai, paper-dev) with full config - knowledge/build_index.py: TF-IDF indexer + search function (19 docs, 725 terms) - All command syntax validated live on dev server via RCON (12/13 passed) - PLAN.md: mark Phase 1.3 complete
This commit is contained in:
@@ -119,16 +119,15 @@ These projects informed the plan but solve different problems:
|
||||
- [x] Seed 31 examples from repair code, prayer logs, sudo logs, and session history (`data/processed/seed_dataset.jsonl`)
|
||||
|
||||
#### 1.3 Knowledge Corpus
|
||||
- [ ] Scrape Minecraft Wiki command reference pages for 1.21.x syntax
|
||||
- Target: `/give`, `/effect`, `/tp`, `/execute`, `/worldborder`, `/weather`, `/gamemode`, `/enchant`, `/fill`, `/setblock`, `/clone`, `/scoreboard`, `/data`, `/function`
|
||||
- Store as structured JSON (command, syntax, parameters, examples, version notes)
|
||||
- [ ] Extract and chunk local server context:
|
||||
- `server.properties` from mc1 and shrink-world
|
||||
- Datapack definitions (shrinkborder, morespawns)
|
||||
- Player list and UUID mappings
|
||||
- RCON connection parameters (sanitized)
|
||||
- [ ] Index knowledge corpus for RAG retrieval (simple TF-IDF or embedding-based)
|
||||
- [ ] Validate: query the index with sample questions, spot-check relevance
|
||||
- [x] Scrape Minecraft Wiki command reference pages for 1.21.x syntax (14 commands in `knowledge/mc-commands/commands.json`)
|
||||
- Includes JE syntax, arguments, examples, version notes, and common errors per command
|
||||
- Commands validated live on dev server (Paper 1.21.11) -- 12/13 passed, 1 false negative (already in target state)
|
||||
- [x] Extract and chunk local server context (`knowledge/server-context/servers.json`)
|
||||
- All 4 servers (mc1, shrink-world, paper-ai, paper-dev) with ports, RCON, settings, plugins
|
||||
- Player list with UUIDs, infrastructure details, version-specific notes
|
||||
- [x] Index knowledge corpus for RAG retrieval (`knowledge/build_index.py` -- TF-IDF with title boosting)
|
||||
- 19 documents indexed, 725 unique terms
|
||||
- [x] Validated with 6 test queries -- all return relevant top results
|
||||
|
||||
#### 1.4 Baseline Assistant (No Fine-Tuning)
|
||||
- [ ] Build prompt-only assistant using `qwen3-coder` (via Ollama at 192.168.0.179)
|
||||
|
||||
Reference in New Issue
Block a user