Seth
|
a3d139e04f
|
Mortdecai v4 pre-training: /no_think, dedup, 3,369 examples
- /no_think prepended to all system prompts (seed + tool training)
- Deduplicated seed dataset (435 dupes removed)
- Training script updated for Qwen3.5-9B + /no_think
- 2,210 seed + 1,159 tool-calling = 3,369 total examples
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-03-19 20:15:00 -04:00 |
|
Seth
|
ee764cd22a
|
Tool-calling training: 1,159 multi-turn examples with error correction
Tool schemas (agent/tools/tool_schemas.py):
- rcon.execute: execute commands, get success/error results
- minecraft.wiki_lookup: look up syntax and item info
- world.player_info: player health, position, inventory
- world.server_state: time, weather, online players
- 10 RCON error patterns with corrections
- 12 common error scenarios for training
Training data generator (training/scripts/generate_tool_training.py):
- Converts seed dataset to multi-turn tool conversations
- Error correction: model tries wrong command → gets error → self-corrects
- Wiki/player/server lookups for uncertainty scenarios
- Qwen3 native tool-calling format with <tool_call> tags
1,159 examples: 1043 success, 79 error correction, 24 error scenarios,
13 tool lookups. Ready for v4 training.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-03-19 18:49:08 -04:00 |
|