Files
Mortdecai/docs/superpowers/specs/2026-03-22-oracle-bot-design.md
T
Seth 5b28002001 0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline
Major changes from this session:

Training:
- 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL
- 7,256 merged training examples (up from 3,183)
- New training data: failure modes (85), midloop messaging (27),
  prompt injection defense (29), personality (32), gold from quarantine
  bank (232), new tool examples (30), claude's own experience (10)
- All training data RCON-validated at 100% pass rate
- Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56%

Oracle Bot (Mind's Eye):
- Invisible spectator bot (mineflayer) streams world state via WebSocket
- HTML5 Canvas frontend at mind.mortdec.ai
- Real-time tool trace visualization with expandable entries
- Streaming model tokens during inference
- Gateway integration: fire-and-forget POST /trace on every tool call

Reinforcement Learning:
- Gymnasium environment wrapping mineflayer bot (minecraft_env.py)
- PPO training via Stable Baselines3 (10K param policy network)
- Behavioral cloning pretraining (97.5% accuracy on expert policy)
- Infinite training loop with auto-restart and checkpoint resume
- Bot learns combat, survival, navigation from raw experience

Bot Army:
- 8-soldier marching formation with autonomous combat
- Combat bots using mineflayer-pvp, pathfinder, armor-manager
- Multilingual prayer bots via translategemma:27b (18 languages)
- Frame-based AI architecture: LLM planner + reactive micro-scripts

Infrastructure:
- Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser)
- Billing gateway now tracks all LAN traffic (LAN auto-auth)
- Gateway fallback for empty god-mode responses
- Updated mortdec.ai landing page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 20:22:50 -04:00

238 lines
9.6 KiB
Markdown

# Oracle Bot — Mortdecai Mind's Eye
**Date:** 2026-03-22
**Status:** Approved design, pending implementation
**Public URL:** `mind.mortdec.ai`
## Summary
A live HTML5 viewport that renders what the Mortdecai AI model "sees" during Minecraft server interactions. An invisible spectator bot (mineflayer) maintains real-time world state, the gateway streams tool traces to it, and browsers connect via WebSocket to watch the AI think.
## Architecture
```
Browser (mind.mortdec.ai)
↕ WebSocket (ws://CT644:3333)
Oracle Bot (Node.js, single process)
├── MC Client (mineflayer, spectator mode)
├── Vision Server (Express + ws, port 3333)
├── Trace Receiver (POST /trace from gateway)
└── Command API (POST /command, future tool integration)
↕ MC Protocol (offline auth)
Paper Server (1.21, CT 644:25568 dev)
```
**Approach:** Smart Bot is the Server (Approach 2). One Node.js process handles MC connection, WebSocket streaming, and HTTP endpoints. Designed to evolve into a gateway tool (Approach 3) where the AI controls the bot directly.
## Bot Core
### Three roles in one process:
1. **MC Client** — mineflayer bot in spectator mode. Maintains live chunk cache, entity list, player positions. Username: `OracleBot`. Connects to dev server (port 25568, offline auth).
2. **Vision Server** — Express + ws. Serves the HTML5 frontend. Streams world state and tool traces to connected browsers via WebSocket on port 3333.
3. **Trace Receiver**`POST /trace` endpoint. Gateway calls this (fire-and-forget) on every tool invocation during the model-driven tool loop.
### Future-ready command API:
- `POST /command` — accepts instructions: `{action: "follow", target: "player"}`, `{action: "scan", center: {x,y,z}, radius: 20}`
- Day one: only `follow` and `scan` implemented
- Endpoint exists so the gateway can later call it as a tool (`oracle.scan`, `oracle.look`)
### Bot behavior:
- On connect: spectator mode, fly to first online player
- Follows the player the AI is currently interacting with (switches on trace events)
- On idle: parks at last active player or world spawn
## Data Flow & States
### Two modes:
**Idle Mode (no active trace):**
- Bot parks at last active player position
- Low-frequency heartbeat to browsers: player list, positions, time, weather
- Update rate: ~5 seconds
- Frontend: calm ambient view, slow-updating minimap, player dots
**Active Mode (trace incoming):**
- Gateway fires `POST /trace` with tool call data
- Bot teleports to relevant player
- Scans burst of chunk data around player
- High-frequency updates: blocks, entities, tool trace overlay
- God mode: dramatic visual (golden glow, Sethian orange accents)
- Sudo mode: clinical/technical (grid overlay, command syntax)
- Persists 10s after last trace, then fades to idle
### WebSocket message types:
```javascript
// Heartbeat (idle)
{type: "heartbeat", players: [{name, x, y, z}], time: 6000, weather: "clear"}
// World snapshot (active)
{type: "world", center: {x, y, z}, blocks: [{x, y, z, type}], entities: [{type, x, y, z, count}]}
// Tool trace event (active)
{type: "trace", tool: "world.scan_area", input: {...}, result: {...}, step: 2, mode: "god"}
// Mode change
{type: "mode", mode: "god"|"sudo"|"idle", player: "slingshooter08"}
```
## Frontend (HTML5 Canvas)
### Single page, no build step. Pure HTML5 Canvas + vanilla JS.
**Layout:**
```
┌─────────────────────────────────┬──────────────────┐
│ │ TOOL TRACE │
│ WORLD MAP │ │
│ (2D top-down tiles) │ [scan_area] ● │
│ │ [rcon.exec] ● │
│ ○ player dots │ [journal] ● │
│ █ blocks colored by type │ │
│ ◇ entities │ step 3/8 │
│ │ │
├─────────────────────────────────┤ │
│ STATUS BAR │ │
│ Mode: GOD | Player: sling... │ │
│ HP: 20 | Pos: (12, -60, 15) │ │
└─────────────────────────────────┴──────────────────┘
```
### Visual modes:
- **Idle:** Dark muted palette, slow pulse animation. Sleeping eye aesthetic.
- **God active:** Sethian orange (#D35400), golden particles on commands, dramatic god message text. Blocks glow where AI acts.
- **Sudo active:** Cool blue/green terminal aesthetic, monospace overlays, precise grid. Clinical.
### Block rendering:
- Each block type → color (stone=gray, dirt=brown, water=blue, redstone=red, air=transparent)
- Top-down slice at player Y level (configurable)
- Entities as icons/dots with distance rings
- Scanned areas pulse/highlight as tool traces arrive ("AI is looking here")
### Branding:
- Font: Rajdhani Bold
- Primary accent: Sethian orange (#D35400)
- Background: dark (#1a1a2e)
- Title: "MORTDECAI — MIND'S EYE"
- Subtle eye/pyramid motif
## Security & Resilience
### Public vs internal endpoints:
- **Public (via Caddy):** WebSocket `/ws`, static files `/`, `/index.html`
- **Internal only (localhost):** `POST /trace`, `POST /command` — Caddy must NOT proxy these. Gateway calls them on localhost:3333 directly.
- WebSocket: max 100 concurrent connections, per-IP cap of 5. Excess connections get 429.
### Caddy config:
```
mind.mortdec.ai {
reverse_proxy /ws localhost:3333
reverse_proxy / localhost:3333 {
# Only serve static files and WebSocket, not /trace or /command
}
@blocked path /trace /command
respond @blocked 404
}
```
### Chunk loading after teleport:
- After bot teleports to a player, wait 2 seconds for chunk packets before scanning
- `world-state.js` tracks chunk load events and exposes `awaitChunksLoaded(center, radius, timeoutMs)`
- If timeout expires, scan with whatever chunks are loaded (partial data is better than no data)
### Spectator mode enforcement:
- Add to `mc_aigod_paper.py` PlayerJoinEvent: if player name is `OracleBot`, set gamemode spectator before spawn
- Fallback: bot self-executes `/gamemode spectator OracleBot` via chat on spawn event
### Bot reconnection:
- On `kicked` or `end` event: exponential backoff reconnect (1s, 2s, 4s, 8s, max 30s)
- Broadcast `{type: "status", connected: false}` to all browsers on disconnect
- Frontend shows "Bot offline — reconnecting..." overlay with pulse animation
- On reconnect: broadcast `{type: "status", connected: true}`, resume normal flow
### Payload limits:
- World snapshots: max 32x32x1 top-down slice (1,024 blocks). Air blocks excluded.
- Delta compression: after initial snapshot, only send changed blocks
- Max WebSocket frame: 64KB. If payload exceeds, chunk into multiple messages.
### Multiple simultaneous sessions:
- Trace events include `session_id` and `player` fields
- Bot follows the most recent trace's player
- Frontend tool trace panel shows all active sessions, color-coded by player
- If two sessions overlap, traces interleave in the timeline (both visible)
### Message versioning:
- All WebSocket messages include `v: 1` field
- Frontend ignores messages with unknown `v` values gracefully
## Deployment
**Location:** CT 644 (same container as MC servers + gateway). Lowest latency.
**Public access:**
```
mind.mortdec.ai → Caddy (CT 600) → CT 644:3333 (WebSocket upgrade)
```
- No Authelia — fully public
- DNS: CNAME to Caddy ingress
**Gateway integration (minimal):**
- One addition to `langgraph_gateway.py`: fire-and-forget POST to `http://localhost:3333/trace` after each tool call
- Also POST on session start (mode + player) and session end (final response)
- Non-blocking: try/except with 1s timeout. If bot is down, gateway doesn't care.
**Process management:**
- systemd service: `oracle-bot.service`
- Auto-restart on crash
- Logs: `/var/log/oracle-bot.log`
**Server-side setup:**
- Paper server needs to `/gamemode spectator OracleBot` on join (command block or plugin event)
## Future Evolution
### Phase 2: Gateway tool integration
- `oracle.scan` tool — model queries bot's chunk cache instead of RCON. Faster, richer.
- `oracle.look` tool — bot teleports to coords and returns what it sees.
- `POST /command` endpoint (built day one) becomes the tool backend.
### Phase 3: Multi-bot fleet
- Oracle spawns additional bots on command
- Model dispatches: `oracle.dispatch({task: "watch", target: "SwiftWolf"})`
- All bots feed into same vision server → same frontend
### Phase 4: Multimodal training capture
- Frontend frames captured as screenshots paired with model decisions
- Builds (visual_state, model_action) dataset for multimodal fine-tuning
- Mind's Eye becomes training data for visual understanding
## Tech Stack
- **Bot + Server:** Node.js, mineflayer, express, ws
- **Frontend:** HTML5 Canvas, vanilla JS, WebSocket API
- **Integration:** HTTP POST (gateway → bot), WebSocket (bot → browser)
- **Deployment:** systemd on CT 644, Caddy reverse proxy
## File Structure
```
oracle-bot/
├── package.json # entrypoint: "main": "server.js"
├── server.js # ENTRYPOINT — express + ws, requires bot.js, serves frontend
├── bot.js # mineflayer connection, spectator, chunk tracking
├── world-state.js # abstracted world state (blocks, entities, players)
├── public/
│ └── index.html # single-file frontend (HTML + Canvas + JS + CSS)
└── README.md
```