docs: log AI-player spec approval, update context, add handoff

Updates CLAUDE.md "Current State" + "Key files" to point at the new spec. Adds DECISIONS.md "AI / computer player" section (11 settled decisions). Strikes through the prior "Client-side AI / hint generation — out of scope" row with a "partially superseded" note: the reversal applies only to the human-vs-AI path. Adds 7 new Deferred/Rejected rows for AI-feature scope. Handoff at .claude/handoffs/2026-04-28-170713-ai-player-spec.md captures session state for the next pickup (writing-plans → Phase 1 implementation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 13:12:04 -04:00
parent 288693fcd6
commit 729199097e
3 changed files with 217 additions and 3 deletions
@@ -50,6 +50,22 @@ Format: `YYYY-MM-DD: <decision> — <why>`
 - 2026-04-28: **WS path through Caddy** — `wss://chess.sethpc.xyz/ws?game=<id>` works without explicit `transport ws` config. Caddy's reverse_proxy handles upgrade transparently.
 - 2026-04-28: **Public DNS** — relies on existing `*.sethpc.xyz` wildcard pointing at the WAN IP; no Pi-hole entry was needed. Caddy host-routes `chess.sethpc.xyz` to 192.168.0.245:3000.

+## AI / computer player (designed 2026-04-28, not yet implemented)
+
+Spec: `docs/superpowers/specs/2026-04-28-ai-player-design.md`. All decisions below are settled at spec-approval time; revisit if implementation surfaces something the spec didn't anticipate.
+
+- 2026-04-28: **Two AI bots, phased delivery** — `CasualBrain` (Phase 1, algorithmic, in-process) ships first; `ReconBrain` (Phase 2, `gemma4:26b` chat agent) ships second. Phased to keep research uncertainty (Recon's actual playing strength) from blocking shipping anything. Rejected: combined launch, single difficulty-dial UX, throwaway Casual-as-stub.
+- 2026-04-28: **Bots use the same view filter as humans** — `BotDriver` calls `buildView(game, botColor)`; bot input is filtered `BoardView` + `Announcement[]`. No oracle access. Preserves the architectural invariant: the view filter is the only egress for board state, even for in-process bots. Rejected: "easy mode" oracle access for Casual to keep it simple.
+- 2026-04-28: **In-process virtual players, not external WS clients** — `BotDriver` lives in the existing Fastify server, dispatches actions through the same `commit` handler humans use. One process, no new deploy targets. Rejected: external bot processes (more operational surface, no real benefit), hybrid Casual-in-process / Recon-external (asymmetric for no gain).
+- 2026-04-28: **Recon bot is a stateful chat agent, not stateless** — per-game chat history persists across turns as the bot's private memory. Each turn appends user (new view + announcements + candidates) + assistant (reasoning + move). Reasoning is hidden from the human during play, revealed in collapsible post-game panel. Rejected: stateless one-shot move-picker (loses belief-tracking across turns), revealing reasoning during play (would leak strategic intent).
+- 2026-04-28: **Endpoint priority: steel141 RTX 3090 Ti primary, pve197 V100 fallback** — preflight on game creation; mid-game failover allowed once (one-way). Rationale: 3090 Ti benchmarks at ~134 tok/s on `gemma4:26b`; V100 estimated ~80 tok/s. Both have the model present. Rejected: no failover (worse UX), bidirectional flap (more complexity, no real benefit).
+- 2026-04-28: **GPU shown to user** — persistent badge under AI's slot reads `"gemma4:26b · RTX 3090 Ti"` (or V100 / failed-over variant). Game-start moderator-panel UI message explicitly names the model + host. Rationale: chess.sethpc.xyz is a personal homelab site; surfacing the hardware is brand-appropriate and gives honest feedback when fallback engages. Rejected: hiding the GPU (would be opaque on slow V100 fallback).
+- 2026-04-28: **`gemma4:26b` model choice** — sweet spot per gemma4-research: ~134 tok/s decode on 3090 Ti (4.7× faster than 31B), MoE 3.8B active, vision-capable (not used here). Rejected: 31B (5× slower, marginal strength gain not worth latency), e4b (too small for this task).
+- 2026-04-28: **Per-move latency budget: 30s normal, 90s first-move** — first-move headroom covers cold-start (steel141 keep_alive=30m policy, ~30-60s reload after idle). Beyond 90s, treat as endpoint failure → failover. Rejected: tighter cap (false-positives on cold start), looser cap (UX death).
+- 2026-04-28: **Recon "done" bar: ≥60% wins over 50 Recon-vs-Casual self-play games** — concrete, measurable acceptance bound. If Recon misses 60% but holds >40%, prompt-engineering rabbit hole; if <40%, design signal (try 31B or feed textual board representation). Self-play harness lives in `scripts/selfplay.ts`, not in CI. Rejected: subjective "feels okay" bar (would let weak Recon ship), bar against humans (untestable at scale).
+- 2026-04-28: **Reasoning hidden during play, revealed post-game** — Gemma's chat history is private during the game; on game end, the chat history is copied to `Game.aiThoughtsLog` and the post-game screen shows a collapsible "View gemma4's reasoning" section. Rejected: live streaming "thinking tokens" to user (leaks strategy), permanent hiding (loses showcase value of the project).
+- 2026-04-28: **`vsAi` field added to `CreateGameRequest`; `aiInfo` field added to `joined`/`update` server messages; `'ai_unavailable'` added to `EndReason`** — minimal protocol surface for the feature. AI metadata is NOT in `ModeratorText` enum (kept clean). UI-system messages for game-start info and failover events are style-distinct from `Announcement` entries.
+
 ## Deferred / Rejected

 <!-- Decisions NOT to do something are just as valuable -- prevents re-proposing rejected ideas -->
@@ -65,4 +81,11 @@ Format: `YYYY-MM-DD: <decision> — <why>`
 - 2026-04-28: **Move log / PGN export, post-game replay** — deferred. Announcements are persisted in-game (so the moderator-panel scrollback works); export and replay are post-MVP.
 - 2026-04-28: **Public lobby / matchmaking / ratings** — out of scope. This is a private-link game, not a chess site.
 - 2026-04-28: **Pre-deploy "server restarting" warning to active players** — stretch goal, not MVP. Mitigation for now: deploy during low-usage windows.
- 2026-04-28: **Client-side AI / hint generation** — explicitly out of scope. Human vs. human only.
+- 2026-04-28: ~~**Client-side AI / hint generation** — explicitly out of scope. Human vs. human only.~~ **Partially superseded 2026-04-28** by AI-player spec. Reversal applies *only* to the human-vs-AI path; client-side AI / hint generation in human-vs-human games remains rejected.
+- 2026-04-28: **Difficulty slider for AI** — rejected. Two named buttons (Casual, Recon) only. No continuum; the two bots are architecturally different, not tuneable strengths of the same engine.
+- 2026-04-28: **Stockfish for vanilla-mode AI strength** — deferred. Vanilla is a side-effect, not a feature target. Revisit if users explicitly ask for strong vanilla AI.
+- 2026-04-28: **Live token streaming during Gemma's thinking** — rejected for MVP. Static "AI is thinking..." indicator only. Streaming would leak strategic intent and adds protocol complexity.
+- 2026-04-28: **Mid-game GPU flap-back** — rejected. Once failed over to V100, stays there for the rest of the game even if steel141 recovers. Simpler, more predictable, and chat-history is mid-flight.
+- 2026-04-28: **AI vs AI public spectate-able games** — rejected for MVP. Self-play harness is CLI-only (`scripts/selfplay.ts`).
+- 2026-04-28: **Per-turn context compaction** — deferred. Spec uses `num_ctx: 32768` which covers ~128 turns; longer games would overflow but are rare in casual play. Add running-summary compaction if seen in practice.
+- 2026-04-28: **Bot rating / Elo / personalities** — out of scope. Two named buttons, no scoreboard.