docs: bootstrap repo with bakeoff results and game-mechanics idea bank

This repo opens with the design-discovery work completed before any product code is written. Two model bakeoffs against gemma4:8b/26b/31b on a local Ollama established that: - Whole-puzzle generation in the Connections shape is unreliable on Gemma 4 (gemma4:31b ~50% structural-pass, gemma4:26b ~20-30%); 31b is intentionally out of project scope, so the generation route is harder still. - Atomic semantic-judging skills are reliable: 87.5%/93.75%/100% (8B/26b/31b) on JUDGE; *all three models* scored 10/10 on CREATIVE_ACCEPT — fair judging of player-INVENTED categories. That is the structural unlock vs static hand-curated word games. The README contains the full writeup, the test bench, and a brainstormed bank of 10 distinct game-mechanics ideas across the fast/medium/slow tempo range, plus a primitives table for recombination. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 23:09:46 -04:00
commit 5a2a02e483
10 changed files with 4659 additions and 0 deletions
@@ -0,0 +1,49 @@
+# IDEA.md — seth_semantic_game
+
+## What is this?
+
+A daily word game **based on NYT Connections**, powered by a locally-hosted Gemma 4
+model. Connections gives the player 16 words that have to be sorted into 4 hidden
+groups of 4 by shared semantic category. The twist for this project — what makes it
+worth building rather than just playing the original — is whatever Gemma 4 enables
+that NYT's hand-curated static format cannot.
+
+That twist is **not yet decided**. That's what brainstorming is for.
+
+The base mechanic is fixed:
+- Connections-style grouping puzzle (semantic categories, not letters)
+- Gemma 4 in the loop somewhere (puzzle generation, judging, hint system, or all of
+  the above)
+- Daily-puzzle structure with social-shareable result (the Connections / Wordle
+  ritual — borrowed *only* for its sharing pattern, not its gameplay)
+
+This is **not** Wordle-derived. The original draft of this file framed it as
+"Wordle-style"; that was wrong. The mechanic is grouping, not letter-guessing.
+
+## Problem it solves
+
+Mostly fun and a real use of the local Gemma 4 stack. NYT Connections is hand-curated
+and ships one puzzle per day; a generative version could ship infinite puzzles, accept
+fuzzy or creative groupings, generate themed/seeded puzzles, or do other things the
+hand-built version structurally cannot. Secondary: a daily-puzzle hook for sethpc.xyz
+alongside other homelab games.
+
+## Constraints / preferences
+
+- Self-hosted: Ollama with Gemma 4 on commodity GPU (a single 24 GB card is enough)
+- Web frontend, dark theme with orange accents
+- If a puzzle is generative, output must be **deterministic per day** (every player
+  on a given date gets the same puzzle). Likely a date-seeded prompt with cached
+  output rather than a fresh generation per request.
+- Per-guess judging cost should be cheap — at most one Gemma call per submission, and
+  ideally answers are precomputed when the daily puzzle is generated, so judging
+  becomes a cheap lookup.
+- No login required for casual play (cookies/localStorage for streak)
+
+> NOTE on history: this brief was originally a "Wordle-style" framing. That was
+> wrong — the seed game is NYT Connections (16 words → 4 hidden groups of 4).
+> But after the model bakeoffs (see README), the *direction* shifted again:
+> rather than cloning Connections, the project pivots toward gameplay that
+> uses Gemma's per-call CREATIVE_ACCEPT ability to fairly judge
+> player-INVENTED categories — a thing static curated games structurally can't
+> do. The brainstormed game ideas in the README are what came out of that.