docs: bootstrap repo with bakeoff results and game-mechanics idea bank
This repo opens with the design-discovery work completed before any product code is written. Two model bakeoffs against gemma4:8b/26b/31b on a local Ollama established that: - Whole-puzzle generation in the Connections shape is unreliable on Gemma 4 (gemma4:31b ~50% structural-pass, gemma4:26b ~20-30%); 31b is intentionally out of project scope, so the generation route is harder still. - Atomic semantic-judging skills are reliable: 87.5%/93.75%/100% (8B/26b/31b) on JUDGE; *all three models* scored 10/10 on CREATIVE_ACCEPT — fair judging of player-INVENTED categories. That is the structural unlock vs static hand-curated word games. The README contains the full writeup, the test bench, and a brainstormed bank of 10 distinct game-mechanics ideas across the fast/medium/slow tempo range, plus a primitives table for recombination. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,49 @@
|
||||
# IDEA.md — seth_semantic_game
|
||||
|
||||
## What is this?
|
||||
|
||||
A daily word game **based on NYT Connections**, powered by a locally-hosted Gemma 4
|
||||
model. Connections gives the player 16 words that have to be sorted into 4 hidden
|
||||
groups of 4 by shared semantic category. The twist for this project — what makes it
|
||||
worth building rather than just playing the original — is whatever Gemma 4 enables
|
||||
that NYT's hand-curated static format cannot.
|
||||
|
||||
That twist is **not yet decided**. That's what brainstorming is for.
|
||||
|
||||
The base mechanic is fixed:
|
||||
- Connections-style grouping puzzle (semantic categories, not letters)
|
||||
- Gemma 4 in the loop somewhere (puzzle generation, judging, hint system, or all of
|
||||
the above)
|
||||
- Daily-puzzle structure with social-shareable result (the Connections / Wordle
|
||||
ritual — borrowed *only* for its sharing pattern, not its gameplay)
|
||||
|
||||
This is **not** Wordle-derived. The original draft of this file framed it as
|
||||
"Wordle-style"; that was wrong. The mechanic is grouping, not letter-guessing.
|
||||
|
||||
## Problem it solves
|
||||
|
||||
Mostly fun and a real use of the local Gemma 4 stack. NYT Connections is hand-curated
|
||||
and ships one puzzle per day; a generative version could ship infinite puzzles, accept
|
||||
fuzzy or creative groupings, generate themed/seeded puzzles, or do other things the
|
||||
hand-built version structurally cannot. Secondary: a daily-puzzle hook for sethpc.xyz
|
||||
alongside other homelab games.
|
||||
|
||||
## Constraints / preferences
|
||||
|
||||
- Self-hosted: Ollama with Gemma 4 on commodity GPU (a single 24 GB card is enough)
|
||||
- Web frontend, dark theme with orange accents
|
||||
- If a puzzle is generative, output must be **deterministic per day** (every player
|
||||
on a given date gets the same puzzle). Likely a date-seeded prompt with cached
|
||||
output rather than a fresh generation per request.
|
||||
- Per-guess judging cost should be cheap — at most one Gemma call per submission, and
|
||||
ideally answers are precomputed when the daily puzzle is generated, so judging
|
||||
becomes a cheap lookup.
|
||||
- No login required for casual play (cookies/localStorage for streak)
|
||||
|
||||
> NOTE on history: this brief was originally a "Wordle-style" framing. That was
|
||||
> wrong — the seed game is NYT Connections (16 words → 4 hidden groups of 4).
|
||||
> But after the model bakeoffs (see README), the *direction* shifted again:
|
||||
> rather than cloning Connections, the project pivots toward gameplay that
|
||||
> uses Gemma's per-call CREATIVE_ACCEPT ability to fairly judge
|
||||
> player-INVENTED categories — a thing static curated games structurally can't
|
||||
> do. The brainstormed game ideas in the README are what came out of that.
|
||||
Reference in New Issue
Block a user