seth_semantic_game

Files

T

History

Mortdecai 5a2a02e483 docs: bootstrap repo with bakeoff results and game-mechanics idea bank

This repo opens with the design-discovery work completed before any product
code is written. Two model bakeoffs against gemma4:8b/26b/31b on a local
Ollama established that:

- Whole-puzzle generation in the Connections shape is unreliable on Gemma 4
  (gemma4:31b ~50% structural-pass, gemma4:26b ~20-30%); 31b is intentionally
  out of project scope, so the generation route is harder still.
- Atomic semantic-judging skills are reliable: 87.5%/93.75%/100% (8B/26b/31b)
  on JUDGE; *all three models* scored 10/10 on CREATIVE_ACCEPT — fair judging
  of player-INVENTED categories. That is the structural unlock vs static
  hand-curated word games.

The README contains the full writeup, the test bench, and a brainstormed
bank of 10 distinct game-mechanics ideas across the fast/medium/slow tempo
range, plus a primitives table for recombination.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-27 23:09:46 -04:00

gemma-generation-bakeoff.py

docs: bootstrap repo with bakeoff results and game-mechanics idea bank

2026-04-27 23:09:46 -04:00

gemma-semantic-bakeoff.py

docs: bootstrap repo with bakeoff results and game-mechanics idea bank

2026-04-27 23:09:46 -04:00