Ollama ROCm doesn't auto-detect newer AMD iGPUs (gfx1150/1151).
Setting HSA_OVERRIDE_GFX_VERSION=11.0.0 in the compose fixes this.
Configurable via .env for other AMD chips.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Setup script now:
1. Generates API key
2. Starts Docker containers
3. Downloads GGUF from mortdec.ai automatically (~5.3GB)
4. Creates Ollama model with correct chat template
5. Runs test inference
6. Prints connection details for Seth
Matt just runs ./setup.sh — no manual file copying.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- API key auth on all inference endpoints
- Power/cost tracking: GPU TDP × inference time × electricity rate
- Spending cap enforcement
- Web dashboard with live stats
- Docker compose for AMD ROCm (Strix Halo) or NVIDIA
- Auto-setup script with GGUF loading
- Tested against local Ollama
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>