4 Commits

Author SHA1 Message Date
Seth adeda6dd84 Pre-set HSA_OVERRIDE_GFX_VERSION for Strix Halo ROCm detection
Ollama ROCm doesn't auto-detect newer AMD iGPUs (gfx1150/1151).
Setting HSA_OVERRIDE_GFX_VERSION=11.0.0 in the compose fixes this.
Configurable via .env for other AMD chips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:37:06 -04:00
Seth f470f052aa Fix models mount to read-write for Modelfile creation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:35:45 -04:00
Seth df9f623943 Fully automated setup: downloads GGUF, loads model, tests inference
Setup script now:
1. Generates API key
2. Starts Docker containers
3. Downloads GGUF from mortdec.ai automatically (~5.3GB)
4. Creates Ollama model with correct chat template
5. Runs test inference
6. Prints connection details for Seth

Matt just runs ./setup.sh — no manual file copying.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:33:39 -04:00
Seth c5865feb35 Mortdecai Gateway — authenticated Ollama proxy with power metering
- API key auth on all inference endpoints
- Power/cost tracking: GPU TDP × inference time × electricity rate
- Spending cap enforcement
- Web dashboard with live stats
- Docker compose for AMD ROCm (Strix Halo) or NVIDIA
- Auto-setup script with GGUF loading
- Tested against local Ollama

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:26:43 -04:00