Gateway: POST /admin/update-model downloads new GGUF and reloads.
Disabled by default — requires ALLOW_MODEL_UPDATES=true in .env.
Matt controls whether remote model updates are allowed.
Self-play: --api-key flag for authenticated gateway connections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- API key auth on all inference endpoints
- Power/cost tracking: GPU TDP × inference time × electricity rate
- Spending cap enforcement
- Web dashboard with live stats
- Docker compose for AMD ROCm (Strix Halo) or NVIDIA
- Auto-setup script with GGUF loading
- Tested against local Ollama
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>