c5865feb35
- API key auth on all inference endpoints - Power/cost tracking: GPU TDP × inference time × electricity rate - Spending cap enforcement - Web dashboard with live stats - Docker compose for AMD ROCm (Strix Halo) or NVIDIA - Auto-setup script with GGUF loading - Tested against local Ollama Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7 lines
64 B
Plaintext
7 lines
64 B
Plaintext
.env
|
|
SESSION.default.md
|
|
GITEA_API.md
|
|
__pycache__/
|
|
*.pyc
|
|
models/
|