c5865feb35
- API key auth on all inference endpoints - Power/cost tracking: GPU TDP × inference time × electricity rate - Spending cap enforcement - Web dashboard with live stats - Docker compose for AMD ROCm (Strix Halo) or NVIDIA - Auto-setup script with GGUF loading - Tested against local Ollama Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 lines
123 B
Docker
6 lines
123 B
Docker
FROM python:3.11-slim
|
|
RUN pip install --no-cache-dir requests
|
|
WORKDIR /app
|
|
COPY gateway.py .
|
|
CMD ["python3", "gateway.py"]
|