0b37d7de79cb54e812e58feff75c8d82056d9f19
Gateway: POST /admin/update-model downloads new GGUF and reloads. Disabled by default — requires ALLOW_MODEL_UPDATES=true in .env. Matt controls whether remote model updates are allowed. Self-play: --api-key flag for authenticated gateway connections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mortdecai Gateway
Authenticated Ollama proxy with power metering. Deploy on any machine with a GPU to contribute inference compute to the Mortdecai training pipeline.
Quick Start
git clone <repo-url>
cd mortdecai-gateway
mkdir -p models
# Copy the GGUF file into models/
cp /path/to/mortdecai-v4.gguf models/
chmod +x setup.sh
./setup.sh
Dashboard: http://localhost:8434/dashboard
What It Does
Your GPU → Ollama → Gateway (auth + metering) → Port 8434 → Internet
The gateway sits in front of Ollama and:
- Authenticates requests via API key
- Tracks inference time, tokens, energy usage
- Estimates electricity cost (GPU TDP × time × rate)
- Enforces a spending cap
- Provides a dashboard with live stats
Configuration
Edit .env:
API_KEY=mk_your_secret_key
GPU_TDP_WATTS=54 # Your GPU's TDP
SYSTEM_OVERHEAD_WATTS=30 # CPU/RAM draw during inference
ELECTRICITY_RATE=0.15 # $/kWh
SPENDING_CAP=10.00 # $ before gateway stops accepting
Endpoints
| Endpoint | Auth | Description |
|---|---|---|
GET /health |
No | Ollama status + loaded models |
GET /dashboard |
No | Web dashboard with live stats |
GET /stats |
Yes | JSON usage stats |
POST /api/chat |
Yes | Proxied to Ollama |
POST /api/generate |
Yes | Proxied to Ollama |
* |
Yes | Everything else proxied to Ollama |
Response Metadata
Every proxied response includes a _gateway field:
{
"message": { "role": "assistant", "content": "..." },
"_gateway": {
"duration_seconds": 3.42,
"energy_wh": 0.0798,
"estimated_cost": 0.000012,
"total_cost": 0.0342,
"budget_remaining": 9.9658
}
}
AMD ROCm
The Docker compose uses ollama/ollama:rocm by default. Requires ROCm drivers on the host. For Strix Halo, ensure BIOS is set to reserved VRAM mode.
NVIDIA
Edit docker-compose.yml: uncomment the deploy section and comment out the devices section.
Description
Languages
Python
86.4%
Shell
13.3%
Dockerfile
0.3%