Mortdecai Gateway — authenticated Ollama proxy with power metering

- API key auth on all inference endpoints
- Power/cost tracking: GPU TDP × inference time × electricity rate
- Spending cap enforcement
- Web dashboard with live stats
- Docker compose for AMD ROCm (Strix Halo) or NVIDIA
- Auto-setup script with GGUF loading
- Tested against local Ollama

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 19:26:43 -04:00
commit c5865feb35
7 changed files with 561 additions and 0 deletions
+78
View File
@@ -0,0 +1,78 @@
# Mortdecai Gateway
Authenticated Ollama proxy with power metering. Deploy on any machine with a GPU to contribute inference compute to the Mortdecai training pipeline.
## Quick Start
```bash
git clone <repo-url>
cd mortdecai-gateway
mkdir -p models
# Copy the GGUF file into models/
cp /path/to/mortdecai-v4.gguf models/
chmod +x setup.sh
./setup.sh
```
Dashboard: http://localhost:8434/dashboard
## What It Does
```
Your GPU → Ollama → Gateway (auth + metering) → Port 8434 → Internet
```
The gateway sits in front of Ollama and:
- Authenticates requests via API key
- Tracks inference time, tokens, energy usage
- Estimates electricity cost (GPU TDP × time × rate)
- Enforces a spending cap
- Provides a dashboard with live stats
## Configuration
Edit `.env`:
```
API_KEY=mk_your_secret_key
GPU_TDP_WATTS=54 # Your GPU's TDP
SYSTEM_OVERHEAD_WATTS=30 # CPU/RAM draw during inference
ELECTRICITY_RATE=0.15 # $/kWh
SPENDING_CAP=10.00 # $ before gateway stops accepting
```
## Endpoints
| Endpoint | Auth | Description |
|----------|------|-------------|
| `GET /health` | No | Ollama status + loaded models |
| `GET /dashboard` | No | Web dashboard with live stats |
| `GET /stats` | Yes | JSON usage stats |
| `POST /api/chat` | Yes | Proxied to Ollama |
| `POST /api/generate` | Yes | Proxied to Ollama |
| `*` | Yes | Everything else proxied to Ollama |
## Response Metadata
Every proxied response includes a `_gateway` field:
```json
{
"message": { "role": "assistant", "content": "..." },
"_gateway": {
"duration_seconds": 3.42,
"energy_wh": 0.0798,
"estimated_cost": 0.000012,
"total_cost": 0.0342,
"budget_remaining": 9.9658
}
}
```
## AMD ROCm
The Docker compose uses `ollama/ollama:rocm` by default. Requires ROCm drivers on the host. For Strix Halo, ensure BIOS is set to reserved VRAM mode.
## NVIDIA
Edit `docker-compose.yml`: uncomment the `deploy` section and comment out the `devices` section.