mortdecai-gateway

Seth/mortdecai-gateway

Fork 0

Commit Graph

Author	SHA1	Message	Date
Seth	6d3df9ae58	Full cost model: marginal power, labor, profit, live config Cost model: - Marginal billing: only charge for watts above idle - Dedicated billing: charge for all uptime (optional) - Labor rate: $/hr for operator time, manually logged - Profit margin: percentage markup on electricity cost - All parameters adjustable live via POST /config Dashboard shows: - Cost breakdown with progress bar - Power model (idle→load for GPU and system) - Marginal watts per inference call - Labor hours + labor cost - Total owed (electricity + labor + margin) - GPU utilization, temperature, power draw - Avg cost per request, estimated remaining requests Endpoints: - GET /config — view current cost config - POST /config — update any parameter live - GET /stats — full usage stats + cost config (auth required) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:49:14 -04:00
Seth	0b37d7de79	Add opt-in model update endpoint + API key support Gateway: POST /admin/update-model downloads new GGUF and reloads. Disabled by default — requires ALLOW_MODEL_UPDATES=true in .env. Matt controls whether remote model updates are allowed. Self-play: --api-key flag for authenticated gateway connections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:39:50 -04:00
Seth	c5865feb35	Mortdecai Gateway — authenticated Ollama proxy with power metering - API key auth on all inference endpoints - Power/cost tracking: GPU TDP × inference time × electricity rate - Spending cap enforcement - Web dashboard with live stats - Docker compose for AMD ROCm (Strix Halo) or NVIDIA - Auto-setup script with GGUF loading - Tested against local Ollama Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:26:43 -04:00

Author

SHA1

Message

Date

Seth

6d3df9ae58

Full cost model: marginal power, labor, profit, live config

Cost model:
- Marginal billing: only charge for watts above idle
- Dedicated billing: charge for all uptime (optional)
- Labor rate: $/hr for operator time, manually logged
- Profit margin: percentage markup on electricity cost
- All parameters adjustable live via POST /config

Dashboard shows:
- Cost breakdown with progress bar
- Power model (idle→load for GPU and system)
- Marginal watts per inference call
- Labor hours + labor cost
- Total owed (electricity + labor + margin)
- GPU utilization, temperature, power draw
- Avg cost per request, estimated remaining requests

Endpoints:
- GET /config — view current cost config
- POST /config — update any parameter live
- GET /stats — full usage stats + cost config (auth required)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 19:49:14 -04:00

Seth

0b37d7de79

Add opt-in model update endpoint + API key support

Gateway: POST /admin/update-model downloads new GGUF and reloads.
Disabled by default — requires ALLOW_MODEL_UPDATES=true in .env.
Matt controls whether remote model updates are allowed.

Self-play: --api-key flag for authenticated gateway connections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 19:39:50 -04:00

Seth

c5865feb35

Mortdecai Gateway — authenticated Ollama proxy with power metering

- API key auth on all inference endpoints
- Power/cost tracking: GPU TDP × inference time × electricity rate
- Spending cap enforcement
- Web dashboard with live stats
- Docker compose for AMD ROCm (Strix Halo) or NVIDIA
- Auto-setup script with GGUF loading
- Tested against local Ollama

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 19:26:43 -04:00

3 Commits