Add Appendix A: When the Metric Is the Product

Explores the case where the unweighted mean is reported directly to the
client, making the metric itself the source of satisfaction. Under this
model the entire paper's conclusion inverts: SPT genuinely maximizes
client satisfaction at zero marginal cost.

Analyzes this as a moral hazard / pooling equilibrium using game theory,
identifies three fragility conditions (client inspects own ticket,
competitor offers per-ticket SLAs, team internalizes the metric), and
maps the pattern across domains (education, healthcare, finance, software).

Concludes: the incentive exists, the equilibrium is real, and it holds
until it doesn't.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mortdecai
2026-03-28 17:29:56 -04:00
parent 574eca5b27
commit 3cf815d28b
+173
View File
@@ -874,4 +874,177 @@ where none exists.**
---
## Appendix A. When the Metric Is the Product
The preceding twelve sections rest on an implicit assumption: that client
satisfaction is a function of *experienced service quality* — how long
*their* task took, relative to its size and urgency. If this assumption
holds, the proof is valid and the unweighted mean is a destructive metric.
But there exists a scenario in which the assumption fails and the entire
argument collapses.
### A.1 The Self-Referential Metric
Suppose the service provider reports the unweighted mean completion time
directly to the client — on a dashboard, in an SLA report, on a marketing
page — and the client's satisfaction is derived primarily from *that number*
rather than from their individual experience.
Define client satisfaction as:
$$U_{\text{client}} = f\!\left(\bar{C}(\sigma)\right), \quad f' < 0$$
That is: the client sees "Average resolution time: 6.56 hours" and is
satisfied, without checking whether *their* ticket — the critical email
outage — took 6.56 hours or 18.75 hours.
Under this model, SPT genuinely maximizes client satisfaction (Theorem 1).
The service provider's throughput is unchanged (Theorem 6). The business
outcome improves: same work done, happier client.
**Every theorem in this paper remains mathematically correct. But the
conclusion inverts.** The metric is no longer a proxy for service quality
that can be gamed — it *is* the service quality, because the client has
agreed to evaluate quality by the aggregate number rather than by their
individual experience.
### A.2 The Economics
This creates a coherent, stable business equilibrium:
| Actor | Behavior | Outcome |
|-------|----------|---------|
| Provider | Optimizes unweighted mean (SPT) | Metric improves, no extra work |
| Client | Reads dashboard, sees low average | Reports satisfaction |
| Management | Sees satisfied client + good metric | Rewards team |
Throughput is unchanged (Theorem 6), so the same revenue-generating work
is completed. The only thing that changed is the *order* — and therefore
the reported number. Real resources were rearranged, no additional value
was created, but the business metrics all moved in the right direction.
This is *profitable*. The provider extracts satisfaction from the client
at zero marginal cost, by optimizing a number that the client has accepted
as a proxy for quality. The client is no worse off *in their own estimation*,
because they evaluate the aggregate, not their individual experience.
### A.3 The Fragility
This equilibrium is stable only as long as the client never inspects
their own experience. It breaks the moment any of the following occur:
**1. The client checks their own ticket.**
A CTO whose email server was down for 18.75 hours will not be reassured
by a dashboard reading "Average resolution: 6.56 hours." The aggregate
metric and the individual experience diverge maximally for high-priority
tasks (Theorem 4). The clients most likely to inspect their own experience
are exactly the ones receiving the worst service.
**2. A competitor offers per-ticket SLAs.**
If an alternative provider guarantees "P1 incidents resolved within 4 hours"
instead of "average resolution under 7 hours," the aggregate-metric provider
cannot compete for clients with critical needs — which are typically the
highest-value clients.
**3. The provider's team internalizes the metric.**
If the team believes the metric reflects real performance (rather than
consciously gaming it), they lose the ability to recognize when critical
work is being neglected. The metric becomes an epistemic hazard: it
tells the team they are performing well, preventing them from seeing that
they are not.
### A.4 The General Pattern
This is not unique to task scheduling. The structure is:
1. A measurable proxy is established for an unmeasured quality.
2. The proxy is reported as if it were the quality itself.
3. The proxy is optimized, improving the reported number.
4. The underlying quality diverges from the proxy, but no one measures
the underlying quality because the proxy exists.
5. The system is stable until an exogenous shock forces inspection of
the underlying quality.
This pattern appears across domains:
| Domain | Proxy metric | Underlying quality | Divergence |
|--------|-------------|-------------------|------------|
| IT support | Avg. resolution time | Critical system uptime | Server down for 19 hrs, avg says 6.5 |
| Education | Standardized test scores | Actual learning | Teaching to the test, understanding declines |
| Healthcare | Patient throughput | Patient outcomes | Faster discharges, higher readmission rates |
| Finance | Quarterly earnings | Long-term value creation | Cost-cutting inflates EPS, erodes capability |
| Software | Velocity (story points) | Deliverable product quality | Point inflation, features half-finished |
In each case, the proxy is optimized, the number improves, and the system
*functions* — profitably, even — until the moment the underlying quality
is tested by reality.
### A.5 A Mathematical Note on Equilibrium Stability
Model the system as a game between provider (P) and client (C).
**Information structure:**
- P observes individual completion times $\{C_i\}$ and chooses schedule $\sigma$
- C observes only the reported aggregate $\bar{C}(\sigma)$
**Payoffs:**
- P's payoff increases with C's satisfaction and is independent of schedule
(throughput is invariant)
- C's *reported* satisfaction $U_C = f(\bar{C})$ is maximized by SPT
- C's *actual* welfare (if they could observe it) depends on individual
$C_i$ values, especially for high-priority tasks
This is a **moral hazard** problem. P has private information (the
distribution of $C_i$) that C cannot observe. P's optimal strategy is to
minimize the observable signal ($\bar{C}$) regardless of the unobservable
distribution — which is exactly SPT.
The equilibrium is a **pooling equilibrium**: P's schedule looks identical
to the client regardless of the underlying priority-weighted performance.
A provider with PWCT = 10.2 and a provider with PWCT = 10.167 both report
$\bar{C} = 6.56$ under SPT. The client cannot distinguish between them.
This equilibrium is stable under the standard game-theoretic condition:
**C has no incentive to deviate** (they have no better information source)
and **P has no incentive to deviate** (any other schedule worsens $\bar{C}$
with zero throughput benefit).
It is *unstable* under **information revelation**: if C obtains access to
individual $C_i$ values (via a customer portal, a competing vendor's
transparency, or a sufficiently painful incident), the pooling equilibrium
collapses and C's evaluation shifts to the underlying quality.
### A.6 The Uncomfortable Conclusion
The honest answer to "does optimizing the unweighted mean hurt the
business?" is: **not necessarily, as long as the client never looks
behind the number**.
The honest answer to "does it hurt the client?" is: **only when they
have a problem large enough to notice** — which is precisely when the
metric's distortion is largest (Theorem 4).
The honest answer to "is this sustainable?" is: it is exactly as
sustainable as any system in which the seller knows more than the buyer.
Such systems are historically stable for extended periods and then
collapse rapidly when the information asymmetry is punctured — by a
crisis, a competitor, or a regulator.
The mathematical structure is clear: the unweighted mean creates an
information asymmetry between the metric and the reality. Optimizing
the metric under this asymmetry is *locally rational* for the provider,
*locally satisfying* for the uninspecting client, and *globally fragile*
for the relationship.
Whether one calls this "efficient market behavior" or "a dystopian
consequence of optimizing legible numbers over illegible reality" is not
a mathematical question. The math says only this: **the incentive exists,
the equilibrium is real, and it holds until it doesn't.**
---
*This proof was developed conversationally and formalized on 2026-03-28.*