Add Appendix A: When the Metric Is the Product
Explores the case where the unweighted mean is reported directly to the client, making the metric itself the source of satisfaction. Under this model the entire paper's conclusion inverts: SPT genuinely maximizes client satisfaction at zero marginal cost. Analyzes this as a moral hazard / pooling equilibrium using game theory, identifies three fragility conditions (client inspects own ticket, competitor offers per-ticket SLAs, team internalizes the metric), and maps the pattern across domains (education, healthcare, finance, software). Concludes: the incentive exists, the equilibrium is real, and it holds until it doesn't. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -874,4 +874,177 @@ where none exists.**
|
||||
|
||||
---
|
||||
|
||||
## Appendix A. When the Metric Is the Product
|
||||
|
||||
The preceding twelve sections rest on an implicit assumption: that client
|
||||
satisfaction is a function of *experienced service quality* — how long
|
||||
*their* task took, relative to its size and urgency. If this assumption
|
||||
holds, the proof is valid and the unweighted mean is a destructive metric.
|
||||
|
||||
But there exists a scenario in which the assumption fails and the entire
|
||||
argument collapses.
|
||||
|
||||
### A.1 The Self-Referential Metric
|
||||
|
||||
Suppose the service provider reports the unweighted mean completion time
|
||||
directly to the client — on a dashboard, in an SLA report, on a marketing
|
||||
page — and the client's satisfaction is derived primarily from *that number*
|
||||
rather than from their individual experience.
|
||||
|
||||
Define client satisfaction as:
|
||||
|
||||
$$U_{\text{client}} = f\!\left(\bar{C}(\sigma)\right), \quad f' < 0$$
|
||||
|
||||
That is: the client sees "Average resolution time: 6.56 hours" and is
|
||||
satisfied, without checking whether *their* ticket — the critical email
|
||||
outage — took 6.56 hours or 18.75 hours.
|
||||
|
||||
Under this model, SPT genuinely maximizes client satisfaction (Theorem 1).
|
||||
The service provider's throughput is unchanged (Theorem 6). The business
|
||||
outcome improves: same work done, happier client.
|
||||
|
||||
**Every theorem in this paper remains mathematically correct. But the
|
||||
conclusion inverts.** The metric is no longer a proxy for service quality
|
||||
that can be gamed — it *is* the service quality, because the client has
|
||||
agreed to evaluate quality by the aggregate number rather than by their
|
||||
individual experience.
|
||||
|
||||
### A.2 The Economics
|
||||
|
||||
This creates a coherent, stable business equilibrium:
|
||||
|
||||
| Actor | Behavior | Outcome |
|
||||
|-------|----------|---------|
|
||||
| Provider | Optimizes unweighted mean (SPT) | Metric improves, no extra work |
|
||||
| Client | Reads dashboard, sees low average | Reports satisfaction |
|
||||
| Management | Sees satisfied client + good metric | Rewards team |
|
||||
|
||||
Throughput is unchanged (Theorem 6), so the same revenue-generating work
|
||||
is completed. The only thing that changed is the *order* — and therefore
|
||||
the reported number. Real resources were rearranged, no additional value
|
||||
was created, but the business metrics all moved in the right direction.
|
||||
|
||||
This is *profitable*. The provider extracts satisfaction from the client
|
||||
at zero marginal cost, by optimizing a number that the client has accepted
|
||||
as a proxy for quality. The client is no worse off *in their own estimation*,
|
||||
because they evaluate the aggregate, not their individual experience.
|
||||
|
||||
### A.3 The Fragility
|
||||
|
||||
This equilibrium is stable only as long as the client never inspects
|
||||
their own experience. It breaks the moment any of the following occur:
|
||||
|
||||
**1. The client checks their own ticket.**
|
||||
|
||||
A CTO whose email server was down for 18.75 hours will not be reassured
|
||||
by a dashboard reading "Average resolution: 6.56 hours." The aggregate
|
||||
metric and the individual experience diverge maximally for high-priority
|
||||
tasks (Theorem 4). The clients most likely to inspect their own experience
|
||||
are exactly the ones receiving the worst service.
|
||||
|
||||
**2. A competitor offers per-ticket SLAs.**
|
||||
|
||||
If an alternative provider guarantees "P1 incidents resolved within 4 hours"
|
||||
instead of "average resolution under 7 hours," the aggregate-metric provider
|
||||
cannot compete for clients with critical needs — which are typically the
|
||||
highest-value clients.
|
||||
|
||||
**3. The provider's team internalizes the metric.**
|
||||
|
||||
If the team believes the metric reflects real performance (rather than
|
||||
consciously gaming it), they lose the ability to recognize when critical
|
||||
work is being neglected. The metric becomes an epistemic hazard: it
|
||||
tells the team they are performing well, preventing them from seeing that
|
||||
they are not.
|
||||
|
||||
### A.4 The General Pattern
|
||||
|
||||
This is not unique to task scheduling. The structure is:
|
||||
|
||||
1. A measurable proxy is established for an unmeasured quality.
|
||||
2. The proxy is reported as if it were the quality itself.
|
||||
3. The proxy is optimized, improving the reported number.
|
||||
4. The underlying quality diverges from the proxy, but no one measures
|
||||
the underlying quality because the proxy exists.
|
||||
5. The system is stable until an exogenous shock forces inspection of
|
||||
the underlying quality.
|
||||
|
||||
This pattern appears across domains:
|
||||
|
||||
| Domain | Proxy metric | Underlying quality | Divergence |
|
||||
|--------|-------------|-------------------|------------|
|
||||
| IT support | Avg. resolution time | Critical system uptime | Server down for 19 hrs, avg says 6.5 |
|
||||
| Education | Standardized test scores | Actual learning | Teaching to the test, understanding declines |
|
||||
| Healthcare | Patient throughput | Patient outcomes | Faster discharges, higher readmission rates |
|
||||
| Finance | Quarterly earnings | Long-term value creation | Cost-cutting inflates EPS, erodes capability |
|
||||
| Software | Velocity (story points) | Deliverable product quality | Point inflation, features half-finished |
|
||||
|
||||
In each case, the proxy is optimized, the number improves, and the system
|
||||
*functions* — profitably, even — until the moment the underlying quality
|
||||
is tested by reality.
|
||||
|
||||
### A.5 A Mathematical Note on Equilibrium Stability
|
||||
|
||||
Model the system as a game between provider (P) and client (C).
|
||||
|
||||
**Information structure:**
|
||||
- P observes individual completion times $\{C_i\}$ and chooses schedule $\sigma$
|
||||
- C observes only the reported aggregate $\bar{C}(\sigma)$
|
||||
|
||||
**Payoffs:**
|
||||
- P's payoff increases with C's satisfaction and is independent of schedule
|
||||
(throughput is invariant)
|
||||
- C's *reported* satisfaction $U_C = f(\bar{C})$ is maximized by SPT
|
||||
- C's *actual* welfare (if they could observe it) depends on individual
|
||||
$C_i$ values, especially for high-priority tasks
|
||||
|
||||
This is a **moral hazard** problem. P has private information (the
|
||||
distribution of $C_i$) that C cannot observe. P's optimal strategy is to
|
||||
minimize the observable signal ($\bar{C}$) regardless of the unobservable
|
||||
distribution — which is exactly SPT.
|
||||
|
||||
The equilibrium is a **pooling equilibrium**: P's schedule looks identical
|
||||
to the client regardless of the underlying priority-weighted performance.
|
||||
A provider with PWCT = 10.2 and a provider with PWCT = 10.167 both report
|
||||
$\bar{C} = 6.56$ under SPT. The client cannot distinguish between them.
|
||||
|
||||
This equilibrium is stable under the standard game-theoretic condition:
|
||||
**C has no incentive to deviate** (they have no better information source)
|
||||
and **P has no incentive to deviate** (any other schedule worsens $\bar{C}$
|
||||
with zero throughput benefit).
|
||||
|
||||
It is *unstable* under **information revelation**: if C obtains access to
|
||||
individual $C_i$ values (via a customer portal, a competing vendor's
|
||||
transparency, or a sufficiently painful incident), the pooling equilibrium
|
||||
collapses and C's evaluation shifts to the underlying quality.
|
||||
|
||||
### A.6 The Uncomfortable Conclusion
|
||||
|
||||
The honest answer to "does optimizing the unweighted mean hurt the
|
||||
business?" is: **not necessarily, as long as the client never looks
|
||||
behind the number**.
|
||||
|
||||
The honest answer to "does it hurt the client?" is: **only when they
|
||||
have a problem large enough to notice** — which is precisely when the
|
||||
metric's distortion is largest (Theorem 4).
|
||||
|
||||
The honest answer to "is this sustainable?" is: it is exactly as
|
||||
sustainable as any system in which the seller knows more than the buyer.
|
||||
Such systems are historically stable for extended periods and then
|
||||
collapse rapidly when the information asymmetry is punctured — by a
|
||||
crisis, a competitor, or a regulator.
|
||||
|
||||
The mathematical structure is clear: the unweighted mean creates an
|
||||
information asymmetry between the metric and the reality. Optimizing
|
||||
the metric under this asymmetry is *locally rational* for the provider,
|
||||
*locally satisfying* for the uninspecting client, and *globally fragile*
|
||||
for the relationship.
|
||||
|
||||
Whether one calls this "efficient market behavior" or "a dystopian
|
||||
consequence of optimizing legible numbers over illegible reality" is not
|
||||
a mathematical question. The math says only this: **the incentive exists,
|
||||
the equilibrium is real, and it holds until it doesn't.**
|
||||
|
||||
---
|
||||
|
||||
*This proof was developed conversationally and formalized on 2026-03-28.*
|
||||
|
||||
Reference in New Issue
Block a user