db1044072c
New Section 4.5 proves that completing old tasks is actively punished by the unweighted mean: a single 26-day-old task hurts the average more than 26 one-day tasks help it (same total wait resolved, worse metric). The rational response is not starvation (Theorem 3) but abandonment — closing aged tasks as "won't fix" to protect the average. Changes: - New Section 4.5 with Theorem 6.1 and Corollary 6.2 - Old Section 4.5 (Compound Effect) renumbered to 4.6, table updated - Conclusion updated with new item 3, subsequent items renumbered - Edition 1 backed up to .backup/README.md.v1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Unweighted Average Completion Time Is Not a Fair Metric for Task Scheduling A mathematical proof that unweighted average task completion time is a biased statistic that incentivizes cherry-picking easy work, and that any scheduling advantage it appears to reveal is an artifact of the metric — not a reflection of genuine throughput or service quality. --- ## 1. Introduction Many organizations measure task-execution performance by **unweighted mean completion time**: the average number of hours (or days) between task submission and task resolution, counting each task equally regardless of size or priority. This paper proves that this metric is not merely imprecise but structurally biased. It can be improved by reordering work without doing any additional work (Theorem 1), while a properly weighted alternative is completely immune to scheduling manipulation (Theorem 2). When combined with a priority system, the metric actively contradicts the organization's own priority classifications (Theorem 9). The argument proceeds in four parts: - **Part I** (Sections 2–4) establishes the mathematical foundation: the unweighted mean is gameable by Shortest Processing Time (SPT) scheduling, the work-weighted mean is schedule-invariant, and the resulting service-quality consequences are provably negative. - **Part II** (Sections 5–6) extends the model to priority-classified tasks, proves the metric becomes adversarial to the priority system, and proposes weighted alternatives with a worked IT service desk example. - **Part III** (Sections 7–9) examines organizational dynamics: what happens when the metric is reported to clients (information asymmetry), what happens to team members who understand its flaws (psychological harm), and what a single informed manager can do about it (constrained optimization with game-theoretic stability analysis). - **Part IV** (Sections 10–12) presents honest counterarguments, situates the work in existing literature, and concludes. The core results build on Smith's (1956) foundational scheduling theory [1], extended through game theory [9, 10], organizational measurement theory [18, 19], and psychology [11–17] to trace a complete chain from a mathematical proof about a specific metric to organizational outcomes. --- # Part I: Mathematical Foundation ## 2. Definitions Let there be **n** tasks with processing times $p_1, p_2, \ldots, p_n$. A **schedule** $\sigma$ is a permutation of $\{1, 2, \ldots, n\}$ assigning tasks to execution order on a single executor. The **completion time** of task $\sigma(k)$ under schedule $\sigma$ is: $$C_{\sigma(k)} = \sum_{j=1}^{k} p_{\sigma(j)}$$ The **unweighted mean completion time** is: $$\bar{C}(\sigma) = \frac{1}{n} \sum_{k=1}^{n} C_{\sigma(k)}$$ The **work-weighted mean completion time** is: $$\bar{C}_w(\sigma) = \frac{\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)}}{\sum_{k=1}^{n} p_{\sigma(k)}}$$ --- ## 3. Core Results ### 3.1 The Unweighted Mean Is Gameable **Theorem 1** (Smith, 1956 [1])**.** The schedule that minimizes $\bar{C}(\sigma)$ is Shortest Processing Time first (SPT): sort tasks so that $p_{\sigma(1)} \le p_{\sigma(2)} \le \cdots \le p_{\sigma(n)}$. **Proof (exchange argument [1, 2]).** Consider any schedule $\sigma$ in which two adjacent tasks $i, j$ satisfy $p_i > p_j$ with task $i$ scheduled immediately before task $j$. Let $t$ be the start time of task $i$. | | Task $i$ finishes | Task $j$ finishes | Sum | |---|---|---|---| | **Before swap** ($i$ then $j$) | $t + p_i$ | $t + p_i + p_j$ | $2t + 2p_i + p_j$ | | **After swap** ($j$ then $i$) | $t + p_j$ | $t + p_j + p_i$ | $2t + p_i + 2p_j$ | The change in the sum of completion times is: $$(2p_i + p_j) - (p_i + 2p_j) = p_i - p_j > 0$$ Every swap of a longer-before-shorter adjacent pair strictly reduces the total. Any non-SPT schedule contains such a pair. Repeated swaps converge to SPT. Therefore SPT uniquely minimizes $\bar{C}(\sigma)$. $\blacksquare$ ### 3.2 The Work-Weighted Mean Is Schedule-Invariant **Theorem 2.** The work-weighted mean completion time $\bar{C}_w(\sigma)$ is the same for every schedule $\sigma$. **Proof.** Expand the numerator: $$\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_{k=1}^{n} p_{\sigma(k)} \sum_{j=1}^{k} p_{\sigma(j)}$$ Reindex by letting $a = \sigma(k)$ and $b = \sigma(j)$. The double sum counts every ordered pair $(a, b)$ where $b$ is scheduled no later than $a$: $$= \sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b$$ For any pair $(a, b)$ with $a \ne b$, exactly one of $\{b \preceq_\sigma a\}$ or $\{a \prec_\sigma b\}$ holds. The diagonal terms ($a = b$) contribute $p_a^2$ regardless of order. Therefore: $$\sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b = \sum_{a} p_a^2 + \sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b$$ Together with the complementary sum, the two off-diagonal sums cover all unordered pairs: $$\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b + \sum_{\substack{a \ne b \\ a \prec_\sigma b}} p_a \, p_b = \sum_{a \ne b} p_a \, p_b$$ The right-hand side is schedule-independent. By symmetry of $p_a p_b$, both off-diagonal sums are equal: $$\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b = \frac{1}{2} \sum_{a \ne b} p_a \, p_b$$ Therefore: $$\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_a p_a^2 + \frac{1}{2} \sum_{a \ne b} p_a \, p_b = \frac{1}{2}\left(\sum_a p_a\right)^2 + \frac{1}{2}\sum_a p_a^2$$ This expression contains no reference to $\sigma$. Since the denominator $\sum p_a$ is also schedule-independent: $$\bar{C}_w(\sigma) = \frac{\frac{1}{2}\left(\sum p_a\right)^2 + \frac{1}{2}\sum p_a^2}{\sum p_a}$$ is **constant across all schedules**. $\blacksquare$ This is an instance of the conservation laws in scheduling identified by Coffman, Shanthikumar, and Yao [20]. The invariance corresponds to measuring how long a unit of *work* waits rather than how long a *task* waits — the unweighted statistic counts completions rather than work, which is why it is gameable. (See also Little [3, 4] for the queueing- theoretic context, with the caveat that Little's Law applies directly only to steady-state systems, not to the batch case analyzed here.) ### 3.3 Illustrative Example Two tasks: $A$ with $p_A = 1$ hour, $B$ with $p_B = 10$ hours. | Schedule | $C_A$ | $C_B$ | Unweighted mean | Work-weighted mean | |----------|-------|-------|-----------------|-------------------| | SPT (A first) | 1 | 11 | 6.0 | 111/11 ≈ 10.09 | | Reverse (B first) | 11 | 10 | 10.5 | 111/11 ≈ 10.09 | SPT appears **4.5 hours better** on the unweighted metric but provides **zero improvement** on the work-weighted metric. The apparent advantage exists only because the unweighted statistic lets a 1-hour task "vote" equally with a 10-hour task. --- ## 4. Consequences for Service Quality ### 4.1 Starvation of Large Tasks **Theorem 3 (Metric Bias).** Any scheduling policy that minimizes unweighted mean completion time necessarily maximizes the completion time of the largest task. **Proof.** SPT places the largest task last. Its completion time equals the total processing time $\sum p_i$, which is the maximum possible completion time for any individual task. Under any schedule that does not place the largest task last, that task completes strictly earlier. $\blacksquare$ This creates a **starvation incentive**: rational agents optimizing the unweighted statistic will indefinitely defer large tasks in favor of small ones. Austin [18] identified this general pattern — that incomplete measurement creates incentives to optimize the measured dimension at the expense of unmeasured ones — in the context of organizational performance management. Theorem 3 provides the specific mechanism for task scheduling. ### 4.2 Maximum Completion Time for the Largest Task **Theorem 4 (SPT Uniquely Maximizes Completion Time of the Largest Task).** Among all schedules, SPT is the unique policy that assigns the maximum possible completion time ($\sum p_i$) to the largest task. **Proof.** SPT sorts tasks in ascending order of $p_i$, placing the largest task $p_{\max}$ in the last position. The last task in any schedule has completion time $\sum_{i=1}^{n} p_i$, which is the maximum any individual task can receive. Under any schedule that does not place $p_{\max}$ last, it completes strictly before $\sum p_i$. $\blacksquare$ **Corollary 4.1.** A team optimizing unweighted mean completion time will systematically deliver the worst experience to clients with the most complex needs. This is not a side effect — it is the *mechanism* by which the metric improves. **Note on slowdown ratios.** SPT actually *compresses* slowdown ratios ($S_i = C_i / p_i$) because larger tasks in later positions have large denominators that absorb the accumulated sum. For example, with tasks $[1, 5, 10]$: SPT gives slowdowns $[1, 1.2, 1.6]$ (low variance) while LPT gives $[1, 3, 16]$ (high variance). SPT's harm to large-task clients is not visible in the slowdown ratio — it is visible in **absolute completion time**. This distinction is important: the scheduling fairness literature [21, 22, 23] has debated SPT/SRPT unfairness primarily through slowdown-based measures, which can obscure the absolute-delay burden proved below. ### 4.3 Delay Concentration **Theorem 5 (SPT Concentrates Delay on the Largest Task).** Under SPT, the largest task bears more absolute delay than under any other schedule. **Proof.** Define absolute delay as $\Delta_i = C_i - p_i$ (time spent waiting, independent of own size). Under SPT, the largest task is in position $n$ with: $$\Delta_{\max\text{-task}}^{\text{SPT}} = C_n - p_n = \sum_{i=1}^{n-1} p_i$$ This is the sum of all other tasks' processing times — the maximum possible delay for any single task. Under any schedule where the largest task is not last, its delay is strictly less. Meanwhile, SPT gives the smallest task zero delay ($\Delta_1^{\text{SPT}} = 0$). The entire queuing burden is shifted from small tasks to large tasks. $\blacksquare$ SPT minimizes *total* delay (good for aggregate efficiency) by concentrating delay onto the tasks best able to absorb it in slowdown-ratio terms. But in absolute terms — hours spent waiting — the largest task bears the full weight. ### 4.4 Throughput Invariance **Theorem 6 (Throughput Invariance).** Total work completed over any time horizon $T$ is identical under all scheduling policies. **Proof.** The executor processes work at a fixed rate. Over any horizon $T \ge \sum p_i$, the total work done is exactly $\sum p_i$ regardless of order. For the steady-state case with ongoing arrivals, the long-run throughput is determined by the service rate $\mu$ and is completely independent of scheduling: $$\lim_{T \to \infty} \frac{W(T)}{T} = \mu \quad \text{for all schedules } \sigma$$ $\blacksquare$ **Corollary 6.1.** A team that switches from any scheduling policy to SPT will observe an improvement in unweighted mean completion time with **zero change in actual throughput**. The metric improves. The output does not. ### 4.5 The Compound Effect Combining Theorems 4, 5, and 6: | Measure | Effect of optimizing unweighted mean | |---------|--------------------------------------| | Throughput (work/time) | No change (Theorem 6) | | Delay for small tasks | Minimized — approaches zero (SPT) | | Delay for large tasks | **Maximized** — bears all queuing burden (Theorem 5) | | Completion time of largest task | **Maximum possible**: $\sum p_i$ (Theorem 4) | The net effect on perceived quality is negative because: 1. **Loss aversion is asymmetric** [8]. A client whose 100-hour task is deprioritized experiences a large, salient negative. A client whose 1-hour task is expedited experiences a small, often unnoticed positive. 2. **High-effort tasks correlate with high-value clients.** Large tasks are disproportionately likely to come from major clients, complex contracts, or critical business needs. 3. **Starvation compounds.** In a continuous system (Theorem 3), large tasks may be **indefinitely deferred** as new small tasks keep arriving. **Theorem 7 (The Core Result).** For a team processing tasks of non-uniform size, adopting unweighted mean completion time as a performance metric: (a) Provides **zero productivity gain** (Theorem 6), while (b) **Assigning the maximum possible completion time** to the largest task (Theorem 4), and (c) **Concentrating all queuing delay** onto the largest tasks while eliminating delay for the smallest (Theorem 5). This is not a tradeoff. The metric creates a pure transfer of service quality from high-effort clients to low-effort clients, with no net work gained. $\blacksquare$ --- # Part II: Priority Systems ## 5. Breakdown Under Priority Classification The preceding sections proved that unweighted mean completion time is biased when tasks vary in size. We now show that introducing a **priority system** — as virtually all real teams use — causes the metric to become not merely biased but **actively adversarial** to the organization's stated goals. ### 5.1 Extended Model: Tasks With Priority Let each task $i$ have processing time $p_i$ and a priority class $q_i \in \{1, 2, 3, 4\}$ where 1 is the highest priority (critical) and 4 is the lowest (cosmetic/enhancement). Assign priority weights: $$w(q) = \begin{cases} 8 & q = 1 \text{ (Critical)} \\ 4 & q = 2 \text{ (High)} \\ 2 & q = 3 \text{ (Medium)} \\ 1 & q = 4 \text{ (Low)} \end{cases}$$ The specific weights are illustrative; the results hold for any strictly decreasing weight function. The key property is that priority is assigned by **business impact**, not by task size. ### 5.2 The Metric Contradicts the Priority System **Theorem 8 (Priority-Size Inversion).** When priority is independent of task size, the schedule that minimizes unweighted mean completion time (SPT) will, in expectation, complete low-priority tasks before high-priority tasks of greater size. **Proof.** SPT orders tasks by $p_i$ ascending, regardless of $q_i$. Consider two tasks: - Task A: $p_A = 40$ hours, $q_A = 1$ (Critical — e.g., server outage) - Task B: $p_B = 0.5$ hours, $q_B = 4$ (Low — e.g., cosmetic UI fix) SPT schedules B before A. The unweighted mean for this pair: $$\bar{C}^{\text{SPT}} = \frac{0.5 + 40.5}{2} = 20.5 \qquad \bar{C}^{\text{priority}} = \frac{40 + 40.5}{2} = 40.25$$ The metric declares SPT nearly **twice as good** — despite completing a cosmetic fix while a server outage burns. In general, when $q_i$ is statistically independent of $p_i$, SPT's ordering has **zero correlation** with priority. In practice, Critical tasks (outages, security incidents, data loss) often require more work than Low tasks, so the metric is plausibly **anti-correlated** with the priority system. $\blacksquare$ ### 5.3 Information Destruction The unweighted mean reduces a three-dimensional task $(p_i, q_i, C_i)$ to a one-dimensional signal ($C_i$), then averages uniformly. This discards priority entirely and implicitly inverts size. **Theorem 9 (Information Destruction).** Let $I(\sigma)$ be the mutual information between the schedule's implicit priority ranking (position) and the actual priority assignment $q_i$. For SPT: $$I(\sigma_{\text{SPT}}) = 0 \quad \text{when } p_i \perp q_i$$ **Proof.** SPT assigns positions based solely on $p_i$. When $p_i$ and $q_i$ are independent, knowing a task's position in the SPT schedule provides zero information about its priority. $\blacksquare$ **Corollary 9.1.** A team that optimizes unweighted mean completion time is operating a scheduling system that carries zero information about its own priority classification. The priority field in their ticketing system is, with respect to execution order, decorative. This is an instance of what Austin [18] calls the fundamental problem of incomplete measurement: when the measurement system captures only a subset of the relevant dimensions, optimizing the measurement systematically degrades the unmeasured dimensions. ### 5.4 Priority-Weighted Delay Cost Define the **priority-weighted delay cost** of a schedule: $$D(\sigma) = \sum_{i=1}^{n} w(q_i) \cdot C_i$$ **Theorem 10 (SPT and Priority-Weighted Delay Cost).** The optimal schedule for minimizing $D(\sigma)$ is WSJF: order by $w(q_i)/p_i$ descending [1, 5]. SPT's ordering — by $1/p_i$ descending — ignores priority entirely and produces higher $D$ than priority-respecting alternatives when priority is correlated with task size. **Proof.** By the exchange argument, swapping adjacent tasks $i, j$ changes $D$ by: $$\Delta D = w(q_j) \cdot p_i - w(q_i) \cdot p_j$$ The swap improves $D$ when $w(q_j)/p_j > w(q_i)/p_i$ but $j$ is scheduled after $i$. Therefore the optimal order is decreasing $w(q_i)/p_i$ — the WSJF rule. SPT corresponds to WSJF only when $w(q_i) = \text{const}$ (all tasks have equal priority). **Example.** Critical ($w = 8$, $p = 3$) and Low ($w = 1$, $p = 2$): - SPT (Low first): $D = 1 \cdot 2 + 8 \cdot 5 = 42$ - WSJF (Critical first): $D = 8 \cdot 3 + 1 \cdot 5 = 29$ SPT incurs 45% more priority-weighted delay. In practice, Critical tasks tend to be larger (outages, security incidents), making the divergence systematic. $\blacksquare$ --- ## 6. Proposed Solutions ### 6.1 Priority-Weighted Metrics Replace unweighted mean completion time with the **Priority-Weighted Completion Score (PWCS)**: $$\text{PWCS}(\sigma) = \frac{\sum_{i=1}^{n} w(q_i) \cdot \frac{C_i}{p_i}}{\sum_{i=1}^{n} w(q_i)}$$ This is the priority-weighted mean slowdown ratio. It measures how long each task waited relative to its size, weighted by how much that task mattered. Lower is better. **Properties:** 1. **Priority-respecting.** Delays to Critical tasks cost 8x more than delays to Low tasks. 2. **Size-fair.** Uses slowdown ratio $C_i / p_i$, so large tasks are not penalized for being large. 3. **Not gameable by SPT.** Reordering by processing time does not systematically improve the score. 4. **Reduces to unweighted mean when tasks are uniform.** A strict generalization. ### 6.2 Optimal Policy: WSJF **Theorem 11.** The schedule minimizing the priority-weighted completion time $\text{PWCT}(\sigma) = \sum w(q_i) \cdot C_i / \sum w(q_i)$ processes tasks in order of decreasing $w(q_i)/p_i$ — the **Weighted Shortest Job First (WSJF)** rule [1, 5]. **Proof.** By the exchange argument (as in Theorem 10), the swap of adjacent tasks $i, j$ improves PWCT when $w(q_j)/p_j > w(q_i)/p_i$ but $j$ is scheduled after $i$. The optimal order is therefore decreasing $w(q_i)/p_i$. $\blacksquare$ Within a priority class, this reduces to SPT (shortest first). Across classes, a Critical 4-hour task ($w/p = 2.0$) beats a Low 1-hour task ($w/p = 1.0$). **Practical caveat.** Pure WSJF can place tiny Low-priority tasks ahead of large Critical tasks (a 15-minute Low task has $w/p = 1/0.25 = 4.0$, beating a 6-hour Critical at $w/p = 8/6 = 1.33$). In practice, this is mitigated by enforcing **strict priority-class ordering** and applying WSJF only *within* each class. ### 6.3 Applied Example: IT Service Desk Consider an IT team with the following ticket queue: | Ticket | Priority | Type | Est. Hours | |--------|----------|------|-----------| | T1 | P1 (Critical) | Email server down | 6 | | T2 | P2 (High) | VPN failing for remote team | 4 | | T3 | P3 (Medium) | New employee laptop setup | 2 | | T4 | P4 (Low) | Update desktop wallpaper policy | 0.5 | | T5 | P3 (Medium) | Install software license | 1 | | T6 | P1 (Critical) | Database backup failing | 3 | | T7 | P2 (High) | Printer fleet offline | 2 | | T8 | P4 (Low) | Archive old shared drive folder | 0.25 | **SPT order** (optimizing unweighted mean): T8, T4, T5, T3, T7, T6, T2, T1 | Pos | Ticket | Priority | Hours | Completion | Slowdown | |-----|--------|----------|-------|------------|----------| | 1 | T8 (archive folder) | P4 Low | 0.25 | 0.25 | 1.0 | | 2 | T4 (wallpaper) | P4 Low | 0.5 | 0.75 | 1.5 | | 3 | T5 (software) | P3 Med | 1 | 1.75 | 1.75 | | 4 | T3 (laptop) | P3 Med | 2 | 3.75 | 1.875 | | 5 | T7 (printers) | P2 High | 2 | 5.75 | 2.875 | | 6 | T6 (backups) | P1 Crit | 3 | 8.75 | 2.917 | | 7 | T2 (VPN) | P2 High | 4 | 12.75 | 3.188 | | 8 | T1 (email) | P1 Crit | 6 | 18.75 | 3.125 | **Practical WSJF** (priority-class-first, SPT within class): | Pos | Ticket | Priority | Hours | Completion | |-----|--------|----------|-------|------------| | 1 | T6 (backups) | P1 Crit | 3 | 3 | | 2 | T1 (email) | P1 Crit | 6 | 9 | | 3 | T7 (printers) | P2 High | 2 | 11 | | 4 | T2 (VPN) | P2 High | 4 | 15 | | 5 | T5 (software) | P3 Med | 1 | 16 | | 6 | T3 (laptop) | P3 Med | 2 | 18 | | 7 | T8 (archive) | P4 Low | 0.25 | 18.25 | | 8 | T4 (wallpaper) | P4 Low | 0.5 | 18.75 | **Comparison:** | Metric | SPT | Practical WSJF | Winner | |--------|-----|----------------|--------| | Unweighted mean completion | **6.56 hrs** | 13.63 hrs | SPT | | P1 mean time to resolution | 13.75 hrs | **6 hrs** | WSJF | | P2 mean time to resolution | 9.25 hrs | **13 hrs** | SPT | | Time to fix email server | 18.75 hrs | **9 hrs** | WSJF | | Time to fix database backups | 8.75 hrs | **3 hrs** | WSJF | | Time to update wallpaper | **0.75 hrs** | 18.75 hrs | SPT | The aggregate priority-weighted completion times are nearly identical (PWCT: 10.2 vs 10.17) because aggregation hides distributional damage. The real difference is in the **per-priority-class** breakdown: the email server is down for 18.75 hours under SPT versus 9 hours under WSJF. The database backups fail for 8.75 hours versus 3. The unweighted metric confidently reports SPT as **more than twice as efficient** (6.56 vs 13.63), rewarding the team that updated desktop wallpaper while the email server was on fire. ### 6.4 Recommended Metric Suite Even priority-weighted aggregate metrics can fail to distinguish good from bad schedules, because aggregation hides distributional damage. No single metric suffices. A complete measurement system should track: | Metric | What it measures | Formula | |--------|-----------------|---------| | **Mean completion by priority class** | Per-class responsiveness | $\bar{C}$ filtered by $q$ | | **P1 mean time to resolution** | Critical incident response | $\bar{C}$ for $q = 1$ | | **Throughput** | Raw work capacity | Work-hours completed / calendar time | | **Aging violations** | Starvation prevention | Tasks exceeding SLA by priority | | **Max completion time (P1/P2)** | Worst-case critical response | $\max(C_i)$ for $q \le 2$ | The key insight: **per-priority-class metrics** expose scheduling failures that aggregate metrics hide. --- # Part III: Organizational Dynamics ## 7. When the Metric Is the Product Sections 2–6 assume that client satisfaction is a function of *experienced service quality*. But there exists a scenario in which this assumption fails and the entire argument collapses. ### 7.1 The Self-Referential Metric Suppose the provider reports the unweighted mean directly to the client — on a dashboard, in an SLA report, on a marketing page — and the client's satisfaction is derived primarily from *that number*: $$U_{\text{client}} = f\!\left(\bar{C}(\sigma)\right), \quad f' < 0$$ Under this model, SPT genuinely maximizes client satisfaction (Theorem 1). Throughput is unchanged (Theorem 6). The business outcome improves: same work done, happier client. **Every theorem in this paper remains mathematically correct. But the conclusion inverts.** The metric is no longer a proxy that can be gamed — it *is* the service quality, because the client has agreed to evaluate quality by the aggregate number. ### 7.2 The Economics This creates a coherent, stable equilibrium: | Actor | Behavior | Outcome | |-------|----------|---------| | Provider | Optimizes unweighted mean (SPT) | Metric improves, no extra work | | Client | Reads dashboard, sees low average | Reports satisfaction | | Management | Sees satisfied client + good metric | Rewards team | The provider extracts satisfaction at zero marginal cost, by optimizing a number the client has accepted as a proxy for quality. ### 7.3 The Fragility This equilibrium is stable only as long as the client never inspects their own experience. It breaks when: 1. **The client checks their own ticket.** A CTO whose email server was down for 18.75 hours will not be reassured by "Average resolution: 6.56 hours." The clients most likely to inspect are exactly the ones receiving the worst service (Theorem 4). 2. **A competitor offers per-ticket SLAs.** "P1 resolved within 4 hours" beats "average resolution under 7 hours" for any client with critical needs. 3. **The team internalizes the metric.** If the team believes the metric reflects real performance, they lose the ability to recognize when critical work is neglected. The metric becomes an epistemic hazard. ### 7.4 The General Pattern This pattern — proxy replaces quality, proxy is optimized, quality diverges, system is stable until tested by reality — recurs across domains. Muller [19] documents it extensively as "metric fixation"; Campbell [24] formalized the corrupting effect of using indicators as targets. | Domain | Proxy metric | Underlying quality | Divergence | |--------|-------------|-------------------|------------| | IT support | Avg. resolution time | Critical system uptime | Server down 19 hrs, avg says 6.5 | | Education | Test scores | Actual learning | Teaching to the test | | Healthcare | Patient throughput | Patient outcomes | Faster discharges, higher readmission | | Finance | Quarterly earnings | Long-term value | Cost-cutting inflates EPS, erodes capability | | Software | Velocity (story points) | Product quality | Point inflation, features half-finished | ### 7.5 Information Asymmetry Model the system as a game between provider (P) and client (C). P observes individual $\{C_i\}$ and chooses $\sigma$; C observes only $\bar{C}(\sigma)$. This is a **moral hazard** problem [10]: P's optimal strategy is to minimize the observable signal regardless of the unobservable distribution. The equilibrium is a **pooling equilibrium** [9]: P's reported metric looks identical regardless of the underlying priority-weighted performance. It is stable until C obtains access to individual $C_i$ values — via a customer portal, a competitor's transparency, or a sufficiently painful incident. ### 7.6 The Uncomfortable Conclusion The honest answer to "does optimizing the unweighted mean hurt the business?" is: **not necessarily, as long as the client never looks behind the number**. The honest answer to "is this sustainable?" is: it is exactly as sustainable as any system in which the seller knows more than the buyer — stable for extended periods, then rapid collapse when the asymmetry is punctured. --- ## 8. The Psychological Cost of Knowing Section 7 modeled the provider as a unitary actor. But teams are composed of individuals. When a team member understands the proof — when they *know* the metric is synthetic, that the dashboard is theater, that the email server is still down while they close wallpaper tickets — a new cost appears that the equilibrium model omitted. ### 8.1 The Hidden Variable: Team Awareness | Actor | Observes individual $C_i$ | Observes $\bar{C}$ | Understands the proof | |-------|--------------------------|--------------------|-----------------------| | Management | Possibly | Yes | Varies | | Team member | **Yes** | Yes | **Yes** (in this scenario) | | Client | No | Yes | No | The team member has full information. They see the ticket queue. They know the email server has been down since 7 AM. They know they are closing a wallpaper ticket because it improves the number. And they know *why*. ### 8.2 Cognitive Dissonance Under Full Information Cognitive dissonance [11] arises when an individual holds contradictory cognitions. Without understanding *why*, the contradiction can be rationalized: "management knows best." Understanding the proof removes the ambiguity. The team member now holds: - **Cognition A:** "I am a competent professional. My job is to solve important problems." - **Cognition B:** "I am closing a wallpaper ticket while the email server is down, because the metric is mathematically biased (Theorem 1), the reordering produces zero throughput (Theorem 6), and the only beneficiary is the dashboard (Section 7). I can prove this." The dissonance is now *load-bearing*. The available resolutions — abandon professional identity, reject the proof, advocate for change, or leave — each impose costs that did not exist before. ### 8.3 Self-Determination Theory: Three Needs Violated Deci and Ryan's Self-Determination Theory [12, 13] identifies three needs predicting intrinsic motivation: **Autonomy.** The metric constrains choices in a way the team member knows is mathematically suboptimal. A worker who understands the process is provably counterproductive cannot feel autonomous following it. **Competence.** The metric rewards *apparent* effectiveness (low $\bar{C}$) while being invariant to *actual* effectiveness (Theorem 6). Genuine competence — fixing the email server first — is *punished* by the metric. **Relatedness.** The team member knows the client's email server is down. They could help. They are instead updating wallpaper — not because it helps anyone, but because it helps a number. The connection between work and human impact has been severed, and the team member can see the severed ends. ### 8.4 Moral Injury Moral injury [16, 17] is the lasting harm caused by "perpetrating, failing to prevent, bearing witness to, or learning about acts that transgress deeply held moral beliefs" [17]. It has since been extended to business settings [25]. The key distinction from burnout: **burnout is exhaustion from doing too much. Moral injury is damage from doing the wrong thing.** A team member who knows the email server is down, knows they should fix it, closes a wallpaper ticket instead, and does so because the metric requires it, is experiencing the structural conditions for moral injury. ### 8.5 Learned Helplessness and Metric Fatalism Seligman's learned helplessness [14, 15] describes how exposure to uncontrollable negative outcomes leads to passivity. The sequence: 1. The metric is flawed (proof understood). 2. Advocate for change. 3. Rejected ("the numbers are good, don't rock the boat"). 4. Repeat with decreasing conviction. 5. Terminal state: "The metric is what it is. I'll just close tickets." This is not laziness. It is the rational response to a system that punishes correct behavior and rewards incorrect behavior, when the individual lacks power to change the system. ### 8.6 The Adversarial Selection Spiral Combining Section 7's equilibrium with the turnover dynamic: 1. Organization adopts unweighted mean. Metric looks good (SPT). 2. Aware, competent team members experience psychological costs (8.2–8.5). 3. Those members leave. Replaced by members who do not understand the metric's flaws or do not care. 4. The metric continues to look good — it always does under SPT, regardless of team competence (Corollary 6.1). 5. Actual service quality degrades, but the metric cannot detect this (Corollary 9.1). 6. Return to step 1. The metric selects *against* the people who would improve the system and *for* the people who will not challenge it. The system stabilizes at a lower level of competence, invisible to its own measurement apparatus. ### 8.7 The Complete Cost Model | Section 7 (visible) | Section 8 (hidden) | |---------------------|---------------------| | Client satisfied (good number) | Team dissatisfied (bad reality) | | Throughput unchanged | Discretionary effort withdrawn | | Metric improves | Competent members leave | | Business economy stable | Institutional competence degrades | These operate on different timescales: the equilibrium is visible quarterly; the competence degradation is visible over years. The complete model is: **the metric works, and it is destructive, and the destruction is invisible to the metric.** The metric is fresh paint on corroded rebar. --- ## 9. Manager Internalization: The Actionable Solution Sections 2–6 say reject the metric. Section 7 says the metric works (for the business). Section 8 says it destroys the team. In practice, most managers cannot unilaterally change the metric. The best solution is company-wide metric reform. The *actionable* solution is what a single informed manager can do right now. ### 9.1 The Strategy A manager who understands the proof can **internalize the metric's limitations without propagating them to the team**: 1. **Schedule primarily by priority.** The team works critical tasks first. 2. **Tactically interleave small tasks.** When a small low-priority task can be completed without materially delaying high-priority work, do it. Not because the metric demands it, but because it also needs to get done and costs almost nothing. 3. **Never reveal the metric as the motivation.** "Knock out this quick one while we wait for the vendor callback on the P1" — not "we need to bring our average down." The team's intrinsic motivation remains intact (Section 8). The manager absorbs the metric-management burden. ### 9.2 Formalization The manager's problem is a constrained optimization: $$\min_{\sigma} \sum_{i=1}^{n} w(q_i) \cdot C_i \quad \text{subject to} \quad \bar{C}(\sigma) \le \bar{C}_{\text{target}}$$ **Theorem 12 (Bounded Metric Cost of Priority Scheduling).** A manager who uses SPT *within* each priority class and priority ordering *between* classes will produce a metric close to the SPT-optimal value — the gap arises only from between-class inversions. **Proof sketch.** Within each priority class, SPT is free (all tasks have equal priority). The only deviation from global SPT is the between-class ordering. Each cross-class inversion costs at most $p_{\text{large}} - p_{\text{small}}$ in the unweighted sum, and these inversions are bounded by the number of classes. In practice, the gap is typically within 10–20% of SPT-optimal. $\blacksquare$ ### 9.3 The Manager as Information Barrier | Layer | Sees metric | Sees priorities | Sees proof | |-------|-----------|----------------|------------| | Organization | Yes | Nominally | No | | Manager | Yes | Yes | **Yes** | | Team | No (shielded) | Yes | Irrelevant | | Client | Yes (dashboard) | Via SLA | No | The manager is the only actor holding all three pieces of information. This is not manipulation — they are doing the right work in the right order, and the metric happens to be acceptable because within-class SPT is free. ### 9.4 The Competitive Breakdown This strategy fails when the metric becomes **competitive between teams**. **Case 1: Cooperative** — Teams measured for parity, not ranking. Each manager independently uses the internalization strategy. The metric is decorative but harmless. This is a **coordination game** with a stable cooperative equilibrium. **Case 2: Competitive** — Teams ranked by $\bar{C}$. This is a **prisoner's dilemma**: | | Team B: Priority-first | Team B: SPT | |---|---|---| | **Team A: Priority-first** | (Good work, Good work) | (A looks bad, B looks good) | | **Team A: SPT** | (A looks good, B looks bad) | (Both look good, both do wrong work) | The Nash equilibrium is (SPT, SPT). The internalization strategy is a cooperative equilibrium that is **not stable under competition**. ### 9.5 Scope | Condition | Viability | |-----------|-----------| | Metric used for health-check / parity | **Viable** | | Metric visible but not ranked | **Viable** | | Metric ranked across teams | **Fragile** — requires all managers to cooperate | | Metric tied to compensation / resources | **Not viable** — prisoner's dilemma dominates | | Metric reform possible at org level | **Unnecessary** — fix the metric instead | **The best solution is company-wide. The actionable solution is a manager who understands this proof, shields their team from the metric, schedules by priority, and uses SPT only within priority classes to keep the number reasonable.** --- # Part IV: Assessment ## 10. Devil's Advocate Intellectual honesty requires acknowledging where the argument has limits. ### 10.1 Simplicity Has Real Value **Argument.** The unweighted mean requires no priority weights, no task-size estimates, no calibration. **Assessment: True.** But the unweighted metric does not avoid assumptions — it *hides* them by implicitly setting all weights to 1 and all sizes to 1. A known-imprecise estimate of task size is still more informative than the implicit assumption that all sizes are equal. ### 10.2 Minimizing the Number of People Waiting **Argument.** SPT minimizes total person-hours spent waiting. If each task represents one client, this is optimal. **Assessment: Mathematically correct.** If you run a DMV and every person's time is equally valuable, SPT is the right policy. It breaks down when tasks are not 1:1 with clients, waiting cost is not uniform, or the metric is used to evaluate teams rather than serve a literal queue. ### 10.3 SPT as a Triage Heuristic **Argument.** When task sizes cluster tightly, SPT approximates FIFO and the unweighted mean approximates the weighted mean. **Assessment: Correct.** The coefficient of variation $CV = \sigma_p / \bar{p}$ determines distortion severity: | $CV$ | Task size distribution | Distortion | |------|----------------------|------------| | < 0.3 | Tight (call center) | Negligible | | 0.3 – 1.0 | Moderate (mixed IT) | Moderate | | > 1.0 | Wide (typical IT queue) | Severe | A typical IT desk spans 15 minutes to 40+ hours ($CV > 2$). The distortion is not an edge case — it is the default. ### 10.4 Gaming Requires Malice **Argument.** The theorems show the metric *can* be gamed, not that it *will* be gamed. **Assessment: This is the strongest counterargument.** If the metric is purely informational and never influences behavior, the gaming incentive is absent. However, any metric reported to management, tied to OKRs, or discussed in retrospectives will influence behavior. This is Goodhart's Law [6, 7] — and it applies to well-intentioned teams as reliably as to cynical ones. The drift happens organically: completing three easy tickets "feels productive" while the metric validates the feeling. ### 10.5 When the Unweighted Mean Is Defensible The metric is defensible **only when all four conditions hold**: 1. Task sizes are approximately uniform ($CV < 0.3$) 2. No priority differentiation (all tasks equally important) 3. Each task represents exactly one client 4. The metric is not used to evaluate, reward, or direct behavior These conditions are rarely met in the systems where the metric is most commonly used. --- ## 11. Related Work This paper sits at the intersection of several literatures that have not previously been connected. ### 11.1 Scheduling Theory and Fairness Smith [1] established the SPT optimality result and the WSJF rule in 1956. Conway, Maxwell, and Miller [2] provided the comprehensive textbook treatment. The fairness of size-based scheduling policies has been debated in computer systems scheduling: Bansal and Harchol-Balter [22] investigated SRPT unfairness; Wierman and Harchol-Balter [23] formalized fairness classifications against Processor-Sharing; Angel, Bampis, and Pascual [21] measured SPT schedule quality against fair optimality criteria. This prior work analyzes fairness in CPU and server scheduling. The present paper applies the same mathematical results to *organizational task management*, where the "scheduler" is a human team, the "jobs" are client requests with business-impact priorities, and the "objective function" is a management metric. The mechanism is identical; the consequences differ because organizational scheduling has priority systems, client relationships, and psychological costs that CPU scheduling does not. ### 11.2 Measurement Dysfunction Austin [18] proved that incomplete measurement — measuring only a subset of relevant dimensions — creates incentives to optimize the measured dimensions at the expense of unmeasured ones, and that this effect is not merely possible but *inevitable* when measurement is tied to rewards. His information-asymmetry framing closely parallels Section 7. The present paper provides the specific mathematical mechanism (Theorems 1–2) for the case of task scheduling, and extends the argument through psychology (Section 8) to trace the complete chain of organizational harm. Muller [19] documented "metric fixation" across education, healthcare, policing, and finance, providing extensive empirical evidence for the patterns theorized in Section 7.4. Campbell [24] formalized the corrupting effect of using indicators as targets, complementing Goodhart's original observation [6] and Strathern's generalization [7]. Bevan and Hood [26] empirically documented gaming behaviors in the English public health system — including the exact patterns of "hitting the target and missing the point" described in our Section 5.2. ### 11.3 Psychological Costs of Metric Dysfunction The application of moral injury (Shay [16], Litz et al. [17]) to business settings has recent precedent: a 2024 *Journal of Business Ethics* study [25] explicitly extended the construct to for-profit workplaces, finding structural conditions similar to those described in Section 8.4. Moore [27] analyzed moral *disengagement* — the cognitive restructuring that enables unethical behavior under organizational pressure. The present paper addresses the complementary phenomenon: the harm to individuals who *refuse* to disengage. ### 11.4 What Is Novel The individual components — SPT optimality, Goodhart's Law, measurement dysfunction, moral injury — all have precedent. The contributions of this paper are: 1. **The conservation law (Theorem 2) used prescriptively** — as a constructive argument that work-weighted completion time *cannot* be gamed, rather than as a theoretical scheduling result. 2. **The specific proof that priority classes make the metric algebraically adversarial** (Theorems 8–9) — not merely empirically bad but structurally contradictory, with zero mutual information between the schedule and the priority system. 3. **The integrated chain** from mathematical proof through information asymmetry through psychological harm through adversarial selection spiral — tracing a single metric from Smith (1956) to organizational hollowing. 4. **The manager internalization strategy** (Section 9) with formal game-theoretic analysis of its stability and breakdown conditions under inter-team competition. 5. **The application of scheduling theory to organizational management critique** — proving that a commonly used team metric has specific, quantifiable pathologies rather than arguing from anecdote or general principle. --- ## 12. Conclusion The unweighted average completion time is a **biased statistic** that: 1. **Can be gamed** by scheduling policy (Theorem 1), unlike work-weighted completion time which is schedule-invariant (Theorem 2). 2. **Incentivizes starvation** of large tasks (Theorem 3). 3. **Degrades client satisfaction** with zero compensating productivity gain (Theorem 7). 4. **Actively contradicts priority systems** by carrying zero information about business-impact classification (Theorem 9). 5. **Ignores priority entirely** in its scheduling recommendation, producing suboptimal priority-weighted delay whenever priority and size are not perfectly inversely correlated (Theorem 10). A metric that can be improved by reordering work — without doing any additional work — is measuring the scheduling policy, not the system's capacity. When combined with a priority system, it recommends the schedule that inflicts the most damage on the highest-priority work. When the metric is reported to clients, it creates an information asymmetry (Section 7) whose business equilibrium is profitable but fragile. When team members understand its flaws, it violates their intrinsic motivation and selects for the departure of the most competent people (Section 8). A single informed manager can partially mitigate these effects through constrained optimization (Section 9), but this cooperative strategy is not stable under inter-team competition. The unweighted mean is defensible only under narrow conditions (Section 10.5): uniform task sizes, no priorities, one-to-one client-task mapping, and no behavioral influence. These conditions are rarely met. **Unweighted average completion time is not a fair or accurate measurement of task execution performance. Its adoption as a team metric will rationally produce starvation of complex work, violation of stated priorities, inequitable client outcomes, and the illusion of productivity where none exists.** The best solution is organizational metric reform. The actionable solution is a manager who understands this proof. --- ## References ### Scheduling Theory [1] Smith, W. E. (1956). Various optimizers for single-stage production. *Naval Research Logistics Quarterly*, 3(1–2), 59–66. doi:[10.1002/nav.3800030106](https://doi.org/10.1002/nav.3800030106) > Origin of the SPT optimality result (Theorem 1), the weighted completion > time rule $w_i/p_i$ descending (WSJF, Theorem 11), and the adjacent-job > pairwise interchange (exchange argument) proof technique used throughout. [2] Conway, R. W., Maxwell, W. L., & Miller, L. W. (1967). *Theory of Scheduling*. Addison-Wesley. > Standard textbook treatment of single-machine scheduling theory, > extending Smith's results. [3] Little, J. D. C. (1961). A proof for the queuing formula: L = λW. *Operations Research*, 9(3), 383–387. doi:[10.1287/opre.9.3.383](https://doi.org/10.1287/opre.9.3.383) > First rigorous proof of Little's Law. Referenced in Section 3.2 for > queueing-theoretic context. [4] Little, J. D. C. (2011). Little's Law as viewed on its 50th anniversary. *Operations Research*, 59(3), 536–549. doi:[10.1287/opre.1110.0941](https://doi.org/10.1287/opre.1110.0941) > Retrospective discussing scope, limitations, and common misapplications. [5] Reinertsen, D. G. (2009). *The Principles of Product Development Flow: Second Generation Lean Product Development*. Celeritas Publishing. ISBN: 978-0-9844512-0-8. > Popularized WSJF and "Cost of Delay / Duration" in agile/lean contexts. > Mathematical foundation is Smith (1956) [1]. ### Measurement and Incentives [6] Goodhart, C. A. E. (1984). Problems of monetary management: The U.K. experience. In *Monetary Theory and Practice* (pp. 91–121). Macmillan. > Source of Goodhart's Law: "Any observed statistical regularity will tend > to collapse once pressure is placed upon it for control purposes." [7] Strathern, M. (1997). 'Improving ratings': Audit in the British university system. *European Review*, 5(3), 305–321. doi:[10.1002/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4](https://doi.org/10.1002/(SICI)1234-981X(199707)5:3%3C305::AID-EURO184%3E3.0.CO;2-4) > Generalized Goodhart's Law: "When a measure becomes a target, it ceases > to be a good measure." ### Behavioral Economics [8] Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. *Econometrica*, 47(2), 263–292. doi:[10.2307/1914185](https://doi.org/10.2307/1914185) > Established loss aversion. Referenced in Section 4.5. ### Game Theory and Contract Theory [9] Akerlof, G. A. (1970). The market for "lemons": Quality uncertainty and the market mechanism. *The Quarterly Journal of Economics*, 84(3), 488–500. doi:[10.2307/1879431](https://doi.org/10.2307/1879431) > Information asymmetry and adverse selection. The pooling equilibrium in > Section 7.5 is structurally analogous. [10] Hölmstrom, B. (1979). Moral hazard and observability. *The Bell Journal of Economics*, 10(1), 74–91. doi:[10.2307/3003320](https://doi.org/10.2307/3003320) > Formal treatment of moral hazard. The metric-reporting scenario in > Section 7.5 is a moral hazard problem. ### Psychology [11] Festinger, L. (1957). *A Theory of Cognitive Dissonance*. Stanford University Press. ISBN: 978-0-8047-0131-0. > Foundational theory. Referenced in Section 8.2. [12] Deci, E. L., & Ryan, R. M. (1985). *Intrinsic Motivation and Self-Determination in Human Behavior*. Plenum Press. ISBN: 978-0-306-42022-1. > Original treatment of Self-Determination Theory. Referenced in > Section 8.3. [13] Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. *American Psychologist*, 55(1), 68–78. doi:[10.1037/0003-066X.55.1.68](https://doi.org/10.1037/0003-066X.55.1.68) > SDT overview linking need satisfaction to intrinsic motivation and > well-being. [14] Seligman, M. E. P., & Maier, S. F. (1967). Failure to escape traumatic shock. *Journal of Experimental Psychology*, 74(1), 1–9. doi:[10.1037/h0024514](https://doi.org/10.1037/h0024514) > Original demonstration of learned helplessness. Referenced in > Section 8.5. [15] Seligman, M. E. P. (1975). *Helplessness: On Depression, Development, and Death*. W. H. Freeman. ISBN: 978-0-7167-0752-3. > Extended treatment connecting learned helplessness to human depression > and institutional behavior. [16] Shay, J. (1994). *Achilles in Vietnam: Combat Trauma and the Undoing of Character*. Atheneum / Simon & Schuster. ISBN: 978-0-689-12182-3. > Introduced the concept of moral injury. Referenced in Section 8.4. [17] Litz, B. T., Stein, N., Delaney, E., Lebowitz, L., Nash, W. P., Silva, C., & Maguen, S. (2009). Moral injury and moral repair in war veterans: A preliminary model and intervention strategy. *Clinical Psychology Review*, 29(8), 695–706. doi:[10.1016/j.cpr.2009.07.003](https://doi.org/10.1016/j.cpr.2009.07.003) > Formalized moral injury as a clinical construct. Definition quoted in > Section 8.4. ### Organizational Measurement [18] Austin, R. D. (1996). *Measuring and Managing Performance in Organizations*. Dorset House. ISBN: 978-0-932633-36-1. > Proved that incomplete measurement creates inevitable incentives to > optimize measured dimensions at the expense of unmeasured ones. The > information-asymmetry framing closely parallels Section 7. The single > most important predecessor to this paper's argument. [19] Muller, J. Z. (2018). *The Tyranny of Metrics*. Princeton University Press. ISBN: 978-0-691-17495-2. > Comprehensive treatment of "metric fixation" across education, > healthcare, policing, and finance. Extensive empirical evidence for the > patterns theorized in Section 7.4. ### Scheduling Fairness [20] Coffman, E. G., Shanthikumar, J. G., & Yao, D. D. (1992). Multiclass queueing systems: Polymatroid structure and optimal scheduling control. *Operations Research*, 40(S2), S293–S299. > Conservation laws in scheduling. The schedule-invariance of > work-weighted completion time (Theorem 2) is an instance of these > conservation laws. [21] Angel, E., Bampis, E., & Pascual, F. (2008). How good are SPT schedules for fair optimality criteria? *Annals of Operations Research*, 159(1), 53–64. doi:[10.1007/s10479-007-0267-0](https://doi.org/10.1007/s10479-007-0267-0) > Directly measures SPT schedule quality against fairness criteria. > Closest predecessor in scheduling theory to Section 4's fairness > analysis. [22] Bansal, N., & Harchol-Balter, M. (2001). Analysis of SRPT scheduling: Investigating unfairness. *ACM SIGMETRICS Performance Evaluation Review*, 29(1), 279–290. doi:[10.1145/384268.378792](https://doi.org/10.1145/384268.378792) > Investigates the belief that SRPT unfairly penalizes large jobs in > computer scheduling. Argues unfairness is smaller than believed but > acknowledges the core tension. [23] Wierman, A., & Harchol-Balter, M. (2003). Classifying scheduling policies with respect to unfairness in an M/GI/1. *ACM SIGMETRICS Performance Evaluation Review*, 31(1), 238–249. > Formalizes fairness definitions for scheduling policies by comparison > to Processor-Sharing. ### Additional References [24] Campbell, D. T. (1979). Assessing the impact of planned social change. *Evaluation and Program Planning*, 2(1), 67–90. doi:[10.1016/0149-7189(79)90048-X](https://doi.org/10.1016/0149-7189(79)90048-X) > Campbell's Law: "The more any quantitative social indicator is used for > social decision-making, the more subject it will be to corruption > pressures and the more apt it will be to distort and corrupt the social > processes it is intended to monitor." Complements Goodhart's Law [6]. [25] Ferreira, C. M., et al. (2024). It's business: A qualitative study of moral injury in business settings. *Journal of Business Ethics*. doi:[10.1007/s10551-024-05615-0](https://doi.org/10.1007/s10551-024-05615-0) > Extends moral injury to for-profit workplaces. Validates Section 8.4's > application of Shay/Litz beyond military and healthcare settings. [26] Bevan, G., & Hood, C. (2006). What's measured is what matters: Targets and gaming in the English public health care system. *Public Administration*, 84(3), 517–538. doi:[10.1111/j.1467-9299.2006.00600.x](https://doi.org/10.1111/j.1467-9299.2006.00600.x) > Empirically documents gaming behaviors including "hitting the target > and missing the point." Provides real-world evidence for Section 5.2's > priority-metric contradiction. [27] Moore, C. (2012). Why employees do bad things: Moral disengagement and unethical organizational behavior. *Personnel Psychology*, 65(1), 1–48. doi:[10.1111/j.1744-6570.2011.01237.x](https://doi.org/10.1111/j.1744-6570.2011.01237.x) > Analyzes moral *disengagement* — the cognitive restructuring enabling > unethical behavior. Section 8 addresses the complementary phenomenon: > harm to individuals who *refuse* to disengage. --- *This proof was developed conversationally and formalized on 2026-03-28.*