diff --git a/.backup/README.md.v1 b/.backup/README.md.v1
new file mode 100644
index 0000000..ba10bcf
--- /dev/null
+++ b/.backup/README.md.v1
@@ -0,0 +1,1249 @@
+# Unweighted Average Completion Time Is Not a Fair Metric for Task Scheduling
+
+A mathematical proof that unweighted average task completion time is a biased
+statistic that incentivizes cherry-picking easy work, and that any scheduling
+advantage it appears to reveal is an artifact of the metric — not a reflection
+of genuine throughput or service quality.
+
+---
+
+## 1. Introduction
+
+Many organizations measure task-execution performance by **unweighted mean
+completion time**: the average number of hours (or days) between task
+submission and task resolution, counting each task equally regardless of
+size or priority.
+
+This paper proves that this metric is not merely imprecise but structurally
+biased. It can be improved by reordering work without doing any additional
+work (Theorem 1), while a properly weighted alternative is completely
+immune to scheduling manipulation (Theorem 2). When combined with a
+priority system, the metric actively contradicts the organization's own
+priority classifications (Theorem 9).
+
+The argument proceeds in four parts:
+
+- **Part I** (Sections 2–4) establishes the mathematical foundation:
+  the unweighted mean is gameable by Shortest Processing Time (SPT)
+  scheduling, the work-weighted mean is schedule-invariant, and the
+  resulting service-quality consequences are provably negative.
+
+- **Part II** (Sections 5–6) extends the model to priority-classified
+  tasks, proves the metric becomes adversarial to the priority system,
+  and proposes weighted alternatives with a worked IT service desk example.
+
+- **Part III** (Sections 7–9) examines organizational dynamics: what
+  happens when the metric is reported to clients (information asymmetry),
+  what happens to team members who understand its flaws (psychological
+  harm), and what a single informed manager can do about it (constrained
+  optimization with game-theoretic stability analysis).
+
+- **Part IV** (Sections 10–12) presents honest counterarguments, situates
+  the work in existing literature, and concludes.
+
+The core results build on Smith's (1956) foundational scheduling theory [1],
+extended through game theory [9, 10], organizational measurement theory
+[18, 19], and psychology [11–17] to trace a complete chain from a
+mathematical proof about a specific metric to organizational outcomes.
+
+---
+
+# Part I: Mathematical Foundation
+
+## 2. Definitions
+
+Let there be **n** tasks with processing times $p_1, p_2, \ldots, p_n$.
+
+A **schedule** $\sigma$ is a permutation of $\{1, 2, \ldots, n\}$ assigning
+tasks to execution order on a single executor.
+
+The **completion time** of task $\sigma(k)$ under schedule $\sigma$ is:
+
+$$C_{\sigma(k)} = \sum_{j=1}^{k} p_{\sigma(j)}$$
+
+The **unweighted mean completion time** is:
+
+$$\bar{C}(\sigma) = \frac{1}{n} \sum_{k=1}^{n} C_{\sigma(k)}$$
+
+The **work-weighted mean completion time** is:
+
+$$\bar{C}_w(\sigma) = \frac{\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)}}{\sum_{k=1}^{n} p_{\sigma(k)}}$$
+
+---
+
+## 3. Core Results
+
+### 3.1 The Unweighted Mean Is Gameable
+
+**Theorem 1** (Smith, 1956 [1])**.** The schedule that minimizes
+$\bar{C}(\sigma)$ is Shortest Processing Time first (SPT): sort tasks so
+that $p_{\sigma(1)} \le p_{\sigma(2)} \le \cdots \le p_{\sigma(n)}$.
+
+**Proof (exchange argument [1, 2]).**
+
+Consider any schedule $\sigma$ in which two adjacent tasks $i, j$ satisfy
+$p_i > p_j$ with task $i$ scheduled immediately before task $j$. Let $t$
+be the start time of task $i$.
+
+| | Task $i$ finishes | Task $j$ finishes | Sum |
+|---|---|---|---|
+| **Before swap** ($i$ then $j$) | $t + p_i$ | $t + p_i + p_j$ | $2t + 2p_i + p_j$ |
+| **After swap** ($j$ then $i$) | $t + p_j$ | $t + p_j + p_i$ | $2t + p_i + 2p_j$ |
+
+The change in the sum of completion times is:
+
+$$(2p_i + p_j) - (p_i + 2p_j) = p_i - p_j > 0$$
+
+Every swap of a longer-before-shorter adjacent pair strictly reduces the
+total. Any non-SPT schedule contains such a pair. Repeated swaps converge
+to SPT. Therefore SPT uniquely minimizes $\bar{C}(\sigma)$. $\blacksquare$
+
+### 3.2 The Work-Weighted Mean Is Schedule-Invariant
+
+**Theorem 2.** The work-weighted mean completion time $\bar{C}_w(\sigma)$
+is the same for every schedule $\sigma$.
+
+**Proof.**
+
+Expand the numerator:
+
+$$\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_{k=1}^{n} p_{\sigma(k)} \sum_{j=1}^{k} p_{\sigma(j)}$$
+
+Reindex by letting $a = \sigma(k)$ and $b = \sigma(j)$. The double sum
+counts every ordered pair $(a, b)$ where $b$ is scheduled no later than $a$:
+
+$$= \sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b$$
+
+For any pair $(a, b)$ with $a \ne b$, exactly one of
+$\{b \preceq_\sigma a\}$ or $\{a \prec_\sigma b\}$ holds. The diagonal
+terms ($a = b$) contribute $p_a^2$ regardless of order. Therefore:
+
+$$\sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b = \sum_{a} p_a^2 + \sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b$$
+
+Together with the complementary sum, the two off-diagonal sums cover all
+unordered pairs:
+
+$$\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b + \sum_{\substack{a \ne b \\ a \prec_\sigma b}} p_a \, p_b = \sum_{a \ne b} p_a \, p_b$$
+
+The right-hand side is schedule-independent. By symmetry of $p_a p_b$,
+both off-diagonal sums are equal:
+
+$$\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b = \frac{1}{2} \sum_{a \ne b} p_a \, p_b$$
+
+Therefore:
+
+$$\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_a p_a^2 + \frac{1}{2} \sum_{a \ne b} p_a \, p_b = \frac{1}{2}\left(\sum_a p_a\right)^2 + \frac{1}{2}\sum_a p_a^2$$
+
+This expression contains no reference to $\sigma$. Since the denominator
+$\sum p_a$ is also schedule-independent:
+
+$$\bar{C}_w(\sigma) = \frac{\frac{1}{2}\left(\sum p_a\right)^2 + \frac{1}{2}\sum p_a^2}{\sum p_a}$$
+
+is **constant across all schedules**. $\blacksquare$
+
+This is an instance of the conservation laws in scheduling identified by
+Coffman, Shanthikumar, and Yao [20]. The invariance corresponds to
+measuring how long a unit of *work* waits rather than how long a *task*
+waits — the unweighted statistic counts completions rather than work,
+which is why it is gameable. (See also Little [3, 4] for the queueing-
+theoretic context, with the caveat that Little's Law applies directly
+only to steady-state systems, not to the batch case analyzed here.)
+
+### 3.3 Illustrative Example
+
+Two tasks: $A$ with $p_A = 1$ hour, $B$ with $p_B = 10$ hours.
+
+| Schedule | $C_A$ | $C_B$ | Unweighted mean | Work-weighted mean |
+|----------|-------|-------|-----------------|-------------------|
+| SPT (A first) | 1 | 11 | 6.0 | 111/11 ≈ 10.09 |
+| Reverse (B first) | 11 | 10 | 10.5 | 111/11 ≈ 10.09 |
+
+SPT appears **4.5 hours better** on the unweighted metric but provides
+**zero improvement** on the work-weighted metric. The apparent advantage
+exists only because the unweighted statistic lets a 1-hour task "vote"
+equally with a 10-hour task.
+
+---
+
+## 4. Consequences for Service Quality
+
+### 4.1 Starvation of Large Tasks
+
+**Theorem 3 (Metric Bias).** Any scheduling policy that minimizes
+unweighted mean completion time necessarily maximizes the completion time
+of the largest task.
+
+**Proof.** SPT places the largest task last. Its completion time equals
+the total processing time $\sum p_i$, which is the maximum possible
+completion time for any individual task. Under any schedule that does not
+place the largest task last, that task completes strictly earlier.
+$\blacksquare$
+
+This creates a **starvation incentive**: rational agents optimizing the
+unweighted statistic will indefinitely defer large tasks in favor of small
+ones. Austin [18] identified this general pattern — that incomplete
+measurement creates incentives to optimize the measured dimension at the
+expense of unmeasured ones — in the context of organizational performance
+management. Theorem 3 provides the specific mechanism for task scheduling.
+
+### 4.2 Maximum Completion Time for the Largest Task
+
+**Theorem 4 (SPT Uniquely Maximizes Completion Time of the Largest Task).**
+Among all schedules, SPT is the unique policy that assigns the maximum
+possible completion time ($\sum p_i$) to the largest task.
+
+**Proof.** SPT sorts tasks in ascending order of $p_i$, placing the largest
+task $p_{\max}$ in the last position. The last task in any schedule has
+completion time $\sum_{i=1}^{n} p_i$, which is the maximum any individual
+task can receive. Under any schedule that does not place $p_{\max}$ last,
+it completes strictly before $\sum p_i$. $\blacksquare$
+
+**Corollary 4.1.** A team optimizing unweighted mean completion time will
+systematically deliver the worst experience to clients with the most
+complex needs. This is not a side effect — it is the *mechanism* by which
+the metric improves.
+
+**Note on slowdown ratios.** SPT actually *compresses* slowdown ratios
+($S_i = C_i / p_i$) because larger tasks in later positions have large
+denominators that absorb the accumulated sum. For example, with tasks
+$[1, 5, 10]$: SPT gives slowdowns $[1, 1.2, 1.6]$ (low variance) while
+LPT gives $[1, 3, 16]$ (high variance). SPT's harm to large-task clients
+is not visible in the slowdown ratio — it is visible in **absolute
+completion time**. This distinction is important: the scheduling fairness
+literature [21, 22, 23] has debated SPT/SRPT unfairness primarily through
+slowdown-based measures, which can obscure the absolute-delay burden
+proved below.
+
+### 4.3 Delay Concentration
+
+**Theorem 5 (SPT Concentrates Delay on the Largest Task).** Under SPT,
+the largest task bears more absolute delay than under any other schedule.
+
+**Proof.** Define absolute delay as $\Delta_i = C_i - p_i$ (time spent
+waiting, independent of own size). Under SPT, the largest task is in
+position $n$ with:
+
+$$\Delta_{\max\text{-task}}^{\text{SPT}} = C_n - p_n = \sum_{i=1}^{n-1} p_i$$
+
+This is the sum of all other tasks' processing times — the maximum possible
+delay for any single task. Under any schedule where the largest task is not
+last, its delay is strictly less. Meanwhile, SPT gives the smallest task
+zero delay ($\Delta_1^{\text{SPT}} = 0$). The entire queuing burden is
+shifted from small tasks to large tasks. $\blacksquare$
+
+SPT minimizes *total* delay (good for aggregate efficiency) by
+concentrating delay onto the tasks best able to absorb it in slowdown-ratio
+terms. But in absolute terms — hours spent waiting — the largest task bears
+the full weight.
+
+### 4.4 Throughput Invariance
+
+**Theorem 6 (Throughput Invariance).** Total work completed over any time
+horizon $T$ is identical under all scheduling policies.
+
+**Proof.** The executor processes work at a fixed rate. Over any horizon
+$T \ge \sum p_i$, the total work done is exactly $\sum p_i$ regardless of
+order. For the steady-state case with ongoing arrivals, the long-run
+throughput is determined by the service rate $\mu$ and is completely
+independent of scheduling:
+
+$$\lim_{T \to \infty} \frac{W(T)}{T} = \mu \quad \text{for all schedules } \sigma$$
+
+$\blacksquare$
+
+**Corollary 6.1.** A team that switches from any scheduling policy to SPT
+will observe an improvement in unweighted mean completion time with **zero
+change in actual throughput**. The metric improves. The output does not.
+
+### 4.5 The Compound Effect
+
+Combining Theorems 4, 5, and 6:
+
+| Measure | Effect of optimizing unweighted mean |
+|---------|--------------------------------------|
+| Throughput (work/time) | No change (Theorem 6) |
+| Delay for small tasks | Minimized — approaches zero (SPT) |
+| Delay for large tasks | **Maximized** — bears all queuing burden (Theorem 5) |
+| Completion time of largest task | **Maximum possible**: $\sum p_i$ (Theorem 4) |
+
+The net effect on perceived quality is negative because:
+
+1. **Loss aversion is asymmetric** [8]. A client whose 100-hour task is
+   deprioritized experiences a large, salient negative. A client whose
+   1-hour task is expedited experiences a small, often unnoticed positive.
+
+2. **High-effort tasks correlate with high-value clients.** Large tasks
+   are disproportionately likely to come from major clients, complex
+   contracts, or critical business needs.
+
+3. **Starvation compounds.** In a continuous system (Theorem 3), large
+   tasks may be **indefinitely deferred** as new small tasks keep arriving.
+
+**Theorem 7 (The Core Result).** For a team processing tasks of non-uniform
+size, adopting unweighted mean completion time as a performance metric:
+
+(a) Provides **zero productivity gain** (Theorem 6), while
+(b) **Assigning the maximum possible completion time** to the largest task
+    (Theorem 4), and
+(c) **Concentrating all queuing delay** onto the largest tasks while
+    eliminating delay for the smallest (Theorem 5).
+
+This is not a tradeoff. The metric creates a pure transfer of service
+quality from high-effort clients to low-effort clients, with no net work
+gained. $\blacksquare$
+
+---
+
+# Part II: Priority Systems
+
+## 5. Breakdown Under Priority Classification
+
+The preceding sections proved that unweighted mean completion time is
+biased when tasks vary in size. We now show that introducing a **priority
+system** — as virtually all real teams use — causes the metric to become
+not merely biased but **actively adversarial** to the organization's stated
+goals.
+
+### 5.1 Extended Model: Tasks With Priority
+
+Let each task $i$ have processing time $p_i$ and a priority class
+$q_i \in \{1, 2, 3, 4\}$ where 1 is the highest priority (critical) and
+4 is the lowest (cosmetic/enhancement). Assign priority weights:
+
+$$w(q) = \begin{cases} 8 & q = 1 \text{ (Critical)} \\ 4 & q = 2 \text{ (High)} \\ 2 & q = 3 \text{ (Medium)} \\ 1 & q = 4 \text{ (Low)} \end{cases}$$
+
+The specific weights are illustrative; the results hold for any strictly
+decreasing weight function. The key property is that priority is assigned
+by **business impact**, not by task size.
+
+### 5.2 The Metric Contradicts the Priority System
+
+**Theorem 8 (Priority-Size Inversion).** When priority is independent of
+task size, the schedule that minimizes unweighted mean completion time
+(SPT) will, in expectation, complete low-priority tasks before
+high-priority tasks of greater size.
+
+**Proof.** SPT orders tasks by $p_i$ ascending, regardless of $q_i$.
+Consider two tasks:
+
+- Task A: $p_A = 40$ hours, $q_A = 1$ (Critical — e.g., server outage)
+- Task B: $p_B = 0.5$ hours, $q_B = 4$ (Low — e.g., cosmetic UI fix)
+
+SPT schedules B before A. The unweighted mean for this pair:
+
+$$\bar{C}^{\text{SPT}} = \frac{0.5 + 40.5}{2} = 20.5 \qquad \bar{C}^{\text{priority}} = \frac{40 + 40.5}{2} = 40.25$$
+
+The metric declares SPT nearly **twice as good** — despite completing a
+cosmetic fix while a server outage burns.
+
+In general, when $q_i$ is statistically independent of $p_i$, SPT's
+ordering has **zero correlation** with priority. In practice, Critical
+tasks (outages, security incidents, data loss) often require more work
+than Low tasks, so the metric is plausibly **anti-correlated** with the
+priority system. $\blacksquare$
+
+### 5.3 Information Destruction
+
+The unweighted mean reduces a three-dimensional task $(p_i, q_i, C_i)$ to
+a one-dimensional signal ($C_i$), then averages uniformly. This discards
+priority entirely and implicitly inverts size.
+
+**Theorem 9 (Information Destruction).** Let $I(\sigma)$ be the mutual
+information between the schedule's implicit priority ranking (position)
+and the actual priority assignment $q_i$. For SPT:
+
+$$I(\sigma_{\text{SPT}}) = 0 \quad \text{when } p_i \perp q_i$$
+
+**Proof.** SPT assigns positions based solely on $p_i$. When $p_i$ and
+$q_i$ are independent, knowing a task's position in the SPT schedule
+provides zero information about its priority. $\blacksquare$
+
+**Corollary 9.1.** A team that optimizes unweighted mean completion time
+is operating a scheduling system that carries zero information about its
+own priority classification. The priority field in their ticketing system
+is, with respect to execution order, decorative.
+
+This is an instance of what Austin [18] calls the fundamental problem of
+incomplete measurement: when the measurement system captures only a subset
+of the relevant dimensions, optimizing the measurement systematically
+degrades the unmeasured dimensions.
+
+### 5.4 Priority-Weighted Delay Cost
+
+Define the **priority-weighted delay cost** of a schedule:
+
+$$D(\sigma) = \sum_{i=1}^{n} w(q_i) \cdot C_i$$
+
+**Theorem 10 (SPT and Priority-Weighted Delay Cost).** The optimal
+schedule for minimizing $D(\sigma)$ is WSJF: order by $w(q_i)/p_i$
+descending [1, 5]. SPT's ordering — by $1/p_i$ descending — ignores
+priority entirely and produces higher $D$ than priority-respecting
+alternatives when priority is correlated with task size.
+
+**Proof.** By the exchange argument, swapping adjacent tasks $i, j$
+changes $D$ by:
+
+$$\Delta D = w(q_j) \cdot p_i - w(q_i) \cdot p_j$$
+
+The swap improves $D$ when $w(q_j)/p_j > w(q_i)/p_i$ but $j$ is
+scheduled after $i$. Therefore the optimal order is decreasing
+$w(q_i)/p_i$ — the WSJF rule. SPT corresponds to WSJF only when
+$w(q_i) = \text{const}$ (all tasks have equal priority).
+
+**Example.** Critical ($w = 8$, $p = 3$) and Low ($w = 1$, $p = 2$):
+
+- SPT (Low first): $D = 1 \cdot 2 + 8 \cdot 5 = 42$
+- WSJF (Critical first): $D = 8 \cdot 3 + 1 \cdot 5 = 29$
+
+SPT incurs 45% more priority-weighted delay. In practice, Critical tasks
+tend to be larger (outages, security incidents), making the divergence
+systematic. $\blacksquare$
+
+---
+
+## 6. Proposed Solutions
+
+### 6.1 Priority-Weighted Metrics
+
+Replace unweighted mean completion time with the **Priority-Weighted
+Completion Score (PWCS)**:
+
+$$\text{PWCS}(\sigma) = \frac{\sum_{i=1}^{n} w(q_i) \cdot \frac{C_i}{p_i}}{\sum_{i=1}^{n} w(q_i)}$$
+
+This is the priority-weighted mean slowdown ratio. It measures how long
+each task waited relative to its size, weighted by how much that task
+mattered. Lower is better.
+
+**Properties:**
+
+1. **Priority-respecting.** Delays to Critical tasks cost 8x more than
+   delays to Low tasks.
+2. **Size-fair.** Uses slowdown ratio $C_i / p_i$, so large tasks are not
+   penalized for being large.
+3. **Not gameable by SPT.** Reordering by processing time does not
+   systematically improve the score.
+4. **Reduces to unweighted mean when tasks are uniform.** A strict
+   generalization.
+
+### 6.2 Optimal Policy: WSJF
+
+**Theorem 11.** The schedule minimizing the priority-weighted completion
+time $\text{PWCT}(\sigma) = \sum w(q_i) \cdot C_i / \sum w(q_i)$ processes
+tasks in order of decreasing $w(q_i)/p_i$ — the **Weighted Shortest Job
+First (WSJF)** rule [1, 5].
+
+**Proof.** By the exchange argument (as in Theorem 10), the swap of
+adjacent tasks $i, j$ improves PWCT when $w(q_j)/p_j > w(q_i)/p_i$ but
+$j$ is scheduled after $i$. The optimal order is therefore decreasing
+$w(q_i)/p_i$. $\blacksquare$
+
+Within a priority class, this reduces to SPT (shortest first). Across
+classes, a Critical 4-hour task ($w/p = 2.0$) beats a Low 1-hour task
+($w/p = 1.0$).
+
+**Practical caveat.** Pure WSJF can place tiny Low-priority tasks ahead
+of large Critical tasks (a 15-minute Low task has $w/p = 1/0.25 = 4.0$,
+beating a 6-hour Critical at $w/p = 8/6 = 1.33$). In practice, this is
+mitigated by enforcing **strict priority-class ordering** and applying
+WSJF only *within* each class.
+
+### 6.3 Applied Example: IT Service Desk
+
+Consider an IT team with the following ticket queue:
+
+| Ticket | Priority | Type | Est. Hours |
+|--------|----------|------|-----------|
+| T1 | P1 (Critical) | Email server down | 6 |
+| T2 | P2 (High) | VPN failing for remote team | 4 |
+| T3 | P3 (Medium) | New employee laptop setup | 2 |
+| T4 | P4 (Low) | Update desktop wallpaper policy | 0.5 |
+| T5 | P3 (Medium) | Install software license | 1 |
+| T6 | P1 (Critical) | Database backup failing | 3 |
+| T7 | P2 (High) | Printer fleet offline | 2 |
+| T8 | P4 (Low) | Archive old shared drive folder | 0.25 |
+
+**SPT order** (optimizing unweighted mean): T8, T4, T5, T3, T7, T6, T2, T1
+
+| Pos | Ticket | Priority | Hours | Completion | Slowdown |
+|-----|--------|----------|-------|------------|----------|
+| 1 | T8 (archive folder) | P4 Low | 0.25 | 0.25 | 1.0 |
+| 2 | T4 (wallpaper) | P4 Low | 0.5 | 0.75 | 1.5 |
+| 3 | T5 (software) | P3 Med | 1 | 1.75 | 1.75 |
+| 4 | T3 (laptop) | P3 Med | 2 | 3.75 | 1.875 |
+| 5 | T7 (printers) | P2 High | 2 | 5.75 | 2.875 |
+| 6 | T6 (backups) | P1 Crit | 3 | 8.75 | 2.917 |
+| 7 | T2 (VPN) | P2 High | 4 | 12.75 | 3.188 |
+| 8 | T1 (email) | P1 Crit | 6 | 18.75 | 3.125 |
+
+**Practical WSJF** (priority-class-first, SPT within class):
+
+| Pos | Ticket | Priority | Hours | Completion |
+|-----|--------|----------|-------|------------|
+| 1 | T6 (backups) | P1 Crit | 3 | 3 |
+| 2 | T1 (email) | P1 Crit | 6 | 9 |
+| 3 | T7 (printers) | P2 High | 2 | 11 |
+| 4 | T2 (VPN) | P2 High | 4 | 15 |
+| 5 | T5 (software) | P3 Med | 1 | 16 |
+| 6 | T3 (laptop) | P3 Med | 2 | 18 |
+| 7 | T8 (archive) | P4 Low | 0.25 | 18.25 |
+| 8 | T4 (wallpaper) | P4 Low | 0.5 | 18.75 |
+
+**Comparison:**
+
+| Metric | SPT | Practical WSJF | Winner |
+|--------|-----|----------------|--------|
+| Unweighted mean completion | **6.56 hrs** | 13.63 hrs | SPT |
+| P1 mean time to resolution | 13.75 hrs | **6 hrs** | WSJF |
+| P2 mean time to resolution | 9.25 hrs | **13 hrs** | SPT |
+| Time to fix email server | 18.75 hrs | **9 hrs** | WSJF |
+| Time to fix database backups | 8.75 hrs | **3 hrs** | WSJF |
+| Time to update wallpaper | **0.75 hrs** | 18.75 hrs | SPT |
+
+The aggregate priority-weighted completion times are nearly identical
+(PWCT: 10.2 vs 10.17) because aggregation hides distributional damage.
+The real difference is in the **per-priority-class** breakdown: the email
+server is down for 18.75 hours under SPT versus 9 hours under WSJF. The
+database backups fail for 8.75 hours versus 3.
+
+The unweighted metric confidently reports SPT as **more than twice as
+efficient** (6.56 vs 13.63), rewarding the team that updated desktop
+wallpaper while the email server was on fire.
+
+### 6.4 Recommended Metric Suite
+
+Even priority-weighted aggregate metrics can fail to distinguish good from
+bad schedules, because aggregation hides distributional damage. No single
+metric suffices. A complete measurement system should track:
+
+| Metric | What it measures | Formula |
+|--------|-----------------|---------|
+| **Mean completion by priority class** | Per-class responsiveness | $\bar{C}$ filtered by $q$ |
+| **P1 mean time to resolution** | Critical incident response | $\bar{C}$ for $q = 1$ |
+| **Throughput** | Raw work capacity | Work-hours completed / calendar time |
+| **Aging violations** | Starvation prevention | Tasks exceeding SLA by priority |
+| **Max completion time (P1/P2)** | Worst-case critical response | $\max(C_i)$ for $q \le 2$ |
+
+The key insight: **per-priority-class metrics** expose scheduling failures
+that aggregate metrics hide.
+
+---
+
+# Part III: Organizational Dynamics
+
+## 7. When the Metric Is the Product
+
+Sections 2–6 assume that client satisfaction is a function of *experienced
+service quality*. But there exists a scenario in which this assumption
+fails and the entire argument collapses.
+
+### 7.1 The Self-Referential Metric
+
+Suppose the provider reports the unweighted mean directly to the client
+— on a dashboard, in an SLA report, on a marketing page — and the
+client's satisfaction is derived primarily from *that number*:
+
+$$U_{\text{client}} = f\!\left(\bar{C}(\sigma)\right), \quad f' < 0$$
+
+Under this model, SPT genuinely maximizes client satisfaction (Theorem 1).
+Throughput is unchanged (Theorem 6). The business outcome improves: same
+work done, happier client.
+
+**Every theorem in this paper remains mathematically correct. But the
+conclusion inverts.** The metric is no longer a proxy that can be gamed —
+it *is* the service quality, because the client has agreed to evaluate
+quality by the aggregate number.
+
+### 7.2 The Economics
+
+This creates a coherent, stable equilibrium:
+
+| Actor | Behavior | Outcome |
+|-------|----------|---------|
+| Provider | Optimizes unweighted mean (SPT) | Metric improves, no extra work |
+| Client | Reads dashboard, sees low average | Reports satisfaction |
+| Management | Sees satisfied client + good metric | Rewards team |
+
+The provider extracts satisfaction at zero marginal cost, by optimizing a
+number the client has accepted as a proxy for quality.
+
+### 7.3 The Fragility
+
+This equilibrium is stable only as long as the client never inspects their
+own experience. It breaks when:
+
+1. **The client checks their own ticket.** A CTO whose email server was
+   down for 18.75 hours will not be reassured by "Average resolution:
+   6.56 hours." The clients most likely to inspect are exactly the ones
+   receiving the worst service (Theorem 4).
+
+2. **A competitor offers per-ticket SLAs.** "P1 resolved within 4 hours"
+   beats "average resolution under 7 hours" for any client with critical
+   needs.
+
+3. **The team internalizes the metric.** If the team believes the metric
+   reflects real performance, they lose the ability to recognize when
+   critical work is neglected. The metric becomes an epistemic hazard.
+
+### 7.4 The General Pattern
+
+This pattern — proxy replaces quality, proxy is optimized, quality
+diverges, system is stable until tested by reality — recurs across domains.
+Muller [19] documents it extensively as "metric fixation"; Campbell [24]
+formalized the corrupting effect of using indicators as targets.
+
+| Domain | Proxy metric | Underlying quality | Divergence |
+|--------|-------------|-------------------|------------|
+| IT support | Avg. resolution time | Critical system uptime | Server down 19 hrs, avg says 6.5 |
+| Education | Test scores | Actual learning | Teaching to the test |
+| Healthcare | Patient throughput | Patient outcomes | Faster discharges, higher readmission |
+| Finance | Quarterly earnings | Long-term value | Cost-cutting inflates EPS, erodes capability |
+| Software | Velocity (story points) | Product quality | Point inflation, features half-finished |
+
+### 7.5 Information Asymmetry
+
+Model the system as a game between provider (P) and client (C). P observes
+individual $\{C_i\}$ and chooses $\sigma$; C observes only
+$\bar{C}(\sigma)$. This is a **moral hazard** problem [10]: P's optimal
+strategy is to minimize the observable signal regardless of the
+unobservable distribution.
+
+The equilibrium is a **pooling equilibrium** [9]: P's reported metric
+looks identical regardless of the underlying priority-weighted performance.
+It is stable until C obtains access to individual $C_i$ values — via a
+customer portal, a competitor's transparency, or a sufficiently painful
+incident.
+
+### 7.6 The Uncomfortable Conclusion
+
+The honest answer to "does optimizing the unweighted mean hurt the
+business?" is: **not necessarily, as long as the client never looks behind
+the number**. The honest answer to "is this sustainable?" is: it is
+exactly as sustainable as any system in which the seller knows more than
+the buyer — stable for extended periods, then rapid collapse when the
+asymmetry is punctured.
+
+---
+
+## 8. The Psychological Cost of Knowing
+
+Section 7 modeled the provider as a unitary actor. But teams are composed
+of individuals. When a team member understands the proof — when they
+*know* the metric is synthetic, that the dashboard is theater, that the
+email server is still down while they close wallpaper tickets — a new cost
+appears that the equilibrium model omitted.
+
+### 8.1 The Hidden Variable: Team Awareness
+
+| Actor | Observes individual $C_i$ | Observes $\bar{C}$ | Understands the proof |
+|-------|--------------------------|--------------------|-----------------------|
+| Management | Possibly | Yes | Varies |
+| Team member | **Yes** | Yes | **Yes** (in this scenario) |
+| Client | No | Yes | No |
+
+The team member has full information. They see the ticket queue. They know
+the email server has been down since 7 AM. They know they are closing a
+wallpaper ticket because it improves the number. And they know *why*.
+
+### 8.2 Cognitive Dissonance Under Full Information
+
+Cognitive dissonance [11] arises when an individual holds contradictory
+cognitions. Without understanding *why*, the contradiction can be
+rationalized: "management knows best." Understanding the proof removes
+the ambiguity. The team member now holds:
+
+- **Cognition A:** "I am a competent professional. My job is to solve
+  important problems."
+- **Cognition B:** "I am closing a wallpaper ticket while the email
+  server is down, because the metric is mathematically biased (Theorem 1),
+  the reordering produces zero throughput (Theorem 6), and the only
+  beneficiary is the dashboard (Section 7). I can prove this."
+
+The dissonance is now *load-bearing*. The available resolutions — abandon
+professional identity, reject the proof, advocate for change, or leave —
+each impose costs that did not exist before.
+
+### 8.3 Self-Determination Theory: Three Needs Violated
+
+Deci and Ryan's Self-Determination Theory [12, 13] identifies three needs
+predicting intrinsic motivation:
+
+**Autonomy.** The metric constrains choices in a way the team member
+knows is mathematically suboptimal. A worker who understands the process
+is provably counterproductive cannot feel autonomous following it.
+
+**Competence.** The metric rewards *apparent* effectiveness (low $\bar{C}$)
+while being invariant to *actual* effectiveness (Theorem 6). Genuine
+competence — fixing the email server first — is *punished* by the metric.
+
+**Relatedness.** The team member knows the client's email server is down.
+They could help. They are instead updating wallpaper — not because it
+helps anyone, but because it helps a number. The connection between work
+and human impact has been severed, and the team member can see the severed
+ends.
+
+### 8.4 Moral Injury
+
+Moral injury [16, 17] is the lasting harm caused by "perpetrating, failing
+to prevent, bearing witness to, or learning about acts that transgress
+deeply held moral beliefs" [17]. It has since been extended to business
+settings [25]. The key distinction from burnout: **burnout is exhaustion
+from doing too much. Moral injury is damage from doing the wrong thing.**
+
+A team member who knows the email server is down, knows they should fix
+it, closes a wallpaper ticket instead, and does so because the metric
+requires it, is experiencing the structural conditions for moral injury.
+
+### 8.5 Learned Helplessness and Metric Fatalism
+
+Seligman's learned helplessness [14, 15] describes how exposure to
+uncontrollable negative outcomes leads to passivity. The sequence:
+
+1. The metric is flawed (proof understood).
+2. Advocate for change.
+3. Rejected ("the numbers are good, don't rock the boat").
+4. Repeat with decreasing conviction.
+5. Terminal state: "The metric is what it is. I'll just close tickets."
+
+This is not laziness. It is the rational response to a system that
+punishes correct behavior and rewards incorrect behavior, when the
+individual lacks power to change the system.
+
+### 8.6 The Adversarial Selection Spiral
+
+Combining Section 7's equilibrium with the turnover dynamic:
+
+1. Organization adopts unweighted mean. Metric looks good (SPT).
+2. Aware, competent team members experience psychological costs (8.2–8.5).
+3. Those members leave. Replaced by members who do not understand the
+   metric's flaws or do not care.
+4. The metric continues to look good — it always does under SPT,
+   regardless of team competence (Corollary 6.1).
+5. Actual service quality degrades, but the metric cannot detect this
+   (Corollary 9.1).
+6. Return to step 1.
+
+The metric selects *against* the people who would improve the system and
+*for* the people who will not challenge it. The system stabilizes at a
+lower level of competence, invisible to its own measurement apparatus.
+
+### 8.7 The Complete Cost Model
+
+| Section 7 (visible) | Section 8 (hidden) |
+|---------------------|---------------------|
+| Client satisfied (good number) | Team dissatisfied (bad reality) |
+| Throughput unchanged | Discretionary effort withdrawn |
+| Metric improves | Competent members leave |
+| Business economy stable | Institutional competence degrades |
+
+These operate on different timescales: the equilibrium is visible
+quarterly; the competence degradation is visible over years. The complete
+model is: **the metric works, and it is destructive, and the destruction
+is invisible to the metric.** The metric is fresh paint on corroded rebar.
+
+---
+
+## 9. Manager Internalization: The Actionable Solution
+
+Sections 2–6 say reject the metric. Section 7 says the metric works
+(for the business). Section 8 says it destroys the team. In practice,
+most managers cannot unilaterally change the metric. The best solution is
+company-wide metric reform. The *actionable* solution is what a single
+informed manager can do right now.
+
+### 9.1 The Strategy
+
+A manager who understands the proof can **internalize the metric's
+limitations without propagating them to the team**:
+
+1. **Schedule primarily by priority.** The team works critical tasks first.
+2. **Tactically interleave small tasks.** When a small low-priority task
+   can be completed without materially delaying high-priority work, do it.
+   Not because the metric demands it, but because it also needs to get
+   done and costs almost nothing.
+3. **Never reveal the metric as the motivation.** "Knock out this quick
+   one while we wait for the vendor callback on the P1" — not "we need
+   to bring our average down." The team's intrinsic motivation remains
+   intact (Section 8). The manager absorbs the metric-management burden.
+
+### 9.2 Formalization
+
+The manager's problem is a constrained optimization:
+
+$$\min_{\sigma} \sum_{i=1}^{n} w(q_i) \cdot C_i \quad \text{subject to} \quad \bar{C}(\sigma) \le \bar{C}_{\text{target}}$$
+
+**Theorem 12 (Bounded Metric Cost of Priority Scheduling).** A manager
+who uses SPT *within* each priority class and priority ordering *between*
+classes will produce a metric close to the SPT-optimal value — the gap
+arises only from between-class inversions.
+
+**Proof sketch.** Within each priority class, SPT is free (all tasks have
+equal priority). The only deviation from global SPT is the between-class
+ordering. Each cross-class inversion costs at most
+$p_{\text{large}} - p_{\text{small}}$ in the unweighted sum, and these
+inversions are bounded by the number of classes. In practice, the gap is
+typically within 10–20% of SPT-optimal. $\blacksquare$
+
+### 9.3 The Manager as Information Barrier
+
+| Layer | Sees metric | Sees priorities | Sees proof |
+|-------|-----------|----------------|------------|
+| Organization | Yes | Nominally | No |
+| Manager | Yes | Yes | **Yes** |
+| Team | No (shielded) | Yes | Irrelevant |
+| Client | Yes (dashboard) | Via SLA | No |
+
+The manager is the only actor holding all three pieces of information.
+This is not manipulation — they are doing the right work in the right
+order, and the metric happens to be acceptable because within-class SPT
+is free.
+
+### 9.4 The Competitive Breakdown
+
+This strategy fails when the metric becomes **competitive between teams**.
+
+**Case 1: Cooperative** — Teams measured for parity, not ranking. Each
+manager independently uses the internalization strategy. The metric is
+decorative but harmless. This is a **coordination game** with a stable
+cooperative equilibrium.
+
+**Case 2: Competitive** — Teams ranked by $\bar{C}$. This is a
+**prisoner's dilemma**:
+
+| | Team B: Priority-first | Team B: SPT |
+|---|---|---|
+| **Team A: Priority-first** | (Good work, Good work) | (A looks bad, B looks good) |
+| **Team A: SPT** | (A looks good, B looks bad) | (Both look good, both do wrong work) |
+
+The Nash equilibrium is (SPT, SPT). The internalization strategy is a
+cooperative equilibrium that is **not stable under competition**.
+
+### 9.5 Scope
+
+| Condition | Viability |
+|-----------|-----------|
+| Metric used for health-check / parity | **Viable** |
+| Metric visible but not ranked | **Viable** |
+| Metric ranked across teams | **Fragile** — requires all managers to cooperate |
+| Metric tied to compensation / resources | **Not viable** — prisoner's dilemma dominates |
+| Metric reform possible at org level | **Unnecessary** — fix the metric instead |
+
+**The best solution is company-wide. The actionable solution is a manager
+who understands this proof, shields their team from the metric, schedules
+by priority, and uses SPT only within priority classes to keep the number
+reasonable.**
+
+---
+
+# Part IV: Assessment
+
+## 10. Devil's Advocate
+
+Intellectual honesty requires acknowledging where the argument has limits.
+
+### 10.1 Simplicity Has Real Value
+
+**Argument.** The unweighted mean requires no priority weights, no
+task-size estimates, no calibration.
+
+**Assessment: True.** But the unweighted metric does not avoid assumptions
+— it *hides* them by implicitly setting all weights to 1 and all sizes to
+1. A known-imprecise estimate of task size is still more informative than
+the implicit assumption that all sizes are equal.
+
+### 10.2 Minimizing the Number of People Waiting
+
+**Argument.** SPT minimizes total person-hours spent waiting. If each
+task represents one client, this is optimal.
+
+**Assessment: Mathematically correct.** If you run a DMV and every
+person's time is equally valuable, SPT is the right policy. It breaks
+down when tasks are not 1:1 with clients, waiting cost is not uniform,
+or the metric is used to evaluate teams rather than serve a literal queue.
+
+### 10.3 SPT as a Triage Heuristic
+
+**Argument.** When task sizes cluster tightly, SPT approximates FIFO
+and the unweighted mean approximates the weighted mean.
+
+**Assessment: Correct.** The coefficient of variation $CV = \sigma_p / \bar{p}$ determines distortion severity:
+
+| $CV$ | Task size distribution | Distortion |
+|------|----------------------|------------|
+| < 0.3 | Tight (call center) | Negligible |
+| 0.3 – 1.0 | Moderate (mixed IT) | Moderate |
+| > 1.0 | Wide (typical IT queue) | Severe |
+
+A typical IT desk spans 15 minutes to 40+ hours ($CV > 2$). The
+distortion is not an edge case — it is the default.
+
+### 10.4 Gaming Requires Malice
+
+**Argument.** The theorems show the metric *can* be gamed, not that it
+*will* be gamed.
+
+**Assessment: This is the strongest counterargument.** If the metric is
+purely informational and never influences behavior, the gaming incentive
+is absent. However, any metric reported to management, tied to OKRs, or
+discussed in retrospectives will influence behavior. This is Goodhart's
+Law [6, 7] — and it applies to well-intentioned teams as reliably as to
+cynical ones. The drift happens organically: completing three easy tickets
+"feels productive" while the metric validates the feeling.
+
+### 10.5 When the Unweighted Mean Is Defensible
+
+The metric is defensible **only when all four conditions hold**:
+
+1. Task sizes are approximately uniform ($CV < 0.3$)
+2. No priority differentiation (all tasks equally important)
+3. Each task represents exactly one client
+4. The metric is not used to evaluate, reward, or direct behavior
+
+These conditions are rarely met in the systems where the metric is most
+commonly used.
+
+---
+
+## 11. Related Work
+
+This paper sits at the intersection of several literatures that have not
+previously been connected.
+
+### 11.1 Scheduling Theory and Fairness
+
+Smith [1] established the SPT optimality result and the WSJF rule in 1956.
+Conway, Maxwell, and Miller [2] provided the comprehensive textbook
+treatment. The fairness of size-based scheduling policies has been debated
+in computer systems scheduling: Bansal and Harchol-Balter [22] investigated
+SRPT unfairness; Wierman and Harchol-Balter [23] formalized fairness
+classifications against Processor-Sharing; Angel, Bampis, and Pascual [21]
+measured SPT schedule quality against fair optimality criteria.
+
+This prior work analyzes fairness in CPU and server scheduling. The present
+paper applies the same mathematical results to *organizational task
+management*, where the "scheduler" is a human team, the "jobs" are client
+requests with business-impact priorities, and the "objective function" is
+a management metric. The mechanism is identical; the consequences differ
+because organizational scheduling has priority systems, client
+relationships, and psychological costs that CPU scheduling does not.
+
+### 11.2 Measurement Dysfunction
+
+Austin [18] proved that incomplete measurement — measuring only a subset
+of relevant dimensions — creates incentives to optimize the measured
+dimensions at the expense of unmeasured ones, and that this effect is not
+merely possible but *inevitable* when measurement is tied to rewards. His
+information-asymmetry framing closely parallels Section 7. The present
+paper provides the specific mathematical mechanism (Theorems 1–2) for the
+case of task scheduling, and extends the argument through psychology
+(Section 8) to trace the complete chain of organizational harm.
+
+Muller [19] documented "metric fixation" across education, healthcare,
+policing, and finance, providing extensive empirical evidence for the
+patterns theorized in Section 7.4. Campbell [24] formalized the corrupting
+effect of using indicators as targets, complementing Goodhart's original
+observation [6] and Strathern's generalization [7].
+
+Bevan and Hood [26] empirically documented gaming behaviors in the English
+public health system — including the exact patterns of "hitting the target
+and missing the point" described in our Section 5.2.
+
+### 11.3 Psychological Costs of Metric Dysfunction
+
+The application of moral injury (Shay [16], Litz et al. [17]) to business
+settings has recent precedent: a 2024 *Journal of Business Ethics* study
+[25] explicitly extended the construct to for-profit workplaces, finding
+structural conditions similar to those described in Section 8.4. Moore
+[27] analyzed moral *disengagement* — the cognitive restructuring that
+enables unethical behavior under organizational pressure. The present
+paper addresses the complementary phenomenon: the harm to individuals who
+*refuse* to disengage.
+
+### 11.4 What Is Novel
+
+The individual components — SPT optimality, Goodhart's Law, measurement
+dysfunction, moral injury — all have precedent. The contributions of this
+paper are:
+
+1. **The conservation law (Theorem 2) used prescriptively** — as a
+   constructive argument that work-weighted completion time *cannot* be
+   gamed, rather than as a theoretical scheduling result.
+
+2. **The specific proof that priority classes make the metric algebraically
+   adversarial** (Theorems 8–9) — not merely empirically bad but
+   structurally contradictory, with zero mutual information between the
+   schedule and the priority system.
+
+3. **The integrated chain** from mathematical proof through information
+   asymmetry through psychological harm through adversarial selection
+   spiral — tracing a single metric from Smith (1956) to organizational
+   hollowing.
+
+4. **The manager internalization strategy** (Section 9) with formal
+   game-theoretic analysis of its stability and breakdown conditions
+   under inter-team competition.
+
+5. **The application of scheduling theory to organizational management
+   critique** — proving that a commonly used team metric has specific,
+   quantifiable pathologies rather than arguing from anecdote or
+   general principle.
+
+---
+
+## 12. Conclusion
+
+The unweighted average completion time is a **biased statistic** that:
+
+1. **Can be gamed** by scheduling policy (Theorem 1), unlike work-weighted
+   completion time which is schedule-invariant (Theorem 2).
+2. **Incentivizes starvation** of large tasks (Theorem 3).
+3. **Degrades client satisfaction** with zero compensating productivity
+   gain (Theorem 7).
+4. **Actively contradicts priority systems** by carrying zero information
+   about business-impact classification (Theorem 9).
+5. **Ignores priority entirely** in its scheduling recommendation,
+   producing suboptimal priority-weighted delay whenever priority and
+   size are not perfectly inversely correlated (Theorem 10).
+
+A metric that can be improved by reordering work — without doing any
+additional work — is measuring the scheduling policy, not the system's
+capacity. When combined with a priority system, it recommends the schedule
+that inflicts the most damage on the highest-priority work.
+
+When the metric is reported to clients, it creates an information asymmetry
+(Section 7) whose business equilibrium is profitable but fragile. When
+team members understand its flaws, it violates their intrinsic motivation
+and selects for the departure of the most competent people (Section 8).
+A single informed manager can partially mitigate these effects through
+constrained optimization (Section 9), but this cooperative strategy is
+not stable under inter-team competition.
+
+The unweighted mean is defensible only under narrow conditions
+(Section 10.5): uniform task sizes, no priorities, one-to-one client-task
+mapping, and no behavioral influence. These conditions are rarely met.
+
+**Unweighted average completion time is not a fair or accurate measurement
+of task execution performance. Its adoption as a team metric will
+rationally produce starvation of complex work, violation of stated
+priorities, inequitable client outcomes, and the illusion of productivity
+where none exists.**
+
+The best solution is organizational metric reform. The actionable solution
+is a manager who understands this proof.
+
+---
+
+## References
+
+### Scheduling Theory
+
+[1] Smith, W. E. (1956). Various optimizers for single-stage production.
+*Naval Research Logistics Quarterly*, 3(1–2), 59–66.
+doi:[10.1002/nav.3800030106](https://doi.org/10.1002/nav.3800030106)
+
+> Origin of the SPT optimality result (Theorem 1), the weighted completion
+> time rule $w_i/p_i$ descending (WSJF, Theorem 11), and the adjacent-job
+> pairwise interchange (exchange argument) proof technique used throughout.
+
+[2] Conway, R. W., Maxwell, W. L., & Miller, L. W. (1967). *Theory of
+Scheduling*. Addison-Wesley.
+
+> Standard textbook treatment of single-machine scheduling theory,
+> extending Smith's results.
+
+[3] Little, J. D. C. (1961). A proof for the queuing formula: L = λW.
+*Operations Research*, 9(3), 383–387.
+doi:[10.1287/opre.9.3.383](https://doi.org/10.1287/opre.9.3.383)
+
+> First rigorous proof of Little's Law. Referenced in Section 3.2 for
+> queueing-theoretic context.
+
+[4] Little, J. D. C. (2011). Little's Law as viewed on its 50th
+anniversary. *Operations Research*, 59(3), 536–549.
+doi:[10.1287/opre.1110.0941](https://doi.org/10.1287/opre.1110.0941)
+
+> Retrospective discussing scope, limitations, and common misapplications.
+
+[5] Reinertsen, D. G. (2009). *The Principles of Product Development
+Flow: Second Generation Lean Product Development*. Celeritas Publishing.
+ISBN: 978-0-9844512-0-8.
+
+> Popularized WSJF and "Cost of Delay / Duration" in agile/lean contexts.
+> Mathematical foundation is Smith (1956) [1].
+
+### Measurement and Incentives
+
+[6] Goodhart, C. A. E. (1984). Problems of monetary management: The U.K.
+experience. In *Monetary Theory and Practice* (pp. 91–121). Macmillan.
+
+> Source of Goodhart's Law: "Any observed statistical regularity will tend
+> to collapse once pressure is placed upon it for control purposes."
+
+[7] Strathern, M. (1997). 'Improving ratings': Audit in the British
+university system. *European Review*, 5(3), 305–321.
+doi:[10.1002/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4](https://doi.org/10.1002/(SICI)1234-981X(199707)5:3%3C305::AID-EURO184%3E3.0.CO;2-4)
+
+> Generalized Goodhart's Law: "When a measure becomes a target, it ceases
+> to be a good measure."
+
+### Behavioral Economics
+
+[8] Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of
+decision under risk. *Econometrica*, 47(2), 263–292.
+doi:[10.2307/1914185](https://doi.org/10.2307/1914185)
+
+> Established loss aversion. Referenced in Section 4.5.
+
+### Game Theory and Contract Theory
+
+[9] Akerlof, G. A. (1970). The market for "lemons": Quality uncertainty
+and the market mechanism. *The Quarterly Journal of Economics*, 84(3),
+488–500. doi:[10.2307/1879431](https://doi.org/10.2307/1879431)
+
+> Information asymmetry and adverse selection. The pooling equilibrium in
+> Section 7.5 is structurally analogous.
+
+[10] Hölmstrom, B. (1979). Moral hazard and observability. *The Bell
+Journal of Economics*, 10(1), 74–91.
+doi:[10.2307/3003320](https://doi.org/10.2307/3003320)
+
+> Formal treatment of moral hazard. The metric-reporting scenario in
+> Section 7.5 is a moral hazard problem.
+
+### Psychology
+
+[11] Festinger, L. (1957). *A Theory of Cognitive Dissonance*. Stanford
+University Press. ISBN: 978-0-8047-0131-0.
+
+> Foundational theory. Referenced in Section 8.2.
+
+[12] Deci, E. L., & Ryan, R. M. (1985). *Intrinsic Motivation and
+Self-Determination in Human Behavior*. Plenum Press.
+ISBN: 978-0-306-42022-1.
+
+> Original treatment of Self-Determination Theory. Referenced in
+> Section 8.3.
+
+[13] Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and
+the facilitation of intrinsic motivation, social development, and
+well-being. *American Psychologist*, 55(1), 68–78.
+doi:[10.1037/0003-066X.55.1.68](https://doi.org/10.1037/0003-066X.55.1.68)
+
+> SDT overview linking need satisfaction to intrinsic motivation and
+> well-being.
+
+[14] Seligman, M. E. P., & Maier, S. F. (1967). Failure to escape
+traumatic shock. *Journal of Experimental Psychology*, 74(1), 1–9.
+doi:[10.1037/h0024514](https://doi.org/10.1037/h0024514)
+
+> Original demonstration of learned helplessness. Referenced in
+> Section 8.5.
+
+[15] Seligman, M. E. P. (1975). *Helplessness: On Depression,
+Development, and Death*. W. H. Freeman. ISBN: 978-0-7167-0752-3.
+
+> Extended treatment connecting learned helplessness to human depression
+> and institutional behavior.
+
+[16] Shay, J. (1994). *Achilles in Vietnam: Combat Trauma and the Undoing
+of Character*. Atheneum / Simon & Schuster. ISBN: 978-0-689-12182-3.
+
+> Introduced the concept of moral injury. Referenced in Section 8.4.
+
+[17] Litz, B. T., Stein, N., Delaney, E., Lebowitz, L., Nash, W. P.,
+Silva, C., & Maguen, S. (2009). Moral injury and moral repair in war
+veterans: A preliminary model and intervention strategy. *Clinical
+Psychology Review*, 29(8), 695–706.
+doi:[10.1016/j.cpr.2009.07.003](https://doi.org/10.1016/j.cpr.2009.07.003)
+
+> Formalized moral injury as a clinical construct. Definition quoted in
+> Section 8.4.
+
+### Organizational Measurement
+
+[18] Austin, R. D. (1996). *Measuring and Managing Performance in
+Organizations*. Dorset House. ISBN: 978-0-932633-36-1.
+
+> Proved that incomplete measurement creates inevitable incentives to
+> optimize measured dimensions at the expense of unmeasured ones. The
+> information-asymmetry framing closely parallels Section 7. The single
+> most important predecessor to this paper's argument.
+
+[19] Muller, J. Z. (2018). *The Tyranny of Metrics*. Princeton University
+Press. ISBN: 978-0-691-17495-2.
+
+> Comprehensive treatment of "metric fixation" across education,
+> healthcare, policing, and finance. Extensive empirical evidence for the
+> patterns theorized in Section 7.4.
+
+### Scheduling Fairness
+
+[20] Coffman, E. G., Shanthikumar, J. G., & Yao, D. D. (1992).
+Multiclass queueing systems: Polymatroid structure and optimal scheduling
+control. *Operations Research*, 40(S2), S293–S299.
+
+> Conservation laws in scheduling. The schedule-invariance of
+> work-weighted completion time (Theorem 2) is an instance of these
+> conservation laws.
+
+[21] Angel, E., Bampis, E., & Pascual, F. (2008). How good are SPT
+schedules for fair optimality criteria? *Annals of Operations Research*,
+159(1), 53–64. doi:[10.1007/s10479-007-0267-0](https://doi.org/10.1007/s10479-007-0267-0)
+
+> Directly measures SPT schedule quality against fairness criteria.
+> Closest predecessor in scheduling theory to Section 4's fairness
+> analysis.
+
+[22] Bansal, N., & Harchol-Balter, M. (2001). Analysis of SRPT
+scheduling: Investigating unfairness. *ACM SIGMETRICS Performance
+Evaluation Review*, 29(1), 279–290.
+doi:[10.1145/384268.378792](https://doi.org/10.1145/384268.378792)
+
+> Investigates the belief that SRPT unfairly penalizes large jobs in
+> computer scheduling. Argues unfairness is smaller than believed but
+> acknowledges the core tension.
+
+[23] Wierman, A., & Harchol-Balter, M. (2003). Classifying scheduling
+policies with respect to unfairness in an M/GI/1. *ACM SIGMETRICS
+Performance Evaluation Review*, 31(1), 238–249.
+
+> Formalizes fairness definitions for scheduling policies by comparison
+> to Processor-Sharing.
+
+### Additional References
+
+[24] Campbell, D. T. (1979). Assessing the impact of planned social
+change. *Evaluation and Program Planning*, 2(1), 67–90.
+doi:[10.1016/0149-7189(79)90048-X](https://doi.org/10.1016/0149-7189(79)90048-X)
+
+> Campbell's Law: "The more any quantitative social indicator is used for
+> social decision-making, the more subject it will be to corruption
+> pressures and the more apt it will be to distort and corrupt the social
+> processes it is intended to monitor." Complements Goodhart's Law [6].
+
+[25] Ferreira, C. M., et al. (2024). It's business: A qualitative study
+of moral injury in business settings. *Journal of Business Ethics*.
+doi:[10.1007/s10551-024-05615-0](https://doi.org/10.1007/s10551-024-05615-0)
+
+> Extends moral injury to for-profit workplaces. Validates Section 8.4's
+> application of Shay/Litz beyond military and healthcare settings.
+
+[26] Bevan, G., & Hood, C. (2006). What's measured is what matters:
+Targets and gaming in the English public health care system. *Public
+Administration*, 84(3), 517–538.
+doi:[10.1111/j.1467-9299.2006.00600.x](https://doi.org/10.1111/j.1467-9299.2006.00600.x)
+
+> Empirically documents gaming behaviors including "hitting the target
+> and missing the point." Provides real-world evidence for Section 5.2's
+> priority-metric contradiction.
+
+[27] Moore, C. (2012). Why employees do bad things: Moral disengagement
+and unethical organizational behavior. *Personnel Psychology*, 65(1),
+1–48. doi:[10.1111/j.1744-6570.2011.01237.x](https://doi.org/10.1111/j.1744-6570.2011.01237.x)
+
+> Analyzes moral *disengagement* — the cognitive restructuring enabling
+> unethical behavior. Section 8 addresses the complementary phenomenon:
+> harm to individuals who *refuse* to disengage.
+
+---
+
+*This proof was developed conversationally and formalized on 2026-03-28.*
diff --git a/README.md b/README.md
index ba10bcf..9d7d4a6 100644
--- a/README.md
+++ b/README.md
@@ -255,9 +255,63 @@ $\blacksquare$
 will observe an improvement in unweighted mean completion time with **zero
 change in actual throughput**. The metric improves. The output does not.
 
-### 4.5 The Compound Effect
+### 4.5 The Aged-Task Abandonment Incentive
 
-Combining Theorems 4, 5, and 6:
+Theorems 3–5 show that SPT deprioritizes large tasks. But the metric
+creates a second, more destructive incentive: **completing old tasks is
+actively punished**.
+
+**Theorem 6.1 (Aged-Task Penalty).** Completing a single task with
+completion time $C_{\text{old}}$ increases the running mean by more than
+completing $C_{\text{old}}$ tasks with completion time 1 each.
+
+**Proof.** Let the team have completed $m$ tasks with running sum
+$S = \sum_{i=1}^{m} C_i$ and running mean $\bar{C} = S/m$.
+
+**Case 1:** Complete one task with completion time $C_{\text{old}}$:
+
+$$\bar{C}_1 = \frac{S + C_{\text{old}}}{m + 1}$$
+
+**Case 2:** Complete $C_{\text{old}}$ tasks each with completion time 1:
+
+$$\bar{C}_2 = \frac{S + C_{\text{old}}}{m + C_{\text{old}}}$$
+
+Both cases add the same value ($C_{\text{old}}$) to the numerator. But
+Case 2 adds $C_{\text{old}}$ completions to the denominator, while Case 1
+adds only 1. Therefore:
+
+$$\bar{C}_1 - \bar{C}_2 = \frac{S + C_{\text{old}}}{m + 1} - \frac{S + C_{\text{old}}}{m + C_{\text{old}}} = (S + C_{\text{old}}) \cdot \frac{C_{\text{old}} - 1}{(m+1)(m + C_{\text{old}})}$$
+
+For $C_{\text{old}} > 1$, this difference is strictly positive: the old
+task produces a **worse average** than the equivalent volume of fresh
+work. $\blacksquare$
+
+**Example.** A team has completed 100 tasks with a running mean of 2 days
+($S = 200$). They can either:
+
+- Complete one 26-day-old task: $\bar{C} = 226/101 = 2.24$ days
+- Complete 26 tasks at 1 day each: $\bar{C} = 226/126 = 1.79$ days
+
+Same 26 days of total wait resolved. The metric says the second team is
+better — 1.79 vs 2.24 — despite resolving the same total wait time.
+
+**Corollary 6.2 (Abandonment Incentive).** Under the unweighted mean,
+the rational response to an aged task is not to deprioritize it (SPT,
+Theorem 3) but to **remove it from the system entirely** — close it as
+"won't fix," transfer it to another team, or let it expire. This removes
+the task from both numerator and denominator, protecting the average.
+
+This goes beyond starvation. Theorems 3–5 prove that the metric
+*delays* large and old tasks. Theorem 6.1 proves that the metric
+*punishes completion of them* — meaning the incentive is not merely to
+defer but to abandon. A metric that penalizes resolving the hardest
+problems is not measuring performance; it is measuring avoidance.
+
+---
+
+### 4.6 The Compound Effect
+
+Combining Theorems 4, 5, 6, and 6.1:
 
 | Measure | Effect of optimizing unweighted mean |
 |---------|--------------------------------------|
@@ -265,6 +319,7 @@ Combining Theorems 4, 5, and 6:
 | Delay for small tasks | Minimized — approaches zero (SPT) |
 | Delay for large tasks | **Maximized** — bears all queuing burden (Theorem 5) |
 | Completion time of largest task | **Maximum possible**: $\sum p_i$ (Theorem 4) |
+| Incentive for aged tasks | **Abandon rather than complete** (Theorem 6.1) |
 
 The net effect on perceived quality is negative because:
 
@@ -996,11 +1051,13 @@ The unweighted average completion time is a **biased statistic** that:
 1. **Can be gamed** by scheduling policy (Theorem 1), unlike work-weighted
    completion time which is schedule-invariant (Theorem 2).
 2. **Incentivizes starvation** of large tasks (Theorem 3).
-3. **Degrades client satisfaction** with zero compensating productivity
+3. **Punishes completion of aged tasks**, incentivizing abandonment
+   over resolution (Theorem 6.1).
+4. **Degrades client satisfaction** with zero compensating productivity
    gain (Theorem 7).
-4. **Actively contradicts priority systems** by carrying zero information
+5. **Actively contradicts priority systems** by carrying zero information
    about business-impact classification (Theorem 9).
-5. **Ignores priority entirely** in its scheduling recommendation,
+6. **Ignores priority entirely** in its scheduling recommendation,
    producing suboptimal priority-weighted delay whenever priority and
    size are not perfectly inversely correlated (Theorem 10).