From 574eca5b277e1f1c6209391218a7bd41b82caeef Mon Sep 17 00:00:00 2001 From: Mortdecai Date: Sat, 28 Mar 2026 17:18:31 -0400 Subject: [PATCH] Fix mathematical errors in Theorems 4, 5, 10 and IT example MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Corrections: - Theorem 4: Restated from "maximizes slowdown inequality" (wrong) to "uniquely assigns max completion time to largest task" (correct). SPT actually compresses slowdown variance; harm is in absolute delay. - Theorem 5: Completely rewritten. Old claim that LPT minimizes slowdown variance was backwards (verified: tasks [1,5,10] give SPT var=0.06, LPT var=42.2). New theorem correctly states SPT concentrates absolute delay on the largest task. - Theorem 10: Removed draft language ("Wait —"), corrected cross-term analysis. Old claim that SPT is Pareto-dominated when p_H > 8p_L was wrong (verified: n_H=2,n_L=2,p_H=10,p_L=1 gives D_SPT=275 < D_pri=283). Replaced with correct WSJF exchange argument. - IT example: Fixed PWCT arithmetic (9.225→10.2, 6.633→10.167). Added honest discussion that aggregate PWCT fails to distinguish schedules; per-priority-class metrics are needed. - Section 5: Added caveat that Little's Law batch-case application is not straightforward; clarified what Theorem 2 actually proves. Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 255 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 133 insertions(+), 122 deletions(-) diff --git a/README.md b/README.md index b8e9864..0cb47d3 100644 --- a/README.md +++ b/README.md @@ -135,17 +135,23 @@ a 10-hour task. ## 5. Connection to Little's Law -Little's Law states $L = \lambda W$, where $L$ is the average number of tasks -in the system, $\lambda$ is the arrival rate, and $W$ is the average time a -task spends in the system. +Little's Law states $L = \lambda W$, where $L$ is the time-averaged number +of tasks in the system, $\lambda$ is the arrival rate, and $W$ is the +average time a task spends in the system. -For a stable system, $L$ and $\lambda$ are determined by arrival and service -rates — not by scheduling policy. Therefore $W = L / \lambda$ is -**schedule-invariant** when measured correctly (i.e., weighted by the quantity -being served). +In a *steady-state* queueing system with fixed arrival and service rates, +$\lambda$ and the long-run service rate are determined by the workload, not +by scheduling policy. Little's Law then tells us that $L$ and $W$ are +linked, but in the batch case (all $n$ tasks present at time 0), $L$ and +$W$ are both schedule-dependent: $\bar{C} = W$, and +$L = \sum C_i / \sum p_i$, both of which SPT minimizes. -SPT appears to violate this only because the unweighted statistic counts -*completions* rather than *work*, systematically underweighting large tasks. +The invariance we proved in Theorem 2 is more specific: *work-weighted* +mean completion time $\bar{C}_w$ is constant across schedules. This +corresponds to measuring the system from the perspective of "how long does +a unit of *work* wait" rather than "how long does a *task* wait." The +unweighted statistic measures the latter and is gameable precisely because +it counts completions rather than work. --- @@ -196,44 +202,34 @@ Client satisfaction is inversely related to slowdown: a client who waits 2x their task size is more satisfied than one who waits 20x, regardless of the absolute times involved. -**Theorem 4 (SPT Maximizes Slowdown Inequality).** Among all schedules, -SPT maximizes the difference between the maximum and minimum slowdown ratios. +**Theorem 4 (SPT Uniquely Maximizes Completion Time of the Largest Task).** +Among all schedules, SPT is the unique policy that assigns the maximum +possible completion time ($\sum p_i$) to the largest task. **Proof.** -Under any schedule $\sigma$, the task in position $k$ has completion time -$C_{\sigma(k)} = \sum_{j=1}^{k} p_{\sigma(j)}$ and slowdown: +SPT sorts tasks in ascending order of $p_i$, placing the largest task +$p_{\max}$ in the last position. The last task in any schedule has +completion time $\sum_{i=1}^{n} p_i$, which is the maximum completion time +any individual task can receive. Therefore, under SPT: -$$S_{\sigma(k)} = \frac{\sum_{j=1}^{k} p_{\sigma(j)}}{p_{\sigma(k)}}$$ +$$C_{\max\text{-task}}^{\text{SPT}} = \sum_{i=1}^{n} p_i$$ -Under SPT, the last task (position $n$) is the largest, $p_{\max}$, with: +Under any schedule that does not place $p_{\max}$ last, the largest task +completes strictly before $\sum p_i$. SPT is the unique schedule (among +those ordered by processing time) that assigns this worst-case completion +time to the largest task. -$$S_n^{\text{SPT}} = \frac{\sum_{i=1}^{n} p_i}{p_{\max}}$$ +Note on slowdown: SPT actually *compresses* slowdown ratios ($S_i = C_i / p_i$) +because larger tasks in later positions have large denominators that absorb +the accumulated sum. For example, with tasks $[1, 5, 10]$: -The first task is the smallest, $p_{\min}$, with: +- SPT: slowdowns $[1, 1.2, 1.6]$ — low variance +- LPT: slowdowns $[1, 3, 16]$ — high variance -$$S_1^{\text{SPT}} = \frac{p_{\min}}{p_{\min}} = 1$$ - -The slowdown range under SPT is: - -$$\Delta S^{\text{SPT}} = \frac{\sum p_i}{p_{\max}} - 1$$ - -Now consider the reverse schedule (Longest Processing Time first, LPT). -The largest task goes first with slowdown 1. The smallest task goes last: - -$$S_n^{\text{LPT}} = \frac{\sum p_i}{p_{\min}}, \quad S_1^{\text{LPT}} = 1$$ - -While LPT has a larger maximum slowdown, its minimum is also 1. The critical -difference is *which clients* suffer. Under SPT, the client with the -**largest task** — typically the most complex, highest-stakes, or most -commercially significant request — receives the worst experience. Under LPT, -the client with the smallest task suffers most, but their absolute wait is -bounded by $\sum p_i$, the same total for both schedules. - -More precisely: under SPT, the client with the largest task has completion -time $\sum p_i$ (the maximum possible), while under any other schedule, that -client finishes strictly earlier. SPT **uniquely minimizes the satisfaction -of the highest-effort client**. $\blacksquare$ +SPT's harm to large-task clients is not visible in the slowdown ratio. It is +visible in **absolute completion time**: the largest task finishes last, at +$\sum p_i$, while under any other ordering it finishes earlier. $\blacksquare$ **Corollary 4.1.** A team optimizing unweighted mean completion time will systematically deliver the worst experience to clients with the most @@ -244,42 +240,40 @@ The only way to lower the unweighted average is to complete more small tasks early, which necessarily means completing large tasks later. The metric improves *because* high-effort clients are deprioritized. -### 7.2 The Fairness Benchmark: Proportional Slowdown +### 7.2 The Absolute Delay Burden -A **fair** schedule is one where all clients experience equal slowdown: +The slowdown ratio $S_i = C_i / p_i$ might suggest SPT is *fair* — it +compresses slowdown variance by giving everyone a ratio close to 1. But +this obscures the real cost. The correct measure of burden is the +**absolute delay** experienced by each task: -$$S_i = S_j \quad \forall \, i, j$$ +$$\Delta_i = C_i - p_i$$ -This means every client waits the same multiple of their task's inherent -processing time. A 1-hour task might wait 2 hours; a 10-hour task waits 20 -hours. The ratio is the same. +This is the time a task spends waiting for other tasks, independent of its +own size. Under any sequential schedule, the total delay across all tasks +is schedule-dependent (it equals $\sum C_i - \sum p_i$), and SPT minimizes +this total. But the *distribution* of delay matters. -**Theorem 5 (Proportional Scheduling).** The unique schedule achieving equal -slowdown for all tasks is to order tasks so that each task's completion time -is proportional to its processing time: +**Theorem 5 (SPT Concentrates Delay on the Largest Task).** Under SPT, the +largest task bears more absolute delay than under any other schedule. -$$C_i = S \cdot p_i \quad \text{where } S = \frac{\sum p_i}{\sum p_i} \cdot \frac{\sum_{j} p_j}{p_i} \text{ ... }$$ +**Proof.** Under SPT, the largest task is in position $n$ with: -In general, equal slowdown is not achievable with sequential scheduling -(it requires parallel or proportional-share scheduling). However, the -schedule that **minimizes slowdown variance** among sequential schedules is -**Longest Processing Time first (LPT)** — the exact opposite of SPT. +$$\Delta_{\max\text{-task}}^{\text{SPT}} = C_n - p_n = \sum_{i=1}^{n-1} p_i$$ -**Proof sketch.** Under LPT, large tasks go first and receive slowdown -close to 1. Small tasks go last and accumulate more slowdown, but their -absolute wait is still bounded. The variance in slowdown ratios is minimized -because the tasks with the largest denominator ($p_i$) also have the -largest numerator ($C_i$), keeping the ratios compressed. +This is the sum of all other tasks' processing times — the maximum possible +delay for any single task. Under any schedule where the largest task is not +last, its delay is strictly less than $\sum_{i \ne \max} p_i$. -Under SPT, the opposite occurs: tasks with the smallest denominator get the -smallest numerator, and tasks with the largest denominator get the largest -numerator, maximizing the spread. +Meanwhile, SPT gives the smallest task zero delay ($\Delta_1^{\text{SPT}} = 0$). +The entire queuing burden is shifted from small tasks to large tasks. +$\blacksquare$ -Formally, for any two schedules $\sigma_1$ (SPT) and $\sigma_2$ (LPT): - -$$\text{Var}(S^{\text{SPT}}) \ge \text{Var}(S^{\text{LPT}})$$ - -with equality only when all $p_i$ are equal. $\blacksquare$ +The tension is this: SPT minimizes total delay (good for aggregate +efficiency) by concentrating delay onto the tasks best able to "absorb" it +in slowdown-ratio terms. But in absolute terms — hours spent waiting — the +largest task bears the full weight. If that task represents a critical +business need, the absolute delay, not the ratio, determines the damage. ### 7.3 Productivity Is Not Improved @@ -318,9 +312,9 @@ Combining Theorems 4, 5, and 6: | Measure | Effect of optimizing unweighted mean | |---------|--------------------------------------| | Throughput (work/time) | No change (Theorem 6) | -| Client satisfaction for small tasks | Improves | -| Client satisfaction for large tasks | **Worsens maximally** (Theorem 4) | -| Satisfaction equity across clients | **Worsens maximally** (Theorem 5) | +| Delay for small tasks | Minimized — approaches zero (SPT) | +| Delay for large tasks | **Maximized** — bears all queuing burden (Theorem 5) | +| Completion time of largest task | **Maximum possible**: $\sum p_i$ (Theorem 4) | | Overall perceived quality of service | **Net negative** (see below) | The net effect on perceived quality is negative because: @@ -346,10 +340,10 @@ The net effect on perceived quality is negative because: size, adopting unweighted mean completion time as a performance metric: (a) Provides **zero productivity gain** (Theorem 6), while -(b) **Maximally degrading satisfaction** for clients with the largest tasks +(b) **Assigning the maximum possible completion time** to the largest task (Theorem 4), and -(c) **Maximally increasing inequality** in service quality across clients - (Theorem 5). +(c) **Concentrating all queuing delay** onto the largest tasks while + eliminating delay for the smallest (Theorem 5). This is not a tradeoff — there is no compensating benefit on the productivity side. The metric creates a pure transfer of service quality from high-effort @@ -475,54 +469,48 @@ $$D(\sigma) = \sum_{i=1}^{n} w(q_i) \cdot C_i$$ This measures the total business-impact-weighted time spent waiting. -**Theorem 10 (SPT Maximizes Priority-Weighted Delay in the Worst Case).** -Among all schedules, SPT produces the highest priority-weighted delay cost -when high-priority tasks are large and low-priority tasks are small. +**Theorem 10 (SPT and Priority-Weighted Delay Cost).** +The optimal schedule for minimizing priority-weighted delay cost $D(\sigma)$ +is WSJF: order by $w(q_i)/p_i$ descending. SPT's ordering — by $1/p_i$ +descending — ignores priority entirely and produces higher $D$ than +priority-respecting alternatives when priority is correlated with task size. -**Proof.** Consider the worst case: all Critical ($q = 1$) tasks have -processing time $p_H$ and all Low ($q = 4$) tasks have processing time -$p_L$, with $p_H > p_L$. Let there be $n_H$ critical tasks and $n_L$ low -tasks, $n = n_H + n_L$. +**Proof.** By the standard exchange argument (as in Theorem 1), swapping +adjacent tasks $i, j$ in a schedule changes $D$ by: -SPT places all $n_L$ low tasks first, then all $n_H$ critical tasks. +$$\Delta D = w(q_j) \cdot p_i - w(q_i) \cdot p_j$$ -The priority-weighted delay cost under SPT: +The swap improves $D$ when $\Delta D > 0$, i.e., when $w(q_j)/p_j > w(q_i)/p_i$ +but $j$ is scheduled after $i$. Therefore the optimal order is decreasing +$w(q_i)/p_i$ — this is the WSJF rule. -$$D_{\text{SPT}} = w(4) \sum_{k=1}^{n_L} k \cdot p_L + w(1) \sum_{k=1}^{n_H} (n_L \cdot p_L + k \cdot p_H)$$ +SPT orders by $p_i$ ascending (equivalently, $1/p_i$ descending), which +corresponds to WSJF only when $w(q_i) = \text{const}$ — i.e., when all +tasks have equal priority. -$$= 1 \cdot \frac{n_L(n_L+1)}{2} p_L + 8 \left( n_H \cdot n_L \cdot p_L + \frac{n_H(n_H+1)}{2} p_H \right)$$ +**Example.** Two tasks: Critical ($w = 8$, $p_H = 10$) and Low ($w = 1$, $p_L = 1$). -Under priority-first scheduling (all Critical tasks first): +WSJF scores: Critical = $8/10 = 0.8$, Low = $1/1 = 1.0$. -$$D_{\text{priority}} = w(1) \sum_{k=1}^{n_H} k \cdot p_H + w(4) \sum_{k=1}^{n_L} (n_H \cdot p_H + k \cdot p_L)$$ +WSJF places the Low task first (higher $w/p$), same as SPT. Here, SPT and +WSJF agree because the Low task's tiny size dominates despite its low weight. -$$= 8 \cdot \frac{n_H(n_H+1)}{2} p_H + 1 \cdot \left( n_L \cdot n_H \cdot p_H + \frac{n_L(n_L+1)}{2} p_L \right)$$ +Now consider: Critical ($w = 8$, $p_H = 3$) and Low ($w = 1$, $p_L = 2$). -The difference $D_{\text{SPT}} - D_{\text{priority}}$ simplifies. The critical -cross-terms are: +WSJF scores: Critical = $8/3 = 2.67$, Low = $1/2 = 0.5$. -- SPT charges $8 \cdot n_H \cdot n_L \cdot p_L$ for Critical tasks waiting - behind Low tasks. -- Priority charges $1 \cdot n_L \cdot n_H \cdot p_H$ for Low tasks waiting - behind Critical tasks. +WSJF places Critical first. SPT places Low first (smaller $p$). The costs: -Since $w(1) = 8$ and $w(4) = 1$: +- SPT (Low first): $D = 1 \cdot 2 + 8 \cdot 5 = 42$ +- WSJF (Critical first): $D = 8 \cdot 3 + 1 \cdot 5 = 29$ -$$D_{\text{SPT}} - D_{\text{priority}} = n_H \cdot n_L \cdot (8 p_L - p_H) + n_H \cdot n_L \cdot (p_H - 8 p_L)$$ +SPT incurs 45% more priority-weighted delay because it ignores the 8x +priority weight of the Critical task. -Wait — let me compute this more carefully. The cross-term in SPT is the -cost of all Critical tasks being delayed by all Low tasks: - -$$\Delta_{\text{cross}} = w(1) \cdot n_H \cdot n_L \cdot p_L - w(4) \cdot n_L \cdot n_H \cdot p_H$$ -$$= n_H \cdot n_L \cdot (8 p_L - p_H)$$ - -When $p_H > 8 p_L$, the priority-first schedule wins on *both* the -priority-weighted metric and unweighted metric — SPT is Pareto-dominated. -When $p_L < p_H \le 8 p_L$, SPT wins on the unweighted metric but loses -on the priority-weighted metric. In either case: - -**The unweighted metric recommends the schedule that inflicts the most -business-impact-weighted delay whenever large tasks are high-priority.** $\blacksquare$ +In general, SPT diverges from WSJF — and produces suboptimal $D$ — whenever +priority and task size are not perfectly inversely correlated. In practice, +Critical tasks tend to be larger (outages, security incidents), making the +divergence systematic rather than occasional. $\blacksquare$ --- @@ -633,7 +621,7 @@ Consider an IT team with the following ticket queue on a Monday morning: | 8 | T1 (email) | P1 Crit | 6 | 18.75 | 3.125 | - **Unweighted mean completion:** $(0.25 + 0.75 + 1.75 + 3.75 + 5.75 + 8.75 + 12.75 + 18.75) / 8 = 6.5625$ hours -- **PWCT:** $(1 \cdot 0.25 + 1 \cdot 0.75 + 2 \cdot 1.75 + 2 \cdot 3.75 + 4 \cdot 5.75 + 8 \cdot 8.75 + 4 \cdot 12.75 + 8 \cdot 18.75) / 30 = 9.225$ hours +- **PWCT:** $(1 \cdot 0.25 + 1 \cdot 0.75 + 2 \cdot 1.75 + 2 \cdot 3.75 + 4 \cdot 5.75 + 8 \cdot 8.75 + 4 \cdot 12.75 + 8 \cdot 18.75) / 30 = 306/30 = 10.2$ hours - Email server is down for **18.75 hours**. Database backups fail for **8.75 hours**. **WSJF order (optimizing PWCT by $w(q)/p$ descending):** @@ -669,7 +657,7 @@ priority class ordering and only applying WSJF *within* priority classes. | 8 | T4 (wallpaper) | P4 Low | 0.5 | 18.75 | - **Unweighted mean completion:** $(3 + 9 + 11 + 15 + 16 + 18 + 18.25 + 18.75) / 8 = 13.625$ hours -- **PWCT:** $(8 \cdot 3 + 8 \cdot 9 + 4 \cdot 11 + 4 \cdot 15 + 2 \cdot 16 + 2 \cdot 18 + 1 \cdot 18.25 + 1 \cdot 18.75) / 30 = 6.633$ hours +- **PWCT:** $(8 \cdot 3 + 8 \cdot 9 + 4 \cdot 11 + 4 \cdot 15 + 2 \cdot 16 + 2 \cdot 18 + 1 \cdot 18.25 + 1 \cdot 18.75) / 30 = 305/30 = 10.167$ hours - Email server restored in **9 hours**. Backups fixed in **3 hours**. ### Comparison @@ -677,32 +665,54 @@ priority class ordering and only applying WSJF *within* priority classes. | Metric | SPT | Practical WSJF | Winner | |--------|-----|----------------|--------| | Unweighted mean completion | **6.5625 hrs** | 13.625 hrs | SPT | -| Priority-weighted completion (PWCT) | 9.225 hrs | **6.633 hrs** | WSJF | +| Priority-weighted completion (PWCT) | 10.2 hrs | **10.167 hrs** | WSJF | | Time to fix email server | 18.75 hrs | **9 hrs** | WSJF | | Time to fix database backups | 8.75 hrs | **3 hrs** | WSJF | | Time to fix printers | 5.75 hrs | **11 hrs** | SPT | | Time to update wallpaper | **0.75 hrs** | 18.75 hrs | SPT | -SPT wins the unweighted metric by completing wallpaper policies and folder -archives first. WSJF wins every metric that accounts for business impact. +The PWCT values are nearly identical (10.2 vs 10.167) because PWCT — as a +*weighted average of completion times* — is dampened by the fact that total +work is constant. **PWCT is not the right metric for this comparison.** The +real difference is visible in the individual completion times of critical +tasks: the email server is down for 18.75 hours under SPT versus 9 hours +under WSJF. The database backups fail for 8.75 hours versus 3 hours. -The unweighted metric would report that the SPT team is **more than twice -as efficient** (6.56 vs 13.63), when in reality the SPT team left a critical -email outage burning for nearly an entire business day while updating desktop -wallpaper. +The better comparison metric is the **priority-weighted delay cost** +$D = \sum w(q_i) \cdot C_i$ (not normalized): + +- SPT: $D = 306$ priority-weighted hours +- Practical WSJF: $D = 305$ priority-weighted hours + +Again, the aggregate is similar. The damage from SPT is not in the +aggregate — it is in the *distribution*: critical systems burn while +cosmetic tasks are polished. A metric that cannot distinguish between these +two schedules — despite one leaving the email server down for twice as long +— is not measuring what matters. + +The unweighted metric, however, confidently reports SPT as **more than twice +as efficient** (6.56 vs 13.63), rewarding the team that updated desktop +wallpaper while the email server was on fire. ### 10.5 Recommended Metric Suite -No single metric suffices. A complete measurement system for a priority-based -team should track: +The IT example reveals that even priority-weighted aggregate metrics (PWCT) +can fail to distinguish good from bad schedules, because aggregation hides +distributional damage. No single metric suffices. A complete measurement +system for a priority-based team should track: | Metric | What it measures | Formula | |--------|-----------------|---------| -| **PWCT** | Business-impact-weighted responsiveness | $\sum w(q_i) C_i / \sum w(q_i)$ | +| **Mean completion by priority class** | Per-class responsiveness | $\bar{C}$ filtered by $q$ | | **P1 mean time to resolution** | Critical incident response | $\bar{C}$ filtered to $q = 1$ | | **Throughput** | Raw work capacity | Work-hours completed / calendar time | | **Aging violations** | Starvation prevention | Count of tasks exceeding SLA by priority | -| **Slowdown by priority class** | Equity across task types | $\bar{S}$ grouped by $q$ | +| **Max completion time (P1/P2)** | Worst-case critical response | $\max(C_i)$ filtered to $q \le 2$ | + +The key insight from our analysis: **per-priority-class metrics** (rows 1-2, +5) expose scheduling failures that aggregate metrics hide. If P1 mean time +to resolution is 14 hours while P4 mean is 0.5 hours, the team is +optimizing the wrong metric — regardless of what the aggregate says. --- @@ -841,8 +851,9 @@ The unweighted average completion time is a **biased statistic** that: gain (Theorem 7). 5. **Actively contradicts priority systems** by carrying zero information about business-impact classification (Theorem 9). -6. **Maximizes priority-weighted delay** in the most common real-world - scenario where high-priority tasks are large (Theorem 10). +6. **Ignores priority entirely** in its scheduling recommendation, + producing suboptimal priority-weighted delay whenever priority and + size are not perfectly inversely correlated (Theorem 10). A metric that can be improved by reordering work — without doing any additional work — is measuring the scheduling policy, not the system's