Phase 11: Probability Theory II
Subject 11-05: Order Statistics
Prerequisites: 10-07 (Continuous Random Variables), 10-10 (Joint Distributions), 11-04 (Transformations), binomial distribution
Learning Objectives
- Define order statistics X_{(1)} ≤ X_{(2)} ≤ ... ≤ X_{(n)} for a random sample
- Derive the CDF and PDF of the minimum X_{(1)} and maximum X_{(n)}
- Derive the PDF of the k-th order statistic X_{(k)} using the binomial argument
- Compute the joint distribution of all order statistics
- Apply order statistics in reliability (lifetime of series/parallel systems) and extreme value theory
Core Content
1. Definition
Let X₁, X₂, ..., Xₙ be i.i.d. random variables with CDF F and PDF f. The order statistics are the sorted values:
X_{(1)} ≤ X_{(2)} ≤ ... ≤ X_{(n)}
- X_{(1)} = min(X₁, ..., Xₙ) — the sample minimum
- X_{(k)} = the k-th smallest value
- X_{(n)} = max(X₁, ..., Xₙ) — the sample maximum
- For odd n, X_{((n+1)/2)} is the sample median
⚠️ CRITICAL: X_{(k)} are NOT independent (except in degenerate cases). They must satisfy X_{(1)} ≤ X_{(2)} ≤ ... ≤ X_{(n)}. Their joint distribution has this ordering built in.
2. Distribution of the Minimum and Maximum
Maximum — X_{(n)}:
$F_{X_{(n)}}(x) = P(X_{(n)} ≤ x) = P(all Xᵢ ≤ x) = [F(x)]ⁿ
f_{X_{(n)}}(x) = n [F(x)]^{n−1} f(x)
$
Proof: The event {max ≤ x} means every Xᵢ ≤ x. By independence, multiply: [P(X₁ ≤ x)]ⁿ = [F(x)]ⁿ.
Minimum — X_{(1)}:
$F_{X_{(1)}}(x) = P(X_{(1)} ≤ x) = 1 − P(all Xᵢ > x) = 1 − [1 − F(x)]ⁿ
f_{X_{(1)}}(x) = n [1 − F(x)]^{n−1} f(x)
$
Proof: The event {min > x} means every Xᵢ > x: P(min > x) = [1 − F(x)]ⁿ. So F_{min}(x) = 1 − [1 − F(x)]ⁿ.
Example — Exponential sample: If Xᵢ ~ Exponential(λ), then: - X_{(1)} ~ Exponential(nλ). The minimum of n independent Exp(λ) is Exp(nλ). - Proof: 1 − F(x) = e^{−λx}, so F_{min}(x) = 1 − (e^{−λx})ⁿ = 1 − e^{−nλx}. Mean = 1/(nλ). Makes sense — the more components in parallel, the faster the first one fails.
Reliability interpretation: - Series system (all must work): system lifetime = min(X₁, ..., Xₙ) - Parallel system (at least one must work): system lifetime = max(X₁, ..., Xₙ)
3. Distribution of the k-th Order Statistic
For the k-th smallest value X_{(k)}:
$f_{X_{(k)}}(x) = n! / ((k−1)! (n−k)!) · [F(x)]^{k−1} [1 − F(x)]^{n−k} f(x)
$
Derivation (binomial argument): For X_{(k)} to be near x, we need: 1. Exactly k−1 observations < x 2. Exactly n−k observations > x 3. One observation ≈ x (with density f(x))
The number of ways to assign which observations go into which category is the multinomial coefficient n! / ((k−1)! · 1! · (n−k)!). Multiply by [F(x)]^{k−1} · f(x)dx · [1 − F(x)]^{n−k}, cancel the dx, and we get the result.
Alternative derivation (direct CDF): F_{X_{(k)}}(x) = P(at least k observations ≤ x) = Σ_{j=k}^{n} C(n, j) [F(x)]^j [1 − F(x)]^{n−j}.
Differentiating and simplifying yields the PDF above.
Special cases: - k = 1: f(x) = n [1 − F(x)]^{n−1} f(x) — matches minimum ✓ - k = n: f(x) = n [F(x)]^{n−1} f(x) — matches maximum ✓ - k = (n+1)/2 (sample median for odd n): gives the sampling distribution of the median
Example — Uniform(0, 1): F(x) = x, f(x) = 1 for 0 < x < 1. Then:
f_{X_{(k)}}(x) = n! / ((k−1)! (n−k)!) · x^{k−1} (1−x)^{n−k}
This is exactly the PDF of Beta(k, n−k+1)! So X_{(k)} ~ Beta(k, n−k+1) when sampling from Uniform(0, 1).
Mean: E[X_{(k)}] = k/(n+1). The order statistics evenly space themselves out — on average, X_{(1)} ≈ 1/(n+1), X_{(2)} ≈ 2/(n+1), ..., X_{(n)} ≈ n/(n+1).
4. Joint Distribution of Order Statistics
The joint PDF of all n order statistics is:
f(x₁, ..., xₙ) = n! · f(x₁) f(x₂) ... f(xₙ) for x₁ < x₂ < ... < xₙ
and zero otherwise.
Derivation: There are n! equally likely permutations that would produce this sorted ordering. Each specific permutation has joint density f(x₁) f(x₂) ... f(xₙ) (by independence of the original sample). The n! accounts for all possible arrangements.
Joint distribution of two order statistics (X_{(r)}, X_{(s)}) where r < s:
$f(x, y) = n!/((r−1)!(s−r−1)!(n−s)!) · [F(x)]^{r−1} [F(y)−F(x)]^{s−r−1} [1−F(y)]^{n−s} f(x) f(y)
$
for x < y. This generalizes the single order statistic formula.
5. Applications and Extensions
Sample range: R = X_{(n)} − X_{(1)}. For Uniform(0, 1): the joint distribution of (X_{(1)}, X_{(n)}) gives R ~ Beta(n−1, 2).
Extreme value theory: As n → ∞, the distribution of the (properly normalized) maximum converges to one of three families: Gumbel, Fréchet, or Weibull — regardless of the original distribution. This is the Fisher-Tippett-Gnedenko theorem, analogous to the CLT for extremes.
Quantiles: The sample p-th quantile (p ∈ (0, 1)) is approximately X_{(⌊np⌋)}. For large n, the sampling distribution of the sample quantile is approximately normal:
$X_{(⌊np⌋)} ≈ N(F^{−1}(p), p(1−p) / (n [f(F^{−1}(p))]²))
$
Key Terms
- Joint distribution of two order statistics
Worked Examples
Example 1: Minimum of Exponential Components
A system has 3 independent components with lifetimes Xᵢ ~ Exponential(λ = 0.001 per hour). The system is a series system (fails when first component fails). Find: (a) PDF of system lifetime, (b) P(system lasts > 500 hours), (c) expected system lifetime.
Solution:
System lifetime = X_{(1)} = min(X₁, X₂, X₃).
(a) X_{(1)} ~ Exponential(3λ) = Exponential(0.003). PDF: f(t) = 0.003 e^{−0.003t}, t > 0.
(b) P(T > 500) = e^{−0.003·500} = e^{−1.5} ≈ 0.223.
(c) E[T] = 1/(3·0.001) = 1000/3 ≈ 333.3 hours. With 3 components in series, the system lasts on average 1/3 as long as a single component.
Example 2: Distribution of the Median
Let X₁, ..., X₅ be i.i.d. Uniform(0, 1). Find: (a) the PDF of the sample median X_{(3)}, (b) P(0.3 < median < 0.7).
Solution:
(a) For n=5, k=3: f(x) = 5!/(2! 2!) · x²(1−x)² · 1 = 30 x²(1−x)², 0 < x < 1.
This is Beta(3, 3). Mean = 3/(3+3) = 0.5, variance = (3·3)/(6²·7) = 9/(36·7) = 9/252 ≈ 0.0357.
(b) P(0.3 < X_{(3)} < 0.7) = ∫_{0.3}^{0.7} 30 x²(1−x)² dx.
Beta CDF: evaluate F_Beta(0.7; 3, 3) − F_Beta(0.3; 3, 3). Using symmetry of Beta(3,3): F(0.7) − F(0.3) = F(0.7) − F(0.7) (symmetric about 0.5) = 2F(0.7) − 1. Numerically, F_Beta(0.7; 3, 3) ≈ 0.8369, so P ≈ 2(0.8369) − 1 = 0.674. About 67%.
Example 3: Range of Uniform Sample
For n=4 i.i.d. Uniform(0, 1) observations, find the PDF of the sample range R = X_{(4)} − X_{(1)}.
Solution:
Joint PDF of (X_{(1)}, X_{(4)}) for Uniform(0,1): f(u, v) = 4!/(0! 2! 0!) · (v−u)² · 1 · 1 = 12(v−u)² for 0 < u < v < 1.
Transform: R = V − U, S = U. V = R + S, U = S. |∂(u,v)/∂(r,s)| = 1.
Support: 0 < s < s+r < 1, so 0 < s < 1−r, and 0 < r < 1.
f_R(r) = ∫₀^{1−r} f_{U,V}(s, s+r) ds = ∫₀^{1−r} 12r² ds = 12r²(1−r).
So f_R(r) = 12 r² (1−r) for 0 < r < 1. This is Beta(3, 2).
E[R] = 3/(3+2) = 0.6. On average, the range covers 60% of the interval.
Quiz
Q1: For n i.i.d. random variables, the minimum X_{(1)} = min(X₁, ..., Xₙ) has CDF:
A) F_{X_{(1)}}(x) = F_X(x)^n B) F_{X_{(1)}}(x) = 1 − (1−F_X(x))^n C) F_{X_{(1)}}(x) = n F_X(x) D) F_{X_{(1)}}(x) = F_X(x)
Correct: B)
- If you chose B: Correct! P(min ≤ x) = 1 − P(all > x) = 1 − (1−F_X(x))^n.
- If you chose A: This is the CDF of the MAXIMUM: P(max ≤ x) = P(all ≤ x) = F_X(x)^n.
- If you chose C: This doesn't even produce valid CDF values (can exceed 1).
- If you chose D: The distribution of order statistics differs from the original.
Q3: The PDF of the k-th order statistic X_{(k)} for i.i.d. samples with PDF f and CDF F is proportional to:
A) F(x)^{k-1} · (1−F(x))^{n-k} · f(x) B) F(x)^k · f(x) C) f(x)^k D) k · F(x) · f(x)
Correct: A)
- If you chose A: Correct! f_{X_{(k)}}(x) = n!/((k−1)!(n−k)!) · F(x)^{k−1} · (1−F(x))^{n−k} · f(x). The three factors represent: k−1 below, n−k above, and 1 at x.
- If you chose B: This is only correct for k = n (the maximum).
- If you chose C: This ignores the ordering structure entirely.
- If you chose D: This doesn't match any standard order statistic distribution.
Practice Problems
- Let X₁, ..., Xₙ be i.i.d. with CDF F. Derive the CDF of X_{(1)} and X_{(n)}.
- Five fuses have independent lifetimes ~ Exponential(0.01 per hour). Find the probability the first fuse blows within 50 hours.
- For n=3 i.i.d. Uniform(0, 1) observations, find the PDF and expected value of X_{(2)}.
- For i.i.d. Uniform(0, θ), show that X_{(n)} is a sufficient statistic for θ. Find E[X_{(n)}].
- Derive the PDF of the sample median for n=3 from a population with PDF f and CDF F.
- Let X₁, ..., X₁₀ be i.i.d. N(0, 1). What is P(X_{(10)} > 2)?
- For a parallel system with 4 independent components ~ Exponential(λ), find the system lifetime PDF and expected lifetime.
Answers
1. F_{max}(x) = [F(x)]ⁿ, f_{max}(x) = n[F(x)]^{n−1}f(x). F_{min}(x) = 1−[1−F(x)]ⁿ, f_{min}(x) = n[1−F(x)]^{n−1}f(x). 2. X_{(1)} ~ Exp(5·0.01) = Exp(0.05). P(X_{(1)} < 50) = 1 − e^{−0.05·50} = 1 − e^{−2.5} ≈ 0.918. High chance the weakest link goes early. 3. f_{X_{(2)}}(x) = 3!/(1!1!) · x¹(1−x)¹ · 1 = 6x(1−x), 0Summary
- Order statistics are the sorted values X_{(1)} ≤ X_{(2)} ≤ ... ≤ X_{(n)} of an i.i.d. sample; they are dependent with a structured joint distribution
- Min and max have simple CDFs: F_{min}(x) = 1−[1−F(x)]ⁿ, F_{max}(x) = [F(x)]ⁿ — natural for reliability (series = min, parallel = max)
- The k-th order statistic X_{(k)} has PDF: n!/((k−1)!(n−k)!) [F]^{k−1}[1−F]^{n−k} f — derived by a binomial argument about how many observations fall below vs. above x
- For Uniform(0, 1): X_{(k)} ~ Beta(k, n−k+1) with mean k/(n+1) — order statistics evenly space themselves
- Joint PDF: n! f(x₁)...f(xₙ) for x₁<...<xₙ — each permutation of the sample is equally likely
Pitfalls
- Min vs. Max CDF confusion: F_{min}(x) = 1−[1−F(x)]ⁿ, F_{max}(x) = [F(x)]ⁿ. These look similar but are fundamentally different. A common mistake is to write the min CDF as [F(x)]ⁿ (that's the max) or to forget the complement in the min formula.
- Treating order statistics as independent: X_{(1)}, ..., X_{(n)} are dependent random variables — they must satisfy X_{(1)} ≤ ... ≤ X_{(n)}. Applying formulas that assume independence (e.g., multiplying marginal PDFs) is wrong unless you're working with the joint distribution that explicitly accounts for ordering.
- Forgetting the multinomial coefficient: The PDF of X_{(k)} is not just [F]^{k−1}[1−F]^{n−k} f — the factor n!/((k−1)!(n−k)!) is essential. Omitting it gives a function that doesn't integrate to 1.
- Confusing the k-th order statistic with the k-th observation: X_{(k)} is the k-th smallest value in the sorted sample, not the k-th observation collected. The original X₁, ..., Xₙ are i.i.d., but the ordered X_{(k)} are not.
- Applying the Uniform(0,1) Beta result to other distributions: X_{(k)} ~ Beta(k, n−k+1) only when sampling from Uniform(0,1). For other distributions, you must use the general formula with the appropriate CDF and PDF.
Quiz
-
X_{(1)} denotes: a) The first observation collected b) The sample minimum c) The sample mean d) The largest observation Answer: b. X_{(1)} = min(X₁, ..., Xₙ) — the smallest value in the sorted sample.
-
The CDF of the sample maximum X_{(n)} from n i.i.d. observations is: a) n F(x) b) 1 − [1 − F(x)]ⁿ c) [F(x)]ⁿ d) F(x)/n Answer: c. P(all ≤ x) = Π P(Xᵢ ≤ x) = [F(x)]ⁿ.
-
For a series system of n independent components, system lifetime equals: a) max(lifetimes) b) sum of lifetimes c) min(lifetimes) d) average lifetime Answer: c. A series system fails when the first component fails — system lifetime = min.
-
If X₁, ..., Xₙ ~ Uniform(0, 1), then X_{(k)} follows: a) Normal(k/n, k(n−k)/n³) b) Beta(k, n−k+1) c) Exponential(k/n) d) Binomial(n, k/(n+1)) Answer: b. Beta(k, n−k+1) — a key result connecting order statistics to the Beta distribution.
-
For n independent Exponential(λ) components, X_{(1)} ~: a) Exponential(λ) b) Exponential(nλ) c) Gamma(n, λ) d) Exponential(λ/n) Answer: b. The minimum has rate nλ, so it fails faster — the "weakest link" phenomenon.
-
The joint PDF of all order statistics X_{(1)} < ... < X_{(n)} contains which factor? a) n b) F(x₁)ⁿ c) n! d) 1/n! Answer: c. f(x₁,...,xₙ) = n! f(x₁)...f(xₙ) for x₁<...<xₙ. The n! accounts for all permutations.
-
E[X_{(k)}] for n i.i.d. Uniform(0, 1) equals: a) k/n b) (k+1)/(n+1) c) k/(n+1) d) (n−k+1)/(n+1) Answer: c. k/(n+1) — the order statistics evenly partition the unit interval.
-
The expected lifetime of a parallel system with 3 independent Exp(λ) components is: a) 1/(3λ) b) 3/λ c) (1 + 1/2 + 1/3)/λ = 11/(6λ) d) λ/3 Answer: c. For parallel with n components: E[T] = (1/λ) Σ_{k=1}^{n} 1/k. For n=3: (1+0.5+0.333)/λ = 1.833/λ.
Next Steps
Continue to 11-06 Limit Theorems to learn about the Law of Large Numbers, the Central Limit Theorem, and convergence concepts.