Phase 11: Probability Theory II
Subject 11-04: Transformations of Random Variables
Prerequisites: 10-07 (Continuous Random Variables), 10-10 (Joint Distributions), multivariable calculus
Learning Objectives
- Apply the CDF method to find the distribution of Y = g(X) for a single continuous RV
- Apply the Jacobian method for invertible transformations of single random variables
- Extend the Jacobian method to bivariate transformations (X, Y) → (U, V)
- Handle non-monotonic transformations by partitioning the support
- Derive distributions of sums, products, and quotients via transformation techniques
Core Content
1. The CDF Method (Single Variable)
Given X with known CDF F_X, find the CDF of Y = g(X):
Step 1: Express F_Y(y) = P(Y ≤ y) = P(g(X) ≤ y). Step 2: Solve for X in terms of y: P(X ∈ A(y)) where A(y) = {x : g(x) ≤ y}. Step 3: Express in terms of F_X. Step 4: Differentiate to get the PDF: f_Y(y) = F'_Y(y).
⚠️ CRITICAL: The CDF method ALWAYS works, even for non-monotonic g. The Jacobian method only works for monotonic transformations.
Example — Y = X² where X ~ N(0, 1):
For y > 0: F_Y(y) = P(X² ≤ y) = P(−√y ≤ X ≤ √y) = Φ(√y) − Φ(−√y) = 2Φ(√y) − 1.
f_Y(y) = d/dy[2Φ(√y) − 1] = 2 φ(√y) · (1/(2√y)) = φ(√y)/√y = (1/√(2π)) y^{−1/2} e^{−y/2}.
This is the PDF of χ²(1) — the chi-squared distribution with 1 degree of freedom. Indeed, Z² ~ χ²(1).
2. The Jacobian Method (Monotonic Single Variable)
If g is strictly monotonic (and differentiable) on the support of X:
Monotonic increasing: g'(x) > 0.
$f_Y(y) = f_X(g^{−1}(y)) · |d/dy [g^{−1}(y)]| = f_X(x) / |g'(x)| evaluated at x = g^{−1}(y)
$
Monotonic decreasing: g'(x) < 0. Same formula, absolute value handles the sign.
Derivation: F_Y(y) = P(g(X) ≤ y) = P(X ≤ g^{−1}(y)) = F_X(g^{−1}(y)). Differentiate: f_Y(y) = f_X(g^{−1}(y)) · (g^{−1})'(y). By inverse function theorem: (g^{−1})'(y) = 1/g'(g^{−1}(y)). The absolute value generalizes to both increasing and decreasing cases.
Example — Y = e^X where X ~ N(μ, σ²): g(x) = eˣ is strictly increasing. g^{−1}(y) = ln(y) for y > 0. f_Y(y) = f_X(ln(y)) · |1/y| = (1/(yσ√(2π))) exp(−(ln(y) − μ)²/(2σ²)), y > 0. This is the log-normal distribution.
Example — Y = 1/X where X > 0: g(x) = 1/x is strictly decreasing for x > 0. g^{−1}(y) = 1/y. f_Y(y) = f_X(1/y) · |−1/y²| = f_X(1/y) / y².
If X ~ Exponential(λ): f_Y(y) = λ e^{−λ/y} / y² for y > 0. This is the inverse exponential distribution.
3. The Jacobian Method (Bivariate)
For a one-to-one transformation (U, V) = (g₁(X,Y), g₂(X,Y)):
f_{U,V}(u, v) = f_{X,Y}(x(u,v), y(u,v)) · |J(u, v)|^{−1}
where J = ∂(u,v)/∂(x,y) = det[[∂u/∂x, ∂u/∂y], [∂v/∂x, ∂v/∂y]] is the Jacobian determinant.
Equivalently, using the inverse transformation:
$f_{U,V}(u, v) = f_{X,Y}(x(u,v), y(u,v)) · |∂(x,y)/∂(u,v)|
$
where ∂(x,y)/∂(u,v) = det[[∂x/∂u, ∂x/∂v], [∂y/∂u, ∂y/∂v]].
The absolute value is critical — it ensures the PDF is non-negative.
Intuition: The Jacobian accounts for the area distortion. When you transform coordinates, a small rectangle in (x, y) space maps to a parallelogram in (u, v) space. The Jacobian determinant gives the area scaling factor.
4. Sum of Two Independent Continuous RVs — Convolution
If X and Y are independent with PDFs f_X and f_Y, the PDF of Z = X + Y is:
$f_Z(z) = ∫_{−∞}^{∞} f_X(x) f_Y(z − x) dx [convolution]
$
Derivation via bivariate transformation: Let U = X, Z = X + Y. Jacobian: ∂(u,z)/∂(x,y) = 1, so ∂(x,y)/∂(u,z) = 1. Joint: f_{U,Z}(u,z) = f_X(u) f_Y(z − u). Marginalize out U: f_Z(z) = ∫ f_X(u) f_Y(z−u) du.
Key examples: - Sum of independent Normals: N(μ₁,σ₁²) + N(μ₂,σ₂²) = N(μ₁+μ₂, σ₁²+σ₂²) - Sum of independent Gammas (same rate): Gamma(α₁,β) + Gamma(α₂,β) = Gamma(α₁+α₂, β) - Sum of independent Exponentials (same rate): Exp(λ) + Exp(λ) = Gamma(2, λ) (Erlang)
5. Other Bivariate Transformations
Product Z = XY: f_Z(z) = ∫{−∞}^{∞} f{X,Y}(x, z/x) · (1/|x|) dx
Quotient Z = X/Y (with Y ≠ 0): f_Z(z) = ∫{−∞}^{∞} f{X,Y}(zy, y) · |y| dy
Ratio of independent standard normals: X/Y ~ Cauchy(0, 1). Derivation: f_{X,Y}(x,y) = (1/(2π)) e^{−(x²+y²)/2}. Let Z = X/Y, W = Y. Then f_Z(z) = ∫ |w| (1/(2π)) e^{−(z²w²+w²)/2} dw = 1/(π(1+z²)) — the standard Cauchy.
Key Terms
- The absolute value is critical
Worked Examples
Example 1: CDF Method — Non-Monotonic
X ~ Uniform(−1, 2). Find the PDF of Y = X².
Solution:
X has PDF f_X(x) = 1/3 for −1 < x < 2, 0 otherwise.
For 0 < y < 1: {x : x² ≤ y} = [−√y, √y]. F_Y(y) = P(−√y ≤ X ≤ √y) = (√y − (−√y))/3 = 2√y/3.
For 1 ≤ y < 4: {x : x² ≤ y and x ∈ (−1, 2)} = [−√y, √y] ∩ (−1, 2) = [−1, √y] (since √y ≥ 1, the left end is −1). F_Y(y) = (√y − (−1))/3 = (√y + 1)/3.
For y ≥ 4: F_Y(y) = 1.
Differentiate: - 0 < y < 1: f_Y(y) = d/dy(2√y/3) = 1/(3√y) - 1 ≤ y < 4: f_Y(y) = d/dy((√y+1)/3) = 1/(6√y)
Note the piecewise behavior comes from the non-symmetric support of X.
Example 2: Bivariate Transformation to Polar Coordinates
Let X, Y be i.i.d. N(0, 1). Find the joint distribution of (R, Θ) where X = R cos Θ, Y = R sin Θ.
Solution:
f_{X,Y}(x,y) = (1/(2π)) e^{−(x²+y²)/2}.
Inverse: r = √(x²+y²), θ = arctan(y/x). Jacobian: ∂(x,y)/∂(r,θ) = [[∂x/∂r, ∂x/∂θ], [∂y/∂r, ∂y/∂θ]] = [[cosθ, −r sinθ], [sinθ, r cosθ]].
det = cosθ(r cosθ) − (−r sinθ)(sinθ) = r(cos²θ + sin²θ) = r.
f_{R,Θ}(r, θ) = f_{X,Y}(r cosθ, r sinθ) · |r| = (1/(2π)) e^{−r²/2} · r, for r > 0, 0 < θ < 2π.
This factors! f_{R,Θ}(r, θ) = (r e^{−r²/2}) · (1/(2π)). R has a Rayleigh distribution, Θ ~ Uniform(0, 2π), and they're independent. This is the basis of the Box-Muller transform for generating normal random variables.
Example 3: Convolution for Sum
X ~ Uniform(0, 1), Y ~ Uniform(0, 1), independent. Find the PDF of Z = X + Y.
Solution:
f_X(x) = 1 for 0 < x < 1, f_Y(y) = 1 for 0 < y < 1.
f_Z(z) = ∫ f_X(x) f_Y(z − x) dx. The integrand is 1 when 0 < x < 1 AND 0 < z−x < 1, i.e., max(0, z−1) < x < min(1, z).
- For 0 < z < 1: f_Z(z) = ∫₀ᶻ 1 dx = z.
- For 1 ≤ z < 2: f_Z(z) = ∫_{z−1}¹ 1 dx = 2 − z.
- Otherwise: 0.
This is the triangular distribution on (0, 2).
Check: ∫₀¹ z dz + ∫₁² (2−z) dz = 1/2 + 1/2 = 1. ✓
Quiz
Q1: When transforming a continuous random variable X via Y = g(X), the CDF method computes:
A) F_Y(y) = P(g(X) ≤ y) = ∫_{x: g(x) ≤ y} f_X(x) dx B) F_Y(y) = g(F_X(y)) C) F_Y(y) = F_X(g^{-1}(y)) D) F_Y(y) = f_X(y) · |g'(y)|
Correct: A)
- If you chose A: Correct! The CDF method works for ANY transformation (monotonic or not): first express the event {Y ≤ y} in terms of X, then integrate the PDF over that region.
- If you chose B: This doesn't work in general — CDFs don't compose with arbitrary functions.
- If you chose C: This works only when g is strictly monotonic, and is a special case of the CDF method.
- If you chose D: This confuses the PDF transformation formula with the CDF.
Q2: For a monotonic transformation Y = g(X) with g strictly increasing, the PDF of Y is:
A) f_Y(y) = f_X(y) · g'(y) B) f_Y(y) = f_X(g^{-1}(y)) · |d/dy g^{-1}(y)| C) f_Y(y) = f_X(g^{-1}(y)) D) f_Y(y) = f_X(y) / g'(y)
Correct: B)
- If you chose B: Correct! The Jacobian method: differentiate the inverse and multiply by the original PDF evaluated at the inverse. The absolute value handles both increasing and decreasing g.
- If you chose A: The argument should be g^{-1}(y), not y, and the derivative is of the inverse.
- If you chose C: Missing the Jacobian factor; the PDF doesn't integrate to 1 without it.
- If you chose D: The derivative factor should be of the INVERSE function, not g'(y).
Q3: The convolution formula for the sum Z = X + Y of independent continuous RVs is:
A) f_Z(z) = f_X(z) + f_Y(z) B) f_Z(z) = ∫ f_X(x) f_Y(z − x) dx C) f_Z(z) = f_X(z) · f_Y(z) D) f_Z(z) = ∫ f_X(x) f_Y(x) dx
Correct: B)
- If you chose B: Correct! The PDF of the sum is the convolution of the individual PDFs: f_Z(z) = ∫ f_X(x) f_Y(z − x) dx. This comes from the transformation (X, Y) → (X, X+Y) and integrating out X.
- If you chose A: PDFs add only for MIXTURE distributions, not sums of RVs.
- If you chose C: This would be the PDF of the product, not the sum.
- If you chose D: This integral gives a single number (expectation of a product), not a function of z.
Q5: For a bivariate transformation (U, V) = g(X, Y), the joint PDF requires multiplying by:
A) The gradient of g B) The absolute value of the Jacobian determinant |∂(x,y)/∂(u,v)| C) The Hessian of g D) The inverse of g
Correct: B)
- If you chose B: Correct! f_{U,V}(u,v) = f_{X,Y}(x(u,v), y(u,v)) · |det(J)| where J = ∂(x,y)/∂(u,v) is the Jacobian matrix of the inverse transformation.
- If you chose A: The gradient is a vector, not a scalar factor.
- If you chose C: The Hessian is a second-derivative matrix, irrelevant to change of variables.
- If you chose D: You need the determinant of the Jacobian of the inverse, not just the inverse function.
Practice Problems
- Let X ~ Exponential(λ). Find the PDF of Y = √X using the Jacobian method.
- X ~ Uniform(0, 1). Find the PDF of Y = −ln(X) using the CDF method.
- X, Y i.i.d. Uniform(0, 1). Find the PDF of Z = X + Y via convolution.
- Let X and Y be independent standard normals. Use the bivariate Jacobian to find the PDF of U = X + Y and V = X − Y. Are U and V independent?
- X ~ Uniform(−π/2, π/2). Find the PDF of Y = tan(X).
- Let X, Y be independent Exponential(1). Find the PDF of Z = X/Y.
- X ~ N(0, 1). Find the PDF of Y = |X| (the half-normal distribution).
Answers
1. g(x) = √x, increasing. g^{−1}(y) = y². f_Y(y) = f_X(y²) · |2y| = λe^{−λy²} · 2y = 2λy e^{−λy²}, y > 0 (Rayleigh-like). 2. Y = −ln(X). X ~ U(0,1). F_Y(y) = P(−ln X ≤ y) = P(ln X ≥ −y) = P(X ≥ e^{−y}) = 1 − e^{−y}. f_Y(y) = e^{−y} for y > 0. So Y ~ Exponential(1). 3. f_Z(z) = z for 0Summary
- The CDF method (F_Y(y) = P(g(X) ≤ y)) always works for finding the distribution of Y = g(X), even for non-monotonic g
- For monotonic differentiable g: f_Y(y) = f_X(g^{−1}(y)) · |(g^{−1})'(y)| — the Jacobian method
- For bivariate transformations: f_{U,V}(u,v) = f_{X,Y}(x(u,v), y(u,v)) · |det ∂(x,y)/∂(u,v)|
- Sums of independent RVs use convolution: f_{X+Y}(z) = ∫ f_X(x) f_Y(z−x) dx
- Normal + Normal = Normal; Gamma(α₁,β) + Gamma(α₂,β) = Gamma(α₁+α₂,β); ratio of standard normals is Cauchy
Pitfalls
- Forgetting the absolute value of the Jacobian determinant. The transformation formula requires |det(J)|, not det(J). Without the absolute value, you can get a negative PDF. This is the single most common error in transformation problems. Always write |∂(x,y)/∂(u,v)|, with vertical bars for the absolute value.
- Using the Jacobian method for non-monotonic transformations. If g(x) is not strictly monotonic (e.g., g(x) = x²), the Jacobian formula does NOT apply directly. You must either: (a) use the CDF method, (b) partition the domain into monotonic pieces and sum contributions, or (c) use the general formula with multiple inverse branches.
- Using the wrong Jacobian matrix. There are two choices: the forward Jacobian ∂(u,v)/∂(x,y) and the inverse Jacobian ∂(x,y)/∂(u,v). The transformation formula uses the INVERSE Jacobian: |∂(x,y)/∂(u,v)|. If you mistakenly compute ∂(u,v)/∂(x,y) and plug it in, you get the reciprocal of the correct factor.
- Confusing convolution with simply adding the PDFs. The PDF of X + Y is NOT f_X(z) + f_Y(z). That would be a mixture distribution (flip a coin, draw from X or Y). The PDF of a sum is the CONVOLUTION ∫ f_X(x) f_Y(z−x) dx. Adding PDFs gives a two-hump distribution; convolving them gives a smoothed one.
- Forgetting to handle piecewise regions in the CDF method. When g(X) is non-monotonic or X has asymmetric support, the set {x : g(x) ≤ y} may split into multiple intervals. Example: Y = X² with X ~ Uniform(−1, 2) — for 0 < y < 1 the set is [−√y, √y], but for 1 ≤ y < 4 it's [−1, √y]. Each region requires separate CDF expressions and PDF differentiation.
Quiz
-
The CDF method for Y = g(X) works by: a) Directly differentiating g b) Writing F_Y(y) = P(g(X) ≤ y) and solving for X c) Multiplying PDFs d) Using the chain rule Answer: b. The CDF method expresses the event {Y ≤ y} in terms of X, then uses F_X.
-
For a monotonic increasing transformation Y = g(X), the PDF is: a) f_Y(y) = f_X(g(y)) b) f_Y(y) = f_X(g^{−1}(y)) · g'(g^{−1}(y)) c) f_Y(y) = f_X(g^{−1}(y)) · |(g^{−1})'(y)| d) f_Y(y) = f_X(g(y)) · g'(y) Answer: c. The factor is the derivative of the inverse function, needed for the change of variables.
-
The sum of two independent Exponential(λ) random variables has which distribution? a) Exponential(2λ) b) Gamma(2, λ) c) Normal(2/λ, 2/λ²) d) Uniform(0, 2/λ) Answer: b. Two independent Exp(λ) sum to Gamma(2, λ) (Erlang-2).
-
In bivariate transformation, the Jacobian determinant accounts for: a) The area scaling factor of the transformation b) The correlation between X and Y c) The sum of the variables d) The means of the marginals Answer: a. The Jacobian is the local area scaling factor — how much a small region in (x,y) expands or contracts in (u,v) space.
-
For X, Y i.i.d. Uniform(0, 1), the PDF of X + Y on [0, 2] is: a) Constant b) Triangular (increasing then decreasing) c) Bell-shaped d) Exponential Answer: b. The triangular distribution: f(z) = z for 0<z<1, 2−z for 1≤z<2.
-
If X ~ N(0, 1), then Y = X² has distribution: a) N(0, 1) b) χ²(1) c) Exponential(1/2) d) Uniform(0, 1) Answer: b. The square of a standard normal is chi-squared on 1 degree of freedom.
-
The transformation X = R cos Θ, Y = R sin Θ for independent standard normals yields: a) Dependent R and Θ b) Independent R and Θ c) R ~ Uniform, Θ ~ Normal d) R = X + Y Answer: b. R (Rayleigh) and Θ (Uniform(0, 2π)) are independent — the Box-Muller property.
-
When is the CDF method preferred over the Jacobian method? a) For monotonic transformations b) For linear transformations c) For non-monotonic transformations d) When the transformation is differentiable Answer: c. The Jacobian requires monotonicity (one-to-one). For non-monotonic g, partition the support and use the CDF method.
Next Steps
Continue to 11-05 Order Statistics to learn about distributions of sample minimums, maximums, medians, and joint distributions of order statistics.