Phase 10: Probability Theory
Subject 10-10: Joint Distributions
Prerequisites: 10-04 (Discrete RVs), 10-07 (Continuous RVs), 10-06 (Expectation), multivariable calculus (partial derivatives, double integrals)
Learning Objectives
- Define and compute joint PMFs for discrete random vectors and joint PDFs for continuous random vectors
- Derive marginal distributions by summing (discrete) or integrating (continuous) over other variables
- Define conditional distributions and compute conditional expectations E[Y | X = x]
- State and verify independence for jointly distributed random variables: f(x, y) = f_X(x) f_Y(y)
- Apply the bivariate normal distribution and compute probabilities for linear combinations
Core Content
1. Joint PMF (Discrete Case)
For discrete random variables X and Y, the joint probability mass function is:
$p_{X,Y}(x, y) = P(X = x, Y = y)
$
Properties: 1. p_{X,Y}(x, y) ≥ 0 for all (x, y) 2. Σ_x Σ_y p_{X,Y}(x, y) = 1 3. For any event A ⊆ R²: P((X,Y) ∈ A) = Σ_{(x,y)∈A} p_{X,Y}(x, y)
Joint CDF:
$F_{X,Y}(x, y) = P(X ≤ x, Y ≤ y) = Σ_{u≤x} Σ_{v≤y} p_{X,Y}(u, v)
$
2. Joint PDF (Continuous Case)
For continuous random variables X and Y, the joint probability density function f_{X,Y}(x, y) satisfies:
Properties: 1. f_{X,Y}(x, y) ≥ 0 for all (x, y) 2. ∫∫ f_{X,Y}(x, y) dx dy = 1 3. P((X,Y) ∈ A) = ∬A f{X,Y}(x, y) dx dy
Joint CDF:
$F_{X,Y}(x, y) = P(X ≤ x, Y ≤ y) = ∫_{−∞}^{x} ∫_{−∞}^{y} f_{X,Y}(u, v) dv du
$
Recovering PDF from CDF:
$f_{X,Y}(x, y) = ∂²F_{X,Y} / (∂x ∂y)
$
3. Marginal Distributions
Discrete — Marginal PMF:
$p_X(x) = Σ_y p_{X,Y}(x, y) (sum over all values of Y)
p_Y(y) = Σ_x p_{X,Y}(x, y) (sum over all values of X)
$
Continuous — Marginal PDF:
$f_X(x) = ∫_{−∞}^{∞} f_{X,Y}(x, y) dy (integrate out Y)
f_Y(y) = ∫_{−∞}^{∞} f_{X,Y}(x, y) dx (integrate out X)
$
Intuition: The marginal distribution of X is what you get if you "average over" or "ignore" Y — it's the distribution of X alone, without reference to Y.
Example (continuous):
f_{X,Y}(x, y) = 6xy² for 0 < x < 1, 0 < y < 1
Marginal of X:
$f_X(x) = ∫₀¹ 6xy² dy = 6x[y³/3]₀¹ = 6x · 1/3 = 2x, 0 < x < 1 $
Marginal of Y:
$f_Y(y) = ∫₀¹ 6xy² dx = 6y²[x²/2]₀¹ = 6y² · 1/2 = 3y², 0 < y < 1 $
Verify normalization: ∫₀¹ 2x dx = 1 ✓, ∫₀¹ 3y² dy = 1 ✓.
4. Conditional Distributions
Discrete:
$p_{Y|X}(y|x) = P(Y = y | X = x) = p_{X,Y}(x, y) / p_X(x)
$
Continuous:
f_{Y|X}(y|x) = f_{X,Y}(x, y) / f_X(x) for f_X(x) > 0
For fixed x, f_{Y|X}(y|x) as a function of y is a valid PDF: it's non-negative and integrates to 1.
Conditional Expectation:
Discrete:
$E[Y | X = x] = Σ_y y · p_{Y|X}(y|x)
$
Continuous:
$E[Y | X = x] = ∫ y · f_{Y|X}(y|x) dy
$
Law of Total Expectation (Law of Iterated Expectations):
$E[Y] = E[E[Y | X]] $
In the continuous case:
$E[Y] = ∫ E[Y | X = x] f_X(x) dx $
This is a powerful tool: you can compute E[Y] by first conditioning on X, finding the conditional expectation, then averaging over X.
Law of Total Variance:
$Var(Y) = E[Var(Y | X)] + Var(E[Y | X]) $
5. Independence
X and Y are independent if and only if:
Discrete:
p_{X,Y}(x, y) = p_X(x) · p_Y(y) for all x, y
Continuous:
f_{X,Y}(x, y) = f_X(x) · f_Y(y) for all x, y
Equivalently: the conditional distribution equals the marginal:
$f_{Y|X}(y|x) = f_Y(y) (X gives no information about Y)
$
And: the joint CDF factors: F_{X,Y}(x, y) = F_X(x) F_Y(y).
Theorem: If X and Y are independent, then Cov(X, Y) = 0 (and E[XY] = E[X]E[Y]). The converse is FALSE — zero correlation does NOT imply independence (except for jointly normal variables).
Theorem: If X and Y are independent, then g(X) and h(Y) are independent for any functions g, h.
6. Bivariate Normal Distribution
The most important joint continuous distribution. (X, Y) is bivariate normal with parameters μ_X, μ_Y, σ_X², σ_Y², and correlation ρ.
Joint PDF:
$f(x, y) = (1 / (2π σ_X σ_Y √(1−ρ²))) · exp(−(1/(2(1−ρ²))) [((x−μ_X)/σ_X)² − 2ρ((x−μ_X)/σ_X)((y−μ_Y)/σ_Y) + ((y−μ_Y)/σ_Y)²]) $
Key properties: - Marginals: X ~ N(μ_X, σ_X²), Y ~ N(μ_Y, σ_Y²) - Conditional distributions: Y | X=x ~ N(μ_Y + ρ(σ_Y/σ_X)(x−μ_X), σ_Y²(1−ρ²)) - ρ = 0 ⇔ X and Y are independent (unique among distributions — for bivariate normal ONLY, zero correlation implies independence) - Linear combinations are normal: aX + bY ~ N(aμ_X + bμ_Y, a²σ_X² + b²σ_Y² + 2abρσ_Xσ_Y)
Conditional expectation given X is LINEAR in X:
$E[Y | X = x] = μ_Y + ρ(σ_Y/σ_X)(x − μ_X) $
This is the "regression line" — it gives the best linear predictor of Y given X.
Key Terms
- 10 10 Joint Distributions
- Answer: b.
- Answer: c.
- Phase 11
- Subject 10-10: Joint Distributions
- independent
- joint probability density function
- joint probability mass function
Worked Examples
Example 1: Discrete Joint Distribution
Joint PMF of (X, Y):
| X\Y | 1 | 2 | 3 |
|---|---|---|---|
| 0 | 0.1 | 0.1 | 0.0 |
| 1 | 0.2 | 0.3 | 0.3 |
Find: (a) marginal PMFs, (b) P(Y=2 | X=1), (c) E[Y | X=0], (d) Are X and Y independent?
Solution:
(a) p_X(0) = 0.1+0.1+0.0 = 0.2; p_X(1) = 0.2+0.3+0.3 = 0.8. p_Y(1) = 0.1+0.2 = 0.3; p_Y(2) = 0.1+0.3 = 0.4; p_Y(3) = 0.0+0.3 = 0.3.
(b) p_{Y|X}(2|1) = p(1,2)/p_X(1) = 0.3/0.8 = 0.375.
(c) Conditional PMF given X=0: p(1|0)=0.1/0.2=0.5, p(2|0)=0.5, p(3|0)=0. E[Y|X=0] = 1(0.5)+2(0.5)+3(0) = 1.5.
(d) Check: p(0,1) = 0.1, p_X(0)p_Y(1) = 0.2·0.3 = 0.06. Not equal, so dependent.
Example 2: Continuous Joint Distribution
Let f(x, y) = 2 for 0 < x < y < 1, zero otherwise.
(a) Verify it's a valid joint PDF. (b) Find marginal PDFs f_X(x) and f_Y(y). (c) Find P(X + Y < 1). (d) Find E[Y | X = 0.5].
Solution:
(a) ∫₀¹ ∫ₓ¹ 2 dy dx = ∫₀¹ 2(1−x) dx = 2[x−x²/2]₀¹ = 2(1−1/2) = 1 ✓. Non-negative on support ✓.
(b) f_X(x) = ∫ₓ¹ 2 dy = 2(1−x) for 0 < x < 1. f_Y(y) = ∫₀ʸ 2 dx = 2y for 0 < y < 1.
(c) P(X+Y < 1) = region where 0 < x < y, x+y < 1. For fixed x, y goes from x to 1−x, but only valid when x < 1−x, i.e., x < 0.5. So: P = ∫₀^{0.5} ∫ₓ^{1−x} 2 dy dx = ∫₀^{0.5} 2(1−2x) dx = 2[x−x²]₀^{0.5} = 2(0.5−0.25) = 0.5.
(d) f_{Y|X}(y|0.5) = f(0.5,y)/f_X(0.5) = 2/(2(1−0.5)) = 2/1 = 2 for 0.5 < y < 1. E[Y|X=0.5] = ∫{0.5}¹ y·2 dy = [y²]{0.5}¹ = 1 − 0.25 = 0.75.
Example 3: Bivariate Normal
Let (X, Y) be bivariate normal with μ_X = 170, μ_Y = 65, σ_X = 10, σ_Y = 8, ρ = 0.6.
(a) Find P(Y > 70). (b) Find the conditional distribution of Y given X = 180. (c) Find E[2X − 3Y] and Var(2X − 3Y).
Solution:
(a) Y ~ N(65, 64). Z = (70−65)/8 = 5/8 = 0.625. P(Y > 70) = 1 − Φ(0.625) ≈ 0.266.
(b) Y|X=180 ~ N(μ_{Y|X}, σ²_{Y|X}). μ_{Y|X} = 65 + 0.6(8/10)(180−170) = 65 + 0.6·0.8·10 = 65 + 4.8 = 69.8. σ²_{Y|X} = 64(1−0.36) = 64·0.64 = 40.96 (so σ_{Y|X} ≈ 6.4).
(c) E[2X−3Y] = 2·170 − 3·65 = 340 − 195 = 145. Var(2X−3Y) = 4·100 + 9·64 + 2·2·(−3)·0.6·10·8 = 400 + 576 − 576 = 400. So 2X−3Y ~ N(145, 400).
Quiz
Q1: For jointly distributed discrete random variables, the marginal PMF p_X(x) is obtained by:
A) Multiplying p_{X,Y}(x,y) by p_Y(y) B) Summing p_{X,Y}(x,y) over all values of y C) Integrating p_{X,Y}(x,y) over y D) Taking the derivative of the joint CDF
Correct: B)
- If you chose B: Correct! p_X(x) = Σ_y p_{X,Y}(x,y) — you "marginalize out" Y by summing its probabilities.
- If you chose A: This gives the conditional times marginal, not the marginal itself.
- If you chose C: Integration is for continuous RVs; discrete RVs use summation.
- If you chose D: The marginal CDF is obtained from limits of the joint CDF, not its derivative.
Q2: For continuous X and Y to be independent, which condition must hold?
A) f_{X,Y}(x,y) = f_X(x) + f_Y(y) B) F_{X,Y}(x,y) = F_X(x) F_Y(y) for all x,y C) E[XY] = E[X]E[Y] D) Cov(X,Y) = 0
Correct: B)
- If you chose B: Correct! Independence means the joint CDF factors: F_{X,Y}(x,y) = F_X(x) F_Y(y). Equivalently, f_{X,Y}(x,y) = f_X(x) f_Y(y).
- If you chose A: Joint distributions multiply, not add, when independent.
- If you chose C: This is necessary but NOT sufficient for independence (except for multivariate normal).
- If you chose D: Zero covariance does not imply independence — counterexample: Y = X² with X symmetric about 0.
Q3: To find a marginal PDF from a joint PDF f_{X,Y}(x,y), you:
A) Integrate over x B) Integrate over y C) Set the other variable to its mean D) Take the partial derivative
Correct: B (for f_X)
- If you chose B: Correct! f_X(x) = ∫ f_{X,Y}(x,y) dy — integrate out Y to get the marginal of X.
- If you chose A: That gives f_Y(y), the marginal of Y.
- If you chose C: This doesn't produce a valid marginal distribution.
- If you chose D: The marginal PDF is obtained by integration, not differentiation.
Q5: For the bivariate normal distribution, zero correlation implies:
A) Nothing about independence B) Independence C) Identical distributions D) The variables are uncorrelated but dependent
Correct: B)
- If you chose B: Correct! A unique property of the multivariate normal: uncorrelated → independent. The joint PDF factors when ρ = 0.
- If you chose A: This is true for general distributions, but for bivariate normal specifically, zero correlation implies independence.
- If you chose C: Correlation and identical distributions are unrelated concepts.
- If you chose D: For the bivariate normal, zero correlation actually implies full independence.
Practice Problems
-
For the joint PMF: p(0,0)=0.4, p(0,1)=0.1, p(1,0)=0.2, p(1,1)=0.3. Find marginals, check independence, and find E[XY].
-
Let f(x,y) = (3/2)(x² + y²) for 0 < x < 1, 0 < y < 1. Verify it's valid, find marginals, and compute P(X < 0.5).
-
For f(x,y) = 4xy for 0 < x < 1, 0 < y < 1: find f_{Y|X}(y|x) and E[Y | X = x].
-
If (X, Y) is bivariate normal with μ_X=0, μ_Y=0, σ_X=1, σ_Y=1, ρ=0.5, find P(Y > 1 | X = 0.5).
-
Prove the law of total expectation for discrete RVs: E[Y] = Σ_x E[Y | X=x] p_X(x).
-
Show that if X and Y are independent, then f_{Y|X}(y|x) = f_Y(y).
-
Let f(x, y) = e^{−(x+y)} for x > 0, y > 0. Find P(X < Y) and show X and Y are independent.