Math graphic
📐 Concept diagram

Phase 10: Probability Theory

Subject 10-01: Probability Foundations

Prerequisites: Phases 1-9 (through Advanced Linear Algebra) — set theory, basic combinatorics, measure-theoretic intuition


Learning Objectives

  1. Define sample spaces, events as subsets, and the σ-algebra structure for countable settings
  2. State and apply Kolmogorov's three axioms of probability and prove basic consequences (complement rule, monotonicity, bounds)
  3. Compute probabilities using the addition rule for disjoint events and the inclusion-exclusion principle for overlapping events
  4. Interpret probability as a measure on a measurable space and recognize limits of countable additivity
  5. Distinguish between equally-likely-outcome counting, relative frequency, and subjective interpretations of probability

Core Content

1. Sample Spaces and Events

A probability experiment is any process with an uncertain outcome. The sample space Ω is the set of all possible outcomes.

Examples: - Coin toss: Ω = {H, T} - Two coin tosses: Ω = {HH, HT, TH, TT} - Roll of a die: Ω = {1, 2, 3, 4, 5, 6} - Lifetime of a lightbulb (continuous): Ω = [0, ∞)

An event is a subset of the sample space: A ⊆ Ω. An event "occurs" if the actual outcome ω ∈ A.

Special events: - Certain event: Ω itself (always occurs) - Impossible event: ∅ (never occurs) - Elementary event: {ω} for a single outcome ω

For a finite or countably infinite Ω, we typically take the σ-algebra F to be the power set ℘(Ω) — every subset is an event.

Set operations on events: - Union A ∪ B: "A or B occurs" (or both) - Intersection A ∩ B: "A and B both occur" - Complement Aᶜ = Ω \ A: "A does not occur" - Difference A \ B = A ∩ Bᶜ: "A occurs but B does not"

De Morgan's Laws:

$(A ∪ B)ᶜ = Aᶜ ∩ Bᶜ
(A ∩ B)ᶜ = Aᶜ ∪ Bᶜ
$

Events A and B are mutually exclusive (disjoint) if A ∩ B = ∅ — they cannot occur simultaneously.

2. Kolmogorov's Axioms (1933)

A probability measure P is a function P: F → [0, 1] satisfying:

Axiom 1 (Non-negativity): P(A) ≥ 0 for every event A.

Axiom 2 (Normalization): P(Ω) = 1.

Axiom 3 (Countable additivity): If A₁, A₂, ... are pairwise disjoint events (Aᵢ ∩ Aⱼ = ∅ for i ≠ j), then:

$P(⋃_{i=1}^{∞} Aᵢ) = Σ_{i=1}^{∞} P(Aᵢ)
$

These three axioms are the foundation of all probability theory. Everything else is derived from them.

Immediate consequences (theorems, not axioms):

Theorem 1 (Complement rule): P(Aᶜ) = 1 − P(A)

Proof: A and Aᶜ are disjoint, and A ∪ Aᶜ = Ω. By Axiom 3 (finite additivity as a special case): P(A) + P(Aᶜ) = P(Ω) = 1. Hence P(Aᶜ) = 1 − P(A). ∎

Theorem 2 (Probability of impossible event): P(∅) = 0

Proof: ∅ = Ωᶜ, so P(∅) = 1 − P(Ω) = 1 − 1 = 0. ∎

Theorem 3 (Monotonicity): If A ⊆ B, then P(A) ≤ P(B)

Proof: B = A ∪ (B \ A), and A ∩ (B \ A) = ∅. So P(B) = P(A) + P(B \ A) ≥ P(A) since P(B \ A) ≥ 0. ∎

Theorem 4 (Bounds): 0 ≤ P(A) ≤ 1 for all A

Proof: ∅ ⊆ A ⊆ Ω, so by monotonicity, 0 = P(∅) ≤ P(A) ≤ P(Ω) = 1. ∎

3. Addition Rule and Inclusion-Exclusion

For disjoint events A and B:

P(A ∪ B) = P(A) + P(B)    (finite additivity from Axiom 3)

For arbitrary events (not necessarily disjoint), we must avoid double-counting A ∩ B:

Addition rule (two events):

$P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
$

Derivation: Write A ∪ B as the disjoint union (A \ B) ∪ (B \ A) ∪ (A ∩ B). Then:

$P(A ∪ B) = P(A \ B) + P(B \ A) + P(A ∩ B)
         = [P(A) − P(A ∩ B)] + [P(B) − P(A ∩ B)] + P(A ∩ B)
         = P(A) + P(B) − P(A ∩ B)
$

Inclusion-exclusion (three events):

$P(A ∪ B ∪ C) = P(A) + P(B) + P(C)
               − P(A ∩ B) − P(A ∩ C) − P(B ∩ C)
               + P(A ∩ B ∩ C)
$

General inclusion-exclusion (n events):

$P(⋃_{i=1}^{n} Aᵢ) = Σ_{i} P(Aᵢ) − Σ_{i<j} P(Aᵢ ∩ Aⱼ) + Σ_{i<j<k} P(Aᵢ ∩ Aⱼ ∩ Aⱼ) − ... + (−1)^{n+1} P(A₁ ∩ ... ∩ A_n)
$

4. Probability as Measure

Probability is a special case of a measure: a normalized measure where the total measure of the space is 1. This connection to measure theory unifies discrete and continuous probability.

Common Pitfall: Countable additivity does NOT imply uncountable additivity. You cannot sum probabilities over an uncountable collection of disjoint events.

5. Equally Likely Outcomes (Classical Probability)

When all outcomes are equally likely and Ω is finite:

$P(A) = |A| / |Ω| = (number of favorable outcomes) / (total number of outcomes)
$

This reduces probability to counting. Used extensively in combinatorics problems (cards, dice, lotteries).

Example: Rolling two fair dice. Ω has 36 equally likely ordered pairs. Event "sum = 7" has 6 favorable outcomes: {(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)}. So P(sum = 7) = 6/36 = 1/6.

Edge case: The "equally likely" assumption must be justified. It fails for biased coins, weighted dice, or non-uniform distributions.



Key Terms

Worked Examples

Example 1: Applying the Axioms

In a sample space Ω, P(A) = 0.4, P(B) = 0.3, P(A ∩ B) = 0.1. Find: (a) P(Aᶜ) (b) P(A ∪ B) (c) P(A ∩ Bᶜ) (d) P(Aᶜ ∩ Bᶜ)

Solution:

(a) P(Aᶜ) = 1 − P(A) = 1 − 0.4 = 0.6

(b) P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 0.4 + 0.3 − 0.1 = 0.6

(c) A ∩ Bᶜ = A \ B, and A = (A ∩ B) ∪ (A ∩ Bᶜ) with disjoint union. So P(A) = P(A ∩ B) + P(A ∩ Bᶜ) → P(A ∩ Bᶜ) = 0.4 − 0.1 = 0.3

(d) By De Morgan: Aᶜ ∩ Bᶜ = (A ∪ B)ᶜ P(Aᶜ ∩ Bᶜ) = 1 − P(A ∪ B) = 1 − 0.6 = 0.4

Verify: P(A) + P(B) − P(A ∩ B) + P(Aᶜ ∩ Bᶜ) = 0.4 + 0.3 − 0.1 + 0.4 = 1.0. Total probability sums to 1. ✓


Example 2: Inclusion-Exclusion with Three Events

In a survey of 100 students: - 60 study math (M) - 45 study physics (P) - 35 study chemistry (C) - 25 study both math and physics - 20 study both math and chemistry - 15 study both physics and chemistry - 8 study all three

How many study at least one subject? Exactly two subjects?

Solution:

P(at least one) = P(M) + P(P) + P(C) − P(M∩P) − P(M∩C) − P(P∩C) + P(M∩P∩C) = 0.60 + 0.45 + 0.35 − 0.25 − 0.20 − 0.15 + 0.08 = 0.88

So 88 students study at least one subject.

For exactly two subjects: Students in exactly two = P(M∩P) + P(M∩C) + P(P∩C) − 3·P(M∩P∩C) = 25 + 20 + 15 − 3(8) = 60 − 24 = 36 students.


Example 3: Showing P(A \ B) = P(A) − P(A ∩ B)

Proof using axioms:

Write A as the disjoint union A = (A ∩ B) ∪ (A \ B). These are disjoint because (A ∩ B) ∩ (A \ B) = A ∩ B ∩ A ∩ Bᶜ = A ∩ (B ∩ Bᶜ) = ∅.

By finite additivity: P(A) = P(A ∩ B) + P(A \ B). Therefore P(A \ B) = P(A) − P(A ∩ B). ∎


Quiz

Q1: Which of the following is NOT one of Kolmogorov's three axioms of probability?

A) P(A) ≥ 0 for all events A B) P(Ω) = 1 C) P(Aᶜ) = 1 − P(A) D) Countable additivity for disjoint events

Correct: C)


Q2: If P(A) = 0.7 and P(B) = 0.5, and A and B are mutually exclusive (disjoint), what is P(A ∪ B)?

A) 0.2 B) 0.85 C) 1.2 D) 0.35

Correct: C)


Q3: By De Morgan's Law, (A ∪ B)ᶜ is equal to:

A) Aᶜ ∪ Bᶜ B) Aᶜ ∩ Bᶜ C) A ∩ B D) (A ∩ B) ∪ (Aᶜ ∩ Bᶜ)

Correct: B)


Q4: In the inclusion-exclusion formula for three events A, B, C, what is the sign of the P(A ∩ B ∩ C) term?

A) Positive B) Negative C) It depends on whether the events are disjoint D) Zero

Correct: A)


Q5: If P(A) = 0.6 and P(B) = 0.5, what is the maximum possible value of P(A ∩ B)?

A) 0.1 B) 0.5 C) 0.6 D) 1.0

Correct: B)


Q6: Which of the following is a consequence of Kolmogorov's axioms?

A) P(A ∪ B) = P(A) + P(B) for all events B) If A ⊆ B, then P(A) ≤ P(B) C) P(A ∩ B) = P(A)P(B) D) P(A | B) = P(B | A)

Correct: B)


Q7: A fair coin is tossed 3 times. What is P(exactly 2 heads)?

A) 1/8 B) 3/8 C) 1/2 D) 3/4

Correct: B)


Q8: P(∅) = 0 is:

A) An axiom of probability B) A theorem derived from the axioms C) Only true for finite sample spaces D) True only when ∅ is the empty set

Correct: B)


Practice Problems

  1. If P(A) = 0.5, P(B) = 0.4, and P(A ∩ B) = 0.2, compute P(A ∪ B), P(Aᶜ), P(Bᶜ), and P(Aᶜ ∩ B).

  2. Prove that P(A ∩ B) ≥ P(A) + P(B) − 1. (This is Bonferroni's inequality.)

  3. A fair coin is tossed 3 times. List the sample space Ω. Find the probability of: (a) exactly 2 heads, (b) at least 1 head, (c) no heads.

  4. For three events A, B, C, derive the formula for P(A ∪ B ∪ C) by applying the two-event addition rule twice.

  5. A card is drawn from a standard 52-card deck. Find: (a) P(heart or king), (b) P(face card or ace), (c) P(red or spade).

  6. Show that if P(A) = 0, then P(A ∩ B) = 0 for any event B.

  7. Use inclusion-exclusion to find the probability that a randomly chosen integer from 1 to 100 is divisible by 2, 3, or 5.

Answers 1. P(A ∪ B) = 0.5 + 0.4 − 0.2 = 0.7; P(Aᶜ) = 0.5; P(Bᶜ) = 0.6; P(Aᶜ ∩ B) = P(B) − P(A ∩ B) = 0.4 − 0.2 = 0.2. 2. P(A ∪ B) = P(A) + P(B) − P(A ∩ B) ≤ 1. Rearranging: P(A ∩ B) ≥ P(A) + P(B) − 1. 3. Ω = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} (8 outcomes). (a) 3/8, (b) 7/8, (c) 1/8. 4. P(A ∪ B ∪ C) = P((A ∪ B) ∪ C) = P(A ∪ B) + P(C) − P((A ∪ B) ∩ C) = P(A)+P(B)−P(A∩B)+P(C)−P((A∩C)∪(B∩C)) = P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C). 5. (a) Hearts (13) + Kings (4) − King of hearts (1) = 16/52 = 4/13. (b) Face cards (12) + Aces (4) = 16/52 = 4/13 (disjoint). (c) Red (26) + Spades (13) = 39/52 = 3/4 (disjoint: spades are black). 6. Since A ∩ B ⊆ A, by monotonicity P(A ∩ B) ≤ P(A) = 0, so P(A ∩ B) = 0. 7. Let A={div by 2}, B={div by 3}, C={div by 5}. |A|=50, |B|=33, |C|=20. |A∩B|=16 (div by 6), |A∩C|=10 (div by 10), |B∩C|=6 (div by 15), |A∩B∩C|=3 (div by 30). Inclusion-exclusion: 50+33+20−16−10−6+3 = 74. P = 74/100 = 0.74.

Summary


Pitfalls


Quiz

  1. Which of the following is NOT one of Kolmogorov's axioms? a) P(Ω) = 1 b) P(Aᶜ) = 1 − P(A) c) P(A) ≥ 0 for all events A d) Countable additivity for disjoint events Answer: b. The complement rule is a theorem, not an axiom.

  2. If A and B are disjoint and P(A) = 0.3, P(B) = 0.5, what is P(A ∪ B)? a) 0.8 b) 0.65 c) 0.15 d) Cannot be determined Answer: a. For disjoint events, P(A ∪ B) = P(A) + P(B) = 0.8.

  3. If P(A) = 0.6 and P(B) = 0.5, what is the maximum possible value of P(A ∩ B)? a) 0.1 b) 0.5 c) 0.6 d) 1.0 Answer: b. A ∩ B ⊆ B, so P(A ∩ B) ≤ P(B) = 0.5. The maximum is 0.5 (when B ⊆ A).

  4. De Morgan's law states that (A ∪ B)ᶜ equals: a) Aᶜ ∪ Bᶜ b) Aᶜ ∩ Bᶜ c) A ∩ B d) (A ∩ B)ᶜ Answer: b. (A ∪ B)ᶜ = Aᶜ ∩ Bᶜ.

  5. In inclusion-exclusion for n events, the sign of the k-fold intersection term is: a) Always positive b) (−1)^{k+1} c) (−1)^{k} d) Positive for even k, negative for odd k Answer: b. The general term for a k-way intersection has sign (−1)^{k+1} — positive for k=1, negative for k=2, etc.

  6. If P(A) = 0, which must be true? a) A = ∅ b) P(A ∪ B) = P(B) for any B c) A is impossible d) Both b and c Answer: b. P(A) = 0 does not imply A = ∅ (consider a continuous random variable equaling a specific value). But P(A ∪ B) = P(A) + P(B) − P(A ∩ B) = 0 + P(B) − 0 = P(B).

  7. For a fair die roll, what is P(outcome ≤ 3 or even)? a) 1/2 b) 2/3 c) 5/6 d) 1 Answer: c. A = {1,2,3}, B = {2,4,6}. A ∩ B = {2}. P = 3/6 + 3/6 − 1/6 = 5/6.

  8. True or False: If P(A ∪ B) = P(A) + P(B), then A and B must be disjoint. a) True b) False — they could overlap with P(A ∩ B) = 0 Answer: b. P(A ∪ B) = P(A) + P(B) − P(A ∩ B). So P(A ∪ B) = P(A) + P(B) iff P(A ∩ B) = 0, which does not require A ∩ B = ∅.


Next Steps

Continue to 10-02 Conditional Probability to learn about P(A|B), the multiplication rule, the law of total probability, and Bayes' theorem.