Math graphic
๐Ÿ“ Concept diagram

### 12.10 โ€” ANOVA

Phase: Statistics Prerequisites: 12-08-common-tests, 12-02-sampling-sampling-distributions, 08-08-eigenvalues-eigenvectors

Learning Objectives

By the end of this subject, you will be able to:

  1. State the null and alternative hypotheses for one-way ANOVA
  2. Decompose total variation into between-group and within-group components
  3. Compute the F-statistic and conduct an ANOVA test
  4. Understand when and why post-hoc tests are needed
  5. Describe the conceptual framework of two-way ANOVA

Core Content

โš ๏ธ CRITICAL: Why ANOVA Instead of Multiple t-tests?

If you have 3 groups and run 3 pairwise t-tests at $\alpha = 0.05$ each, the probability of at least one Type I error is $1 - (0.95)^3 \approx 0.143$ โ€” nearly triple the nominal rate! With 5 groups (10 tests), it's about $1 - (0.95)^{10} \approx 0.40$.

ANOVA provides a single omnibus test: "Do ANY of the group means differ?" โ€” controlling the family-wise error rate.

One-Way ANOVA: The Model

$Y_{ij} = \mu + \alpha_j + \epsilon_{ij}$, where $\epsilon_{ij} \sim N(0, \sigma^2)$ independently.

Hypotheses: $H_0: \alpha_1 = \alpha_2 = \cdots = \alpha_k = 0$ (all group means equal) vs $H_a$: at least one $\alpha_j \neq 0$.

The F-Statistic

ANOVA partitions total variation into between-group and within-group:

$$\text{SST} = \text{SSB} + \text{SSW}$$

Mean squares: - $\text{MSB} = \text{SSB} / (k-1)$ - $\text{MSW} = \text{SSW} / (N-k)$

$$F = \frac{\text{MSB}}{\text{MSW}} \sim F_{k-1, N-k}$$

Intuition: If $H_0$ is true, both MSB and MSW estimate the same $\sigma^2$, so $F \approx 1$. If $H_0$ is false, MSB overestimates $\sigma^2$ (it includes group effects), so $F > 1$.

Decision: Reject $H_0$ if $F > F_{\alpha, k-1, N-k}$ or if p-value $< \alpha$.

ANOVA Table

Source df SS MS F
Between $k-1$ SSB MSB = SSB/($k-1$) MSB/MSW
Within $N-k$ SSW MSW = SSW/($N-k$)
Total $N-1$ SST

โš ๏ธ CRITICAL: ANOVA Assumptions

  1. Independence: observations within and between groups
  2. Normality: residuals within each group are approximately normal
  3. Homogeneity of variance: $\sigma_1^2 = \sigma_2^2 = \cdots = \sigma_k^2$ (equal variances across groups)

ANOVA is fairly robust to moderate violations of normality (for balanced designs), but sensitive to variance inequality when group sizes differ.

Post-Hoc Tests

If ANOVA rejects $H_0$, we know at least one pair differs โ€” but WHICH one(s)? Post-hoc tests answer this while controlling the family-wise error rate.

Tukey's HSD (Honestly Significant Difference): Compares all pairwise differences. A difference is significant if:

$$|\bar{Y}j - \bar{Y}{\ell}| > q_{\alpha, k, N-k} \cdot \sqrt{\frac{\text{MSW}}{2}\left(\frac{1}{n_j} + \frac{1}{n_{\ell}}\right)}$$

Where $q$ is the studentised range distribution.

Bonferroni correction: Simplest method โ€” divide $\alpha$ by the number of comparisons. Conservative but valid.

Two-Way ANOVA (Conceptual)

When there are TWO categorical factors (e.g., drug type AND dosage):

$Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta){ij} + \epsilon{ijk}$

The F-test for interaction tests whether the effect of one factor is consistent across levels of the other. If interaction is significant, main effects must be interpreted with caution.



Key Terms

Worked Examples

Example 1: One-way ANOVA computation

Three teaching methods tested on students:

Method A Method B Method C
78 72 68
82 75 70
80 74 69
79 73 67

$n_A = n_B = n_C = 4$, $N = 12$, $k = 3$

Group means: $\bar{Y}_A = 79.75$, $\bar{Y}_B = 73.5$, $\bar{Y}_C = 68.5$

Grand mean: $\bar{Y} = (79.75 + 73.5 + 68.5)/3 = 221.75/3 = 73.917$

SSB: $4[(79.75-73.917)^2 + (73.5-73.917)^2 + (68.5-73.917)^2]$ $= 4[34.03 + 0.174 + 29.34] = 4 \cdot 63.544 = 254.18$

SSW: Method A: $(78-79.75)^2 + (82-79.75)^2 + (80-79.75)^2 + (79-79.75)^2 = 3.0625+5.0625+0.0625+0.5625=8.75$

Method B: $(72-73.5)^2 + (75-73.5)^2 + (74-73.5)^2 + (73-73.5)^2 = 2.25+2.25+0.25+0.25=5.0$

Method C: $(68-68.5)^2 + (70-68.5)^2 + (69-68.5)^2 + (67-68.5)^2 = 0.25+2.25+0.25+2.25=5.0$

SSW = 8.75 + 5.0 + 5.0 = 18.75

Source df SS MS F
Between 2 254.18 127.09 $127.09/2.083 = 61.01$
Within 9 18.75 2.083
Total 11 272.93

$F_{0.05, 2, 9} \approx 4.26$. Since $61.01 > 4.26$, we reject $H_0$ โ€” the teaching methods differ significantly.

Example 2: Post-hoc (Bonferroni)

With $k=3$ groups, there are $\binom{3}{2} = 3$ comparisons. Bonferroni-adjusted $\alpha = 0.05/3 = 0.0167$.

Compare A vs B: $|\bar{Y}_A - \bar{Y}_B| = 6.25$ $\text{SE} = \sqrt{2.083(1/4+1/4)} = \sqrt{1.0415} = 1.021$

$t = 6.25/1.021 = 6.12$, $df = 9$, using Bonferroni critical value $t_{0.0083, 9} \approx 2.93$ โ†’ significant.

Similarly, all pairs are significant โ€” each method differs from every other.

Example 3: Interaction interpretation

Two-way ANOVA on crop yield: Factor A = fertiliser (yes/no), Factor B = water (low/high).

Low Water High Water
No Fertiliser 10 14
Fertiliser 12 22

Main effect of fertiliser (averaged over water): $(12+22)/2 - (10+14)/2 = 17 - 12 = 5$

Main effect of water (averaged over fertiliser): $(14+22)/2 - (10+12)/2 = 18 - 11 = 7$

Interaction: Does fertiliser effect depend on water? - Low water: 12 - 10 = +2 - High water: 22 - 14 = +8

The effect of fertiliser is much larger under high water โ€” this is an interaction. The main effects alone don't tell the full story.



Quiz

Q1: What does the concept of ANOVA primarily refer to in this subject?

A) A computational error related to ANOVA B) A historical anecdote about ANOVA C) The definition and application of ANOVA D) A visual representation of ANOVA

Correct: C)

Q2: Which of the following is the key formula discussed in this subject?

A) \alpha = 0.05 B) A simplified version of \alpha = 0.05... C) An unrelated formula from a different topic D) The inverse operation of the formula in question

Correct: A)

Q3: What is the primary purpose of F-statistic?

A) It is primarily a historical notation system B) It is used to f-statistic in mathematical analysis C) It is used only in advanced research contexts D) It replaces all other methods in this domain

Correct: B)

Q4: Which statement about Post-hoc tests is TRUE?

A) Post-hoc tests is a fundamental concept covered in this subject B) Post-hoc tests is an advanced topic beyond this subject's scope C) Post-hoc tests is not related to this subject D) Post-hoc tests is mentioned only as a historical footnote

Correct: A)

Q5: Based on the worked examples in this subject, what is the correct result?

A) An unrelated numerical value B) A different result from a common mistake C) The inverse of the correct answer D) ** Compares all pairwise differences. A difference

Correct: D)

Q6: How are Post-hoc tests and Two-way ANOVA related?

A) Post-hoc tests is the inverse of Two-way ANOVA B) Post-hoc tests and Two-way ANOVA are completely unrelated topics C) Post-hoc tests is a special case of Two-way ANOVA D) Post-hoc tests and Two-way ANOVA are closely related concepts

Correct: D)

Q7: What is a common pitfall when working with โš ๏ธ Critical: Why Anova Instead Of Multiple T-Tests??

A) A common mistake is confusing โš ๏ธ Critical: Why Anova Instead Of Multiple T-Tests? with a similar concept B) โš ๏ธ Critical: Why Anova Instead Of Multiple T-Tests? is always computed the same way in all contexts C) โš ๏ธ Critical: Why Anova Instead Of Multiple T-Tests? has no common misconceptions D) The main error with โš ๏ธ Critical: Why Anova Instead Of Multiple T-Tests? is using it when it is not needed

Correct: A)

Q8: When should you apply One-Way Anova: The Model?

A) Use One-Way Anova: The Model only in pure mathematics contexts B) Apply One-Way Anova: The Model to solve problems in this subject's domain C) Avoid One-Way Anova: The Model unless explicitly instructed D) One-Way Anova: The Model is not practically useful

Correct: B)

Practice Problems

  1. For $k=4$ groups with $n=10$ each, what are the ANOVA degrees of freedom?

    Click for answer Between groups: $k-1 = 3$ Within groups: $N-k = 40-4 = 36$ Total: $N-1 = 39$ F-statistic follows $F_{3, 36}$.

  2. ANOVA gives $F = 0.87$ with $df = (3, 28)$, $p = 0.47$. Interpret.

    Click for answer $F \approx 1$ suggests MSB and MSW are similar โ€” there is no evidence that group means differ. $p = 0.47$ is far above any conventional $\alpha$. Fail to reject $H_0$. Do NOT run post-hoc tests โ€” ANOVA already told you no differences were found.

  3. Why do we need post-hoc tests after a significant ANOVA?

    Click for answer ANOVA only tells you that AT LEAST ONE pair of means differs โ€” it doesn't tell you WHICH one(s). Post-hoc tests identify the specific pairs that differ while controlling the family-wise error rate. Running multiple t-tests instead would inflate the Type I error rate.

  4. You run ANOVA and reject $H_0$. You then run Tukey HSD and find NO significant pairwise differences. Is this possible? How?

    Click for answer Yes, this can happen. Tukey HSD controls the family-wise error rate more strictly than ANOVA's omnibus F-test. The F-test can detect that means are not all equal (e.g., a complex contrast) without any single pairwise comparison reaching significance. It's uncommon but possible, especially with many groups.

  5. In a two-way ANOVA, the interaction term is significant ($p = 0.003$). How should you interpret the main effects?

    Click for answer When interaction is significant, main effects should be interpreted with caution โ€” the effect of one factor depends on the level of the other. Rather than reporting "Factor A increases Y by X units," report the simple effects (effect of A at each level of B separately). Plotting the interaction helps visualise the dependence.


Summary

Key takeaways:


Pitfalls



Next Steps

Next up: 13-01-entropy.md โ€” Information Theory begins!