Math graphic
πŸ“ Concept diagram

08-06 β€” Orthogonal Projections

Phase: 8 β€” Linear Algebra (Rigorous) Subject: 08-06 Prerequisites: 08-05 β€” Inner Product Spaces Next subject: 08-07 β€” Determinants (Deep)


Learning Objectives

By the end of this subject, you will be able to:

  1. Compute the orthogonal projection of a vector onto a line and onto a general subspace using projection matrices
  2. Characterize orthogonal matrices (Q^T Q = I) and apply them to preserve lengths and angles under transformation
  3. Solve least-squares problems Ax β‰ˆ b via the normal equations A^T A xΜ‚ = A^T b
  4. Compute and interpret the QR decomposition A = QR, and use it to solve least-squares problems efficiently
  5. Understand the geometric meaning of projections: the projection minimizes distance, and the residual is orthogonal to the subspace

Core Content

1. Orthogonal Projection onto a Line

Given a nonzero vector a ∈ R^n, the orthogonal projection of b onto the line span{a} is:

$p = proj_a(b) = (aΒ·b / aΒ·a) a = (a^T b / a^T a) a
$

CRITICAL β€” Foundational: Orthogonal projection is the geometric foundation of least squares, Fourier series, PCA, and ML. Key property: the residual b βˆ’ p is ORTHOGONAL to the subspace β€” this makes the projection the 'closest point.'

This can be written as p = P b where P is the projection matrix:

$P = (a a^T) / (a^T a)      (n Γ— n matrix, rank 1)
$

Properties of a projection matrix P: - PΒ² = P (idempotent: projecting again doesn't change anything)

Common Pitfall: PΒ² = P alone does NOT guarantee orthogonal projection β€” oblique projections also satisfy this. Orthogonal projection requires BOTH PΒ² = P AND P^T = P. - P^T = P (symmetric) - P projects onto C(P) = span{a} - I βˆ’ P projects onto (span{a})^βŸ‚

Example: Project b = (3, 1, 0) onto a = (1, 0, 2).

$aΒ·b = 1Β·3 + 0Β·1 + 2Β·0 = 3
aΒ·a = 1 + 0 + 4 = 5
p = (3/5)(1, 0, 2) = (0.6, 0, 1.2)
$

The residual e = b βˆ’ p = (3, 1, 0) βˆ’ (0.6, 0, 1.2) = (2.4, 1, βˆ’1.2) is orthogonal to a: aΒ·e = 1Β·2.4 + 0Β·1 + 2Β·(βˆ’1.2) = 2.4 βˆ’ 2.4 = 0. βœ“

2. Orthogonal Projection onto a Subspace

Let W be a subspace of R^n with basis. We can find the projection of b onto W by two methods:

Method 1 β€” Orthogonal basis: If {u₁, ..., u_k} is an ORTHOGONAL basis of W, then:

$proj_W(b) = Ξ£_{i=1}^k (bΒ·u_i / u_iΒ·u_i) u_i
$

Method 2 β€” Matrix (normal equations): Let A be an n Γ— k matrix whose columns are a basis of W. The projection p = A xΜ‚ where xΜ‚ solves:

$A^T A xΜ‚ = A^T b    (normal equations)
$

Since columns of A are independent, A^T A is invertible, so:

$xΜ‚ = (A^T A)⁻¹ A^T b
p = A (A^T A)⁻¹ A^T b = P b
$

The projection matrix is P = A(A^T A)⁻¹ A^T.

Properties: PΒ² = P, P^T = P, rank(P) = k = dim(W).

Geometric interpretation: p is the unique point in W closest to b. That is:

$β€–b βˆ’ pβ€– = min_{w ∈ W} β€–b βˆ’ wβ€–
$

3. Orthogonal Matrices

Definition: A square matrix Q is orthogonal if Q^T Q = I (equivalently, Q⁻¹ = Q^T).

Properties: - Columns of Q form an orthonormal basis of R^n - Rows of Q also form an orthonormal basis - Q preserves inner product: ⟨Qx, Qy⟩ = (Qx)^T(Qy) = x^T Q^T Q y = x^T y = ⟨x, y⟩ - Q preserves norm: β€–Qxβ€– = β€–xβ€– - Q preserves angles - det(Q) = Β±1 (rotation: +1; reflection: βˆ’1)

Examples: - Rotation matrix: Q = [cos ΞΈ βˆ’sin ΞΈ; sin ΞΈ cos ΞΈ], Q^T Q = I βœ“ - Permutation matrix: swaps coordinates, orthogonal - Householder reflection: Q = I βˆ’ 2uu^T where β€–uβ€– = 1

4. QR Decomposition

Any m Γ— n matrix A with linearly independent columns can be factored as:

$A = Q R
$

where: - Q is m Γ— n with orthonormal columns (Q^T Q = I_n) - R is n Γ— n upper triangular with positive diagonal entries

Computation via Gram-Schmidt:

Let columns of A be a₁, ..., a_n. Orthonormalize to get q₁, ..., q_n (columns of Q). Then R entries are:

r_ij = q_i^T a_j  for i < j    (projection coefficients)
r_jj = β€–a_j βˆ’ Ξ£_{i<j} r_ij q_iβ€–  (length of orthogonalized vector)

Solving Ax = b via QR:

$Ax = b  β†’  QRx = b  β†’  Rx = Q^T b
$

Since R is upper triangular, solve by back-substitution.

Least-squares via QR:

$A^T A xΜ‚ = A^T b  ⇔  R^T Q^T Q R xΜ‚ = R^T Q^T b  ⇔  R^T R xΜ‚ = R^T Q^T b
$

Since R is invertible: R xΜ‚ = Q^T b. Solve by back-substitution β€” no need to form A^T A!

5. Least-Squares Problems

Given A (m Γ— n, m > n) and b ∈ R^m, find xΜ‚ minimizing β€–Ax βˆ’ bβ€–Β².

Geometric interpretation: Find the point A xΜ‚ in C(A) closest to b β€” the orthogonal projection of b onto C(A).

Normal equations: A^T A xΜ‚ = A^T b.

Solution exists: If columns of A are independent, xΜ‚ is unique: xΜ‚ = (A^T A)⁻¹ A^T b.

The residual: e = b βˆ’ A xΜ‚ is in N(A^T) = C(A)^βŸ‚. In fact, A^T e = 0 (this is just the normal equations).

Example: Fit a line y = mx + c to data points (1, 1), (2, 2), (3, 3.5).

For each point: mx_i + c = y_i. In matrix form:

$[1 1] [c]   [1]
[2 1] [m] = [2]
[3 1]       [3.5]
$

A = [1 1; 2 1; 3 1], b = (1, 2, 3.5)^T.

A^T A = [1 2 3; 1 1 1] [1 1; 2 1; 3 1] = [14 6; 6 3] A^T b = [1 2 3; 1 1 1] [1; 2; 3.5] = [1+4+10.5; 1+2+3.5] = [15.5; 6.5]

Solve: [14 6; 6 3][c; m] = [15.5; 6.5] β†’ 14c + 6m = 15.5, 6c + 3m = 6.5

From second: 2c + m = 6.5/3? Let's do: 6c + 3m = 6.5 β†’ divide by 3: 2c + m = 13/6. From first minus 2Γ—(second): (14c+6m) βˆ’ (12c+6m) = 15.5 βˆ’ 13 = 2.5 β†’ 2c = 2.5 β†’ c = 1.25. Then m = 13/6 βˆ’ 2c = 13/6 βˆ’ 2.5 = 13/6 βˆ’ 15/6 = βˆ’2/6 β‰ˆ βˆ’0.333. Line: y = βˆ’0.333x + 1.25.



Key Terms

Worked Examples

Example 1: Projection onto a Plane

Problem: Find the projection of b = (6, 0, 0) onto the plane W = span{(1, 1, 0), (0, 1, 1)}.

Solution: Use Method 2 (normal equations).

A = [1 0; 1 1; 0 1] (columns are the basis vectors).

A^T A = [1 1 0; 0 1 1] [1 0; 1 1; 0 1] = [2 1; 1 2] A^T b = [1 1 0; 0 1 1] [6; 0; 0] = [6; 0]

Solve [2 1; 1 2] xΜ‚ = [6; 0]: (2Γ—eq2 βˆ’ eq1): (2βˆ’2)x̂₁ + (4βˆ’1)xΜ‚β‚‚ = 0βˆ’6 β†’ 3xΜ‚β‚‚ = βˆ’6 β†’ xΜ‚β‚‚ = βˆ’2. Then x̂₁ = (6 βˆ’ xΜ‚β‚‚)/2 = (6+2)/2 = 4.

p = A xΜ‚ = [1 0; 1 1; 0 1][4; βˆ’2] = (4, 4βˆ’2, βˆ’2) = (4, 2, βˆ’2).

Residual: e = b βˆ’ p = (6βˆ’4, 0βˆ’2, 0βˆ’(βˆ’2)) = (2, βˆ’2, 2).

Check orthogonality: eΒ·(1,1,0) = 2βˆ’2+0 = 0, eΒ·(0,1,1) = 0βˆ’2+2 = 0. βœ“

Example 2: QR Decomposition

Problem: Find QR decomposition of A = [1 1; 1 0; 0 1].

Solution (Gram-Schmidt):

Col1: a₁ = (1, 1, 0). β€–a₁‖ = √2. q₁ = (1/√2, 1/√2, 0). r₁₁ = √2.

Col2: compute r₁₂ = q₁^T aβ‚‚ = (1/√2, 1/√2, 0)Β·(1, 0, 1) = 1/√2. vβ‚‚ = aβ‚‚ βˆ’ r₁₂ q₁ = (1, 0, 1) βˆ’ (1/√2)(1/√2, 1/√2, 0) = (1, 0, 1) βˆ’ (1/2, 1/2, 0) = (1/2, βˆ’1/2, 1). β€–vβ‚‚β€–Β² = 1/4 + 1/4 + 1 = 3/2, β€–vβ‚‚β€– = √(3/2). rβ‚‚β‚‚ = β€–vβ‚‚β€– = √(3/2). qβ‚‚ = vβ‚‚/β€–vβ‚‚β€– = (1/2, βˆ’1/2, 1)/√(3/2) = (1/√6, βˆ’1/√6, 2/√6).

$Q = [1/√2   1/√6 ]
    [1/√2  -1/√6 ]
    [0      2/√6 ]

R = [√2   1/√2  ]
    [0    √(3/2)]
$

Check: Q^T Q = Iβ‚‚ and A = QR. βœ“

Example 3: Orthogonal Matrix Properties

Problem: Show that if Q is orthogonal, then β€–Qxβ€– = β€–xβ€– for all x.

Solution: β€–Qxβ€–Β² = (Qx)^T (Qx) = x^T Q^T Q x = x^T I x = x^T x = β€–xβ€–Β². Taking square roots: β€–Qxβ€– = β€–xβ€–. βœ“

Corollary: det(Q) = Β±1 because det(Q^T Q) = det(I) β†’ det(Q)^2 = 1 β†’ det(Q) = Β±1.

Example 4: Least-Squares Parabola Fit

Problem: Fit y = axΒ² + bx + c to (βˆ’1, 2), (0, 1), (1, 3), (2, 6).

Solution: For each (x, y): axΒ² + bx + c = y.

Matrix form:

$[1  -1   1] [a]   [2]
[0   0   1] [b] = [1]
[1   1   1] [c]   [3]
[4   2   1]       [6]
$

A^T A = [1 0 1 4; βˆ’1 0 1 2; 1 1 1 1] [1 βˆ’1 1; 0 0 1; 1 1 1; 4 2 1] = [18 8 6; 8 6 2; 6 2 4]

A^T b = [1 0 1 4; βˆ’1 0 1 2; 1 1 1 1] [2; 1; 3; 6] = [2+3+24; βˆ’2+3+12; 2+1+3+6] = [29; 13; 12]

Solve [18 8 6; 8 6 2; 6 2 4] xΜ‚ = [29; 13; 12]: xΜ‚ β‰ˆ (1, 0, 1). So y β‰ˆ xΒ² + 1.

Check values: f(βˆ’1)=2, f(0)=1, f(1)=2 (actual 3), f(2)=5 (actual 6). The fit isn't perfect but minimizes squared error.

Quiz

Q1: The orthogonal projection of b onto the line through a nonzero vector a is:

A) (aΒ·b) a B) (aΒ·b / bΒ·b) a C) (aΒ·b / aΒ·a) a D) (aΒ·b) / β€–aβ€–

Correct: C)


Q2: An orthogonal projection matrix P must satisfy:

A) PΒ² = P only B) PΒ² = P and P^T = P C) P^T P = I D) det(P) = 1

Correct: B)


Q3: A square matrix Q is orthogonal if:

A) Q has orthonormal rows B) Q^T Q = I C) Q⁻¹ = Q^T D) Both B and C (which are equivalent for square matrices)

Correct: D)


Q4: In the QR decomposition A = QR, what is R?

A) A diagonal matrix B) A lower triangular matrix C) An upper triangular matrix with positive diagonal entries D) The identity matrix

Correct: C)


Q5: The least-squares solution xΜ‚ to Ax β‰ˆ b (with A having independent columns) satisfies:

A) AxΜ‚ = b exactly B) A^T A xΜ‚ = A^T b (the normal equations) C) A^T xΜ‚ = b D) xΜ‚ = A⁻¹b

Correct: B)


Q6: After projecting b onto subspace W, the residual e = b βˆ’ p satisfies:

A) e ∈ W B) e ∈ W^βŸ‚ (the orthogonal complement of W) C) e = 0 always D) e is parallel to the projection p

Correct: B)


Practice Problems

(Answers are below. Try each problem before checking.)

Problem 1: Project b = (4, βˆ’1, 5) onto the line through a = (2, 1, 0). Find the projection matrix P.

Problem 2: Find the orthogonal projection of b = (1, 1, 1, 1) onto W = span{(1, 0, 1, 0), (0, 1, 0, 1)}.

Problem 3: Determine whether Q = [3/5 4/5; 4/5 βˆ’3/5] is orthogonal.

Problem 4: Find the QR decomposition of A = [1 2; 0 1; 1 0].

Problem 5: Solve the least-squares problem: minimize β€–Ax βˆ’ bβ€–Β² where A = [1 0; 1 1; 1 2] and b = (0, 1, 3)^T.

Problem 6: Show that if P is an orthogonal projection matrix, then I βˆ’ P is also an orthogonal projection matrix.

Problem 7: Prove that Q^T is orthogonal when Q is orthogonal.

Answers (click to expand) **Problem 1:** aΒ·b = 2Β·4 + 1Β·(βˆ’1) + 0Β·5 = 8 βˆ’ 1 = 7 aΒ·a = 4 + 1 = 5 p = (7/5)(2, 1, 0) = (14/5, 7/5, 0) = (2.8, 1.4, 0). P = a a^T / (a^T a) = [2; 1; 0][2 1 0]/5 = [4 2 0; 2 1 0; 0 0 0]/5 = [0.8 0.4 0; 0.4 0.2 0; 0 0 0]. Check: PΒ² = P and P b = [0.8 0.4 0; 0.4 0.2 0; 0 0 0][4;βˆ’1;5] = [3.2βˆ’0.4; 1.6βˆ’0.2; 0] = [2.8; 1.4; 0] = p. βœ“ **Problem 2:** The basis vectors are already orthogonal (1,0,1,0)Β·(0,1,0,1) = 0. proj = proj_{v₁}(b) + proj_{vβ‚‚}(b). bΒ·v₁ = 1+0+1+0 = 2, v₁·v₁ = 2. proj₁ = (2/2)(1,0,1,0) = (1,0,1,0). bΒ·vβ‚‚ = 0+1+0+1 = 2, vβ‚‚Β·vβ‚‚ = 2. projβ‚‚ = (2/2)(0,1,0,1) = (0,1,0,1). p = (1, 1, 1, 1). Wait β€” p = b! This means b ∈ W. Indeed, (1,1,1,1) = (1,0,1,0) + (0,1,0,1). **Problem 3:** Q^T Q = [3/5 4/5; 4/5 βˆ’3/5] [3/5 4/5; 4/5 βˆ’3/5] = [9/25+16/25 12/25βˆ’12/25; 12/25βˆ’12/25 16/25+9/25] = [1 0; 0 1] = I. βœ“ Yes, Q is orthogonal. det(Q) = (3/5)(βˆ’3/5) βˆ’ (4/5)(4/5) = βˆ’9/25 βˆ’ 16/25 = βˆ’1. This is a reflection. **Problem 4:** a₁ = (1, 0, 1), β€–a₁‖ = √2, q₁ = (1/√2, 0, 1/√2), r₁₁ = √2. r₁₂ = q₁^T aβ‚‚ = (1/√2, 0, 1/√2)Β·(2, 1, 0) = 2/√2 = √2. vβ‚‚ = aβ‚‚ βˆ’ r₁₂ q₁ = (2, 1, 0) βˆ’ √2(1/√2, 0, 1/√2) = (2, 1, 0) βˆ’ (1, 0, 1) = (1, 1, βˆ’1). β€–vβ‚‚β€–Β² = 1+1+1 = 3, β€–vβ‚‚β€– = √3. rβ‚‚β‚‚ = √3. qβ‚‚ = (1/√3, 1/√3, βˆ’1/√3).
$Q = [1/√2   1/√3 ]
    [0      1/√3 ]
    [1/√2  -1/√3 ]

R = [√2   √2 ]
    [0    √3 ]
$
Check A = QR. βœ“ **Problem 5:** A^T A = [1 1 1; 0 1 2] [1 0; 1 1; 1 2] = [3 3; 3 5]. A^T b = [1 1 1; 0 1 2] [0; 1; 3] = [4; 7]. Solve [3 3; 3 5] xΜ‚ = [4; 7]: From eq1: x̂₁ + xΜ‚β‚‚ = 4/3. Subtract eq1 from eq2: 2xΜ‚β‚‚ = 7βˆ’4 = 3 β†’ xΜ‚β‚‚ = 1.5. x̂₁ = 4/3 βˆ’ 1.5 = 4/3 βˆ’ 3/2 = 8/6 βˆ’ 9/6 = βˆ’1/6. So xΜ‚ = (βˆ’1/6, 3/2). The best-fit line is y = βˆ’x/6 + 3/2. Residual β€–AxΜ‚ βˆ’ bβ€–Β² = β€–(βˆ’1/6βˆ’0, βˆ’1/6+3/2βˆ’1, βˆ’1/6+3βˆ’3)β€–Β² = β€–(βˆ’1/6, 1/3, βˆ’1/6)β€–Β² = 1/36 + 1/9 + 1/36 = 1/18 + 2/18 = 3/18 = 1/6. **Problem 6:** Let P be an orthogonal projection matrix (PΒ² = P, P^T = P). (Iβˆ’P)Β² = IΒ² βˆ’ 2P + PΒ² = I βˆ’ 2P + P = I βˆ’ P. βœ“ (idempotent) (Iβˆ’P)^T = I^T βˆ’ P^T = I βˆ’ P. βœ“ (symmetric) Now check what (Iβˆ’P) projects onto: For any x ∈ C(P), Px = x, so (Iβˆ’P)x = x βˆ’ x = 0. So C(Iβˆ’P) βŠ† N(P) = C(P)^βŸ‚. And dim(Iβˆ’P) = n βˆ’ rank(P) = dim(C(P)^βŸ‚). So C(Iβˆ’P) = C(P)^βŸ‚. Thus Iβˆ’P projects onto the orthogonal complement. **Problem 7:** (Q^T)^T Q^T = Q Q^T. But Q Q^T = I because Q⁻¹ = Q^T. So Q Q^T = Q Q⁻¹ = I. Thus (Q^T)^T Q^T = I, so Q^T is orthogonal.

Summary

  1. The orthogonal projection of b onto a subspace W gives the unique closest point in W to b; the projection matrix P = A(A^T A)⁻¹A^T is symmetric and idempotent (P² = P)
  2. Orthogonal matrices (Q^T Q = I) preserve inner products, norms, and angles β€” they represent rotations and reflections; det(Q) = Β±1
  3. QR decomposition A = QR (Gram-Schmidt in matrix form) enables solving Ax = b and least-squares problems via back-substitution on R x = Q^T b, avoiding the formation of A^T A
  4. The least-squares solution to Ax β‰ˆ b minimizes β€–Ax βˆ’ bβ€–Β² and satisfies the normal equations A^T A xΜ‚ = A^T b; the residual b βˆ’ A xΜ‚ is orthogonal to C(A)
  5. Projection is the geometric foundation of least-squares approximation, Fourier series, and many numerical linear algebra algorithms

Pitfalls

  1. Assuming PΒ² = P is sufficient for orthogonal projection. Idempotency (PΒ² = P) defines a projection, but it could be oblique. An orthogonal projection additionally requires symmetry: P^T = P. Always check both conditions.

  2. Forming A^T A explicitly for least squares. Computing A^T A squares the condition number, turning a mildly ill-conditioned problem into a severely ill-conditioned one. Use QR decomposition instead: solve R x = Q^T b by back-substitution, which preserves the original condition number.

  3. Confusing Q^T Q = I_n with Q Q^T = I_m for tall matrices. When Q is mΓ—n with m > n and orthonormal columns, Q^T Q = I_n (nΓ—n identity), but Q Q^T β‰  I_m. The product Q Q^T is the projection matrix onto C(Q), not the identity.

  4. Forgetting that least-squares requires independent columns for uniqueness. If columns of A are linearly dependent, A^T A is singular and the normal equations have infinitely many solutions. The minimum-norm solution then requires the pseudoinverse.

  5. Not verifying orthogonality of the residual. The defining property of the least-squares solution is that the residual b βˆ’ A xΜ‚ is orthogonal to C(A). If you forget to check A^T(b βˆ’ A xΜ‚) = 0, you may have computed a non-orthogonal (and therefore non-optimal) projection.



Next Steps

Move on to 08-07 β€” Determinants (Deep) to explore the rigorous definition of determinants via permutations and parity, cofactor expansion, the full list of determinant properties, Cramer's rule, and the geometric interpretation as signed volume.