08-06 β Orthogonal Projections
Phase: 8 β Linear Algebra (Rigorous) Subject: 08-06 Prerequisites: 08-05 β Inner Product Spaces Next subject: 08-07 β Determinants (Deep)
Learning Objectives
By the end of this subject, you will be able to:
- Compute the orthogonal projection of a vector onto a line and onto a general subspace using projection matrices
- Characterize orthogonal matrices (Q^T Q = I) and apply them to preserve lengths and angles under transformation
- Solve least-squares problems Ax β b via the normal equations A^T A xΜ = A^T b
- Compute and interpret the QR decomposition A = QR, and use it to solve least-squares problems efficiently
- Understand the geometric meaning of projections: the projection minimizes distance, and the residual is orthogonal to the subspace
Core Content
1. Orthogonal Projection onto a Line
Given a nonzero vector a β R^n, the orthogonal projection of b onto the line span{a} is:
$p = proj_a(b) = (aΒ·b / aΒ·a) a = (a^T b / a^T a) a $
CRITICAL β Foundational: Orthogonal projection is the geometric foundation of least squares, Fourier series, PCA, and ML. Key property: the residual
b β pis ORTHOGONAL to the subspace β this makes the projection the 'closest point.'
This can be written as p = P b where P is the projection matrix:
$P = (a a^T) / (a^T a) (n Γ n matrix, rank 1) $
Properties of a projection matrix P: - PΒ² = P (idempotent: projecting again doesn't change anything)
Common Pitfall: PΒ² = P alone does NOT guarantee orthogonal projection β oblique projections also satisfy this. Orthogonal projection requires BOTH PΒ² = P AND P^T = P. - P^T = P (symmetric) - P projects onto C(P) = span{a} - I β P projects onto (span{a})^β
Example: Project b = (3, 1, 0) onto a = (1, 0, 2).
$aΒ·b = 1Β·3 + 0Β·1 + 2Β·0 = 3 aΒ·a = 1 + 0 + 4 = 5 p = (3/5)(1, 0, 2) = (0.6, 0, 1.2) $
The residual e = b β p = (3, 1, 0) β (0.6, 0, 1.2) = (2.4, 1, β1.2) is orthogonal to a: aΒ·e = 1Β·2.4 + 0Β·1 + 2Β·(β1.2) = 2.4 β 2.4 = 0. β
2. Orthogonal Projection onto a Subspace
Let W be a subspace of R^n with basis. We can find the projection of b onto W by two methods:
Method 1 β Orthogonal basis: If {uβ, ..., u_k} is an ORTHOGONAL basis of W, then:
$proj_W(b) = Ξ£_{i=1}^k (bΒ·u_i / u_iΒ·u_i) u_i
$
Method 2 β Matrix (normal equations): Let A be an n Γ k matrix whose columns are a basis of W. The projection p = A xΜ where xΜ solves:
$A^T A xΜ = A^T b (normal equations) $
Since columns of A are independent, A^T A is invertible, so:
$xΜ = (A^T A)β»ΒΉ A^T b p = A (A^T A)β»ΒΉ A^T b = P b $
The projection matrix is P = A(A^T A)β»ΒΉ A^T.
Properties: PΒ² = P, P^T = P, rank(P) = k = dim(W).
Geometric interpretation: p is the unique point in W closest to b. That is:
$βb β pβ = min_{w β W} βb β wβ
$
3. Orthogonal Matrices
Definition: A square matrix Q is orthogonal if Q^T Q = I (equivalently, Qβ»ΒΉ = Q^T).
Properties: - Columns of Q form an orthonormal basis of R^n - Rows of Q also form an orthonormal basis - Q preserves inner product: β¨Qx, Qyβ© = (Qx)^T(Qy) = x^T Q^T Q y = x^T y = β¨x, yβ© - Q preserves norm: βQxβ = βxβ - Q preserves angles - det(Q) = Β±1 (rotation: +1; reflection: β1)
Examples: - Rotation matrix: Q = [cos ΞΈ βsin ΞΈ; sin ΞΈ cos ΞΈ], Q^T Q = I β - Permutation matrix: swaps coordinates, orthogonal - Householder reflection: Q = I β 2uu^T where βuβ = 1
4. QR Decomposition
Any m Γ n matrix A with linearly independent columns can be factored as:
$A = Q R $
where: - Q is m Γ n with orthonormal columns (Q^T Q = I_n) - R is n Γ n upper triangular with positive diagonal entries
Computation via Gram-Schmidt:
Let columns of A be aβ, ..., a_n. Orthonormalize to get qβ, ..., q_n (columns of Q). Then R entries are:
r_ij = q_i^T a_j for i < j (projection coefficients)
r_jj = βa_j β Ξ£_{i<j} r_ij q_iβ (length of orthogonalized vector)
Solving Ax = b via QR:
$Ax = b β QRx = b β Rx = Q^T b $
Since R is upper triangular, solve by back-substitution.
Least-squares via QR:
$A^T A xΜ = A^T b β R^T Q^T Q R xΜ = R^T Q^T b β R^T R xΜ = R^T Q^T b $
Since R is invertible: R xΜ = Q^T b. Solve by back-substitution β no need to form A^T A!
5. Least-Squares Problems
Given A (m Γ n, m > n) and b β R^m, find xΜ minimizing βAx β bβΒ².
Geometric interpretation: Find the point A xΜ in C(A) closest to b β the orthogonal projection of b onto C(A).
Normal equations: A^T A xΜ = A^T b.
Solution exists: If columns of A are independent, xΜ is unique: xΜ = (A^T A)β»ΒΉ A^T b.
The residual: e = b β A xΜ is in N(A^T) = C(A)^β. In fact, A^T e = 0 (this is just the normal equations).
Example: Fit a line y = mx + c to data points (1, 1), (2, 2), (3, 3.5).
For each point: mx_i + c = y_i. In matrix form:
$[1 1] [c] [1] [2 1] [m] = [2] [3 1] [3.5] $
A = [1 1; 2 1; 3 1], b = (1, 2, 3.5)^T.
A^T A = [1 2 3; 1 1 1] [1 1; 2 1; 3 1] = [14 6; 6 3] A^T b = [1 2 3; 1 1 1] [1; 2; 3.5] = [1+4+10.5; 1+2+3.5] = [15.5; 6.5]
Solve: [14 6; 6 3][c; m] = [15.5; 6.5] β 14c + 6m = 15.5, 6c + 3m = 6.5
From second: 2c + m = 6.5/3? Let's do: 6c + 3m = 6.5 β divide by 3: 2c + m = 13/6. From first minus 2Γ(second): (14c+6m) β (12c+6m) = 15.5 β 13 = 2.5 β 2c = 2.5 β c = 1.25. Then m = 13/6 β 2c = 13/6 β 2.5 = 13/6 β 15/6 = β2/6 β β0.333. Line: y = β0.333x + 1.25.
Key Terms
- 08 06 Orthogonal Projections
- Correct: B) A rotation
- Correct: B) PΒ² = P and P^T = P
- Correct: B) Q^T Q = I_n
- Correct: B) W^β
- Correct: B) e β W^β
- Correct: B) βAx β bβΒ²
- Correct: C) (aΒ·b / aΒ·a) a
- Correct: D) Either 1 or β1
- End-of-Subject Quiz
- Example 1: Projection onto a Plane
- Example 2: QR Decomposition
Worked Examples
Example 1: Projection onto a Plane
Problem: Find the projection of b = (6, 0, 0) onto the plane W = span{(1, 1, 0), (0, 1, 1)}.
Solution: Use Method 2 (normal equations).
A = [1 0; 1 1; 0 1] (columns are the basis vectors).
A^T A = [1 1 0; 0 1 1] [1 0; 1 1; 0 1] = [2 1; 1 2] A^T b = [1 1 0; 0 1 1] [6; 0; 0] = [6; 0]
Solve [2 1; 1 2] xΜ = [6; 0]: (2Γeq2 β eq1): (2β2)xΜβ + (4β1)xΜβ = 0β6 β 3xΜβ = β6 β xΜβ = β2. Then xΜβ = (6 β xΜβ)/2 = (6+2)/2 = 4.
p = A xΜ = [1 0; 1 1; 0 1][4; β2] = (4, 4β2, β2) = (4, 2, β2).
Residual: e = b β p = (6β4, 0β2, 0β(β2)) = (2, β2, 2).
Check orthogonality: eΒ·(1,1,0) = 2β2+0 = 0, eΒ·(0,1,1) = 0β2+2 = 0. β
Example 2: QR Decomposition
Problem: Find QR decomposition of A = [1 1; 1 0; 0 1].
Solution (Gram-Schmidt):
Col1: aβ = (1, 1, 0). βaββ = β2. qβ = (1/β2, 1/β2, 0). rββ = β2.
Col2: compute rββ = qβ^T aβ = (1/β2, 1/β2, 0)Β·(1, 0, 1) = 1/β2. vβ = aβ β rββ qβ = (1, 0, 1) β (1/β2)(1/β2, 1/β2, 0) = (1, 0, 1) β (1/2, 1/2, 0) = (1/2, β1/2, 1). βvββΒ² = 1/4 + 1/4 + 1 = 3/2, βvββ = β(3/2). rββ = βvββ = β(3/2). qβ = vβ/βvββ = (1/2, β1/2, 1)/β(3/2) = (1/β6, β1/β6, 2/β6).
$Q = [1/β2 1/β6 ]
[1/β2 -1/β6 ]
[0 2/β6 ]
R = [β2 1/β2 ]
[0 β(3/2)]
$
Check: Q^T Q = Iβ and A = QR. β
Example 3: Orthogonal Matrix Properties
Problem: Show that if Q is orthogonal, then βQxβ = βxβ for all x.
Solution: βQxβΒ² = (Qx)^T (Qx) = x^T Q^T Q x = x^T I x = x^T x = βxβΒ². Taking square roots: βQxβ = βxβ. β
Corollary: det(Q) = Β±1 because det(Q^T Q) = det(I) β det(Q)^2 = 1 β det(Q) = Β±1.
Example 4: Least-Squares Parabola Fit
Problem: Fit y = axΒ² + bx + c to (β1, 2), (0, 1), (1, 3), (2, 6).
Solution: For each (x, y): axΒ² + bx + c = y.
Matrix form:
$[1 -1 1] [a] [2] [0 0 1] [b] = [1] [1 1 1] [c] [3] [4 2 1] [6] $
A^T A = [1 0 1 4; β1 0 1 2; 1 1 1 1] [1 β1 1; 0 0 1; 1 1 1; 4 2 1] = [18 8 6; 8 6 2; 6 2 4]
A^T b = [1 0 1 4; β1 0 1 2; 1 1 1 1] [2; 1; 3; 6] = [2+3+24; β2+3+12; 2+1+3+6] = [29; 13; 12]
Solve [18 8 6; 8 6 2; 6 2 4] xΜ = [29; 13; 12]: xΜ β (1, 0, 1). So y β xΒ² + 1.
Check values: f(β1)=2, f(0)=1, f(1)=2 (actual 3), f(2)=5 (actual 6). The fit isn't perfect but minimizes squared error.
Quiz
Q1: The orthogonal projection of b onto the line through a nonzero vector a is:
A) (aΒ·b) a B) (aΒ·b / bΒ·b) a C) (aΒ·b / aΒ·a) a D) (aΒ·b) / βaβ
Correct: C)
- If you chose C: p = (aΒ·b / aΒ·a) a β the scalar (aΒ·b)/(aΒ·a) scales a to land at the orthogonal projection point. The denominator must be aΒ·a (βaβΒ²). Correct!
- If you chose A: Missing the division by aΒ·a; the result would overshoot or undershoot the projection point.
- If you chose B: Normalizes by the wrong length (bΒ·b instead of aΒ·a).
- If you chose D: Divides by βaβ instead of βaβΒ², giving the wrong magnitude.
Q2: An orthogonal projection matrix P must satisfy:
A) PΒ² = P only B) PΒ² = P and P^T = P C) P^T P = I D) det(P) = 1
Correct: B)
- If you chose B: Idempotency (PΒ² = P) means projecting twice doesn't change anything. Symmetry (P^T = P) ensures the projection is orthogonal, not oblique. Both are required. Correct!
- If you chose A: Idempotency alone allows oblique projections; symmetry is also needed for orthogonality.
- If you chose C: That defines an orthogonal matrix, not a projection matrix.
- If you chose D: Projection matrices can have determinant 0 (if they project onto a proper subspace).
Q3: A square matrix Q is orthogonal if:
A) Q has orthonormal rows B) Q^T Q = I C) Qβ»ΒΉ = Q^T D) Both B and C (which are equivalent for square matrices)
Correct: D)
- If you chose D: Q^T Q = I means the columns are orthonormal. For a square matrix, this implies Q Q^T = I and Qβ»ΒΉ = Q^T. Correct!
- If you chose A: Orthonormal rows are a consequence but not the definition.
- If you chose B: True, but C is also equivalent.
- If you chose C: True, but B is also equivalent.
Q4: In the QR decomposition A = QR, what is R?
A) A diagonal matrix B) A lower triangular matrix C) An upper triangular matrix with positive diagonal entries D) The identity matrix
Correct: C)
- If you chose C: Gram-Schmidt produces R upper triangular because aβ±Ό involves only qβ,...,qβ±Ό. Diagonal entries rβ±Όβ±Ό = βaβ±Ό β projection onto earlier columnsβ > 0. Correct!
- If you chose A: R is triangular, not diagonal (unless columns of A are already orthogonal).
- If you chose B: R is upper, not lower triangular.
- If you chose D: R = I only if A already has orthonormal columns.
Q5: The least-squares solution xΜ to Ax β b (with A having independent columns) satisfies:
A) AxΜ = b exactly B) A^T A xΜ = A^T b (the normal equations) C) A^T xΜ = b D) xΜ = Aβ»ΒΉb
Correct: B)
- If you chose B: The normal equations A^T A xΜ = A^T b arise from setting the gradient of βAx β bβΒ² to zero. Geometrically, the residual b β AxΜ is orthogonal to C(A). Correct!
- If you chose A: The system is overdetermined (m > n) β an exact solution generally doesn't exist.
- If you chose C: A^T is n Γ m; A^T xΜ = b doesn't make sense dimensionally unless n = m.
- If you chose D: A is not square (m > n), so Aβ»ΒΉ doesn't exist.
Q6: After projecting b onto subspace W, the residual e = b β p satisfies:
A) e β W B) e β W^β (the orthogonal complement of W) C) e = 0 always D) e is parallel to the projection p
Correct: B)
- If you chose B: The residual is orthogonal to every vector in W β this is the defining property of orthogonal projection and why p is the closest point in W to b. Correct!
- If you chose A: The projection p β W; the residual is what remains after subtracting the projection.
- If you chose C: e = 0 only if b was already in W.
- If you chose D: e is orthogonal to p (since p β W and e β W).
Practice Problems
(Answers are below. Try each problem before checking.)
Problem 1: Project b = (4, β1, 5) onto the line through a = (2, 1, 0). Find the projection matrix P.
Problem 2: Find the orthogonal projection of b = (1, 1, 1, 1) onto W = span{(1, 0, 1, 0), (0, 1, 0, 1)}.
Problem 3: Determine whether Q = [3/5 4/5; 4/5 β3/5] is orthogonal.
Problem 4: Find the QR decomposition of A = [1 2; 0 1; 1 0].
Problem 5: Solve the least-squares problem: minimize βAx β bβΒ² where A = [1 0; 1 1; 1 2] and b = (0, 1, 3)^T.
Problem 6: Show that if P is an orthogonal projection matrix, then I β P is also an orthogonal projection matrix.
Problem 7: Prove that Q^T is orthogonal when Q is orthogonal.
Answers (click to expand)
**Problem 1:** aΒ·b = 2Β·4 + 1Β·(β1) + 0Β·5 = 8 β 1 = 7 aΒ·a = 4 + 1 = 5 p = (7/5)(2, 1, 0) = (14/5, 7/5, 0) = (2.8, 1.4, 0). P = a a^T / (a^T a) = [2; 1; 0][2 1 0]/5 = [4 2 0; 2 1 0; 0 0 0]/5 = [0.8 0.4 0; 0.4 0.2 0; 0 0 0]. Check: PΒ² = P and P b = [0.8 0.4 0; 0.4 0.2 0; 0 0 0][4;β1;5] = [3.2β0.4; 1.6β0.2; 0] = [2.8; 1.4; 0] = p. β **Problem 2:** The basis vectors are already orthogonal (1,0,1,0)Β·(0,1,0,1) = 0. proj = proj_{vβ}(b) + proj_{vβ}(b). bΒ·vβ = 1+0+1+0 = 2, vβΒ·vβ = 2. projβ = (2/2)(1,0,1,0) = (1,0,1,0). bΒ·vβ = 0+1+0+1 = 2, vβΒ·vβ = 2. projβ = (2/2)(0,1,0,1) = (0,1,0,1). p = (1, 1, 1, 1). Wait β p = b! This means b β W. Indeed, (1,1,1,1) = (1,0,1,0) + (0,1,0,1). **Problem 3:** Q^T Q = [3/5 4/5; 4/5 β3/5] [3/5 4/5; 4/5 β3/5] = [9/25+16/25 12/25β12/25; 12/25β12/25 16/25+9/25] = [1 0; 0 1] = I. β Yes, Q is orthogonal. det(Q) = (3/5)(β3/5) β (4/5)(4/5) = β9/25 β 16/25 = β1. This is a reflection. **Problem 4:** aβ = (1, 0, 1), βaββ = β2, qβ = (1/β2, 0, 1/β2), rββ = β2. rββ = qβ^T aβ = (1/β2, 0, 1/β2)Β·(2, 1, 0) = 2/β2 = β2. vβ = aβ β rββ qβ = (2, 1, 0) β β2(1/β2, 0, 1/β2) = (2, 1, 0) β (1, 0, 1) = (1, 1, β1). βvββΒ² = 1+1+1 = 3, βvββ = β3. rββ = β3. qβ = (1/β3, 1/β3, β1/β3).$Q = [1/β2 1/β3 ]
[0 1/β3 ]
[1/β2 -1/β3 ]
R = [β2 β2 ]
[0 β3 ]
$
Check A = QR. β
**Problem 5:**
A^T A = [1 1 1; 0 1 2] [1 0; 1 1; 1 2] = [3 3; 3 5].
A^T b = [1 1 1; 0 1 2] [0; 1; 3] = [4; 7].
Solve [3 3; 3 5] xΜ = [4; 7]:
From eq1: xΜβ + xΜβ = 4/3.
Subtract eq1 from eq2: 2xΜβ = 7β4 = 3 β xΜβ = 1.5.
xΜβ = 4/3 β 1.5 = 4/3 β 3/2 = 8/6 β 9/6 = β1/6.
So xΜ = (β1/6, 3/2). The best-fit line is y = βx/6 + 3/2.
Residual βAxΜ β bβΒ² = β(β1/6β0, β1/6+3/2β1, β1/6+3β3)βΒ² = β(β1/6, 1/3, β1/6)βΒ² = 1/36 + 1/9 + 1/36 = 1/18 + 2/18 = 3/18 = 1/6.
**Problem 6:** Let P be an orthogonal projection matrix (PΒ² = P, P^T = P).
(IβP)Β² = IΒ² β 2P + PΒ² = I β 2P + P = I β P. β (idempotent)
(IβP)^T = I^T β P^T = I β P. β (symmetric)
Now check what (IβP) projects onto: For any x β C(P), Px = x, so (IβP)x = x β x = 0. So C(IβP) β N(P) = C(P)^β. And dim(IβP) = n β rank(P) = dim(C(P)^β). So C(IβP) = C(P)^β. Thus IβP projects onto the orthogonal complement.
**Problem 7:** (Q^T)^T Q^T = Q Q^T. But Q Q^T = I because Qβ»ΒΉ = Q^T. So Q Q^T = Q Qβ»ΒΉ = I. Thus (Q^T)^T Q^T = I, so Q^T is orthogonal.
Summary
- The orthogonal projection of b onto a subspace W gives the unique closest point in W to b; the projection matrix P = A(A^T A)β»ΒΉA^T is symmetric and idempotent (PΒ² = P)
- Orthogonal matrices (Q^T Q = I) preserve inner products, norms, and angles β they represent rotations and reflections; det(Q) = Β±1
- QR decomposition A = QR (Gram-Schmidt in matrix form) enables solving Ax = b and least-squares problems via back-substitution on R x = Q^T b, avoiding the formation of A^T A
- The least-squares solution to Ax β b minimizes βAx β bβΒ² and satisfies the normal equations A^T A xΜ = A^T b; the residual b β A xΜ is orthogonal to C(A)
- Projection is the geometric foundation of least-squares approximation, Fourier series, and many numerical linear algebra algorithms
Pitfalls
-
Assuming PΒ² = P is sufficient for orthogonal projection. Idempotency (PΒ² = P) defines a projection, but it could be oblique. An orthogonal projection additionally requires symmetry: P^T = P. Always check both conditions.
-
Forming A^T A explicitly for least squares. Computing A^T A squares the condition number, turning a mildly ill-conditioned problem into a severely ill-conditioned one. Use QR decomposition instead: solve R x = Q^T b by back-substitution, which preserves the original condition number.
-
Confusing Q^T Q = I_n with Q Q^T = I_m for tall matrices. When Q is mΓn with m > n and orthonormal columns, Q^T Q = I_n (nΓn identity), but Q Q^T β I_m. The product Q Q^T is the projection matrix onto C(Q), not the identity.
-
Forgetting that least-squares requires independent columns for uniqueness. If columns of A are linearly dependent, A^T A is singular and the normal equations have infinitely many solutions. The minimum-norm solution then requires the pseudoinverse.
-
Not verifying orthogonality of the residual. The defining property of the least-squares solution is that the residual b β A xΜ is orthogonal to C(A). If you forget to check A^T(b β A xΜ) = 0, you may have computed a non-orthogonal (and therefore non-optimal) projection.
Next Steps
Move on to 08-07 β Determinants (Deep) to explore the rigorous definition of determinants via permutations and parity, cofactor expansion, the full list of determinant properties, Cramer's rule, and the geometric interpretation as signed volume.