09-02 — QR Decomposition
Phase: 9 — Matrix Decompositions & Advanced Linear Algebra Subject: 09-02 Prerequisites: 09-01 — LU Decomposition Next subject: 09-03 — Singular Value Decomposition (SVD)
Learning Objectives
- Derive the QR decomposition $A = QR$ where
Qis orthogonal andRis upper-triangular - Compute QR via Classical and Modified Gram-Schmidt, understanding their numerical differences
- Explain Householder reflections and Givens rotations as numerically stable alternatives for computing QR
- Apply QR decomposition to solve the linear least squares problem $min ||Ax - b||$
- Describe QR's role in the QR algorithm for eigenvalue computation
Core Content
QR decomposition factors an $m × n$ matrix A (with $m ≥ n$) into $A = QR$, where Q is an $m × m$ orthogonal matrix ($Q^T Q = I$) and R is an $m × n$ upper-triangular matrix.
CRITICAL -- Foundational: QR decomposition A = QR is Gram-Schmidt in matrix form. Q preserves lengths/angles. QR powers the QR eigenvalue algorithm and least-squares solvers.
For the "economy" or "thin" QR: $A = Q_1 R_1$ where $Q_1$ is $m × n$ with orthonormal columns and $R_1$ is $n × n$ upper-triangular.
1. Gram-Schmidt QR
The columns of A, denoted $a_1, a_2, ..., a_n$, span some subspace. QR constructs orthonormal columns $q_1, q_2, ..., q_n$ spanning the same subspace, in order.
Classical Gram-Schmidt (CGS):
Common Pitfall: Classical Gram-Schmidt loses orthogonality in floating point. Use Modified Gram-Schmidt or Householder reflections. Never use CGS in production code.
$v_1 = a_1 q_1 = v_1 / ||v_1|| v_2 = a_2 - (q_1^T a_2) q_1 q_2 = v_2 / ||v_2|| v_3 = a_3 - (q_1^T a_3) q_1 - (q_2^T a_3) q_2 q_3 = v_3 / ||v_3|| ... $
The entries of R: $r_{ij} = q_i^T a_j$ for i < j, and $r_{jj} = ||v_j||$.
Matrix form derivation:
From q_1 = a_1 / r_{11}:
$a_1 = r_{11} q_1
$
From $a_2 = q_1^T a_2 q_1 + v_2 = r_{12} q_1 + r_{22} q_2$:
$a_2 = r_{12} q_1 + r_{22} q_2
$
In general:
$a_j = sum_{i=1}^{j} r_{ij} q_i
$
Stacking as columns: $A = Q R$, where R is upper-triangular.
CRITICAL -- Foundational: QR decomposition is Gram-Schmidt in matrix form. Q preserves lengths/angles. QR powers the QR eigenvalue algorithm and least-squares solvers.
$[a_1 a_2 ... a_n] = [q_1 q_2 ... q_n] [r_{11} r_{12} ... r_{1n}]
[ 0 r_{22} ... r_{2n}]
[ ... ... ]
[ 0 0 ... r_{nn}]
$
Numerical issue with CGS: In floating-point arithmetic, orthogonality of computed q vectors degrades for ill-conditioned A. The cause: q_i can have a small component in directions already subtracted, accumulating round-off errors.
Modified Gram-Schmidt (MGS): Subtract projections one at a time, using the most recently computed orthogonal vector:
for j = 1:n:
v = a_j
for i = 1:j-1:
r_{ij} = q_i^T v
v = v - r_{ij} q_i
r_{jj} = ||v||
q_j = v / r_{jj}
MGS is numerically more stable because it subtracts projections against the current v rather than the original a_j, reducing error propagation.
2. Householder Reflections
A Householder reflection is a symmetric orthogonal matrix:
$H = I - 2 v v^T where ||v|| = 1 $
Geometrically, H reflects a vector across the hyperplane orthogonal to v.
Key property: For any vector x, we can choose v so that Hx is a multiple of $e_1$ (the first unit vector). Specifically:
Let $v = x - α e_1$ where $α = -sign(x_1) ||x||$ (the sign choice maximizes stability). Then normalize v:
$Hx = α e_1 $
Householder QR algorithm:
R = A
Q = I
for k = 1:n:
x = R(k:m, k) % the subdiagonal portion of column k
construct H_k to zero out x below the first entry
R = H_k R % apply to trailing submatrix
Q = Q H_k % accumulate (Q = H_1 H_2 ... H_n)
Since each H_k is orthogonal and symmetric, the product $Q = H_1 H_2 ... H_n$ is orthogonal. The resulting R is upper-triangular.
Householder QR requires ~ 2mn² - 2n³/3 flops and is numerically very stable (backward stable).
3. Givens Rotations
A Givens rotation zeros out a single entry using a 2×2 rotation matrix embedded in the identity:
$G(i, j, θ) = [1 ... 0 ... 0 ... 0]
[0 ... c ... s ... 0] row i
[0 ... -s ... c ... 0] row j
[0 ... 0 ... 0 ... 1]
$
where $c = cos θ$, $s = sin θ$. For a vector [a, b]^T:
$c = a / sqrt(a² + b²) s = b / sqrt(a² + b²) $
Then $G [a, b]^T = [sqrt(a²+b²), 0]^T$.
Givens QR: apply rotations one by one to zero out entries below the diagonal. Useful for sparse or structured matrices where you want to introduce zeros selectively.
4. Solving Least Squares with QR
The least squares problem: $minimize ||Ax - b||$ over x.
Given $A = QR$ (full QR):
$||Ax - b||² = ||QRx - b||² = ||Q^T(QRx - b)||² = ||Rx - Q^T b||² $
(since Q preserves norms: $||Qy|| = ||y||$.)
Let $c = Q^T b$. Partition:
$R = [R_1] c = [c_1] (first n rows)
[ 0 ] [c_2] (last m-n rows)
$
Then:
$||Ax - b||² = ||R_1 x - c_1||² + ||c_2||² $
The minimum is achieved when $R_1 x = c_1$ (the normal equations solved stably via triangular solve). The residual norm is $||c_2||$.
Why QR over normal equations? Normal equations $A^T A x = A^T b$ have condition number $κ(A)^2$, making them numerically unstable for ill-conditioned problems. QR works directly with A and has condition number $κ(A)$.
5. Connection to Eigenvalues: the QR Algorithm
The unshifted QR algorithm iterates:
A_0 = A
for k = 0, 1, 2, ...:
compute QR factorization: A_k = Q_k R_k
A_{k+1} = R_k Q_k
Since $A_{k+1} = Q_k^T A_k Q_k$, the iterates are orthogonally similar and share eigenvalues. Under mild conditions, A_k converges to an upper-triangular (or quasi-triangular) matrix whose diagonal entries are the eigenvalues.
Key Terms
- 09 02 Qr Decomposition
- 09-03 Singular Value Decomposition
- Answer: a.
- Answer: b.
- Answer: d.
- End-of-Subject Quiz
- Why QR over normal equations?
Worked Examples
Example 1: QR via Gram-Schmidt (2×2)
Find QR of:
$A = [3 2]
[4 1]
$
Solution:
Column 1: $v_1 = [3, 4]^T$, $||v_1|| = 5$, $q_1 = [3/5, 4/5]^T$
Column 2: $r_{12} = q_1^T a_2 = (3/5)(2) + (4/5)(1) = 10/5 = 2$
$v_2 = a_2 - r_{12} q_1 = [2, 1]^T - 2[3/5, 4/5]^T = [2-6/5, 1-8/5]^T = [4/5, -3/5]^T
||v_2|| = 1
q_2 = [4/5, -3/5]^T
$
$Q = [3/5 4/5] R = [5 2]
[4/5 -3/5] [0 1]
$
Verify: $QR = [[3,2],[4,1]]$ ✓ and $Q^T Q = I$ ✓
Example 2: QR via Householder (3×2)
Compute Householder QR of:
$A = [1 1]
[2 0]
[2 1]
$
Solution:
Step 1: Target column 1. $x = [1, 2, 2]^T$, $||x|| = 3$.
To zero out entries 2 and 3: we want $H_1 x = [-3, 0, 0]^T$ (choose sign for stability; here pick -3 so first entry keeps sign but we use the stable sign convention).
Actually using the stable convention: $α = -sign(x_1)||x|| = -3$.
$v = x - α e_1 = [1, 2, 2]^T - [-3, 0, 0]^T = [4, 2, 2]^T u = v / ||v||: ||v|| = sqrt(24) = 2√6 $
$H_1 = I - 2 u u^T $
But we can compute $H_1 A$ without forming $H_1$ explicitly. We only need $τ = 2/||v||² = 2/24 = 1/12$:
$H_1 A = A - τ v (v^T A) $
$v^T A = [4, 2, 2] [[1,1],[2,0],[2,1]] = [4+4+4, 4+0+2] = [12, 6]
τ v (v^T A) = (1/12) [4,2,2]^T [12,6] = [4,2,2]^T [1, 0.5]
= [[4, 2], [2, 1], [2, 1]]
H_1 A = [[1,1],[2,0],[2,1]] - [[4,2],[2,1],[2,1]]
= [[-3, -1], [0, -1], [0, 0]]
$
Step 2: Now focus on the 2×1 subvector $[-1, 0]^T$ in rows 2-3 of column 2. $x = [-1, 0]^T$, $||x|| = 1$.
$α = -sign(-1)·1 = 1$. $v = [-1, 0]^T - [1, 0]^T = [-2, 0]^T$.
Apply Householder to rows 2-3, but only the bottom of column 2 is affected (top row of column 2 is frozen). After this step:
$R = [-3 -1]
[ 0 1]
[ 0 0]
$
The accumulated $Q = H_1 H_2$ is orthogonal.
Example 3: Least Squares with QR
Solve the overdetermined system in the least squares sense:
$x + y = 2 2x = 3 2x + y = 3 $
$A = [1 1] b = [2]
[2 0] [3]
[2 1] [3]
$
From Example 2, we have the economy QR (first 2 columns of Q, first 2 rows of R):
R_1 = [-3 -1] (note: with sign conventions from earlier)
[ 0 1]
Actually, using MGS gives a cleaner Q. Let me redo with MGS:
$a_1 = [1,2,2]^T$, $r_{11} = 3$, $q_1 = [1/3, 2/3, 2/3]^T$
$r_{12} = q_1^T a_2 = (1/3)(1) + (2/3)(0) + (2/3)(1) = 1$
$v = a_2 - r_{12} q_1 = [1,0,1]^T - [1/3,2/3,2/3]^T = [2/3, -2/3, 1/3]^T$
$r_{22} = ||v|| = sqrt(4/9 + 4/9 + 1/9) = sqrt(1) = 1$
$q_2 = [2/3, -2/3, 1/3]^T$
So:
$R_1 = [3 1] Q_1 = [1/3 2/3]
[0 1] [2/3 -2/3]
[2/3 1/3]
$
Compute $c = Q_1^T b$:
$c_1 = [1/3, 2/3, 2/3]·[2,3,3] = 2/3 + 2 + 2 = 14/3 c_2 = [2/3, -2/3, 1/3]·[2,3,3] = 4/3 - 2 + 1 = 1/3 $
Solve $R_1 x = c$:
$3x_1 + x_2 = 14/3
x_2 = 1/3
$
=> $x_2 = 1/3$, $x_1 = (14/3 - 1/3)/3 = 13/9$
So $x = [13/9, 1/3]^T$.
Residual norm: $||Ax - b|| = sqrt(||c_2||²) = 1/3$. (Note: this is using thin QR; we'd need the full Q to get the formal residual.)
Quiz
Q1: What does the concept of End-of-Subject Quiz primarily refer to in this subject?
A) A historical anecdote about End-of-Subject Quiz B) The definition and application of End-of-Subject Quiz C) A computational error related to End-of-Subject Quiz D) A visual representation of End-of-Subject Quiz
Correct: B)
- If you chose A: This is incorrect. End-of-Subject Quiz is defined as: the definition and application of end-of-subject quiz. The other options describe different aspects that are not the primary focus.
- If you chose B: End-of-Subject Quiz is defined as: the definition and application of end-of-subject quiz. The other options describe different aspects that are not the primary focus. Correct!
- If you chose C: This is incorrect. End-of-Subject Quiz is defined as: the definition and application of end-of-subject quiz. The other options describe different aspects that are not the primary focus.
- If you chose D: This is incorrect. End-of-Subject Quiz is defined as: the definition and application of end-of-subject quiz. The other options describe different aspects that are not the primary focus.
Q2: What is the primary purpose of Common Pitfalls?
A) It replaces all other methods in this domain B) It is used only in advanced research contexts C) It is primarily a historical notation system D) It is used to common pitfalls in mathematical analysis
Correct: D)
- If you chose A: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose B: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose C: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose D: Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role. Correct!
Q3: Based on the worked examples in this subject, what is the correct result?
A) An unrelated numerical value B) A different result from a common mistake C) a.** D) The inverse of the correct answer
Correct: C)
- If you chose A: This is incorrect. The worked examples show that the result is a.**. The other options represent common errors.
- If you chose B: This is incorrect. The worked examples show that the result is a.**. The other options represent common errors.
- If you chose C: The worked examples show that the result is a.**. The other options represent common errors. Correct!
- If you chose D: This is incorrect. The worked examples show that the result is a.**. The other options represent common errors.
Practice Problems
-
Compute the QR decomposition of $A = [[1, 1], [1, -1], [0, 1]]$ using Classical Gram-Schmidt.
-
For the matrix in problem 1, use QR to find the least squares solution to $Ax = [3, 1, 1]^T$.
-
Verify that for $A = [[2, -2], [1, 1]]$, CGS and MGS give the same QR (this is well-conditioned). Show the computation.
-
Construct the Householder reflection matrix that maps $x = [3, 4]^T$ to a multiple of $e_1 = [1, 0]^T$.
-
Show that if $A = QR$, then $A^T A = R^T R$. Why is this useful for understanding why QR is more stable than normal equations?
-
Given $Q^T Q = I$, prove that $||Qx|| = ||x||$ for any vector
x.
Answers
1. `q_1 = [1/√2, 1/√2, 0]^T`, `r_{11} = √2` `r_{12} = q_1^T a_2 = 0`, `v_2 = a_2 = [1, -1, 1]^T`, `r_{22} = √3`, `q_2 = [1/√3, -1/√3, 1/√3]^T` `R = [[√2, 0], [0, √3]]`, `Q = [[1/√2, 1/√3], [1/√2, -1/√3], [0, 1/√3]]` 2. `c = Q^T b = [3/√2 + 1/√2, 3/√3 - 1/√3 + 1/√3]^T = [4/√2, 3/√3]^T = [2√2, √3]^T` `R x = c`: `x_1 = (2√2)/√2 = 2`, `x_2 = √3/√3 = 1`. So `x = [2, 1]^T`. 3. CGS: `q_1 = [2/√5, 1/√5]^T`, `r_{12} = q_1^T a_2 = -4/√5 + 1/√5 = -3/√5` `v_2 = [-2,1]^T - (-3/√5)[2/√5, 1/√5]^T = [-2+6/5, 1+3/5]^T = [-4/5, 8/5]^T` `r_{22} = 4/√5`, `q_2 = [-1/√5, 2/√5]^T`. MGS gives identical results. 4. `||x|| = 5`. `α = -sign(3)·5 = -5`. `v = [3,4]^T - [-5,0]^T = [8,4]^T`. `H = I - 2vv^T/||v||² = I - 2[64,32;32,16]/80 = [[1,0],[0,1]] - [[64/40,32/40],[32/40,16/40]]` `= [[-0.6, -0.8], [-0.8, 0.6]]`. Verify: `H[3,4]^T = [-5, 0]^T` ✓ 5. `A^T A = (QR)^T (QR) = R^T Q^T Q R = R^T R`. The normal equations approach computes `A^T A` explicitly, squaring the condition number: `κ(A^T A) = κ(A)^2`. QR works directly with A, maintaining `κ(R) = κ(A)`, so it's typically more numerically stable. 6. `||Qx||² = (Qx)^T (Qx) = x^T Q^T Q x = x^T I x = x^T x = ||x||²`. So `||Qx|| = ||x||`.Summary
- QR decomposition factors $A = QR$ with
Qorthogonal andRupper-triangular, extending Gram-Schmidt to matrix form - Classical Gram-Schmidt is intuitive but numerically unstable; Modified Gram-Schmidt improves stability by sequential projection
- Householder reflections (preferred in practice) and Givens rotations (good for sparse matrices) provide backward-stable QR
- Least squares $min ||Ax-b||$ is solved stably via QR: $R_1 x = Q^T b$ (first n rows), avoiding the condition-squaring of normal equations
- The QR algorithm iteratively applies QR to converge to Schur form, forming the basis of modern eigenvalue computation
Pitfalls
-
Using Classical Gram-Schmidt in production code. CGS loses orthogonality catastrophically for ill-conditioned matrices due to floating-point error accumulation. Always use Modified Gram-Schmidt (sequentially subtract against the current v) or Householder reflections for numerical stability.
-
Using normal equations when QR is available. The normal equations A^T A x = A^T b square the condition number. QR solves R₁ x = Q₁^T b by back-substitution, maintaining the original condition number κ(A) instead of κ(A)². For anything but the most well-conditioned problems, prefer QR.
-
Building economy QR when you need the full Q. The economy QR (Q₁: m×n, R₁: n×n) is sufficient for least squares. But the full residual norm requires the (m−n) trailing rows of Q^T b. If you need the complete residual analysis, compute the full QR.
-
Confusing R from Gram-Schmidt with R from Householder. Gram-Schmidt guarantees positive diagonal entries in R (rⱼⱼ = ‖vⱼ‖ > 0). Householder reflections may produce negative diagonals depending on the sign convention chosen for the reflector. Both are valid QR decompositions, but the R matrices differ.
-
Assuming Q in QR is square for rectangular matrices. For an m×n matrix with m > n, the full Q is m×m but the economy Q₁ is m×n. When applying Q^T b in the least-squares solve, ensure you're using the right Q for the dimensions of b.
Next Steps
Continue to 09-03 Singular Value Decomposition for the most general and powerful matrix factorization: A = U Σ V^T.