09-09 — Numerical Linear Algebra
Phase: 9 — Matrix Decompositions & Advanced Linear Algebra Subject: 09-09 Prerequisites: 09-08 — Tensor Algebra Next subject: 09-10 — Matrix Calculus
Learning Objectives
- Analyze floating-point errors and machine epsilon, and apply backward error analysis to linear algebra algorithms
- Define numerical stability and distinguish between forward stability, backward stability, and mixed stability
- Understand sparse matrix storage formats (CSR, CSC, COO) and their computational implications
- Implement and compare stationary iterative methods: Jacobi, Gauss-Seidel, and Successive Over-Relaxation (SOR)
- Explain the Conjugate Gradient (CG) method for SPD systems and the role of preconditioning
Core Content
1. Floating-Point Arithmetic and Errors
In IEEE 754 double precision, numbers are represented with 53 bits of mantissa (~15-16 decimal digits). The machine epsilon $ε_mach$ is approximately 2^{-52} ≈ 2.22 × 10^{-16}.
Floating-point model: For any real number x within range:
$fl(x) = x(1 + δ) where |δ| ≤ ε_mach $
For an arithmetic operation ∘ ∈ {+, -, ×, ÷}:
$fl(x ∘ y) = (x ∘ y)(1 + δ) where |δ| ≤ ε_mach $
Accumulation of errors in linear algebra:
Consider computing the inner product $s = Σ_{i=1}^{n} x_i y_i$ in floating point:
CRITICAL -- Foundational: NEVER compute A^{-1} explicitly -- use solve(A, b). Computing inverses is 3x more expensive and numerically less stable. Use np.linalg.solve, not np.linalg.inv.
$fl(s) = Σ x_i y_i (1 + δ_i) $
where each $δ_i$ is bounded by roughly $n·ε_mach$. The error grows linearly (not exponentially) with n for basic operations.
Forward error: $||computed - true||$ Backward error: How much we'd need to perturb the input for the exact solution to equal our computed one.
For a linear system $Ax = b$:
- Computed x̂ satisfies $(A + ΔA) x̂ = b$ for some small $ΔA$ (backward error)
- Forward error: $||x̂ - x|| ≤ κ(A) · ||ΔA||/||A||$
2. Numerical Stability
An algorithm is backward stable if it gives the exact answer to a slightly perturbed problem. Specifically, for solving $Ax = b$, the computed x̂ satisfies $(A + ΔA) x̂ = b + Δb$ where $||ΔA||/||A||$ and $||Δb||/||b||$ are O(ε_mach).
Stability of key algorithms:
| Algorithm | Backward Stable? | Notes |
|---|---|---|
| Gaussian elimination with partial pivoting | Usually yes | Growth factor can be large (theoretically 2^{n-1}, but rarely in practice) |
| Cholesky | Yes | No pivoting needed for SPD |
| QR (Householder) | Yes | Gold standard for least squares |
| QR (Modified Gram-Schmidt) | Near-stable | Better than CGS, not quite as stable as Householder |
| QR (Classical Gram-Schmidt) | No | Orthogonality degrades badly |
| Normal equations (A^T A) | No | Squares condition number |
| Power iteration | Conditionally | Stable for well-separated eigenvalues |
Growth factor in GE: With partial pivoting, elements in U are bounded by $2^{n-1} max|a_{ij}|$ in the worst case. In practice, the growth factor is almost always small, making Gaussian elimination with partial pivoting effectively backward stable.
3. Sparse Matrices
A matrix is sparse if most entries are zero. Storing and operating on only the nonzeros yields enormous savings.
Common storage formats:
COO (Coordinate): Store (row, col, value) triples. Simple but inefficient for arithmetic.
row: [0, 0, 1, 2, 2]
col: [0, 2, 1, 0, 2]
values: [1, 2, 3, 4, 5]
CSR (Compressed Sparse Row): Three arrays:
- values: nonzero entries, row by row
- col_idx: column index of each value
- row_ptr: index in values/col_idx where each row starts
$A = [1 0 2] values: [1, 2, 3, 4, 5]
[0 3 0] col_idx: [0, 2, 1, 0, 2]
[4 0 5] row_ptr: [0, 2, 3, 5]
$
Matrix-vector product $y = Ax$ in CSR:
for i = 0 to n-1:
y[i] = 0
for k = row_ptr[i] to row_ptr[i+1]-1:
y[i] += values[k] * x[col_idx[k]]
Cost: O(nnz) instead of O(n²), where nnz = number of nonzeros.
CSC (Compressed Sparse Column): Analogous to CSR but column-oriented. Better for algorithms that access by column (e.g., forward substitution with L).
Applications: Finite element methods, graph algorithms (adjacency matrices), PageRank, large-scale optimization.
Fill-in: Sparse factorization (LU, Cholesky) introduces new nonzeros ("fill-in"). Reordering (e.g., minimum degree, nested dissection, Cuthill-McKee) dramatically reduces fill-in.
4. Stationary Iterative Methods
For large sparse systems where direct methods are too expensive, iterative methods approximate the solution incrementally.
General splitting: Write $A = M - N$ where M is easily invertible. The iteration:
$M x^{(k+1)} = N x^{(k)} + b
=> x^{(k+1)} = M^{-1} N x^{(k)} + M^{-1} b
$
Jacobi method: $M = D$ (diagonal of A), $N = D - A$:
x_i^{(k+1)} = (b_i - Σ_{j≠i} a_{ij} x_j^{(k)}) / a_{ii}
Each component is updated using ONLY values from the previous iteration. Highly parallelizable.
Gauss-Seidel: $M = L + D$ (lower triangular + diagonal):
x_i^{(k+1)} = (b_i - Σ_{j<i} a_{ij} x_j^{(k+1)} - Σ_{j>i} a_{ij} x_j^{(k)}) / a_{ii}
Uses the most recently computed values. Typically converges twice as fast as Jacobi.
Successive Over-Relaxation (SOR): Accelerates Gauss-Seidel:
$x_i^{(k+1)} = (1-ω) x_i^{(k)} + ω · (Gauss-Seidel update)
$
For SPD matrices, optimal ω ∈ (0, 2) can dramatically accelerate convergence.
Convergence condition: All methods converge for any initial guess iff $ρ(M^{-1}N) < 1$, where ρ is the spectral radius. For Jacobi and Gauss-Seidel, diagonal dominance ensures convergence.
5. Conjugate Gradient Method
For SPD systems $Ax = b$, CG is an optimal Krylov subspace method.
Key idea: At step k, CG finds the best approximation x_k in the Krylov subspace:
K_k = span{b, Ab, A²b, ..., A^{k-1}b}
that minimizes $||x - x_k||_A$ (the A-norm: $||v||_A² = v^T A v$).
Algorithm:
x_0 = initial guess
r_0 = b - A x_0
p_0 = r_0
for k = 0, 1, 2, ...:
α_k = (r_k^T r_k) / (p_k^T A p_k)
x_{k+1} = x_k + α_k p_k
r_{k+1} = r_k - α_k A p_k
β_k = (r_{k+1}^T r_{k+1}) / (r_k^T r_k)
p_{k+1} = r_{k+1} + β_k p_k
Convergence: Exact in n steps (in exact arithmetic) because the Krylov subspace spans ℝ^n. In practice, much faster with good conditioning. Error bound:
$||x_k - x_*||_A ≤ 2 ((√κ - 1)/(√κ + 1))^k ||x_0 - x_*||_A $
where κ = κ₂(A).
Convergence is fast when κ is small and/or when eigenvalues are clustered.
6. Preconditioning
Preconditioning transforms $Ax = b$ into an equivalent system with a better condition number:
$M^{-1} A x = M^{-1} b (left preconditioning)
$
or
$A M^{-1} y = b, x = M^{-1} y (right preconditioning)
$
where $M ≈ A$ but M^{-1} is easy to apply.
Common preconditioners:
- Jacobi (diagonal): $M = diag(A)$. Simple but limited.
- Symmetric Gauss-Seidel: One forward + one backward sweep.
- Incomplete Cholesky (IC): Cholesky with zero fill-in at positions where A is zero.
- Incomplete LU (ILU): LU with controlled fill-in.
- Multigrid: Hierarchical; optimal O(n) for many PDE problems.
- Algebraic Multigrid (AMG): Black-box multigrid from matrix only.
Preconditioned CG (PCG):
r_0 = b - A x_0
z_0 = M^{-1} r_0 (solve M z_0 = r_0)
p_0 = z_0
for k = 0, 1, 2, ...:
α_k = (r_k^T z_k) / (p_k^T A p_k)
x_{k+1} = x_k + α_k p_k
r_{k+1} = r_k - α_k A p_k
z_{k+1} = M^{-1} r_{k+1}
β_k = (r_{k+1}^T z_{k+1}) / (r_k^T z_k)
p_{k+1} = z_{k+1} + β_k p_k
A good preconditioner can reduce iterations from thousands to tens.
Key Terms
- DIAgonal format
Worked Examples
Example 1: Jacobi Iteration
Solve:
$4x + y = 5 x + 3y = 4 $
True solution: $x = 1, y = 1$.
Jacobi updates:
$x^{(k+1)} = (5 - y^{(k)}) / 4
y^{(k+1)} = (4 - x^{(k)}) / 3
$
Start $x^{(0)} = 0, y^{(0)} = 0$:
| k | x | y |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 1.25 | 1.333 |
| 2 | 0.917 | 0.917 |
| 3 | 1.021 | 1.028 |
| 4 | 0.993 | 0.993 |
| 5 | 1.002 | 1.002 |
Convergence is linear but effective.
Example 2: Gauss-Seidel Iteration
Same system. Updates:
$x^{(k+1)} = (5 - y^{(k)}) / 4
y^{(k+1)} = (4 - x^{(k+1)}) / 3
$
Note: y uses the just-computed x!
| k | x | y |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 1.25 | 0.917 |
| 2 | 1.021 | 0.993 |
| 3 | 1.002 | 0.999 |
Gauss-Seidel converges faster — roughly twice as fast for this problem.
Example 3: Conjugate Gradient — 2 Steps
$A = [3 1] b = [5]
[1 3] [5]
$
Exact solution: $x = [1.25, 1.25]^T$.
x₀ = [0, 0]^T, r₀ = [5, 5]^T, p₀ = [5, 5]^T.
Iteration 1: $α₀ = (5²+5²) / (p₀^T A p₀) = 50 / ([5,5][20,20]^T) = 50/200 = 0.25$ $x₁ = [0,0] + 0.25[5,5] = [1.25, 1.25]^T$ $r₁ = [5,5] - 0.25[20,20]^T = [0,0]^T$ Done in 1 step! (Because A has only 2 distinct eigenvalues with multiplicity, and x₀ was 0.)
Theoretically CG takes at most n steps; in practice, much fewer.
Quiz
Q1: What does the concept of DIAgonal format primarily refer to in this subject?
A) A historical anecdote about DIAgonal format B) The definition and application of DIAgonal format C) A visual representation of DIAgonal format D) A computational error related to DIAgonal format
Correct: B)
- If you chose A: This is incorrect. DIAgonal format is defined as: the definition and application of diagonal format. The other options describe different aspects that are not the primary focus.
- If you chose B: DIAgonal format is defined as: the definition and application of diagonal format. The other options describe different aspects that are not the primary focus. Correct!
- If you chose C: This is incorrect. DIAgonal format is defined as: the definition and application of diagonal format. The other options describe different aspects that are not the primary focus.
- If you chose D: This is incorrect. DIAgonal format is defined as: the definition and application of diagonal format. The other options describe different aspects that are not the primary focus.
Q2: What is the primary purpose of Common Pitfalls?
A) It replaces all other methods in this domain B) It is primarily a historical notation system C) It is used only in advanced research contexts D) It is used to common pitfalls in mathematical analysis
Correct: D)
- If you chose A: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose B: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose C: This is incorrect. Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role.
- If you chose D: Common Pitfalls serves the purpose described in the correct answer. The other options misrepresent its role. Correct!
Q3: Based on the worked examples in this subject, what is the correct result?
A) A different result from a common mistake B) The inverse of the correct answer C) An unrelated numerical value D) [0,0]
Correct: D)
- If you chose A: This is incorrect. The worked examples show that the result is [0,0]. The other options represent common errors.
- If you chose B: This is incorrect. The worked examples show that the result is [0,0]. The other options represent common errors.
- If you chose C: This is incorrect. The worked examples show that the result is [0,0]. The other options represent common errors.
- If you chose D: The worked examples show that the result is [0,0]. The other options represent common errors. Correct!
Practice Problems
-
Compute 3 iterations of Jacobi on $[[5,2],[2,4]] x = [9, 8]^T$ starting from x₀ = [0, 0]^T.
-
For the same system, compute 3 iterations of Gauss-Seidel. Compare convergence.
-
Show that the spectral radius $ρ(M^{-1}N)$ for Jacobi on a strictly diagonally dominant matrix is < 1 (guaranteeing convergence).
-
Using CSR format, represent the matrix
[[0,3,0],[1,0,0],[0,0,2]]and compute y = Ax for x = [1, 2, 3]^T using the CSR representation. -
Explain why the condition number of
A^T Ais κ(A)² and why this makes normal equations numerically inferior to QR for least squares.
Answers
1. Jacobi: k=0: x=[0,0] k=1: x₁=(9-0)/5=1.8, x₂=(8-0)/4=2.0 → [1.8, 2.0] k=2: x₁=(9-2·2.0)/5=1.0, x₂=(8-2·1.8)/4=1.1 → [1.0, 1.1] k=3: x₁=(9-2·1.1)/5=1.36, x₂=(8-2·1.0)/4=1.5 → [1.36, 1.5] True solution: [1, 1]^T. (Converging slowly because off-diagonal is relatively large.) 2. Gauss-Seidel: k=0: x=[0,0] k=1: x₁=(9-0)/5=1.8, x₂=(8-2·1.8)/4=1.1 → [1.8, 1.1] k=2: x₁=(9-2·1.1)/5=1.36, x₂=(8-2·1.36)/4=1.32 → [1.36, 1.32] k=3: x₁=(9-2·1.32)/5=1.272, x₂=(8-2·1.272)/4=1.364 → [1.272, 1.364] Faster than Jacobi. Both oscillate around the true solution [1, 1]. 3. For Jacobi: M = D (diagonal), N = D - A. M^{-1}N = D^{-1}(D-A) = I - D^{-1}A. For strictly diagonally dominant A: |a_{ii}| > Σ_{j≠i} |a_{ij}|. The row sums of |I - D^{-1}A| are Σ_{j≠i} |a_{ij}/a_{ii}| < 1. By Gershgorin's theorem, all eigenvalues of M^{-1}N have magnitude < 1, so ρ < 1. 4. CSR: values = [3, 1, 2] col_idx = [1, 0, 2] row_ptr = [0, 1, 2, 3] y = Ax: y₀ = values[0]·x[col_idx[0]] = 3·2 = 6. y₁ = values[1]·x[col_idx[1]] = 1·1 = 1. y₂ = values[2]·x[col_idx[2]] = 2·3 = 6. y = [6, 1, 6]^T. Verify: A[1,2,3]^T = [6,1,6]^T ✓. 5. `κ(A^T A) = ||A^T A||₂ · ||(A^T A)^{-1}||₂ = σ_max² · (1/σ_min²) = (σ_max/σ_min)² = κ(A)²`. Forming A^T A squares the condition number, so for κ(A) = 10³, normal equations lose ~6 digits compared to ~3 for QR. When κ(A) is large, this is catastrophic.Summary
- Floating-point arithmetic introduces relative errors bounded by ε_mach ≈ 10⁻¹⁶; backward error analysis reveals how perturbations propagate
- Backward stability means the computed solution is exact for a slightly perturbed problem; Gaussian elimination with pivoting, Cholesky, and Householder QR are backward stable
Common Pitfall: Stable != accurate. A stable algorithm gives the exact solution to a NEARBY problem. If the problem is ill-conditioned, even stable algorithms give inaccurate results. - Sparse matrices (CSR/CSC/COO formats) enable O(nnz) instead of O(n²) operations — critical for large-scale problems - Stationary iterative methods (Jacobi, Gauss-Seidel, SOR) converge when ρ(M^{-1}N) < 1; Gauss-Seidel is typically 2× faster than Jacobi - Conjugate Gradient is optimal for SPD systems: minimizes A-norm error over Krylov subspaces, converging faster with clustered eigenvalues - Preconditioning transforms M^{-1}Ax = M^{-1}b to reduce κ, dramatically accelerating iterative methods
Pitfalls
- Computing A^{-1} explicitly. Never form the inverse — use
solve(A, b)instead. Computing A^{-1} is roughly 3× more expensive and numerically less stable than Gaussian elimination with forward/backward substitution. - Assuming backward stability guarantees accuracy. A backward stable algorithm gives the exact solution to a slightly perturbed problem. If the problem is ill-conditioned, even a perfectly stable algorithm produces large forward errors.
- Using Classical Gram-Schmidt for orthogonalization. CGS loses orthogonality catastrophically due to round-off. Use Modified Gram-Schmidt or (better) Householder transformations for production code.
- Expecting Jacobi or Gauss-Seidel to converge unconditionally. These methods require ρ(M^{-1}N) < 1. Diagonal dominance is sufficient but not necessary — always check the spectral radius for your specific matrix.
- Neglecting preconditioning for CG. Conjugate Gradient on an ill-conditioned system without preconditioning can take thousands of iterations. A good preconditioner (incomplete Cholesky, algebraic multigrid) can reduce this to tens.
Next Steps
Continue to 09-10 Matrix Calculus for the final Phase 9 subject: derivatives involving vectors and matrices, including gradients, Jacobians, and the chain rule in matrix form.