Math graphic
📐 Concept diagram

09-09 — Numerical Linear Algebra

Phase: 9 — Matrix Decompositions & Advanced Linear Algebra Subject: 09-09 Prerequisites: 09-08 — Tensor Algebra Next subject: 09-10 — Matrix Calculus


Learning Objectives

  1. Analyze floating-point errors and machine epsilon, and apply backward error analysis to linear algebra algorithms
  2. Define numerical stability and distinguish between forward stability, backward stability, and mixed stability
  3. Understand sparse matrix storage formats (CSR, CSC, COO) and their computational implications
  4. Implement and compare stationary iterative methods: Jacobi, Gauss-Seidel, and Successive Over-Relaxation (SOR)
  5. Explain the Conjugate Gradient (CG) method for SPD systems and the role of preconditioning

Core Content

1. Floating-Point Arithmetic and Errors

In IEEE 754 double precision, numbers are represented with 53 bits of mantissa (~15-16 decimal digits). The machine epsilon $ε_mach$ is approximately 2^{-52} ≈ 2.22 × 10^{-16}.

Floating-point model: For any real number x within range:

$fl(x) = x(1 + δ)  where |δ| ≤ ε_mach
$

For an arithmetic operation ∘ ∈ {+, -, ×, ÷}:

$fl(x ∘ y) = (x ∘ y)(1 + δ)  where |δ| ≤ ε_mach
$

Accumulation of errors in linear algebra:

Consider computing the inner product $s = Σ_{i=1}^{n} x_i y_i$ in floating point:

CRITICAL -- Foundational: NEVER compute A^{-1} explicitly -- use solve(A, b). Computing inverses is 3x more expensive and numerically less stable. Use np.linalg.solve, not np.linalg.inv.

$fl(s) = Σ x_i y_i (1 + δ_i)
$

where each $δ_i$ is bounded by roughly $n·ε_mach$. The error grows linearly (not exponentially) with n for basic operations.

Forward error: $||computed - true||$ Backward error: How much we'd need to perturb the input for the exact solution to equal our computed one.

For a linear system $Ax = b$: - Computed satisfies $(A + ΔA) x̂ = b$ for some small $ΔA$ (backward error) - Forward error: $||x̂ - x|| ≤ κ(A) · ||ΔA||/||A||$

2. Numerical Stability

An algorithm is backward stable if it gives the exact answer to a slightly perturbed problem. Specifically, for solving $Ax = b$, the computed satisfies $(A + ΔA) x̂ = b + Δb$ where $||ΔA||/||A||$ and $||Δb||/||b||$ are O(ε_mach).

Stability of key algorithms:

Algorithm Backward Stable? Notes
Gaussian elimination with partial pivoting Usually yes Growth factor can be large (theoretically 2^{n-1}, but rarely in practice)
Cholesky Yes No pivoting needed for SPD
QR (Householder) Yes Gold standard for least squares
QR (Modified Gram-Schmidt) Near-stable Better than CGS, not quite as stable as Householder
QR (Classical Gram-Schmidt) No Orthogonality degrades badly
Normal equations (A^T A) No Squares condition number
Power iteration Conditionally Stable for well-separated eigenvalues

Growth factor in GE: With partial pivoting, elements in U are bounded by $2^{n-1} max|a_{ij}|$ in the worst case. In practice, the growth factor is almost always small, making Gaussian elimination with partial pivoting effectively backward stable.

3. Sparse Matrices

A matrix is sparse if most entries are zero. Storing and operating on only the nonzeros yields enormous savings.

Common storage formats:

COO (Coordinate): Store (row, col, value) triples. Simple but inefficient for arithmetic.

row:    [0, 0, 1, 2, 2]
col:    [0, 2, 1, 0, 2]
values: [1, 2, 3, 4, 5]

CSR (Compressed Sparse Row): Three arrays: - values: nonzero entries, row by row - col_idx: column index of each value - row_ptr: index in values/col_idx where each row starts

$A = [1  0  2]    values:  [1, 2, 3, 4, 5]
    [0  3  0]    col_idx: [0, 2, 1, 0, 2]
    [4  0  5]    row_ptr: [0, 2, 3, 5]
$

Matrix-vector product $y = Ax$ in CSR:

for i = 0 to n-1:
    y[i] = 0
    for k = row_ptr[i] to row_ptr[i+1]-1:
        y[i] += values[k] * x[col_idx[k]]

Cost: O(nnz) instead of O(n²), where nnz = number of nonzeros.

CSC (Compressed Sparse Column): Analogous to CSR but column-oriented. Better for algorithms that access by column (e.g., forward substitution with L).

Applications: Finite element methods, graph algorithms (adjacency matrices), PageRank, large-scale optimization.

Fill-in: Sparse factorization (LU, Cholesky) introduces new nonzeros ("fill-in"). Reordering (e.g., minimum degree, nested dissection, Cuthill-McKee) dramatically reduces fill-in.

4. Stationary Iterative Methods

For large sparse systems where direct methods are too expensive, iterative methods approximate the solution incrementally.

General splitting: Write $A = M - N$ where M is easily invertible. The iteration:

$M x^{(k+1)} = N x^{(k)} + b
=> x^{(k+1)} = M^{-1} N x^{(k)} + M^{-1} b
$

Jacobi method: $M = D$ (diagonal of A), $N = D - A$:

x_i^{(k+1)} = (b_i - Σ_{j≠i} a_{ij} x_j^{(k)}) / a_{ii}

Each component is updated using ONLY values from the previous iteration. Highly parallelizable.

Gauss-Seidel: $M = L + D$ (lower triangular + diagonal):

x_i^{(k+1)} = (b_i - Σ_{j<i} a_{ij} x_j^{(k+1)} - Σ_{j>i} a_{ij} x_j^{(k)}) / a_{ii}

Uses the most recently computed values. Typically converges twice as fast as Jacobi.

Successive Over-Relaxation (SOR): Accelerates Gauss-Seidel:

$x_i^{(k+1)} = (1-ω) x_i^{(k)} + ω · (Gauss-Seidel update)
$

For SPD matrices, optimal ω ∈ (0, 2) can dramatically accelerate convergence.

Convergence condition: All methods converge for any initial guess iff $ρ(M^{-1}N) < 1$, where ρ is the spectral radius. For Jacobi and Gauss-Seidel, diagonal dominance ensures convergence.

5. Conjugate Gradient Method

For SPD systems $Ax = b$, CG is an optimal Krylov subspace method.

Key idea: At step k, CG finds the best approximation x_k in the Krylov subspace:

K_k = span{b, Ab, A²b, ..., A^{k-1}b}

that minimizes $||x - x_k||_A$ (the A-norm: $||v||_A² = v^T A v$).

Algorithm:

x_0 = initial guess
r_0 = b - A x_0
p_0 = r_0
for k = 0, 1, 2, ...:
    α_k = (r_k^T r_k) / (p_k^T A p_k)
    x_{k+1} = x_k + α_k p_k
    r_{k+1} = r_k - α_k A p_k
    β_k = (r_{k+1}^T r_{k+1}) / (r_k^T r_k)
    p_{k+1} = r_{k+1} + β_k p_k

Convergence: Exact in n steps (in exact arithmetic) because the Krylov subspace spans ℝ^n. In practice, much faster with good conditioning. Error bound:

$||x_k - x_*||_A ≤ 2 ((√κ - 1)/(√κ + 1))^k ||x_0 - x_*||_A
$

where κ = κ₂(A).

Convergence is fast when κ is small and/or when eigenvalues are clustered.

6. Preconditioning

Preconditioning transforms $Ax = b$ into an equivalent system with a better condition number:

$M^{-1} A x = M^{-1} b    (left preconditioning)
$

or

$A M^{-1} y = b,  x = M^{-1} y    (right preconditioning)
$

where $M ≈ A$ but M^{-1} is easy to apply.

Common preconditioners:

Preconditioned CG (PCG):

r_0 = b - A x_0
z_0 = M^{-1} r_0    (solve M z_0 = r_0)
p_0 = z_0
for k = 0, 1, 2, ...:
    α_k = (r_k^T z_k) / (p_k^T A p_k)
    x_{k+1} = x_k + α_k p_k
    r_{k+1} = r_k - α_k A p_k
    z_{k+1} = M^{-1} r_{k+1}
    β_k = (r_{k+1}^T z_{k+1}) / (r_k^T z_k)
    p_{k+1} = z_{k+1} + β_k p_k

A good preconditioner can reduce iterations from thousands to tens.



Key Terms

Worked Examples

Example 1: Jacobi Iteration

Solve:

$4x + y = 5
x + 3y = 4
$

True solution: $x = 1, y = 1$.

Jacobi updates:

$x^{(k+1)} = (5 - y^{(k)}) / 4
y^{(k+1)} = (4 - x^{(k)}) / 3
$

Start $x^{(0)} = 0, y^{(0)} = 0$:

k x y
0 0 0
1 1.25 1.333
2 0.917 0.917
3 1.021 1.028
4 0.993 0.993
5 1.002 1.002

Convergence is linear but effective.

Example 2: Gauss-Seidel Iteration

Same system. Updates:

$x^{(k+1)} = (5 - y^{(k)}) / 4
y^{(k+1)} = (4 - x^{(k+1)}) / 3
$

Note: y uses the just-computed x!

k x y
0 0 0
1 1.25 0.917
2 1.021 0.993
3 1.002 0.999

Gauss-Seidel converges faster — roughly twice as fast for this problem.

Example 3: Conjugate Gradient — 2 Steps

$A = [3  1]    b = [5]
    [1  3]        [5]
$

Exact solution: $x = [1.25, 1.25]^T$.

x₀ = [0, 0]^T, r₀ = [5, 5]^T, p₀ = [5, 5]^T.

Iteration 1: $α₀ = (5²+5²) / (p₀^T A p₀) = 50 / ([5,5][20,20]^T) = 50/200 = 0.25$ $x₁ = [0,0] + 0.25[5,5] = [1.25, 1.25]^T$ $r₁ = [5,5] - 0.25[20,20]^T = [0,0]^T$ Done in 1 step! (Because A has only 2 distinct eigenvalues with multiplicity, and x₀ was 0.)

Theoretically CG takes at most n steps; in practice, much fewer.


Quiz

Q1: What does the concept of DIAgonal format primarily refer to in this subject?

A) A historical anecdote about DIAgonal format B) The definition and application of DIAgonal format C) A visual representation of DIAgonal format D) A computational error related to DIAgonal format

Correct: B)

Q2: What is the primary purpose of Common Pitfalls?

A) It replaces all other methods in this domain B) It is primarily a historical notation system C) It is used only in advanced research contexts D) It is used to common pitfalls in mathematical analysis

Correct: D)

Q3: Based on the worked examples in this subject, what is the correct result?

A) A different result from a common mistake B) The inverse of the correct answer C) An unrelated numerical value D) [0,0]

Correct: D)

Practice Problems

  1. Compute 3 iterations of Jacobi on $[[5,2],[2,4]] x = [9, 8]^T$ starting from x₀ = [0, 0]^T.

  2. For the same system, compute 3 iterations of Gauss-Seidel. Compare convergence.

  3. Show that the spectral radius $ρ(M^{-1}N)$ for Jacobi on a strictly diagonally dominant matrix is < 1 (guaranteeing convergence).

  4. Using CSR format, represent the matrix [[0,3,0],[1,0,0],[0,0,2]] and compute y = Ax for x = [1, 2, 3]^T using the CSR representation.

  5. Explain why the condition number of A^T A is κ(A)² and why this makes normal equations numerically inferior to QR for least squares.

Answers 1. Jacobi: k=0: x=[0,0] k=1: x₁=(9-0)/5=1.8, x₂=(8-0)/4=2.0 → [1.8, 2.0] k=2: x₁=(9-2·2.0)/5=1.0, x₂=(8-2·1.8)/4=1.1 → [1.0, 1.1] k=3: x₁=(9-2·1.1)/5=1.36, x₂=(8-2·1.0)/4=1.5 → [1.36, 1.5] True solution: [1, 1]^T. (Converging slowly because off-diagonal is relatively large.) 2. Gauss-Seidel: k=0: x=[0,0] k=1: x₁=(9-0)/5=1.8, x₂=(8-2·1.8)/4=1.1 → [1.8, 1.1] k=2: x₁=(9-2·1.1)/5=1.36, x₂=(8-2·1.36)/4=1.32 → [1.36, 1.32] k=3: x₁=(9-2·1.32)/5=1.272, x₂=(8-2·1.272)/4=1.364 → [1.272, 1.364] Faster than Jacobi. Both oscillate around the true solution [1, 1]. 3. For Jacobi: M = D (diagonal), N = D - A. M^{-1}N = D^{-1}(D-A) = I - D^{-1}A. For strictly diagonally dominant A: |a_{ii}| > Σ_{j≠i} |a_{ij}|. The row sums of |I - D^{-1}A| are Σ_{j≠i} |a_{ij}/a_{ii}| < 1. By Gershgorin's theorem, all eigenvalues of M^{-1}N have magnitude < 1, so ρ < 1. 4. CSR: values = [3, 1, 2] col_idx = [1, 0, 2] row_ptr = [0, 1, 2, 3] y = Ax: y₀ = values[0]·x[col_idx[0]] = 3·2 = 6. y₁ = values[1]·x[col_idx[1]] = 1·1 = 1. y₂ = values[2]·x[col_idx[2]] = 2·3 = 6. y = [6, 1, 6]^T. Verify: A[1,2,3]^T = [6,1,6]^T ✓. 5. `κ(A^T A) = ||A^T A||₂ · ||(A^T A)^{-1}||₂ = σ_max² · (1/σ_min²) = (σ_max/σ_min)² = κ(A)²`. Forming A^T A squares the condition number, so for κ(A) = 10³, normal equations lose ~6 digits compared to ~3 for QR. When κ(A) is large, this is catastrophic.

Summary

Common Pitfall: Stable != accurate. A stable algorithm gives the exact solution to a NEARBY problem. If the problem is ill-conditioned, even stable algorithms give inaccurate results. - Sparse matrices (CSR/CSC/COO formats) enable O(nnz) instead of O(n²) operations — critical for large-scale problems - Stationary iterative methods (Jacobi, Gauss-Seidel, SOR) converge when ρ(M^{-1}N) < 1; Gauss-Seidel is typically 2× faster than Jacobi - Conjugate Gradient is optimal for SPD systems: minimizes A-norm error over Krylov subspaces, converging faster with clustered eigenvalues - Preconditioning transforms M^{-1}Ax = M^{-1}b to reduce κ, dramatically accelerating iterative methods


Pitfalls



Next Steps

Continue to 09-10 Matrix Calculus for the final Phase 9 subject: derivatives involving vectors and matrices, including gradients, Jacobians, and the chain rule in matrix form.