📐 Concept diagram

08-10 — Symmetric Matrices and the Spectral Theorem

Phase: 8 — Linear Algebra (Rigorous) Subject: 08-10 Prerequisites: 08-07 — Determinants (Deep), 08-08 — Eigenvalues and Eigenvectors, 08-09 — Diagonalisation Next subject: 09-01 — LU Decomposition (Phase 9 — Matrix Decompositions & Advanced Linear Algebra)

Learning Objectives

By the end of this subject, you will be able to:

State and prove the Spectral Theorem for real symmetric matrices: every real symmetric matrix is orthogonally diagonalizable with real eigenvalues
Perform orthogonal diagonalization A = QΛQᵀ by finding orthonormal eigenvectors and constructing an orthogonal matrix Q
Analyze quadratic forms Q(x) = xᵀAx — classify them as positive definite, negative definite, or indefinite using eigenvalues, principal minors, or energy tests
Relate the spectral theorem to geometric concepts including principal axes of ellipsoids, Rayleigh quotients, and constrained optimization
Connect the spectral theorem to downstream applications: PCA, SVD, and the Hessian test in multivariable calculus

Core Content

1. Real Symmetric Matrices — What Makes Them Special

An n × n real matrix A is symmetric if A = Aᵀ, i.e., aᵢⱼ = aⱼᵢ for all i, j.

Symmetric matrices appear everywhere: - Covariance matrices in statistics - Hessian matrices (second derivatives) of smooth functions - Laplacian matrices of graphs - Stiffness matrices in finite element methods - The Gram matrix XᵀX (fundamental in linear regression and PCA)

Key property that sets symmetric matrices apart from general matrices:

General n × n matrix:  may have complex eigenvalues, may not be diagonalizable, eigenvectors not orthogonal
Symmetric matrix:       ALL eigenvalues are REAL, ALWAYS diagonalizable, eigenvectors for distinct eigenvalues are ORTHOGONAL

2. The Spectral Theorem (Real Symmetric Case)

Theorem (Spectral Theorem for Real Symmetric Matrices): Let A be an n × n real symmetric matrix. Then:

All eigenvalues λ₁, ..., λ_n of A are real.
Eigenvectors corresponding to distinct eigenvalues are orthogonal.
A is orthogonally diagonalizable: there exists an orthogonal matrix Q (Qᵀ = Q⁻¹) and a diagonal matrix Λ such that

$A = Q Λ Qᵀ
$

Equivalently,

$Qᵀ A Q = Λ
$

Columns of Q are orthonormal eigenvectors of A, and diagonal entries of Λ are the corresponding eigenvalues.

CRITICAL — Foundational: The Spectral Theorem is the crown jewel of linear algebra. Every real symmetric matrix has REAL eigenvalues and an ORTHOGONAL eigenbasis. This means $A = QΛQ^T$ (not just $QΛQ⁻¹$). Powers PCA, SVD, quadratic form classification, and the Hessian test.

The name "spectral": The set of eigenvalues {λ₁, ..., λ_n} is called the spectrum of A. The spectral theorem says the spectrum of a symmetric matrix consists entirely of real numbers.

Proof Outline

Proof of part (1) — Real eigenvalues: Let λ be an eigenvalue with eigenvector x (possibly complex), so Ax = λx. Take the conjugate transpose: xᴴA = λ̄xᴴ (since A is real and symmetric, Aᴴ = A).

Multiply Ax = λx on the left by xᴴ: xᴴAx = λ(xᴴx) Multiply xᴴA = λ̄xᴴ on the right by x: xᴴAx = λ̄(xᴴx)

Therefore λ(xᴴx) = λ̄(xᴴx). Since xᴴx = ‖x‖² > 0 (x ≠ 0), we get λ = λ̄, so λ is real. ✓

Proof of part (2) — Orthogonal eigenvectors for distinct λ: Let A v = λ₁v and A w = λ₂w with λ₁ ≠ λ₂. Then:

$λ₁(v·w) = (λ₁v)·w = (Av)·w = v·(Aᵀw) = v·(Aw) = v·(λ₂w) = λ₂(v·w)
$

(Second step uses symmetry: (Av)·w = vᵀAᵀw = vᵀAw = v·(Aw).)

So λ₁(v·w) = λ₂(v·w) → (λ₁ − λ₂)(v·w) = 0. Since λ₁ ≠ λ₂, we must have v·w = 0. ✓

Proof of part (3) — Orthogonal diagonalizability (sketch): The full proof uses induction. Key idea: pick one eigenvector v₁ (exists because characteristic polynomial splits over C, but part (1) shows λ₁ is real so v₁ is real). Normalize to unit length. Complete {v₁} to an orthonormal basis, form orthogonal matrix U = [v₁ | U₂]. Then:

$Uᵀ A U = [λ₁  0ᵀ]
         [0   A₁]
$

where A₁ is (n−1)×(n−1) symmetric. Apply the induction hypothesis to A₁. The result is A = QΛQᵀ with Q = U₁U₂...U_n. ✓

Handling Repeated Eigenvalues

When eigenvalues repeat, the spectral theorem guarantees we can STILL find orthogonal eigenvectors. For an eigenvalue λ with multiplicity m, the eigenspace E_λ has dimension m, and we can apply Gram-Schmidt within it to get m orthonormal eigenvectors.

Example: A = I (identity matrix). λ = 1 with multiplicity n. ANY orthonormal basis of Rⁿ works as eigenvectors. The spectral theorem says QΛQᵀ = II Iᵀ = I = A — holds for any Q.

3. The Spectral Decomposition (Outer Product Form)

Instead of A = QΛQᵀ, we can write:

$A = λ₁ q₁ q₁ᵀ + λ₂ q₂ q₂ᵀ + ... + λ_n q_n q_nᵀ
$

where q_i are the orthonormal eigenvector columns of Q. Each term is a rank-1 projection matrix scaled by the eigenvalue.

Geometric interpretation: A acts on any vector x by: 1. Project x onto each eigen-direction q_i: the component is (q_iᵀ x) 2. Scale each component by λ_i 3. Reassemble: A x = Σ λ_i (q_iᵀ x) q_i

This is the eigendecomposition (or spectral decomposition) — it decomposes the action of A into independent scalar multiplications along orthogonal axes.

Spectral decomposition and powers:

$A^k = λ₁^k q₁ q₁ᵀ + λ₂^k q₂ q₂ᵀ + ... + λ_n^k q_n q_nᵀ
$

4. Quadratic Forms

A quadratic form in variables x₁, ..., x_n is an expression:

$Q(x) = xᵀ A x = Σᵢ Σⱼ aᵢⱼ xᵢ xⱼ
$

where A is symmetric (we can always assume A is symmetric because xᵀAx = xᵀ((A+Aᵀ)/2)x and the skew-symmetric part contributes zero).

Connection to the spectral theorem: Using A = QΛQᵀ, let y = Qᵀx (a rotation). Then:

$Q(x) = xᵀ A x = xᵀ Q Λ Qᵀ x = (Qᵀ x)ᵀ Λ (Qᵀ x) = yᵀ Λ y = λ₁y₁² + λ₂y₂² + ... + λ_n y_n²
$

This is the principal axes theorem: every quadratic form can be expressed as a sum of squares (no cross terms) after an orthogonal change of variables. The coordinate axes of y are the principal axes of the quadratic form.

Geometric application — ellipsoids: The equation xᵀAx = 1 defines a quadric surface. After diagonalization:

$λ₁ y₁² + λ₂ y₂² + ... + λ_n y_n² = 1
$

For n = 2 with λ₁, λ₂ > 0, this is an ellipse with semi-axes lengths 1/√λ₁ and 1/√λ₂ along the eigenvector directions.

5. Definiteness Classification

A symmetric matrix A (equivalently, its quadratic form Q(x) = xᵀAx) is:

Type	Condition	Eigenvalue characterization
Positive definite	xᵀAx > 0 for all x ≠ 0	All λ_i > 0
Positive semidefinite	xᵀAx ≥ 0 for all x	All λ_i ≥ 0
Negative definite	xᵀAx < 0 for all x ≠ 0	All λ_i < 0
Negative semidefinite	xᵀAx ≤ 0 for all x	All λ_i ≤ 0
Indefinite	Takes both positive and negative values	Some λ_i > 0, some λ_i < 0

Three Ways to Check Definiteness

Method 1: Eigenvalues. Compute all eigenvalues. Check signs.

Method 2: Sylvester's Criterion (Principal Minors). For an n × n symmetric A, let Δ_k be the leading principal minor (determinant of the top-left k×k submatrix):

A is positive definite ⟺ Δ_k > 0 for all k = 1, ..., n
A is negative definite ⟺ (−1)^k Δ_k > 0 for all k (signs alternate starting negative)

Method 3: Cholesky factorization. A is positive definite ⟺ A = LLᵀ with L lower triangular having positive diagonal entries. This is both a test and a computational tool.

Example: Determine the definiteness of:

$A = [4  2]
    [2  3]
$

Δ₁ = 4 > 0 ✓
Δ₂ = det(A) = 12 − 4 = 8 > 0 ✓ → Positive definite. (Eigenvalues: λ = 1, 6 — both > 0, confirming.)

Example (indefinite):

$A = [1  4]
    [4  1]
$

Δ₁ = 1 > 0 ✓
Δ₂ = 1 − 16 = −15 < 0 ✗ → Indefinite. (Eigenvalues: λ = 5, −3 — mixed signs.)

Common misconception: It is NOT sufficient to check only the final determinant Δ_n. You MUST check ALL leading principal minors. Counterexample:

$A = [−1   0]
    [ 0  −1]
$

Δ₂ = 1 > 0, but Δ₁ = −1 < 0 → negative definite, not positive definite.

6. The Rayleigh Quotient

The Rayleigh quotient of a symmetric matrix A is:

R(x) = (xᵀ A x) / (xᵀ x)   for x ≠ 0

Theorem (Courant-Fischer / Rayleigh-Ritz): For a symmetric matrix A with eigenvalues λ_min = λ₁ ≤ λ₂ ≤ ... ≤ λ_n = λ_max:

λ_min ≤ R(x) ≤ λ_max for all x ≠ 0
λ_min is the global minimum of R(x), attained at the corresponding eigenvector
λ_max is the global maximum of R(x), attained at the corresponding eigenvector
The intermediate eigenvalues are given by constrained min-max problems (Courant-Fischer theorem)

Proof sketch: Diagonalize A = QΛQᵀ. Let y = Qᵀx. Then:

$R(x) = (xᵀQ Λ Qᵀx) / (xᵀQ Qᵀx) = (yᵀΛy) / (yᵀy) = (Σ λ_i y_i²) / (Σ y_i²)
$

This is a weighted average of the λ_i, so λ_min ≤ R(x) ≤ λ_max. ✓

Common Pitfall: The Rayleigh quotient gives eigenvectors, not arbitrary vectors. Its minimum is λ_min and maximum is λ_max, both achieved at corresponding eigenvectors. Many optimization problems reduce to Rayleigh quotient extremization.

Practical application — Power iteration: To find the largest eigenvalue/eigenvector, iterate:

$x_{k+1} = (A x_k) / ‖A x_k‖
$

The Rayleigh quotient R(x_k) converges to λ_max as k → ∞.

7. Connection to the Hessian and Optimization

In multivariable calculus, the Hessian matrix H_f(p) of second partial derivatives is symmetric (by Clairaut's theorem, assuming f is C²):

$H_f(p) = [∂²f/∂x_i∂x_j]
$

If H_f(p) is positive definite → p is a strict local minimum
If H_f(p) is negative definite → p is a strict local maximum
If H_f(p) is indefinite → p is a saddle point
If H_f(p) has a zero eigenvalue → second derivative test is inconclusive

This is the spectral theorem applied to optimization — the eigenvalues of the Hessian determine the local curvature in each principal direction.

8. Connection to PCA and SVD (Preview)

Given a centered data matrix X (n samples × p features), the covariance matrix is:

$C = (1/(n−1)) Xᵀ X
$

C is p × p and symmetric positive semidefinite. Its spectral decomposition:

$C = Q Λ Qᵀ
$

gives the principal components: columns of Q are directions of maximum variance, eigenvalues in Λ are the variances.

The SVD of X: X = U Σ Vᵀ gives V = Q (right singular vectors = eigenvectors of XᵀX) and Σ²/n = Λ (singular values squared are proportional to eigenvalues of the covariance matrix).

9. Simultaneous Diagonalization

Theorem: If A and B are symmetric and A is positive definite, then there exists an invertible matrix P such that:

$Pᵀ A P = I   and   Pᵀ B P = D  (diagonal)
$

This is used in generalized eigenvalue problems Ax = λBx (e.g., in discriminant analysis).

10. Common Misconceptions

"Symmetric matrices always have distinct eigenvalues" — FALSE. The identity matrix is symmetric with repeated eigenvalue 1. The spectral theorem guarantees orthogonal diagonalizability regardless of multiplicity.
"Any orthogonally diagonalizable matrix is symmetric" — TRUE. If A = QΛQᵀ with Q orthogonal, then Aᵀ = QΛᵀQᵀ = QΛQᵀ = A. Orthogonal diagonalizability characterizes real symmetric matrices.
"Positive eigenvalues guarantee positive definiteness" — Only for SYMMETRIC matrices. A non-symmetric matrix can have positive real eigenvalues but not be positive definite (positive definiteness is defined only for symmetric matrices, or equivalently requires symmetry as part of the definition in most conventions).
"The spectral theorem works over any field" — No, it requires the real numbers (or complex for Hermitian matrices). Over ℤ₂ or finite fields, symmetric doesn't imply diagonalizable.
"Gram-Schmidt always produces valid eigenvectors" — Only within the same eigenspace. If you Gram-Schmidt eigenvectors from DIFFERENT eigenvalues, you lose the eigenvector property (orthogonal ≠ eigenvector).

Key Terms

Definiteness
Indefinite
Negative definite
Negative semidefinite
Positive definite
Positive semidefinite
Quadratic forms
Rayleigh quotient
The Spectral Theorem

Worked Examples

Example 1: Full Orthogonal Diagonalization

Problem: Orthogonally diagonalize A = [3 1; 1 3].

Solution:

Step 1 — Find eigenvalues:

$det(A − λI) = det[3−λ   1  ] = (3−λ)² − 1 = λ² − 6λ + 9 − 1 = λ² − 6λ + 8 = (λ−2)(λ−4)
              [ 1   3−λ ]
$

λ₁ = 2, λ₂ = 4.

Step 2 — Find eigenvectors:

$λ = 2:  A − 2I = [1  1]  →  x₁ + x₂ = 0  →  x₂ = −x₁
                   [1  1]
        Eigenvector: v₁ = (1, −1)

λ = 4:  A − 4I = [−1   1]  →  −x₁ + x₂ = 0  →  x₂ = x₁
                   [ 1  −1]
        Eigenvector: v₂ = (1, 1)
$

Step 3 — Verify orthogonality: v₁·v₂ = 1·1 + (−1)·1 = 0 ✓ (distinct eigenvalues guarantee this).

Step 4 — Normalize to unit length:

$‖v₁‖ = √(1+1) = √2    →  q₁ = (1/√2, −1/√2)
‖v₂‖ = √2              →  q₂ = (1/√2,  1/√2)
$

Step 5 — Form Q and Λ:

$Q = [1/√2   1/√2]      Λ = [2  0]
    [−1/√2  1/√2]          [0  4]
$

Step 6 — Verify:

$Q Λ Qᵀ = [1/√2  1/√2] [2  0] [ 1/√2  −1/√2]
         [−1/√2 1/√2] [0  4] [ 1/√2   1/√2]

       = [1/√2  1/√2] [2/√2  −2/√2]
         [−1/√2 1/√2] [4/√2   4/√2]

       = [ 1  −1] [ 1/√2  −1/√2]    NO — let me compute carefully.
         [−1   1] [ 1/√2   1/√2]
$

Let's do it directly: QΛ = [2/√2 4/√2; −2/√2 4/√2] = [√2 2√2; −√2 2√2]

QΛQᵀ = [√2 2√2] [ 1/√2 −1/√2] = [1+2 −1+2] = [3 1] = A ✓ [−√2 2√2] [ 1/√2 1/√2] [−1+2 1+2] [1 3]

Spectral decomposition form:

$A = 2·q₁q₁ᵀ + 4·q₂q₂ᵀ
  = 2 [ 1/2  −1/2]  +  4 [1/2  1/2]
      [−1/2   1/2]       [1/2  1/2]
  = [1  −1]  +  [2  2]  =  [3  1]
    [−1  1]     [2  2]     [1  3]  ✓
$

Example 2: Repeated Eigenvalue Case

Problem: Orthogonally diagonalize A = [2 1 1; 1 2 1; 1 1 2].

Solution:

Eigenvalues: The characteristic polynomial is det(A−λI). Note A = I + 11ᵀ (all-ones matrix).

$det(A − λI) = det((2−λ)I + (all-ones)) — but easier: observe that (1,1,1)ᵀ is an eigenvector.
A(1,1,1)ᵀ = (4,4,4)ᵀ = 4·(1,1,1)ᵀ  →  λ₁ = 4
$

The remaining eigenvalues: since trace(A) = 6 = 4 + λ₂ + λ₃, and A−4I = [−2 1 1; 1 −2 1; 1 1 −2] has rank 2, λ=1 is the remaining eigenvalue with multiplicity 2. (Also: A = I + 11ᵀ, eigenvalues of 11ᵀ are 3,0,0, so A eigenvalues are 4,1,1.)

λ₁ = 4: A−4I = [−2 1 1; 1 −2 1; 1 1 −2]. RREF → x = (t, t, t)ᵀ. Eigenvector: v₁ = (1,1,1). Normalize: q₁ = (1/√3, 1/√3, 1/√3).

λ = 1 (multiplicity 2): A−I = [1 1 1; 1 1 1; 1 1 1]. Eigenspace: {x : x₁+x₂+x₃=0} — a plane. Pick independent vectors:

$v₂ = (1, −1, 0)   (already in the plane: 1+(−1)+0=0 ✓)
v₃ = (1, 0, −1)    (in the plane: 1+0+(−1)=0 ✓)
$

Check orthogonality: v₂·v₃ = 1·1 + (−1)·0 + 0·(−1) = 1 ≠ 0. They span the eigenspace but aren't orthogonal. Apply Gram-Schmidt:

$u₂ = v₂ = (1, −1, 0)
‖u₂‖ = √2  →  q₂ = (1/√2, −1/√2, 0)

u₃ = v₃ − proj_{u₂} v₃ = (1, 0, −1) − ((1)(1)+(0)(−1)+(−1)(0))/2 · (1, −1, 0)
   = (1, 0, −1) − (1/2)(1, −1, 0)
   = (1−1/2, 0+1/2, −1−0) = (1/2, 1/2, −1)

‖u₃‖ = √(1/4+1/4+1) = √(3/2) = √3/√2
q₃ = (1/√6, 1/√6, −2/√6)  — after normalizing: (1/2, 1/2, −1) / √(3/2) = (1/√6, 1/√6, −2/√6)
$

Verify: q₂·q₃ = 1/(√2·√6) + (−1)/(√2·√6) + 0 = 0 ✓. q₁·q₂ = 1/(√3·√2) + (−1)/(√3·√2) = 0 ✓.

$Q = [1/√3   1/√2   1/√6]      Λ = [4  0  0]
    [1/√3  −1/√2   1/√6]          [0  1  0]
    [1/√3     0   −2/√6]          [0  0  1]
$

Verify QᵀQ = I (exercise).

Example 3: Quadratic Form Analysis and Principal Axes

Problem: Given Q(x, y) = 5x² + 4xy + 2y², (a) Express Q as xᵀAx with A symmetric. (b) Find the principal axes (orthogonal transformation to eliminate the cross term). (c) Classify the quadratic form. (d) Sketch the curve Q(x, y) = 6.

Solution:

(a) Write Q(x, y) = [x y] [5 2; 2 2] [x; y]. So:

$A = [5  2]
    [2  2]
$

(Cross term 4xy = 2xy + 2yx → off-diagonal entries each get half: 2.)

(b) Find eigenvalues and eigenvectors of A:

$det(A−λI) = det[5−λ   2  ;  2  2−λ] = (5−λ)(2−λ) − 4 = λ² − 7λ + 10 − 4 = λ² − 7λ + 6 = (λ−1)(λ−6)
$

λ₁ = 1, λ₂ = 6.

Eigenvectors:

$λ=1: A−I = [4  2]  →  4x+2y=0 → y=−2x.   v₁ = (1, −2),  ‖v₁‖ = √5,  q₁ = (1/√5, −2/√5)
           [2  1]

λ=6: A−6I = [−1   2]  →  −x+2y=0 → x=2y.  v₂ = (2, 1),  ‖v₂‖ = √5,  q₂ = (2/√5, 1/√5)
            [ 2  −4]
$

Check: q₁·q₂ = 2/5 − 2/5 = 0 ✓.

The principal axes transformation: let [u; v] = Qᵀ[x; y] = [q₁ᵀ; q₂ᵀ][x; y].

$u = (1/√5)x − (2/√5)y     (coordinate along q₁)
v = (2/√5)x + (1/√5)y     (coordinate along q₂)
$

Then Q(x, y) = λ₁u² + λ₂v² = u² + 6v².

(c) Since both eigenvalues are positive (1 > 0, 6 > 0), A is positive definite. The quadratic form is an elliptic paraboloid opening upward.

(d) Q(x, y) = 6 → u² + 6v² = 6 → u²/6 + v²/1 = 1. This is an ellipse with semi-axes √6 ≈ 2.45 in the u-direction (along q₁) and 1 in the v-direction (along q₂). The ellipse is stretched more in the q₁ direction.

Example 4: Positive Definiteness Test via Principal Minors

Problem: Determine whether A = [3 1 1; 1 2 0; 1 0 2] is positive definite.

Solution:

Compute leading principal minors:

$Δ₁ = 3 > 0
Δ₂ = det([3 1; 1 2]) = 6 − 1 = 5 > 0
Δ₃ = det(A) = 3·det([2 0; 0 2]) − 1·det([1 0; 1 2]) + 1·det([1 2; 1 0])
    = 3·4 − 1·2 + 1·(−2)
    = 12 − 2 − 2 = 8 > 0
$

All Δ_k > 0 → positive definite. ✓

(Confirmation via eigenvalues: λ ≈ 1.27, 2.00, 3.73 — all positive.)

Counterexample — what happens when you skip Δ₂:

$A = [−1  0  0]
    [ 0 −1  0]
    [ 0  0 −1]
$

Δ₁ = −1 < 0. Skip and check only Δ₃: det = −1. You might mistakenly think −1 < 0 means negative definite — which is true here. But consider:

$B = [−1  0  0]
    [ 0  1  0]
    [ 0  0 −1]
$

Δ₁ = −1 < 0, Δ₂ = −1 < 0, Δ₃ = 1 > 0. All three are needed to see: this is indefinite.

Example 5: Rayleigh Quotient and Constrained Optimization

Problem: For A = [3 1; 1 3], find the maximum and minimum of R(x) = (xᵀAx)/(xᵀx) over all nonzero vectors x ∈ R².

Solution:

From Example 1, eigenvalues are λ₁ = 2 and λ₂ = 4.

By Rayleigh-Ritz: - Minimum = λ_min = 2, attained at x = q₁ = (1, −1)/√2 - Maximum = λ_max = 4, attained at x = q₂ = (1, 1)/√2

Verification for maximum:

$x = (1, 1)/√2 → xᵀAx = (1/2)[1 1][3 1; 1 3][1; 1] = (1/2)[1 1][4; 4] = (1/2)(8) = 4
xᵀx = 1/2 + 1/2 = 1
R(x) = 4/1 = 4 ✓
$

Verification for min:

$x = (1, −1)/√2 → xᵀAx = (1/2)[1 −1][3 1; 1 3][1; −1] = (1/2)[1 −1][2; −2] = (1/2)(4) = 2
xᵀx = 1
R(x) = 2 ✓
$

Practice Problems

(Answers are below. Try each problem before checking.)

Problem 1: Orthogonally diagonalize A = [6 −2; −2 3]. Write the spectral decomposition as a sum of rank-1 matrices.

Problem 2: Prove that if A is real symmetric and A² = A (idempotent), then all eigenvalues are 0 or 1.

Problem 3: For A = [1 2; 2 1], determine whether the quadratic form Q(x,y) = xᵀAx is positive definite, negative definite, or indefinite.

Problem 4: Orthogonally diagonalize the 3×3 matrix A = [1 0 2; 0 1 0; 2 0 1].

Problem 5: A 2×2 symmetric matrix has eigenvalues 5 and −1. If v = (3, 4)ᵀ is an eigenvector for λ = 5, find A.

Problem 6: Use the Rayleigh quotient to show that for a symmetric positive definite A, the condition number κ(A) = λ_max/λ_min.

Problem 7: Classify the quadratic form Q(x, y, z) = x² + 4xy − 2xz + 3y² + 2yz + 2z². Find its matrix A and determine definiteness.

Problem 8: Show that if A is symmetric and all its eigenvalues are positive, then A is positive definite.

Answers (click to expand)

**Problem 1:**

$Eigenvalues: det[6−λ  −2; −2  3−λ] = (6−λ)(3−λ) − 4 = λ² − 9λ + 18 − 4 = λ² − 9λ + 14 = (λ−2)(λ−7)
λ = 2, 7.
$

λ=2: A−2I = [4 −2; −2 1] → 4x−2y=0 → y=2x. v₁=(1,2), ‖v₁‖=√5, q₁=(1/√5, 2/√5). λ=7: A−7I = [−1 −2; −2 −4] → x+2y=0 → x=−2y. v₂=(−2,1), ‖v₂‖=√5, q₂=(−2/√5, 1/√5). Check orthogonality: q₁·q₂ = −2/5 + 2/5 = 0 ✓.

$Q = [1/√5  −2/√5]     Λ = [2  0]
    [2/√5   1/√5]         [0  7]
$

Spectral decomposition:

$A = 2·(1/5)[1  2]  +  7·(1/5)[ 4  −2]
            [2  4]              [−2   1]
  = (1/5)[2   4] + (1/5)[28  −14] = (1/5)[30  −10] = [6  −2] ✓
         [4   8]         [−14   7]         [−10   15]   [−2  3]
$

**Problem 2:** If A² = A and A is symmetric, then A = QΛQᵀ with Λ diagonal. Then: A² = QΛQᵀ QΛQᵀ = QΛ²Qᵀ = A = QΛQᵀ → Λ² = Λ → λ_i² = λ_i for each i → λ_i(λ_i−1) = 0 → λ_i ∈ {0,1}. ✓ Thus any symmetric idempotent matrix is an orthogonal projection matrix (onto the subspace spanned by eigenvectors with λ=1). **Problem 3:**

$A = [1  2]
    [2  1]
$

Eigenvalues: det[1−λ 2; 2 1−λ] = (1−λ)²−4 = λ²−2λ−3 = (λ−3)(λ+1) → λ=3, −1. Mixed signs → INDEFINITE. (There exist directions where Q > 0 and directions where Q < 0.) Example: x = (1,1) → Q = [1 1][1 2;2 1][1;1] = [1 1][3;3] = 6 > 0. x = (1,−1) → Q = [1 −1][1 2;2 1][1;−1] = [1 −1][−1;1] = −2 < 0. **Problem 4:**

$A = [1  0  2]
    [0  1  0]
    [2  0  1]
$

The middle row/column isolates eigenvalue 1 with eigenvector (0,1,0). For the [1 2; 2 1] block: eigenvalues are 3 and −1 (from Problem 3). λ₁ = 1: v₁ = (0, 1, 0), q₁ = (0, 1, 0). λ₂ = 3: 2×2 block eigenvector (1,1) → v₂ = (1, 0, 1), ‖v₂‖=√2, q₂ = (1/√2, 0, 1/√2). λ₃ = −1: 2×2 block eigenvector (1,−1) → v₃ = (1, 0, −1), ‖v₃‖=√2, q₃ = (1/√2, 0, −1/√2). Check orthogonality: q₁·q₂ = 0, q₁·q₃ = 0, q₂·q₃ = 1/2 + 0 − 1/2 = 0 ✓.

$Q = [0    1/√2   1/√2]     Λ = [1  0   0]
    [1      0      0  ]         [0  3   0]
    [0    1/√2  −1/√2]         [0  0  −1]
$

**Problem 5:** Symmetric 2×2 with eigenvalues 5, −1. v=(3,4)ᵀ for λ=5. Normalize: q₁ = (3/5, 4/5). The other eigenvector q₂ must be orthogonal to q₁ (distinct eigenvalues): q₂ = (−4/5, 3/5). Spectral decomposition:

$A = 5·(1/25)[9  12] + (−1)·(1/25)[16  −12]
            [12  16]               [−12   9]
  = (1/25)[45  60] − (1/25)[16  −12]
          [60  80]          [−12   9]
  = (1/25)[29  72] = [29/25  72/25]
          [72  71]   [72/25  71/25]
$

Check: A(3,4)ᵀ = (29·3+72·4)/25 = (87+288)/25 = 375/25 = 15 = 5·(3,4)ᵀ ✓. A(−4,3)ᵀ = (29·(−4)+72·3)/25 = (−116+216)/25 = 100/25 = 4 ≠ −1·(−4,3) = (4,−3). Let's recheck... Actually A(−4,3)ᵀ = [29/25 72/25; 72/25 71/25][−4; 3] = ((29·(−4)+72·3)/25, (72·(−4)+71·3)/25) = ((−116+216)/25, (−288+213)/25) = (100/25, −75/25) = (4, −3). But −1·(−4,3) = (4, −3). ✓ **Problem 6:** For symmetric positive definite A, λ_min > 0 and λ_max > 0. The 2-norm condition number: κ₂(A) = ‖A‖₂·‖A⁻¹‖₂. For symmetric A: ‖A‖₂ = max |λ_i| = λ_max (since all eigenvalues > 0). ‖A⁻¹‖₂ = max |1/λ_i| = 1/λ_min (since A⁻¹ has eigenvalues 1/λ_i, and 1/λ_min > 0 is the largest). Thus κ₂(A) = λ_max/λ_min. ✓ This appears as the extreme Rayleigh quotient ratio: max R(x)/min R(x). **Problem 7:** The symmetric matrix form of Q(x,y,z):

$x² → a₁₁=1
3y² → a₂₂=3
2z² → a₃₃=2
4xy → a₁₂=a₂₁=2
−2xz → a₁₃=a₃₁=−1
2yz → a₂₃=a₃₂=1

A = [1   2  −1]
    [2   3   1]
    [−1  1   2]
$

Leading principal minors:

$Δ₁ = 1 > 0
Δ₂ = det([1 2; 2 3]) = 3−4 = −1 < 0
$

Δ₁ > 0 but Δ₂ < 0 → INDEFINITE. No need to compute Δ₃. (For completeness: Δ₃ = 1·(6−1) − 2·(4−(−1)) + (−1)·(2−(−3)) = 5 − 2·5 − 1·5 = 5−10−5 = −10 < 0. So sequence is +, −, −, confirming indefinite.) **Problem 8:** Since A is symmetric, A = QΛQᵀ with λ_i > 0 for all i. For any x ≠ 0, let y = Qᵀx. Then xᵀAx = yᵀΛy = Σ λ_i y_i². Since all λ_i > 0 and y ≠ 0 (because Qᵀ is invertible, so x ≠ 0 ⇒ y ≠ 0), we have Σ λ_i y_i² > 0. Thus xᵀAx > 0 for all x ≠ 0 → A is positive definite. ✓

Summary

The Spectral Theorem is the crowning result: every real symmetric matrix has real eigenvalues, orthogonal eigenvectors (for distinct eigenvalues), and can be written A = QΛQᵀ with Q orthogonal — this is stronger than generic diagonalizability
Quadratic forms xᵀAx become sums of squares Σ λ_i y_i² after an orthogonal change of variables y = Qᵀx, revealing the principal axes and eliminating cross terms
Definiteness of a symmetric matrix (positive definite, negative definite, indefinite) can be determined by eigenvalues (all > 0, all < 0, mixed), by Sylvester's criterion (leading principal minors), or by Cholesky factorization
The Rayleigh quotient R(x) = xᵀAx/xᵀx provides a variational characterization: λ_min ≤ R(x) ≤ λ_max with equality at eigenvectors
The spectral theorem connects to the Hessian in optimization (eigenvalue signs classify critical points), to PCA (principal components are eigenvectors of the covariance matrix), and to the SVD (eigenvectors of XᵀX are right singular vectors)

Quiz

Q1: Which statement is guaranteed for EVERY real symmetric matrix?

A) All eigenvalues are real B) All eigenvalues are distinct C) The matrix is singular D) All eigenvectors have unit length

Answer & Explanation

**A** — The Spectral Theorem guarantees real eigenvalues for symmetric matrices. B is false (the identity matrix has repeated eigenvalues). C is false (symmetric matrices can be invertible). D is false — eigenvectors are only unit length after normalization.

Q2: If A = QΛQᵀ with Q orthogonal (Qᵀ = Q⁻¹), then A MUST be:

A) Invertible B) Symmetric C) Positive definite D) Diagonal

Answer & Explanation

**B** — Aᵀ = (QΛQᵀ)ᵀ = QΛᵀQᵀ = QΛQᵀ = A. Orthogonal diagonalizability characterizes real symmetric matrices. A is false (Λ may have zeros). C is false (eigenvalues could be negative). D is false (A need not be diagonal, only diagonalizable).

Q3: Eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are:

A) Always parallel B) Orthogonal C) Always unit vectors D) Linearly dependent

Answer & Explanation

**B** — Proof: λ₁(v·w) = (Av)·w = v·(Aᵀw) = v·(Aw) = λ₂(v·w), so (λ₁ − λ₂)(v·w) = 0. Since λ₁ ≠ λ₂, v·w = 0. A is impossible for distinct eigenvalues. C is true only after normalization. D is false — they are independent.

Q4: A symmetric matrix A has leading principal minors Δ₁ = 2, Δ₂ = −1, Δ₃ = −4. A is:

A) Positive definite B) Negative definite C) Positive semidefinite D) Indefinite

Answer & Explanation

**D** — Δ₁ > 0 but Δ₂ < 0. Positive definite requires all Δ_k > 0. Negative definite requires signs to alternate starting negative. Mixed signs → indefinite. The eigenvalues must have mixed signs.

Q5: For a symmetric positive definite matrix A, the level set xᵀAx = 1 in R² is:

A) A hyperbola B) A parabola C) An ellipse D) A straight line

Answer & Explanation

**C** — After diagonalization: λ₁y₁² + λ₂y₂² = 1 with λ₁, λ₂ > 0. This is the equation of an ellipse with semi-axes 1/√λ₁ and 1/√λ₂. A hyperbola would require mixed eigenvalue signs (indefinite). A parabola requires a zero eigenvalue (semidefinite).

Pitfalls

Checking only det(A) for positive definiteness. A positive determinant does NOT guarantee positive definiteness. The matrix diag(−1, −1) has det = 1 > 0 but is negative definite. You must check ALL leading principal minors (Sylvester's criterion) or all eigenvalues.
Applying the Rayleigh quotient to non-symmetric matrices. The Rayleigh quotient R(x) = x^T A x / x^T x is only meaningful when A is symmetric. For a non-symmetric matrix, the quadratic form x^T A x depends only on the symmetric part (A + A^T)/2; the skew-symmetric part contributes zero.
Gram-Schmidt across different eigenspaces. For a symmetric matrix, eigenvectors for DISTINCT eigenvalues are automatically orthogonal — do NOT apply Gram-Schmidt to them. Gram-Schmidt is only needed WITHIN a single eigenspace to obtain an orthonormal basis when an eigenvalue has multiplicity > 1.
Confusing orthogonal diagonalization with generic diagonalization. The spectral theorem gives A = QΛQ^T with Q orthogonal (Q⁻¹ = Q^T), which is stronger than generic diagonalization A = PDP⁻¹. Only symmetric matrices guarantee orthogonal diagonalization. Using Q^T in place of Q⁻¹ for a non-symmetric matrix gives incorrect results.
Assuming the covariance matrix is always positive definite. X^T X is always positive SEMIdefinite. It is positive definite only if X has full column rank. When the number of features exceeds the number of samples (p > n), X^T X is guaranteed to be singular with at least p−n zero eigenvalues — a critical fact for PCA and regularized regression.

Next Steps

Move on to 09-01 — LU Decomposition (Phase 9 — Matrix Decompositions & Advanced Linear Algebra) to learn another fundamental factorization: expressing A = LU for efficient linear system solving. The spectral theorem (A = QΛQᵀ) gave us one decomposition; LU, QR, Cholesky, and SVD complete the decomposition toolkit essential for numerical linear algebra and machine learning.

Progress

Phases

08-10 — Symmetric Matrices and the Spectral Theorem

Learning Objectives

Core Content

1. Real Symmetric Matrices — What Makes Them Special

2. The Spectral Theorem (Real Symmetric Case)

Proof Outline

Handling Repeated Eigenvalues

3. The Spectral Decomposition (Outer Product Form)

4. Quadratic Forms

5. Definiteness Classification

Three Ways to Check Definiteness

6. The Rayleigh Quotient

7. Connection to the Hessian and Optimization

8. Connection to PCA and SVD (Preview)

9. Simultaneous Diagonalization

10. Common Misconceptions

Key Terms

Worked Examples

Example 1: Full Orthogonal Diagonalization

Example 2: Repeated Eigenvalue Case

Example 3: Quadratic Form Analysis and Principal Axes

Example 4: Positive Definiteness Test via Principal Minors

Example 5: Rayleigh Quotient and Constrained Optimization

Practice Problems

Summary

Quiz

Q1: Which statement is guaranteed for EVERY real symmetric matrix?

Q2: If A = QΛQᵀ with Q orthogonal (Qᵀ = Q⁻¹), then A MUST be:

Q3: Eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are:

Q4: A symmetric matrix A has leading principal minors Δ₁ = 2, Δ₂ = −1, Δ₃ = −4. A is:

Q5: For a symmetric positive definite matrix A, the level set xᵀAx = 1 in R² is:

Pitfalls

Next Steps