08-04 — Matrices as Linear Transformations
Phase: 8 — Linear Algebra (Rigorous) Subject: 08-04 Prerequisites: 08-03 — Linear Transformations Next subject: 08-05 — Inner Product Spaces
Learning Objectives
By the end of this subject, you will be able to:
- Interpret matrix-vector multiplication Ax as a linear combination of columns, and as the action of a linear transformation
- Construct the matrix of a linear transformation relative to given bases and switch between different basis representations
- Define and compute the trace of a matrix, prove its cyclic property, and relate it to eigenvalues
- State and apply the rank-nullity theorem in matrix form: rank(A) + nullity(A) = n
- Understand similarity of matrices: A ~ B if they represent the same linear transformation under different bases
Core Content
1. Matrix-Vector Multiplication as a Linear Transformation
Every m × n matrix A defines a linear transformation L_A : R^n → R^m by L_A(x) = Ax. Conversely, every linear transformation T : R^n → R^m has a unique matrix representation relative to the standard bases.
CRITICAL — Foundational: Every matrix IS a linear transformation. The column picture $Ax = x₁col₁ + ... + x_ncol_n$ connects matrix multiplication to linear combinations and the column space.
Two views of matrix-vector multiplication:
Column picture: Ax is a linear combination of the columns of A:
$Ax = x₁·col₁(A) + x₂·col₂(A) + ... + x_n·col_n(A) $
This is why C(A) = span{columns of A} = range of L_A.
Row picture: Each component of Ax is a dot product of a row of A with x:
$(Ax)_i = row_i(A) · x $
Example:
$A = [2 1] x = [3]
[0 -1] [4]
$
Column picture: 3·(2,0) + 4·(1,−1) = (6,0) + (4,−4) = (10,−4) Row picture: (2·3 + 1·4, 0·3 + (−1)·4) = (10, −4)
2. The Matrix of a Linear Transformation (General Bases)
Let V have basis B = {b₁, ..., b_n} and W have basis C = {c₁, ..., c_m}. For T : V → W:
[T]_{C←B} is the m × n matrix whose j-th column is [T(b_j)]_C.
Formula: [T(v)]C = [T]{C←B} [v]_B
Change of basis for transformations: If we change bases on both domain and codomain, the matrix of T changes by:
[T]_{D←A} = P_{D←C} [T]_{C←B} Q_{B←A}
where P is the change-of-basis for the codomain and Q (often written P⁻¹ when domain=codomain) is the change for the domain.
But for an operator T : V → V where we use the SAME basis for domain and codomain:
$[T]_B = P⁻¹ [T]_C P $
where P = P_{C←B} is the change-of-basis matrix. This is the SIMILARITY relation.
3. Similarity
Definition: Two n × n matrices A and B are similar (A ~ B) if there exists an invertible matrix P such that:
$B = P⁻¹ A P $
Physical meaning: A and B represent the same linear transformation T : V → V under different bases. If A = [T]C and B = [T]_B, then B = P⁻¹ A P where P = P{C←B}.
Properties preserved by similarity: - Rank (dimension of range) - Nullity (dimension of kernel) - Determinant - Trace - Eigenvalues - Characteristic polynomial
These are called similarity invariants — they don't depend on the choice of basis.
4. Trace of a Matrix
Definition: For an n × n matrix A = [a_ij], the trace is:
$tr(A) = a₁₁ + a₂₂ + ... + a_nn = Σ_{i=1}^n a_ii
$
Key properties: 1. tr(A + B) = tr(A) + tr(B) 2. tr(cA) = c·tr(A) 3. Cyclic property: tr(AB) = tr(BA) (even when AB ≠ BA)
Proof of cyclic property: tr(AB) = Σ_i (AB)_ii = Σ_i Σ_k a_ik b_ki = Σ_k Σ_i b_ki a_ik = Σ_k (BA)_kk = tr(BA). ∎
Consequences: - tr(P⁻¹ A P) = tr(A P P⁻¹) = tr(A) — the trace is a similarity invariant
Common Pitfall: $tr(AB) = tr(BA)$ but $tr(ABC) ≠ tr(ACB)$. Trace is invariant under CYCLIC permutations only. - tr(ABC) = tr(BCA) = tr(CAB) (cyclic permutations permitted) - tr(A^T) = tr(A)
Relation to eigenvalues: tr(A) = Σ λ_i (sum of eigenvalues, counted with multiplicity)
5. The Rank-Nullity Theorem (Matrix Form)
For an m × n matrix A, considered as a linear map L_A : R^n → R^m:
Rank: rank(A) = dim(C(A)) = dim(range(L_A)) = number of pivot columns in RREF
Nullity: nullity(A) = dim(N(A)) = dim(ker(L_A)) = number of free variables = n − rank(A)
Rank-Nullity Theorem: rank(A) + nullity(A) = n (number of columns)
This is just dim(range) + dim(ker) = dim(domain) applied to L_A.
Corollary — Fundamental Theorem of Linear Algebra (part 1): - dim(C(A)) + dim(N(A)) = n - dim(C(A^T)) + dim(N(A^T)) = m - dim(C(A)) = dim(C(A^T)) = rank(A) (row rank = column rank)
Example:
$A = [1 2 1 0]
[0 1 1 1]
[0 0 0 0]
$
n = 4, rank = 2 (two pivot columns), nullity = n − rank = 2. The two free variables correspond to a 2-dimensional null space.
Edge case — Full rank matrices: - Full column rank (rank = n): N(A) = {0}, L_A is injective - Full row rank (rank = m): C(A) = R^m, L_A is surjective - Full rank square (rank = m = n): A is invertible
Key Terms
- 08 04 Matrices As Linear Transformations
- Correct: B) 2
- Correct: B) Cyclic property
- Correct: B) The matrix is invertible
- Correct: B) The number of columns of A
- Correct: C) 5
- Correct: C) A and B have the same eigenvectors
- Correct: C) Eigenvectors
- Correct: C) n
- End-of-Subject Quiz
- Example 1: Constructing the Matrix from Basis Images
- Example 2: Similarity and Trace
Worked Examples
Example 1: Constructing the Matrix from Basis Images
Problem: T : R^3 → R^2, T(x, y, z) = (2x − y + z, x + y − 2z). Find [T]_{D←B} where B = {(1,0,1), (0,1,1), (1,1,0)} (domain) and D = {(1,1), (1,−1)} (codomain).
Solution:
Step 1: Compute T on domain basis vectors. T(1,0,1) = (2(1)−0+1, 1+0−2(1)) = (3, −1) T(0,1,1) = (0−1+1, 0+1−2) = (0, −1) T(1,1,0) = (2−1+0, 1+1−0) = (1, 2)
Step 2: Express each in codomain basis D = {(1,1), (1,−1)}.
[T(b₁)]_D: α(1,1) + β(1,−1) = (3,−1) → α+β = 3, α−β = −1 → 2α = 2 → α=1, β=2. [T(b₁)]_D = (1,2) [T(b₂)]_D: α+β = 0, α−β = −1 → 2α = −1 → α=−1/2, β=1/2. [T(b₂)]_D = (−1/2, 1/2) [T(b₃)]_D: α+β = 1, α−β = 2 → 2α = 3 → α=3/2, β=−1/2. [T(b₃)]_D = (3/2, −1/2)
Step 3:
$[T]_{D←B} = [1 -1/2 3/2 ]
[2 1/2 -1/2 ]
$
Example 2: Similarity and Trace
Problem: Show tr(A) = tr(B) when A and B are similar.
Solution: B = P⁻¹ A P. tr(B) = tr(P⁻¹ A P) = tr(A P P⁻¹) (cyclic property: tr(XY) = tr(YX) with X = P⁻¹ A, Y = P) = tr(A I) = tr(A). ✓
Numerical verification: A = [1 2; 3 4], tr(A) = 5. P = [2 1; 1 1], P⁻¹ = [1 −1; −1 2]. B = P⁻¹ A P = [1 −1; −1 2] [1 2; 3 4] [2 1; 1 1] = [1 −1; −1 2] [4 3; 10 7] = [−6 −4; 16 11] tr(B) = −6 + 11 = 5. ✓
Example 3: Rank-Nullity Application
Problem: A 5 × 8 matrix has N(A) of dimension 3. What is rank(A)? Can the equation Ax = b be solvable for all b ∈ R^5?
Solution: n = 8, nullity = 3. By rank-nullity: rank(A) = n − nullity = 8 − 3 = 5.
rank(A) = 5 = m (number of rows), so A has full row rank. This means C(A) = R^5, so Ax = b IS solvable for ALL b ∈ R^5.
Note: The solution is not unique — since nullity = 3, there's a 3-dimensional family of solutions.
Example 4: Matrix of a Composition vs Product of Matrices
Problem: T : R^2 → R^3 with T(x,y) = (x, x+y, y), and S : R^3 → R^2 with S(u,v,w) = (u+w, v−w). Compute matrices [T]_E, [S]_E, [S ∘ T]_E, and verify [S ∘ T] = [S][T].
Solution:
[T]_E: T(e₁) = (1,1,0), T(e₂) = (0,1,1).
$[T] = [1 0]
[1 1]
[0 1]
$
[S]_E: S(e₁) = (1,0), S(e₂) = (0,1), S(e₃) = (1,−1).
$[S] = [1 0 1]
[0 1 -1]
$
[S ∘ T]: (S ∘ T)(x,y) = S(x, x+y, y) = (x+y, x+y−y) = (x+y, x).
$[S ∘ T]_E = [1 1]
[1 0]
$
Matrix product: [S][T] = [1 0 1; 0 1 −1] [1 0; 1 1; 0 1] = [(1·1+0·1+1·0) (1·0+0·1+1·1); (0·1+1·1+(−1)·0) (0·0+1·1+(−1)·1)] = [1 1; 1 0]. ✓
Practice Problems
(Answers are below. Try each problem before checking.)
Problem 1: Write the matrix of T : R^3 → R^2, T(x, y, z) = (3x − 2y + z, x + 4y − z) relative to the standard bases.
Problem 2: For A = [1 2 3; 4 5 6; 7 8 9], find rank(A), nullity(A), and a basis for N(A).
Problem 3: Prove tr(AB) = tr(BA) for arbitrary n × n matrices. Then use this to show tr(ABC) = tr(BCA).
Problem 4: If A and B are similar, do they necessarily have the same eigenvalues? Same eigenvectors? Explain.
Problem 5: A 7 × 4 matrix A has rank 3. Is the transformation x ↦ Ax injective? Surjective?
Problem 6: Find [T]_{B←B} for T : R^2 → R^2, T(x, y) = (3x + y, x + 3y) with B = {(1, 1), (1, −1)}.
Problem 7: Show that if A is n × n and P is invertible n × n, then tr(P⁻¹AP) = tr(A).
Answers (click to expand)
**Problem 1:**$A = [3 -2 1]
[1 4 -1]
$
Columns: T(e₁) = (3,1), T(e₂) = (−2,4), T(e₃) = (1,−1). ✓
**Problem 2:** Row reduce A:
$[1 2 3] [1 2 3] [1 0 -1] [4 5 6] R₂-4R₁ [0 -3 -6] R₂/(-3) [0 1 2] [7 8 9] R₃-7R₁ [0 -6 -12] R₃-2R₂ [0 0 0] $rank = 2 (two pivot columns). nullity = n − rank = 3 − 2 = 1. RREF equations: x₁ − x₃ = 0, x₂ + 2x₃ = 0. Let x₃ = t: x = (t, −2t, t) = t(1, −2, 1). N(A) = span{(1, −2, 1)}. **Problem 3:** tr(AB) = Σ_i (AB)_ii = Σ_i Σ_k a_ik b_ki = Σ_k Σ_i b_ki a_ik = Σ_k (BA)_kk = tr(BA). Then tr(ABC) = tr((AB)C) = tr(C(AB)) = tr(CAB). And tr(BCA) = tr((BC)A) = tr(A(BC)) = tr(ABC). So tr(ABC) = tr(BCA). **Problem 4:** Similar matrices DO have the same eigenvalues: if Av = λv, then (P⁻¹AP)(P⁻¹v) = P⁻¹Av = λ(P⁻¹v). So λ is an eigenvalue of P⁻¹AP with eigenvector P⁻¹v. But same eigenvectors? NOT necessarily. B = P⁻¹AP has eigenvectors P⁻¹v (the eigenvectors of A transformed by P⁻¹). Unless P = I, the eigenvectors differ. **Problem 5:** n = 4 columns, rank = 3. Injective? nullity = n − rank = 1 > 0, so ker ≠ {0}. NOT injective. Surjective? Range is 3-dimensional, but codomain is R^7. NOT surjective (max range dimension is 4, and 3 < 7). The map is neither injective nor surjective. **Problem 6:** T(b₁) = T(1,1) = (4, 4). Express in B: α(1,1) + β(1,−1) = (4,4) → α+β=4, α−β=4 → α=4, β=0. [T(b₁)]_B = (4,0). T(b₂) = T(1,−1) = (2, −2). Express in B: α+β=2, α−β=−2 → α=0, β=2. [T(b₂)]_B = (0,2).
$[T]_{B←B} = [4 0]
[0 2]
$
This is diagonal! B is a basis of eigenvectors: b₁ has eigenvalue 4, b₂ has eigenvalue 2.
**Problem 7:** tr(P⁻¹AP) = tr(APP⁻¹) = tr(AI) = tr(A). The cyclic property tr(XY) = tr(YX) with X = P⁻¹A and Y = P.
Summary
- Matrix-vector multiplication Ax can be understood as a linear combination of columns (column picture) or as dot products with rows (row picture); both perspectives are useful in different contexts
- The matrix of T relative to bases B, C is constructed by computing T on each basis vector of B and expressing the result in C — changing bases changes the matrix via B = P⁻¹AP (similarity)
- The trace is the sum of diagonal entries, is linear, has the cyclic property tr(AB) = tr(BA), and is invariant under similarity; it equals the sum of eigenvalues
- The rank-nullity theorem rank(A) + nullity(A) = n decomposes the domain dimension; a linear system Ax = b is solvable iff rank(A) = rank([A|b])
- Similarity (B = P⁻¹AP) captures the idea that the same linear transformation looks different in different coordinate systems; many properties (rank, trace, det, eigenvalues) are similarity-invariant
Pitfalls
-
Thinking similar matrices share eigenvectors. Similar matrices share eigenvalues, but their eigenvectors are related by the change-of-basis matrix: if Av = λv, then (P⁻¹AP)(P⁻¹v) = λ(P⁻¹v). The eigenvectors of B = P⁻¹AP are P⁻¹v, not v. Only when P = I do they coincide.
-
Applying the cyclic property of trace to non-cyclic permutations. tr(ABC) = tr(BCA) = tr(CAB) works only for CYCLIC permutations. tr(ABC) ≠ tr(ACB) in general — swapping non-adjacent factors breaks the cyclic property.
-
Using pivot columns from RREF instead of the original matrix for column space basis. Row operations change the column space. The pivot columns of the ORIGINAL A give the basis for C(A). Using RREF columns is incorrect — they span a different subspace.
-
Confusing rank(A) with the number of rows or columns. rank(A) is the dimension of C(A) (or equivalently C(A^T)). It can be at most min(m, n) but is often smaller. A 5 × 3 matrix can have rank 2; don't assume rank equals either dimension.
-
Thinking Ax = b is always solvable. The system is consistent only when b ∈ C(A). Full row rank (rank = m) guarantees solvability for every b; full column rank (rank = n) guarantees uniqueness when a solution exists. Neither condition is automatic.
Quiz
Answer each question, then read the explanation for your choice.
Q1: The matrix-vector product Ax can be interpreted as:
A) A dot product of A with x B) A linear combination of the rows of A C) A linear combination of the columns of A D) The transpose of A multiplied by x
Answer and Explanations
**Correct: C) A linear combination of the columns of A** Ax = x₁·col₁(A) + x₂·col₂(A) + ... + x_n·col_n(A). This column picture directly connects Ax to the column space C(A). - A) Not defined — A and x have different shapes. - B) The row picture shows each component, but Ax as a whole is a column combination. - C) ✓ Correct. This is the fundamental column interpretation. - D) Different operation entirely.Q2: If A is 4 × 7 with rank 2, what is dim(N(A))?
A) 2 B) 3 C) 5 D) 7
Answer and Explanations
**Correct: C) 5** nullity = n − rank = 7 − 2 = 5. - A) That's the rank. - B) That would be if n = 5. - C) ✓ Correct. 7 − 2 = 5. - D) That would be if rank = 0.Q3: Which property is NOT preserved under similarity (B = P⁻¹AP)?
A) Trace B) Determinant C) Eigenvectors D) Rank
Answer and Explanations
**Correct: C) Eigenvectors** Eigenvectors change: if Av = λv, then B(P⁻¹v) = λ(P⁻¹v), so eigenvectors of B are P⁻¹ times eigenvectors of A. They differ unless P = I. - A) tr(B) = tr(A) — preserved. - B) det(B) = det(A) — preserved (det(P⁻¹AP) = det(P⁻¹)det(A)det(P) = det(A)). - C) ✓ Not preserved — eigenvectors transform. - D) Rank is preserved because P is invertible.Q4: For n × n matrices A and B, tr(AB) = tr(BA). This is called the:
A) Commutative property B) Cyclic property C) Symmetry property D) Nilpotent property
Answer and Explanations
**Correct: B) Cyclic property** The trace is invariant under cyclic permutations. In general, AB ≠ BA (matrices don't commute), but their traces are equal. - A) Commutativity would be AB = BA, which is false in general. - B) ✓ Correct. You can cyclically permute factors inside a trace. - C) The trace is symmetric (tr(A^T) = tr(A)), but that's not what tr(AB) = tr(BA) is called. - D) Nilpotent means A^k = 0 for some k.Q5: A 6 × 6 matrix has nullity 0. What can we conclude?
A) The matrix is the identity B) The matrix is invertible C) The matrix has trace 0 D) The determinant may be zero
Answer and Explanations
**Correct: B) The matrix is invertible** nullity = 0 → rank = n − nullity = 6 = full rank. A square matrix with full rank is invertible. - A) The identity is one example, but any full-rank matrix works. - B) ✓ Correct. Full rank square matrix ⇔ invertible. - C) Trace has nothing to do with nullity. - D) Full rank means det ≠ 0.Q6: If A is an m × n matrix and rank(A) = m, then:
A) The columns are linearly independent B) Ax = b has a solution for every b ∈ R^m C) N(A) = {0} D) A is square
Answer and Explanations
**Correct: B) Ax = b has a solution for every b ∈ R^m** rank(A) = m means C(A) = R^m (full row rank), so every b is in the column space → the system is always consistent. - A) Full column rank (rank = n) gives independent columns. Here rank = m, columns may be dependent (if m < n). - B) ✓ Correct. Full row rank → surjective → consistent for all b. - C) N(A) = {0} requires rank = n (full column rank), not rank = m. - D) Full row rank doesn't imply square.Q7: For a 5 × 3 matrix A with rank 3, what is the dimension of N(A^T)?
A) 0 B) 2 C) 3 D) 5
Answer and Explanations
**Correct: B) 2** For the transpose: A^T is 3 × 5. dim(N(A^T)) = m − rank(A^T) = m − rank(A) = 5 − 3 = 2. - A) Would be if A^T had full column rank (5). - B) ✓ Correct. 5 − 3 = 2. - C) That's the rank. - D) That's m.Q8: What is tr(I_n)?
A) 0 B) 1 C) n D) n²
Answer and Explanations
**Correct: C) n** The identity matrix has ones on the diagonal and zeros elsewhere. Summing n ones gives n. - A) The zero matrix has trace 0. - B) A single 1×1 identity. - C) ✓ Correct. n diagonal entries, each 1. - D) That's the total number of entries.Q9: If A and B are similar matrices, which statement is FALSE?
A) det(A) = det(B) B) tr(A) = tr(B) C) A and B have the same eigenvectors D) rank(A) = rank(B)
Answer and Explanations
**Correct: C) A and B have the same eigenvectors** As shown earlier, eigenvectors transform: if Av = λv, then B(P⁻¹v) = λ(P⁻¹v). The eigenvectors of B are P⁻¹v, not v itself. - A) True — det(B) = det(P⁻¹AP) = det(P⁻¹)det(A)det(P) = det(A). - B) True by cyclic property of trace. - C) ✓ False. Eigenvectors change under similarity. - D) True — multiplying by invertible P and P⁻¹ preserves rank.Q10: In the rank-nullity theorem rank(A) + dim(N(A)) = n, the n refers to:
A) The number of rows of A B) The number of columns of A C) The number of nonzero rows in RREF D) min(m, n)
Answer and Explanations
**Correct: B) The number of columns of A** The domain of L_A is R^n, where n is the number of columns. dim(N(A)) + dim(range(A)) = dim(domain). - A) m (rows) appears in rank(A^T) + dim(N(A^T)) = m. - B) ✓ Correct. Domain dimension = number of columns. - C) That equals rank(A). - D) min(m,n) is the maximum possible rank, not the domain dimension.Next Steps
Move on to 08-05 — Inner Product Spaces to introduce geometry into vector spaces: the dot product and its generalization (inner product), norms, distances, angles, orthogonality, and the Gram-Schmidt orthogonalization process.