06-05 - Chain Rule for Multivariable Functions
Phase: 6 | Subject: 06-05 Prerequisites: 06-03-partial-derivatives.md (∂f/∂x, ∂f/∂y computation), 04-04-differentiation-rules.md (single-variable chain rule), 04-06-implicit-differentiation.md (implicit differentiation in 1D) Next subject: 06-06-directional-derivatives-and-gradient.md
Learning Objectives
By the end of this subject, you will be able to:
- Apply the multivariable chain rule using tree diagrams to track variable dependencies
- Compute derivatives for Case 1: x(t), y(t) → dz/dt
- Compute partial derivatives for Case 2: x(s,t), y(s,t) → ∂z/∂s, ∂z/∂t
- Apply implicit differentiation in multiple variables using the chain rule
- Understand and compute the total derivative
Core Content
The Chain Rule: Why It Matters
⚠️ CRITICAL FOUNDATION: The multivariable chain rule dz/dt = (∂z/∂x)(dx/dt) + (∂z/∂y)(dy/dt) sums contributions over all paths. This tree-diagram approach generalizes to any number of variables and is the engine behind backpropagation in neural networks, implicit differentiation, and PDE conversions.
In single-variable calculus, the chain rule handles compositions: d/dx[f(g(x))] = f'(g(x))·g'(x).
In multivariable calculus, a function z = f(x,y) might have x and y themselves depending on other variables. The chain rule tells us how z changes when those underlying variables change.
Tree Diagrams
A tree diagram is the best way to organize dependencies and write the chain rule correctly:
Case 1: z = f(x,y) with x = g(t), y = h(t) — one parameter
$ z / \ x y | | t t $
z depends on x and y. x and y each depend on t. Resulting derivative:
$dz/dt = (∂z/∂x)(dx/dt) + (∂z/∂y)(dy/dt) $
This is the total derivative with respect to t.
Mnemonic: "Sum over all paths from z to t — multiply partials along each branch."
Case 2: z = f(x,y) with x = g(s,t), y = h(s,t) — two parameters
$ z
/ \
x y
/ \ / \
s t s t
$
Now z has partials with respect to s and t:
$∂z/∂s = (∂z/∂x)(∂x/∂s) + (∂z/∂y)(∂y/∂s) ∂z/∂t = (∂z/∂x)(∂x/∂t) + (∂z/∂y)(∂y/∂t) $
Key distinction: When the final variable is a "leaf" parameter (like t or s), we use ordinary d when it's the only dependency, or partial ∂ when multiple branches might reach it. If the intermediate variables (x,y) depend on both s and t, we use ∂ at all derivative levels.
General Rule
For each terminal variable, sum over all paths from z to that variable:
$∂z/∂(terminal) = Σ (∂z/∂(intermediate) × ∂(intermediate)/∂(terminal)) $
summed over all intermediate variables.
Case 1 Examples: dz/dt
Example 1
Problem: z = x²y + xy², where x = cos(t), y = sin(t). Find dz/dt.
Solution:
Step 1: ∂z/∂x = 2xy + y², ∂z/∂y = x² + 2xy.
Step 2: dx/dt = -sin(t), dy/dt = cos(t).
Step 3: dz/dt = (2xy + y²)(-sin(t)) + (x² + 2xy)(cos(t)).
Since x = cos(t), y = sin(t): = (2cos(t)sin(t) + sin²(t))(-sin(t)) + (cos²(t) + 2cos(t)sin(t))(cos(t)) = -2cos(t)sin²(t) - sin³(t) + cos³(t) + 2cos²(t)sin(t) = cos³(t) - sin³(t) + 2cos²(t)sin(t) - 2cos(t)sin²(t) = (cos(t) - sin(t))(cos²(t) + cos(t)sin(t) + sin²(t)) + 2cos(t)sin(t)(cos(t) - sin(t)) = (cos(t) - sin(t))(1 + cos(t)sin(t) + 2cos(t)sin(t)) = (cos(t) - sin(t))(1 + 3cos(t)sin(t))
(We could also compute this by first substituting: z = cos²t·sin t + cos t·sin²t = cos t·sin t(cos t + sin t) and differentiating directly — both approaches give the same result.)
Example 2: Direct Substitution Check
Problem: z = e^{xy}, x = t², y = t³. Find dz/dt at t = 1.
Solution:
Method 1 — Chain rule: ∂z/∂x = ye^{xy} → at t=1, x=1, y=1: 1·e¹ = e. ∂z/∂y = xe^{xy} → at t=1: 1·e¹ = e. dx/dt = 2t → at t=1: 2. dy/dt = 3t² → at t=1: 3. dz/dt = e·2 + e·3 = 5e.
Method 2 — Direct substitution: z = e^{t²·t³} = e^{t⁵}. dz/dt = 5t⁴ e^{t⁵}. At t=1: 5e. ✓
Case 2 Examples: ∂z/∂s and ∂z/∂t
Example 1
Problem: z = x² + xy + y², where x = s + t, y = st. Find ∂z/∂s and ∂z/∂t.
Solution:
∂z/∂x = 2x + y, ∂z/∂y = x + 2y. ∂x/∂s = 1, ∂x/∂t = 1. ∂y/∂s = t, ∂y/∂t = s.
∂z/∂s = (2x + y)(1) + (x + 2y)(t) = 2x + y + xt + 2yt.
Substituting x = s + t, y = st: = 2(s + t) + st + (s + t)t + 2(st)t = 2s + 2t + st + st + t² + 2st² = 2s + 2t + 2st + t² + 2st².
∂z/∂t = (2x + y)(1) + (x + 2y)(s) = 2x + y + xs + 2ys.
Substituting: = 2(s + t) + st + (s + t)s + 2(st)s = 2s + 2t + st + s² + st + 2s²t = s² + 2s + 2t + 2st + 2s²t.
(Check: directly, z = (s+t)² + (s+t)(st) + (st)² = s² + 2st + t² + s²t + st² + s²t². ∂/∂s = 2s + 2t + 2st + t² + 2st². Matches.)
Example 2: Polar → Cartesian Conversion
Problem: If z = f(x,y) and x = r cos θ, y = r sin θ, express ∂z/∂r and ∂z/∂θ in terms of ∂z/∂x and ∂z/∂y.
Solution:
∂x/∂r = cos θ, ∂x/∂θ = -r sin θ. ∂y/∂r = sin θ, ∂y/∂θ = r cos θ.
∂z/∂r = (∂z/∂x)(cos θ) + (∂z/∂y)(sin θ).
∂z/∂θ = (∂z/∂x)(-r sin θ) + (∂z/∂y)(r cos θ).
These formulas are extensively used when converting PDEs between coordinate systems.
Implicit Differentiation Revisited
Single-Variable Recall
For F(x,y) = 0 defining y implicitly as function of x: dy/dx = -Fₓ/F_y (provided F_y ≠ 0).
Derivation via Chain Rule
If y = y(x) is defined implicitly by F(x, y(x)) = 0, differentiate with respect to x:
d/dx[F(x, y(x))] = Fₓ·dx/dx + F_y·dy/dx = Fₓ + F_y·dy/dx = 0.
Solve: dy/dx = -Fₓ/F_y.
Two-Variable Implicit Differentiation
If z = z(x,y) is defined implicitly by F(x, y, z) = 0:
∂z/∂x = -Fₓ/F_z, ∂z/∂y = -F_y/F_z (provided F_z ≠ 0).
Derivation: Differentiate F(x, y, z(x,y)) = 0 with respect to x: Fₓ·1 + F_y·0 + F_z·∂z/∂x = 0 → ∂z/∂x = -Fₓ/F_z.
Similarly for y.
Example: Sphere
Problem: x² + y² + z² = 1 defines z implicitly. Find ∂z/∂x and ∂z/∂y.
Solution:
F(x,y,z) = x² + y² + z² - 1 = 0.
Fₓ = 2x, F_y = 2y, F_z = 2z.
∂z/∂x = -2x/(2z) = -x/z. ∂z/∂y = -2y/(2z) = -y/z.
These match what you'd get solving z = ±√(1 - x² - y²) and differentiating directly: ∂z/∂x = -x/√(1 - x² - y²) = -x/z. ✓
Example: More Complex Implicit
Problem: e^{xyz} + x² + y² + z² = 3 defines z near (0,0,1). Find ∂z/∂x at (0,0,1).
Solution:
F(x,y,z) = e^{xyz} + x² + y² + z² - 3 = 0.
Fₓ = yz e^{xyz} + 2x → Fₓ(0,0,1) = 0·1·e⁰ + 0 = 0. F_y = xz e^{xyz} + 2y → F_y(0,0,1) = 0. F_z = xy e^{xyz} + 2z → F_z(0,0,1) = 0 + 2 = 2.
∂z/∂x = -Fₓ/F_z = -0/2 = 0. ∂z/∂y = -F_y/F_z = -0/2 = 0.
The Total Derivative
For z = f(x,y) where x and y are independent variables:
$dz = fₓ dx + f_y dy $
This was introduced in 06-04 (the differential). But when x and y depend on parameters:
If x = x(t), y = y(t):
$dz/dt = fₓ·dx/dt + f_y·dy/dt $
If x = x(s,t), y = y(s,t):
$∂z/∂s = fₓ·∂x/∂s + f_y·∂y/∂s ∂z/∂t = fₓ·∂x/∂t + f_y·∂y/∂t $
These are all forms of the total derivative — capturing how z changes through all its dependencies.
Common Mistake: Confusing ∂z/∂t with dz/dt
When z = f(x,y) and both x and y depend on t only, there's a single parameter, so we write dz/dt (ordinary derivative), not ∂z/∂t.
When z = F(x,y,t) where t appears BOTH as an intermediate variable through x and y AND directly as an argument, the tree is:
$ z /|\ x y t | | | t t t $
Then dz/dt = (∂z/∂x)(dx/dt) + (∂z/∂y)(dy/dt) + ∂z/∂t.
The last term ∂z/∂t is the direct partial derivative of F with respect to its third argument.
Key Terms
- 06 05 Chain Rule Multivariable
- Case 1 Examples: dz/dt
- Case 1: z = f(x,y) with x = g(t), y = h(t) — one parameter
- Case 2 Examples: ∂z/∂s and ∂z/∂t
- Common Mistake: Confusing ∂z/∂t with dz/dt
- Correct: B)
- Derivation via Chain Rule
- Example 1
- Example 1: Full dz/dt Computation
- Example 2: Chain Rule with Three Intermediate Variables
- Example 2: Direct Substitution Check
- Example 2: Polar → Cartesian Conversion
Worked Examples
Example 1: Full dz/dt Computation
Problem: z = x ln(y), x = e^t, y = t². Find dz/dt.
Solution:
Step 1: ∂z/∂x = ln(y), ∂z/∂y = x/y.
Step 2: dx/dt = e^t, dy/dt = 2t.
Step 3: dz/dt = ln(y)·e^t + (x/y)·2t.
Step 4: Substitute x = e^t, y = t²: dz/dt = ln(t²)·e^t + (e^t/t²)·2t = 2ln(t)·e^t + 2e^t/t = 2e^t(ln(t) + 1/t).
Check via direct substitution: z = e^t ln(t²) = 2e^t ln(t). dz/dt = 2e^t ln(t) + 2e^t(1/t) = 2e^t(ln(t) + 1/t). ✓
Example 2: Chain Rule with Three Intermediate Variables
Problem: w = x² + y² + z², with x = r cos θ, y = r sin θ, z = φ. Find ∂w/∂r, ∂w/∂θ, ∂w/∂φ.
Solution:
Tree: w depends on (x, y, z). x depends on (r, θ). y depends on (r, θ). z depends on φ.
∂w/∂r = ∂w/∂x·∂x/∂r + ∂w/∂y·∂y/∂r + ∂w/∂z·∂z/∂r = 2x(cos θ) + 2y(sin θ) + 2z(0) = 2r cos²θ + 2r sin²θ = 2r(cos²θ + sin²θ) = 2r.
∂w/∂θ = 2x(-r sin θ) + 2y(r cos θ) + 2z(0) = -2r² cos θ sin θ + 2r² sin θ cos θ = 0.
∂w/∂φ = 2z(1) = 2φ.
Check: w = r² + φ² directly. ∂w/∂r = 2r. ∂w/∂θ = 0. ∂w/∂φ = 2φ. ✓
Example 3: Implicit Differentiation
Problem: yz + x ln(y) + z² = 0 defines z implicitly as a function of (x,y). Find ∂z/∂x at (x,y,z) = (1, e, -1).
Solution:
Step 1: F(x,y,z) = yz + x ln(y) + z² = 0.
Step 2: Fₓ = ln(y), F_y = z + x/y, F_z = y + 2z.
Step 3: ∂z/∂x = -Fₓ/F_z = -ln(y)/(y + 2z).
Step 4: At (1, e, -1): ln(e) = 1. y + 2z = e + 2(-1) = e - 2. ∂z/∂x = -1/(e - 2) ≈ -1/(0.718) ≈ -1.392.
Quiz
Q1: For z = f(x, y) where x = g(t) and y = h(t), the total derivative dz/dt is:
A) ∂z/∂x + ∂z/∂y B) (∂z/∂x)(dx/dt) + (∂z/∂y)(dy/dt) C) ∂z/∂t D) (∂z/∂x)(∂z/∂y)
Correct: B)
- If you chose B: The chain rule sums over all paths from z to t. Since z depends on x and y, and both depend on t: dz/dt = (∂z/∂x)(dx/dt) + (∂z/∂y)(dy/dt). Correct!
- If you chose A: Missing the dx/dt and dy/dt factors — this would only be correct if x = t and y = t.
- If you chose C: When x and y depend on t, we write dz/dt (ordinary derivative), not ∂z/∂t, but the formula still involves the partials.
- If you chose D: This product has no meaningful interpretation in the chain rule context.
Q2: If z = f(x, y) with x = g(s, t) and y = h(s, t), then ∂z/∂s equals:
A) (∂z/∂x)(∂x/∂s) only B) (∂z/∂y)(∂y/∂s) only C) (∂z/∂x)(∂x/∂s) + (∂z/∂y)(∂y/∂s) D) (∂z/∂x)(∂z/∂y)
Correct: C)
- If you chose C: There are two paths from z to s (through x and through y). The chain rule sums both: multiply along each branch, then add. Correct!
- If you chose A: Missing the path through y — incomplete derivative.
- If you chose B: Missing the path through x — incomplete derivative.
- If you chose D: This product is not the chain rule.
Q3: For the implicit relation F(x, y) = 0 defining y as a function of x, dy/dx equals:
A) F_x / F_y B) −F_x / F_y C) F_y / F_x D) −F_y / F_x
Correct: B)
- If you chose B: Differentiating F(x, y(x)) = 0 with respect to x gives F_x + F_y·(dy/dx) = 0 → dy/dx = −F_x/F_y. Correct!
- If you chose A: Missing the negative sign.
- If you chose C: Both the sign and the ratio are wrong.
- If you chose D: The ratio is inverted.
Q4: If z = x² + y² with x = r cos θ, y = r sin θ, then ∂z/∂r equals:
A) 2r B) 2x cos θ + 2y sin θ C) 2x + 2y D) 2r cos θ sin θ
Correct: A)
- If you chose A: ∂z/∂r = (2x)(cos θ) + (2y)(sin θ) = 2r cos²θ + 2r sin²θ = 2r. Also, z = r² directly, so ∂z/∂r = 2r. Correct!
- If you chose B: This is the unsubstituted form, which simplifies to 2r — not wrong but incomplete.
- If you chose C: Forgot dx/dr and dy/dr; 2x + 2y ≠ 2r.
- If you chose D: This doesn't match the chain rule result.
Q5: In a tree diagram for the chain rule, the derivative with respect to a terminal variable is computed by:
A) Following the single longest path B) Summing over all paths: multiply derivatives along each branch, then add across paths C) Multiplying derivatives along each branch and then multiplying across paths D) Adding all partial derivatives at each level
Correct: B)
- If you chose B: The chain rule: ∂(output)/∂(terminal) = Σ (∂(output)/∂(intermediate) × ∂(intermediate)/∂(terminal)) over all intermediate variables. Multiply along each branch, sum across branches. Correct!
- If you chose A: All paths contribute; you can't pick just one.
- If you chose C: You add (not multiply) across different paths from the output to the terminal variable.
- If you chose D: You multiply partials along each branch, not add at each level.
Q6: What is the key distinction between dz/dt and ∂z/∂t in the chain rule context?
A) They are always interchangeable B) dz/dt is used when z depends on t through intermediate variables only; ∂z/∂t when t appears directly as an argument C) dz/dt is for 2D, ∂z/∂t is for 3D D) ∂z/∂t is always zero
Correct: B)
- If you chose B: If z = F(x, y, t) where x and y also depend on t, then dz/dt = (∂F/∂x)(dx/dt) + (∂F/∂y)(dy/dt) + ∂F/∂t. The ordinary derivative dz/dt captures the total rate of change through all dependencies; ∂F/∂t is only the direct partial. Correct!
- If you chose A: They are not interchangeable — the distinction matters when there are both direct and indirect dependencies on the same variable.
- If you chose C: The distinction is about the dependency structure (direct vs. indirect), not the dimension.
- If you chose D: ∂z/∂t can be nonzero if z depends directly on t.
Practice Problems
-
z = x²y, x = t², y = t + 1. Find dz/dt at t = 1.
-
z = e^{xy}, x = s cos t, y = s sin t. Find ∂z/∂s.
-
If u = f(x,y), x = r cos θ, y = r sin θ, show that (∂u/∂r)² + (1/r²)(∂u/∂θ)² = (∂u/∂x)² + (∂u/∂y)².
-
x³ + y³ + z³ + 6xyz = 1 defines z implicitly. Find ∂z/∂x at (x,y,z) = (1, 0, 0).
-
w = xyz, x = t, y = t², z = t³. Use the chain rule to find dw/dt.
-
z = sin(xy), x = s² + t², y = s² - t². Find ∂z/∂s.
Answers (click to expand)
**Problem 1:** ∂z/∂x = 2xy, ∂z/∂y = x². dx/dt = 2t, dy/dt = 1. At t=1: x=1, y=2. dz/dt = 2(1)(2)·2 + (1)²·1 = 8 + 1 = 9. **Problem 2:** ∂z/∂x = ye^{xy}, ∂z/∂y = xe^{xy}. ∂x/∂s = cos t, ∂y/∂s = sin t. ∂z/∂s = ye^{xy}cos t + xe^{xy}sin t = e^{xy}(y cos t + x sin t). Substituting: e^{s²cos t sin t}·s(cos²t + sin²t) = s e^{(s² sin 2t)/2}. **Problem 3:** ∂u/∂r = uₓ cos θ + u_y sin θ. ∂u/∂θ = -uₓ r sin θ + u_y r cos θ. (∂u/∂r)² + (1/r²)(∂u/∂θ)² = (uₓ cos θ + u_y sin θ)² + (1/r²)(-uₓ r sin θ + u_y r cos θ)² = uₓ²cos²θ + 2uₓu_y cosθ sinθ + u_y²sin²θ + uₓ²sin²θ - 2uₓu_y sinθ cosθ + u_y²cos²θ = uₓ²(cos²θ+sin²θ) + u_y²(sin²θ+cos²θ) = uₓ² + u_y². ✓ **Problem 4:** F = x³ + y³ + z³ + 6xyz - 1. Fₓ = 3x² + 6yz → at (1,0,0): 3. F_z = 3z² + 6xy → at (1,0,0): 0. ∂z/∂x = -3/0 → undefined at this point! (F_z = 0 here, so the implicit function theorem fails.) **Problem 5:** dw/dt = yz·1 + xz·2t + xy·3t². Substituting: t²·t³ + t·t³·2t + t·t²·3t² = t⁵ + 2t⁵ + 3t⁵ = 6t⁵. Direct: w = t·t²·t³ = t⁶. dw/dt = 6t⁵. ✓ **Problem 6:** ∂z/∂x = y cos(xy), ∂z/∂y = x cos(xy). ∂x/∂s = 2s, ∂y/∂s = 2s. ∂z/∂s = y cos(xy)·2s + x cos(xy)·2s = 2s cos(xy)(y + x). Substituting: 2s cos((s²+t²)(s²-t²))·(s²-t² + s²+t²) = 2s cos(s⁴ - t⁴)·2s² = 4s³ cos(s⁴ - t⁴).Summary
Key takeaways:
- The chain rule sums contributions over all paths from the output to the input variable
- Tree diagrams organize dependencies: multiply along branches, sum across branches
- Case 1 (one parameter): dz/dt = fₓ·dx/dt + f_y·dy/dt (ordinary derivative)
- Case 2 (multiple parameters): ∂z/∂s = fₓ·∂x/∂s + f_y·∂y/∂s (partial derivatives)
- Implicit differentiation: ∂z/∂x = -Fₓ/F_z for F(x,y,z)=0; works because of the chain rule
- Always verify with direct substitution when possible — chain rule and direct differentiation MUST agree
Pitfalls
- Confusing ∂z/∂t with dz/dt. When z = f(x,y) and both x and y depend only on t, there is a single independent variable, so the derivative is dz/dt (ordinary), not ∂z/∂t. Reserve ∂ notation for when there are multiple independent variables. Using the wrong notation can lead to fundamental errors in the chain rule setup.
- Missing branches in the tree diagram. The chain rule sums over ALL paths from the output to the input variable. If z depends on x and y, and both depend on s and t, then ∂z/∂s has TWO terms: (∂z/∂x)(∂x/∂s) + (∂z/∂y)(∂y/∂s). Forgetting a branch — especially when variables appear both directly and indirectly — produces an incomplete derivative.
- Omitting the direct partial derivative when a variable appears explicitly. If z = F(x, y, t) where x = x(t) and y = y(t), then dz/dt = (∂F/∂x)(dx/dt) + (∂F/∂y)(dy/dt) + ∂F/∂t. The last term ∂F/∂t is the direct partial and is NOT multiplied by anything — it accounts for t's explicit appearance in F.
- Evaluating partial derivatives at the wrong variable. In the chain rule expression dz/dt = fₓ(x,y)·dx/dt + f_y(x,y)·dy/dt, the partials fₓ and f_y must be evaluated at (x(t), y(t)), not at some fixed point. Leaving them as functions without substituting the parameterization leads to expressions that still contain x and y instead of t only.
- Relying on direct substitution as a crutch. While direct substitution is a good verification tool, always practicing it instead of the chain rule means you won't develop the skill needed when direct substitution is algebraically impossible — which is the norm in applications like backpropagation, PDE transformations, and implicit differentiation.