Chain Rule

$\newenvironment {prompt}{}{} \newcommand {\ungraded }[0]{} \newcommand {\todo }[0]{} \newcommand {\npnoround }[0]{\nprounddigits {-1}} \newcommand {\npnoroundexp }[0]{\nproundexpdigits {-1}} \newcommand {\npunitcommand }[1]{\ensuremath {\mathrm {#1}}} \newcommand {\RR }[0]{\mathbb R} \newcommand {\R }[0]{\mathbb R} \newcommand {\N }[0]{\mathbb N} \newcommand {\Z }[0]{\mathbb Z} \newcommand {\sagemath }[0]{\textsf {SageMath}} \DeclareMathOperator {\dx }{dx} \DeclareMathOperator {\dt }{dt} \DeclareMathOperator {\dy }{dy} \DeclareMathOperator {\dz }{dz} \DeclareMathOperator {\dr }{dr} \DeclareMathOperator {\dw }{dw} \DeclareMathOperator {\du }{du} \DeclareMathOperator {\dv }{dv} \DeclareMathOperator {\ds }{ds} \newcommand {\ddx }[0]{\frac {d}{\d x}} \newcommand {\zeroOverZero }[0]{\ensuremath {\boldsymbol {\tfrac {0}{0}}}} \newcommand {\inftyOverInfty }[0]{\ensuremath {\boldsymbol {\tfrac {\infty }{\infty }}}} \newcommand {\zeroOverInfty }[0]{\ensuremath {\boldsymbol {\tfrac {0}{\infty }}}} \newcommand {\zeroTimesInfty }[0]{\ensuremath {\small \boldsymbol {0\cdot \infty }}} \newcommand {\inftyMinusInfty }[0]{\ensuremath {\small \boldsymbol {\infty -\infty }}} \newcommand {\oneToInfty }[0]{\ensuremath {\boldsymbol {1^\infty }}} \newcommand {\zeroToZero }[0]{\ensuremath {\boldsymbol {0^0}}} \newcommand {\inftyToZero }[0]{\ensuremath {\boldsymbol {\infty ^0}}} \newcommand {\numOverZero }[0]{\ensuremath {\boldsymbol {\tfrac {\#}{0}}}} \newcommand {\dfn }[0]{\textbf } \newcommand {\unit }[0]{\mathop {}\!\mathrm } \newcommand {\eval }[1]{\bigg [ #1 \bigg ]} \newcommand {\seq }[1]{\left ( #1 \right )} \newcommand {\epsilon }[0]{\varepsilon } \newcommand {\phi }[0]{\varphi } \newcommand {\iff }[0]{\Leftrightarrow } \DeclareMathOperator {\arccot }{arccot} \DeclareMathOperator {\arcsec }{arcsec} \DeclareMathOperator {\arccsc }{arccsc} \DeclareMathOperator {\si }{Si} \DeclareMathOperator {\scal }{scal} \DeclareMathOperator {\sign }{sign} \newcommand {\point }[1]{\left (#1\right )} \newcommand {\pt }[1]{\mathbf {#1}} \newcommand {\Lim }[2]{\lim _{\point {#1} \to \point {#2}}} \newcommand {\bar }[0]{\overline } \newcommand {\sectionOutcomes }[0]{} \newcommand {\HyperFirstAtBeginDocument }[0]{\AtBeginDocument }$

We discuss and ultimately develop a rule that allows us to take derivatives of compositions of functions. The so-called Chain Rule.

Video Lecture

This gives an alternative version to the quotient rule; one that allows us to avoid memorizing the order of terms! You can see the supplement video describing how this works here:

https://youtu.be/WICJy6woKx0

(Supplemental Videos are included via external link so you don’t have to watch them to earn credit.)

Text and Additional Details

Perhaps the most useful, and sometimes the most difficult to use, rule for derivatives is the chain rule. This will allow us to tackle compositions of functions - an idea so general that its utility is difficult to overstate. But, like most things, the more versatile the tool, the harder it can sometimes be to recognize where, or when, to use it. For this reason we will have a separate segment on how to recognize when, how, and where to use the chain rule.

Function composition is something that can take so many forms that it is nearly impossible to tackle without going exceedingly general. As always, we will start by writing out the limit of the difference quotient that we want to understand before we sidestep to discuss how we will tackle the problems that arise with simplifying it.

Consider the functions $f(x)$ and $g(x)$ , and suppose we want to understand the derivative of $f(g(x))$ . Then we would need to solve the following limit:

$\lim \limits _{\Delta x\rightarrow 0} \frac {f(g(x + \Delta x)) - f(g(x))}{\Delta x}$

Now, without concrete functions for $f(x)$ or $g(x)$ , it seems impossible to figure out what to do with this; after all, there is no obvious route to try to generate some kind of bridge between terms like we did with the product rule, and there is no way to combine the two terms in the numerator in hopes of simplifying the problem like we did with the quotient rule... so what do we do?

Before we pursue a solution to our problem, it is helpful to rewrite our general difference quotient in a somewhat weird way, but this will help us see what is happening in the next steps. Instead of using the alternative form that we wrote above, we will instead write a slight variation of the original format to emphasize a pattern we want to recognize. Consider the following:

$f'(v) = \lim \limits _{\Delta u\rightarrow v} \frac {f(u) - f(v)}{u-v}$

What we want to observe here is that the $u$ and $v$ are the inputs for the function in the numerator and they coincide directly to the terms in the denominator. We want to think of this as “f of thing 1 minus f of thing 2 divided by thing 1 minus thing 2”. Importantly we want to be a little more... open minded... about what the “thing 1” and “thing 2” can be. We typically think of them as variables, like $x$ or $\Delta x$ , but there’s no reason they can’t be functions. This is the conceptual leap we will need, to get at the heart of taking derivatives of compositions.

Let’s return to our seemingly impenetrable problem. Like so many times before, the first step is to multiply by one “cleverly.”

$\displaystyle \lim \limits _{\Delta x\rightarrow 0} \frac {f(g(x + \Delta x)) - f(g(x))}{\Delta x}$	$\displaystyle =\lim \limits _{\Delta x\rightarrow 0} \left (\frac {f(g(x + \Delta x)) - f(g(x))}{\Delta x}\right )\left (\frac {g(x + \Delta x) - g(x)}{g(x + \Delta x) - g(x)}\right )$	Multiply by $1=\frac {g(x + \Delta x) - g(x)}{g(x + \Delta x) - g(x)}$ .
	$\displaystyle =\lim \limits _{\Delta x\rightarrow 0} \left (\frac {f(g(x + \Delta x)) - f(g(x))}{g(x + \Delta x) - g(x)}\right )\left (\frac {g(x + \Delta x) - g(x)}{\Delta x}\right )$	Switch denominators
	$\displaystyle = \left (\lim \limits _{u\rightarrow v}\frac {f(u) - f(v}{u - v}\right )\left (\lim \limits _{\Delta x\rightarrow 0}\frac {g(x + \Delta x) - g(x)}{\Delta x}\right )$	Sub $u=g(x + \Delta x)$ , $v=g(x)$ .
		And split limit over product.
	$\displaystyle = \bigg (f'(v)\bigg )\bigg (g'(x)\bigg )$	Evaluate limit
	$\displaystyle = \bigg (f'(g(x))\bigg )\bigg (g'(x)\bigg )$	Sub $v=g(x)$ back.

Here we have “multiplied by 1... cleverly” to bridge the gap between the first and second terms, but in a way that is not obvious. We actually needed the denominator to fill out our $u$ and $v$ substitution to be able to see the pattern and identify the derivative $f'(g(x))$ . We record our result succinctly in the following theorem:

Chain Rule Let $f(x)$ and $g(x)$ be differentiable functions. Then $\frac {d}{dx} f(g(x)) = f'(g(x)) \cdot g'(x)$

This is known as the chain rule, and is our rule for taking derivatives of the composition of functions. We will be using this rule, often many times, in almost every derivative we take once we start looking at applications of derivatives, so it is worth practicing and making sure you are very comfortable with the chain rule.

1 : Use the chain rule to compute the derivative of $f(x) = 2(x^2 + 3x + 1)^3$ . $f'(x) = \answer {6(x^2 + 3x + 1)^2(2x + 3)}$ .

At this point, you may be thinking that this seems like a fairly bizarre rule. After all, composition of limits was relatively straight forward, and this is clearly more complicated; obviously something deeper is going on.

It’s true that the chain rule is remarkably unintuitive from the viewpoint of limit laws. That may make the following all the more surprising; there is a deceptively intuitive way to understand the limit law from the perspective of rate of change! In essence, the chain rule is showing us that the rate of change is magnified by the composition process; the inside function changes the variable by some rate, and then the outer function changes the inner function’s output as if it were just the normal input. So the rate of change is multiplicative; the rate of change of the outer function magnifies (multiplies) the rate of change of the inner function, giving us the chain rule.