0% found this document useful (0 votes)
145 views3 pages

Mit 18.022 12

The document summarizes the chain rule from calculus. It states that if functions f and g are differentiable, their composition g o f is also differentiable, and its derivative is equal to the derivative of g multiplied by the derivative of f. This is demonstrated through examples of specific functions composed together. The chain rule can be used to prove simple rules for derivatives of sums, products, and other operations. A proof of the general chain rule formula is also provided using definitions of differentiability and properties of the Frobenius norm.

Uploaded by

nislam57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views3 pages

Mit 18.022 12

The document summarizes the chain rule from calculus. It states that if functions f and g are differentiable, their composition g o f is also differentiable, and its derivative is equal to the derivative of g multiplied by the derivative of f. This is demonstrated through examples of specific functions composed together. The chain rule can be used to prove simple rules for derivatives of sums, products, and other operations. A proof of the general chain rule formula is also provided using definitions of differentiability and properties of the Frobenius norm.

Uploaded by

nislam57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

The chain rule

Based on lecture notes by James McKernan


Theorem 1 (Chain Rule). Let U Rn and let V Rm be two open subsets.
Let f : U V and g : V Rp be two functions. If f is differentiable at p and
g is differentiable at q = f (p), then g f : U Rp is differentiable at p, with
derivative:
D(g f )(p) = (D(g)(q))(D(f )(p)).
It is interesting to untwist this result in specific cases. Suppose we are given
f : R R2

and

g : R2 R.

So f (x) = (f1 (x), f2 (x)) and g = g(y, z). Then


 df1 
g
g
dx (p)
Df (p) = df
and
Dg(q) = ( (q),
(q)).
2
y
z
dx (p)
So
d(g f )
g
df1
g
df2
= D(g f )(p) = Dg(q)Df (p) =
(q)
(p) +
(q)
(p).
dx
y
dx
z
dx
Example 2. Suppose that f (x) = (x2 , x3 ) and g(y, z) = yz. If we apply the chain
rule, we get
D(g f )(x) = z(2x) + y(3x2 ) = 5x4 .
On the other hand (g f )(x) = x5 , and of course
dx5
= 5x4 .
dx
In general, if f = f (x1 , . . . , xn ) and g = g(y1 , . . . , yn ) then the (i, k) entry of
D(g f )(p), that is
(g f )i
xk
is given by the dot product of the ith row of Dg(q) and the kth column of Df (p),
m

X gi
(g f )i
fj
=
(q)
(p).
xk
y
x
j
k
j=1
If y = f (x) and z = g(y) then we get
m

X zi yj
zi
=
.
xk
yj xk
j=1
We can use the chain rule to prove some of the simple rules for derivatives.
Suppose that we have
f : Rn Rm

and

g : Rn Rm .

Suppose that f and g are differentiable at p. What about f + g? Well there is


a function
a : R2m Rm ,
which sends (~u, ~v ) Rm Rm to the sum ~u+~v . In coordinates (u1 , u2 , . . . , um , v1 , v2 , . . . , vm ),
a(u1 , u2 , . . . , um , v1 , v2 , . . . , vm ) = (u1 + v1 , u2 + v2 , . . . , um + vm ).
1

Now a is differentiable (it is a polynomial, linear even). There is function


h : Rn R2m ,
which sends q to (f (q), g(q)). The composition a h : Rn Rm is the function
we want to differentiate, it sends p to f (p) + g(p). The chain rule says that that
the function is differentiable at p and
D(f + g)(p) = Df (p) + Dg(p).
Now suppose that m = 1. Instead of a, consider the function
m : R2 R,
given by m(x, y) = xy. Then m is differentiable, with derivative
Dm(x, y) = (y, x).
So the chain rule says the composition of h and m, namely the function which sends
p to the product f (p)g(p) is differentiable and the derivative satisfies the usual rule
D(f g)(p) = g(p)D(f )(p) + f (p)D(g)(p).
Here is another example of the chain rule, suppose
x = r cos
y = r sin .
Then
f
f x f y
=
+
r
x r
y r
f
f
cos +
sin .
=
x
y
Similarly,
f
f x f y
=
+

x
y
f
f
= r sin +
r cos .
x
y
We can rewrite this as



=

cos
r sin

sin
r cos




x

Now the determinant of




cos
r sin

sin
r cos

is
r(cos2 + sin2 ) = r.
So if r 6= 0, then we can invert the matrix above and we get


 
1 r cos sin
x
r
=

r r sin cos
y

We now turn to a proof of the chain rule. We will need:

Lemma 3. Let A Rn be an open subset and let f : A Rm be a function.


If f is differentiable at p, then there is a > 0 such that if kq pk < , then
kf (q) f (p)k < (K + 1)kq pk,
where K is the Frobenius norm of Df (p).
Proof. As f is differentiable at p, there is a constant > 0 such that if kq pk < ,
then
kf (q) f (p) Df (p)(q p)k
< 1.
kq pk
Hence
kf (q) f (p) Df (p)(q p)k < kq pk.
But then
kf (q) f (p)k = kf (q) f (p) Df (p)(q p) + Df (p)(q p)k
kf (q) f (p) Df (p)(q p)k + kDf (p)(q p)k
k(q p)k + Kk(q p)k
= (K + 1)k(q p)k,

Proof of (1). Lets fix some notation. We want the derivative at p. Let q = f (p).
Let p0 be a point in U (which we imagine is close to p). Finally, let q 0 = f (p0 ) (so
if p0 is close to p, then we expect q 0 to be close to q).
The trick is to carefully define an auxiliary function G : V Rp ,
( 0
g(q )g(q)Dg(q)(q 0 q)
if q 0 6= q
0
kq 0 qk
G(q ) =
~0
if q 0 = q.
Then G is continuous at q = f (p), as g is differentiable at q. Now,
(g f )(p0 ) (g f )(p) Dg(q)Df (p)(p0 p)
kp0 pk
0
Dg(q)(f (p ) f (p)) Dg(q)(q 0 q) + g(q 0 ) g(q) Dg(q)Df (p)(p0 p)
=
kp0 pk
f (p0 ) f (p) Df (p)(p0 p) g(q 0 ) g(q) Dg(q)(q 0 q)
= Dg(q)
+
kp0 pk
kp0 pk
f (p0 ) f (p) Df (p)(p0 p)
kf (p0 ) f (p)k
= Dg(q)
+ G(f (p0 ))
.
0
kp pk
kp0 pk
As p0 approaches p, note that
f (p0 ) f (p) Df (p)(p0 p)
,
kp0 pk
and G(p0 ) both approach zero and
kf (p0 ) f (p)k
K + 1.
kp0 pk
So then

(g f )(p0 ) (g f )(p) Dg(q)Df (p)(p0 p)


,
kp0 pk
approaches zero as well, which is what we want.

You might also like