Calc 3 Book
Calc 3 Book
0.8
0.6
0.4
0.2
z
0
-0.2 -10
-0.4 -5
-10 0 x
-5
0 5
y 5 10
10
CORRAL’S
VECTOR
CALCULUS
Michael Corral
and Anton Petrunin
Corral’s Vector Calculus
This text was typeset in LATEX 2ε with the KOMA-Script bundle, using the GNU Emacs text
editor on a Fedora Linux system. The graphics were created using MetaPost, PGF, and
Gnuplot.
iii
Contents
Preface iii
2 Curves 56
2.1 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.3 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
iv
Contents v
Bibliography 213
History 226
Index 227
1 Vectors in Euclidean Space
1.1 Introduction
In single-variable calculus, the functions that one encounters are functions of a variable
(usually x or t) that varies over some subset of the real number line (which we denote by R).
For such a function, say, y = f ( x), the graph of the function f consists of the points ( x, y) =
( x, f ( x)). These points lie in the Euclidean plane, which, in the Cartesian or rectangular
coordinate system, consists of all ordered pairs of real numbers (a, b). We use the word
“Euclidean” to denote a system in which all the usual rules of Euclidean geometry hold. We
denote the Euclidean plane by R2 ; the “2” represents the number of dimensions of the plane.
The Euclidean plane has two perpendicular coordinate axes: the x-axis and the y-axis.
In vector (or multivariable) calculus, we will deal with functions of two or three variables
(usually x, y or x, y, z, respectively). The graph of a function of two variables, say, z = f ( x, y),
lies in Euclidean space, which in the Cartesian coordinate system consists of all ordered
triples of real numbers (a, b, c). Since Euclidean space is 3-dimensional, we denote it by R3 .
The graph of f consists of the points ( x, y, z) = ( x, y, f ( x, y)). The 3-dimensional coordinate
system of Euclidean space can be represented on a flat surface, such as this page or a black-
board, only by giving the illusion of three dimensions, in the manner shown in Figure 1.1.1.
Euclidean space has three mutually perpendicular coordinate axes ( x, y and z), and three
mutually perpendicular coordinate planes: the x y-plane, yz-plane and xz-plane (see Figure
1.1.2).
z z
c
P(a, b, c)
yz-plane
b y xz-plane y
0 0
a x y-plane
x x
1
2 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
direction of the x-axis, the middle finger in the positive direction of the y-axis, and the thumb
in the positive direction of the z-axis, as in Figure 1.1.3.
An equivalent way of defining a right-handed system is if you can point your thumb up-
wards in the positive z-axis direction while using the remaining four fingers to rotate the
x-axis towards the y-axis. Doing the same thing with the left hand is what defines a left-
handed coordinate system. Notice that switching the x- and y-axes in a right-handed
system results in a left-handed system, and that rotating either type of system does not
change its “handedness”. Throughout the book we will use a right-handed system.
For functions of three variables, the graphs exist in 4-dimensional space (R4 ), which we
can not see in our 3-dimensional space, let alone simulate in 2-dimensional space. So we can
only think of 4-dimensional space abstractly. For an entertaining discussion of this subject,
see the book by A BBOTT.1
So far, we have discussed the position of an object in 2-dimensional or 3-dimensional space.
But what about something such as the velocity of the object, or its acceleration? Or the
gravitational force acting on the object? These phenomena all seem to involve motion and
direction in some way. This is where the idea of a vector comes in.
You have already dealt with velocity and acceleration in single-variable calculus. For
example, for motion along a straight line, if y = f ( t) gives the displacement of an object after
time t, then d y/ dt = f ′ ( t) is the velocity of the object at time t. The derivative f ′ ( t) is just a
1 One thing you will learn is why a 4-dimensional creature would be able to reach inside an egg and remove the
number, which is positive if the object is moving in an agreed-upon “positive” direction, and
negative if it moves in the opposite of that direction. So you can think of that number, which
was called the velocity of the object, as having two components: a magnitude, indicated
by a nonnegative number, preceded by a direction, indicated by a plus or minus symbol
(representing motion in the positive direction or the negative direction, respectively); that
is, f ′ ( t) = ±a for some number a ≥ 0. Then a is the magnitude of the velocity (normally called
the speed of the object), and the ± represents the direction of the velocity (though the + is
usually omitted for the positive direction).
For motion along a straight line (which is a 1-dimensional space) the velocities are also
contained in that 1-dimensional space, since they are just numbers. For general motion
along a curve in 2- or 3-dimensional space, however, velocity will need to be represented by
a multidimensional object which should have both a magnitude and a direction. A geomet-
ric object which has those features is an arrow, which in elementary geometry is called a
“directed line segment”. This is the motivation for how we will define a vector.
Definition 1.1. A (nonzero) vector is a directed line segment drawn from a point P (called
its initial point) to a point Q (called its terminal point), with P and Q being distinct
−−→
points. The vector is denoted by PQ . Its magnitude is the length of the line segment,
° −−→ °
denoted by ° PQ °, and its direction is the same as that of the directed line segment. The
zero vector is just a point, and it is denoted by 0.
To indicate the direction of a vector, we draw an arrow from its initial point to its terminal
point. We will often denote a vector by a single bold-faced letter (for instance, v) and use
the terms “magnitude” and “length” interchangeably. Note that our definition could apply to
systems with any number of dimensions (see Figure 1.1.4 (a)–(c)).
z
y R
−−→S Q
−−→Q Q R
P
S
−−→
PQ
v P R y
x
0
−−→ −−→ 0 −−→
RS PQ RS v
P
S 0 R P Q x S x
(a) One dimension (b) Two dimensions (c) Three dimensions
A few things need to be noted about the zero vector. Our motivation for what a vector is
included the notions of magnitude and direction. What is the magnitude of the zero vector?
We define it to be zero; that is, k 0 k = 0. This agrees with the definition of the zero vector as
just a point, which has zero length. What about the direction of the zero vector? A single
point really has no well-defined direction. Notice that we were careful to only define the
4 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
direction of a nonzero vector, which is well-defined since the initial and terminal points are
distinct. Not everyone agrees on the direction of the zero vector. Some contend that the
zero vector has arbitrary direction, some say that it has indeterminate direction (that is, the
direction can not be determined), while others say that it has no direction. Our definition of
the zero vector, however, does not require it to have a direction, and we will leave it at that.2
Now that we know what a vector is, we need a way of determining when two vectors are
equal. This leads us to the following definition.
Definition 1.2. Two nonzero vectors are equal if they have the same magnitude and the
same direction. Any vector with zero magnitude is equal to the zero vector.
By this definition, vectors with the same magnitude and direction but with different initial
points wouldpbe equal. For example, in Figure 1.1.5 the vectors u, v and w all have the same
magnitude 5 (by the Pythagorean Theorem). And we see that u and w are parallel, since
they lie on lines having the same slope 21 , and they point in the same direction. So u = w,
even though they have different initial points. We also see that v is parallel to u but points
in the opposite direction. So u 6= v.
y
u
4
3
2 v
w
1
x
0 1 2 3 4
Figure 1.1.5
So we can see that there are an infinite number of vectors for a given magnitude and
direction, those vectors all being equal and differing only by their initial and terminal points.
Is there a single vector which we can choose to represent all those equal vectors? The answer
is yes, and is suggested by the vector w in Figure 1.1.5.
Unless otherwise indicated, when speaking of “the vector” with a given magnitude and
direction, we will mean the one whose initial point is at the origin of the coordinate
system.
Thinking of vectors as starting from the origin provides a way of dealing with vectors in
a standard way, since every coordinate system has an origin. But there will be times when
2 In the subject of linear algebra there is a more abstract way of defining a vector where the concept of “direction”
it is convenient to consider a different initial point for a vector (for example, when adding
vectors, which we will do in the next section).
Another advantage of using the origin as the initial point is that it provides an natural
correspondence between a vector and its terminal point.
Example 1.1. Let v be the vector in R3 whose initial point is at the origin and whose ter-
minal point is (3, 4, 5). Though the point (3, 4, 5) and the vector v are different objects, it is
convenient to write v = (3, 4, 5). When doing this, it is understood that the initial point of v
is at the origin (0, 0, 0) and the terminal point is (3, 4, 5).
z z
P(3, 4, 5) v = (3, 4, 5)
y y
0 0
x x
(a) The point (3,4,5) (b) The vector (3,4,5)
−−→ −−→
So PQ = v = (1, 4, 2) and RS = w = (1, 4, 2).
−−→ −−→
∴ PQ = RS
z Q
−−→
PQ (3, 5, 7)
P
(2, 1, 5)
−−→
Translate PQ to v
v=w
(1, 4, 2)
0 y
−−→
Translate RS to w S
(2, 1, 0)
−−→S
R R
(1, −3, −2) x
Figure 1.1.7
Finding the magnitude of a vector v = (a, b) in R2 is a special case of formula (1.2) with
P = (0, 0) and Q = (a, b) :
4 + 9 + 49 = 62.
8 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
3
(d) The magnitude of the vector v = (5 p, 8, −2) in R . p p
Solution: By formula (1.5), k v k = 5 + 8 + (−2)2 = 25 + 64 + 4 = 93.
2 2
Exercises
A
1. Calculate the magnitudes of the following vectors:
(a) v = (2, −1); (b) v = (2, −1, 0); (c) v = (3, 2, −2); (d) v = (0, 0, 1); (e) v = (6, 4, −4).
−−→ −−→
2. For the points P = (1, −1, 1), Q = (2, −2, 2), R = (2, 0, 1), S = (3, −1, 2), does PQ = RS ?
−−→ −−→
3. For the points P = (0, 0, 0), Q = (1, 3, 2), R = (1, 0, 1), S = (2, 3, 4), does PQ = RS ?
B
4. Let v = (1, 0, 0) and w = (a, 0, 0) be vectors in R3 . Show that k w k = | a | k v k.
C
z
6. Though we will see a simple proof of Theorem 1.1
in the next section, it is possible to prove it using Q(x2 , y2 , z2 )
methods similar to those in the proof of Theorem P(x1 , y1 , z1 )
1.2. Prove the special case of Theorem 1.1 where the R(x2 , y2 , z1 )
points P = ( x1 , y1 , z1 ) and Q = ( x2 , y2 , z2 ) satisfy the fol-
lowing conditions: y
0
x2 > x1 > 0, y2 > y1 > 0, and z2 > z1 > 0. S(x1 , y1 , 0)
(Hint: Think of Case 4 in the proof of Theorem 1.2,
T(x2 , y2 , 0)
and consider Figure 1.1.9.) x U(x2 , y1 , 0)
Figure 1.1.9
1.2 Vector Algebra 9
Two vectors v and w are parallel (denoted by v ∥ w) if one is a scalar multiple of the other.
You can think of scalar multiplication of a vector as stretching or shrinking the vector, and
as flipping the vector in the opposite direction if the scalar is a negative number (see Figure
1.2.1).
v 2v 3v 0.5v −v −2v
Figure 1.2.1
Recall that translating a nonzero vector means that the initial point of the vector is
changed but the magnitude and direction are preserved. We are now ready to define the
sum of two vectors.
Definition 1.5. The sum of vectors v and w, denoted by v + w, is obtained by translating
w so that its initial point is at the terminal point of v; the initial point of v + w is the initial
point of v, and its terminal point is the new terminal point of w.
3 The term scalar was invented by 19th century Irish mathematician, physicist and astronomer William Rowan
Hamilton, to convey the sense of something that could be represented by a point on a scale or graduated ruler.
The word vector comes from Latin, where it means “carrier”.
4 An alternate definition of scalars and vectors, used in physics, is that under certain types of coordinate trans-
formations (for example rotations), a quantity that is not affected is a scalar, while a quantity that is affected
(in a certain way) is a vector. See M ARION for details.
10 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
v+w
w
w w
v v v
(a) Vectors v and w (b) Translate w to the end of v (c) The sum v + w
Notice that our definition is valid for the zero vector (which is just a point, and hence can
be translated), and so we see that v + 0 = v = 0 + v for any vector v. In particular, 0 + 0 = 0.
Also, it is easy to see that v + (−v) = 0, as we would expect. In general, since the scalar
multiple −v = −1 v is a well-defined vector, we can define vector subtraction as follows:
v − w = v + (−w). See Figure 1.2.3.
v v
w v−w
−w −w
v
(a) Vectors v and w (b) Translate −w to the end of v (c) The difference v − w
Figure 1.2.4 shows the use of “geometric proofs” of various laws of vector algebra, that is,
it uses laws from elementary geometry to prove statements about vectors. For example, (a)
shows that v + w = w + v for any vectors v, w. And (c) shows how you can think of v − w as
the vector that is tacked on to the end of w to add up to v.
v w
v+w
w+v v−w
w w v v−w w
v+w v−w
v −w v
(a) Add vectors (b) Subtract vectors (c) Combined add/subtract
Notice that we have temporarily abandoned the practice of starting vectors at the origin.
In fact, we have not even mentioned coordinates in this section so far. Since we will deal
mostly with Cartesian coordinates in this book, the following two theorems are useful for
performing vector algebra on vectors in R2 and R3 starting at the origin.
1.2 Vector Algebra 11
Theorem 1.3. Let v = (v1 , v2 ), w = (w1 , w2 ) be vectors in R2 , and let k be a scalar. Then
(a) kv = ( kv1 , kv2 );
(b) v + w = (v1 + w1 , v2 + w2 ).
Proof: (a) Without loss of generality, we assume that v1 , v2 > 0 (the other possibilities are
handled in a similar manner). If k = 0 then kv = 0v = 0 = (0, 0) = (0v1 , 0v2 ) = ( kv1 , kv2 ), which
is what we needed to show. If k 6= 0, then ( kv1 , kv2 ) lies on a line with slope kv 2 v2
kv1 = v1 , which
is the same as the slope of the line on which v (and hence kv) lies, and ( kv1 , kv2 ) points in
the same directionqon that line asq kv. Also, by formula q (1.3) the magnitude of ( kv1 , kv2 ) is
p
( kv1 )2 + ( kv2 )2 = k2 v12 + k2 v22 = k2 (v12 + v22 ) = | k | v12 + v22 = | k | k v k. So kv and ( kv1 , kv2 )
have the same magnitude and direction. This proves (a).
u + (v + w) = ( u 1 , u 2 , u 3 ) + ((v1 , v2 , v3 ) + (w1 , w2 , w3 ))
= ( u 1 , u 2 , u 3 ) + (v1 + w1 , v2 + w2 , v3 + w3 ) by Theorem 1.4(b)
= ( u 1 + (v1 + w1 ), u 2 + (v2 + w2 ), u 3 + (v3 + w3 )) by Theorem 1.4(b)
= (( u 1 + v1 ) + w1 , ( u 2 + v2 ) + w2 , ( u 3 + v3 ) + w3 ) by properties of real numbers
= ( u 1 + v1 , u 2 + v2 , u 3 + v3 ) + (w1 , w2 , w3 ) by Theorem 1.4(b)
= (u + v) + w
This completes the analytic proof of (b). Figure 1.2.6 provides the geometric proof.
u + (v + w) = (u + v) + w
v+w w
u u+v
v
Figure 1.2.6 Associative Law for vector addition
A unit vector is a vector with magnitude 1. Notice that for any nonzero vector° v, the °
vector k vv k is a unit vector which points in the same direction as v, since k v1 k > 0 and ° k vv k ° =
kvk
kvk = 1. Dividing a nonzero vector v by k v k is often called normalizing v.
There are specific unit vectors which we will often use, called the basis vectors:
i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) in R3 ; i = (1, 0) and j = (0, 1) in R2 .
These are useful for several reasons: they are mutually perpendicular, since they lie on
distinct coordinate axes; they are all unit vectors: k i k = k j k = k k k = 1; every vector can
be written as a unique scalar combination of the basis vectors: v = (a, b) = a i + b j in R2 ,
1.2 Vector Algebra 13
y y 1
2 v = (a, b)
ck
k
y y
1 bj i 0 j 0
1 2 ai
j x 1
x
0 2 bj
i 1 2 0 ai x x
(a) R2 (b) v = a i + b j (c) R3 (d) v = a i + b j + c k
(a) Find v − w.
Solution: v − w = (2 − 3, 1 − (−4), −1 − 2) = (−1, 5, −3).
We can now easily prove Theorem 1.1 from the previous section. The distance d between
two points P = ( x1 , y1 , z1 ) and Q = ( x2 , y2 , z2 ) in R3 is the same as the length of the vector w − v,
where the vectors v and w are defined as v = ( x1 , y1 , z1 ) and p w = ( x2 , y2 , z2 ) (see Figure 1.2.8).
So since w − v = ( x2 − x1 , y2 − y1 , z2 − z1 ), then d = k w − v k = ( x2 − x1 )2 + ( y2 − y1 )2 + ( z2 − z1 )2 by
Theorem 1.2.
z P(x1 , y1 , z1 )
w−v
v Q(x2 , y2 , z2 )
w y
0
x
Exercises
A
1. Let v = (−1, 5, −2) and w = (3, 1, 1).
° °
(a) Find v − w. (b) Find v + w. (c) Find v
kvk . (d) Find ° 21 (v − w) °.
°1 °
(e) Find ° (v + w) °. (f) Find −2 v + 4 w.
2 (g) Find v − 2 w.
(h) Find the vector u such that u + v + w = i.
(i) Find the vector u such that u + v + w = 2 j + k.
(j) Is there a scalar m such that m(v + 2 w) = k? If so, find it.
2. For the vectors v and w from Exercise 1, is k v − w k = k v k − k w k? If not, which quantity
is larger?
C
6. We know that every vector in R3 can be written as a scalar combination of the vectors i,
j, and k. Can every vector in R3 be written as a scalar combination of just i and j; that is
for any vector v in R3 , are there scalars m, n such that v = m i + n j? Justify your answer.
1.3 Dot Product 15
v · w = v1 w1 + v2 w2 + v3 w3 . (1.6)
Similarly, for vectors v = (v1 , v2 ) and w = (w1 , w2 ) in R2 , the dot product is:
v · w = v1 w1 + v2 w2 . (1.7)
Notice that the dot product of two vectors is a scalar, not a vector. So the associative law
that holds for multiplication of numbers and for addition of vectors (see Theorem 1.5(b),(e)),
does not hold for the dot product of vectors. Why? Because for vectors u, v, w, the dot
product u · v is a scalar, and so (u · v) · w is not defined since the left side of that dot product
(the part in parentheses) is a scalar and not a vector.
For vectors v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k in component form, the dot product
is still v · w = v1 w1 + v2 w2 + v3 w3 .
Also notice that we defined the dot product in an analytic way, that is, by referencing
vector coordinates. There is a geometric way of defining the dot product, which we will now
develop as a consequence of the analytic definition.
Definition 1.7. The angle between two nonzero vectors with the same initial point is the
smallest angle between them.
We do not define the angle between the zero vector and any other vector. Any two nonzero
vectors with the same initial point have two angles between them: θ and 360◦ − θ . We will
always choose the smallest nonnegative angle θ between them, so that 0◦ ≤ θ ≤ 180◦ . See
Figure 1.3.1.
θ 360◦ − θ
θ
θ
360◦ − θ 360◦ − θ
We can now take a more geometric view of the dot product by establishing a relationship
between the dot product of two vectors and the angle between them.
16 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Theorem 1.6. Let v, w be nonzero vectors, and let θ be the angle between them. Then
v· w
cos θ = (1.8)
kvk kwk
We will prove the theorem, assuming that the notion of angle as well as the Law of Cosines
are known. In a more rigorous approach, one could define the angles between the vectors
using the statement of the theorem above.
Proof: We will prove the theorem for vectors in R3 (the proof for R2 is similar). Let v =
(v1 , v2 , v3 ) and w = (w1 , w2 , w3 ). By the Law of Cosines (see Figure 1.3.2), we have
k v − w k2 = k v k2 + k w k2 − 2 k v k k w k cos θ . (1.9)
(note that equation (1.9) holds even for the “degenerate” cases θ = 0◦ and 180◦ ).
z
v−w
v
θ
w y
0
x
Figure 1.3.2
Example 1.5. Find the angle θ between the vectors v = (2, 1, −1) and w = (3, −4, 1).
p p
Solution: Since v · w = (2)(3) + (1)(−4) + (−1)(1) = 1, k v k = 6, and k w k = 26, then
v· w 1 1
cos θ = = p p = p ≈ 0.08 =⇒ θ ≈ 85.41◦ .
kvk kwk 6 26 2 39
Two nonzero vectors are perpendicular if the angle between them is 90◦ . Since cos 90◦ =
0, we have the following important corollary to Theorem 1.6:
1.3 Dot Product 17
Corollary 1.7. Two nonzero vectors v and w are perpendicular if and only if v · w = 0.
By Corollary 1.8, the dot product can be thought of as a way of telling if the angle be-
tween two vectors is acute, obtuse, or a right angle, depending on whether the dot product
is positive, negative, or zero, respectively. See Figure 1.3.3.
w w w
90◦ < θ ≤ 180◦
θ = 90◦
◦ ◦
0 ≤ θ < 90
v v v
(a) v · w > 0 (b) v · w < 0 (c) v · w = 0
Figure 1.3.3 Sign of the dot product & angle between vectors.
Example 1.6. Are the vectors v = (−1, 5, −2) and w = (3, 1, 1) perpendicular?
Solution: Yes, v ⊥ w since v · w = (−1)(3) + (5)(1) + (−2)(1) = 0.
The following theorem summarizes the basic properties of the dot product.
Proof: The proofs of parts (a)–(e) are straightforward applications of the definition of the
dot product, and are left to the reader as exercises. We will prove part (f).
(f) If either v = 0 or w = 0, then v · w = 0 by part (c), and so the inequality holds trivially. So
assume that v and w are nonzero vectors. Then by Theorem 1.6,
v · w = cos θ k v k k w k , so
| v · w | = | cos θ | k v k k w k , so
| v · w | ≤ k v k k w k since | cos θ | ≤ 1. QED
For vectors v and w, the collection of all scalar combinations kv + l w is called the span
of v and w. If nonzero vectors v and w are parallel, then their span is a line; if they are
not parallel, then their span is a plane. So what we showed above is that a vector which is
perpendicular to two other vectors is also perpendicular to their span.
The dot product can be used to derive properties of the magnitudes of vectors, the most
important of which is the Triangle Inequality, as given in the following theorem:
k v + w k2 = (v + w) · (v + w) = v · v + v · w + w · v + w · w
= k v k2 + 2(v · w) + k w k2 , so since a ≤ | a | for any real number a, we have
≤ k v k2 + 2 | v · w | + k w k2 , so by Theorem 1.9(f) we have
≤ k v k2 + 2 k v k k w k + k w k2 = (k v k + k w k)2 and so
k v + w k ≤ k v k + k w k after taking square roots of both sides, which proves (b).
The Triangle Inequality gets its name from the fact that in any triangle, v+w
no one side is longer than the sum of the lengths of the other two sides (see
w
Figure 1.3.4). Another way of saying this is with the familiar statement “the v
shortest distance between two points is a straight line.”
Figure 1.3.4
Exercises
A
1. Let v = (5, 1, −2) and w = (4, −4, 3). Calculate v · w.
7. v = − i + 2 j + k, w = −3 i + 6 j + 3 k; 8. v = i, w = 3 i + 2 j + 4k.
v
25. For nonzero vectors v and w, the projection of v onto w (some-
times written as pro j w v) is the vector u along the same line L
as w whose terminal point is obtained by dropping a perpendic- L
ular line from the terminal point of v to L (see Figure 1.3.5). w
u
Show that
|v · w| Figure 1.3.5
kuk = .
kwk
(Hint: Consider the angle between v and w.)
27. Let α, β, and γ be the angles between a nonzero vector v in R3 and the vectors i, j, and k,
respectively. Show that cos2 α + cos2 β + cos2 γ = 1. (The angles α, β, γ are often called the
direction angles of v, and cos α, cos β, cos γ are called the direction cosines.)
1.4 Cross Product 21
Definition 1.8. Let v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) be vectors in R3 . The cross product
of v and w, denoted by v × w, is the vector in R3 given by:
v × w = (v2 w3 − v3 w2 , v3 w1 − v1 w3 , v1 w2 − v2 w1 ). (1.10)
z
Example 1.7. Find i × j.
Solution: Since i = (1, 0, 0) and j = (0, 1, 0), then 1 k = i× j
y
i × j = ((0)(0) − (0)(1), (0)(0) − (1)(0), (1)(1) − (0)(0))
i 0 j 1
= (0, 0, 1) 1
x
= k.
Figure 1.4.1
Similarly it can be shown that j × k = i and k × i = j.
In the above example, the cross product of the given vectors was perpendicular to both
those vectors. It turns out that this will always be the case.
Theorem 1.11. If the cross product v × w of two nonzero vectors v and w is also a nonzero
vector, then it is perpendicular to both v and w.
(v × w) · v = (v2 w3 − v3 w2 , v3 w1 − v1 w3 , v1 w2 − v2 w1 ) · (v1 , v2 , v3 )
= v2 w3 v1 − v3 w2 v1 + v3 w1 v2 − v1 w3 v2 + v1 w2 v3 − v2 w1 v3
= v1 v2 w3 − v1 v2 w3 + w1 v2 v3 − w1 v2 v3 + v1 w2 v3 − v1 w2 v3
= 0 , after rearranging the terms.
∴ v × w ⊥ v by Corollary 1.7.
The proof that v × w ⊥ w is similar. QED
As a consequence of the above theorem and Theorem 1.9, we have the following:
Corollary 1.12. If the cross product v × w of two nonzero vectors v and w is also a nonzero
vector, then it is perpendicular to the span of v and w.
22 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
The span of any two nonzero, nonparallel vectors v, w in R3 is a plane P , so the above
corollary shows that v × w is perpendicular to that plane. As shown in Figure 1.4.2, there
are two possible directions for v × w, one the opposite of the other. The choice of direction
of v × w can be visualized using the right-hand rule, that is, the vectors v, w, v × w form
a right-handed system. Recall from Section 1.1 that this means that you can point your
thumb upwards in the direction of v × w while rotating v towards w with the remaining four
fingers.
z
v× w
v
θ y
0
w
P
x −v × w
We will now derive a formula for the magnitude of v × w, for nonzero vectors v, w:
and now adding and subtracting v12 w12 , v22 w22 , and v32 w32 on the right side gives
= v12 (w12 + w22 + w32 ) + v22 (w12 + w22 + w32 ) + v32 (w12 + w22 + w32 )
− (v12 w12 + v22 w22 + v32 w32 + 2(v1 w1 v2 w2 + v1 w1 v3 w3 + v2 w2 v3 w3 ))
= (v12 + v22 + v32 )(w12 + w22 + w32 )
− ((v1 w1 )2 + (v2 w2 )2 + (v3 w3 )2 + 2(v1 w1 )(v2 w2 ) + 2(v1 w1 )(v3 w3 ) + 2(v2 w2 )(v3 w3 ))
k v × w k = k v k k w k sin θ . (1.11)
It may seem strange to bother with the above formula, when the magnitude of the cross
product can be calculated directly, like for any other vector. The formula is more useful for
its applications in geometry, as in the following example.
Example 1.8. Let △PQR and PQRS be a triangle and parallelogram, respectively, as shown
in Figure 1.4.3.
P S P S
h w h
θ θ
Q b R Q v R
Figure 1.4.3
Think of the triangle as existing in R3 , and identify the sides QR and QP with vectors v
and w, respectively, in R3 . Let θ be the angle between v and w. The area A PQR of △PQR is
1
2 bh, where b is the base of the triangle and h is the height. So we see that
b = k v k and h = k w k sin θ ,
1
A PQR = k v k k w k sin θ
2
1
= k v × w k.
2
So since the area A PQRS of the parallelogram PQRS is twice the area of the triangle △PQR ,
then
A PQRS = k v k k w k sin θ .
(a) The area A of a triangle with adjacent sides v, w (as vectors in R3 ) is:
1
A= k v × w k;
2
(b) The area A of a parallelogram with adjacent sides v, w (as vectors in R3 ) is:
A = k v × w k.
24 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
It may seem at first glance that since the formulas derived in Example 1.8 were for the
adjacent sides QP and QR only, then the more general statements in Theorem 1.13 that the
formulas hold for any adjacent sides are not justified. We would get a different formula for
the area if we had picked PQ and PR as the adjacent sides, but it can be shown (see Exercise
26) that the different formulas would yield the same value, so the choice of adjacent sides
indeed does not matter, and Theorem 1.13 is valid.
Theorem 1.13 makes it simpler to calculate the area of a triangle in 3-dimensional space
than by using traditional geometric methods.
Example 1.9. Calculate the area of the triangle △PQR , where P = (2, 4, −7), Q = (3, 7, 18),
and R = (−5, 12, 8).
−−→ −−→ z
Solution: Let v = PQ and w = PR , as in Figure 1.4.4. Then Q(3, 7, 18)
Example 1.10. Calculate the area of the parallelogram PQRS , where P = (1, 1), Q = (2, 3),
R = (5, 4), and S = (4, 2).
1.4 Cross Product 25
−−→ −−→ y
Solution: Let v = SP and w = SR , as in Figure 1.4.5. Then
R
4
v = (1, 1) − (4, 2) = (−3, −1) and w = (5, 4) − (4, 2) = (1, 2). Q
3 w
2
But these are vectors in R2 , and the cross product is only de- S
fined for vectors in R3 . However, R2 can be thought of as the 1 v
P
subset of R3 such that the z-coordinate is always 0. So we can x
write v = (−3, −1, 0) and w = (1, 2, 0). Then the area A of the 0 1 2 3 4 5
parallelogram PQRS is Figure 1.4.5
° °
A = k v × w k = ° (−3, −1, 0) × (1, 2, 0) °
° °
= ° ((−1)(0) − (0)(2), (0)(1) − (−3)(0), (−3)(2) − (−1)(1)) °
° °
= ° (0, 0, −5) °.
A = 5.
The following theorem summarizes the basic properties of the cross product.
Proof: The proofs of properties (b)–(f) are straightforward. We will prove parts (a) and (g)
and leave the rest to the reader as exercises.
(a) By the definition of the cross product and scalar multipli- z
v× w
cation, we have: v
v × w = (v2 w3 − v3 w2 , v3 w1 − v1 w3 , v1 w2 − v2 w1 ) y
= −(v3 w2 − v2 w3 , v1 w3 − v3 w1 , v2 w1 − v1 w2 ) 0
w
= −(w2 v3 − w3 v2 , w3 v1 − w1 v3 , w1 v2 − w2 v1 )
= −w × v x w× v
Figure 1.4.6
26 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Note that this says that v × w and w × v have the same mag-
nitude but opposite direction (see Figure 1.4.6).
i× j = k j × k = i, k× i = j
j × i = −k, k × j = −i, i × k = −j ,
i × i = j × j = k × k = 0.
Recall that a parallelepiped is a 3-dimensional solid with 6 faces, all of which are parallel-
ograms.6
Hence,
vol(P ) = A h
k u k u · (v × w)
= kv × wk
kuk kv × wk
= u · (v × w).
In Example 1.12 the height h of the parallelepiped is k u k cos θ , and not −k u k cos θ , because
the vector u is on the same side of the base parallelogram’s plane as the vector v × w (so that
cos θ > 0). Since the volume is the same no matter which base and height we use, then
repeating the same steps using the base determined by u and v (since w is on the same
side of that base’s plane as u × v), the volume is w · (u × v). Repeating this with the base
determined by w and u, we have the following result:
u · (v × w) = w · (u × v) = v · (w × u). (1.12)
(Note that the equalities hold trivially if any of the vectors are 0.)
Since v × w = −w × v for any vectors v, w in R3 , then picking the wrong order for the three
adjacent sides in the scalar triple product in formula (1.12) will give you the negative of the
volume of the parallelepiped. So taking the absolute value of the scalar triple product for
any order of the three adjacent sides will always give the volume:
Another type of triple product is the vector triple product u × (v × w). The proof of the
following theorem is left as an exercise for the reader:
An examination of the formula in Theorem 1.16 gives some idea of the geometry of the
vector triple product. By the right side of formula (1.13), we see that u × (v × w) is a scalar
combination of v and w, and hence lies in the plane containing v and w (that is, the vectors
u × (v × w), v and w are coplanar). This makes sense since, by Theorem 1.11, u × (v × w) is
perpendicular to both u and v × w. In particular, being perpendicular to v × w means that
u × (v × w) lies in the plane containing v and w, since that plane is itself perpendicular to
v × w. But then how is u × (v × w) also perpendicular to u, which could be any vector? The
following example may help to see how this works.
28 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Example 1.13. Find u × (v × w) for u = (1, 2, 4), v = (2, 2, 0), w = (1, 3, 0).
Solution: Since u · v = 6 and u · w = 7, then
u × (v × w) = (u · w)v − (u · v)w
= 7 (2, 2, 0) − 6 (1, 3, 0) = (14, 14, 0) − (6, 18, 0)
= (8, −4, 0).
Note that v and w lie in the x y-plane, and that u × (v × w) also lies in that plane. Also,
u × (v × w) is perpendicular to both u and v × w = (0, 0, 4) (see Figure 1.4.8).
z
v×w u
y
0 w
v
u × (v × w)
x
Figure 1.4.8
It may help to remember this formula as being the product of the scalars on the downward
diagonal minus the product of the scalars on the upward diagonal.
Example 1.14. ¯ ¯
¯1 2¯
¯3 4¯ = (1)(4) − (2)(3) = 4 − 6 = −2.
¯ ¯
One way to remember the above formula is the following: multiply each scalar in the first
row by the determinant of the 2 × 2 matrix that remains after removing the row and column
that contain that scalar, then sum those products up, putting alternating plus and minus
signs in front of each (starting with a plus).
Example 1.15.
¯ ¯
¯ 1 0 2 ¯¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ 4 −1 3 ¯ = 1 ¯ −1 3 ¯ − 0 ¯ 4 3 ¯ + 2 ¯ 4 −1
¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ = 1(−2 − 0) − 0(8 − 3) + 2(0 + 1) = 0.
¯ ¯ ¯ 0 2 ¯ ¯ 1 2 ¯ ¯ 1 0 ¯
¯ 1 0 2 ¯
We defined the determinant as a scalar, derived from algebraic operations on scalar entries
in a matrix. However, if we put three vectors in the first row of a 3 × 3 matrix, then the
definition still makes sense, since we would be performing scalar multiplication on those
three vectors (they would be multiplied by the 2×2 scalar determinants as before). This gives
us a determinant that is now a vector, and lets us write the cross product of v = v1 i + v2 j + v3 k
and w = w1 i + w2 j + w3 k as a determinant:
¯ ¯
¯i j k ¯¯ ¯¯ ¯ ¯ ¯ ¯ ¯
¯ v2 v3 ¯¯ ¯ v1 v3 ¯¯ ¯ v1 v2 ¯¯
v × w = ¯¯ v1 v2 v3 ¯¯ = ¯¯ i − ¯¯ j + ¯¯ k
¯w w2 w3 ¯ w1 w3 ¯ w1 w2 ¯
1 w2 w3 ¯
= (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k .
The scalar triple product can also be written as a determinant. In fact, by Example 1.12,
the following theorem provides an alternate definition of the determinant of a 3 × 3 matrix
as the volume of a parallelepiped whose adjacent sides are the rows of the matrix and form
a right-handed system (a left-handed system would give the negative volume).
Example 1.17. Find the volume of the parallelepiped with adjacent sides u = (2, 1, 3), v =
(−1, 3, 2), w = (1, 1, −2) (see Figure 1.4.9).
Solution: By Theorem 1.15, the volume vol(P ) of the parallelepiped z
P is the absolute value of the scalar triple product of the three
adjacent sides (in any order). By Theorem 1.17, v
¯ ¯ u
¯ 2 1 3 ¯¯ y
¯
u · (v × w) = ¯¯ −1 3 2 ¯¯ 0
¯ 1 1 −2 ¯ x w
¯ ¯ ¯ ¯ ¯ ¯
¯ 3 2 ¯ − 1 ¯ −1 2 ¯ + 3 ¯ −1 3 ¯
¯ ¯ ¯ ¯ ¯
= 2 ¯¯ Figure 1.4.9 P
1 −2 ¯ ¯ 1 −2 ¯ ¯ 1 1 ¯
Interchanging the dot and cross products can be useful in proving vector identities:
¯ ¯
¯u · w u · z¯
Example 1.18. Prove: (u × v) · (w × z) = ¯
¯ ¯ for all vectors u, v, w, z in R3 .
v · w v · z¯
Solution: Let x = u × v. Then
(u × v) · (w × z) = x · (w × z)
= w · (z × x) (by formula (1.12))
= w · (z × (u × v))
= w · ((z · v)u − (z · u)v) (by Theorem 1.16)
= (z · v)(w · u) − (z · u)(w · v)
= (u · w)(v · z) − (u · z)(v · w) (by commutativity of the dot product).
¯ ¯
¯u · w u · z¯
=¯¯ ¯.
v · w v · z¯
1.4 Cross Product 31
Exercises
A
For Exercises 1–6, calculate v × w.
5. v = − i + 2 j + k, w = −3 i + 6 j + 3 k; 6. v = i, w = 3 i + 2 j + 4k.
7. P = (5, 1, −2), Q = (4, −4, 3), R = (2, 4, 0); 8. P = (4, 0, 2), Q = (2, 1, 5), R = (−1, 0, −1).
For Exercises 11–12, find the volume of the parallelepiped with adjacent sides u, v, w.
11. u = (1, 1, 3), v = (2, 1, 4), w = (5, 1, −2); 12. u = (1, 3, 2), v = (7, 2, −10), w = (1, 0, 1).
13. u = (1, 1, 1), v = (3, 0, 2), w = (2, 2, 2); 14. u = (1, 0, 2), v = (−1, 0, 3), w = (2, 0, −2).
15. Calculate (u × v) · (w × z) for u = (1, 1, 1), v = (3, 0, 2), w = (2, 2, 2), z = (2, 1, 4).
B
16. If v and w are unit vectors in R3 , under what condition(s) would v × w also be a unit
vector in R3 ? Justify your answer.
24. Prove Theorem 1.17. (Hint: Expand both sides of the equation.)
C
26. Prove that in Example 1.8 the formula for the area of the triangle △PQR yields the
same value no matter which two adjacent sides are chosen. To do this, show that
1 1
k u × (−w) k = k v × w k,
2 2
−−→ −−→ −−→ −−→
where u = PR , −w = PQ , and v = QR , w = QP as before. Similarly, show that
1 1
k (−u) × (−v) k = k v × w k,
2 2
−−→ −−→
where −u = RP and −v = RQ .
27. Assume that the vector equation a × x = b in R3 , with unknown x and a 6= 0 has a
solution. Show that:
(a) a · b = 0.
b× a
(b) x = + ka is a solution to the equation, for any scalar k.
k a k2
28. Prove the Jacobi identity:
u × (v × w) + v × (w × u) + w × (u × v) = 0.
(u × v) × (w × z) = (z · (u × v))w − (w · (u × v))z
and that
(u × v) × (w × z) = (u · (w × z))v − (v · (w × z))u
31. Describe geometrically the set of points with position vector x satisfying the equation
(v × x) × x = v
Figure 1.5.1
Let r = ( x0 , y0 , z0 ) be the vector pointing from the origin to P . Since multiplying the vector
v by a scalar t lengthens or shrinks v while preserving its direction if t > 0, and reversing
its direction if t < 0, then we see from Figure 1.5.1 that every point on the line L can be
obtained by adding the vector tv to the vector r for some scalar t. That is, as t varies over all
real numbers, the vector r + tv will point to every point on L. We can summarize the vector
representation of L as follows:
Note that we used the correspondence between a vector and its terminal point. Since
v = (a, b, c), then the terminal point of the vector r + tv is ( x0 + at, y0 + bt, z0 + ct). We then get
the parametric representation of L with the parameter t:
In formula (1.17), if a 6= 0, then we can solve for the parameter t: t = ( x − x0 )/a. We can also
solve for t in terms of y and in terms of z if neither b nor c, respectively, is zero: t = ( y − y0 )/ b
and t = ( z − z0 )/ c. These three values all equal the same value t, so we can write the following
system of equalities, called the symmetric representation of L:
For a point P = ( x0 , y0 , z0 ) and vector v = (a, b, c) in R3 with a, b and c all nonzero, the line
L through P parallel to v consists of all points ( x, y, z) given by the equations
x − x0 y − y0 z − z0
= = . (1.18)
a b c
Example 1.19. Write the line L through the point P = (2, 3, 5) and parallel to the vector
v = (4, −1, 6), in the following forms: (a) vector, (b) parametric, (c) symmetric. Lastly: (d) find
two points on L distinct from P .
Solution: (a) Let r = (2, 3, 5). Then by formula (1.16), L is given by:
Symmetric:
x − x1 y − y1 z − z1
= = (if x1 6= x2 , y1 6= y2 , and z1 6= z2 ). (1.22)
x2 − x1 y2 − y1 z2 − z1
Example 1.20. Write the line L through the points P1 = (−3, 1, −4) and P2 = (4, 4, −6) in
parametric form.
Solution: By formula (1.21), L consists of the points ( x, y, z) such that
kv × wk
d= . (1.23)
kvk
In other words, d is the hight of the parallelogram with adjacent sides v and w. Since its
area is k v × w k and its base k v k, we get the expression (1.23).
36 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Example 1.21. Find the distance d from the point P = (1, 1, 1) to the line L in Example 1.20.
Solution: From Example 1.20, we see that we can represent L in vector form as: r + tv, for
−−→
r = (−3, 1, −4) and v = (7, 3, −2). Since the point Q = (−3, 1, −4) is on L, then for w = QP =
(1, 1, 1) − (−3, 1, −4) = (4, 0, 5), we have:
¯ ¯
¯ i j k ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ ¯ ¯ 3 −2 ¯ ¯ 7 −2 ¯ ¯ 7 3 ¯
v × w = ¯ 7 3 −2 ¯ = ¯
¯ ¯ ¯ ¯ i−¯
¯ ¯ j+¯¯ ¯ k = 15 i − 43 j − 12 k , so
¯ 4 0 ¯ 0 5 ¯ 4 5 ¯ 4 0 ¯
5
° ° p p
k v × w k ° 15 i − 43 j − 12 k ° 152 + (−43)2 + (−12)2 2218
d= = ° ° = p = p ≈ 5.98.
kvk ° (7, 3, −2) ° 2 2
7 + 3 + (−2) 2 62
Two lines
It is clear that two lines L 1 and L 2 , represented in vector form as r1 + sv1 and r2 + tv2 ,
respectively, are parallel (denoted as L 1 ∥ L 2 ) if v1 and v2 are parallel. Also, L 1 and L 2 are
perpendicular (denoted as L 1 ⊥ L 2 ) if v1 and v2 are perpendicular.
In 2-dimensional space, two lines are either identical, parallel, or they z
intersect. In 3-dimensional space, there is an additional possibility: two L1
lines can be skew, that is, they do not intersect but they are not parallel.
L2 y
However, even though they are not parallel, skew lines are on parallel
planes (see Figure 1.5.5). 0
3 x
To determine whether two lines in R intersect, it is often easier to use
the parametric representation of the lines. In this case, you should use dif- Figure 1.5.5
ferent parameter variables (usually s and t) for the lines, since the values of the parameters
may not be the same at the point of intersection. Setting the two ( x, y, z) triples equal will
result in a system of 3 equations in 2 unknowns ( s and t).
Example 1.22. Find the point of intersection (if any) of the following lines:
x+1 y−2 z−1 y−8 z+3
= = and x+3 = = .
3 2 −1 −3 2
Solution: First we write the lines in parametric form, with parameters s and t:
x = −1 + 3 s, y = 2 + 2 s, z = 1−s and x = −3 + t, y = 8 − 3 t, z = −3 + 2 t.
−1 + 3 s = −3 + t : ⇒ t = 2 + 3 s,
2 + 2 s = 8 − 3 t : ⇒ 2 + 2 s = 8 − 3(2 + 3 s) = 2 − 9 s ⇒ 2 s = −9 s ⇒ s = 0 ⇒ t = 2 + 3(0) = 2,
1 − s = −3 + 2 t : 1 − 0 = −3 + 2(2) ⇒ 1 = 1. X (Note that we had to check this.)
Letting s = 0 in the equations for the first line, or letting t = 2 in the equations for the second
line, gives the point of intersection (−1, 2, 1).
1.5 Lines and Planes 37
r
(x, y, z) (x0 , y0 , z0 )
a( x − x0 ) + b( y − y0 ) + c( z − z0 ) = 0. (1.25)
Example 1.23. Find the equation of the plane P containing the point (−3, 1, 3) and perpen-
dicular to the vector n = (2, 4, 8).
Solution: By formula (1.25), the plane P consists of all points ( x, y, z) such that:
2( x + 3) + 4( y − 1) + 8( z − 3) = 0.
If we multiply out the terms in formula (1.25) and combine the constant terms, we get an
equation of the plane in normal form:
ax + b y + cz + d = 0. (1.26)
−−→
QR
Q
R −−→
S QS
Example 1.24. Find the equation of the plane P containing the points (2, 1, 3), (1, −1, 2) and
(3, 2, 1).
−−→
Solution: Let Q = (2, 1, 3), R = (1, −1, 2) and S = (3, 2, 1). Then for the vectors QR = (−1, −2, −1)
−−→
and QS = (1, 1, −2), the plane P has a normal vector
−−→ −−→
n = QR × QS = (−1, −2, −1) × (1, 1, −2) = (5, −3, 1).
So using formula (1.25) with the point Q (we could also use R or S ), the plane P consists of
all points ( x, y, z) such that:
5( x − 2) − 3( y − 1) + ( z − 3) = 0,
or in normal form,
5 x − 3 y + z − 10 = 0.
We mentioned earlier that skew lines in R3 lie on separate, parallel planes. So two skew
lines do not determine a plane. But two (nonidentical) lines which either intersect or are
parallel do determine a plane. In both cases, to find the equation of the plane that contains
those two lines, simply pick from the two lines a total of three noncollinear points (one point
from one line and two points from the other), then use the technique above, as in Example
1.24, to write the equation. We will leave examples of this as exercises for the reader.
1.5 Lines and Planes 39
| ax0 + b y0 + cz0 + d |
D= p . (1.27)
a2 + b 2 + c 2
Proof: Let R = ( x, y, z) be any point in the plane P (so that ax + b y + cz + d = 0) and let
−−→
r = RQ = ( x0 − x, y0 − y, z0 − z). Then r 6= 0 since Q does not lie in P . From the normal form
equation for P , we know that n = (a, b, c) is a normal vector for P . Now, any plane divides
R3 into two disjoint parts. Assume that n points toward the side of P where the point Q
is located. Place n so that its initial point is at R , and let θ be the angle between r and
n. Then 0◦ < θ < 90◦ , so cos θ > 0. Thus, the distance D is cos θ k r k = | cos θ | k r k (see Figure
1.5.8).
n Q
r D
D
θ
P
R
Figure 1.5.8
n· r
By Theorem 1.6 in Section 1.3, we know that cos θ = , so
knk krk
|n · r| | n · r | | a( x0 − x) + b( y0 − y) + c( z0 − z) |
D = | cos θ | k r k = krk = = p
knk krk knk a2 + b 2 + c 2
| ax0 + b y0 + cz0 − (ax + b y + cz) | | ax0 + b y0 + cz0 − (− d ) | | ax0 + b y0 + cz0 + d |
= p = p = p .
a2 + b 2 + c 2 a2 + b 2 + c 2 a2 + b 2 + c 2
If n points away from the side of P where the point Q is located, then 90◦ < θ < 180◦ and
so cos θ < 0. The distance D is then | cos θ | k r k, and thus repeating the same argument as
above still gives the same result. QED
Example 1.25. Find the distance D from (2, 4, −5) to the plane from Example 1.24.
Solution: Recall that the plane is given by 5 x − 3 y + z − 10 = 0. So
| 5(2) − 3(4) + 1(−5) − 10 | |−17 | 17
D= p = p = p ≈ 2.87.
52 + (−3)2 + 12 35 35
40 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Note that two planes are parallel if they have normal vectors that
are parallel, and the planes are perpendicular if their normal vectors
are perpendicular.
L
Suppose that two planes P1 and P2 with normal vectors n1 and n2 ,
respectively, intersect in a line L (see Figure 1.5.9). Since n1 × n2 ⊥ n1 ,
then n1 × n2 is parallel to the plane P1 . Likewise, n1 × n2 ⊥ n2 means that Figure 1.5.9
n1 × n2 is also parallel to P2 . Thus, n1 × n2 is parallel to the intersection
of P1 and P2 , which is L. Thus, we can write L in the following vector form:
where r is any vector pointing to a point belonging to both planes. To find a point in both
planes, find a common solution ( x, y, z) to the two normal form equations of the planes. This
can often be made easier by setting one of the coordinate variables to zero, which leaves you
to solve two equations in just two unknowns.
5 x − 3 y + z − 10 = 0,
2 x + 4 y − z + 3 = 0.
Set x = 0 (why is that a good choice?). Then the above equations are reduced to:
−3 y + z − 10 = 0,
4 y − z + 3 = 0.
The second equation gives z = 4 y + 3, substituting that into the first equation gives y = 7.
Then z = 31, and so the point (0, 7, 31) is on L. Since n1 × n2 = (−1, 7, 26), then L is given by:
or in parametric form:
Projections
1.5 Lines and Planes 41
Assume we need to find the orthogonal projection S of the given point Q with the position
vector q to the line L given by parametric equation r + tv.
Note that S is the point of intersection of line L and the plane P thru Q perpendicular to
v. This plane P is given by the equation (x − q) · v = 0 with unknown x.
Since S belongs to L, its position vector is r + tv some t. Since it lies on the plane, we get
(r + tv − q) · v = 0.
Example 1.27. Find the projections of the point Q = (1, 1, 1) to the line x = 1 + 4 t, y = 2 + 5 t, z =
3 + 6 t.
Solution: The vector form of the parametric equation is (1, 2, 3) + t(4, 5, 6). Applying the
formula above, we get
Exercises
A
For Exercises 1–4, write the line L through the point P and parallel to the vector v in the
following forms: (a) vector, (b) parametric, and (c) symmetric.
For Exercises 5–6, write the line L through the points P1 and P2 in parametric form.
For Exercises 7–8, (a) find the distance d from the point P to the line L (b) find the orthogonal
projection of P to L
42 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
8. P = (0, 0, 0), L : x = 3 + 2 t, y = 4 + 3 t, z = 5 + 4 t.
For Exercises 9–10, find the point of intersection (if any) of the given lines.
9. x = 7 + 3 s, y = −4 − 3 s, z = −7 − 5 s and x = 1 + 6 t, y = 2 + t, z = 3 − 2 t;
x−6 x − 11 y − 14 z + 9
10. = y+3 = z and = = .
4 3 −6 2
For Exercises 11–12, write the normal form of the plane P containing the point Q and per-
pendicular to the vector n.
11. Q = (5, 1, −2), n = (4, −4, 3); 12. Q = (6, −2, 0), n = (2, 6, 4).
For Exercises 13–14, write the normal form of the plane containing the given points.
13. (1, 0, 3), (1, 2, −1), (6, 1, 6); 14. (−3, 1, −3), (4, −4, 3), (0, 0, 1).
15. Write the normal form of the plane containing the lines from Exercise 9.
16. Write the normal form of the plane containing the lines from Exercise 10.
For Exercises 17–18, (a) find the distance D from the point Q to the plane P and (b) find the
projection of Q to the plane P
For Exercises 19–20, find the line of intersection (if any) of the given planes.
19. x + 3 y + 2 z − 6 = 0, 2 x − y + z + 2 = 0; 20. 3 x + y − 5 z = 0, x + 2 y + z + 4 = 0.
B
x−6
21. Find the point(s) of intersection (if any) of the line = y + 3 = z with the plane
4
x + 3 y + 2 z − 6 = 0. (Hint: Put the equations of the line into the equation of the plane.)
Definition 1.9. A sphere S is the set of all points ( x, y, z) in R3 which are a fixed distance r
(called the radius) from a fixed point P0 = ( x0 , y0 , z0 ) (called the center of the sphere):
S = { ( x, y, z) : ( x − x0 )2 + ( y − y0 )2 + ( z − z0 )2 = r 2 }. (1.29)
S = { x : k x − x0 k = r }, (1.30)
y ( x0 , y0 , z0 )
0
x0
y
0
x
x
(a) radius r , center (0, 0, 0) (b) radius r , center ( x0 , y0 , z0 )
Note in Figure 1.6.1(a) that the intersection of the sphere with the x y-plane is a circle of
radius r (that is, a great circle, given by x2 + y2 = r 2 as a subset of R2 ). Similarly for the
intersections with the xz-plane and the yz-plane. In general, a plane intersects a sphere
either at a single point or in a circle.
8 See O’N EILL for a deeper and more rigorous discussion of surfaces.
44 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Example 1.28. Find the intersection of the sphere x2 + y2 + z2 = 169 with the plane z = 12.
z
p The sphere is centered at the origin and has radius
Solution:
13 = 169, so it does intersect the plane z = 12. Putting z = 12
z = 12 into the equation of the sphere gives
x2 + y2 + 122 = 169, y
0
x2 + y2 = 169 − 144 = 25 = 52
which is a circle of radius 5 centered at (0, 0, 12), parallel to
x
the x y-plane (see Figure 1.6.2).
Figure 1.6.2
If the equation in formula (1.29) is multiplied out, we get an equation of the form:
x2 + y2 + z2 + ax + b y + cz + d = 0 (1.31)
for some constants a, b, c and d . Conversely, an equation of this form may describe a sphere,
which can be determined by completing the square for the x, y and z variables.
Note that the equation (1.31) could be written as
k x k 2 + v · x + d = 0,
where x = ( x, y, z) and v = (a, b, c).
Example 1.29. Is 2 x2 + 2 y2 + 2 z2 − 8 x + 4 y − 16 z + 10 = 0 the equation of a sphere?
Solution: Dividing both sides of the equation by 2 gives
x 2 + y 2 + z 2 − 4 x + 2 y − 8 z + 5 = 0,
( x2 − 4 x + 4) + ( y2 + 2 y + 1) + ( z2 − 8 z + 16) + 5 − 4 − 1 − 16 = 0,
( x − 2)2 + ( y + 1)2 + ( z − 4)2 = 16
which is a sphere of radius 4 centered at (2, −1, 4).
Example 1.30. Find the points(s) of intersection (if any) of the sphere from Example 1.29
and the line x = 3 + t, y = 1 + 2 t, z = 3 − t.
Solution: Put the equations of the line into the equation of the sphere, which was ( x − 2)2 +
( y + 1)2 + ( z − 4)2 = 16, and solve for t:
(3 + t − 2)2 + (1 + 2 t + 1)2 + (3 − t − 4)2 = 16,
( t + 1)2 + (2 t + 2)2 + (− t − 1)2 = 16,
6 t2 + 12 t − 10 = 0.
4
The quadratic formula gives the solutions t = −1 ± p . Putting those two values into the
6
equations of the line gives the following two points of intersection:
µ ¶ µ ¶
4 8 4 4 8 4
2 + p , −1 + p , 4 − p and 2 − p , −1 − p , 4 + p .
6 6 6 6 6 6
1.6 Elementary surfaces 45
Example 1.31. Find the intersection (if any) of the spheres x2 + y2 + z2 = 25 and x2 + y2 + ( z −
2)2 = 16.
Solution: For any point ( x, y, z) on both spheres, we see that
x2 + y2 + z2 = 25 ⇒ x2 + y2 = 25 − z2 , and
x2 + y2 + ( z − 2)2 = 16 ⇒ x2 + y2 = 16 − ( z − 2)2 , so
16 − ( z − 2)2 = 25 − z2 ⇒ 4z − 4 = 9 ⇒ z = 13/4
⇒ x2 + y2 = 25 − (13/4)2 = 231/16.
p
231 231
∴ The intersection is the circle x2 + y2 = 16 in the plane z = 13/4. It has radius 4 ≈ 3.8
and centered at (0, 0, 13
4 ).
The cylinders that we will consider are right circular cylinders. These are cylinders ob-
tained by moving a line L along a circle C in R3 in a way so that L is always perpendicular
to the plane containing C . We will only consider the cases where the plane containing C is
parallel to one of the three coordinate planes (see Figure 1.6.3).
z z
r
z
r y
y r
y 0
0
0
x x x
(a) x2 + y2 = r 2 , any z (b) x2 + z2 = r 2 , any y (c) y2 + z2 = r 2 , any x
For example, the equation of a cylinder whose base circle C lies in the x y-plane and is
centered at (a, b, 0) and has radius r is
where the value of the z coordinate is unrestricted. Similar equations can be written when
the base circle lies in one of the other coordinate planes. A plane intersects a right circular
cylinder in a circle, ellipse, or one or two lines, depending on whether that plane is parallel,
oblique9 , or perpendicular, respectively, to the plane containing C . The intersection of a
surface with a plane is called the trace of the surface.
9 That is, at an angle strictly between 0◦ and 90◦ .
46 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
for some constants A , B, . . . , J . If the above equation is not that of a sphere, cylinder, plane,
line or point, then the resulting surface is called a quadric surface.
One type of quadric surface is the ellipsoid, given z
c
by an equation of the form:
x 2 y2 z 2
+ + = 1. (1.34) y
a2 b 2 c 2 0 b
a
In the case where a = b = c, this is just a sphere.
In general, an ellipsoid is egg-shaped (think of an
ellipse rotated around its major axis). Its traces in x
the coordinate planes are ellipses. Figure 1.6.4 Ellipsoid
Two other types of quadric surfaces are the hyperboloid of one sheet, given by an
equation of the form:
x 2 y2 z 2
+ − =1 (1.35)
a2 b 2 c 2
and the hyperboloid of two sheets, whose equation has the form:
x 2 y2 z 2
− − = 1. (1.36)
a2 b 2 c 2
z z
y y
0 0
x x
Figure 1.6.5 Hyperboloid of one sheet. Figure 1.6.6 Hyperboloid of two sheets.
1.6 Elementary surfaces 47
For the hyperboloid of one sheet, the trace in any plane parallel to the x y-plane is an
ellipse. The traces in the planes parallel to the xz- or yz-planes are hyperbolas (see Figure
1.6.5), except for the special cases x = ±a and y = ± b; in those planes the traces are pairs of
intersecting lines (see Exercise 8).
For the hyperboloid of two sheets, the trace in any plane parallel to the x y- or xz-plane is
a hyperbola (see Figure 1.6.6). There is no trace in the yz-plane. In any plane parallel to the
yz-plane for which | x | > | a |, the trace is an ellipse.
z
The elliptic paraboloid is another type of quadric surface,
whose equation has the form:
x 2 y2 z
+ = . (1.37)
a2 b 2 c
x 2 y2 z
− = . (1.38)
a2 b 2 c
The hyperbolic paraboloid can be tricky to draw; using graphing software on a computer
can make it easier. For example, Figure 1.6.8 was created using the free Gnuplot package.
It shows the graph of the hyperbolic paraboloid z = y2 − x2 , which is the special case where
a = b = 1 and c = −1 in equation (1.38). The mesh lines on the surface are the traces in
planes parallel to the coordinate planes. So we see that the traces in planes parallel to the
xz-plane are parabolas pointing upward, while the traces in planes parallel to the yz-plane
are parabolas pointing downward. Also, notice that the traces in planes parallel to the x y-
plane are hyperbolas, though in the x y-plane itself the trace is a pair of intersecting lines
through the origin. This is true in general when c < 0 in equation (1.38). When c > 0, the
surface would be similar to that in Figure 1.6.8, only rotated 90◦ around the z-axis and the
nature of the traces in planes parallel to the xz- or yz-planes would be reversed.
100
50
z 0
-10
-50
-5
-100
-10
0
-5 x
0 5
y 5
10 10
x 2 y2 z 2
+ − = 0. (1.39)
a2 b 2 c 2
y
The traces in planes parallel to the x y-plane are ellipses, ex-
0
cept in the x y-plane itself where the trace is a single point.
The traces in planes parallel to the xz- or yz-planes are hyper-
bolas, except in the xz- and yz-planes themselves where the
traces are pairs of intersecting lines. x
Notice that every point on the elliptic cone is on a line which
lies entirely on the surface; in Figure 1.6.9 these lines all go Figure 1.6.9 Elliptic cone
through the origin. This makes the elliptic cone an example of
a ruled surface. The cylinder is also a ruled surface.
What may not be as obvious is that both the hyperboloid of one sheet and the hyperbolic
paraboloid are ruled surfaces. In fact, on both surfaces there are two lines through each
point on the surface (see Exercises 11–12). Such surfaces are called doubly ruled surfaces,
and the pairs of lines are called a regulus.
1.6 Elementary surfaces 49
It is clear that for each of the six types of quadric surfaces that we discussed, the surface
can be translated away from the origin (say, by replacing x2 by ( x − x0 )2 in its equation). It can
be proved11 that every quadric surface can be translated and/or rotated so that its equation
matches one of the six types that we described.
For example, z = kx y is a case of equation (1.33) with “mixed” variables; namely D 6= 0,
so that we get an x y term. This equation does not match any of the types we considered.
However, by rotating the x-pand y-axes by p 45◦ in the x y-plane by means of the coordinate
′ ′
transformation x = ( x − y )/ 2, y = ( x + y )/ 2, z = z′ , then z = kx y becomes the hyperbolic
′ ′
describes a hyperbolic paraboloid as in equation (1.38), but rotated 45◦ in the x y-plane.
Exercises
A
For Exercises 1–4, determine if the given equation describes a sphere. If so, find its radius
and center.
1. x2 + y2 + z2 − 4 x − 6 y − 10 z + 37 = 0; 2. x2 + y2 + z2 + 2 x − 2 y − 8 z + 19 = 0;
3. 2 x2 + 2 y2 + 2 z2 + 4 x + 4 y + 4 z − 44 = 0; 4. x2 + y2 − z2 + 12 x + 2 y − 4 z + 32 = 0.
5. Find the point(s) of intersection of the sphere ( x − 3)2 + ( y + 1)2 + ( z − 3)2 = 9 and the line
x = −1 + 2 t , y = −2 − 3 t , z = 3 + t .
x2 y2 z2
8. Find the trace of the hyperboloid of one sheet a2
+ b2 − c2
= 1 in the plane x = a, and the
trace in the plane y = b.
x2 y2 z
9. Find the trace of the hyperbolic paraboloid a2
− b2 = c in the x y-plane.
10. It can be shown that any four noncoplanar points (that is, points that do not lie in the
same plane) determine a sphere.12 Find the equation of the sphere that passes through
the points (0, 0, 0), (0, 0, 2), (1, −4, 3) and (0, −1, 3). (Hint: Equation (1.31))
11 See Ch. 7 in P OGORELOV.
12 See W ELCHONS and K RICKENBERGER, p. 160, for a proof.
50 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
11. Show that the hyperboloid of one sheet is a doubly ruled surface; that is, each point on
the surface is on two lines lying entirely on the surface. (Hint: Write equation (1.35) as
x2 2 y2
a2
− zc2 = 1 − b2 , factor each side. Recall that two planes intersect in a line.)
12. Show that the hyperbolic paraboloid is a doubly ruled surface. (Hint: Exercise 11)
z
13. Let S be the sphere with radius 1 centered at (0, 0, 1), (0, 0, 2)
and let S ∗ be S without the “north pole” point (0, 0, 2). Let
(a, b, c) be an arbitrary point on S ∗ . Then the line passing (a, b, c)
S
through (0, 0, 2) and (a, b, c) intersects the x y-plane at some 1
point ( x, y, 0), as in Figure 1.6.10. Find this point ( x, y, 0) in y
terms of a, b and c. (x, y, 0) 0
(Note: Every point in the x y-plane can be matched with a x
point on S ∗ , and vice versa, in this manner. This method is Figure 1.6.10
called stereographic projection, which essentially identifies
all of R2 with a “punctured” sphere.)
14. Given two points P and Q in the space consider the set of points X such that the distance
from X to P is twice larger than the distance from X to Q . Show that this set is a sphere.
Find its radius and center if P = (1, 2, 3) and Q = (2, 4, 5).
15. Show that the equidistant set from a plane and a point not on the plane is formed by a
elliptic paraboloid. (Hint:Use the coordinate system with the given pane as the x y-plane.)
1.7 Curvilinear Coordinates 51
For this reason, physicists usually switch the definitions of θ and φ to make (ρ , θ , φ) a right-handed system.
52 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
Example 1.32. Convert the point (−2, −2, 1) from Cartesian coordinates to (a) cylindrical
and (b) spherical coordinates.
p p ¡ −2 ¢ 5π
Solution: (a) r = (−2)2 + (−2)2 = 2 2, θ = tan−1 − −1
2 = tan (1) = 4 , since y = −2 < 0.
¡ p 5π ¢
∴ (r, θ, z) = 2 2, 4 , 1 .
p p ¡ ¢
(b) ρ = (−2)2 + (−2)2 + 12 = 9 = 3, φ = cos−1 13 ≈ 1.23 radians.
¡ ¢
∴ (ρ , θ, φ) = 3, 54π , 1.23 .
For cylindrical coordinates ( r, θ , z), and constants r 0 , θ0 and z0 , we see from Figure 1.7.4
that the surface r = r 0 is a cylinder of radius r 0 centered along the z-axis, the surface θ = θ0
is a half-plane emanating from the z-axis, and the surface z = z0 is a plane parallel to the
x y-plane.
z z z
r0 z0
y y y
0
0 0
θ0
x x x
(a) r = r 0 (b) θ = θ0 (c) z = z0
For spherical coordinates (ρ , θ , φ), and constants ρ 0 , θ0 and φ0 , we see from Figure 1.7.5
that the surface ρ = ρ 0 is a sphere of radius ρ 0 centered at the origin, the surface θ = θ0 is a
half-plane emanating from the z-axis, and the surface φ = φ0 is a circular cone whose vertex
is at the origin.
Figures 1.7.4(a) and 1.7.5(a) show how these coordinate systems got their names.
Sometimes the equation of a surface in Cartesian coordinates can be transformed into a
simpler equation in some other coordinate system, as in the following example.
Using spherical coordinates to write the equation of a sphere does not necessarily make
the equation simpler, if the sphere is not centered at the origin.
1.7 Curvilinear Coordinates 53
z
z z
ρ0
φ0
y y
0
0
y
θ0 0
x x x
(a) ρ = ρ 0 (b) θ = θ0 (c) φ = φ0
x2 + y2 + z2 − 4 x − 2 y + 5 = 9 , so we get
ρ 2 − 4ρ sin φ cos θ − 2ρ sin φ sin θ − 4 = 0 , or
ρ 2 − 2 sin φ (2 cos θ − sin θ ) ρ − 4 = 0 after combining terms.
Note that this actually makes it more difficult to figure out what the surface is, as opposed
to the Cartesian equation where you could immediately identify the surface as a sphere of
radius 3 centered at (2, 1, 0).
For Exercises 5–7, write the given equation in (a) cylindrical and (b) spherical coordinates.
5. x2 + y2 + z2 = 25; 6. x2 + y2 = 2 y; 7. x2 + y2 + 9 z2 = 36.
B
8. Describe the intersection of the surfaces whose equations in spherical coordinates are
θ = π2 and φ = π4 .
54 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
14
12
10
8
z 6
4
-2
2 -1.5
-1
0-2 -0.5
-1.5 -1 0
-0.5 0.5 x
0 0.5 1
y 1 1.5 1.5
2 2
9. Show that for a 6= 0, the equation ρ = 2a sin φ cos θ in spherical coordinates describes a
sphere centered at (a, 0, 0) with radius | a |.
C
10. Let P = (a, θ , φ) be a point in spherical coordinates, with a > 0 and 0 < φ < π. Then P
lies on the sphere ρ = a. Since 0 < φ < π, the line segment from the origin to P can be
extended to intersect the cylinder given by r = a (in cylindrical coordinates). Find the
cylindrical coordinates of that point of intersection.
11. Let P1 and P2 be points whose spherical coordinates are (ρ 1 , θ1 , φ1 ) and (ρ 2 , θ2 , φ2 ), respec-
tively. Let v1 be the vector from the origin to P1 , and let v2 be the vector from the origin
to P2 . For the angle γ between v1 and v2 , show that
This formula is used in electrodynamics to prove the addition theorem for spherical har-
monics, which provides a general expression for the electrostatic potential at a point due
to a unit charge. See pp. 100–102 in J ACKSON.
12. Show that the distance d between the points P1 and P2 with cylindrical coordinates
1.7 Curvilinear Coordinates 55
( r 1 , θ1 , z1 ) and ( r 2 , θ2 , z2 ), respectively, is
q
d= r 21 + r 22 − 2 r 1 r 2 cos( θ2 − θ1 ) + ( z2 − z1 )2 .
13. Show that the distance d between the points P1 and P2 with spherical coordinates
(ρ 1 , θ1 , φ1 ) and (ρ 2 , θ2 , φ2 ), respectively, is
q
d= ρ 21 + ρ 22 − 2ρ 1 ρ 2 [sin φ1 sin φ2 cos( θ2 − θ1 ) + cos φ1 cos φ2 ] .
2 Curves
for some real-valued functions f 1 ( t), f 2 ( t), f 3 ( t), called the component functions of f. The first
form is often used when emphasizing that f( t) is a vector, and the second form is useful when
considering just the terminal points of the vectors.
z
3
Example 2.1. Define f : R → R by f( t) = (cos t, sin t, t).
This is a parametric equation of a helix (see Figure 1.8.1). As the
value of t increases, the terminal points of f( t) is spiraling upward.
y
For each t, the x- and y-coordinates of f( t) are x = cos t and y = sin t, f(2π) 0
so
f(0)
x2 + y2 = cos2 t + sin2 t = 1.
x
Thus, f( t) lies on the surface of the right circular cylinder x2 + y2 = 1 Figure 2.1.1
for any t.
Since each of the three component functions are real-valued, it will sometimes be the case
that results from single-variable calculus can simply be applied to each of the component
functions to yield a similar result for the vector-valued function. However, there are times
when such generalizations do not hold (see Exercise 13). The concept of a limit, though, can
be extended naturally to vector-valued functions, as in the following definition.
56
2.1 Vector-Valued Functions 57
Definition 2.2. Let f( t) be a vector-valued function, let a be a real number and let c be a
vector. Then we say that the limit of f( t) as t approaches a equals c, written as lim f( t) = c,
t→ a
if lim k f( t) − c k = 0.
t→ a
Equivalently, if f( t) = ( f 1 ( t), f 2 ( t), f 3 ( t)), then
³ ´
lim f( t) = lim f 1 ( t), lim f 2 ( t), lim f 3 ( t) ,
t→ a t→ a t→ a t→ a
The above definition shows that continuity and the derivative of vector-valued functions
can also be defined in terms of its component functions.
Definition 2.3. Let f( t) = ( f 1 ( t), f 2 ( t), f 3 ( t)) be a vector-valued function, and let a be a real
number in its domain. Then f( t) is continuous at a if lim f( t) = f(a). Equivalently, f( t) is
t→ a
continuous at a if and only if f 1 ( t), f 2 ( t), and f 3 ( t) are continuous at a.
df
The derivative of f( t) at a, denoted by f′ (a) or (a), is the limit
dt
f(a + h) − f(a)
f′ (a) = lim
h→0 h
if that limit exists. Equivalently, f′ (a) = ( f 1′ (a), f 2′ (a), f 3′ (a)), if the component derivatives exist.
We say that f( t) is differentiable at a if f′ (a) exists.
Example 2.2. Let f( t) = (cos t, sin t, t). Then f′ ( t) = (− sin t, cos t, 1) for all t. The tangent line
58 CHAPTER 2. CURVES
z
f′ (a)
f(a L
+
f(a) h)
−
f(a f( t)
)
f(a + h) y
0
x
Figure 2.1.2 Tangent vector f′ (a) and tangent line L = f(a) + sf′ (a).
L to the curve at f(2π) = (1, 0, 2π) is L = f(2π) + s f′ (2π) = (1, 0, 2π) + s(0, 1, 1), or in parametric
form: x = 1, y = s, z = 2π + s for −∞ < s < ∞.
d
(a) (c) = 0;
dt
d df
(b) ( kf) = k ;
dt dt
d df dg
(c) (f + g) = + ;
dt dt dt
d df dg
(d) (f − g) = − ;
dt dt dt
d du df
(e) ( u f) = f+u ;
dt dt dt
d df dg
(f) (f · g) = · g + f· ;
dt dt dt
d df dg
(g) (f × g) = ×g+f× .
dt dt dt
Proof: The proofs of parts (a)–(e) follow easily by differentiating the component functions
and using the rules for derivatives from single-variable calculus. We will prove part (f),
and leave the proof of part (g) as an exercise for the reader.
2.1 Vector-Valued Functions 59
(f) Write f( t) = ( f 1 ( t), f 2 ( t), f 3 ( t)) and g( t) = ( g 1 ( t), g 2 ( t), g 3 ( t)), where the component functions
f 1 ( t), f 2 ( t), f 3 ( t), g 1 ( t), g 2 ( t), g 3 ( t) are all differentiable real-valued functions. Then
d d
(f( t) · g( t)) = ( f 1 ( t) g 1 ( t) + f 2 ( t) g 2 ( t) + f 3 ( t) g 3 ( t))
dt dt
d d d
= ( f 1 ( t) g 1 ( t)) + ( f 2 ( t) g 2 ( t)) + ( f 3 ( t) g 3 ( t))
dt dt dt
d f1 d g1 d f2 d g2 d f3 d g3
= ( t) g 1 ( t) + f 1 ( t) ( t) + ( t) g 2 ( t) + f 2 ( t) ( t) + ( t) g 3 ( t) + f 3 ( t) ( t)
dt dt dt dt dt dt
³df d f2 d f3 ´
1
= ( t), ( t), ( t) · ( g 1 ( t), g 2 ( t), g 3 ( t))
dt dt dt
³dg d g2 d g3 ´
1
+ ( f 1 ( t), f 2 ( t), f 3 ( t)) · ( t), ( t), ( t)
dt dt dt
df dg
= ( t) · g( t) + f( t) · ( t) for all t. QED
dt dt
d
We know that k f( t) k is constant if and only if k f( t) k = 0 for all t. Also, f( t) ⊥ f′ ( t) if and
dt
only if f′ ( t) · f( t) = 0. Thus, the above example shows this important fact:
This means that if a curve lies completely on a sphere (or circle) centered at the origin, then
the tangent vector f′ ( t) is always perpendicular to the position vector f( t).
1
0.8
0.6
0.4
z 0.2
0
-0.2
-0.4
-0.6 -1
-0.8 -0.8
-0.6
-1-1 -0.4
-0.2
-0.8 -0.6 0
-0.4 -0.2 0.2 x
0 0.4
0.2 0.4 0.6
y 0.6 0.8 0.8
1 1
d ′ d ′′ dnf d ³ d n −1 f ´
f′′ ( t) = f ( t) , f′′′ ( t) = f ( t) , ... , = (for n = 2, 3, 4, . . .).
dt dt dt n dt dt n−1
We can use vector-valued functions to represent physical quantities, such as velocity, ac-
celeration, force, momentum, etc. For example, let the real variable t represent time elapsed
from some initial time ( t = 0), and suppose that an object of constant mass m is subjected
to some force so that it moves in space, with its position ( x, y, z) at time t a function of
t. That is, x = x( t), y = y( t), z = z( t) for some real-valued functions x( t), y( t), z( t). Call
r( t) = ( x( t), y( t), z( t)) the position vector of the object. We can define various physical quan-
tities associated with the object as follows:1
dv
acceleration: a( t) = v̇( t) = v′ ( t) =
dt
d2r
= r̈( t) = r′′ ( t) = 2
dt
′′ ′′ ′′
= ( x ( t), y ( t), z ( t));
momentum: p( t) = mv( t);
dp
force: F( t) = ṗ( t) = p′ ( t) = (Newton’s Second Law of Motion).
dt
The magnitude k v( t) k of the velocity vector is called the speed of the object. Note that since
the mass m is a constant, the force equation becomes the familiar F( t) = ma( t).
Example 2.5. Let r( t) = (5 cos t, 3 sin t, 4 sin t) be the position vector of an object at time t ≥ 0.
Find its (a) velocity and (b) acceleration vectors.
Solution: (a) v( t) = ṙ( t) = (−5 sin t, 3 cos t, 4 cos t).
(b) a( t) = v̇( t) = (−5 cos t, −3 sin t, −4 sin t).
p
Note that k r( t) k = 25 cos2 t + 25 sin2 t = 5 for all t, so by Example 2.3 we know that r( t) ·
ṙ( t) = 0 for all t (which we can verify from part (a)). In fact, k v( t) k = 5 for all t also. And not
only does r( t) lie on the sphere of radius 5 centered at the origin, but perhaps not so obvious
is that it lies completely within a circle of radius 5 centered at the origin. Also, note that
a( t) = −r( t). It turns out (see Exercise 16) that whenever an object moves in a circle with
constant speed, the acceleration vector will point towards the center of the circle.
Recall from Section 1.5 that if r1 , r2 are position vectors to distinct points then r1 + t(r2 − r1 )
represents a line through those two points as t varies over all real numbers. That vector
sum can be written as (1 − t)r1 + tr2 . So the function l( t) = (1 − t)r1 + tr2 is a line through
the terminal points of r1 and r2 , and when t is restricted to the interval [0, 1] it is the line
segment between the points, with l(0) = r1 and l(1) = r2 .
In general, a function of the form f( t) = (a 1 t + b 1 , a 2 t + b 2 , a 3 t + b 3 ) represents a line in R3 . A
function of the form f( t) = (a 1 t2 + b 1 t + c 1 , a 2 t2 + b 2 t + c 2 , a 3 t2 + b 3 t + c 3 ) represents a (possibly
degenerate) parabola in R3 .
Example 2.6. Bézier curves are used in Computer Aided Design to approximate the shape of
a polygonal path in space (called the Bézier polygon or control polygon). For instance, given
three points (or position vectors) b0 , b1 , b2 in R3 , define
for all real t. For t in the interval [0, 1], we see that b10 ( t) is the line segment between b0 and
b1 , and b11 ( t) is the line segment between b1 and b2 . The function b20 ( t) is the Bézier curve
62 CHAPTER 2. CURVES
for the points b0 , b1 , b2 . Note from the last formula that the curve is a parabola that goes
through b0 (when t = 0) and b2 (when t = 1).
As an example, let b0 = (0, 0, 0), b1 = (1, 2, 3), and b2 = (4, 5, 2). Then the explicit formula for
the Bézier curve is b20 ( t) = (2 t + 2 t2 , 4 t + t2 , 6 t − 4 t2 ), as shown in Figure 2.1.4, where the line
segments are b10 ( t) and b11 ( t), and the curve is b20 ( t).
(1, 2, 3)
2.5
2
(4, 5, 2)
1.5
z
1 (0, 0, 0)
0
0.5 0.5
1
1.5
2 x
0 0 2.5
1 2 3
3 4 3.5
5 4
y
Example 2.7. The pedal curve is traced by the orthogonal projection of a fixed point P on
the tangent lines of a given curve f( t).
Write a parametric expression h( t) for the pedal curve for the unit circle f( t) = (cos( t), sin t)
and the point P = (1, 0), so its position vector is i. (This curve is called cardioid.)
2 See pp. 27–30 in FARIN .
2.1 Vector-Valued Functions 63
w( t) = (f′ ( t) · v( t)) f′ ( t)
= (f′ ( t) · (i − f( t))) f′ ( t)
= (sin2 t, − sin t cos t).
and
h( t) = f( t) + w( t)
= (cos t + sin2 t, sin t − sin t cos t).
Exercises
A
For Exercises 1–4, calculate f′ ( t) and find the tangent line at f(0).
2
1. f( t) = ( t + 1, t2 + 1, t3 + 1); 2. f( t) = ( e t + 1, e2 t + 1, e t + 1);
For Exercises 5–6, find the velocity v( t) and acceleration a( t) of an object with the given
position vector r( t).
B
³ cos t sin t −at ´
7. Let f( t) = p ,p ,p , with a 6= 0.
1 + a2 t2 1 + a2 t2 1 + a2 t2
(a) Show that k f( t) k = 1 for all t.
(b) Show directly that f′ ( t) · f( t) = 0 for all t.
8. If f′ ( t) = 0 for all t in some interval (a, b), show that f( t) is a constant vector in (a, b).
(c) Compare f′ (0) and g′ (0). Given your answer to part (a), how do you explain the differ-
ence in the two derivatives?
14. Wrie a parametric equation for the pedal curve to f( t) = ( t, t2 , t3 ) with respect to the
origin.
C
15. The Bézier curve b30 ( t) for four noncollinear points b0 , b1 , b2 , b3 in R3 is defined by the
following algorithm (going from the left column to the right):
16. Let r( t) be the position vector for a particle moving in R3 , v( t) be its velocity and a( t) be
its acceleration. Show that
d
(r × (v × r)) = k r k2 a + (r · v)v − (k v k2 + r · a)r.
dt
17. Let r( t) be the position vector in R3 for a particle that moves with constant speed c > 0 in
a circle of radius a > 0 centered at the origin in the x y-plane. Show that its acceleration
a( t) points in the opposite direction as r( t) for all t. (Hint: Use Example 2.3 to show that
r( t) ⊥ v( t) and a( t) ⊥ v( t), and hence a( t) ∥ r( t).)
2.1 Vector-Valued Functions 65
19. Show that there is no plane which is tangent3 to the curve f( t) = ( t, t2 , t3 ) at two distinct
points.
2
(4, 5, 2)
(0, 1, 1)
1.5
1
z
(0, 0, 0)
0.5
0
0.5
(2, 3, 0) 1
1.5
2 x
0 0 2.5
1 2 3
3 4 3.5
5 4
y
If f( t) = ( x( t), y( t), z( t)) is the position vector of an object moving in R3 then its speed at time
t is k f′ ( t) k, that is the magnitude of the velocity vector. Therefore it seems natural to define
the distance s traveled by as the definite integral of its speed in the time interval (2.1).
Example 2.8. Find the length L of the helix f( t) = (cos t, sin t, t) from t = 0 to t = 2π.
Solution: By formula (2.1), we have
Notice that the set traced out by the curve f( t) = (cos t, sin t, t) from Example 2.8 is also
traced out by the function g( t) = (cos 2 t, sin 2 t, 2 t). For example, over the interval [0, π], g( t)
traces out the same section of the curve as f( t) does over the interval [0, 2π]. Intuitively,
this says that g( t) traces the curve twice as fast as f( t). This makes sense since, viewing the
functions as position p vectors and their
p derivatives as velocity vectors, the speeds of f( t) and
′ ′
g( t) are k f ( t) k = 2 and k g ( t) k = 2 2, respectively. We say that g( t) is a reparametrization
of curve f( t).
Definition 2.5. Let f( t) be a smooth curve in R3 defined on an interval [a, b], and let
α : [ c, d ] → [a, b] be a smooth one-to-one mapping of an interval [ c, d ] onto [a, b]. Then the
function g : [ c, d ] → R3 defined by g( s) = f(α( s)) is a reparametrization of f( t) with param-
eter s. If the derivative of α does not vanish, we say that the reparametrization is regular
and g( s) is equivalent to f( t).
s t f( t)
α f
[ c, d ] [a, b] R3
g( s) = f(α( s)) = f( t)
Note that the differentiability of g( s) follows from a version of the Chain Rule for vector-
valued functions (the proof is left as an exercise):
2.2 Arc Length 67
d g d f dt
= or equivalently g′ ( s) = f′ (α( s)) α′ ( s) (2.2)
ds dt ds
for any s where the composite function f(α( s)) is defined.
Example 2.9. The following are all regular reparametrizations of one curve:
f( t) = (cos t, sin t, t) for t in [0, 2π],
g( s) = (cos 2 s, sin 2 s, 2 s) for s in [0, π],
h( s) = (cos 2π s, sin 2π s, 2π s) for s in [0, 1].
To see that g( s) is regular reparametrization of f( t), define α : [0, π] → [0, 2π] by α( s) = 2 s.
Then α is smooth, one-to-one, maps [0, π] onto [0, 2π], and is strictly increasing (since α ′ ( s) =
2 > 0 for all s). Likewise, defining α : [0, 1] → [0, 2π] by α( s) = 2π s shows that h( s) is regular
reparametrization of f( t).
A curve can be reparametrized, with different speeds, so which one is the best to use? In
some situations the arc length parametrization can be useful. The idea behind this is to
replace the parameter t, for any given smooth parametrization f( t) defined on [a, b], by the
parameter s given by
wt
s = s( t) = k f′ ( u) k du. (2.3)
a
In terms of motion along a curve, s is the distance traveled along the curve after time t
has elapsed. So the new parameter will be distance instead of time. There is a natural
correspondence between s and t: from a starting point on the curve, the distance traveled
along the curve (in one direction) is uniquely determined by the amount of time elapsed, and
vice versa.
Since s is the arc length of the curve over the interval [a, t] for each t in [a, b], then it is a
function of t. By the Fundamental Theorem of Calculus, its derivative is
d w ′
t
′ ds
s ( t) = = k f ( u) k du = k f′ ( t) k for all t in [a, b].
dt dt a
Since f( t) is smooth, then k f′ ( t) k > 0 for all t in [a, b]. Thus s ′ ( t) > 0 and hence s( t) is strictly
increasing on the interval [a, b]. Recall that this means that s is a one-to-one mapping of the
interval [a, b] onto the interval [ s(a), s( b)]. But we see that
wa wb
s( a) = k f′ ( u) k du = 0 and s( b ) = k f′ ( u) k du = L = arc length from t = a to t = b.
a a
68 CHAPTER 2. CURVES
1 1
α ′ ( s) = = .
s ′ (α( s)) k f′ (α( s)) k
Example 2.10. Parametrize the helix f( t) = (cos t, sin t, t), for t in [0, 2π], by arc length.
Solution: By Example 2.8 and formula (2.3), we have
wt wt p p
′
s= k f ( u) k du = 2 du = 2 t for all t in [0, 2π].
0 0
s
So we can solve for t in terms of s: t = α( s) = p .
2
³ s s s ´ p
∴ g(s) = cos p , sin p , p for all s in [0, 2 2π]. Note that k g′ (s) k = 1.
2 2 2
Exercises
A
For Exercises 1–3, calculate the arc length of f( t) over the given interval.
p
2. f( t) = (( t2 + 1) cos t, ( t2 + 1) sin t, 2 2 t) on [0, 1];
B
6. Assume that g( s) is a regular reparametrization of f( t). Show that both curves have the
same length.
8. Show that the arc length L of a curve whose spherical coordinates are ρ = ρ ( t), θ = θ ( t)
and φ = φ( t) for t in an interval [a, b] is
wb q
L= ρ ′ ( t)2 + (ρ ( t)2 sin2 φ( t)) θ ′ ( t)2 + ρ ( t)2 φ ′ ( t)2 dt.
a
9. Let f( t) be a smooth curve. The pedal curve of f( t) is traced by the orthogonal projections
of the origin on the tangent lines to f. Write a parametric equation for the pedal curve
h( t) for the given smooth curve f( t).
C
10. Assume that the trajectory of the back wheel of an ideal bicycle is given by smooth plane
curve b( t), here t denotes time. We assume that in the ideal bicycle the distance from back
wheel and front wheel is fixed, let us denote it by R and the back wheel always moves in
the direction to the front wheel.
(a) Write an expression for the trajectory of the front wheel f( t).
(b) Show that the speed of the back wheel can not exceed the speed of the front wheel.
70 CHAPTER 2. CURVES
2.3 Curvature
In the field of mathematics known as differential geometry4 special attention is given to the
parametrization-independent constructions. For example, depending on the parametriza-
tion, the velocity vector of the curve at given point can be multiplied by a scalar, so it
is not parametrization-independent; on the other hand the tangent line at given point is
parametrization-independent — although it is defined using parametrization the resulting
line is the same.
An other example is so called osculating plane. Given a smooth regular curve f, its oscu-
lating plane at f( t) is the plane passing thru f( t) and containing the velocity vector f′ ( t) and
the acceleration f′′ ( t). The osculating plane is defined if f′ ( t) is not parallel to f′′ ( t). Note that
in this case the cross product f′ ( t) × f′′ ( t) is perpendicular to the osculating plane. Therefore
the equation of the osculating plane at f( t) can be written as
Example 2.11. Let us show that osculating plane does at given point does not depend on the
parametrization. That is, if g( s) = f(α( s)) is a regular reparametrization then the plane thru
g( s) and containing the velocity vector g′ ( s) and the acceleration g′′ ( s) is the same as the
plane thru f( t) and containing the velocity vector f′ ( t) and the acceleration f′′ ( t) for t = α( s).
Since f( t) = g( s), we only need to show that f′ ( t) × f′′ ( t) ∥ g′ ( s) × g′′ ( s).
By chain rule
g′ ( s) = f′ (α( s))α′ ( s)
Yet an other example is so called curvature. Assume a smooth regular curve g has arc
length parametrization. Note that if g parametrize a straight line then g′ ( s) is a constant
unit vector and therefore g′′ ( s) = 0 at all points. Therefore the value κ( s) = k g′′ ( s) k can be
used to measure how fast the curve deviates from the straight line. The value κ( s) and the
vector g′′ ( s) are called curvature and curvature vector of the curve g at the point g( s).
4 See O’N EILL for an introduction to elementary differential geometry.
2.3 Curvature 71
If κ( s) 6= 0 then the value R ( s) = κ(1s) is called curvature radius of g at the point g( s). It
is called this way since the best approximation of the curve g at the point g( s) by a circle,
so called osculating circle, has radius R ( s). This circle is lying in the osculating plane, its
center lies in the direction of curvature vector g′′ ( s) from g( s) on the distance R ( s). If κ( s) = 0
then the osculating circle degenerates to the tangent line.
Assume you want to find the curvature of the given curve using the definition above. Then
you first have to find the arc length parametrization and then apply the formula above at the
given point. Finding this parametrization often leads to an integral that is either difficult or
impossible to evaluate explicitly. The simple integral in Example 2.10 is the exception, not
the norm. In general, arc length parametrizations are more useful for theoretical purposes
than for practical computations.5
The following theorem provides a direct way to calculate the curvature, without passing
to the reparametrization. Exercises 9 guides you through similar calculations.
Theorem 2.3. The curvature κ of a smooth curve f at the point f( t) can be found using the
following formula:
k f′′ ( t) × f′ ( t) k
κ= . (2.4)
k f′ ( t) k3
Proof: Let g( s) be the arc length parametrization of f( t); in particular k g′ ( s) k = 1 for any s.
As above, we assume t = α( s) and therefore g( s) = f(α( s)) and α′ ( s) = k f′1( t) k .
5 For example, the usual parametrizations of Bézier curves, which we discussed in Section 1.8, are polynomial
functions in R3 . This makes their computation relatively simple, which, in Computer-aided design, is desir-
able. But their arc length parametrizations are not only not polynomials, they are in fact usually impossible to
calculate at all.
72 CHAPTER 2. CURVES
Since f′ ( t) × f′ ( t) = 0, we get
µ ′′ ¶
f′′ ( t) × f′ ( t) f ( t) ′ ′′ f′ ( t)
= + f ( t )α ( s ) ×
k f′ ( t) k3 k f′ ( t) k2 k f′ ( t) k
′′ ′
= g ( s) × g ( s).
Since k g′ ( s) k = 1, we have
k f′′ ( t) × f′ ( t) k
= k g′′ ( s) × g′ ( s) k
k f′ ( t) k3
= k g′′ ( s) k k g′ ( s) k
= κ.
QED
Exercises
A
For Exercises 1–4, find the tangent line, the osculating plane and the curvature at each point
of the curve f( t).
B
5. Let f( t) be a smooth regular curve and g( s) = f(α( s)) be its regular reparametrization.
Show that the osculating plane of f at f( t) coinsides with the osculating plane of g at g( s)
if t = α( s).
2.3 Curvature 73
6. Let f( t) be a smooth regular curve; in particular, f′ ( t) 6= 0 for all t. Then we can define the
unit tangent vector T by
f′ ( t)
T( t) = ′ .
k f ( t) k
(a) Show that
f′ ( t) × (f′′ ( t) × f′ ( t))
T′ ( t ) = .
k f′ ( t) k3
7. Let g( s) be a smooth curve with arc length parametrization and κ( s) be its curvature.
Show that
g′′′ ( s) · g′ ( s) = −κ( s)2 .
8. Let g be a smooth plane curve with arc length parametrization. The curve
h( s) = g( s) − sg′ ( s)
C
9. Let f( t) be a smooth curve in the plane. Assume its curvature κ( t) is increasing in t. Show
that the curve has no self-intersections; that is, if t 0 6= t 1 then f( t 0 ) 6= f( t 1 ). (Hint: Write an
expression for the center and radius of the osculating circles and use it to show that they
do not intersect each other.
3 Functions of Several Variables
3.1 Functions of Two or Three Variables
In Section 1.8 we discussed vector-valued functions of a single real variable. We will now
examine real-valued functions of a point (or vector) in R2 or R3 . For the most part these
functions will be defined on sets of points in R2 , but there will be times when we will use
points in R3 , and there will also be times when it will be convenient to think of the points as
vectors (or terminal points of vectors).
A real-valued function f defined on a subset D of R2 is a rule that assigns to each point
( x, y) in D a real number f ( x, y). The largest possible set D in R2 on which f is defined is
called the domain of f , and the range of f is the set of all real numbers f ( x, y) as ( x, y)
varies over the domain D . A similar definition holds for functions f ( x, y, z) defined on points
( x, y, z) in R3 .
f ( x, y) = x y
1
f ( x, y) =
x− y
is all of R2 except the points ( x, y) for which x = y. That is, the domain is the set D = {( x, y) :
x 6= y}. The range of f is all real numbers except 0.
is the set D = {( x, y) : x2 + y2 ≤ 1}, since the quantity inside the square root is nonnegative if
and only if 1 − ( x2 + y2 ) ≥ 0. We see that D consists of all points on and inside the unit circle
in R2 (D is sometimes called the closed unit disk). The range of f is the interval [0, 1] in R.
74
3.1 Functions of Two or Three Variables 75
f ( x, y, z) = e x+ y− z
is shown below. Note that the level curves (shown both on the surface and projected onto the
x y-plane) are groups of concentric circles.
You may be wondering what happens to the function in Example 3.5 at the point ( x, y) =
(0, 0), since both the numerator and denominator are 0 at that point. The function is not
defined at (0, 0), but the limit of the function exists (and equals 1) as ( x, y) approaches (0, 0).
We will now state explicitly what is meant by the limit of a function of two variables.
Definition 3.1. Let (a, b) be a point in R2 , and let f ( x, y) be a real-valued function defined
on some set containing (a, b) (but not necessarily defined at (a, b) itself). Then we say that
the limit of f ( x, y) equals L as ( x, y) approaches (a, b), written as
lim f ( x, y) = L , (3.1)
( x,y)→(a,b)
A similar definition can be made for functions of three variables. The idea behind the
above definition is that the values of f ( x, y) can get arbitrarily close to L (that is, within ǫ
of L) if we pick ( x, y) sufficiently close to (a, b) (that is, inside a circle centered at (a, b) with
some sufficiently small radius δ).
If you recall the “epsilon-delta” proofs of limits of real-valued functions of a single variable,
you may remember how awkward they can be, and how they can usually only be done easily
76 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
1
0.8
0.6
0.4
0.2
z
0
-0.2 -10
-5
-0.4
-10 0
-5 x
0 5
y 5
10 10
p
sin x2 + y2
Figure 3.1.1 The function f (x, y) = p 2 2 .
x +y
for simple functions. In general, the multivariable cases are at least equally awkward to go
through, so we will not bother with such proofs. Instead, we will simply state that when the
function f ( x, y) is given by a single formula and is defined at the point (a, b) (for example,
is not some indeterminate form like 0/0) then you can just substitute ( x, y) = (a, b) into the
formula for f ( x, y) to find the limit.
Example 3.6.
xy (1)(2) 2
lim = =
( x,y)→(1,2) x2 + y2 2
1 +2 2 5
xy
since f ( x, y) = x2 + y2
is properly defined at the point (1, 2).
The major difference between limits in one variable and limits in two or more variables
has to do with how a point is approached. In the single-variable case, the statement “ x → a”
means that x gets closer to the value a from two possible directions along the real number
line (see Figure 3.1.2(a)). In two dimensions, however, ( x, y) can approach a point (a, b) along
an infinite number of paths (see Figure 3.1.2(b)).
Example 3.7.
xy
lim does not exist
( x,y)→(0,0) x2 + y2
3.1 Functions of Two or Three Variables 77
x x (a, b)
x
0 a x 0
(a) x → a in R (b) ( x, y) → (a, b) in R2
Note that we can not simply substitute ( x, y) = (0, 0) into the function, since doing so gives an
indeterminate form 0/0. To show that the limit does not exist, we will show that the function
approaches different values as ( x, y) approaches (0, 0) along different paths in R2 . To see this,
suppose that ( x, y) → (0, 0) along the positive x-axis, so that y = 0 along that path. Then
xy x0
f ( x, y) = = =0
x 2 + y 2 x 2 + 02
along that path (since x > 0 in the denominator). But if ( x, y) → (0, 0) along the straight line
y = x through the origin, for x > 0, then we see that
xy x2 1
f ( x, y) = 2 2
= 2 2
= ,
x +y x +x 2
which means that f ( x, y) approaches different values as ( x, y) → (0, 0) along different paths.
Hence the limit does not exist.
Limits of real-valued multivariable functions obey the same algebraic rules as in the
single-variable case, as shown in the following theorem, which we state without proof.
Theorem 3.1. Suppose that lim f ( x, y) and lim g( x, y) both exist, and that k is
( x,y)→(a,b) ( x,y)→(a,b)
some scalar. Then:
h i h i
(a) lim [ f ( x, y) ± g( x, y)] = lim f ( x, y) ± lim g( x, y) ;
( x,y)→(a,b) ( x,y)→(a,b) ( x,y)→(a,b)
h i
(b) lim k f ( x, y) = k lim f ( x, y) ;
( x,y)→(a,b) ( x,y)→(a,b)
h ih i
(c) lim [ f ( x, y) g( x, y)] = lim f ( x, y) lim g( x, y) ;
( x,y)→(a,b) ( x,y)→(a,b) ( x,y)→(a,b)
lim f ( x, y)
f ( x, y) ( x,y)→(a,b)
(d) lim = if lim g( x, y) 6= 0;
( x,y)→(a,b) g( x, y) lim g( x, y) ( x,y)→(a,b)
( x,y)→(a,b)
Note that in part (e), it suffices to have | f ( x, y) − L | ≤ g( x, y) for all ( x, y) “sufficiently close”
to (a, b) (but excluding (a, b) itself).
Example 3.8. Show that
y4
lim = 0.
( x,y)→(0,0) x 2 + y2
Since substituting ( x, y) = (0, 0) into the function gives the indeterminate form 0/0, we need
an alternate method for evaluating this limit. We will use Theorem 3.1(e). First, notice that
¡p ¢4 ¡p ¢4 ¡p ¢4
y4 = y2 and so 0 ≤ y4 ≤ x2 + y2 for all ( x, y). But x2 + y2 = ( x2 + y2 )2 . Thus, for
all ( x, y) 6= (0, 0) we have
¯ ¯
¯ y4 ¯ ( x2 + y2 )2 2 2
¯ x2 + y2 ¯ ≤ x2 + y2 = x + y → 0 as ( x, y) → (0, 0).
¯ ¯
y4
Therefore, lim = 0.
( x,y)→(0,0) x2 + y2
Unless indicated otherwise, you can assume that all the functions we deal with are con-
tinuous. In fact, we can modify the function from Example 3.8 so that it is continuous on all
of R2 .
Exercises
A
For Exercises 1–6, state the domain and range of the given function.
3.1 Functions of Two or Three Variables 79
1
1. f ( x, y) = x2 + y2 − 1; 2. f ( x, y) = ;
x 2 + y2
p x2 + 1
3. f ( x, y) = x2 + y2 − 4; 4. f ( x, y) = ;
y
p
5. f ( x, y, z) = sin( x yz); 6. f ( x, y, z) = ( x − 1)( yz − 1).
x 2 − y2 x y2
9. lim ; 10. lim ;
( x,y)→(0,0) x 2 + y2 ( x,y)→(0,0) x2 + y4
x 2 − 2 x y + y2 x y2
11. lim ; 12. lim ;
( x,y)→(1,−1) x− y ( x,y)→(0,0) x 2 + y2
2 2
x −y x 2 − 2 x y + y2
13. lim ; 14. lim ;
( x,y)→(1,1) x− y ( x,y)→(0,0) x− y
µ ¶
y4 sin( x y) 1
15. lim ; 16. lim ( x2 + y2 ) cos ;
( x,y)→(0,0) x 2 + y2 ( x,y)→(0,0) xy
x µ ¶
17. lim ; 1
( x,y)→(0,0) y 18. lim cos .
( x,y)→(0,0) xy
B
1 2 2 2
−( x + y )/2σ
19. Show that f ( x, y) = 2πσ 2 e , for σ > 0, is constant on the circle of radius r > 0
centered at the origin. This function is called a Gaussian blur, and is used as a filter in
image processing software to produce a “blurred” effect.
(Hint: You will need to use L’Hôpital’s Rule for single-variable limits.)
C
22. Prove Theorem 3.1(a) in the case of addition. (Hint: Use Definition 3.1.)
Definition 3.3. Let f ( x, y) be a real-valued function with domain D in R2 , and let (a, b) be
a point in D . Then the partial derivative of f at (a, b) with respect to x, denoted by
∂f
(a, b), is defined as
∂x
∂f f (a + h, b) − f (a, b)
(a, b) = lim (3.2)
∂x h→0 h
∂f
and the partial derivative of f at (a, b) with respect to y, denoted by (a, b), is defined
∂y
as
∂f f (a, b + h) − f (a, b)
(a, b) = lim . (3.3)
∂y h→0 h
Note: The symbol ∂ is pronounced “del”.1
Recall that the derivative of a function f ( x) can be interpreted as the rate of change of
that function in the (positive) x direction. From the definitions above, we can see that the
partial derivative of a function f ( x, y) with respect to x or y is the rate of change of f ( x, y) in
the (positive) x or y direction, respectively. What this means is that the partial derivative of
a function f ( x, y) with respect to x can be calculated by treating the y variable as a constant,
and then simply differentiating f ( x, y) as if it were a function of x alone, using the usual
rules from single-variable calculus. Likewise, the partial derivative of f ( x, y) with respect to
y is obtained by treating the x variable as a constant and then differentiating f ( x, y) as if it
were a function of y alone.
∂f ∂f
Example 3.10. Find ( x, y) and ( x, y) for the function f ( x, y) = x2 y + y3 .
∂x ∂y
Solution: Treating y as a constant and differentiating f ( x, y) with respect to x gives
∂f
( x, y) = 2 x y
∂x
and treating x as a constant and differentiating f ( x, y) with respect to y gives
∂f
( x, y) = x2 + 3 y2 .
∂y
∂f ∂f ∂f ∂f
We will often simply write and instead of ( x, y) and ( x, y).
∂x ∂y ∂x ∂y
1 It is not a Greek letter. The symbol was first used by the mathematicians A. Clairaut and L. Euler around
1740, to distinguish it from the letter d used for the “usual” derivative.
3.2 Partial Derivatives 81
∂f ∂f sin( x y2 )
Example 3.11. Find and for the function f ( x, y) = .
∂x ∂y x2 + 1
Solution: Treating y as a constant and differentiating f ( x, y) with respect to x gives
∂f 2 x y cos( x y2 )
= .
∂y x2 + 1
∂f ∂f
Since both and are themselves functions of x and y, we can take their partial
∂x ∂y
derivatives with respect to x and y. This yields the higher-order partial derivatives:
∂2 f ∂ ³∂f ´ ∂2 f ∂ ³∂f ´
= , = ,
∂ x2 ∂ x ∂ x ∂ y2 ∂ y ∂ y
∂2 f ∂ ³∂f ´ ∂2 f ∂ ³∂f ´
= , = ,
∂ y ∂x ∂ y ∂x ∂x ∂ y ∂x ∂ y
∂3 f ∂ ³ ∂2 f ´ ∂3 f ∂ ³ ∂2 f ´
= , = ,
∂ x3 ∂ x ∂ x2 ∂ y3 ∂ y ∂ y2
∂3 f ∂ ³ ∂2 f ´ ∂3 f ∂ ³ ∂2 f ´
= , = ,
∂ y ∂ x2 ∂ y ∂ x2 ∂ x ∂ y2 ∂ x ∂ y2
∂3 f ∂ ³ ∂2 f ´ ∂3 f ∂ ³ ∂2 f ´
= , = ,
∂ y2 ∂ x ∂ y ∂ y ∂ x ∂ x2 ∂ y ∂ x ∂ x ∂ y
∂3 f ∂ ³ ∂2 f ´ ∂3 f ∂ ³ ∂2 f ´
= , = ,
∂x ∂ y ∂x ∂x ∂ y ∂x ∂ y ∂x ∂ y ∂ y ∂x ∂ y
..
.
∂f ∂f ∂2 f ∂2 f ∂2 f ∂2 f
Example 3.12. Find the partial derivatives , , 2
, 2
, and for the
∂x ∂y ∂x ∂y ∂ y ∂x ∂x ∂ y
2
function f ( x, y) = e x y + x y3 .
82 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
∂f 2 ∂f 2
= 2 x ye x y
+ y3 , = x2 e x y
+ 3 x y2 ,
∂x ∂y
∂2 f ∂ 2 ∂2 f ∂ 2
= (2 x ye x y + y3 ) = ( x 2 e x y + 3 x y2 )
∂ x2 ∂x ∂ y2 ∂y
2 2 2
= 2 ye x y
+ 4 x 2 y2 e x y , = x4 e x y
+ 6 x y,
∂2 f ∂ 2 ∂2 f ∂ 2
= (2 x ye x y + y3 ) = ( x 2 e x y + 3 x y2 )
∂ y ∂x ∂y ∂x ∂ y ∂x
2 2 2 2
= 2 xe x y
+ 2 x3 ye x y
+ 3 y2 , = 2 xe x y
+ 2 x3 ye x y
+ 3 y2 .
Higher-order partial derivatives that are taken with respect to different variables, such
∂2 f ∂2 f
as ∂ y ∂ x and ∂ x ∂ y , are called mixed partial derivatives. Notice in the above example that
∂2 f ∂2 f ∂2 f
∂ y ∂x
= ∂x ∂ y
. It turns that this will usually be the case. Specifically, whenever both ∂ y ∂x
and
∂2 f 2
are continuous at a point (a, b), then they are equal at that point. All the functions
∂x ∂ y
we will deal with will have continuous partial derivatives of all orders, so you can assume in
the remainder of the text that
∂2 f ∂2 f
= for all ( x, y) in the domain of f .
∂ y ∂x ∂x ∂ y
In other words, it doesn’t matter in which order you take partial derivatives. This applies
even to mixed partial derivatives of order 3 or higher.
The notation for partial derivatives varies. All of the following are equivalent:
∂f
: f x ( x, y) , f 1 ( x, y) , D x ( x, y) , D 1 ( x, y) ;
∂x
∂f
: f y ( x, y) , f 2 ( x, y) , D y ( x, y) , D 2 ( x, y) ;
∂y
∂2 f
: f xx ( x, y) , f 11 ( x, y) , D xx ( x, y) , D 11 ( x, y) ;
∂ x2
∂2 f
: f yy ( x, y) , f 22 ( x, y) , D yy ( x, y) , D 22 ( x, y) ;
∂ y2
∂2 f
: f x y ( x, y) , f 12 ( x, y) , D x y ( x, y) , D 12 ( x, y) ;
∂ y ∂x
∂2 f
: f yx ( x, y) , f 21 ( x, y) , D yx ( x, y) , D 21 ( x, y) .
∂x ∂ y
2 See pp. 214–216 in T AYLOR and M ANN for a proof.
3.2 Partial Derivatives 83
Exercises
A
∂f ∂f
For Exercises 1–16, find ∂x
and ∂y
.
1. f ( x, y) = x2 + y2 ; 2. f ( x, y) = cos( x + y);
p
3. f ( x, y) = x2 + y + 4; x+1
4. f ( x, y) = ;
y+1
5. f ( x, y) = e x y + x y; 6. f ( x, y) = x2 − y2 + 6 x y + 4 x − 8 y + 2;
7. f ( x, y) = x4 ; 8. f ( x, y) = x + 2 y;
p
9. f ( x, y) = x2 + y2 ; 10. f ( x, y) = sin( x + y);
p
11. f ( x, y) = 3 x2 + y + 4; xy+1
12. f ( x, y) = ;
x+ y
2
+ y2 )
13. f ( x, y) = e−( x ; 14. f ( x, y) = ln( x y);
23. f ( x, y) = x4 ; 24. f ( x, y) = x + 2 y;
B
27. Show that the function f ( x, y) = sin( x + y) + cos( x − y) satisfies the wave equation
∂2 f ∂2 f
− =0 .
∂ x2 ∂ y2
28. Let u and v be twice-differentiable functions of a single variable, and let c 6= 0 be a con-
stant. Show that f ( x, y) = u( x + c y) + v( x − c y) is a solution of the general one-dimensional
wave equation3
∂2 f 1 ∂2 f
− =0 .
∂ x 2 c 2 ∂ y2
3 Conversely, it turns out that any solution must be of this form. See Ch. 1 in W EINBERGER.
84 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
z z ∂f
z = f (x, y) slope = ∂y
(a, b)
∂f
(a, b, f (a, b)) slope = ∂x
(a, b) z = f (x, y)
(a, b, f (a, b))
Ly
Lx
b y y
0 0
a
(a, b) (a, b)
x x
D D
(a) Tangent line L x in the plane y = b (b) Tangent line L y in the plane x = a
dy
Since the derivative dx of a function y = f ( x) is used to find the tangent line to the graph
of f (which is a curve in R2 ), you might expect that partial derivatives can be used to define
a tangent plane to the graph of a surface z = f ( x, y). This indeed turns out to be the case.
First, we need a definition of a tangent plane. The intuitive idea is that a tangent plane “just
touches” a surface at a point. The formal definition mimics the intuitive notion of a tangent
line to a curve.
Note that since two lines in R3 determine a plane, then the two tangent lines to the surface
z = f ( x, y) in the x and y directions described in Figure 3.3.1 are contained in the tangent
plane at that point, if the tangent plane exists at that point. The existence of those two
3.3 Tangent Plane to a Surface 85
tangent lines does not by itself guarantee the existence of the tangent plane. It is possible
that if we take the trace of the surface in the plane x − y = 0 (which makes a 45◦ angle with
the positive x-axis), the resulting curve in that plane may have a tangent line which is not
in the plane determined by the other two tangent lines, or it may not have a tangent line
∂f ∂f
at all at that point. Luckily, it turns out4 that if ∂ x and ∂ y exist in a region around a point
(a, b) and are continuous at (a, b) then the tangent plane to the surface z = f ( x, y) will exist
at the point (a, b, f (a, b)). In this text, those conditions will always hold.
Suppose that we want an equation of the tangent plane T z
to the surface z = f ( x, y) at a point (a, b, f (a, b)). Let L x and z = f (x, y)
L y be the tangent lines to the traces of the surface in the
(a, b, f (a, b)) L
planes y = b and x = a, respectively (as in Figure 2.3.2), and y
suppose that the conditions for T to exist do hold. Then the Lx
equation for T is T y
0
A ( x − a) + B( y − b) + C ( z − f (a, b)) = 0 (3.4) x
Figure 3.3.2 Tangent plane
where n = ( A, B, C ) is a normal vector to the plane T . Since
T contains the lines L x and L y , then all we need are vectors vx and v y that are parallel to L x
and L y , respectively, and then let n = vx × v y .
∂f
Since the slope of L x is ∂ x (a, b), then the vector vx = (1, 0, ∂ x (a, b)) is
∂f z
∂f
vx = (1, 0, ∂ x (a, b))
parallel to L x (since vx lies in the xz-plane and lies in a line with slope
∂f
∂x
(a,b) ∂f
1 =∂x
(a, b). See Figure 2.3.3). Similarly, the vector
∂f
∂f (a, b)
vy = (0, 1, ∂ y (a, b)) is parallel to L y . Hence, the vector ∂x
x
¯ ¯ 0
¯ i j k ¯¯ 1
¯
∂f
(a, b) ¯ = − ∂∂ xf (a, b) i − ∂∂ fy (a, b) j + k
¯ ¯
n = vx × v y = ¯ 1 0 ∂x Figure 3.3.3
¯ ∂f ¯
¯ 0 1 (a, b) ¯
∂y
Example 3.13. Find the equation of the tangent plane to the surface z = x2 + y2 at the point
(1, 2, 5).
4 See T AYLOR and M ANN, § 6.4.
86 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
∂f ∂f
Solution: For the function f ( x, y) = x2 + y2 , we have ∂x
= 2 x and ∂y
= 2 y, so the equation of
the tangent plane at the point (1, 2, 5) is
2(1)( x − 1) + 2(2)( y − 2) − z + 5 = 0 , or
2x + 4 y − z − 5 = 0 .
Example 3.14. Find the equation of the tangent plane to the surface x2 + y2 + z2 = 9 at the
point (2, 2, −1).
∂F ∂F ∂F
Solution: For the function F ( x, y, z) = x2 + y2 + z2 − 9, we have ∂x
= 2 x, ∂y
= 2 y, and ∂z
= 2 z,
so the equation of the tangent plane at (2, 2, −1) is
Exercises
A
For Exercises 1–6, find the equation of the tangent plane to the surface z = f ( x, y) at the
point P .
For Exercises 7–10, find the equation of the tangent plane to the given surface at the point
P.
³ p ´
2 y2 z2
7. x4 + 9 + 16 = 1, P = 1, 2, 2 311 ; 8. x2 + y2 + z2 = 9, P = (0, 0, 3);
p
9. x2 + y2 − z2 = 0, P = (3, 4, 5); 10. x2 + y2 = 4, P = ( 3, 1, 0).
B
11. Find the angles between the curve f( t) = ( t, t2 , t3 ) and the surface x6 + y3 + z2 = 3 at their
intersections.
3.4 Directional Derivatives and the Gradient 87
Definition 3.5. Let f ( x, y) be a real-valued function with domain D in R2 , and let (a, b) be a
point in D . Let v be a vector in R2 . Then the directional derivative of f at (a, b) in the
direction of v, denoted by D v f (a, b), is defined as
Notice in the definition that we seem to be treating the point (a, b) as a vector, since we
are adding the vector hv to it. But this is just the usual idea of identifying vectors with their
terminal points, which the reader should be used to by now. If we were to write the vector v
as v = (v1 , v2 ), then
f (a + hv1 , b + hv2 ) − f (a, b)
D v f (a, b) = lim . (3.9)
h→0 h
∂f ∂f
From this we can immediately recognize that the partial derivatives ∂ x and ∂ y are special
cases of the directional derivative with v = i = (1, 0) and v = j = (0, 1), respectively. That is,
∂f ∂f
∂x
= D i f and ∂ y = D j f .
∂f ∂f
If f ( x, y) has continuous partial derivatives ∂ x and ∂ y (which will always be the case in
this text), then there is a simple formula for the directional derivative:
Theorem 3.2. Let f ( x, y) be a real-valued function with domain D in R2 such that the partial
∂f ∂f
derivatives ∂ x and ∂ y exist and are continuous in D . Let (a, b) be a point in D . Then
∂f ∂f
D v f (a, b) = v1 (a, b) + v2 (a, b) . (3.10)
∂x ∂y
∂f
Proof: Note that if v = i = (1, 0) then the above formula reduces to D v f (a, b) = ∂x
(a, b),
∂f
which we know is true since D i f = ∂x
, as we noted earlier. Similarly, for v = j = (0, 1) the
∂f ∂f
formula reduces to D v f (a, b) = ∂y
(a, b),
which is true since D j f = ∂y
. Fix such a vector
v = (v1 , v2 ) and fix a number h 6= 0. Then
Since g(α) = f (a + hv1 , y + α hv2 ) is a real-valued function, we can apply the Mean Value
Theorem from single-variable calculus on the interval [0, 1]. It provides a number 0 < α < 1
such that
g(1) − g(0)
g ′ (α) =
1−0
= f (a + hv1 , b + hv2 ) − f (a + hv1 , b).
By chain rule
∂f
g ′ (α) = (a + hv1 , b + α hv2 ) hv2 .
∂y
Therefore,
∂f
f (a + hv1 , b + hv2 ) − f (a + hv1 , b) = hv2 (a + hv1 , b + α hv2 ).
∂x
By a similar argument, there exists a number 0 < β < 1 such that
∂f
f (a + hv1 , b) − f (a, b) = hv1 (a + β hv1 , b) .
∂x
Thus, by equation (3.11), we have
∂f ∂f
f (a + hv1 , b + hv2 ) − f (a, b) hv2 ∂ y (a + hv1 , b + α hv2 ) + hv1 ∂ x (a + β hv1 , b)
=
h h
∂f ∂f
= v2 (a + hv1 , b + α hv2 ) + v1 (a + β hv1 , b)
∂y ∂x
∂f ∂f ∂f ∂f
= v2 (a, b) + v1 (a, b) by the continuity of and , so
∂y ∂x ∂x ∂y
∂f ∂f
D v f (a, b) = v1 (a, b) + v2 (a, b)
∂x ∂y
Along the same lines one can prove the following generalization of the chain rule.
3.4 Directional Derivatives and the Gradient 89
Theorem 3.3. Let f ( x, y) be a real-valued function with domain D in R2 such that the
∂f ∂f
partial derivatives ∂ x and ∂ y exist and are continuous in D and h( t) = ( h 1 ( t), h 2 ( t)) be a
smooth function with values in D . Then
³ ´
∂f ∂f
Note that D v f (a, b) = v · ∂x
(a, b), ∂ y (a, b) . The second vector has a special name:
Definition 3.6. For a real-valued function f ( x, y), the gradient of f , denoted by ∇ f , is the
vector ³∂f ∂f ´
∇f = , (3.13)
∂x ∂ y
in R2 . For a real-valued function f ( x, y, z), the gradient is the vector
³∂f ∂f ∂f ´
∇f = , , (3.14)
∂x ∂ y ∂z
Corollary 3.4. In the assumptions of the theorems 3.2 and 3.3 we have
(a) D v f = v · ∇ f ;
2 3
´ directional derivative of f ( x, y) = x y + x y at the point (1, 2) in the
Example 3.15. ³Find the
direction of v = p1 , p1 .
2 2
∂f ∂f
A real-valued function z = f ( x, y) whose partial derivatives ∂ x and ∂ y exist and are con-
tinuous is called continuously differentiable. Assume that f ( x, y) is such a function and that
∇ f 6= 0. Let c be a real number in the range of f and let v be a vector in R2 which is tangent
to the level curve f ( x, y) = c (see Figure 3.4.1).
5 Sometimes the notation grad( f ) is used instead of ∇ f .
90 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
y
v ∇f
f ( x, y) = c
x
0
Figure 3.4.1
The value of f ( x, y) is constant along a level curve, so since v is a tangent vector to this
curve, then the rate of change of f in the direction of v is 0; that is, D v f = 0. But we know
that D v f = v · ∇ f . In other words, ∇ f ⊥ v, which means that ∇ f is normal to the level curve.
In general, for any unit vector v in R2 , we have D v f = k∇ f k cos θ , where θ is the angle
between v and ∇ f . At a fixed point ( x, y) the length k∇ f k is fixed, and the value of D v f then
varies as θ varies. The largest value that D v f can take is when cos θ = 1 (θ = 0◦ ), while the
smallest value occurs when cos θ = −1 (θ = 180◦ ). In other words, the value of the function
f increases the fastest in the direction of ∇ f (since θ = 0◦ in that case), and the value of
f decreases the fastest in the direction of −∇ f (since θ = 180◦ in that case). We have thus
proved the following theorem:
Example 3.16. In which direction does the function f ( x, y) = x y2 + x3 y increase the fastest
from the point (1, 2)? In which direction does it decrease the fastest?
Solution: Since ∇ f = (³y2 + 3 x´2 y, 2 x y + x3 ), then ∇ f (1, 2) = (10, 5) 6= 0. A unit vector
³ in´ that
∇f 2 1 2 1
direction is v = k∇ f k = p , p . Thus, f increases the fastest in the direction of p , p and
5 5 5 5
³ ´
− 2 − 1
decreases the fastest in the direction of p , p .
5 5
3.4 Directional Derivatives and the Gradient 91
Though we proved Theorem 3.5 for functions of two variables, a similar argument can
be used to show that it also applies to functions of three or more variables. Likewise, the
directional derivative in the three-dimensional case can also be defined by the formula D v f =
v· ∇f .
Example 3.17. The temperature T of a solid is given by the function
T ( x, y, z) = e− x + e−2 y + e4 z ,
where x, y, z are space coordinates relative to the center of the solid. In which direction from
the point (1, 1, 1) will the temperature decrease the fastest?
Solution: Since ∇ f = (− e− x , −2 e−2 y , 4 e4 z ), then the temperature will decrease the fastest in
the direction of −∇ f (1, 1, 1) = ( e−1 , 2 e−2 , −4 e4 ).
Exercises
A
For Exercises 1–10, compute the gradient ∇ f .
1
1. f ( x, y) = x2 + y2 − 1; 2. f ( x, y) = ;
x 2 + y2
p
3. f ( x, y) = x2 + y2 + 4; 4. f ( x, y) = x2 e y ;
5. f ( x, y) = ln( x y); 6. f ( x, y) = 2 x + 5 y;
7. f ( x, y, z) = sin( x yz); 8. f ( x, y, z) = x2 e yz ;
p
9. f ( x, y, z) = x2 + y2 + z2 ; 10. f ( x, y, z) = x2 + y2 + z2 .
For ³Exercises
´ 11–14, find the directional derivative of f at the point P in the direction of
v= p1 , p1 .
2 2
1
11. f ( x, y) = x2 + y2 − 1, P = (1, 1); 12. f ( x, y) = , P = (1, 1);
x 2 + y2
p
13. f ( x, y) = x2 + y2 + 4, P = (1, 1); 14. f ( x, y) = x2 e y , P = (1, 1).
19. ∇( c f ) = c ∇ f ; 20. ∇( f + g) = ∇ f + ∇ g;
g ∇f − f ∇g
21. ∇( f g) = f ∇ g + g ∇ f ; 22. ∇( f / g) = if g( x, y) 6= 0;
g2
23. D −v f = −D v f ; 24. D v ( c f ) = c D v f ;
25. D v ( f + g) = D v f + D v g; 26. D v ( f g) = f D v g + g D v f .
p
27. The function r ( x, y) = x2 + y2 is the length of the position vector r = x i + y j for each
point ( x, y) in R2 . Show that ∇ r = 1r r when ( x, y) 6= (0, 0), and that ∇( r 2 ) = 2 r.
C
28. Let g( x) and f ( x, y) be smooth function such that
f ( x, g( x)) = 0.
Show that
∂f ∂f
∂x
( x, g( x)) + g′ ( x) ∂ y ( x, g( x)) = 0.
(Hint: Apply Theorem 3.3 for the curve h( t) = ( t, g( t)).)
3.5 Maxima and Minima 93
Definition 3.7. Let f ( x, y) be a real-valued function, and let ( x0 , y0 ) be a point in the domain
of f . We say that f has a local maximum at ( x0 , y0 ) if f ( x, y) ≤ f ( x0 , y0 ) for all ( x, y) inside
some disk of positive radius centered at ( x0 , y0 ); that is, there is some sufficiently small r > 0
such that f ( x, y) ≤ f ( x0 , y0 ) for all ( x, y) for which ( x − x0 )2 + ( y − y0 )2 < r 2 .
Likewise, we say that f has a local minimum at ( x0 , y0 ) if f ( x, y) ≥ f ( x0 , y0 ) for all ( x, y)
inside some disk of positive radius centered at ( x0 , y0 ).
If f ( x, y) ≤ f ( x0 , y0 ) for all ( x, y) in the domain of f , then f has a global maximum at
( x0 , y0 ). If f ( x, y) ≥ f ( x0 , y0 ) for all ( x, y) in the domain of f , then f has a global minimum
at ( x0 , y0 ).
Suppose that ( x0 , y0 ) is a local maximum point for f ( x, y), and that the first-order partial
derivatives of f exist at ( x0 , y0 ). We know that f ( x0 , y0 ) is the largest value of f ( x, y) as
( x, y) goes in all directions from the point ( x0 , y0 ), in some sufficiently small disk centered at
( x0 , y0 ). In particular, f ( x0 , y0 ) is the largest value of f in the x direction (around the point
( x0 , y0 )), that is, the single-variable function g( x) = f ( x, b) has a local maximum at x = a. So
∂f ∂f
we know that g ′ (a) = 0. Since g ′ ( x) = ∂ x ( x, b), then ∂ x ( x0 , y0 ) = 0. Similarly, f ( x0 , y0 ) is the
∂f
largest value of f near ( x0 , y0 ) in the y direction and so (x , y )
∂y 0 0
= 0. We thus have the
following theorem:
∂f ∂f
Theorem 3.6. Let f ( x, y) be a real-valued function such that both ∂ x ( x0 , y0 ) and ∂ y ( x0 , y0 )
exist. Then a necessary condition for f ( x, y) to have a local maximum or minimum at ( x0 , y0 )
is that ∇ f ( x0 , y0 ) = 0.
Note: Theorem 3.6 can be extended to apply to functions of three or more variables.
A point ( x0 , y0 ) where ∇ f ( x0 , y0 ) = 0 is called a critical point for the function f ( x, y).
So given a function f ( x, y), to find the critical points of f you have to solve the equations
∂f ∂f
∂x
( x, y) = 0 and ∂ y ( x, y) = 0 simultaneously for ( x, y). Similar to the single-variable case, the
necessary condition that ∇ f ( x0 , y0 ) = 0 is not always sufficient to guarantee that a critical
point is a local maximum or minimum.
∂f
Example 3.18. The function f ( x, y) = x y has a critical point at (0, 0): ∂x
= y = 0 ⇒ y = 0, and
∂f
∂y
= x = 0 ⇒ x = 0, so (0, 0) is the only critical point. But clearly f does not have a local
maximum or minimum at (0, 0) since any disk around (0, 0) contains points ( x, y) where the
values of x and y have the same sign (so that f ( x, y) = x y > 0 = f (0, 0)) and different signs (so
that f ( x, y) = x y < 0 = f (0, 0)). In fact, along the path y = x in R2 , f ( x, y) = x2 , which has a
94 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
local minimum at (0, 0), while along the path y = − x we have f ( x, y) = − x2 , which has a local
maximum at (0, 0). So (0, 0) is an example of a saddle point; that is, it is a local maximum
in one direction and a local minimum in another direction. The graph of f ( x, y) is shown in
Figure 3.5.1, which is a hyperbolic paraboloid.
100
50
0
z
-50
-10
-100-10 -5
-5 0 x
0 5 5
y 10 10
From the course of single-variable calculus, you may remember second derivative test.
If f ′ ( x0 ) = 0 and f ′′ (0) > 0 then the real-to-real function f has a local minimum at x0 .
In order to explain the multi-variable analog of this test, let us introduce second direc-
tional derivative. Fix a vector v ∈ R2 and a smooth function of two variables f ( x, y). The
directional derivative h( x, y) = D v f ( x, y) is an other smooth function of two variables, so we
can take its directional derivative again D v h( x, y); it is called second directional derivative
and denoted as D v2 f ( x, y).
If D v f ( x0 , y0 ) = 0 and D v2 f ( x0 , y0 ) > 0 for any vector v 6= 0 then the the smooth function
f ( x, y) of two variables, has a local minimum at ( x0 , y0 ).
In this form the second derivative test is not useful since it requires to check inequality
D v2 f ( x0 , y0 ) > 0 for infinite number of vectors v. Let us try to remove this weak point.
3.5 Maxima and Minima 95
∂f ∂f
D v f ( x0 , y0 ) = a ∂ x + b ∂ y .
and
∂f ∂f
D v2 f ( x0 , y0 ) = D v (a ∂ x + b ∂ y )
∂2 f ∂2 f ∂2 f ∂2 f
= a2 ∂ x2 + ab ∂ x∂ y + ab ∂ y∂ x + b2 ∂ y2
∂2 f ∂2 f ∂2 f
= a2 ∂ x2 + 2ab ∂ y∂ x + b2 ∂ y2 ,
∂2 f ∂2 f
the last equality holds since the function f is smooth and, therefore, ∂ x∂ y
= ∂ y∂ x
Therefore, the condition D v2 f ( x0 , y0 ) > 0 for any v 6= 0 means that
∂2 f ∂2 f ∂2 f
a2 ∂ x2 + 2ab ∂ y∂ x + b2 ∂ y2 > 0
for any pair of real numbers (a, b) at least one of which is not zero.
Analyzing the last inequality for all possible pairs (a, b) leads to the following theorem
which is true analog of second derivative test for smooth functions of two variables; it gives
sufficient conditions for a critical point to be a local maximum or minimum of a smooth
function (that is, a function whose partial derivatives of all orders exist and are continuous).
The theorem will not be proved here.6
∂2 f ∂2 f ³ ∂2 f ´2
D= (x , y )
2 0 0
(x , y ) −
2 0 0
( x0 , y0 )
∂x ∂y ∂ y ∂x
Then
∂2 f
(a) if D > 0 and ( x , y ) > 0,
∂ x2 0 0
then f has a local minimum at ( x0 , y0 )
∂2 f
(b) if D > 0 and ( x , y ) < 0,
∂ x2 0 0
then f has a local maximum at ( x0 , y0 )
(c) if D < 0, then f has neither a local minimum nor a local maximum at ( x0 , y0 )
If condition (c) holds, then ( x0 , y0 ) is a saddle point; that is, the second directional deriva-
tive D v2 f ( x0 , y0 ) can be positive and negative for different vectors v.
6 See T AYLOR and M ANN, § 7.6.
96 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
∂2 f ∂2 f
Recall that the assumption that f ( x, y) is smooth implies that ∂ y ∂x
= ∂x ∂ y
. Therefore
¯ 2 ¯
¯ ∂ f ∂2 f ¯
¯
¯ ( x0 , y0 ) ( x0 , y0 )¯¯
¯ ∂ x2 ∂ y ∂x ¯
D=¯ 2 ¯.
¯ ∂ f ∂2 f ¯
¯ ∂ x ∂ y ( x0 , y0 ) ( x0 , y0 ) ¯¯
¯
∂y 2
³ ´2
∂2 f ∂2 f ∂2 f ∂2 f ∂2 f
Also, if D > 0 then (x , y )
∂ x2 0 0 ∂ y2 0 0
( x , y ) = D + ∂ y ∂ x ( x0 , y0 ) > 0, and so ( x , y ) and ∂ y2 ( x0 , y0 )
∂ x2 0 0
have the same sign. This means that in parts (a) and (b) of the theorem one can replace
∂2 f ∂2 f
( x , y ) by ∂ y2 ( x0 , y0 ) if desired.
∂ x2 0 0
then the critical points ( x, y) are the common solutions of the equations
2x + y − 3 = 0
x + 2y =0
which has the unique solution ( x, y) = (2, −1). So (2, −1) is the only critical point.
To use Theorem 3.7, we need the second-order partial derivatives:
∂2 f ∂2 f ∂2 f
=2 , =2 , =1
∂ x2 ∂ y2 ∂ y ∂x
and so
∂2 f ∂2 f ³ ∂2 f ´2
D = 2
(2, −1) 2
(2, −1) − (2, −1) = (2)(2) − 12 = 3 > 0
∂x ∂y ∂ y ∂x
2
∂ f
and ∂ x2
(2, −1) = 2 > 0. Thus, (2, −1) is a local minimum.
then the critical points ( x, y) are the common solutions of the equations
y − 3 x2 = 0
x − 2y = 0
3.5 Maxima and Minima 97
The first equation yields y = 3 x2 , substituting that into the second equation yields x −6 x2 = 0,
¡ ¢2 1
which has the solutions x = 0 and x = 61 . So x = 0 ⇒ y = 3(0) = 0 and x = 61 ⇒ y = 3 61 = 12 .
¡1 1 ¢
So the critical points are ( x, y) = (0, 0) and ( x, y) = 6 , 12 .
To use Theorem 3.7, we need the second-order partial derivatives:
∂2 f ∂2 f ∂2 f
= −6 x , = −2 , =1
∂ x2 ∂ y2 ∂ y ∂x
So
∂2 f ∂2 f ³ ∂2 f ´2
D = 2
(0, 0) 2
(0, 0) − (0, 0) = (−6(0))(−2) − 12 = −1 < 0
∂x ∂y ∂ y ∂x
and thus (0, 0) is a saddle point. Also,
∂2 f ¡ 1 ¢ ∂2 f ¡ 1 ¢ ³ ∂2 f ¡ 1 1 ¢´2 ¡1¢
D = , 1 , 1 − , 12 = (−6 )(−2) − 12 = 1 > 0
∂ x2 6 12 ∂ y2 6 12 ∂ y ∂x 6 6
∂2 f ¡ 1 1 ¢ ¡1 1
¢
and ,
∂ x2 6 12
= −1 < 0. Thus, 6 , 12 is a local maximum.
Example 3.21. Find all local maxima and minima of f ( x, y) = ( x − 2)4 + ( x − 2 y)2 .
Solution: First find the critical points; that is, solve ∇ f = 0. Since
∂f ∂f
= 4( x − 2)3 + 2( x − 2 y) and = −4( x − 2 y)
∂x ∂y
then the critical points ( x, y) are the common solutions of the equations
4( x − 2)3 + 2( x − 2 y) = 0
−4( x − 2 y) = 0
The second equation yields x = 2 y, substituting that into the first equation yields 4(2 y − 2)3 =
0, which has the solution y = 1, and so x = 2(1) = 2. Thus, (2, 1) is the only critical point.
To use Theorem 3.7, we need the second-order partial derivatives:
∂2 f ∂2 f ∂2 f
= 12( x − 2)2 + 2 , =8 , = −4
∂ x2 ∂ y2 ∂ y ∂x
So
∂2 f ∂2 f ³ ∂2 f ´2
D = 2
(2, 1) 2
(2, 1) − (2, 1) = (2)(8) − (−4)2 = 0
∂x ∂y ∂ y ∂x
and so the test fails. What can be done in this situation? Sometimes it is possible to examine
the function to see directly the nature of a critical point. In our case, we see that f ( x, y) ≥ 0
for all ( x, y), since f ( x, y) is the sum of fourth and second powers of numbers and hence must
be nonnegative. But we also see that f (2, 1) = 0. Thus f ( x, y) ≥ 0 = f (2, 1) for all ( x, y), and
hence (2, 1) is, in fact, a global minimum for f .
98 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
2
+ y2 )
Example 3.22. Find all local maxima and minima of f ( x, y) = ( x2 + y2 ) e−( x .
Solution: First find the critical points; that is, solve ∇ f = 0. Since
∂f 2
+ y2 )
= 2 x(1 − ( x2 + y2 )) e−( x
∂x
∂f 2
+ y2 )
= 2 y(1 − ( x2 + y2 )) e−( x
∂y
then the critical points are (0, 0) and all points ( x, y) on the unit circle x2 + y2 = 1.
To use Theorem 3.7, we need the second-order partial derivatives:
∂2 f 2
+ y2 )
= 2[1 − ( x2 + y2 ) − 2 x2 − 2 x2 (1 − ( x2 + y2 ))] e−( x
∂ x2
∂2 f 2
+ y2 )
= 2[1 − ( x2 + y2 ) − 2 y2 − 2 y2 (1 − ( x2 + y2 ))] e−( x
∂ y2
∂2 f 2
+ y2 )
= −4 x y[2 − ( x2 + y2 )] e−( x
∂ y ∂x
∂2 f
At (0, 0), we have D = 4 > 0 and ∂ x2 (0, 0) = 2 > 0, so (0, 0) is a local minimum. However, for
points ( x, y) on the unit circle x2 + y2 = 1, we have
and so the test fails. If we look at the graph of f ( x, y), as shown in Figure 3.5.2, it looks like
we might have a local maximum for ( x, y) on the unit circle x2 + y2 = 1. If we switch to using
polar coordinates ( r, θ ) instead of ( x, y) in R2 , where r 2 = x2 + y2 , then we see that we can write
2 2
f ( x, y) as a function g( r ) of the variable r alone: g( r ) = r 2 e−r . Then g ′ ( r ) = 2 r (1 − r 2 ) e−r ,
so it has a critical point at r = 1, and we can check that g ′′ (1) = −4 e−1 < 0, so the Second
Derivative Test from single-variable calculus says that r = 1 is a local maximum. But r = 1
corresponds to the unit circle x2 + y2 = 1. Thus, the points ( x, y) on the unit circle x2 + y2 = 1
are local maximum points for f .
Exercises
A
For Exercises 1–10, find all local maxima and minima of the function f ( x, y).
3.5 Maxima and Minima 99
0.4
0.35
0.3
0.25
z 0.2
0.15 -3
0.1 -2
0.05
0 -1
-3
-2 0
-1 x
1
0
1 2
y 2
3 3
2 + y2 )
Figure 3.5.2 f (x, y) = (x2 + y2 )e−( x .
1. f ( x, y) = x3 − 3 x + y2 ; 2. f ( x, y) = x3 − 12 x + y2 + 8 y;
3. f ( x, y) = x3 − 3 x + y3 − 3 y; 4. f ( x, y) = x3 + 3 x2 + y3 − 3 y2 ;
5. f ( x, y) = 2 x3 + 6 x y + 3 y2 ; 6. f ( x, y) = 2 x3 − 6 x y + y2 ;
p
7. f ( x, y) = x2 + y2 ; 8. f ( x, y) = x + 2 y;
9. f ( x, y) = 4 x2 − 4 x y + 2 y2 + 10 x − 6 y; 10. f ( x, y) = −4 x2 + 4 x y − 2 y2 + 16 x − 12 y.
B
11. For a rectangular solid of volume 1000 cubic meters, find the dimensions that will min-
imize the surface area. (Hint: Use the volume condition to write the surface area as a
function of just two variables.)
12. Prove that if ( x0 , y0 ) is a local maximum or local minimum point for a smooth function
f ( x, y), then the tangent plane to the surface z = f ( x, y) at the point ( x0 , y0 , f ( x0 , y0 )) is
parallel to the x y-plane. (Hint: Use Theorem 3.6.)
C
13. Find three positive numbers x, y, z whose sum is 10 such that x2 y2 z is a maximum.
100 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
∂2 f ∂2 f ³ ∂2 f ´2
D ( x, y) = 2
( x, y) 2
( x, y) − ( x, y) .
∂x ∂y ∂ y ∂x
Newton’s algorithm: Pick an initial point ( x0 , y0 ). For n = 0, 1, 2, 3, . . . , define:
¯ 2 ¯
¯∂ f ∂2 f ¯
¯ ∂ y2 ( xn , yn ) ∂ x ∂ y ( xn , yn )¯
¯ ¯
¯ ∂f ∂f ¯
¯ ∂ y ( xn , yn ) (x , y ) ¯
∂x n n
x n +1 = x n − ,
D ( xn , yn )
¯ 2 ¯ (3.15)
¯∂ f ∂2 f ¯
¯ ∂ x2 ( xn , yn ) ∂ x ∂ y ( xn , yn )¯
¯ ¯
¯ ∂f ∂f ¯
¯ ∂ x ( xn , yn ) (
∂y n n
x , y ) ¯
yn+1 = yn − .
D ( xn , yn )
Then the sequence of points ( xn , yn )∞
n=1 typically converges to a critical point. If there are
several critical points, then you will have to try different initial points to find them.
The choice of the formulas in (3.15) is motivated by the following fact, which can be
∂2 f ∂2 f
checked by direct calculations. Assume that the partial derivatives ∂ x2 ( x, y), ∂ x∂ y ( x, y) and
∂2 f
∂ y2
( x, y)
are constants; in other words, the function f ( x, y) can be expressed as a quadratic
polynomial in x and y, say
f ( x, y) = a + bx + c y + lx2 + mx y + n y2
for some constants a, b, c, l, m, n. Then for any choice ( x0 , y0 ) the formulas (3.15) returns a
critical point ( x1 , y1 ), which is unique in this case.
7 This is also a problem for the equivalent method (the Second Derivative Test) in single-variable calculus,
∂f ∂f
= 3 x 2 − y − 1 + y3 , = − x + 3 x y2 − 4 y3
∂x ∂y
∂2 f ∂2 f ∂2 f
= 6x , = 6 x y − 12 y2 , = −1 + 3 y 2
∂ x2 ∂ y2 ∂ y ∂x
Notice that solving ∇ f = 0 would involve solving two third-degree polynomial equations in x
and y, which in this case can not be done easily.
We need to pick an initial point ( x0 , y0 ) for our algorithm. Looking at the graph of z = f ( x, y)
over a large region may help (see Figure 3.6.1 below), though it may be hard to tell where
the critical points are.
50000
0
-50000
-100000
-150000
-200000 -20
-250000 -15
-300000 -10
-350000 -5
-20
-15 0
-10 5 x
-5
0 10
5
y 10 15
15
20 20
Notice in the formulas (3.15) that we divide by D , so we should pick an initial point where
D is not zero. And we can see that D (0, 0) = (0)(0) − (−1)2 = −1 6= 0, so take (0, 0) as our initial
point. Since it may take a large number of iterations of Newton’s algorithm to be sure that
we are close enough to the actual critical point, and since the computations are quite tedious,
we will let a computer do the computing. For this, we will write a simple program, using
the Java programming language, which will take a given initial point as a parameter and
102 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
then perform 100 iterations of Newton’s algorithm. In each iteration the new point will be
printed, so that we can see if there is convergence. The full code is shown in Listing 3.1.
//Program to find the critical points of f(x,y)=x^3-xy-x+xy^3-y^4
public class newton {
public static void main(String[] args) {
//Get the initial point (x,y) as command-line parameters
double x = Double.parseDouble(args[0]); //Initial x value
double y = Double.parseDouble(args[1]); //Initial y value
System.out.println("Initial point: (" + x + "," + y + ")");
//Go through 100 iterations of Newton’s algorithm
for (int n=1; n<=100; n++) {
double D = fxx(x,y)*fyy(x,y) - Math.pow(fxy(x,y),2);
double xn = x; double yn = y; //The current x and y values
if (D == 0) { //We can not divide by 0
System.out.println("Error: D = 0 at iteration n = " + n);
System.exit(0); //End the program
} else { //Calculate the new values for x and y
x = xn - (fyy(xn,yn)*fx(xn,yn) - fxy(xn,yn)*fy(xn,yn))/D;
y = yn - (fxx(xn,yn)*fy(xn,yn) - fxy(xn,yn)*fx(xn,yn))/D;
System.out.println("n = " + n + ": (" + x + "," + y + ")");
}
}
}
//Below are the parts specific to the function f
//The first partial derivative of f wrt x: 3x^2-y-1+y^3
public static double fx(double x, double y) {
return 3*Math.pow(x,2) - y - 1 + Math.pow(y,3);
}
//The first partial derivative of f wrt y: -x+3xy^2-4y^3
public static double fy(double x, double y) {
return -x + 3*x*Math.pow(y,2) - 4*Math.pow(y,3);
}
//The second partial derivative of f wrt x: 6x
public static double fxx(double x, double y) {
return 6*x;
}
//The second partial derivative of f wrt y: 6xy-12y^2
public static double fyy(double x, double y) {
return 6*x*y - 12*Math.pow(y,2);
}
//The mixed second partial derivative of f wrt x and y: -1+3y^2
public static double fxy(double x, double y) {
return -1 + 3*Math.pow(y,2);
}
}
To use this program, you should first save the code in Listing 3.1 in a plain text file called
newton.java. You will need the Java Development Kit8 to compile the code. In the directory
where newton.java is saved, run this command at a command prompt to compile the code:
javac newton.java
Then run the program with the initial point (0, 0) with this command:
java newton 0 0
Below is the output of the program using (0, 0) as the initial point, truncated to show the
first 10 lines and the last 5 lines:
java newton 0 0
Initial point: (0.0,0.0)
n = 1: (0.0,-1.0)
n = 2: (1.0,-0.5)
n = 3: (0.6065857885615251,-0.44194107452339687)
n = 4: (0.484506572966545,-0.405341511995805)
n = 5: (0.47123972682634485,-0.3966334583092305)
n = 6: (0.47113558510349535,-0.39636450001936047)
n = 7: (0.4711356343449705,-0.3963643379632247)
n = 8: (0.4711356343449874,-0.39636433796318005)
n = 9: (0.4711356343449874,-0.39636433796318005)
n = 10: (0.4711356343449874,-0.39636433796318005)
...
n = 96: (0.4711356343449874,-0.39636433796318005)
n = 97: (0.4711356343449874,-0.39636433796318005)
n = 98: (0.4711356343449874,-0.39636433796318005)
n = 99: (0.4711356343449874,-0.39636433796318005)
n = 100: (0.4711356343449874,-0.39636433796318005)
As you can see, we appear to have converged fairly quickly (after only 8 iterations) to
what appears to be an actual critical point (up to Java’s level of precision), namely the point
(0.4711356343449874, −0.39636433796318005). It is easy to confirm that ∇ f = 0 at this
∂f ∂f
point, either by evaluating ∂ x and ∂ y at the point ourselves or by modifying our program to
also print the values of the partial derivatives at the point. It turns out that both partial
derivatives are indeed close enough to zero to be considered zero:
∂f
(0.4711356343449874, −0.39636433796318005) = 4.85722573273506 × 10−17
∂x
∂f
(0.4711356343449874, −0.39636433796318005) = −8.326672684688674 × 10−17
∂y
We also have D (0.4711356343449874, −0.39636433796318005) = −8.776075636032301 < 0,
so by Theorem 3.7 we know that (0.4711356343449874, −0.39636433796318005) is a saddle
point.
8 Available for free at http://www.oracle.com/technetwork/java/javase/downloads/
104 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
Since ∇ f consists of cubic polynomials, it seems likely that there may be three critical
points. The computer program makes experimenting with other initial points easy, and
trying different values does indeed lead to different sequences which converge:
java newton -1 -1
Initial point: (-1.0,-1.0)
n = 1: (-0.5,-0.5)
n = 2: (-0.49295774647887325,-0.08450704225352113)
n = 3: (-0.1855674752461383,-1.2047647348546167)
n = 4: (-0.4540060574531383,-0.8643989895639324)
n = 5: (-0.3672160534444,-0.5426077421319053)
n = 6: (-0.4794622222856417,-0.24529117721011612)
n = 7: (0.11570743992954591,-2.4319791238981274)
n = 8: (-0.05837851765533317,-1.6536079835854451)
n = 9: (-0.129841298650007,-1.121516233310142)
n = 10: (-1.004453014967208,-0.9206128022529645)
n = 11: (-0.5161209914612475,-0.4176293491131443)
n = 12: (-0.5788664043863884,0.2918236503332734)
n = 13: (-0.6985177124230715,0.49848120123515316)
n = 14: (-0.6733618916578702,0.4345777963475479)
n = 15: (-0.6704392913413444,0.4252025996474051)
n = 16: (-0.6703832679150286,0.4250147307973365)
n = 17: (-0.6703832459238701,0.42501465652421205)
n = 18: (-0.6703832459238667,0.4250146565242004)
n = 19: (-0.6703832459238667,0.42501465652420045)
n = 20: (-0.6703832459238667,0.42501465652420045)
...
n = 98: (-0.6703832459238667,0.42501465652420045)
n = 99: (-0.6703832459238667,0.42501465652420045)
n = 100: (-0.6703832459238667,0.42501465652420045)
∂f ∂f
Again, it is easy to confirm that both ∂ x and ∂ y vanish at the point
(−0.6703832459238667, 0.42501465652420045), which means it is a critical point. And
D (−0.6703832459238667, 0.42501465652420045) = 15.3853578526055 > 0
∂2 f
(−0.6703832459238667, 0.42501465652420045) = −4.0222994755432 < 0
∂ x2
so we know that (−0.6703832459238667, 0.42501465652420045) is a local maximum. An
idea of what the graph of f looks like near that point is shown in Figure 3.6.2, which does
suggest a local maximum around that point.
Finally, running the computer program with the initial point (−5, −5) yields the critical
point (−7.540962756992551, −5.595509445899435), with D < 0 at that point, which makes
it a saddle point.
3.6 Numerical Methods 105
0.6
0.4
0.2
0
z -0.2
-0.4 -1
-0.6
-0.8 -0.8
-1 -0.6
0
0.2 -0.4 x
0.4
0.6 -0.2
y 0.8
1 0
The derivation of Newton’s algorithm, and the proof that it converges (given a “reason-
able” choice for the initial point) requires techniques beyond the scope of this text. See
R ALSTON and R ABINOWITZ for more detail and for discussion of other numerical methods.
Our description of Newton’s algorithm is the special two-variable case of a more general
algorithm that can be applied to functions of n ≥ 2 variables.
In the case of functions which have a global maximum or minimum, Newton’s algorithm
can be used to find those points. In general, global maxima and minima tend to be more
interesting than local versions, at least in practical applications. A maximization problem
can always be turned into a minimization problem (why?), so a large number of methods
have been developed to find the global minimum of functions of any number of variables.
This field of study is called nonlinear programming.
Many of these methods are based on the steepest descent technique, which is based on
an idea that we discussed in Section 2.4. Recall that the negative gradient −∇ f gives the
106 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
direction of the fastest rate of decrease of a function f . The crux of the steepest descent idea,
then, is that starting from some initial point, you move a certain amount in the direction
of −∇ f at that point. Wherever that takes you becomes your new point, and you then just
keep repeating that procedure until eventually (hopefully) you reach the point where f has
its smallest value. There is a “pure” steepest descent method, and a multitude of variations
on it that improve the rate of convergence, ease of calculation, etc. For more discussion of
this, and of nonlinear programming in general, see B AZARAA, S HERALI and S HETTY.
Exercises
C
1. Recall Example 3.21 from the previous section, where we showed that the point (2, 1) was
a global minimum for the function f ( x, y) = ( x − 2)4 + ( x − 2 y)2 . Notice that our computer
program can be modified fairly easily to use this function (just change the return values
in the fx, fy, fxx, fyy and fxy function definitions to use the appropriate partial derivative).
Either modify that program or write one of your own in a programming language of your
choice to show that Newton’s algorithm does lead to the point (2, 1). First use the initial
point (0, 3), then use the initial point (3, 2), and compare the results. Make sure that your
program attempts to do 100 iterations of the algorithm. Did anything strange happen
when your program ran? If so, how do you explain it? (Hint: Something strange should
happen.)
f 1 ( x, y) = 0 and f 2 ( x, y) = 0 ,
f 1 ( x, y) = sin( x y) − x − y = 0 and f 2 ( x, y) = e2 x − 2 x + 3 y = 0 .
Show that you get two different solutions when using (0, 0) and (1, 1) for the initial point
( x0 , y0 ).
3.7 Lagrange Multipliers 107
Example 3.24. For a rectangle whose perimeter is 20 m, find the dimensions that will max-
imize the area.
Solution: The area A of a rectangle with width x and height y is A = x y. The perimeter P of
the rectangle is then given by the formula P = 2 x + 2 y. Since we are given that the perimeter
P = 20, this problem can be stated as:
Maximize : f ( x, y) = x y
given : 2 x + 2 y = 20
The reader is probably familiar with a simple method, using single-variable calculus, for
solving this problem. Since we must have 2 x + 2 y = 20, then we can solve for, say, y in
terms of x using that equation. This gives y = 10 − x, which we then substitute into f to get
f ( x, y) = x y = x(10 − x) = 10 x − x2 . This is now a function of x alone, so we now just have to
maximize the function f ( x) = 10 x − x2 on the interval [0, 10]. Since f ′ ( x) = 10 − 2 x = 0 ⇒ x = 5
and f ′′ (5) = −2 < 0, then the Second Derivative Test tells us that x = 5 is a local maximum
for f , and hence x = 5 must be the global maximum on the interval [0, 10] (since f = 0 at
the endpoints of the interval). So since y = 10 − x = 5, then the maximum area occurs for a
rectangle whose width and height both are 5 m.
Notice in the above example that the ease of the solution depended on being able to solve
for one variable in terms of the other in the equation 2 x + 2 y = 20. But what if that were not
possible (which is often the case)? In this section we will use a general method, called the
Lagrange multiplier method9 , for solving constrained optimization problems:
The equation g( x, y) = c is called the constraint equation, and we say that x and y are con-
strained by g( x, y) = c. Points ( x, y) which are maxima or minima of f ( x, y) with the con-
dition that they satisfy the constraint equation g( x, y) = c are called constrained maximum
or constrained minimum points, respectively. Similar definitions hold for functions of three
variables.
The Lagrange multiplier method for solving such problems can now be stated:
9 Named after the French mathematician Joseph Louis Lagrange (1736–1813).
108 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
Theorem 3.8. Let f ( x, y) and g( x, y) be smooth functions, and suppose that c is a scalar
constant such that ∇ g( x, y) 6= 0 for all ( x, y) that satisfy the equation g( x, y) = c. Then to
solve the constrained optimization problem
find the points ( x, y) that solve the equation ∇ f ( x, y) = λ∇ g( x, y) for some constant λ. The
number λ is called the Lagrange multiplier and the point ( x, y) is called critical point of
f ( x, y) constrained by g( x, y) = c.
If there is a constrained maximum or minimum, then it must be such a critical point of
f ( x, y) constrained by g( x, y) = c.
If the condition holds then by theorem above the maximum and minimum points have to be
one of the critical. It remains to find all the critical points ( x, y) and compare their values
f ( x, y); the maximum of these values is the global maximum of f ( x, y) with the constraint
g( x, y) = c; analogously, the minimal value if the global minimum.
Let us formulate a more general condition which guarantees existance of minimum, but
not maximum.
If the set described by system g( x, y) = c, f ( x, y) ≤ d is not empty and bounded for some
d then the constrained minimum of f ( x, y) will occur at some points.
Again, to find the minimum we can find all the values f ( x, y) at the critical points ( x, y); the
minimal value is the global minimum of f ( x, y) with the constraint g( x, y) = c.
For example if f ( x, y) = x2 + y2 then the set described by f ( x, y) ≤ d is bounded for any d .
Therefore the above condition guarantees existence of minimum for any constrain g( x, y) = c.
Here is an analogous condition for maximum.
If the set described by system g( x, y) = c, f ( x, y) ≥ d is not empty and bounded for some
d then the constrained maximum of f ( x, y) will occur at some points.
x+
y=
6
xy = 1
x+
y=
2
x+
y=
−
2
Sometimes the answer depend on the hidden constants. For example, consider the func-
tion f ( x, y) = x + y and the constraint x y = 1. The equation ∇ f = λ∇ g from Theorem 3.8 takes
form
(1, 1) = λ( y, x).
Since yx = 1, we have two critical points: (1, 1) with the multiplier 1 and (−1, −1) with the
multiplier −1. Non of these points is maximum, nor minimum; in fact, f can positive and
negative values with arbitrary large absolute value. On the other hand, it might happen that
by the nature of the problem, x and y have to be positive (it is called implicit constraints)
then only one point (1, 1) satisfies is the constrained minimum point and the minimum is 2.
In a general case of that type the maximum or minimum of f ( x, y) will occur either at a
point ( x, y) satisfying ∇ f ( x, y) = λ∇ g( x, y) or at a “boundary” point of the set described by the
hidden constraints.
Similar thing happens in the Example 3.24 the constraint equation 2 x + 2 y = 20 describes
a line in R2 , which by itself is not bounded. However, there are “hidden” constraints, due
110 CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES
to the nature of the problem, namely 0 ≤ x, y ≤ 10, which cause that line to be restricted
to a line segment in R2 , which is bounded; the the endpoints of that line segment form the
“boundary”.
Example 3.25. For a rectangle whose perimeter is 20 m, use the Lagrange multiplier method
to find the dimensions that will maximize the area.
Solution: As we saw in Example 3.24, with x and y representing the width and height,
respectively, of the rectangle, this problem can be stated as:
Maximize : f ( x, y) = x y
given : g( x, y) = 2 x + 2 y = 20
g( x, y) = 20,
∇ f ( x, y) = λ∇ g( x, y)
g( x, y) = 20,
∂f ∂g
=λ ,
∂x ∂x
∂f ∂g
=λ ,
∂y ∂y
namely:
2 x + 2 y = 20,
y = 2 λ,
x = 2 λ.
The general idea is to solve for λ in the last two equations, then set those expressions equal
(since they both equal λ) to solve for x and y. Doing this we get
y x
= λ = ⇒ x= y ,
2 2
so now substitute either of the expressions for x or y into the constraint equation to solve for
x and y:
20 = g( x, y) = 2 x + 2 y = 2 x + 2 x = 4 x ⇒ x = 5 ⇒ y = 5
There must be a maximum area, since the minimum area is 0 and f (5, 5) = 25 > 0, so the
point (5, 5) that we found (called a constrained critical point) must be the constrained maxi-
mum.
∴ The maximum area occurs for a rectangle whose width and height both are 5 m.
3.7 Lagrange Multipliers 111
Example 3.26. Find the points on the circle x2 + y2 = 80 which are closest to and farthest
from the point (1, 2).
Solution: The distance d from any point ( x, y) to the point (1, 2) is
q
d = ( x − 1)2 + ( y − 2)2 ,
and minimizing the distance is equivalent to minimizing the square of the distance. Thus
the problem can be stated as:
2( x − 1) = 2λ x ,
2( y − 2) = 2λ y
Note that x 6= 0 since otherwise we would get −2 = 0 in the first equation. Similarly, y 6= 0.
So we can solve both equations for λ as follows:
x−1 y−2
= λ = ⇒ x y − y = x y − 2x ⇒ y = 2x
x y
y
Substituting this into g( x, y) = x2 + y2 = 80 yields 5 x2 = 80, x2 + y2 = 80
so x = ±4. So the two constrained critical points are (4, 8) and (4, 8)
(−4, −8). Since f (4, 8) = 45 and f (−4, −8) = 125, and since there
must be points on the circle closest to and farthest from (1, 2), (1, 2) x
then it must be the case that (4, 8) is the point on the circle clos- 0
est to (1, 2) and (−4, −8) is the farthest from (1, 2) (see Figure
2.7.1).
Notice that since the constraint equation x2 + y2 = 80 describes (−4, −8)
a circle, which is a bounded set in R2 , then we were guaranteed
that the constrained critical points we found were indeed the Figure 3.7.1
constrained maximum and minimum.
Example 3.27.
x 2 + y2 + z 2 = 1
1 = 2λ x,
0 = 2λ y,
1 = 2λ z,
The second equation implies λ 6= 0 (otherwise we would have 1 = 0), so we can divide by λ in
the third equation to get y = 0 and we can divide by λ in the first and last equations to get
x = 21λ = z. Substituting these expressions into the constraint equation x2 + y2 + z2 = 1 yields
the constrained critical points
( p1 , 0, p1 ) −1
and ( p −1
, 0, p ).
2 2 2 2
Since
f ( p1 , 0, p1 ) > f ( p
−1 −1
, 0, p ),
2 2 2 2
2 2 2
and since³the constraint
´ equation x + y + z = 1 describes a ³sphere (which
´ is bounded) in
3 1 1 − 1 − 1
R , then p , 0, p is the constrained maximum point and p , 0, p is the constrained
2 2 2 2
minimum point.
Note that solving the equation ∇ f ( x, y) = λ∇ g( x, y) means having to solve a system of two
(possibly nonlinear) equations in three unknowns, which as we have seen before, may not be
possible to do. And the 3-variable case can get even more complicated. All of this somewhat
restricts the usefulness of Lagrange’s method to relatively simple functions. Luckily there
are many numerical methods for solving constrained optimization problems, though we will
not discuss them here.11
Exercises
A
1. Find the constrained maxima and minima of f ( x, y) = 2 x + y given that x2 + y2 = 4.
3. Find the constrained minima of f ( x, y) = x2 + 3 y2 given that x y = 1 and show that there is
no constrained maxima.
4. Find the points on the circle x2 + y2 = 100 which are closest to and farthest from the point
(2, 3).
11 See B AZARAA, S HERALI and S HETTY.
3.7 Lagrange Multipliers 113
B
5. Find the constrained maxima and minima of f ( x, y, z) = x + y2 + 2 z given that
4 x2 + 9 y2 − 36 z2 = 36.
6. Find the volume of the largest rectangular parallelepiped with edges parallel to the coor-
dinate axis that can be inscribed in the ellipsoid
x 2 y2 z 2
+ + =1 .
a2 b 2 c 2
C
7. Let ( x0 , y0 ) be a minimum point of smooth function f ( x, y) with the constraint g( x, y) ≤ c.
Assume g( x0 , y0 ) = c and ∇ g( x0 , y0 ) 6= 0. Show that ∇ f ( x0 , y0 ) = λ∇ g( x0 , y0 ) for some λ ≤
0. (Hint: Note that ( x0 , y0 ) is also a minimum point of smooth function f ( x, y) with the
constraint g( x, y) = c and the points ( x0 , y0 ) − t∇ g( x0 , y0 ) satisfy the constraint inequality
for small positive t.)
4 Multiple Integrals
c d y
0 A(x)
a
x
b
x R
rd
Then A ( x) = c f ( x, y) d y since we are treating x as fixed, and only y varies. This makes
sense since for a fixed x the function f ( x, y) is a continuous function of y over the interval
[ c, d ], so we know that the area under the curve is the definite integral. The area A ( x) is a
function of x, so by the “slice” or cross-section method from single-variable calculus we know
114
4.1 Double Integrals 115
that the volume V of the solid under the surface z = f ( x, y) but above the x y-plane over the
rectangle R is the integral over [a, b] of that cross-sectional area A ( x):
wb wb wd
V = A ( x) dx = f ( x, y) d y dx (4.1)
a a c
We will always refer to this volume as “the volume under the surface”. The above expression
uses what are called iterated integrals. First the function f ( x, y) is integrated as a func-
tion of y, treating the variable x as a constant (this is called integrating with respect to y).
That is what occurs in the “inner” integral between the square brackets in equation (4.1).
This is the first iterated integral. Once that integration is performed, the result is then an
expression involving only x, which can then be integrated with respect to x. That is what
occurs in the “outer” integral above (the second iterated integral). The final result is then
a number (the volume). This process of going through two iterations of integrals is called
double integration, and the last expression in equation (4.1) is called a double integral.
Notice that integrating f ( x, y) with respect to y is the inverse operation of taking the
partial derivative of f ( x, y) with respect to y.
Also, we could have taken the area of cross-sections under the surface which were parallel
to the xz-plane, which would then depend only on the variable y, so that the volume would
be
wd wb
V = f ( x, y) dx d y . (4.2)
c a
It turns out that in general1 the order of the iterated integrals does not matter. Also, we will
usually discard the brackets and simply write
wd wb
V = f ( x, y) dx d y , (4.3)
c a
where it is understood that the fact that dx is written before d y means that the function
f ( x, y) is first integrated with respect to x using the “inner” limits of integration a and b,
and then the resulting function is integrated with respect to y using the “outer” limits of
integration c and d . This order of integration can be changed if it is more convenient.
Example 4.1. Find the volume V under the plane z = 8 x + 6 y over the rectangle R = [0, 1] ×
[0, 2].
Suppose we had switched the order of integration. We can verify that we still get the same
answer:
w1 w2
V = (8 x + 6 y) d y dx
0 0
w1 µ ¯ y=2 ¶
2¯
= 8x y + 3 y ¯ dx
y=0
0
w1
= (16 x + 12) dx
0
¯1
= 8 x2 + 12 x ¯
¯
0
= 20
Example 4.2. Find the volume V under the surface z = e x+ y over the rectangle R = [2, 3] ×
[1, 2].
Solution: We know that f ( x, y) = e x+ y > 0 for all ( x, y), so
w2 w3
V = e x+ y dx d y
1 2
w2 µ ¯ x =3 ¶
x+ y ¯
= e ¯ dy
x =2
1
w2
= ( e y+3 − e y+2 ) d y
1
¯2
= e y+3 − e y+2 ¯
¯
1
4.1 Double Integrals 117
= e5 − e4 − ( e4 − e3 ) = e5 − 2 e4 + e3
rb
Recall that for a general function f ( x), the integral a f ( x) dx represents the difference of
the area below the curve y = f ( x) but above the x-axis when f ( x) ≥ 0, and the area above the
curve but below the x-axis when f ( x) ≤ 0. Similarly, the double integral of any continuous
function f ( x, y) represents the difference of the volume below the surface z = f ( x, y) but above
the x y-plane when f ( x, y) ≥ 0, and the volume above the surface but below the x y-plane when
f ( x, y) ≤ 0. Thus, our method of double integration by means of iterated integrals can be
used to evaluate the double integral of any continuous function over a rectangle, regardless
of whether f ( x, y) ≥ 0 or not.
w2πwπ
Example 4.3. Evaluate sin( x + y) dx d y.
0 0
Solution: Note that f ( x, y) = sin( x + y) is both positive and negative over the rectangle [0, π] ×
[0, 2π]. We can still evaluate the double integral:
w2πwπ w2π³ ¯ x =π ´
¯
sin( x + y) dx d y = − cos( x + y) ¯ dy
x =0
0 0 0
w2π
= (− cos( y + π) + cos y) d y
0
¯2π
¯
= − sin( y + π) + sin y ¯ = − sin 3π + sin 2π − (− sin π + sin 0)
0
= 0
Exercises
A
For Exercises 1–4, find the volume under the surface z = f ( x, y) over the rectangle R .
w2 w1 w2 w1
7. ( x + 2) dx d y 8. x( x y + sin x) dx d y
0 0 −1 −1
w/2w1
π wπ πw/2
2
9. x y cos( x y) dx d y 10. sin x cos( y − π) dx d y
0 0 0 0
w2 w4 w1 w2
11. x y dx d y 12. 1 dx d y
0 1 −1 −1
wd wb
M dx d y = M ( d − c)( b − a).
c a
4.2 Double Integrals Over a General Region 119
y = g 2 (x)
d
R
x = h 1 (y)
x = h 2 (y)
y = g 1 (x) c
R
x x
0 a b 0
r br g ( x) r d r h2 ( y)
(a) Vertical slice: a g 2( x) f ( x, y) d y dx (b) Horizontal slice: c h 1 ( y)
f ( x, y) dx d y
1
Then using the slice method from the previous x section, the double integral of a real-valued
function f ( x, y) over the region R , denoted by f ( x, y) d A , is given by
R
x wb gw
2 ( x)
f ( x, y) d A = f ( x, y) d y dx (4.4)
R a g 1 ( x)
This means that we take vertical slices in the region R between the curves y = g 1 ( x) and
y = g 2 ( x). The symbol d A is sometimes called an area element or infinitesimal, with the A
signifying area. Note that f ( x, y) is first integrated with respect to y, with functions of x as
the limits of integration. This makes sense since the result of the first iterated integral will
have to be a function of x alone, which then allows us to take the second iterated integral
with respect to x.
Similarly, if we have a region R in the x y-plane that is bounded on the left by a curve
x = h 1 ( y), bounded on the right by a curve x = h 2 ( y), bounded below by the horizontal line
120 CHAPTER 4. MULTIPLE INTEGRALS
y = c, and bounded above by the horizontal line y = d (where c < d ), as in Figure 4.2.1(b)
(assuming that h 1 ( y) and h 2 ( y) do not intersect on the open interval ( c, d )), then taking
horizontal slices gives
x wd hw
2 ( y)
f ( x, y) d A = f ( x, y) dx d y (4.5)
R c h 1 ( y)
Notice that these definitions include the case s when the region R is a rectangle. Also, if
f ( x, y) ≥ 0 for all ( x, y) in the region R , then f ( x, y) d A is the volume under the surface
R
z = f ( x, y) over the region R .
Example 4.4. Assume the region R is defined by the inequalities x2 ≤ y and y2 ≤ x. Rewrite
double integral x
f ( x, y) d A,
R
as an iterated integral.
Note that the region R does not change if you switch x and y. Therefore, the same integral
could be written as p
x w1 w x
f ( x, y) d A = f ( x, y) d y dx.
R 0 x2
Example 4.5. Find the volume V under the plane z = 8 x + 6 y over the plane region R defined
by the inequalities 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2 x2 }.
4.2 Double Integrals Over a General Region 121
y
Solution: The region R is shown in Figure 3.2.2. Using vertical slices we
y = 2x2
get:
x R
V = (8 x + 6 y) d A
x
R
0 1
w1 2wx2
= (8 x + 6 y) d y dx Figure 4.2.2
0 0
w1 µ ¯ y=2 x2 ¶
2¯
= 8x y + 3 y ¯ dx
y=0
0
w1
= (16 x3 + 12 x4 ) dx
0
¯1
= 4 x4 + 12
5 x 5¯
¯ = 4 + 12
5 =
32
5 = 6.4
0
y
We get the same answer using horizontal slices (see Figure 3.2.3): 2
x p
x= y/2
V = (8 x + 6 y) d A R
R
x
w2 w1 0 1
= (8 x + 6 y) dx d y
0
p Figure 4.2.3
y/2
w2 µ ¯ x =1 ¶
4 x2 + 6 x y ¯ p
¯
= dy
x= y/2
0
w2 p w2 p
= (4 + 6 y − (2 y + p6 y y )) d y = (4 + 4 y − 3 2 y3/2 ) d y
2
0 0
p ¯2 p p
= 4 y + 2 y2 − 6 2 5/2 ¯
5 y ¯ = 8+8− 6 2 32
5 = 16 − 48
5 =
32
5 = 6.4
0
Example 4.6. Find the volume V of the solid bounded by the three coordinate planes and
the plane 2 x + y + 4 z = 4.
Solution: The
s solid is shown in Figure 4.2.4(a) with a typical vertical slice. The volume V
is given by f ( x, y) d A , where f ( x, y) = z = 14 (4 − 2 x − y) and the region R , shown in Figure
R
122 CHAPTER 4. MULTIPLE INTEGRALS
y
4
z
y = −2x + 4
(0, 0, 1) 2x + y + 4z = 4
y R
0 (0, 4, 0)
x
x (2, 0, 0) 0 2
(a) (b)
Figure 4.2.4
For a general region Rs, which may not be one of the types of regions we have considered so
far, the double integral f ( x, y) d A is defined as follows. Assume that f ( x, y) is a nonnega-
R
tive real-valued function and that R is a bounded region in R2 , so it can be enclosed in some
rectangle [a, b] × [ c, d ]. Then divide that rectangle into a grid of subrectangles. Only consider
the subrectangles that are enclosed completely within the region R , as shown by the shaded
subrectangles in Figure 4.2.5(a). In any such subrectangle [ xi , xi+1 ] × [ y j , y j+1 ], pick a point
( xi∗ , y j∗ ). Then the volume under the surface z = f ( x, y) over that subrectangle is approxi-
mately f ( xi∗ , y j∗ ) ∆ xi ∆ y j , where ∆ xi = xi+1 − xi , ∆ y j = y j+1 − y j , and f ( xi∗ , y j∗ ) is the height and
∆ xi ∆ y j is the base area of a parallelepiped, as shown in Figure 4.2.5(b). Then the total vol-
ume under the surface is approximately the sum of the volumes of all such parallelepipeds,
namely XX
f ( x i∗ , y j∗ ) ∆ x i ∆ y j , (4.6)
j i
4.2 Double Integrals Over a General Region 123
where the summation occurs over the indices of the subrectangles inside R . If we take
smaller and smaller subrectangles, so that the length of the largest diagonal of the subrect-
angles goes to 0, then the subrectangles begin to fill more and more of the region R , and so
the above sum approaches
s the actual volume under the surface z = f ( x, y) over the region R .
We then define f ( x, y) d A as the limit of that double summation (the limit is taken over all
R
subdivisions of the rectangle [a, b] × [ c, d ] as the largest diagonal of the subrectangles goes
to 0).
y
d
z
z = f (x, y)
∆ yj
∆ xi f (x i∗ , y j∗ )
y j +1
(x i∗ , y j∗ )
y j y j +1 y
yj
0
c xi
x x i +1
0 a (x i∗ , y j∗ ) R
x i x i +1 b x
(a) Subrectangles inside the region R (b) Parallelepiped over a subrectangle,
with volume f ( x i∗ , y j∗ ) ∆ x i ∆ y j
A similar definition can be made for a function f ( x, y) that is not necessarily always non-
negative: just replace each mention of volume by the negative volume in the description
above when f ( x, y) < 0. In the case of a region of the type shown in Figure 4.2.1, using
s the def-
inition of the Riemann integral from single-variable calculus, our definition of f ( x, y) d A
R
reduces to a sequence of two iterated integrals.
Finally, the region R does not have to be bounded. We can evaluate improper double inte-
grals (that is, over an unbounded region, or over a region which contains points where the
function f ( x, y) is not defined) as a sequence of iterated improper single-variable integrals.
w∞ 1/wx2
2 y d y dx.
1 0
124 CHAPTER 4. MULTIPLE INTEGRALS
Solution:
w∞ 1/wx2 w∞µ ¯ y=1/ x2 ¶
2¯
2 y d y dx = y ¯ dx
y=0
1 0 1
w∞ ¯∞
x−4 dx = − 31 x−3 ¯ = 0 − (− 13 ) = 1
¯
= 3
1
1
Exercises
A
For Exercises 1–8, evaluate the given double integral.
w1 w1 wπ wy
2 2. sin x dx d y
1. 24 x y d y dx
p 0 0
0 x
wx
w2 ln w2 w2 y 2
3. 4 x d y dx 4. e y dx d y
1 0 0 0
w/2wy
π w∞w∞ 2
+ y2 )
5. cos x sin y dx d y 6. x ye−( x dx d y
0 0 0 0
w2 wy w1 wx2
7. 1 dx d y 8. 2 d y dx
0 0 0 0
10. f ( x, y) = x2 + y and R is the triangle with vertices (0, 0), (2, 0) and (0, 1).
11. Find the volume V of the solid bounded by the three coordinate planes and the plane
x + y + z = 1.
12. Find the volume V of the solid bounded by the three coordinate planes and the plane
3 x + 2 y + 5 z = 6.
B
s
13. Explain why the double integral 1 d A gives the area of the region R . For simplicity,
R
you can assume that R is a region of the type shown in Figure 4.2.1(a).
C
4.2 Double Integrals Over a General Region 125
15. Show how Exercise 12 can be used to solve Exercise 10. Figure 4.2.6
B
For Exercises 16–17 rewrite double integral
x
f ( x, y) d A,
R
where the limit is over all divisions of the rectangular parallelepiped enclosing S into sub-
parallelepipeds whose largest diagonal is going to 0, and the triple summation is over all the
subparallelepipeds inside S . It can be shown that this limit does not depend on the choice
of the rectangular parallelepiped enclosing S . The symbol dV is often called the volume
element.
Physically, what does the triple integral represent? We saw that a double integral could
be thought of as the volume under a two-dimensional surface. It turns out that the triple
integral simply generalizes this idea: it can be thought of as representing the hypervolume
under a three-dimensional hypersurface w = f ( x, y, z) whose graph lies in R4 . In general,
the word “volume” is often used as a general term to signify the same concept for any n-
dimensional object (including length in R1 , area in R2 and volume in R3 ). It may be hard to
get a grasp on the concept of the “volume” of a four-dimensional object, but at least we now
know how to calculate that volume!
In the case where S is a rectangular parallelepiped [ x1 , x2 ] × [ y1 , y2 ] × [ z1 , z2 ], that is, S =
{( x, y, z) : x1 ≤ x ≤ x2 , y1 ≤ y ≤ y2 , z1 ≤ z ≤ z2 }, the triple integral is a sequence of three iterated
integrals, namely
y wz2 wy2 wx2
f ( x, y, z) dV = f ( x, y, z) dx d y dz , (4.8)
S z1 y1 x1
where the order of integration does not matter. This is the simplest case.
A more complicated case is where S is a solid which is bounded below by a surface z =
g 1 ( x, y), bounded above by a surface z = g 2 ( x, y), y is bounded between two curves h 1 ( x) and
h 2 ( x), and x varies between a and b. Then
Notice in this case that the first iterated integral will result in a function of x and y (since its
limits of integration are functions of x and y), which then leaves you with a double integral of
a type that we learned how to evaluate in Section 3.2. There are, of course, many variations
4.3 Triple Integrals 127
on this case (for example, changing the roles of the variables x, y, z), so as you can probably
tell, triple integrals can be quite tricky. At this point, just learning how to evaluate a triple
integral, regardless of what it represents, is the most important thing. We will see some
other ways in which triple integrals are used later in the text.
w3 w2 w1
Example 4.8. Evaluate ( x y + z) dx d y dz.
0 0 0
Solution:
w3 w2 w1 w3 w2 µ ¯ x =1 ¶
1 2 ¯
( x y + z) dx d y dz = 2 x y + xz ¯ d y dz
x =0
0 0 0 0 0
w3 w2 ¡ ¢
1
= 2 y+ z d y dz
0 0
w3 µ ¯ y=2 ¶
1 2 ¯
= 4y + yz ¯ dz
y=0
0
w3
= (1 + 2 z) dz
0
¯3
= z + z2 ¯ = 12
¯
0
¯1
11 2 1 4¯ 7
= 6 x − x + 24 x ¯ = 8
0
Since the function being integrated is the constant 1, then the above triple integral reduces
to a double integral of the types that we considered in the previous section if the solid is
bounded above by some surface z = f ( x, y) and bounded below by the x y-plane z = 0. There
are many other possibilities. For example, the solid could be bounded below and above by
surfaces z = g 1 ( x, y) and z = g 2 ( x, y), respectively, with y bounded between two curves h 1 ( x)
and h 2 ( x), and x varies between a and b. Then
Exercises
A
For Exercises 1–8, evaluate the given triple integral.
w3 w2 w1 w1 wx wy
1. x yz dx d y dz 2. x yz dz d y dx
0 0 0 0 0 0
wπ wx wx y w1 wz wy 2
3. x2 sin z dz d y dx 4. ze y dx d y dz
0 0 0 0 0 0
we wy 1/
wy w2 wy wz2
2
2
5. x z dx dz d y 6. yz dx dz d y
1 0 0 1 0 0
w2 w4 w3 w1 1w− x 1−wx− y
7. 1 dx d y dz 8. 1 dz d y dx
1 2 0 0 0 0
B
4.3 Triple Integrals 129
10. Find the volume V of the solid S bounded by the three coordinate planes, bounded above
by the plane x + y + z = 2, and bounded below by the plane z = x + y.
11. Let S be the solid defined by the inequalities x2 − 1 ≤ y ≤ 1 − z2 . Rewrite the triple
integral y
f ( x, y, z) dV ,
S
as an iterated integral.
C
12. Show that
wb wz wy wb
( b − x )2
f ( x) dx d y dz = 2 f ( x) dx.
a a a a
(Hint: Think of how changing the order of integration in the triple integral changes the
limits of integration.)
130 CHAPTER 4. MULTIPLE INTEGRALS
1 w
b
f¯ = f ( x) dx . (4.11)
b−a a
The quantity b − a is the length of the interval [a, b], which can be thought of as the
“volume” of the interval. Applying the same reasoning to functions of two or three variables,
we define the average value of f ( x, y) over a region R to be
1 x
f¯ = f ( x, y) d A , (4.12)
A (R )
R
where A (R ) is the area of the region R , and we define the average value of f ( x, y, z) over a
solid S to be
1 y
f¯ = f ( x, y, z) dV , (4.13)
V (S )
S
The average value of f ( x, y) over R can be thought of as representing the sum of all the
values of f divided by the number of points in R . However, we can not take the sum literary
since there are an infinite number of points in any region (in fact, uncounably many — one
can not enumerate them by natural numbers). But what if we took a very large number N
of random points in the region R (which can be generated by a computer) and then took the
average of the values of f for those points, and used that average as the value of f¯? This is
exactly what the Monte Carlo method does. So in formula (4.14) the approximation we get
is s
x f 2 − ( f¯)2
f ( x, y) d A ≈ A (R ) f¯ ± A (R ) , (4.15)
N
R
where
1 X N 1 X N
f¯ = f ( xi , yi ) and f2 = ( f ( xi , yi ))2 , (4.16)
N i =1 N i =1
4.4 Numerical Approximation of Multiple Integrals 131
with the sums taken over the N random points ( x1 , y1 ), . . ., ( xN , yN ). The ± “error term” in
formula (4.15) does not really provide hard bounds on the approximation. It represents a
single standard deviation from the expected value of the integral. That is, it provides a likely
bound on the error. Due to its use of random points, the Monte Carlo method is an example
of a probabilistic method (as opposed to deterministic methods such as Newton’s method,
which use a specific formula for generating points).
For example, we can use formula (4.15) to approximate the volume V under the plane
z = 8 x + 6 y over the rectangle R = [0, 1] × [0, 2]. In Example 4.1 in Section 3.1, we showed
that the actual volume is 20. Below is a code listing (montecarlo.java) for a Java program
that calculates the volume, using a number of points N that is passed on the command line
as a parameter.
The results of running this program with various numbers of random points (for instance,
java montecarlo 100) are shown below:
132 CHAPTER 4. MULTIPLE INTEGRALS
As you can see, the approximation is fairly good. As N → ∞, it can be shown p that the
Monte Carlo approximation converges to the actual volume (on the order of O ( N ), in com-
putational complexity terminology).
In the above example the region R was a rectangle. To use the Monte Carlo method for
a nonrectangular (bounded) region R , only a slight modification is needed. Pick a rectangle
R̃ that encloses R , and generate random points in that rectangle as before. Then use those
points in the calculation of f¯ only if they are inside R . There is no need to calculate the area
of R for formula (4.15) in this case, since the exclusion of points not inside R allows you to
use the area of the rectangle R̃ instead, similar to before.
For instance, in Example 4.5 we showed that the volume under the surface z = 8 x + 6 y
over the nonrectangular region R = {( x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2 x2 } is 6.4. Since the rectangle
R̃ = [0, 1] × [0, 2] contains R , we can use the same program as before, with the only change
being a check to see if y < 2 x2 for a random point ( x, y) in [0, 1] × [0, 2]. Listing 4.2 below
contains the code (montecarlo2.java):
The results of running the program with various numbers of random points (for instance,
java montecarlo2 1000) are shown below:
To use the Monte Carlo method to evaluate triple integrals, you will need to generate
random triples ( x, y, z) in a parallelepiped, instead of random pairs ( x, y) in a rectangle, and
use the volume of the parallelepiped instead of the area of a rectangle in formula (4.15) (see
Exercise 2). For a more detailed discussion of numerical integration methods, see P RESS et
al.
Exercises
C
1. Write a program that uses the Monte Carlo method to approximate the double integral
x
e x y d A,
R
where R = [0, 1] × [0, 1]. Show the program output for N = 10, 100, 1000, 10000, 100000
and 1000000 random points.
2. Write a program that uses the Monte Carlo method to approximate the triple integral
y
e x yz dV ,
S
where S = [0, 1] × [0, 1] × [0, 1]. Show the program output for N = 10, 100, 1000, 10000,
100000 and 1000000 random points.
134 CHAPTER 4. MULTIPLE INTEGRALS
5. Use the Monte Carlo method to approximate the volume of a sphere of radius 1.
x2 y2 2
6. Use the Monte Carlo method to approximate the volume of the ellipsoid 9 + 4 + z1 = 1.
4.5 Change of Variables in Multiple Integrals 135
w2 p
x3 x2 − 1 dx ,
1
u = x2 − 1 ⇒ x2 = u + 1
du = 2 x dx
x=1 ⇒ u=0
x=2 ⇒ u=3
so that we get
w2 p w2 p
x3 x2 − 1 dx = 1 2
2 x · 2x x2 − 1 dx
1 1
w3 p
1
= 2 ( u + 1) u du
0
w3 ³ ´
= 1
2 u3/2 + u1/2 du , which can be integrated to give
0
p
14 3
= 5 .
Let us take a different look at what happened when we did that substitution, which will give
some motivation for how substitution works in multiple integrals. First, we let u = x2 − 1.
On the interval of integration [1, 2], the function x 7→ x2 − 1 is strictly increasing (and maps
[1, 2] onto [0, 3]) and hence has an inverse function (defined on the interval [0, 3]). That is,
on [0, 3] we can define x as a function of u, namely
p
x = g( u) = u + 1 .
p
Then substituting that expression for x into the function f ( x) = x3 x2 − 1 gives
p
f ( x) = f ( g( u)) = ( u + 1)3/2 u ,
136 CHAPTER 4. MULTIPLE INTEGRALS
so since
w2 w2 p
f ( x) dx = x3 x2 − 1 dx
1 1
w3 p
1
= 2 ( u + 1) u du , which can be written as
0
w3 p
= ( u + 1)3/2 u · 21 ( u + 1)−1/2 du , which means
0
w2 g−w1 (2)
f ( x) dx = f ( g( u)) g ′ ( u) du .
1 g−1 (1)
This is called the change of variable formula for integrals of single-variable functions, and it
is what you were implicitly using when doing integration by substitution.
This formula turns out to be a special case of a more general formula which can be used
to evaluate multiple integrals. We will state the formulas for double and triple integrals
involving real-valued functions of two and three variables, respectively. We will assume
that all the functions involved are continuously differentiable and that the regions and solids
involved all have “reasonable” boundaries. The proof of the following theorem is beyond the
scope of the text.2
2 See T AYLOR and M ANN, § 15.32 and § 15.62 for all the details.
4.5 Change of Variables in Multiple Integrals 137
is never 0 in R ′ . Then
x x
f ( x, y) d A ( x, y) = f ( x( u, v), y( u, v)) | J ( u, v) | d A ( u, v) . (4.19)
R R′
We use the notation d A ( x, y) and d A ( u, v) to denote the area element in the ( x, y) and ( u, v)
coordinates, respectively.
Similarly, if x = x( u, v, w), y = y( u, v, w) and z = z( u, v, w) define a one-to-one mapping of
a solid S ′ in uvw-space onto a solid S in x yz-space such that the determinant
¯ ¯
¯ ∂x ∂x ∂x ¯
¯ ¯
¯ ∂ u ∂v ∂w ¯
¯ ¯
¯ ∂y ∂y ∂y ¯
J ( u, v, w) = ¯
¯ ¯
¯ (4.20)
¯ ∂ u ∂v ∂w ¯
¯ ∂z ∂z ∂z ¯
¯ ¯
¯ ¯
∂ u ∂v ∂w
is never 0 in S ′ , then
y y
f ( x, y, z) dV ( x, y, z) = f ( x( u, v, w), y( u, v, w), z( u, v, w)) | J ( u, v, w) | dV ( u, v, w) . (4.21)
S S′
The determinant J ( u, v) in formula (4.18) is called the Jacobian of x and y with respect
to u and v, and is sometimes written as
∂( x, y)
J ( u, v) = . (4.22)
∂( u, v)
Similarly, the Jacobian J ( u, v, w) of three variables is sometimes written as
∂( x, y, z)
J ( u, v, w) = . (4.23)
∂( u, v, w)
Notice that formula (4.19) is saying that d A ( x, y) = | J ( u, v) | d A ( u, v), which you can think of
as a two-variable version of the relation dx = g ′ ( u) du in the single-variable case.
The following example shows how the change of variables formula is used.
x x− y
Example 4.10. Evaluate e x+ y d A , where R = {( x, y) : x ≥ 0, y ≥ 0, x + y ≤ 1}.
R
138 CHAPTER 4. MULTIPLE INTEGRALS
Solution: First, note that evaluating this double integral without using substitution is prob-
ably impossible, at least in a closed form. By looking at the numerator and denominator of
the exponent of e, we will try the substitution u = x − y and v = x + y. To use the change of
variables formula (4.19), we need to write both x and y in terms of u and v. So solving for
x and y gives x = 21 ( u + v) and y = 12 (v − u). In Figure 4.5.1 below, we see how the mapping
x = x( u, v) = 21 ( u + v), y = y( u, v) = 21 (v − u) maps the region R ′ onto R in a one-to-one manner.
y v
1
x= 2 (u + v) 1
1
1
y= 2 (v − u) R′
x+ y =1
u = −v u=v
R x u
0 1 −1 0 1
The change of variables formula can be used to evaluate double integrals in polar coordi-
nates. Letting
x = x( r, θ ) = r cos θ and y = y( r, θ ) = r sin θ ,
4.5 Change of Variables in Multiple Integrals 139
we have
¯ ¯
¯ ∂x ∂ x ¯¯ ¯ ¯
¯
∂θ ¯¯ = ¯¯ cos θ − r sin θ ¯¯
¯ ¯
J ( u, v) = ¯¯ ∂ r ¯ = r cos2 θ + r sin2 θ = r ⇒ | J ( u, v) | = | r | = r ,
¯∂y ∂ y ¯¯ ¯ sin θ r cos θ ¯
¯ ¯
∂r ∂θ
where the mapping x = r cos θ , y = r sin θ maps the region R ′ in the r θ -plane onto the
region R in the x y-plane in a one-to-one manner.
p
Example 4.12. Find the volume V inside the cone z = x2 + y2 for 0 ≤ z ≤ 1.
140 CHAPTER 4. MULTIPLE INTEGRALS
In a similar fashion, it can be shown (see Exercises 5–6) that triple integrals in cylindrical
and spherical coordinates take the following forms:
Triple Integral in Cylindrical Coordinates
y y
f ( x, y, z) dx d y dz = f ( r cos θ , r sin θ , z) r dr d θ dz , (4.25)
S S′
where the mapping x = r cos θ , y = r sin θ , z = z maps the solid S ′ in r θ z-space onto the
solid S in x yz-space in a one-to-one manner.
Example 4.13. For a > 0, find the volume V inside the sphere S = x2 + y2 + z2 = a2 .
4.5 Change of Variables in Multiple Integrals 141
y w2πwπ wa
V = 1 dV = 1 ρ 2 sin φ d ρ d φ d θ
S 0 0 0
w w ρ 3 ¯ρ = a
2π π µ ¶ w2πwπ a3
¯
= ¯ sin φ d φ d θ = sin φ d φ d θ
3 ρ =0 3
0 0 0 0
w2πµ a3 ¯ φ= π ¶ w2π 2a3 4π a 3
¯
= − cos φ ¯ dθ = dθ = .
3 φ= 0 3 3
0 0
Exercises
A
1. Find the volume V inside the paraboloid z = x2 + y2 for 0 ≤ z ≤ 4.
p
2. Find the volume V inside the cone z = x2 + y2 for 0 ≤ z ≤ 3.
B
3. Find the volume V of the solid inside both x2 + y2 + z2 = 4 and x2 + y2 = 1.
p
4. Find the volume V inside both the sphere x2 + y2 + z2 = 1 and the cone z = x 2 + y2 .
C
2 y2 2
10. Show that the volume inside the ellipsoid ax2 + b2 + zc2 = 1 is 4π3abc . (Hint: Use the change
of variables x = au, y = bv, z = cw, then consider Example 4.13.)
For any smooth function f ( x, y) which vanishes outside of a bounded region in the plane.
142 CHAPTER 4. MULTIPLE INTEGRALS
where
wb ( f ( x))2 wb wb
Mx = dx , My = x f ( x) dx , M= f ( x) dx , (4.27)
a
2 a a
assuming that R has uniform density, i.e the mass of R is uniformly distributed over the
region. In this case the area M of the region is considered the mass of R (the density is
constant, and taken as 1 for simplicity).
In the general case where the density of a region (or lamina) R is a continuous function
δ = δ( x, y) of the coordinates ( x, y) of points inside R (where R can be any region in R2 ) the
coordinates ( x̄, ȳ) of the center of mass of R are given by
My Mx
x̄ = and ȳ = , (4.28)
M M
where x x x
My = xδ( x, y) d A , Mx = yδ( x, y) d A , M= δ( x, y) d A , (4.29)
R R R
The quantities M x and M y are called the moments (or first moments) of the region R about
the x-axis and y-axis, respectively. The quantity M is the mass of the region R . To see this,
think of taking a small rectangle inside R with dimensions ∆ x and ∆ y close to 0. The mass
of that rectangle is approximately δ( x∗ , y∗ )∆ x ∆ y, for some point ( x∗ , y∗ ) in that rectangle.
Then the mass of R is the limit of the sums of the masses of all such rectangless inside R as
the diagonals of the rectangles approach 0, which is the double integral δ( x, y) d A .
R
Note that the formulas in (4.27) represent a special case when δ( x, y) = 1 throughout R in
the formulas in (4.29).
y
Solution: The region R is shown in Figure 3.6.2. We have
y = 2x2
x
M = δ( x, y) d A
R R
w1 2wx2 x
= ( x + y) d y dx 0 1
0 0
w1
à ¯ y=2 x2 ! Figure 4.6.2
y2 ¯¯
= xy+ dx
2 ¯ y=0
0
w1
= (2 x3 + 2 x4 ) dx
0
¯1
x4 2 x5 ¯¯ 9
= + ¯ =
2 5 0 10
and
x x
Mx = yδ( x, y) d A My = xδ( x, y) d A
R R
w1 2wx2 w1 2wx2
= y( x + y) d y dx = x( x + y) d y dx
0 0 0 0
w1 w1
à ¯ y=2 x2 ! à ¯ y=2 x2 !
x y2 y3 ¯¯ 2 x y2 ¯¯
= + dx = x y+ dx
2 3 ¯ y=0 2 ¯ y=0
0 0
w1 8x 6 w1
= (2 x5 +
) dx = (2 x4 + 2 x5 ) dx
3
0 0
¯1 ¯1
x6 8 x7 ¯¯ 5 2 x5 x6 ¯¯ 11
= + ¯ = = + ¯ = ,
3 21 0 7 5 3 0 15
My 11/15 22 Mx 5/7 50
x̄ = = = , ȳ = = = .
M 9/10 27 M 9/10 63
Note how this center of mass is a little further towards the upper corner of the region R
¡ ¢
than when the density is uniform (use the formulas in (4.27) to show that ( x̄, ȳ) = 43 , 53 in
that case). This makes sense since the density function δ( x, y) = x + y increases as ( x, y)
approaches that upper corner, where there is quite a bit of area.
144 CHAPTER 4. MULTIPLE INTEGRALS
In the special case where the density function δ( x, y) is a constant function on the region
R , the center of mass ( x̄, ȳ) is called the centroid of R .
The formulas for the center of mass of a region in R2 can be generalized to a solid S in R3 .
Let S be a solid with a continuous mass density function δ( x, y, z) at any point ( x, y, z) in S .
Then the center of mass of S has coordinates ( x̄, ȳ, z̄), where
M yz M xz Mx y
x̄ = , ȳ = , z̄ = , (4.30)
M M M
where
y y y
M yz = xδ( x, y, z) dV , M xz = yδ( x, y, z) dV , Mx y = zδ( x, y, z) dV , (4.31)
S S S
y
M = δ( x, y, z) dV . (4.32)
S
In this case, M yz , M xz and M x y are called the moments (or first moments) of S around the
yz-plane, xz-plane and x y-plane, respectively. Also, M is the mass of S .
w2π πw/2 wa
à !
3
= sin φ cos φ ρ dρ dφ dθ
0 0 0
4.6 Application: Center of Mass 145
w2π πw/2
a4
= 4 sin φ cos φ d φ d θ
0 0
w2π πw/2
a4
Mx y = 8 sin 2φ d φ d θ (since sin 2φ = 2 sin φ cos φ)
0 0
w2πµ ¯φ=π/2 ¶
a4 ¯
= − 16 cos 2φ ¯ dθ
φ= 0
0
w2π
a4
= 8 dθ
0
π a4
= ,
4
so
π a4
Mx y 4 3a
z̄ = = = .
M 2πa3 8
3
¡ ¢
Thus, the center of mass of S is ( x̄, ȳ, z̄) = 0, 0, 38a .
Exercises
A
For Exercises 1–5, find the center of mass of the region R with the given density function
δ( x, y).
1. R = {( x, y) : 0 ≤ x ≤ 2, 0 ≤ y ≤ 4 }, δ( x, y) = 2 y
2. R = {( x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ x2 }, δ( x, y) = x + y
3. R = {( x, y) : y ≥ 0, x2 + y2 ≤ a2 }, δ( x, y) = 1
p
4. R = {( x, y) : y ≥ 0, x ≥ 0, 1 ≤ x2 + y2 ≤ 4 }, δ( x, y) = x 2 + y2
5. R = {( x, y) : y ≥ 0, x2 + y2 ≤ 1 }, δ( x, y) = y
B
For Exercises 6–10, find the center of mass of the solid S with the given density function
δ( x, y, z).
6. S = {( x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ( x, y, z) = x yz
7. S = {( x, y, z) : z ≥ 0, x2 + y2 + z2 ≤ a2 }, δ( x, y, z) = x2 + y2 + z2
146 CHAPTER 4. MULTIPLE INTEGRALS
8. S = {( x, y, z) : x ≥ 0, y ≥ 0, z ≥ 0, x2 + y2 + z2 ≤ a2 }, δ( x, y, z) = 1
9. S = {( x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ( x, y, z) = x2 + y2 + z2
10. S = {( x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 − x − y}, δ( x, y, z) = 1
C
11. Let F be a figure in the upper half-plane; denote as ( x0 , y0 ) its center of mass and as A
its the area. Show that
2π A y0
is the volume of the body of revolution obtained by rotating F around x-axis.
4.7 Application: Probability and Expected Value 147
Probability
Suppose that you have a standard six-sided (fair) die, and you let a variable X represent
the value rolled. Then the probability of rolling a 3, written as P ( X = 3), is 61 , since there
are six sides on the die and each one is equally likely to be rolled, and hence in particular
the 3 has a one out of six chance of being rolled. Likewise the probability of rolling at most a
3, written as P ( X ≤ 3), is 36 = 21 , since of the six numbers on the die, there are three equally
likely numbers (1, 2, and 3) that are less than or equal to 3. Note that P ( X ≤ 3) = P ( X =
1) + P ( X = 2) + P ( X = 3). We call X a discrete random variable on the sample space (or
probability space) Ω consisting of all possible outcomes. In our case, Ω = {1, 2, 3, 4, 5, 6}. An
event A is a subset of the sample space. For example, in the case of the die, the event X ≤ 3
is the set {1, 2, 3}.
Now let X be a variable representing a random real number in the interval (0, 1). Note
that for any real number x in (0, 1), it makes no sense to consider P ( X = x) since it must
be 0 (why?). Instead, we consider the probability P ( X ≤ x), which is given by P ( X ≤ x) = x.
The reasoning is this: the interval (0, 1) has length 1, and for x in (0, 1) the interval (0, x)
has length x. So since X represents a random number in (0, 1), and hence is uniformly
distributed over (0, 1), then
length of (0, x) x
P ( X ≤ x) = = = x.
length of (0, 1) 1
We call X a continuous random variable on the sample space Ω = (0, 1). An event A is a
subset of the sample space. For example, in our case the event X ≤ x is the set (0, x).
In the case of a discrete random variable, we saw how the probability of an event was
the sum of the probabilities of the individual outcomes comprising that event (for instance,
P ( X ≤ 3) = P ( X = 1) + P ( X = 2) + P ( X = 3) in the die example). For a continuous random
variable, the probability of an event will instead be the integral of a function, which we will
now describe.
Let X be a continuous real-valued random variable on a sample space Ω in R. For simplic-
ity, let Ω = (a, b). Define the distribution function F of X as
wx
F ( x) = f ( y) d y , for −∞ < x < ∞ , (4.35)
−∞
and
w∞
f ( x) dx = 1 . (4.36)
−∞
wx
P ( X ≤ x) = f ( y) d y , for a < x < b . (4.37)
a
Example 4.16. Let X represent a randomly selected real number in the interval (0, 1). We
say that X has the uniform distribution on (0, 1), with distribution function
1, for x ≥ 1
F ( x) = P ( X ≤ x) = x, for 0 < x < 1 (4.39)
0, for x ≤ 0 ,
In general, if X represents a randomly selected real number in an interval (a, b), then X has
the uniform distribution function
1, for x ≥ b
x
F ( x) = P ( X ≤ x) = b−a , for a < x < b (4.41)
0, for x ≤ a ,
Example 4.17. A famous distribution function is given by the standard normal distribution,
whose probability density function f is
1 2
f ( x) = p e− x /2 , for −∞ < x < ∞. (4.43)
2π
This is often called a “bell curve”, and is used widely in statistics. Since we are claiming that
f is a probability density function, we should have
w∞ 1 2
p e− x /2 dx = 1 (4.44)
−∞ 2π
w∞ 2 p
/2
e− x dx = 2π . (4.45)
−∞
We can use a double integral in polar coordinates to verify this integral. First,
w∞ w∞ w∞ w∞
à !
−( x2 + y2 )/2 − y2 /2 − x2 /2
e dx d y = e e dx d y
−∞ −∞ −∞ −∞
w∞ w∞
à !à !
− x2 /2 − y2 /2
= e dx e dy
−∞ −∞
w∞
à !2
− x2 /2
= e dx
−∞
since the same function is being integrated twice in the middle equation, just with different
variables. But using polar coordinates, we see that
w∞ w∞ w2πw∞
−( x2 + y2 )/2 2
/2
e dx d y = e−r r dr d θ
−∞ −∞ 0 0
w2πµ ¯r=∞ ¶
¯
− r 2 /2 ¯
= −e ¯ dθ
0 r =0
w2π w2π
0
= (0 − (− e )) d θ = 1 d θ = 2π ,
0 0
150 CHAPTER 4. MULTIPLE INTEGRALS
and so
w∞
à !2
− x2 /2
e dx = 2π , and hence
−∞
w∞ 2 p
/2
e− x dx = 2π .
−∞
wz wy wx
F ( x, y, z) = f ( u, v, w) du dv dw , for −∞ < x, y, z < ∞ (4.47)
−∞ −∞ −∞
and
w∞ w∞ w∞
f ( x, y, z) dx d y dz = 1 , (4.48)
−∞ −∞ −∞
then we call f the joint probability density function for X , Y and Z . In general, for a 1 < b 1 ,
a 2 < b 2 , a 3 < b 3 , we have
with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can
be thought of as representing a probability (for a function f which is a probability density
function).
Example 4.18. Let a, b, and c be real numbers selected randomly from the interval (0, 1).
What is the probability that the equation ax2 + bx + c = 0 has at least one real solution x?
4.7 Application: Probability and Expected Value 151
p p
b2 − 4ac ≥ 0 ⇔ 0 < 4ac ≤ b2 < 1 ⇔ 0 < 2 a c ≤ b < 1 , R1 R2
a
0 1 1
where the last relation holds for all 0 < a, c < 1 such that 4
Considering a, b and c as real variables, the region R in the ac-plane where the above
relation holds is given by R = {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c < 41a }, which we can see is a
union of two regions R 1 and R 2 , as in Figure 3.7.1 above.
Now let X , Y and Z be continuous random variables, each representing a randomly se-
lected real number from the interval (0, 1) (think of X , Y and Z representing a, b and c,
respectively). Then, similar to how we showed that f ( x) = 1 is the probability density func-
tion of the uniform distribution on (0, 1), it can be shown that f ( x, y, z) = 1 for x, y, z in (0, 1)
(0 elsewhere) is the joint probability density function of X , Y and Z . Now,
p p
P ( b2 − 4ac ≥ 0) = P ((a, c) ∈ R, 2 a c ≤ b < 1) ,
p p
so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2 a c to 1 and as
(a, c) varies over the region R . Since R can be divided into two regions R 1 and R 2 , then the
required triple integral can be split into a sum of two triple integrals, using vertical slices in
R:
1/4w1
w w1 w1 1/4
wa w1
P ( b2 − 4ac ≥ 0) = 1 db dc da + 1 db dc da
p p p p
0 0 2 a c 1/4 0 2 a c
| {z } | {z }
R1 R2
1/4w1
w p p w1 1/4
wa p p
= (1 − 2 a c) dc da + (1 − 2 a c) dc da
0 0 1/4 0
w
1/4µ
p ¯ c =1 ¶ w1 µ p ¯ c=1/4a ¶
c − 34 a c3/2 ¯ c − 34 a c3/2 ¯
¯ ¯
= da + da
c =0 c =0
0 1/4
w
1/4 w1
4p
¡ ¢ 1
= 1− 3 a da + 12a da
0 1/4
¯1/4 ¯1
8 3/2 ¯¯ 1 ¯
= a− a ¯ + ln a ¯¯
9 0 12 1/4
152 CHAPTER 4. MULTIPLE INTEGRALS
µ ¶ µ ¶
1 1 1 1 5 1
= − + 0− ln = + ln 4
4 9 12 4 36 12
5 + 3 ln 4
P ( b2 − 4ac ≥ 0) = ≈ 0.2544
36
In other words, the equation ax2 + bx + c = 0 has about a 25% chance of being solved!
Expected Value
The expected value E X of a random variable X can be thought of as the “average” value of
X as it varies over its sample space. If X is a discrete random variable, then
X
EX = x P ( X = x) , (4.50)
x
with the sum being taken over all elements x of the sample space. For example, if X repre-
sents the number rolled on a six-sided die, then
6
X 6
X 1
EX = x P ( X = x) = x = 3.5 (4.51)
x =1 x =1 6
w∞
EX = x f ( x) dx . (4.52)
−∞
For example, if X has the uniform distribution on the interval (0, 1), then its probability
density function is
(
1, for 0 < x < 1
f ( x) = (4.53)
0, elsewhere,
and so
w∞ w1 1
EX = x f ( x) dx = x dx = . (4.54)
−∞
2
0
For a pair of jointly distributed, real-valued continuous random variables X and Y with
joint probability density function f ( x, y), the expected values of X and Y are given by
w∞ w∞ w∞ w∞
EX = x f ( x, y) dx d y and EY = y f ( x, y) dx d y , (4.55)
−∞ −∞ −∞ −∞
respectively.
4.7 Application: Probability and Expected Value 153
Example 4.19. If you were to pick n > 2 random real numbers from the interval (0, 1), what
are the expected values for the smallest and largest of those numbers?
Solution: Let U1 , . . . ,Un be n continuous random variables, each representing a randomly
selected real number from (0, 1) with the uniform distribution on (0, 1). Define random vari-
ables X and Y by
X = min(U1 , . . . ,Un ) and Y = max(U1 , . . . ,Un ) .
Then it can be shown3 that the joint probability density function of X and Y is
(
n( n − 1)( y − x)n−2 , for 0 ≤ x ≤ y ≤ 1
f ( x, y) = (4.56)
0, elsewhere.
w1 w1
EX = n( n − 1) x( y − x)n−2 d y dx
0 x
w1 µ ¯ y=1 ¶
n −1 ¯
= nx( y − x) ¯ dx
y= x
0
w1
= nx(1 − x)n−1 dx , so integration by parts yields
0
1 ¯1
= − x(1 − x)n − (1 − x)n+1 ¯
¯
n+1 0
1
EX = ,
n+1
and similarly (see Exercise 3) it can be shown that
w1 wy n
EY = n( n − 1) y( y − x)n−2 dx d y = .
n+1
0 0
So, for example, if you were to repeatedly take samples of n = 3 random real numbers from
(0, 1), and each time store the minimum and maximum values in the sample, then the aver-
age of the minimums would approach 14 and the average of the maximums would approach
3
4 as the number of samples grows. It would be relatively simple (see Exercise 4) to write a
computer program to test this.
Exercises
B
3 See Ch. 6 in H OEL, P ORT and S TONE.
154 CHAPTER 4. MULTIPLE INTEGRALS
n
3. Show that EY = n +1 in Example 4.19
C
4. Write a computer program (in the language of your choice) that verifies the results in
Example 4.19 for the case n = 3 by taking large numbers of samples.
6. For continuous random variables X , Y with joint probability density function f ( x, y),
define the second moments E ( X 2 ) and E (Y 2 ) by
w∞ w∞ w∞ w∞
2 2 2
E( X ) = x f ( x, y) dx d y and E (Y ) = y2 f ( x, y) dx d y ,
−∞ −∞ −∞ −∞
E ( X Y ) − (E X )(EY )
ρ = p ,
Var( X ) Var(Y )
where
w∞ w∞
E( X Y ) = x y f ( x, y) dx d y.
−∞ −∞
8. In Example 4.18 would the answer change if the interval (0, 100) is used instead of (0, 1)?
Explain.
5 Line and Surface Integrals
wb
W = f ( x) dx.
a
In this section, we will see how to define the integral of a function (either real-valued or
vector-valued) of two variables over a general curve (also called path) in R2 . This definition
will be motivated by the physical notion of work. We will begin with real-valued functions of
two variables.
In physics, the intuitive idea of work is that
Assume you move a an object of unit weight along a curve C in R2 and want to find the work
of the force which works against the friction. Suppose f ( x, y) is the coefficient of friction at
the point ( x, y). In this case the force has magnitude f ( x, y) and it is applied in the direction
of motion along C (see Figure 5.1.1 below).
y t = ti q
C
∆s i ≈ ∆ xi 2 + ∆ yi 2
t=a ∆ yi
t = t i +1
∆ xi
t=b
x
0
155
156 CHAPTER 5. LINE AND SURFACE INTEGRALS
We will assume for now that the function f ( x, y) is continuous and real-valued, so we only
consider the magnitude of the force. Partition the interval [a, b] as follows:
is approximately the total amount of work done over the entire curve. But since
s
µ ¶ µ ¶
∆ xi 2 ∆ yi 2
q
2 2
∆ xi + ∆ yi = + ∆t i ,
∆t i ∆t i
Taking the limit of that sum as the length of the largest subinterval goes to 0, the sum over
xi ∆ yi
all subintervals becomes the integral from t = a to t = b, ∆ ′ ′
∆ t i and ∆ t i become x ( t) and y ( t),
respectively, and f ( xi∗ , yi∗ ) becomes f ( x( t), y( t)), so that
wb q
W = f ( x( t), y( t)) x ′ ( t)2 + y ′ ( t)2 dt . (5.4)
a
The integral on the right side of the above equation gives us our idea of how to define,
for any real-valued function f ( x, y), the integral of f ( x, y) along the curve C , called a line
integral:
wt q
s = s( t) = x ′ ( u)2 + y ′ ( u)2 du , (5.6)
a
which you may recognize from Section 1.9 as the length of the curve C over the interval [a, t],
for all t in [a, b]. That is,
q
′
ds = s ( t) dt = x ′ ( t)2 + y ′ ( t)2 dt , (5.7)
f ( x, y)
C ds
x
0
Example 5.1. Use a line integral to show that the lateral surface area A of a right circular
cylinder of radius r and height h is 2π rh.
158 CHAPTER 5. LINE AND SURFACE INTEGRALS
Solution: We will use the right circular cylinder with base circle C z
given by x2 + y2 = r 2 and with height h in the positive z direction r
(see Figure 4.1.3). Parametrize C as follows:
Note in Example 5.1 that if we had traversed the circle C twice (that is, let t vary from 0
to 4π) then we would have gotten an area of 4π rh — twice the desired area, even though the
curve itself is still the same (namely, a circle of radius r ). Also, notice that we traversed the
circle in the counter-clockwise direction. If we had gone in the clockwise direction, using the
parametrization
then it is easy to verify (see Exercise 12) that the value of the line integral is unchanged.
In general, it can ber shown (see Exercise 15) that reversing the direction in which a curve
C is traversed leaves C f ( x, y) ds unchanged, for any f ( x, y). If a curve C has a parametriza-
tion x = x( t), y = y( t), a ≤ t ≤ b, then denote by −C the same curve as C but traversed in the
opposite direction. Then −C is parametrized by
x = x( a + b − t) , y = y( a + b − t ) , a≤t≤b , (5.9)
and we have w w
f ( x, y) ds = f ( x, y) ds . (5.10)
C −C
Notice that our definition of the line integral was with respect to the arc length parameter
s. We can also define
w wb
f ( x, y) dx = f ( x( t), y( t)) x ′ ( t) dt (5.11)
C a
5.1 Line Integrals 159
w wb
f ( x, y) d y = f ( x( t), y( t)) y ′ ( t) dt (5.12)
C a
r( t) = x( t) i + y( t) j
w w wb wb
P ( x, y) dx + Q ( x, y) d y = P ( x( t), y( t)) x ′ ( t) dt + Q ( x( t), y( t)) y ′ ( t) dt
C C a a
wb
= (P ( x( t), y( t)) x ′ ( t) + Q ( x( t), y( t)) y ′ ( t)) dt
a
wb
= f( x( t), y( t)) · r′ ( t) dt
a
by definition of f( x, y). Notice that the function f( x( t), y( t)) · r ′ ( t) is a real-valued function on
[a, b], so the last integral on the right looks somewhat similar to our earlier definition of a
line integral. This leads us to the following definition:
where it is understood that the line integral along C is being applied to both P and Q . The
quantity P ( x, y) dx + Q ( x, y) d y is known as a differential form. For a real-valued function
F ( x, y), the differential of F is dF = ∂∂Fx dx + ∂∂Fy d y. A differential form P ( x, y) dx + Q ( x, y) d y
is called exact if it equals dF for some function F ( x, y).
Recall that if the points on a curve C have position vector r( t) = x( t) i + y( t) j, then r ′ ( t) is a
tangent vector to C at the point ( x( t), y( t)) in the direction of increasing t (which we call the
direction of C ). Since C is a smooth curve, then r ′ ( t) 6= 0 on [a, b] and hence
r ′ ( t)
T( t) = ° ′ °
° r ( t) °
is the unit tangent vector to C at ( x( t), y( t)). Putting Definitions 5.1 and 5.2 together we get
the following theorem:
r ′ ( t)
where T( t) = k r ′ ( t) k is the unit tangent vector to C at ( x( t), y( t)).
If the vector field f( x, y) represents the force moving an object along a curve C , then the work
W done by this force is w w
W = f · T ds = f · dr . (5.16)
C C
r 2
Example 5.2. Evaluate C (x + y2 ) dx + 2 x y d y, where:
(a) C : x = t , y = 2t , 0≤t≤1
(b) C : x = t , y = 2 t2 , 0≤t≤1
5.1 Line Integrals 161
y
Solution: Write C = C 1 ∪ C 2 , where C 1 is the curve given by x = 0, y = t,
2 (1, 2)
0 ≤ t ≤ 2 and C 2 is the curve given by x = t, y = 2, 0 ≤ t ≤ 1 (see Figure C2
4.1.5). Then
w w C1
( x2 + y2 ) dx + 2 x y d y = ( x2 + y2 ) dx + 2 x y d y
x
C C1
w 0 1
+ ( x2 + y2 ) dx + 2 x y d y
Figure 5.1.5
C2
w2 ¡ ¢ w1 ¡ ¢
2 2
= (0 + t )(0) + 2(0) t(1) dt + ( t2 + 4)(1) + 2 t(2)(0) dt
0 0
w2 w1
= 0 dt + ( t2 + 4) dt
0 0
¯1
t3 ¯ 1 13
= + 4 t ¯¯ = + 4 =
3 0 3 3
Line integral
rb notation varies quite a bit. For example, in physics it is common to see the
notation a f · d l, where it is understood that the limits of integration a and b are for the
underlying
r parameter t of the curve, and the letter l signifies length. Also, the formulation
C f · T ds from Theorem 5.1 is often preferred in physics since it emphasizes the idea of
integrating the tangential component f · T of f in the direction of T (that is, in the direction
of C ), which is a useful physical interpretation of line integrals.
Exercises
A
For Exercises 1–4, calculate w
f ( x, y) ds
C
for the given function f ( x, y) and curve C .
5. Use a line integral to find the lateral surface area of the part of the cylinder
x2 + y2 = 4 below the plane x + 2 y + z = 6 and above the x y-plane.
5.1 Line Integrals 163
7. f( x, y) = y i − x j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π
8. f( x, y) = x i + y j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π
9. f( x, y) = ( x2 − y) i + ( x − y2 ) j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π
14. Show that if f points in the same direction as r ′ ( t) at each point r( t) along a smooth
curve C , then w w
f · d r = k f k ds.
C C
C
15. Prove that w w
f ( x, y) ds = f ( x, y) ds.
C −C
(Hint: Use formulas (5.9).)
16. Let C be a smooth curve with arc length L, and suppose that f( x, y) = P ( x, y) i + Q ( x, y) j
is a vector field such that k f( x, y) k ≤ M for all ( x, y) on C . Show that
¯w
¯ ¯
¯
¯ ¯
¯ f · d r ¯ ≤ ML.
¯ ¯
C
¯r ¯ r
¯ b ¯ b
(Hint: Recall that ¯ a g( x) dx ¯ ≤ a | g( x) | dx for Riemann integrals.)
rb
17. Prove that the Riemann integral a f ( x) dx is a special case of a line integral.
164 CHAPTER 5. LINE AND SURFACE INTEGRALS
For line integrals of vector fields, however, the value does change. To see this, let f( x, y) =
P ( x, y) i + Q ( x, y) j be a vector field, with P and Q continuously differentiable functions. Let
C be a smooth curve parametrized by x = x( t), y = y( t), a ≤ t ≤ b, with position vector r( t) =
x( t) i + y( t) j (we will usually abbreviate this by saying that C : r( t) = x( t) i + y( t) j is a smooth
curve). We know that the curve −C traversed in the opposite direction is parametrized by
x = x(a + b − t), y = y(a + b − t), a ≤ t ≤ b. Then
w wb d
P ( x, y) dx = P ( x(a + b − t), y(a + b − t)) ( x(a + b − t)) dt
a
dt
−C
wb
= P ( x(a + b − t), y(a + b − t)) (− x ′ (a + b − t)) dt (by the Chain Rule)
a
wa
= P ( x( u), y( u)) (− x ′ ( u)) (− du) (by letting u = a + b − t)
b
wa
= P ( x( u), y( u)) x ′ ( u) du
b
wb wa wb
= − P ( x( u), y( u)) x ′ ( u) du , since =− , so
a b a
w w
P ( x, y) dx = − P ( x, y) dx
−C C
since we are just using a different letter ( u) for the line integral along C . A similar argument
shows that w w
Q ( x, y) d y = − Q ( x, y) d y ,
−C C
and hence
w w w
f · dr = P ( x, y) dx + Q ( x, y) d y
−C −C −C
w w
= − P ( x, y) dx + − Q ( x, y) d y
C C
5.2 Properties of Line Integrals 165
w w
à !
= − P ( x, y) dx + Q ( x, y) d y
w wC C
f · dr = − f · dr . (5.18)
−C C
The above formula can be interpreted in terms of the work done by a force f( x, y) (treated
as a vector) moving an object along a curve C : the total work performed moving the object
along C from its initial point to its terminal point, and then back to the initial point moving
backwards along the same path, is zero. This is because when force is considered as a vector,
direction is accounted for.
The preceding discussion shows the importance of always taking the direction of the curve
into account when using line integrals of vector fields. For this reason, the curves in line
integrals are sometimes referred to as directed curves or oriented curves.
Recall that our definition of a line integral required that we have a parametrization x =
x( t), y = y( t), a ≤ t ≤ b for the curve C . But as we know, any curve has infinitely many
parametrizations. So could we get a different value for a line integral using some other
parametrization of C , say, x = x̃( u), y = ỹ( u), c ≤ u ≤ d ? If so, this would mean that our
definition is not well-defined. Luckily, it turns out that the value of a line integral of a
vector field is unchanged as long as the direction of the curve C is preserved by whatever
parametrization is chosen:
Proof: Since α( u) is strictly increasing and maps [ c, d ] onto [a, b], then we know that t =
α( u) has an inverse function u = α−1 ( t) defined on [a, b] such that c = α−1 (a), d = α−1 ( b),
1
and du ′
dt = α ′ ( u) . Also, dt = α ( u) du, and by the Chain Rule
d x̃ d dx dt x̃ ′ ( u)
x̃ ′ ( u) = = ( x(α( u))) = = x ′ ( t) α ′ ( u ) ⇒ x ′ ( t) =
du du dt du α ′ ( u)
wb α−w1 ( b)
x̃ ′ ( u) ′
′
P ( x( t), y( t)) x ( t) dt = P ( x(α( u)), y(α( u))) (α ( u) du)
a
α ′ ( u)
α−1 (a)
wd
= P ( x̃( u), ỹ( u)) x̃ ′ ( u) du ,
c
166 CHAPTER 5. LINE AND SURFACE INTEGRALS
r
which shows that C P r ( x, y) dx has the same value for both parametrizations. A similar
argument
r shows that C Q ( x, y) d y has the same value for both parametrizations, and hence
C f · d r has the same value.
QED
Notice that the condition α ′ ( u) > 0 in Theorem 5.2 means that the two parametrizations
move along C in the same direction. That was not the case with the “reverse” parametriza-
tion for −C : for u = a + b − t we have t = α( u) = a + b − u ⇒ α ′ ( u) = −1 < 0.
r
Example 5.4. Evaluate the line integral C ( x2 + y2 ) dx + 2 x y d y from Example 5.2, Section
4.1, along the curve C : x = t, y = 2 t2 , 0 ≤ t ≤ 1, where t = sin u for 0 ≤ u ≤ π/2.
dt
Solution: First, we notice that 0 = sin 0, 1 = sin(π/2), and du = cos u > 0 on (0, π/2). So by
Theorem 5.2 we know that if C is parametrized by
w w/2
π
¡ ¢
2 2
( x + y ) dx + 2 x y d y = (sin2 u + (2 sin2 u)2 ) cos u + 2(sin u)(2 sin2 u)4 sin u cos u du
C 0
w/2
π
¡ ¢
= sin2 u + 20 sin4 u cos u du
0
¯π/2
sin3 u 5 ¯
¯
= + 4 sin u ¯
3 0
1 13
= +4 =
3 3
In other words, the line integral is unchanged whether t or u is the parameter for C .
By a closed curve, we mean a curve C whose initial point and terminal point are the
same; that is, for C : x = x( t), y = y( t), a ≤ t ≤ b, we have ( x(a), y(a)) = ( x( b), y( b)).
A simple closed curve is a closed curve which does not intersect itself. Note that any
closed curve can be regarded as a union of simple closed curves (think of the loops in a figure
eight). We use the special notation
z z
f ( x, y) ds and f · dr
C C
Ï Ï
t=b
Î Î
C C
(a) Closed (b) Not closed
So far, the examples we have seen of line integrals (for instance, Example 5.2) have had
the same value for different curves joining the initial point to the terminal point. That is,
the line integral has been independent of the path joining the two points. As we mentioned
before, this is not always the case. The following theorem gives a necessary and sufficient
condition for this path independence:
r
Theorem 5.3. In a region R , theuline integral C f · d r is independent of the path between
any two points in R if and only if C f · d r = 0 for every closed curve C which is contained in
R.
u
Proof: Suppose that C f · d r = 0 for every closed curve C which is contained in R . Let P1
and P2 be two distinct points in R . Let C 1 be a curve in R going from P1 to P2 , and let C 2
be another curve in R going from P1 to P2 , as in Figure 4.2.2.
Then C =uC 1 ∪ −C 2 is a closed curve in R (from P1 to C1
P1 ), and so C f · d r = 0. Thus, Ï
z
0 = f · dr P1 P2
C
w w
Ï
= f · dr + f · dr
C2
C1 −C 2
w w Figure 5.2.2
= f · dr − f · d r , and so
C1 C2
r r
· ·
C 1 f · d r = C 2 f · d r. This proves path independence.
r
Conversely, suppose that the line integral C f · d r is independent of the path between any
two points in R . Let C be a closed curve contained in R . Let P1 and P2 be two distinct points
168 CHAPTER 5. LINE AND SURFACE INTEGRALS
on C . Let C 1 be a part of the curve C that goes from P1 to P2 , and let C 2 be the remaining
part of C that goes from P1 to P2 , again as in Figure 4.2.2. Then by path independence we
have
w w
f · dr = f · dr
C1 C2
w w
f · dr − f · dr = 0
C1 C2
w w
f · dr + f · d r = 0 , so
C1 −C 2
z
f · dr = 0
C
since C = C 1 ∪ −C 2 . QED
Clearly, the above theorem does not give a practical way to determine path independence,
since it is impossible to check the line integrals around all possible closed curves in a region.
What it mostly does is give an idea of the way in which line integrals behave, and how seem-
ingly unrelated line integrals can be related (in this case, a specific line integral between
two points and all line integrals around closed curves).
Recall that if z = f ( x, y) is a continuously differentiable function of x and y, and both
x = x( t) and y = y( t) are differentiable functions of t, then
dz ∂ z dx ∂z d y
= + . (5.19)
dt ∂ x dt ∂ y dt
This is multivariable version of the Chain Rule, see Theorem 3.3 and Corollary 3.4. We
will now use this version of Chain Rule to prove the following sufficient condition for path
independence of line integrals:
where A = ( x(a), y(a)) and B = ( x( b), y( b)) are the endpoints of C . Thus, the line integral is
independent of the path between its endpoints, since it depends only on the values of F at
those endpoints.
5.2 Properties of Line Integrals 169
r
Proof: By definition of ·
C f · d r, we have
w wb ¡ ¢
f · dr = P ( x( t), y( t)) x ′ ( t) + Q ( x( t), y( t)) y ′ ( t) dt
C a
wb µ ∂F dx ∂F d y
¶
∂F ∂F
= + dt (since ∇F = f ⇒ = P and = Q)
a
∂ x dt ∂ y dt ∂x ∂y
wb
= F ( x( t), y( t))′ dt (by the Chain Rule in Theorem 3.3)
a
¯b
¯
= F ( x( t), y( t)) ¯ = F (B) − F ( A )
a
Theorem 5.4 can be thought of as the line integral version of the Fundamental Theorem
of Calculus. A real-valued function F ( x, y) such that ∇F ( x, y) = f( x, y) is called a potential
for f. A conservative vector field is one which has a potential.
r
Example 5.5. Recall from Examples 5.2 and 5.3 in Section 4.1 that the line integral C ( x2 +
y2 ) dx + 2 x y d y was found to have the value 13
3 for three different curves C going from the
point (0, 0) to the point (1, 2). Use Theorem 5.4 to show that this line integral is indeed path
independent.
Solution: We need to find a real-valued function F ( x, y) such that
∂F ∂F
= x 2 + y2 and = 2x y .
∂x ∂y
A consequence of Theorem 5.4 in the special case where C is a closed curve, so that the
endpoints A and B are the same point, is the following important corollary:
z
Corollary 5.5. If a vector field f has a potential in a region R , then f · d r = 0 for any closed
C
curve C in R . Equivalently, z
∇F · d r = 0
C
z
Example 5.6. Evaluate x dx + y d y for C : x = 2 cos t, y = 3 sin t, 0 ≤ t ≤ 2π.
C
Solution: The vector field f( x, y) = x i + y j has a potential F ( x, y):
∂F 1 2
= x ⇒ F ( x, y) = x + g( y) , so
∂x 2
∂F 1
= y ⇒ g ′ ( y) = y ⇒ g ( y) = y2 + K
∂y 2
1 2 1 2
for any constant K , so F ( x, y) = x + y is a potential for f( x, y). Thus,
2 2
z z
x dx + y d y = f · dr = 0
C C
x2 y2
by Corollary 5.5, since the curve C is closed (it is the ellipse 4 + 9 = 1).
Exercises
A
z
1. Evaluate ( x2 + y2 ) dx + 2 x y d y for C : x = cos t, y = sin t, 0 ≤ t ≤ 2π.
C
w
2. Evaluate ( x2 + y2 ) dx + 2 x y d y for C : x = cos t, y = sin t, 0 ≤ t ≤ π.
C
B
6. Let f( x, y) and g( x, y) be vector fields, let a and b be constants, and let C be a curve in R2 .
Show that w w w
(a f ± b g) · d r = a f · d r ± b g · d r .
C C C
r
7. Let C be a curve whose arc length is L. Show that C 1 ds = L.
C
10. Let g( x) and h( y) be differentiable functions, and let f( x, y) = h( y) i + g( x) j. For which
g( x) and h( y), the vector field f is potential? Find the potential F ( x, y) for all these cases.
172 CHAPTER 5. LINE AND SURFACE INTEGRALS
Proof: We will prove the theorem in the case for a simple region R , that is, where the
boundary curve C can be written as C = C 1 ∪ C 2 in two distinct ways:
where X 1 and X 2 are the points on C farthest to the left and right, respectively; and
where Y1 and Y2 are the lowest and highest points, respectively, on C . See Figure 4.3.1.
y
y = y2 (x)
d
Y2
Î
X 2 x = x2 (y)
x = x1 (y) X 1 R
ÏC
Y1
c
y = y1 (x)
x
a b
Figure 5.3.1
wb
= − (P ( x, y2 ( x)) − P ( x, y1 ( x))) dx
a
wb µ ¯ y= y2 ( x) ¶
¯
= − P ( x, y) ¯ dx
y= y1 ( x)
a
wb yw2 ( x) ∂P ( x, y)
= − d y dx (by the Fundamental Theorem of Calculus)
a y1 ( x)
∂y
x ∂P
= − dA .
∂y
R
wd
= (Q ( x2 ( y), y) − Q ( x1 ( y), y)) d y
c
wd µ ¯ x= x2 ( y) ¶
¯
= Q ( x, y) ¯ dy
x= x1 ( y)
c
174 CHAPTER 5. LINE AND SURFACE INTEGRALS
wd xw2 ( y) ∂Q ( x, y)
= dx d y (by the Fundamental Theorem of Calculus)
c x1 ( y)
∂x
x ∂Q
= d A , and so
∂x
R
z z z
f · dr = P ( x, y) dx + Q ( x, y) d y
C C C
x ∂P x ∂Q
= − dA + dA
∂y ∂x
R R
x µ ∂Q ∂P
¶
= − dA .
∂x ∂y
R
QED
z x µ ∂Q ∂P ¶
2 2 C
( x + y ) dx + 2 x y d y = − dA
∂x ∂ y
C R
x x x
= (2 y − 2 y) d A = 0dA = 0 . 0 1
R R
Figure 5.3.2
We actually already knew that the answer was zero. Recall from Example 5.5 in Section
4.2 thatuthe vector field f( x, y) = ( x2 + y2 ) i + 2 x y j has a potential function F ( x, y) = 13 x3 + x y2 ,
and so C f · d r = 0 by Corollary 5.5.
Though we proved Green’s Theorem only for a simple region R , the theorem can also
be proved for more general regions; in particular to regions which admit subdivision into
5.3 Green’s Theorem 175
simple regions.1 It includes regions bounded by few closed curves. For such regions, the
“outer” boundary and the “inner” boundaries are traversed so that R is always on the left
side.
C1 Î C1
R1 R1
Î
Ï C2 C3 C2
Ï
Î
Î
R2 R2
Ï
Ï
(a) Region R with one hole (b) Region R with two holes
The idea for why Green’s Theorem holds for such regions is shown in Figure 5.3.3 above.
The idea is to cut region R so that it is divided into simple subregions. For example, in
Figure 5.3.3(a) the region R is the union of the regions R 1 and R 2 , which are divided by the
slits indicated by the dashed lines. Those slits are part of the boundary of both R 1 and R 2 ,
and we traverse then in the manner indicated by the arrows. Notice that along each slit the
boundary of R 1 is traversed in the opposite direction as that of R 2 , which means that the line
integrals of f along those slits cancel each other out. Assuming that Green’s Theorem holds
for R 1 and R 2 , we get
z x µ ∂Q ∂P ¶ z x µ ∂Q ∂P ¶
f · dr = − d A and f · dr = − dA .
∂x ∂ y ∂x ∂ y
bdy R1 bdy R2
of R 1 of R 2
But since the line integrals along the slits cancel out, we have
z z z
f · dr = f · dr + f · dr ,
C 1 ∪C 2 bdy bdy
of R 1 of R 2
and so
z x µ ∂Q ∂P
¶ x µ ∂Q ∂P
¶ x µ ∂Q ∂P
¶
f · dr = − dA + − dA = − dA ,
∂x ∂y ∂x ∂y ∂x ∂y
C 1 ∪C 2 R1 R2 R
which shows that Green’s Theorem holds in the region R . A similar argument shows that
the theorem holds in the region with two holes shown in Figure 5.3.3(b).
1 See T AYLOR and M ANN, § 15.31 for a discussion of some of the difficulties involved when the boundary curve
is “complicated”.
176 CHAPTER 5. LINE AND SURFACE INTEGRALS
and let R = { ( x, y) : 0 < x2 + y2 ≤ 1 }. For the boundary curve Cu: x2 + y2 = 1, traversed counter-
clockwise, it was shown in Exercise 9(b) in Section 4.2 that C f · d r = 2π. But
∂Q y2 − x 2 ∂P x µ ∂Q ∂P ¶ x
= 2 = ⇒ − d A = 0dA = 0 .
∂x ( x + y2 )2 ∂y ∂x ∂ y
R R
This would seem to contradict Green’s Theorem. However, note that R is not the entire
region enclosed by C , since the point (0, 0) is not contained in R . That is, R has a “hole” at
the origin, so Green’s Theorem does not apply.
y
If we modify the region R to be the annulus R =
{ ( x, y) : 1/4 ≤ x2 + y2 ≤ 1 } (see Figure 4.3.3), and take 1
the “boundary” C of R to be C = C 1 ∪ C 2 , where C 1 is C1
the unit circle x2 + y2 = 1 traversed counterclockwise
Î
R
and C 2 is the circle x2 + y2 = 1/4 traversed clockwise, 1/2
then it can be shown (see Exercise 8) that C2
Ï
x
z 0 1/2 1
f · dr = 0 .
C
s ³ ∂Q ´
We would still have ∂x
− ∂∂Py d A = 0, so for this R
R
we would have
Figure 5.3.4 The annulus R
z x µ ∂Q ∂P
¶
f · dr = − dA ,
∂x ∂y
C R
which shows that Green’s Theorem holds for the annular region R .
z x µ ∂Q ∂P
¶ x
f · dr = − dA = 0dA = 0 .
∂x ∂y
C R R
5.3 Green’s Theorem 177
For a simply connected region R (that is, a region with no holes), the following can be
shown:
∂P ∂Q
(d) = in R (in this case, the differential form P dx + Q d y is exact)
∂y ∂x
A
For Exercises 1–4, use Green’s Theorem to evaluate the given line integral around the curve
C , traversed counterclockwise.
z
1. ( x2 − y2 ) dx + 2 x y d y; C is the boundary of R = { ( x, y) : 0 ≤ x ≤ 1, 2 x2 ≤ y ≤ 2 x }
C
z
2. x2 y dx + 2 x y d y; C is the boundary of R = { ( x, y) : 0 ≤ x ≤ 1, x2 ≤ y ≤ x }
C
z
3. 2 y dx − 3 x d y; C is the circle x2 + y2 = 1
C
z 2 2
4. ( e x + y2 ) dx + ( e y + x2 ) d y; C is the boundary of the triangle with vertices (0, 0), (4, 0)
C
and (0, 4)
8. Show that z
a dx + b d y = 0
C
B
178 CHAPTER 5. LINE AND SURFACE INTEGRALS
u
9. For the vector field f as in Example 5.8, show directly that C f · d r = 0, where C is the
boundary of the annulus R = { ( x, y) : 1/4 ≤ x2 + y2 ≤ 1 } traversed so that R is always on
the left.
10. Evaluate z
e x sin y dx + ( y3 + e x cos y) d y,
C
where C is the boundary of the rectangle with vertices (1, −1), (1, 1), (−1, 1) and (−1, −1),
traversed counterclockwise.
C
11. For a region R bounded by a simple closed curve C , show that the area A of R is
z z 1z
A = − y dx = xdy = x d y − y dx ,
2
C C C
where C is traversed
s so that R is always on the left. (Hint: Use Green’s Theorem and the
fact that A = R 1 d A .)
In the following exercises, use Exercise 11 to find the area bounded by curve. (You should
figure out how the curve traversed around the region it bounds.)
13. The deltoid curve (2 cos t + cos 2 t, 2 sin t − sin 2 t) for 0 ≤ t ≤ 2π. (The deltoid curve is shown
on the diagram; you can assume without proof that it has no self-intesections.)
5.4 Surface Integrals and the Divergence Theorem 179
z
(x(t), y(t), z(t))
(x(a), y(a), z(a))
x = x(t) C
y = y(t) r(t) (x(b), y(b), z(b))
z = z(t)
R1 y
a t b 0
x
Similar to how we used a parametrization of a curve to define the line integral along the
curve, we will use a parametrization of a surface to define a surface integral. We will use
two variables, u and v, to parametrize a surface Σ in R3 : x = x( u, v), y = y( u, v), z = z( u, v),
for ( u, v) in some region R in R2 (see Figure 5.4.2).
v R2
z
Σ
R x = x(u, v)
y = y(u, v)
(u, v) z = z(u, v) r(u, v)
y
0
u x
In this case, the position vector of a point on the surface Σ is given by the vector-valued
180 CHAPTER 5. LINE AND SURFACE INTEGRALS
function
r( u, v) = x( u, v)i + y( u, v)j + z( u, v)k for ( u, v) in R .
∂r ∂r
Since r( u, v) is a function of two variables, define the partial derivatives ∂u
and ∂v
for ( u, v)
in R by
∂r ∂x ∂y ∂z
( u, v) = ( u, v)i +
( u, v)j + ( u, v)k , and
∂u ∂u ∂u ∂u
∂r ∂x ∂y ∂z
( u, v) = ( u, v)i + ( u, v)j + ( u, v)k .
∂v ∂v ∂v ∂v
∂r r( u + ∆ u, v) − r( u, v)
≈ , and
∂u ∆u
∂r r( u, v + ∆v) − r( u, v)
≈ ,
∂v ∆v
and so the surface area element d σ is approximately
° ° ° °
° (r( u + ∆ u, v) − r( u, v)) × (r( u, v + ∆v) − r( u, v)) ° ≈ ° (∆ u ∂r ) × (∆v ∂r ) ° = ° ∂r × ∂r ° ∆ u ∆v
° ° ° ° ° °
° ∂u ∂v ° ° ∂ u ∂v °
We will write the double integral on the right using the special notation
x x°
° ∂r
°
∂r °
dσ = ° ∂ u × ∂v ° du dv . (5.27)
° °
Σ R
This is a special case of a surface integral over the surface Σ, where the surface area element
d σ can be thought of as 1 d σ. Replacing 1 by a general real-valued function f ( x, y, z) defined
in R3 , we have the following:
Example 5.9. A torus T is a surface obtained by revolving a circle of radius a in the yz-plane
around the z-axis, where the circle’s center is at a distance b from the z-axis (0 < a < b), as
in Figure 5.4.3. Find the surface area of T .
Solution: For any point on the circle, the line segment from the center of the circle to that
point makes an angle u with the y-axis in the positive y direction (see Figure 5.4.3(a)). And
as the circle revolves around the z-axis, the line segment from the origin to the center of that
circle sweeps out an angle v with the positive x-axis (see Figure 5.4.3(b)). Thus, the torus
can be parametrized as:
z
z
(y − b)2 + z2 = a2
y
a (x, y, z)
u y v
0 a
b x
(a) Circle in the yz-plane (b) Torus T
Figure 5.4.3
we see that
∂r
= −a sin u cos v i − a sin u sin v j + a cos u k
∂u
∂r
= −( b + a cos u) sin v i + ( b + a cos u) cos v j + 0k ,
∂v
and so computing the cross product gives
∂r ∂r
× = −a( b + a cos u) cos v cos u i − a( b + a cos u) sin v cos u j − a( b + a cos u) sin u k ,
∂u ∂v
which has magnitude ° °
° ∂r ∂r °
° ∂ u × ∂v ° = a( b + a cos u) .
° °
w2πw2π°
° ∂r
°
∂r °
= ° ∂ u × ∂v ° du dv
° °
0 0
w2πw2π
= a( b + a cos u) du dv
0 0
w2πµ ¯u=2π ¶
2 ¯
= abu + a sin u ¯ dv
u =0
0
w2π
= 2πab dv
0
5.4 Surface Integrals and the Divergence Theorem 183
= 4π2 ab
z
Assume that a surface Σ is given by a collection of charts.
Note that for each chart r( u, v) in the collection, the vectors ∂∂ur
and ∂∂rv are tangent to the surface. Therefore, their crossproduct
∂r
∂u
( u, v) × ∂∂rv ( u, v) is normal to Σ at the point with position vector y
r( u, v). 0
Assume further that at each point P of the surface Σ one can
choose a unit normal vector n in such a way such that for every
x
chart r( u, v) in the collection n at the point with position vector
Figure 5.4.4
r( u, v), the crossproduct ∂∂ur ( u, v) × ∂∂rv ( u, v) and is codirectional
with n. In this case Σ is called oriented and the vector field n is
called outward unit normal vector of Σ.
Definition 5.4. Let Σ be an oriented surface in R3 and let f( x, y, z) be a vector field defined
on some subset of R3 that contains Σ. The surface integral of f over Σ is
x x
f · dσ = f · n dσ , (5.30)
Σ Σ
Note in the above definition that the dot product inside the integral on the right is a real-
valued function, and hence we can use Definition 5.3 to evaluate the integral.
s
Example 5.10. Evaluate the surface integral f · d σ, where f( x, y, z) = yzi + xzj + x yk and Σ
Σ
is the part of the plane x + y + z = 1 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward unit normal
n pointing in the positive z direction (see Figure 4.4.5).
184 CHAPTER 5. LINE AND SURFACE INTEGRALS
w1 1w−u 1 p
= p (( u + v) − ( u + v)2 + uv) 3 dv du
0 0
3
w1 (u + v)2
à ¯v=1−u !
( u + v)3 uv2 ¯¯
= − + du
2 3 2 ¯ v =0
0
w1 µ 1 u 3 u2 5 u3
¶
= + − + du
6 2 2 6
0
¯1
u u2 u3 5 u4 ¯¯ 1
= + − + ¯ = .
6 4 2 24 0 8
5.4 Surface Integrals and the Divergence Theorem 185
Computing surface integrals can often be tedious, especially when the formula for the
outward unit normal vector at each point of Σ changes. The following theorem provides an
easier way in the case when Σ is a closed surface, that is, when Σ encloses a bounded
solid in R3 . For example, spheres, cubes, and ellipsoids are closed surfaces, but planes and
paraboloids are not.
where
∂ f1 ∂ f2 ∂ f3
div f = + + (5.32)
∂x ∂y ∂z
is called the divergence of f.
The proof of the Divergence Theorem is very similar to the proof of Green’s Theorem. It is
first proved for the simple case when the solid S is bounded above by one surface, bounded
below by another surface, and bounded laterally by one or more surfaces. The proof can then
be extended to more general solids.2
s
Example 5.11. Evaluate f · d σ, where f( x, y, z) = xi + yj + zk and Σ is the unit sphere
Σ
x2 + y2 + z2 = 1.
Solution: We see that div f = 1 + 1 + 1 = 3, so
x y y
f · dσ = div f dV = 3 dV
Σ S S
y 4π(1)3
= 3 1 dV = 3 vol(S ) = 3 · = 4π .
3
S
s
In physical applications, the surface integral f · d σ is often referred to as the flux of f
Σ
through the surface Σ. For example, if f represents the velocity field of a fluid, then the flux
is the net quantity of fluid to flow through the surface Σ per unit time. A positive flux means
there is a net flow out of the surface (that is, in the direction of the outward unit normal
vector n), while a negative flux indicates a net flow inward (in the direction of −n).
2 See T AYLOR and M ANN, § 15.6 for the details.
186 CHAPTER 5. LINE AND SURFACE INTEGRALS
The term divergence comes from interpreting div f as a measure of how much a vector
field “diverges” from a point. This is best seen by using another definition of div f which is
equivalent3 to the definition given by formula (5.32). Namely, for a point ( x, y, z) in R3 ,
1x
div f( x, y, z) = lim f · dσ , (5.33)
V →0 V
Σ
where V is the volume enclosed by a closed surface Σ around the point ( x, y, z). In the
limit, V → 0 means that we take smaller and smaller closed surfaces around ( x, y, z), which
means that the volumes they enclose are going to zero. It can be shown that this limit is
independent of the shapes of those surfaces. Notice that the limit being taken is of the
ratio of the flux through a surface to the volume enclosed by that surface, which gives a
rough measure of the flow “leaving” a point, as we mentioned. Vector fields which have zero
divergence are often called solenoidal fields.
The following theorem is a simple consequence of formula (5.33).
Theorem 5.8. If the flux of a vector field f is zero through every closed surface containing a
given point, then div f = 0 at that point.
1x
div f( x, y, z) = lim f · d σ for closed surfaces Σ containing ( x, y, z), so
V →0 V
Σ
1
= lim (0) by our assumption that the flux through each Σ is zero, so
V →0 V
= lim 0
V →0
= 0. QED
is used to denote surface integrals of scalar and vector fields, respectively, over closed sur-
faces.
Exercises
A
3 See S CHEY, p. 36–39, for an intuitive discussion of this.
5.4 Surface Integrals and the Divergence Theorem 187
For Exercises 1–4, use the Divergence Theorem to evaluate the surface integral
x
f · dσ
Σ
1. f( x, y, z) = xi + 2 yj + 3 zk, Σ : x2 + y2 + z2 = 9
3. f( x, y, z) = x3 i + y3 j + z3 k, Σ : x2 + y2 + z2 = 1
4. f( x, y, z) = 2i + 3j + 5k, Σ : x2 + y2 + z2 = 1
B
5. Show that the flux of any constant vector field through any closed surface is zero.
6. Evaluate the surface integral from Exercise 2 without using the Divergence Theorem;
that is, using only Definition 5.3, as in Example 5.10. Note that there will be a different
outward unit normal vector to each of the six faces of the cube.
s
7. Evaluate the surface integral f · d σ, where f( x, y, z) = x2 i + x yj + zk and Σ is the part of
Σ
the plane 6 x + 3 y + 2 z = 6 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward unit normal n
pointing in the positive z direction.
8. Use a surface integral to show that the surface area of a sphere of radius r is 4π r 2 . (Hint:
Use spherical coordinates to parametrize the sphere.)
wπ w2π q
S = sin φ a2 b2 cos2 φ + c2 (a2 sin2 θ + b2 cos2 θ ) sin2 φ d θ d φ .
0 0
(Note: The above double integral can not be evaluated by elementary means. For specific
values of a, b and c it can be evaluated using numerical methods. An alternative is to
express the surface area in terms of elliptic integrals.4 )
4 B OWMAN, F., Introduction to Elliptic Functions, with Applications, New York: Dover, 1961, § III.7.
188 CHAPTER 5. LINE AND SURFACE INTEGRALS
C
11. Use Definition 5.3 to prove that the surface area S over a region R in R2 of a surface
z = f ( x, y) is given by the formula
xr ³ ´ ³ ´2
∂f 2 ∂f
S = 1+ ∂x
+ ∂y dA .
R
So far the only types of line integrals which we have discussed are those along curves in R2 .
But the definitions and properties which were covered in Sections 4.1 and 4.2 can easily be
extended to include functions of three variables, so that we can now discuss line integrals
along curves in R3 .
w wb q
f ( x, y, z) ds = f ( x( t), y( t), z( t)) x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt . (5.34)
C a
w wb
f ( x, y, z) dx = f ( x( t), y( t), z( t)) x ′ ( t) dt . (5.35)
C a
w wb
f ( x, y, z) d y = f ( x( t), y( t), z( t)) y ′ ( t) dt . (5.36)
C a
w wb
f ( x, y, z) dz = f ( x( t), y( t), z( t)) z ′ ( t) dt . (5.37)
C a
r
Similar to the two-variable case, if f ( x, y, z) ≥ 0 then the line integral C f ( x, y, z) ds can be
thought of as the total area of the “picket fence” of height f ( x, y, z) at each point along the
curve C in R3 .
Vector fields in R3 are defined in a similar fashion to those in R2 , which allows us to define
the line integral of a vector field along a curve in R3 .
190 CHAPTER 5. LINE AND SURFACE INTEGRALS
dw ∂w dx ∂w d y ∂w dz
= + + . (5.41)
dt ∂ x dt ∂ y dt ∂ z dt
where A = ( x(a), y(a), z(a)) and B = ( x( b), y( b), z( b)) are the endpoints of C .
z
Corollary 5.12. If a vector field f has a potential in a solid S , then f · d r = 0 for any closed
z C
curve C in S (that is, ∇F · d r = 0 for any real-valued function F ( x, y, z)).
C
x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 = (sin2 t + 2 t sin t cos t + t2 cos2 t) + (cos2 t − 2 t sin t cos t + t2 sin2 t) + 1
= t2 (sin2 t + cos2 t) + sin2 t + cos2 t + 1
= t2 + 2 ,
w w8π q
f ( x, y, z) ds = f ( x( t), y( t), z( t)) x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt
C 0
w8π p
= t t2 + 2 dt
0
µ ¶ ¯8π
1 2 3/2 ¯
¯ 1³ 2 3/2
p ´
= ( t + 2) ¯ = (64 π + 2) − 2 2 .
3 0 3
Example 5.13. Let f( x, y, z) =rx i + y j + 2 z k be a vector field in R3 . Using the same curve C
from Example 5.12, evaluate C f · d r.
192 CHAPTER 5. LINE AND SURFACE INTEGRALS
30
25
t = 8π
20
15
z
10
5 -25
t=0 -20
-15
0-25 -10
-5
-20 -15 -10 0
-5 5 x
0 5 10 10
15 20 15
y 25 30 2520
x2 y2
Solution: Note that F ( x, y, z) = 2 + 2 + z2 is a potential for f( x, y, z) (that is, ∇F = f). So by
Theorem 5.11 we know that
w
f · d r = F (B) − F ( A ) , where A = ( x(0), y(0), z(0)) and B = ( x(8π), y(8π), z(8π)), so
C
= F (8π sin 8π, 8π cos 8π, 8π) − F (0 sin 0, 0 cos 0, 0)
= F (0, 8π, 8π) − F (0, 0, 0)
(8π)2
= 0+ + (8π)2 − (0 + 0 + 0) = 96π2 .
2
field N in R3 such that N is nonzero and normal to Σ (that is, perpendicular to the tangent
plane) at each point of Σ. We say that such an N is a normal vector field.
5.5 Stokes’ Theorem 193
z
For example, the unit sphere x2 + y2 + z2 = 1 is orientable, since the N
continuous vector field N( x, y, z) = x i + y j + z k is nonzero and normal
to the sphere at each point. In fact, −N( x, y, z) is another normal −N
vector field (see Figure 4.5.2). We see in this case that N( x, y, z) is y
what we have called an outward normal vector, and −N( x, y, z) is an 0
inward normal vector. These “outward” and “inward” normal vec-
tor fields on the sphere correspond to an “outer” and “inner” side,
x
respectively, of the sphere. That is, we say that the sphere is a two-
Figure 5.5.2
sided surface. Roughly, “two-sided” means “orientable”. Other ex-
amples of two-sided, and hence orientable, surfaces are cylinders,
paraboloids, ellipsoids, and planes.
You may be wondering what kind of surface would not have two sides. An example is the
Möbius strip, which is constructed by taking a thin rectangle and connecting its ends at
the opposite corners, resulting in a “twisted” strip (see Figure 5.5.3).
AA
→
→
A B
−→
B A
(a) Connect A to A and B to B along the ends (b) Not orientable
If you imagine walking along a line down the center of the Möbius strip, as in Figure
5.5.3(b), then you arrive back at the same place from which you started but upside down!
That is, your orientation changed even though your motion was continuous along that center
line. Informally, thinking of your vertical direction as a normal vector field along the strip,
there is a discontinuity at your starting point (and, in fact, at every point) since your vertical
direction takes two different values there. The Möbius strip has only one side, and hence is
nonorientable.6
For an orientable surface Σ which has a boundary curve C , pick a unit normal vector n
such that if you walked along C with your head pointing in the direction of n, then the
surface would be on your left. We say in this situation that n is a positive unit normal vector
and that C is traversed n-positively. We can now state Stokes’ Theorem:
where µ ¶ µ ¶ µ ¶
∂R ∂Q ∂P ∂R ∂Q ∂P
curl f = − i + − j + − k, (5.46)
∂y ∂z ∂z ∂x ∂x ∂y
n is a positive unit normal vector over Σ, and C is traversed n-positively.
Proof: As the general case is beyond the scope of this text, we will prove the theorem only
for the special case where Σ is the graph of z = z( x, y) for some smooth real-valued function
z( x, y), with ( x, y) varying over a region D in R2 .
Projecting Σ onto the x y-plane, we see that the closed z
Σ : z = z(x, y)
curve C (the boundary curve of Σ) projects onto a closed n
curve C D which is the boundary curve of D (see Fig-
ure 4.5.4). Assuming that C has a smooth parametriza-
tion, its projection C D in the x y-plane also has a smooth
C
parametrization, say y
C D : x = x ( t ) , y = y( t ) , a ≤ t ≤ b , 0
D (x, y)
3
and so C can be parametrized (in R ) as x
CD
C : x = x( t) , y = y( t) , z = z( x( t), y( t)) , a ≤ t ≤ b ,
Figure 5.5.4
since the curve C is part of the surface z = z( x, y). Now, by the Chain Rule (Theorem 3.3),
for z = z( x( t), y( t)) as a function of t, we know that
∂z ∂z
z ′ ( t) = x ′ ( t) + y ′ ( t) ,
∂x ∂y
and so
z w
f · dr = P ( x, y, z) dx + Q ( x, y, z) d y + R ( x, y, z) dz
C C
wb µ µ
∂z ∂z
¶¶
′ ′ ′ ′
= P x ( t) + Q y ( t) + R x ( t) + y ( t) dt
a
∂x ∂y
wb µµ ∂z
¶ µ ¶
∂z ′
¶
= P +R x ′ ( t) + Q + R y ( t) dt
a
∂x ∂y
w
= P̃ ( x, y) dx + Q̃ ( x, y) d y ,
CD
5.5 Stokes’ Theorem 195
where
∂z
P̃ ( x, y) = P ( x, y, z( x, y)) + R ( x, y, z( x, y)) ( x, y) , and
∂x
∂z
Q̃ ( x, y) = Q ( x, y, z( x, y)) + R ( x, y, z( x, y)) ( x, y)
∂y
for ( x, y) in D . Thus, by Green’s Theorem applied to the region D , we have
z x µ ∂Q̃ ∂P̃ ¶
f · dr = − dA . (5.47)
∂x ∂ y
C D
Thus,
µ ¶
∂Q̃ ∂ ∂z
= Q ( x, y, z( x, y)) + R ( x, y, z( x, y)) ( x, y) , so by the Product Rule we get
∂x ∂x ∂y
µ ¶ µ ¶
∂ ∂ ∂z ∂ ∂z
= (Q ( x, y, z( x, y))) + R ( x, y, z( x, y)) ( x, y) + R ( x, y, z( x, y)) ( x, y) .
∂x ∂x ∂y ∂x ∂ y
Now, by formula (5.42) in Theorem 5.10, we have
∂ ∂Q ∂ x ∂Q ∂ y ∂Q ∂ z
(Q ( x, y, z( x, y))) = + +
∂x ∂x ∂x ∂ y ∂x ∂z ∂x
∂Q ∂Q ∂Q ∂ z
= ·1 + ·0 +
∂x ∂y ∂z ∂x
∂Q ∂Q ∂ z
= + .
∂x ∂z ∂x
Similarly,
∂ ∂R ∂R ∂ z
(R ( x, y, z( x, y))) = + .
∂x ∂x ∂z ∂x
Thus,
µ ¶
∂Q̃ ∂Q ∂Q ∂ z ∂R ∂R ∂ z ∂ z ∂2 z
= + + + + R ( x, y, z( x, y))
∂x ∂x ∂z ∂x ∂x ∂z ∂x ∂ y ∂x ∂ y
2
∂Q ∂Q ∂ z ∂R ∂ z ∂R ∂ z ∂ z ∂ z
= + + + + R .
∂x ∂z ∂x ∂x ∂ y ∂z ∂x ∂ y ∂x ∂ y
In a similar fashion, we can calculate
∂P̃ ∂P ∂P ∂ z ∂R ∂ z ∂R ∂ z ∂ z ∂2 z
= + + + + R .
∂y ∂y ∂z ∂ y ∂ y ∂x ∂z ∂ y ∂x ∂ y ∂x
So subtracting gives
µ ¶ µ ¶ µ ¶
∂Q̃ ∂P̃ ∂Q ∂R ∂ z ∂R ∂P ∂ z ∂Q ∂P
− = − + − + − (5.48)
∂x ∂y ∂z ∂ y ∂x ∂x ∂z ∂ y ∂x ∂y
196 CHAPTER 5. LINE AND SURFACE INTEGRALS
∂2 z ∂2 z
since ∂x ∂ y
= ∂ y ∂x
by the smoothness of z = z( x, y). Hence, by equation (5.47),
z x µ µ ∂R ¶
∂Q ∂ z
µ
∂P
¶
∂R ∂ z
µ
∂Q ∂P
¶¶
f · dr = − − − − + − dA (5.49)
∂y ∂z ∂x ∂z ∂x ∂ y ∂x ∂y
C D
after factoring out a −1 from the terms in the first two products in equation (5.48).
Now, recall from Section 2.3 (see p.76) that the vector N = − ∂∂xz i − ∂∂ zy j + k is normal to the
tangent plane to the surface z = z( x, y) at each point of Σ. Thus,
N − ∂∂xz i − ∂∂ zy j + k
n = ° ° = r
°N° ¡ ¢2 ³ ´2
1 + ∂∂xz + ∂∂ zy
is, in fact, a positive unit normal vector to Σ (see Figure 4.5.4). Hence, using the
parametrization r( x, y) = x i + y j + z( x, y) k, for ( x,ry) in D , of the surface Σ, we have
¡ ¢ ³ ´2
° × ° = 1 + ∂ z 2 + ∂ z . So we see that us-
∂r ∂z ∂r ∂z
° ∂r ∂r °
∂x
= i + ∂x
k and ∂y
= j + ∂y
k, and so ∂x ∂y ∂x ∂y
ing formula (5.46) for curl f, we have
x x °
° ∂r ∂r °
°
·
(curl f) n d σ = ·
(curl f) n °° × °dA
∂x ∂ y °
Σ D
x µµ ∂R ∂Q ¶ µ ∂P ∂R ¶ µ ∂Q ∂P ¶ ¶ µ ∂ z ∂z
¶
= − i+ − j+ − k · − i− j+k dA
∂y ∂z ∂z ∂x ∂x ∂ y ∂x ∂y
D
x µ µ ∂R ∂Q ¶ ∂ z µ ∂P ∂R ¶ ∂ z µ ∂Q ∂P ¶¶
= − − − − + − dA ,
∂y ∂z ∂x ∂z ∂x ∂ y ∂x ∂ y
D
Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously vary-
ing) positive unit normal vector n and a boundary curve C traversed n-positively can be
expressed more precisely as follows: if r( t) is the position vector for C and T( t) = r ′ ( t)/k r ′ ( t) k
is the unit tangent vector to C , then the vectors T, n, T × n form a right-handed system.
Also, it should be noted that Stokes’ Theorem holds even when the boundary curve C is
piecewise smooth.
z
Solution: The positive unit normal vector to the surface C
z = z( x, y) = x2 + y2 is 1 n
− ∂∂xz i − ∂∂ zy j + k −2 x i − 2 y j + k
n = r = p ,
¡ ∂ z ¢2 ³
∂z
´2 1 + 4 x 2 + 4 y2
1 + ∂x + ∂ y
Σ
y
and curl f = (1 − 0) i + (1 − 0) j + (1 − 0) k = i + j + k, so 0
x
q Figure 5.5.5 z = x 2 + y2
(curl f) · n = (−2 x − 2 y + 1)/ 1 + 4 x2 + 4 y2 .
x x °
° ∂r ∂r °
°
(curl f) · n d σ = ·
(curl f) n °° × °dA
∂x ∂ y °
Σ D
x −2 x − 2 y + 1 q
= p 1 + 4 x 2 + 4 y2 d A
1 + 4 x 2 + 4 y2
D
x
= (−2 x − 2 y + 1) d A , so switching to polar coordinates gives
D
w2πw1
= (−2 r cos θ − 2 r sin θ + 1) r dr d θ
0 0
w2πw1
= (−2 r 2 cos θ − 2 r 2 sin θ + r ) dr d θ
0 0
w2πµ 3 3
¯
2 ¯ r =1
¶
= − 23r cos θ − 23r sin θ + r2 ¯ dθ
r =0
0
w2π ¡ ¢
= − 23 cos θ − 23 sin θ + 12 d θ
0
¯2π
= − 23 sin θ + 23 cos θ + 12 θ ¯ = π .
¯
0
The boundary curve C is the unit circle x2 + y2 = 1 laying in the plane z = 1 (see Figure
198 CHAPTER 5. LINE AND SURFACE INTEGRALS
by Stokes’ Theorem. Vector fields which have zero curl are often called irrotational fields.
In fact, the term curl was created by the 19th century Scottish physicist James Clerk
Maxwell in his study of electromagnetism, where it is used extensively. In physics, the
curl is interpreted as a measure of circulation density. This is best seen by using another
definition of curl f which is equivalent8 to the definition given by formula (5.46). Namely, the
value of n · (curl f) at a point ( x, y, z), is
1z
lim f · d r, (5.50)
S →0 S
C
where S is the surface area of a surface Σ containing the point ( x, y, z) and with a simple
closed boundary curve C and positive unit normal vector n at ( x, y, z). In the limit, think of
the curve C shrinking to the point ( x, y, z), which causes Σ, the surface it bounds, to have
smaller and smaller surface area. That ratio of circulation to surface area in the limit is
what makes the curl a rough measure of circulation density (that is, circulation per unit
area).
where Σ is any orientable surface inside S whose boundary is C (such a surface is some-
times called a capping surface for C ). So similar to the two-variable case, we have a three-
dimensional version of a result from Section 4.3, for solid regions in R3 which are simply
connected (that is, regions having no holes):
The following statements are equivalent for a simply connected solid region S in R3 :
∂R ∂Q ∂P ∂R ∂Q ∂P
(d) = , = , and = in S (that is, curl f = 0 in S ).
∂y ∂z ∂z ∂x ∂x ∂y
Part (d) is also a way of saying that the differential form P dx + Q d y + R dz is exact.
Exercises
A r
For Exercises 1–3, calculate C f ( x, y, z) ds for the given function f ( x, y, z) and curve C .
1. f ( x, y, z) = z; C : x = cos t, y = sin t, z = t, 0 ≤ t ≤ 2π
x
2. f ( x, y, z) = + y + 2 yz; C : x = t2 , y = t, z = 1, 1 ≤ t ≤ 2
y
5.5 Stokes’ Theorem 201
p
2 2 3/2
3. f ( x, y, z) = z2 ; C : x = t sin t, y = t cos t, z = 3 t , 0≤t≤1
r
For Exercises 4–9, calculate ·
C f · dr for the given vector field f( x, y, z) and curve C .
4. f( x, y, z) = i − j + k; C : x = 3 t, y = 2 t, z = t, 0 ≤ t ≤ 1
5. f( x, y, z) = y i − x j + z k; C : x = cos t, y = sin t, z = t, 0 ≤ t ≤ 2π
6. f( x, y, z) = x i + y j + z k; C : x = cos t, y = sin t, z = 2, 0 ≤ t ≤ 2π
7. f( x, y, z) = ( y − 2 z) i + x y j + (2 xz + y) k; C : x = t, y = 2 t, z = t2 − 1, 0 ≤ t ≤ 1
For Exercises 10–13, state whether or not the vector field f( x, y, z) has a potential in R3 (you
do not need to find the potential itself).
B
For Exercises 14–15, verify Stokes’ Theorem for the given vector field f( x, y, z) and surface Σ.
14. f( x, y, z) = 2 y i − x j + z k; Σ : x2 + y2 + z2 = 1, z ≥ 0
15. f( x, y, z) = x y i + xz j + yz k; Σ : z = x 2 + y2 , z ≤ 1
16. Construct a Möbius strip from a piece of paper, then draw a line down its center (like
the dotted line in Figure 5.5.3(b)). Cut the Möbius strip along that center line completely
around the strip. How many surfaces does this result in? How would you describe them?
Are they orientable?
C
17.sLet Σ be a closed surface and f( x, y, z) a smooth vector field. Show that
(curl f) · n d σ = 0. (Hint: Split Σ in half.)
Σ
in R3 , where each of the partial derivatives is evaluated at the point ( x, y, z). So in this way,
you can think of the symbol ∇ as being “applied” to a real-valued function f to produce a
vector ∇ f .
It turns out that the divergence and curl can also be expressed in terms of the symbol ∇.
This is done by thinking of ∇ as a vector in R3 , namely
∂ ∂ ∂
∇ = i + j + k. (5.51)
∂x ∂y ∂z
Here, the symbols ∂∂x , ∂∂y and ∂∂z are to be thought of as “partial derivative operators” that
will get “applied” to a real-valued function, say f ( x, y, z), to produce the partial derivatives
∂f ∂f ∂f ∂f
,
∂x ∂ y
and ∂ z . For instance, ∂∂x “applied” to f ( x, y, z) produces ∂ x .
Is ∇ really a vector? Strictly speaking, no, since ∂∂x , ∂∂y and ∂∂z are not actual numbers. But
it helps to think of ∇ as a vector, especially with the divergence and curl, as we will soon see.
The process of “applying” ∂∂x , ∂∂y , ∂∂z to a real-valued function f ( x, y, z) is normally thought of
as multiplying the quantities:
µ ¶ µ ¶ µ ¶
∂ ∂f ∂ ∂f ∂ ∂f
(f ) = , (f ) = , (f ) =
∂x ∂x ∂y ∂y ∂z ∂z
For this reason, ∇ is often referred to as the “del operator”, since it “operates” on functions.
For example, it is often convenient to write the divergence div f as ∇ · f, since for a vector
field f( x, y, z) = f 1 ( x, y, z)i + f 2 ( x, y, z)j + f 3 ( x, y, z)k, the dot product of f with ∇ (thought of as a
vector) makes sense:
µ ¶
∂ ∂ ∂
∇· f = i + j + k · ( f 1 ( x, y, z)i + f 2 ( x, y, z)j + f 3 ( x, y, z)k)
∂x ∂y ∂z
µ ¶ µ ¶ µ ¶
∂ ∂ ∂
= ( f1) + ( f2) + ( f3)
∂x ∂y ∂z
∂ f1 ∂ f2 ∂ f3
= + +
∂x ∂y ∂z
= div f
5.6 Gradient, Divergence, Curl and Laplacian 203
We can also write curl f in terms of ∇, namely as ∇ × f, since for a vector field f( x, y, z) =
P ( x, y, z)i + Q ( x, y, z)j + R ( x, y, z)k, we have:
¯ ¯
¯
¯ i j k ¯
¯
¯ ∂ ∂ ∂ ¯
¯ ¯
∇×f = ¯ ¯
¯ ∂x ∂y ∂z ¯
¯ ¯
¯P ( x, y, z) Q ( x, y, z) R ( x, y, z)¯
µ ¶ µ ¶ µ ¶
∂R ∂Q ∂R ∂P ∂Q ∂P
= − i − − j + − k
∂y ∂z ∂x ∂z ∂x ∂ y
µ ¶ µ ¶ µ ¶
∂R ∂Q ∂P ∂R ∂Q ∂P
= − i + − j + − k
∂y ∂z ∂z ∂x ∂x ∂ y
= curl f
∂f ∂f ∂f
For a real-valued function f ( x, y, z), the gradient ∇ f ( x, y, z) = ∂x
i + ∂y j + ∂z
k is a vector
field, so we can take its divergence:
div ∇ f = ∇ · ∇ f
µ ¶ µ ¶
∂ ∂ ∂ ∂f ∂f ∂f
= i + j + k · i + j + k
∂x ∂y ∂z ∂x ∂y ∂z
µ ¶ µ ¶ µ ¶
∂ ∂f ∂ ∂f ∂ ∂f
= + +
∂x ∂x ∂y ∂y ∂z ∂z
∂2 f ∂2 f ∂2 f
= + +
∂ x2 ∂ y2 ∂ z2
Note that this is a real-valued function, to which we will give a special name:
Solution: (a) ∇k r k2 = 2 x i + 2 y j + 2 z k = 2 r
∂
(b) ∇ · r = ∂x
( x) + ∂∂y ( y) + ∂∂z ( z) = 1 + 1 + 1 = 3
(c) ¯ ¯
¯ i j k ¯¯
¯
¯ ¯
¯∂ ∂ ∂¯
∇× r = ¯ ¯ = (0 − 0) i − (0 − 0) j + (0 − 0) k = 0
¯ ∂x ∂y ∂ z ¯¯
¯
¯ x y z ¯
∂2 2 2
(d) ∆k r k2 = ∂ x2
( x2 + y2 + z2 ) + ∂∂y2 ( x2 + y2 + z2 ) + ∂∂z2 ( x2 + y2 + z2 ) = 2 + 2 + 2 = 6
2
Note that we could have calculated ∆k r k another way, using the ∇ notation along with parts
(a) and (b):
∆k r k2 = ∇ · ∇k r k2 = ∇ · 2 r = 2 ∇ · r = 2(3) = 6
Notice that in Example 5.17 if we take the curl of the gradient of k r k2 we get
∇ × (∇k r k2 ) = ∇ × 2 r = 2 ∇ × r = 2 0 = 0 .
The following theorem shows that this will be the case in general:
since the mixed partial derivatives in each component are equal. QED
Another way of stating Theorem 5.14 is that gradients are irrotational. Also, notice that
in Example 5.17 if we take the divergence of the curl of r we trivially get
∇ · (∇ × r) = ∇ · 0 = 0 .
The following theorem shows that this will be the case in general:
5.6 Gradient, Divergence, Curl and Laplacian 205
Corollary 5.17. The flux of the curl of a smooth vector field f( x, y, z) through any closed
surface is zero.
Proof: Let Σ be a closed surface which bounds a solid S . The flux of ∇ × f through Σ is
x y
(∇ × f ) · d σ = ∇ · (∇ × f ) dV (by the Divergence Theorem)
Σ S
y
= 0 dV (by Theorem 5.16)
S
= 0. QED
Since the choice of Σ was arbitrary, then we must have (∇× × (∇ f )) · n = 0 throughout R3 , where
n is any unit vector. Using i, j and k in place of n, we see that we must have ∇ × (∇ f ) = 0 in
R3 , which completes the proof.
Example 5.18. A system of electric charges has a charge density ρ ( x, y, z) and produces an
electrostatic field E( x, y, z) at points ( x, y, z) in space. Gauss’ Law states that
x y
E · d σ = 4π ρ dV
Σ S
for any closed surface Σ which encloses the charges, with S being the solid region enclosed
by Σ. Show that ∇ · E = 4πρ . This is one of Maxwell’s Equations.9
9 In Gaussian (or CGS) units.
206 CHAPTER 5. LINE AND SURFACE INTEGRALS
Exercises
A
For Exercises 1–6, find the Laplacian of the function f ( x, y, z).
1. f ( x, y, z) = x + y + z 2. f ( x, y, z) = x5 3. f ( x, y, z) = ( x2 + y2 + z2 )3/2
2
− y2 − z2
4. f ( x, y, z) = e x+ y+ z 5. f ( x, y, z) = x3 + y3 + z3 6. f ( x, y, z) = e− x
B
For Exercises 7–18, prove the given formula ( r = k r k is the length of the position vector field
r( x, y, z) = x i + y j + z k).
C
19. Prove Theorem 5.16.
x
∇u · d σ = 0
Σ
Often (especially in physics) it is convenient to use other coordinate systems when dealing
with quantities such as the gradient, divergence, curl and Laplacian. We will present the
formulas for these in cylindrical and spherical coordinates.
Recall from Section 1.7 that a point ( x, y, z) can be represented in cylindrical coordinates
( r, θ , z), where x = r cos θ , y = r sin θ , z = z. At each point ( r, θ , z), let er , eθ , e z be unit vectors
in the direction of increasing r , θ , z, respectively (see Figure 5.7.1). Then er , eθ , e z form an
orthonormal set of vectors. Note, by the right-hand rule, that e z × er = eθ .
ez
eρ
z eθ z eθ
(x, y, z) (x, y, z)
er
eφ
z φ ρ
y z y
0 0
x θ r x θ
y (x, y, 0) y (x, y, 0)
x x
∂F ∂F ∂F
gradient : ∇F = i + j + k
∂x ∂y ∂z
∂ f1 ∂ f2 ∂ f3
divergence : ∇ · f = + +
∂x ∂y
∂z
µ µ ¶ ¶ µ ¶
∂ f3 ∂ f1 ∂ f3
∂ f2 ∂ f2 ∂ f1
curl : ∇ × f = − i + − j + − k
∂y ∂z ∂z ∂x ∂x ∂y
∂2 F ∂2 F ∂2 F
Laplacian : ∆ F = + +
∂ x2 ∂ y2 ∂ z2
∂F 1 ∂F ∂F
gradient : ∇F = er + eθ + ez
∂r r ∂θ ∂z
1 ∂ 1 ∂ fθ ∂fz
divergence : ∇ · f = (r f r ) + +
µr ∂ r ¶r ∂θ µ ∂ z ¶ µ ¶
1 ∂ f z ∂ fθ ∂ fr ∂ fz 1 ∂ ∂ fr
curl : ∇ × f = − er + − eθ + (r f θ ) − ez
r ∂θ ∂z ∂z ∂r r ∂r ∂θ
µ ¶
1 ∂ ∂F 1 ∂2 F ∂2 F
Laplacian : ∆ F = r + 2 2 +
r ∂r ∂r r ∂θ ∂ z2
∂F 1 ∂F 1 ∂F
gradient : ∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ
1 ∂ 2 1 ∂ fθ 1 ∂
divergence : ∇ · f = 2 (ρ f ρ ) + + (sin φ f θ )
ρ ∂ρ ρ sin φ ∂θ ρ sin φ ∂φ
µ ¶ µ ¶
1 ∂ ∂ fφ 1 ∂ ∂ fρ
curl : ∇ × f = (sin φ f θ ) − eρ + (ρ f φ ) − eθ
ρ sin φ ∂φ ∂θ ρ ∂ρ ∂φ
µ ¶
1 ∂ fρ 1 ∂
+ − (ρ f θ ) eφ
ρ sin φ ∂θ ρ ∂ρ
µ ¶ µ ¶
1 ∂ ∂F 1 ∂2 F 1 ∂ ∂F
Laplacian : ∆ F = 2 ρ2 + + sin φ
ρ ∂ρ ∂ρ ρ 2 sin2 φ ∂θ 2 ρ 2 sin φ ∂φ ∂φ
The derivation of the above formulas for cylindrical and spherical coordinates is straight-
forward but extremely tedious. The basic idea is to take the Cartesian equivalent of the
quantity in question and to substitute into that formula using the appropriate coordinate
transformation. As an example, we will derive the formula for the gradient in spherical
coordinates.
5.7 Other coordinate systems 209
Goal: Show that the gradient of a real-valued function F (ρ , θ , φ) in spherical coordinates is:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ
Idea: In the Cartesian gradient formula ∇F ( x, y, z) = ∂∂Fx i + ∂∂Fy j + ∂∂Fz k, put the Cartesian ba-
sis vectors i, j, k in terms of the spherical coordinate basis vectors eρ , eθ , eφ and functions of
ρ , θ and φ. Then put the partial derivatives ∂∂Fx , ∂∂Fy , ∂∂Fz in terms of ∂∂ρF , ∂∂θF , ∂∂φ
F
and functions
of ρ , θ and φ.
Now, since the angle θ is measured in the x y-plane, then the unit vector eθ in the θ
direction must be parallel to the x y-plane. That is, eθ is of the form a i + b j + 0 k. To figure
out what a and b are, note that since eθ ⊥ eρ , then in particular eθ ⊥ eρ when eρ is in the
x y-plane. That occurs when the angle φ is π/2. Putting φ = π/2 into the formula for eρ gives
eρ = cos θ i + sin θ j + 0 k, and we see that a vector perpendicular to that is − sin θ i + cos θ j + 0 k.
Since this vector is also a unit vector and points in the (positive) θ direction, it must be eθ :
eθ = − sin θ i + cos θ j + 0 k
Step 2: Use the three formulas from Step 1 to solve for i, j, k in terms of eρ , eθ , eφ .
This comes down to solving a system of three equations in three unknowns. There are
many ways of doing this, but we will do it by combining the formulas for eρ and eφ to
eliminate k, which will give us an equation involving just i and j. This, with the formula for
eθ , will then leave us with a system of two equations in two unknowns (i and j), which we
will use to solve first for j then for i. Lastly, we will solve for k.
First, note that
sin φ eρ + cos φ eφ = cos θ i + sin θ j
so that
sin θ (sin φ eρ + cos φ eφ ) + cos θ eθ = (sin2 θ + cos2 θ )j = j ,
210 CHAPTER 5. LINE AND SURFACE INTEGRALS
and so:
j = sin φ sin θ eρ + cos θ eθ + cos φ sin θ eφ
and so:
i = sin φ cos θ eρ − sin θ eθ + cos φ cos θ eφ
∂F ∂F ∂ x ∂F ∂ y ∂F ∂ z
= + + ,
∂ρ ∂ x ∂ρ ∂ y ∂ρ ∂ z ∂ρ
∂F ∂F ∂ x ∂F ∂ y ∂F ∂ z
= + + ,
∂θ ∂ x ∂θ ∂ y ∂θ ∂ z ∂θ
∂F ∂F ∂ x ∂F ∂ y ∂F ∂ z
= + + ,
∂φ ∂ x ∂φ ∂ y ∂φ ∂ z ∂φ
which yields:
∂F ∂F ∂F ∂F
= sin φ cos θ + sin φ sin θ + cos φ
∂ρ ∂x ∂y ∂z
∂F ∂F ∂F
= −ρ sin φ sin θ + ρ sin φ cos θ
∂θ ∂x ∂y
∂F ∂F ∂F ∂F
= ρ cos φ cos θ + ρ cos φ sin θ − ρ sin φ
∂φ ∂x ∂y ∂z
Step 4: Use the three formulas from Step 3 to solve for ∂∂Fx , ∂∂Fy , ∂∂Fz in terms of ∂∂ρF , ∂∂θF , ∂∂φ
F
.
Again, this involves solving a system of three equations in three unknowns. Using a
similar process of elimination as in Step 2, we get:
µ ¶
∂F 1 ∂F ∂F ∂F
= ρ sin2 φ cos θ − sin θ + sin φ cos φ cos θ
∂x ρ sin φ ∂ρ ∂θ ∂φ
µ ¶
∂F 1 2 ∂F ∂F ∂F
= ρ sin φ sin θ + cos θ + sin φ cos φ sin θ
∂y ρ sin φ ∂ρ ∂θ ∂φ
µ ¶
∂F 1 ∂F ∂F
= ρ cos φ − sin φ
∂z ρ ∂ρ ∂φ
∂F ∂F ∂F
Step 5: Substitute the formulas for i, j, k from Step 2 and the formulas for , ,
∂x ∂ y ∂z
from
∂F ∂F ∂F
Step 4 into the Cartesian gradient formula ∇F ( x, y, z) = ∂x
i+ ∂y
j+ ∂z
k.
5.7 Other coordinate systems 211
Doing this last step is perhaps the most tedious, since it involves simplifying 3 × 3 + 3 × 3 +
2 × 2 = 22 terms! Namely,
µ ¶
1 ∂F ∂F ∂F
∇F = ρ sin2 φ cos θ − sin θ + sin φ cos φ cos θ (sin φ cos θ eρ − sin θ eθ
ρ sin φ ∂ρ ∂θ ∂φ
+ cos φ cos θ eφ )
µ ¶
1 2 ∂F ∂F ∂F
+ ρ sin φ sin θ + cos θ + sin φ cos φ sin θ (sin φ sin θ eρ + cos θ eθ
ρ sin φ ∂ρ ∂θ ∂φ
+ cos φ sin θ eφ )
µ ¶
1 ∂F ∂F
+ ρ cos φ − sin φ (cos φ eρ − sin φ eφ ) ,
ρ ∂ρ ∂φ
which we see has 8 terms involving eρ , 6 terms involving eθ , and 8 terms involving eφ . But
the algebra is straightforward and yields the desired result:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ X
∂ρ ρ sin φ ∂θ ρ ∂φ
Exercises
A
212 CHAPTER 5. LINE AND SURFACE INTEGRALS
C
6. Derive the gradient formula in cylindrical coordinates:
∂F 1 ∂F ∂F
∇F = er + eθ + ez .
∂r r ∂θ ∂z
Bibliography
Abbott, E.A., Flatland, 7th edition. New York: Dover Publications, Inc., 1952
Classic tale about a creature living in a 2-dimensional world who encounters a higher-
dimensional creature, with lots of humor thrown in.
Anton, H. and C. Rorres, Elementary Linear Algebra: Applications Version, 8th edition. New
York: John Wiley & Sons, 2000
Standard treatment of elementary linear algebra.
Bazaraa, M.S., H.D. Sherali and C.M. Shetty, Nonlinear Programming: Theory and Algo-
rithms, 2nd edition. New York: John Wiley & Sons, 1993
Thorough treatment of nonlinear optimization.
Farin, G., Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide,
2nd edition. San Diego, CA: Academic Press, 1990
An intermediate-level book on curve and surface design.
Hecht, E., Optics, 2nd edition. Reading, MA: Addison-Wesley Publishing Co., 1987
An intermediate-level book on optics, covering a wide range of topics.
Hoel, P.G., S.C. Port and C.J. Stone, Introduction to Probability Theory, Boston, MA:
Houghton Mifflin Co., 1971
An excellent introduction to elementary, calculus-based probability theory. Lots of good exer-
cises.
Jackson, J.D., Classical Electrodynamics, 2nd edition. New York: John Wiley & Sons, 1975
An advanced book on electromagnetism, famous for being intimidating. Most of the mathemat-
ics will be understandable after reading the present book.
Marion, J.B., Classical Dynamics of Particles and Systems, 2nd edition. New York: Academic
Press, 1970
Standard intermediate-level treatment of classical mechanics. Very thorough.
O’Neill, B., Elementary Differential Geometry, New York: Academic Press, 1966
Intermediate-level book on differential geometry, with a modern approach based on differential
forms.
213
214 Bibliography
Press, W.H., S.A. Teukolsky, W.T. Vetterling and B.P. Flannery, Numerical Recipes in FOR-
TRAN: The Art of Scientific Computing, 2nd edition. Cambridge, UK: Cambridge Uni-
versity Press, 1992
An excellent source of information on numerical methods for solving a wide variety of problems.
Though all the examples are in the FORTRAN programming language, the code is clear enough
to implement in the language of your choice.
Protter, M.H. and C.B. Morrey, Analytic Geometry, 2nd edition. Reading, MA: Addison-
Wesley Publishing Co., 1975
Thorough treatment of elementary analytic geometry, with a rigor not found in most recent
books.
Ralston, A. and P. Rabinowitz, A First Course in Numerical Analysis, 2nd edition. New York:
McGraw-Hill, 1978
Standard treatment of elementary numerical analysis.
Reitz, J.R., F.J. Milford and R.W. Christy, Foundations of Electromagnetic Theory, 3rd edi-
tion. Reading, MA: Addison-Wesley Publishing Co., 1979
Intermediate text on electromagnetism.
Schey, H.M., Div, Grad, Curl, and All That: An Informal Text on Vector Calculus, New York:
W.W. Norton & Co., 1973
Very intuitive approach to the subject, from a physicist’s viewpoint. Highly recommended.
Taylor, A.E. and W.R. Mann, Advanced Calculus, 2nd edition. New York: John Wiley & Sons,
1972
Excellent treatment of n-dimensional calculus. A good book to study after the present book.
Many intriguing exercises.
Weinberger, H.F., A First Course in Partial Differential Equations, New York: John Wiley &
Sons, 1965
A good introduction to the vast subject of partial differential equations.
Welchons, A.M. and W.R. Krickenberger, Solid Geometry, Boston, MA: Ginn & Co., 1936
A very thorough treatment of 3-dimensional geometry from an elementary perspective, in-
cludes many topics which (sadly) do not seem to be taught anymore.
Appendix A
Answers and Hints to Selected Exercises
Chapter 1 3.(a) (2, 1, 3) + t(1, 0, 1); (b) x = 2 + t, y = 1, z =
3 + t; (c) x − 2 = z − 3, y = 1; 5. x = 1 + 2 t, y =
Section 1.1 (p. 8) −2 + 7 t, z = −3 + 8 t; 7. 7.65; 9. (1, 2, 3);
p p p
1.(a)p 5; (b) 5; (c) 17; (d) 1; 11. 4 x − 4 y + 3 z − 10 = 0; 13. x − 2 y − z + p2 =
(e) 2 17; 2. Yes; 3. No. 0; 15. 11 x − 24 y + 21 z − 26 = 0; 17. 9/ 35;
19. x = 5 t, y = 2 + 3 t, z = −7 t; 21. (10, −2, 1).
215
216 APPENDIX A: ANSWERS AND HINTS TO SELECTED EXERCISES
the functions as position vectors; 15. Hint: Section 3.3 (p. 86)
Theorem 1.16.
1. 2 x + 3 y − z − 3 = 0; 3. −2 x + y − z − p
2 = 0;
1 4 11
5. x + 2 y = z; 7. 2 ( x − 1) + 9 ( y − 2) + 12 ( z −
Section 2.3 (p. 72) p
2 11
p
3π 5 3 ) = 0;
9. 3 x + 4 y − 5 z = 0.
1. 2 ; 3. 2(53/2 − 8); 5. Replace t by
³³ ´2/3 ´.
27 s+16
2 − 4 9; 6. Hint: Use Theo- Section 3.4 (p. 91)
rem 2.1(e), Example 2.3, and Theorem 1.16; y
1. (2 x, 2 y); 3. ( p 2 x 2 , p 2 2 ); 5. ( 1x , 1y );
7. Hint: Use Exercise 6. 9. Hint: Use x + y +4 x + y +4
f ′ ( t) = k f( t) kT, differentiate that to get f ′′ ( t), 7. ( yz cos( x yz), xz cos( xpyz), x y cos( x yz));
′ ′′
put those expressions into f ( t) × f ( t), then 9. (2 x, 2 y, 2 z); 11. 2 2; 13. p1 ;
p 3
write T ′ ( t) in terms of N( t).; 11. T( t) = 15. 3 cos(1); 17. increase: (45, 20), de-
p1 (− sin t, cos t, 1), N( t) = (− cos t, − sin t, 0), crease: (−45, −20)
2
B( t) = p1 (sin t, − cos t, 1), κ( t) = 1/2
2
Section 3.5 (p. 98)
Chapter 3 1. local min. (1, 0); saddle pt. (−1, 0); 3. lo-
cal min. (1, 1); local max. (−1, −1); saddle pts.
Section 3.1 (p. 78)
(1, −1), (−1, 1); 5. local min. (1, −1), saddle
1. domain: R2 , range: [−1, ∞); 3. domain: pt. (0, 0); 7. local min. (0, 0); 9. local min.
{( x, y) : x2 + y2 ≥ 4}, range: [0, ∞); 5. domain: (−1, 1/2); 11. width = height = depth=10;
R3 , range: [−1, 1]; 7. 1; 9. does not exist; 13. x = y = 4, z = 2.
11. 2; 13. 2; 15. 0; 17. does not exist.
∂ f
∂x ∂y
∂
∂x
f
Section 3.7 (p. 112)
y2 )−1/2 , ∂ y = y( x2 + y ) ; 11. ∂ x = 23x ( x2 +
2 −1/2
³ ´ ³ ´
−4 p
∂f
y + 4)−2/3 , ∂ y = 13 ( x2 + y + 4)−2/3 ; 13. ∂ x =
∂f 1. min. p , −2 ; max. p4 , p2 ; 3. min.
³ ´ 5 5 ³ 5 5
´
2 2 ∂f 2 2
−2 xe−( x + y ) , ∂ y = −2 ye−( x + y ) ; 15. ∂ x =
∂f p20 , p30 ; max. − 20
p ,− p 30
4. There is
13 13 13 13
∂f ∂2 f ∂2 f no global maximum, nor global minimum.
y cos( x y), ∂ y = x cos( x y); 17. ∂ x2 = 2, ∂ y2 = 2,
5. 8abc
p
∂2 f ∂2 f ∂2 f 3 3
∂x ∂ y
= 0; 19. ∂ x2 = ( y+4)( x2 + y+4)−3/2 , ∂ y2 =
∂2 f
− 14 ( x2 + y + 4)−3/2 , ∂ x ∂ y = − 12 x( x2 + y + 4)−3/2 ;
∂2 f ∂2 f ∂2 f
Chapter 4
21. ∂ x2 = y2 e x y , ∂ y2 = x2 e x y , ∂ x ∂ y = (1 +
∂2 f ∂2 f ∂2 f Section 4.1 (p. 117)
x y) e x y + 1; 23. ∂ x2 = 12 x2 , ∂ y2 = 0, ∂ x ∂ y = 0;
∂2 f ∂2 f ∂2 f
25. ∂ x2 = − x−2 , ∂ y2 = − y−2 , ∂ x ∂ y = 0 1. 1; 3. 7
12 ; 5. 76 ; 7. 5; 9. 12 ; 11. 15.
217
Everyone is permitted to copy and distribute verbatim copies of this license document, but
changing it is not allowed.
Preamble
The purpose of this License is to make a manual, textbook, or other functional and useful
document "free" in the sense of freedom: to assure everyone the effective freedom to copy
and redistribute it, with or without modifying it, either commercially or noncommercially.
Secondarily, this License preserves for the author and publisher a way to get credit for their
work, while not being considered responsible for modifications made by others.
This License is a kind of "copyleft", which means that derivative works of the document
must themselves be free in the same sense. It complements the GNU General Public License,
which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free
software needs free documentation: a free program should come with manuals providing the
same freedoms that the software does. But this License is not limited to software manuals; it
can be used for any textual work, regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works whose purpose is instruction
or reference.
218
219
Title" of such a section when you modify the Document means that it remains a section
"Entitled XYZ" according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that
this License applies to the Document. These Warranty Disclaimers are considered to be
included by reference in this License, but only as regards disclaiming warranties: any other
implication that these Warranty Disclaimers may have is void and has no effect on the
meaning of this License.
2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially or non-
commercially, provided that this License, the copyright notices, and the license notice say-
ing this License applies to the Document are reproduced in all copies, and that you add no
other conditions whatsoever to those of this License. You may not use technical measures to
obstruct or control the reading or further copying of the copies you make or distribute. How-
ever, you may accept compensation in exchange for copies. If you distribute a large enough
number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly
display copies.
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have printed covers) of
the Document, numbering more than 100, and the Document’s license notice requires Cover
Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both
covers must also clearly and legibly identify you as the publisher of these copies. The front
cover must present the full title with all words of the title equally prominent and visible.
You may add other material on the covers in addition. Copying with changes limited to the
covers, as long as they preserve the title of the Document and satisfy these conditions, can
be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the
first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto
adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100,
you must either include a machine-readable Transparent copy along with each Opaque copy,
or state in or with each Opaque copy a computer-network location from which the general
network-using public has access to download using public-standard network protocols a com-
plete Transparent copy of the Document, free of added material. If you use the latter option,
you must take reasonably prudent steps, when you begin distribution of Opaque copies in
quantity, to ensure that this Transparent copy will remain thus accessible at the stated lo-
cation until at least one year after the last time you distribute an Opaque copy (directly or
through your agents or retailers) of that edition to the public.
221
It is requested, but not required, that you contact the authors of the Document well before
redistributing any large number of copies, to give them a chance to provide you with an
updated version of the Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under the conditions of
sections 2 and 3 above, provided that you release the Modified Version under precisely this
License, with the Modified Version filling the role of the Document, thus licensing distribu-
tion and modification of the Modified Version to whoever possesses a copy of it. In addition,
you must do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document,
and from those of previous versions (which should, if there were any, be listed in the
History section of the Document). You may use the same title as a previous version if the
original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for au-
thorship of the modifications in the Modified Version, together with at least five of the
principal authors of the Document (all of its principal authors, if it has fewer than five),
unless they release you from this requirement.
C. State on the Title page the name of the publisher of the Modified Version, as the pub-
lisher.
E. Add an appropriate copyright notice for your modifications adjacent to the other copy-
right notices.
F. Include, immediately after the copyright notices, a license notice giving the public per-
mission to use the Modified Version under the terms of this License, in the form shown
in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections and required Cover
Texts given in the Document’s license notice.
I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating
at least the title, year, new authors, and publisher of the Modified Version as given on the
Title Page. If there is no section Entitled "History" in the Document, create one stating
the title, year, authors, and publisher of the Document as given on its Title Page, then
add an item describing the Modified Version as stated in the previous sentence.
222 GNU FREE DOCUMENTATION LICENSE
J. Preserve the network location, if any, given in the Document for public access to a Trans-
parent copy of the Document, and likewise the network locations given in the Document
for previous versions it was based on. These may be placed in the "History" section. You
may omit a network location for a work that was published at least four years before the
Document itself, or if the original publisher of the version it refers to gives permission.
K. For any section Entitled "Acknowledgments" or "Dedications", Preserve the Title of the
section, and preserve in the section all the substance and tone of each of the contributor
acknowledgments and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their
titles. Section numbers or the equivalent are not considered part of the section titles.
M. Delete any section Entitled "Endorsements". Such a section may not be included in the
Modified Version.
If the Modified Version includes new front-matter sections or appendices that qualify as
Secondary Sections and contain no material copied from the Document, you may at your
option designate some or all of these sections as invariant. To do this, add their titles to
the list of Invariant Sections in the Modified Version’s license notice. These titles must be
distinct from any other section titles.
You may add a section Entitled "Endorsements", provided it contains nothing but endorse-
ments of your Modified Version by various parties–for example, statements of peer review
or that the text has been approved by an organization as the authoritative definition of a
standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to
25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version.
Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or
through arrangements made by) any one entity. If the Document already includes a cover
text for the same cover, previously added by you or by arrangement made by the same entity
you are acting on behalf of, you may not add another; but you may replace the old one, on
explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to
use their names for publicity for or to assert or imply endorsement of any Modified Version.
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this License, under
the terms defined in section 4 above for modified versions, provided that you include in the
223
combination all of the Invariant Sections of all of the original documents, unmodified, and
list them all as Invariant Sections of your combined work in its license notice, and that you
preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical In-
variant Sections may be replaced with a single copy. If there are multiple Invariant Sections
with the same name but different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original author or publisher of that
section if known, or else a unique number. Make the same adjustment to the section titles
in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled "History" in the various orig-
inal documents, forming one section Entitled "History"; likewise combine any sections En-
titled "Acknowledgments", and any sections Entitled "Dedications". You must delete all
sections Entitled "Endorsements".
6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released un-
der this License, and replace the individual copies of this License in the various documents
with a single copy that is included in the collection, provided that you follow the rules of this
License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually
under this License, provided you insert a copy of this License into the extracted document,
and follow this License in all other respects regarding verbatim copying of that document.
8. TRANSLATION
Translation is considered a kind of modification, so you may distribute translations of the
Document under the terms of section 4. Replacing Invariant Sections with translations re-
quires special permission from their copyright holders, but you may include translations of
224 GNU FREE DOCUMENTATION LICENSE
some or all Invariant Sections in addition to the original versions of these Invariant Sections.
You may include a translation of this License, and all the license notices in the Document,
and any Warranty Disclaimers, provided that you also include the original English version
of this License and the original versions of those notices and disclaimers. In case of a dis-
agreement between the translation and the original version of this License or a notice or
disclaimer, the original version will prevail.
If a section in the Document is Entitled "Acknowledgments", "Dedications", or "History",
the requirement (section 4) to Preserve its Title (section 1) will typically require changing
the actual title.
9. TERMINATION
You may not copy, modify, sublicense, or distribute the Document except as expressly pro-
vided for under this License. Any other attempt to copy, modify, sublicense or distribute the
Document is void, and will automatically terminate your rights under this License. How-
ever, parties who have received copies, or rights, from you under this License will not have
their licenses terminated so long as such parties remain in full compliance.
The Free Software Foundation may publish new, revised versions of the GNU Free Doc-
umentation License from time to time. Such new versions will be similar in spirit to the
present version, but may differ in detail to address new problems or concerns. See http://
www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document
specifies that a particular numbered version of this License "or any later version" applies to
it, you have the option of following the terms and conditions either of that specified version or
of any later version that has been published (not as a draft) by the Free Software Foundation.
If the Document does not specify a version number of this License, you may choose any
version ever published (not as a draft) by the Free Software Foundation.
To use this License in a document you have written, include a copy of the License in the
document and put the following copyright and license notices just after the title page:
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the
"with...Texts." line with this:
with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts
being LIST, and with the Back-Cover Texts being LIST.
If you have Invariant Sections without Cover Texts, or some other combination of the
three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releas-
ing these examples in parallel under your choice of free software license, such as the GNU
General Public License, to permit their use in free software.
History
This section contains the revision history of the book. For persons making modifications to
the book, please record the pertinent information here, following the format in the first item
below.
1. VERSION: 1.0
Date: 2008-01-04
Author(s): Michael Corral
Title: Vector Calculus
Modification(s): Initial version
2. VERSION: 1.1
Date: 2016-12-04
Author(s): Anton Petrunin
Title: Corral’s Vector Calculus
Modification(s): Minor corrections and more exercises.
226
Index
Symbols B
1 ∞
C , C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Bézier curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 bounded set . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
M x , M y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
M x y , M xz , M yz . . . . . . . . . . . . . . . . . . . . . . . . 144 C
R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 capping surface . . . . . . . . . . . . . . . . . . . . . . . 200
R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Cauchy–Schwarz Inequality . . . . . . . . . . . 17
x̄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142 center of mass . . . . . . . . . . . . . . . . . . . . . . . . . 142
ȳ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
z̄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . 66, 89
δ( x, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 change of variable . . . . . . . . . . . . . . . . 135, 137
∂( x, y, z) circulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . .198
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
∂( u, v, w) closed curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
∂f closed surface . . . . . . . . . . . . . . . . . . . . . . . . . 185
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
t
∂x collinear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 conical helix . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115, 119 conservative field . . . . . . . . . . . . . . . . . . . . . 169
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156, 159 constrained critical point . . . . . . . . . . . . . 107
er , eθ , e z , eρ , eφ . . . . . . . . . . . . . . . . . . . . . . . 207
continuity. . . . . . . . . . . . . . . . . . . . . . . . . . .57, 78
i, j, k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
continuously differentiable . . . . . . . . . 57, 89
v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89, 202
∇
coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Cartesian . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
u
Σ
curvilinear . . . . . . . . . . . . . . . . . . . . . . . . . 51
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166
∂ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 cylindrical . . . . . . . . . . . . . . . . . . . . 51, 208
D v f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 ellipsoidal. . . . . . . . . . . . . . . . . . . . . . . . .187
d r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 left-handed . . . . . . . . . . . . . . . . . . . . . . . . . . 2
polar . . . . . . . . . . . . . . . . . . . . . . . . . . 52, 139
A rectangular. . . . . . . . . . . . . . . . . . . . . . . . . .1
acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . 2, 61 right-handed . . . . . . . . . . . . . . . . . . . . . . . . 1
angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 spherical . . . . . . . . . . . . . . . . . . . . . . 51, 208
annulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 coplanar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
area element . . . . . . . . . . . . . . . . . . . . . . . . . . 119 covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
average value . . . . . . . . . . . . . . . . . . . . . . . . . 130 critical point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
227
228 Index
L of revolution . . . . . . . . . . . . . . . . . . . . . . . 47
Lagrange multiplier. . . . . . . . . . . . . . . . . . .107 parallelepiped . . . . . . . . . . . . . . . . . . . . . . . . . . 26
lamina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 parameter . . . . . . . . . . . . . . . . . . . . . . . . . . 33, 66
level curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 parametrization . . . . . . . . . . . . . . . . . . . . . . . . 66
limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 partial derivative . . . . . . . . . . . . . . . . . . . . . . . 80
vector-valued function . . . . . . . . . . . . . 57 partial differential equation . . . . . . . . . . . . 83
line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 path independence . . . . . . . . . . 167, 177, 200
intersection of planes . . . . . . . . . . . . . . 40 pedal curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 piecewise smooth curve . . . . . . . . . . . . . . . 161
parametric representation . . . . . . . . . 33 plane
perpendicular . . . . . . . . . . . . . . . . . . . . . . 36 coordinate . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Euclidean . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
symmetric representation . . . . . . . . . 34 line of intersection . . . . . . . . . . . . . . . . . 40
through two points . . . . . . . . . . . . . . . . . 35 normal form . . . . . . . . . . . . . . . . . . . . . . . 37
vector representation . . . . . . . . . . . . . . 33 normal vector . . . . . . . . . . . . . . . . . . . . . . 37
line integral . . . . . . . . . . . . . . . . . . . . . . 156, 159 point-normal form . . . . . . . . . . . . . . . . . 37
local maximum . . . . . . . . . . . . . . . . . . . . . . . . . 93 tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
local minimum . . . . . . . . . . . . . . . . . . . . . . . . . 93 through three points . . . . . . . . . . . . . . . 38
position vector . . . . . . . . . . . . . . . . . 59, 60, 159
M potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 probability density function . . . . . . . . . . . 148
mixed partial derivative . . . . . . . . . . . . . . . . 82 projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Möbius strip . . . . . . . . . . . . . . . . . . . . . . . . . . 193
moment . . . . . . . . . . . . . . . . . . . . . . . . . . 142, 144 Q
momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 quadric surface . . . . . . . . . . . . . . . . . . . . . . . . . 46
Monte Carlo method . . . . . . . . . . . . . . . . . . 130
multiple integral . . . . . . . . . . . . . . . . . . . . . . 114
R
random variable . . . . . . . . . . . . . . . . . . . . . . 147
multiply connected . . . . . . . . . . . . . . . . . . . . 175
regular reparametrization . . . . . . . . . . . . . 66
N reparametrization . . . . . . . . . . . . . . . . . . . . . . 66
n-positive direction . . . . . . . . . . . . . . . . . . . 193 Riemann integral . . . . . . . . . . . . . . . . . . . . . 155
normal to a curve. . . . . . . . . . . . . . . . . . . . . . .90 right-hand rule . . . . . . . . . . . . . . . . . . . . . . . . . 22
normal vector field . . . . . . . . . . . . . . . . . . . . 192 ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
O S
orientable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
sample space . . . . . . . . . . . . . . . . . . . . . . . . . . 147
P scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 combination . . . . . . . . . . . . . . . . . . . . . . . . 13
elliptic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 scalar function . . . . . . . . . . . . . . . . . . . . . . . . . 58
hyperbolic . . . . . . . . . . . . . . . . . . . . . . 47, 94 scalar triple product . . . . . . . . . . . . . . . . . . . . 26
230 Index
U
uniform density . . . . . . . . . . . . . . . . . . . . . . . 142
uniform distribution . . . . . . . . . . . . . . . . . . 148
uniformly distributed . . . . . . . . . . . . . . . . . 147
unit disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
V
variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
angle between. . . . . . . . . . . . . . . . . . . . . .15
basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
components . . . . . . . . . . . . . . . . . . . . . . . . 13