100% found this document useful (2 votes)
803 views361 pages

Vector-Tensor-Calculus - Daniel Mirand PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
803 views361 pages

Vector-Tensor-Calculus - Daniel Mirand PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 361

Vector and Tensor

Calculus
Frankenstein’s Note

Daniel Miranda

Version 0.76
Copyright ©2018
Permission is granted to copy, distribute and/or modify this document under the terms
of the GNU Free Documentation License, Version 1.3 or any later version published by
the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no
Back-Cover Texts.

These notes were written based on and using excerpts from the book “Multivariable
and Vector Calculus” by David Santos and includes excerpts from “Vector Calculus” by
Michael Corral, from “Linear Algebra via Exterior Products” by Sergei Winitzki, “Linear
Algebra” by David Santos and from “Introduction to Tensor Calculus” by Taha Sochi.
These books are also available under the terms of the GNU Free Documentation Li-
cense.
History

These notes are based on the LATEX source of the book “Multivariable and Vector Calculus” of David
Santos, which has undergone profound changes over time. In particular some examples and figures
from “Vector Calculus” by Michael Corral have been added. The tensor part is based on “Linear
algebra via exterior products” by Sergei Winitzki and on “Introduction to Tensor Calculus” by Taha
Sochi.
What made possible the creation of these notes was the fact that these four books available are
under the terms of the GNU Free Documentation License.
0.76
Corrections in chapter 8, 9 and 11.
The section 3.3 has been improved and simplified.
Third Version 0.7 - This version was released 04/2018.
Two new chapters: Multiple Integrals and Integration of Forms. Around 400 corrections in the
first seven chapters. New examples. New figures.
Second Version 0.6 - This version was released 05/2017.
In this versions a lot of efforts were made to transform the notes into a more coherent text.
First Version 0.5 - This version was released 02/2017.
The first version of the notes.

iii
Acknowledgement

I would like to thank Alan Gomes, Ana Maria Slivar, Tiago Leite Dias for all they comments and cor-
rections.

v
Contents
I. Differential Vector Calculus 1

1. Multidimensional Vectors 3
1.1. Vectors Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Basis and Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1. Linear Independence and Spanning Sets . . . . . . . . . . . . . . . . . . . . 8
1.2.2. Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3. Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3. Linear Transformations and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4. Three Dimensional Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1. Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.2. Cylindrical and Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . 22
1.5. ⋆ Cross Product in the n-Dimensional Space . . . . . . . . . . . . . . . . . . . . . . 27
1.6. Multivariable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.6.1. Graphical Representation of Vector Fields . . . . . . . . . . . . . . . . . . . 29
1.7. Levi-Civitta and Einstein Index Notation . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7.1. Common Definitions in Einstein Notation . . . . . . . . . . . . . . . . . . . 35
1.7.2. Examples of Using Einstein Notation to Prove Identities . . . . . . . . . . . . 36

2. Limits and Continuity 39


2.1. Some Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2. Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3. Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.4. ⋆ Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3. Differentiation of Vector Function 55


3.1. Differentiation of Vector Function of a Real Variable . . . . . . . . . . . . . . . . . . 55
3.1.1. Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2. Kepler Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3. Definition of the Derivative of Vector Function . . . . . . . . . . . . . . . . . . . . . 64
3.4. Partial and Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5. The Jacobi Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

vii
Contents
3.6. Properties of Differentiable Transformations . . . . . . . . . . . . . . . . . . . . . . 72
3.7. Gradients, Curls and Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . 76
3.8. The Geometrical Meaning of Divergence and Curl . . . . . . . . . . . . . . . . . . . 83
3.8.1. Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.8.2. Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9. Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.10. Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.11. Implicit Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.12. Common Differential Operations in Einstein Notation . . . . . . . . . . . . . . . . . 91
3.12.1. Common Identities in Einstein Notation . . . . . . . . . . . . . . . . . . . . 92
3.12.2. Examples of Using Einstein Notation to Prove Identities . . . . . . . . . . . . 94

II. Integral Vector Calculus 101

4. Multiple Integrals 103


4.1. Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.2. Iterated integrals and Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3. Double Integrals Over a General Region . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.4. Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5. Change of Variables in Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . . . 118
4.6. Application: Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.7. Application: Probability and Expected Value . . . . . . . . . . . . . . . . . . . . . . 128

5. Curves and Surfaces 137


5.1. Parametric Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2. Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.2.1. Implicit Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3. Classical Examples of Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.4. ⋆ Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5. Constrained optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

6. Line Integrals 155


6.1. Line Integrals of Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.2. Parametrization Invariance and Others Properties of Line Integrals . . . . . . . . . . 158
6.3. Line Integral of Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.3.1. Area above a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.4. The First Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.5. Test for a Gradient Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.5.1. Irrotational Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.5.2. Work and potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.6. The Second Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.7. Constructing Potentials Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

viii
Contents
6.8. Green’s Theorem in the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.9. Application of Green’s Theorem: Area . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.10. Vector forms of Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7. Surface Integrals 179


7.1. The Fundamental Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2. The Area of a Parametrized Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.2.1. The Area of a Graph of a Function . . . . . . . . . . . . . . . . . . . . . . . . 187
7.2.2. Pappus Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3. Surface Integrals of Scalar Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.4. Surface Integrals of Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4.1. Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4.2. Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.5. Kelvin-Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.6. Gauss Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.6.1. Gauss’s Law For Inverse-Square Fields . . . . . . . . . . . . . . . . . . . . . 204
7.7. Applications of Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.7.1. Conservative and Potential Forces . . . . . . . . . . . . . . . . . . . . . . . 206
7.7.2. Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.8. Helmholtz Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
7.9. Green’s Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

III. Tensor Calculus 211

8. Curvilinear Coordinates 213


8.1. Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.2. Line and Volume Elements in Orthogonal Coordinate Systems . . . . . . . . . . . . 217
8.3. Gradient in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . 220
8.3.1. Expressions for Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.4. Divergence in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . 222
8.5. Curl in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . 223
8.6. The Laplacian in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . 224
8.7. Examples of Orthogonal Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 225

9. Tensors 231
9.1. Linear Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.2. Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.2.1. Duas Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.3. Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.4. Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.4.1. Basis of Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.4.2. Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

ix
Contents
9.5. Change of Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.5.1. Vectors and Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.5.2. Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.6. Symmetry properties of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.7. Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.7.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.7.2. Exterior product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.7.3. Hodge star operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

10. Tensors in Coordinates 249


10.1. Index notation for tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
10.1.1. Definition of index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
10.1.2. Advantages and disadvantages of index notation . . . . . . . . . . . . . . . 252
10.2. Tensor Revisited: Change of Coordinate . . . . . . . . . . . . . . . . . . . . . . . . . 252
10.2.1. Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
10.2.2. Examples of Tensors of Different Ranks . . . . . . . . . . . . . . . . . . . . . 255
10.3. Tensor Operations in Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.3.1. Addition and Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
10.3.2. Multiplication by Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
10.3.3. Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
10.3.4. Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
10.3.5. Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
10.3.6. Permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
10.4. Kronecker and Levi-Civita Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.4.1. Kronecker δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.4.2. Permutation ϵ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.4.3. Useful Identities Involving δ or/and ϵ . . . . . . . . . . . . . . . . . . . . . . 261
10.4.4. ⋆ Generalized Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . 264
10.5. Types of Tensors Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
10.5.1. Isotropic and Anisotropic Tensors . . . . . . . . . . . . . . . . . . . . . . . . 265
10.5.2. Symmetric and Anti-symmetric Tensors . . . . . . . . . . . . . . . . . . . . 266

11. Tensor Calculus 269


11.1. Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11.1.1. Change of Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
11.2. Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
11.3. Integrals and the Tensor Divergence Theorem . . . . . . . . . . . . . . . . . . . . . 276
11.4. Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.5. Covariant Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.6. Geodesics and The Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . . . 282

x
Contents
12. Applications of Tensor 285
12.1. The Inertia Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
12.1.1. The Parallel Axis Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
12.2. Ohm’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.3. Equation of Motion for a Fluid: Navier-Stokes Equation . . . . . . . . . . . . . . . . 289
12.3.1. Stress Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.3.2. Derivation of the Navier-Stokes Equations . . . . . . . . . . . . . . . . . . . 290

13. Integration of Forms 295


13.1. Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
13.2. Integrating Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
13.3. Zero-Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
13.4. One-Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
13.5. Closed and Exact Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
13.6. Two-Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
13.7. Three-Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
13.8. Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
13.9. Green’s, Stokes’, and Gauss’ Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 311

IV. Appendix 319

A. Answers and Hints 321


Answers and Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

B. GNU Free Documentation License 333

References 339

Index 344

xi
Part I.

Differential Vector Calculus

1
Multidimensional Vectors
1.
1.1. Vectors Space
In this section we introduce an algebraic structure for Rn , the vector space in n-dimensions.
We assume that you are familiar with the geometric interpretation of members of R2 and R3 as
the rectangular coordinates of points in a plane and three-dimensional space, respectively.
Although Rn cannot be visualized geometrically if n ≥ 4, geometric ideas from R, R2 , and R3
often help us to interpret the properties of Rn for arbitrary n.

1 Definition
The n-dimensional space, Rn , is defined as the set
{ }
Rn = (x1 , x2 , . . . , xn ) : xk ∈ R .

Elements v ∈ Rn will be called vectors and will be written in boldface v. In the blackboard the
vectors generally are written with an arrow ⃗v .

2 Definition
If x and y are two vectors in Rn their vector sum x + y is defined by the coordinatewise addition

x + y = (x1 + y1 , x2 + y2 , . . . , xn + yn ) . (1.1)

Note that the symbol “+” has two distinct meanings in (1.1): on the left, “+” stands for the newly
defined addition of members of Rn and, on the right, for the usual addition of real numbers.
The vector with all components 0 is called the zero vector and is denoted by 0. It has the property
that v + 0 = v for every vector v; in other words, 0 is the identity element for vector addition.

3 Definition
A real number λ ∈ R will be called a scalar. If λ ∈ R and x ∈ Rn we define scalar multiplication
of a vector and a scalar by the coordinatewise multiplication

λx = (λx1 , λx2 , . . . , λxn ) . (1.2)

3
1. Multidimensional Vectors
The space Rn with the operations of sum and scalar multiplication defined above will be called
n dimensional vector space.
The vector (−1)x is also denoted by −x and is called the negative or opposite of x
We leave the proof of the following theorem to the reader.
4 Theorem
If x, z, and y are in Rn and λ, λ1 and λ2 are real numbers, then

Ê x + z = z + x (vector addition is commutative).

Ë (x + z) + y = x + (z + y) (vector addition is associative).

Ì There is a unique vector 0, called the zero vector, such that x + 0 = x for all x in Rn .

Í For each x in Rn there is a unique vector −x such that x + (−x) = 0.

Î λ1 (λ2 x) = (λ1 λ2 )x.

Ï (λ1 + λ2 )x = λ1 x + λ2 x.

Ð λ(x + z) = λx + λz.

Ñ 1x = x.

Clearly, 0 = (0, 0, . . . , 0) and, if x = (x1 , x2 , . . . , xn ), then

−x = (−x1 , −x2 , . . . , −xn ).

We write x + (−z) as x − z. The vector 0 is called the origin.


In a more general context, a nonempty set V , together with two operations +, · is said to be a
vector space if it has the properties listed in Theorem 4. The members of a vector space are called
vectors.
When we wish to note that we are regarding a member of Rn as part of this algebraic structure,
we will speak of it as a vector; otherwise, we will speak of it as a point.
5 Definition
The canonical ordered basis for Rn is the collection of vectors

{e1 , e2 , . . . , en }

with
ek = (0, . . . , 1, . . . , 0) .
| {z }
a 1 in the k slot and 0’s everywhere else

Observe that

n
vk ek = (v1 , v2 , . . . , vn ) . (1.3)
k=1
This means that any vector can be written as sums of scalar multiples of the standard basis. We
will discuss this fact more deeply in the next section.

4
1.1. Vectors Space

6 Definition
Let a, b be distinct points in Rn and let x = b − a ̸= 0. The parametric line passing through a in
the direction of x is the set
{r ∈ Rn : r = a + tx t ∈ R} .

7 Example
Find the parametric equation of the line passing through the points (1, 2, 3) and (−2, −1, 0).

Solution: ▶ The line follows the direction


( )
1 − (−2), 2 − (−1), 3 − 0 = (3, 3, 3) .

The desired equation is


(x, y, z) = (1, 2, 3) + t (3, 3, 3) .

Equivalently
(x, y, z) = (−2, −1, 0) + t (3, 3, 3) .

Length, Distance, and Inner Product


8 Definition
Given vectors x, y of Rn , their inner product or dot product is defined as


n
x•y = xk yk .
k=1

9 Theorem
For x, y, z ∈ Rn , and α and β real numbers, we have:

Ê (αx + βy)•z = α(x•z) + β(y•z)

Ë x•y = y•x

Ì x•x ≥ 0

Í x•x = 0 if and only if x = 0

The proof of this theorem is simple and will be left as exercise for the reader.
The norm or length of a vector x, denoted as ∥x∥, is defined as

∥x∥ = x•x

5
1. Multidimensional Vectors

10 Definition
Given vectors x, y of Rn , their distance is

» ∑
n
d(x, y) = ∥x − y∥ = (x − y)•(x − y) = (xi − yi )2
i=1

If n = 1, the previous definition of length reduces to the familiar absolute value, for n = 2 and
n = 3, the length and distance of Definition 10 reduce to the familiar definitions for the two and
three dimensional space.

11 Definition
A vector x is called unit vector
∥x∥ = 1.

12 Definition
Let x be a non-zero vector, then the associated versor (or normalized vector) denoted x̂ is the unit
vector
x
x̂ = .
∥x∥

We now establish one of the most useful inequalities in analysis.

13 Theorem (Cauchy-Bunyakovsky-Schwarz Inequality)


Let x and y be any two vectors in Rn . Then we have

|x•y| ≤ ∥x∥∥y∥.

Proof. Since the norm of any vector is non-negative, we have

∥x + ty∥ ≥ 0 ⇐⇒ (x + ty)•(x + ty) ≥ 0


⇐⇒ x•x + 2tx•y + t2 y•y ≥ 0
⇐⇒ ∥x∥2 + 2tx•y + t2 ∥y∥2 ≥ 0.

This last expression is a quadratic polynomial in t which is always non-negative. As such its discrim-
inant must be non-positive, that is,

(2x•y)2 − 4(∥x∥2 )(∥y∥2 ) ≤ 0 ⇐⇒ |x•y| ≤ ∥x∥∥y∥,

giving the theorem. ■

6
1.1. Vectors Space
The Cauchy-Bunyakovsky-Schwarz inequality can be written as
Ñ é1/2 Ñ é1/2
∑ ∑ ∑
n n n
xk yk ≤ x2k yk2 , (1.4)

k=1 k=1 k=1

for real numbers xk , yk .

14 Theorem (Triangle Inequality)


Let x and y be any two vectors in Rn . Then we have

∥x + y∥ ≤ ∥x∥ + ∥y∥.

Proof.
||x + y||2 = (x + y)•(x + y)
= x•x + 2x•y + y•y
≤ ||x||2 + 2||x||||y|| + ||y||2
= (||x|| + ||y||)2 ,
from where the desired result follows. ■

15 Corollary
If x, y, and z are in Rn , then
|x − y| ≤ |x − z| + |z − y|.

Proof. Write
x − y = (x − z) + (z − y),

and apply Theorem 14. ■

16 Definition
\
Let x and y be two non-zero vectors in Rn . Then the angle (x, y) between them is given by the
relation
\ x•y
cos (x, y) = .
∥x∥∥y∥
This expression agrees with the geometry in the case of the dot product for R2 and R3 .

17 Definition
Let x and y be two non-zero vectors in Rn . These vectors are said orthogonal if the angle between
them is 90 degrees. Equivalently, if: x•y = 0 .

Let P0 = (p1 , p2 , . . . , pn ), and n = (n1 , n2 , . . . , nn ) be a nonzero vector.

7
1. Multidimensional Vectors

18 Definition
The hyperplane defined by the point P0 and the vector n is defined as the set of points P : (x1 , , x2 , . . . , xn ) ∈
Rn , such that the vector drawn from P0 to P is perpendicular to n.

Recalling that two vectors are perpendicular if and only if their dot product is zero, it follows that
the desired hyperplane can be described as the set of all points P such that

n•(P − P0 ) = 0.

Expanded this becomes

n1 (x1 − p1 ) + n2 (x2 − p2 ) + · · · + nn (xn − pn ) = 0,

which is the point-normal form of the equation of a hyperplane. This is just a linear equation

n1 x1 + n2 x2 + · · · nn xn + d = 0,

where

d = −(n1 p1 + n2 p2 + · · · + nn pn ).

1.2. Basis and Change of Basis


1.2.1. Linear Independence and Spanning Sets
19 Definition
Let λi ∈ R, 1 ≤ i ≤ n. Then the vectorial sum


n
λ j xj
j=1

is said to be a linear combination of the vectors xi ∈ Rn , 1 ≤ i ≤ n.

20 Definition
The vectors xi ∈ Rn , 1 ≤ i ≤ n, are linearly dependent or tied if


n
∃(λ1 , λ2 , · · · , λn ) ∈ Rn \ {0} such that λj xj = 0,
j=1

that is, if there is a non-trivial linear combination of them adding to the zero vector.

21 Definition
The vectors xi ∈ Rn , 1 ≤ i ≤ n, are linearly independent or free if they are not linearly dependent.

8
1.2. Basis and Change of Basis
That is, if λi ∈ R, 1 ≤ i ≤ n then

n
λj xj = 0 =⇒ λ1 = λ2 = · · · = λn = 0.
j=1

A family of vectors is linearly independent if and only if the only linear combination of
them giving the zero-vector is the trivial linear combination.
22 Example

{ }
(1, 2, 3) , (4, 5, 6) , (7, 8, 9)
is a tied family of vectors in R3 , since

(1) (1, 2, 3) + (−2) (4, 5, 6) + (1) (7, 8, 9) = (0, 0, 0) .

23 Definition
A family of vectors {x1 , x2 , . . . , xk , . . . , } ⊆ Rn is said to span or generate Rn if every x ∈ Rn can
be written as a linear combination of the xj ’s.

24 Example
Since

n
vk ek = (v1 , v2 , . . . , vn ) . (1.5)
k=1
This means that the canonical basis generate Rn .

25 Theorem
If {x1 , x2 , . . . , xk , . . . , } ⊆ Rn spans Rn , then any superset

{y, x1 , x2 , . . . , xk , . . . , } ⊆ Rn

also spans Rn .

Proof. This follows at once from



l ∑
l
λi xi = 0y + λ i xi .
i=1 i=1

26 Example
The family of vectors
{ }
i = (1, 0, 0) , j = (0, 1, 0) , k = (0, 0, 1)
spans R3 since given (a, b, c) ∈ R3 we may write

(a, b, c) = ai + bj + ck.

9
1. Multidimensional Vectors
27 Example
Prove that the family of vectors
{ }
t1 = (1, 0, 0) , t2 = (1, 1, 0) , t3 = (1, 1, 1)

spans R3 .

Solution: ▶ This follows from the identity

(a, b, c) = (a − b) (1, 0, 0) + (b − c) (1, 1, 0) + c (1, 1, 1) = (a − b)t1 + (b − c)t2 + ct3 .

1.2.2. Basis
28 Definition
A family E = {x1 , x2 , . . . , xk , . . .} ⊆ Rn is said to be a basis of Rn if

Ê are linearly independent,

Ë they span Rn .

29 Example
The family
ei = (0, . . . , 0, 1, 0, . . . , 0) ,
where there is a 1 on the i-th slot and 0’s on the other n − 1 positions, is a basis for Rn .

30 Theorem
All basis of Rn have the same number of vectors.

31 Definition
The dimension of Rn is the number of elements of any of its basis, n.

32 Theorem
Let {x1 , . . . , xn } be a family of vectors in Rn . Then the x’s form a basis if and only if the n × n matrix
A formed by taking the x’s as the columns of A is invertible.

Proof. Since we have the right number of vectors, it is enough to prove that the x’s are linearly
independent. But if X = (λ1 , λ2 , . . . , λn ), then

λ1 x1 + · · · + λn xn = AX.

If A is invertible, then AX = 0n =⇒ X = A−1 0 = 0, meaning that λ1 = λ2 = · · · λn = 0, so the


x’s are linearly independent.

The reciprocal will be left as a exercise. ■

10
1.2. Basis and Change of Basis

33 Definition

Ê A basis E = {x1 , x2 , . . . , xk } of vectors in Rn is called orthogonal if

xi •xj = 0

for all i ̸= j.

Ë An orthogonal basis of vectors is called orthonormal if all vectors in E are unit vectors, i.e,
have norm equal to 1.

1.2.3. Coordinates
34 Theorem
Let E = {e1 , e2 , . . . , en } be a basis for a vector space Rn . Then any x ∈ Rn has a unique represen-
tation
x = a1 e1 + a2 e2 + · · · + an en .

Proof. Let
x = b1 e1 + b2 e2 + · · · + bn en

be another representation of x. Then

0 = (a1 − b1 )e1 + (a2 − b2 )e2 + · · · + (an − bn )en .

Since {e1 , e2 , . . . , en } forms a basis for Rn , they are a linearly independent family. Thus we must
have
a1 − b1 = a2 − b2 = · · · = an − bn = 0R ,

that is
a1 = b1 ; a2 = b2 ; · · · ; an = bn ,

proving uniqueness. ■

35 Definition
An ordered basis E = {e1 , e2 , . . . , en } of a vector space Rn is a basis where the order of the xk has
been fixed. Given an ordered basis {e1 , e2 , . . . , en } of a vector space Rn , Theorem 34 ensures that
there are unique (a1 , a2 , . . . , an ) ∈ Rn such that

x = a1 e1 + a2 e2 + · · · + an en .

The ak ’s are called the coordinates of the vector x.

11
1. Multidimensional Vectors
We will denote the coordinates the vector x on the basis E by

[x]E

or simply [x].

36 Example
The standard ordered basis for R3 is E = {i, j, k}. The vector (1, 2, 3) ∈ R3 for example, has co-
ordinates (1, 2, 3)E . If the order of the basis were changed to the ordered basis F = {i, k, j}, then
(1, 2, 3) ∈ R3 would have coordinates (1, 3, 2)F .

Usually, when we give a coordinate representation for a vector x ∈ Rn , we assume that


we are using the standard basis.

37 Example
Consider the vector (1, 2, 3) ∈ R3 (given in standard representation). Since

(1, 2, 3) = −1 (1, 0, 0) − 1 (1, 1, 0) + 3 (1, 1, 1) ,

{ }
under the ordered basis E = (1, 0, 0) , (1, 1, 0) , (1, 1, 1) , (1, 2, 3) has coordinates (−1, −1, 3)E .
We write

(1, 2, 3) = (−1, −1, 3)E .

38 Example
The vectors of
{ }
E = (1, 1) , (1, 2)

are non-parallel, and so form a basis for R2 . So do the vectors

{ }
F = (2, 1) , (1, −1) .

Find the coordinates of (3, 4)E in the base F .

Solution: ▶ We are seeking x, y such that

        
2  1  1 1 3 2 1 
3 (1, 1) + 4 (1, 2) = x     
  + y   =⇒ 
  = 
  
 (x, y) .
 F
1 −1 1 2 4 1 −1

12
1.2. Basis and Change of Basis
Thus  −1   
2 1  1 1 3
(x, y)F = 





 
 
1 −1 1 2 4
   
1 1
  1 1 3
= 
1
3 3 
2 
 
 
− 1 2 4
3 3
  
2
 1  3
= 
 1
3  
 
− −1 4
3
 
 6 
=  
  .
−5
F
Let us check by expressing both vectors in the standard basis of R2 :

(3, 4)E = 3 (1, 1) + 4 (1, 2) = (7, 11) ,

(6, −5)F = 6 (2, 1) − 5 (1, −1) = (7, 11) .


In general let us consider basis E , F for the same vector space Rn . We want to convert XE to YF .
We let A be the matrix formed with the column vectors of E in the given order an B be the matrix
formed with the column vectors of F in the given order. Both A and B are invertible matrices since
the E, F are basis, in view of Theorem 32. Then we must have

AXE = BYF =⇒ YF = B −1 AXE .

Also,
XE = A−1 BYF .
This prompts the following definition.

39 Definition
Let E = {x1 , x2 , . . . , xn } and F = {y1 , y2 , . . . , yn } be two ordered basis for a vector space Rn .
Let A ∈ Mn×n (R) be the matrix having the x’s as its columns and let B ∈ Mn×n (R) be the matrix
having the y’s as its columns. The matrix P = B −1 A is called the transition matrix from E to F
and the matrix P −1 = A−1 B is called the transition matrix from F to E.
40 Example
Consider the basis of R3
{ }
E = (1, 1, 1) , (1, 1, 0) , (1, 0, 0) ,
{ }
F = (1, 1, −1) , (1, −1, 0) , (2, 0, 0) .
Find the transition matrix from E to F and also the transition matrix from F to E. Also find the coor-
dinates of (1, 2, 3)E in terms of F .

13
1. Multidimensional Vectors
Solution: ▶ Let
   
1 1 1  1 1 2
   
   
A = 1 1 0 , B =  1 −1 0 .
   
   
1 0 0 −1 0 0

The transition matrix from E to F is

P = B −1 A
 −1  
 1 1 2 1 1 1
   
   
=  1 −1 0 1 1 0
   
   
−1 0 0 1 0 0
  
0 0 −1 1 1 1
  
  
= 0

−1 −1 
 1 1 0

1 1   
1 1 0 0
2 2
 
−1 0 0
 
 
= −2 −1 −0 .
 
 1
2 1
2

The transition matrix from F to E is


 −1  
−1 0 0 −1 0 0
   
−1    
P = −2 −1 0 = 2 −1 0 .
   
 1  
2 1 0 2 2
2

Now,
    
−1 0 0  1 −1
    
    
YF = −2 −1 0   = −4 .
  2  
 
1    11 
2 1 3
2 E 2 F

As a check, observe that in the standard basis for R3


ï ò ï ò ï ò ï ò ï ò
1, 2, 3 = 1 1, 1, 1 + 2 1, 1, 0 + 3 1, 0, 0 = 6, 3, 1 ,
E

ï ò ï ò ï ò ï ò ï ò
11 11
−1, −4, = −1 1, 1, −1 − 4 1, −1, 0 + 2, 0, 0 = 6, 3, 1 .
2 F 2

14
1.3. Linear Transformations and Matrices

1.3. Linear Transformations and Matrices


41 Definition
A linear transformation or homomorphism between Rn and Rm

Rn → Rm
L: ,
x 7→ L(x)

is a function which is

■ Additive: L(x + y) = L(x) + L(y),

■ Homogeneous: L(λx) = λL(x), for λ ∈ R.

It is clear that the above two conditions can be summarized conveniently into

L(x + λy) = L(x) + λL(y).

Assume that {xi }i∈[1;n] is an ordered basis for Rn , and E = {yi }i∈[1;m] an ordered basis for Rm .
Then  
 a11 
 
 
 a21 
 
L(x1 ) = a11 y1 + a21 y2 + · · · + am1 ym =  . 
 . 
 . 
 
 
am1
 E
 a12 
 
 
 a22 
 
L(x2 ) = a12 y1 + a22 y2 + · · · + am2 ym =  . 
 . 
 .  .
 
 
am2
E
.. .. .. .. ..
. . . . .
 
 a1n 
 
 
 a2n 
 
L(xn ) = a1n y1 + a2n y2 + · · · + amn ym =  . 
 . 
 . 
 
 
amn
E

15
1. Multidimensional Vectors

42 Definition
The m × n matrix
 
 a11 a12 ··· a1n 
 
 
 a21 a22 ··· a2n 
 
ML =  . .. .. .. 
 . 
 . . . . 
 
 
am1 am2 · · · amn

formed by the column vectors above is called the matrix representation of the linear map L with
respect to the basis {xi }i∈[1;m] , {yi }i∈[1;n] .

43 Example
Consider L : R3 → R3 ,
L (x, y, z) = (x − y − z, x + y + z, z) .
Clearly L is a linear transformation.

1. Find the matrix corresponding to L under the standard ordered basis.

2. Find the matrix corresponding to L under the ordered basis (1, 0, 0) , (1, 1, 0) , (1, 0, 1) , for both
the domain and the image of L.

Solution: ▶

1. The matrix will be a 3 × 3 matrix. We have L (1, 0, 0) = (1, 1, 0), L (0, 1, 0) = (−1, 1, 0), and
L (0, 0, 1) = (−1, 1, 1), whence the desired matrix is
 
1 −1 −1
 
 
1 1 1 .
 
 
0 0 1

2. Call this basis E. We have

L (1, 0, 0) = (1, 1, 0) = 0 (1, 0, 0) + 1 (1, 1, 0) + 0 (1, 0, 1) = (0, 1, 0)E ,

L (1, 1, 0) = (0, 2, 0) = −2 (1, 0, 0) + 2 (1, 1, 0) + 0 (1, 0, 1) = (−2, 2, 0)E ,


and

L (1, 0, 1) = (0, 2, 1) = −3 (1, 0, 0) + 2 (1, 1, 0) + 1 (1, 0, 1) = (−3, 2, 1)E ,

whence the desired matrix is  


0 −2 −3
 
 
1 2 2 .
 
 
0 0 1

16
1.4. Three Dimensional Space

44 Definition
The column rank of A is the dimension of the space generated by the columns of A, while the row rank
of A is the dimension of the space generated by the rows of A.

A fundamental result in linear algebra is that the column rank and the row rank are always equal.
This number (i.e., the number of linearly independent rows or columns) is simply called the rank of
A.

1.4. Three Dimensional Space


In this section we particularize some definitions to the important case of three dimensional space

45 Definition
The 3-dimensional space is defined and denoted by
{ }
R3 = r = (x, y, z) : x ∈ R, y ∈ R, z ∈ R .

Having oriented the z axis upwards, we have a choice for the orientation of the the x and y-axis.
We adopt a convention known as a right-handed coordinate system, as in figure 1.1. Let us explain.
Put
i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1),

and observe that


r = (x, y, z) = xi + yj + zk.

k k

j j

i i

Figure 1.1. Right-handed system. Figure 1.3. Left-handed system.


Figure 1.2. Right Hand.

1.4.1. Cross Product


The cross product of two vectors is defined only in three-dimensional space R3 . We will define a
generalization of the cross product for the n dimensional space in the section 1.5.
The standard cross product is defined as a product satisfying the following properties.

17
1. Multidimensional Vectors

46 Definition
Let x, y, z be vectors in R3 , and let λ ∈ R be a scalar. The cross product × is a closed binary opera-
tion satisfying

Ê Anti-commutativity: x × y = −(y × x)

Ë Bilinearity:

(x + z) × y = x × y + z × y and x × (z + y) = x × z + x × y

Ì Scalar homogeneity: (λx) × y = x × (λy) = λ(x × y)

Í x×x=0

Î Right-hand Rule:
i × j = k, j × k = i, k × i = j.

It follows that the cross product is an operation that, given two non-parallel vectors on a plane,
allows us to “get out” of that plane.
47 Example
Find
(1, 0, −3) × (0, 1, 2) .

Solution: ▶ We have

(i − 3k) × (j + 2k) = i × j + 2i × k − 3k × j − 6k × k
= k − 2j + 3i + 0
= 3i − 2j + k

Hence
(1, 0, −3) × (0, 1, 2) = (3, −2, 1) .

The cross product of vectors in R3 is not associative, since

i × (i × j) = i × k = −j

but
(i × i) × j = 0 × j = 0.

Operating as in example 47 we obtain

18
1.4. Three Dimensional Space

∥x∥∥y∥ sin θ
x×y

∥ y∥
x y θ
∥x∥

Figure 1.4. Theorem 51. Figure 1.5. Area of a parallelogram

48 Theorem
Let x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) be vectors in R3 . Then

x × y = (x2 y3 − x3 y2 )i + (x3 y1 − x1 y3 )j + (x1 y2 − x2 y1 )k.

Proof. Since i × i = j × j = k × k = 0, we only worry about the mixed products, obtaining,

x × y = (x1 i + x2 j + x3 k) × (y1 i + y2 j + y3 k)
= x1 y2 i × j + x1 y3 i × k + x2 y1 j × i + x2 y3 j × k
+x3 y1 k × i + x3 y2 k × j
= (x1 y2 − y1 x2 )i × j + (x2 y3 − x3 y2 )j × k + (x3 y1 − x1 y3 )k × i
= (x1 y2 − y1 x2 )k + (x2 y3 − x3 y2 )i + (x3 y1 − x1 y3 )j,

proving the theorem. ■

The cross product can also be expressed as the formal/mnemonic determinant




i j k



u × v = u1 u2 u3



v1 v2 v3

Using cofactor expansion we have





u2 u3 u u1 u u2
u × v = i + 3 j + 1 k

v2 v3 v3 v1 v1 v2

Using the cross product, we may obtain a third vector simultaneously perpendicular to two other
vectors in space.

19
1. Multidimensional Vectors

49 Theorem
x ⊥ (x × y) and y ⊥ (x × y), that is, the cross product of two vectors is simultaneously perpen-
dicular to both original vectors.

Proof. We will only check the first assertion, the second verification is analogous.

x•(x × y) = (x1 i + x2 j + x3 k)•((x2 y3 − x3 y2 )i


+(x3 y1 − x1 y3 )j + (x1 y2 − x2 y1 )k)
= x1 x2 y3 − x1 x3 y2 + x2 x3 y1 − x2 x1 y3 + x3 x1 y2 − x3 x2 y1
= 0,

completing the proof. ■

Although the cross product is not associative, we have, however, the following theorem.

50 Theorem

a × (b × c) = (a•c)b − (a•b)c.

Proof.

a × (b × c) = (a1 i + a2 j + a3 k) × ((b2 c3 − b3 c2 )i+


+(b3 c1 − b1 c3 )j + (b1 c2 − b2 c1 )k)
= a1 (b3 c1 − b1 c3 )k − a1 (b1 c2 − b2 c1 )j − a2 (b2 c3 − b3 c2 )k
+a2 (b1 c2 − b2 c1 )i + a3 (b2 c3 − b3 c2 )j − a3 (b3 c1 − b1 c3 )i
= (a1 c1 + a2 c2 + a3 c3 )(b1 i + b2 j + b3 i)+
(−a1 b1 − a2 b2 − a3 b3 )(c1 i + c2 j + c3 i)
= (a•c)b − (a•b)c,

completing the proof. ■

51 Theorem
\
Let (x, y) ∈ [0; π] be the convex angle between two vectors x and y. Then

\
||x × y|| = ||x||||y|| sin (x, y).

20
z
b b
b
P

b
D′ b

A′ 1.4. Three Dimensional Space


C′
b
b
b

N b

a×b
D b

θ b
B′
x A b
b

z
b C
y b M
B
x
Figure 1.6. Theorem 497. Figure 1.7. Example ??.

Proof. We have

||x × y||2 = (x2 y3 − x3 y2 )2 + (x3 y1 − x1 y3 )2 + (x1 y2 − x2 y1 )2


= x22 y32 − 2x2 y3 x3 y2 + x23 y22 + x23 y12 − 2x3 y1 x1 y3 +
+x21 y32 + x21 y22 − 2x1 y2 x2 y1 + x22 y12
= (x21 + x22 + x23 )(y12 + y22 + y32 ) − (x1 y1 + x2 y2 + x3 y3 )2
= ||x||2 ||y||2 − (x•y)2
\
= ||x||2 ||y||2 − ||x||2 ||y||2 cos2 (x, y)
\
= ||x||2 ||y||2 sin2 (x, y),

whence the theorem follows. ■

Theorem 51 has the following geometric significance: ∥x × y∥ is the area of the parallelogram
formed when the tails of the vectors are joined. See figure 1.5.
The following corollaries easily follow from Theorem 51.
52 Corollary
Two non-zero vectors x, y satisfy x × y = 0 if and only if they are parallel.

53 Corollary (Lagrange’s Identity)

||x × y||2 = ∥x∥2 ∥y∥2 − (x•y)2 .

The following result mixes the dot and the cross product.

54 Theorem
Let x, y, z, be linearly independent vectors in R3 . The signed volume of the parallelepiped spanned
by them is (x × y) • z.

Proof. See figure 1.6. The area of the base of the parallelepiped is the area of the parallelogram
determined by the vectors x and y, which has area ∥x × y∥. The altitude of the parallelepiped is
∥z∥ cos θ where θ is the angle between z and x × y. The volume of the parallelepiped is thus

∥x × y∥∥z∥ cos θ = (x × y)•z,

proving the theorem. ■

21
1. Multidimensional Vectors
Since we may have used any of the faces of the parallelepiped, it follows that

(x × y)•z = (y × z)•x = (z × x)•y.

In particular, it is possible to “exchange” the cross and dot products:

x•(y × z) = (x × y)•z

1.4.2. Cylindrical and Spherical Coordinates


Let B = {x1 , x2 , x3 } be an ordered basis for R3 . As we have al-
ready seen, for every v ∈ Rn there is a unique linear combination P
b

of the basis vectors that equals v:


v λ3 e3
e3
v = xx1 + yx2 + zx3 . e1 λ1 e1
O b

e2
−−→ λ2 e2
The coordinate vector of v relative to E is the sequence of coordinates OK
b

K
[v]E = (x, y, z).

In this representation, the coordinates of a point (x, y, z) are determined by following straight
paths starting from the origin: first parallel to x1 , then parallel to the x2 , then parallel to the x3 , as
in Figure 1.7.1.
In curvilinear coordinate systems, these paths can be curved. We will provide the definition of
curvilinear coordinate systems in the section 3.10 and 8. In this section we provide some examples:
the three types of curvilinear coordinates which we will consider in this section are polar coordi-
nates in the plane cylindrical and spherical coordinates in the space.
Instead of referencing a point in terms of sides of a rectangular parallelepiped, as with Cartesian
coordinates, we will think of the point as lying on a cylinder or sphere. Cylindrical coordinates are
often used when there is symmetry around the z-axis; spherical coordinates are useful when there
is symmetry about the origin.
Let P = (x, y, z) be a point in Cartesian coordinates in R3 , and let P0 = (x, y, 0) be the projection
of P upon the xy-plane. Treating (x, y) as a point in R2 , let (r, θ) be its polar coordinates (see Figure
1.7.2). Let ρ be the length of the line segment from the origin to P , and let ϕ be the angle between
that line segment and the positive z-axis (see Figure 1.7.3). ϕ is called the zenith angle. Then the
cylindrical coordinates (r, θ, z) and the spherical coordinates (ρ, θ, ϕ) of P (x, y, z) are defined
as follows:1

1
This “standard” definition of spherical coordinates used by mathematicians results in a left-handed system. For this
reason, physicists usually switch the definitions of θ and ϕ to make (ρ, θ, ϕ) a right-handed system.

22
1.4. Three Dimensional Space

Cylindrical coordinates (r, θ, z):


»
x = r cos θ r= x2 + y 2
Å ã z P(x, y, z)
y
y = r sin θ θ = tan−1
x
z
z=z z=z y
0
x θ r
where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0 Figure 1.8.
Cylindrical
y Pcoordinates
(x, y, 0)
x 0

z P(x, y, z)
ρ
Spherical coordinates (ρ, θ, ϕ): z
ϕ
» 0
y
x = ρ sin ϕ cos θ ρ= x2 + y 2 + z 2 x θ
Å ã Figure 1.9.
y
y = ρ sin ϕ sin θ θ = tan−1 Spherical
y coordinates
P (x, y, 0)
x x 0
Ç å
−1 √ z
z = ρ cos ϕ ϕ = cos
x2 + y 2 + z 2

where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0

Both θ and ϕ are measured in radians. Note that r ≥ 0, 0 ≤ θ < 2π, ρ ≥ 0 and 0 ≤ ϕ ≤ π. Also,
θ is undefined when (x, y) = (0, 0), and ϕ is undefined when (x, y, z) = (0, 0, 0).
55 Example
Convert the point (−2, −2, 1) from Cartesian coordinates to (a) cylindrical and (b) spherical coordi-
nates.
Ç å
» √ −2 5π
Solution: ▶ (a) r = (−2)2 + (−2)2 = 2 2, θ = tan−1 = tan−1 (1) = , since
−2 4
y = −2 < 0. Ç å
√ 5π
∴ (r, θ, z) = 2 2, , 1
4
Ç å
» √ 1
(b) ρ = (−2)2 + (−2)2 + 12 = 9 = 3, ϕ = cos−1 ≈ 1.23 radians.
Ç å 3

∴ (ρ, θ, ϕ) = 3, , 1.23
4

For cylindrical coordinates (r, θ, z), and constants r0 , θ0 and z0 , we see from Figure 8.3 that the
surface r = r0 is a cylinder of radius r0 centered along the z-axis, the surface θ = θ0 is a half-plane
emanating from the z-axis, and the surface z = z0 is a plane parallel to the xy-plane.
The unit vectors r̂, θ̂, k̂ at any point P are perpendicular to the surfaces r = constant, θ = con-
stant, z = constant through P in the directions of increasing r, θ, z. Note that the direction of the

23
1. Multidimensional Vectors

z z z
r0 z0

y 0 y y
0 0

θ0
x x x

(a) r = r0 (b) θ = θ0 (c) z = z0

Figure 1.10. Cylindrical coordinate surfaces

unit vectors r̂, θ̂ vary from point to point, unlike the corresponding Cartesian unit vectors.
z

z = z1 plane
z1

P1 (r1 , ϕ1 , z1 ) k̂
ϕ̂
r = r1 surface r̂

r1
y
x ϕ1
ϕ = ϕ1 plane

For spherical coordinates (ρ, θ, ϕ), and constants ρ0 , θ0 and ϕ0 , we see from Figure 1.11 that the
surface ρ = ρ0 is a sphere of radius ρ0 centered at the origin, the surface θ = θ0 is a half-plane
emanating from the z-axis, and the surface ϕ = ϕ0 is a circular cone whose vertex is at the origin.

z
z z

ρ0
y y ϕ0
0
0
y
θ0 0
x x x

(a) ρ = ρ0 (b) θ = θ0 (c) ϕ = ϕ0

Figure 1.11. Spherical coordinate surfaces

Figures 8.3(a) and 1.11(a) show how these coordinate systems got their names.
Sometimes the equation of a surface in Cartesian coordinates can be transformed into a simpler
equation in some other coordinate system, as in the following example.

24
1.4. Three Dimensional Space
56 Example
Write the equation of the cylinder x2 + y 2 = 4 in cylindrical coordinates.


Solution: ▶ Since r = x2 + y 2 , then the equation in cylindrical coordinates is r = 2. ◀
Using spherical coordinates to write the equation of a sphere does not necessarily make the
equation simpler, if the sphere is not centered at the origin.

57 Example
Write the equation (x − 2)2 + (y − 1)2 + z 2 = 9 in spherical coordinates.

Solution: ▶ Multiplying the equation out gives

x2 + y 2 + z 2 − 4x − 2y + 5 = 9 , so we get
ρ2 − 4ρ sin ϕ cos θ − 2ρ sin ϕ sin θ − 4 = 0 , or
ρ2 − 2 sin ϕ (2 cos θ − sin θ ) ρ − 4 = 0

after combining terms. Note that this actually makes it more difficult to figure out what the surface
is, as opposed to the Cartesian equation where you could immediately identify the surface as a
sphere of radius 3 centered at (2, 1, 0). ◀
58 Example
Describe the surface given by θ = z in cylindrical coordinates.

Solution: ▶ This surface is called a helicoid. As the (vertical) z coordinate increases, so does the
angle θ, while the radius r is unrestricted. So this sweeps out a (ruled!) surface shaped like a spiral
staircase, where the spiral has an infinite radius. Figure 1.12 shows a section of this surface restricted
to 0 ≤ z ≤ 4π and 0 ≤ r ≤ 2. ◀

Figure 1.12. Helicoid θ = z

25
1. Multidimensional Vectors

Exercises
A
For Exercises 1-4, find the (a) cylindrical and (b) spherical coordinates of the point whose Cartesian
coordinates are given.
√ √ √
1. (2, 2 3, −1) 3. ( 21, − 7, 0)

2. (−5, 5, 6) 4. (0, 2, 2)

For Exercises 5-7, write the given equation in (a) cylindrical and (b) spherical coordinates.

5. x2 + y 2 + z 2 = 25 7. x2 + y 2 + 9z 2 = 36

6. x2 + y 2 = 2y

B
π
8. Describe the intersection of the surfaces whose equations in spherical coordinates are θ =
2
π
and ϕ = .
4
9. Show that for a ̸= 0, the equation ρ = 2a sin ϕ cos θ in spherical coordinates describes a
sphere centered at (a, 0, 0) with radius|a|.
C
10. Let P = (a, θ, ϕ) be a point in spherical coordinates, with a > 0 and 0 < ϕ < π. Then
P lies on the sphere ρ = a. Since 0 < ϕ < π, the line segment from the origin to P can
be extended to intersect the cylinder given by r = a (in cylindrical coordinates). Find the
cylindrical coordinates of that point of intersection.

11. Let P1 and P2 be points whose spherical coordinates are (ρ1 , θ1 , ϕ1 ) and (ρ2 , θ2 , ϕ2 ), respec-
tively. Let v1 be the vector from the origin to P1 , and let v2 be the vector from the origin to P2 .
For the angle γ between

cos γ = cos ϕ1 cos ϕ2 + sin ϕ1 sin ϕ2 cos( θ2 − θ1 ).

This formula is used in electrodynamics to prove the addition theorem for spherical harmon-
ics, which provides a general expression for the electrostatic potential at a point due to a unit
charge. See pp. 100-102 in [36].

12. Show that the distance d between the points P1 and P2 with cylindrical coordinates (r1 , θ1 , z1 )
and (r2 , θ2 , z2 ), respectively, is
»
d= r12 + r22 − 2r1 r2 cos( θ2 − θ1 ) + (z2 − z1 )2 .

13. Show that the distance d between the points P1 and P2 with spherical coordinates (ρ1 , θ1 , ϕ1 )
and (ρ2 , θ2 , ϕ2 ), respectively, is
»
d= ρ21 + ρ22 − 2ρ1 ρ2 [sin ϕ1 sin ϕ2 cos( θ2 − θ1 ) + cos ϕ1 cos ϕ2 ] .

26
1.5. ⋆ Cross Product in the n-Dimensional Space

1.5. ⋆ Cross Product in the n-Dimensional Space


In this section we will answer the following question: Can one define a cross product in the n-
dimensional space so that it will have properties similar to the usual 3 dimensional one?
Clearly the answer depends which properties we require.
The most direct generalizations of the cross product are to define either:

■ a binary product × : Rn × Rn → Rn which takes as input two vectors and gives as output a
vector;

■ a n − 1-ary product × : |Rn × ·{z


· · × Rn} → Rn which takes as input n − 1 vectors, and gives
n−1 times
as output one vector.

Under the correct assumptions it can be proved that a binary product exists only in the dimen-
sions 3 and 7. A simple proof of this fact can be found in [51].
In this section we focus in the definition of the n − 1-ary product.

59 Definition
Let v1 , . . . , vn−1 be vectors in Rn ,, and let λ ∈ R be a scalar. Then we define their generalized cross
product vn = v1 × · · · × vn−1 as the (n − 1)-ary product satisfying

Ê Anti-commutativity: v1 × · · · vi × vi+1 × · · · × vn−1 = −v1 × · · · vi+1 × vi × · · · × vn−1 ,


i.e, changing two consecutive vectors a minus sign appears.

Ë Bilinearity: v1 × · · · vi + x × vi+1 × · · · × vn−1 = v1 × · · · vi × vi+1 × · · · × vn−1 + v1 ×


· · · x × vi+1 × · · · × vn−1

Ì Scalar homogeneity: v1 × · · · λvi × vi+1 × · · · × vn−1 = λv1 × · · · vi × vi+1 × · · · × vn−1

Í Right-hand Rule: e1 ×· · ·×en−1 = en , e2 ×· · ·×en = e1 , and so forth for cyclic permutations


of indices.

We will also write

×(v , . . . , v
1 n−1 ) := v1 × · · · vi × vi+1 × · · · × vn−1

In coordinates, one can give a formula for this (n − 1)-ary analogue of the cross product in Rn
by:

27
1. Multidimensional Vectors
60 Proposition
Let e1 , . . . , en be the canonical basis of Rn and let v1 , . . . , vn−1 be vectors in Rn , with coordinates:

v1 = (v11 , . . . v1n ) (1.6)


..
. (1.7)
vi = (vi1 , . . . vin ) (1.8)
..
. (1.9)
vn = (vn1 , . . . vnn ) (1.10)

in the canonical basis. Then




v ··· v1n
11

.. .. ..
. . .
×
(v1 , . . . , vn−1 ) =

vn−11 ···
.

vn−1n


e ··· en
1

This formula is very similar to the determinant formula for the normal cross product in R3 except
that the row of basis vectors is the last row in the determinant rather than the first.
The reason for this is to ensure that the ordered vectors

×(v , ..., v
(v1 , ..., vn−1 , 1 n−1 ))

have a positive orientation with respect to

(e1 , ..., en ).

61 Proposition
The vector product have the following properties:
The vector×(v1 , . . . , vn−1 ) is perpendicular to vi ,

Ë the magnitude of×(v1 , . . . , vn−1 ) is the volume of the solid defined by the vectors v1 , . . . vi−1
Ê


v ··· v1n
11

.. .. ..
. . .

Ì vn •v1 × · · · × vn−1 = .

vn−11

··· vn−1n

v ··· vnn
n1

1.6. Multivariable Functions


Let A ⊆ Rn . For most of this course, our concern will be functions of the form

f : A ⊆ Rn → Rm .

28
1.6. Multivariable Functions
If m = 1, we say that f is a scalar field. If m ≥ 2, we say that f is a vector field.
We would like to develop a calculus analogous to the situation in R. In particular, we would like to
examine limits, continuity, differentiability, and integrability of multivariable functions. Needless
to say, the introduction of more variables greatly complicates the analysis. For example, recall that
the graph of a function f : A → Rm , A ⊆ Rn . is the set

{(x, f (x)) : x ∈ A)} ⊆ Rn+m .

If m + n > 3, we have an object of more than three-dimensions! In the case n = 2, m = 1, we have


a tri-dimensional surface. We will now briefly examine this case.

62 Definition
Let A ⊆ R2 and let f : A → R be a function. Given c ∈ R, the level curve at z = c is the curve
resulting from the intersection of the surface z = f (x, y) and the plane z = c, if there is such a
curve.

63 Example
The level curves of the surface f (x, y) = x2 + 3y 2 (an elliptic paraboloid) are the concentric ellipses

x2 + 3y 2 = c, c > 0.

-1

-2

-3
-3 -2 -1 0 1 2 3

Figure 1.13. Level curves for f (x, y) = x2 + 3y 2 .

1.6.1. Graphical Representation of Vector Fields


In this section we present a graphical representation of vector fields. For this intent, we limit our-
selves to low dimensional spaces.
A vector field v : R3 → R3 is an assignment of a vector v = v(x, y, z) to each point (x, y, z) of
a subset U ⊂ R3 . Each vector v of the field can be regarded as a ”bound vector” attached to the
corresponding point (x, y, z). In components

v(x, y, z) = v1 (x, y, z)i + v2 (x, y, z)j + v3 (x, y, z)k.

29
1. Multidimensional Vectors
64 Example
Sketch each of the following vector fields.
F = xi + yj
F = −yi + xj
r = xi + yj + zk

Solution: ▶
a) The vector field is null at the origin; at other points, F is a vector pointing away from the origin;
b) This vector field is perpendicular to the first one at every point;
c) The vector field is null at the origin; at other points, F is a vector pointing away from the origin.
This is the 3-dimensional analogous of the first one. ◀

3 3

2 2
1

1 1

0 0 0
-1 -1
1
-2 -2
−1
−1 0
-3 -3
−0.5 0
0.5 1 −1
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

65 Example
Suppose that an object of mass M is located at the origin of a three-dimensional coordinate system.
We can think of this object as inducing a force field g in space. The effect of this gravitational field
is to attract any object placed in the vicinity of the origin toward it with a force that is governed by
Newton’s Law of Gravitation.
GmM
F=
r2
To find an expression for g , suppose that an object of mass m is located at a point with position
vector r = xi + yj + zk .
The gravitational field is the gravitational force exerted per unit mass on a small test mass (that
won’t distort the field) at a point in the field. Like force, it is a vector quantity: a point mass M at the
origin produces the gravitational field
GM
g = g(r) = − 3 r,
r
where r is the position relative to the origin and where r = ∥r∥. Its magnitude is
GM
g=−
r2
and, due to the minus sign, at each point g is directed opposite to r, i.e. towards the central mass.

Exercises

30
1.7. Levi-Civitta and Einstein Index Notation

Figure 1.14. Gravitational Field

66 Problem 67 Problem
Sketch the level curves for the following maps. Sketch the level surfaces for the following maps.

1. (x, y) 7→ x + y 1. (x, y, z) 7→ x + y + z

2. (x, y) 7→ xy 2. (x, y, z) 7→ xyz

3. (x, y) 7→ min(|x|, |y|) 3. (x, y, z) 7→ min(|x|, |y|, |z|)

4. (x, y) 7→ x3 − x 4. (x, y, z) 7→ x2 + y 2

5. (x, y) 7→ x2 + 4y 2 5. (x, y, z) 7→ x2 + 4y 2

6. (x, y) 7→ sin(x2 + y 2 ) 6. (x, y, z) 7→ sin(z − x2 − y 2 )

7. (x, y) 7→ cos(x2 − y 2 ) 7. (x, y, z) 7→ x2 + y 2 + z 2

1.7. Levi-Civitta and Einstein Index Notation


We need an efficient abbreviated notation to handle the complexity of mathematical structure be-
fore us. We will use indices of a given “type” to denote all possible values of given index ranges. By
index type we mean a collection of similar letter types, like those from the beginning or middle of
the Latin alphabet, or Greek letters

a, b, c, . . .
i, j, k, . . .
λ, β, γ . . .

each index of which is understood to have a given common range of successive integer values. Vari-
ations of these might be barred or primed letters or capital letters. For example, suppose we are
looking at linear transformations between Rn and Rm where m ̸= n. We would need two different

31
1. Multidimensional Vectors
index ranges to denote vector components in the two vector spaces of different dimensions, say
i, j, k, ... = 1, 2, . . . , n and λ, β, γ, . . . = 1, 2, . . . , m.
In order to introduce the so called Einstein summation convention, we agree to the following
limitations on how indices may appear in formulas. A given index letter may occur only once in a
given term in an expression (call this a “free index”), in which case the expression is understood to
stand for the set of all such expressions for which the index assumes its allowed values, or it may
occur twice but only as a superscript-subscript pair (one up, one down) which will stand for the sum
over all allowed values (call this a “repeated index”). Here are some examples. If i, j = 1, . . . , n
then

Ai ←→ n expressions : A1 , A2 , . . . , An ,

n
Ai i ←→ Ai i , a single expression with n terms
i=1
(this is called the trace of the matrix A = (Ai j )),

n ∑
n
Aji i ←→ A1i i , . . . , Ani i , n expressions each of which has n terms in the sum,
i=1 i=1
Aii ←→ no sum, just an expression for each i, if we want to refer to a specific
diagonal component (entry) of a matrix, for example,
Ai vi + Ai wi = Ai (vi + wi ), 2 sums of n terms each (left) or one combined sum (right).

A repeated index is a “dummy index,” like the dummy variable in a definite integral
ˆ b ˆ b
f (x) dx = f (u) du.
a a

We can change them at will: Ai i = Aj j .

In order to emphasize that we are using Einstein’s convention, we will enclose any
terms under consideration with ⌜ · ⌟.

68 Example
Using Einstein’s Summation convention, the dot product of two vectors x ∈ Rn and y ∈ Rn can be
written as

n
x•y = xi yi = ⌜xt yt ⌟.
i=1

69 Example
Given that ai , bj , ck , dl are the components of vectors in R3 , a, b, c, d respectively, what is the mean-
ing of
⌜ai bi ck dk ⌟?

Solution: ▶ We have


3 ∑
3
⌜ai bi ck dk ⌟ = ai bi ⌜ck dk ⌟ = a b⌜ck dk ⌟ = a b
• • ck dk = (a•b)(c•d).
i=1 k=1

32
1.7. Levi-Civitta and Einstein Index Notation
70 Example
Using Einstein’s Summation convention, the ij-th entry (AB)ij of the product of two matrices A ∈
Mm×n (R) and B ∈ Mn×r (R) can be written as

n
(AB)ij = Aik Bkj = ⌜Ait Btj ⌟.
k=1

71 Example
Using Einstein’s Summation convention, the trace tr (A) of a square matrix A ∈ Mn×n (R) is tr (A) =
∑n
t=1 Att = ⌜Att ⌟.
72 Example
Demonstrate, via Einstein’s Summation convention, that if A, B are two n × n matrices, then

tr (AB) = tr (BA) .

Solution: ▶ We have
Ä ä Ä ä
tr (AB) = tr (AB)ij = tr ⌜Aik Bkj ⌟ = ⌜⌜Atk Bkt ⌟⌟,

and
Ä ä Ä ä
tr (BA) = tr (BA)ij = tr ⌜Bik Akj ⌟ = ⌜⌜Btk Akt ⌟⌟,
from where the assertion follows, since the indices are dummy variables and can be exchanged. ◀
73 Definition (Kronecker’s Delta)
The symbol δij is defined as follows:



 0 if i ̸= j
δij =


 1 if i = j.

74 Example

It is easy to see that ⌜δik δkj ⌟ = 3k=1 δik δkj = δij .
75 Example
We see that

3 ∑
3 ∑
3
⌜δij ai bj ⌟ = δij ai bj = ak bk = x•y.
i=1 j=1 k=1

Recall that a permutation of distinct objects is a reordering of them. The 3! = 6 permutations of the
index set {1, 2, 3} can be classified into even or odd. We start with the identity permutation 123 and
say it is even. Now, for any other permutation, we will say that it is even if it takes an even number of
transpositions (switching only two elements in one move) to regain the identity permutation, and
odd if it takes an odd number of transpositions to regain the identity permutation. Since

231 → 132 → 123, 312 → 132 → 123,

the permutations 123 (identity), 231, and 312 are even. Since

132 → 123, 321 → 123, 213 → 123,

the permutations 132, 321, and 213 are odd.

33
1. Multidimensional Vectors

76 Definition (Levi-Civitta’s Alternating Tensor)


The symbol εjkl is defined as follows:




 0 if {j, k, l} ̸= {1, 2, 3}

 Ü ê





 1 2 3


 −1 if is an odd permutation
εjkl = j k l

 Ü ê





 1 2 3

 +1 if is an even permutation




 j k l

In particular, if one subindex is repeated we have εrrs = εrsr = εsrr = 0. Also,

ε123 = ε231 = ε312 = 1, ε132 = ε321 = ε213 = −1.


77 Example
Using the Levi-Civitta alternating tensor and Einstein’s summation convention, the cross product can
also be expressed, if i = e1 , j = e2 , k = e3 , then

x × y = ⌜εjkl (ak bl )ej ⌟.


78 Example
If A = [aij ] is a 3 × 3 matrix, then, using the Levi-Civitta alternating tensor,

det A = ⌜εijk a1i a2j a3k ⌟.


79 Example
Let x, y, z be vectors in R3 . Then

x•(y × z) = ⌜xi (y × z)i ⌟ = ⌜xi εikl (yk zl )⌟.

Identities Involving δ and ϵ


ϵijk δ1i δ2j δ3k = ϵ123 = 1 (1.11)


δ δim δin
il


ϵijk ϵlmn = δjl
δjm δjn = δil δjm δkn +δim δjn δkl +δin δjl δkm −δil δjn δkm −δim δjl δkn −δin δjm δkl


δkl δkm δkn
(1.12)


δil δim
ϵijk ϵlmk = = δil δjm − δim δjl
(1.13)
δ δjm
jl
The last identity is very useful in manipulating and simplifying tensor expressions and proving vec-
tor and tensor identities.
ϵijk ϵljk = 2δil (1.14)
ϵijk ϵijk = 2δii = 6 (1.15)

34
1.7. Levi-Civitta and Einstein Index Notation
80 Example
Write the following identities using Einstein notation

1. A · (B × C) = C · (A × B) = B · (C × A)

2. A × (B × C) = B (A · C) − C (A · B)

Solution: ▶

A · (B × C) = C · (A × B) = B · (C × A)
⇕ ⇕ (1.16)
ϵijk Ai Bj Ck = ϵkij Ck Ai Bj = ϵjki Bj Ck Ai

A × (B × C) = B (A · C) − C (A · B)
⇕ (1.17)
ϵijk Aj ϵklm Bl Cm = Bi (Am Cm ) − Ci (Al Bl )

1.7.1. Common Definitions in Einstein Notation


The trace of a matrix A tensor is:
tr (A) = Aii (1.18)

For a 3 × 3 matrix the determinant is:




A11 A12 A13



det (A) = A21 A22 A23 = ϵijk A1i A2j A3k = ϵijk Ai1 Aj2 Ak3 (1.19)



A31 A32 A33

where the last two equalities represent the expansion of the determinant by row and by column.
Alternatively
1
det (A) = ϵijk ϵlmn Ail Ajm Akn (1.20)
3!
For an n × n matrix the determinant is:
1
det (A) = ϵi1 ···in A1i1 . . . Anin = ϵi1 ···in Ai1 1 . . . Ain n = ϵi ···i ϵj ···j Ai j . . . Ain jn (1.21)
n! 1 n 1 n 1 1
The inverse of a matrix A is:
î ó 1
A−1 = ϵjmn ϵipq Amp Anq (1.22)
ij 2 det (A)

The multiplication of a matrix A by a vector b as defined in linear algebra is:

[Ab]i = Aij bj (1.23)

35
1. Multidimensional Vectors
The multiplication of two n × n matrices A and B as defined in linear algebra is:

[AB]ik = Aij Bjk (1.24)

Again, here we are using matrix notation; otherwise a dot should be inserted between the two ma-
trices.
The dot product of two vectors is:

A · B =δij Ai Bj = Ai Bi (1.25)

The cross product of two vectors is:

[A × B]i = ϵijk Aj Bk (1.26)

The scalar triple product of three vectors is:




A1 A2 A3



A · (B × C) = B1 B2 B3 = ϵijk Ai Bj Ck (1.27)



C1 C2 C3

The vector triple product of three vectors is:


[ ]
A × (B × C) i = ϵijk ϵklm Aj Bl Cm (1.28)

1.7.2. Examples of Using Einstein Notation to Prove Identities


81 Example
A · (B × C) = C · (A × B) = B · (C × A):

Solution: ▶

A · (B × C) = ϵijk Ai Bj Ck (Eq. ??)


= ϵkij Ai Bj Ck (Eq. 10.40)
= ϵkij Ck Ai Bj (commutativity)
= C · (A × B) (Eq. ??) (1.29)
= ϵjki Ai Bj Ck (Eq. 10.40)
= ϵjki Bj Ck Ai (commutativity)
= B · (C × A) (Eq. ??)

The negative permutations of these identities can be similarly obtained and proved by changing
the order of the vectors in the cross products which results in a sign change.

82 Example
Show that A × (B × C) = B (A · C) − C (A · B):

36
1.7. Levi-Civitta and Einstein Index Notation
Solution: ▶
[ ]
A × (B × C) i = ϵijk Aj [B × C]k (Eq. ??)
= ϵijk Aj ϵklm Bl Cm (Eq. ??)
= ϵijk ϵklm Aj Bl Cm (commutativity)
= ϵijk ϵlmk Aj Bl Cm (Eq. 10.40)
Ä ä
= δil δjm − δim δjl Aj Bl Cm (Eq. 10.58)
= δil δjm Aj Bl Cm − δim δjl Aj Bl Cm (distributivity)
Ä ä Ä ä
= (δil Bl ) δjm Aj Cm − (δim Cm ) δjl Aj Bl (commutativity and grouping)
= Bi (Am Cm ) − Ci (Al Bl ) (Eq. 10.32)
= Bi (A · C) − Ci (A · B) (Eq. 1.25)
[ ] [ ]
= B (A · C) i − C (A · B) i (definition of index)
[ ]
= B (A · C) − C (A · B) i (Eq. ??)
(1.30)
Because i is a free index the identity is proved for all components. Other variants of this identity
[e.g. (A × B) × C] can be obtained and proved similarly by changing the order of the factors in
the external cross product with adding a minus sign. ◀

Exercises
83 Problem
Let x, y, z be vectors in R3 . Demonstrate that

⌜xi yi zj ⌟ = (x•y)z.

37
Limits and Continuity
2.
2.1. Some Topology
84 Definition
Let a ∈ Rn and let ε > 0. An open ball centered at a of radius ε is the set

Bε (a) = {x ∈ Rn : ∥x − a∥ < ε}.

An open box is a Cartesian product of open intervals

]a1 ; b1 [×]a2 ; b2 [× · · · ×]an−1 ; bn−1 [×]an ; bn [,

where the ak , bk are real numbers.

The set
Bε (a) = {x ∈ Rn : ∥x − a∥ < ε}.

is also called the ε-neighborhood of the point a.


b2 − a2

b b
ε

b
b

(a1 , a2 ) b b

b1 − a1
b

Figure 2.1. Open ball in R2 . Figure 2.2. Open rectangle in R2 .

39
z z

ε
b (a1 , a2 , a3 )
2. Limits and Continuity
b

x y x y

Figure 2.3. Open ball in R3 . Figure 2.4. Open box in R3 .

85 Example
An open ball in R is an open interval, an open ball in R2 is an open disk and an open ball in R3 is
an open sphere. An open box in R is an open interval, an open box in R2 is a rectangle without its
boundary and an open box in R3 is a box without its boundary.

86 Definition
A set A ⊆ Rn is said to be open if for every point belonging to it we can surround the point by a
sufficiently small open ball so that this balls lies completely within the set. That is, ∀a ∈ A ∃ε > 0
such that Bε (a) ⊆ A.

Figure 2.5. Open Sets

87 Example
The open interval ]−1; 1[ is open in R. The interval ]−1; 1] is not open, however, as no interval centred
at 1 is totally contained in ] − 1; 1].

88 Example
The region ] − 1; 1[×]0; +∞[ is open in R2 .

89 Example
¶ ©
The ellipsoidal region (x, y) ∈ R2 : x2 + 4y 2 < 4 is open in R2 .

The reader will recognize that open boxes, open ellipsoids and their unions and finite intersections
are open sets in Rn .

90 Definition
A set F ⊆ Rn is said to be closed in Rn if its complement Rn \ F is open.

91 Example
The closed interval [−1; 1] is closed in R, as its complement, R \ [−1; 1] =] − ∞; −1[∪]1; +∞[ is open
in R. The interval ] − 1; 1] is neither open nor closed in R, however.

40
2.1. Some Topology
92 Example
The region [−1; 1] × [0; +∞[×[0; 2] is closed in R3 .

93 Lemma
If x1 and x2 are in Sr (x0 ) for some r > 0, then so is every point on the line segment from x1 to x2 .

Proof. The line segment is given by

x = tx2 + (1 − t)x1 , 0 < t < 1.

Suppose that r > 0. If


|x1 − x0 | < r, |x2 − x0 | < r,

and 0 < t < 1, then

|x − x0 | = |tx2 + (1 − t)x1 − tx0 − (1 − t)x0 | (2.1)


= |t(x2 − x0 ) + (1 − t)(x1 − x0 )| (2.2)
≤ t|x2 − x0 | + (1 − t)|x1 − x0 | (2.3)
< tr + (1 − t)r = r.

94 Definition
A sequence of points {xk } in Rn converges to the limit x if

lim |xk − x| = 0.
k→∞

In this case we write


lim xk = x.
k→∞

The next two theorems follow from this, the definition of distance in Rn , and what we already
know about convergence in R.

95 Theorem
Let
x = (x1 , x2 , . . . , xn ) and xk = (x1k , x2k , . . . , xnk ), k ≥ 1.

Then lim xk = x if and only if


k→∞

lim xik = xi , 1 ≤ i ≤ n;
k→∞

that is, a sequence {xk } of points in Rn converges to a limit x if and only if the sequences of compo-
nents of {xk } converge to the respective components of x.

41
2. Limits and Continuity

96 Theorem (Cauchy’s Convergence Criterion)


A sequence {xk } in Rn converges if and only if for each ε > 0 there is an integer K such that

∥xr − xs ∥ < ε if r, s ≥ K.

97 Definition
Let S be a subset of R. Then

1. x0 is a limit point of S if every deleted neighborhood of x0 contains a point of S.

2. x0 is a boundary point of S if every neighborhood of x0 contains at least one point in S and


one not in S. The set of boundary points of S is the boundary of S, denoted by ∂S. The closure
of S, denoted by S, is S = S ∪ ∂S.

3. x0 is an isolated point of S if x0 ∈ S and there is a neighborhood of x0 that contains no other


point of S.

4. x0 is exterior to S if x0 is in the interior of S c . The collection of such points is the exterior of S.

98 Example
Let S = (−∞, −1] ∪ (1, 2) ∪ {3}. Then

1. The set of limit points of S is (−∞, −1] ∪ [1, 2].

2. ∂S = {−1, 1, 2, 3} and S = (−∞, −1] ∪ [1, 2] ∪ {3}.

3. 3 is the only isolated point of S.

4. The exterior of S is (−1, 1) ∪ (2, 3) ∪ (3, ∞).

99 Example
For n ≥ 1, let
ñ ô ∞
1 1 ∪
In = , and S= In .
2n + 1 2n n=1

Then

1. The set of limit points of S is S ∪ {0}.

2. ∂S = {x|x = 0 or x = 1/n (n ≥ 2)} and S = S ∪ {0}.

3. S has no isolated points.

4. The exterior of S is
 Ç å Ç å

∪ 1 1 1
(−∞, 0) ∪  , ∪ ,∞ .
n=1
2n + 2 2n + 1 2

42
2.1. Some Topology
100 Example
Let S be the set of rational numbers. Since every interval contains a rational number, every real num-
ber is a limit point of S; thus, S = R. Since every interval also contains an irrational number, every
real number is a boundary point of S; thus ∂S = R. The interior and exterior of S are both empty,
and S has no isolated points. S is neither open nor closed.

The next theorem says that S is closed if and only if S = S (Exercise 108).

101 Theorem
A set S is closed if and only if no point of S c is a limit point of S.

Proof. Suppose that S is closed and x0 ∈ S c . Since S c is open, there is a neighborhood of x0 that
is contained in S c and therefore contains no points of S. Hence, x0 cannot be a limit point of S. For
the converse, if no point of S c is a limit point of S then every point in S c must have a neighborhood
contained in S c . Therefore, S c is open and S is closed. ■

Theorem 101 is usually stated as follows.


102 Corollary
A set is closed if and only if it contains all its limit points.

A polygonal curve P is a curve specified by a sequence of points (A1 , A2 , . . . , An ) called its ver-
tices. The curve itself consists of the line segments connecting the consecutive vertices.

A2 A3

A1 An

Figure 2.6. Polygonal curve

103 Definition
A domain is a path connected open set. A path connected set D means that any two points of this
set can be connected by a polygonal curve lying within D.

104 Definition
A simply connected domain is a path-connected domain where one can continuously shrink any
simple closed curve into a point while remaining in the domain.

Equivalently a pathwise-connected domain U ⊆ R3 is called simply connected if for every sim-


ple closed curve Γ ⊆ U , there exists a surface Σ ⊆ U whose boundary is exactly the curve Γ.

Exercises

43
2. Limits and Continuity

(b) Non-simply connected domain


(a) Simply connected domain

Figure 2.7. Domains

105 Problem 106 Problem (Putnam Exam 1969)


Determine whether the following subsets of R2 Let p(x, y) be a polynomial with real coefficients
are open, closed, or neither, in R2 . in the real variables x and y, defined over the en-
tire plane R2 . What are the possibilities for the im-
1. A = {(x, y) ∈ R : |x| < 1, |y| < 1}
2
age (range) of p(x, y)?

2. B = {(x, y) ∈ R2 : |x| < 1, |y| ≤ 1} 107 Problem (Putnam 1998)


Let F be a finite collection of open disks in R2
3. C = {(x, y) ∈ R2 : |x| ≤ 1, |y| ≤ 1} whose union contains a set E ⊆ R2 . Shew that
there is a pairwise disjoint subcollection Dk , k ≥
4. D = {(x, y) ∈ R2 : x2 ≤ y ≤ x}
1 in F such that
5. E = {(x, y) ∈ R2 : xy > 1} ∪
n
E⊆ 3Dj .
j=1
6. F = {(x, y) ∈ R2 : xy ≤ 1}
108 Problem
7. G = {(x, y) ∈ R2 : |y| ≤ 9, x < y2} A set S is closed if and only if no point of S c is a
limit point of S.

2.2. Limits
We will start with the notion of limit.
109 Definition
A function f : Rn → Rm is said to have a limit L ∈ Rm at a ∈ Rn if ∀ε > 0, ∃δ > 0 such that

0 < ||x − a|| < δ =⇒ ||f (x) − L|| < ε.

In such a case we write,


lim f (x) = L.
x→a

The notions of infinite limits, limits at infinity, and continuity at a point, are analogously defined.

44
2.2. Limits

110 Theorem
A function f : Rn → Rm have limit
lim f(x) = L.
x→a

if and only if the coordinates functions f1 , f2 , . . . fm have limits L1 , L2, . . . , Lm respectively, i.e.,
fi → Li .

Proof.
We start with the following observation:

f (x) − L 2 = f1 (x) − L1 2 + f2 (x) − L2 2 + · · · + fm (x) − Lm 2 .

So, if

f1 (x) − L1 < ε

f2 (x) − L2 < ε

..
.

fm (x) − Lm < ε

then f(t) − L < mε.

Now, if f(x) − L < ε then

f1 (x) − L1 < ε

f2 (x) − L2 < ε

..
.

fm (x) − Lm < ε

Limits in more than one dimension are perhaps trickier to find, as one must approach the test
point from infinitely many directions.
111 Example ( )
x2 y x5 y 3
Find lim ,
(x,y)→(0,0) x2 + y 2 x6 + y 4

x2 y
Solution: ▶ First we will calculate lim We use the sandwich theorem. Observe that
(x,y)→(0,0) x2 + y 2
x2
0 ≤ x2 ≤ x2 + y 2 , and so 0 ≤ ≤ 1. Thus
x2 + y 2

x2 y

lim 0 ≤ lim ≤ lim |y|,
(x,y)→(0,0) (x,y)→(0,0) x2 + y 2 (x,y)→(0,0)

and hence
x2 y
lim = 0.
(x,y)→(0,0) x2 + y 2

45
2. Limits and Continuity
x5 y 3
Now we find lim .
(x,y)→(0,0) x6 + y 4
Either |x| ≤ |y| or |x| ≥ |y|. Observe that if |x| ≤ |y|, then

x5 y 3 y8

6 ≤ = y4.
x + y4 y4

If |y| ≤ |x|, then


x5 y 3 x8

6 ≤ = x2 .
x + y4 x6
Thus
x5 y 3

6 ≤ max(y 4 , x2 ) ≤ y 4 + x2 −→ 0,
x + y4

as (x, y) → (0, 0).


Aliter: Let X = x3 , Y = y 2 .
x5 y 3 X 5/3 Y 3/2

6 = .
x + y4 X2 + Y 2
Passing to polar coordinates X = ρ cos θ, Y = ρ sin θ, we obtain

x5 y 3 X 5/3 Y 3/2

6 = = ρ5/3+3/2−2 | cos θ|5/3 | sin θ|3/2 ≤ ρ7/6 → 0,
x + y4 X2 + Y 2

as (x, y) → (0, 0).



112 Example
1+x+y
Find lim .
(x,y)→(0,0) x2 − y 2

Solution: ▶ When y = 0,
1+x
→ +∞,
x2
as x → 0. When x = 0,
1+y
→ −∞,
−y 2
as y → 0. The limit does not exist. ◀
113 Example
xy 6
Find lim .
(x,y)→(0,0) x6 + y8

Solution: ▶ Putting x = t4 , y = t3 , we find

xy 6 1
6 8
= 2 → +∞,
x +y 2t

as t → 0. But when y = 0, the function is 0. Thus the limit does not exist. ◀
114 Example
((x − 1)2 + y 2 ) loge ((x − 1)2 + y 2 )
Find lim .
(x,y)→(0,0) |x| + |y|

46
2.2. Limits

Figure 2.9. Example 115.


Figure 2.8. Example 114.

Figure 2.10. Example 116.


Figure 2.11. Example 113.

47
2. Limits and Continuity
Solution: ▶ When y = 0 we have

2(x − 1)2 ln(|1 − x|) 2x


∼− ,
|x| |x|

and so the function does not have a limit at (0, 0). ◀


115 Example
sin(x4 ) + sin(y 4 )
Find lim √ .
(x,y)→(0,0) x4 + y 4

Solution: ▶ sin(x4 ) + sin(y 4 ) ≤ x4 + y 4 and so



sin(x4 ) + sin(y 4 ) »

√ ≤ x4 + y 4 → 0,
x4 + y 4

as (x, y) → (0, 0). ◀


116 Example
sin x − y
Find lim .
(x,y)→(0,0) x − sin y

Solution: ▶ When y = 0 we obtain


sin x
→ 1,
x
as x → 0. When y = x the function is identically −1. Thus the limit does not exist. ◀

If f : R2 → R, it may be that the limits


Å ã Å ã
lim lim f (x, y) , lim lim f (x, y) ,
y→y0 x→x0 x→x0 y→y0

both exist. These are called the iterated limits of f as (x, y) → (x0 , y0 ). The following possibilities
might occur.
Å ã Å ã
1. If lim f (x, y) exists, then each of the iterated limits lim lim f (x, y) and lim lim f (x, y)
(x,y)→(x0 ,y0 ) y→y0 x→x0 x→x0 y→y0
exists.
Å ã Å ã
2. If the iterated limits exist and lim lim f (x, y) ̸= lim lim f (x, y) then lim f (x, y)
y→y0 x→x0 x→x0 y→y0 (x,y)→(x0 ,y0 )
does not exist.
Å ã Å ã
3. It may occur that lim lim f (x, y) = lim lim f (x, y) , but that lim f (x, y)
y→y0 x→x0 x→x0 y→y0 (x,y)→(x0 ,y0 )
does not exist.

4. It may occur that lim f (x, y) exists, but one of the iterated limits does not.
(x,y)→(x0 ,y0 )

Exercises

48
2.3. Continuity
117 Problem 126 Problem
Sketch the domain of definition of (x, y) 7→ Demonstrate that

4 − x2 − y 2 .
x2 y 2 z 2
118 Problem lim = 0.
(x,y,z)→(0,0,0) x2 + y2 + z2
Sketch the domain of definition of (x, y) 7→
log(x + y). 127 Problem
Prove that
119 Problem
Sketch the domain of definition of (x, y) 7→ Ç å Ç å
x−y x−y
1 lim lim = 1 = − lim lim .
. x→0 y→0 x + y y→0 x→0 x + y
x2 + y 2
x−y
120 Problem Does lim exist?.
1 (x,y)→(0,0) x + y
Find lim (x2 + y 2 ) sin .
(x,y)→(0,0) xy
128 Problem
121 Problem
sin xy Let
Find lim .
(x,y)→(0,2) x 

 1 1
 x sin + y sin if x ̸= 0, y ̸= 0
122 Problem f (x, y) = x y


For what c will the function  0 otherwise


 √
 1 − x2 − 4y 2 , if x2 + 4y 2 ≤ 1, Prove that lim f (x, y) exists, but that
f (x, y) = (x,y)→(0,0)
Ç å


 c, if x2 + 4y 2 > 1
the iterated limits lim lim f (x, y) and
x→0 y→0
Å ã
be continuous everywhere on the xy-plane?
lim lim f (x, y) do not exist.
y→0 x→0
123 Problem
Find 129 Problem
» 1 Prove that
lim x2 + y 2 sin .
(x,y)→(0,0) x2 + y 2 ( )
x2 y 2
124 Problem lim lim 2 2 = 0,
x→0 y→0 x y + (x − y)2
Find
max(|x|, |y|) and that
lim √ . ( )
(x,y)→(+∞,+∞) x4 + y 4
x2 y 2
lim lim 2 2 = 0,
125 Problem y→0 x→0 x y + (x − y)2

Find
x2 y 2
but still lim does not ex-
2x2 sin y 2 + y 4 e−|x| (x,y)→(0,0) x2 y 2 + (x − y)2
lim √ .
(x,y)→(0,0 x2 + y 2 ist.

2.3. Continuity

49
2. Limits and Continuity

130 Definition
Let U ⊂ Rm be a domain, and f : U → Rd be a function. We say f is continuous at a if lim f (x) =
x→a
f (a).

131 Definition
If f is continuous at every point a ∈ U , then we say f is continuous on U (or sometimes simply f is
continuous).

Again the standard results on continuity from one variable calculus hold. Sums, products, quo-
tients (with a non-zero denominator) and composites of continuous functions will all yield contin-
uous functions.
The notion of continuity is useful is computing the limits along arbitrary curves.
132 Proposition
Let f : Rd → R be a function, and a ∈ Rd . Let γ : [0, 1] → Rd be a any continuous function with
γ(0) = a, and γ(t) ̸= a for all t > 0. If lim f (x) = l, then we must have lim f (γ(t)) = l.
x→a t→0

133 Corollary
If there exists two continuous functions γ1 , γ2 : [0, 1] → Rd such that for i ∈ {1, 2} we have γi (0) = a
and γi (t) ̸= a for all t > 0. If lim f (γ1 (t)) ̸= lim f (γ2 (t)) then lim f (x) can not exist.
t→0 t→0 x→a

134 Theorem
The vector function f : Rd → R is continuous at t0 if and only if the coordinates functions f1 , f2 , . . . fn
are continuous at t0 .

The proof of this Theorem is very similar to the proof of Theorem 110.

Exercises
135 Problem 138 Problem
1
Sketch the domain of definition of (x, y) 7→ Find lim (x2 + y 2 ) sin .
√ (x,y)→(0,0) xy
4 − x2 − y 2 .
139 Problem
sin xy
136 Problem Find lim .
(x,y)→(0,2) x
Sketch the domain of definition of (x, y) 7→
log(x + y). 140 Problem
For what c will the function
137 Problem 

 √
Sketch the domain of definition of (x, y) 7→  1 − x2 − 4y 2 , if x2 + 4y 2 ≤ 1,
f (x, y) =
1 

2 2
.  c, if x2 + 4y 2 > 1
x +y
be continuous everywhere on the xy-plane?

50
2.4. ⋆ Compactness
141 Problem 146 Problem
Find Let
» 1 
lim x2 + y 2 sin . 
 1 1
(x,y)→(0,0) x2 + y2  x sin + y sin if x ̸= 0, y ̸= 0
f (x, y) = x y


142 Problem  0 otherwise
Find
max(|x|, |y|) Prove that lim f (x, y) exists, but that
lim √ . (x,y)→(0,0)
(x,y)→(+∞,+∞) x4 + y 4 Ç å
the iterated limits lim lim f (x, y) and
143 Problem x→0 y→0
Å ã
Find lim lim f (x, y) do not exist.
y→0 x→0
2x2 sin y 2 + y 4 e−|x|
lim √ .
(x,y)→(0,0 x2 + y2 147 Problem
Prove that
144 Problem
Demonstrate that ( )
x2 y 2
lim lim 2 2 = 0,
x2 y 2 z 2 x→0 y→0 x y + (x − y)2
lim = 0.
(x,y,z)→(0,0,0) x2 + y 2 + z 2
and that
145 Problem
( )
Prove that x2 y 2
lim lim 2 2 = 0,
Ç å Ç å y→0 x→0 x y + (x − y)2
x−y x−y
lim lim = 1 = − lim lim .
x→0 y→0 x + y y→0 x→0 x + y
x2 y 2
but still lim does not ex-
x−y (x,y)→(0,0) x2 y 2 + (x − y)2
Does lim exist?. ist.
(x,y)→(0,0) x + y

2.4. ⋆ Compactness
The next definition generalizes the definition of the diameter of a circle or sphere.

148 Definition
If S is a nonempty subset of Rn , then
{ }
d(S) = sup |x − Y| x, Y ∈ S

is the diameter of S. If d(S) < ∞, S is bounded; if d(S) = ∞, S is unbounded.

149 Theorem (Principle of Nested Sets)


If S1 , S2 , … are closed nonempty subsets of Rn such that

S1 ⊃ S2 ⊃ · · · ⊃ Sr ⊃ · · · (2.4)

51
2. Limits and Continuity
and
lim d(Sr ) = 0, (2.5)
r→∞

then the intersection




I= Sr
r=1

contains exactly one point.

Proof. Let {xr } be a sequence such that xr ∈ Sr (r ≥ 1). Because of (2.4), xr ∈ Sk if r ≥ k, so

|xr − xs | < d(Sk ) if r, s ≥ k.

From (2.5) and Theorem 96, xr converges to a limit x. Since x is a limit point of every Sk and every
Sk is closed, x is in every Sk (Corollary 102). Therefore, x ∈ I, so I ̸= ∅. Moreover, x is the only
point in I, since if Y ∈ I, then
|x − Y| ≤ d(Sk ), k ≥ 1,

and (2.5) implies that Y = x. ■

We can now prove the Heine–Borel theorem for Rn . This theorem concerns compact sets. As in
R, a compact set in Rn is a closed and bounded set.
Recall that a collection H of open sets is an open covering of a set S if

S ⊂ ∪ {H} H ∈ H.

150 Theorem (Heine–Borel Theorem)


If H is an open covering of a compact subset S, then S can be covered by finitely many sets from H.

Proof. The proof is by contradiction. We first consider the case where n = 2, so that you can
visualize the method. Suppose that there is a covering H for S from which it is impossible to select
a finite subcovering. Since S is bounded, S is contained in a closed square

T = {(x, y)|a1 ≤ x ≤ a1 + L, a2 ≤ x ≤ a2 + L}

with sides of length L (Figure ??).


Bisecting the sides of T as shown by the dashed lines in Figure ?? leads to four closed squares,
T , T (2) , T (3) , and T (4) , with sides of length L/2. Let
(1)

S (i) = S ∩ T (i) , 1 ≤ i ≤ 4.

Each S (i) , being the intersection of closed sets, is closed, and


4
S= S (i) .
i=1

Moreover, H covers each S (i) , but at least one S (i) cannot be covered by any finite subcollection of
H, since if all the S (i) could be, then so could S. Let S1 be a set with this property, chosen from S (1) ,

52
2.4. ⋆ Compactness
S (2) , S (3) , and S (4) . We are now back to the situation we started from: a compact set S1 covered by
H, but not by any finite subcollection of H. However, S1 is contained in a square T1 with sides of
length L/2 instead of L. Bisecting the sides of T1 and repeating the argument, we obtain a subset
S2 of S1 that has the same properties as S, except that it is contained in a square with sides of
length L/4. Continuing in this way produces a sequence of nonempty closed sets S0 (= S), S1 , S2 ,
…, such that Sk ⊃ Sk+1 and d(Sk ) ≤ L/2k−1/2 (k ≥ 0). From Theorem 149, there is a point x in
∩∞
k=1 Sk . Since x ∈ S, there is an open set H in H that contains x, and this H must also contain
some ε-neighborhood of x. Since every x in Sk satisfies the inequality

|x − x| ≤ 2−k+1/2 L,

it follows that Sk ⊂ H for k sufficiently large. This contradicts our assumption on H, which led
us to believe that no Sk could be covered by a finite number of sets from H. Consequently, this
assumption must be false: H must have a finite subcollection that covers S. This completes the
proof for n = 2.
The idea of the proof is the same for n > 2. The counterpart of the square T is the hypercube
with sides of length L:
{ }
T = (x1 , x2 , . . . , xn ) ai ≤ xi ≤ ai + L, i = 1, 2, . . . , n.

Halving the intervals of variation of the n coordinates x1 , x2 , …, xn divides T into 2n closed hyper-
cubes with sides of length L/2:
{ }
T (i) = (x1 , x2 , . . . , xn ) bi ≤ xi ≤ bi + L/2, 1 ≤ i ≤ n,

where bi = ai or bi = ai + L/2. If no finite subcollection of H covers S, then at least one of these


smaller hypercubes must contain a subset of S that is not covered by any finite subcollection of S.
Now the proof proceeds as for n = 2. ■

151 Theorem (Bolzano-Weierstrass)


Every bounded infinite set of real numbers has at least one limit point.

Proof. We will show that a bounded nonempty set without a limit point can contain only a finite
number of points. If S has no limit points, then S is closed (Theorem 101) and every point x of S
has an open neighborhood Nx that contains no point of S other than x. The collection

H = {Nx } x ∈ S

is an open covering for S. Since S is also bounded, implies that S can be covered by a finite collec-
tion of sets from H, say Nx1 , …, Nxn . Since these sets contain only x1 , …, xn from S, it follows that
S = {x1 , . . . , xn }. ■

53
Differentiation of Vector Function
3.
In this chapter we consider functions f : Rn → Rm . This functions are usually classified based on
the dimensions n and m:

Ê if the dimensions n and m are equal to 1, such a function is called a real function of a real
variable.

Ë if m = 1 and n > 1 the function is called a real-valued function of a vector variable or, more
briefly, a scalar field.

Ì if n = 1 and m > 1 it is called a vector-valued function of a real variable.

Í if n > 1 and m > 1 it is called a vector-valued function of a vector variable, or simply a vector
field.

We suppose that the cases of real function of a real variable and of scalar fields have been studied
before.
This chapter extends the concepts of limit, continuity, and derivative to vector-valued function
and vector fields.
We start with the simplest one: vector-valued function.

3.1. Differentiation of Vector Function of a Real


Variable
152 Definition
A vector-valued function of a real variable is a rule that associates a vector f(t) with a real number
t, where t is in some subset D of R (called the domain of f). We write f : D → Rn to denote that f is
a mapping of D into Rn .

f : R → Rn
( )
f(t) = f1 (t), f2 (t), . . . , fn (t)

55
3. Differentiation of Vector Function
with
f1 , f2 , . . . , fn : R → R.

called the component functions of f.


In R3 vector-valued function of a real variable can be written in component form as

f(t) = f1 (t)i + f2 (t)j + f3 (t)k

or in the form
f(t) = (f1 (t), f2 (t), f3 (t))

for some real-valued functions f1 (t), f2 (t), f3 (t). The first form is often used when emphasizing
that f(t) is a vector, and the second form is useful when considering just the terminal points of
the vectors. By identifying vectors with their terminal points, a curve in space can be written as a
vector-valued function.
153 Example
For example, f(t) = ti + t2 j + t3 k is a vector-valued function in R3 , defined for all real numbers t. At
t = 1 the value of the function is the vector i + j + k, which in Cartesian coordinates has the terminal
point (1, 1, 1).

y
f(2π) 0
f(0)
x

154 Example
Define f : R → R3 by f(t) = (cos t, sin t, t).
This is the equation of a helix (see Figure 1.8.1). As the value of t increases, the terminal points of f(t)
trace out a curve spiraling upward. For each t, the x- and y-coordinates of f(t) are x = cos t and
y = sin t, so
x2 + y 2 = cos2 t + sin2 t = 1.

Thus, the curve lies on the surface of the right circular cylinder x2 + y 2 = 1.

It may help to think of vector-valued functions of a real variable in Rn as a generalization of the


parametric functions in R2 which you learned about in single-variable calculus. Much of the theory
of real-valued functions of a single real variable can be applied to vector-valued functions of a real
variable.

155 Definition
Let f(t) = (f1 (t), f2 (t), . . . , fn (t)) be a vector-valued function, and let a be a real number in its

56
3.1. Differentiation of Vector Function of a Real Variable
df
domain. The derivative of f(t) at a, denoted by f′ (a) or (a), is the limit
dt
f(a + h) − f(a)
f′ (a) = lim
h→0 h
if that limit exists. Equivalently, f′ (a) = (f1′ (a), f2′ (a), . . . , fn′ (a)), if the component derivatives exist.
We say that f(t) is differentiable at a if f′ (a) exists.

The derivative of a vector-valued function is a tangent vector to the curve in space which the
function represents, and it lies on the tangent line to the curve (see Figure 3.1).
z
f′ (a)
f(a L
f(a) +
h)

f(a f(t)
)
f(a + h) y
0
x
Figure 3.1. Tangent vector f′ (a) and tangent line
L = f(a) + sf′ (a)

156 Example
Let f(t) = (cos t, sin t, t). Then f′ (t) = (− sin t, cos t, 1) for all t. The tangent line L to the curve at
f(2π) = (1, 0, 2π) is L = f(2π) + s f′ (2π) = (1, 0, 2π) + s(0, 1, 1), or in parametric form: x = 1,
y = s, z = 2π + s for −∞ < s < ∞.

Note that if u(t) is a scalar function and f(t) is a vector-valued function, then their product, de-
fined by (u f)(t) = u(t) f(t) for all t, is a vector-valued function (since the product of a scalar with a
vector is a vector).
The basic properties of derivatives of vector-valued functions are summarized in the following
theorem.
157 Theorem
Let f(t) and g(t) be differentiable vector-valued functions, let u(t) be a differentiable scalar func-
tion, let k be a scalar, and let c be a constant vector. Then

d
Ê c=0
dt

d df
Ë (kf) = k
dt dt

d df dg
Ì (f + g) = +
dt dt dt

d df dg
Í (f − g) = −
dt dt dt

57
3. Differentiation of Vector Function
d du df
Î (u f) = f+u
dt dt dt

d df dg
Ï (f•g) = •g + f•
dt dt dt

d df dg
Ð (f × g) = ×g + f×
dt dt dt
Proof. The proofs of parts (1)-(5) follow easily by differentiating the component functions and using
the rules for derivatives from single-variable calculus. We will prove part (6), and leave the proof of
part (7) as an exercise for the reader.
( ) ( )
(6) Write f(t) = f1 (t), f2 (t), f3 (t) and g(t) = g1 (t), g2 (t), g3 (t) , where the component functions
f1 (t), f2 (t), f3 (t), g1 (t), g2 (t), g3 (t) are all differentiable real-valued functions. Then
d d
(f(t)•g(t)) = (f1 (t) g1 (t) + f2 (t) g2 (t) + f3 (t) g3 (t))
dt dt
d d d
= (f1 (t) g1 (t)) + (f2 (t) g2 (t)) + (f3 (t) g3 (t))
dt dt dt
df1 dg1 df2 dg2 df3 dg3
= (t) g1 (t) + f1 (t) (t) + (t) g2 (t) + f2 (t) (t) + (t) g3 (t) + f3 (t) (t)
Çdt dt å dt dt dt dt
df1 df2 df3 ( )
= (t), (t), (t) • g1 (t), g2 (t), g3 (t)
dt dt dt
Ç å
( ) dg1 dg2 dg3
+ f1 (t), f2 (t), f3 (t) • (t), (t), (t)
dt dt dt
(3.1)
df dg
= (t)•g(t) + f(t)• (t) for all t.■ (3.2)
dt dt

158 Example

Suppose f(t) is differentiable. Find the derivative of f(t) . Solution: ▶

Since f(t) is a real-valued function of t, then by the Chain Rule for real-valued functions, we know
d
f(t) 2 = 2 f(t) d f(t) .

that
dt dt
2 d
But f(t) = f(t)•f(t), so f(t) 2 = d (f(t)•f(t)). Hence, we have
dt dt
d
2 f(t) f(t) = d (f(t)•f(t)) = f′ (t)•f(t) + f(t)•f′ (t) by Theorem 157(f), so
dt dt
= 2f′ (t)•f(t) , so if f(t) ̸= 0 then
d ′
f(t) = f (t) f(t)


dt f(t) .


d
We know that f(t) is constant if and only if f(t) = 0 for all t. Also, f(t) ⊥ f′ (t) if and only if
dt
f′ (t)•f(t) = 0. Thus, the above example shows this important fact:
159 Proposition

If f(t) ̸= 0, then f(t) is constant if and only if f(t) ⊥ f′ (t) for all t.

58
3.1. Differentiation of Vector Function of a Real Variable
This means that if a curve lies completely on a sphere (or circle) centered at the origin, then the
tangent vector f′ (t) is always perpendicular to the position vector f(t).

160 Example Ç å
cos t sin t −at
The spherical spiral f(t) = √ ,√ ,√ , for a ̸= 0.
1 + a2 t2 1 + a2 t 2 1 + a2 t2
Figure 3.2 shows the graph of the curve when a = 0.2. In the exercises, the reader will be asked to
show that this curve lies on the sphere x2 + y 2 + z 2 = 1 and to verify directly that f′ (t)•f(t) = 0 for
all t.

1
0.8
0.6
0.4
z 0.2
0
-0.2
-0.4
-0.6 -1
-0.8 -0.8
-0.6
-1 -1 -0.4
-0.2
-0.8 -0.6 0
-0.4 -0.2 0.2 x
0 0.4
0.2 0.4 0.6
y 0.6 0.8 0.8
1 1

Figure 3.2. Spherical spiral with a = 0.2

Just as in single-variable calculus, higher-order derivatives of vector-valued functions are ob-


tained by repeatedly differentiating the (first) derivative of the function:

Ç n−1 å
′′ d ′ ′′′ d ′′ dn f d d f
f (t) = f (t) , f (t) = f (t) , ... , = (for n = 2, 3, 4, . . .)
dt dt dt n dt dtn−1

We can use vector-valued functions to represent physical quantities, such as velocity, accelera-
tion, force, momentum, etc. For example, let the real variable t represent time elapsed from some
initial time (t = 0), and suppose that an object of constant mass m is subjected to some force so
that it moves in space, with its position (x, y, z) at time t a function of t. That is, x = x(t), y = y(t),
z = z(t) for some real-valued functions x(t), y(t), z(t). Call r(t) = (x(t), y(t), z(t)) the position
vector of the object. We can define various physical quantities associated with the object as fol-

59
3. Differentiation of Vector Function
lows:1

position: r(t) = (x(t), y(t), z(t))


dr
velocity: v(t) = ṙ(t) = r′ (t) =
dt
′ ′ ′
= (x (t), y (t), z (t))
dv
acceleration: a(t) = v̇(t) = v′ (t) =
dt
d 2r
= r̈(t) = r′′ (t) =
dt2
= (x′′ (t), y ′′ (t), z ′′ (t))
momentum: p(t) = mv(t)
dp
force: F(t) = ṗ(t) = p′ (t) = (Newton’s Second Law of Motion)
dt

The magnitude v(t) of the velocity vector is called the speed of the object. Note that since the
mass m is a constant, the force equation becomes the familiar F(t) = ma(t).

161 Example
Let r(t) = (5 cos t, 3 sin t, 4 sin t) be the position vector of an object at time t ≥ 0. Find its (a) velocity
and (b) acceleration vectors.

Solution: ▶
(a) v(t) = ṙ(t) = (−5 sin t, 3 cos t, 4 cos t)
(b) a(t) = v̇(t) = (−5 cos t, −3 sin t, −4 sin t)

Note that r(t) = 25 cos2 t + 25 sin2 t = 5 for all t, so by Example 158 we know that r(t)•ṙ(t) =

0 for all t (which we can verify from part (a)). In fact, v(t) = 5 for all t also. And not only does
r(t) lie on the sphere of radius 5 centered at the origin, but perhaps not so obvious is that it lies
completely within a circle of radius 5 centered at the origin. Also, note that a(t) = −r(t). It turns
out (see Exercise 16) that whenever an object moves in a circle with constant speed, the acceleration
vector will point in the opposite direction of the position vector (i.e. towards the center of the circle).

3.1.1. Antiderivatives
162 Definition
An antiderivative of a vector-valued function f is a vector-valued function F such that

F′ (t) = f(t).
ˆ
The indefinite integral f(t) dt of a vector-valued function f is the general antiderivative of f
and represents the collection of all antiderivatives of f.

1
We will often use the older dot notation for derivatives when physics is involved.

60
3.1. Differentiation of Vector Function of a Real Variable
The same reasoning that allows us to differentiate a vector-valued function componentwise ap-
plies to integrating as well. Recall that the integral of a sum is the sum of the integrals and also that
we can remove constant factors from integrals. So, given f(t) = x(t) vi + y(t)j + z(t)k, it follows
that we can integrate componentwise. Expressed more formally,
If f(t) = x(t)i + y(t)j + z(t)k, then
ˆ Lj å Lj å Lj å
f(t) dt = x(t) dt i + y(t) dt j + z(t) dt k.

163 Proposition
Two antiderivarives of f(t) differs by a vector, i.e., if F(t) and G(t) are antiderivatives of f then exists
c ∈ Rn such that
F(t) − G(t) = c

Exercises
164 Problem 3. For a constant vector c ̸= 0, the function
For Exercises 1-4, calculate f′ (t) and find the tan- f(t) = tc represents a line parallel to c.
gent line at f(0).
(a) What kind of curve does g(t) = t3 c rep-
1. f(t) = (t + 1, t2 + 2. f(t) = (et + resent? Explain.
2
1, t3 + 1) 1, e2t + 1, et + 1) (b) What kind of curve does h(t) = et c
represent? Explain.
3. f(t) = (cos 2t, sin 2t, t) 4. f(t) = (sin 2t, 2 sin2 t, 2 cos t)
(c) Compare f′ (0) and g′ (0). Given your
For Exercises 5-6, find the velocity v(t) and accel- answer to part (a), how do you explain
eration a(t) of an object with the given position the difference in the two derivatives?
Ç å
vector r(t). d df d2 f
4. Show that f× = f× 2.
dt dt dt
5. r(t) = (t, t − 6. r(t) = (3 cos t, 2 sin t, 1)
5. Let a particle of (constant) mass m have
sin t, 1 − cos t)
position vector r(t), velocity v(t), acceler-
165 Problem ation a(t) and momentum p(t) at time t.
1. Let The angular momentum L(t) of the parti-
Ç å cle with respect to the origin at time t is de-
cos t sin t −at
f(t) = √ ,√ ,√ , fined as L(t) = r(t)× p(t). If F(t) is the
1 + a2 t2 1 + a2 t2 1 + a2 t2
force acting on the particle at time t, then
with a ̸= 0.

define the torque N(t) acting on the par-
(a) Show that f(t) = 1 for all t. ticle with respect to the origin as N(t) =

(b) Show directly that f (t)•f(t) = 0 for all r(t)×F(t). Show that L′ (t) = N(t).
t. d df
6. Show that (f•(g × h)) = •(g × h) +
Ç dt
å Ç å dt
2. If f′ (t) = 0 for all t in some interval (a, b), dg dh
show that f(t) is a constant vector in (a, b). f• × h + f• g × .
dt dt

61
3. Differentiation of Vector Function
7. The Mean Value Theorem does not hold terval (0, 2π) such that
for vector-valued functions: Show that for
f(2π) − f(0)
f(t) = (cos t, sin t, t), there is no t in the in- f′ (t) = .
2π − 0

3.2. Kepler Law


Why do planets have elliptical orbits? In this section we will solve the two body system problem,
i.e., describe the trajectory of two body that interact under the force of gravity. In particular we will
proof that the trajectory of a body is a ellipse with focus on the other body.

Figure 3.3. Two Body System

We will made two simplifying assumptions:

Ê The bodies are spherically symmetric and can be treated as point masses.

Ë There are no external or internal forces acting upon the bodies other than their mutual grav-
itation.

Two point mass objects with masses m1 and m2 and position vectors x1 and x2 relative to some
inertial reference frame experience gravitational forces:
−Gm1 m2
m1 ẍ1 = ^
r
r2
Gm1 m2
m2 ẍ2 = ^
r
r2
where x is the relative position vector of mass 1 with respect to mass 2, expressed as:

x = x1 − x2

and ^
r is the unit vector in that direction and r is the length of that vector.
Dividing by their respective masses and subtracting the second equation from the first yields the
equation of motion for the acceleration of the first object with respect to the second:
µ
ẍ = − ^
r (3.3)
r2

62
3.2. Kepler Law
where µ is the parameter:
µ = G(m1 + m2 )

With the versor r̂ we can write r = rr̂ and with this notation equation 3.3 can be written
µ
r̈ = − r̂. (3.4)
r2
For movement under any central force, i.e. a force parallel to r, the relative angular momentum

L = r × ṙ

stays constant. This fact can be easily deduced:


d
L̇ = (r × ṙ) = ṙ × ṙ + r × r̈ = 0 + 0 = 0
dt

Since the cross product of the position vector and its velocity stays constant, they must lie in the
same plane, orthogonal to L. This implies the vector function is a plane curve.

From 3.4 it follows that

d
L = r × ṙ = rr̂ × ˙ + rṙ(r̂ × r̂) = r2 r̂ × r̂˙
(rr̂) = rr̂ × (rr̂˙ + ṙr̂) = r2 (r̂ × r̂)
dt

Now consider

µ ˙ = −µr̂ × (r̂ × r̂)


˙ = −µ[(r̂•r̂)r̂ ˙
˙ − (r̂•r̂)r̂]
r̈ × L = − r̂ × (r2 r̂ × r̂)
r2

Since r̂•r̂ = |r̂|2 = 1 we have that

1 1 d
r̂•r̂˙ = (r̂•r̂˙ + r̂˙ •r̂) = (r̂•r̂) = 0
2 2 dt

Substituting these values into the previous equation, we have:

r̈ × L = µr̂˙

Now, integrating both sides:

ṙ × L = µr̂ + c

Where c is a constant vector. If we calculate the inner product of the previous equation this with r

63
3. Differentiation of Vector Function
yields an interesting result:

r•(ṙ × L) = r•(µr̂ + c) = µr•r̂ + r•c = µr(r̂•r̂) + rc cos(θ) = r(µ + c cos(θ))

Where θ is the angle between r and c. Solving for r:

r•(ṙ × L) (r × ṙ)•L |L|2


r= = =
µ + c cos(θ) µ + c cos(θ) µ + c cos(θ)

Finally, we note that


(r, θ)
|L|2
are effectively the polar coordinates of the vector function. Making the substitutions p = and
µ
c
e = , we arrive at the equation
µ
p
r= (3.5)
1 + e · cos θ
The Equation 3.5 is the equation in polar coordinates for a conic section with one focus at the
origin.

3.3. Definition of the Derivative of Vector Function


Observe that since we may not divide by vectors, the corresponding definition in higher dimensions
involves quotients of norms.

166 Definition
Let A ⊆ Rn be an open set. A function f : A → Rm is said to be differentiable at a ∈ A if there is a
linear transformation, called the derivative of f at a, Da (f) : Rn → Rm such that

||f(x) − f(a) − Da (f)(x − a)||


lim = 0.
x→a ||x − a||

If we denote by E(h) the difference (error)

E(h) := f(a + h) − f(a) − Da (f)(a)(h).

Then may reformulate the definition of the derivative as

167 Definition
A function f : A → Rm is said to be differentiable at a ∈ A if there is a linear transformation Da (f)
such that
f(a + h) − f(a) = Da (f)(h) + E(h),

64
3.3. Definition of the Derivative of Vector Function

E(h)
with E(h) a function that satisfies limh→0 = 0.
∥h∥

The condition for differentiability at a is equivalent also to

f(x) − f(a) = Da (f )(x − a) + E(x − a)



E(x − a)
with E(x − a) a function that satisfies limh→0 = 0.
∥h∥

168 Theorem
The derivative Da (f ) is uniquely determined.

Proof. Let L : Rn → Rm be another linear transformation satisfying definition 166. We must


prove that ∀v ∈ Rn , L(v) = Da (f )(v). Since A is open, a + h ∈ A for sufficiently small ∥h∥. By
definition, we have
f(a + h) − f(a) = Da (f )(h) + E1 (h).

E1 (h)
with limh→0 = 0.
∥h∥
and
f(a + h) − f(a) = L(h) + E2 (h).

E2 (h)
with limh→0 = 0.
∥h∥
Now, observe that

Da (f )(v) − L(v) = Da (f )(h) − f(a + h) + f(a) + f(a + h) − f(a) − L(h).

By the triangle inequality,

||Da (f )(v) − L(v)|| ≤ ||Da (f )(h) − f(a + h) + f(a)||


+||f(a + h) − f(a) − L(h)||
= E1 (h) + E2 (h)
= E3 (h),

E3 (h) E1 + E2 (h)
with limh→0 = limh→0 = 0.
∥h∥ ∥h∥
This means that
||L(v) − Da (f )(v)|| → 0,

i.e., L(v) = Da (f )(v), completing the proof. ■

169 Example
If L : Rn → Rm is a linear transformation, then Da (L) = L, for any a ∈ Rn .

65
3. Differentiation of Vector Function
Solution: ▶ Since Rn is an open set, we know that Da (L) uniquely determined. Thus if L satisfies
definition 166, then the claim is established. But by linearity
||L(x) − L(a) − L(x − a)|| = ||L(x) − L(a) − L(x) + L(a)|| = ∥0∥ = 0,
whence the claim follows. ◀
170 Example
Let
R3 × R3 → R
f:
(x, y) 7→ x•y
be the usual dot product in R3 . Show that f is differentiable and that
D(x,y) f(h, k) = x•k + h•y.

Solution: ▶ We have
f(x + h, y + k) − f(x, y) = (x + h)•(y + k) − x•y
= x•y + x•k + h•y + h•k − x•y
= x•k + h•y + h•k.
Since x•k + h•y is a linear function of (h, k) if we choose E(h) = h•k, we have by the Cauchy-
Buniakovskii-Schwarz inequality, that |h•k| ≤ ∥h∥∥k∥ and

E(h)
lim ≤ ∥k∥ = 0.
(h,k)→(0,0 ∥h∥
which proves the assertion. ◀
Just like in the one variable case, differentiability at a point, implies continuity at that point.

171 Theorem
Suppose A ⊆ Rn is open and f : A → Rn is differentiable on A. Then f is continuous on A.

Proof. Given a ∈ A, we must shew that


lim f(x) = f(a).
x→a
Since f is differentiable at a we have
f(x) − f(a) = Da (f )(x − a) + E(x − a).

E(h)
Since limh→0 = 0 then limh→0 E(h) = 0. and so
∥h∥
f(x) − f(a) → 0,
as x → a, proving the theorem. ■

Exercises

66
3.4. Partial and Directional Derivatives
172 Problem 173 Problem
Let L : R → R be a linear transformation and Let f : Rn → R, n ≥ 1, f(x) = ∥x∥ be the usual
3 3

norm in Rn , with ∥x∥2 = x•x. Prove that


R3 → R3
F : .
x 7→ x × L(x) x•v
Dx (f )(v) = ,
∥x∥
Shew that F is differentiable and that
Dx (F )(h) = x × L(h) + h × L(x). for x ̸= 0, but that f is not differentiable at 0.

3.4. Partial and Directional Derivatives


174 Definition
Let A ⊆ Rn , f : A → Rm , and put
 
 f1 (x1 , . . . , xn ) 
 
 
 f2 (x1 , . . . , xn ) 
 
f(x) =  .. .
 
 . 
 
 
fm (x1 , . . . , xn )

∂fi
Here fi : Rn → R. The partial derivative (x) is defined as
∂xj

∂fi fi (x1 , , . . . , xj + h, . . . , xn ) − fi (x1 , . . . , xj , . . . , xn )


∂j fi (x) := (x) := lim ,
∂xj h→0 h

whenever this limit exists.

To find partial derivatives with respect to the j-th variable, we simply keep the other variables
fixed and differentiate with respect to the j-th variable.

xi = cte

67
3. Differentiation of Vector Function
175 Example
If f : R3 → R, and f(x, y, z) = x + y 2 + z 3 + 3xy 2 z 3 then

∂f
(x, y, z) = 1 + 3y 2 z 3 ,
∂x
∂f
(x, y, z) = 2y + 6xyz 3 ,
∂y
and
∂f
(x, y, z) = 3z 2 + 9xy 2 z 2 .
∂z
Let f(x) be a vector valued function. Then the derivative of f(x) in the direction u is defined as
ñ ô
d
Du f(x) := Df(x)[u] = f(v + α u)
dα α=0

for all vectors u.


176 Proposition

Ê If f(x) = f1 (x) + f2 (x) then Du f(x) = Du f1 (x) + Du f2 (x)


( ) ( )
Ë If f(x) = f1 (x) × f2 (x) then Du f(x) = Du f1 (x) × f2 (x) + f1 (v) × Du f2 (x)

3.5. The Jacobi Matrix


We now establish a way which simplifies the process of finding the derivative of a function at a given
point.
Since the derivative of a function f : Rn → Rm is a linear transformation, it can be represented
by aid of matrices. The following theorem will allow us to determine the matrix representation for
Da (f ) under the standard bases of Rn and Rm .

177 Theorem
Let
 
 f1 (x1 , . . . , xn ) 
 
 
 f2 (x1 , . . . , xn ) 
 
f(x) =  .. .
 
 . 
 
 
fm (x1 , . . . , xn )

Suppose A ⊆ Rn is an open set and f : A → Rm is differentiable. Then each partial derivative


∂fi
(x) exists, and the matrix representation of Dx (f ) with respect to the standard bases of Rn
∂xj

68
3.5. The Jacobi Matrix
and Rm is the Jacobi matrix
 
∂f1 ∂f1 ∂f1
 ∂x (x) (x) ... (x) 
 1 ∂x2 ∂xn 
 ∂f2 ∂f2 ∂f2 
 (x) (x) ... (x) 
 
f′ (x) = 

∂x1 ∂x2 ∂xn .

 .. .. .. .. 
 . . . . 
 
 ∂fm ∂fm ∂fm 
(x) (x) . . . (x)
∂x1 ∂x2 ∂xn
Proof. Let ej , 1 ≤ j ≤ n, be the standard basis for Rn . To obtain the Jacobi matrix, we must
compute Dx (f )(ej ), which will give us the j-th column of the Jacobi matrix. Let f′ (x) = (Jij ), and
observe that  
 J1j 
 
 
 J2j 
 
Dx (f )(ej ) =  .  .
 . 
 . 
 
 
Jmj
and put y = x + εej , ε ∈ R. Notice that
||f(y) − f(x) − Dx (f )(y − x)||
||y − x||
||f(x1 , . . . , xj + h, . . . , xn ) − f(x1 , . . . , xj , . . . , xn ) − εDx (f )(ej )||
= .
|ε|
Since the sinistral side → 0 as ε → 0, the so does the i-th component of the numerator, and so,
|fi (x1 , . . . , xj + h, . . . , xn ) − fi (x1 , . . . , xj , . . . , xn ) − εJij |
→ 0.
|ε|
This entails that
fi (x1 , . . . , xj + ε, . . . , xn ) − fi (x1 , . . . , xj , . . . , xn ) ∂fi
Jij = lim = (x) .
ε→0 ε ∂xj
This finishes the proof. ■

Strictly speaking, the Jacobi matrix is not the derivative of a function at a point. It is
a matrix representation of the derivative in the standard basis of Rn . We will abuse
language, however, and refer to f′ when we mean the Jacobi matrix of f.
178 Example
Let f : R3 → R2 be given by
f(x, y) = (xy + yz, loge xy).
Compute the Jacobi matrix of f.

Solution: ▶ The Jacobi matrix is the 2 × 3 matrix


   
∂x f1 (x, y) ∂y f1 (x, y) ∂z f1 (x, y)  y x + z y
f′ (x, y) = 

=
 1 1
.

∂x f2 (x, y) ∂y f2 (x, y) ∂z f2 (x, y) 0
x y

69
3. Differentiation of Vector Function
179 Example
Let f(ρ, θ, z) = (ρ cos θ, ρ sin θ, z) be the function which changes from cylindrical coordinates to
Cartesian coordinates. We have
 
cos θ −ρ sin θ 0
 
′  
f (ρ, θ, z) =  sin θ ρ cos θ 0 .
 
 
0 0 1

180 Example
Let f(ρ, ϕ, θ) = (ρ cos θ sin ϕ, ρ sin θ sin ϕ, ρ cos ϕ) be the function which changes from spherical co-
ordinates to Cartesian coordinates. We have
 
cos θ sin ϕ ρ cos θ cos ϕ −ρ sin ϕ sin θ 
 
 
f′ (ρ, ϕ, θ) =  sin θ sin ϕ ρ sin θ cos ϕ ρ cos θ sin ϕ  .
 
 
cos ϕ −ρ sin ϕ 0

The concept of repeated partial derivatives is akin to the concept of repeated differentiation.
Similarly with the concept of implicit partial differentiation. The following examples should be self-
explanatory.
181 Example
∂2 π
Let f(u, v, w) = eu v cos w. Determine f(u, v, w) at (1, −1, ).
∂u∂v 4
Solution: ▶ We have
∂2 ∂ u
(eu v cos w) = (e cos w) = eu cos w,
∂u∂v ∂u

e 2
which is at the desired point. ◀
2
182 Example
∂z ∂z
The equation z xy + (xy)z + xy 2 z 3 = 3 defines z as an implicit function of x and y. Find and
∂x ∂y
at (1, 1, 1).

Solution: ▶ We have
∂ xy ∂ xy log z
z = e
∂x Ç
∂x å
xy ∂z
= y log z + z xy ,
z ∂x
∂ ∂ z log xy
(xy)z = e
∂x Ç
∂x å
∂z z
= log xy + (xy)z ,
∂x x
∂ ∂z
xy 2 z 3 = y 2 z 3 + 3xy 2 z 2 ,
∂x ∂x

70
3.5. The Jacobi Matrix
Hence, at (1, 1, 1) we have

∂z ∂z ∂z 1
+1+1+3 = 0 =⇒ =− .
∂x ∂x ∂x 2

Similarly,
∂ xy ∂ xy log z
z = e
∂y ∂y
Ç å
xy ∂z
= x log z + z xy ,
z ∂y

∂ ∂ z log xy
(xy)z = e
∂y ∂y
Ç å
∂z z
= log xy + (xy)z ,
∂y y

∂ ∂z
xy 2 z 3 = 2xyz 3 + 3xy 2 z 2 ,
∂y ∂y
Hence, at (1, 1, 1) we have

∂z ∂z ∂z 3
+1+2+3 = 0 =⇒ =− .
∂y ∂y ∂y 4

Exercises
183 Problem 186 Problem
Let f : [0; +∞[×]0; +∞[→ R, f(r, t) = Let f(x, y) = (xyx + y) and g(x, y) =
Ä ä
tn e−r /4t , where n is a constant. Determine n x − yx2 y 2 x + y Find (g ◦ f )′ (0, 1).
2

such that
Ç å
∂f 1 ∂ 2 ∂f
= 2 r . 187 Problem
∂t r ∂r ∂r
Assuming that the equation xy 2 + 3z = cos z 2
184 Problem
defines z implicitly as a function of x and y, find
Let
∂x z.
f : R2 → R, f(x, y) = min(x, y 2 ).
∂f(x, y) ∂f(x, y)
Find and . 188 Problem
∂x ∂y ∂w
If w = euv and u = r + s, v = rs, determine .
185 Problem ∂r
Let f : R2 → R2 and g : R3 → R2 be given by
Ä ä
f(x, y) = xy 2 x2 y , g(x, y, z) = (x − y 189
+ 2zxy) .
Problem
Compute (f ◦ g)′ (1, 0, 1), if at all defined. If un- Let z be an implicitly-defined function of x and y
2 2
defined, explain. Compute (g ◦ f )′ (1, 0), if at all through the equation (x + z) + (y + z) = 8.
∂z
defined. If undefined, explain. Find at (1, 1, 1).
∂x

71
3. Differentiation of Vector Function

3.6. Properties of Differentiable Transformations


Just like in the one-variable case, we have the following rules of differentiation.

190 Theorem
Let A ⊆ Rn , B ⊆ Rm be open sets f, g : A → Rm , α ∈ R, be differentiable on A, h : B → Rl be
differentiable on B, and f(A) ⊆ B. Then we have

■ Addition Rule: Dx ((f + αg)) = Dx (f) + αDx (g).


Ä ä ( )
■ Chain Rule: Dx ((h ◦ f)) = Df(x) (h) ◦ Dx (f) .

Since composition of linear mappings expressed as matrices is matrix multiplication, the Chain
Rule takes the alternative form when applied to the Jacobi matrix.

(h ◦ f)′ = (h′ ◦ f)(f′ ). (3.6)

191 Example
Let
f(u, v) = (uev , u + v, uv) ,

Ä ä
h(x, y) = x2 + y, y + z .

Find (f ◦ h)′ (x, y).

Solution: ▶ We have
 
 ev uev 
 
′  
f (u, v) =  1 1  ,
 
 
v u

and
 
2x 1 0
h′ (x, y) = 


.
0 1 1

Observe also that


 
 ey+z (x2 + y)ey+z 
 
′  
f (h(x, y)) =  1 1 .
 
 
y+z x2 + y

72
3.6. Properties of Differentiable Transformations
Hence

(f ◦ h)′ (x, y) = f′ (h(x, y))h′ (x, y)


 
 ey+z  
 (x2 + y)ey+z 

 
  2x 1 0

  
=  1 1  
 
  0 1 1
 
 
y+z x2 + y
 
 2xey+z (1 + x2 + y)ey+z (x2 + y)ey+z 
 
 
 
 
=  2x 2 1 .
 
 
 
 
2xy + 2xz x2 + 2y + z x2 + y


192 Example
Let
f : R2 → R, f(u, v) = u2 + ev ,

u, v : R3 → R u(x, y) = xz, v(x, y) = y + z.


( )
Put h(x, y) = f u(x, y, z), v(x, y, z) . Find the partial derivatives of h.
( )
Solution: ▶ Put g : R3 → R2 , g(x, y) = u(x, y), v(x, y) = (xz, y + z). Observe that h = f ◦ g.
Now,  
z 0 x
g′ (x, y) = 

,

0 1 1
ï ò

f (u, v) = 2u ev ,
ï ò

f (h(x, y)) = 2xz ey+z .

Thus [ ]
∂h ∂h ∂h = h′ (x, y)
(x, y) (x, y) (x, y)
∂x ∂y ∂z

= (f′ (g(x, y)))(g′ (x, y))


 
[ ]
z 0 x

.
=  
2xz ey+z  
 
0 1 1
[ ]
= 2xz 2 ey+z 2x2 z + ey+z

73
3. Differentiation of Vector Function
Equating components, we obtain
∂h
(x, y) = 2xz 2 ,
∂x
∂h
(x, y) = ey+z ,
∂y
∂h
(x, y) = 2x2 z + ey+z .
∂z

193 Theorem
Let F = (f1 , f2 , . . . , fm ) : Rn → Rm , and suppose that the partial derivatives

∂fi
, 1 ≤ i ≤ m, 1 ≤ j ≤ n, (3.7)
∂xj

exist on a neighborhood of x0 and are continuous at x0 . Then F is differentiable at x0 .

We say that F is continuously differentiable on a set S if S is contained in an open set on which


the partial derivatives in (3.7) are continuous. The next three lemmas give properties of continu-
ously differentiable transformations that we will need later.
194 Lemma
Suppose that F : Rn → Rm is continuously differentiable on a neighborhood N of x0 . Then, for every
ϵ > 0, there is a δ > 0 such that

|F(x) − F(y)| < (∥F′ (x0 )∥ + ϵ)|x − y| if A, y ∈ Bδ (x0 ). (3.8)

Proof. Consider the auxiliary function

G(x) = F(x) − F′ (x0 )x. (3.9)

The components of G are



n
∂fi (x0 )∂xj
gi (x) = fi (x) − ,
j=1
x j

so
∂gi (x) ∂fi (x) ∂fi (x0 )
= − .
∂xj ∂xj ∂xj
Thus, ∂gi /∂xj is continuous on N and zero at x0 . Therefore, there is a δ > 0 such that

∂g (x)
i ϵ
< √ for 1 ≤ i ≤ m, 1 ≤ j ≤ n, if |x − x0 | < δ. (3.10)
∂xj mn

Now suppose that x, y ∈ Bδ (x0 ). By Theorem ??,


n
∂gi (xi )
gi (x) − gi (y) = (xj − yj ), (3.11)
j=1
∂xj

74
3.6. Properties of Differentiable Transformations
where xi is on the line segment from x to y, so xi ∈ Bδ (x0 ). From (3.10), (3.11), and Schwarz’s
inequality, Ñ
n ñ ô2 é
∑ ∂gi (xi ) ϵ2
(gi (x) − gi (y))2 ≤ |x − y|2 < |x − y|2 .
j=1
∂xj m

Summing this from i = 1 to i = m and taking square roots yields

|G(x) − G(y)| < ϵ|x − y| if x, y ∈ Bδ (x0 ). (3.12)

To complete the proof, we note that

F(x) − F(y) = G(x) − G(y) + F′ (x0 )(x − y), (3.13)

so (3.12) and the triangle inequality imply (3.8). ■

195 Lemma
Suppose that F : Rn → Rn is continuously differentiable on a neighborhood of x0 and F′ (x0 ) is
nonsingular. Let
1
r= . (3.14)
∥(F (x0 ))−1 ∥

Then, for every ϵ > 0, there is a δ > 0 such that

|F(x) − F(y)| ≥ (r − ϵ)|x − y| if x, y ∈ Bδ (x0 ). (3.15)

Proof. Let x and y be arbitrary points in DF and let G be as in (3.9). From (3.13),


|F(x) − F(y)| ≥ |F′ (x0 )(x − y)| − |G(x) − G(y)| , (3.16)

Since
x − y = [F′ (x0 )]−1 F′ (x0 )(x − y),

(3.14) implies that


1
|x − y| ≤ |F′ (x0 )(x − y)|,
r
so
|F′ (x0 )(x − y)| ≥ r|x − y|. (3.17)

Now choose δ > 0 so that (3.12) holds. Then (3.16) and (3.17) imply (3.15). ■

196 Definition
A function f is said to be continuously differentiable if the derivative f′ exists and is itself a contin-
uous function.
Continuously differentiable functions are said to be of class C 1 . A function is of class C 2 if the first
and second derivative of the function both exist and are continuous. More generally, a function is
said to be of class C k if the first k derivatives exist and are continuous. If derivatives f(n) exist for all
positive integers n, the function is said smooth or equivalently, of class C ∞ .

75
3. Differentiation of Vector Function

3.7. Gradients, Curls and Directional Derivatives


197 Definition
Let
Rn → R
f:
x 7→ f (x)
be a scalar field. The gradient of f is the vector defined and denoted by
( )
∇f (x) := Df (x) := ∂1 f (x) , ∂2 f (x) , . . . , ∂n f (x) .

The gradient operator is the operator

∇ = (∂1 , ∂2 , . . . , ∂n ) .

198 Theorem
Let A ⊆ Rn be open and let f : A → R be a scalar field, and assume that f is differentiable in A.
Let K ∈ R be a constant. Then ∇f (x) is orthogonal to the surface implicitly defined by f (x) = K.

Proof. Let
R → Rn
c:
t 7→ c(t)

be a curve lying on this surface. Choose t0 so that c(t0 ) = x. Then

(f ◦ c)(t0 ) = f (c(t)) = K,

and using the chain rule


Df (c(t0 ))Dc(t0 ) = 0,

which translates to
(∇f (x))•(c′ (t0 )) = 0.

Since c′ (t0 ) is tangent to the surface and its dot product with ∇f (x) is 0, we conclude that ∇f (x)
is normal to the surface. ■

199 Remark
Now let c(t) be a curve in Rn (not necessarily in the surface).
And let θ be the angle between ∇f (x) and c′ (t0 ). Since

|(∇f (x))•(c′ (t0 ))| = ||∇f (x)||||c′ (t0 )|| cos θ,

∇f (x) is the direction in which f is changing the fastest.

76
3.7. Gradients, Curls and Directional Derivatives
200 Example
Find a unit vector normal to the surface x3 + y 3 + z = 4 at the point (1, 1, 2).

Solution: ▶ Here f (x, y, z) = x3 + y 3 + z − 4 has gradient


Ä ä
∇f (x, y, z) = 3x2 , 3y 2 , 1

which at (1, 1, 2) is (3, 3, 1). Normalizing this vector we obtain


Ç å
3 3 1
√ ,√ ,√ .
19 19 19

201 Example
Find the direction of the greatest rate of increase of f (x, y, z) = xyez at the point (2, 1, 2).

Solution: ▶ The direction is that of the gradient vector. Here

∇f (x, y, z) = (yez , xez , xyez )


Ä ä
which at (2, 1, 2) becomes e2 , 2e2 , 2e2 . Normalizing this vector we obtain

1
√ (1, 2, 2) .
5

202 Example
Sketch the gradient vector field for f (x, y) = x2 + y 2 as well as several contours for this function.

Solution: ▶ The contours for a function are the curves defined by,

f (x, y) = k

for various values of k. So, for our function the contours are defined by the equation,

x2 + y 2 = k

and so they are circles centered at the origin with radius k . The gradient vector field for this
function is
∇f (x, y) = 2xi + 2yj

Here is a sketch of several of the contours as well as the gradient vector field. ◀
203 Example
Let f : R3 → R be given by
f (x, y, z) = x + y 2 − z 2 .

Find the equation of the tangent plane to f at (1, 2, 3).

77
3. Differentiation of Vector Function
3 3

2 2
2

1 1
1

0 0 0

-1
-1 -1

-2
-2 -2

-3

-3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Solution: ▶ A vector normal to the plane is ∇f (1, 2, 3). Now

∇f (x, y, z) = (1, 2y, −2z)

which is
(1, 4, −6)

at (1, 2, 3). The equation of the tangent plane is thus

1(x − 1) + 4(y − 2) − 6(z − 3) = 0,

or
x + 4y − 6z = −9.

204 Definition
Let
Rn → Rn
f:
x 7→ f(x)
be a vector field with
( )
f(x) = f1 (x), f2 (x), . . . , fn (x) .

The divergence of f is defined and denoted by


( )
divf (x) = ∇•f(x) := Tr Df (x) := ∂1 f1 (x) + ∂2 f2 (x) + · · · + ∂n fn (x) .

205 Example
2
If f(x, y, z) = (x2 , y 2 , yez ) then
2
divf (x) = 2x + 2y + 2yzez .

Mean Value Theorem for Scalar Fields The mean value theorem generalizes to scalar fields. The
trick is to use parametrization to create a real function of one variable, and then apply the one-
variable theorem.

78
3.7. Gradients, Curls and Directional Derivatives

206 Theorem (Mean Value Theorem for Scalar Fields)


Let U be an open connected subset of Rn , and let f : U → R be a differentiable function. Fix points
x, y ∈ U such that the segment connecting x to y lies in U . Then

f (y) − f (x) = ∇f (z) · (y − x)

where z is a point in the open segment connecting x to y.

Proof. Let U be an open connected subset of Rn , and let f : U → R be a differentiable function.


Å
Fix points x, y ∈ U such that the segment connecting x to y lies in U , and define g(t) := f (1 −
ã
t)x + ty . Since f is a differentiable function in U the function g is continuous function in [0, 1] and
differentiable in (0, 1). The mean value theorem gives:

g(1) − g(0) = g ′ (c)

for some c ∈ (0, 1). But since g(0) = f (x) and g(1) = f (y), computing g ′ (c) explicitly we have:
Å ã
f (y) − f (x) = ∇f (1 − c)x + cy · (y − x)

or
f (y) − f (x) = ∇f (z) · (y − x)
where z is a point in the open segment connecting x to y ■

By the Cauchy-Schwarz inequality, the equation gives the estimate:



Å ã

f (y) − f (x) ≤ ∇f (1 − c)x + cy y − x .

Curl
207 Definition
If F : R3 → R3 is a vector field with components F = (F1 , F2 , F3 ), we define the curl of F
 
∂2 F3 − ∂3 F2 
 
def  
∇ × F = ∂3 F1 − ∂1 F3  .
 
 
∂1 F2 − ∂2 F1

This is sometimes also denoted by curl(F).

208 Remark
A mnemonic to remember this formula is to write
   
∂1  F1 
   
   
∇ × F = ∂2  × F2  ,
   
   
∂3 F3

79
3. Differentiation of Vector Function
and compute the cross product treating both terms as 3-dimensional vectors.

209 Example
If F(x) = x/|x|3 , then ∇ × F = 0.

210 Remark
In the example above, F is proportional to a gravitational force exerted by a body at the origin. We
know from experience that when a ball is pulled towards the earth by gravity alone, it doesn’t start to
rotate; which is consistent with our computation ∇ × F = 0.

211 Example
If v(x, y, z) = (sin z, 0, 0), then ∇ × v = (0, cos z, 0).

212 Remark
Think of v above as the velocity field of a fluid between two plates placed at z = 0 and z = π. A small
ball placed closer to the bottom plate experiences a higher velocity near the top than it does at the
bottom, and so should start rotating counter clockwise along the y-axis. This is consistent with our
calculation of ∇ × v.

The definition of the curl operator can be generalized to the n dimensional space.

213 Definition
Let gk : Rn → Rn , 1 ≤ k ≤ n − 2 be vector fields with gi = (gi1 , gi2 , . . . , gin ). Then the curl of
(g1 , g2 , . . . , gn−2 )
 
 e1 e2 ... en 
 
 
 ∂1 ∂2 ... ∂n 
 
 

 g11 (x) g12 (x) ... g1n (x) 

curl(g1 , g2 , . . . , gn−2 )(x) = det 

.

 g21 (x) g22 (x) ... g2n (x) 
 
 
 .. .. .. .. 
 . . . . 
 
 
g(n−2)1 (x) g(n−2)2 (x) . . . g(n−2)n (x)

214 Example
If f(x, y, z, w) = (exyz , 0, 0, w2 ), g(x, y, z, w) = (0, 0, z, 0) then
 
 e1 e2 e3 e4 
 
 
 ∂1 ∂2 ∂3 ∂4 
 
curl(f, g)(x, y, z, w) = det   = (xz 2 exyz )e4 .
 xyz 
e 0 0 w2 
 
 
0 0 z 0

80
3.7. Gradients, Curls and Directional Derivatives

215 Definition
Let A ⊆ Rn be open and let f : A → R be a scalar field, and assume that f is differentiable in A. Let
v ∈ Rn \ {0} be such that x + tv ∈ A for sufficiently small t ∈ R. Then the directional derivative
of f in the direction of v at the point x is defined and denoted by

f(x + tv) − f(x)


Dv f(x) = lim .
t→0 t

Some authors require that the vector v in definition 215 be a unit vector.

216 Theorem
Let A ⊆ Rn be open and let f : A → R be a scalar field, and assume that f is differentiable in A. Let
v ∈ Rn \ {0} be such that x + tv ∈ A for sufficiently small t ∈ R. Then the directional derivative
of f in the direction of v at the point x is given by

∇f (x)•v.

217 Example
Find the directional derivative of f(x, y, z) = x3 + y 3 − z 2 in the direction of (1, 2, 3).

Solution: ▶ We have
Ä ä
∇f (x, y, z) = 3x2 , 3y 2 , −2z
and so
∇f (x, y, z)•v = 3x2 + 6y 2 − 6z.

The following is a collection of useful differentiation formulae in R3 .
218 Theorem

Ê ∇•ψu = ψ∇•u + u•∇ψ

Ë ∇ × ψu = ψ∇ × u + ∇ψ × u

Ì ∇•u × v = v•∇ × u − u•∇ × v

Í ∇ × (u × v) = v•∇u − u•∇v + u(∇•v) − v(∇•u)

Î ∇(u•v) = u•∇v + v•∇u + u × (∇ × v) + v × (∇ × u)

Ï ∇ × (∇ψ) = curl (grad ψ) = 0

Ð ∇•(∇ × u) = div (curl u) = 0

Ñ ∇•(∇ψ1 × ∇ψ2 ) = 0

Ò ∇ × (∇ × u) = curl (curl u) = grad (div u) − ∇2 u

81
3. Differentiation of Vector Function
where
∂2f ∂2f ∂2f
∆f = ∇2 f = ∇ · ∇f = 2
+ 2 + 2
∂x ∂y ∂z

is the Laplacian operator and

∂2 ∂2 ∂2
∇2 u = (
2
+ 2 + 2 )(ux i + uy j + uz k) = (3.18)
∂x ∂y ∂z
2 2
∂ ux ∂ ux ∂ ux2 2
∂ uy 2
∂ uy 2
∂ uy ∂ 2 uz ∂ 2 uz ∂ 2 uz
( + + )i + ( + + )j + ( + + )k
∂x2 ∂y 2 ∂z 2 ∂x2 ∂y 2 ∂z 2 ∂x2 ∂y 2 ∂z 2

Finally, for the position vector r the following are valid

Ê ∇•r = 3

Ë ∇×r=0

Ì u•∇r = u

where u is any vector.

Exercises
219 Problem 223 Problem
x2
The temperature at a point in space is T = xy + Find the tangent plane to the surface − y2 −
2
yz + zx. z 2 = 0 at the point (2, −1, 1).
a) Find the direction in which the temperature
224 Problem
changes most rapidly with distance from (1, 1, 1).
Find the point on the surface
What is the maximum rate of change?
b) Find the derivative of T in the direction of the x2 + y 2 − 5xy + xz − yz = −3
vector 3i − 4k at (1, 1, 1).
for which the tangent plane is x − 7y = −6.
220 Problem
For each of the following vector functions F, de-225 Problem
termine whether ∇ϕ = F has a solution and de- Find a vector pointing in the direction in which
termine it if it exists. f(x, y, z) = 3xy−9xz 2 +y increases most rapidly
a) F = 2xyz 3 i − (x2 z 3 + 2y)j + 3x2 yz 2 k at the point (1, 1, 0).
b) F = 2xyi + (x2 + 2yz)j + (y 2 + 1)k 226 Problem
221 Problem Let Du f(x, y) denote the directional derivative of
Let f(x, y, z) = xeyz . Find f at (x, y) in the direction of the unit vector u. If
∇f (1, 2) = 2i − j, find D 3 4 f(1, 2).
(∇f )(2, 1, 1). ( , )
5 5
222 Problem
227 Problem
Let f(x, y, z) = (xz, exy , z). Find
Use a linear approximation of the function
(∇ × f )(2, 1, 1). f(x, y) = ex cos 2y at (0, 0) to estimate f(0.1, 0.2).

82
3.8. The Geometrical Meaning of Divergence and Curl
228 Problem 1. ∇•ϕV = ϕ∇•V + V•∇ϕ
Prove that
2. ∇ × ϕV = ϕ∇ × V + (∇ϕ) × V
∇ • (u × v) = v • (∇ × u) − u • (∇ × v).
3. ∇ × (∇ϕ) = 0
229 Problem
Find the point on the surface 4. ∇•(∇ × V) = 0

5. ∇(U•V) = (U•∇)V + (V•∇)U + U ×


2x2 + xy + y 2 + 4x + 8y − z + 14 = 0
(∇ × V) + +V × (∇ × U)
for which the tangent plane is 4x + y − z = 0.
231 Problem
230 Problem Find the angles made by the gradient of f(x, y) =

Let ϕ : R3 → R be a scalar field, and let U, V : x 3 + y at the point (1, 1) with the coordinate
R3 → R3 be vector fields. Prove that axes.

3.8. The Geometrical Meaning of Divergence and Curl


In this section we provide some heuristics about the meaning of Divergence and Curl. This inter-
pretations will be formally proved in the chapters 6 and 7.

3.8.1. Divergence

∆z

∆y

∆x

−k

Figure 3.4. Computing the vertical contribution to


the flux.

83
3. Differentiation of Vector Function
Consider a small closed parallelepiped, with sides parallel to the coordinate planes, as shown in
Figure 3.4. What is the flux of F out of the parallelepiped?
Consider first the vertical contribution, namely the flux up through the top face plus the flux
through the bottom face. These two sides each have area ∆A = ∆x ∆y, but the outward normal
vectors point in opposite directions so we get

F•∆A ≈ F(z + ∆z)•k ∆x ∆y − F(z)•k ∆x ∆y
top+bottom
Å ã
≈ Fz (z + ∆z) − Fz (z) ∆x ∆y
Fz (z + ∆z) − Fz (z)
≈ ∆x ∆y ∆z
∆z
∂Fz
≈ ∆x ∆y ∆z by Mean Value Theorem
∂z
where we have multiplied and divided by ∆z to obtain the volume ∆V = ∆x ∆y ∆z in the third
step, and used the definition of the derivative in the final step.
Repeating this argument for the remaining pairs of faces, it follows that the total flux out of the
parallelepiped is
Ç å
∑ ∂Fx ∂Fy ∂Fz
total flux = F•∆A ≈ + + ∆V
parallelepiped
∂x ∂y ∂z

Since the total flux is proportional to the volume of the parallelepiped, it approaches zero as the
volume of the parallelepiped shrinks down. The interesting quantity is therefore the ratio of the
flux to volume; this ratio is called the divergence.
At any point P , we can define the divergence of a vector field F, written ∇•F, to be the flux of F
per unit volume leaving a small parallelepiped around the point P .
Hence, the divergence of F at the point P is the flux per unit volume through a small paral-
lelepiped around P , which is given in rectangular coordinates by

flux ∂Fx ∂Fy ∂Fz


∇•F = = + +
unit volume ∂x ∂y ∂z

Analogous computations can be used to determine expressions for the divergence in other coor-
dinate systems. These computations are presented in chapter 8.

3.8.2. Curl
Intuitively, curl is the circulation per unit area, circulation density, or rate of rotation (amount of
twisting at a single point).
Consider a small rectangle in the yz-plane, with sides parallel to the coordinate axes, as shown
in Figure 1. What is the circulation of F around this rectangle?
Consider first the horizontal edges, on each of which dr = ∆y j. However, when computing
the circulation of F around this rectangle, we traverse these two edges in opposite directions. In

84
3.8. The Geometrical Meaning of Divergence and Curl

∆z

∆y

Figure 3.5. Computing the horizontal contribution


to the circulation around a small rectangle.

particular, when traversing the rectangle in the counterclockwise direction, ∆y < 0 on top and
∆y > 0 on the bottom.

F•dr ≈ −F(z + ∆z)•j ∆y + F(z)•j ∆y (3.19)
top+bottom
Å ã
≈ − Fy (z + ∆z) − Fy (z) ∆y
Fy (z + ∆z) − Fy (z)
≈− ∆y ∆z
∆z
∂Fy
≈− ∆y ∆z by Mean Value Theorem
∂z
where we have multiplied and divided by ∆z to obtain the surface element ∆A = ∆y ∆z in the
third step, and used the definition of the derivative in the final step.
Just as with the divergence, in making this argument we are assuming that F doesn’t change
much in the x and y directions, while nonetheless caring about the change in the z direction.
Repeating this argument for the remaining two sides leads to

F•dr ≈ F(y + ∆y)•k ∆z − F(y)•k ∆z (3.20)
sides
Å ã
≈ Fz (y + ∆y) − Fz (y) ∆z
Fz (y + ∆y) − Fz (y)
≈ ∆y ∆z
∆y
∂Fz
≈ ∆y ∆z
∂y
where care must be taken with the signs, which are different from those in (3.19). Adding up both
expressions, we obtain
Ç å
∂Fz ∂Fy
total yz-circulation ≈ − ∆x ∆y (3.21)
∂y ∂z
Since this is proportional to the area of the rectangle, it approaches zero as the area of the rectangle
converges to zero. The interesting quantity is therefore the ratio of the circulation to area.

85
3. Differentiation of Vector Function
We are computing the i-component of the curl.
yz-circulation ∂Fz ∂Fy
curl(F)•i := = − (3.22)
unit area ∂y ∂z
The rectangular expression for the full curl now follows by cyclic symmetry, yielding
Ç å Ç å Ç å
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
curl(F) = − i+ − j+ − k (3.23)
∂y ∂z ∂z ∂x ∂x ∂y
which is more easily remembered in the form



i j k


∂ ∂
curl(F) = ∇ × F = ∂x ∂ (3.24)
∂y ∂z


Fx Fy Fz

Figure 3.6. Consider a small paddle wheel placed


in a vector field of position. If the vy component
is an increasing function of x , this tends to make
the paddle wheel want to spin (positive, counter-
clockwise) about the k̂ -axis. If the vx component
is a decreasing function of y , this tends to make
the paddle wheel want to spin (positive, counter-
clockwise) about the k̂ -axis. The net impulse to
spin around the k̂ -axis is the sum of the two.
Source MIT

3.9. Maxwell’s Equations


Maxwell’s Equations is a set of four equations that describes the behaviors of electromagnetism.
Together with the Lorentz Force Law, these equations describe completely (classical) electromag-
netism, i. e., all other results are simply mathematical consequences of these equations.
To begin with, there are two fields that govern electromagnetism, known as the electric and mag-
netic field. These are denoted by E(r, t) and B(r, t) respectively.
To understand electromagnetism, we need to explain how the electric and magnetic fields are
formed, and how these fields affect charged particles. The last is rather straightforward, and is
described by the Lorentz force law.

86
3.10. Inverse Functions

232 Definition (Lorentz force law)


A point charge q experiences a force of

F = q(E + ṙ × B).

The dynamics of the field itself is governed by Maxwell’s Equations. To state the equations, first
we need to introduce two more concepts.

233 Definition (Charge and current density)

■ ρ(r, t) is the charge density, defined as the charge per unit volume.

■ j(r, t) is the current density, defined as the electric current per unit area of cross section.

Then Maxwell’s equations are

234 Definition (Maxwell’s equations)

ρ
∇·E=
ε0
∇·B=0
∂B
∇×E+ =0
∂t
∂E
∇ × B − µ 0 ε0 = µ0 j,
∂t
where ε0 is the electric constant (i.e, the permittivity of free space) and µ0 is the magnetic constant
(i.e, the permeability of free space), which are constants.

3.10. Inverse Functions


A function f is said one-to-one if f(x1 ) and f(x2 ) are distinct whenever x1 and x2 are distinct points
of Dom(f). In this case, we can define a function g on the image
{ }
Im(f) = u|u = f(x) for some x ∈ Dom(f)

of f by defining g(u) to be the unique point in Dom(f) such that f(u) = u. Then

Dom(g) = Im(f) and Im(g) = Dom(f).

Moreover, g is one-to-one,
g(f(x)) = x, x ∈ Dom(f),
and
f(g(u)) = u, u ∈ Dom(g).

87
3. Differentiation of Vector Function
We say that g is the inverse of f, and write g = f−1 . The relation between f and g is symmetric; that
is, f is also the inverse of g, and we write f = g−1 .
A transformation f may fail to be one-to-one, but be one-to-one on a subset S of Dom(f). By this
we mean that f(x1 ) and f(x2 ) are distinct whenever x1 and x2 are distinct points of S. In this case,
f is not invertible, but if f|S is defined on S by

f|S (x) = f(x), x ∈ S,

and left undefined for x ̸∈ S, then f|S is invertible.


We say that f|S is the restriction of f to S, and that f−1
S is the inverse of f restricted to S. The
−1
domain of fS is f(S).
The question of invertibility of an arbitrary transformation f : Rn → Rn is too general to have a
useful answer. However, there is a useful and easily applicable sufficient condition which implies
that one-to-one restrictions of continuously differentiable transformations have continuously dif-
ferentiable inverses.
235 Definition
If the function f is one-to-one on a neighborhood of the point x0 , we say that f is locally invertible
at x0 . If a function is locally invertible for every x0 in a set S, then f is said locally invertible on S.

To motivate our study of this question, let us first consider the linear transformation
  
 a11 a12 · · · a1n   x1 
  
  
 a21 a22 · · · a2n  x2 
  
f(x) = Ax =  . .. ..  .. .
 . ..  
 . . . .  . 
  
  
an1 an2 · · · ann xn

The function f is invertible if and only if A is nonsingular, in which case Im(f) = Rn and

f−1 (u) = A−1 u.

Since A and A−1 are the differential matrices of f and f−1 , respectively, we can say that a linear
transformation is invertible if and only if its differential matrix f′ is nonsingular, in which case the
differential matrix of f−1 is given by
(f−1 )′ = (f′ )−1 .
Because of this, it is tempting to conjecture that if f : Rn → Rn is continuously differentiable and
A′ (x) is nonsingular, or, equivalently, D(f)(x) ̸= 0, for x in a set S, then f is one-to-one on S.
However, this is false. For example, if

f(x, y) = [ex cos y, ex sin y] ,

then

x
e cos y −ex sin y
D(f)(x, y) = = e2x =
̸ 0, (3.25)
ex sin y e cos y
x

88
3.11. Implicit Functions
but f is not one-to-one on R2 . The best that can be said in general is that if f is continuously differ-
entiable and D(f)(x) ̸= 0 in an open set S, then f is locally invertible on S, and the local inverses
are continuously differentiable. This is part of the inverse function theorem, which we will prove
presently.

236 Theorem (Inverse Function Theorem)


If f : U → Rn is differentiable at a and Da (f) is invertible, then there exists a domains U ′ , V ′ such
that a ∈ U ′ ⊆ U , f(a) ∈ V ′ and f : U ′ → V ′ is bijective. Further, the inverse function g : V ′ → U ′
is differentiable.

The proof of the Inverse Function Theorem will be presented in the Section ??.
We note that the condition about the invertibility of Da (f) is necessary. If f has a differentiable
inverse in a neighborhood of a, then Da (f) must be invertible. To see this differentiate the identity

f(g(x)) = x

3.11. Implicit Functions


Let U ⊆ Rn+1 be a domain and f : U → R be a differentiable function. If x ∈ Rn and y ∈ R, we’ll
concatenate the two vectors and write (x, y) ∈ Rn+1 .

237 Theorem (Special Implicit Function Theorem)


Suppose c = f (a, b) and ∂y f (a, b) ̸= 0. Then, there exists a domain U ′ ∋ a and differentiable
function g : U ′ → R such that g(a) = b and f (x, g(x)) = c for all x ∈ U ′ .
Further, there exists a domain V ′ ∋ b such that
{ } { }

(x, y) x ∈ U ′ , y ∈ V ′ , f (x, y) = c = (x, g(x)) x ∈ U ′ .

In other words, for all x ∈ U ′ the equation f (x, y) = c has a unique solution in V ′ and is given
by y = g(x).
238 Remark
To see why ∂y f ̸= 0 is needed, let f (x, y) = αx + βy and consider the equation f (x, y) = c. To
express y as a function of x we need β ̸= 0 which in this case is equivalent to ∂y f ̸= 0.
239 Remark
If n = 1, one expects f (x, y) = c to some curve in R2 . To write this curve in the form y = g(x)
using a differentiable function g, one needs the curve to never be vertical. Since ∇f is perpendicular
to the curve, this translates to ∇f never being horizontal, or equivalently ∂y f ̸= 0 as assumed in the
theorem.
240 Remark
For simplicity we choose y to be the last coordinate above. It could have been any other, just as long
as the corresponding partial was non-zero. Namely if ∂i f (a) ̸= 0, then one can locally solve the
equation f (x) = f (a) (uniquely) for the variable xi and express it as a differentiable function of the
remaining variables.

89
3. Differentiation of Vector Function
241 Example
f (x, y) = x2 + y 2 with c = 1.

Proof. [of the Special Implicit Function Theorem] Let f(x, y) = (x, f (x, y)), and observe D(f)(a,b) ̸=
0. By the inverse function theorem f has a unique local inverse g. Note g must be of the form
g(x, y) = (x, g(x, y)). Also f ◦ g = Id implies (x, y) = f(x, g(x, y)) = (x, f (x, g(x, y)). Hence
y = g(x, c) uniquely solves f (x, y) = c in a small neighbourhood of (a, b). ■

Instead of y ∈ R above, we could have been fancier and allowed y ∈ Rn . In this case f needs to
be an Rn valued function, and we need to replace ∂y f ̸= 0 with the assumption that the n×n minor
in D(f ) (corresponding to the coordinate positions of y) is invertible. This is the general version of
the implicit function theorem.

242 Theorem (General Implicit Function Theorem)


Let U ⊆ Rm+n be a domain. Suppose f : Rn × Rm → Rm is C 1 on an open set containing (a, b)
where a ∈ Rn and b ∈ Rm . Suppose f(a, b) = 0 and that the m × m matrix M = (Dn+j fi (a, b))
is nonsingular. Then that there is an open set A ⊂ Rn containing a and an open set B ⊂ Rm
containing b such that, for each x ∈ A, there is a unique g(x) ∈ B such that f(x, g(x)) = 0.
Furthermore, g is differentiable.

In other words: if the matrix M is invertible, then one can locally solve the equation f(x) =
f(a) (uniquely) for the variables xi1 , …, xim and express them as a differentiable function of the
remaining n variables.
The proof of the General Implicit Function Theorem will be presented in the Section ??.
243 Example
Consider the equations

(x − 1)2 + y 2 + z 2 = 5 and (x + 1)2 + y 2 + z 2 = 5

for which x = 0, y = 0, z = 2 is one solution. For all other solutions close enough to this point,
determine which of variables x, y, z can be expressed as differentiable functions of the others.

Solution: ▶ Let a = (0, 0, 1) and


 
(x − 1)2 y2
+ +  z2
F (x, y, z) = 



2 2
(x + 1) + y + z 2

Observe  
−2 0 4
DFa = 

,

2 0 4

and the 2 × 2 minor using the first and last column is invertible. By the implicit function theorem
this means that in a small neighborhood of a, x and z can be (uniquely) expressed in terms of y. ◀

90
3.12. Common Differential Operations in Einstein Notation
244 Remark
In the above example, one can of course solve explicitly and obtain
»
x=0 and z = 4 − y2,

but in general we won’t be so lucky.

3.12. Common Differential Operations in Einstein


Notation
Here we present the most common differential operations as defined by Einstein Notation.
The operator ∇ is a spatial partial differential operator defined in Cartesian coordinate systems
by:

∇i = (3.26)
∂xi
The gradient of a differentiable scalar function of position f is a vector given by:
∂f
[∇f ]i = ∇i f = = ∂i f = f,i (3.27)
∂xi
The gradient of a differentiable vector function of position A (which is the outer product, as de-
fined in S 10.3.3, between the ∇ operator and the vector) is defined by:

[∇A]ij = ∂i Aj (3.28)

The gradient operation is distributive but not commutative or associative:

∇ (f + h) = ∇f + ∇h (3.29)

∇f ̸= f ∇ (3.30)

(∇f ) h ̸= ∇ (f h) (3.31)

where f and h are differentiable scalar functions of position.


The divergence of a differentiable vector A is a scalar given by:
∂Ai ∂Ai
∇ · A = δij = = ∇i Ai = ∂i Ai = Ai,i (3.32)
∂xj ∂xi

The divergence of a differentiable A is a vector defined in one of its forms by:

[∇ · A]i = ∂j Aji (3.33)

and in another form by


[∇ · A]j = ∂i Aji (3.34)

These two different forms can be given, respectively, in symbolic notation by:

∇·A & ∇ · AT (3.35)

91
3. Differentiation of Vector Function
where AT is the transpose of A.
The divergence operation is distributive but not commutative or associative:

∇ · (A + B) = ∇ · A + ∇ · B (3.36)

∇ · A ̸= A · ∇ (3.37)
∇ · (f A) ̸= ∇f · A (3.38)
where A and B are differentiable vector functions of position.
The curl of a differentiable vector A is a vector given by:
∂Ak
[∇ × A]i = ϵijk = ϵijk ∇j Ak = ϵijk ∂j Ak = ϵijk Ak,j (3.39)
∂xj
The curl operation is distributive but not commutative or associative:

∇ × (A + B) = ∇ × A + ∇ × B (3.40)

∇ × A ̸= A × ∇ (3.41)
∇ × (A × B) ̸= (∇ × A) × B (3.42)
The Laplacian scalar operator, also called the harmonic operator, acting on a differentiable scalar
f is given by:
∂2f ∂2f
∆f = ∇2 f = δij = = ∇ii f = ∂ii f = f,ii (3.43)
∂xi ∂xj ∂xi ∂xi
The Laplacian operator acting on a differentiable vector A is defined for each component of the
vector similar to the definition of the Laplacian acting on a scalar, that is
î ó
∇2 A i = ∂jj Ai (3.44)

The following scalar differential operator is commonly used in science (e.g. in fluid dynamics):

A · ∇ = Ai ∇ i = Ai = Ai ∂ i (3.45)
∂xi
where A is a vector. As indicated earlier, the order of Ai and ∂i should be respected.
The following vector differential operator also has common applications in science:

[A × ∇]i = ϵijk Aj ∂k (3.46)

3.12.1. Common Identities in Einstein Notation


Here we present some of the widely used identities of vector calculus in the traditional vector nota-
tion and in its equivalent Einstein Notation. In the following bullet points, f and h are differentiable
scalar fields; A, B, C and D are differentiable vector fields; and r = xi ei is the position vector.

∇·r=n
⇕ (3.47)
∂i xi = n

92
3.12. Common Differential Operations in Einstein Notation
where n is the space dimension.

∇×r=0
⇕ (3.48)
ϵijk ∂j xk = 0

∇ (a · r) = a
⇕ (3.49)
Ä ä
∂i aj xj = ai

where a is a constant vector.

∇ · (∇f ) = ∇2 f
⇕ (3.50)
∂i (∂i f ) = ∂ii f

∇ · (∇ × A) = 0
⇕ (3.51)
ϵijk ∂i ∂j Ak = 0

∇ × (∇f ) = 0
⇕ (3.52)
ϵijk ∂j ∂k f = 0

∇ (f h) = f ∇h + h∇f
⇕ (3.53)
∂i (f h) = f ∂i h + h∂i f

∇ · (f A) = f ∇ · A + A · ∇f
⇕ (3.54)
∂i (f Ai ) = f ∂i Ai + Ai ∂i f

∇ × (f A) = f ∇ × A + ∇f × A
⇕ (3.55)
Ä ä
ϵijk ∂j (f Ak ) = f ϵijk ∂j Ak + ϵijk ∂j f Ak

A × (∇ × B) = (∇B) · A − A · ∇B
⇕ (3.56)
ϵijk ϵklm Aj ∂l Bm = (∂i Bm ) Am − Al (∂l Bi )

∇ × (∇ × A) = ∇ (∇ · A) − ∇2 A
⇕ (3.57)
ϵijk ϵklm ∂j ∂l Am = ∂i (∂m Am ) − ∂ll Ai

93
3. Differentiation of Vector Function
∇ (A · B) = A × (∇ × B) + B × (∇ × A) + (A · ∇) B + (B · ∇) A
⇕ (3.58)
∂i (Am Bm ) = ϵijk Aj (ϵklm ∂l Bm ) + ϵijk Bj (ϵklm ∂l Am ) + (Al ∂l ) Bi + (Bl ∂l ) Ai

∇ · (A × B) = B · (∇ × A) − A · (∇ × B)
⇕ (3.59)
Ä ä Ä ä Ä ä
∂i ϵijk Aj Bk = Bk ϵkij ∂i Aj − Aj ϵjik ∂i Bk

∇ × (A × B) = (B · ∇) A + (∇ · B) A − (∇ · A) B − (A · ∇) B
⇕ (3.60)
Ä ä Ä ä
ϵijk ϵklm ∂j (Al Bm ) = (Bm ∂m ) Ai + (∂m Bm ) Ai − ∂j Aj Bi − Aj ∂j Bi



A·C A·D
(A × B) · (C × D) =

B·C B·D

⇕ (3.61)
ϵijk Aj Bk ϵilm Cl Dm = (Al Cl ) (Bm Dm ) − (Am Dm ) (Bl Cl )
[ ] [ ]
(A × B) × (C × D) = D · (A × B) C − C · (A × B) D
⇕ (3.62)
Ä ä Ä ä
ϵijk ϵjmn Am Bn ϵkpq Cp Dq = ϵqmn Dq Am Bn Ci − ϵpmn Cp Am Bn Di

In Einstein, the condition for a vector field A to be solenoidal is:

∇·A=0
⇕ (3.63)
∂i Ai = 0

In Einstein, the condition for a vector field A to be irrotational is:

∇×A=0
⇕ (3.64)
ϵijk ∂j Ak = 0

3.12.2. Examples of Using Einstein Notation to Prove Identities


245 Example
Show that ∇ · r = n:

Solution: ▶
∇ · r = ∂i xi (Eq. 3.32)
= δii (Eq. 10.36) (3.65)
=n (Eq. 10.36)

94
3.12. Common Differential Operations in Einstein Notation
246 Example
Show that ∇ × r = 0:

Solution: ▶

[∇ × r]i = ϵijk ∂j xk (Eq. 3.39)


= ϵijk δkj (Eq. 10.35)
(3.66)
= ϵijj (Eq. 10.32)
=0 (Eq. 10.27)

Since i is a free index the identity is proved for all components.



247 Example
∇ (a · r) = a:

Solution: ▶
[ ] Ä ä
∇ (a · r) i = ∂i aj xj (Eqs. 3.27 & 1.25)
= aj ∂i xj + xj ∂i aj (product rule)
= aj ∂i xj (aj is constant)
(3.67)
= aj δji (Eq. 10.35)
= ai (Eq. 10.32)
= [a]i (definition of index)

Since i is a free index the identity is proved for all components.



∇ · (∇f ) = ∇2 f :

∇ · (∇f ) = ∂i [∇f ]i (Eq. 3.32)


= ∂i (∂i f ) (Eq. 3.27)
= ∂i ∂i f (rules of differentiation) (3.68)
= ∂ii f (definition of 2nd derivative)
= ∇2 f (Eq. 3.43)

∇ · (∇ × A) = 0:

∇ · (∇ × A) = ∂i [∇ × A]i (Eq. 3.32)


Ä ä
= ∂i ϵijk ∂j Ak (Eq. 3.39)
= ϵijk ∂i ∂j Ak (∂ not acting on ϵ)
= ϵijk ∂j ∂i Ak (continuity condition) (3.69)
= −ϵjik ∂j ∂i Ak (Eq. 10.40)
= −ϵijk ∂i ∂j Ak (relabeling dummy indices i and j)
=0 (since ϵijk ∂i ∂j Ak = −ϵijk ∂i ∂j Ak )

95
3. Differentiation of Vector Function
This can also be concluded from line three by arguing that: since by the continuity condition ∂i and
∂j can change their order with no change in the value of the term while a corresponding change of
the order of i and j in ϵijk results in a sign change, we see that each term in the sum has its own
negative and hence the terms add up to zero (see Eq. 10.50).
∇ × (∇f ) = 0:
[ ]
∇ × (∇f ) i = ϵijk ∂j [∇f ]k (Eq. 3.39)
= ϵijk ∂j (∂k f ) (Eq. 3.27)
= ϵijk ∂j ∂k f (rules of differentiation)
= ϵijk ∂k ∂j f (continuity condition) (3.70)
= −ϵikj ∂k ∂j f (Eq. 10.40)
= −ϵijk ∂j ∂k f (relabeling dummy indices j and k)
=0 (since ϵijk ∂j ∂k f = −ϵijk ∂j ∂k f )

This can also be concluded from line three by a similar argument to the one given in the previous
[ ]
point. Because ∇ × (∇f ) i is an arbitrary component, then each component is zero.
∇ (f h) = f ∇h + h∇f :
[ ]
∇ (f h) i = ∂i (f h) (Eq. 3.27)
= f ∂i h + h∂i f (product rule)
(3.71)
= [f ∇h]i + [h∇f ]i (Eq. 3.27)
= [f ∇h + h∇f ]i (Eq. ??)

Because i is a free index the identity is proved for all components.


∇ · (f A) = f ∇ · A + A · ∇f :

∇ · (f A) = ∂i [f A]i (Eq. 3.32)


= ∂i (f Ai ) (definition of index)
(3.72)
= f ∂i Ai + Ai ∂i f (product rule)
= f ∇ · A + A · ∇f (Eqs. 3.32 & 3.45)

∇ × (f A) = f ∇ × A + ∇f × A:
[ ]
∇ × (f A) i = ϵijk ∂j [f A]k (Eq. 3.39)
= ϵijk ∂j (f Ak ) (definition of index)
Ä ä
= f ϵijk ∂j Ak + ϵijk ∂j f Ak (product rule & commutativity)
(3.73)
= f ϵijk ∂j Ak + ϵijk [∇f ]j Ak (Eq. 3.27)
= [f ∇ × A]i + [∇f × A]i (Eqs. 3.39 & ??)
= [f ∇ × A + ∇f × A]i (Eq. ??)

Because i is a free index the identity is proved for all components.

96
3.12. Common Differential Operations in Einstein Notation
A × (∇ × B) = (∇B) · A − A · ∇B:
[ ]
A × (∇ × B) i = ϵijk Aj [∇ × B]k (Eq. ??)
= ϵijk Aj ϵklm ∂l Bm (Eq. 3.39)
= ϵijk ϵklm Aj ∂l Bm (commutativity)
= ϵijk ϵlmk Aj ∂l Bm (Eq. 10.40)
Ä ä
= δil δjm − δim δjl Aj ∂l Bm (Eq. 10.58)
(3.74)
= δil δjm Aj ∂l Bm − δim δjl Aj ∂l Bm (distributivity)
= Am ∂i Bm − Al ∂l Bi (Eq. 10.32)
= (∂i Bm ) Am − Al (∂l Bi ) (commutativity & grouping)
[ ]
= (∇B) · A i − [A · ∇B]i
[ ]
= (∇B) · A − A · ∇B i (Eq. ??)

Because i is a free index the identity is proved for all components.


∇ × (∇ × A) = ∇ (∇ · A) − ∇2 A:
[ ]
∇ × (∇ × A) i = ϵijk ∂j [∇ × A]k (Eq. 3.39)
= ϵijk ∂j (ϵklm ∂l Am ) (Eq. 3.39)
= ϵijk ϵklm ∂j (∂l Am ) (∂ not acting on ϵ)
= ϵijk ϵlmk ∂j ∂l Am (Eq. 10.40 & definition of derivative)
Ä ä
= δil δjm − δim δjl ∂j ∂l Am (Eq. 10.58)
= δil δjm ∂j ∂l Am − δim δjl ∂j ∂l Am (distributivity)
= ∂ m ∂ i Am − ∂ l ∂ l Ai (Eq. 10.32)
= ∂i (∂m Am ) − ∂ll Ai (∂ shift, grouping & Eq. ??)
[ ] î ó
= ∇ (∇ · A) i − ∇2 A i
(Eqs. 3.32, 3.27 & 3.44)
î ó
= ∇ (∇ · A) − ∇2 A i
(Eqs. ??)
(3.75)
Because i is a free index the identity is proved for all components. This identity can also be consid-
ered as an instance of the identity before the last one, observing that in the second term on the right
hand side the Laplacian should precede the vector, and hence no independent proof is required.
∇ (A · B) = A × (∇ × B) + B × (∇ × A) + (A · ∇) B + (B · ∇) A:
We start from the right hand side and end with the left hand side
[ ]
A × (∇ × B) + B × (∇ × A) + (A · ∇) B + (B · ∇) A i
=
[ ] [ ] [ ] [ ]
A × (∇ × B) i
+ B × (∇ × A) i
+ (A · ∇) B i
+ (B · ∇) A i
= (Eq. ??)
ϵijk Aj [∇ × B]k + ϵijk Bj [∇ × A]k + (Al ∂l ) Bi + (Bl ∂l ) Ai = (Eqs. ??, 3.32 & indexing)
ϵijk Aj (ϵklm ∂l Bm ) + ϵijk Bj (ϵklm ∂l Am ) + (Al ∂l ) Bi + (Bl ∂l ) Ai = (Eq. 3.39)
ϵijk ϵklm Aj ∂l Bm + ϵijk ϵklm Bj ∂l Am + (Al ∂l ) Bi + (Bl ∂l ) Ai = (commutativity)
ϵijk ϵlmk Aj ∂l Bm + ϵijk ϵlmk Bj ∂l Am + (Al ∂l ) Bi + (Bl ∂l ) Ai = (Eq. 10.40)
(δil δjm − δim δjl ) Aj ∂l Bm + (δil δjm − δim δjl ) Bj ∂l Am + (Al ∂l ) Bi + (Bl ∂l ) Ai = (Eq. 10.58) (3.76)

97
3. Differentiation of Vector Function
(δil δjm Aj ∂l Bm − δim δjl Aj ∂l Bm ) + (δil δjm Bj ∂l Am − δim δjl Bj ∂l Am ) + (Al ∂l ) Bi + (Bl ∂l ) Ai = (distributivity)
δil δjm Aj ∂l Bm − Al ∂l Bi + δil δjm Bj ∂l Am − Bl ∂l Ai + (Al ∂l ) Bi + (Bl ∂l ) Ai = (Eq. 10.32)
δil δjm Aj ∂l Bm − (Al ∂l ) Bi + δil δjm Bj ∂l Am − (Bl ∂l ) Ai + (Al ∂l ) Bi + (Bl ∂l ) Ai = (grouping)
δil δjm Aj ∂l Bm + δil δjm Bj ∂l Am = (cancellation)
Am ∂i Bm + Bm ∂i Am = (Eq. 10.32)
∂i (Am Bm ) = (product rule)
[ ]
= ∇ (A · B) i
(Eqs. 3.27 & 3.32)

Because i is a free index the identity is proved for all components.


∇ · (A × B) = B · (∇ × A) − A · (∇ × B):

∇ · (A × B) = ∂i [A × B]i (Eq. 3.32)


Ä ä
= ∂i ϵijk Aj Bk (Eq. ??)
Ä ä
= ϵijk ∂i Aj Bk (∂ not acting on ϵ)
Ä ä
= ϵijk Bk ∂i Aj + Aj ∂i Bk (product rule)
= ϵijk Bk ∂i Aj + ϵijk Aj ∂i Bk (distributivity) (3.77)
= ϵkij Bk ∂i Aj − ϵjik Aj ∂i Bk (Eq. 10.40)
Ä ä Ä ä
= Bk ϵkij ∂i Aj − Aj ϵjik ∂i Bk (commutativity & grouping)
= Bk [∇ × A]k − Aj [∇ × B]j (Eq. 3.39)
= B · (∇ × A) − A · (∇ × B) (Eq. 1.25)

∇ × (A × B) = (B · ∇) A + (∇ · B) A − (∇ · A) B − (A · ∇) B:

[ ]
∇ × (A × B) i = ϵijk ∂j [A × B]k (Eq. 3.39)
= ϵijk ∂j (ϵklm Al Bm ) (Eq. ??)
= ϵijk ϵklm ∂j (Al Bm ) (∂ not acting on ϵ)
Ä ä
= ϵijk ϵklm Bm ∂j Al + Al ∂j Bm (product rule)
Ä ä
= ϵijk ϵlmk Bm ∂j Al + Al ∂j Bm (Eq. 10.40)
Ä äÄ ä
= δil δjm − δim δjl Bm ∂j Al + Al ∂j Bm (Eq. 10.58)
= δil δjm Bm ∂j Al + δil δjm Al ∂j Bm − δim δjl Bm ∂j Al − δim δjl Al ∂j Bm (distributivity)
= Bm ∂m Ai + Ai ∂m Bm − Bi ∂j Aj − Aj ∂j Bi (Eq. 10.32)
Ä ä Ä ä
= (Bm ∂m ) Ai + (∂m Bm ) Ai − ∂j Aj Bi − Aj ∂j Bi (grouping)
[ ] [ ] [ ] [ ]
= (B · ∇) A i + (∇ · B) A i − (∇ · A) B i − (A · ∇) B i (Eqs. 3.45 & 3.32)
[ ]
= (B · ∇) A + (∇ · B) A − (∇ · A) B − (A · ∇) B i (Eq. ??)
(3.78)
Because i is a free index the identity is proved for all components.

98
3.12. Common Differential Operations in Einstein Notation



A·C A·D
(A × B) · (C × D) = :

B·C B·D

(A × B) · (C × D) = [A × B]i [C × D]i (Eq. 1.25)


= ϵijk Aj Bk ϵilm Cl Dm (Eq. ??)
= ϵijk ϵilm Aj Bk Cl Dm (commutativity)
Ä ä
= δjl δkm − δjm δkl Aj Bk Cl Dm (Eqs. 10.40 & 10.58)
= δjl δkm Aj Bk Cl Dm − δjm δkl Aj Bk Cl Dm (distributivity)
Ä ä Ä ä
= δjl Aj Cl (δkm Bk Dm ) − δjm Aj Dm (δkl Bk Cl ) (commutativity & grouping)
= (Al Cl ) (Bm Dm ) − (Am Dm ) (Bl Cl ) (Eq. 10.32)
= (A · C) (B · D) − (A · D) (B · C) (Eq. 1.25)



A·C A·D
=
(definition of determinant)
B·C B·D

(3.79)
[ ] [ ]
(A × B) × (C × D) = D · (A × B) C − C · (A × B) D:
[ ]
(A × B) × (C × D) i = ϵijk [A × B]j [C × D]k (Eq. ??)
= ϵijk ϵjmn Am Bn ϵkpq Cp Dq (Eq. ??)
= ϵijk ϵkpq ϵjmn Am Bn Cp Dq (commutativity)
= ϵijk ϵpqk ϵjmn Am Bn Cp Dq (Eq. 10.40)
Ä ä
= δip δjq − δiq δjp ϵjmn Am Bn Cp Dq (Eq. 10.58)
Ä ä
= δip δjq ϵjmn − δiq δjp ϵjmn Am Bn Cp Dq (distributivity)
Ä ä
= δip ϵqmn − δiq ϵpmn Am Bn Cp Dq (Eq. 10.32)
= δip ϵqmn Am Bn Cp Dq − δiq ϵpmn Am Bn Cp Dq (distributivity)
= ϵqmn Am Bn Ci Dq − ϵpmn Am Bn Cp Di (Eq. 10.32)
= ϵqmn Dq Am Bn Ci − ϵpmn Cp Am Bn Di (commutativity)
Ä ä Ä ä
= ϵqmn Dq Am Bn Ci − ϵpmn Cp Am Bn Di (grouping)
[ ] [ ]
= D · (A × B) Ci − C · (A × B) Di (Eq. ??)
î[ ] ó î[ ] ó
= D · (A × B) C i − C · (A × B) D i
(definition of index)
î[ ] [ ] ó
= D · (A × B) C − C · (A × B) D i
(Eq. ??)
(3.80)
Because i is a free index the identity is proved for all components.

99
Part II.

Integral Vector Calculus

101
Multiple Integrals
4.
In this chapter we develop the theory of integration for scalar functions.
Recall also that the definite integral of a nonnegative function f (x) ≥ 0 represented the area
“under” the curve y = f (x). As we will now see, the double integral of a nonnegative real-valued
function f (x, y) ≥ 0 represents the volume “under” the surface z = f (x, y).

4.1. Double Integrals


Let R = [a, b]×[c, d] ⊆ R2 be a rectangle, and f : R → R be continuous. Let P = {x0 , . . . , xM , y0 , . . . , yM }
where a = x0 < x1 < · · · < xM = b and c = y0 < y1 < · · · < yM = d. The set P determines a
partition of R into a grid of (non-overlapping) rectangles Ri,j = [xi , xi+1 ]×[yj , yj+1 ] for 0 ≤ i < M
¶ ©
and 0 ≤ j < N . Given P , choose a collection of points M = ξi,j so that ξi,j ∈ Ri,j for all i, j.

103
4. Multiple Integrals

248 Definition
The Riemann sum of f with respect to the partition P and points M is defined by


M −1 N
∑ −1 ∑
M −1 N
∑ −1
def
R(f, P, M) = f (ξi,j ) area(Ri,j ) = f (ξi,j )(xi+1 − xi )(yj+1 − yj )
i=0 j=0 i=0 j=0

249 Definition
The mesh size of a partition P is defined by
{ } { }

∥P ∥ = max xi+1 − xi 0 ≤ i < M ∪ yj+1 − yj 0 ≤ j ≤ N .

250 Definition
The Riemann integral of f over the rectangle R is defined by
¨
def
f (x, y) dx dy = lim R(f, P, M),
R ∥P ∥→0

provided the limit exists and is independent of the choice of the points M. A function is said to be
Riemann integrable over R if the Riemann integral exists and is finite.

251 Remark
A few other popular notation conventions used to denote the integral are
¨ ¨ ¨ ¨
f dA, f dx dy, f dx1 dx2 , and f.
R R R R

252 Remark
The double integral represents the volume of the region under the graph of f . Alternately, if f (x, y) is
the density of a planar body at point (x, y), the double integral is the total mass.

253 Theorem
Any bounded continuous function is Riemann integrable on a bounded rectangle.

254 Remark
Most bounded functions we will encounter will be Riemann integrable. Bounded functions with rea-
sonable discontinuities (e.g. finitely many jumps) are usually Riemann integrable on bounded rect-
angle. An example of a “badly discontinuous” function that is not Riemann integrable is the function
f (x, y) = 1 if x, y ∈ Q and 0 otherwise.

Now suppose U ⊆ R2 is an nice bounded1 domain, and f : U → R is a function. Find a bounded


rectangle R ⊇ U , and as before let P be a partition of R into a grid of rectangles. Now we define
1
We will subsequently always assume U is “nice”. Namely, U is open, connected and the boundary of U is a piecewise
differentiable curve. More precisely, we need to assume that the “area” occupied by the boundary of U is 0. While
you might suspect this should be true for all open sets, it isn’t! There exist open sets of finite area whose boundary
occupies an infinite area!

104
4.1. Double Integrals
the Riemann sum by only summing over all rectangles Ri,j that are completely contained inside U .
Explicitly, let 

1 Ri,j ⊆ U
χi,j =

0 otherwise.

and define

M −1 N
∑ −1
def
R(f, P, M, U ) = χi,j f (ξi,j )(xi+1 − xi )(yj+1 − yj ).
i=0 j=0

255 Definition
The Riemann integral of f over the domain U is defined by
¨
def
f (x, y) dx dy = lim R(f, P, M, U ),
U ∥P ∥→0

provided the limit exists and is independent of the choice of the points M. A function is said to be
Riemann integrable over R if the Riemann integral exists and is finite.

256 Theorem
Any bounded continuous function is Riemann integrable on a bounded region.

257 Remark
As before, most reasonable bounded functions we will encounter will be Riemann integrable.

To deal with unbounded functions over unbounded domains, we use a limiting process.

258 Definition
Let U ⊆ R2 be a domain (which is not necessarily bounded) and f : U → R be a (not necessarily
bounded) function. We say f is integrable if
¨
lim χR |f | dA
R→∞ U ∩B(0,R)

exists and is finite. Here χR (x) = 1 if f (x) < R and 0 otherwise.

259 Proposition
If f is integrable on the domain U , then
¨
lim χR f dA
R→∞ U ∩B(0,R)

exists and is finite.


260 Remark
If f is integrable, then the above limit is independent of how you expand your domain. Namely, you
can take the limit of the integral over U ∩ [−R, R]2 instead, and you will still get the same answer.

105
4. Multiple Integrals

261 Definition
If f is integrable we define
¨ ¨
f dx dy = lim χR f dA
U R→∞ U ∩B(0,R)

4.2. Iterated integrals and Fubini’s theorem


Let f (x, y) be a continuous function such that f (x, y) ≥ 0 for all (x, y) on the rectangle R =
{(x, y) : a ≤ x ≤ b, c ≤ y ≤ d} in R2 . We will often write this as R = [a, b] × [c, d]. For any number
x∗ in the interval [a, b], slice the surface z = f (x, y) with the plane x = x∗ parallel to the yz-plane.
Then the trace of the surface in that plane is the curve f (x∗, y), where x∗ is fixed and only y varies.
The area A under that curve (i.e. the area of the region between the curve and the xy-plane) as y
varies over the interval [c, d] then depends only on the value of x∗. So using the variable x instead
of x∗, let A(x) be that area (see Figure 4.1).
z
z = f (x, y)

c d y
0 A(x)
a
x
b
x R

Figure 4.1. The area A(x) varies with x


ˆ d
Then A(x) = f (x, y) dy since we are treating x as fixed, and only y varies. This makes sense
c
since for a fixed x the function f (x, y) is a continuous function of y over the interval [c, d], so we
know that the area under the curve is the definite integral. The area A(x) is a function of x, so by
the “slice” or cross-section method from single-variable calculus we know that the volume V of the
solid under the surface z = f (x, y) but above the xy-plane over the rectangle R is the integral over
[a, b] of that cross-sectional area A(x):
ˆ b ˆ b [ˆ d
]
V = A(x) dx = f (x, y) dy dx (4.1)
a a c

We will always refer to this volume as “the volume under the surface”. The above expression uses
what are called iterated integrals. First the function f (x, y) is integrated as a function of y, treating
the variable x as a constant (this is called integrating with respect to y). That is what occurs in the
“inner” integral between the square brackets in equation (4.1). This is the first iterated integral.
Once that integration is performed, the result is then an expression involving only x, which can

106
4.2. Iterated integrals and Fubini’s theorem
then be integrated with respect to x. That is what occurs in the “outer” integral above (the second
iterated integral). The final result is then a number (the volume). This process of going through two
iterations of integrals is called double integration, and the last expression in equation (4.1) is called
a double integral.
Notice that integrating f (x, y) with respect to y is the inverse operation of taking the partial
derivative of f (x, y) with respect to y. Also, we could just as easily have taken the area of cross-
sections under the surface which were parallel to the xz-plane, which would then depend only on
the variable y, so that the volume V would be
ˆ [ˆ ]
d b
V = f (x, y) dx dy . (4.2)
c a

It turns out that in general due to Fubini’s Theorem the order of the iterated integrals does not
matter. Also, we will usually discard the brackets and simply write
ˆ dˆ b
V = f (x, y) dx dy , (4.3)
c a

where it is understood that the fact that dx is written before dy means that the function f (x, y)
is first integrated with respect to x using the “inner” limits of integration a and b, and then the
resulting function is integrated with respect to y using the “outer” limits of integration c and d. This
order of integration can be changed if it is more convenient.
Let U ⊆ R2 be a domain.

262 Definition
For x ∈ R, define
{ } { }

Sx U = y (x, y) ∈ U and Ty U = x (x, y) ∈ U

263 Example
If U = [a, b] × [c, d] then
 

[c, d] 
[a, b]
x ∈ [a, b] y ∈ [c, d]
Sx U = and Ty U =

∅ x ̸∈ [a, b] 
∅ y ̸∈ [c, d].

For domains we will consider, Sx U and Ty U will typically be an interval (or a finite union of in-
tervals).

264 Definition
Given a function f : U → R, we define the two iterated integrals by
ˆ ň ã ˆ ň ã
f (x, y) dy dx and f (x, y) dx dy,
x∈R y∈Sx U y∈R x∈Ty U

with the convention that an integral over the empty set is 0. (We included the parenthesis above for
clarity; and will drop them as we become more familiar with iterated integrals.)

107
4. Multiple Integrals

Suppose f (x, y) represents the density of a planar body at point (x, y). For any x ∈ R,
ˆ
f (x, y) dy
y∈Sx U

represents the mass of the body contained in the vertical line through the point (x, 0). It’s only
natural to expect that if we integrate this with respect to y, we will get the total mass, which is
the double integral. By the same argument, we should get the same answer if we had sliced it
horizontally first and then vertically. Consequently, we expect both iterated integrals to be equal
to the double integral. This is true, under a finiteness assumption.

265 Theorem (Fubini’s theorem)


Suppose f : U → R is a function such that either
ˆ ň ã ˆ ň ã

f (x, y) dy dx < ∞ or f (x, y) dx dy < ∞, (4.4)
x∈R y∈Sx U y∈R x∈Ty U

then f is integrable over U and


¨ ˆ ň ã ˆ ň ã
f dA = f (x, y) dy dx = f (x, y) dx dy.
U x∈R y∈Sx U y∈R x∈Ty U

Without the assumption (4.4) the iterated integrals need not be equal, even though both may
exist and be finite.
266 Example
Define
Äyä x2 − y 2
f (x, y) = −∂x ∂y tan−1 = .
x (x2 + y 2 )2
Then ˆ ˆ ˆ ˆ
1 1 1 1
π π
f (x, y) dy dx = and f (x, y) dx dy = −
x=0 y=0 4 y=0 x=0 4

267 Example
Let f (x, y) = (x − y)/(x + y)3 if x, y > 0 and 0 otherwise, and U = (0, 1)2 . The iterated integrals
of f over U both exist, but are not equal.

268 Example
Define



1
 y ∈ (x, x + 1) and x ≥ 0

f (x, y) =
 −1 y ∈ (x − 1, x) and x ≥ 0



0 otherwise.

Then the iterated integrals of f both exist and are not equal.

269 Example
Find the volume V under the plane z = 8x + 6y over the rectangle R = [0, 1] × [0, 2].

108
4.2. Iterated integrals and Fubini’s theorem
Solution: ▶ We see that f (x, y) = 8x + 6y ≥ 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2, so:
ˆ 2ˆ 1
V = (8x + 6y) dx dy
0
ˆ (0 x=1 )
2
= 4x + 6xy
2
dy
0 x=0
ˆ 2
= (4 + 6y) dy
0
2

= 4y + 3y 2
0
= 20

Suppose we had switched the order of integration. We can verify that we still get the same answer:
ˆ 1ˆ 2
V = (8x + 6y) dy dx
0
ˆ (0 y=2 )
1 2
= 8xy + 3y dx
0 y=0
ˆ 1
= (16x + 12) dx
0
1

= 8x2 + 12x
0
= 20


270 Example
Find the volume V under the surface z = ex+y over the rectangle R = [2, 3] × [1, 2].

Solution: ▶ We know that f (x, y) = ex+y > 0 for all (x, y), so
ˆ 2ˆ 3
V = ex+y dx dy
1
ˆ (2 x=3 )
2
= e x+y dy

1 x=2
ˆ 2
= (ey+3 − ey+2 ) dy
1
2
y+2

= e y+3
−e
1
= e − e − (e − e3 ) = e5 − 2e4 + e3
5 4 4

◀ ˆ b
Recall that for a general function f (x), the integral f (x) dx represents the difference of the
a
area below the curve y = f (x) but above the x-axis when f (x) ≥ 0, and the area above the curve
but below the x-axis when f (x) ≤ 0. Similarly, the double integral of any continuous function

109
4. Multiple Integrals
f (x, y) represents the difference of the volume below the surface z = f (x, y) but above the xy-
plane when f (x, y) ≥ 0, and the volume above the surface but below the xy-plane when f (x, y) ≤
0. Thus, our method of double integration by means of iterated integrals can be used to evaluate
the double integral of any continuous function over a rectangle, regardless of whether f (x, y) ≥ 0
or not.

271 Exampleˆ 2π ˆ π
Evaluate sin(x + y) dx dy.
0 0

Solution: ▶ Note that f (x, y) = sin(x + y) is both positive and negative over the rectangle
[0, π] × [0, 2π]. We can still evaluate the double integral:
ˆ 2π ˆ π ˆ 2π Ç x=π å

sin(x + y) dx dy = − cos(x + y) dy
0 0 0 x=0
ˆ 2π
= (− cos(y + π) + cos y) dy
0


= − sin(y + π) + sin y = − sin 3π + sin 2π − (− sin π + sin 0)
0
= 0

Exercises
A
For Exercises 1-4, find the volume under the surface z = f (x, y) over the rectangle R.

1. f (x, y) = 4xy, R = [0, 1] × [0, 1] 2. f (x, y) = ex+y , R = [0, 1] × [−1, 1]

3. f (x, y) = x3 + y 2 , R = [0, 1] × [0, 1] 4. f (x, y) = x4 + xy + y 3 , R = [1, 2] × [0, 2]

For Exercises 5-12, evaluate the given double integral.


ˆ 1ˆ 2 ˆ 1ˆ 2
5. (1 − y)x2 dx dy 6. x(x + y) dx dy
0 1 0 0
ˆ 2ˆ 1 ˆ 2 ˆ 1
7. (x + 2) dx dy 8. x(xy + sin x) dx dy
0 0 −1 −1
ˆ π/2 ˆ 1 ˆ π ˆ π/2
9. 2
xy cos(x y) dx dy 10. sin x cos(y − π) dx dy
0 0 0 0
ˆ 2ˆ 4 ˆ 1 ˆ 2
11. xy dx dy 12. 1 dx dy
0 1 −1 −1
ˆ dˆ b
13. Let M be a constant. Show that M dx dy = M (d − c)(b − a).
c a

110
4.3. Double Integrals Over a General Region

4.3. Double Integrals Over a General Region


In the previous section we got an idea of what a double integral over a rectangle represents. We can
now define the double integral of a real-valued function f (x, y) over more general regions in R2 .
Suppose that we have a region R in the xy-plane that is bounded on the left by the vertical line
x = a, bounded on the right by the vertical line x = b (where a < b), bounded below by a curve
y = g1 (x), and bounded above by a curve y = g2 (x), as in Figure 4.2(a). We will assume that g1 (x)
and g2 (x) do not intersect on the open interval (a, b) (they could intersect at the endpoints x = a
and x = b, though).

y y
y = g2 (x)
d

R x = h1 (y)
x = h2 (y)
y = g1 (x) c
R
x x
0 a b 0

ˆ b ˆ g2 (x) ˆ d ˆ h2 (y)
(a) Vertical slice: f (x, y) dy dx (b) Horizontal slice: f (x, y) dx dy
a g1 (x) c h1 (y)

Figure 4.2. Double integral over a nonrectangu-


lar region R

Then using the slice method from the previous


¨ section, the double integral of a real-valued func-
tion f (x, y) over the region R, denoted by f (x, y) dA, is given by
R

¨ ˆ b [ˆ g2 (x)
]
f (x, y) dA = f (x, y) dy dx (4.5)
a g1 (x)
R

This means that we take vertical slices in the region R between the curves y = g1 (x) and y = g2 (x).
The symbol dA is sometimes called an area element or infinitesimal, with the A signifying area. Note
that f (x, y) is first integrated with respect to y, with functions of x as the limits of integration. This
makes sense since the result of the first iterated integral will have to be a function of x alone, which
then allows us to take the second iterated integral with respect to x.
Similarly, if we have a region R in the xy-plane that is bounded on the left by a curve x = h1 (y),
bounded on the right by a curve x = h2 (y), bounded below by the horizontal line y = c, and
bounded above by the horizontal line y = d (where c < d), as in Figure 4.2(b) (assuming that h1 (y)

111
4. Multiple Integrals
and h2 (y) do not intersect on the open interval (c, d)), then taking horizontal slices gives
¨ ˆ [ˆ ]
d h2 (y)
f (x, y) dA = f (x, y) dx dy (4.6)
c h1 (y)
R

Notice that these definitions include the case when the region R is a rectangle. Also, if f (x, y) ≥ 0
˜
for all (x, y) in the region R, then f (x, y) dA is the volume under the surface z = f (x, y) over
R
the region R.

272 Example
Find the volume V under the plane z = 8x + 6y over the region R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤
2x2 }.

y
Solution: ▶ The region R is shown in Figure 3.2.2. Using vertical slices we get:
y = 2x2
¨
V = (8x + 6y) dA R
R
 
ˆ 1 ˆ 2x2
x
=  (8x + 6y) dy  dx 0 1
0 0
Ñ é Figure 4.3.
ˆ y=2x2
1
= 8xy + 3y 2 dx
0 y=0
ˆ 1
= (16x3 + 12x4 ) dx
0
1
4 12 5
12 32
= 4x + 5 x = 4+ 5 = 5 = 6.4
0

y
We get the same answer using horizontal slices (see Figure 3.2.3): 2
¨

V = (8x + 6y) dA x= y/2
R
R
 
ˆ 2 ˆ 1
x
 (8x + 6y) dx dy 0 1
= √
0 y/2
Ñ é Figure 4.4.
ˆ x=1
2
= 4x + 6xy √
2
dy
0 x= y/2
ˆ ˆ √
2
√ 2
= (4 + 6y − (2y + √6 y y )) dy
2
= (4 + 4y − 3 2y 3/2 ) dy
0 0
√ 2 √ √

6 2 5/2
= 4y + 2y 2 − 5 y = 8+8− 6 2 32
5 = 16 − 48
5 = 32
5 = 6.4
0


273 Example
Find the volume V of the solid bounded by the three coordinate planes and the plane 2x+y +4z = 4.

112
4.3. Double Integrals Over a General Region

y
4

z
y = −2x + 4
(0, 0, 1) 2x + y + 4z = 4

y R
0 (0, 4, 0)
x
x (2, 0, 0) 0 2
(a)
(b)

Figure 4.5.

Solution: ▶ The solid is shown in Figure 4.5(a) with a typical vertical slice. The volume V is given
˜
by f (x, y) dA, where f (x, y) = z = 14 (4 − 2x − y) and the region R, shown in Figure 4.5(b), is
R
R = {(x, y) : 0 ≤ x ≤ 2, 0 ≤ y ≤ −2x + 4}. Using vertical slices in R gives
¨
4 (4 − 2x − y) dA
1
V =
R
ˆ [ˆ ]
2 −2x+4
= 1
4 (4 − 2x − y) dy dx
0 0
ˆ ( y=−2x+4 )
2 2
= − 81 (4 − 2x − y) dx
0 y=0
ˆ 2
= 1
8 (4 − 2x)2 dx
0
2
3
= − 1
48 (4 − 2x) = 64 4
48 = 3
0


For a general region R, which may not be one of the types of regions we have considered so
˜
far, the double integral f (x, y) dA is defined as follows. Assume that f (x, y) is a nonnegative
R
real-valued function and that R is a bounded region in R2 , so it can be enclosed in some rectangle
[a, b] × [c, d]. Then divide that rectangle into a grid of subrectangles. Only consider the subrect-
angles that are enclosed completely within the region R, as shown by the shaded subrectangles in
Figure 4.6(a). In any such subrectangle [xi , xi+1 ]×[yj , yj+1 ], pick a point (xi∗ , yj∗ ). Then the volume
under the surface z = f (x, y) over that subrectangle is approximately f (xi∗ , yj∗ ) ∆xi ∆yj , where
∆xi = xi+1 − xi , ∆yj = yj+1 − yj , and f (xi∗ , yj∗ ) is the height and ∆xi ∆yj is the base area of a
parallelepiped, as shown in Figure 4.6(b). Then the total volume under the surface is approximately
the sum of the volumes of all such parallelepipeds, namely
∑∑
f (xi∗ , yj∗ ) ∆xi ∆yj , (4.7)
j i

113
4. Multiple Integrals
where the summation occurs over the indices of the subrectangles inside R. If we take smaller and
smaller subrectangles, so that the length of the largest diagonal of the subrectangles goes to 0, then
the subrectangles begin to fill more and more of the region R, and so the above sum approaches the
˜
actual volume under the surface z = f (x, y) over the region R. We then define f (x, y) dA as the
R
limit of that double summation (the limit is taken over all subdivisions of the rectangle [a, b] × [c, d]
as the largest diagonal of the subrectangles goes to 0).
y
d

z
z = f (x, y)
∆yj
∆xi f (xi∗ , yj∗ )
yj+1

(xi∗ , yj∗ )
yj yj yj+1 y
0

c xi
x xi+1
0 a xi xi+1 b (xi∗ , yj∗ ) R
x
(a) Subrectangles inside the region R (b) Parallelepiped over a subrectangle, with volume
f (xi∗ , yj∗ ) ∆xi ∆yj

Figure 4.6. Double integral over a general region


R

A similar definition can be made for a function f (x, y) that is not necessarily always nonnega-
tive: just replace each mention of volume by the negative volume in the description above when
f (x, y) < 0. In the case of a region of the type shown in Figure 4.2, using the definition of the Rie-
˜
mann integral from single-variable calculus, our definition of f (x, y) dA reduces to a sequence
R
of two iterated integrals.
Finally, the region R does not have to be bounded. We can evaluate improper double integrals
(i.e. over an unbounded region, or over a region which contains points where the function f (x, y)
is not defined) as a sequence of iterated improper single-variable integrals.
274 Exampleˆ ∞ ˆ 1/x2
Evaluate 2y dy dx.
1 0

Solution: ▶
Ñ é
ˆ ∞ ˆ 1/x2 ˆ ∞ y=1/x2

2y dy dx = y 2 dx
1 0 1 y=0
ˆ ∞ ∞
−4 1 −3

= x dx = − 3x = 0 − (− 13 ) = 1
3
1 1

114
4.4. Triple Integrals

Exercises
A
For Exercises 1-6, evaluate the given double integral.

ˆ 1ˆ 1
ˆ π ˆ y
1. 2
24x y dy dx 2. sin x dx dy

0 x 0 0

ˆ 2 ˆ ln x ˆ 2 ˆ 2y
2
3. 4x dy dx 4. ey dx dy
1 0 0 0

ˆ ˆ ∞ˆ
π/2 ˆ y ∞
xye−(x
2 +y 2 )
5. cos x sin y dx dy 6. dx dy
0 0 0 0

ˆ 2ˆ y ˆ 1 ˆ x2
7. 1 dx dy 8. 2 dy dx
0 0 0 0

9. Find the volume V of the solid bounded by the three coordinate planes and the plane x+y+z =
1.

10. Find the volume V of the solid bounded by the three coordinate planes and the plane 3x +
2y + 5z = 6.

B
˜
11. Explain why the double integral 1 dA gives the area of the region R. For simplicity, you can
R
assume that R is a region of the type shown in Figure 4.2(a).

12. Prove that the volume of a tetrahedron with mutually perpendicular ad- c
b
jacent sides of lengths a, b, and c, as in Figure 3.2.6, is abc
6 . (Hint: Mimic
a
Example 273, and recall from
Section 1.5 how three noncollinear points determine a plane.)
Figure 4.7.
13. Show how Exercise 12 can be used to solve Exercise 10.

4.4. Triple Integrals


Our definition of a double integral of a real-valued function f (x, y) over a region R in R2 can be
extended to define a triple integral of a real-valued function f (x, y, z) over a solid S in R3 . We simply
proceed as before: the solid S can be enclosed in some rectangular parallelepiped, which is then
divided into subparallelepipeds. In each subparallelepiped inside S, with sides of lengths ∆x, ∆y

115
4. Multiple Integrals
and ∆z, pick a point (x∗ , y∗ , z∗ ). Then define the triple integral of f (x, y, z) over S, denoted by
˝
f (x, y, z) dV , by
S

˚ ∑∑∑
f (x, y, z) dV = lim f (x∗ , y∗ , z∗ ) ∆x ∆y ∆z , (4.8)
S

where the limit is over all divisions of the rectangular parallelepiped enclosing S into subparal-
lelepipeds whose largest diagonal is going to 0, and the triple summation is over all the subparal-
lelepipeds inside S. It can be shown that this limit does not depend on the choice of the rectangular
parallelepiped enclosing S. The symbol dV is often called the volume element.
Physically, what does the triple integral represent? We saw that a double integral could be thought
of as the volume under a two-dimensional surface. It turns out that the triple integral simply gen-
eralizes this idea: it can be thought of as representing the hypervolume under a three-dimensional
hypersurface w = f (x, y, z) whose graph lies in R4 . In general, the word “volume” is often used as
a general term to signify the same concept for any n-dimensional object (e.g. length in R1 , area in
R2 ). It may be hard to get a grasp on the concept of the “volume” of a four-dimensional object, but
at least we now know how to calculate that volume!
In the case where S is a rectangular parallelepiped [x1 , x2 ] × [y1 , y2 ] × [z1 , z2 ], that is, S =
{(x, y, z) : x1 ≤ x ≤ x2 , y1 ≤ y ≤ y2 , z1 ≤ z ≤ z2 }, the triple integral is a sequence of three
iterated integrals, namely
˚ ˆ z2 ˆ y2 ˆ x2
f (x, y, z) dV = f (x, y, z) dx dy dz , (4.9)
z1 y1 x1
S

where the order of integration does not matter. This is the simplest case.
A more complicated case is where S is a solid which is bounded below by a surface z = g1 (x, y),
bounded above by a surface z = g2 (x, y), y is bounded between two curves h1 (x) and h2 (x), and
x varies between a and b. Then
˚ ˆ bˆ h2 (x) ˆ g2 (x,y)
f (x, y, z) dV = f (x, y, z) dz dy dx . (4.10)
a h1 (x) g1 (x,y)
S

Notice in this case that the first iterated integral will result in a function of x and y (since its limits
of integration are functions of x and y), which then leaves you with a double integral of a type that
we learned how to evaluate in Section 3.2. There are, of course, many variations on this case (for
example, changing the roles of the variables x, y, z), so as you can probably tell, triple integrals can
be quite tricky. At this point, just learning how to evaluate a triple integral, regardless of what it
represents, is the most important thing. We will see some other ways in which triple integrals are
used later in the text.

275 Exampleˆ 3ˆ 2ˆ 1
Evaluate (xy + z) dx dy dz.
0 0 0

116
4.4. Triple Integrals
Solution: ▶
ˆ ( x=1 )
3ˆ 2ˆ 1 ˆ 3ˆ 2
(xy + z) dx dy dz = 1 2
2x y + xz dy dz
0 0 0 0 0 x=0
ˆ 3ˆ 2Ä ä
1
= 2y + z dy dz
0 0
ˆ ( y=2 )
3
= 1 2
4y + yz dz
0 y=0
ˆ 3
= (1 + 2z) dz
0
3

= z + z 2 = 12
0

276 Exampleˆ 1 ˆ 1−x ˆ 2−x−y
Evaluate (x + y + z) dz dy dx.
0 0 0

Solution: ▶
ˆ 1 ˆ 1−x ˆ 2−x−y ˆ ( z=2−x−y )
1 ˆ 1−x
(x + y + z) dz dy dx = (x + y)z + 12 z 2 dy dx
0 0 0 0 0 z=0
ˆ 1 ˆ 1−x Ä ä
= (x + y)(2 − x − y) + 21 (2 − x − y)2 dy dx
0 0
ˆ 1 ˆ 1−x Ä ä
= 2 − 21 x2 − xy − 12 y 2 dy dx
0
ˆ (0 y=1−x )
1
1 3
= 2y − 1 2
2x y − xy − 1
2 xy
2
− 6y dx
0 y=0
ˆ 1Ä ä
= 11
6 − 2x + 16 x3 dx
0
1
1 4

= 11
6 x − x2 + 24 x = 7
8
0

◀ Note that the volume V of a solid in R3 is given by


˚
V = 1 dV . (4.11)
S

Since the function being integrated is the constant 1, then the above triple integral reduces to a
double integral of the types that we considered in the previous section if the solid is bounded above
by some surface z = f (x, y) and bounded below by the xy-plane z = 0. There are many other
possibilities. For example, the solid could be bounded below and above by surfaces z = g1 (x, y)
and z = g2 (x, y), respectively, with y bounded between two curves h1 (x) and h2 (x), and x varies
between a and b. Then
˚ ˆ b ˆ h2 (x) ˆ g2 (x,y) ˆ b ˆ h2 (x)
( )
V = 1 dV = 1 dz dy dx = g2 (x, y) − g1 (x, y) dy dx
a h1 (x) g1 (x,y) a h1 (x)
S

just like in equation (4.10). See Exercise 10 for an example.

117
4. Multiple Integrals

Exercises
A
For Exercises 1-8, evaluate the given triple integral.
ˆ 3ˆ 2ˆ 1 ˆ 1ˆ xˆ y
1. xyz dx dy dz 2. xyz dz dy dx
0 0 0 0 0 0
ˆ π ˆ x ˆ xy ˆ 1ˆ z ˆ y
x2 sin z dz dy dx
2
3. 4. zey dx dy dz
0 0 0 0 0 0
ˆ eˆ y ˆ 1/y ˆ 2 ˆ y2 ˆ z2
5. x2 z dx dz dy 6. yz dx dz dy
1 0 0 1 0 0
ˆ 2ˆ 4ˆ 3 ˆ 1 ˆ 1−x ˆ 1−x−y
7. 1 dx dy dz 8. 1 dz dy dx
1 2 0 0 0 0
ˆ z2 ˆ y2 ˆ x2
9. Let M be a constant. Show that M dx dy dz = M (z2 − z1 )(y2 − y1 )(x2 − x1 ).
z1 y1 x1

B
10. Find the volume V of the solid S bounded by the three coordinate planes, bounded above by
the plane x + y + z = 2, and bounded below by the plane z = x + y.

C
ˆ bˆ z ˆ y ˆ b
(b−x)2
11. Show that f (x) dx dy dz = 2 f (x) dx. (Hint: Think of how changing the or-
a a a a
der of integration in the triple integral changes the limits of integration.)

4.5. Change of Variables in Multiple Integrals


Given the difficulty of evaluating multiple integrals, the reader may be wondering if it is possible to
simplify those integrals using a suitable substitution for the variables. The answer is yes, though it is
a bit more complicated than the substitution method which you learned in single-variable calculus.
Recall that if you are given, for example, the definite integral
ˆ 2 √
x3 x2 − 1 dx ,
1

then you would make the substitution

u = x2 − 1 ⇒ x2 = u + 1
du = 2x dx

which changes the limits of integration

x=1 ⇒ u=0
x=2 ⇒ u=3

118
4.5. Change of Variables in Multiple Integrals
so that we get
ˆ 2 √ ˆ 2 √
x 3
x2 − 1 dx = 1 2
2x · 2x x2 − 1 dx
1 1
ˆ 3
1

= 2 (u + 1) u du
0
ˆ 3( )
= 1
2 u3/2 + u1/2 du , which can be easily integrated to give
0

14 3
= 5 .

Let us take a different look at what happened when we did that substitution, which will give some
motivation for how substitution works in multiple integrals. First, we let u = x2 − 1. On the interval
of integration [1, 2], the function x 7→ x2 − 1 is strictly increasing (and maps [1, 2] onto [0, 3]) and
hence has an inverse function (defined on the interval [0, 3]). That is, on [0, 3] we can define x as a
function of u, namely

x = g(u) = u + 1 .

Then substituting that expression for x into the function f (x) = x3 x2 − 1 gives

f (x) = f (g(u)) = (u + 1)3/2 u ,

and we see that


dx
= g ′ (u) ⇒ dx = g ′ (u) du
du
dx = 12 (u + 1)−1/2 du ,

so since

g(0) = 1 ⇒ 0 = g −1 (1)
g(3) = 2 ⇒ 3 = g −1 (2)

then performing the substitution as we did earlier gives


ˆ 2 ˆ 2 √
f (x) dx = x3 x2 − 1 dx
1 1
ˆ 3
1

= 2 (u + 1) u du , which can be written as
0
ˆ 3 √
= (u + 1)3/2 u · 21 (u + 1)−1/2 du , which means
0
ˆ 2 ˆ g −1 (2)
f (x) dx = f (g(u)) g ′ (u) du .
1 g −1 (1)

In general, if x = g(u) is a one-to-one, differentiable function from an interval [c, d] (which


you can think of as being on the “u-axis”) onto an interval [a, b] (on the x-axis), which means that

119
4. Multiple Integrals
g ′ (u) ̸= 0 on the interval (c, d), so that a = g(c) and b = g(d), then c = g −1 (a) and d = g −1 (b),
and
ˆ b ˆ g−1 (b)
f (x) dx = f (g(u)) g ′ (u) du . (4.12)
a g −1 (a)

This is called the change of variable formula for integrals of single-variable functions, and it is what
you were implicitly using when doing integration by substitution. This formula turns out to be a
special case of a more general formula which can be used to evaluate multiple integrals. We will
state the formulas for double and triple integrals involving real-valued functions of two and three
variables, respectively. We will assume that all the functions involved are continuously differen-
tiable and that the regions and solids involved all have “reasonable” boundaries. The proof of the
following theorem is beyond the scope of the text.

277 Theorem
Change of Variables Formula for Multiple Integrals
Let x = x(u, v) and y = y(u, v) define a one-to-one mapping of a region R′ in the uv-plane onto a
region R in the xy-plane such that the determinant

∂x ∂x


∂u ∂v
J(u, v) = (4.13)
∂y ∂y


∂u ∂v
is never 0 in R′ . Then
¨ ¨

f (x, y) dA(x, y) = f (x(u, v), y(u, v)) J(u, v) dA(u, v) . (4.14)
R R′

We use the notation dA(x, y) and dA(u, v) to denote the area element in the (x, y) and (u, v) coor-
dinates, respectively.
Similarly, if x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) define a one-to-one mapping of a
solid S ′ in uvw-space onto a solid S in xyz-space such that the determinant

∂x ∂x ∂x


∂u ∂v ∂w

∂y ∂y ∂y
J(u, v, w) = (4.15)
∂u ∂v ∂w

∂z ∂z ∂z

∂u ∂v ∂w
is never 0 in S ′ , then
˚ ˚

f (x, y, z) dV (x, y, z) = f (x(u, v, w), y(u, v, w), z(u, v, w)) J(u, v, w) dV (u, v, w) .
S S′
(4.16)

The determinant J(u, v) in formula (4.13) is called the Jacobian of x and y with respect to u and

120
4.5. Change of Variables in Multiple Integrals
v, and is sometimes written as
∂(x, y)
J(u, v) = . (4.17)
∂(u, v)
Similarly, the Jacobian J(u, v, w) of three variables is sometimes written as
∂(x, y, z)
J(u, v, w) = . (4.18)
∂(u, v, w)

Notice that formula (4.14) is saying that dA(x, y) = J(u, v) dA(u, v), which you can think of as a
two-variable version of the relation dx = g ′ (u) du in the single-variable case.
The following example shows how the change of variables formula is used.
278 Example¨
x−y
Evaluate e x+y dA, where R = {(x, y) : x ≥ 0, y ≥ 0, x + y ≤ 1}.
R

Solution: ▶ First, note that evaluating this double integral without using substitution is probably
impossible, at least in a closed form. By looking at the numerator and denominator of the exponent
of e, we will try the substitution u = x − y and v = x + y. To use the change of variables formula
(4.14), we need to write both x and y in terms of u and v. So solving for x and y gives x = 12 (u + v)
and y = 12 (v − u). In Figure 4.8 below, we see how the mapping x = x(u, v) = 21 (u + v), y =
y(u, v) = 21 (v − u) maps the region R′ onto R in a one-to-one manner.
y v
1
x= 2 (u + v) 1
1
y= 1
2 (v − u)
x+y =1 R′

u = −v u=v
R x u
0 1 −1 0 1

Figure 4.8. The regions R and R′

Now we see that



∂x ∂x
1 1
2 1
∂u ∂v 2 1 1
J(u, v) = = = ⇒ J(u, v) = = ,
∂y ∂y 1 2 2 2
− 1
2 2
∂u ∂v
so using horizontal slices in R′ , we have
¨ ¨
x−y
e x+y dA = f (x(u, v), y(u, v)) J(u, v) dA
R R′
ˆ 1ˆ v u
1
= ev 2 du dv
0 −v
ˆ Ç u=v å
1
= v u
2e dv
v
0 u=−v
ˆ 1
= v
2 (e − e−1 ) dv
0

121
4. Multiple Integrals
Ç å
v2 1 1 1 e2 − 1
= (e − e−1 ) = e− =
4 0 4 e 4e
◀ The change of variables formula can be used to evaluate double integrals in polar coordinates.
Letting
x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ ,

we have

∂x ∂x

cos θ −r sin θ
∂r ∂θ
J(u, v) = = = r cos2 θ+r sin2 θ = r ⇒ J(u, v) = |r| = r ,
∂y ∂y
sin θ r cos θ

∂r ∂θ
so we have the following formula:
Double Integral in Polar Coordinates
¨ ¨
f (x, y) dx dy = f (r cos θ, r sin θ) r dr dθ , (4.19)
R R′

where the mapping x = r cos θ, y = r sin θ maps the region R′ in the rθ-plane onto the region
R in the xy-plane in a one-to-one manner.

279 Example
Find the volume V inside the paraboloid z = x2 + y 2 for 0 ≤ z ≤ 1.
z
Solution: Using vertical slices, we see that x2 + y 2 = 1
¨ ¨ 1
V = (1 − z) dA = (1 − (x2 + y 2 )) dA ,
R R

where R = {(x, y) : x2 + y 2 ≤ 1} is the unit disk in R2 (see


Figure 3.5.2). In polar coordinates (r, θ) we know that x2 +
y 2 = r2 and that the unit disk R is the set R′ = {(r, θ) : 0 ≤ y
r ≤ 1, 0 ≤ θ ≤ 2π}. Thus, 0
ˆ 2π ˆ 1 x
V = (1 − r2 ) r dr dθ Figure 4.9. z = x2 + y 2
0 0
ˆ 2π ˆ 1
= (r − r3 ) dr dθ
0
ˆ (0 r=1 )

= r2
− r4 dθ
2 4
0 r=0
ˆ 2π
1
= 4 dθ
0
π
=
2
280 Example

Find the volume V inside the cone z = x2 + y 2 for 0 ≤ z ≤ 1.

122
4.5. Change of Variables in Multiple Integrals

z x2 + y 2 = 1
Solution: Using vertical slices, we see that
1
¨ ¨ Å » ã
V = (1 − z) dA = 1 − x + y dA ,
2 2

R R
y
0
where R = {(x, y) : x2 + y 2 ≤ 1} is the unit disk in R2
x
(see Figure 3.5.3). In polar coordinates (r, θ) we know √
√ Figure 4.10. z= x2 + y 2
that x2 + y 2 = r and that the unit disk R is the set
R′ = {(r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}. Thus,

ˆ 2π ˆ 1
V = (1 − r) r dr dθ
0 0
ˆ 2π ˆ 1
= (r − r2 ) dr dθ
0
ˆ (0 r=1 )

= r2
− r3 dθ
2 3
0 r=0
ˆ 2π
1
= 6 dθ
0
π
=
3

In a similar fashion, it can be shown (see Exercises 5-6) that triple integrals in cylindrical and
spherical coordinates take the following forms:

Triple Integral in Cylindrical Coordinates


˚ ˚
f (x, y, z) dx dy dz = f (r cos θ, r sin θ, z) r dr dθ dz , (4.20)
S S′

where the mapping x = r cos θ, y = r sin θ, z = z maps the solid S ′ in rθz-space onto the solid
S in xyz-space in a one-to-one manner.

Triple Integral in Spherical Coordinates


˚ ˚
f (x, y, z) dx dy dz = f (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ) ρ2 sin ϕ dρ dϕ dθ ,
S S′
(4.21)
where the mapping x = ρ sin ϕ cos θ, y = ρ sin ϕ sin θ, z = ρ cos ϕ maps the solid S′ in ρϕθ-
space onto the solid S in xyz-space in a one-to-one manner.

281 Example
For a > 0, find the volume V inside the sphere S = x2 + y 2 + z 2 = a2 .

123
4. Multiple Integrals
Solution: We see that S is the set ρ = a in spherical coordinates, so
˚ ˆ 2π ˆ π ˆ a
V = 1 dV = 1 ρ2 sin ϕ dρ dϕ dθ
0 0 0
S
ˆ ( )
2π ˆ ˆ ˆ
π
ρ3 ρ=a 2π π
a3
= sin ϕ dϕ dθ = sin ϕ dϕ dθ
0 0 3 ρ=0 0 0 3
ˆ ( ϕ=π ) ˆ 2π 3

a3 2a 4πa3
= −
cos ϕ dθ = dθ = .
0 3 ϕ=0 0 3 3

Exercises
A

1. Find the volume V inside the paraboloid z = x2 + y 2 for 0 ≤ z ≤ 4.



2. Find the volume V inside the cone z = x2 + y 2 for 0 ≤ z ≤ 3.

3. Find the volume V of the solid inside both x2 + y 2 + z 2 = 4 and x2 + y 2 = 1.



4. Find the volume V inside both the sphere x2 + y 2 + z 2 = 1 and the cone z = x2 + y 2 .

5. Prove formula (4.20). 6. Prove formula (4.21).

˜ Ä ä Ä ä
x+y x−y
7. Evaluate sin 2 cos 2 dA, where R is the triangle with vertices (0, 0), (2, 0) and (1, 1).
R
(Hint: Use the change of variables u = (x + y)/2, v = (x − y)/2.)

8. Find the volume of the solid bounded by z = x2 + y 2 and z 2 = 4(x2 + y 2 ).


x2 y2
9. Find the volume inside the elliptic cylinder a2
+ b2
= 1 for 0 ≤ z ≤ 2.

C
2 2 2
10. Show that the volume inside the ellipsoid xa2 + yb2 + zc2 = 1 is 4πabc
3 . (Hint: Use the change of
variables x = au, y = bv, z = cw, then consider Example 281.)

11. Show that the Beta function, defined by


ˆ 1
B(x, y) = tx−1 (1 − t)y−1 dt , for x > 0, y > 0,
0

satisfies the relation B(y, x) = B(x, y) for x > 0, y > 0.

12. Using the substitution t = u/(u + 1), show that the Beta function can be written as
ˆ ∞
ux−1
B(x, y) = du , for x > 0, y > 0.
0 (u + 1)x+y

124
4.6. Application: Center of Mass

4.6. Application: Center of Mass


y
Recall from single-variable calculus that for a region R = {(x, y) :
y = f (x)
a ≤ x ≤ b, 0 ≤ y ≤ f (x)} in R2 that represents a thin, flat
plate (see Figure 3.6.1), where f (x) is a continuous function
R
on [a, b], the center of mass of R has coordinates (x̄, ȳ) given (x̄, ȳ) x
by 0 a b
My Mx
x̄ = and ȳ = , Figure 4.11. Center of mass of R
M M
where
ˆ b ˆ b ˆ b
(f (x))2
Mx = dx , My = xf (x) dx , M= f (x) dx , (4.22)
a 2 a a

assuming that R has uniform density, i.e the mass of R is uniformly distributed over the region. In
this case the area M of the region is considered the mass of R (the density is constant, and taken
as 1 for simplicity).
In the general case where the density of a region (or lamina) R is a continuous function δ =
δ(x, y) of the coordinates (x, y) of points inside R (where R can be any region in R2 ) the coordinates
(x̄, ȳ) of the center of mass of R are given by

My Mx
x̄ = and ȳ = , (4.23)
M M
where ¨ ¨ ¨
My = xδ(x, y) dA , Mx = yδ(x, y) dA , M= δ(x, y) dA , (4.24)
R R R

The quantities Mx and My are called the moments (or first moments) of the region R about the x-
axis and y-axis, respectively. The quantity M is the mass of the region R. To see this, think of taking
a small rectangle inside R with dimensions ∆x and ∆y close to 0. The mass of that rectangle is
approximately δ(x∗ , y∗ )∆x ∆y, for some point (x∗ , y∗ ) in that rectangle. Then the mass of R is the
limit of the sums of the masses of all such rectangles inside R as the diagonals of the rectangles
˜
approach 0, which is the double integral δ(x, y) dA.
R
Note that the formulas in (4.22) represent a special case when δ(x, y) = 1 throughout R in the
formulas in (4.24).

282 Example
Find the center of mass of the region R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2x2 }, if the density function
at (x, y) is δ(x, y) = x + y.

125
4. Multiple Integrals

y
Solution: ▶ The region R is shown in Figure 3.6.2. We have
y = 2x2
¨
M = δ(x, y) dA
R
R
ˆ 1 ˆ 2x2 x
= (x + y) dy dx 0 1
0 0
Ö y=2x2 è
ˆ Figure 4.12.
1
y2
= xy + dx
0 2
y=0
ˆ 1
= (2x3 + 2x4 ) dx
0
1

x4 2x5
= + = 9
2 5 10
0

and
¨ ¨
Mx = yδ(x, y) dA My = xδ(x, y) dA
R R
ˆ 1 ˆ 2x2 ˆ 1 ˆ 2x2
= y(x + y) dy dx = x(x + y) dy dx
0 0 0 0
Ö y=2x2 è Ö y=2x2 è
ˆ ˆ
1
xy 2 y 3 1
xy 2

= + dx = x2 y + dx
0 2 3 0 2

y=0 y=0
ˆ 1 ˆ 1
8x6
= (2x5 + ) dx = (2x4 + 2x5 ) dx
0 3 0
1 1

x6 8x7 2x5 x6
= + = 5 = + = 11 ,
3 21 7 5 3
15
0 0

so the center of mass (x̄, ȳ) is given by


My 11/15 22 Mx 5/7 50
x̄ = = = , ȳ = = = .
M 9/10 27 M 9/10 63
Note how this center of mass is a little further towards the upper corner of the region R than when
Ä ä
the density is uniform (it is easy to use the formulas in (4.22) to show that (x̄, ȳ) = 43 , 35 in that
case). This makes sense since the density function δ(x, y) = x + y increases as (x, y) approaches
that upper corner, where there is quite a bit of area. ◀

In the special case where the density function δ(x, y) is a constant function on the region R, the
center of mass (x̄, ȳ) is called the centroid of R.
The formulas for the center of mass of a region in R2 can be generalized to a solid S in R3 . Let
S be a solid with a continuous mass density function δ(x, y, z) at any point (x, y, z) in S. Then the
center of mass of S has coordinates (x̄, ȳ, z̄), where
Myz Mxz Mxy
x̄ = , ȳ = , z̄ = , (4.25)
M M M

126
4.6. Application: Center of Mass
where
˚ ˚ ˚
Myz = xδ(x, y, z) dV , Mxz = yδ(x, y, z) dV , Mxy = zδ(x, y, z) dV ,
S S S
(4.26)
˚
M = δ(x, y, z) dV . (4.27)
S
In this case, Myz , Mxz and Mxy are called the moments (or first moments) of S around the yz-plane,
xz-plane and xy-plane, respectively. Also, M is the mass of S.

283 Example
Find the center of mass of the solid S = {(x, y, z) : z ≥ 0, x2 + y 2 + z 2 ≤ a2 }, if the density function
at (x, y, z) is δ(x, y, z) = 1.
z
Solution: ▶ The solid S is just the upper hemisphere inside the sphere a
of radius a centered at the origin (see Figure 3.6.3). So since the density
function is a constant and S is symmetric about the z-axis, then it is clear (x̄, ȳ, z̄)
y
that x̄ = 0 and ȳ = 0, so we need only find z̄. We have
˚ ˚ 0 a
M = δ(x, y, z) dV = 1 dV = V olume(S).
x
S S
Figure 4.13.
But since the volume of S is half the volume of the sphere of radius a,
3 2πa3
which we know by Example 281 is 4πa3 , then M = 3 . And
˚
Mxy = zδ(x, y, z) dV
S
˚
= z dV , which in spherical coordinates is
S
ˆ 2π ˆ π/2 ˆ a
= (ρ cos ϕ) ρ2 sin ϕ dρ dϕ dθ
0 0 0
ˆ 2π ˆ π/2
Lj a å
3
= sin ϕ cos ϕ ρ dρ dϕ dθ
0 0 0
ˆ 2π ˆ π/2
a4
= 4 sin ϕ cos ϕ dϕ dθ
0 0
ˆ 2π ˆ π/2
a4
Mxy = 8 sin 2ϕ dϕ dθ (since sin 2ϕ = 2 sin ϕ cos ϕ)
0 0
ˆ ( ϕ=π/2 )

cos 2ϕ
4
= − a16 dθ
0 ϕ=0

ˆ 2π
a4
= 8 dθ
0
πa4
= ,
4

127
4. Multiple Integrals
πa4
Mxy 4 3a
z̄ = = 2πa3
= .
M 8
3
Ä ä
8 .◀
Thus, the center of mass of S is (x̄, ȳ, z̄) = 0, 0, 3a

Exercises
A
For Exercises 1-5, find the center of mass of the region R with the given
density function δ(x, y).

1. R = {(x, y) : 0 ≤ x ≤ 2, 0 ≤ y ≤ 4 }, δ(x, y) = 2y

2. R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ x2 }, δ(x, y) = x + y

3. R = {(x, y) : y ≥ 0, x2 + y 2 ≤ a2 }, δ(x, y) = 1

4. R = {(x, y) : y ≥ 0, x ≥ 0, 1 ≤ x2 + y 2 ≤ 4 }, δ(x, y) = x2 + y 2

5. R = {(x, y) : y ≥ 0, x2 + y 2 ≤ 1 }, δ(x, y) = y

B
For Exercises 6-10, find the center of mass of the solid S with the given density function δ(x, y, z).

6. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ(x, y, z) = xyz

7. S = {(x, y, z) : z ≥ 0, x2 + y 2 + z 2 ≤ a2 }, δ(x, y, z) = x2 + y 2 + z 2

8. S = {(x, y, z) : x ≥ 0, y ≥ 0, z ≥ 0, x2 + y 2 + z 2 ≤ a2 }, δ(x, y, z) = 1

9. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ(x, y, z) = x2 + y 2 + z 2

10. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 − x − y}, δ(x, y, z) = 1

4.7. Application: Probability and Expected Value


In this section we will briefly discuss some applications of multiple integrals in the field of prob-
ability theory. In particular we will see ways in which multiple integrals can be used to calculate
probabilities and expected values.

Probability
Suppose that you have a standard six-sided (fair) die, and you let a variable X represent the
value rolled. Then the probability of rolling a 3, written as P (X = 3), is 61 , since there are six sides
on the die and each one is equally likely to be rolled, and hence in particular the 3 has a one out
of six chance of being rolled. Likewise the probability of rolling at most a 3, written as P (X ≤ 3),

128
4.7. Application: Probability and Expected Value
is 36 = 12 , since of the six numbers on the die, there are three equally likely numbers (1, 2, and 3)
that are less than or equal to 3. Note that P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3).
We call X a discrete random variable on the sample space (or probability space) Ω consisting of all
possible outcomes. In our case, Ω = {1, 2, 3, 4, 5, 6}. An event A is a subset of the sample space.
For example, in the case of the die, the event X ≤ 3 is the set {1, 2, 3}.
Now let X be a variable representing a random real number in the interval (0, 1). Note that the
set of all real numbers between 0 and 1 is not a discrete (or countable) set of values, i.e. it can not
be put into a one-to-one correspondence with the set of positive integers.2 In this case, for any real
number x in (0, 1), it makes no sense to consider P (X = x) since it must be 0 (why?). Instead, we
consider the probability P (X ≤ x), which is given by P (X ≤ x) = x. The reasoning is this: the
interval (0, 1) has length 1, and for x in (0, 1) the interval (0, x) has length x. So since X represents
a random number in (0, 1), and hence is uniformly distributed over (0, 1), then
length of (0, x) x
P (X ≤ x) = = = x.
length of (0, 1) 1
We call X a continuous random variable on the sample space Ω = (0, 1). An event A is a subset of
the sample space. For example, in our case the event X ≤ x is the set (0, x).
In the case of a discrete random variable, we saw how the probability of an event was the sum
of the probabilities of the individual outcomes comprising that event (e.g. P (X ≤ 3) = P (X =
1) + P (X = 2) + P (X = 3) in the die example). For a continuous random variable, the probability
of an event will instead be the integral of a function, which we will now describe.
Let X be a continuous real-valued random variable on a sample space Ω in R. For simplicity, let
Ω = (a, b). Define the distribution function F of X as

F (x) = P (X ≤ x) , for −∞ < x < ∞ (4.28)





1,
 for x ≥ b

=
 P (X ≤ x), for a < x < b (4.29)



0, for x ≤ a .

Suppose that there is a nonnegative, continuous real-valued function f on R such that


ˆ x
F (x) = f (y) dy , for −∞ < x < ∞ , (4.30)
−∞

and ˆ ∞
f (x) dx = 1 . (4.31)
−∞
Then we call f the probability density function (or p.d.f. for short) for X. We thus have
ˆ x
P (X ≤ x) = f (y) dy , for a < x < b . (4.32)
a

Also, by the Fundamental Theorem of Calculus, we have

F ′ (x) = f (x) , for −∞ < x < ∞. (4.33)

2
For a proof see p. 9-10 in kam.

129
4. Multiple Integrals
284 Example
Let X represent a randomly selected real number in the interval (0, 1). We say that X has the uniform
distribution on (0, 1), with distribution function




 1, for x ≥ 1

F (x) = P (X ≤ x) = x, for 0 < x < 1 (4.34)




0, for x ≤ 0 ,

and probability density function




1, for 0 < x < 1
f (x) = F ′ (x) = (4.35)

0, elsewhere.

In general, if X represents a randomly selected real number in an interval (a, b), then X has the uni-
form distribution function




1, for x ≥ b

F (x) = P (X ≤ x) = x
b−a , for a < x < b (4.36)




0, for x ≤ a ,

and probability density function




 1 , for a < x < b
f (x) = F ′ (x) = b−a
(4.37)
0,
 elsewhere.

285 Example
A famous distribution function is given by the standard normal distribution, whose probability density
function f is
1
f (x) = √ e−x /2 , for −∞ < x < ∞.
2
(4.38)

This is often called a “bell curve”, and is used widely in statistics. Since we are claiming that f is a
p.d.f., we should have ˆ ∞
1
√ e−x /2 dx = 1
2
(4.39)
−∞ 2π
by formula (4.31), which is equivalent to
ˆ ∞ √
e−x /2 dx = 2π .
2
(4.40)
−∞

We can use a double integral in polar coordinates to verify this integral. First,
ˆ ∞ˆ ∞ ˆ ∞ (ˆ )

−(x2 +y 2 )/2 −y 2 /2 −x2 /2
e dx dy = e e dx dy
−∞ −∞ −∞ −∞
(ˆ ) (ˆ )
∞ ∞
−x2 /2 −y 2 /2
= e dx e dy
−∞ −∞
(ˆ )2

−x2 /2
= e dx
−∞

130
4.7. Application: Probability and Expected Value
since the same function is being integrated twice in the middle equation, just with different variables.
But using polar coordinates, we see that
ˆ ∞ ˆ ∞ ˆ 2π ˆ ∞
e−(x e−r
2 +y 2 )/2 2 /2
dx dy = r dr dθ
−∞ −∞ 0 0
Ñ r=∞ é
ˆ

= −e −r2 /2 dθ

0
r=0
ˆ 2π ˆ 2π
= (0 − (−e0 )) dθ = 1 dθ = 2π ,
0 0

and so
(ˆ )2

−x2 /2
e dx = 2π , and hence
−∞
ˆ ∞ √
e−x
2 /2
dx = 2π .
−∞

In addition to individual random variables, we can consider jointly distributed random variables.
For this, we will let X, Y and Z be three real-valued continuous random variables defined on the
same sample space Ω in R (the discussion for two random variables is similar). Then the joint dis-
tribution function F of X, Y and Z is given by

F (x, y, z) = P (X ≤ x, Y ≤ y, Z ≤ z) , for −∞ < x, y, z < ∞. (4.41)

If there is a nonnegative, continuous real-valued function f on R3 such that


ˆ z ˆ y ˆ x
F (x, y, z) = f (u, v, w) du dv dw , for −∞ < x, y, z < ∞ (4.42)
−∞ −∞ −∞

and ˆ ∞ ˆ ∞ ˆ ∞
f (x, y, z) dx dy dz = 1 , (4.43)
−∞ −∞ −∞

then we call f the joint probability density function (or joint p.d.f. for short) for X, Y and Z. In
general, for a1 < b1 , a2 < b2 , a3 < b3 , we have
ˆ b3 ˆ b2 ˆ b1
P (a1 < X ≤ b1 , a2 < Y ≤ b2 , a3 < Z ≤ b3 ) = f (x, y, z) dx dy dz , (4.44)
a3 a2 a1

with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can be
thought of as representing a probability (for a function f which is a p.d.f.).

286 Example
Let a, b, and c be real numbers selected randomly from the interval (0, 1). What is the probability that
the equation ax2 + bx + c = 0 has at least one real solution x?

131
4. Multiple Integrals

c
Solution: ▶ We know by the quadratic formula that there is at least one
real solution if b2 − 4ac ≥ 0. So we need to calculate P (b2 − 4ac ≥ 0). We 1
will use three jointly distributed random variables to do this. First, since 1
c= 4a
0 < a, b, c < 1, we have
√ √ R1 R2
a
b − 4ac ≥ 0 ⇔ 0 < 4ac ≤ b < 1 ⇔ 0 < 2 a c ≤ b < 1 ,
2 2
0 1 1
4

where the last relation holds for all 0 < a, c < 1 such that Figure 4.14. Region
1 R = R1 ∪ R2
0 < 4ac < 1 ⇔ 0 < c < .
4a
Considering a, b and c as real variables, the region R in the ac-plane where the above relation holds
is given by R = {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c < 4a 1
}, which we can see is a union of two
regions R1 and R2 , as in Figure 3.7.1 above.
Now let X, Y and Z be continuous random variables, each representing a randomly selected real
number from the interval (0, 1) (think of X, Y and Z representing a, b and c, respectively). Then,
similar to how we showed that f (x) = 1 is the p.d.f. of the uniform distribution on (0, 1), it can be
shown that f (x, y, z) = 1 for x, y, z in (0, 1)
(0 elsewhere) is the joint p.d.f. of X, Y and Z. Now,
√ √
P (b2 − 4ac ≥ 0) = P ((a, c) ∈ R, 2 a c ≤ b < 1) ,
√ √
so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2 a c to 1 and as (a, c)
varies over the region R. Since R can be divided into two regions R1 and R2 , then the required
triple integral can be split into a sum of two triple integrals, using vertical slices in R:
ˆ 1/4 ˆ 1 ˆ 1 ˆ 1 ˆ 1/4a ˆ 1
P (b − 4ac ≥ 0) =
2
√ √
1 db dc da + √ √
1 db dc da
| 0 {z 0 } 2 a c |
1/4
{z
0
}
2 a c
R1 R2
ˆ 1/4 ˆ 1 ˆ ˆ
√ √ 1 1/4a √ √
= (1 − 2 a c) dc da + (1 − 2 a c) dc da
0 0 1/4 0
ˆ ( c=1 ) ˆ 1 ( )
1/4 √
3/2 √ 3/2 c=1/4a
= c− 4
3 ac da + c − 3 a c
4
da
0 c=0 1/4 c=0
ˆ 1/4 Ä ˆ
√ ä 1
= 1− 4
3 a da + 1
12a da
0 1/4
1/4 1

8 1
= a − a3/2 + ln a
9 12
0 1/4
Ç å Ç å
1 1 1 1 5 1
= − + 0− ln = + ln 4
4 9 12 4 36 12
5 + 3 ln 4
P (b2 − 4ac ≥ 0) = ≈ 0.2544
36
In other words, the equation ax2 + bx + c = 0 has about a 25% chance of being solved! ◀

132
4.7. Application: Probability and Expected Value
Expected Value
The expected value EX of a random variable X can be thought of as the “average” value of X as
it varies over its sample space. If X is a discrete random variable, then

EX = x P (X = x) , (4.45)
x

with the sum being taken over all elements x of the sample space. For example, if X represents the
number rolled on a six-sided die, then

6 ∑
6
1
EX = x P (X = x) = x = 3.5 (4.46)
x=1 x=1
6

is the expected value of X, which is the average of the integers 1 − 6.


If X is a real-valued continuous random variable with p.d.f. f , then
ˆ ∞
EX = x f (x) dx . (4.47)
−∞

For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is


1, for 0 < x < 1
f (x) = (4.48)

0, elsewhere,

and so ˆ ˆ
∞ 1
1
EX = x f (x) dx = x dx = . (4.49)
−∞ 0 2
For a pair of jointly distributed, real-valued continuous random variables X and Y with joint p.d.f.
f (x, y), the expected values of X and Y are given by
ˆ ∞ˆ ∞ ˆ ∞ˆ ∞
EX = x f (x, y) dx dy and EY = y f (x, y) dx dy , (4.50)
−∞ −∞ −∞ −∞

respectively.

287 Example
If you were to pick n > 2 random real numbers from the interval (0, 1), what are the expected values
for the smallest and largest of those numbers?

Solution: ▶ Let U1 , . . . , Un be n continuous random variables, each representing a randomly


selected real number from (0, 1), i.e. each has the uniform distribution on (0, 1). Define random
variables X and Y by

X = min(U1 , . . . , Un ) and Y = max(U1 , . . . , Un ) .

Then it can be shown3 that the joint p.d.f. of X and Y is




n(n − 1)(y − x)n−2 , for 0 ≤ x ≤ y ≤ 1
f (x, y) = (4.51)

0, elsewhere.
3
See Ch. 6 in [34].

133
4. Multiple Integrals
Thus, the expected value of X is
ˆ 1ˆ 1
EX = n(n − 1)x(y − x)n−2 dy dx
0
ˆ (x y=1 )
1
n−1
= nx(y − x) dx
0 y=x
ˆ 1
= nx(1 − x)n−1 dx , so integration by parts yields
0

1 1
= − x(1 − x) − (1 − x)n+1 n
n+1 0
1
EX = ,
n+1
and similarly (see Exercise 3) it can be shown that
ˆ 1ˆ y
n
EY = n(n − 1)y(y − x)n−2 dx dy = .
0 0 n+1
So, for example, if you were to repeatedly take samples of n = 3 random real numbers from (0, 1),
and each time store the minimum and maximum values in the sample, then the average of the
minimums would approach 41 and the average of the maximums would approach 34 as the number
of samples grows. It would be relatively simple (see Exercise 4) to write a computer program to test
this. ◀

Exercises
B
ˆ ∞
e−x dx using anything you have learned so far.
2
1. Evaluate the integral
−∞
ˆ ∞
1
√ e−(x−µ) /2σ dx.
2 2
2. For σ > 0 and µ > 0, evaluate
−∞ σ 2π
n
3. Show that EY = n+1 in Example 287

C
4. Write a computer program (in the language of your choice) that verifies the results in Example
287 for the case n = 3 by taking large numbers of samples.

5. Repeat Exercise 4 for the case when n = 4.

6. For continuous random variables X, Y with joint p.d.f. f (x, y), define the second moments
E(X 2 ) and E(Y 2 ) by
ˆ ∞ˆ ∞ ˆ ∞ˆ ∞
2 2 2
E(X ) = x f (x, y) dx dy and E(Y ) = y 2 f (x, y) dx dy ,
−∞ −∞ −∞ −∞

and the variances Var(X) and Var(Y ) by

Var(X) = E(X 2 ) − (EX)2 and Var(Y ) = E(Y 2 ) − (EY )2 .

Find Var(X) and Var(Y ) for X and Y as in Example 287.

134
4.7. Application: Probability and Expected Value
7. Continuing Exercise 6, the correlation ρ between X and Y is defined as

E(XY ) − (EX)(EY )
ρ = » ,
Var(X) Var(Y )
ˆ ∞ ˆ ∞
where E(XY ) = xy f (x, y) dx dy. Find ρ for X and Y as in Example 287.
−∞ −∞
(Note: The quantity E(XY ) − (EX)(EY ) is called the covariance of X and Y .)

8. In Example 286 would the answer change if the interval (0, 100) is used instead of (0, 1)? Explain.

135
Curves and Surfaces
5.
5.1. Parametric Curves
There are many ways we can described a curve. We can, say, describe it by a equation that the
points on the curve satisfy. For example, a circle can be described by x2 + y 2 = 1. However, this is
not a good way to do so, as it is rather difficult to work with. It is also often difficult to find a closed
form like this for a curve.
Instead, we can imagine the curve to be specified by a particle moving along the path. So it is
represented by a function f : R → Rn , and the curve itself is the image of the function. This is
known as a parametrization of a curve. In addition to simplified notation, this also has the benefit
of giving the curve an orientation.

288 Definition
We say Γ ⊆ Rn is a differentiable curve if exists a differentiable function γ : I = [a, b] → Rn such
that Γ = γ([a, b]).
The function γ is said a parametrization of the curve γ. And the function γ : I = [a, b] → Rn is
said a parametric curve.

Sometimes Γ = γ[I] ⊆ Rn is called the image of the parametric curve. We note that a curve Rn
can be the image of several distinct parametric curves.
289 Remark
Usually we will denote the image of the curve and its parametrization by the same letter and we will
talk about the curve γ with parametrization γ(t).

290 Definition
A parametrization γ(t) : I → Rn is regular if γ ′ (t) ̸= 0 for all t ∈ I.

The parametrization provide the curve with an orientation. Since γ = γ([a, b]), we can think the
curve as the trace of a motion that starts at γ(a) and ends on γ(b).
291 Example
The curve x2 + y 2 = 1 can be parametrized by γ(t) = (cos t, sin t) for t ∈ [0, 2π]

137
5. Curves and Surfaces

Figure 5.1. Orientation of a Curve

z
(x(t), y(t), z(t))
(x(a), y(a), z(a))
x = x(t) C
y = y(t) r(t) (x(b), y(b), z(b))
z = z(t)
R y
a t b 0
x

Figure 5.2. Parametrization of a curve C in R3

Given a parametric curve γ : I = [a, b] → Rn

■ The curve is said to be simple if γ is injective, i.e. if for all x, y in (a, b), we have γ(x) = γ(y)
implies x = y.

■ If γ(x) = γ(y) for some x ̸= y in (a, b), then γ(x) is called a multiple point of the curve.

■ A curve γ is said to be closed if γ(a) = γ(b).

■ A simple closed curve is a closed curve which does not intersect itself.

Note that any closed curve can be regarded as a union of simple closed curves (think of the loops
in a figure eight)

▶ ▶

t=a t=b t=a

t=b
◀ ◀
C C
(a) Closed (b) Not closed

Figure 5.3. Closed vs non-closed curves

292 Theorem (Jordan Curve Theorem)


Let γ be a simple closed curve in the plane R2 . Then its complement, R2 \ γ, consists of exactly
two connected components. One of these components is bounded (the interior) and the other is

138
5.1. Parametric Curves
unbounded (the exterior), and the curve γ is the boundary of each component.

The Jordan Curve Theorem asserts that every simple closed curve in the plane curve divides the
plane into an ”interior” region bounded by the curve and an ”exterior” region. While the statement
of this theorem is intuitively obvious, it’s demonstration is intricate.
293 Example
Find a parametric representation for the curve resulting by the intersection of the plane 3x + y + z = 1
and the cylinder x2 + 2y 2 = 1 in R3 .

Solution: ▶ The projection of the intersection of the plane 3x + y + z = 1 and the cylinder is the
ellipse x2 + 2y 2 = 1, on the xy-plane. This ellipse can be parametrized as

2
x = cos t, y = sin t, 0 ≤ t ≤ 2π.
2
From the equation of the plane,

2
z = 1 − 3x − y = 1 − 3 cos t − sin t.
2
Thus we may take the parametrization
( √ √ )
( ) 2 2
r(t) = x(t), y(t), z(t) = cos t, sin t, 1 − 3 cos t − sin t .
2 2

294 Proposition { }

Let f : Rn+1 → Rn is differentiable, c ∈ Rn and γ = x ∈ Rn+1 f (x) = c be the level set of f . If
at every point in γ, the matrix Df has rank n then γ is a curve.

Proof. Let a ∈ γ. Since rank(D(f)a ) = d, there must be d linearly independent columns in


the matrix D(f)a . For simplicity assume these are the first d ones. The implicit function theo-
rem applies and guarantees that the equation f (x) = c can be solved for x1 , . . . , xn , and each
xi can be expressed as a differentiable function of xn+1 (close to a). That is, there exist open sets

U ′ ⊆ Rn , V ′ ⊆ R
{ and a differentiable
} function g such that a ∈ U ′ × V ′ and γ (U ′ × V ′ ) =

(g(xn+1 ), xn+1 ) xn+1 ∈ V ′ . ■

295 Remark
A curve can have many parametrizations. For example, δ(t) = (cos t, sin(−t)) also parametrizes
the unit circle, but runs clockwise instead of counter clockwise. Choosing a parametrization requires
choosing the direction of traversal through the curve.

We can change parametrization of r by taking an invertible smooth function u 7→ ũ, and have a
new parametrization r(ũ) = r(ũ(u)). Then by the chain rule,
dr dr dũ
= ·
du dũ du
dr dr dũ
= /
dũ du du

139
5. Curves and Surfaces
296 Proposition
Let γ be a regular curve and γ be a parametrization, a = γ(t0 ) ∈ γ. Then the tangent line through a
{ }

is γ(t0 ) + tγ ′ (t0 ) t ∈ R .

If we think of γ(t) as the position of a particle at time t, then the above says that the tangent
space is spanned by the velocity of the particle.
That is, the velocity of the particle is always tangent to the curve it traces out. However, the accel-
eration of the particle (defined to be γ ′′ ) need not be tangent to the curve! In fact if the magnitude

of the velocity γ ′ is constant, then the acceleration will be perpendicular to the curve!
So far we have always insisted all curves and parametrizations are differentiable or C 1 . We now
relax this requirement and subsequently only assume that all curves (and parametrizations) are
piecewise differentiable, or piecewise C 1 .

297 Definition
A function f : [a, b] → Rn is called piecewise C 1 if there exists a finite set F ⊆ [a, b] such that f is
C 1 on [a, b] − F , and further both left and right limits of f and f ′ exist at all points in F .

y
1

0.5

x
−1 −0.5 0.5 1

Figure 5.4. Piecewise C 1 function

298 Definition
A (connected) curve γ is piecewise C 1 if it has a parametrization which is continuous and piecewise
C 1.

Figure 5.5. The boundary of a square is a piecewise


C 1 curve, but not a differentiable curve.

140
5.2. Surfaces
299 Remark
A piecewise C 1 function need not be continuous. But curves are always assumed to be at least contin-
uous; so for notational convenience, we define a piecewise C 1 curve to be one which has a parametriza-
tion which is both continuous and piecewise C 1 .

5.2. Surfaces
We have seen that a space curve C can be parametrized by a vector function r = r(u) where u
ranges over some interval I of the u-axis. In an analogous manner we can parametrize a surface S
in space by a vector function r = r(u, v) where (u, v) ranges over some region Ω of the uv-plane.

v R2
z
S

Ω x = x(u, v)
y = y(u, v)
(u, v) z = z(u, v) r(u, v)
y
0
u x

Figure 5.6. Parametrization of a surface S in R3

300 Definition
A parametrized surface is given by a one-to-one transformation r : Ω → Rn , where Ω is a domain
in the plane R2 . The transformation is then given by

r(u, v) = (x1 (u, v), . . . , xn (u, v)).

301 Example
(The graph of a function) The graph of a function

y = f (x), x ∈ [a, b]

can be parametrized by setting

r(u) = ui + f (u)j, u ∈ [a, b].

In the same vein the graph of a function

z = f (x, y), (x, y) ∈ Ω

can be parametrized by setting

r(u, v) = ui + vj + f (u, v)k, (u, v) ∈ Ω.

As (u, v) ranges over Ω, the tip of r(u, v) traces out the graph of f .

141
5. Curves and Surfaces
302 Example (Plane)
If two vectors a and b are not parallel, then the set of all linear combinations ua+vb generate a plane
p0 that passes through the origin. We can parametrize this plane by setting

r(u, v) = ua + vb, (u, v) ∈ R × R.

The plane p that is parallel to p0 and passes through the tip of c can be parametrized by setting

r(u, v) = ua + vb + c, (u, v) ∈ R × R.

Note that the plane contains the lines

l1 : r(u, 0) = ua + c and l2 : r(0, v) = vb + c.

303 Example (Sphere)


The sphere of radius a centered at the origin can be parametrized by

r(u, v) = a cos u cos vi + a sin u cos vj + a sin vk


π π
with (u, v) ranging over the rectangle R : 0 ≤ u ≤ 2π, − ≤ v ≤ .
2 2
Derive this parametrization. The points of latitude v form a circle of radius a cos v on the horizontal
plane z = a sin v. This circle can be parametrized by

R(u) = a cos v(cos ui + sin uj) + a sin vk, u ∈ [0, 2π].

This expands to give

R(u, v) = a cos u cos vi + a sin u cos vj + a sin vk, u ∈ [0, 2π].


π π
Letting v range from − to , we obtain the entire sphere. The xyz-equation for this same sphere is
2 2
2 2 2 2
x + y + z = a . It is easy to verify that the parametrization satisfies this equation:

x2 + y 2 + z 2 = a2 cos 2 u cos 2 v + a2 sin 2 u cos 2 v + a2 sin 2 v


Ä ä
= a2 cos 2 u + sin 2 u cos 2 v + a2 sin 2 v
Ä ä
= a2 cos 2 v + sin 2 v = a2 .

304 Example (Cone)


Considers a cone with apex semiangle α and slant height s. The points of slant height v form a circle
of radius v sin α on the horizontal plane z = v cos a. This circle can be parametrized by

C(u) = v sin α(cos ui + sin uj) + v cos αk

= v cos u sin αi + v sin u sin αj + v cos αk, u ∈ [0, 2π].

Since we can obtain the entire cone by letting v range from 0 to s, the cone is parametrized by

r(u, v) = v cos u sin αi + v sin u sin αj + v cos αk,

with 0 ≤ u ≤ 2π, 0 ≤ v ≤ s.

142
5.2. Surfaces
305 Example (Spiral Ramp)
A rod of length l initially resting on the x-axis and attached at one end to the z-axis sweeps out a
surface by rotating about the z-axis at constant rate ω while climbing at a constant rate b.
To parametrize this surface we mark the point of the rod at a distance u from the z-axis (0 ≤ u ≤ l)
and ask for the position of this point at time v. At time v the rod will have climbed a distance bv and
rotated through an angle ωv. Thus the point will be found at the tip of the vector

u(cos ωvi + sin ωvj) + bvk = u cos ωv i + u sin ωvj + bvk.

The entire surface can be parametrized by

r(u, v) = u cos ωvi + u sin ωvj + bvk with 0 ≤ u ≤ l, 0 ≤ v.

306 Definition
A regular parametrized surface is a smooth mapping φ : U → Rn , where U is an open subset of
R2 , of maximal rank. This is equivalent to saying that the rank of φ is 2

Let (u, v) be coordinates in R2 , (x1 , . . . , xn ) be coordinates in Rn . Then

φ(u, v) = (x1 (u, v), . . . , xn (u, v)),

where xi (u, v) admit partial derivatives and the Jacobian matrix has rank two.

5.2.1. Implicit Surface


An implicit surface is the set of zeros of a function of three variables, i.e, an implicit surface is a
surface in Euclidean space defined by an equation

F (x, y, z) = 0.

Let F : U → R be a differentiable function. A regular point is a point p ∈ U for which the


differential dFp is surjective.
We say that q is a regular value, if for every point p in F −1 (q), p is a regular value.

307 Theorem (Regular Value Theorem)


Let U ⊂ R3 be open and F : U → R be differentiable. If q is a regular value of f then F −1 (q) is a
regular surface

308 Example
Show that the circular cylinder x2 + y 2 = 1 is a regular surface.

Solution: ▶ Define the function F (x, y, z) = x2 + y 2 + z 2 − 1. Then the cylinder is the set f −1 (0).
Observe that ∂f ∂f ∂f
∂x = 2x, ∂y = 2y, ∂z = 2z.
It is clear that all partial derivatives are zero if and only if x = y = z = 0. Further checking shows
that f (0, 0, 0) ̸= 0, which means that (0, 0, 0) does not belong to f −1 (0). Hence for all u ∈ f −1 (0),
not all of partial derivatives at u are zero. By Theorem 307, the circular cylinder is a regular surface.

143
5. Curves and Surfaces

5.3. Classical Examples of Surfaces


In this section we consider various surfaces that we shall periodically encounter in subsequent sec-
tions.
Let us start with the plane. Recall that if a, b, c are real numbers, not all zero, then the Cartesian
equation of a plane with normal vector (a, b, c) and passing through the point (x0 , y0 , z0 ) is

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

If we know that the vectors u and v are on the plane (parallel to the plane) then with the parameters
p, qthe equation of the plane is
x − x0 = pu1 + qv1 ,
y − y0 = pu2 + qv2 ,
z − z0 = pu3 + qv3 .

309 Definition
A surface S consisting of all lines parallel to a given line ∆ and passing through a given curve γ is
called a cylinder. The line ∆ is called the directrix of the cylinder.

To recognise whether a given surface is a cylinder we look at its Cartesian equation. If it


is of the form f (A, B) = 0, where A, B are secant planes, then the curve is a cylinder.
Under these conditions, the lines generating S will be parallel to the line of equation
A = 0, B = 0. In practice, if one of the variables x, y, or z is missing, then the surface
is a cylinder, whose directrix will be the axis of the missing coordinate.

Figure 5.7. Circular cylinder x2 + y 2 = 1. Figure 5.8. The parabolic cylinder z = y 2 .

310 Example
Figure 5.7 shews the cylinder with Cartesian equation x2 +y 2 = 1. One starts with the circle x2 +y 2 =
1 on the xy-plane and moves it up and down the z-axis. A parametrization for this cylinder is the
following:
x = cos v, y = sin v, z = u, u ∈ R, v ∈ [0; 2π].

144
5.3. Classical Examples of Surfaces
311 Example
Figure 5.8 shews the parabolic cylinder with Cartesian equation z = y 2 . One starts with the parabola
z = y 2 on the yz-plane and moves it up and down the x-axis. A parametrization for this parabolic
cylinder is the following:

x = u, y = v, z = v2, u ∈ R, v ∈ R.
312 Example
Figure 5.9 shews the hyperbolic cylinder with Cartesian equation x2 − y 2 = 1. One starts with the
hyperbola x2 − y 2 on the xy-plane and moves it up and down the z-axis. A parametrization for this
parabolic cylinder is the following:

x = ± cosh v, y = sinh v, z = u, u ∈ R, v ∈ R.

We need a choice of sign for each of the portions. We have used the fact that cosh2 v − sinh2 v = 1.

313 Definition
Given a point Ω ∈ R3 (called the apex) and a curve γ (called the generating curve), the surface S
obtained by drawing rays from Ω and passing through γ is called a cone.

A B
In practice, if the Cartesian equation of a surface can be put into the form f ( , ) = 0,
C C
where A, B, C, are planes secant at exactly one point, then the surface is a cone, and
its apex is given by A = 0, B = 0, C = 0.
314 Example
The surface in R3 implicitly given by
z 2 = x2 + y 2
Å ã2 Å ã
x y 2
is a cone, as its equation can be put in the form + − 1 = 0. Considering the planes
z z
x = 0, y = 0, z = 0, the apex is located at (0, 0, 0). The graph is shewn in figure 5.11.

315 Definition
A surface S obtained by making a curve γ turn around a line ∆ is called a surface of revolution.
We then say that ∆ is the axis of revolution. The intersection of S with a half-plane bounded by ∆ is
called a meridian.

If the Cartesian equation of S can be put in the form f (A, S) = 0, where A is a plane
and S is a sphere, then the surface is of revolution. The axis of S is the line passing
through the centre of S and perpendicular to the plane A.

316 Example
Find the equation of the surface of revolution generated by revolving the hyperbola

x2 − 4z 2 = 1

about the z-axis.

145
5. Curves and Surfaces

Figure 5.10. The torus.


x2 y2 z2
Figure 5.9. The hyperbolic cylinder x2 − y 2 = 1. Figure 5.11. Cone 2
+ 2 = 2.
a b c

Solution: ▶ Let (x, y, z) be a point on S. If this point were on the xz plane, it would be on the

hyperbola, and its distance to the axis of rotation would be |x| = 1 + 4z 2 . Anywhere else, the
distance of (x, y, z) to the axis of rotation is the same as the distance of (x, y, z) to (0, 0, z), that is

x2 + y 2 . We must have » √
x2 + y 2 = 1 + 4z 2 ,

which is to say
x2 + y 2 − 4z 2 = 1.

This surface is called a hyperboloid of one sheet. See figure 5.15. Observe that when z = 0, x2 +
y 2 = 1 is a circle on the xy plane. When x = 0, y 2 − 4z 2 = 1 is a hyperbola on the yz plane. When
y = 0, x2 − 4z 2 = 1 is a hyperbola on the xz plane.

A parametrization for this hyperboloid is


√ √
x= 1 + 4u2 cos v, y= 1 + 4u2 sin v, z = u, u ∈ R, v ∈ [0; 2π].


317 Example
The circle (y − a)2 + z 2 = r2 , on the yz plane (a, r are positive real numbers) is revolved around the
z-axis, forming a torus T . Find the equation of this torus.

Solution: ▶ Let (x, y, z) be a point on T . If this point were on the yz plane, it would be on the circle,

and the of the distance to the axis of rotation would be y = a + sgn (y − a) r2 − z 2 , where sgn (t)
(with sgn (t) = −1 if t < 0, sgn (t) = 1 if t > 0, and sgn (0) = 0) is the sign of t. Anywhere else, the

distance from (x, y, z) to the z-axis is the distance of this point to the point (x, y, z) : x2 + y 2 . We
must have
√ √
x2 + y 2 = (a + sgn (y − a) r2 − z 2 )2 = a2 + 2asgn (y − a) r2 − z 2 + r2 − z 2 .

Rearranging

x2 + y 2 + z 2 − a2 − r2 = 2asgn (y − a) r2 − z 2 ,

or
(x2 + y 2 + z 2 − (a2 + r2 ))2 = 4a2 r2 − 4a2 z 2

146
5.3. Classical Examples of Surfaces
since (sgn (y − a))2 = 1, (it could not be 0, why?). Rearranging again,

(x2 + y 2 + z 2 )2 − 2(a2 + r2 )(x2 + y 2 ) + 2(a2 − r2 )z 2 + (a2 − r2 )2 = 0.

The equation of the torus thus, is of fourth degree, and its graph appears in figure 7.4.
A parametrization for the torus generated by revolving the circle (y − a)2 + z 2 = r2 around the
z-axis is

x = a cos θ + r cos θ cos α, y = a sin θ + r sin θ cos α, z = r sin α,

with (θ, α) ∈ [−π; π]2 .


Figure 5.12. Paraboloid Figure 5.14. Two-sheet hyperboloid


Figure 5.13. Hyperbolic paraboloid
x2 y2 x2 y2 z2 x2 y2
z = 2 + 2. z= 2 − 2 2
= 2 + 2 + 1.
a b a b c a b

318 Example
The surface z = x2 + y 2 is called an elliptic paraboloid. The equation clearly requires that z ≥ 0.
For fixed z = c, c > 0, x2 + y 2 = c is a circle. When y = 0, z = x2 is a parabola on the xz plane.
When x = 0, z = y 2 is a parabola on the yz plane. See figure 5.12. The following is a parametrization
of this paraboloid:
√ √
x = u cos v, y = u sin v, z = u, u ∈ [0; +∞[, v ∈ [0; 2π].
319 Example
The surface z = x2 − y 2 is called a hyperbolic paraboloid or saddle. If z = 0, x2 − y 2 = 0 is a pair
of lines in the xy plane. When y = 0, z = x2 is a parabola on the xz plane. When x = 0, z = −y 2
is a parabola on the yz plane. See figure 5.13. The following is a parametrization of this hyperbolic
paraboloid:
x = u, y = v, z = u2 − v 2 , u ∈ R, v ∈ R.
320 Example
The surface z 2 = x2 + y 2 + 1 is called an hyperboloid of two sheets. For z 2 − 1 < 0, x2 + y 2 < 0 is
impossible, and hence there is no graph when −1 < z < 1. When y = 0, z 2 − x2 = 1 is a hyperbola
on the xz plane. When x = 0, z 2 − y 2 = 1 is a hyperbola on the yz plane. When z = c is a constant
c > 1, then the x2 + y 2 = c2 − 1 are circles. See figure 5.14. The following is a parametrization for
the top sheet of this hyperboloid of two sheets

x = u cos v, y = u sin v, z = u2 + 1, u ∈ R, v ∈ [0; 2π]

147
5. Curves and Surfaces
and the following parametrizes the bottom sheet,

x = u cos v, y = u sin v, z = −u2 − 1, u ∈ R, v ∈ [0; 2π],

321 Example
The surface z 2 = x2 + y 2 − 1 is called an hyperboloid of one sheet. For x2 + y 2 < 1, z 2 < 0
is impossible, and hence there is no graph when x2 + y 2 < 1. When y = 0, z 2 − x2 = −1 is a
hyperbola on the xz plane. When x = 0, z 2 − y 2 = −1 is a hyperbola on the yz plane. When z = c is
a constant, then the x2 + y 2 = c2 + 1 are circles See figure 5.15. The following is a parametrization
for this hyperboloid of one sheet
√ √
x= u2 + 1 cos v, y= u2 + 1 sin v, z = u, u ∈ R, v ∈ [0; 2π],

Figure 5.15. One-sheet hyperboloid x2 y2 z2


z2 x2 y2 Figure 5.16. Ellipsoid + 2 + 2 = 1.
= + − 1. a2 b c
c2 a2 b2

322 Example
x2 y 2 z 2
Let a, b, c be strictly positive real numbers. The surface 2 + 2 + 2 = 1 is called an ellipsoid. For
a b c
x2 y 2 x2 z 2
z = 0, 2 + 2 1 is an ellipse on the xy plane.When y = 0, 2 + 2 = 1 is an ellipse on the xz plane.
a b a c
z2 y2
When x = 0, 2 + 2 = 1 is an ellipse on the yz plane. See figure 5.16. We may parametrize the
c b
ellipsoid using spherical coordinates:

x = a cos θ sin ϕ, y = b sin θ sin ϕ, z = c cos ϕ, θ ∈ [0; 2π], ϕ ∈ [0; π].

Exercises
323 Problem erated by revolving the line 3x + 4y = 1 about
Find the equation of the surface of revolution S the y-axis .
generated by revolving the ellipse 4x2 + z 2 = 1
325 Problem
about the z-axis.
Describe the surface parametrized by φ(u, v) 7→
(v cos u, v sin u, au), (u, v) ∈ (0, 2π) × (0, 1),
324 Problem
a > 0.
Find the equation of the surface of revolution gen-

148
5.4. ⋆ Manifolds
326 Problem 332 Problem
Describe the surface parametrized by φ(u, v) = Demonstrate that the surface in R3 given implic-
(au cos v, bu sin v, u2 ), (u, v) ∈ (1, +∞) × itly by
(0, 2π), a, b > 0.
z 2 − xy = 2z − 1
327 Problem
Consider the spherical cap defined by is a cone

S = {(x, y, z) ∈ R3 : x2 +y 2 +z 2 = 1, z ≥ 1/ 2}.
333 Problem (Putnam Exam 1970)
Parametrise S using Cartesian, Spherical, and
Determine, with proof, the radius of the largest
Cylindrical coordinates.
circle which can lie on the ellipsoid
328 Problem
Demonstrate that the surface in R3 x2 y 2 z 2
+ 2 + 2 = 1, a > b > c > 0.
x2 +y 2 +z 2 −2xz a2 b c
S:e − (x + z)e = 0,

implicitly defined, is a cylinder. 334 Problem


329 Problem The hyperboloid of one sheet in figure 5.17 has the
Shew that the surface in R implicitly defined by property that if it is cut by planes at z = ±2, its
3

projection on the xy plane produces the ellipse


x4 + y 4 + z 4 − 4xyz(x + y + z) = 1
y2
x2 + = 1, and if it is cut by a plane at z = 0,
is a surface of revolution, and find its axis of revo- 4
its projection on the xy plane produces the ellipse
lution.
4x2 + y 2 = 1. Find its equation.
330 Problem
Shew that the surface S in R3 given implicitly by
the equation y2
z = 2, x2 + =1
z 4
1 1 1
+ + =1
x−y y−z z−x
is a cylinder and find the direction of its directrix.
z = 0, 4x2 + y 2 = 1
331 Problem x y
Shew that the surface S in R3 implicitly defined as y2
z = −2, x2 + =1
xy + yz + zx + x + y + z + 1 = 0 4

is of revolution and find its axis. Figure 5.17. Problem 334.

5.4. ⋆ Manifolds
335 Definition
We say M ⊆ Rn is a d-dimensional (differentiable) manifold if for every a ∈ M there exists domains
U ⊆ Rn , V ⊆ Rn and a differentiable function f : V → U such that rank(D(f)) = d at every point

in V and U M = f(V ).

149
5. Curves and Surfaces
336 Remark
For d = 1 this is just a curve, and for d = 2 this is a surface.

337 Remark
If d = 1 and γ is a connected, then there exists an interval U and an injective differentiable function
γ : U → Rn such that Dγ ̸= 0 on U and γ(U ) = γ. If d > 1 this is no longer true: even though near
every point the surface is a differentiable image of a rectangle, the entire surface need not be one.

As before d-dimensional manifolds can be obtained as level sets of functions f : Rn+d → Rn


provided we have rank(D(f)) = d on the entire level set.
338 Proposition { }

Let f : Rn+d → Rn is differentiable, c ∈ Rn and γ = x ∈ Rn+1 f (x) = c be the level set of f . If
at every point in γ, the matrix D(f) has rank d then γ is a d-dimensional manifold.

The results from the previous section about tangent spaces of implicitly defined manifolds gen-
eralize naturally in this context.

339 Definition { }

Let U ⊆ Rn , f : U → R be a differentiable function, and M = (x, f (x)) ∈ Rn+1 x ∈ U be the
graph of f . (Note M is a d-dimensional manifold in Rn+1 .) Let (a, f (a)) ∈ M .

■ The tangent “plane” at the point (a, f (a)) is defined by


{ }

(x, y) ∈ Rn+1 y = f (a) + Dfa (x − a)

■ The tangent space at the point (a, f (a)) (denoted by T M(a,f (a)) ) is the subspace defined by
{ }

T M(a,f (a)) = (x, y) ∈ Rn+1 y = Dfa x .

340 Remark
When d = 2 the tangent plane is really a plane. For d = 1 it is a line (the tangent line), and for other
values it is a d-dimensional hyper-plane.

341 Proposition { }

Suppose f : Rn+d → Rn is differentiable, and the level set γ = x f (x) = c is a d-dimensional
manifold. Suppose further that D(f)a has rank n for all a ∈ γ. Then the tangent space at a is precisely
the kernel of D(f)a , and the vectors ∇f1 , …∇fn are n linearly independent vectors that are normal
to the tangent space.

5.5. Constrained optimization.


Consider an implicitly defined surface S = {g = c}, for some g : R3 → R. Our aim is to maximise
or minimise a function f on this surface.

150
5.5. Constrained optimization.

342 Definition
We say a function f attains a local maximum at a on the surface S, if there exists ϵ > 0 such that
|x − a| < ϵ and x ∈ S imply f (a) ≥ f (x).

343 Remark
This is sometimes called constrained local maximum, or local maximum subject to the constraint g =
c.

344 Proposition
If f attains a local maximum at a on the surface S, then ∃λ ∈ R such that ∇f (a) = λ∇g(a).

{ }
Proof. [Intuition] If ∇f (a) ̸= 0, then S ′ = f = f (a) is a surface. If f attains a constrained max-
def

imum at a then S ′ must be tangent to S at the point a. This forces ∇f (a) and ∇g(a) to be parallel. ■

345 Proposition (Multiple constraints)


Let f, g1 , …, gn : Rd → R be : Rd → R be differentiable. If f attains a local maximum at a subject to

the constraints g1 = c1 , g2 = c2 , …gn = cn then ∃λ1 , . . . λn ∈ R such that ∇f (a) = n1 λi ∇gi (a).

To explicitly find constrained local maxima in Rn with n constraints we do the following:

■ Simultaneously solve the system of equations

∇f (x) = λ1 ∇g1 (x) + · · · λn ∇gn (x)


g1 (x) = c1 ,
...
gn (x) = cn .

■ The unknowns are the d-coordinates of x, and the Lagrange multipliers λ1 , …, λn . This is
n + d variables.

■ The first equation above is a vector equation where both sides have d coordinates. The re-
maining are scalar equations. So the above system is a system of n + d equations with n + d
variables.

■ The typical situation will yield a finite number of solutions.

■ There is a test involving the bordered Hessian for whether these points are constrained local
minima / maxima or neither. These are quite complicated, and are usually more trouble than
they are worth, so one usually uses some ad-hoc method to decide whether the solution you
found is a local maximum or not.
346 Example
Find necessary conditions for f (x, y) = y to attain a local maxima/minima of subject to the constraint
y = g(x).

151
5. Curves and Surfaces
Of course, from one variable calculus, we know that the local maxima / minima must occur at
points where g ′ = 0. Let’s revisit it using the constrained optimization technique above. Proof.
[Solution] Note our constraint is of the form y − g(x) = 0. So at a local maximum we must have
   
0 −g ′ (x)
  = ∇f = λ∇(y − g(x)) =   and y = g(x).
   
1 1

This forces λ = 1 and hence g ′ (x) = 0, as expected. ■

347 Example
x2 y 2
Maximise xy subject to the constraint 2 + 2 = 1.
a b
Proof. [Solution] At a local maximum,
   
Å 2 ã 2x/a2
y  x y2  
  = ∇(xy) = λ∇ + 2 = λ 
  a2 b  2

x 2y/b .

√ √
which forces y 2 = x2 b2 /a2 . Substituting this in the constraint gives x = ±a/ 2 and y = ±b/ 2.
This gives four possibilities for xy to attain a maximum. Directly checking shows that the points
√ √ √ √
(a/ 2, b/ 2) and (−a/ 2, −b/ 2) both correspond to a local maximum, and the maximum value
is ab/2. ■

348 Proposition (Cauchy-Schwartz)


If x, y ∈ Rn then|x · y| ≤ |x||y|.

Proof. Maximise x · y subject to the constraint|x| = a and|y| = b. ■

349 Proposition (Inequality of the means)


If xi ≥ 0, then
Å∏ ã1/n
1∑ n n
xi ≥ xi .
n 1 1

350 Proposition (Young’s inequality)


If p, q > 1 and 1/p + 1/q = 1 then
|x|p |y|q
|xy| ≤ + .
p q

152
Line Integrals
6.
6.1. Line Integrals of Vector Fields
We start with some motivation. With this objective we remember the definition of the work:

351 Definition
If a constant force f acting on a body produces an displacement ∆x, then the work done by the force
is f•∆x.

We want to generalize this definition to the case in which the force is not constant. For this pur-
pose let γ ⊆ Rn be a curve, with a given direction of traversal, and f : Rn → Rn be a vector
function.
Here f represents the force that acts on a body and pushes it along the curve γ. The work done
by the force can be approximated by

N −1 ∑
N −1
W ≈ f(xi ) (xi+1 − xi ) =
• f(xi )•∆xi
i=0 i=0

where x0 , x1 , …, xN −1 are N points on γ, chosen along the direction of traversal. The limit as the
largest distance between neighbors approaches 0 is the work done:

N −1
W = lim f(xi )•∆xi
∥P ∥→0
i=0

This motivates the following definition:

352 Definition
Let γ ⊆ Rn be a curve with a given direction of traversal, and f : γ → Rn be a (vector) function. The
line integral of f over γ is defined to be
ˆ ∑
N −1
f• dℓ = lim f(x∗i )•(xi+1 − xi )
γ ∥P ∥→0
i=0

N −1
= lim f(x∗i )•∆xi .
∥P ∥→0
i=0

153
6. Line Integrals
if the above limit exists. Here P = {x0 , x1 , . . . , xN −1 }, the points xi are chosen along the direction
of traversal, and ∥P ∥ = max|xi+1 − xi |.

353 Remark
If f = (f1 , . . . , fn ), where fi : γ → R are functions, then one often writes the line integral in the
differential form notation as
ˆ ˆ
f• dℓ = f1 dx1 + · · · + fn dxn
γ γ

The following result provides a explicit way of calculating line integrals using a parametrization
of the curve.
354 Theorem
If γ : [a, b] → Rn is a parametrization of γ (in the direction of traversal), then
ˆ ˆ b
f dℓ =
• f ◦ γ(t)•γ ′ (t) dt (6.1)
γ a

Proof.
Let a = t0 < t1 < · · · < tn = b be a partition of a, b and let xi = γ(ti ).
The line integral of f over γ is defined to be
ˆ ∑
N −1
f• dℓ = lim f(xi )•∆xi
γ ∥P ∥→0
i=0

N −1 ∑
n
= lim fj (xi ) · (∆xi )j
∥P ∥→0
i=0 j=1
Ä ä
By the Mean Value Theorem, we have (∆xi )j = x′ ∗i j
∆ti

∑ ∑
n N −1 ∑ ∑
n N −1 Ä ∗ä
fj (xi ) · (∆xi )j = fj (xi ) · x′ i j
∆ti
j=1 i=0 j=1 i=0
∑n ˆ ˆ b
= fj (γ(x)) · γj′ (t) dt = f ◦ γ(t)•γ ′ (t) dt
j=1 a

In the differential form notation (when d = 2) say


( )
f = (f, g) and γ(t) = x(t), y(t) ,

where f, g : γ → R are functions. Then Proposition 354 says


ˆ ˆ ˆ î ó
f dℓ = f dx + g dy =
• f (x(t), y(t)) x′ (t) + g(x(t), y(t)) y ′ (t) dt
γ γ γ

355 Remark
Sometimes (6.1) is used as the definition of the line integral. In this case, one needs to verify that this
definition is independent of the parametrization. Since this is a good exercise, we’ll do it anyway a
little later.

154
6.1. Line Integrals of Vector Fields
356 Example
Take F(r) = (xey , z 2 , xy) and we want to find the line integral from a = (0, 0, 0) to b = (1, 1, 1).

b
C2
C1

We first integrate along the curve C1 : r(u) = (u, u2 , u3 ). Then r′ (u) = (1, 2u, 3u2 ), and F(r(u)) =
2
(ueu , u6 , u3 ). So
ˆ ˆ 1
F dr =
• F•r′ (u) du
C1 0
ˆ 1
2
= ueu + 2u7 + 3u5 du
0
e 1 1 1
= − + +
2 2 4 2
e 1
= +
2 4
Now we try to integrate along another curve C2 : r(t) = (t, t, t). So r′ (t) = (1, 1, 1).
ˆ ˆ
F dℓ = F•r′ (t)dt

C2
ˆ 1
= tet + 2t2 dt
0
5
= .
3
We see that the line integral depends on the curve C in general, not just a, b.

357 Example
Suppose a body of mass M is placed at the origin. The force experienced by a body of mass m at the
−GM x
point x ∈ R3 is given by f(x) = , where G is the gravitational constant. Compute the work
|x|3
done when the body is moved from a to b along a straight line.

Solution: ▶ Let γ be the straight line joining a and b. Clearly γ : [0, 1] → γ defined by γ(t) =
a + t(b − a) is a parametrization of γ. Now
ˆ ˆ 1
γ(t)
′ GM m GM m
W = f• dℓ = −GM m 3 •γ (t) dt = − .■
γ 0 γ(t) |b| |a|


358 Remark
If the line joining through a and b passes through the origin, then some care has to be taken when
doing the above computation. We will see later that gravity is a conservative force, and that the
above line integral only depends on the endpoints and not the actual path taken.

155
6. Line Integrals

6.2. Parametrization Invariance and Others


Properties of Line Integrals
Since line integrals can be defined in terms of ordinary integrals, they share many of the properties
of ordinary integrals.

359 Definition
The curve γ is said to be the union of two curves γ1 and γ2 if γ is defined on an interval [a, b], and the
curves γ1 and γ2 are the restriction γ|[a,d] and γ|[d,b] .

360 Proposition

■ linearity property with respect to the integrand,


ˆ ˆ ˆ
(αf + βG) • dℓ = α f• dℓ + β G• dℓ
γ γ γ

■ additive property with respect to the path of integration: where the union of the two curves γ1
and γ2 is the curve γ. ˆ ˆ ˆ
f• dℓ = f• dℓ + f• dℓ
γ γ1 γ2

The proofs of these properties follows immediately from the definition of the line integral.

361 Definition
Let h : I → I1 be a C 1 real-valued function that is a one-to-one map of an interval I = [a, b] onto
another interval I = [a1 , b1 ]. Let γ : I1 → Rn be a piecewise C 1 path. Then we call the composition

γ2 = γ1 ◦ h : I → Rn

a reparametrization of γ.

It is implicit in the definition that h must carry endpoints to endpoints; that is, either h(a) = a1
and h(b) = b1 , or h(a) = b1 and h(b) = a1 . We distinguish these two types of reparametrizations.

■ In the first case, the reparametrization is said to be orientation-preserving, and a particle


tracing the path γ1 ◦ moves in the same direction as a particle tracing γ1 .

■ In the second case, the reparametrization is described as orientation-reversing, and a par-


ticle tracing the path γ1 ◦ moves in the opposite direction to that of a particle tracing γ1
362 Proposition (Parametrization invariance)
If γ1 : [a1 , b1 ] → γ and γ2 : [a2 , b2 ] → γ are two parametrizations of γ that traverse it in the same
direction, then ˆ ˆ
b1 b2
f ◦ γ1 (t)•γ1′ (t) dt = f ◦ γ2 (t)•γ2′ (t) dt.
a1 a2

156
6.3. Line Integral of Scalar Fields
Proof. Let φ : [a1 , b1 ] → [a2 , b2 ] be defined by φ = γ2−1 ◦ γ1 . Since γ1 and γ2 traverse the curve in
the same direction, φ must be increasing. One can also show (using the inverse function theorem)
that φ is continuous and piecewise C 1 . Now
ˆ b2 ˆ b2
f ◦ γ2 (t)•γ2′ (t) dt = f(γ1 (φ(t)))•γ1′ (φ(t))φ′ (t) dt.
a2 a2

Making the substitution s = φ(t) finishes the proof. ■

6.3. Line Integral of Scalar Fields


363 Definition
If γ ⊆ Rn is a piecewise C 1 curve, then
ˆ ∑
N
length(γ) = f |dℓ| = lim |xi+1 − xi | ,
γ ∥P ∥→0
i=0

where as before P = {x0 , . . . , xN −1 }.

More generally:

364 Definition
If f : γ → R is any scalar function, we definea
ˆ ∑
N
f (x∗i ) |xi+1 − xi | ,
def
f |dℓ| = lim
γ ∥P ∥→0
i=0
ˆ
a
Unfortunately f |dℓ| is also called the line integral. To avoid confusion, we will call this the line integral with
γ
respect to arc-length instead.

ˆ
The integral f |dℓ| is also denoted by
γ
ˆ ˆ
f ds = f |dℓ|
γ γ

365 Theorem
Let γ ⊆ Rn be a piecewise C 1 curve, γ : [a, b] → R be any parametrization (in the given direction
of traversal), f : γ → R be a scalar function. Then
ˆ ˆ b

f |dℓ| = f (γ(t)) γ ′ (t) dt,
γ a

157
6. Line Integrals
and consequently
ˆ ˆ b

length(γ) = 1 |dℓ| = γ (t) dt.
γ a

366 Example
Compute the circumference of a circle of radius r.

367 Example
The trace of
r(t) = i cos t + j sin t + kt

is known as a cylindrical helix. To find the length of the helix as t traverses the interval [0; 2π], first
observe that

∥dℓ∥ = (sin t)2 + (− cos t)2 + 1 dt = 2dt,

and thus the length is ˆ 2π √ √


2dt = 2π 2.
0

6.3.1. Area above a Curve


If γ is a curve in the xy-plane and f (x, y) is a nonnegative continuous function defined on the curve
γ, then the integral ˆ
f (x, y)|dℓ|
γ

can be interpreted as the area A of the curtain that obtained by the union of all vertical line segment
that extends upward from the point (x, y) to a height of f (x, y), i.e, the area bounded by the curve
γ and the graph of f
This fact come from the approximation by rectangles:


N
area = lim f (x, y)|xi+1 − xi | ,
∥P ∥→0
i=0

368 Example
Use a line integral to show that the lateral surface area A of a right circular cylinder of radius r and
height h is 2πrh.

158
6.3. Line Integral of Scalar Fields

Figure 6.1. Right circular cylinder of radius r and


height h

Solution: ▶ We will use the right circular cylinder with base circle C given by x2 + y 2 = r2 and
with height h in the positive z direction (see Figure 4.1.3). Parametrize C as follows:

x = x(t) = r cos t , y = y(t) = r sin t , 0 ≤ t ≤ 2π

Let f (x, y) = h for all (x, y). Then


ˆ ˆ b »
A = f (x, y) ds = f (x(t), y(t)) x ′ (t)2 + y ′ (t)2 dt
C a
ˆ 2π »
= h (−r sin t)2 + (r cos t)2 dt
0
ˆ 2π »
= h r sin2 t + cos2 t dt
0
ˆ 2π
= rh 1 dt = 2πrh
0

369 Example
Find the area of the surface extending upward from the circle x2 + y 2 = 1 in the xy-plane to the
parabolic cylinder z = 1 − y 2

Solution: ▶ The circle circle C given by x2 + y 2 = 1 can be parametrized as as follows:

x = x(t) = cos t , y = y(t) = sin t , 0 ≤ t ≤ 2π

159
6. Line Integrals
Let f (x, y) = 1 − y 2 for all (x, y). Above the circle he have f (θ) = 1 − sin2 t Then
ˆ ˆ b »
A = f (x, y) ds = f (x(t), y(t)) x ′ (t)2 + y ′ (t)2 dt
C a
ˆ 2π »
= (1 − sin2 t) (− sin t)2 + (cos t)2 dt
0
ˆ 2π
= 1 − sin2 t dt = π
0

6.4. The First Fundamental Theorem


370 Definition
Suppose U ⊆ Rn is a domain. A vector field F is a gradient field in U if exists an C 1 function
φ : U → R such that
F = ∇φ.

The function φ is called the potential of the vector field F.

In

371 Definition
Suppose U ⊆ Rn is a domain. A vector field f : U → Rn is a path-independent vector field if the
integral of f over a piecewise C 1 curve is dependent only on end points, for all piecewise C 1 curve in
U.

372 Theorem (First Fundamental theorem for line integrals)


Suppose U ⊆ Rn is a domain, φ : U → R is C 1 and γ ⊆ Rn is any differentiable curve that starts
at a, ends at b and is completely contained in U . Then
ˆ
∇φ• dℓ = φ(b) − φ(a).
γ

Proof. Let γ : [0, 1] → γ be a parametrization of γ. Note


ˆ ˆ 1 ˆ 1
′ d
∇φ dℓ =
• ∇φ(γ(t)) γ (t) dt =
• φ(γ(t)) dt = φ(b) − φ(a).
γ 0 0 dt

The above theorem can be restated as: a gradient vector field is a path-independent vector field.
If γ is a closed curve, then line integrals over γ are denoted by
˛
f• dℓ.
γ

160
6.4. The First Fundamental Theorem
373 Corollary
If γ ⊆ Rn is a closed curve, and φ : γ → R is C 1 , then
˛
∇φ• dℓ = 0.
γ

374 Definition
Let U ⊆ Rn , and f : U → Rn be a vector function. We say f is a conservative force (or conservative
vector field) if ˛
f• dℓ = 0,

for all closed curves γ which are completely contained inside U .

Clearly if f = −∇ϕ for some C 1 function V : U → R, then f is conservative. The converse is also
true provided U is simply connected, which we’ll return to later. For conservative vector field:
ˆ ˆ
F dℓ = ∇ϕ• dℓ

γ γ
= [ϕ]ba
= ϕ(b) − ϕ(a)

We note that the result is independent of the path γ joining a to b.


γ2
B

γ1

A
375 Example
If φ fails to be C 1 even at one point, the above can fail quite badly. Let φ(x, y) = tan−1 (y/x), ex-
{ }

tended to R2 − (x, y) x ≤ 0 in the usual way. Then
 
1 −y 
∇φ =  
x2 +y  x 
2

{ }

which is defined on R2 − (0, 0). In particular, if γ = (x, y) x2 + y 2 = 1 , then ∇φ is defined on
all of γ. However, you can easily compute
˛
∇φ• dℓ = 2π ̸= 0.
γ

The reason this doesn’t contradict the previous corollary is that Corollary 373 requires φ itself to be
defined on all of γ, and not just ∇φ! This example leads into something called the winding number
which we will return to later.

161
6. Line Integrals

6.5. Test for a Gradient Field


If a vector field F is a gradient field, and the potential φ has continuous second derivatives, then
the second-order mixed partial derivatives must be equal:
∂Fi ∂Fj
(x) = (x) for all i, j
∂xj ∂xi
So if F = (F1 , . . . , Fn ) is a gradient field and the components of F have continuous partial
derivatives, then we must have
∂Fi ∂Fj
(x) = (x) for all i, j
∂xj ∂xi
If these partial derivatives do not agree, then the vector field cannot be a gradient field.
This gives us an easy way to determine that a vector field is not a gradient field.
376 Example
The vector field (−y, x, −yx) is not a gradient field because partial2 f 1 = −1 is not equal to ∂1 f2 =
1.

When F is defined on simple connected domain and has continuous partial derivatives, the
check works the other way as well. If F = (F1 , . . . , Fn ) is field and the components of F have
continuous partial derivatives, satisfying
∂Fi ∂Fj
(x) = (x) for all i, j
∂xj ∂xi
then F is a gradient field (i.e., there is a potential function f such that F = ∇f ). This gives us a very
nice way of checking if a vector field is a gradient field.
377 Example
The vector field F = (x, z, y) is a gradient field because F is defined on all of R3 , each component
has continuous partial derivatives, and My = 0 = Nx , Mz = 0 = Px , and Nz = 1 = Py . Notice that
f = x2 /2 + yz gives ∇f = ⟨x, z, y⟩ = F.

6.5.1. Irrotational Vector Fields


In this section we restrict our attention to three dimensional space .
378 Definition
Let f : U → R3 be a C 1 vector field defined in the open set U . Then the vector f is called irrotational
if and only if its curl is 0 everywhere in U , i.e., if

∇ × f ≡ 0.

For any C 2 scalar field φ on U , we have

∇ × (∇φ) ≡ 0.

so every C 1 gradiente vector field on U is also an irrotational vector field on U .


Provided that U is simply connected, the converse of this is also true:

162
6.5. Test for a Gradient Field

379 Theorem
Let U ⊂ R3 be a simply connected domain and let f be a C 1 vector field in U . Then are equivalents

■ f is a irrotational vector field;

■ f is a gradiente vector field on U

■ f is a conservative vector field on U

The proof of this theorem is presented in the Section 7.7.1.


The above statement is not true in general if U is not simply connected as we have already seen
in the example 375.

6.5.2. Work and potential energy


380 Definition (Work and ˆpotential energy)
If F(r) is a force, then F•dℓ is the work done by the force along the curve C. It is the limit of a
C
sum of terms F(r)•δr, ie. the force along the direction of δr.

Consider a point particle moving under F(r) according to Newton’s second law: F(r) = mr̈.
Since the kinetic energy is defined as

1
T (t) = mṙ2 ,
2
the rate of change of energy is
d
T (t) = mṙ•r̈ = F•ṙ.
dt
Suppose the path of particle is a curve C from a = r(α) to b = r(β), Then
ˆ β ˆ β ˆ
dT
T (β) − T (α) = dt = F ṙ dt =
• F•dℓ.
α dt α C

So the work done on the particle is the change in kinetic energy.

381 Definition (Potential energy)


Given a conservative force F = −∇V , V (x) is the potential energy. Then
ˆ
F•dℓ = V (a) − V (b).
C

Therefore, for a conservative force, we have F = ∇V , where V (r) is the potential energy.
So the work done (gain in kinetic energy) is the loss in potential energy. So the total energy T +V
is conserved, ie. constant during motion.
We see that energy is conserved for conservative forces. In fact, the converse is true — the energy
is conserved only for conservative forces.

163
6. Line Integrals

6.6. The Second Fundamental Theorem


The gradient theorem states that if the vector field f is the gradient of some scalar-valued function,
then f is a path-independent vector field. This theorem has a powerful converse:

382 Theorem
Suppose U ⊆ Rn is a domain of Rn . If F is a path-independent vector field in U , then F is the
gradient of some scalar-valued function.

It is straightforward to show that a vector field is path-independent if and only if the integral of
the vector field over every closed loop in its domain is zero. Thus the converse can alternatively be
stated as follows: If the integral of f over every closed loop in the domain of f is zero, then f is the
gradient of some scalar-valued function.
Proof.
Suppose U is an open, path-connected subset of Rn , and F : U → Rn is a continuous and
path-independent vector field. Fix some point a of U , and define f : U → R by
ˆ
f (x) := F(u)•dℓ
γ[a,x]

Here γ[a, x] is any differentiable curve in U originating at a and terminating at x. We know that f
is well-defined because f is path-independent.
Let v be any nonzero vector in Rn . By the definition of the directional derivative,
∂f f (x + tv) − f (x)
(x) = lim (6.2)
∂v t→0
ˆ t ˆ
F(u)•dℓ − F(u)•dℓ
γ[a,x+tv] γ[a,x]
= lim (6.3)
t→0
ˆ t
1
= lim F(u)•dℓ (6.4)
t→0 t γ[x,x+tv]

To calculate the integral within the final limit, we must parametrize γ[x, x + tv]. Since f is path-
independent, U is open, and t is approaching zero, we may assume that this path is a straight line,
and parametrize it as u(s) = x + sv for 0 < s < t. Now, since u′ (s) = v, the limit becomes
ˆ ˆ t
1 t
d
lim F(u(s))•u′ (s) ds = F(x + sv)•v ds = F(x)•v
t→0 t 0 dt 0
t=0

Thus we have a formula for ∂v f , where v is arbitrary.. Let x = (x1 , x2 , . . . , xn )


Ç å
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , , ..., = F(x)
∂x1 ∂x2 ∂xn

Thus we have found a scalar-valued function f whose gradient is the path-independent vector
field f, as desired.

164
6.7. Constructing Potentials Functions

6.7. Constructing Potentials Functions


If f is a conservative field on an open connected set U , the line integral of f is independent of the
path in U . Therefore we can find a potential simply by integrating f from some fixed point a to an
arbitrary point x in U , using any piecewise smooth path lying in U . The scalar field so obtained
depends on the choice of the initial point a. If we start from another initial point, say b, we obtain
a new potential. But, because of the additive property of line integrals, and can differ only by a
constant, this constant being the integral of f from a to b.

Construction of a potential on an open rectangle. If f is a conservative vector field on an open


rectangle in Rn , a potential f can be constructed by integrating from a fixed point to an arbitrary
point along a set of line segments parallel to the coordinate axes.

(a, y) (x, y)

(a, b) (x, b)

We will simplify the deduction, assuming that n = 2. In this case we can integrate first from (a,
b) to (x, b) along a horizontal segment, then from (x, b) to (x,y) along a vertical segment. Along the
horizontal segment we use the parametric representation

γ(t) = ti + bj, a <, t <, x,

and along the vertical segment we use the parametrization

γ2 (t) = xi + tj, b < t < y.

If F (x, y) = F1 (x, y)i + F2 (x, y)j, the resulting formula for a potential f (x, y) is
ˆ b ˆ y
f (x, y) = F1 (t, b) dt + F2 (x, t) dt.
a b

We could also integrate first from (a, b) to (a, y) along a vertical segment and then from (a, y) to
(x, y) along a horizontal segment as indicated by the dotted lines in Figure. This gives us another
formula for f(x, y), ˆ ˆ
y x
f (x, y) = F2 (a, t) dt + F2 (t, y) dt.
b a

Both formulas give the same value for f (x, y) because the line integral of a gradient is independent
of the path.

165
6. Line Integrals
Construction of a potential using anti-derivatives But there’s another way to find a potential
∂V
of a conservative vector field: you use the fact that = Fx to conclude that V (x, y) must be of
ˆ x ∂x
∂V
the form Fx (u, y)du + G(y), and similarly = Fy implies that V (x, y) must be of the form
ˆ y a ∂y ˆ x
Fy (x, v)du + H(x). So you find functions G(y) and H(x) such that Fx (u, y)du + G(y) =
ˆb y a

Fy (x, v)du + H(x)


b

383 Example
Show that

F = (ex cos y + yz)i + (xz − ex sin y)j + (xy + z)k

is conservative over its natural domain and find a potential function for it.

Solution: ▶
The natural domain of F is all of space, which is connected and simply connected. Let’s define
the following:

M = ex cos y + yz

N = xz − ex sin y

P = xy + z

and calculate
∂P ∂M
=y=
∂x ∂z
∂P ∂N
=x=
∂y ∂z
∂N ∂M
= −ex sin y =
∂x ∂y
Because the partial derivatives are continuous, F is conservative. Now that we know there exists a
function f where the gradient is equal to F, let’s find f.
∂f
= ex cos y + yz
∂x
∂f
= xz − ex sin y
∂y
∂f
= xy + z
∂z
If we integrate the first of the three equations with respect to x, we find that
ˆ
f (x, y, z) = (ex cos y + yz)dx = ex cos y + xyz + g(y, z)

where g(y,z) is a constant dependant on y and z variables. We then calculate the partial derivative
with respect to y from this equation and match it with the equation of above.

166
6.8. Green’s Theorem in the Plane
∂ ∂g
(f (x, y, z)) = −ex sin y + xz + = xz − ex sin y
∂y ∂y

This means that the partial derivative of g with respect to y is 0, thus eliminating y from g entirely
and leaving at as a function of z alone.

f (x, y, z) = ex cos y + xyz + h(z)

We then repeat the process with the partial derivative with respect to z.

∂ dh
(f (x, y, z)) = xy + = xy + z
∂z dz
which means that
dh
=z
dz
so we can find h(z) by integrating:
z2
h(z) = +C
2
Therefore,

z2
f (x, y, z) = ex cos y + xyz + +C
2
We still have infinitely many potential functions for F, one at each value of C. ◀

6.8. Green’s Theorem in the Plane


384 Definition
A positively oriented curve is a planar simple closed curve such that when travelling on it one al-
ways has the curve interior to the left. If in the previous definition one interchanges left and right,
one obtains a negatively oriented curve.

We will now see a way of evaluating the line integral of a smooth vector field around a simple
closed curve. A vector field f(x, y) = P (x, y) i + Q(x, y) j is smooth if its component functions
P (x, y) and Q(x, y) are smooth. We will use Green’s Theorem (sometimes called Green’s Theorem
in the plane) to relate the line integral around a closed curve with a double integral over the region
inside the curve:

385 Theorem (Green’s Theorem - Simple Regions)


Let Ω be a region in R2 whose boundary is a positively oriented curve γ which is piecewise smooth.
Let f(x, y) = P (x, y) i + Q(x, y) j be a smooth vector field defined on both Ω and γ. Then
˛ ¨ Ç å
∂Q ∂P
f•dℓ = − dA , (6.5)
γ ∂x ∂y

where γ is traversed so that Ω is always on the left side of γ.

167
6. Line Integrals

γ1 ◀γ 1


γ3 γ2




(b) positively oriented curve


(a) positively oriented curve

γ1


(c) negatively oriented curve

Figure 6.2. Orientations of Curves

Proof. We will prove the theorem in the case for a simple region Ω, that is, where the boundary
curve γ can be written as C = γ1 ∪ γ2 in two distinct ways:

γ1 = the curve y = y1 (x) from the point X1 to the point X2 (6.6)


γ2 = the curve y = y2 (x) from the point X2 to the point X1 , (6.7)

where X1 and X2 are the points on C farthest to the left and right, respectively; and

γ1 = the curve x = x1 (y) from the point Y2 to the point Y1 (6.8)


γ2 = the curve x = x2 (y) from the point Y1 to the point Y2 , (6.9)

where Y1 and Y2 are the lowest and highest points, respectively, on γ. See Figure
y
y = y2 (x)
d
Y2

X2 x = x2 (y)
x = x1 (y) X1 Ω

Y1
▶γ
c
y = y1 (x)
x
a b

168
6.8. Green’s Theorem in the Plane
Integrate P (x, y) around γ using the representation γ = γ1 ∪ γ2 Since y = y1 (x) along γ1 (as x goes
from a to b) and y = y2 (x) along γ2 (as x goes from b to a), as we see from Figure, then we have
˛ ˆ ˆ
P (x, y) dx = P (x, y) dx + P (x, y) dx
γ γ1 γ2
ˆb ˆ a
= P (x, y1 (x)) dx + P (x, y2 (x)) dx
a b
ˆ b ˆ b
= P (x, y1 (x)) dx − P (x, y2 (x)) dx
a a
ˆ b( )
= − P (x, y2 (x)) − P (x, y1 (x)) dx
a
ˆ b( y=y2 (x) )

= − P (x, y) dx
a y=y1 (x)
ˆ bˆ y2 (x)
∂P (x, y)
= − dy dx (by the Fundamental Theorem of Calculus)
a y1 (x) ∂y
¨
∂P
= − dA .
∂y

Likewise, integrate Q(x, y) around γ using the representation γ = γ1 ∪ γ2 . Since x = x1 (y) along
γ1 (as y goes from d to c) and x = x2 (y) along γ2 (as y goes from c to d), as we see from Figure , then
we have
˛ ˆ ˆ
Q(x, y) dy = Q(x, y) dy + Q(x, y) dy
γ γ1 γ2
ˆ c ˆ d
= Q(x1 (y), y) dy + Q(x2 (y), y) dy
d c
ˆ d ˆ d
= − Q(x1 (y), y) dy + Q(x2 (y), y) dy
c c
ˆ d( )
= Q(x2 (y), y) − Q(x1 (y), y) dy
c
ˆ ( x=x2 (y) )
d
= Q(x, y) dy
c x=x1 (y)
ˆ d ˆ x2 (y)
∂Q(x, y)
= dx dy (by the Fundamental Theorem of Calculus)
c x1 (y) ∂x
¨
∂Q
= dA , and so
∂x

˛ ˛ ˛
f dr =
• P (x, y) dx + Q(x, y) dy
γ γ γ
¨ ¨
∂P ∂Q
= − dA + dA
∂y ∂x
Ω Ω

169
6. Line Integrals
¨ Ç å
∂Q ∂P
= − dA .
∂x ∂y

386 Remark
Note, Green’s theorem requires that Ω is bounded and f (or P and Q) is C 1 on all of Ω. If this fails at
even one point, Green’s theorem need not apply anymore!

387 Example˛
Evaluate (x2 +y 2 ) dx+2xy dy, where C is the boundary traversed counterclockwise of the region
C
R = { (x, y) : 0 ≤ x ≤ 1, 2x2 ≤ y ≤ 2x }.

(1, 2)
2
y

x
0 1

Solution: ▶ R is the shaded region in Figure above. By Green’s Theorem, for P (x, y) = x2 + y 2
and Q(x, y) = 2xy, we have
˛ ¨ Ç å
∂Q ∂P
2 2
(x + y ) dx + 2xy dy = − dA
C ∂x ∂y

¨ ¨
= (2y − 2y) dA = 0 dA = 0 .
Ω Ω

2 2
˛ vector field f(x, y) = (x + y ) i + 2xy j
There is another way to see that the answer is zero. The
1 3
has a potential function F (x, y) = x + xy 2 , and so f•dr = 0. ◀
3 C

388 Example
Let f(x, y) = P (x, y) i + Q(x, y) j, where

−y x
P (x, y) = and Q(x, y) = ,
x2 + y 2 x2 + y 2

and let R = { (x, y) : 0 < x2 + y 2 ≤ 1 }. For the boundary curve


˛ C : x2 + y 2 = 1, traversed
counterclockwise, it was shown in Exercise 9(b) in Section 4.2 that f•dr = 2π. But
C
¨ Ç å ¨
∂Q y 2 − x2 ∂P ∂Q ∂P
= 2 2 2
= ⇒ − dA = 0 dA = 0 .
∂x (x + y ) ∂y ∂x ∂y
Ω Ω

170
6.8. Green’s Theorem in the Plane
This would seem to contradict Green’s Theorem. However, note that R is not the entire region en-
closed by C, since the point (0, 0) is not contained in R. That is, R has a “hole” at the origin, so
Green’s Theorem does not apply.
389 Example
Calculate the work done by the force

f(x, y) = (sin x − y 3 ) i + (ey + x3 ) j

to move a particle around the unit circle x2 + y 2 = 1 in the counterclockwise direction.

Solution: ▶
˛
W = f•dℓ (6.10)
˛C
= (sin x − y 3 ) dx + (ey + x3 ) dy (6.11)
ˆC ˆ ñ ô
∂ y ∂
= (e + x ) −3
(sin x − y ) dA
3
(6.12)
R ∂x ∂y

Green’s Theorem

(6.13)
ˆ ˆ
=3 (x2 + y 2 )dA (6.14)
R
ˆ 2π ˆ 2

=3 rdrdθ = (6.15)
0 r 2

using polar coordinates

(6.16)


The Green Theorem can be generalized:

390 Theorem (Green’s Theorem - Regions with Holes)


Let Ω ⊆ R2 be a bounded domain whose exterior boundary is a piecewise C 1 curve γ. If Ω has holes,
let γ1 , …, γN be the interior boundaries. If f : Ω̄ → R2 is C 1 , then
¨ ˛ ∑
N ˛
[∂1 F2 − ∂2 F1 ] dA = f• dℓ + f• dℓ,
Ω γ i=1 γi

where all line integrals above are computed by traversing the exterior boundary counter clockwise,
and every interior boundary clockwise, i.e., such that the boundary is a positively oriented curve.

391 Remark
A common convention is to denote the boundary of Ω by ∂Ω and write
 

N
∂Ω = γ ∪  γi  .
i=1

171
6. Line Integrals
Then Theorem 390 becomes
¨ ˛
[∂1 F2 − ∂2 F1 ] dA = f• dℓ,
Ω ∂Ω

where again the exterior boundary is oriented counter clockwise and the interior boundaries are all
oriented clockwise.
392 Remark
In the differential form notation, Green’s theorem is stated as
¨ î ó ˆ
∂x Q − ∂y P dA = P dx + Q dy,
Ω ∂Ω

P, Q : Ω̄ → R are C 1 functions. (We use the same assumptions as before on the domain Ω, and
orientations of the line integrals on the boundary.)

Proof. The full proof is a little cumbersome. But the main idea can be seen by first proving it when
Ω is a square. Indeed, suppose first Ω = (0, 1)2 .
y
d

c
x
a b

Then the fundamental theorem of calculus gives


¨ ˆ ˆ
1 [ ] 1 [ ]
[∂1 F2 − ∂2 F1 ] dA = F2 (1, y) − F2 (0, y) dy − F1 (x, 1) − F1 (x, 0) dx
Ω y=0 x=0

The first integral is the line integral of f on the two vertical sides of the square, and the second one
is line integral of f on the two horizontal sides of the square. This proves Theorem 390 in the case
when Ω is a square.
For line integrals, when adding two rectangles with a common edge the common edges are tra-
versed in opposite directions so the sum is just the line integral over the outside boundary.

Similarly when adding a lot of rectangles: everything cancels except the outside boundary. This
extends Green’s Theorem on a rectangle to Green’s Theorem on a sum of rectangles. Since any
region can be approximated as closely as we want by a sum of rectangles, Green’s Theorem must
hold on arbitrary regions.

172
6.9. Application of Green’s Theorem: Area
393 Example ˛
Evaluate y 3 dx − x3 dy where γ are the two circles of radius 2 and radius 1 centered at the origin
C
with positive orientation.

Solution: ▶
˛ ˆ ˆ
y dx − x dy = −3
3 3
(x2 + y 2 )dA (6.17)
γ D
ˆ 2pi ˆ 2
= −3 r3 drdθ (6.18)
0 1
45π
=− (6.19)
2

6.9. Application of Green’s Theorem: Area


Green’s theorem can be used to compute area by line integral. Let C be a positively oriented, piece-
wise smooth, simple closed
¨curve in a plane, and let U be the region bounded by C The area of
domain U is given by A = dA.
U
∂Q ∂P
Then if we choose P and M such that − = 1, the area is given by
∂x ∂y
˛
A = (P dx + Q dy).
C

Possible formulas for the area of U include:


˛ ˛ ˛
1
A= x dy = − y dx = (−y dx + x dy).
C C 2 C
394 Corollary
Let Ω ⊆ R2 be bounded set with a C 1 boundary ∂Ω, then
ˆ ˆ ˆ
1
area (Ω) = [−y dx + x dy] = −y dx = x dy
2 ∂Ω ∂Ω ∂Ω

395 Example
Use Green’s Theorem to calculate the area of the disk D of radius r.

173
6. Line Integrals
Solution: ▶ The boundary of D is the circle of radius r:

C(t) = (r cos t, r sin t), 0 ≤ t ≤ 2π.

Then

C ′ (t) = (−r sin t, r cos t),

and, by Corollary 394,


¨
area of D = dA
ˆ
1
= x dy − y dx
2 C
ˆ
1 2π
= [(r cos t)(r cos t) − (r sin t)(−r sin t)]dt
2 0
ˆ ˆ
1 2π 2 2 2 r2 2π
= r (sin t + cos t)dt = dt = πr2 .
2 0 2 0

396 Example
Use the Green’s theorem for computing the area of the region bounded by the x -axis and the arch of
the cycloid:
x = t − sin(t), y = 1 − cos(t), 0 ≤ t ≤ 2π

Solution: ▶ ¨ ˛
Area(D) = dA = −ydx.
D C
Along the x-axis, you have y = 0, so you only need to compute the integral over the arch of the cy-
cloid. Note that your parametrization of the arch is a clockwise parametrization, so in the following
calculation, the answer will be the minus of the area:
ˆ 2π ˆ 2π
(cos(t) − 1)(1 − cos(t))dt = − 1 − 2 cos(t) + cos2 (t)dt = −3π.
0 0

397 Corollary (Surveyor’s Formula)
Let P ⊆ R2 be a (not necessarily convex) polygon whose vertices, ordered counter clockwise, are
(x1 , y1 ), …, (xN , yN ). Then
(x1 y2 − x2 y1 ) + (x2 y3 − x3 y2 ) + · · · + (xN y1 − x1 yN )
area (P ) = .
2
Proof. Let P be the set of points belonging to the polygon. We have that
¨
A= dx dy.
P

Using the Corollary 394 we have


¨ ˆ
x dy y dx
dxdy = − .
P ∂P 2 2

174
6.10. Vector forms of Green’s Theorem

We can write ∂P = ni=1 L(i), where L(i) is the line segment from (xi , yi ) to (xi+1 , yi+1 ). With this
notation, we may write
ˆ n ˆ n ˆ
x dy y dx ∑ x dy y dx 1∑
− = − = x dy − y dx.
∂P 2 2 i=1 A(i)
2 2 2 i=1 A(i)

Parameterizing the line segment, we can write the integrals as


n ˆ 1
1∑
(xi + (xi+1 − xi )t)(yi+1 − yi ) − (yi + (yi+1 − yi )t)(xi+1 − xi ) dt.
2 i=1 0

Integrating we get

1∑ n
1
[(xi + xi+1 )(yi+1 − yi ) − (yi + yi+1 )(xi+1 − xi )].
2 i=1 2

simplifying yields the result

1∑ n
area (P ) = (xi yi+1 − xi+1 yi ).
2 i=1

6.10. Vector forms of Green’s Theorem


398 Theorem (Stokes’ Theorem in the Plane)
Let F = Li + M j. Then
˛ ¨
F · dℓ = ∇ × F · dS
γ Ω

Proof. Ç å
∂M ∂L
∇× F= − k̂
∂x ∂y
Over the region R we can write dx dy = dS and dS = k̂ dS. Thus using Green’s Theorem:
˛ ¨
F · dℓ = k̂ · ∇ × F dS
γ Ω
¨
= ∇ × F · dS

399 Theorem (Divergence Theorem in the Plane)


. Let F = M i − Lj Then
ˆ ˛
∇•F dx dy = F · n̂ ds
R γ

175
6. Line Integrals

Proof.
∂M ∂L
∇• F = −
∂x ∂y

and so Green’s theorem can be rewritten as


¨ ˛
∇ F dx dy = F1 dy − F2 dx

Ω γ

Now it can be shown that


n̂ ds = (dyi − dxj)

here s is arclength along C, and n̂ is the unit normal to C. Therefore we can rewrite Green’s theorem
as ˆ ˛
∇•F dx dy = F · n̂ ds
R γ

400 Theorem (Green’s identities in the Plane)


Let ϕ(x, y) and ψ(x, y) be two scalar functions C 2 , defined in the open set Ω ⊂ R2 .
˛ ¨
∂ψ
ϕ ds = ϕ∇2 ψ + (∂ϕ) · (∂ψ)] dx dy
γ ∂n Ω

and ˛ ñ ô ¨
∂ψ ∂ϕ
ϕ −ψ ds = (ϕ∇2 ψ − ψ∇2 ϕ) dx dy
γ ∂n ∂n Ω

Proof. If we use the divergence theorem:


ˆ ˛
∇• F dx dy = F · n̂ ds
S γ

then we can calculate down the corresponding Green identities. These are
˛ ¨
∂ψ
ϕ ds = ϕ∇2 ψ + (∂ϕ) · (∂ψ)] dx dy
γ ∂n Ω

and ˛ ñ ô ¨
∂ψ ∂ϕ
ϕ −ψ ds = (ϕ∇2 ψ − ψ∇2 ϕ) dx dy
γ ∂n ∂n Ω

176
Surface Integrals
7.
In this chapter we restrict our study to the case of surfaces in three-dimensional space. Similar
results for manifolds in the n-dimensional space are presented in the chapter 13.

7.1. The Fundamental Vector Product


401 Definition
A parametrized surface is given by a one-to-one transformation r : Ω → Rn , where Ω is a domain
in the plane R2 . This amounts to being given three scalar functions, x = x(u, v), y = y(u, v) and
z = z(u, v) of two variables, u and v, say. The transformation is then given by

r(u, v) = (x(u, v), y(u, v), z(u, v)).

and is called the parametrization of the surface.

v R2
z
S

Ω x = x(u, v)
y = y(u, v)
(u, v) z = z(u, v) r(u, v)
y
0
u x

Figure 7.1. Parametrization of a surface S in R3

402 Definition

177
7. Surface Integrals
■ A parametrization is said regular at the point (u0 , v0 ) in Ω if

∂u r(u0 , v0 ) × ∂v r(u0 , v0 ) ̸= 0.

■ The parametrization is regular if its regular for all points in Ω.

■ A surface that admits a regular parametrization is said regular parametrized surface.

Henceforth, we will assume that all surfaces are regular parametrized surface.
Now we consider two curves in S. The first one C1 is given by the vector function

r1 (u) = r(u, v0 ), u ∈ (a, b)

obtained keeping the variable v fixed at v0 . The second curve C2 is given by the vector function

r2 (u) = r(u0 , v), v ∈ (c, d)

this time we are keeping the variable u fixed at u0 ).


Both curves pass through the point r(u0 , v0 ) :

■ The curve C1 has tangent vector r1′ (u0 ) = ∂u r(u0 , v0 )

■ The curve C2 has tangent vector r2′ (v0 ) = ∂v r(u0 , v0 ).

The cross product n(u0 , v0 ) = ∂u r(u0 , v0 )×∂v r′ (u0 , v0 ), which we have assumed to be different
from zero, is thus perpendicular to both curves at the point r(u0 , v0 ) and can be taken as a normal
vector to the surface at that point.
We record the result as follows:

R2

∆u S
v r(u, v)
ru ∆u
rv ∆v
Ω ∆v

(u, v)
r(u, v)

Figure 7.2. Parametrization of a surface S in R3

178
7.1. The Fundamental Vector Product

403 Definition
If S is a regular surface given by a differentiable function r = r(u, v), then the cross product

n(u, v) = ∂u r × ∂v r

is called the fundamental vector product of the surface.

404 Example
For the plane r(u, v) = ua + vb + c we have
∂u r(u, v) = a, ∂v r(u, v) = b and therefore n̂(u, v) = a × b. The vector a × b is normal to the
plane.
405 Example
We parametrized the sphere x2 + y 2 + z 2 = a2 by setting

r(u, v) = a cos u sin vi + a sin u sin vj + a cos vk,

with 0 ≤ u ≤ 2π, 0 ≤ v ≤ π. In this case

∂u r(u, v) = −a sin u sin vi + a cos u sin vj

and
∂v r(u, v) = a cos u cos vi + a sin u cos vj − a sin vk.
Thus

i j k



n(u, v) = −a sin u cos v a cos u cos v 0



a cos u cos v a sin u cos v −a sin v
= −a sin v (a cos u sin vi + a sin u sin vj + a cos vk, )
= −a sin v r(u, v).
As was to be expected, the fundamental vector product of a sphere is parallel to the radius vector
r(u, v).

406 Definition (Boundary)


A surface S can have a boundary ∂S. We are interested in the case where the boundary consist of a
piecewise smooth curve or in a union of piecewise smooth curves.
A surface is bounded if it can be contained in a solid sphere of radius R, and is called unbounded
otherwise. A bounded surface with no boundary is called closed.

407 Example
The boundary of a hemisphere is a circle (drawn in red).

179
7. Surface Integrals
408 Example
The sphere and the torus are examples of closed surfaces. Both are bounded and without boundaries.

7.2. The Area of a Parametrized Surface


We will now learn how to perform integration over a surface in R3 .
Similar to how we used a parametrization of a curve to define the line integral along the curve,
we will use a parametrization of a surface to define a surface integral. We will use two variables, u
and v, to parametrize a surface S in R3 : x = x(u, v), y = y(u, v), z = z(u, v), for (u, v) in some
region Ω in R2 (see Figure 7.3).

R2

∆u S
v r(u, v)
ru ∆u
rv ∆v
Ω ∆v

(u, v)
r(u, v)

Figure 7.3. Parametrization of a surface S in R3

In this case, the position vector of a point on the surface S is given by the vector-valued function

r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k for (u, v) in Ω.

The parametrization of S can be thought of as “transforming” a region in R2 (in the uv-plane)


into a 2-dimensional surface in R3 . This parametrization of the surface is sometimes called a patch,
based on the idea of “patching” the region Ω onto S in the grid-like manner shown in Figure 7.3.
In fact, those gridlines in Ω lead us to how we will define a surface integral over S. Along the
vertical gridlines in Ω, the variable u is constant. So those lines get mapped to curves on S, and the
variable u is constant along the position vector r(u, v). Thus, the tangent vector to those curves
∂r
at a point (u, v) is . Similarly, the horizontal gridlines in Ω get mapped to curves on S whose
∂v
∂r
tangent vectors are .
∂u

180
(u, v + ∆v) (u + ∆u, v + ∆v) 7.2. The Area of a Parametrized Surface

r(u, v + ∆v))

r(u + ∆u, v))

(u, v) (u + ∆u, v) r(u, v)

Now take a point (u, v) in Ω as, say, the lower left corner of one of the rectangular grid sections
in Ω, as shown in Figure 7.3. Suppose that this rectangle has a small width and height of ∆u and
∆v, respectively. The corner points of that rectangle are (u, v), (u + ∆u, v), (u + ∆u, v + ∆v) and
(u, v + ∆v). So the area of that rectangle is A = ∆u ∆v.
Then that rectangle gets mapped by the parametrization onto some section of the surface S
which, for ∆u and ∆v small enough, will have a surface area (call it dS) that is very close to the
area of the parallelogram which has adjacent sides r(u + ∆u, v) − r(u, v) (corresponding to the
line segment from (u, v) to (u + ∆u, v) in Ω) and r(u, v + ∆v) − r(u, v) (corresponding to the line
segment from (u, v) to (u, v + ∆v) in Ω). But by combining our usual notion of a partial derivative
with that of the derivative of a vector-valued function applied to a function of two variables, we
have
∂r r(u + ∆u, v) − r(u, v)
≈ , and
∂u ∆u
∂r r(u, v + ∆v) − r(u, v)
≈ ,
∂v ∆v
and so the surface area element dS is approximately
∂r
∂r ∂r ∂r
(r(u+∆u, v)−r(u, v))×(r(u, v +∆v)−r(u, v)) ≈ (∆u )×(∆v ) = × ∆u ∆v
∂u ∂v ∂u ∂v
∂r ∂r

Thus, the total surface area S of S is approximately the sum of all the quantities × ∆u ∆v,
∂u ∂v
summed over the rectangles in Ω.
Taking the limit of that sum as the diagonal of the largest rectangle goes to 0 gives
¨
∂r ∂r
S = × du dv . (7.1)
∂u ∂v

We will write the double integral on the right using the special notation
¨ ¨



∂r ∂r

dS = ∂u × ∂v du dv . (7.2)

S Ω

This is a special case of a surface integral over the surface S, where the surface area element dS can
be thought of as 1 dS. Replacing 1 by a general real-valued function f (x, y, z) defined in R3 , we
have the following:

181
7. Surface Integrals

409 Definition
Let S be a surface in R3 parametrized by

x = x(u, v), y = y(u, v), z = z(u, v),

for (u, v) in some region Ω in R2 . Let r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k be the position vector
for any point on S. The surface area S of S is defined as
¨ ¨
∂r ∂r
S = 1 dS = × du dv (7.3)
∂u ∂v
S Ω

410 Example
A torus T is a surface obtained by revolving a circle of radius a in the yz-plane around the z-axis, where
the circle’s center is at a distance b from the z-axis (0 < a < b), as in Figure 7.4. Find the surface area
of T .

z
(y − b)2 + z 2 = a2 y
(x, y, z)
a v
u y a
0

b x
(b) Torus T
(a) Circle in the yz-plane

Figure 7.4.

Solution: ▶
For any point on the circle, the line segment from the center of the circle to that point makes
an angle u with the y-axis in the positive y direction (see Figure 7.4(a)). And as the circle revolves
around the z-axis, the line segment from the origin to the center of that circle sweeps out an angle
v with the positive x-axis (see Figure 7.4(b)). Thus, the torus can be parametrized as:

x = (b + a cos u) cos v , y = (b + a cos u) sin v , z = a sin u , 0 ≤ u ≤ 2π , 0 ≤ v ≤ 2π

So for the position vector

r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k


= (b + a cos u) cos v i + (b + a cos u) sin v j + a sin u k

182
7.2. The Area of a Parametrized Surface
we see that
∂r
= − a sin u cos v i − a sin u sin v j + a cos u k
∂u
∂r
= − (b + a cos u) sin v i + (b + a cos u) cos v j + 0k ,
∂v
and so computing the cross product gives

∂r ∂r
× = − a(b + a cos u) cos v cos u i − a(b + a cos u) sin v cos u j − a(b + a cos u) sin u k ,
∂u ∂v
which has magnitude
∂r ∂r

× = a(b + a cos u) .
∂u ∂v
Thus, the surface area of T is
¨
S = 1 dS
S

ˆ ˆ
2π 2π ∂r ∂r

= ∂u × ∂v du dv
0 0
ˆ 2π ˆ 2π
= a(b + a cos u) du dv
0 0
ˆ ( u=2π )

= abu + a sin u
2
dv
0 u=0
ˆ 2π
= 2πab dv
0
2
= 4π ab


z

y
0

411 Example
[The surface area of a sphere] The function

r(u, v) = a cos u sin vi + a sin u sin vj + a cos vk,


π
with (u, v) ranging over the set 0 ≤ u ≤ 2π, 0 ≤ v ≤ parametrizes a sphere of radius a. For this
2
parametrization

183
7. Surface Integrals

n(u, v) = a sin v r(u, v) and n(u, v) = a2 |sin v| = a2 sin v.

So, ¨
area of the sphere = a2 sin v du dv

ˆ 2π
Lj π å ˆ π
= a sin v dv du = 2πa2
2
sin v dv = 4πa2 ,
0 0 0

which is known to be correct.

412 Example (The area of a region of the plane)


If S is a plane region Ω, then S can be parametrized by setting

r(u, v) = ui + vj, (u, v) ∈ Ω.



Here n(u, v) = ∂u r(u, v) × ∂v r(u, v) = i × j = k and n(u, v) = 1. In this case we reobtain the
familiar formula ¨
A= du dv.

413 Example (The area of a surface of revolution)


Let S be the surface generated by revolving the graph of a function

y = f (x), x ∈ [a, b]

about the x-axis. We will assume that f is positive and continuously differentiable.
We can parametrize S by setting

r(u, v) = vi + f (v) cos u j + f (v) sin u k

with (u, v) ranging over the set Ω : 0 ≤ u ≤ 2π, a ≤ v ≤ b. In this case




i j k



n(u, v) = ∂u r(u, v) × ∂v r(u, v) = 0 −f (v) sin u f (v) cos u



1 f ′ (v) cos u f ′ (v) sin u

= −f (v)f ′ (v)i + f (v) cos u j + f (v) sin u k.


»[ ]2
Therefore n(u, v) = f (v) f ′ (v) + 1 and
¨ √
[ ]2
area (S) = f (v) f ′ (v) + 1 du dv

ˆ (ˆ √ ) ˆ √
2π b [ ] b [ ]2
f (v) f ′ (v) 2 + 1 dv du = 2πf (v) f ′ (v) + 1 dv.
0 a a

184
7.2. The Area of a Parametrized Surface
414 Example ( Spiral ramp)
One turn of the spiral ramp of Example 5 is the surface

S : r(u, v) = u cos ωv i + u sin ωv j + bv k

with (u, v) ranging over the set Ω :0 ≤ u ≤ l, 0 ≤ v ≤ 2π/ω. In this case

∂u r(u, v) = cos ωv i + sin ωv j, ∂v r′ (u, v) = −ωu sin ωv i + ωu cos ωv j + bk.

Therefore


i j k



n(u, v) =
cos ωv sin ωv 0 = b sin ωv i − b cos ωv j + ωuk


−ωu sin ωv ωu cos ωv b

and

n(u, v) = b2 + ω 2 u2 .

Thus ¨ √
area of S = b2 + ω 2 u2 du dv

ˆ (ˆ ) ˆ l√
2π/ω l √ 2π
= b2 + ω 2 u2 du dv = b2 + ω 2 u2 du.
0 0 ω 0

The integral can be evaluated by setting u = (b/ω) tan x.

7.2.1. The Area of a Graph of a Function


Let S be the surface of a function f (x, y) :

z = f (x, y), (x, y) ∈ Ω.

We are to show that if f is continuously differentiable, then


¨ … î ó2
[ ]2
area (S) = fx′ (x, y) + fy′ (x, y) + 1 dx dy.

We can parametrize S by setting

r(u, v) = ui + vj + f (u, v)k, (u, v) ∈ Ω.

We may just as well use x and y and write

r(x, y) = xi + yj + f (x, y)k, (x, y) ∈ Ω.

Clearly
rx (x, y) = i + fx (x, y)k and ry (x, y) = j + fy (x, y)k.

185
7. Surface Integrals
Thus

i j k



n(x, y) = 1 0 fx (x, y) = −fx (x, y) i − fy (x, y) j + k.



0 1 fy (x, y)
√[ ] î ó2
Therefore n(x, y) = f ′ (x, y) 2 + f ′ (x, y) + 1 and the formula is verified.
x y

415 Example
Find the surface area of that part of the parabolic cylinder z = y 2 that lies over the triangle with
vertices (0, 0), (0, 1), (1, 1) in the xy-plane.

Solution: ▶
Here f (x, y) = y 2 so that
fx (x, y) = 0, fy (x, y) = 2y.

The base triangle can be expressed by writing

Ω : 0 ≤ y ≤ 1, 0 ≤ x ≤ y.

The surface has area


¨ … î ó2
[ ]2
area = fx′ (x, y) + fy′ (x, y) + 1 dx dy

ˆ 1ˆ y »
= 4y 2 + 1 dx dy
0 0
ˆ »

1
5 5−1
= y 4y 2 + 1 dy = .
0 12

416 Example
Find the surface area of that part of the hyperbolic paraboloid z = xy that lies inside the cylinder
x2 + y 2 = a2 .

Solution: ▶ Let f (x, y) = xy so that

fx (x, y) = y, fy (x, y) = x.

The formula gives ¨ »


A= x2 + y 2 + 1 dx dy.

In polar coordinates the base region takes the form

0 ≤ r ≤ a, 0 ≤ θ ≤ 2π.

Thus we have ¨ √ ˆ ˆ
2π a√
A= 2
r + 1 rdrdθ = r2 + 1 rdrdθ
Ω 0 0

186
7.2. The Area of a Parametrized Surface
2
= π[(a2 + 1)3/2 − 1].
3
There is an elegant version of this last area formula that is geometrically vivid. We know that the
vector
rx (x, y) × ry (x, y) = −fx (x, y)i − fy (x, y)j + k

is normal to the surface at the point (x, y, f (x, y)). The unit vector in that direction, the vector

−fx (x, y)i − fy (x, y)j + k


n(x, y) = √[ ] î ó2 ,
fx (x, y) 2 + fy (x, y) + 1

is called the upper unit normal (It is the unit normal with a nonnegative k-component.)
Now let γ(x, y) be the angle between n(x, y) and k. Since n(x, y) and k are both unit vectors,

1
cos[γ(x, y)] = n(x, y)•k = √[ ]2 î ó2 .
fx′ (x, y) + fy′ (x, y) + 1

Taking reciprocals we have



[ ]2 î ó2
sec[γ(x, y)] = fx′ (x, y) + fy′ (x, y) + 1.

The area formula can therefore be written


¨
A= sec[γ(x, y)] dx dy.

7.2.2. Pappus Theorem


417 Theorem
Let γ be a curve in the plane. The area of the surface obtained when γ is revolved around an external
axis is equal to the product of the arc length of γ and the distance traveled by the centroid of γ

Ä ä
Proof. If x(t), z(t) , a ≤ t ≤ b, parametrizes a smooth plane curve C in the half-plane x > 0, the
surface S obtained by revolving C about the z-axis may be parametrized by
Ä ä
γ(s, t) = x(t) cos s, x(t) sin s, z(t) , a ≤ t ≤ b, 0 ≤ s ≤ 2π.

The partial derivatives are

∂γ Ä ä
= −x(t) sin s, x(t) cos s, 0 ,
∂s
∂γ Ä ′ ä
= x (t) cos s, x′ (t) sin s, z ′ (t) ;
∂t
Their cross product is

∂γ ∂γ Ä ä
× = −x(t) z ′ (t) cos s, z ′ (t) sin s, x′ (t) ;
∂s ∂t

187
7. Surface Integrals
the fundamental vector is

∂γ ∂γ »

× ds dt = x(t) z ′ (t)2 + x′ (t)2 ds dt.
∂s ∂t

The surface area of S is


ˆ b ˆ 2π » ˆ b »
x(t) z ′ (t)2 + x′ (t)2 ds dt = 2π x(t) z ′ (t)2 + x′ (t)2 dt.
a 0 a

If ˆ b»
ℓ= z ′ (t)2 + x′ (t)2 dt
a
denotes the arc length of C, the area of S becomes
ˆ ( ˆ )
b » 1 b »
2π x(t) z ′ (t)2 + x′ (t)2 dt = 2π ℓ x(t) z ′ (t)2 + x′ (t)2 dt = ℓ (2π x̄),
a ℓ a

the length of C times the circumference of the circle swept by the centroid of C.

7.3. Surface Integrals of Scalar Functions


418 Definition
Let S be a surface in R3 parametrized by

x = x(u, v), y = y(u, v), z = z(u, v),

for (u, v) in some region Ω in R2 . Let r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k be the position vector
for any point on S. And let f : S → R be a continuous function.
The integral of f over S is defined as as
¨ ¨ ∂r ∂r

S = 1 dS = f (u, v) × du dv (7.4)
∂u ∂v
S Ω

419 Remark
Other common notation for the surface integral is
¨ ¨ ¨ ¨
f dS = f dS = f dS = f dA
S S Ω Ω

420 Remark
If the surface cannot be parametrized by a unique function, the integral can be computed by breaking
up S into finitely many pieces which can be parametrized.
The formula above will yield an answer that is independent of the chosen parametrization and how
you break up the surface (if necessary).

188
7.3. Surface Integrals of Scalar Functions
421 Example
Evaluate
¨
zdS
S
where S is the upper half of a sphere of radius 2.

Solution: ▶ As we already computed n = ◀


422 Example
Integrate the function g(x, y, z) = yz over the surface of the wedge in the first octant bounded by the
coordinate planes and the planes x = 2 and y + z = 1.

Solution: ▶ If a surface consists of many different pieces, then a surface integral over such a
surface is the sum of the integrals over each of the surfaces.
The portions are S1 : x = 0 for 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 − y; S2 : x = 2 for 0 ≤ y ≤ 1, ≤ z ≤ 1 − y;
S3 : y = 0 for 0 ≤ x ≤ 2, 0 ≤ z ≤ 1; S4 : ¨ z = 0 for 0 ≤ x ≤ 2, 0 ≤ y ≤ 1; and S5 : z = 1 − y
for 0 ≤ x ≤ 2, 0 ≤ y ≤ 1. Hence, to find gdS, we must evaluate all 5 integrals. We compute
√ √ S √ √
dS1 = 1 + 0 + 0dzdy, dS2 = 1 + 0 + 0dzdy, dS3 = 0 + 1 + 0dzdx, dS4 = 0 + 0 + 1dydx,
»
dS5 = 0 + (−1)2 + 1dydx, and so
¨ ¨ ¨ ¨ ¨
gdS + gdS + gdS + gdS + gdS
ˆ 1 ˆ 1−y
S 1 ˆ 1 ˆ 1−y
S 2 ˆ 2ˆ 1
S3 ˆ 2ˆ 1
S 4 ˆ 2ˆ 1
S 5

= yzdzdy + yzdzdy + (0)zdzdx + y(0)dydx + y(1 − y) 2dyd
ˆ0 1 ˆ0 1−y ˆ0 1 ˆ0 1−y 0 0 0 0 ˆ0 2 ˆ0 1 √
= yzdzdy + yzdzdy +0 +0 + y(1 − y) 2dyd
0 0 0 0 0 0

= 1/24 +1/24 +0 +0 + 2/3


423 Example
The temperature at each point in space on the surface of a sphere of radius 3 is given by T (x, y, z) =
sin(xy + z). Calculate the average temperature.

Solution: ▶
The average temperature on the sphere is given by the surface integral
¨
1
AV = f dS
S S
A parametrization of the surface is

r(θ, ϕ) = ⟨3cosθ sin ϕ, 3 sin θ sin ϕ, 3 cos ϕ⟩

for 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π. We have

T (θ, ϕ) = sin((3 cos θ sin ϕ)(3 sin θ sin ϕ) + 3 cos ϕ),

and the surface area differential is dS = |rθ × rϕ | = 9 sin ϕ.

189
7. Surface Integrals
The surface area is ˆ ˆ
2π π
σ= 9 sin ϕdϕdθ
0 0
and the average temperature on the surface is
ˆ ˆ
1 2π π
AV = sin((3 cos θ sin ϕ)(3 sin θ sin ϕ) + 3 cos ϕ)9 sin ϕdϕdθ.
σ 0 0


424 Example
Consider the surface which is the upper hemisphere of radius 3 with density δ(x, y, z) = z 2 . Calculate
its surface, the mass and the center of mass

Solution: ▶
A parametrization of the surface is

r(θ, ϕ) = ⟨3 cos θ sin ϕ, 3 sin θ sin ϕ, 3 cos ϕ⟩

for 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π/2. The surface area differential is

dS = |rθ × rϕ |dθdϕ = 9 sin ϕdθdϕ.

The surface area is ˆ ˆ


2π π/2
S= 9 sin ϕdϕdθ.
0 0

If the density is δ(x, y, z) = z 2 , then we have


¨ ˆ 2π ˆ π/2
yδdS (3 sin θ sin ϕ)(3 cos ϕ)2 (9 sin ϕ)dϕdθ
ȳ = ¨S = 0 0
ˆ 2π ˆ π/2
δdS (3 cos ϕ)2 (9 sin ϕ)dϕdθ
S 0 0

7.4. Surface Integrals of Vector Functions


7.4.1. Orientation
Like curves, we can parametrize a surface in two different orientations. The orientation of a curve
is given by the unit tangent vector n; the orientation of a surface is given by the unit normal vector
n. Unless we are dealing with an unusual surface, a surface has two sides. We can pick the normal
vector to point out one side of the surface, or we can pick the normal vector to point out the other
side of the surface. Our choice of normal vector specifies the orientation of the surface. We call the
side of the surface with the normal vector the positive side of the surface.

190
7.4. Surface Integrals of Vector Functions

425 Definition
We say (S, n̂) is an oriented surface if S ⊆ R3 is a C 1 surface, n̂ : S → R3 is a continuous function

such that for every x ∈ S, the vector n̂(x) is normal to the surface S at the point x, and n̂(x) = 1.

426 Example{ }

Let S = x ∈ R3 ∥x∥ = 1 , and choose n̂(x) = x/∥x∥.

427 Remark
At any point x ∈ S there are exactly two possible choices of n̂(x). An oriented surface simply provides
a consistent choice of one of these in a continuous way on the entire surface. Surprisingly this isn’t
always possible! If S is the surface of a Möbius strip, for instance, cannot be oriented.

428 Example
If S is the graph of a function, we orient S by chosing n̂ to always be the unit normal vector with a
positive z coordinate.

429 Example
If S is a closed surface, then we will typically orient S by letting n̂ to be the outward pointing normal
vector.

y
0

Recall that normal vectors to a plane can point in two opposite directions. By an outward unit
normal vector to a surface S, we will mean the unit vector that is normal to S and points to the
“outer” part of the surface.
430 Example
If S is the surface of a Möbius strip, for instance, cannot be oriented.

7.4.2. Flux
If S is some oriented surface with unit normal n̂, then the amount of fluid flowing through S per
unit time is exactly ¨
f•n̂ dS.
S
Note, both f and n̂ above are vector functions, and f•n̂ : S → R is a scalar function. The surface
integral of this was defined in the previous section.

191
7. Surface Integrals

Figure 7.5. The Moebius Strip is an example of a


surface that is not orientable

Figure 7.6. Möbius Strip II - M.C. Escher

431 Definition
Let (S, n̂) be an oriented surface, and f : S → R3 be a C 1 vector field. The surface integral of f over
S is defined to be ¨
f•n̂ dS.
S

432 Remark
Other common notation for the surface integral is
¨ ¨ ¨
f n̂ dS =
• f dS =
• f•dA
S S S

433 Example ¨
Evaluate the surface integral f•dS, where f(x, y, z) = yzi + xzj + xyk and S is the part of the
S
plane x + y + z = 1 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward unit normal n pointing in the
positive z direction.

Solution: ▶

192
7.4. Surface Integrals of Vector Functions
z
1
n
S
0 y
1
1 x+y+z =1
x

Since the vector v = (1, 1, 1) is normal to the plane


Ç x + y + z =å1 (why?), then dividing v by its
1 1 1
length yields the outward unit normal vector n = √ , √ , √ . We now need to parametrize
3 3 3
S. As we can see from Figure projecting S onto the xy-plane yields a triangular region R = { (x, y) :
0 ≤ x ≤ 1, 0 ≤ y ≤ 1 − x }. Thus, using (u, v) instead of (x, y), we see that

x = u, y = v, z = 1 − (u + v), for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u

is a parametrization of S over Ω (since z = 1 − (x + y) on S). So on S,


Ç å
1 1 1 1
f•n = (yz, xz, xy)• √ ,√ ,√ = √ (yz + xz + xy)
3 3 3 3
1 1
= √ ((x + y)z + xy) = √ ((u + v)(1 − (u + v)) + uv)
3 3
1
= √ ((u + v) − (u + v)2 + uv)
3
for (u, v) in Ω, and for r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k = ui + vj + (1 − (u + v))k we have
∂r ∂r √
∂r ∂r
× = (1, 0, −1)×(0, 1, −1) = (1, 1, 1) ⇒ × = 3 .
∂u ∂v ∂u ∂v
Thus, integrating over Ω using vertical slices (e.g. as indicated by the dashed line in Figure 4.4.5)
gives
¨ ¨
f•dS = f•n dS
S S
¨ ∂r
∂r

= (f(x(u, v), y(u, v), z(u, v))•n) × dv du
∂u ∂v

ˆ 1 ˆ 1−u √
1
= √ ((u + v) − (u + v)2 + uv) 3 dv du
0 0 3
Ö v=1−u è
ˆ
1
(u + v)2 (u + v)3
uv 2
= − + du
0 2 3 2
v=0

ˆ ( )
1
1 u 3u2 5u3
= + − + du
0 6 2 2 6
1

u u2 u3 5u4
= + − + = 1.

6 4 2 24 8
0

193
7. Surface Integrals

434 Proposition
Let r : Ω → S be a parametrization of the oriented surface (S, n̂). Then either

∂u r × ∂v r
n̂ ◦ r = (7.5)
∥∂u r × ∂v r∥

on all of S, or
∂u r × ∂v r
n̂ ◦ r = − (7.6)
∥∂u r × ∂v r∥
on all of S. Consequently, in the case (7.5) holds, we have
¨ ¨
F•n̂ dS = (F ◦ r)•(∂u r × ∂v r) dudv. (7.7)
S Ω

Proof. The vector ∂u r × ∂v r is normal to S and hence parallel to n̂. Thus


∂u r × ∂v r
n̂•
∥∂u r × ∂v r∥

must be a function that only takes on the values ±1. Since s is also continuous, it must either be
identically 1 or identically −1, finishing the proof. ■

435 Example
Gauss’s law sates that the total charge enclosed by a surface S is given by
¨
Q = ϵ0 E •dS,
S

where ϵ0 the permittivity of free space, and E is the electric field. By convention, the normal vector is
chosen to be pointing outward.
If E(x) = e3 , compute the charge enclosed by the top half of the hemisphere bounded by ∥x∥ = 1
and x3 = 0.

7.5. Kelvin-Stokes Theorem


Given a surface S ⊂ R3 with boundary ∂S you are free to chose the orientation of S, i.e., the di-
rection of the normal, but you have to orient S and ∂S coherently. This means that if you are an
”observer” walking along the boundary of the surface with the normal as your upright direction;
you are moving in the positive direction if onto the surface the boundary the interior of S is on to
the left of ∂S.
436 Example
Consider the annulus
A := {(x, y, 0) | a2 ≤ x2 + y 2 ≤ b2 }

in the (x, y)-plane, and from the two possible normal unit vectors (0, 0, ±1) choose n̂ := (0, 0, 1). If
you are an ”observer” walking along the boundary of the surface with the normal as n̂ means that

194
7.5. Kelvin-Stokes Theorem
the outer boundary circle of A should be oriented counterclockwise. Staring at the figure you can
convince yourself that the inner boundary circle has to be oriented clockwise to make the interior of
A lie to the left of ∂A. One might write

∂A = ∂Db − ∂Da ,

where Dr is the disk of radius r centered at the origin, and its boundary circle ∂Dr is oriented coun-
terclockwise.



−∂Da ∂Db


437 Theorem (Kelvin–Stokes Theorem)


Let U ⊆ R3 be a domain, (S, n̂) ⊆ U be a bounded, oriented, piecewise C 1 , surface whose bound-
ary is the (piecewise C 1 ) curve γ. If f : U → R3 is a C 1 vector field, then
ˆ ˛
∇ × f n̂ dS = f•dℓ.

S γ

Here γ is traversed in the counter clockwise direction when viewed by an observer standing with his
feet on the surface and head in the direction of the normal vector.

∇×f n̂

Proof. Let f = f1 i + f2 j + f3 k. Consider




i j k


∂f1 ∂f1
∇ × (f1 i) = ∂x ∂y ∂z = j −k
∂z ∂y


f1 0 0

195
7. Surface Integrals
Then we have
¨ ¨
[∇ × (f1 i)] · dS = (n̂ · ∇ × (f1 i) dS
S
¨ S
∂f1 ∂f1
= (j · n̂) − (k · n̂) dS
S ∂z ∂y

We prove the theorem in the case S is a graph of a function, i.e., S is parametrized as

r = xi + yj + g(x, y)k

where g(x, y) : Ω → R. In this case the boundary γ of S is given by the image of the curve C
boundary of Ω:


z
S

Ω y
x
C

Let the equation of S be z = g(x, y). Then we have

−∂g/∂xi − ∂g/∂yj + k
n̂ =
((∂g/∂x)2 + (∂g/∂y)2 + 1)1/2

Therefore on Ω:
∂g ∂z
j · n̂ = − (k · n̂) = − (k · n̂)
∂y ∂y
Thus
¨ ¨ Ñ é
∂f1 ∂f1 ∂z
[∇ × (f1 i)] · dS = − − (k · n̂) dS
S S ∂y z,x ∂z y,x ∂y x

Using the chain rule for partial derivatives


¨

=− f1 (x, y, z)(k · n̂) dS
S ∂y x

Then:
ˆ

=− f1 (x, y, g) dx dy
∂y
˛ Ω

= f1 (x, y, f (x, y))


C

196
7.5. Kelvin-Stokes Theorem
with the last line following by using Green’s theorem. However on γ we have z = g and

˛ ˛
f1 (x, y, g) dx = f1 (x, y, z) dx
C γ

We have therefore established that


¨ ˛
(∇ × f1 i) · df = f1 dx
S γ

In a similar way we can show that

¨ ˛
(∇ × A2 j) · df = A2 dy
S γ

and
¨ ˛
(∇ × A3 k) · df = A3 dz
S γ

and so the theorem is proved by adding all three results together. ■

438 Example
Verify Stokes’ Theorem for f(x, y, z) = z i + x j + y k when S is the paraboloid z = x2 + y 2 such that
z ≤ 1.

z
C
1 n

Solution: ▶ The positive unit normal vector to the surface


z = z(x, y) = x2 + y 2 is
S
y
∂z ∂z
− i− j+k 0
∂x ∂y −2x i − 2y j + k x
n = √ Ç å2 Ç å2 = √ ,
∂z ∂z 1 + 4x2 + 4yFigure
2
7.7. z = x2 + y 2
1+ +
∂x ∂y

and ∇ × f = (1 − 0) i + (1 − 0) j + (1 − 0) k = i + j + k, so

»
(∇ × f )•n = (−2x − 2y + 1)/ 1 + 4x2 + 4y 2 .

Since S can be parametrized as r(x, y) = x i+y j+(x2 +y 2 ) k for (x, y) in the region D = { (x, y) :

197
7. Surface Integrals
x2 + y 2 ≤ 1 }, then
¨ ¨ ∂r ∂r

(∇ × f )•n dS = (∇ × f )•n × dxdy
∂x ∂y
S D
¨
−2x − 2y + 1 »
= √ 1 + 4x2 + 4y 2 dxdy
1 + 4x2 + 4y 2
D
¨
= (−2x − 2y + 1) dxdy , so switching to polar coordinates gives
D
ˆ 2π ˆ 1
= (−2r cos θ − 2r sin θ + 1)r dr dθ
0 0
ˆ 2π ˆ 1
= (−2r2 cos θ − 2r2 sin θ + r) dr dθ
0
ˆ (0 )

2r3 2r3 r2 r=1
= − cos θ − sin θ + dθ
0 3 3 2 r=0
ˆ 2π
Ç å
2 2 1
= − cos θ − sin θ + dθ
0 3 3 2

2 2 1 2π
= − sin θ + cos θ + θ = π .
3 3 2 0

The boundary curve C is the unit circle x2 + y 2 = 1 laying in the plane z = 1 (see Figure), which
can be parametrized as x = cos t, y = sin t, z = 1 for 0 ≤ t ≤ 2π. So
˛ ˆ 2π
f•dr = ((1)(− sin t) + (cos t)(cos t) + (sin t)(0)) dt
C 0
ˆ 2π
Ç å Ç å
1 + cos 2t 1 + cos 2t
= − sin t + dt here we used cos2 t =
0 2 2

t sin 2t
= cos t + + = π.
2 4 0
˛ ¨
So we see that f•dr = (∇ × f )•n dS, as predicted by Stokes’ Theorem. ◀
C
S
The line integral in the preceding example was far simpler to calculate than the surface integral,
but this will not always be the case.
439 Example
Let S be the section of a sphere of radius a with 0 ≤ θ ≤ α. In spherical coordinates,

dS = a2 sin θer dθ dφ.

Let F = (0, xz, 0). Then ∇ × F = (−x, 0, z). Then


¨
∇ × F•dS = πa3 cos α sin2 α.
S

Our boundary ∂C is
r(φ) = a(sin α cos φ, sin α sin φ, cos α).

198
7.5. Kelvin-Stokes Theorem
The right hand side of Stokes’ is
ˆ ˆ 2π
F dℓ =
• a sin α cos φ a cos α a sin α cos φ dφ
C 0 | {z } | {z } | {z }
x z dy
ˆ 2π
= a3 sin2 α cos α cos2 φ dφ
0
= πa3 sin2 α cos α.

So they agree.
440 Remark
The rule determining the direction of traversal of γ is often called the right hand rule. Namely, if you
put your right hand on the surface with thumb aligned with n̂, then γ is traversed in the pointed to by
your index finger.
441 Remark
If the surface S has holes in it, then (as we did with Greens theorem) we orient each of the holes clock-
wise, and the exterior boundary counter clockwise following the right hand rule. Now Kelvin–Stokes
theorem becomes ˆ ˆ
∇ × f n̂ dS =
• f•dℓ,
S ∂S
where the line integral over ∂S is defined to be the sum of the line integrals over each component of
the boundary.
442 Remark
If S is contained in the x, y plane and is oriented by choosing n̂ = e3 , then Kelvin–Stokes theorem
reduces to Greens theorem.

Kelvin–Stokes theorem allows us to quickly see how the curl of a vector field measures the in-
finitesimal circulation.
443 Proposition
Suppose a small, rigid paddle wheel of radius a is placed in a fluid with center at x0 and rotation axis
parallel to n̂. Let v : R3 → R3 be the vector field describing the velocity of the ambient fluid. If ω the
angular speed of rotation of the paddle wheel about the axis n̂, then
∇ × v(x0 )•n̂
lim ω = .
a→0 2
Proof. Let S be the surface of a disk with center x0 , radius a, and face perpendicular to n̂, and
γ = ∂S. (Here S represents the face of the paddle wheel, and γ the boundary.) The angular speed
ω will be such that ˛
(v − aωτ̂ )•dℓ = 0,
γ
where τ̂ is a unit vector tangent to γ, pointing in the direction of traversal. Consequently
˛ ¨
1 1 a→0 ∇ × v(x0 )•n̂
ω= v •dℓ = ∇ × v •n̂ dS −
−−→ .■
2πa2 γ 2πa2 S 2

199
7. Surface Integrals
444 Example
x2 y2
Let S be the elliptic paraboloid z = + for z ≤ 1, and let C be its boundary curve. Calculate
˛ 4 9
f•dr for f(x, y, z) = (9xz + 2y)i + (2x + y 2 )j + (−2y 2 + 2z)k, where C is traversed counter-
C
clockwise.

Solution: ▶ The surface is similar to the one in Example 438, except now the boundary curve C is
x2 y 2
the ellipse + = 1 laying in the plane z = 1. In this case, using Stokes’ Theorem is easier than
4 9
computing the line integral directly. As in Example 438, at each point (x, y, z(x, y)) on the surface
x2 y 2
z = z(x, y) = + the vector
4 9
∂z ∂z x 2y
− i− j+k − i− j+k
∂x ∂y
n = √ Ç å2 Ç å2 =  2 9 ,
∂z ∂z x2 4y 2
1+ + 1+ +
∂x ∂y 4 9

is a positive unit normal vector to S. And calculating the curl of f gives

∇ × f = (−4y − 0)i + (9x − 0)j + (2 − 2)k = − 4y i + 9x j + 0 k ,

so
x 2y
(−4y)(− ) + (9x)(− ) + (0)(1) 2xy − 2xy + 0
(∇ × f )•n = 2  9 =   = 0,
x 2 4y 2 x2 4y 2
1+ + 1+ +
4 9 4 9
and so by Stokes’ Theorem
˛ ¨ ¨
f•dr = (∇ × f )•n dS = 0 dS = 0 .
C
S S

7.6. Gauss Theorem


445 Theorem (Divergence Theorem or Gauss Theorem)
Let U ⊆ R3 be a bounded domain whose boundary is a (piecewise) C 1 surface denoted by ∂U . If
f : U → R3 is a C 1 vector field, then
˚ ‹
(∇ f) dV =
• f•n̂ dS,
U ∂U

where n̂ is the outward pointing unit normal vector.

446 Remark
Similar to our convention with line integrals, we denote surface integrals over closed surfaces with

the symbol .

200
7.6. Gauss Theorem
447 Remark
Let BR = B(x0 , R) and observe
ˆ ˆ
1 1
lim f•n̂ dS = lim ∇•f dV = ∇•f(x0 ),
R→0 volume (∂BR ) ∂B R→0 volume (∂BR ) B
R R

which justifies our intuition that ∇•f measures the outward flux of a vector field.
448 Remark
If V ⊆ R2 , U = V × [a, b] is a cylinder, and f : R3 → R3 is a vector field that doesn’t depend on x3 ,
then the divergence theorem reduces to Greens theorem.

Proof. [Proof of the Divergence Theorem] Suppose first that the domain U is the unit cube (0, 1)3 ⊆
R3 . In this case ˚ ˚
∇•f dV = (∂1 v1 + ∂2 v2 + ∂3 v3 ) dV.
U U
Taking the first term on the right, the fundamental theorem of calculus gives
˚ ˆ 1 ˆ 1
∂1 v1 dV = (v1 (1, x2 , x3 ) − v1 (0, x2 , x3 )) dx2 dx3
U x3 =0 x2 =0
ˆ ˆ
= v •n̂ dS + v •n̂ dS,
L R

where L and B are the left and right faces of the cube respectively. The ∂2 v2 and ∂3 v3 terms give
the surface integrals over the other four faces. This proves the divergence theorem in the case that
the domain is the unit cube.

449 Example¨
Evaluate f•dS, where f(x, y, z) = xi + yj + zk and S is the unit sphere x2 + y 2 + z 2 = 1.
S

Solution: ▶ We see that div f = 1 + 1 + 1 = 3, so


¨ ˚ ˚
f dS =
• div f dV = 3 dV
S S S
˚
4π(1)3
= 3 1 dV = 3 vol(S) = 3 · = 4π .
3
S


450 Example
Consider a hemisphere.

S1

S2

201
7. Surface Integrals
V is a solid hemisphere
x2 + y 2 + z 2 ≤ a2 , z ≥ 0,

and ∂V = S1 + S2 , the hemisphere and the disc at the bottom.


Take F = (0, 0, z + a) and ∇•F = 1. Then
˚
2
∇•F dV = πa3 ,
V 3

the volume of the hemisphere.


On S1 , the outward pointing fundamental vector is

n(u, v) = a sin v r(u, v) = a sin v (x, y, z).

Then
F•n(u, v) = az(z + a) sin v = a3 cos φ(cos φ + 1) sin v

Then
¨ ˆ 2π ˆ π/2
3
F•dS = a dφ sin φ(cos2 φ + cos φ) dφ
S1 0 0
ñ ôπ/2
−1 1
= 2πa 3
cos3 φ − cos2 φ
3 2 0
5
= πa3 .
3

On S2 , dS = n dS = −(0, 0, 1) dS. Then F•dS = −a dS. So


¨
F•dS = −πa3 .
S2

So ¨ ¨ Ç å
5 2
F•dS + F•dS = − 1 πa3 = πa3 ,
S1 S2 3 3
in accordance with Gauss’ theorem.

7.6.1. Gauss’s Law For Inverse-Square Fields


451 Proposition (Gauss’s gravitational law)
Let g : R3 → R3 be the gravitational field of a mass distribution (i.e. g(x) is the force experienced by
a point mass located at x). If S is any closed C 1 surface, then
˛
g •n̂ dS = −4πGM,
S

where M is the mass enclosed by the region S. Here G is the gravitational constant, and n̂ is the
outward pointing unit normal vector.

202
7.6. Gauss Theorem
Proof. The core of the proof is the following calculation. Given a fixed y ∈ R3 , define the vector
field f by
x−y
f(x) = .
∥x − y∥3
The vector field −Gmf(x) represents the gravitational field of a mass located at y Then

˛ 4π if y is in the region enclosed by S,
f•n̂ dS = (7.8)
S 0 otherwise.

For simplicity, we subsequently assume y = 0.


To prove (7.8), observe
∇•f = 0,

when x ̸= 0. Let U be the region enclosed by S. If 0 ̸∈ U , then the divergence theorem will apply
to in the region U and we have
˛ ˚
g •n̂ dS = ∇•g dV = 0.
S U

On the other hand, if 0 ∈ U , the divergence theorem will not directly apply, since f ̸∈ C 1 (U ). To
circumvent this, let ϵ > 0 and U ′ = U − B(0, ϵ), and S ′ be the boundary of U ′ . Since 0 ̸∈ U ′ , f is
C 1 on all of U ′ and the divergence theorem gives
˚ ˆ
0= ∇•f dV = f•n̂ dS,
U′ ∂U ′

and hence ˛ ˛ ˛
1
f n̂ dS = −
• f n̂ dS =
• dS = −4π,
S ∂B(0,ϵ) ∂B(0,ϵ) ϵ2

as claimed. (Above the normal vector on ∂B(0, ϵ) points outward with respect to the domain U ′ ,
and inward with respect to the ball B(0, ϵ).)
Now, in the general case, suppose the mass distribution has density ρ. Then the gravitational
field g(x) will be the super-position of the gravitational fields at x due to a point mass of size ρ(y) dV
placed at y. Namely, this means
ˆ
ρ(y)(x − y)
g(x) = −G dV (y).
R3 ∥x − y∥3

Now using Fubini’s theorem,


¨ ˆ ˆ
x−y
g(x) n̂(x) dS(x) = −G
• ρ(y) •n̂(x) dS(x) dV (y)
S y∈R3 x∈S ∥x − y∥3
ˆ
= −4πG ρ(y) dV (y) = −4πGM,
y∈U

where the second last equality followed from (7.8). ■

203
7. Surface Integrals
452 Example
A system of electric charges has a charge density ρ(x, y, z) and produces an electrostatic field E(x, y, z)
at points (x, y, z) in space. Gauss’ Law states that
¨ ˚
E•dS = 4π ρ dV
S S

for any closed surface S which encloses the charges, with S being the solid region enclosed by S.
Show that ∇•E = 4πρ. This is one of Maxwell’s Equations.1

Solution: ▶ By the Divergence Theorem, we have


˚ ¨
∇•E dV = E•dS
S S
˚
= 4π ρ dV by Gauss’ Law, so combining the integrals gives
S
˚
(∇•E − 4πρ) dV = 0 , so
S
∇•E − 4πρ = 0 since S and hence S was arbitrary, so
∇•E = 4πρ .

7.7. Applications of Surface Integrals


7.7.1. Conservative and Potential Forces
We’ve seen before that any potential force must be conservative. We demonstrate the converse
here.

453 Theorem
Let U ⊆ R3 be a simply connected domain, and f : U → R3 be a C 1 vector field. Then f is a
conservative force, if and only if f is a potential force, if and only if ∇ × f = 0.

Proof. Clearly, if f is a potential force, equality of mixed partials shows ∇ × f = 0. Suppose now
∇ × f = 0. By Kelvin–Stokes theorem
˛ ˆ
f•dℓ = ∇ × f•n̂ dS = 0,
γ S

and so f is conservative. Thus to finish the proof of the theorem, we only need to show that a con-
servative force is a potential force. We do this next.
1
In Gaussian (or CGS) units.

204
7.7. Applications of Surface Integrals
Suppose f is a conservative force. Fix x0 ∈ U and define
ˆ
V (x) = − f•dℓ,
γ

where γ is any path joining x0 and x that is completely contained in U . Since f is conservative, we
seen before that the line integral above will not depend on the path itself but only on the endpoints.
Now let h > 0, and let γ be a path that joins x0 to a, and is a straight line between a and a + he1 .
Then ˆ
1 a1 +h
−∂1 V (a) = lim F1 (a + te1 ) dt = F1 (a).
h→0 h a1

The other partials can be computed similarly to obtain f = −∇V concluding the proof. ■

7.7.2. Conservation laws


454 Definition (Conservation equation)
Suppose we are interested in a quantity Q. Let ρ(r, t) be the amount of stuff per unit volume and
j(r, t) be the flow rate of the quantity (eg if Q is charge, j is the current density).
The conservation equation is
∂ρ
+ ∇•j = 0.
∂t

This is stronger than the claim that the total amount of Q in the universe is fixed. It says that Q
cannot just disappear here and appear elsewhere. It must continuously flow out.
In particular, let V be a fixed time-independent volume with boundary S = ∂V . Then
˚
Q(t) = ρ(r, t) dV
V

Then the rate of change of amount of Q in V is


˚ ˚ ¨
dQ ∂ρ
= dV = − ∇ j dV = −
• j•dS.
dt V ∂t V S

by divergence theorem. So this states that the rate of change of the quantity Q in V is the flux of
the stuff flowing out of the surface. ie Q cannot just disappear but must smoothly flow out.
In particular, if V is the whole universe (ie R3 ), and j → 0 sufficiently rapidly as |r| → ∞, then
we calculate the total amount of Q in the universe by taking V to be a solid sphere of radius Ω, and
take the limit as R → ∞. Then the surface integral → 0, and the equation states that
dQ
= 0,
dt

455 Example
If ρ(r, t) is the charge density (ie. ρδV is the amount of charge in a small volume δV ), then Q(t) is the
total charge in V . j(r, t) is the electric current density. So j•dS is the charge flowing through δS per
unit time.

205
7. Surface Integrals
456 Example
Let j = ρu with u being the velocity field. Then (ρu δt)•δS is equal to the mass of fluid crossing δS in
time δt. So ¨
dQ
=− j•dS
dt S
does indeed imply the conservation of mass. The conservation equation in this case is

∂ρ
+ ∇•(ρu) = 0
∂t
For the case where ρ is constant and uniform (ie. independent of r and t), we get that ∇•u = 0. We
say that the fluid is incompressible.

7.8. Helmholtz Decomposition


The Helmholtz theorem, also known as the Fundamental Theorem of Vector Calculus, states that
a vector field F which vanishes at the boundaries can be written as the sum of two terms, one of
which is irrotational and the other, solenoidal.
Roughly:

“A vector field is uniquely defined (within an additive constant) by specifying its diver-
gence and its curl”.

457 Theorem (Helmholtz Decomposition for R3 )


If F is a C 2 vector function on R3 and F vanishes faster than 1/r as r → ∞. Then F can be decom-
posed into a curl-free component and a divergence-free component:

F = −∇Φ + ∇ × A,

Proof. We will demonstrate first the case when F satisfies

F = −∇2 Z (7.9)

for some vector field Z


Now, consider the following identity for an arbitrary vector field Z(r) :

− ∇2 Z = −∇(∇ · Z) + ∇ × ∇ × Z (7.10)

then it follows that


F = −∇U + ∇ × W (7.11)

with
U = ∇.Z (7.12)

and
W=∇×Z (7.13)

206
7.9. Green’s Identities
Eq.(7.11) is Helmholtz’s theorem, as ∇U is irrotational and ∇ × W is solenoidal.
Now we will generalize for all vector field: if V vanishes at infinity fast enough, for, then, the
equation
∇2 Z = −V , (7.14)

which is Poisson’s equation, has always the solution


ˆ
1 V(r′ )
Z(r) = d3 r′ . (7.15)
4π |r − r′ |

It is now a simple matter to prove, from Eq.(7.11), that V is determined from its div and curl. Taking,
in fact, the divergence of Eq.(7.11), we have:

div(V) = −∇2 U (7.16)

which is, again, Poisson’s equation, and, so, determines U as


ˆ
1 ∇′ .V(r′ )
U (r) = d 3 r′ (7.17)
4π |r − r′ |

Take now the curl of Eq.(7.11). We have

∇×V=∇×∇×W
= ∇(∇.W) − ∇2 W (7.18)

Now, ∇.W = 0, as W = ∇ × Z, so another Poisson equation determines W. Using U and W so


determined in Eq.(7.11) proves the decomposition ■

458 Theorem (Helmholtz Decomposition for Bounded Domains)


If F is a C 2 vector function on a bounded domain V ⊂ R3 and let S be the surface that encloses
the domain V then Then F can be decomposed into a curl-free component and a divergence-free
component:

F = −∇Φ + ∇ × A,

where
˚ ( ) ‹ ( )
1 ∇′ •F r′ 1 F r′
Φ(r) = ′
dV ′ − ^
n ′•
dS ′
4π V |r − r | 4π S |r − r′ |
˚ ( ) ‹ ( )
1 ∇ ′ × F r′ 1 F r′
A(r) = ′|
dV ′ − ′
^ ×
n dS ′
4π V |r − r 4π S |r − r′ |

and ∇′ is the gradient with respect to r′ not r.

7.9. Green’s Identities

207
7. Surface Integrals

459 Theorem
Let ϕ and ψ be two scalar fields with continuous second derivatives. Then
¨ ñ ô ˚
∂ψ
■ ϕ dS = [ϕ∇2 ψ + (∇ϕ) · (∇ψ)] dV Green’s first identity
S ∂n U
¨ ñ ô ˚
∂ψ ∂ϕ
■ ϕ −ψ dS = (ϕ∇2 ψ − ψ∇2 ϕ) dV Green’s second identity.
S ∂n ∂n U

Proof.
Consider the quantity
F = ϕ∇ψ

It follows that

div F = ϕ∇2 ψ + (∇ϕ) · (∇ψ)


n̂ · F = ϕ∂ψ/∂n

Applying the divergence theorem we obtain


¨ ñ ô ˚
∂ψ
ϕ dS = [ϕ∇2 ψ + (∇ϕ) · (∇ψ)] dV
S ∂n U

which is known as Green’s first identity. Interchanging ϕ and ψ we have


¨ ñ ô ˚
∂ϕ
ψ dS = [ψ∇2 ϕ + (∇ψ) · (∇ϕ)] dV
S ∂n U

Subtracting (2) from (1) we obtain


¨ ñ ô ˚
∂ψ ∂ϕ
ϕ −ψ dS = (ϕ∇2 ψ − ψ∇2 ϕ) dV
S ∂n ∂n U

which is known as Green’s second identity.


208
Part III.

Tensor Calculus

209
Curvilinear Coordinates
8.
8.1. Curvilinear Coordinates
The location of a point P in space can be represented in many different ways. Three systems com-
monly used in applications are the rectangular cartesian system of Coordinates (x, y, z), the cylin-
drical polar system of Coordinates (r, ϕ, z) and the spherical system of Coordinates (r, φ, ϕ). The
last two are the best examples of orthogonal curvilinear systems of coordinates (u1 , u2 , u3 ) .

460 Definition
A function u : U → V is called a (differentiable) coordinate change if

■ u is bijective

■ u is differentiable

■ Du is invertible at every point.

Figure 8.1. Coordinate System

In the tridimensional case, suppose that (x, y, z) are expressible as single-valued functions u of
the variables (u1 , u2 , u3 ). Suppose also that (u1 , u2 , u3 ) can be expressed as single-valued func-
tions of (x, y, z).
Through each point P : (a, b, c) of the space we have three surfaces: u1 = c1 , u2 = c2 and
u3 = c3 , where the constants ci are given by ci = ui (a, b, c)
If say u2 and u3 are held fixed and u1 is made to vary, a path results. Such path is called a u1
curve. u2 and u3 curves can be constructed in analogous manner.

211
8. Curvilinear Coordinates
u3

u1 = const

u2 = const
u2

u3 = const

u1

The system (u1 , u2 , u3 ) is said to be a curvilinear coordinate system.


461 Example
The parabolic cylindrical coordinates are defined in terms of the Cartesian coordinates by:

x = στ
1Ä 2 ä
y= τ − σ2
2
z=z

The constant surfaces are the plane


z = z1

and the parabolic cylinders


x2
2y = − σ2
σ2
and
x2
2y = − + τ2
τ2

212
8.1. Curvilinear Coordinates
Coordinates I
The surfaces u2 = u2 (P ) and u3 = u3 (P ) intersect in a curve, along which only u1 varies.

^
e3
u2 = u2 (P ) ui curve ^
ei P
u1 = u1 (P )
^
e1
P
r(P )
u3 = u3 (P ) ^
e2

Let ^
e1 be the unit vector tangential to the curve at P . Let ^
e2 , ^
e3 be unit vectors tangential to
curves along which only u2 , u3 vary.
Clearly

∂r ∂r

^
ei = / .
∂u1 ∂ui

And if we define hi = |∂r/∂ui | then

∂r
ei · hi
=^
∂ui

The quantities hi are often known as the length scales for the coordinate system.

462 Example (Versors in Spherical Coordinates)


In spherical coordinates r = (r cos(θ) sin(ϕ), r sin(θ) sin(ϕ), r cos(ϕ)) so:

∂r
∂r (cos(θ) sin(ϕ), sin(θ) sin(ϕ), cos(ϕ))
er =
∂r =
1

∂r

er = (cos(θ) sin(ϕ), sin(θ) sin(ϕ), cos(ϕ))

∂r
∂θ (−r sin(θ) sin(ϕ), r cos(θ) sin(ϕ), 0)
eθ =
∂r =
r sin(ϕ)

∂θ

eθ = (− sin(θ), cos(θ), 0)

∂r
∂ϕ (r cos(θ) cos(ϕ), r sin(θ) cos(ϕ), −r sin(ϕ))
eϕ =
∂r =
r

∂ϕ

eϕ = (cos(θ) cos(ϕ), sin(θ) cos(ϕ), − sin(ϕ))

213
8. Curvilinear Coordinates
Coordinates II
e1 , ^
Let (^ e2 , ^
e3 ) be unit vectors at P in the directions normal to u1 = u1 (P ), u2 = u2 (P ), u3 =
e1 , ^
u3 (P ) respectively, such that u1 , u2 , u3 increase in the directions ^ e2 , â3 . Clearly we must have

ei = ∇(ui )/|∇ui |
^

463 Definition
e1 , ^
If (^ e2 , ^
e3 ) are mutually orthogonal, the coordinate system is said to be an orthogonal curvilinear
coordinate system.

464 Theorem
The following affirmations are equivalent:

e1 , ^
1. (^ e2 , ^
e3 ) are mutually orthogonal;

e1 , ^
2. (^ e2 , ^
e3 ) are mutually orthogonal;
∂r/∂ui
3. ^ ei =
ei = ^ = ∇ui /|∇ui | for i = 1, 2, 3
|∂r/∂ui |

So we associate to a general curvilinear coordinate system two sets of basis vectors for every
point:
{^
e1 , ^ e3 }
e2 , ^

is the covariant basis, and


{^
e1 , ^ e3 }
e2 , ^

is the contravariant basis.

e3

e2
e1

Figure 8.2. Covariant and Contravariant Basis

Note the following important equality:

ei · ^
^ ej = δji .

214
8.2. Line and Volume Elements in Orthogonal Coordinate Systems
465 Example
Cylindrical coordinates (r, θ, z):
»
x = r cos θ r= x2 + y 2
Å ã
y
y = r sin θ θ = tan−1
x
z=z z=z

where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0

For cylindrical coordinates (r, θ, z), and constants r0 , θ0 and z0 , we see from Figure 8.3 that the
surface r = r0 is a cylinder of radius r0 centered along the z-axis, the surface θ = θ0 is a half-plane
emanating from the z-axis, and the surface z = z0 is a plane parallel to the xy-plane.

z z z
r0 z0

y 0 y y
0 0

θ0
x x x

(a) r = r0 (b) θ = θ0 (c) z = z0

Figure 8.3. Cylindrical coordinate surfaces

The unit vectors r̂, θ̂, k̂ at any point P are perpendicular to the surfaces r = constant, θ = constant,
z = constant through P in the directions of increasing r, θ, z. Note that the direction of the unit vectors
r̂, θ̂ vary from point to point, unlike the corresponding Cartesian unit vectors.

8.2. Line and Volume Elements in Orthogonal


Coordinate Systems
466 Definition (Line Element)
Since r = r(u1 , u2 , u3 ), the line element dr is given by

∂r ∂r ∂r
dr = du1 + du2 + du3
∂u1 ∂u2 ∂u3
e2 + h3 du3 ^
e1 + h2 du2 ^
= h1 du1 ^ e3

If the system is orthogonal, then it follows that

(ds)2 = (dr) · (dr) = h21 (du1 )2 + h22 (du2 )2 + h23 (du3 )2

215
8. Curvilinear Coordinates
In what follows we will assume we have an orthogonal system so that
∂r/∂ui
^ ei =
ei = ^ = ∇ui /|∇ui | for i = 1, 2, 3
|∂r/∂ui |
In particular, line elements along curves of intersection of ui surfaces have lengths h1 du1 , h2 du2 , h3 du3
respectively.

467 Definition (Volume Element)


In R3 , the volume element is given by

dV = dx dy dz.

In a coordinate systems x = x(u1 , u2 , u3 ), y = y(u1 , u2 , u3 ), z = z(u1 , u2 , u3 ), the volume element


is:
∂(x, y, z)

dV = du1 du2 du3 .
∂(u1 , u2 , u3 )

468 Proposition
In an orthogonal system we have

dV = (h1 du1 )(h2 du2 )(h3 du3 )


= h1 h2 h3 du1 du2 du3

In this section we find the expression of the line and volume elements in some classics orthogonal
coordinate systems.
(i) Cartesian Coordinates (x, y, z)

dV = dxdydz
dr = dxî + dy ĵ + dz k̂
(ds)2 = (dr) · (dr) = (dx)2 + (dy)2 + (dz)2

(ii) Cylindrical polar coordinates (r, θ, z) The coordinates are related to Cartesian by

x = r cos θ, y = r sin θ, z = z

We have that (ds)2 = (dx)2 + (dy)2 + (dz)2 , but we can write


Ç å Ç å Ç å
∂x ∂x ∂x
dx = dr + dθ + dz
∂r ∂θ ∂z
= (cos θ) dr − (r sin θ) dθ

and
Ç å Ç å Ç å
∂y ∂y ∂y
dy = dr + dθ + dz
∂r ∂θ ∂z
= (sin θ) dr + (r cos θ)dθ

216
8.2. Line and Volume Elements in Orthogonal Coordinate Systems
Therefore we have

(ds)2 = (dx)2 + (dy)2 + (dz)2


= · · · = (dr)2 + r2 (dθ)2 + (dz)2

Thus we see that for this coordinate system, the length scales are

h1 = 1, h2 = r, h3 = 1

and the element of volume is


dV = r drdθdz

(iii) Spherical Polar coordinates (r, ϕ, θ) In this case the relationship between the coordinates
is
x = r sin ϕ cos θ; y = r sin ϕ sin θ; z = r cos ϕ

Again, we have that (ds)2 = (dx)2 + (dy)2 + (dz)2 and we know that

∂x ∂x ∂x
dx = dr + dθ + dϕ
∂r ∂θ ∂ϕ
= (sin ϕ cos θ)dr + (−r sin ϕ sin θ)dθ + r cos ϕ cos θdϕ

and

∂y ∂y ∂y
dy = dr + dθ + dϕ
∂r ∂θ ∂ϕ
= sin ϕ sin θdr + r sin ϕ cos θdθ + r cos ϕ sin θdϕ

together with

∂z ∂z ∂z
dz = dr + dθ + dϕ
∂r ∂θ ∂ϕ
= (cos ϕ)dr − (r sin ϕ)dϕ

Therefore in this case, we have (after some work)

(ds)2 = (dx)2 + (dy)2 + (dz)2


= · · · = (dr)2 + r2 (dϕ)2 + r2 sin2 ϕ(dθ)2

Thus the length scales are


h1 = 1, h2 = r, h3 = r sin ϕ

and the volume element is


dV = r2 sin ϕ drdϕdθ
469 Example
Find the volume and surface area of a sphere of radius a, and also find the surface area of a cap of
the sphere that subtends on angle α at the centre of the sphere.

217
8. Curvilinear Coordinates
dV = r2 sin ϕ drdϕdθ

and an element of surface of a sphere of radius a is (by removing h1 du1 = dr):

dS = a2 sin ϕ dϕ dθ

∴ total volume is
ˆ ˆ 2π ˆ π ˆ a
dV = r2 sin ϕ drdϕdθ
V θ=0 ϕ=0 r=0
ˆ a
= 2π[− cos ϕ]π0 r2 dr
0
= 4πa3 /3

Surface area is
ˆ ˆ 2π ˆ π
dS = a2 sin ϕ dϕ dθ
S θ=0 ϕ=0

= 2πa [− cos ϕ]π0


2

= 4πa2

Surface area of cap is


ˆ 2π ˆ α
a2 sin ϕ dϕ dθ = 2πa2 [− cos ϕ]α0
θ=0 ϕ=0

= 2πa2 (1 − cos α)

8.3. Gradient in Orthogonal Curvilinear Coordinates


Let
∇Φ = λ1 ^
e1 + λ2 ^
e2 + λ3 ^
e3

in a general coordinate system, where λ1 , λ2 , λ3 are to be found. Recall that the element of length
is given by
dr = h1 du1 ^e1 + h2 du2 ^ e2 + h3 du3 ^
e3

Now
∂Φ ∂Φ ∂Φ
dΦ = du1 + du2 + du3
∂u1 ∂u2 ∂u3
∂Φ ∂Φ ∂Φ
= dx + dy + dz
∂x ∂y ∂z
= (∇Φ) · dr

But, using our expressions for ∇Φ and dr above:

(∇Φ) · dr = λ1 h1 du1 + λ2 h2 du2 + λ3 h3 du3

218
8.3. Gradient in Orthogonal Curvilinear Coordinates
and so we see that
∂Φ
hi λ i = (i = 1, 2, 3)
∂ui
Thus we have the result that

470 Proposition (Gradient in Orthogonal Curvilinear Coordinates)

^
e1 ∂Φ ^
e2 ∂Φ ^
e3 ∂Φ
∇Φ = + +
h1 ∂u1 h2 ∂u2 h3 ∂u3

This proposition allows us to write down ∇ easily for other coordinate systems.
(i) Cylindrical polars (r, θ, z) Recall that h1 = 1, h2 = r, h3 = 1. Thus

∂ θ̂ ∂ ∂
∇ = r̂ + + ẑ
∂r r ∂θ ∂z

(ii) Spherical Polars (r, ϕ, θ) We have h1 = 1, h2 = r, h3 = r sin ϕ, and so

∂ ϕ̂ ∂ θ̂ ∂
∇ = r̂ + +
∂r r ∂ϕ r sin ϕ ∂θ

471 Example
Calculate the gradient of the function expressed in cylindrical coordinate as

f (r, θ, z) = r sin θ + z.

Solution: ▶

∂f θ̂ ∂f ∂f
∇f = r̂ + + ẑ (8.1)
∂r r ∂θ ∂z
= r̂ sin θ + θ̂ cos θ + ẑ (8.2)

8.3.1. Expressions for Unit Vectors


From the expression for ∇ we have just derived, it is easy to see that

ei = hi ∇ui
^

Alternatively, since the unit vectors are orthogonal, if we know two unit vectors we can find the
third from the relation
^ e2 × ^
e1 = ^ e3 = h2 h3 (∇u2 × ∇u3 )

and similarly for the other components, by permuting in a cyclic fashion.

219
8. Curvilinear Coordinates

8.4. Divergence in Orthogonal Curvilinear


Coordinates
Suppose we have a vector field

A = A1 ^
e1 + A2 ^
e2 + A3 ^
e3

Then consider

∇ · (A1 ^
e1 ) = ∇ · [A1 h2 h3 (∇u2 × ∇u3 )]
^
e1
= A1 h2 h3 ∇ · (∇u2 × ∇u3 ) + ∇(A1 h2 h3 ) ·
h2 h3
using the results established just above. Also we know that

∇ · (B × C) = C · curl B − B · curl C

and so it follows that

∇ · (∇u2 × ∇u3 ) = (∇u3 ) · curl(∇u2 ) − (∇u2 ) · curl(∇u3 ) = 0

since the curl of a gradient is always zero. Thus we are left with

^
e1 1 ∂
∇ · (A1 ^
e1 ) = ∇(A1 h2 h3 ) · = (A1 h2 h3 )
h2 h3 h1 h2 h3 ∂u1
We can proceed in a similar fashion for the other components, and establish that

472 Proposition (Divergence in Orthogonal Curvilinear Coordinates)

ñ ô
1 ∂ ∂ ∂
∇·A= (h2 h3 A1 ) + (h3 h1 A2 ) + (h1 h2 A3 )
h1 h2 h3 ∂u1 ∂u2 ∂u3

Using the above proposition is now easy to write down the divergence in other coordinate sys-
tems.
(i) Cylindrical polars (r, θ, z)
Since h1 = 1, h2 = r, h3 = 1 using the above formula we have :
ñ ô
1 ∂ ∂ ∂
∇·A= (rA1 ) + (A2 ) + (rA3 )
r ∂r ∂θ ∂z
∂A1 A1 1 ∂A2 ∂A3
= + + +
∂r r r ∂θ ∂z
(ii) Spherical polars (r, ϕ, θ)
We have h1 = 1, h2 = r, h3 = r sin ϕ. So
ñ ô
1 ∂ 2 ∂ ∂
∇·A= 2 (r sin ϕA1 ) + (r sin ϕA2 ) + (rA3 )
r sin ϕ ∂r ∂ϕ ∂θ

220
8.5. Curl in Orthogonal Curvilinear Coordinates
473 Example
Calculate the divergence of the vector field expressed in spherical coordinates (r, ϕ, θ) as f = r̂+ ϕ̂+ θ̂

Solution: ▶
ñ ô
1 ∂ 2 ∂ ∂
∇·f= (r sin ϕ) + (r sin ϕ) + r (8.3)
r2 sin ϕ ∂r ∂ϕ ∂θ
1
= 2 [2r sin ϕ + r cos ϕ] (8.4)
r sin ϕ

8.5. Curl in Orthogonal Curvilinear Coordinates


We will calculate the curl of the first component of A:

∇ × (A1 ^
e1 ) = ∇ × (A1 h1 ∇u1 )

= A1 h2 ∇ × (∇u1 ) + ∇(A1 h1 ) × ∇u1

= 0 + ∇(A1 h1 ) × ∇u1
ñ ô
^
e1 ∂ ^
e2 ∂ ^
e3 ∂ ^
e1
= (A1 h1 ) + (A1 h1 ) + (A1 h1 ) ×
h1 ∂u1 h2 ∂u2 h3 ∂u3 h1
^
e2 ∂ ^
e3 ∂
= (h1 A1 ) − (h1 A1 )
h1 h3 ∂u3 h1 h2 ∂u2

e1 × ^
(since ^ e2 × ^
e1 = 0, ^ e1 = −^ e3 × ^
e3 , ^ e1 = ^
e2 ).
We can obviously find curl(A2 ^
e2 ) and curl(A3 ^
e3 ) in a similar way. These can be shown to be

^
e3 ∂ ^
e1 ∂
∇ × (A2 ^
e2 ) = (h2 A2 ) − (h2 A2 )
h2 h1 ∂u1 h2 h3 ∂u3
^
e1 ∂ ^
e2 ∂
∇ × (A3 ^
e3 ) = (h3 A3 ) − (h3 A3 )
h3 h2 ∂u2 h3 h1 ∂u1

Adding these three contributions together, we find we can write this in the form of a determinant
as

474 Proposition (Curl in Orthogonal Curvilinear Coordinates)



h1 ^ h2 ^ e3
h3 ^
e1 e2

1
curl A = ∂ ∂u2 ∂u3
h1 h2 h3 u1

h1 A1 h2 A2 h3 A3

It’s then straightforward to write down the expressions of the curl in various orthogonal coordi-
nate systems.

221
8. Curvilinear Coordinates
(i) Cylindrical polars


r̂ rθ̂ ẑ

1

curl A = ∂r ∂θ ∂z
r

A1 rA2 A3

(ii) Spherical polars




r̂ rϕ̂ r sin ϕθ̂


1
curl A = 2 ∂ ∂ϕ ∂θ
r sin ϕ r

A1 rA2 r sin ϕA3

8.6. The Laplacian in Orthogonal Curvilinear


Coordinates
From the formulae already established for the gradient and the divergent, we can see that

475 Proposition (The Laplacian in Orthogonal Curvilinear Coordinates)

∇2 Φ = ∇ · (∇Φ)
ñ ô
1 ∂ 1 ∂Φ ∂ 1 ∂Φ ∂ 1 ∂Φ
= (h2 h3 )+ (h3 h1 )+ (h1 h2 )
h1 h2 h3 ∂u1 h1 ∂u1 ∂u2 h2 ∂u2 ∂u3 h3 ∂u3

(i) Cylindrical polars (r, θ, z)


[ Ç å Ç å Ç å]
1 ∂ ∂Φ ∂ 1 ∂Φ ∂ ∂Φ
∇ Φ=
2
r + + r
r ∂r ∂r ∂θ r ∂θ ∂z ∂z

∂ 2 Φ 1 ∂Φ 1 ∂2Φ ∂2Φ
= + + +
∂r2 r ∂r r2 ∂θ2 ∂z 2

(ii) Spherical polars (r, ϕ, θ)


[ Ç å Ç å Ç å]
1 ∂ ∂Φ ∂ ∂Φ ∂ 1 ∂Φ
∇ Φ= 2
2 2
r sin ϕ + sin ϕ +
r sin ϕ ∂r ∂r ∂ϕ ∂ϕ ∂θ sin ϕ ∂θ

∂ 2 Φ 2 ∂Φ cot ϕ ∂Φ 1 ∂2Φ 1 ∂2Φ


= + + + +
∂r2 r ∂r r2 ∂ϕ r2 ∂ϕ2 r2 sin2 ϕ ∂θ2

476 Example
In Example ?? we showed that ∇∥r∥2 = 2 r and ∆∥r∥2 = 6, where r(x, y, z) = x i + y j + z k in
Cartesian coordinates. Verify that we get the same answers if we switch to spherical coordinates.
Solution: Since ∥r∥2 = x2 + y 2 + z 2 = ρ2 in spherical coordinates, let F (ρ, θ, ϕ) = ρ2 (so that

222
8.7. Examples of Orthogonal Coordinates
F (ρ, θ, ϕ) = ∥r∥2 ). The gradient of F in spherical coordinates is

∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eϕ
∂ρ ρ sin ϕ ∂θ ρ ∂ϕ
1 1
= 2ρ eρ + (0) eθ + (0) eϕ
ρ sin ϕ ρ
r
= 2ρ eρ = 2ρ , as we showed earlier, so
∥r∥
r
= 2ρ = 2 r , as expected. And the Laplacian is
ρ
Ç å Ç å
1 ∂ 2 ∂F 1 ∂2F 1 ∂ ∂F
∆F = ρ + 2 2 + 2 sin ϕ
ρ2 ∂ρ ∂ρ ρ sin ϕ ∂θ2 ρ sin ϕ ∂ϕ ∂ϕ
1 ∂ 2 1 1 ∂ ( )
= (ρ 2ρ) + 2 (0) + 2 sin ϕ (0)
ρ2 ∂ρ ρ sin ϕ ρ sin ϕ ∂ϕ
1 ∂
= (2ρ3 ) + 0 + 0
ρ2 ∂ρ
1
= (6ρ2 ) = 6 , as expected.
ρ2

^
e1 ∂Φ ^
e2 ∂Φ ^
e3 ∂Φ
∇Φ = + +
h1 ∂u1 ñ h2 ∂u2 h3 ∂u3 ô
1 ∂ ∂ ∂
∇·A = (h2 h3 A1 ) + (h3 h1 A2 ) + (h1 h2 A3 )
h1 h2 h3 ∂u1 ∂u 2 ∂u3

h1 ^ h2 ^ e3
h3 ^
e1 e2

1
curl A = ∂ ∂u2 ∂u3
h1 h2 h3 u1

h1 A1 h2 A2 h3 A3
ñ ô
1 ∂ 1 ∂Φ ∂ 1 ∂Φ ∂ 1 ∂Φ
∇2 Φ = (h2 h3 )+ (h3 h1 )+ (h1 h2 )
h1 h2 h3 ∂u1 h1 ∂u1 ∂u2 h2 ∂u2 ∂u3 h3 ∂u3

Table 8.1. Vector operators in orthogonal curvilin-


ear coordinates u1 , u2 , u3 .

8.7. Examples of Orthogonal Coordinates


Spherical Polar Coordinates (r, ϕ, θ) ∈ [0, ∞) × [0, π] × [0, 2π)

x = r sin ϕ cos θ (8.5)


y = r sin ϕ sin θ (8.6)
z = r cos ϕ (8.7)

223
8. Curvilinear Coordinates
The scale factors for the Spherical Polar Coordinates are:

h1 = 1 (8.8)
h2 = r (8.9)
h3 = r sin ϕ (8.10)

Cylindrical Polar Coordinates (r, θ, z) ∈ [0, ∞) × [0, 2π) × (−∞, ∞)

x = r cos θ (8.11)
y = r sin θ (8.12)
z=z (8.13)

The scale factors for the Cylindrical Polar Coordinates are:

h1 = h3 = 1 (8.14)
h2 = r (8.15)

Parabolic Cylindrical Coordinates (u, v, z) ∈ (−∞, ∞) × [0, ∞) × (−∞, ∞)

1
x = (u2 − v 2 ) (8.16)
2
y = uv (8.17)
z=z (8.18)

The scale factors for the Parabolic Cylindrical Coordinates are:


h1 = h2 = u2 + v 2 (8.19)
h3 = 1 (8.20)

Paraboloidal Coordinates (u, v, θ) ∈ [0, ∞) × [0, ∞) × [0, 2π)

x = uv cos θ (8.21)
y = uv sin θ (8.22)
1
z = (u2 − v 2 ) (8.23)
2

The scale factors for the Paraboloidal Coordinates are:



h1 = h2 = u2 + v 2 (8.24)
h3 = uv (8.25)

224
8.7. Examples of Orthogonal Coordinates
Elliptic Cylindrical Coordinates (u, v, z) ∈ [0, ∞) × [0, 2π) × (−∞, ∞)

x = a cosh u cos v (8.26)


y = a sinh u sin v (8.27)
z=z (8.28)

The scale factors for the Elliptic Cylindrical Coordinates are:


»
h1 = h2 = a sinh2 u + sin2 v (8.29)
h3 = 1 (8.30)

Prolate Spheroidal Coordinates (ξ, η, θ) ∈ [0, ∞) × [0, π] × [0, 2π)

x = a sinh ξ sin η cos θ (8.31)


y = a sinh ξ sin η sin θ (8.32)
z = a cosh ξ cos η (8.33)

The scale factors for the Prolate Spheroidal Coordinates are:


»
h1 = h2 = a sinh2 ξ + sin2 η (8.34)
h3 = a sinh ξ sin η (8.35)

î ó
Oblate Spheroidal Coordinates (ξ, η, θ) ∈ [0, ∞) × − π2 , π2 × [0, 2π)

x = a cosh ξ cos η cos θ (8.36)


y = a cosh ξ cos η sin θ (8.37)
z = a sinh ξ sin η (8.38)

The scale factors for the Oblate Spheroidal Coordinates are:


»
h1 = h2 = a sinh2 ξ + sin2 η (8.39)
h3 = a cosh ξ cos η (8.40)

Ellipsoidal Coordinates
(λ, µ, ν) (8.41)
2 2 2
λ<c <b <a , (8.42)
2 2 2
c <µ<b <a , (8.43)
c2 < b2 < ν < a2 , (8.44)

x2 y2 z2
a2 −qi
+ b2 −qi
+ c2 −qi
= 1 where (q1 , q2 , q3 ) = (λ, µ, ν)

1 (qj −qi )(qk −qi )
The scale factors for the Ellipsoidal Coordinates are: hi = 2 (a2 −qi )(b2 −qi )(c2 −qi )

225
8. Curvilinear Coordinates
Bipolar Coordinates (u, v, z) ∈ [0, 2π) × (−∞, ∞) × (−∞, ∞)
a sinh v
x= (8.45)
cosh v − cos u
a sin u
y= (8.46)
cosh v − cos u
z=z (8.47)

The scale factors for the Bipolar Coordinates are:


a
h1 = h2 = (8.48)
cosh v − cos u
h3 = 1 (8.49)

Toroidal Coordinates (u, v, θ) ∈ (−π, π] × [0, ∞) × [0, 2π)


a sinh v cos θ
x= (8.50)
cosh v − cos u
a sinh v sin θ
y= (8.51)
cosh v − cos u
a sin u
z= (8.52)
cosh v − cos u
The scale factors for the Toroidal Coordinates are:
a
h1 = h2 = (8.53)
cosh v − cos u
a sinh v
h3 = (8.54)
cosh v − cos u

Conical Coordinates
(λ, µ, ν) (8.55)
2 2 2 2
ν <b <µ <a (8.56)
λ ∈ [0, ∞) (8.57)

λµν
x= (8.58)
ab
 
λ (µ2 − a2 )(ν 2 − a2 )
y= (8.59)
a a2 − b2
 
λ (µ2 − b2 )(ν 2 − b2 )
z= (8.60)
b a2 − b2
The scale factors for the Conical Coordinates are:

h1 = 1 (8.61)
λ2 (µ2 − ν 2)
h22 = (8.62)

(µ2 − µ2 )
a2 )(b2
λ2 (µ2 − ν 2 )
h23 = 2 (8.63)
(ν − a2 )(ν 2 − b2 )

226
8.7. Examples of Orthogonal Coordinates

Exercises
A
For Exercises 1-6, find the Laplacian of the function f (x, y, z) in Cartesian coordinates.

1. f (x, y, z) = x + y + z 2. f (x, y, z) = x5 3. f (x, y, z) = (x2 + y 2 + z 2 )3/2

6. f (x, y, z) = e−x
2 −y 2 −z 2
4. f (x, y, z) = ex+y+z 5. f (x, y, z) = x3 + y 3 + z 3

7. Find the Laplacian of the function in Exercise 3 in spherical coordinates.

8. Find the Laplacian of the function in Exercise 6 in spherical coordinates.


z
9. Let f (x, y, z) = in Cartesian coordinates. Find ∇f in cylindrical coordinates.
x2 + y 2
10. For f(r, θ, z) = r er + z sin θ eθ + rz ez in cylindrical coordinates, find div f and curl f.

11. For f(ρ, θ, ϕ) = eρ + ρ cos θ eθ + ρ eϕ in spherical coordinates, find div f and curl f.

B
For Exercises 12-23, prove the given formula (r = ∥r∥ is the length of the position vector field
r(x, y, z) = x i + y j + z k).

12. ∇ (1/r) = −r/r3 13. ∆ (1/r) = 0 14. ∇•(r/r3 ) = 0 15. ∇ (ln r) = r/r2

16. div (F + G) = div F + div G 17. curl (F + G) = curl F + curl G

18. div (f F) = f div F + F•∇f 19. div (F×G) = G•curl F − F•curl G

20. div (∇f ×∇g) = 0 21. curl (f F) = f curl F + (∇f )×F

22. curl (curl F) = ∇(div F) − ∆ F 23. ∆ (f g) = f ∆ g + g ∆ f + 2(∇f •∇g)

227
Tensors
9.
In this chapter we define a tensor as a multilinear map.

9.1. Linear Functional


477 Definition
A function f : Rn → R is a linear functional if satisfies the

linearity condition: f (au + bv) = af (u) + bf (v) ,

or in words: “the value on a linear combination is the the linear combination of the values.”

A linear functional is also called linear function, 1-form, or covector.


This easily extends to linear combinations with any number of terms; for example
Ñ é
∑N ∑
N
f (v) = f v i ei = v i f (ei )
i=1 i=1

where the coefficients fi ≡ f (ei ) are the “components” of a covector with respect to the basis {ei },
or in our shorthand notation

f (v) = f (v i ei ) (express in terms of basis)


= v i f (ei ) (linearity)
= v i fi . (definition of components)

A covector f is entirely determined by its values fi on the basis vectors, namely its components
with respect to that basis.
Our linearity condition is usually presented separately as a pair of separate conditions on the two
operations which define a vector space:

■ sum rule: the value of the function on a sum of vectors is the sum of the values, f (u + v) =
f (u) + f (v),

229
9. Tensors
■ scalar multiple rule: the value of the function on a scalar multiple of a vector is the scalar
times the value on the vector, f (cu) = cf (u).
478 Example
In the usual notation on R3 , with Cartesian coordinates (x1 , x2 , x3 ) = (x, y, z), linear functions are
of the form f (x, y, z) = ax + by + cz,
479 Example
If we fixed a vector n we have a function n∗ : Rn → R defined by

n∗ (v) := n · v

is a linear function.

9.2. Dual Spaces


480 Definition
We define the dual space of Rn , denoted as (Rn )∗ , as the set of all real-valued linear functions on
Rn ;

(Rn )∗ = {f : f : Rn → R is a linear function }

The dual space (Rn )∗ is itself an n-dimensional vector space, with linear combinations of covec-
tors defined in the usual way that one can takes linear combinations of any functions, i.e., in terms
of values

covector addition: (af + bg)(v) ≡ af (v) + bg(v) , f, g covectors, v a vector .

481 Theorem
Suppose that vectors in Rn represented as column vectors
 
 x1 
 
 . 
x =  ..  .
 
 
xn

For each row vector


[a] = [a1 . . . an ]

there is a linear functional f defined by


 
 x1 
 
 . 
f (x) = [a1 . . . an ]  ..  .
 
 
xn

230
9.2. Dual Spaces
f (x) = a1 x1 + · · · + an xn ,

and each linear functional in Rn can be expressed in this form

482 Remark
As consequence of the previous theorem we can see vectors as column and covectors as row matrix.
And the action of covectors in vectors as the matrix product of the row vector and the column vector.
  

 

 x1 
 


  

 .
. 
R =  .  , xi ∈ R
n
(9.1)

   

 
 


 xn 

{ }
(Rn )∗ = [a1 . . . an ], ai ∈ R (9.2)

483 Remark
closure of the dual space
Show that the dual space is closed under this linear combination operation. In other words, show
that if f, g are linear functions, satisfying our linearity condition, then a f + b g also satisfies the lin-
earity condition for linear functions:

(a f + b g)(c1 u + c2 v) = c1 (a f + b g)(u) + c2 (a f + b g)(v) .

9.2.1. Duas Basis


Let us produce a basis for (Rn )∗ , called the dual basis {ei } or “the basis dual to {ei },” by defining
n covectors which satisfy the following “duality relations”


1 if i = j ,
ei (ej ) = δji ≡

0 if i ̸= j ,

where the symbol δji is called the “Kronecker delta,” nothing more than a symbol for the compo-
nents of the n × n identity matrix I = (δji ). We then extend them to any other vector by linearity.
Then by linearity

ei (v) = ei (v j ej ) (expand in basis)


j i
= v e (ej ) (linearity)
= v j δji (duality)
= vj (Kronecker delta definition)

where the last equality follows since for each i, only the term with j = i in the sum over j con-
tributes to the sum. Alternatively matrix multiplication of a vector on the left by the identity matrix
δji v j = v i does not change the vector. Thus the calculation shows that the i-th dual basis covector
ei picks out the i-th component v i of a vector v.

231
9. Tensors

484 Theorem
The n covectors {ei } form a basis of (Rn )∗ .

Proof.

1. spanning condition:
Using linearity and the definition fi = f (ei ), this calculation shows that every linear function
f can be written as a linear combination of these covectors

f (v) = f (v i ei ) (expand in basis)


= v i f (ei ) (linearity)
i
= v fi (definition of components)
i j
= v δ i fj (Kronecker delta definition)
= v i ej (ei )fj (dual basis definition)
= (fj ej )(v i ei ) (linearity)
= (fj ej )(v) . (expansion in basis, in reverse)

Thus f and fi ei have the same value on every v ∈ Rn so they are the same function: f = fi ei ,
where fi = f (ei ) are the “components” of f with respect to the basis {ei } of (Rn )∗ also said
to be the “components” of f with respect to the basis {ei } of Rn already introduced above.
The index i on fi labels the components of f , while the index i on ei labels the dual basis
covectors.

2. linear independence:
Suppose fi ei = 0 is the zero covector. Then evaluating each side of this equation on ej and
using linearity

0 = 0(ej ) (zero scalar = value of zero linear function)


= (fi ei )(ej ) (expand zero vector in basis)
i
= fi e (ej ) (definition of linear combination function value)
= fi δji (duality)
= fj (Knonecker delta definition)

forces all the coefficients of ei to vanish, i.e., no nontrivial linear combination of these covec-
tors exists which equals the zero covector so these covectors are linearly independent. Thus
(Rn )∗ is also an n-dimensional vector space.

9.3. Bilinear Forms


A bilinear form is a function that is linear in each argument separately:

232
9.4. Tensor
1. B(u + v, w) = B(u, w) + B(v, w) and B(λu, v) = λB(u, v)

2. B(u, v + w) = B(u, v) + B(u, w) and B(u, λv) = λB(u, v)

Let f (v, w) be a bilinear form and let e1 , . . . , en be a basis in this space. The numbers Bij de-
termined by formula
Bij = f (ei , ej ) (9.3)
are called the coordinates or the components of the form B in the basis e1 , . . . , en . The numbers
9.3 are written in form of a matrix
 
 B11 . . . B1n 
 
 .. .. .. 
 . . . 
B=


, (9.4)
 
 
Bn1 . . . Bnn 

which is called the matrix of the bilinear form B in the basis e1 , . . . , en . For the element Bij in
the matrix 9.4 the first index i specifies the row number, the second index j specifies the column
number.
The matrix of a symmetric bilinear form B is also symmetric: Bij = Bji . Let v 1 , . . . , v n and
w1 , . . . , wn be coordinates of two vectors v and w in the basis e1 , . . . , en . Then the values f (v, w)
of a bilinear form are calculated by the following formulas:

n ∑
n
B(v, w) = Bij v i wj , (9.5)
i=1 j=1

9.4. Tensor
Let V = Rn and let V ∗ = Rn∗ denote its dual space. We let

V k = |V × ·{z
· · × V} .
k times

485 Definition
A k-multilinear map on V is a function T : V k → R which is linear in each variable.

T(v1 , . . . , λv + w, vi+1 , . . . , vk ) = λT(v1 , . . . , v, vi+1 , . . . , vk ) + T(v1 , . . . , w, vi+1 , . . . , vk )

In other words, given (k − 1) vectors v1 , v2 , . . . , vi−1 , vi+1 , . . . , vk , the map Ti : V → R defined


by Ti (v) = T(v1 , v2 , . . . , v, vi+1 , . . . , vk ) is linear.

486 Definition

■ A tensor of type (r, s) on V is a multilinear map T : V r × (V ∗ )s → R.

233
9. Tensors
■ A covariant k-tensor on V is a multilinear map T : V k → R

■ A contravariant k-tensor on V is a multilinear map T : (V ∗ )k → R.

In other words, a covariant k-tensor is a tensor of type (k, 0) and a contravariant k-tensor is a
tensor of type (0, k).
487 Example

■ Vectors can be seem as functions V ∗ → R, so vectors are contravariant tensor.

■ Linear functionals are covariant tensors.

■ Inner product are functions from V × V → R so covariant tensor.

■ The determinant of a matrix is an multilinear function of the columns (or rows) of a square ma-
trix, so is a covariant tensor.

The above terminology seems backwards, Michael Spivak explains:

”Nowadays such situations are always distinguished by calling the things which go in
the same direction “covariant” and the things which go in the opposite direction “con-
travariant.” Classical terminology used these same words, and it just happens to have
reversed this... And no one had the gall or authority to reverse terminology sanctified
by years of usage. So it’s very easy to remember which kind of tensor is covariant, and
which is contravariant — it’s just the opposite of what it logically ought to be.”

488 Definition
We denote the space of tensors of type (r, s) by Trs (V ).

So, in particular,

Tk (V ) := Tk0 (V ) = {covariant k-tensors}


Tk (V ) := T0k (V ) = {contravariant k-tensors}.

Two important special cases are:

T1 (V ) = {covariant 1-tensors} = V ∗
T1 (V ) = {contravariant 1-tensors} = V ∗∗ ∼
= V.

This last line means that we can regard vectors v ∈ V as contravariant 1-tensors. That is, every
vector v ∈ V can be regarded as a linear functional V ∗ → R via

v(ω) := ω(v),

where ω ∈ V ∗ .
The rank of an (r, s)-tensor is defined to be r + s.
In particular, vectors (contravariant 1-tensors) and dual vectors (covariant 1-tensors) have rank 1.

234
9.4. Tensor

489 Definition
If S ∈ Trs11 (V ) is an (r1 , s1 )-tensor, and T ∈ Trs22 (V ) is an (r2 , s2 )-tensor, we can define their tensor
product S ⊗ T ∈ Trs11 +r +s2 (V ) by
2

(S⊗T)(v1 , . . . , vr1 +r2 , ω1 , . . . , ωs1 +s2 ) = S(v1 , . . . , vr1 , ω1 , . . . , ωs1 )·T(vr1 +1 , . . . , vr1 +r2 , ωs1 +1 , . . . , ωs1 +s2 ).

490 Example
Let u, v ∈ V . Again, since V ∼= T1 (V ), we can regard u, v ∈ T1 (V ) as (0, 1)-tensors. Their tensor
product u ⊗ v ∈ T2 (V ) is a (0, 2)-tensor defined by

(u ⊗ v)(ω, η) = u(ω) · v(η)

491 Example
Let V = R3 . Write u = (1, 2, 3)⊤ ∈ V in the standard basis, and η = (4, 5, 6) ∈ V ∗ in the dual basis.
For the inputs, let’s also write ω = (x, y, z) ∈ V ∗ and v = (p, q, r)⊤ ∈ V . Then

(u ⊗ η)(ω, v) = u(ω) · η(v)


   
1 p
   
   
= 2 [x, y, z] · [4, 5, 6] q 
   
   
3 r

= (x + 2y + 3z)(4p + 5q + 6r)
= 4px + 5qx + 6rx
8py + 10qy + 12py
12pz + 15qz + 18rz
  
4 5 6  p
  
  
= [x, y, z]  8 10 12  
  q 
  
12 15 18 r
 
4 5 6
 
 
= ω8 10 12 v.
 
 
12 15 18

492 Example
If S has components αi j k , and T has components β rs then S ⊗ T has components αi j k β rs , because

S ⊗ T(ui , uj , uk , ur , us ) = S(ui , uj , uk )T(ur , us ).

Tensors satisfy algebraic laws such as:

(i) R ⊗ (S + T) = R ⊗ S + R ⊗ T,

235
9. Tensors
(ii) (λR) ⊗ S = λ(R ⊗ S) = R ⊗ (λS),

(iii) (R ⊗ S) ⊗ T = R ⊗ (S ⊗ T).

But
S ⊗ T ̸= T ⊗ S
in general. To prove those we look at components wrt a basis, and note that

αi jk (β r s + γ r s ) = αi jk β r s + αi jk γ r s ,

for example, but


αi β j ̸= β j αi
in general.
Some authors take the definition of an (r, s)-tensor to mean a multilinear map V s × (V ∗ )r → R
(note that the r and s are reversed).

9.4.1. Basis of Tensor


493 Theorem
Let Trs (V ) be the space of tensors of type (r, s). Let {e1 , . . . , en } be a basis for V , and {e1 , . . . , en }
be the dual basis for V ∗
Then

{ej1 ⊗ . . . , ⊗ejr ⊗ ejr+1 ⊗ . . . ⊗ ejr +s 1 ≤ ji ≤ r + s}

is a base for Trs (V ).

So any tensor T ∈ Trs (V ) can be written as combination of this basis. Let T ∈ Trs (V ) be a (r, s)
tensor and let {e1 , . . . , en } be a basis for V , and {e1 , . . . , en } be the dual basis for V ∗ then we can
j ···jr+s
define a collection of scalars Ajr+1 1 ···jr
by
j ···jr+s
T(ej1 , . . . , ejr , ejr+1 . . . ejn ) = Ajr+1
1 ···jr

j ···jr+s
Then the scalars Ajr+1
1 ···jr
| 1 ≤ ji ≤ r + s} completely determine the multilinear function T

494 Theorem
j ···jr+s
Given T ∈ Trs (V ) a (r, s) tensor. Then we can define a collection of scalars Ajr+1
1 ···jr
by
j ···jr+s
Ajr+1
1 ···jr
= T(ej1 , . . . , ejr , ejr+1 . . . ejn )

The tensor T can be expressed as:



n ∑
n
j ···jr+s j1
T= ··· Ajr+1
1 ···jr
e ⊗ ejr ⊗ ejr+1 · · · ⊗ ejr+s
j1 =1 jn =1

As consequence of the previous theorem we have the following expression for the value of a ten-
sor:

236
9.4. Tensor

495 Theorem
Given T ∈ Trs (V ) be a (r, s) tensor. And


n
ji
vi = vi eji
ji =1

for 1 < i < r, and



n
vi = vjii eji
ji =1

for r + 1 < i < r + s, then



n ∑
n
j ···jr+s j1 (r+s)
n
T(v1 , . . . , v ) = ··· Ajr+1
1 ···jr
v1 · · · vjr+s
j1 =1 jn =1

496 Example
Let’s take a trilinear function

f : R2 × R2 × R2 → R.

A basis for R2 is {e1 , e2 } = {(1, 0), (0, 1)}. Let

f (ei , ej , ek ) = Aijk ,

where i, j, k ∈ {1, 2}. In other words, the constant Aijk is a function value at one of the eight possible
triples of basis vectors (since there are two choices for each of the three Vi ), namely:

{e1 , e1 , e1 }, {e1 , e1 , e2 }, {e1 , e2 , e1 }, {e1 , e2 , e2 }, {e2 , e1 , e1 }, {e2 , e1 , e2 }, {e2 , e2 , e1 }, {e2 , e2 , e2 }.

Each vector vi ∈ Vi = R2 can be expressed as a linear combination of the basis vectors


2
j
vi = vi ej = vi1 × e1 + vi2 × e2 = vi1 × (1, 0) + vi2 × (0, 1).
j=1

The function value at an arbitrary collection of three vectors vi ∈ R2 can be expressed as


2 ∑
2 ∑
2
f (v1 , v2 , v3 ) = Aijk v1i v2j v3k .
i=1 j=1 k=1

Or, in expanded form as

f ((a, b), (c, d), (e, f )) = ace × f (e1 , e1 , e1 ) + acf × f (e1 , e1 , e2 ) (9.6)
+ ade × f (e1 , e2 , e1 ) + adf × f (e1 , e2 , e2 ) + bce × f (e2 , e1 , e1 ) + bcf × f (e2 , e1 , e2 )
(9.7)
+ bde × f (e2 , e2 , e1 ) + bdf × f (e2 , e2 , e2 ). (9.8)

237
9. Tensors
9.4.2. Contraction
The simplest case of contraction is the pairing of V with its dual vector space V ∗ .

C :V∗⊗V →R (9.9)
C(f ⊗ v) = f (v) (9.10)

where f is in V ∗ and v is in V .
The above operation can be generalized to a tensor of type (r, s) (with r > 1, s > 1)

Cks : Trs (V ) → Ts−1


r−1
(V ) (9.11)
(9.12)

9.5. Change of Coordinates


9.5.1. Vectors and Covectors
Suppose that V is a vector space and E = {v1 , · · · , vn } and F = {w1 , · · · , wn } are two ordered
basis for V . E and F give rise to the dual basis E ∗ = {v 1 , · · · , v n } and F ∗ = {w1 , · · · , wn } for V ∗
respectively.
j
If [T ]E
F = [λi ] is the matrix representation of coordinate transformation from E to F , i.e.

    
1 λ21 . . . λn1   v1 
 w1   λ 1
  
 ..   .. .. .. .   
  .. 
 . = .
   . . ..   
 . 
    
wn λ1n λ2n · · · λnn vn

What is the matrix of coordinate transformation from E ∗ to F ∗ ?


We can write wj ∈ F ∗ as a linear combination of basis elements in E ∗ :

wj = µj1 v 1 + · · · + µjn v n
j ∗
We get a matrix representation [S]E
F ∗ = [µi ] as the following:

 
µ1 µ21 ... µn1 
ï ò ï ò 1
 
 .. .. .. .. 
w1 · · · wn = v 1 · · · vn 
 . . . . 

 
µ1n µ2n · · · µnn

We know that wi = λ1i v1 + · · · + λni vn . Evaluating this functional at wi ∈ V we get:

wj (wi ) = µj1 v 1 (wi ) + · · · + µjn v n (wi ) = δij

wj (wi ) = µj1 v 1 (λ1i v1 + · · · + λni vn ) + · · · + µjn v n (λ1i v1 + · · · + λni vn ) = δij

238
9.6. Symmetry properties of tensors

n
wj (wi ) = µj1 λ1i + · · · + µjn λni = µjk λki = δij
k=1


n
But µjk λki is the (i, j) entry of the matrix product TS. Therefore TS = In and S = T−1 .
k=1
If we want to write down the transformation from E ∗ to F ∗ as column vectors instead of row
vector and name the new matrix that represents this transformation as U , we observe that U = S t
and therefore U = (T−1 )t .
Therefore if T represents the transformation from E to F by the equation w = T v, then w∗ =
U v∗ .

9.5.2. Bilinear Forms


Let e1 , . . . , en and ẽ1 , . . . , ẽn be two basis in a linear vector space V . Let’s denote by S the transi-
tion matrix for passing from the first basis to the second one. Denote T = S −1 . From 9.3 we easily
derive the formula relating the components of a bilinear form f(v, w) these two basis. For this pur-
pose it is sufficient to substitute the expression for a change of basis into the formula 9.3 and use
the bilinearity of the form f(v, w):
n ∑
∑ n n ∑
∑ n
fij = f(ei , ej ) = Tki Tqj f(ẽk , ẽq ) = Tki Tqj f˜kq .
k=1 q=1 k=1 q=1

The reverse formula expressing f˜kq through fij is derived similarly:


n ∑
n ∑
n ∑
n
fij = Tki Tqj f˜kq ,f˜kq = Ski Sqj fij . (9.13)
k=1 q=1 i=1 j=1

In matrix form these relationships are written as follows:

F = TT F̃ T,F̃ = S T F S. (9.14)

Here ST and TT are two matrices obtained from S and T by transposition.

9.6. Symmetry properties of tensors


Symmetry properties involve the behavior of a tensor under the interchange of two or more argu-
ments. Of course to even consider the value of a tensor after the permutation of some of its argu-
ments, the arguments must be of the same type, i.e., covectors have to go in covector arguments
and vectors in vectors arguments and no other combinations are allowed.
The simplest case to consider are tensors with only 2 arguments of the same type. For vector
arguments we have (0, 2)-tensors. For such a tensor T introduce the following terminology:

T(Y, X) = T(X, Y ) , T is symmetric in X and Y ,


T(Y, X) = −T(X, Y ) , T is antisymmetric or “ alternating” in X and Y .

239
9. Tensors
Letting (X, Y ) = (ei , ej ) and using the definition of components, we get a corresponding condition
on the components

Tji = Tij , T is symmetric in the index pair (i, j),


Tji = −Tij , T is antisymmetric (alternating) in the index pair (i, j).

For an antisymmetric tensor, the last condition immediately implies that no index can be repeated
without the corresponding component being zero

Tji = −Tij → Tii = 0 .

Any (0, 2)-tensor can be decomposed into symmetric and antisymmetric parts by defining

1
[SY M (T)](X, Y ) = [T(X, Y ) + T(Y, X)] , (“the symmetric part of T”),
2
1
[ALT (T)](X, Y ) = [T(X, Y ) − T(Y, X)] , (“the antisymmetric part of T”),
2
T = SY M (T) + ALT (T) .

The last equality holds since evaluating it on the pair (X, Y ) immediately leads to an identity.
[Check.]
Again letting (X, Y ) = (ei , ej ) leads to corresponding component formulas

1
[SY M (T)]ij = (Tij + Tji ) ≡ T(ij) , (n(n + 1)/2 independent components),
2
1
[ALT (T)]ij = (Tij − Tji ) ≡ T[ij] , (n(n − 1)/2 independent components),
2
Tij = T(ij) + T[ij] , (n2 = n(n + 1)/2 + n(n − 1)/2 independent components).

Round brackets around a pair of indices denote the symmetrization operation, while square
brackets denote antisymmetrization. This is a very convenient shorthand. All of this can be re-
peated for (20 )-tensors and just reflects what we already know about the symmetric and antisym-
metric parts of matrices.

9.7. Forms
9.7.1. Motivation
Oriented area and Volume We define the oriented area function A(a, b) by

A(a, b) = ± |a| · |b| · sin α,

where the sign is chosen positive when the angle α is measured from the vector a to the vector b in
the counterclockwise direction, and negative otherwise.

240
9.7. Forms
Statement: The oriented area A(a, b) of a parallelogram spanned by the vectors a and b in the
two-dimensional Euclidean space is an antisymmetric and bilinear function of the vectors a and b:

A(a, b) = −A(b, a),


A(λa, b) = λ A(a, b),
A(a, b + c) = A(a, b) + A(a, c). (the sum law)

The ordinary (unoriented) area is then obtained as the absolute value of the oriented area, Ar(a, b) =

A(a, b) . It turns out that the oriented area, due to its strict linearity properties, is a much more
convenient and powerful construction than the unoriented area.

497 Theorem
Let a, b, c, be linearly independent vectors in R3 . The signed volume of the parallelepiped spanned
by them is (a × b)•c.

Statement: The oriented volume V (a, b, c) of a parallelepiped spanned by the vectors a, b and
c in the three-dimensional Euclidean space is an antisymmetric and trilinear function of the vectors
a, b and c:

V (a, b, c) = −V (b, a, c),


V (λa, b, c) = λ V (a, b, c),
V (a, b + d, c) = V (a, b) + V (a, d, c). (the sum law)

9.7.2. Exterior product


In three dimensions, an oriented area is represented by the cross product a × b, which is indeed an
antisymmetric and bilinear product. So we expect that the oriented area in higher dimensions can
be represented by some kind of new antisymmetric product of a and b; let us denote this product
(to be defined below) by a ∧ b, pronounced “a wedge b.” The value of a ∧ b will be a vector in a new
vector space. We will also construct this new space explicitly.

Definition of exterior product We will construct an antisymmetric product using the tensor prod-
uct space.

498 Definition
Given a vector space V , we define a new vector space V ∧ V called the exterior product (or anti-
symmetric tensor product, or alternating product, or wedge product) of two copies of V . The space
V ∧ V is the subspace in V ⊗ V consisting of all antisymmetric tensors, i.e. tensors of the form

v1 ⊗ v2 − v2 ⊗ v1 , v1,2 ∈ V,

241
9. Tensors
and all linear combinations of such tensors. The exterior product of two vectors v1 and v2 is the
expression shown above; it is obviously an antisymmetric and bilinear function of v1 and v2 .

For example, here is one particular element from V ∧ V , which we write in two different ways
using the properties of the tensor product:

(u + v) ⊗ (v + w) − (v + w) ⊗ (u + v) = u ⊗ v − v ⊗ u
+u ⊗ w − w ⊗ u + v ⊗ w − w ⊗ v ∈ V ∧ V. (9.15)

Remark: A tensor v1 ⊗ v2 ∈ V ⊗ V is not equal to the tensor v2 ⊗ v1 if v1 ̸= v2 .


It is quite cumbersome to perform calculations in the tensor product notation as we did in Eq. (9.15).
So let us write the exterior product as u∧v instead of u⊗v−v⊗u. It is then straightforward to see
that the “wedge” symbol ∧ indeed works like an anti-commutative multiplication, as we intended.
The rules of computation are summarized in the following statement.

Statement 1: One may save time and write u ⊗ v − v ⊗ u ≡ u ∧ v ∈ V ∧ V , and the result of
any calculation will be correct, as long as one follows the rules:

u ∧ v = −v ∧ u, (9.16)
(λu) ∧ v = λ (u ∧ v) , (9.17)
(u + v) ∧ x = u ∧ x + v ∧ x. (9.18)

It follows also that u ∧ (λv) = λ (u ∧ v) and that v ∧ v = 0. (These identities hold for any vectors
u, v ∈ V and any scalars λ ∈ K.)

Proof: These properties are direct consequences of the properties of the tensor product when
applied to antisymmetric tensors. For example, the calculation (9.15) now requires a simple expan-
sion of brackets,
(u + v) ∧ (v + w) = u ∧ v + u ∧ w + v ∧ w.

Here we removed the term v ∧ v which vanishes due to the antisymmetry of ∧. Details left as
exercise. ■
Elements of the space V ∧ V , such as a ∧ b + c ∧ d, are sometimes called bivectors.1 We will
also want to define the exterior product of more than two vectors. To define the exterior product
of three vectors, we consider the subspace of V ⊗ V ⊗ V that consists of antisymmetric tensors of
the form

a⊗b⊗c−b⊗a⊗c+c⊗a⊗b−c⊗b⊗a
+b ⊗ c ⊗ a − a ⊗ c ⊗ b (9.19)
1
It is important to note that a bivector is not necessarily expressible as a single-term product of two vectors; see the
Exercise at the end of Sec. ??.

242
9.7. Forms
and linear combinations of such tensors. These tensors are called totally antisymmetric because
they can be viewed as (tensor-valued) functions of the vectors a, b, c that change sign under ex-
change of any two vectors. The expression in Eq. (9.19) will be denoted for brevity by a ∧ b ∧ c,
similarly to the exterior product of two vectors, a ⊗ b − b ⊗ a, which is denoted for brevity by a ∧ b.
Here is a general definition.

Definition 2: The exterior product of k copies of V (also called the k-th exterior power of V ) is
denoted by ∧k V and is defined as the subspace of totally antisymmetric tensors within V ⊗ ... ⊗ V .
In the concise notation, this is the space spanned by expressions of the form

v1 ∧ v2 ∧ ... ∧ vk , vj ∈ V,

assuming that the properties of the wedge product (linearity and antisymmetry) hold as given by
Statement 1. For instance,

u ∧ v1 ∧ ... ∧ vk = (−1)k v1 ∧ ... ∧ vk ∧ u (9.20)

(“pulling a vector through k other vectors changes sign k times”). ■


The previously defined space of bivectors is in this notation V ∧ V ≡ ∧ V . A natural extension
2

of this notation is ∧0 V = K and ∧1 V = V . I will also use the following “wedge product” notation,

n
vk ≡ v1 ∧ v2 ∧ ... ∧ vn .
k=1

Tensors from the space ∧n V are also called n-vectors or antisymmetric tensors of rank n.

Question: How to compute expressions containing multiple products such as a ∧ b ∧ c?

Answer: Apply the rules shown in Statement 1. For example, one can permute adjacent vectors
and change sign,
a ∧ b ∧ c = −b ∧ a ∧ c = b ∧ c ∧ a,

one can expand brackets,

a ∧ (x + 4y) ∧ b = a ∧ x ∧ b + 4a ∧ y ∧ b,
¶ ©
and so on. If the vectors a, b, c are given as linear combinations of some basis vectors ej , we
can thus reduce a ∧ b ∧ c to a linear combination of exterior products of basis vectors, such as
e1 ∧ e2 ∧ e3 , e1 ∧ e2 ∧ e4 , etc.
Ç å
1 1
Example 1: Suppose we work in R3
and have vectors a = 0, , − , b = (2, −2, 0), c =
2 2
(−2, 5, −3). Let us compute various exterior products. Calculations are easier if we introduce the
basis {e1 , e2 , e3 } explicitly:
1
a= (e2 − e3 ) , b = 2(e1 − e2 ), c = −2e1 + 5e2 − 3e3 .
2

243
9. Tensors
We compute the 2-vector a ∧ b by using the properties of the exterior product, such as x ∧ x = 0
and x ∧ y = −y ∧ x, and simply expanding the brackets as usual in algebra:
1
a∧b= (e2 − e3 ) ∧ 2 (e1 − e2 )
2
= (e2 − e3 ) ∧ (e1 − e2 )
= e2 ∧ e1 − e3 ∧ e1 − e2 ∧ e2 + e3 ∧ e2
= −e1 ∧ e2 + e1 ∧ e3 − e2 ∧ e3 .

The last expression is the result; note that now there is nothing more to compute or to simplify. The
expressions such as e1 ∧ e2 are the basic expressions out of which the space R3 ∧ R3 is built.
Let us also compute the 3-vector a ∧ b ∧ c,

a ∧ b ∧ c = (a ∧ b) ∧ c
= (−e1 ∧ e2 + e1 ∧ e3 − e2 ∧ e3 ) ∧ (−2e1 + 5e2 − 3e3 ).

When we expand the brackets here, terms such as e1 ∧ e2 ∧ e1 will vanish because

e1 ∧ e2 ∧ e1 = −e2 ∧ e1 ∧ e1 = 0,

so only terms containing all different vectors need to be kept, and we find

a ∧ b ∧ c = 3e1 ∧ e2 ∧ e3 + 5e1 ∧ e3 ∧ e2 + 2e2 ∧ e3 ∧ e1


= (3 − 5 + 2) e1 ∧ e2 ∧ e3 = 0.

We note that all the terms are proportional to the 3-vector e1 ∧ e2 ∧ e3 , so only the coefficient in
front of e1 ∧ e2 ∧ e3 was needed; then, by coincidence, that coefficient turned out to be zero. So
the result is the zero 3-vector. ■

Remark: Origin of the name “exterior.” The construction of the exterior product is a modern
formulation of the ideas dating back to H. Grassmann (1844). A 2-vector a ∧ b is interpreted geo-
metrically as the oriented area of the parallelogram spanned by the vectors a and b. Similarly, a
3-vector a ∧ b ∧ c represents the oriented 3-volume of a parallelepiped spanned by {a, b, c}. Due
to the antisymmetry of the exterior product, we have (a∧b)∧(a∧c) = 0, (a∧b∧c)∧(b∧d) = 0,
etc. We can interpret this geometrically by saying that the “product” of two volumes is zero if these
volumes have a vector in common. This motivated Grassmann to call his antisymmetric product
“exterior.” In his reasoning, the product of two “extensive quantities” (such as lines, areas, or vol-
umes) is nonzero only when each of the two quantities is geometrically “to the exterior” (outside)
of the other.

Exercise 2: Show that in a two-dimensional space V , any 3-vector such as a ∧ b ∧ c can be sim-
plified to the zero 3-vector. Prove the same for n-vectors in N -dimensional spaces when n > N .

One can also consider the exterior powers of the dual space V ∗ . Tensors from ∧n V ∗ are usually
(for historical reasons) called n-forms (rather than “n-covectors”).

244
9.7. Forms
Definition 3: The action of a k-form f∗1 ∧ ... ∧ f∗k on a k-vector v1 ∧ ... ∧ vk is defined by

(−1)|σ| f∗1 (vσ(1) )...f∗k (vσ(k) ),
σ

where the summation is performed over all permutations σ of the ordered set (1, ..., k).

Example 2: With k = 3 we have

(p∗ ∧ q∗ ∧ r∗ )(a ∧ b ∧ c)
= p∗ (a)q∗ (b)r∗ (c) − p∗ (b)q∗ (a)r∗ (c)
+ p∗ (b)q∗ (c)r∗ (a) − p∗ (c)q∗ (b)r∗ (a)
+ p∗ (c)q∗ (a)r∗ (b) − p∗ (c)q∗ (b)r∗ (a).

Exercise 3: a) Show that a ∧ b ∧ ω = ω ∧ a ∧ b where ω is any antisymmetric tensor (e.g. ω =


x ∧ y ∧ z).
b) Show that
ω1 ∧ a ∧ ω2 ∧ b ∧ ω3 = −ω1 ∧ b ∧ ω2 ∧ a ∧ ω3 ,

where ω1 , ω2 , ω3 are arbitrary antisymmetric tensors and a, b are vectors.


c) Due to antisymmetry, a ∧ a = 0 for any vector a ∈ V . Is it also true that ω ∧ ω = 0 for any
bivector ω ∈ ∧2 V ?

9.7.3. Hodge star operator

245
Tensors in Coordinates
10.
”The introduction of numbers as coordinates is an act of violence.”
Hermann Weyl.

10.1. Index notation for tensors


So far we have used a coordinate-free formalism to define and describe tensors. However, in many
calculations a basis in V is fixed, and one needs to compute the components of tensors in that basis.
In this cases the index notation makes such calculations easier.
¶ ©
Suppose a basis {e1 , ..., en } in V is fixed; then the dual basis ek is also fixed. Any vector v ∈ V
∑ ∑
is decomposed as v = v k ek and any covector as f∗ = fk ek .
k k
Any tensor from V ⊗ V is decomposed as

A= Ajk ej ⊗ ek ∈ V ⊗ V
j,k

and so on. The action of a covector on a vector is f∗ (v) = fk vk , and the action of an operator
∑ k
on a vector is Ajk vk ek . However, it is cumbersome to keep writing these sums. In the index
j,k
notation, one writes only the components vk or Ajk of vectors and tensors.

499 Definition
Given T ∈ Trs (V ):


n ∑
n
j ···jr+s j1
T = ··· Tj1r+1
···jr e ⊗ ejr ⊗ ejr+1 · · · ⊗ ejr+s
j1 =1 jr+s =1

The index notation of this tensor is


j ···jr+s
Tj1r+1
···jr

10.1.1. Definition of index notation


The rules for expressing tensors in the index notations are as follows:

247
10. Tensors in Coordinates
o Basis vectors ek and basis tensors (e.g. ek ⊗ e∗l ) are never written explicitly. (It is assumed
that the basis is fixed and known.)

o Instead of a vector v ∈ V , one writes its array of components v k with the superscript index.
Covectors f∗ ∈ V ∗ are written fk with the subscript index. The index k runs over integers
from 1 to N . Components of vectors and tensors may be thought of as numbers.

o Tensors are written as multidimensional arrays of components with superscript or subscript


indices as necessary, for example Ajk ∈ V ∗ ⊗ V ∗ or Blm ∗
k ∈ V ⊗ V ⊗ V . Thus e.g. the
Kronecker delta symbol is written as δkj when it represents the identity operator 1̂V .

o Tensors with subscript indices, like Aij , are called covariant, while tensors with superscript
indices, like Ak , are called contravariant. Tensors with both types of indices, like Almn
lk , are
called mixed type.

o Subscript indices, rather than subscripted tensors, are also dubbed “covariant” and super-
script indices are dubbed “contravariant”.

o For tensor invariance, a pair of dummy indices should in general be complementary in their
variance type, i.e. one covariant and the other contravariant.

o As indicated earlier, tensor order is equal to the number of its indices while tensor rank is
equal to the number of its free indices; hence vectors (terms, expressions and equalities) are
represented by a single free index and rank-2 tensors are represented by two free indices. The
dimension of a tensor is determined by the range taken by its indices.

o The choice of indices must be consistent; each index corresponds to a particular copy of V
or V ∗ . Thus it is wrong to write vj = uk or vi + ui = 0. Correct equations are vj = uj and
v i + ui = 0. This disallows meaningless expressions such as v∗ + u (one cannot add vectors
from different spaces).

n ∑
o Sums over indices such as ak bk are not written explicitly, the symbol is omitted, and
k=1
the Einstein summation convention is used instead: Summation over all values of an index
is always implied when that index letter appears once as a subscript and once as a superscript.
In this case the letter is called a dummy (or mute) index. Thus one writes fk v k instead of
∑ ∑
fk vk and Ajk v k instead of Ajk vk .
k k

o Summation is allowed only over one subscript and one superscript but never over two sub-
scripts or two superscripts and never over three or more coincident indices. This corresponds
to requiring that we are only allowed to compute the canonical pairing of V and V ∗ but no
other pairing. The expression v k v k is not allowed because there is no canonical pairing of V

n
and V , so, for instance, the sum v k v k depends on the choice of the basis. For the same
k=1
reason (dependence on the basis), expressions such as ui v i wi or Aii Bii are not allowed. Cor-
rect expressions are ui v i wk and Aik Bik .

248
10.1. Index notation for tensors
o One needs to pay close attention to the choice and the position of the letters such as j, k, l,... used
as indices. Indices that are not repeated are free indices. The rank of a tensor expression is
equal to the number of free subscript and superscript indices. Thus Ajk v k is a rank 1 tensor
(i.e. a vector) because the expression Ajk v k has a single free index, j, and a summation over k
is implied.

o The tensor product symbol ⊗ is never written. For example, if v ⊗ f∗ = vj fk∗ ej ⊗ ek ,
jk
one writes v k fj to represent the tensor v ⊗ f∗ . The index letters in the expression v k fj are
intentionally chosen to be different (in this case, k and j) so that no summation would be
implied. In other words, a tensor product is written simply as a product of components, and
the index letters are chosen appropriately. Then one can interpret v k fj as simply the product
of numbers. In particular, it makes no difference whether one writes fj v k or v k fj . The position
of the indices (rather than the ordering of vectors) shows in every case how the tensor product
is formed. Note that it is not possible to distinguish V ⊗V ∗ from V ∗ ⊗V in the index notation.
500 Example
It follows from the definition of δji that δji v j = v i . This is the index representation of the identity
transformation 1̂v = v.
501 Example
Suppose w, x, y, and z are vectors from V whose components are wi , xi , y i , z i . What are the compo-
nents of the tensor w ⊗ x + 2y ⊗ z ∈ V ⊗ V ?

Solution: ▶ wi xk + 2y i z k . (We need to choose another letter for the second free index, k, which
corresponds to the second copy of V in V ⊗ V .) ◀
502 Example
The operator  ≡ 1̂V + λv ⊗ u∗ ∈ V ⊗ V ∗ acts on a vector x ∈ V . Calculate the resulting vector
y ≡ Âx.
In the index-free notation, the calculation is
Ä ä
y = Âx = 1̂V + λv ⊗ u∗ x = x + λu∗ (x) v.

In the index notation, the calculation looks like this:


Ä ä
y k = δjk + λv k uj xj = xk + λv k uj xj .

In this formula, j is a dummy index and k is a free index. We could have also written λxj v k uj instead
of λv k uj xj since the ordering of components makes no difference in the index notation.

503 Example
In a physics book you find the following formula,

1Ä ä
α
Hµν = hβµν + hβνµ − hµνβ g αβ .
2
To what spaces do the tensors H, g, h belong (assuming these quantities represent tensors)? Rewrite
this formula in the coordinate-free notation.

249
10. Tensors in Coordinates
Solution: ▶ H ∈ V ⊗ V ∗ ⊗ V ∗ , h ∈ V ∗ ⊗ V ∗ ⊗ V ∗ , g ∈ V ⊗ V . Assuming the simplest case,

h = h∗1 ⊗ h∗2 ⊗ h∗3 , g = g1 ⊗ g2 ,

the coordinate-free formula is


1 ( )
H = g1 ⊗ h∗1 (g2 ) h∗2 ⊗ h∗3 + h∗1 (g2 ) h∗3 ⊗ h∗2 − h∗3 (g2 ) h∗1 ⊗ h∗2 .
2

10.1.2. Advantages and disadvantages of index notation


Index notation is conceptually easier than the index-free notation because one can imagine ma-
nipulating “merely” some tables of numbers, rather than “abstract vectors.” In other words, we
are working with less abstract objects. The price is that we obscure the geometric interpretation of
what we are doing, and proofs of general theorems become more difficult to understand.
The main advantage of the index notation is that it makes computations with complicated ten-
sors quicker.
Some disadvantages of the index notation are:

o If the basis is changed, all components need to be recomputed. In textbooks that use the
index notation, quite some time is spent studying the transformation laws of tensor com-
ponents under a change of basis. If different basis are used simultaneously, confusion may
result.

o The geometrical meaning of many calculations appears hidden behind a mass of indices. It
is sometimes unclear whether a long expression with indices can be simplified and how to
proceed with calculations.

Despite these disadvantages, the index notation enables one to perform practical calculations
with high-rank tensor spaces, such as those required in field theory and in general relativity. For
this reason, and also for historical reasons (Einstein used the index notation when developing the
theory of relativity), most physics textbooks use the index notation. In some cases, calculations can
be performed equally quickly using index and index-free notations. In other cases, especially when
deriving general properties of tensors, the index-free notation is superior.

10.2. Tensor Revisited: Change of Coordinate


Vectors, covectors, linear operators, and bilinear forms are examples of tensors. They are multilin-
ear maps that are represented numerically when some basis in the space is chosen.
This numeric representation is specific to each of them: vectors and covectors are represented by
one-dimensional arrays, linear operators and quadratic forms are represented by two-dimensional
arrays. Apart from the number of indices, their position does matter. The coordinates of a vector

250
10.2. Tensor Revisited: Change of Coordinate
are numerated by one upper index, which is called the contravariant index. The coordinates of
a covector are numerated by one lower index, which is called the covariant index. In a matrix of
bilinear form we use two lower indices; therefore bilinear forms are called twice-covariant tensors.
Linear operators are tensors of mixed type; their components are numerated by one upper and one
lower index. The number of indices and their positions determine the transformation rules, i ̇eṫhe
way the components of each particular tensor behave under a change of basis. In the general case,
any tensor is represented by a multidimensional array with a definite number of upper indices and
a definite number of lower indices. Let’s denote these numbers by r and s. Then we have a tensor
of the type (r, s), or sometimes the term valency is used. A tensor of type (r, s), or of valency (r, s)
is called an r-times contravariant and an s-times covariant tensor. This is terminology; now let’s
proceed to the exact definition. It is based on the following general transformation formulas:

n ∑
n
Xij11... ir
... js = . . . Sih11 . . . Sihrr Tkj11 . . . Tkjss X̃hk11...
... hr
ks , (10.1)
h1 , ..., hr
k1 , ..., ks

n ∑ n
X̃ij11... ir
... js = ... Tih11 . . . Tihrr Skj11 . . . Skjss Xhk11...
... hr
ks . (10.2)
h1 , ..., hr
k1 , ..., ks

504 Definition (Tensor Definition in Coordinate)


A (r + s)-dimensional array Xij11... ir
... js of real numbers and such that the components of this array obey
the transformation rules

n ∑
n
Xij11... ir
... js = ... Sih11 . . . Sihrr Tkj11 . . . Tkjss X̃hk11...
... hr
ks , (10.3)
h1 , ..., hr
k1 , ..., ks

n ∑ n
X̃ij11... ir
... js = ... Tih11 . . . Tihrr Skj11 . . . Skjss Xhk11...
... hr
ks . (10.4)
h1 , ..., hr
k1 , ..., ks

under a change of basis is called tensor of type (r, s), or of valency (r, s).

Formula 10.4 is derived from 10.3, so it is sufficient to remember only one of them. Let it be the
formula 10.3. Though huge, formula 10.3 is easy to remember.
Indices i1 , . . . , ir and j1 , . . . , js are free indices. In right hand side of the equality 10.3 they
are distributed in S-s and T -s, each having only one entry and each keeping its position, i ̇eu̇pper
indices i1 , . . . , ir remain upper and lower indices j1 , . . . , js remain lower in right hand side of the
equality 10.3.
Other indices h1 , . . . , hr and k1 , . . . , ks are summation indices; they enter the right hand side
of 10.3 pairwise: once as an upper index and once as a lower index, once in S-s or T -s and once in
components of array X̃kh11... ... hr
ks .
When expressing Xj1 ... js through X̃hk11...
i1 ... ir ... hr
ks each upper index is served by direct transition matrix S
and produces one summation in 10.3:
∑ ∑n ∑
X... iα ...
... ... ... = ... ... . . . Shiαα . . . X̃... hα ... ... ... ... . (10.5)
hα =1

251
10. Tensors in Coordinates
In a similar way, each lower index is served by inverse transition matrix T and also produces one
summation in formula 10.3:
∑ ∑n ∑
X... ... ...
... jα ... = ... ... . . . Tkα jα . . . X̃... ... ...
... kα ... . (10.6)
kα =1

Formulas 10.5 and 10.6 are the same as 10.3 and used to highlight how 10.3 is written. So tensors are
defined. Further we shall consider more examples showing that many well-known objects undergo
the definition 12.1.
505 Example
Verify that formulas for change of basis of vectors, covectors, linear transformation and bilinear forms
are special cases of formula 10.3. What are the valencies of vectors, covectors, linear operators, and
bilinear forms when they are considered as tensors.

506 Example
The δij is a tensor.

Solution: ▶
δi′j = Ajk (A−1 )li δlk = Ajk (A−1 )ki = δij


507 Example
The ϵijk is a pseudo-tensor.

508 Example
Let aij be the matrix of some bilinear form a. Let’s denote by bij components of inverse matrix for aij .
Prove that matrix bij under a change of basis transforms like matrix of twice-contravariant tensor.
Hence it determines tensor b of valency (2, 0). Tensor b is called a dual bilinear form for a.

10.2.1. Rank
The order of a tensor is identified by the number of its indices (e.g. Aijk is a tensor of order 3) which
normally identifies the tensor rank as well. However, when contraction (see S 10.3.4) takes place
once or more, the order of the tensor is not affected but its rank is reduced by two for each contrac-
tion operation.1

■ “Zero tensor” is a tensor whose all components are zero.

■ “Unit tensor” or “unity tensor”, which is usually defined for rank-2 tensors, is a tensor whose
all elements are zero except the ones with identical values of all indices which are assigned
the value 1.
1
In the literature of tensor calculus, rank and order of tensors are generally used interchangeably; however some au-
thors differentiate between the two as they assign order to the total number of indices, including repetitive indices,
while they keep rank to the number of free indices. We think the latter is better and hence we follow this convention
in the present text.

252
10.3. Tensor Operations in Coordinates
■ While tensors of rank-0 are generally represented in a common form of light face non-indexed
symbols, tensors of rank ≥ 1 are represented in several forms and notations, the main ones
are the index-free notation, which may also be called direct or symbolic or Gibbs notation,
and the indicial notation which is also called index or component or tensor notation. The
first is a geometrically oriented notation with no reference to a particular reference frame
and hence it is intrinsically invariant to the choice of coordinate systems, whereas the sec-
ond takes an algebraic form based on components identified by indices and hence the no-
tation is suggestive of an underlying coordinate system, although being a tensor makes it
form-invariant under certain coordinate transformations and therefore it possesses certain
invariant properties. The index-free notation is usually identified by using bold face symbols,
like a and B, while the indicial notation is identified by using light face indexed symbols such
as ai and Bij .

10.2.2. Examples of Tensors of Different Ranks


o Examples of rank-0 tensors (scalars) are energy, mass, temperature, volume and density.
These are totally identified by a single number regardless of any coordinate system and hence
they are invariant under coordinate transformations.

o Examples of rank-1 tensors (vectors) are displacement, force, electric field, velocity and ac-
celeration. These need for their complete identification a number, representing their mag-
nitude, and a direction representing their geometric orientation within their space. Alterna-
tively, they can be uniquely identified by a set of numbers, equal to the number of dimensions
of the underlying space, in reference to a particular coordinate system and hence this iden-
tification is system-dependent although they still have system-invariant properties such as
length.

o Examples of rank-2 tensors are Kronecker delta (see S 10.4.1), stress, strain, rate of strain and
inertia tensors. These require for their full identification a set of numbers each of which is
associated with two directions.

o Examples of rank-3 tensors are the Levi-Civita tensor (see S 10.4.2) and the tensor of piezo-
electric moduli.

o Examples of rank-4 tensors are the elasticity or stiffness tensor, the compliance tensor and
the fourth-order moment of inertia tensor.

o Tensors of high ranks are relatively rare in science.

10.3. Tensor Operations in Coordinates


There are many operations that can be performed on tensors to produce other tensors in general.
Some examples of these operations are addition/subtraction, multiplication by a scalar (rank-0 ten-
sor), multiplication of tensors (each of rank > 0), contraction and permutation. Some of these

253
10. Tensors in Coordinates
operations, such as addition and multiplication, involve more than one tensor while others are
performed on a single tensor, such as contraction and permutation.
In tensor algebra, division is allowed only for scalars, hence if the components of an indexed
1
tensor should appear in a denominator, the tensor should be redefined to avoid this, e.g. Bi = .
Ai

10.3.1. Addition and Subtraction


Tensors of the same rank and type can be added algebraically to produce a tensor of the same rank
and type, e.g.
a=b+c (10.7)

Ai = Bi − Ci (10.8)

Aij = Bij + Cij (10.9)

509 Definition
Given two tensors Yij11... ir i1 ... ir
... js and Zj1 ... js of the same type then we define their sum as

Xji11... ir i1 ... ir i1 ... ir


... js + Yj1 ... js = Zj1 ... js .

510 Theorem
Given two tensors Yij11... ir i1 ... ir
... js and Zj1 ... js of type (r, s) then their sum

Zij11... ir i1 ... ir i1 ... ir


... js = Xj1 ... js + Yj1 ... js .

is also a tensor of type (r, s).

Proof.

n ∑
n
Xij11... ir
... js = ... Sih11 . . . Sihrr Tkj11 . . . Tkjss X̃hk11...
... hr
ks ,
h1 , ..., hr
k1 , ..., ks


n ∑
n
Yij11... ir
... js = ... Shi11 . . . Sihrr Tkj11 . . . Tkjss Ỹhk11...
... hr
ks ,
h1 , ..., hr
k1 , ..., ks

Then

n ∑
n
Zij11... ir
... js = ... Sih11 . . . Sihrr Tkj11 . . . Tkjss X̃hk11...
... hr
ks
h1 , ..., hr
k1 , ..., ks


n ∑
n
+ ... Sih11 . . . Sihrr Tkj11 . . . Tkjss Ỹhk11...
... hr
ks
h1 , ..., hr
k1 , ..., ks


n ∑
n ( )
Zji11... ir
... js = ... Sih11 . . . Sihrr Tkj11 . . . Tkjss X̃hk11...
... hr
ks + Ỹ h1 ... hr
k1 ... ks
h1 , ..., hr
k1 , ..., ks

254
10.3. Tensor Operations in Coordinates

Addition of tensors is associative and commutative:

(A + B) + C = A + (B + C) (10.10)

A+B=B+A (10.11)

10.3.2. Multiplication by Scalar


A tensor can be multiplied by a scalar, which generally should not be zero, to produce a tensor of
the same variance type and rank, e.g.
Ajik = aBjik (10.12)
where a is a non-zero scalar.
511 Definition
Given Xij11... ir i1 ... ir
... js a tensor of type (r, s) and α a scalar we define the multiplication of Xj1 ... js by α as:

Yij11... ir i1 ... ir
... js = α Xj1 ... js .

512 Theorem
Given Xij11... ir
... js a tensor of type (r, s) and α a scalar then

Yij11... ir i1 ... ir
... js = α Xj1 ... js .

is also a tensor of type (r, s)

The proof of this Theorem is very similar to the proof of the Theorem 510 and the proof is left as an
exercise to the reader.
As indicated above, multiplying a tensor by a scalar means multiplying each component of the
tensor by that scalar.
Multiplication by a scalar is commutative, and associative when more than two factors are in-
volved.

10.3.3. Tensor Product


This may also be called outer or exterior or direct or dyadic multiplication, although some of these
names may be reserved for operations on vectors.
The tensor product is defined by a more tricky formula. Suppose we have tensor X of type (r, s)
and tensor Y of type (p, q), then we can write:
i ... i i ... i
Zj11 ... jr+p
s+q
= Xij11... ir r+1 r+p
... js Yjs+1 ... js+q .

Formula 10.3.3 produces new tensor Z of the type (r + p, s + q). It is called the tensor product of
X and Y and denoted Z = X ⊗ Y.

255
10. Tensors in Coordinates
513 Example

Ai Bj = Cij (10.13)

Aij Bkl = Cijkl (10.14)

Direct multiplication of tensors is not commutative.


514 Example (Outer Product of Vectors)
The outer product of two vectors is equivalent to a matrix multiplication uvT , provided that u is rep-
resented as a column vector and v as a column vector. And so vT is a row vector.
   
u1  u1 v1 u1 v2 u1 v3 
   
 ï ò  
u2  u2 v1 u2 v2 u2 v3 
T    
u ⊗ v = uv =   v1 v2 v3 =  . (10.15)
   
u3  u3 v1 u3 v2 u3 v3 
   
   
u4 u4 v1 u4 v2 u4 v3

In index notation:

(uvT )ij = ui vj

The outer product operation is distributive with respect to the algebraic sum of tensors:

A (B ± C) = AB ± AC & (B ± C) A = BA ± CA (10.16)

Multiplication of a tensor by a scalar (refer to S 10.3.2) may be regarded as a special case of direct
multiplication.
The rank-2 tensor constructed as a result of the direct multiplication of two vectors is commonly
called dyad.
Tensors may be expressed as an outer product of vectors where the rank of the resultant product
is equal to the number of the vectors involved (e.g. 2 for dyads and 3 for triads).
Not every tensor can be synthesized as a product of lower rank tensors.

10.3.4. Contraction
Contraction of a tensor of rank > 1 is to make two free indices identical, by unifying their symbols,
and perform summation over these repeated indices, e.g.

Aji contraction Aii (10.17)


−−−−−−−−→

Ajk contraction on jl Amk (10.18)


il
−−−−−−−−−−−−−→ im

Contraction results in a reduction of the rank by 2 since it implies the annihilation of two free
indices. Therefore, the contraction of a rank-2 tensor is a scalar, the contraction of a rank-3 tensor
is a vector, the contraction of a rank-4 tensor is a rank-2 tensor, and so on.

256
10.3. Tensor Operations in Coordinates
For non-Cartesian coordinate systems, the pair of contracted indices should be different in their
variance type, i.e. one upper and one lower. Hence, contraction of a mixed tensor of type (m, n)
will, in general, produce a tensor of type (m − 1, n − 1).
A tensor of type (p, q) can have p × q possible contractions, i.e. one contraction for each pair of
lower and upper indices.
515 Example (Trace)
In matrix algebra, taking the trace (summing the diagonal elements) can also be considered as con-
traction of the matrix, which under certain conditions can represent a rank-2 tensor, and hence it yields
the trace which is a scalar.

10.3.5. Inner Product


On taking the outer product of two tensors of rank ≥ 1 followed by a contraction on two indices of
the product, an inner product of the two tensors is formed. Hence if one of the original tensors is
of rank-m and the other is of rank-n, the inner product will be of rank-(m + n − 2).
The inner product operation is usually symbolized by a single dot between the two tensors, e.g.
A · B, to indicate contraction following outer multiplication.
In general, the inner product is not commutative. When one or both of the tensors involved in
the inner product are of rank > 1 the order of the multiplicands does matter.
The inner product operation is distributive with respect to the algebraic sum of tensors:

A · (B ± C) = A · B ± A · C & (B ± C) · A = B · A ± C · A (10.19)

516 Example (Dot Product)


A common example of contraction is the dot product operation on vectors which can be regarded as
a direct multiplication (refer to S 10.3.3) of the two vectors, which results in a rank-2 tensor, followed
by a contraction.

517 Example (Matrix acting on vectors)


Another common example (from linear algebra) of inner product is the multiplication of a matrix (rep-
resenting a rank-2 tensor) by a vector (rank-1 tensor) to produce a vector, e.g.

[Ab]ijk = Aij bk contraction on jk [A · b]i = Aij bj (10.20)


−−−−−−−−−−−−−→
The multiplication of two n × n matrices is another example of inner product (see Eq. ??).
For tensors whose outer product produces a tensor of rank > 2, various contraction operations
between different sets of indices can occur and hence more than one inner product, which are dif-
ferent in general, can be defined. Moreover, when the outer product produces a tensor of rank > 3
more than one contraction can take place simultaneously.

10.3.6. Permutation
A tensor may be obtained by exchanging the indices of another tensor, e.g. transposition of rank-2
tensors.

257
10. Tensors in Coordinates
Obviously, tensor permutation applies only to tensors of rank ≥ 2.
The collection of tensors obtained by permuting the indices of a basic tensor may be called iso-
mers.

10.4. Kronecker and Levi-Civita Tensors


These tensors are of particular importance in tensor calculus due to their distinctive properties and
unique transformation attributes. They are numerical tensors with fixed components in all coordi-
nate systems. The first is called Kronecker delta or unit tensor, while the second is called Levi-Civita
The δ and ϵ tensors are conserved under coordinate transformations and hence they are the same
for all systems of coordinate.2

10.4.1. Kronecker δ
This is a rank-2 symmetric tensor in all dimensions, i.e.

δij = δji (i, j = 1, 2, . . . , n) (10.21)

Similar identities apply to the contravariant and mixed types of this tensor.
It is invariant in all coordinate systems, and hence it is an isotropic tensor.3
It is defined as: 

1 (i = j)
δij = (10.22)

0 (i ̸= j)

and hence it can be considered as the identity matrix, e.g. for 3D


   
 δ11 δ12 δ13   1 0 0 
î ó    
   
δij =  δ21 δ22 δ23  =  0 1 0  (10.23)
   
   
δ31 δ32 δ33 0 0 1

Covariant, contravariant and mixed type of this tensor are the same, that is

δ ij = δij = δ ij = δij (10.24)

10.4.2. Permutation ϵ
This is an isotropic tensor. It has a rank equal to the number of dimensions; hence, a rank-n per-
mutation tensor has nn components.
It is totally anti-symmetric in each pair of its indices, i.e. it changes sign on swapping any two of
its indices, that is
ϵi1 ...ik ...il ...in = −ϵi1 ...il ...ik ...in (10.25)
2
For the permutation tensor, the statement applies to proper coordinate transformations.
3
In fact it is more general than isotropic as it is invariant even under improper coordinate transformations.

258
10.4. Kronecker and Levi-Civita Tensors
The reason is that any exchange of two indices requires an even/odd number of single-step shifts
to the right of the first index plus an odd/even number of single-step shifts to the left of the sec-
ond index, so the total number of shifts is odd and hence it is an odd permutation of the original
arrangement.
It is a pseudo tensor since it acquires a minus sign under improper orthogonal transformation of
coordinates (inversion of axes with possible superposition of rotation).
Definition of rank-2 ϵ (ϵij ):

ϵ12 = 1, ϵ21 = −1 & ϵ11 = ϵ22 = 0 (10.26)

Definition of rank-3 ϵ (ϵijk ):






 1 (i, j, k is even permutation of 1,2,3)

ϵijk =
 −1 (i, j, k is odd permutation of 1,2,3) (10.27)



 0 (repeated index)

The definition of rank-n ϵ (ϵi1 i2 ...in ) is similar to the definition of rank-3 ϵ considering index rep-
etition and even or odd permutations of its indices (i1 , i2 , · · · , in ) corresponding to (1, 2, · · · , n),
that is

 [ ]

 1 (i1 , i2 , . . . , in ) is even permutation of (1, 2, . . . , n)

 [ ]
ϵi1 i2 ...in =
−1 (i1 , i2 , . . . , in ) is odd permutation of (1, 2, . . . , n) (10.28)



 0 [repeated index]

ϵ may be considered a contravariant relative tensor of weight +1 or a covariant relative tensor of


weight −1. Hence, in 2, 3 and n dimensional spaces respectively we have:

ϵij = ϵij (10.29)


ϵijk = ϵijk (10.30)
ϵi1 i2 ...in = ϵi1 i2 ...in (10.31)

10.4.3. Useful Identities Involving δ or/and ϵ


Identities Involving δ

When an index of the Kronecker delta is involved in a contraction operation by repeating an index
in another tensor in its own term, the effect of this is to replace the shared index in the other tensor
by the other index of the Kronecker delta, that is

δij Aj = Ai (10.32)

In such cases the Kronecker delta is described as the substitution or index replacement operator.
Hence,
δij δjk = δik (10.33)

259
10. Tensors in Coordinates
Similarly,
δij δjk δki = δik δki = δii = n (10.34)

where n is the space dimension.


Because the coordinates are independent of each other:

∂xi
= ∂j xi = xi,j = δij (10.35)
∂xj

Hence, in an n dimensional space we have

∂i xi = δii = n (10.36)

For orthonormal Cartesian systems:

∂xi ∂xj
= = δij = δ ij (10.37)
∂xj ∂xi
For a set of orthonormal basis vectors in orthonormal Cartesian systems:

ei · ej = δij (10.38)

The double inner product of two dyads formed by orthonormal basis vectors of an orthonormal
Cartesian system is given by:
ei ej : ek el = δik δjl (10.39)

Identities Involving ϵ

For rank-3 ϵ:

ϵijk = ϵkij = ϵjki = −ϵikj = −ϵjik = −ϵkji (sense of cyclic order) (10.40)

These equations demonstrate the fact that rank-3 ϵ is totally anti-symmetric in all of its indices since
a shift of any two indices reverses the sign. This also reflects the fact that the above tensor system
has only one independent component.
For rank-2 ϵ:
ϵij = (j − i) (10.41)

For rank-3 ϵ:
1
ϵijk = (j − i) (k − i) (k − j) (10.42)
2
For rank-4 ϵ:
1
ϵijkl = (j − i) (k − i) (l − i) (k − j) (l − j) (l − k) (10.43)
12
For rank-n ϵ:
 

n−1 ∏n Ä ä ∏ Ä ä
ϵa1 a2 ···an = 1 aj − ai  =
1
aj − ai (10.44)
i=1
i! j=i+1 S(n − 1) 1≤i<j≤n

260
10.4. Kronecker and Levi-Civita Tensors
where S(n − 1) is the super-factorial function of (n − 1) which is defined as


k
S(k) = i! = 1! · 2! · . . . · k! (10.45)
i=1

A simpler formula for rank-n ϵ can be obtained from the previous one by ignoring the magnitude of
the multiplication factors and taking only their signs, that is
Ñ é
∏ Ä ä ∏ Ä ä
ϵa1 a2 ···an = σ aj − ai = σ aj − ai (10.46)
1≤i<j≤n 1≤i<j≤n

where 



 +1 (k > 0)

σ(k) =
 −1 (k < 0) (10.47)



 0 (k = 0)
For rank-n ϵ:
ϵi1 i2 ···in ϵi1 i2 ···in = n! (10.48)

because this is the sum of the squares of ϵi1 i2 ···in over all the permutations of n different indices
which is equal to n! where the value of ϵ of each one of these permutations is either +1 or −1 and
hence in both cases their square is 1.
For a symmetric tensor Ajk :
ϵijk Ajk = 0 (10.49)

because an exchange of the two indices of Ajk does not affect its value due to the symmetry whereas
a similar exchange in these indices in ϵijk results in a sign change; hence each term in the sum has
its own negative and therefore the total sum will vanish.

ϵijk Ai Aj = ϵijk Ai Ak = ϵijk Aj Ak = 0 (10.50)

because, due to the commutativity of multiplication, an exchange of the indices in A’s will not affect
the value but a similar exchange in the corresponding indices of ϵijk will cause a change in sign;
hence each term in the sum has its own negative and therefore the total sum will be zero.
For a set of orthonormal basis vectors in a 3D space with a right-handed orthonormal Cartesian
coordinate system:
ei × ej = ϵijk ek (10.51)
Ä ä
ei · ej × ek = ϵijk (10.52)

Identities Involving δ and ϵ

ϵijk δ1i δ2j δ3k = ϵ123 = 1 (10.53)

For rank-2 ϵ:


δik δil
ϵij ϵkl = = δik δjl − δil δjk
(10.54)
δ δjl
jk

261
10. Tensors in Coordinates
ϵil ϵkl = δik (10.55)

ϵij ϵij = 2 (10.56)

For rank-3 ϵ:


δ δim δin
il


ϵijk ϵlmn = δjl
δjm δjn = δil δjm δkn +δim δjn δkl +δin δjl δkm −δil δjn δkm −δim δjl δkn −δin δjm δkl


δkl δkm δkn
(10.57)


δil δim
ϵijk ϵlmk = = δil δjm − δim δjl
(10.58)
δ δjm
jl

The last identity is very useful in manipulating and simplifying tensor expressions and proving vec-
tor and tensor identities.
ϵijk ϵljk = 2δil (10.59)

ϵijk ϵijk = 2δii = 6 (10.60)

since the rank and dimension of ϵ are the same, which is 3 in this case.
For rank-n ϵ:

δ
i1 j1 δi1 j2 · · · δi1 jn


δ i2 j1 δ i2 j2 · · · δi2 jn

ϵi1 i2 ···in ϵj1 j2 ···jn = . .. .. (10.61)
. ..
. . . .


δ · · · δin jn
in j1 δin j2

According to Eqs. 10.27 and 10.32:

ϵijk δij = ϵijk δik = ϵijk δjk = 0 (10.62)

10.4.4. ⋆ Generalized Kronecker delta


The generalized Kronecker delta is defined by:

 [ ]

 1 (j1 . . . jn ) is even permutation of (i1 . . . in )

 [ ]
δji11 ...i n
...jn = −1 (j1 . . . jn ) is odd permutation of (i1 . . . in ) (10.63)




 0 [repeated j’s]

It can also be defined by the following n × n determinant:




δji11 δji12 ··· δji1n



δji21 δji22 · · · δji2n
i1 ...in
δj1 ...jn = .. .. .. (10.64)
..
. . . .


δjin1 δjin2 · · · δjinn

262
10.5. Types of Tensors Fields
where the δji entries in the determinant are the normal Kronecker delta as defined by Eq. 10.22.
Accordingly, the relation between the rank-n ϵ and the generalized Kronecker delta in an n di-
mensional space is given by:

ϵi1 i2 ...in = δi112...n


i2 ...in & ϵi1 i2 ...in = δ1i12...n
i2 ...in
(10.65)

Hence, the permutation tensor ϵ may be considered as a special case of the generalized Kronecker
delta. Consequently the permutation symbol can be written as an n × n determinant consisting of
the normal Kronecker deltas.
If we define
ij ijk
δlm = δlmk (10.66)

then Eq. 10.58 will take the following form:

ij i j
δlm = δli δm
j
− δm δl (10.67)

Other identities involving δ and ϵ can also be formulated in terms of the generalized Kronecker
delta.
On comparing Eq. 10.61 with Eq. 10.64 we conclude

δji11 ...i n
...jn = ϵ
i1 ...in
ϵj1 ...jn (10.68)

10.5. Types of Tensors Fields


In the following subsections we introduce a number of tensor types and categories and highlight
their main characteristics and differences. These types and categories are not mutually exclusive
and hence they overlap in general; moreover they may not be exhaustive in their classes as some
tensors may not instantiate any one of a complementary set of types such as being symmetric or
anti-symmetric.

10.5.1. Isotropic and Anisotropic Tensors


Isotropic tensors are characterized by the property that the values of their components are invari-
ant under coordinate transformation by proper rotation of axes. In contrast, the values of the com-
ponents of anisotropic tensors are dependent on the orientation of the coordinate axes. Notable
examples of isotropic tensors are scalars (rank-0), the vector 0 (rank-1), Kronecker delta δij (rank-2)
and Levi-Civita tensor ϵijk (rank-3). Many tensors describing physical properties of materials, such
as stress and magnetic susceptibility, are anisotropic.
Direct and inner products of isotropic tensors are isotropic tensors.
The zero tensor of any rank is isotropic; therefore if the components of a tensor vanish in a partic-
ular coordinate system they will vanish in all properly and improperly rotated coordinate systems.4
4
For improper rotation, this is more general than being isotropic.

263
10. Tensors in Coordinates
Consequently, if the components of two tensors are identical in a particular coordinate system they
are identical in all transformed coordinate systems.
As indicated, all rank-0 tensors (scalars) are isotropic. Also, the zero vector, 0, of any dimension
is isotropic; in fact it is the only rank-1 isotropic tensor.

518 Theorem
Any isotropic second order tensor Tij we can be written as

Tij = λδij

for some scalar λ.

Proof. First we will prove that T is diagonal. Let R be the reflection in the hyperplane perpendic-
ular to the j-th vector in the standard ordered basis.


−1 if k = l = j
Rkl =

δkl otherwise

therefore
R = RT ∧ R2 = I ⇒ RT R = RRT = I

Therefore:

Tij = Rip Rjq Tpq = Rii Rjj Tij i ̸= j ⇒ Tij = −Tij ⇒ Tij = 0
p,q

Now we will prove that Tjj = T11 . Let P be the permutation matrix that interchanges the 1st and
j-th rows when acrting by left multiplication.



δjl
 if k = 1

Pkl = δ1l if k = j




δkl otherwise
∑ ∑ ∑ ∑ ∑ ∑
(P T P )kl = T
Pkm Pml = Pmk Pml = Pmk Pml + Pmk Pml = δmk δml +δjk δjl +δ1k δ1l = δmk δ
m m m̸=1,j m=1,j m̸=1,j m

Therefore:
∑ ∑ ∑ ∑
2 2
Tjj = Pjp Pjq Tpq = Pjq Tqq = δ1q Tqq = δ1q Tqq = T11
p,q q q q

10.5.2. Symmetric and Anti-symmetric Tensors


These types of tensor apply to high ranks only (rank ≥ 2). Moreover, these types are not exhaustive,
even for tensors of ranks ≥ 2, as there are high-rank tensors which are neither symmetric nor anti-
symmetric.

264
10.5. Types of Tensors Fields
A rank-2 tensor Aij is symmetric iff for all i and j

Aji = Aij (10.69)

and anti-symmetric or skew-symmetric iff

Aji = −Aij (10.70)

Similar conditions apply to contravariant type tensors (refer also to the following).
A rank-n tensor Ai1 ...in is symmetric in its two indices ij and il iff

Ai1 ...il ...ij ...in = Ai1 ...ij ...il ...in (10.71)

and anti-symmetric or skew-symmetric in its two indices ij and il iff

Ai1 ...il ...ij ...in = −Ai1 ...ij ...il ...in (10.72)

Any rank-2 tensor Aij can be synthesized from (or decomposed into) a symmetric part A(ij) (marked
with round brackets enclosing the indices) and an anti-symmetric part A[ij] (marked with square
brackets) where
1Ä ä 1Ä ä
Aij = A(ij) + A[ij] , A(ij) = Aij + Aji & A[ij] = Aij − Aji (10.73)
2 2
A rank-3 tensor Aijk can be symmetrized by

1 Ä ä
A(ijk) = Aijk + Akij + Ajki + Aikj + Ajik + Akji (10.74)
3!
and anti-symmetrized by
1 Ä ä
A[ijk] = Aijk + Akij + Ajki − Aikj − Ajik − Akji (10.75)
3!
A rank-n tensor Ai1 ...in can be symmetrized by

1
A(i1 ...in ) = (sum of all even & odd permutations of indices i’s) (10.76)
n!
and anti-symmetrized by
1
A[i1 ...in ] = (sum of all even permutations minus sum of all odd permutations) (10.77)
n!
For a symmetric tensor Aij and an anti-symmetric tensor Bij (or the other way around) we have

Aij Bij = 0 (10.78)

The indices whose exchange defines the symmetry and anti-symmetry relations should be of the
same variance type, i.e. both upper or both lower.
The symmetry and anti-symmetry characteristic of a tensor is invariant under coordinate trans-
formation.

265
10. Tensors in Coordinates
A tensor of high rank (> 2) may be symmetrized or anti-symmetrized with respect to only some
of its indices instead of all of its indices, e.g.
1Ä ä 1Ä ä
A(ij)k = Aijk + Ajik & A[ij]k = Aijk − Ajik (10.79)
2 2
A tensor is totally symmetric iff
Ai1 ...in = A(i1 ...in ) (10.80)

and totally anti-symmetric iff


Ai1 ...in = A[i1 ...in ] (10.81)

For a totally skew-symmetric tensor (i.e. anti-symmetric in all of its indices), nonzero entries can
occur only when all the indices are different.

266
Tensor Calculus
11.
11.1. Tensor Fields
In many applications, especially in differential geometry and physics, it is natural to consider a
tensor with components that are functions of the point in a space. This was the setting of Ricci’s
original work. In modern mathematical terminology such an object is called a tensor field and often
referred to simply as a tensor.

519 Definition
A tensor field of type (r, s) is a map T : V → Tsr (V ).

The space of all tensor fields of type (r, s) is denoted Tsr (V ). In this way, given T ∈ Tsr (V ), if we
apply this to a point p ∈ V , we obtain T (p) ∈ Tsr (V )
It’s usual to write the point p as an index:

Tp : (v1 , . . . , ωn ) 7→ Tp (v1 , . . . , ωn ) ∈ R

520 Example

■ If f ∈ T00 (V ) then f is a scalar function.

■ If T ∈ T10 (V ) then T is a vector field.

■ If T ∈ T01 (V ) then T is called differential form of rank 1.

521 Example

x x+y
Mi j =
x − y2 x

267
11. Tensor Calculus
Differential Now we will construct the one of the most important tensor field: the differential.
Given a differentiable scalar function f the directional derivative


d
Dv f (p) := f (p + tv)
dt
t=0

is a linear function of v.

(Dv+w f )(p) = (Dv f )(p) + (Dw f )(p) (11.1)


(Dcv f )(p) = c(Dv f )(p) (11.2)

As we already know the directional derivative is the Jacobian applied to the vector

Dv f (p) = Dfp (v) = [∂1 f, . . . ∂n f ][v1 , . . . , vn ]T

In other words Dv f (p) ∈ T01 (V )

522 Definition
Let f : V → R be a differentiable function. The differential of f , denoted by df , is the differential
form defined by
dfp v = Dv f (p).

Clearly, df ∈ T01 (V )

Let {u1 , u2 , . . . , un } be a coordinate system. Since the coordinates {u1 , u2 , . . . , un } are them-
selves functions, we define the associated differential-forms {du1 , du2 , . . . , dun }.
523 Proposition
∂r
Let {u1 , u2 , . . . , un } be a coordinate system and (p) the corresponding basis of V . Then the
∂ui
differential-forms {du1 , du2 , . . . , dun } are the corresponding dual basis:
Ç å
∂r
duip (p) = δij
∂uj

∂ui
Since = δji , it follows that
∂uj

n
∂f
df = dui .
i=1
∂ui
We also have the following product rule

d(f g) = (df )g + f (dg)

As consequence of Theorem 524 and Proposition 523 we have:

524 Theorem
Given T ∈ Tsr (V ) be a (r, s) tensor. Then T can be expressed in coordinates as:


n ∑
n
···jr ∂r ∂r
T = ··· Ajj1r+1 ···jn du ⊗ du ⊗
j1 jr
(p) · · · ⊗ (p)
j1 =1 jn =1
∂ujr+1 ∂ujr+s

268
11.1. Tensor Fields

11.1.1. Change of Coordinates


∂r ∂r
Let {u1 , u2 , . . . , un } and {ū1 , ū2 , . . . , ūn } two coordinates system and { (p)} and { (p)} the
∂ui ∂ ūi
basis of V with {duj } and {dūj } are the corresponding dual basis:
By the chain rule we have that the vectors change of basis as:

∂r ∂ui ∂r
(p) = (p) (p)
∂ ūj ∂ ūj ∂ui

So the matrix of change of basis is:


∂ui
Aji =
∂ ūj
And the covectors changes by the inverse:
Ä äj ∂ ūj
A−1 i
=
∂ui

525 Theorem (Change of Basis For Tensor Fields)


Let {u1 , u2 , . . . , un } and {ū1 , ū2 , . . . , ūn } two coordinates system and T a tensor
′ ′
i′ ...i′ ∂ ūi1 ∂ ūip ∂uj1 ∂ujq i1 ...ip 1
T̂j ′1...j p′ (ū1 , . . . , ūn ) = · · · ′ · · · n
′ Tj ...j (u , . . . , u ).
1 q ∂ui1 ∂uip ∂ ūj1 ∂ ūjq 1 q

526 Example (Contravariance)


The tangent vector to a curve is a contravariant vector.

Solution: ▶ Let the curve be given by the parameterization xi = xi (t). Then the tangent vector
to the curve is
dxi
Ti =
dt
Under a change of coordinates, the curve is given by

x′i = x′i (t) = x′i (x1 (t), · · · , xn (t))

and the tangent vector in the new coordinate system is given by:

dx′i
T ′i =
dt
By the chain rule,
dx′i ∂x′i dxj
=
dt ∂xj dt
Therefore,
∂x′i
T ′i = T j
∂xj
which shows that the tangent vector transforms contravariantly and thus it is a contravariant vector.

269
11. Tensor Calculus
527 Example (Covariance)
The gradient of a scalar field is a covariant vector field.

Solution: ▶ Let ϕ(x) be a scalar field. Then let


Ç å
∂ϕ ∂ϕ ∂ϕ ∂ϕ
G = ∇ϕ = 1
, 2, 3,··· , n
∂x ∂x ∂x ∂x

thus
∂ϕ
Gi =
∂xi
In the primed coordinate system, the gradient is

∂ϕ′
G′i =
∂x′i
where ϕ′ = ϕ′ (x′ ) = ϕ(x(x′ )) By the chain rule,

∂ϕ′ ∂ϕ ∂xj
=
∂x′i ∂xj ∂x′i
Thus
∂xj
G′i = Gj
∂x′i
which shows that the gradient is a covariant vector.

528 Example
A covariant tensor has components xy, z 2 , 3yz − x in rectangular coordinates. Write its components
in spherical coordinates.

Solution: ▶ Let Ai denote its coordinates in rectangular coordinates (x1 , x2 , x3 ) = (x, y, z).

A1 = xy A2 = z 2 , A3 = 3y − x

Let Āk denote its coordinates in spherical coordinates (x̄1 , x̄2 , x̄3 ) = (r, ϕ, θ):
Then
∂xj
Āk = Aj
∂ x̄k
The relation between the two coordinates systems are given by:

x = r sin ϕ cos θ; y = r sin ϕ sin θ; z = r cos ϕ

And so:
∂x1 ∂x2 ∂x3
Ā1 = A 1 + A 2 + A3 (11.3)
∂ x̄1 ∂ x̄1 ∂ x̄1
= sin ϕ cos θ(xy) + sin ϕ sin θ(z 2 ) + cos ϕ(3y − x) (11.4)
= sin ϕ cos θ(r sin ϕ cos θ)(r sin ϕ sin θ) + sin ϕ sin θ(r cos ϕ)2 (11.5)
+ cos ϕ(3r sin ϕ sin θ − r sin ϕ cos θ) (11.6)

270
11.2. Derivatives
∂x1 ∂x2 ∂x3
Ā2 = A 1 + A 2 + A3 (11.7)
∂ x̄2 ∂ x̄2 ∂ x̄2
= r cos ϕ cos θ(xy) + r cos ϕ sin θ(z 2 ) + −r sin ϕ(3y − x) (11.8)
2
= r cos ϕ cos θ(r sin ϕ cos θ)(r sin ϕ sin θ) + r cos ϕ sin θ(r cos ϕ) (11.9)
+ r sin ϕ(3r sin ϕ sin θ − r sin ϕ cos θ) (11.10)

∂x1 ∂x2 ∂x3


Ā3 = A1 + A 2 + A3 (11.11)
∂ x̄3 ∂ x̄3 ∂ x̄3
= −r sin ϕ sin θ(xy) + r sin ϕ cos θ(z 2 ) + 0) (11.12)
= −r sin ϕ sin θ(r sin ϕ cos θ)(r sin ϕ sin θ) + r sin ϕ cos θ(r cos ϕ) 2
(11.13)
(11.14)

11.2. Derivatives
In this section we consider two different types of derivatives of tensor fields: differentiation with
respect to spacial variables x1 , . . . , xn and differentiation with respect to parameters other than
the spatial ones.
The second type of derivatives are simpler to define. Suppose we have tensor field T of type
(r, s) and depending on the additional parameter t (for instance, this could be a time variable).
Then, upon choosing some Cartesian coordinate system, we can write

∂Xji11...
... ir
js Xji11...
... ir i1 ... ir
js (t + h, x , . . . , x ) − Xj1 ... js (t, x , . . . , x )
1 n 1 n
= lim . (11.15)
∂t h→0 h
The left hand side of 11.15 is a tensor since the fraction in right hand side is constructed by means
of two tensorial operations: difference and scalar multiplication. Taking the limit h → 0 pre-
serves the tensorial nature of this fraction since the matrices of change of coordinates are time-
independent.
So the differentiation with respect to external parameters is a tensorial operation producing new
tensors from existing ones.
Now let’s consider the spacial derivative of tensor field T , e.g, the derivative with respect to x1 .
In this case we want to write the derivative as
∂Tji11...
... ir
js Tji11...
... ir 1 n i1 ... ir 1
js (x + h, . . . , x ) − Tj1 ... js (x , . . . , x )
n
= lim , (11.16)
∂x1 h→0 h
but in numerator of the fraction in the right hand side of 11.16 we get the difference of two tensors
bound to different points of space: the point x1 , . . . , xn and the point x1 + h, . . . , xn .
In general we can’t sum the coordinates of tensors defined in different points since these tensors
are written with respect to distinct basis of vector and covectors, as both basis varies with the point.
In Cartesian coordinate system we don’t have this dependence. And both tensors are written in the
same basis and everything is well defined.
We now claim:

271
11. Tensor Calculus

529 Theorem
For any tensor field T of type (r, s) partial derivatives with respect to spacial variables u1 , . . . , un

∂ ∂
··· T i1 ... ir ,
∂u a ∂xc j1 ... js
| {z }
m

in any Cartesian coordinate system represent another tensor field of the type (r, s + m).

Proof. Since T is a Tensor


′ ′
i ...i ∂ui1 ∂uip ∂ ūj1 ∂ ūjq i′1 ...i′p 1
Tj11...jqp (u1 , . . . , un ) = ′ · · · ′ · · · T̂ ′ ′ (ū , . . . , ūn ).
∂ ūi1 ∂ ūip ∂uj1 ∂ujq j1 ...jq
and so:
Ñ é
′ ′
∂ i1 ...ip 1 ∂ ∂ui1 ∂uip ∂ ūj1 ∂ ūjq i′1 ...i′p 1
Tj1 ...jq (u , . . . , un ) = ′ · · · ′ · · · T̂ ′ ′ (ū , . . . , ūn ) (11.17)
∂u a ∂ua ∂ ūi1 ∂ ūip ∂uj1 ∂ujq j1 ...jq
Ñ é
′ ′
∂ ∂ui1 ∂uip ∂ ūj1 ∂ ūjq i′ ...i′
= ′ · · · ′ · · · T̂j ′1...j p′ (ū1 , . . . , ūn )+ (11.18)
∂ua ∂ ūi1 ∂ ūip ∂uj1 ∂ujq 1 q

′ ′
∂ui1 ∂uip ∂ ūj1 ∂ ūjq ∂ i′1 ...i′p 1
′ · · · ′ · · · T̂ ′ ′ (ū , . . . , ūn ) (11.19)
∂ ūi1 ∂ ūip ∂uj1 ∂ujq ∂ua j1 ...jq
We are assuming that the matrices

∂uis ∂ ūjl
∂ ūi′s ∂ujl
are constant matrices.
And so ′
∂ ∂uis ∂ ∂ ūjl
=0 =0
∂ua ∂ ūi′s ∂ua ∂ujl
Hence Ñ é
′ ′
∂ ∂ui1 ∂uip ∂ ūj1 ∂ ūjq i′ ...i′
′ · · · ′ · · · T̂j ′1...j p′ (ū1 , . . . , ūn ) = 0
∂ua ∂ ūi1 ∂ ūip ∂uj1 ∂ujq 1 q

And
′ ′
∂ i1 ...ip 1 ∂ui1 ∂uip ∂ ūj1 ∂ ūjq ∂ i′1 ...i′p 1
Tj ...j (u , . . . , u n
) = ′ · · · ′ · · · T̂ ′ ′ (ū , . . . , ūn ) (11.20)
∂ua 1 q ∂ ūi1 ∂ ūip ∂uj1 ∂ujq ∂ua j1 ...jq
′ ′ ñ ô
∂ui1 ∂uip j
∂ ū 1 ∂ ūjq ∂ ū′a ∂ i′1 ...i′p 1
= ′ · · · ′ · · · n
T̂ ′ ′ (ū , . . . , ū ) (11.21)
∂ ūi1 ∂ ūip ∂uj1 ∂ujq ∂ua ∂ ū′a j1 ...jq

530 Remark
We note that in general the partial derivative is not a tensor. Given a vector field

∂r
v = vj ,
∂uj

272
11.2. Derivatives
then
∂v ∂v j ∂r 2
j ∂ r
= + v .
∂ui ∂ui ∂uj ∂ui ∂uj
∂2r
The term in general is not null if the coordinate system is not the Cartesian.
∂ui ∂uj
531 Example
Calculate
∂xm ∂λn (Aij λi xj + B ij xi λj )

Solution: ▶

∂xm ∂λn (Aij λi xj + B ij xi λj ) = Aij δ in δ jm + B ij δ im δ jn (11.22)


= Anm + B mn (11.23)


532 Example
Prove that if Fik is an antisymmetric tensor then

Tijk = ∂i Fjk + ∂j Fki + ∂k Fij

is a tensor .

Solution: ▶
The tensor Fik changes as:
∂xj ∂xk
Fjk = F̄ab
∂x′a ∂x′b
Then
( )
∂xj ∂xk
∂i Fjk = ∂i F̄ab (11.24)
∂x′a ∂x′b
( )
∂xj ∂xk ∂xj ∂xk
= ∂i F̄ab + ∂i F̄ab (11.25)
∂x′a ∂x′b ∂x′a ∂x′b
( )
∂xj ∂xk ∂xj ∂xk ∂xi
= ∂i F̄ab + ∂a F̄ab (11.26)
∂x′a ∂x′b ∂x′a ∂x′b ∂x′a

The tensor
Tijk = ∂i Fjk + ∂j Fki + ∂k Fij

is totally antisymmetric under any index pair exchange. Now perform a coordinate change, Tijk will
transform as

∂xi ∂xj ∂xk


Tabc = Tijk + Iabc
∂x′a ∂x′b ∂x′c
where this Iabc is given by:

∂xi ∂xj ∂xk


Iabc = ∂ i ( )Fjk + · · ·
∂x′a ∂x′b ∂x′c

273
11. Tensor Calculus
such Iabc will clearly be also totally antisymmetric under exchange of any pair of the indices a, b, c.
Notice now that we can rewrite:

∂ ∂xj ∂xk ∂ 2 xj ∂xk ∂xj ∂ 2 xj


Iabc = ( )Fjk + · · · = Fjk + Fjk + · · ·
∂x′a ∂x′b ∂x′c ∂x′a x′b ∂x′c ∂x′b ∂x′a x′c
and they all vanish because the object is antisymmetric in the indices a, b, c while the mixed partial
derivatives are symmetric (remember that an object both symmetric and antisymmetric is zero),
hence Tijk is a tensor. ◀

533 Problem
Give a more detailed explanation of why the time derivative of a tensor of type (r, s) is tensor of type
(r, s).

11.3. Integrals and the Tensor Divergence Theorem


It is also straightforward to do integrals. Since we can sum tensors and take limits, the definition of
a tensor-valued ˆ integral is straightforward.
For example, Tij···k (x) dV is a tensor of the same rank as Tij···k (think of the integral as the
V
limit of a sum).
It is easy to generalize the divergence theorem from vectors to tensors.

534 Theorem (Divergence Theorem for Tensors)


Let Tijk··· be a continuously differentiable tensor defined on a domain V with a piecewise-differentiable
boundary (i.e. for almost all points, we have a well-defined normal vector nl ), then we have
ˆ ˆ
ℓ ∂
Tij···kℓ n dS = (Tij···kℓ ) dV,
S V ∂xℓ

with n being an outward pointing normal.

The regular divergence theorem is the case where T has one index and is a vector field.
Proof. The tensor form of the divergence theorem can be obtained applying the usual divergence
theorem to the vector field v defined by vℓ = ai bj · · · ck Tij···kℓ , where a, b, · · · , c are fixed constant
vectors.
Then
∂vℓ ∂
∇·v= = ai bj · · · ck ℓ T ij···kℓ ,
∂xℓ ∂x
and
n · v = nℓ vℓ = ai bj · · · ck Tij···kℓ nℓ .

Since a, b, · · · , c are arbitrary, therefore they can be eliminated, and the tensor divergence theo-
rem follows. ■

274
11.4. Metric Tensor

11.4. Metric Tensor


This is a rank-2 tensor which may also be called the fundamental tensor.
The main purpose of the metric tensor is to generalize the concept of distance to general curvi-
linear coordinate frames and maintain the invariance of distance in different coordinate systems.
In orthonormal Cartesian coordinate systems the distance element squared, (ds)2 , between two
infinitesimally neighboring points in space, one with coordinates xi and the other with coordinates
xi + dxi , is given by
(ds)2 = dxi dxi = δij dxi dxj (11.27)

This definition of distance is the key to introducing a rank-2 tensor, gij , called the metric tensor
which, for a general coordinate system, is defined by

(ds)2 = gij dxi dxj (11.28)

The metric tensor has also a contravariant form, i.e. g ij .


The components of the metric tensor are given by:

ei · ^
gij = ^ ej & ei · ^
g ij = ^ ej (11.29)

where the indexed ^


e are the covariant and contravariant basis vectors:
∂r
^
ei = & ei = ∇ui
^ (11.30)
∂ui

where r is the position vector in Cartesian coordinates and ui is a generalized curvilinear coordi-
nate.
The mixed type metric tensor is given by:

ei · ^
g ij = ^ ej = δ ij & gi j = ^ ej = δi j
ei · ^ (11.31)

and hence it is the same as the unity tensor.


For a coordinate system in which the metric tensor can be cast in a diagonal form where the
diagonal elements are ±1 the metric is called flat.
For Cartesian coordinate systems, which are orthonormal flat-space systems, we have

g ij = δ ij = gij = δij (11.32)

The metric tensor is symmetric, that is

gij = gji & g ij = g ji (11.33)

The contravariant metric tensor is used for raising indices of covariant tensors and the covariant
metric tensor is used for lowering indices of contravariant tensors, e.g.

Ai = g ij Aj Ai = gij Aj (11.34)

275
11. Tensor Calculus
where the metric tensor acts, like a Kronecker delta, as an index replacement operator. Hence, any
tensor can be cast into a covariant or a contravariant form, as well as a mixed form. However, the
order of the indices should be respected in this process, e.g.

Aij = gjk Aik ̸= Aj i = gjk Aki (11.35)

Some authors insert dots (e.g. A·j i ) to remove any ambiguity about the order of the indices.
The covariant and contravariant metric tensors are inverses of each other, that is
î ó î ó−1 î ó î ó−1
gij = g ij & g ij = gij (11.36)

Hence
g ik gkj = δ ij & gik g kj = δi j (11.37)

It is common to reserve the “metric tensor” to the covariant form and call the contravariant form,
which is its inverse, the “associate” or “conjugate” or “reciprocal” metric tensor.
As a tensor, the metric has a significance regardless of any coordinate system although it requires
a coordinate system to be represented in a specific form.
For orthogonal coordinate systems the metric tensor is diagonal, i.e. gij = g ij = 0 for i ̸= j.
For flat-space orthonormal Cartesian coordinate systems in a 3D space, the metric tensor is given
by:  
 1 0 0 
î ó 
î  î ó î ó
ó
 
gij = δij =  0 1 0  = δ ij = g ij (11.38)
 
 
0 0 1

For cylindrical coordinate systems with coordinates (ρ, ϕ, z), the metric tensor is given by:
   

 1 0 0   1 0 0 
î ó   î ó  
   1 
gij =  0 ρ2 0  & g =
ij
 0 0 
 (11.39)
   ρ2 
   
0 0 1 0 0 1

For spherical coordinate systems with coordinates (r, θ, ϕ), the metric tensor is given by:
   

 1 0 0   1 0 0 
î ó   î ó  
   1 
gij =  0 r2 0  & g ij = 
 0 0 
 (11.40)
   r2 
   1 
0 0 r2 sin2 θ 0 0
r sin2 θ
2

11.5. Covariant Differentiation


Let {x1 , . . . , xn } be a coordinate system. And
 
 ∂r 

: i ∈ {1, . . . , n}
 ∂xi 
p

276
11.5. Covariant Differentiation
the associated basis Æ ∏
∂r ∂r
The metric tensor gij = ; .
∂xi ∂xj
Given a vector field
∂r
v = vj ,
∂xj
then
∂v ∂v j ∂r 2
j ∂ r
= + v .
∂xi ∂xi ∂xj ∂xi ∂xj
The last term but can be expressed as a linear combination of the tangent space base vectors using
the Christoffel symbols
∂2r ∂r
i j
= Γk ij k .
∂x ∂x ∂x

535 Definition
The covariant derivative ∇ei v, also written ∇i v, is defined as:
( )
∂v ∂v k ∂r
∇ei v := = + v j Γk ij .
∂xi ∂xi ∂xk

The Christoffel symbols can be calculated using the inner product:


⟨ ⟩ Æ ∏
∂2r ∂r k ∂r ∂r
i j
, l =Γ ij , = Γk ij gkl .
∂x ∂x ∂x ∂xk ∂xl

On the other hand,


⟨ ⟩ ⟨ ⟩
∂gab ∂2r ∂r ∂r ∂2r
= , b + ,
∂xc c a
∂x ∂x ∂x ∂xa ∂xc ∂xb

using the symmetry of the scalar product and swapping the order of partial differentiations we have
⟨ ⟩
∂gjk ∂gki ∂gij ∂2r ∂r
i
+ − =2 , k
∂x ∂xj ∂xk i j
∂x ∂x ∂x

and so we have expressed the Christoffel symbols for the Levi-Civita connection in terms of the
metric:
Ç å
1 ∂gjl ∂gli ∂gij
k
gkl Γ ij = + − .
2 ∂xi ∂xj ∂xl

536 Definition
Christoffel symbol of the second kind is defined by:
Ç å
g kl ∂gil ∂gjl ∂gij
Γkij = + − (11.41)
2 ∂xj ∂xi ∂xl

where the indexed g is the metric tensor in its contravariant and covariant forms with implied sum-
mation over l. It is noteworthy that Christoffel symbols are not tensors.

277
11. Tensor Calculus

The Christoffel symbols of the second kind are symmetric in their two lower indices:

Γkij = Γkji (11.42)


537 Example
For Cartesian coordinate systems, the Christoffel symbols are zero for all the values of indices.

538 Example
For cylindrical coordinate systems (ρ, ϕ, z), the Christoffel symbols are zero for all the values of indices
except:
k
Γ22 = −ρ (11.43)
2 1
Γ12 = Γ221 =
ρ
where (1, 2, 3) stand for (ρ, ϕ, z).
539 Example
For spherical coordinate systems (r, θ, ϕ), the Christoffel symbols can be computed from

ds2 = dr2 + r2 dθ2 + r2 sin2 θdφ2


We can easily then see that the metric tensor and the inverse metric tensor are:
â ì
1 0 0
g= 0 r2 0
0 0 r2 sin2 θ
â ì
1 0 0
g −1 = 0 r−2 0
0 0 r−2 sin−2 θ
Using the formula:

1 ml
ij = g (∂j gil + ∂i glj − ∂l gji )
Γm
2
Where upper indices indicate the inverse matrix. And so:
â ì
0 0 0
Γ1 = 0 −r 0
0 0 −r sin2 θ
â ì
1
0 0
r
1
Γ2 = 0 0
r
0 0 − sin θcosθ

278
11.5. Covariant Differentiation
â ì
1
0 0
r
Γ3 = 0 0 cot θ
1
cot θ 0
r

540 Theorem
Under a change of variable from (y 1 , . . . , y n ) to (x1 , . . . , xn ), the Christoffel symbol transform as

∂xp ∂xq r ∂y k ∂y k ∂ 2 xm
Γ̄k ij = Γ pq +
∂y i ∂y j ∂xr ∂xm ∂y i ∂y j

where the overline denotes the Christoffel symbols in the y coordinate system.

541 Definition (Derivatives of Tensors in Coordinates)

■ For a differentiable scalar f the covariant derivative is the same as the normal partial deriva-
tive, that is:
f;i = f,i = ∂i f (11.44)

This is justified by the fact that the covariant derivative is different from the normal partial
derivative because the basis vectors in general coordinate systems are dependent on their
spatial position, and since a scalar is independent of the basis vectors the covariant and par-
tial derivatives are identical.

■ For a differentiable vector A the covariant derivative is:

Aj;i = ∂i Aj − Γkji Ak (covariant)


(11.45)
Aj;i = ∂i Aj + Γjki Ak (contravariant)

■ For a differentiable rank-2 tensor A the covariant derivative is:

Ajk;i = ∂i Ajk − Γlji Alk − Γlki Ajl (covariant)


Ajk;i = ∂i Ajk + Γjli Alk + Γkli Ajl (contravariant) (11.46)
Akj;i = ∂i Akj + Γkli Akjl −Γl (mixed)
ji Al

■ For a differentiable rank-n tensor A the covariant derivative is:

Aij...k ij...k i aj...k j ia...k k ij...a


lm...p;q = ∂q Alm...p + Γaq Alm...p Γaq Alm...p + · · · + Γaq Alm...p (11.47)
−Γalq Aij...k
am...p − Γamq Aij...k
la...p − ··· − Γapq Aij...k
lm...a

Since the Christoffel symbols are identically zero in Cartesian coordinate systems, the covariant
derivative is the same as the normal partial derivative for all tensor ranks.
The covariant derivative of the metric tensor is zero in all coordinate systems.

279
11. Tensor Calculus
Several rules of normal differentiation similarly apply to covariant differentiation. For example,
covariant differentiation is a linear operation with respect to algebraic sums of tensor terms:

∂;i (aA ± bB) = a∂;i A ± b∂;i B (11.48)

where a and b are scalar constants and A and B are differentiable tensor fields. The product rule
of normal differentiation also applies to covariant differentiation of tensor multiplication:
Ä ä
∂;i (AB) = ∂;i A B + A∂;i B (11.49)

This rule is also valid for the inner product of tensors because the inner product is an outer prod-
uct operation followed by a contraction of indices, and covariant differentiation and contraction of
indices commute.
The covariant derivative operator can bypass the raising/lowering index operator:

Ai = gij Aj =⇒ ∂;m Ai = gij ∂;m Aj (11.50)

and hence the metric behaves like a constant with respect to the covariant operator.
A principal difference between normal partial differentiation and covariant differentiation is that
for successive differential operations the partial derivative operators do commute with each other
(assuming certain continuity conditions) but the covariant operators do not commute, that is

∂i ∂j = ∂j ∂i but ∂;i ∂;j ̸= ∂;j ∂;i (11.51)

Higher order covariant derivatives are similarly defined as derivatives of derivatives; however the
order of differentiation should be respected (refer to the previous point).

11.6. Geodesics and The Euler-Lagrange Equations


Given the metric tensor g in some domain U ⊂ Rn , the length of a continuously differentiable curve
γ : [a, b] → Rn is defined by
ˆ b»
L(γ) = gγ(t) (γ̇(t), γ̇(t)) dt.
a

In coordinates if γ(t) = (x1 , . . . xn ) then:


ˆ b 
dxµ dxν
L(γ) = −gµν dt
a dt dt

The distance d(p, q) between two points p and q is defined as the infimum of the length taken over
all continuous, piecewise continuously differentiable curves γ : [a, b] → Rn such that γ(a) = p
and γ(b) = q. The geodesics are then defined as the locally distance-minimizing paths.
So the geodesics are the curve y(x) such that the functional
ˆ b»
L(γ) = gγ(x) (γ̇(x), γ̇(x)) dx.
a

280
11.6. Geodesics and The Euler-Lagrange Equations
is minimized over all smooth (or piecewise smooth) functions y(x) such that x(a) = p and x(b) = q.
This problem can be simplified, if we introduce the energy functional
ˆ
1 b
E(γ) = g (γ̇(t), γ̇(t)) dt.
2 a γ(t)
For a piecewise C 1 curve, the Cauchy–Schwarz inequality gives

L(γ)2 ≤ 2(b − a)E(γ)

with equality if and only if


g(γ ′ , γ ′ )

is constant.
Hence the minimizers of E(γ) also minimize L(γ).
The previous problem is an example of calculus of variations is concerned with the extrema of
functionals. The fundamental problem of the calculus of variations is to find a function x(t) such
that the functional ˆ b
I(x) = f (t, x(t), y ′ (t)) dt
a
is minimized over all smooth (or piecewise smooth) functions x(t) satisfying certain boundary conditions—
for example, x(a) = A and x(b) = B.
If x̂(t) is the smooth function at which the desired minimum of I(x) occurs, and if I(x̂(t) + εη(t))
is defined for some arbitrary smooth function eta(x) with η(a) = 0 and η(b) = 0, for small enough
ε, then ˆ b
I(x̂ + εη) = f (t, x̂ + εη, x̂′ + εη ′ ) dt
a
is now a function of ε, which must have a minimum at ε = 0. In that case, if I(ε) is smooth enough,
we must have ˆ b
dI
|ε=0 = fx (t, x̂, x̂′ )η(t) + fx′ (t, x̂, x̂′ )η ′ (t) dt = 0 .
dε a
If we integrate the second term by parts we get, using η(a) = 0 and η(b) = 0,
ˆ bÑ é
d
fx (t, x̂, x̂′ ) − fx′ (t, x̂, x̂′ ) η(t) dt = 0 .
a dt

One can then argue that since η(t) was arbitrary and x̂ is smooth, we must have the quantity in
brackets identically zero. This gives the Euler-Lagrange equations:
∂ d ∂
f (t, x, x′ ) − f (t, x, x′ ) = 0 . (11.52)
∂x dt ∂x′
In general this gives a second-order ordinary differential equation which can be solved to obtain
the extremal function f (x). We remark that the Euler–Lagrange equation is a necessary, but not a
sufficient, condition for an extremum.
This can be generalized to many variables: Given the functional:
ˆ b
I(x) = f (t, x1 (t), x′1 (t), . . . , xn (t), x′n (t)) dt
a

281
11. Tensor Calculus
We have the corresponding Euler-Lagrange equations:

∂ d ∂
f (t, x1 (t), x′1 (t), . . . , xn (t), x′n (t)) − (t, x1 (t), x′1 (t), . . . , xn (t), x′n (t)) = 0 . (11.53)
∂x k dt ∂x′k

542 Theorem
A necessary condition to a curve γ be a geodesic is

d2 γ λ µ
λ dγ dγ
ν
+ Γ µν =0
dt2 dt dt

Proof. The geodesics are the minimum of the functional


ˆ b»
L(γ) = gγ(x) (γ̇(x), γ̇(x)) dx.
a

Let
1 dxµ dxν
E = gµν
2 dλ dλ
We will write the Euler Lagrange equations.

d ∂L ∂L
=
dλ ∂(dxµ /dλ) ∂xµ

Developing the right hand side we have:

∂E 1
= ∂λ gµν ẋµ ẋν
∂xλ 2
The first derivative on the left hand side is
∂L
= gµλ (x(λ))ẋµ
∂ ẋλ
where we have made the dependence of g on λ clear for the next step. Now we differentiate with
respect to the curve parameter:
d 1 1
[gµλ (x(λ))ẋµ ] = ∂ν gµλ ẋµ ẋν + gµλ ẍµ = ∂ν gµλ ẋµ ẋν + ∂µ gνλ ẋµ ẋν + gµλ ẍµ
dλ 2 2
Putting it all together, we obtain
1Ä ä
gµλ ẍµ = − ∂ν gµλ + ∂µ gνλ − ∂λ gµν ẋµ ẋν = −Γλµν ẋµ ẋν
2
where in the last step we used the definition of the Christoffel symbols with three lower indices.
Now contract with the inverse metric to raise the first index and cancel the metric on the left hand
side. So
ẍλ = −Γλ µν ẋµ ẋν

282
Applications of Tensor
12.
12.1. The Inertia Tensor
Consider masses mα with positions rα , all rotating with angular velocity ω about 0. So the velocities
are vα = ω × rα . The total angular momentum is

L= rα × mα vα
α

= mα rα × (ω × rα )
α

= mα (|rα |2 ω − (rα · ω)rα ).
α

by vector identities. In components, we have

Li = Iij ωj ,

where

543 Definition (Inertia tensor)


The inertia tensor is defined as

Iij = mα [|rα |2 δij − (rα )i (rα )j ].
α

For a rigid body occupying volume V with mass density ρ(r), we replace the sum with an integral
to obtain ˆ
Iij = ρ(r)(xk xk δij − xi xj ) dV.
V

By inspection, I is a symmetric tensor.

544 Example
Consider a rotating cylinder with uniform density ρ0 . The total mass is 2ℓπa2 ρ0 .

283
12. Applications of Tensor
x3
a

2ℓ
x1

x2

Use cylindrical polar coordinate:

x1 = r cos θ
x2 = r sin θ
x3 = x3
dV = r dr dθ dx3

We have
ˆ
I33 = ρ0 (x21 + x22 ) dV
V
ˆ a ˆ 2π ˆ ℓ
= ρ0 r2 (r dr dθ dx2 )
0 0 −ℓ
[ ]a
r4
= ρ0 · 2π · 2ℓ
4 0
4
= ε0 πℓa .

Similarly, we have
ˆ
I11 = ρ0 (x22 + x23 ) dV
V
ˆ a ˆ 2π ˆ ℓ
= ρ0 (r2 sin2 θ + x23 )r dr dθ dx3
0 0 −ℓ
Ñ [ ]ℓ é
ˆ aˆ 2π
x33
= ρ0 r r2 sin2 θ [x3 ]ℓ−ℓ + dθ dr
0 0 3 −ℓ
ˆ a ˆ 2π
Ç å
2
= ρ0 r r sin θ2ℓ + ℓ3 dθ dr 2 2
0 0 3
( ˆ a ˆ 2π )
2 3
= ρ0 2πa · ℓ + 2ℓ r2 dr sin2 θ
3 0 0
( )
a2 2 2
2
= ρ0 πa ℓ + ℓ
2 3

By symmetry, the result for I22 is the same.

284
12.1. The Inertia Tensor
How about the off-diagonal elements?
ˆ
I13 = − ρ0 x1 x3 dV
V
ˆ a ˆ ℓ ˆ 2π
= −ρ0 r2 cos θx3 dr dx3 dθ
0 −ℓ 0

=0
ˆ 2π
Since dθ cos θ = 0. Similarly, the other off-diagonal elements are all 0. So the non-zero compo-
0
nents are
1
I33 = M a2
2 ( )
a 2 ℓ2
I11 = I22 = M +
4 3

a 3 1
In the particular case where ℓ = , we have Iij = ma2 δij . So in this case,
2 2
1
L = M a2 ω
2
for rotation about any axis.
545 Example (Inertia Tensor of a Cube about the Center of Mass)
The high degree of symmetry here means we only need to do two out of nine possible integrals.
ˆ
Ixx = dV ρ(y 2 + z 2 ) (12.1)
ˆ b/2 ˆ b/2 ˆ b/2
=ρ dx dy dz(y 2 + z 2 ) (12.2)
−b/2 −b/2 −b/2
ˆ b/2
b/2
1 3
= ρb dy (zy 2 + z ) (12.3)
−b/2 3 −b/2
ˆ ( )
b/2
1 b3 2
= ρb dy by + (12.4)
−b/2 3 4
Ç å b/2
1 3 1
= ρb by + b3 y (12.5)
3 12
−b/2
Ç å
1 4 1
= ρb b + b4 (12.6)
12 12
1 1
= ρb5 = M b2 . (12.7)
6 6
ˆ
On the other hand, all the off-diagonal moments are zero, for example Ixy = dV ρ(−xy).
This is an odd function of x and y, and our integration is now symmetric about the origin in all di-
rections, so it vanishes identically. So the inertia tensor of the cube about its center is
â ì
1 0 0
1
I = M b2 0 1 0 .
6
0 0 1

285
12. Applications of Tensor
12.1.1. The Parallel Axis Theorem
The Parallel Axis Theorem relates the inertia tensor about the center of gravity and the inertia tensor
about a parallel axis.
For this purpose we consider two coordinate systems: the first r = (x, y, z) with origin at the
center of mass of an arbitrary object, and the second r′ = (x′ , y ′ , z ′ ) offset by some distance. We
consider that the object is translated from the origin, but not rotated, by some constant vector a.
In vector form, the coordinates are related as

r′ = a + r.

Note that a points towards the center of mass - the direction is important.

546 Theorem
If Iij is the inertia tensor calculated in Center of Mass Coordinate, and Jij is the tensor in the trans-
lated coordinates, then:
Jij = Iij + M (a2 δij − ai aj ).

547 Example (Inertia Tensor of a Cube about a corner)


The CM inertia tensor was
â ì
1/6 0 0
I = M b2 0 1/6 0
0 0 1/6

If instead we want the tensor about one corner of the cube, the displacement vector is

a = (b/2, b/2, b/2),

so a2 = (3/4)b2 . We can construct the difference as a matrix: the off-diagonal components are
[ Ç åÇ å]
3 2 1 1 1
M B − b b = M b2
4 2 2 2

and off-diagonal, [ Ç åÇ å]
1 1 1
M − b b = − M b2
2 2 4
so the shifted inertia tensor is
â ì â ì
1/6 0 0 1/2 −1/4 −1/4
J = M b2 0 1/6 0 + M b2 −1/4 1/2 −1/4 (12.8)

0 0 1/6 −1/4 −1/4 1/2


â ì
2/3 −1/4 −1/4
2
= Mb −1/4 2/3 −1/4 (12.9)

−1/4 −1/4 2/3

286
12.2. Ohm’s Law

12.2. Ohm’s Law


Ohm’s law is an empirical law that states that there is a linear relationship between the electric
current j flowing through a material and the electric field E applied to this material. This law can
be written as
j = σE

where the constant of proportionality σ is known as the conductivity (the conductivity is defined as
the inverse of resistivity).
One important consequence of equation 12.2 is that the vectors j and E are necessary parallel.
This law is true for some materials, but not for all. For example, if the medium is made of alternate
layers of a conductor and an insulator, then the current can only flow along the layers, regardless
of the direction of the electric field. It is useful therefore to have an alternative to equation in which
j and E do not have to be parallel.
This can be achieved by introducing the conductivity tensor, σik , which relates j and E through
the equation:
ji = σik Ek

We note that as j and E are vectors, it follows from the quotient rule that σik is a tensor.

12.3. Equation of Motion for a Fluid: Navier-Stokes


Equation
12.3.1. Stress Tensor
The stress tensor consists of nine components σij that completely define the state of stress at a
point inside a material in the deformed state, placement, or configuration.
 
σ11 σ12 σ13 
 
 
σ = σ21 σ22 σ23 
 
 
σ31 σ32 σ33

The stress tensor can be separated into two components. One component is a hydrostatic or
dilatational stress that acts to change the volume of the material only; the other is the deviator
stress that acts to change the shape only.

â ì â ì â ì
σ11 σ12 σ31 σH 0 0 σ11 − σH σ12 σ31
σ12 σ22 σ23 = 0 σH 0 + σ12 σ22 − σH σ23
σ31 σ23 σ33 0 0 σH σ31 σ23 σ33 − σH

287
12. Applications of Tensor
12.3.2. Derivation of the Navier-Stokes Equations
The Navier-Stokes equations can be derived from the conservation and continuity equations and
some properties of fluids. In order to derive the equations of fluid motion, we will first derive the
continuity equation, apply the equation to conservation of mass and momentum, and finally com-
bine the conservation equations with a physical understanding of what a fluid is.
The first assumption is that the motion of a fluid are described with the flow velocity of the fluid:

548 Definition
The flow velocity v of a fluid is a vector field

v = v(x, t)

which gives the velocity of an element of fluid at a position x and time t

Material Derivative
dT
A normal derivative is the rate of change of of an property at a point. For instance, the value
dt
could be the rate of change of temperature at a point (x, y). However, a material derivative is the
rate of change of an property on a particle in a velocity field. It incorporates two things:
dL
■ Rate of change of the property,
dt
■ Change in position of of the particle in the velocity field v

Therefore, the material derivative can be defined as

549 Definition (Material Derivative)


Given a function u(t, x, y, z)
Du du
= + (v · ∇)u.
Dt dt

Continuity Equation

An intensive property is a quantity whose value does not depend on the amount of the substance
for which it is measured. For example, the temperature of a system is the same as the temperature
of any part of it. If the system is divided the temperature of each subsystem is identical. The same
applies to the density of a homogeneous system; if the system is divided in half, the mass and the
volume change in the identical ratio and the density remains unchanged.
The volume will be denoted by U and its bounding surface area is referred to as ∂U . The conti-
nuity equation derived can later be applied to mass and momentum.

Reynold’s Transport Theorem The first basic assumption is the Reynold’s Transport Theorem:

288
12.3. Equation of Motion for a Fluid: Navier-Stokes Equation

550 Theorem (Reynold’s Transport Theorem)


Let U be a region in Rn with a C 1 boundary ∂U . Let x(t) be the positions of points in the region and
let v(x, t) be the velocity field in the region. Let n(x, t) be the outward unit normal to the boundary.
Let L(x, t) be a C 2 scalar field. Then
Lj å ˆ ˆ
d ∂L
L dV = dV + (v · n)L dA .
dt U U ∂t ∂U

What we will write in a simplified way as


ˆ ˆ ˆ
d
L dV = − Lv · n dA − Q dV. (12.10)
dt U ∂U U
The left hand side of the equation denotes the rate of change of the property L contained inside
the volume U . The right hand side is the sum of two terms:
ˆ
■ A flux term, Lv · n dA, which indicates how much of the property L is leaving the volume
∂U
by flowing over the boundary ∂U
ˆ
■ A sink term, Q dV , which describes how much of the property L is leaving the volume due
U
to sinks or sources inside the boundary
This equation states that the change in the total amount of a property is due to how much flows
out through the volume boundary as well as how much is lost or gained through sources or sinks
inside the boundary.
If the intensive property we’re dealing with is density, then the equation is simply a statement of
conservation of mass: the change in mass is the sum of what leaves the boundary and what appears
within it; no mass is left unaccounted for.

Divergence Theorem The Divergence Theorem allows the flux term of the above equation to be
expressed as a volume integral. By the Divergence Theorem,
ˆ ˆ
Lv · n dA = ∇ · (Lv) dV.
∂U U
Therefore, we can now rewrite our previous equation as
ˆ ˆ
d [ ]
L dV = − ∇ · (Lv) + Q dV.
dt U U
Deriving under the integral sign, we find that
ˆ ˆ
d
L dV = − ∇ · (Lv) + Q dV.
U dt U
Equivalently, ˆ
d
L + ∇ · (Lv) + Q dV = 0.
U dt
This relation applies to any volume U ; the only way the above equality remains true for any volume
U is if the integrand itself is zero. Thus, we arrive at the differential form of the continuity equation
dL
+ ∇ · (Lv) + Q = 0.
dt

289
12. Applications of Tensor
Conservation of Mass

Applying the continuity equation to density, we obtain


+ ∇ · (ρv) + Q = 0.
dt
This is the conservation of mass because we are operating with a constant volume U . With no
sources or sinks of mass (Q = 0),

+ ∇ · (ρv) = 0. (12.11)
dt
The equation 12.11 is called conversation of mass.

In certain cases it is useful to simplify it further. For an incompressible fluid, the density is con-
stant. Setting the derivative of density equal to zero and dividing through by a constant ρ, we obtain
the simplest form of the equation
∇ · v = 0.

Conversation of Momentum

We start with
F = ma.

Allowing for the body force F = a and substituting density for mass, we get a similar equation

d
b=ρ v(x, y, z, t).
dt
Applying the chain rule to the derivative of velocity, we get
Ç å
∂v ∂v ∂x ∂v ∂y ∂v ∂z
b=ρ + + + .
∂t ∂x ∂t ∂y ∂t ∂z ∂t

Equivalently, Ç å
∂v
b=ρ + v · ∇v .
∂t
Substituting the value in parentheses for the definition of a material derivative, we obtain

Dv
ρ = b. (12.12)
Dt

Equations of Motion

The conservation equations derived above, in addition to a few assumptions about the forces and
the behaviour of fluids, lead to the equations of motion for fluids.
We assume that the body force on the fluid parcels is due to two components, fluid stresses and
other, external forces.
b = ∇ · σ + f. (12.13)

290
12.3. Equation of Motion for a Fluid: Navier-Stokes Equation
Here, σ is the stress tensor, and f represents external forces. Intuitively, the fluid stress is repre-
sented as the divergence of the stress tensor because the divergence is the extent to which the
tensor acts like a sink or source; in other words, the divergence of the tensor results in a momen-
tum source or sink, also known as a force. For many applications f is the gravity force, but for now
we will leave the equation in its most general form.

General Form of the Navier-Stokes Equation

We divide the stress tensor σ into the hydrostatic and deviator part. Denoting the stress deviator
tensor as T , we can make the substitution

σ = −pI + T. (12.14)

Substituting this into the previous equation, we arrive at the most general form of the Navier-
Stokes equation:
Dv
ρ = −∇p + ∇ · T + f. (12.15)
Dt

291
Integration of Forms
13.
13.1. Differential Forms
551 Definition
A k-differential form field in Rn is an expression of the form

ω= aj1 j2 ...jk dxj1 ∧ dxj2 ∧ · · · dxjk ,
1≤j1 ≤j2 ≤···≤jk ≤n

where the aj1 j2 ...jk are differentiable functions in Rn .

A 0-differential form in Rn is simply a differentiable function in Rn .


552 Example

g(x, y, z, w) = x + y 2 + z 3 + w4

is a 0-form in R4 .

553 Example
An example of a 1-form field in R3 is

ω = xdx + y 2 dy + xyz 3 dz.

554 Example
An example of a 2-form field in R3 is

ω = x2 dx ∧ dy + y 2 dy ∧ dz + dz ∧ dx.

555 Example
An example of a 3-form field in R3 is

ω = (x + y + z)dx ∧ dy ∧ dz.

We shew now how to multiply differential forms.

293
13. Integration of Forms
556 Example
The product of the 1-form fields in R3

ω1 = ydx + xdy,

ω2 = −2xdx + 2ydy,

is
ω1 ∧ ω2 = (2x2 + 2y 2 )dx ∧ dy.

557 Definition
Let f (x1 , x2 , . . . , xn ) be a 0-form in Rn . The exterior derivative df of f is


n
∂f
df = dxi .
i=1
∂xi
Furthermore, if
ω = f (x1 , x2 , . . . , xn )dxj1 ∧ dxj2 ∧ · · · ∧ dxjk

is a k-form in Rn , the exterior derivative dω of ω is the (k + 1)-form

dω = df (x1 , x2 , . . . , xn ) ∧ dxj1 ∧ dxj2 ∧ · · · ∧ dxjk .

558 Example
If in R2 , ω = x3 y 4 , then
d(x3 y 4 ) = 3x2 y 4 dx + 4x3 y 3 dy.

559 Example
If in R2 , ω = x2 ydx + x3 y 4 dy then

dω = d(x2 ydx + x3 y 4 dy)


= (2xydx + x2 dy) ∧ dx + (3x2 y 4 dx + 4x3 y 3 dy) ∧ dy
= x2 dy ∧ dx + 3x2 y 4 dx ∧ dy
= (3x2 y 4 − x2 )dx ∧ dy

560 Example
Consider the change of variables x = u + v, y = uv. Then

dx = du + dv,

dy = vdu + udv,

whence
dx ∧ dy = (u − v)du ∧ dv.

294
13.2. Integrating Differential Forms
561 Example
Consider the transformation of coordinates xyz into uvw coordinates given by

z y+z
u = x + y + z, v = ,w= .
y+z x+y+z

Then
du = dx + dy + dz,
z y
dv = − dy + dz,
(y + z)2 (y + z)2
y+z x x
dw = − 2
dx + 2
dy + dz.
(x + y + z) (x + y + z) (x + y + z)2
Multiplication gives
Ç
zx y(y + z)
du ∧ dv ∧ dw = − −
(y + z)2 (x
+ y + z) 2 (y + z)2 (x + y + z)2 å
z(y + z) xy
+ − dx ∧ dy ∧ dz
(y + z)2 (x + y + z)2 (y + z)2 (x + y + z)2
z 2 − y 2 − zx − xy
= dx ∧ dy ∧ dz.
(y + z)2 (x + y + z)2

13.2. Integrating Differential Forms


Let

ω= ai1 ,...,ik (x) dxi1 ∧ . . . ∧ dxik
i1 <···<ik

be a differential form and M a differentiable-manifold over which we wish to integrate, where M


has the parameterization

M (u) = (x1 (u), . . . , xk (u))

for in the parameter u domain D . Then defines the integral of the differential form over as
ˆ ˆ ∑ ∂(xi1 , . . . , xik ) 1
ω= ai1 ,...,ik (M (u)) du · · · duk ,
S D i <···<i ∂(u1 , . . . , uk )
1 k

where the integral on the right-hand side is the standard Riemann integral over D, and

∂(xi1 , . . . , xik )
∂(u1 , . . . , uk )

is the determinant of the Jacobian.

13.3. Zero-Manifolds

295
13. Integration of Forms

562 Definition
A 0-dimensional oriented manifold of Rn is simply a point x ∈ Rn , with a choice of the + or − sign.
A general oriented 0-manifold is a union of oriented points.

563 Definition
Let M = +{b} ∪ −{a} be an oriented 0-manifold, and let ω be a 0-form. Then
ˆ
ω = ω(b) − ω(a).
M

−x has opposite orientation to +x and


ˆ ˆ
ω=− ω.
−x +x

564 Example
Let M = −{(1, 0, 0)} ∪ +{(1, 2, 3)} ∪ −{(0, −2, 0)}1 be an oriented 0-manifold, and let ω = x +
2y + z 2 . Then
ˆ
ω = −ω((1, 0, 0)) + ω(1, 2, 3) − ω(0, 0, 3) = −(1) + (14) − (−4) = 17.
M

13.4. One-Manifolds
565 Definition
A 1-dimensional oriented manifold of Rn is simply an oriented smooth curve Γ ∈ Rn , with a choice
of a + orientation if the curve traverses in the direction of increasing t, or with a choice of a − sign
if the curve traverses in the direction of decreasing t. A general oriented 1-manifold is a union of
oriented curves.

The curve −Γ has opposite orientation to Γ and


ˆ ˆ
ω = − ω.
−Γ Γ
 
dx
 , the classical way of writing this is
If f : R2 → R2 and if dr =  
dy
ˆ
f · dr.
Γ

We now turn to the problem of integrating 1-forms.


1
Do not confuse, say, −{(1, 0, 0)} with −(1, 0, 0) = (−1, 0, 0). The first one means that the point (1, 0, 0) is given
negative orientation, the second means that (−1, 0, 0) is the additive inverse of (1, 0, 0).

296
13.4. One-Manifolds
566 Example
Calculate
ˆ
xydx + (x + y)dy
Γ

where Γ is the parabola y = x2 , x ∈ [−1; 2] oriented in the positive direction.

Solution: ▶ We parametrise the curve as x = t, y = t2 . Then

xydx + (x + y)dy = t3 dt + (t + t2 )dt2 = (3t3 + 2t2 )dt,

whence ˆ ˆ 2
ω = (3t3 + 2t2 )dt
Γ ñ −1 ô2
2 3 3 4
= t + t
3 4 −1
69
= .
4
What would happen if we had given the curve above a different parametrisation? First observe that
the curve travels from (−1, 1) to (2, 4) on the parabola y = x2 . These conditions are met with the
√ √
parametrisation x = t − 1, y = ( t − 1)2 , t ∈ [0; 9]. Then

√ √ √ √ √
xydx + (x + y)dy = ( t − 1)3 d( t − 1) + (( t − 1) + ( t − 1)2 )d( t − 1)2
√ √ √
= (3( t − 1)3 + 2( t − 1)2 )d( t − 1)
1 √ √
= √ (3( t − 1)3 + 2( t − 1)2 )dt,
2 t

whence ˆ ˆ
1 9 √ √
ω = √ (3( t − 1)3 + 2( t − 1)2 )dt
[0 2
2 t
√ ]9
Γ
3/2
= 3t4 − 7t3 + 5t 2 − t
0
69
= ,
4
as before.

It turns out that if two different parametrisations of the same curve have the same ori-
entation, then their integrals are equal. Hence, we only need to worry about finding a
suitable parametrisation.

567 Example
Calculate the line integral
ˆ
y sin xdx + x cos ydy,
Γ

where Γ is the line segment from (0, 0) to (1, 1) in the positive direction.

297
13. Integration of Forms
Solution: ▶ This line has equation y = x, so we choose the parametrisation x = y = t. The
integral is thus
ˆ ˆ 1
y sin xdx + x cos ydy = (t sin t + t cos t)dt
Γ 0 ˆ 1
= [t(sin x − cos t)]0 −
1 (sin t − cos t)dt
0
= 2 sin 1 − 1,

upon integrating by parts.



568 Example
Calculate the path integral
ˆ
x+y x−y
2 2
dy + 2 dx
Γ x +y x + y2
around the closed square Γ = ABCD with A = (1, 1), B = (−1, 1), C = (−1, −1), and D =
(1, −1) in the direction ABCDA.

Solution: ▶ On AB, y = 1, dy = 0, on BC, x = −1, dx = 0, on CD, y = −1, dy = 0, and on


DA, x = 1, dx = 0. The integral is thus

ˆ ˆ ˆ ˆ ˆ
ω = ω+ ω+ ω+ ω
Γ AB BC CD DA
ˆ −1 ˆ −1 ˆ ˆ
x−1 y−1 1
x+1 1
y+1
= dx + dy + dx + dy
1 x2 + 1 1 y2 + 1 −1 x2 + 1 −1 y2 + 1
ˆ 1
1
= 4 dx
−1 x2 +1

= 4 arctan x|1−1

= 2π.

When the integral is˛ along a closed


ˆ path, like in the preceding example, it is customary
to use the symbol rather than . The positive direction of integration is that sense
Γ Γ
that when traversing the path, the area enclosed by the curve is to the left of the curve.

569 Example
Calculate the path integral
˛
x2 dy + y 2 dx,
Γ

where Γ is the ellipse 9x2 + 4y 2 = 36 traversed once in the positive sense.

298
13.4. One-Manifolds
Solution: ▶ Parametrise the ellipse as x = 2 cos t, y = 3 sin t, t ∈ [0; 2π]. Observe that when
traversing this closed curve, the area of the ellipse is on the left hand side of the path, so this
parametrisation traverses the curve in the positive sense. We have
˛ ˆ 2π
ω = ((4 cos2 t)(3 cos t) + (9 sin t)(−2 sin t))dt
Γ ˆ0 2π
= (12 cos3 t − 18 sin3 t)dt
0
= 0.

570 Definition
Let Γ be a smooth curve. The integral
ˆ
f (x)∥dx∥
Γ

is called the path integral of f along Γ.

ˆ
571 Example
Find x∥dx∥ where Γ is the triangle starting at A : (−1, −1) to B : (2, −2), and ending in C : (1, 2).
Γ

−x − 4
Solution: ▶ The lines passing through the given points have equations LAB : y = , and
3
LBC : y = −4x + 6. On LAB
Ã
» Ç å2 √
1 x 10dx
x∥dx∥ = x (dx)2 + (dy)2 = x 1 + − dx = ,
3 3

and on LBC » » √
x∥dx∥ = x (dx)2 + (dy)2 = x( 1 + (−4)2 )dx = x 17dx.

Hence ˆ ˆ ˆ
x∥dx∥ = x∥dx∥ + x∥dx∥
Γ
ˆ 2
LAB √ LBC
ˆ 1 √
x 10dx
= + x 17dx
√−1 3√ 2
10 3 17
= − .
2 2

Homework

299
3

2 b

13. Integration of Forms 1

-3 -2 -1 1 2 3
b -1

-2 b

-3

Figure 13.1. Example 571.

ˆ
572 Problemˆ ˆ
path. Calculate also xy∥dx∥ using this
Consider xdx + ydy and xy∥dx∥. C
C C parametrisation.
ˆ
1. Evaluate xdx + ydy where C is the ˆ
573 Problem
C Find xdx + ydy where Γ is the path shewn in
straight line path that starts at (−1, 0) Γ
goes to (0, 1) and ends at (1, 0), by figure ??, starting at O(0, 0) going on a straight
Ä ä
π π
parametrising this path. Calculate also line to A 4 cos ,
6 Ä4 sin 6 and continuing on an
ˆ ä
π π
arc of a circle to B 4 cos 5 , 4 sin 5 .
xy∥dx∥ using this parametrisation.
C
ˆ ˛
574 Problem
2. Evaluate xdx+ydy where C is the semi- Find zdx + xdy + ydz where Γ is the intersec-
C Γ
circle that starts at (−1, 0) goes to (0, 1) tion of the sphere x2 + y 2 + z 2 = 1 and the plane
and ends at (1, 0), by parametrising this x + y = 1, traversed in the positive direction.

13.5. Closed and Exact Forms


575 Lemma (Poincaré Lemma)
If ω is a p-differential form of continuously differentiable functions in Rn then

d(dω) = 0.

Proof. We will prove this by induction on p. For p = 0 if

ω = f (x1 , x2 , . . . , xn )

then

n
∂f
dω = dxk
k=1
∂xk

300
13.5. Closed and Exact Forms
and Ç å
∂f ∑
n
d(dω) = d ∧ dxk
∂xk
k=1 Ñ é

n ∑
n
∂2f
= ∧ dxj ∧ dxk
j=1(
∂xj ∂xk
k=1 )

n
∂2f ∂2f
= − dxj ∧ dxk
1≤j≤k≤n
∂xj ∂xk ∂xk ∂xj

= 0,
since ω is continuously differentiable and so the mixed partial derivatives are equal. Consider now
an arbitrary p-form, p > 0. Since such a form can be written as

ω= aj1 j2 ...jp dxj1 ∧ dxj2 ∧ · · · dxjp ,
1≤j1 ≤j2 ≤···≤jp ≤n

where the aj1 j2 ...jp are continuous differentiable functions in Rn , we have



dω = daj1 j2 ...jp ∧ dxj1 ∧ dxj2 ∧ · · · dxjp
1≤j1 ≤j2 ≤···≤jp ≤n Ñ é
∑ ∑
n
∂aj1 j2 ...jp
= dxi ∧ dxj1 ∧ dxj2 ∧ · · · dxjp ,
1≤j1 ≤j2 ≤···≤jp ≤n i=1
∂xi

it is enough to prove that for each summand


Ä ä
d da ∧ dxj1 ∧ dxj2 ∧ · · · dxjp = 0.

But
Ä ä Ä ä
d da ∧ dxj1 ∧ dxj2 ∧ · · · dxjp = dda ∧ dxj1 ∧ dxj2 ∧ · · · dxjp
Ä ä
+da ∧ d dxj1 ∧ dxj2 ∧ · · · dxjp
Ä ä
= da ∧ d dxj1 ∧ dxj2 ∧ · · · dxjp ,
since dda = 0 from the case p = 0. But an independent induction argument proves that
Ä ä
d dxj1 ∧ dxj2 ∧ · · · dxjp = 0,

completing the proof. ■

576 Definition
A differential form ω is said to be exact if there is a continuously differentiable function F such that

dF = ω.
577 Example
The differential form
xdx + ydy
is exact, since Ç å
1 2
xdx + ydy = d (x + y 2 ) .
2

301
13. Integration of Forms
578 Example
The differential form
ydx + xdy

is exact, since
ydx + xdy = d (xy) .

579 Example
The differential form
x y
dx + 2 dy
x2 +y 2 x + y2
is exact, since Ç å
x y 1
2 2
dx + 2 dy = d loge (x2 + y 2 ) .
x +y x + y2 2

Let ω = dF be an exact form. By the Poincaré Lemma Theorem 575, dω = ddF = 0.


A result of Poincaré says that for certain domains (called star-shaped domains) the
converse is also true, that is, if dω = 0 on a star-shaped domain then ω is exact.

580 Example
Determine whether the differential form

2x(1 − ey ) ey
ω= dx + dy
(1 + x2 )2 1 + x2

is exact.

Solution: ▶ Assume there is a function F such that

dF = ω.

By the Chain Rule


∂F ∂F
dF = dx + dy.
∂x ∂y
This demands that
∂F 2x(1 − ey )
= ,
∂x (1 + x2 )2
∂F ey
= .
∂y 1 + x2
We have a choice here of integrating either the first, or the second expression. Since integrating the
second expression (with respect to y) is easier, we find
ey
F (x, y) = + ϕ(x),
1 + x2
where ϕ(x) is a function depending only on x. To find it, we differentiate the obtained expression
for F with respect to x and find

∂F 2xey
=− + ϕ′ (x).
∂x (1 + x2 )2

302
13.5. Closed and Exact Forms
∂F
Comparing this with our first expression for , we find
∂x
2x
ϕ′ (x) = ,
(1 + x2 )2
that is
1
ϕ(x) = − + c,
1 + x2
where c is a constant. We then take
ey − 1
F (x, y) = + c.
1 + x2

581 Example
Is there a continuously differentiable function such that

dF = ω = y 2 z 3 dx + 2xyz 3 dy + 3xy 2 z 2 dz ?

Solution: ▶ We have

dω = (2yz 3 dy + 3y 2 z 2 dz) ∧ dx
+(2yz 3 dx + 2xz 3 dy + 6xyz 2 dz) ∧ dy
+(3y 2 z 2 dx + 6xyz 2 dy + 6xy 2 zdz) ∧ dz
= 0,

so this form is exact in a star-shaped domain. So put


∂F ∂F ∂F
dF = dx + dy + dz = y 2 z 3 dx + 2xyz 3 dy + 3xy 2 z 2 dz.
∂x ∂y ∂z
Then
∂F
= y 2 z 3 =⇒ F = xy 2 z 3 + a(y, z),
∂x
∂F
= 2xyz 3 =⇒ F = xy 2 z 3 + b(x, z),
∂y
∂F
= 3xy 2 z 2 =⇒ F = xy 2 z 3 + c(x, y),
∂z
Comparing these three expressions for F , we obtain F (x, y, z) = xy 2 z 3 . ◀
We have the following equivalent of the Fundamental Theorem of Calculus.

582 Theorem
Let U ⊆ Rn be an open set. Assume ω = dF is an exact form, and Γ a path in U with starting point
A and endpoint B. Then
ˆ ˆ B
ω= dF = F (B) − F (A).
Γ A
In particular, if Γ is a simple closed path, then
˛
ω = 0.
Γ

303
13. Integration of Forms

583 Example
Evaluate the integral
˛
2x 2y
dx + 2 dy
x2 +y 2 x + y2
Γ

where Γ is the closed polygon with vertices at A = (0, 0), B = (5, 0), C = (7, 2), D = (3, 2),
E = (1, 1), traversed in the order ABCDEA.

Solution: ▶ Observe that


Ç å
2x 2y 4xy 4xy
d 2 2
dx + 2 dy =− dy ∧ dx − 2 dx ∧ dy = 0,
x +y x + y2 (x2 +y )2 2 (x + y 2 )2

and so the form is exact in a start-shaped domain. By virtue of Theorem 582, the integral is 0. ◀
584 Example
Calculate the path integral
˛
(x2 − y)dx + (y 2 − x)dy,
Γ

where Γ is a loop of x3 + y 3 − 2xy = 0 traversed once in the positive sense.

Solution: ▶ Since
∂ 2 ∂ 2
(x − y) = −1 = (y − x),
∂y ∂x
the form is exact, and since this is a closed simple path, the integral is 0. ◀

13.6. Two-Manifolds
585 Definition
A 2-dimensional oriented manifold of R2 is simply an open set (region) D ∈ R2 , where the +
orientation is counter-clockwise and the − orientation is clockwise. A general oriented 2-manifold
is a union of open sets.

The region −D has opposite orientation to D and


ˆ ˆ
ω=− ω.
−D D

We will often write ˆ


f (x, y)dA
D
where dA denotes the area element.

In this section, unless otherwise noticed, we will choose the positive orientation for the
regions considered. This corresponds to using the area form dxdy.

304
13.7. Three-Manifolds
Let D ⊆ R2 . Given a function f : D → R, the integral
ˆ
f dA
D

is the sum of all the values of f restricted to D. In particular,


ˆ
dA
D

is the area of D.

13.7. Three-Manifolds
586 Definition
A 3-dimensional oriented manifold of R3 is simply an open set (body) V ∈ R3 , where the + ori-
entation is in the direction of the outward pointing normal to the body, and the − orientation is in
the direction of the inward pointing normal to the body. A general oriented 3-manifold is a union of
open sets.

The region −M has opposite orientation to M and


ˆ ˆ
ω=− ω.
−M M

We will often write ˆ


f dV
M
where dV denotes the volume element.

In this section, unless otherwise noticed, we will choose the positive orientation for the
regions considered. This corresponds to using the volume form dx ∧ dy ∧ dz.

Let V ⊆ R3 . Given a function f : V → R, the integral


ˆ
f dV
V

is the sum of all the values of f restricted to V . In particular,


ˆ
dV
V

is the oriented volume of V .


587 Example
Find
ˆ
x2 yexyz dV.
[0;1]3

305
13. Integration of Forms
Solution: ▶ The integral is
Ñ (ˆ ) é (ˆ )
ˆ 1 ˆ 1 1 ˆ 1 1
2
x ye xyz
dz dy dx = x(e xy
− 1) dy dx
0 0 0 0 0
ˆ 1
= (ex − x − 1)dx
0
5
= e− .
2
◀s

13.8. Surface Integrals


588 Definition
A 2-dimensional oriented manifold of R3 is simply a smooth surface D ∈ R3 , where the + orienta-
tion is in the direction of the outward normal pointing away from the origin and the − orientation is
in the direction of the inward normal pointing towards the origin. A general oriented 2-manifold in
R3 is a union of surfaces.

The surface −Σ has opposite orientation to Σ and


ˆ ˆ
ω=− ω.
−Σ Σ

In this section, unless otherwise noticed, we will choose the positive orientation for the
regions considered. This corresponds to using the ordered basis

{dy ∧ dz, dz ∧ dx, dx ∧ dy}.

589 Definition
Let f : R3 → R. The integral of f over the smooth surface Σ (oriented in the positive sense) is given
by the expression ˆ

f d2 x .
Σ
Here »
2
d x = (dx ∧ dy)2 + (dz ∧ dx)2 + (dy ∧ dz)2

is the surface area element.

590 Example ˆ

Evaluate z d2 x where Σ is the outer surface of the section of the paraboloid z = x2 + y 2 , 0 ≤
Σ
z ≤ 1.

306
13.8. Surface Integrals
Solution: ▶ We parametrise the paraboloid as follows. Let x = u, y = v, z = u2 + v 2 . Observe
that the domain D of Σ is the unit disk u2 + v 2 ≤ 1. We see that

dx ∧ dy = du ∧ dv,

dy ∧ dz = −2udu ∧ dv,
dz ∧ dx = −2vdu ∧ dv,
and so
2 √
d x = 1 + 4u2 + 4v 2 du ∧ dv.

Now, ˆ ˆ √
2
z d x = (u2 + v 2 ) 1 + 4u2 + 4v 2 dudv.
Σ D
To evaluate this last integral we use polar coordinates, and so
ˆ √ ˆ 2π ˆ 1 »
2 2
(u + v ) 1 + 4u + 4v dudv =
2 2 ρ3 1 + 4ρ2 dρdθ
0 0
D
π √ 1
= (5 5 + ).
12 5

591 Example
Find the area of that part of the cylinder x2 + y 2 = 2y lying inside the sphere x2 + y 2 + z 2 = 4.

Solution: ▶ We have
x2 + y 2 = 2y ⇐⇒ x2 + (y − 1)2 = 1.
We parametrise the cylinder by putting x = cos u, y − 1 = sin u, and z = v. Hence

dx = − sin udu, dy = cos udu, dz = dv,

whence
dx ∧ dy = 0, dy ∧ dz = cos udu ∧ dv, dz ∧ dx = sin udu ∧ dv,
and so
2 »
d x = (dx ∧ dy)2 + (dz ∧ dx)2 + (dy ∧ dz)2

= cos2 u + sin2 u du ∧ dv
= du ∧ dv.
The cylinder and the sphere intersect when x2 + y 2 = 2y and x2 + y 2 + z 2 = 4, that is, when
z 2 = 4 − 2y, i.e. v 2 = 4 − 2(1 + sin u) = 2 − 2 sin u. Also 0 ≤ u ≤ π. The integral is thus
ˆ ˆ π ˆ √2−2 sin u ˆ π
2 √
d x = √ dvdu = 2 2 − 2 sin udu
0 − 2−2 sin u 0
Σ ˆ π√

= 2 2 1 − sin u du
0
√ Ä √ ä
= 2 2 4 2−4 .

307
13. Integration of Forms
592 Example
Evaluate
ˆ
xdydz + (z 2 − zx)dzdx − xydxdy,
Σ

where Σ is the top side of the triangle with vertices at (2, 0, 0), (0, 2, 0), (0, 0, 4).

Solution: ▶ Observe that the plane passing through the three given points has equation 2x +
2y + z = 4. We project this plane onto the coordinate axes obtaining
ˆ ˆ 4 ˆ 2−z/2
8
xdydz = (2 − y − z/2)dydz = ,
0 0 3
Σ

ˆ ˆ 2 ˆ 4−2x
(z − zx)dzdx =
2
(z 2 − zx)dzdx = 8,
0 0
Σ
ˆ ˆ 2 ˆ 2−y
2
− xydxdy = − xydxdy = − ,
0 0 3
Σ

and hence ˆ
xdydz + (z 2 − zx)dzdx − xydxdy = 10.
Σ

Homework
593 Problemˆ

Evaluate y d2 x where Σ is the surface z = x + y 2 , 0 ≤ x ≤ 1, 0 ≤ y ≤ 2.
Σ

594 Problem

Consider the cone z = x2 + y 2 . Find the surface area of the part of the cone which lies between the
planes z = 1 and z = 2.

595 Problemˆ

Evaluate x2 d2 x where Σ is the surface of the unit sphere x2 + y 2 + z 2 = 1.
Σ

596 Problemˆ

Evaluate z d2 x over the conical surface z = x2 + y 2 between z = 0 and z = 1.
S

597 Problem
You put a perfectly spherical egg through an egg slicer, resulting in n slices of identical height, but you
forgot to peel it first! Shew that the amount of egg shell in any of the slices is the same. Your argument
must use surface integrals.

308
13.9. Green’s, Stokes’, and Gauss’ Theorems
598 Problem
Evaluate
ˆ
xydydz − x2 dzdx + (x + z)dxdy,
Σ
where Σ is the top of the triangular region of the plane 2x + 2y + z = 6 bounded by the first octant.

13.9. Green’s, Stokes’, and Gauss’ Theorems


We are now in position to state the general Stoke’s Theorem.

599 Theorem (General Stoke’s Theorem)


Let M be a smooth oriented manifold, having boundary ∂M . If ω is a differential form, then
ˆ ˆ
ω= dω.
∂M M

In R2 , if ω is a 1-form, this takes the name of Green’s Theorem.


600 Example˛
Evaluate (x − y 3 )dx + x3 dy where C is the circle x2 + y 2 = 1.
C

Solution: ▶ We will first use Green’s Theorem and then evaluate the integral directly. We have

dω = d(x − y 3 ) ∧ dx + d(x3 ) ∧ dy
= (dx − 3y 2 dy) ∧ dx + (3x2 dx) ∧ dy
= (3y 2 + 3x2 )dx ∧ dy.

The region M is the area enclosed by the circle x2 + y 2 = 1. Thus by Green’s Theorem, and using
polar coordinates, ˛ ˆ
(x − y 3 )dx + x3 dy = (3y 2 + 3x2 )dxdy
C ˆM2π ˆ 1
= 3ρ2 ρdρdθ
0 0

= .
2
Aliter: We can evaluate this integral directly, again resorting to polar coordinates.
˛ ˆ 2π
(x − y 3 )dx + x3 dy = (cos θ − sin3 θ)(− sin θ)dθ + (cos3 θ)(cos θ)dθ
C ˆ0 2π
= (sin4 θ + cos4 θ − sin θ cos θ)dθ.
0

To evaluate the last integral, observe that 1 = (sin2 θ + cos2 θ)2 = sin4 θ + 2 sin2 θ cos2 θ + cos4 θ,
whence the integral equals
ˆ 2π ˆ 2π
(sin θ + cos θ − sin θ cos θ)dθ =
4 4
(1 − 2 sin2 θ cos2 θ − sin θ cos θ)dθ
0 0

= .
2

309
13. Integration of Forms

In general, let
ω = f (x, y)dx + g(x, y)dy

be a 1-form in R2 . Then

dω = df (x, y) ∧ dx + dg(x, y) ∧ dy
Ç å Ç å
∂ ∂ ∂ ∂
= f (x, y)dx + f (x, y)dy ∧ dx + g(x, y)dx + g(x, y)dy ∧ dy
Ç ∂x ∂y å ∂x ∂y
∂ ∂
= g(x, y) − f (x, y) dx ∧ dy
∂x ∂y
which gives the classical Green’s Theorem
ˆ ˆ Ç å
∂ ∂
f (x, y)dx + g(x, y)dy = g(x, y) − f (x, y) dxdy.
∂x ∂y
∂M M

In R3 , if ω is a 2-form, the above theorem takes the name of Gauss or the Divergence Theorem.

601 Exampleˆ
Evaluate (x − y)dydz + zdzdx − ydxdy where S is the surface of the sphere
S

x2 + y 2 + z 2 = 9

and the positive direction is the outward normal.

Solution: ▶ The region M is the interior of the sphere x2 + y 2 + z 2 = 9. Now,

dω = (dx − dy) ∧ dy ∧ dz + dz ∧ dz ∧ dx − dy ∧ dx ∧ dy
= dx ∧ dy ∧ dz.

The integral becomes ˆ



dxdydz = (27)
3
M
= 36π.
Aliter: We could evaluate this integral directly. We have
ˆ ˆ
(x − y)dydz = xdydz,
Σ Σ

since (x, y, z) 7→ −y is an odd function of y and the domain of integration is symmetric with respect
to y. Now, ˆ ˆ ˆ 3 2π »
xdydz = |ρ| 9 − ρ2 dρdθ
Σ −3 0
= 36π.
Also ˆ
zdzdx = 0,
Σ

310
13.9. Green’s, Stokes’, and Gauss’ Theorems
since (x, y, z) 7→ z is an odd function of z and the domain of integration is symmetric with respect
to z. Similarly ˆ
−ydxdy = 0,
Σ
since (x, y, z) 7→ −y is an odd function of y and the domain of integration is symmetric with respect
to y. ◀
In general, let

ω = f (x, y, z)dy ∧ dz + g(x, y, z)dz ∧ dx + h(x, y, z)dx ∧ dy

be a 2-form in R3 . Then

dω = df (x, y, z)dy ∧ dz + dg(x, y, z)dz ∧ dx + dh(x, y, z)dx ∧ dy


Ç å
∂ ∂ ∂
= f (x, y, z)dx + f (x, y, z)dy + f (x, y, z)dz ∧ dy ∧ dz
∂xÇ ∂y ∂z å
∂ ∂ ∂
+ g(x, y, z)dx + g(x, y, z)dy + g(x, y, z)dz ∧ dz ∧ dx
Ç ∂x ∂y ∂z å
∂ ∂ ∂
+ h(x, y, z)dx + h(x, y, z)dy + h(x, y, z)dz ∧ dx ∧ dy
Ç ∂x ∂y ∂zå
∂ ∂ ∂
= f (x, y, z) + g(x, y, z) + h(x, y, z) dx ∧ dy ∧ dz,
∂x ∂y ∂z

which gives the classical Gauss’s Theorem


ˆ ˆ Ç å
∂ ∂ ∂
f (x, y, z)dydz+g(x, y, z)dzdx+h(x, y, z)dxdy = f (x, y, z) + g(x, y, z) + h(x, y, z) dxdydz.
∂x ∂y ∂z
∂M M

Using classical notation, if    


f (x, y, z)  dydz 
   
   
a =  g(x, y, z)  , dS = dzdx ,
   
   
h(x, y, z) dxdy

then ˆ ˆ
(∇ · a)dV = a · dS.
M ∂M

The classical Stokes’ Theorem occurs when ω is a 1-form in R3 .


602 Example˛
Evaluate ydx + (2x − z)dy + (z − x)dz where C is the intersection of the sphere x2 + y 2 + z 2 = 4
C
and the plane z = 1.

Solution: ▶ We have

dω = (dy) ∧ dx + (2dx − dz) ∧ dy + (dz − dx) ∧ dz


= −dx ∧ dy + 2dx ∧ dy + dy ∧ dz + dz ∧ dx
= dx ∧ dy + dy ∧ dz + dz ∧ dx.

311
13. Integration of Forms
Since on C, z = 1, the surface Σ on which we are integrating is the inside of the circle x2 +y 2 +1 = 4,
i.e., x2 + y 2 = 3. Also, z = 1 implies dz = 0 and so
ˆ ˆ
dω = dxdy.
Σ Σ

Since this is just the area of the circular region x2 + y 2 ≤ 3, the integral evaluates to
ˆ
dxdy = 3π.
Σ

In general, let
ω = f (x, y, z)dx + g(x, y, z)dy + +h(x, y, z)dz
be a 1-form in R3 . Then

dω = df (x, y, z) ∧ dx + dg(x, y, z) ∧ dy + dh(x, y, z) ∧ dz


Ç å
∂ ∂ ∂
= f (x, y, z)dx + f (x, y, z)dy + f (x, y, z)dz ∧ dx
∂x
Ç ∂y ∂z å
∂ ∂ ∂
+ g(x, y, z)dx + g(x, y, z)dy + g(x, y, z)dz ∧ dy
∂x
Ç ∂y ∂z å
∂ ∂ ∂
+ h(x, y, z)dx + h(x, y, z)dy + h(x, y, z)dz ∧ dz
Ç ∂x ∂y å ∂z
∂ ∂
= h(x, y, z) − g(x, y, z) dy ∧ dz
∂yÇ ∂z å
∂ ∂
+ f (x, y, z) − h(x, y, z) dz ∧ dx
Ç ∂z ∂x å
∂ ∂
g(x, y, z) − f (x, y, z) dx ∧ dy
∂x ∂y
which gives the classical Stokes’ Theorem
ˆ
f (x, y, z)dx + g(x, y, z)dy + h(x, y, z)dz
∂M ˆ Ç å
∂ ∂
= h(x, y, z) − g(x, y, z) dydz
∂y ∂z
M Ç å
∂ ∂
+ g(x, y, z) − f (x, y, z) dxdy
Ç ∂z ∂x å
∂ ∂
+ h(x, y, z) − f (x, y, z) dxdy.
∂x ∂y
Using classical notation, if
     
f (x, y, z) dx  dydz 
     
     
a =  g(x, y, z)  , dr = dy  , dS = dzdx ,
     
     
h(x, y, z) dz dxdy

then ˆ ˆ
(∇ × a) · dS = a · dr.
M ∂M

312
13.9. Green’s, Stokes’, and Gauss’ Theorems

Homework
603 Problem˛
Evaluate x3 ydx + xydy where C is the square with vertices at (0, 0), (2, 0), (2, 2) and (0, 2).
C

604 Problem
Consider the triangle △ with vertices A : (0, 0), B : (1, 1), C : (−2, 2).

Ê If LP Q denotes the equation of the line joining P and Q find LAB , LAC , and LBC .

Ë Evaluate ˛
y 2 dx + xdy.

Ì Find ˆ
(1 − 2y)dx ∧ dy
D
where D is the interior of △.
605 Problem
Problems 1 through 4 refer to the differential form

ω = xdy ∧ dz + ydz ∧ dx + 2zdx ∧ dy,

and the solid M whose boundaries are the paraboloid z = 1 − x2 − y 2 , 0 ≤ z ≤ 1 and the disc
x2 + y 2 ≤ 1, z = 0. The surface ∂M of the solid is positively oriented upon considering outward
normals.

1. Prove that dω = 4dx ∧ dy ∧ dz.


ˆ ˆ ˆ √ ˆ
1 1−x2 1−x2 −y 2
2. Prove that in Cartesian coordinates, ω= √ 4dzdydx.
∂M −1 − 1−x2 0
ˆ ˆ 2π ˆ 1 ˆ 1−r2
3. Prove that in cylindrical coordinates, dω = 4rdzdrdθ.
M 0 0 0
ˆ
4. Prove that xdydz + ydzdx + 2zdxdy = 2π.
∂M

606 Problem
Problems 1 through 4 refer to the box

M = {(x, y, z) ∈ R3 : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 2},

the upper face of the box

U = {(x, y, z) ∈ R3 : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, z = 2},

the boundary of the box without the upper top S = ∂M \ U , and the differential form

ω = (arctan y − x2 )dy ∧ dz + (cos x sin z − y 3 )dz ∧ dx + (2zx + 6zy 2 )dx ∧ dy.

313
13. Integration of Forms
1. Prove that dω = 3y 2 dx ∧ dy ∧ dz.
ˆ ˆ 2ˆ 1ˆ 1
2 3 2
2. Prove that (arctan y−x )dydz+(cos x sin z−y )dzdx+(2zx+6zy )dxdy = 3y 2 dxdydz =
∂M 0 0 0
2. Here the boundary of the box is positively oriented considering outward normals.
ˆ
3. Prove that the integral on the upper face of the box is (arctan y − x2 )dydz + (cos x sin z −
ˆ 1ˆ 1 U

y 3 )dzdx + (2zx + 6zy 2 )dxdy = 4x + 12y 2 dxdy = 6.


0 0
ˆ
4. Prove that the integral on the open box is (arctan y−x2 )dydz+(cos x sin z−y 3 )dzdx+
∂M \U
(2zx + 6zy 2 )dxdy = −4.

607 Problem
Problems 1 through 3 refer to a triangular surface T in R3 and a differential form ω. The vertices of T
are at A(6, 0, 0), B(0, 12, 0), and C(0, 0, 3). The boundary of of the triangle ∂T is oriented positively
by starting at A, continuing to B, following to C, and ending again at A. The surface T is oriented
positively by considering the top of the triangle, as viewed from a point far above the triangle. The
differential form is
( )
x ( y) y2
ω = (2xz + arctan e ) dx + xz + (y + 1) dy + xy + + log(1 + z 2 ) dz.
2

1. Prove that the equation of the plane that contains the triangle T is 2x + y + 4z = 12.

2. Prove that dω = ydy ∧ dz + (2x − y) dz ∧ dx + zdx ∧ dy.


ˆ ( )
x ( y) y2
3. Prove that (2xz + arctan e ) dx + xz + (y + 1) dy + xy + + log(1 + z 2 ) dz =
∂T 2
ˆ 3 ˆ 12−4z ˆ 6 ˆ 3−x/2
ydydz + 2xdzdx=108.
0 0 0 0

608 Problem
Use Green’s Theorem to prove that
ˆ
(x2 + 2y 3 )dy = 16π,
Γ

where Γ is the circle (x − 2)2 + y 2 = 4. Also, prove this directly by using a path integral.

609 Problem
Let Γ denote the curve of intersection of the plane x+y = 2 and the sphere x2 −2x+y 2 −2y+z 2 = 0,
oriented clockwise when viewed from the origin. Use Stoke’s Theorem to prove that
ˆ √
ydx + zdy + xdz = −2π 2.
Γ

Prove this directly by parametrising the boundary of the surface and evaluating the path integral.

314
13.9. Green’s, Stokes’, and Gauss’ Theorems
610 Problem
Use Green’s Theorem to evaluate
˛
(x3 − y 3 )dx + (x3 + y 3 )dy,
C

where C is the positively oriented boundary of the region between the circles x2 + y 2 = 2 and x2 +
y 2 = 4.

315
Part IV.

Appendix

317
Answers and Hints
A.
106 Since polynomials are continuous functions and the image of a connected set is connected for a contin-
uous function, the image must be an interval of some sort. If the image were a finite interval, then f (x, kx)
would be bounded for every constant k, and so the image would just be the point f (0, 0). The possibilities
are thus

1. a single point (take for example, p(x, y) = 0),

2. a semi-infinite interval with an endpoint (take for example p(x, y) = x2 whose image is [0; +∞[),

3. a semi-infinite interval with no endpoint (take for example p(x, y) = (xy − 1)2 + x2 whose image is
]0; +∞[),

4. all real numbers (take for example p(x, y) = x).

120 0

121 2

122 c = 0.

123 0

126 By AM-GM,
x2 y 2 z 2 (x2 + y 2 + z 2 )3 (x2 + y 2 + z 2 )2
≤ = →0
x2 +y +z2 2 2 2
27(x + y + z ) 2 27
as (x, y, z) → (0, 0, 0).

138 0

139 2

140 c = 0.

141 0

144 By AM-GM,
x2 y 2 z 2 (x2 + y 2 + z 2 )3 (x2 + y 2 + z 2 )2
≤ = →0
x2 + y 2 + z 2 27(x2 + y 2 + z 2 ) 27
as (x, y, z) → (0, 0, 0).

319
A. Answers and Hints
172 We have
F (x + h) − F (x) = (x + h) × L(x + h) − x × L(x)

= (x + h) × (L(x) + L(h)) − x × L(x)

= x × L(h) + h × L(x) + h × L(h)

||h × L(h)||
Now, we will prove that → 0 as h → 0. For let
∥h∥


n
h= hk ek ,
k=1

where the ek are the standard basis for Rn . Then


n
L(h) = hk L(ek ),
k=1

and hence by the triangle inequality, and by the Cauchy-Bunyakovsky-Schwarz inequality,

∑n
||L(h)|| ≤ k=1 |hk |||L(ek )||
(∑n )1/2 (∑n )1/2
≤ k=1 |hk |2 k=1 ||L(ek )||2

= ∥h∥( nk=1 ||L(ek )||2 )1/2 ,

whence, again by the Cauchy-Bunyakovsky-Schwarz Inequality,

||h × L(h)|| ≤ ||h||||L(h)| ≤ ||h||2 |||L(ek )||2 )1/2

And so
||h × L(h)|| ||h||2 |||L(ek )||2 )1/2
≤ || →0
∥h∥ ∥h∥

184 Observe that 



 x if x ≤ y 2
f(x, y) =

 y2 if x > y 2

Hence 

 1
∂ if x > y 2
f(x, y) =
∂x  0
 if x > y 2

and 

 0
∂ if x > y 2
f(x, y) =
∂y 
 2y if x > y 2

185 Observe that


   
y
2
2xy  1 −1 2
g(1, 0, 1) = (30) , f′ (x, y) =  , g ′ (x, y) =  ,
2xy x2 y x 0

320
Answers and Hints
and hence    
1 −1 2 0 0
g ′ (1, 0, 1) =  , f′ (g(1, 0, 1)) = f′ (3, 0) =  .
0 1 0 0 9

This gives, via the Chain-Rule,


   
0 0 1 −1 2 0 0 0
(f ◦ g)′ (1, 0, 1) = f′ (g(1, 0, 1))g ′ (1, 0, 1) =   = .
0 9 0 1 0 0 9 0

The composition g ◦ f is undefined. For, the output of f is R2 , but the input of g is in R3 .

186 Since f(0, 1) = (01), the Chain Rule gives


   
 
1 −1 0 −1
  1 0  
(g ◦ f ) (0, 1) = (g (f(0, 1)))(f (0, 1)) = (g (0, 1))(f (0, 1)) = 
′ ′ ′ ′ ′
0 0 =
0 0
  1 1  
1 1 2 1

189 We have
∂ ∂ ∂ ∂z ∂z
(x + z)2 + (y + z)2 = 8 =⇒ 2(1 + )(x + z) + 2 (y + z) = 0.
∂x ∂x ∂x ∂x ∂x
At (1, 1, 1) the last equation becomes
∂z ∂z ∂z 1
4(1 + )+4 = 0 =⇒ =− .
∂x ∂x ∂x 2
219 a) Here ∇T = (y + z)i + (x + z)j + (y + x)k. The maximum rate of change at (1, 1, 1) is |∇T (1, 1, 1)| =

2 3 and direction cosines are
∇T 1 1 1
= √ i + √ j + √ k = cos αi + cos βj + cos γk
|∇T | 3 3 3
b) The required derivative is
3i − 4k 2
∇T (1, 1, 1)• =−
|3i − 4k| 5
220 a) Here ∇ϕ = F requires ∇ × F = 0 which is not the case here, so no solution.
b) Here ∇ × F = 0 so that

ϕ(x, y, z) = x2 y + y 2 z + z + c

221 ∇f (x, y, z) = (eyz , xzeyz , xyeyz ) =⇒ (∇f )(2, 1, 1) = (e, 2e, 2e).
( )
222 (∇ × f )(x, y, z) = (0, x, yexy ) =⇒ (∇ × f )(2, 1, 1) = 0, 2, e2 .

224 The vector (1, −7, 0) is perpendicular to the plane. Put f(x, y, z) = x2 + y 2 − 5xy + xz − yz + 3.
Then (∇f )(x, y, z) = (2x − 5y + z, 2y − 5x − zx − y). Observe that ∇f (x, y, z) is parallel to the vector
(1, −7, 0), and hence there exists a constant a such that

(2x − 5y + z, 2y − 5x − zx − y) = a (1, −7, 0) =⇒ x = a, y = a, z = 4a.

Since the point is on the plane

x − 7y = −6 =⇒ a − 7a = −6 =⇒ a = 1.

Thus x = y = 1 and z = 4.

321
A. Answers and Hints
227 Observe that
f(0, 0) = 1, fx (x, y) = (cos 2y)ex cos 2y =⇒ fx (0, 0) = 1,
fy (x, y) = −2x sin 2yex cos 2y =⇒ fy (0, 0) = 0.
Hence
f(x, y) ≈ f(0, 0) + fx (0, 0)(x − 0) + fy (0, 0)(y − 0) =⇒ f(x, y) ≈ 1 + x.
This gives f(0.1, −0.2) ≈ 1 + 0.1 = 1.1.

228 This is essentially the product rule: duv = udv + vdu, where ∇ acts the differential operator and × is
the product. Recall that when we defined the volume of a parallelepiped spanned by the vectors a, b, c, we
saw that
a • (b × c) = (a × b) • c.
Treating ∇ = ∇u + ∇v as a vector, first keeping v constant and then keeping u constant we then see that

∇u • (u × v) = (∇ × u) • v, ∇v • (u × v) = −∇ • (v × u) = −(∇ × v) • u.

Thus

∇ • (u × v) = (∇u + ∇v ) • (u × v) = ∇u • (u × v) + ∇v • (u × v) = (∇ × u) • v − (∇ × v) • u.
π π
231 An angle of with the x-axis and with the y-axis.
6 3
 
x
 
323 Let  
y  be a point on S. If this point were on the xz plane, it would be on the ellipse, and its distance
 
z
 
x
1√  
to the axis of rotation would be |x| = 1 − z 2 . Anywhere else, the distance from  
y  to the z-axis is the
2  
z
 
0
  √
distance of this point to the point   2 2
0 : x + y . This distance is the same as the length of the segment
 
z
on the xz-plane going from the z-axis. We thus have
√ 1√
x2 + y 2 = 1 − z2,
2
or
4x2 + 4y 2 + z 2 = 1.
 
x
 
324 Let  
y  be a point on S. If this point were on the xy plane, it would be on the line, and its distance to
 
z
 
x
1  
the axis of rotation would be |x| = |1 − 4y|. Anywhere else, the distance of  
y  to the axis of rotation is
3  
z

322
Answers and Hints
   
x 0
    √
the same as the distance of     2 2
y  to y , that is x + z . We must have
   
z 0

√ 1
x2 + z 2 = |1 − 4y|,
3
which is to say
9x2 + 9z 2 − 16y 2 + 8y − 1 = 0.

325 A spiral staircase.

326 A spiral staircase.

328 The planes A : x + z = 0 and B : y = 0 are secant. The surface has equation of the form f (A, B) =
2 2
eA +B − A = 0, and it is thus a cylinder. The directrix has direction i − k.

329 Rearranging,

1
(x2 + y 2 + z 2 )2 − ((x + y + z)2 − (x2 + y 2 + z 2 )) − 1 = 0,
2
and so we may take A : x + y + z = 0, S : x2 + y 2 + z 2 = 0, shewing that the surface is of revolution. Its
axis is the line in the direction i + j + k.

330 Considering the planes A : x − y = 0, B : y − z = 0, the equation takes the form

1 1 1
f (A, B) = + − − 1 = 0,
A B A+B
thus the equationrepresents
   a cylinder. To find its directrix, we find the intersection of the planes x = y and

x 1
   
y = z. This gives    
y  = t 1. The direction vector is thus i + j + k.
   
z 1

331 Rearranging,
(x + y + z)2 − (x2 + y 2 + z 2 ) + 2(x + y + z) + 2 = 0,

so we may take A : x + y + z = 0, S : x2 + y 2 + z 2 = 0 as our plane and sphere. The axis of revolution is


then in the direction of i + j + k.

332 After rearranging, we obtain


(z − 1)2 − xy = 0,

or
x y
− + 1 = 0.
z−1z−1
Considering the planes
A : x = 0, B : y = 0, C : z = 1,

we see that our surface is a cone, with apex at (0, 0, 1).

323
A. Answers and Hints
333 The largest circle has radius b. Parallel cross sections of the ellipsoid are similar ellipses, hence we may
increase the size of these by moving towards the centre of the ellipse. Every plane through the origin which
makes a circular cross section must intersect the yz-plane, and the diameter of any such cross section must
y2 z2
be a diameter of the ellipse x = 0, 2 + 2 = 1. Therefore, the radius of the circle is at most b. Arguing
b c
similarly on the xy-plane shews that the radius of the circle is at least b. To shew that circular cross section
of radius b actually exist, one may verify that the two planes given by a2 (b2 − c2 )z 2 = c2 (a2 − b2 )x2 give
circular cross sections of radius b.

334 Any hyperboloid oriented like the one on the figure has an equation of the form

z2 x2 y2
2
= 2 + 2 − 1.
c a b
When z = 0 we must have
1
4x2 + y 2 = 1 =⇒ a = , b = 1.
2
Thus
z2
= 4x2 + y 2 − 1.
c2
Hence, letting z = ±2,

4 1 y2 1 1 3
2
= 4x2 + y 2 − 1 =⇒ 2 = x2 + − =1− = ,
c c 4 4 4 4
y2
since at z = ±2, x2 + = 1. The equation is thus
4
3z 2
= 4x2 + y 2 − 1.
4

572

1. Let L1 : y = x + 1, L2 : −x + 1. Then
ˆ ˆ ˆ
xdx + ydy = xdx + ydy + xdx + ydy
C ˆL11 L2 ˆ
1
= xdx(x + 1)dx + xdx − (−x + 1)dx
−1 0
= 0.

Also, both on L1 and on L2 we have ∥dx∥ = 2dx, thus
ˆ ˆ ˆ
xy∥dx∥ = xy∥dx∥ + xy∥dx∥
L1 ˆ
√ ˆ
C L2
√ 1 1
= 2 x(x + 1)dx − 2 x(−x + 1)dx
−1 0
= 0.

î ó
2. We put x = sin t, y = cos t, t ∈ − π2 ; π2 . Then

ˆ ˆ π/2
xdx + ydy = (sin t)(cos t)dt − (cos t)(sin t)dt
C −π/2

= 0.

324
Answers and Hints

Also, ∥dx∥ = (cos t)2 + (− sin t)2 dt = dt, and thus
ˆ ˆ π/2
xy∥dx∥ = (sin t)(cos t)dt
−π/2
C

(sin t)2 π/2
=
2 −π/2
= 0.


573 Let Γ1 denote the straight line segment path from O to A = (2 3, 2) and Γ2 denote the arc of the circle
π π
centred at (0, 0) and radius 4 going counterclockwise from θ = to θ = .
6 5
←→ x
Observe that the Cartesian equation of the line OA is y = √ . Then on Γ1
3
x x 4
xdx + ydy = xdx + √ d √ = xdx.
3 3 3

Hence √
ˆ ˆ 2 3
4
xdx + ydy = xdx = 8.
Γ1 0 3
π π
On the arc of the circle we may put x = 4 cos θ, y = 4 sin θ and integrate from θ = to θ = . Observe
6 5
that there

xdx + ydy = (cos θ)dcos θ + (sin θ)dsin θ = − sin θ cos θdθ + sin θ cos θdθ = 0,

and since the integrand is 0, the integral will be zero.

Assembling these two pieces,


ˆ ˆ ˆ
xdx + ydy = xdx + ydy + xdx + ydy = 8 + 0 = 8.
Γ Γ1 Γ2

Using the parametrisations from the solution of problem ??, we find on Γ1 that
» …
1 2
x∥dx∥ = x (dx)2 + (dy)2 =x 1 + dx = √ xdx,
3 3

whence √
ˆ ˆ 2 3 √
2
x∥dx∥ = √ xdx = 4 3.
Γ1 0 3
On Γ2 that
» √
x∥dx∥ = x (dx)2 + (dy)2 = 16 cos θ sin2 θ + cos2 θdθ = 16 cos θdθ,

whence
ˆ ˆ π/5
π π π
x∥dx∥ = 16 cos θdθ = 16 sin − 16 sin = 4 sin − 8.
Γ2 π/6 5 6 5
Assembling these we gather that
ˆ ˆ ˆ √ π
x∥dx∥ = x∥dx∥ + x∥dx∥ = 4 3 − 8 + 16 sin .
Γ Γ1 Γ2 5

325
A. Answers and Hints
574 The curve lies on the sphere, and to parametrise this curve, we dispose of one of the variables, y say,
from where y = 1 − x and x2 + y 2 + z 2 = 1 give

x2 + (1 − x)2 + z 2 = 1 =⇒ 2x2 − 2x + z 2 = 0
Ä ä2
=⇒ 2 x − 21 + z 2 = 21
Ä ä2
=⇒ 4 x − 12 + 2z 2 = 1.

So we now put
1 cos t sin t 1 cos t
x= + , z = √ , y =1−x= − .
2 2 2 2 2
We
 must integrate on the side of the plane that can be viewed from the point (1, 1, 0) (observe that the vector
1 Ä ä2
 
1 is normal to the plane). On the zx-plane, 4 x − 1 + 2z 2 = 1 is an ellipse. To obtain a positive
  2
 
0
parametrisation we must integrate from t = 2π to t = 0 (this is because when you look at the ellipse from
the point (1, 1, 0) the positive x-axis is to your left, and not your right). Thus
˛ ˆ Å
0 ã
sin t 1 cos t
zdx + xdy + ydz = √ d +
Γ ˆ 0 Å2
2π 2 ã2 Å ã
1 cos t 1 cos t
+ + d −
ˆ2π0 Å 2 2 ã Ç2 å2
1 cos t sin t
+ − d √
Ç 2
ˆ 0 2π
2 2 å
sin t cos t cos t sin t 1
= + √ + − √ dt
2π 4 2 2 4 2 2
π
= √ .
2

593 We parametrise the surface by letting x = u, y = v, z = u + v 2 . Observe that the domain D of Σ is the
square [0; 1] × [0; 2]. Observe that
dx ∧ dy = du ∧ dv,

dy ∧ dz = −du ∧ dv,

dz ∧ dx = −2vdu ∧ dv,

and so
2 √
d x = 2 + 4v 2 du ∧ dv.

The integral becomes


ˆ ˆ 2 ˆ 1 √

y d2 x = v 2 + 4v 2 dudv
0 0
Σ Çˆ å Lj å
1 2 √
= du y 2 + 4v 2 dv

0 0
13 2
= .
3

594 Using x = r cos θ, y = r sin θ, 1 ≤ r ≤ 2, 0 ≤ θ ≤ 2π, the surface area is

√ ˆ 2π ˆ 2 √
2 rdrdθ = 3π 2.
0 1

326
Answers and Hints
595 We use spherical coordinates, (x, y, z) = (cos θ sin ϕ, sin θ sin ϕ, cos ϕ). Here θ ∈ [0; 2π] is the latitude
and ϕ ∈ [0; π] is the longitude. Observe that

dx ∧ dy = sin ϕ cos ϕdϕ ∧ dθ,

dy ∧ dz = cos θ sin2 ϕdϕ ∧ dθ,

dz ∧ dx = − sin θ sin2 ϕdϕ ∧ dθ,

and so
2
d x = sin ϕdϕ ∧ dθ.

The integral becomes


ˆ ˆ 2π ˆ π

x2 d2 x = cos2 θ sin3 ϕdϕdθ
0 0
Σ

= .
3

596 Put x = u, y = v, z 2 = u2 + v 2 . Then

dx = du, dy = dv, zdz = udu + vdv,

whence
u v
dx ∧ dy = du ∧ dv, dy ∧ dz = − du ∧ dv, dz ∧ dx = − du ∧ dv,
z z
and so
2 √
d x = (dx ∧ dy)2 + (dz ∧ dx)2 + (dy ∧ dz)2

u2 + v 2
= 1+ du ∧ dv
√ z2
= 2 du ∧ dv.
Hence ˆ ˆ √



√ √ √ ˆ 2π ˆ 1
2π 2
z d2 x = 2
u +v 2 2 dudv = 2 2
ρ dρdθ = .
0 0 3
Σ u2 +v 2 ≤1

597 If the egg has radius R, each slice will have height 2R/n. A slice can be parametrised by 0 ≤ θ ≤ 2π,
ϕ1 ≤ ϕ ≤ ϕ2 , with
R cos ϕ1 − R cos ϕ2 = 2R/n.

The area of the part of the surface of the sphere in slice is


ˆ 2π ˆ ϕ2
R2 sin ϕdϕdθ = 2πR2 (cos ϕ1 − cos ϕ2 ) = 4πR2 /n.
0 ϕ1

This means that each of the n slices has identical area 4πR2 /n.

598 We project this plane onto the coordinate axes obtaining


ˆ ˆ 6 ˆ 3−z/2
27
xydydz = (3 − y − z/2)ydydz = ,
0 0 4
Σ

ˆ ˆ 3 ˆ 6−2x
27
− x2 dzdx = − x2 dzdx = − ,
0 0 2
Σ

327
A. Answers and Hints
ˆ ˆ 3 ˆ 3−y
27
(x + z)dxdy = (6 − x − 2y)dxdy = ,
0 0 2
Σ

and hence ˆ
27
xydydz − x2 dzdx + (x + z)dxdy = .
4
Σ

603 Evaluating this directly would result in evaluating four path integrals, one for each side of the square.
We will use Green’s Theorem. We have

dω = d(x3 y) ∧ dx + d(xy) ∧ dy
= (3x2 ydx + x3 dy) ∧ dx + (ydx + xdy) ∧ dy
= (y − x3 )dx ∧ dy.

The region M is the area enclosed by the square. The integral equals
˛ ˆ 2 ˆ 2
x3 ydx + xydy = (y − x3 )dxdy
C 0 0
= −4.

604 We have
1 4
Ê LAB is y = x; LAC is y = −x, and LBC is clearly y = − x + .
3 3
Ë We have ˆ ˆ 1
5
y 2 dx + xdy = (x2 + x)dx =
ˆAB ˆ0 −2 ÇÅ ã å 6
1 4 2 1 15
y 2 dx + xdy = − x+ − x dx = −
3 3 3 2
ˆBC ˆ1 0
14
y 2 dx + xdy = (x2 − x)dx =
CA −2 3
Adding these integrals we find ˛
y 2 dx + xdy = −2.

Ì We have (ˆ )
ˆ ˆ 0 −x/3+4/3
(1 − 2y)dx ∧ dy = (1 − 2y)dy dx
−2 −x
D
ˆ (ˆ )
1 −x/3+4/3
+ (1 − 2y)dy dx
0 x
44 10
= − −
27 27
= −2.

608 Observe that


d(x2 + 2y 3 ) ∧ dy = 2xdx ∧ dy.

Hence by the generalised Stokes’ Theorem the integral equals


ˆ ˆ π/2 ˆ 4 cos θ
2xdx ∧ dy = 2ρ2 cos θdρ ∧ dθ = 16π.
−π/2 0
{(x−2)2 +y 2 ≤4}

328
Answers and Hints
To do it directly, put x − 2 = 2 cos t, y = 2 sin t, 0 ≤ t ≤ 2π. Then the integral becomes
ˆ 2π ˆ 2π
2 3
((2 + 2 cos t) + 16 sin t)d2 sin t = (8 cos t + 16 cos2 t
0 0
+8 cos3 t + 32 cos t sin3 t)dt
= 16π.

609 At the intersection path

0 = x2 + y 2 + z 2 − 2(x + y) = (2 − y)2 + y 2 + z 2 − 4 = 2y 2 − 4y + z 2 = 2(y − 1)2 + z 2 − 2,

which describes an ellipse on the yz-plane. Similarly we get 2(x − 1)2 + z 2 = 2 on the xz-plane. We have

d(ydx + zdy + xdz) = dy ∧ dx + dz ∧ dy + dx ∧ dz = −dx ∧ dy − dy ∧ dz − dz ∧ dx.

Since dx ∧ dy = 0, by Stokes’ Theorem the integral sought is


ˆ ˆ √
− dydz − dzdx = −2π( 2).
2(y−1)2 +z 2 ≤2 2(x−1)2 +z 2 ≤2

(x − x0 )2 (y − y0 )2
(To evaluate the integrals you may resort to the fact that the area of the elliptical region + ≤
a2 b2
1 is πab).

If we were to evaluate this integral directly, we would set



y = 1 + cos θ, z = 2 sin θ, x = 2 − y = 1 − cos θ.

The integral becomes


ˆ 2π √ √
(1 + cos θ)d(1 − cos θ) + 2 sin θd(1 + cos θ) + (1 − cos θ)d( 2 sin θ)
0

which in turn ˆ 2π √ √ √
= sin θ + sin θ cos θ − 2+ 2 cos θdθ = −2π 2.
0

329
GNU Free Documentation License
B.
Version 1.2, November 2002
Copyright © 2000,2001,2002 Free Software Foundation, Inc.

51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Everyone is permitted to copy and distribute verbatim copies of this license document, but
changing it is not allowed.

Preamble

The purpose of this License is to make a manual, textbook, or other functional and useful docu-
ment “free” in the sense of freedom: to assure everyone the effective freedom to copy and redis-
tribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this
License preserves for the author and publisher a way to get credit for their work, while not being
considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must
themselves be free in the same sense. It complements the GNU General Public License, which is a
copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free soft-
ware needs free documentation: a free program should come with manuals providing the same
freedoms that the software does. But this License is not limited to software manuals; it can be used
for any textual work, regardless of subject matter or whether it is published as a printed book. We
recommend this License principally for works whose purpose is instruction or reference.

1. APPLICABILITY AND DEFINITIONS


This License applies to any manual or other work, in any medium, that contains a notice placed
by the copyright holder saying it can be distributed under the terms of this License. Such a notice
grants a world-wide, royalty-free license, unlimited in duration, to use that work under the condi-
tions stated herein. The “Document”, below, refers to any such manual or work. Any member of
the public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or
distribute the work in a way requiring permission under copyright law.

331
B. GNU Free Documentation License
A “Modified Version” of the Document means any work containing the Document or a portion
of it, either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals
exclusively with the relationship of the publishers or authors of the Document to the Document’s
overall subject (or to related matters) and contains nothing that could fall directly within that overall
subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not
explain any mathematics.) The relationship could be a matter of historical connection with the
subject or with related matters, or of legal, commercial, philosophical, ethical or political position
regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being
those of Invariant Sections, in the notice that says that the Document is released under this License.
If a section does not fit the above definition of Secondary then it is not allowed to be designated as
Invariant. The Document may contain zero Invariant Sections. If the Document does not identify
any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-
Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover
Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a for-
mat whose specification is available to the general public, that is suitable for revising the docu-
ment straightforwardly with generic text editors or (for images composed of pixels) generic paint
programs or (for drawings) some widely available drawing editor, and that is suitable for input to
text formatters or for automatic translation to a variety of formats suitable for input to text format-
ters. A copy made in an otherwise Transparent file format whose markup, or absence of markup,
has been arranged to thwart or discourage subsequent modification by readers is not Transparent.
An image format is not Transparent if used for any substantial amount of text. A copy that is not
“Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Tex-
info input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-
conforming simple HTML, PostScript or PDF designed for human modification. Examples of trans-
parent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that
can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or
processing tools are not generally available, and the machine-generated HTML, PostScript or PDF
produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are
needed to hold, legibly, the material this License requires to appear in the title page. For works
in formats which do not have any title page as such, “Title Page” means the text near the most
prominent appearance of the work’s title, preceding the beginning of the body of the text.
A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely
XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ
stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”,
“Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the

332
Document means that it remains a section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License
applies to the Document. These Warranty Disclaimers are considered to be included by reference in
this License, but only as regards disclaiming warranties: any other implication that these Warranty
Disclaimers may have is void and has no effect on the meaning of this License.

2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially or noncommer-
cially, provided that this License, the copyright notices, and the license notice saying this License
applies to the Document are reproduced in all copies, and that you add no other conditions whatso-
ever to those of this License. You may not use technical measures to obstruct or control the reading
or further copying of the copies you make or distribute. However, you may accept compensation
in exchange for copies. If you distribute a large enough number of copies you must also follow the
conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display
copies.

3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have printed covers) of the Docu-
ment, numbering more than 100, and the Document’s license notice requires Cover Texts, you must
enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts
on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legi-
bly identify you as the publisher of these copies. The front cover must present the full title with all
words of the title equally prominent and visible. You may add other material on the covers in addi-
tion. Copying with changes limited to the covers, as long as they preserve the title of the Document
and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones
listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must
either include a machine-readable Transparent copy along with each Opaque copy, or state in or
with each Opaque copy a computer-network location from which the general network-using pub-
lic has access to download using public-standard network protocols a complete Transparent copy
of the Document, free of added material. If you use the latter option, you must take reasonably
prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Trans-
parent copy will remain thus accessible at the stated location until at least one year after the last
time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to
the public.
It is requested, but not required, that you contact the authors of the Document well before redis-
tributing any large number of copies, to give them a chance to provide you with an updated version
of the Document.

333
B. GNU Free Documentation License
4. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under the conditions of sections
2 and 3 above, provided that you release the Modified Version under precisely this License, with
the Modified Version filling the role of the Document, thus licensing distribution and modification
of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in
the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document,
and from those of previous versions (which should, if there were any, be listed in the History
section of the Document). You may use the same title as a previous version if the original
publisher of that version gives permission.

B. List on the Title Page, as authors, one or more persons or entities responsible for authorship
of the modifications in the Modified Version, together with at least five of the principal authors
of the Document (all of its principal authors, if it has fewer than five), unless they release you
from this requirement.

C. State on the Title page the name of the publisher of the Modified Version, as the publisher.

D. Preserve all the copyright notices of the Document.

E. Add an appropriate copyright notice for your modifications adjacent to the other copyright
notices.

F. Include, immediately after the copyright notices, a license notice giving the public permis-
sion to use the Modified Version under the terms of this License, in the form shown in the
Addendum below.

G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts
given in the Document’s license notice.

H. Include an unaltered copy of this License.

I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item stating at least
the title, year, new authors, and publisher of the Modified Version as given on the Title Page.
If there is no section Entitled “History” in the Document, create one stating the title, year,
authors, and publisher of the Document as given on its Title Page, then add an item describing
the Modified Version as stated in the previous sentence.

J. Preserve the network location, if any, given in the Document for public access to a Trans-
parent copy of the Document, and likewise the network locations given in the Document for
previous versions it was based on. These may be placed in the “History” section. You may
omit a network location for a work that was published at least four years before the Document
itself, or if the original publisher of the version it refers to gives permission.

334
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title of the sec-
tion, and preserve in the section all the substance and tone of each of the contributor ac-
knowledgements and/or dedications given therein.

L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles.
Section numbers or the equivalent are not considered part of the section titles.

M. Delete any section Entitled “Endorsements”. Such a section may not be included in the Mod-
ified Version.

N. Do not retitle any existing section to be Entitled “Endorsements” or to conflict in title with
any Invariant Section.

O. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as Sec-
ondary Sections and contain no material copied from the Document, you may at your option des-
ignate some or all of these sections as invariant. To do this, add their titles to the list of Invariant
Sections in the Modified Version’s license notice. These titles must be distinct from any other sec-
tion titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements
of your Modified Version by various parties–for example, statements of peer review or that the text
has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words
as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage
of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made
by) any one entity. If the Document already includes a cover text for the same cover, previously
added by you or by arrangement made by the same entity you are acting on behalf of, you may not
add another; but you may replace the old one, on explicit permission from the previous publisher
that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use
their names for publicity for or to assert or imply endorsement of any Modified Version.

5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this License, under the
terms defined in section 4 above for modified versions, provided that you include in the combi-
nation all of the Invariant Sections of all of the original documents, unmodified, and list them all
as Invariant Sections of your combined work in its license notice, and that you preserve all their
Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant
Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same
name but different contents, make the title of each such section unique by adding at the end of
it, in parentheses, the name of the original author or publisher of that section if known, or else a

335
B. GNU Free Documentation License
unique number. Make the same adjustment to the section titles in the list of Invariant Sections in
the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original
documents, forming one section Entitled “History”; likewise combine any sections Entitled “Ac-
knowledgements”, and any sections Entitled “Dedications”. You must delete all sections Entitled
“Endorsements”.

6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released under this
License, and replace the individual copies of this License in the various documents with a single
copy that is included in the collection, provided that you follow the rules of this License for verbatim
copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under
this License, provided you insert a copy of this License into the extracted document, and follow this
License in all other respects regarding verbatim copying of that document.

7. AGGREGATION WITH INDEPENDENT WORKS


A compilation of the Document or its derivatives with other separate and independent docu-
ments or works, in or on a volume of a storage or distribution medium, is called an “aggregate” if
the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s
users beyond what the individual works permit. When the Document is included in an aggregate,
this License does not apply to the other works in the aggregate which are not themselves derivative
works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then
if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be
placed on covers that bracket the Document within the aggregate, or the electronic equivalent of
covers if the Document is in electronic form. Otherwise they must appear on printed covers that
bracket the whole aggregate.

8. TRANSLATION
Translation is considered a kind of modification, so you may distribute translations of the Docu-
ment under the terms of section 4. Replacing Invariant Sections with translations requires special
permission from their copyright holders, but you may include translations of some or all Invariant
Sections in addition to the original versions of these Invariant Sections. You may include a trans-
lation of this License, and all the license notices in the Document, and any Warranty Disclaimers,
provided that you also include the original English version of this License and the original versions
of those notices and disclaimers. In case of a disagreement between the translation and the origi-
nal version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the re-
quirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.

336
9. TERMINATION
You may not copy, modify, sublicense, or distribute the Document except as expressly provided
for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is
void, and will automatically terminate your rights under this License. However, parties who have
received copies, or rights, from you under this License will not have their licenses terminated so
long as such parties remain in full compliance.

10. FUTURE REVISIONS OF THIS LICENSE


The Free Software Foundation may publish new, revised versions of the GNU Free Documentation
License from time to time. Such new versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies
that a particular numbered version of this License “or any later version” applies to it, you have the
option of following the terms and conditions either of that specified version or of any later version
that has been published (not as a draft) by the Free Software Foundation. If the Document does
not specify a version number of this License, you may choose any version ever published (not as a
draft) by the Free Software Foundation.

337
Bibliography
[1] E. Abbott. Flatland. 7th edition. New York: Dover Publications, Inc., 1952.
[2] R. Abraham, J. E. Marsden, and T. Ratiu. Manifolds, tensor analysis, and applications. Vol-
ume 75. Springer Science & Business Media, 2012.
[3] H. Anton and C. Rorres. Elementary Linear Algebra: Applications Version. 8th edition. New
York: John Wiley & Sons, 2000.
[4] T. M. Apostol. Calculus, Volume I. John Wiley & Sons, 2007.
[5] T. M. Apostol. Calculus, Volume II. John Wiley & Sons, 2007.
[6] V. Arnold. Mathematical Methods of Classical Mechanics. New York: Springer-Verlag, 1978.
[7] M. Bazaraa, H. Sherali, and C. Shetty. Nonlinear Programming: Theory and Algorithms. 2nd edi-
tion. New York: John Wiley & Sons, 1993.
[8] R. L. Bishop and S. I. Goldberg. Tensor analysis on manifolds. Courier Corporation, 2012.
[9] A. I. Borisenko, I. E. Tarapov, and P. L. Balise. “Vector and tensor analysis with applications”.
In: Physics Today 22.2 (1969), pages 83–85.
[10] F. Bowman. Introduction to Elliptic Functions, with Applications. New York: Dover, 1961.
[11] M. A. P. Cabral. Curso de Cálculo de Uma Variável. 2013.
[12] M. P. do Carmo Valero. Riemannian geometry. 1992.
[13] A. Chorin and J. Marsden. A Mathematical Introduction to Fluid Mechanics. New York: Springer-
Verlag, 1979.
[14] M. Corral et al. Vector Calculus. Citeseer, 2008.
[15] R. Courant. Differential and Integral Calculus. Volume 2. John Wiley & Sons, 2011.
[16] P. Dawkins. Paul’s Online Math Notes. (Visited on 12/02/2015).
[17] B. Demidovitch. Problemas e Exercícios de Análise Matemática. 1977.
[18] T. P. Dence and J. B. Dence. Advanced Calculus: A Transition to Analysis. Academic Press, 2009.
[19] M. P. Do Carmo. Differential forms and applications. Springer Science & Business Media, 2012.
[20] M. P. Do Carmo and M. P. Do Carmo. Differential geometry of curves and surfaces. Volume 2.
Prentice-hall Englewood Cliffs, 1976.

339
BIBLIOGRAPHY
[21] C. H. Edwards and D. E. Penney. Calculus and Analytic Geometry. Prentice-Hall, 1982.
[22] H. M. Edwards. Advanced calculus: a differential forms approach. Springer Science & Business
Media, 2013.
[23] t. b. T. H. Euclid. Euclid’s Elements. Santa Fe, NM: Green Lion Press, 2002.
[24] G. Farin. Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide. 2nd edi-
tion. San Diego, CA: Academic Press, 1990.
[25] P. Fitzpatrick. Advanced Calculus. Volume 5. American Mathematical Soc., 2006.
[26] W. H. Fleming. Functions of several variables. Springer Science & Business Media, 2012.
[27] D. Guichard, N. Koblitz, and H. J. Keisler. Calculus: Early Transcendentals. Whitman College,
2014.
[28] H. L. Guidorizzi. Um curso de Cálculo, vol. 2. Grupo Gen-LTC, 2000.
[29] H. L. Guidorizzi. Um curso de Cálculo, vol. 3. Grupo Gen-LTC, 2000.
[30] H. L. Guidorizzi. Um Curso de Cálculo. Livros Técnicos e Científicos Editora, 2001.
[31] D. Halliday and R. Resnick. Physics: Parts 1&2 Combined. 3rd edition. New York: John Wiley &
Sons, 1978.
[32] G. Hartman. APEX Calculus II. 2015.
[33] E. Hecht. Optics. 2nd edition. Reading, MA: Addison-Wesley Publishing Co., 1987.
[34] P. Hoel, S. Port, and C. Stone. Introduction to Probability Theory. Boston, MA: Houghton Mifflin
Co., 1971 (cited on page 133).
[35] J. H. Hubbard and B. B. Hubbard. Vector calculus, linear algebra, and differential forms: a uni-
fied approach. Matrix Editions, 2015.
[36] J. Jackson. Classical Electrodynamics. 2nd edition. New York: John Wiley & Sons, 1975 (cited
on page 26).
[37] L. G. Kallam and M. Kallam. “An Investigation into a Problem-Solving Strategy for Indefinite
Integration and Its Effect on Test Scores of General Calculus Students.” In: (1996).
[38] D. Kay. Schaum’s Outline of Tensor Calculus. McGraw Hill Professional, 1988.
[39] M. Kline. Calculus: An Intuitive and Physical Approach. Courier Corporation, 1998.
[40] S. G. Krantz. The Integral: a Crux for Analysis. Volume 4. 1. Morgan & Claypool Publishers, 2011,
pages 1–105.
[41] S. G. Krantz and H. R. Parks. The implicit function theorem: history, theory, and applications.
Springer Science & Business Media, 2012.
[42] J. Kuipers. Quaternions and Rotation Sequences. Princeton, NJ: Princeton University Press,
1999.
[43] S. Lang. Calculus of several variables. Springer Science & Business Media, 1987.
[44] L. Leithold. The Calculus with Analytic Geometry. Volume 1. Harper & Row, 1972.

340
BIBLIOGRAPHY
[45] E. L. Lima. Análise Real Volume 1. 2008.
[46] E. L. Lima. Analise real, volume 2: funções de n variáveis. Impa, 2013.
[47] I. Malta, S. Pesco, and H. Lopes. Cálculo a Uma Variável. 2002.
[48] J. Marion. Classical Dynamics of Particles and Systems. 2nd edition. New York: Academic Press,
1970.
[49] J. E. Marsden and A. Tromba. Vector calculus. Macmillan, 2003.
[50] P. C. Matthews. Vector calculus. Springer Science & Business Media, 2012.
[51] P. F. McLoughlin. “When Does a Cross Product on Rn Exist”. In: arXiv preprint arXiv:1212.3515
(2012) (cited on page 27).
[52] P. R. Mercer. More Calculus of a Single Variable. Springer, 2014.
[53] C. Misner, K. Thorne, and J. Wheeler. Gravitation. New York: W.H. Freeman & Co., 1973.
[54] J. R. Munkres. Analysis on manifolds. Westview Press, 1997.
[55] S. M. Musa and D. A. Santos. Multivariable and Vector Calculus: An Introduction. Mercury Learn-
ing & Information, 2015.
[56] B. O’Neill. Elementary Differential Geometry. New York: Academic Press, 1966.
[57] O. de Oliveira et al. “The Implicit and Inverse Function Theorems: Easy Proofs”. In: Real Anal-
ysis Exchange 39.1 (2013), pages 207–218.
[58] A. Pogorelov. Analytical Geometry. Moscow: Mir Publishers, 1980.
[59] J. Powell and B. Crasemann. Quantum Mechanics. Reading, MA: Addison-Wesley Publishing
Co., 1961.
[60] M. Protter and C. Morrey. Analytic Geometry. 2nd edition. Reading, MA: Addison-Wesley Pub-
lishing Co., 1975.
[61] J. Reitz, F. Milford, and R. Christy. Foundations of Electromagnetic Theory. 3rd edition. Read-
ing, MA: Addison-Wesley Publishing Co., 1979.
[62] W. Rudin. Principles of Mathematical Analysis. 3rd edition. New York: McGraw-Hill, 1976.
[63] H. Schey. Div, Grad, Curl, and All That: An Informal Text on Vector Calculus. New York: W.W.
Norton & Co., 1973.
[64] A. H. Schoenfeld. Presenting a Strategy for Indefinite Integration. JSTOR, 1978, pages 673–678.
[65] R. Sharipov. “Quick introduction to tensor analysis”. In: arXiv preprint math/0403252 (2004).
[66] G. F. Simmons. Calculus with Analytic Geometry. Volume 10. 1985, page 12.
[67] M. Spivak. Calculus on manifolds. Volume 1. WA Benjamin New York, 1965.
[68] M. Spivak. Calculus. 1984.
[69] M. Spivak. The Hitchhiker’s Guide to Calculus. Mathematical Assn of America, 1995.
[70] J. Stewart. Calculus: Early Transcendentals. Cengage Learning, 2015.

341
BIBLIOGRAPHY
[71] E. W. Swokowski. Calculus with Analytic Geometry. Taylor & Francis, 1979.
[72] S. Tan. Calculus: Early Transcendentals. Cengage Learning, 2010.
[73] A. Taylor and W. Mann. Advanced Calculus. 2nd edition. New York: John Wiley & Sons, 1972.
[74] J. Uspensky. Theory of Equations. New York: McGraw-Hill, 1948.
[75] H. Weinberger. A First Course in Partial Differential Equations. New York: John Wiley & Sons,
1965.
[76] K. Wilfred. Advanced calculus. 2002.
[77] S. Winitzki. Linear algebra via exterior products. Sergei Winitzki, 2010.

342
Index
Mx , My , 125 Jacobi matrix, 69
Mxy , Mxz , Myz , 126 locally invertible on S, 88
x̄, 125 partial derivative, 67
ȳ, 125 repeated partial derivatives, 70
z̄, 126 scalar multiplication, 3
δ(x, y), 125 star-shaped domains, 302
∂(x, y, z) the tensor product, 255
, 121
∂(u, v, w)
˝
, 116
a dual bilinear form, 252
˜S
, 107 acceleration, 60
˜
, 111 alternating, 239
R
ε-neighborhood, 39 antisymmetric, 239, 241
k-differential form field in Rn , 293 antisymmetric tensor, 243
k-th exterior power, 243 antisymmetric tensors, 243
n-forms, 244 apex, 145
n-vectors, 243 area element, 111, 304
1-form, 229
basis, 10
curl, 80 Beta function, 124
derivative of f at a, 64 bivector, 242
differentiable, 64 bivectors, 242
directional derivative of f in the direction of v boundary, 171, 179
at the point x, 81 boundary point, 42
directional derivative of f in the direction of v boundary, 42
at the point x, 81 bounded, 179
distance, 6 bounded, 51
divergence, 78
dot product, 5 canonical ordered basis, 4
gradient, 76 center of mass, 125
gradient operator, 76 centroid, 126

343
Index
change of variable, 118, 120 density, 125
charge density, 87 derivative, 57
class C 1 , 75 deviator stress, 287
class C 2 , 75 diameter, 51
class C ∞ , 75 differentiable, 57
class C k , 75 differential form, 154
clockwise, 171, 172 dilatational, 287
closed, 40, 138, 179 dimension, 10
closed surfaces, 200 directrix, 144
closure of the dual space, 231 distribution function, 129
closure, 42 joint, 131
compact, 52 normal, 130
component functions, 56 Divergence Theorem, 310
components, 233 domain, 43, 55
conductivity tensor, 287 double integral, 107, 111
cone, 145 polar coordinates, 122
conservative force, 155, 161 dual space, 230
conservative vector field, 161 dummy index, 248
continuous at t0 , 50
ellipsoid, 124, 148
continuously differentiable, 75
elliptic paraboloid, 147
continuously differentiable, 74
even, 33
contravariant, 234
exact, 301
contravariant basis, 214
expected value, 133
converges to the limit, 41
exterior, 42
coordinate change, 211
exterior derivative, 294
coordinates, 11, 233
exterior product, 241, 243
curvilinear, 22
origin of the name, 244
cylindrical, 22
exterior, 42
polar, 22, 122
spherical, 22 force, 60
correlation, 135 free, 8
counter clockwise, 171, 172 free index, 249
covariance, 135 fundamental vector product, 179
covariant, 234
covariant basis, 214 Gauss, 310
covector, 229 generate, 9
curl of F, 79 geodesics, 280
current density, 87 gradient field, 160
curve, 137 gravitational constant, 155
curvilinear coordinate system, 212 Green’s Theorem, 167
cylinder, 144 Green’s Theorem., 309

344
Index
helicoid, 25 limit point, 42
helix, 56 line integral, 153
Helmholtz decomposition, 206, 207 line integral with respect to arc-length, 157
Hodge linear combination, 8
Star, 245 linear function, 229
homomorphism, 15 linear functional, 229
hydrostatic, 287 linear homomorphism, 15
hyperbolic paraboloid, 147 linear transformation, 15
hyperboloid of one sheet, 146, 148 linearly dependent, 8
hyperboloid of two sheets, 147 linearly independent, 8
hypercube, 53 locally invertible, 88
hypersurface, 116
manifold, 149
hypervolume, 116
mass, 125
improper integral, 114 matrix
in a continuous way, 191 transition, 13
independent, 154 matrix representation of the linear map L with
inertia tensor, 283 respect to the basis {xi }i∈[1;m] , {yi }i∈[1;n] .,
inner product, 5 16
integral meridian, 145
double, 107, 111 mesh size, 104
improper, 114 moment, 125, 126
iterated, 106 momentum, 60
multiple, 103 multilinear map, 233
surface, 180, 182, 188 multiple integral, 103
triple, 116
negative, 4
inverse of f restricted to S, 88
negatively oriented curve, 167
inverse, 88
norm, 5
irrotational, 162
normal, 194
isolated point, 42
isomers, 258 odd, 33
iterated integral, 107 one-to-one, 87
iterated limits of f as (x, y) → (x0 , y0 ), 48 open, 40
open ball, 39
Jacobian, 121 open box, 39
joint distribution, 131 opposite, 4
ordered basis, 11
Kelvin–Stokes Theorem, 195
orientation, 137
lamina, 125 orientation-preserving, 156
length, 5 orientation-reversing, 156
length scales, 213 oriented area, 240
level curve, 29 oriented surface, 191

345
Index
origin, 4 smooth, 75
orthogonal, 11, 214 space of tensors, 234
orthonormal, 11 span, 9
outward pointing, 191 spanning set, 9
spherical spiral, 59
parametric line, 5
standard normal distribution, 130
parametrization, 137
star, 245
parametrized surface, 141, 177
surface area element, 306
path integral of f along Γ., 299
surface integral, 180, 182, 188, 192
path-independent, 160
surface of revolution., 145
permutation, 33
symmetric, 239
piecewise C 1 , 140
piecewise differentiable, 140 tangent “plane”, 150
polygonal curve, 43 tangent space, 150
position vector, 59, 60 tangent vector, 57
positively oriented curve, 167 tensor, 233
probability, 129 tensor field, 267
probability density function, 129 tensor of type (r, s), 233
tensor product, 235
random variable, 129
tied, 8
rank of an (r, s)-tensor, 234
torus, 182
real function of a real variable, 55
totally antisymmetric, 243
regular, 137, 178
transition matrix, 13
regular parametrized surface, 143, 178
triple integral, 116
regular point, 143
cylindrical coordinates, 123
regular value, 143
spherical coordinates, 123
reparametrization, 156
restriction of f to S, 88 unbounded, 51
Riemann integrable, 104, 105 uniform density, 125
Riemann integral, 104, 105 uniform distribution, 130
Riemann sum, 104 uniformly distributed, 129
right hand rule, 199 unit vector, 6
right-handed coordinate system, 17 upper unit normal, 187

saddle, 147 variance, 134


sample space, 129 vector
scalar, 3 tangent, 57
scalar field, 29, 55 vector field, 29, 55
second moment, 134 smooth, 167
simple closed curve, 138 vector functions, 191
simply connected, 43, 161 vector space, 4
simply connected domain, 43 vector sum, 3
single-term exterior products, 242 vector-valued function

346
Index
antiderivative, 60
indefinite integral, 60
vector-valued function of a real variable, 55
vectors, 3, 4
velocity, 60
versor, 6
volume element, 116, 305

wedge product, 241


winding number, 161
work, 153

zenith angle, 22
zero vector, 3

347

You might also like