Johns
Johns
www.orangegrovetexts.com
j o hns
L. E. Johns
56500
9 781616 101657
20 Lectures on Eigenvectors, Eigenvalues, and Their Applications
L. E. Johns
ISBN 978-1-61610-165-7
Orange Grove Texts Plus is an imprint of the University Press of Florida, which is the scholarly
publishing agency for the State University System of Florida, comprising Florida A&M University,
Florida Atlantic University, Florida Gulf Coast University, Florida International University, Florida
State University, New College of Florida, University of Central Florida, University of Florida,
University of North Florida, University of South Florida, and University of West Florida.
Acknowledgment ix
Reader’s Guide xv
Lecture 10: A Word To The Reader Upon Leaving Finite Dimensional Vector Spaces . . xxv
i
CONTENTS ii
1 Getting Going 3
2.3 Looking at the Problem Ax=b from the Point of View of Linearly Independent
Sets of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 The Number of Linearly Independent Vectors in a Set of Vectors and the Rank of
a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3 Vector Spaces 47
3.2 The Image and the Kernel of a Matrix: The Geometric Meaning of Its Rank . . . . 48
3.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4 Inner Products 67
4.2 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Eigenvectors 85
5.5 The Spectral Representation of a Matrix and a Derivation of the Kremser Equation 99
5.7 Eigenvector Expansions and the Solution to the Problem Ax=b . . . . . . . . . . 105
10 A Word To The Reader Upon Leaving Finite Dimensional Vector Spaces 243
16.2 The Facts about the Solutions to the Eigenvalue Problem . . . . . . . . . . . . . . 393
16.3 The Critical Size of a Region Confining an Autothermal Heat Source . . . . . . . . 394
16.5 ∇4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
CONTENTS vii
17.1 Separating Variables in Cartesian, Cylindrical and Spherical Coordinate Systems . 409
17.3 Solving a Two Dimensional Diffusion Problem in Plane Polar Coordinates (Spher-
ical Coordinates in Two Dimensions) . . . . . . . . . . . . . . . . . . . . . . . . . 418
19.10 Turning a Differential Equation into an Integral Equation: Autothermal Heat Gen-
eration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
Index 627
Acknowledgment
In writing out these lectures in this way I was guided by what I imagined Charles Petty and Anthony
DeGance, students in the 60’s and 70’s, and Ranganathan Narayanan, a colleague from the 80’s to
the present, might ask me if I were teaching this material to them.
Thanks are due Debbie Sandoval and Santiago A. Tavares for the best possible help.
ix
What Sets this Book Apart?
First it has a detailed readers guide. Setting that aside, we answer the question:
There is no formal mathematics in this book, the proofs presented are mostly sketches. But
plausibility is not slighted and the geometric interpretation of the results obtained is a theme that
is maintained throughout the book.
We present the theory of finite dimensional spaces in Part I, first because there are many in-
teresting problems that can be formulated and solved in finite dimensions and second because the
main ideas in n dimensional spaces can be illustrated by drawing pictures in two dimensions.
Then, without stretching the readers imagination too much, in Part II we carry these pictures
over to infinite dimensional spaces. Indeed, often what is of interest in an infinite dimensional
space is a finite dimensional subspace.
In both finite and infinite dimensional spaces our search is always for a problem specific basis,
hence eigenvectors and eigenfunctions become a second theme. We make the jump from Part I to
Part II by introducing difference approximations to the solutions to the diffusion equation.
Now whether we are solving problems in Part I or in Part II, what we do is always the same. We
solve an eigenvalue problem, then all the steps come down to summation in Part I or integration in
Part II. No series, finite or infinite, is differentiated in order to solve a problem. For instance the
method of separation of variables is used only to solve eigenvalue problems, never to solve initial
value problems.
An explanation of domain perturbations is presented in order to extend the problems that can
be solved to domains other than cubes, cylinders and spheres.
Chemical engineering students need to solve problems having physical origins. Thus problems
xi
WHAT SETS THIS BOOK APART? xii
of physical interest are presented in every lecture and the readers will meet many of these problems
in their other courses, in their research or they will find these problems to be the first problems they
would solve in learning a new subject. This is a theme.
We often obtain linear problems via perturbation methods and due to this, there is a strong
emphasis on solvability conditions and hence on inner products. Solvability is thus a theme as is
the use of the eigenfunctions themselves to reveal patterns in nonlinear problems.
A short list of the examples presented in the lectures, and some of what they illustrate, follows:
Linear approximation
Boiling curves
Matrix multiplication
Linear independence
Greatest number of reactions among
Rank
M molecules made up A atoms
Determinant
Kremser equation Spectral decomposition
Dynamic stripping cascade Generalized eigenvectors
Eigenvalues
Chemostat
Branch points
Stirred tank reactor
Hopf bifurcations
Isomerization reactions
Eigenvectors
Draining tanks
Difference approximations Gerschgorin’s theorem
WHAT SETS THIS BOOK APART? xiii
Chromatography Power moments
Electrical potential Multipole expansions
Activator-inhibitor kinetics Eigenvalues
Petri dish Solvability
Size of a confined autothermal heat source Eigenvalues
Separation of variables
Saffman-Taylor problem
Integral constraints
Rayleigh-Taylor problem
Patterns derived from eigenfunctions
Energy of a quantum ball
Separation of variables
Solute dispersion Eigenvalues and eigenfunctions
Oscillations of an inviscid drop Cartesian, cylindrical and spherical coordinates
Many home problems stem from these examples and many of the home problems are not ques-
tions but stories leading to the reader to derive some well known results.
Reader’s Guide
Before we outline the main ideas presented in each lecture we present an overview stating how the
lectures fit together.
Part I has to do with problems whose solutions lie in finite dimensional spaces. Part II has to
do with problems whose solutions are functions.
Thus in Part I the subject is matrices, in Part II the subject is ∇2 and the differential equations
arising upon use of the method of separation of variables.
The applications are indicated in the opening statement: “What Sets this Book Apart?”
Diffusion is a theme of Part II, but only because ∇2 would not be so interesting if there were
no diffusion. Of course other themes could have been chosen from classical physics.
The first five lectures bring the reader to the point where they understand the basic facts about
the eigenvectors and eigenvalues of matrices.
The sixth lecture then uses these ideas to write the solution to systems of constant coefficient
ordinary differential equations.
The seventh and eighth lectures are applications of the sixth lecture to problems the reader
might have learned something about as an undergraduate.
The ninth lecture makes the transition to Part II by solving difference approximation to the
diffusion equation. The idea is that the expansion of the solution in the eigenvectors of a matrix
carries over to the solution of the diffusion equation itself in Part II.
Lectures 11, 14, 16 and 17 present the basic facts about the eigenvalues and eigenvectors of
∇2 in a bounded domain. Lectures 12 and 13 explain a little about problems in unbounded do-
mains, mostly by the use of the method of moments to derive simple facts about concentration
xv
READER’S GUIDE xvi
distributions.
Lecture 15 presents two applications, one to activator-inhibitor kinetics, the other to the con-
struction of the solution to a nonlinear reaction-diffusion problem.
Lecture 16 repeats Lecture 14, but now we have three space dimensions and volume and surface
sources are taken into account.
The method of separation of variables is presented in Lecture 17 and two well-known stability
problems are solved in Lecture 18.
Lecture 19 is about the second order ordinary differential equations that present themselves
upon the use of separation of variables. Their solutions by Frobenius’ method appears in Lecture
20.
Our aim is to introduce linear problems and to suggest how they might arise.
→
−
L−
→
x = f
→
−
where →
−x lies in the space of unknown vectors and f lies in the space of vectors that drive the
→
−
problem. We seek − →
x such that L carries −
→
x into f , where L is a linear operator in the sense that
L (−
→
x +−
→
y ) = L−
→
x + L−
→
y
READER’S GUIDE xvii
x1
x2
The simplest linear problems are those where −
→
x = x =
..
and
.
xn
f1
→
− f2
f =f =
..
.
fm
a11 a12 · · · a1n
Then L = A = , an m × n matrix, and we write our problem Ax = f .
etc.
The most important point in Lecture 1 is that A should be viewed as a set of n columns
A= a1 a2 ... an
What we are mostly interested in is learning whether or not there is a solution and if there is,
how many.
a 1 x1 + a 2 x2 + · · · + a n xn = f
This causes us to ask our question in the following way: Can f be written as a linear combination
of the set of vectors a1 , a2 , · · · , an ? If it can then the coefficients in the expansion of f are the
elements of x, our answer.
We have a vector f lying in the space Cm of dimension m and we have a set of vectors a 1, a 2, · · · , a n
also lying in Cm , and we want to know how much of Cm can be included in the linear combinations
READER’S GUIDE xviii
The main idea is introduced: linear independence. It is illustrated by the question: how many
independent chemical reactions can be written among M molecules made up of A atoms?
The function det is introduced where det acting on a square matrix produces a scalar, called
its determinant.
We define the rank of a matrix, denoted r, where the matrix may or may not be square. And
we identify a set of r basis columns. The basis columns are independent, r is the largest number
of independent columns that can be found and any set of r independent columns is a set of basis
columns. Every set of r + 1 columns is dependent and each of the remaining n − r columns must
be a linear combination of the r basis columns.
The idea of a vector space is presented, its dimension is defined and the idea of a basis is introduced.
Two vector spaces associated to an m × n matrix A, are introduced: Im A and Ker A. The
dimension of Im A is r, the dimension of Ker A is n − r. Im A is the collection of all vectors A x,
Ker A is the collection of all vectors x such that A x = 0.
To be able to extend what we now know to problems beyond matrix problems and free ourselves of
the determinant function which only applies to matrix problems, we introduce the idea of an inner
product, paying most attention to the case where A is n × n.
We are looking for a way to tell if a vector b lies in the subspace Im A, where the dimension
of Im A is r.
READER’S GUIDE xix
Ax = 0
It is of dimension n − r.
Ax = b
Lecture 5: Eigenvectors
Here we denote by A an n × n matrix and we ask if there are vectors x that are mapped by A
without change of direction, i.e., we ask for x’s such that
Ax = λx
This is the eigenvalue problem for A and solutions (x 6= 0, λ) are called eigenvectors and
eigenvalues. If x is a solution, likewise c x for any c.
Writing A x = λ x as
(A − λ I) x = 0
READER’S GUIDE xx
we see that in order that solutions x 6= 0 can be found the λ’s must be such that the rank of
Ker (A − λI) is less than n, i. e., we must have
det (A − λ I) = 0
The number of independent solutions corresponding to a root λ is the dimension of Ker (A − λ I).
Now we pay most of our attention to the plain vanilla case where d = n and
n1 = 1 = m1 , etc.
Then we have n distinct eigenvalues and the corresponding n eigenvectors are independent and
form a basis.
Introducing an inner product, we can derive A∗ , the adjoint of A, and write its eigenvalue
problem
A∗ y = µ y
whereupon we find the µ’s are the complex conjugates of the λ’s and the set of eigenvectors
y 1, y 2, · · · , y n is the set of vectors orthogonal to the set x 1, x 2, · · · , x n ,
viz., h y i, x j i = 0
X
x= ci x i
where ci = h y i, x i and
READER’S GUIDE xxi
X
x= di y i
where di = h x i, x i.
We solve the linear stripping cascade problem and derive a symmetric form of the Kremser
equation.
dx
= A x, t>0
dt
x1
x2
where x =
..
denotes the time dependent unknowns, x at t = 0 is specified, and A denotes
.
xn
an n × n matrix of constants.
To do this we introduce an inner product, denoted , , and in this inner product we derive the
adjoint of A, viz., A∗ .
READER’S GUIDE xxii
Thus we have
c1 = y 1, x (t)
c2 = y 2, x (t)
etc.
D dx E d
y 1, = y 1, Ax =⇒ y 1, x = A∗ y 1, x =⇒
dt dt
d
c1 = λ1 y 1, x = λ1 y 1 , x = λ 1 c1
dt
and hence we have
c1 (t) = c1 (t = 0) eλ1 t
x (n + 1) = A x (n) n = 0, 1, 2, . . .
we find
Stability of solutions to our differential equation requires all the eigenvalues of A to lie in the
READER’S GUIDE xxiii
left half of the complex plane, viz., Re λ < 0, whereas stability of solutions to our difference
equation requires all eigenvectors to lie inside the unit circle | λ| < 1.
We then explain what to do if eigenvectors are missing. For example if λ1 is a root of algebraic
multiplicity 2 but geometric multiplicity 1, i.e., dim Ker (A − λ1 I) = 1 so that there is only one
independent eigenvector corresponding to λ1 , we are going to be short one eigenvector and we will
not have an eigenvector basis for our space. What we do to overcome this difficulty is to introduce
generalized eigenvectors. Thus we write
A x 1 = λ1 x 1
and
A x 2 = x 1 + λ1 x 2
we find
dc1
= λ 1 c1 + c2
dt
and
dc2
= λ 1 c2
dt
whereupon c1 and c2 can be found sequentially, and the factor teλ1 t appears.
We put to use what we have been learning and we do this in the context of a chemostat and a very
simple stirred tank reactor, but a reactor that retains the interesting physics of these reactors. Thus
as the reaction proceeds it releases heat, it uses up reactants and it speeds up as the temperature
increases.
The stirred tank reactor model is two dimensional and therefore 2 × 2 matrices turn up in the
investigation of the stability of its steady states. The eigenvalues of a 2 × 2 matrix depend on
its trace T and determinant D and in the D − T plane stability obtains in the fourth quadrant,
D > 0, T < 0.
The model may be a bit too simple, but then we only need to solve quadratic equations to see
what is going on.
dx
= Ax
dt
The aim is to derive the elements of the matrix A from the measurements. The main idea is
that there are straight line paths, x (t) vs t, and that if we can find the directions of these straight
lines, we have the eigenvectors of A, and hence we can derive A from its spectral expansion
X
A= λi xi y Ti
We illustrate this by solving the problem of measuring reaction rate coefficients in a system of
isomers, the Wei and Prater problem.
READER’S GUIDE xxv
This is a lecture on difference approximations to the solution of the diffusion equation, first, to
present some ideas about diffusion, second, to illustrate the use of Gerschgorin’s theorem in es-
timating the eigenvalues of a matrix and, third, to present the method of solution which will be
carried over to the diffusion equation itself in Part II.
This lecture presents a warning that we are leaving behind solutions that are finite sums and moving
ahead to solutions that are infinite sums, and that it is now important that we do not substitute our
proposed solutions into our equations, i.e., integration is the rule not differentiation.
We begin by defining the gradient operator ∇ in terms of the derivative of a function along a
curve. Then we write ∇ in Cartesian coordinates and in any orthogonal coordinate system derived
from Cartesian coordinates.
We introduce the surface gradient operator, ∇S , so that we can differentiate a function defined
only on a surface and we go on and obtain a formula for the mean curvature of a surface.
We present a formula for ∇2 in an arbitrary orthogonal coordinate system and we work out
the details in cylindrical and spherical coordinate systems. Our emphasis is on the variation of the
base vectors from one point in space to a nearby point.
Then in order to solve problems on domains close to domains we like, we explain how domain
perturbations are carried out.
READER’S GUIDE xxvi
We begin our study of diffusion, and ∇2 , by deriving formulas for the power moments of a con-
centration field in an unbounded domain. Viewing the concentration as the probability of finding
a solute molecule at a certain point at a certain time, we introduce its mean, its variance, etc. We
then present an example where the effect of convection on the variance can be derived.
And we present the point source solution to the diffusion equation and explain superposition.
We continue solving problems in an unbounded domain and derive solutions to the problem of
steady diffusion from a source near the origin to a sink at infinity.
We introduce the monopole, dipole, quadrupole, etc. moments of the source and expand the
solution in these moments.
By doing this we obtain solutions to Poisson’s equation. We derive the electrical potential due
to a set of charges and thus the electrostatic potential energy of two charge distributions.
∂c ∂2c
= 2
∂t ∂x
c = c0 (x) ≥ 0
READER’S GUIDE xxvii
This is the source of the solute. The sinks are at x = 0 and x = 1 where we specify a variety of
homogeneous boundary conditions.
d2 ψ
2
+ λ2 ψ = 0, 0≤x≤1
dx
and try to decide what homogeneous boundary conditions we ought to require ψ to satisfy at x = 0
and x = 1 in order that we can use the eigenfunctions and eigenvalues in solving for c.
Z 1 1 Z 1
dψ 2 dψ dφ dψ
φ 2 dx = φ − dx
0 dx dx 0 0 dx dx
and
Z 1 1 Z 1
d2 ψ dψ dφ d2 φ
φ 2 dx = φ − ψ + ψ 2 dx
0 dx dx dx 0 0 dx
Our expectation is that by solving our eigenvalue problem we will find an infinite set of or-
thogonal eigenfunctions in an inner product denoted h , i and by the theory of Fourier series we
expect to be able to expand the solution to our diffusion problem as a linear combination of these
functions.
X
c (x, t) = ci (t) ψi (x)
where
ci (t) = h ψi , c i
It is at this point that we decide how to choose the boundary conditions that ψi must satisfy.
Thus if c is specified at x = 0 and x = 1, we set ψi = 0 at x = 0 and x = 1 to eliminate
∂c ∂c
the unknown at x = 0, 1. If is specified at x = 0 and c is specified at x = 1 we chose
∂x ∂x
∂ψ
= 0 at x = 0 and ψ = 0 at x = 1. The plan is now apparent. We choose ψ at the boundary
∂x
to remove the indeterminacy in the equation for ci , whereupon the λ2 ’s and ψ’s depend on the
boundary conditions satisfied by c.
We present several examples differing from one another only in the boundary conditions at
x = 0 and x = 1.
Two examples of diffusion in one dimension are presented. The first is an activator–inhibitor
model which illustrates an instability caused by diffusion, viz., the inhibitor diffuses away before
it can arrest the growth of a perturbation. The second is our Petri Dish problem, first introduced
in Lecture 1, where we now explain how to find a solution branch which appears as some input to
the problem advances beyond its critical value. We see that the amplitude of the branch depends
on the input variable in different ways for different nonlinearities.
In this lecture the use of the eigenfunctions of ∇2 to solve inhomogeneous problems on bounded,
three dimensional domains is explained.
Our first job is to use Green’s two theorems to help us discover the boundary conditions that
the eigenfunctions must satisfy and then to derive the important facts about the eigenvalues and the
eigenfunctions.
We then indicate how diffusion eigenvalues can be used to estimate critical conditions in non-
linear problems, say, the critical size of a region confining an autothermal heat source.
READER’S GUIDE xxix
∂c
= ∇2 c + Q
∂t
∇2 ψ + λ 2 ψ = 0
where ψ satisfies homogeneous conditions on the boundary of our domain. The specified function
Q assigns sources and sinks of solute on the domain. Other sources and sinks may be assigned at
the boundary of our domain.
The simplest domain shapes that we can deal with are those where we would introduce Carte-
sian, cylindrical or spherical coordinates.
Now separation of variables is the method ordinarily used to solve the eigenvalue problem.
And our aim here is to explain how it works in simple cases.
and
and obtain
∂2X
2
+ α2 X = 0 (1)
∂x
∂2Y
2
+ β2 Y = 0 (2)
∂y
READER’S GUIDE xxx
and
∂2Z
+ γ 2 Z = 0, (3)
∂z 2
∂2Z
+ γ2 Z = 0 (4)
∂z 2
∂2Θ
+ m2 Θ = 0 (5)
∂θ2
and
∂2 1 ∂ 2 m2 2
2
+ R+ λ − 2 −γ R =0 (6)
∂r r ∂r r
and
d2 Φ
+ m2 Φ = 0 (7)
dφ2
1 d ∂Θ m2
sin θ + ℓ (ℓ + 1) − Θ=0 (8)
sin θ dθ ∂θ sin2 θ
and
d2 2 2 ℓ (ℓ + 1)
+ R+ λ − R=0 (9)
dr 2 r r2
Eqs. (4), (5) and (7) are just like Eqs. (1), (2) and (3). Eqs. (6), (8) and (9) are new.
The reader ought to observe that Eqs. (1), (2) and (3) are independent of one another but by
the time we get to Eqs. (7), (8) and (9), Eq. (7) must be solved first so that the m’s are available in
Eq. (8) then Eq. (8) must be solved so that the ℓ’s are available in Eq. (9) and λ2 appears only in
Eq. (9), and it is independent of m2 .
We work out two simple two dimensional problems in order to see what changes occur as we
go from one coordinate system to another.
First our diffusion problem is set on a rectangle of sides a and b, c is specified on the perimeter
READER’S GUIDE xxxi
and Q is specified on the domain. We find two sets of orthogonal functions, viz.,
mπx m2 π 2
X = sin , α2 = m = 1, 2, . . .
a a2
and
nπy n2 π 2
Y = sin , β2 = n = 1, 2, . . .
b b2
Cartesian coordinates are special. These two sets of orthogonal functions are all that we need
to solve our problem and to obtain c (x, y, t) we take the same steps we took in the one dimensional
case, in Lecture 14.
Second our diffusion problem is now set on a circle of radius R0 . And again we assume c is
specified on the circumference, but now bounded at the origin and periodic in θ.
1
Θm (θ) = √ ei mθ , m = · · · , −2, −1, 0, 1, 2, · · ·
2π
and we see something new: for each value of m2 we will have a corresponding set of radial
eigenfunctions and the set will differ as m2 differs.
In this Lecture we see that eigenvalues and eigenfunctions of ∇2 are themselves of great interest,
whether or not a series solution to a diffusion problem is being sought.
To make this point we solve two stability problems, the Saffman-Taylor problem and the
READER’S GUIDE xxxii
Rayleigh-Taylor problem. In both cases we imagine the setting to be cylinder of circular cross
section bounding a porous solid. Two immiscible fluids fill the pores and in one case a less viscous
fluid is displacing a more viscous fluid, in the other case a heavy fluid lies above a light fluid.
The eigenvalues of ∇2 tell us the critical value of the input variable of interest, the eigenfunc-
tions tell us the pattern we ought to see at critical.
In the second problem the distinction between free and pinned edges bears on the possibility
of separating variables.
Separation of variables leads to second order, linear, ordinary differential equations. The simple
facts about these equations are presented in this lecture, before the method of Frobenius for solving
these equations is outlined in Lecture 20. Our problem is to find u where u satisfies
Lu = f
B0 u = a0 at x=0
and
B1 u = a1 at x=1
where L is a linear, second order differential operator and B is a linear combination of u and u′.
Lu = 0
B0 u = 0 at x=0
READER’S GUIDE xxxiii
and
B1 u = 0 at x=1
is taken up and conditions that it has solutions other than zero are presented.
Lψ + λ2 ψ = 0
B0 ψ = 0 at x=0
and
B1 ψ = 0 at x=1
are derived.
Solutions to the eigenvalue problem for ∇2 are presented in this lecture. The coordinate systems
of interest are Cartesian, cylindrical and spherical.
The method of Frobenius is presented and first used to obtain a power series expansion for the
Bessel function I0 (x). The coefficients in the series define the nature of the functions so obtained.
To emphasize this point we derive the zeros of J0 (z) and cos z from the coefficients in their power
series expansions.
Then the bounded solutions of the associated Legendre equation are worked out and the spher-
ical harmonics are introduced.
READER’S GUIDE xxxiv
Applications to the problem of solute dispersion due to a velocity gradient, to the problem of
small amplitude oscillations of a nonviscous sphere and to the energies of a quantum ball in a
gravitational field are presented.
Part I
Elementary Matrices
This is the title of an old book by Frazer, Duncan and Collar; while this book is by no means
elementary the title does fit Part I of these lectures.
Lecture 1
Getting Going
In this lecture we introduce some ideas which point to the direction of our future work. The next
to last section presents the facts about the solutions to linear algebraic equations.
The last section suggests that we can learn something about nonlinear problems by solving
certain linear problems.
To determine a boiling curve for a liquid solution, we heat the liquid and draw off an equilibrium
vapor. We denote the species making up the liquid by 1, 2, . . . , n in order of decreasing volatility or
increasing boiling point. Ordinarily the vapor is enriched in the more volatile species and so, as the
boiling goes on, the liquid composition shifts in favor of the less volatile species. To write a model
of this we denote by xi and yi the mole fractions of species i in the liquid and in the equilibrium
vapor and by N the number of moles of liquid being heated, then we write
dxi
= −yi + xi i = 1, . . . , n
ds
where
N(t)
s = −ℓn
N(t = 0)
3
LECTURE 1. GETTING GOING 4
where s increases as t increases, and where, at constant pressure, y1 , . . . , yn and T are determined
by x1 , . . . , xn , assuming the phases remain in equilibrium as the liquid is boiled off. Writing
the phase equilibrium equations at constant pressure as yi = fi (x1 , x2 , . . . , xn ) and assuming
P P P
yi = 1 whenever xi = 1 and yi = 0 whenever xi = 0 we see that if xi = 1 and xi ≥ 0 at
P
any point in the boiling process then xi = 1 and xi ≥ 0 at any subsequent point.
The state space is not a vector space, indeed no sum of two vectors in the state space lies in the
state space; it is a subset but not a subspace of R3 .
Should we wish to solve our system of equations, we first ought to develop some guidelines
to what the solution looks like. The simplest guide posts are the points where the system comes
to rest, the so called critical points, steady state points, equilibrium points, etc. These points are
dxi
defined by = 0, i = 1, 2, . . . , n, and here this corresponds to the equations −yi + xi = 0, i =
ds
1, 2, . . . , n, i.e., to the homogeneous azeotropes. The simplest of these lie at the vertices of the
triangle, (points 1, 2, 3), next are the binary azeotropes lying on the edges (e.g., point 4) then the
full ternary azeotropes on the face of the triangle (e.g., point 5), viz,
LECTURE 1. GETTING GOING 5
5
1 2
If we can decide how points move in the neighborhood of these rest points, that is whether
they are attracted to or repelled by the rest points, we can begin to make a qualitative sketch of the
family of solutions to our problem. This property of a critical point is referred to as its stability and
we can get some information on stability by assuming the system is displaced a small amount from
a rest point and then determining whether this small displacement is strengthened or weakened.
To do this we construct a linear approximation to our model in the neighborhood of a rest point of
interest and then investigate its solution.
0 = −fi x01 , x02 , · · · , x0n + x0i , i = 1, 2, . . . , n
Then approximate the model when x is near x0 by writing x = x0 + ξ and retain only terms linear
in ξ. By doing this we obtain
dξ
= Aξ
ds
∂fi 0 0
where A is an n × n matrix whose elements are aij = − x1 , x2 , · · · , x0n + δij
∂xj
This is a system of linear differential equations; what its solutions look like is determined by
the matrix A. To understand stability problems is one of our main interests in studying matrices
but it is far from our only interest as we will soon see. But for now, what is important is that
this example introduces the multiplication Aξ. The reader can carry out the calculation described
above and learn the rule determining the product.
LECTURE 1. GETTING GOING 6
This formula defines matrix multiplication but to see what is going on when two matrices are
multiplied it is useful to think about a matrix in terms of its columns or its rows instead of in terms
of its elements. Indeed an m × n matrix A is made up of n columns, each a column vector lying
in C m . Denoting these a1 , a2 , . . . , an where
a a a
11 12 1n
a21 a22 a2n
a1 =
..
,
a2 =
..
,
··· , an =
..
. . .
am1 am2 amn
we can write
A = a1 a2 . . . an
x1 a1 + x2 a2 + · · · + xn an ∈ C m
LECTURE 1. GETTING GOING 7
where
x1
x2
x=
..
∈Cn
.
xn
The product Ax is then the linear combination of the columns of A determined by coefficients
taken to be the elements of x. Likewise in the product AB, each column of B belongs to C n
and determines the coefficients for the linear combination of the columns of A that adds up to the
corresponding column of the product. So the j th column of a product AB is a linear combination
of the columns of A, the coefficients being the elements of the j th column of B, indeed AB =
(Ab1 Ab2 . . .). Again each row of BA is a linear combination of the rows of A, the coefficients for
constructing the ith row of BA being the elements of the ith row of B. Ordinarily if AB is defined
BA is not and vice versa, the exception is when A and B are square. Then AB and BA need
not be equal. This way of looking at matrix multiplication will help us understand the solvability
conditions for linear algebraic equations, viz., Ax = b. In fact, if m > n, the picture
n m
C C
x
b
a1
O an O a2
Just as
x1
x2
Ax = a1 a2 . . . an
..
= a1 x1 + a2 x2 + · · · + an xn
.
an
so also
y T1
y T2
AX = a1 a2 . . . an
..
= a1 y T + a2 y T + · · · + an y T
1 2 n
.
y Tn
where y T1 is the first row of X, y T2 the second, etc. and where a1 y T1 is a matrix each of whose
columns is a multiple of a1 . This way of looking at a matrix multiplication, instead of
AX = Ax1 Ax2 . . . as above, will be useful later on in turning the solutions to the eigen-
value problem for A into a spectral representation of A.
Columns and rows turn up in a symmetric way. For every result about a set of columns there is
a corresponding result about a set of rows. We emphasize columns and write a system of algebraic
equations as Ax = b but we can usethe transpose
operation, denoted T , where the columns of A
aT1
aT2
become the rows of AT , viz., A =
T
..
, and write the problem Ax = b as xT AT = bT .
.
aTn
Before leaving the boiling curve problem we can observe that the temperature of the liquid
plays no role. Because we ordinarily expect the temperature to increase as the boiling proceeds we
expect a family of solution curves for a plain vanilla liquid system to look as follows
LECTURE 1. GETTING GOING 9
1 2
If there are binary and ternary azeotropes we first observe that binary azeotropes come in two
kinds, maximum boiling which are stable and minimum boiling which are unstable. Knowing
whether a binary azeotrope is maximum or minimum boiling allows us to determine what the
system does on the binary edges. But it may not determine what happens in the interior even in the
absence of ternary azeotropes. Thus if the 1, 2 binary has a maximum boiling azeotrope we expect
either
3 3
or
1 2 1 2
depending on whether the 1-2 azeotrope boils at a higher or lower temperature than does 3.
We might guess that if we calculate the temperature associated with each state and plot the
isotherms on the state diagram we can sketch the solution curves of our problem by insisting only
that the temperature not decrease as the boiling goes on. The temperature then is a sort of potential
for this problem.
The arrows on the edges 1-3 and 2-3 show that the vertex 3 is stable and therefore has at least
a small region of attraction. (Arrows pointing at vertex 3 will lie on the two edges that converge
on vertex 3 as long as there is not a 1-3 or 2-3 maximum boiling azeotrope.) This makes the first
of the two figures doubtful. Assuming the 1-2 azeotrope is stable whenever its boiling point is
LECTURE 1. GETTING GOING 10
higher than the boiling point at vertex 3, and knowing that vertex 3 is stable, we can speculate that
this boiling point ordering is sufficient that an unstable ternary azeotrope mediates the dynamics
of the system, anchoring a boundary separating the regions of attraction of vertex 3 and the 1-2
azeotrope.
There is a theory, called index theory, an account of which can be found in Coddington and
Levinson’s book ”Theory of Ordinary Differential Equations,” which gives global information
about questions of this sort. It establishes conditions that must be satisfied by the sum over the
local stability at each of a set of multiple critical points. This theory requires ideas beyond what
we intend to explain in these lectures and is therefore a direction for advanced study.
dx
= a(t)x + b(t)
dt
This formula tells us that the value of x at time t is determined by its value at time t0 and the
values of b on the interval (t0 , t). It shows that the contributions of the two sources to the solution
are independent and additive. This is one way, but not the only way, of stating the principle of
superposition. It exhibits the main way in which linear problems are special.
b n a(t − t0 ) o
x (t) = x (t = t0 ) ea(t − t0 ) + e −1
a
We can make use of this formula in studying the dynamics of a simple evaporator. The problem
is to concentrate a nonvolatile solute in a feed stream by boiling off the volatile solvent. The
stream to be concentrated, specified by its feed rate F [#/hr], concentration xF [mass fraction] and
temperature TF [o F], is run into a tank containing a heat exchanger of area A [ft2 ] and heat transfer
coefficient U [Btu/hr ft2 o F] supplied with steam condensing at temperature Ts . The pressure in
the system is determined by the conditions under which the solvent being boiled off is condensed.
This is done in a heat exchanger of area Ac and heat transfer coefficient Uc , supplied with cooling
water at temperature Tc .
V, T
F, TF, xF
TS T T Tc
L, T, x
The simplest model corresponds to concentrating a dilute solution. In this case we assume
that
the physical
properties of the solution are those of the solvent and add that its heat capacity
Btu Btu
cp 0
and latent heat λ are constant. Then under steady conditions we write
# F #
LECTURE 1. GETTING GOING 12
0 = F −L−V
0 = xF F − xL
0 = hF F + UA Ts − T − hL − HV
0 = HV − hV − Uc Ac T − Tc
Btu
where T is the boiling point of the solvent, x the concentration of the product and h and H
#
the enthalpies of the liquid and the vapor streams. The pressure is the vapor pressure of the solvent
at temperature T . If F , xF , TF , Ts , UA, Tc and Uc Ac (the operating variables) are set then the
number of equations equals the number of unknowns and we can determine x, L, V and T (the
performance variables). Indeed eliminating L and introducing cp and λ we get
0 = xF F − x F − V
0 = cp TF − T F + UA Ts − T − λV
0 = λV − Uc Ac T − Tc
and we see that as long as
UATs + cp F TF > UA + cp F Tc
then T > Tc and a pressure is established so that a boiling, i.e., V > 0, solution to these equations
is obtained. We say, then, that the pressure is established so that the heat balance balances, i.e., so
that the heat supplied at the evaporator equals the heat removed at the condenser:
cp TF − T F + UA Ts − T = Uc Ac T − Tc
Now let the system be in a steady state corresponding to assigned values of the operating
variables and suppose that certain of these are changed to new values at t = 0. Then, while we
know how to determine the new steady state reached as t → ∞ we need to answer the question:
how does the system make the transition from the old to the new steady state?
LECTURE 1. GETTING GOING 13
Letting M[#] denote the amount of well mixed solution held in the evaporator and assuming
that L is adjusted to hold M fixed, we replace the left hand sides of the original equations by
dM dx dh dT
=0,M ,M = cp M and 0, this last by assuming the condenser hold up to be small.
dt dt dt dt
Then eliminating L we find
dT
cp M = cp F (TF − T ) + UA (Ts − T ) − Uc Ac (T − Tc )
dt
The value of T at t = 0 is the old steady state value while the operating variables take their new
values at t = 0.
In this simple model T vs t can be found using the formula introduced at the beginning of this
section. Then V vs t is determined by
Uc Ac
V = (T − Tc )
λ
dx
M = xp F − (F − V ) x
dt
again using our now favorite formula. Setting M = 1000 the reader can determine T vs t, V vs t
and x vs t as the evaporator makes the transition from the steady state corresponding to T = 160
to that corresponding to T = 133.
It is helpful in dealing with problems like this to scale the variables so that only dimension-
less variables appear. In doing this there may be a variety of time scales and each may sug-
gest a useful approximation. Here the problem is so simple that the only important time scale is
cp M
and this determines how fast the system responds to step changes.
cp F + UA + Uc Ac
If in the foregoing we make the value of Uc Ac very large then the solvent condenses at the temper-
ature Tc and the evaporator operates at constant pressure. The problem remains interesting when
LECTURE 1. GETTING GOING 14
the pressure is fixed if we include in the model the possibility of a boiling point rise. As a dilute
solution can exhibit a significant boiling point rise we retain all the simplifying approximations in
the foregoing save one: we now assume the boiling point of the solution to be Tc + βx where βx
is the boiling point rise. Then as long as the solution is boiling, i.e., V > 0, we can write
dx
M = xF − x F + xV
dt
dT
Mcp = UA Ts − T + cp TF − T F − λV
dt
and
T = Tc + βx
dx
M = xF − x F
dt
dT
Mcp = UA Ts − T + cp TF − T F
dt
and
T < Tc + βx
When the solution is boiling, the model contains the nonlinear term xV but only two of the
three equations are differential equations. To determine V we use T = Tc + βx and hence
dT dx
Mcp = Mcp β to conclude that
dt dt
UA Ts − T + cp TF − T − λV = cp β (xF − x) F + xV
This formula determines V as a function of x and T and can be used to eliminate V from the dif-
ferential equations. We will return to this problem and examine the stability of its steady solutions
to small upsets.
LECTURE 1. GETTING GOING 15
Z b error is f (x) − Pn (x) and if we determine a0 , a1 , · · · , an to make the integral square error,
The
{f (x) − Pn (x)}2 dx, as small as possible, we find, on setting the derivatives of this with respect
a
to a0 , a1 , · · · , an to zero that
n Z
X b Z b
j+i
x dx aj = f (x)xi dx, i = 0, 1, . . . , n
j=0 a a
The matrix on the left hand side is called the Hilbert matrix and the corresponding equations
are remarkable for how difficult they are to solve for values of n that are not large. Problems such
as this require for their solution the use of correction methods designed to improve approximations
obtained by elimination methods. The determinants of the 2 × 2 and 3 × 3 Hilbert matrices are
1/12 and 1/2160, where the numerators are 4 − 3 = 1 and 81 − 80 = 1. Now, if 1/3 is replaced
1 33 1
by 33/100, where − = , the determinant of the altered 3 × 3 Hilbert matrix is 63/106,
3 100 300
which is about 10% of 1/2160.
1 Rb
If we approximate f (x) by a0 + a1 x and denote ( ) dx by ( ) avg , we get
b−a a
and
Now let X and Y be random variables, the values x of X lying in [a, b], the values y of Y lying
in [c, d]. And let f (X, Y ) be the joint probability density: f (x, y)dxdy being the probability that
the point (X, Y ) lies in the rectangle (x, x + dx) × (y, y + dy). The expected value of any function
G(X, Y ) is
Z bZ d
E G(X, Y ) = G(x, y)f (x, y)dxdy
a c
To approximate Y by a0 + a1 X we seek to determine a0 and a1 so that E (Y − (a0 + a1 X))2
is least. Then as
E (Y − (a0 + a1 X))2 = E Y 2 − 2a1 E (XY ) −
2a0 E (Y ) + a21 E X 2 + 2a0 a1 E (X) + a20
we find, on setting the derivatives of this with respect to a0 and a1 to zero and solving for a0 and
a1 , that
E (X 2 ) E (Y ) − E (X) E (XY )
a0 =
E (X 2 ) − E (X)2
and
E (XY ) − E (X) E (Y )
a1 =
E (X 2 ) − E (X)2
These formulas state in another way what we found just above. If X and Y are uncorrelated, we
have E(XY ) = E(X)E(Y ) whence a0 = E(Y ), a1 = 0.
LECTURE 1. GETTING GOING 17
2
σX = E (X − E (X))2 = E X 2 − E (X)2
σY2 = E Y 2 − E (Y )2
and
= E (XY ) − E (X) E (Y )
we can write
a0 = E (Y ) − E (X) a1
and
σY
a1 = ρXY
σX
The least value of E (Y − (a0 + a1 X))2 is then σY2 (1 − ρ2XY ) . If X and Y are uncorrelated
this is σY2 , whereas if they are perfectly correlated it is 0.
We will use some simple ideas about random variables and probability densities when we deal
with the diffusion of a solute in a solvent.
Instead of having the values of a function everywhere on an interval, we may have its values only
at a discrete set of points. Call these values y1 , y2 , . . . , yn corresponding to x = x1 , x2 , . . . , xn .
Then we can try to find a polynomial of degree n − 1 that fits this information. Thus, writing
yi = Pn−1 (xi ) , i = 1, . . . , n
which we write as
a y1
0
a1 y2
V
..
=
..
. .
an−1 yn
where
1 x1 x21 ··· xn−1
1
1 x2 x22 ··· xn−1
2
V =
.. .. .. .. ..
. . . . .
1 xn x2n · · · xn−1
n
and where V is called the Vandermonde matrix. This is a system of n equations in n unknowns.
Ordinarily it has one and only one solution, but the solution may be sensitive to small changes in
the y’s.
and this is a system of m equations in n unknowns. It is overdetermined and we suspect that it does
not have a solution, either because our function is not really a polynomial of degree n−1 or, if it is,
that errors in the data preclude us from seeing this. In fact, what we will find is that a system of m
equations in n unknowns may (i) not have a solution or (ii) may have exactly one solution or (iii)
LECTURE 1. GETTING GOING 19
may have many solutions. This also summarizes the possibilities if m = n, whereas if m < n, an
underdetermined system, the second possibility must be excluded.
We write this
Va = y
n−1
X
yi − aj xji
j=0
m n−1
! m
X X X 2
yi − xji aj = y−Va i
i=1 j=0 i=1
T
= y−Va y−Va
= y T y − y T V a − (V a)T y + (V a)T a
= y T y − 2aT V T y + aT V T V a
To find a0 , a1 , . . . , an−1 so that the sum of the squares of the errors takes its least value, we
set the derivative of this expression with respect to each ak , k = 0, . . . , n − 1, to zero getting n
equations:
n−1
( m ) m
X X X
xki xji ai = xki yi
j=0 i=1 i=1
LECTURE 1. GETTING GOING 20
V TV a = V Ty
This is a system of n equations in n unknowns where the elements of the n × n coefficient matrix
V T V are
m
X
V TV ij
= xi+j
k , i, j = 0, . . . , n − 1
k=1
and this is just what we would expect to turn up in this, the discrete problem, knowing that the
Rb
elements of the corresponding matrix in the continuous problem are a xi+j dx
It is not easy to get an accurate solution to the problem V T V a = V T y, as V T V , like the Hilbert
matrix, does not work well when elimination methods are used. To see what is going on suppose
that x1 , x2 , . . . , xm is an increasing sequence of positive numbers. Then the columns of V T V , viz.,
P P P
x0i x1i xn−1
i
P 1 P 2 P n
xi xi xi
, , ··· ,
P 2 P 3 P n+1
xi xi xi
.. .. ..
. . .
lie in the positive cone of Rn and their directions converge to a limiting direction as n grows large.
Linear independence is retained for all n, but just barely as n grows large. The reader can see this
simply by letting m = 4, n = 3 and x1 = 1, x2 = 2, x3 = 3 and x4 = 4.
To see why nearly dependent columns lead to uncertainties in numerical work, observe that the
solution to
1 1 x 1
=
0 ε y 0
1 1
is x = 1, y = 0 whereas it is x = 0, y = 1 if on the RHS is replaced by
0 ε
For later work we record the observation that V T V a = V T y can be written in terms of the
LECTURE 1. GETTING GOING 21
error, y − V a, as
T
y − V a V = 0T
It is important to build up a picture of the facts about the solutions to the problem Ax = b. We
begin to do this in the hope that the readers will add to it as they go along.
Write Ax = b as
x1 a1 + x2 a2 + · · · + xn an = b
a or b a
1 1
LHS RHS
On the LHS there is no value of x satisfying xa1 = b while on the RHS there is exactly one
value of x. On the LHS we can determine the value of x so that xa1 is as close as possible to b but
xa1 cannot equal b.
a or a
2 2
b
a a
1 1
LHS RHS
and the conclusions are as before: on the LHS there are no values of x and y such that xa1 + ya2 =
b, while on the RHS there is exactly one value of x and one value of y. But if r = 1 the picture is
or
a a
2 2
b b
a a
1 1
LHS RHS
and again on the LHS there are no values of x and y such that xa1 + ya2 = b. The RHS is new: as
before x and y can be determined, but now this is possible in many ways.
These pictures lead us to certain conclusions about the solutions in terms of the numerical
values of n, r and m. If r < m the problem has solutions for some values of b but not for others.
If the problem has a solution and if r = n then it is the only solution, but if r < n there are many
solutions.
Certainly r cannot exceed n and, as a1 , . . . , an ∈ C m , r cannot exceed m either. Using this the
reader can draw more conclusions about the solutions when n < m, n = m and n > m.
LECTURE 1. GETTING GOING 23
Assuming that most of the interesting problems a student will face are nonlinear, we ought to
indicate at least one source of linear problems.
Suppose that upon writing a model to explain or predict an experimental observation, viz., the
output variable, we have to solve a nonlinear problem. Included in the specification of the problem
will be the values of the input variables.
Often a simple steady solution to our problem can be found where, possibly, the nonlinear
terms vanish and we would then like to know if we can see this solution if we run the experiment.
To answer this question we add to the simple base solution we have a small correction and
substitute the sum into our nonlinear equation in order to obtain an equation for the correction.
Upon discarding squares, etc., of small quantities, this will be a linear equation and our aim will
be to discover if the small displacement grows or dies out in time.
If the displacement grows, our base solution will be called unstable and we will not see it in an
experiment.
If all displacements die out our base solution will be stable to small displacements and we may
be able to see it in an experiment.
Ordinarily there will be ranges of inputs where stability obtains and ranges of inputs where
it does not. The critical values of the inputs divide these ranges. Hence we may decide to run a
sequence of experiments where we increase an input to its critical value and ask what we expect to
see if the input is advanced just beyond its critical value.
By asking this question, we are led to derive a sequence of linear problems which are inhomo-
geneous versions of the homogeneous stability problem and which introduce solvability questions.
Boiling an Azeotrope
d~x
= −~y + ~x
ds
LECTURE 1. GETTING GOING 24
where ~x is specified at s = 0 and where, at constant pressure, ~y is known as a function of ~x. Hence,
if we are boiling an azeotrope, i.e., if ~x at s = 0 is an azeotropic composition so that at s = 0 we
have ~y = ~x, then, for all s we have the solution
~x = ~x(t = 0)
dx
= −y + x
ds
y y
or
xA x xA x
maximum boiling azeotrope. minimum boiling azeotrope.
dξ
= −f ′ (xA )ξ + ξ
ds
Now we observe that t = 0 corresponds to s = 0 and that s is a time like variable. Hence we have
stability if f ′ (xA ) > 1, instability if f ′ (xA ) < 1 and we find that a maximum boiling azeotrope
can be boiled off at constant composition. But a minimum boiling azeotrope cannot be sustained.
Here it is not that we expect a composition fluctuation to occur during boiling, instead the problem
lies in the preparation of the initial state.
LECTURE 1. GETTING GOING 25
Going back to our boiling point rise model where the inputs, viz., M, xF , TF , F , cp , λ, UA, Ts ,
Tc and β, are held fixed and where the outputs are x, T and V , denote by x0 , T0 , V0 > 0, a boiling
steady state. Is it stable? To see, substitute
V = V0 + εV1 ,
x = x0 + εx1 ,
and
T = T0 + εT1
into the model presented earlier, where ε is small, and obtain equations for x1 , T1 and V1 . Eliminate
V1 from the first two equations, then eliminate T1 via T1 = βx1 and draw the conclusion that for
any small initial displacement of the steady state, x1 goes to zero as t increases.
To introduce a solvability question we present the following model, where c denotes the concentra-
tion of a solute in a one dimensional domain and F (c) denotes its rate of formation. All variables
are scaled and at first we say only that F (0) = 0 and F ′ (0) > 0. Thus an excursion away from
c = 0 reinforces itself. Our model is
∂c ∂2c
= 2 + λF (c)
∂t ∂x
where c = 0 at x = 0, 1, i.e., there is a solute sink at the ends of our domain, and where λ denotes
the strength of the source.
Our aim is to find the value of λ at which diffusion to the solute sinks at x = 0, 1 can no longer
control the solute source on the domain.
We have a solution c = 0 for all λ and we might wish to know for what values of λ we can
LECTURE 1. GETTING GOING 26
observe this solution. There are two simple things we can do, both leading to the same conclusion.
First we can introduce a small perturbation, viz., c = 0 + c1 where c1 is small and find that c1
satisfies
∂c1 ∂ 2 c1
= + λF ′ (0)c1
∂t ∂x2
where c1 = 0 at x = 0, 1.
c1 = ψ(x)eσt
where σ is the growth rate of a perturbation whose spatial dependence is ψ(x). Then σ and ψ solve
the homogeneous problem
d2 ψ
+ λF ′ (0)ψ − σψ = 0
dx2
where ψ = 0 at x = 0, 1, and this problem has solutions other than ψ = 0 only for special values
of σ.
σ = λF ′ (0) − π 2 , λF ′ (0) − 4π 2 , . . .
and these σ’s are the growth rates of an independent set of perturbations spanning all allowable
perturbations. Because the strength of diffusion increases as the spatial variation increases, we see
that sin πx is the most dangerous perturbation.
At λ = 0, the greatest value of σ is found to be −π 2 . And for all λ > 0 the greatest σ is
λF ′ (0) − π 2 , whereupon the greatest σ becomes zero at λ = π 2 /F ′(0). This then is the critical
value of λ beyond which the solution c = 0 is not stable.
The second thing we can do is to look only at steady solutions, viz., solutions to
d2 c
+ λF (c) = 0
dx2
LECTURE 1. GETTING GOING 27
where c = 0 at x = 0, 1, and observe that one such solution is c = 0 for all λ. Then we can ask: if
dc
we have the solution c = 0 at some λ can we advance it if we advance λ, i.e., can we find ?
dλ
dc .
Denoting by c we have
dλ
d2 c
.
.
+ λF ′(0)c = 0
dx 2
where ċ = 0 at x = 0, 1.
.
This is the forgoing, viz., ψ, problem at σ = 0 and it has only the solution c = 0 for all
λ < λcrit , hence we keep finding only the solution c = 0 as λ increases from zero until we reach
.
λ = λcrit , whereupon a solution c 6= 0 appears, signaling something new may be happening.
Now we can go back to the problem for σ and differentiate with respect to λ, obtaining
.
d2 ψ ′
. . .
+ λF (0) ψ − σ ψ = σψ − F ′ (0)ψ
dx2
.
where ψ = 0 at x = 0, 1.
The corresponding homogeneous equation is the equation for ψ and, at λcrit , it has a solution
.
other that ψ = 0. Hence ψ, not zero, must exist and, therefore, the equation for ψ̇ must be solvable.
. .
The solvability condition then determines σ. The reader can multiply the ψ equation by ψ, the ψ
. .
equation by ψ, subtract and integrate the difference over 0 ≤ x ≤ 1, learning that σ is positive at
dσ
λ = λcrit . Thus we have σ = 0 and > 0 at λ = λcrit . We therefore anticipate seeing a nonzero
dλ
solution branch for λ > λcrit .
We will return to the Petri Dish problem later on and try to decide what the nonzero solution
looks like for λ just beyond λcrit . And we can do this by solving only linear equations.
1. A graph is a set of points connected pairwise by directed line segments. If there are n points
and a line segment runs from point i to point j then the ij element in an n × n matrix is
LECTURE 1. GETTING GOING 28
set to 1 otherwise it is 0. The resulting matrix is called a connection matrix. What do the
powers of a connection matrix tell us about the graph? The powers of a connection matrix
are easy to determine because its columns are made up of zeros and ones. Each column of
the product of a matrix A multiplied on the right by a connection matrix is simply the sum
of certain columns of A.
2. Suppose the elements of the square matrix A themselves are square matrices. Then A is
called a block matrix. Write A as LU where L is blockwise lower triangular and U is
blockwise upper triangular, its diagonal blocks being I.
3. Determine T vs. t, V vs. t and x vs. t in the simple evaporator problem presented in Lecture
1.
The m × n matrix A is assigned. In this and the next two lectures we show how to determine
whether or not a vector x ∈ C n can be found so that the vector Ax ∈ C m equals an assigned
vector b ∈ C m . Assuming the problem Ax = b has a solution, we then show how to write its
general solution.
The main idea is linear independence. This, or its opposite, linear dependence, is a property of
a set of vectors. Indeed, starting with a set of vectors, {v 1 , v 2 , ..., v n } , we can create additional
vectors by making linear combinations of the assigned vectors using arbitrary complex numbers,
viz., c1 v1 + c2 v 2 + ... + cn v n . The set v 1 , v 2 , ..., v n is said to be independent if the only way
we can create the vector 0 is by setting c1 , c2 , ..., cn to zero. In other words the set of vectors
v 1 , v 2 , ..., vn is said to be independent iff
c1 v 1 + c2 v 2 + ... + cn vn = 0
29
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 30
The idea of linear dependence can be stated in terms of linear combinations in the following
way: at least one vector of the set v 1 , v2 , ..., vn is a linear combination of the others if and only if
the set is dependent, i.e., if and only if the equation c1 v1 + c2 v2 + ... + cn v n = 0 is satisfied for
c1 , c2 , ..., cn other than c1 , c2 , ..., cn all zero.
The idea of linear independence is not special to sets of column vectors and is defined as above
for vectors in general, using the corresponding zero vector; indeed it pertains to sets of row vectors
as well as to sets of column vectors.
It turns out that matrices have a surprising property: the greatest number of independent rows
cannot exceed the greatest number of independent columns (and vice versa); and this, willy nilly,
cannot exceed the total number of columns. The number of molecules then is a bound on the
greatest number of independent reactions that can be written.
If we take into account that each molecule is made up of atoms from the set a = 1, 2, ..., A
and denote by αma the number of atoms a in molecule m, then as each atom is conserved in each
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 31
M
X
νrm αma = 0
m=1
α1a
α2a
for all r and all a. We can identify the atom a with the column αa = ..
whence the
.
αM a
conservation conditions are ναa = 0, a = 1, 2, ..., A and these can be written
Each independent condition of this kind reduces by one the greatest number of independent
columns in the set ν 1 , ν 2 , ..., ν M . Assuming the atoms to be distributed independently over the
molecules, i.e., assuming the set α1 , α2 , ..., αA to be independent , the greatest number of indepen-
dent reactions that can be written using M molecules made up of A atoms is M − A, hence the
requirement that balanced reactions be written reduces the greatest number of independent reac-
tions from M to M − A. But even the bound M corresponding to arbitrary reactions is interesting
and it has nothing to do with the requirement that the stoichiometric coefficients be integers. It
holds assuming the νrm to be arbitrary complex numbers and cannot be lowered by the restriction
to integers.
To every set of n column vectors in C m there corresponds a set of m row vectors in C n gener-
ated by writing the column vectors as the columns of an m × n matrix A. The row vectors are then
the rows of A. If we rephrase the question as to the greatest number of independent vectors in the
set a1 , a2 , ..., an to a corresponding question about the greatest number of independent columns in
the matrix A we will get information not only about the columns of A but also about the rows of A
as well. But the rows of A are the columns of AT so answers to questions about n column vectors
in C m are also answers to questions about a corresponding set of m column vectors in C n .
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 32
To see what linear independence has to do with our main problem of determining whether or not
solutions of Ax = b exist and, if they do, writing them, we let a1 , a2 , ..., an denote the columns of
A and write the problem Ax = b as
x1 a1 + x2 a2 + ... + xn an = b
x1 a1 + x2 a2 + ... + xn an = 0
We see therefore that Ax = 0 has solutions other than x1 , x2 , ..., xn all zero iff the set of vectors
a1 , a2 , ..., an is dependent, i.e., not independent, and that Ax = b has solutions iff on joining b to
the set a1 , a2 , ..., an we do not increase the greatest number of independent columns.
Our first problem, therefore, is to determine the greatest number of independent vectors in a set
of n vectors, a1 , a2 , ..., an in C m . To do this we introduce the determinant of a square matrix. The
determinant is a function defined on square matrices mapping each square matrix into a complex
number.
Let A = (aij ) = (a1 a2 ... an ) be an n × n matrix, then the determinant of A, denoted det A, is
defined by
where we have chosen to write the column indices in their natural order and where the sum is
over all sets of integers α1 , α2 , ..., αn that are permutations of 1, 2, ..., n, the + sign to be used if
the permutation is even, the − sign if it is odd. This then is a sum of n! terms, each term being
a product of n factors, where each row and each column is represented once in each term. This
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 33
definition leads to the following four properties from which our conclusions can be drawn:
In (ii) two columns are interchanged; in (iii) a fixed column is written as a linear combination of
arbitrary column vectors; in (iv) a linear combination of other columns is added to the ith column.
All else is held fixed. The proofs of these properties and their corollaries, such as det A = 0 if
ai = aj for any i 6= j, det A = 0 if ai = 0 for any i, etc., come easily out of the definition and
either can be supplied by the reader or can be found in Shilov’s book “Linear Algebra.”
Property (i) is important in turning column theorems into row theorems and vice-versa. Prop-
erties (ii), (iii) and (iv) are properties of columns and therefore properties of the columns of AT .
As the rows of A are the columns of AT they are also properties of the rows of A.
If a1 , a2 , ..., an is a dependent set of vectors in C n then det A = 0. This is so as we can write one
of a dependent set of vectors as a linear combination of the others and then use this combination
in (iii) to show that det A is zero. We can restate this as: if det A 6= 0 then {a1 , a2 , ..., an } is
independent.
Each term in the expansion of det A contains one factor from the j th column. Of the n! terms,
(n − 1)! contain the common factor a1j , (n − 1)! contain the common factor a2j , etc. Writing the
first of these n sets of (n − 1)! terms as a1j A1j , the second as a2j A2j , etc., we find that
n
X
det A = a1j A1j + a2j A2j ... = aij Aij
i=1
∂ det A
= Aij
∂ aij
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 34
This is the expansion of det A via its j th column and it can be written for each j = 1, 2, ..., n.
Each factor Aij , i = 1, 2, ..., n is called the cofactor of the corresponding element aij , i = 1, 2, ..., n
and by definition their values do not depend on the values of the elements in the j th column. When
this construction is carried out for each column, j = 1, 2, ..., n, it generates n2 elements Aij . Then
letting
A11 A12
A21 A22
A1 =
..
,
A2 =
..
, etc.
. .
An1 An2
where A1 is the column of cofactors of a1 , the first column of A, etc, we can write
det A = ATj aj , j = 1, 2, . . . , n
The matrix whose columns are A1 , A2 , ..., An is called the matrix of the cofactors of A and its
transpose, denoted adj A, where adj A = (A1 A2 ...An )T , is called the adjugate of A.
It turns out that Aij = (−1)i+j Mij where Mij , a minor of A, is the determinant of the (n − 1) ×
(n − 1) submatrix of A obtained by deleting its ith row and j th column. It is worth stating that,
unless n = 2 or 3, it is not practical to evaluate determinants directly from the definition, requiring
the evaluation of n! terms, nor by the expansion in cofactors, requiring the evaluation of (n − 1)!
terms n times, etc.
We now have two sets of columns {a1 , a2 , ..., an } and {A1 , A2 , ..., An }, which satisfy
ATj ak = det A, k = j
ATj ak = 0, k 6= j
The second formula is the expansion, via the j th column, of the determinant obtained by writing
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 35
ak in place of aj as the j th column of A and hence is zero. The multiplication on the left hand side
is a column multiplied on the left by a row. The product is a scalar, indeed ATj ak = aTk Aj .
ATi aj 6= 0, i = j
ATi aj = 0, i 6= j
are called biorthogonal sets. This is a useful idea. It’s usefulness stems from the observation that
if we expand a vector in one of the sets, the coefficients in the expansion can be determined simply
by operating on the expansion using vectors of the other set. Indeed to solve the problem Ax = b
where x and b belong to C n and det A 6= 0, we rewrite the equation as
x1 a1 + x2 a2 + ... + xn an = b
xj ATj aj = ATj b
whereupon
ATj b
xj = , j = 1, ..., n.
det A
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 36
(A1 A2 ...An )T
x= b
det A
adj A
= b ≡ A−1 b
det A
Before going on we write our results in another way: the column formulas for the expansion of
a determinant, viz.,
n
X
ATj aj = aij Aij = det A, j = 1, ..., n
i=1
and
n
X
ATj ak = aik Aij = 0, j = 1, ..., n , k = 1, ..., n, k 6= j
i=1
can be written
where I = (δij ).
Now there are row formulas for the expansion of a determinant which can be obtained either by
going back to the definition and factoring an element in the ith row out of each term or by writing
the column formulas using AT in place of A. The row formulas
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 37
n
X
det A = aij Aij , i = 1, ..., n
j=1
n
X
0= akj Aij , i = 1, ..., n, k = 1, ..., n, k 6= i
j=1
obtained by expanding, via the ith row, the determinant obtained by replacing the ith row of A by
its k th row, can be written
A ( adj A) = (det A) I
The readers may wish to satisfy themselves that aij is multiplied by one and the same coeffi-
cient, denoted Aij , whether it appears in a row or column expansion. If det A = 0, we see that
A( adj A) = 0 and hence that each column of adj A is a solution of Ax = 0.
The determinant, defined only on square matrices, can be used to determine the greatest number of
independent columns in an m × n matrix A = (a1 a2 ...an ), where aj ∈ C m , j = 1, ..., n, and where
m need not be equal to n. To do this we introduce square submatrices of A of order k by deleting
all but k rows and all but k columns and then we calculate the determinants, called minors of order
k, of all these submatrices. Using this information we define the rank of A to be the order of the
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 38
largest non-vanishing minor of A and denote it by r; we call the set of r columns of A running
through that minor a set of basis columns. The rank is unique but, as more than one minor of order
r may be non-vanishing, a set of basis columns need not be unique, however each such set is made
up of r columns, and each such set is independent. To see this let a1 , a2 , ..., ar be a set of r basis
columns. (In the problem Ax = b interchanging columns of A and interchanging corresponding
elements of x leave the problem unchanged.) Then to see if the set is independent we investigate
the solutions of
c1 a1 + c2 a2 + ... + cr ar = 0
Looking at the r equations corresponding to the basis minor we see by Cramer’s rule that their
only solution is c1 , c2 , ..., cr all zero. This then is the only solution to the full set of m equations.
What we have established is this: in an m × n matrix there is at least one set of r columns, the
basis columns, that is independent, and r, the rank of the matrix, cannot exceed the smaller of m
or n. To go on we require a result telling us how the columns not in a set of basis columns depend
on the basis columns. Indeed what we need is the basis minor theorem. As stated and proved in
Shilov’s book ”Linear Algebra,” on p. 25, this theorem tells us that any column of a matrix can be
written as a linear combination of any set of basis columns. This most important result in linear
algebra is surprisingly easy to prove. As a set of basis columns is independent each such expansion
must be unique. The basis minor theorem tells us directly that any set of r + 1 or more columns of
which r are basis columns is dependent. Indeed any set of r + 1 or more columns in a matrix of
rank r must be dependent. The argument for this can be found early in the next lecture. Hence the
greatest number of independent columns in an m × n matrix of rank r is r. Of course any subset
of an independent set of vectors is also independent.
Because of property (i) we see that the ranks of A and AT coincide, every square submatrix of
A being the transpose of a square submatrix of AT . The foregoing argument in terms of AT then
shows that r is also the greatest number of independent rows in A. Indeed if m > n the greatest
number of independent rows cannot exceed n whatever values are assigned to the aij .
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 39
We also see that if A is square, i.e., n × n, and det A = 0 then its rank is at most n − 1 whence
the columns of A must be dependent. This is the converse of the earlier result that if det A 6= 0 its
columns are independent.
To wind this lecture down we develop some useful results having to do with the determinant.
Let the elements aij of a matrix A be functions of t. Then det A is a function of t and its derivative
is
d X X ∂ det A daij
det A =
dt i j
∂aij dt
Using
∂ det A
= Aij
∂aij
d X X daij
det A = Aij
dt i j
dt
This can be written as the sum of n determinants in two ways, using column or row expansions.
dx
= A(t)x
dt
and let W = det (x1 (t) x2 (t) ... xn (t)) = det (xij (t)). Then, as above,
dW X X dxij
= Xij
dt i j
dt
But this is
dW XXX
= aik xkj Xij
dt i j k
P
and hence, using j xkj Xij = W δki , we find
dW
= tr A(t)W
dt
we conclude that W (t) is either always zero or never zero. This is an important result in the
theory of differential equations. It is required in Lecture 19. The determinant W (t) is called the
Wronskian of the solutions.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 41
If A = (a1 a2 ...an ) and B = (b1 b2 ...bn ) then the ij element of AT B is aTi bj whence tr (AT B) =
Pn T
i=1 ai bi . Using this, the formula for the derivative of a determinant, viz.,
d X X daij X daj
det A = Aij = ATj
dt j i
dt j
dt
can be written
d dA
det A = tr ( adj A)
dt dt
We give here some simple results which the readers can verify and some not so simple results.
A square matrix is called diagonal if its elements off the main diagonal vanish; it is called upper
or lower triangular if its elements below or above the main diagonal vanish. The determinant of
each such matrix is the product of its diagonal elements.
The determinant of a product of square matrices is the product of the determinants of the
factors. This is a particular instance of a result by which a minor of a product can be expressed as
a sum of products of minors of the factors, see p. 91 of Shilov’s “Linear Algebra.”
The reader can discover that (AB)T = B T AT and then that rank AB ≤ rank A as each col-
umn of AB is a linear combination of the columns of A. As
rank AB = rank (AB)T ≤ rank B T = rank B we also see that rank AB ≤ rank B.
If AA−1 = I and BB −1 = I then ABB −1 A−1 = I and so (AB)−1 = B −1 A−1 . This result
assumes A and B are square. If AB is square but A and B are not, more work is required.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 42
Ordinarily we can write a square matrix A as a product LU where L is lower triangular and
U is upper triangular having 1’s on its main diagonal. This is easy to do column by column: on
writing
1 u12 . . .
0 1 ...
(a1 a2 . . . an ) = (ℓ1 ℓ2 . . . ℓn )
.. ..
. . ...
0 0 ...
a 1 = ℓ1
a2 = u12 ℓ1 + ℓ2
a3 = u13 ℓ1 + u23 ℓ2 + ℓ3
etc.
Thus, because ℓ12 = 0, u12 can be determined to be a12 /a11 , etc. What appears on the diagonal
of L is a11 , a11 a22 − a12 a21 , . . . , i.e., the ratios of the determinants of the upper left hand subma-
a11
trices of A. If one of these is zero the calculation, as indicated
above, cannot go on. This may
0 1
happen whether or not det A = 0. For instance cannot be so expanded but the problem
1 1
disappears if the columns are interchanged.
The readers can show that if A is tridiagonal then L and U are bidiagonal. The readers can also
satisfy themselves that the recipe for the determination of L and U can be improved by calculating
the columns of L and the rows of U in the following sequence: first column of L, first row of
U, second column of L, second column of U, etc. Indeed if A is partitioned into blocks, the
decomposition can be carried out blockwise. In doing this the blocks of A must satisfy certain
minimum conditions, e.g., the diagonal blocks must be square. The equation LUx = b is easy to
solve.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 43
Again we denote by a1 , a2 , . . . , an the n columns of an m × n matrix A and we say that the set
{a1 , a2 , . . . , an } is independent iff c1 a1 + c2 a2 + · · · + cn an = 0 has only the solutions: all c’s = 0.
The reader should then believe that if {a1 , a2 , . . . , an } is independent, the equation Ax = 0
has only the solution x = 0 and if {a1 , a2 , . . . , an , b} is independent, the equation Ax = b has no
solution.
Suppose the R × M matrix ν, having rank r, is our reaction-molecule matrix and the M × A
matrix α = (α1 , . . . , αA ) is our molecule-atom matrix. We will see in Lecture 3 that the equation
νx = 0 has M − r independent solutions. Thus if
ν a α = 0, α = 1, . . . , A
r ≤M −A
To do this expand D by its last row and observe that it is a polynomial of degree n − 1
in xn whose coefficients depend on x1 , x2 , . . . , xn−1 . The n − 1 zeros of this polynomial
are x1 , x2 , . . . , xn−1 , and it can be written
n−1
Y
D (x1 , x2 , . . . , xn ) = c (xn − xi )
i=1
2. A determinant of order k can be written as a linear combination of its first minors, determi-
nants of order k − 1. Let A be an m × n matrix. If all its minors of order k vanish then so
too all its minors of order k + 1, k + 2, etc. Prove this.
Using the formula for the derivative of a determinant and the formula for the expansion
d
of a determinant in terms of its first minors show that: det A is a linear homogeneous
dλ
d2
function of the first minors of A, 2 det A is a linear homogeneous function of the second
dλ
minors of A, etc.
4. Because you see 6 molecules and 3 atoms in the set of 5 reactions listed below, you believe
that at most 3 of the reactions are independent. Calculate the rank of the reaction-molecule
matrix.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 45
C + H2 O −→ CO + H2
C + 2H2 −→ CH4
C + CO2 −→ 2CO
CO + H2 O −→ CO2 + H2
Answer:
+a11 a33 − a13 a31
J = 1 + ε tr A + ε2 +a11 a22 − a12 a21 + ε3 det A
+a22 a33 − a23 a32
An example is:
C + O2 −→ C O2 R = 3, M = 4, A = 2
1
C+ O2 −→ C O
2
1
CO + O2 −→ C O2
2
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 46
f QAQT = f (A)
show that I1 (A) , . . . , In (A) are scalar invariants of A. They are called the principal invari-
ants of A.
∂f
Define the gradient of a scalar function of A, denoted fA (A), having components , by
∂Aij
d ∂f
f (A + sC) = Cij = tr fA (A) C T
ds s=0 ∂Aij
for any C.
Derive
T
detA (A) = det A A−1
and
trA (A) = I
Lecture 3
Vector Spaces
A set of vectors on which rules for addition and scalar multiplication are defined is called a vector
space if the sum of any two vectors in the set is in the set and the product of any scalar and any
vector in the set is in the set. The set of columns of m complex numbers, denoted C m , is a vector
space.
Let the vectors v 1 , . . . , vn belong to C m . Then the set of all linear combinations of v1 , . . . , vn ,
i.e., the set of all vectors c1 v 1 + c2 v 2 + . . . + cn v n corresponding to all ways of choosing c1 , . . . , cn
is a vector space. It is a subspace of C m ; it is called the manifold spanned by v 1 , . . . , vn and it
is denoted [v 1 , v 2 , . . . , v n ]. A set of independent vectors in a vector space that spans the space
is called a basis for the space. Each vector in the space has a unique expansion in a set of basis
vectors and the coefficients in the expansion are called the components of the vector in the basis.
If there are n vectors in a basis for a space then every set of n + 1 vectors in the space is
dependent. Indeed if v 1 , . . . , vn is a basis then u1 , . . . , un+1 must be dependent. To see this let
n
X
uj = ξij v i , j = 1, . . . , n + 1
i=1
47
LECTURE 3. VECTOR SPACES 48
c1 u1 + c2 u2 + · · · + cn+1 un+1 = 0
can be written
n
X n+1
X
vi ξij cj = 0
i=1 j=1
whence we have
n+1
X
ξij cj = 0, i = 1, . . . , n
j=1
ξ1j
ξ2j
and denoting
n
.. by ξ j ∈ C , j = 1, 2, . . . , n + 1, this is
.
ξnj
c1 ξ 1 + c2 ξ 2 + · · · + cn+1 ξ n+1 = 0
As the rank of the n × (n + 1) matrix ξ 1 ξ 2 · · · ξ n+1 cannot exceed n, the set of columns
ξ 1 , ξ 2 , · · · , ξ n+1 must be dependent and so too therefore the set of vectors u1 , u2 , · · · , un+1 . This
tells us: if there are n vectors in some basis for a space then every basis for the space is made up
of n vectors. We go on and define the dimension of the space to be n. It follows directly that any
set of n independent vectors in a space of dimension n is a basis for the space.
There are two subspaces associated to an m × n matrix A that are important to us, one a subspace
of C m the other of C n . Both depend for their identification on the basis columns of A. We denote
LECTURE 3. VECTOR SPACES 49
by r the rank of A, and assume the columns a1 , a2 , · · · , ar to be a set of basis columns. Then we
denote the set of vectors Ax ∈ C m , for all x ∈ C n , by Im A and call it the image of A. Because
c1 Ax1 + c2 Ax2 = A (c1 x1 + c2 x2 ), Im A is a vector space and hence a subspace of C m . Because
Ax = x1 a1 + x2 a2 + · · · + xn an , we can write Im A = [a1 , a2 , . . . , an ] and hence by the basis
minor theorem Im A = [a1 , a2 , . . . , ar ]. As {a1 , a2 , . . . , ar } is independent, it is a basis for Im A.
Hence the dimension of Im A is r and the rank of a matrix, an algebraic quantity, turns out to have
a geometric interpretation as the dimension of its image.
This leads directly to results such as rank AB ≤ rank A inasmuch as Im AB cannot lie outside
Im A.
The geometric interpretation of the rank of A leads to a practical way of determining its value.
Indeed we do not change the rank of A by carrying out operations on the columns of A that do not
change the dimension of Im A. The following operations satisfy this requirement and are therefore
rank preserving:
These rank preserving column operations can be used to produce from A a matrix whose rank,
i.e., number of independent columns, can be established by inspection. The idea is to create zeros
in the first row in columns 2, . . . , n, then in the second row in columns 3, . . . , n, then etc.
c1 a1 + c2 a2 + · · · + cr ar + ar+1 = 0
d1 a1 + d2 a2 + · · · + dr ar + ar+2 = 0
etc.
ar+1 , ar+2 , . . . in terms of a1 , a2 , . . . , ar and the fact that {a1 , a2 , . . . , ar } is independent, that
ξ1 c1 d1 0
ξ2 c2 d2 0
− ξr+1 − ξr+2 −··· =
.. .. .. ..
. . . .
ξr cr dr 0
x − ξr+1 x1 − ξr+2 x2 − · · · = 0
This tells us that x1 , x2 , . . . , xn−r is a basis for Ker A, which we can now write as x1 , x2 , . . . , xn−r ,
and that the dimension of Ker A, the solution space of Ax = 0, is n − r, the number of columns
of A less its rank.
As simple examples the readers may satisfy themselves that the homogeneous equation
fT x = 0
f T1 x = 0
and
f T2 x = 0,
rewritten as
f T1 0
x = ,
f T2 0
has n − 2 independent solutions if f 1 and f 2 are independent, etc. We will see that the solutions
LECTURE 3. VECTOR SPACES 52
to the first equation may be interpreted as the set of vectors perpendicular to f , the solutions to the
second as the set of vectors perpendicular to f 1 and f 2 , etc.
We can sum up the facts about the problem Ax = b. Even though all the foregoing is phrased
in terms of matrix operators and column vectors the conclusions hold as well for all other linear
operator problems so we will state them generally:
The problem A~x = ~b has a solution iff ~b ∈ Im A. If ~b ∈ Im A and ~x0 satisfies A~x = ~b then
so also does ~x0 + ~y for any ~y ∈ Ker A. And all solutions can be so written, for if A~x1 = ~b then
~y1 = ~x1 −~x0 ∈ Ker A and ~x1 = ~x0 +~y1 . If ~x = ~0 is the only solution of A~x = ~0, then if ~b ∈ Im A,
the solution to A~x = ~b is unique.
For a matrix of n columns and rank r the general solution of Ax = b depends on n−r constants
and is x0 + c1 x1 + c2 x2 + · · · + cn−r xn−r where x0 is any particular solution and x1 , x2 , · · · , xn−r
is a fundamental system of solutions of Ax = 0. In this the set of fundamental solutions, viz.,
x1 , x2 , · · · , xn−r , may be replaced by any basis for Ker A.
Working out the following problem (this problem is on page 71 in Shilov’s “Linear Algebra”)
will help the reader get all this straightened out. The problem is to determine the solution to a
system of four equations in five unknowns:
LECTURE 3. VECTOR SPACES 53
x1 + x2 + x3 + x4 + x5 = 7
1 1 7 1 1 1
x1 + x2 = − x3 − x4 + x5
3 2 −2 1 1 −3
From here a particular solution and the fundamental system of solutions to the homogeneous
equations can be obtained easily. A particular solution is obtained
setting x3 , x4 and x5 to
by
7
zero and then the fundamental system is obtained by dropping and first setting x3 = 1,
−2
x4 = 0, x5 = 0, then setting x3 = 0, x4 = 1, x5 = 0, etc.
To illustrate that what we have been doing has more significance than what its face value would
suggest define pij , a polynomial differential operator, by
d d2
pij = aij + bij + cij 2 + · · ·
dt dt
LECTURE 3. VECTOR SPACES 54
and suppose that we have n differential equations which determine n functions u1 (t), u2 (t), . . . , un (t),
via
p11 p12 · · · p1n u1 b (t)
1
p21 p22 · · · p2n u2 b2 (t)
=
.. .. .. .. ..
. . . . .
pn1 pn2 · · · pnn un bn (t)
Now the differential operators pij can be added and multiplied as if they were complex numbers
and so we can write
X
pij Pij = D, j = 1, 2, . . . , n
i
and
X
pik Pij = 0, k, j = 1, 2, . . . , n, k 6= j
i
or in better notation
P Tj pk = Dδjk
Then writing P u = b as
p1 u 1 + p2 u 2 + · · · + pn u n = b
Du1 = P T1 b
Du2 = P T2 b
etc.
The result is that we have turned n differential equations in n unknowns, of order equal to the
highest order among the pij , into one differential equation in one unknown of order equal to the
order of D. Only the right hand sides of the equations determining u1 , u2, . . . differ.
3.5 Example
Let y satisfy
d2 y
+ λy = f (x)
dx2
Add to this
y=0 at x = 0, 1
d2 y
+ λy = 0, y=0 at x = 0, 1
dx2
has only the solution y = 0, whereupon our problem has a solution for all f ’s and it is unique.
If λ is one of the values π 2 , 4π 2 , 9π 2 , etc., then the homogeneous problem has the solution
√
y = sin λx
LECTURE 3. VECTOR SPACES 56
and its rate by ξ˙ e. Here ν e specifies a reaction not a species and νse is a net stoichiometric
coefficient as a species may be written more than one time in an elementary reaction.
R
X
νe = C re µr
r=1
PE
the rate of production of the species, e=1 ν e ξ˙e , can be written in terms of the rates of the
apparent reactions as
E
X R
X R
X
ξ˙e Cre µr = µr η̇r
e=1 r=1 r=1
LECTURE 3. VECTOR SPACES 57
where
E
X
η̇r = Cre ξ˙e
e=1
This last equation is the rule for writing rate laws for apparent reactions in terms of rate laws
for elementary reactions. As an example take the following set of elementary reactions
O3 + O3 ⇄ O1 + O2 + O3
O3 + O1 ⇄ O1 + O1 + O2
O3 + O2 ⇄ O1 + O2 + O2
O3 + O1 ⇄ O2 + O2
O2 + O1 ⇄ O1 + O1 + O1
O2 + O3 ⇄ O1 + O1 + O3
O3 ⇄ 3O1
2O3 ⇄ 3O2
xi+1 = xi (2 − axi )
1
1− =0
ax
Because division is not required to determine the sequence of approximations, we can try to
use this iteration formula in matrix inversion. To see why it might work suppose that B is an
estimate of A−1 , differing from it by a small amount ∆, so that AB and BA are close to I.
Then we can write
A (B + ∆) = I
which leads to
A∆ = I − AB
and then to
BA∆ = B − BAB
B − BAB
B + ∆ = 2B − BAB
Let A be the n × n Hilbert matrix where n = 5. Find an approximate inverse and then
LECTURE 3. VECTOR SPACES 59
Here is another way to think about this. Suppose we can find approximations to the
solution of Ax = b, possibly by using an elimination procedure corrupted by round-off
errors.
Etc.
R1 = R0 − A∆0
we can estimate R1 as
R0 − AX0 R0 = R0 − (I − R0 ) R0 = R02 .
X1 = X0 + ∆0
is approximately
A b W x
3. To find the inverse of in terms of A−1 , find so that
T T
c d y z
A b W x I 0
=
T T T
c d y z 0 1
AW + b y T = I
and
cT W + d y T = 0T
W + A−1 b y T = A−1
cT A−1
yT =
−d + cT A−1 b
A−1 b cT A−1
W = A−1 −
−d + cT A−1 b
LECTURE 3. VECTOR SPACES 61
A b
Indeed has an inverse iff −d + cT A−1 b 6= 0. In the same way find
T
c d
x and z.
4. To see what happens in solving Ax = b when A is nearly singular write the LU decomposi-
tion of A as
L1 0 U1 u
A = LU =
T T
ℓ 1 0 ε
where here 1’s lie on the diagonal of L instead of on the diagonal of U and where
det A = ε det U1 . Requiring 1’s on the diagonal of L instead of on the diagonal of U intro-
duces no new idea; but requiring the small quantity ε to be in the lower right hand corner of
U may require interchanging some columns and/or rows of A.
Then write Ax = b as
L1 U1 L1 u x1 b1
=
T T
ℓ U1 ℓ u + ε x b
L1 U1 x1 + L1 u x = b1
LECTURE 3. VECTOR SPACES 62
and
ℓT U1 x1 + ℓT u + ε x = b
−1
x1 + U1−1 u x = L1 U1 b1
b − ℓT L−1
1 b1
x=
ε
whence
−1
x1 L1 U1 b1 −U1−1 u T −1
x= = + b − ℓ L1 b1
x 0 1 ε
This formula is the main result of this problem. It tells us that the closer ε is to zero the better
job we must do in the determination of b − ℓT L−1
1 b1 . Write this formula as
−1
L1 U1 b1
x= + 1 ψT b φ
0 ε
T
U1−1 u L−1
1 ℓ
where φ = and ψ = . Observe that if ψ T b = 0 then x is indepen-
−1 −1
dent of ε. Show that this obtains when b ∈ Im A (ε = 0) by observing that A (ε = 0) φ = 0
and AT (ε = 0) ψ = 0.
LECTURE 3. VECTOR SPACES 63
Retaining the fractions, find the inverse. Round to four decimal places, write the decimals as
fractions and find the inverse.
p p
X = v1 1 v2 2 · · · vnpn
etc.
then
n op1 n o p2 n opn
[X] = Ma1 Lb1 Tc1 Ma2 Lb2 Tc2 · · · Man Lbn Tcn
LECTURE 3. VECTOR SPACES 64
[v ] = M0 L1 T−1
[ℓ ] = M0 L1 T0
[d ] = M 0 L1 T0
[µ ] = M1 L−1 T−1
[ρ ] = M1 L−3 T0
[σ ] = M1 L0 T−2
[g ] = M0 L1 T−2
LECTURE 3. VECTOR SPACES 65
Suppose
[P1 ] = M α1 L β1 T γ 1
etc.
and write the definition of dimensional independence in terms of the rank of the matrix
α1 α2 · · · αn
β1 β2 · · · βn
γ1 γ2 · · · γn
8. The dimensions of P are independent of the dimensions of P1 and P2 if there are no solutions,
p and q, to
α α α
1 2 p
β = β1 β2
q
γ γ1 γ2
where
[P ] = M α L β T γ
etc.
Are the dimensions of pressure independent of the dimensions of viscosity and veloc-
ity?
LECTURE 3. VECTOR SPACES 66
9. The differential equations determining the growth rate of a small disturbance superimposed
on the base state of a rotating fluid layer under an adverse temperature gradient are
2 2 β 2
D − a − Pr σ 0 d
κ
Θ 0
2Ω
0 D 2 − a2 − σ dD
ν Z = 0
gα 2 2 2Ω 3 W 0
− da − dD (D 2 − a2 ) (D 2 − a2 − σ)
ν ν
d
where D denotes .
dz
Write this as a single differential equation in Θ or Z or W .
Lecture 4
Inner Products
An inner product on C n is a function that assigns a complex number to every pair of vectors in
C n . The complex number assigned to x and y is denoted x, y and is required to satisfy the
following conditions
x, y = y, x
x, c1 y 1 + c2 y 2 = c1 x, y1 + c2 x, y 2
and
x, x > 0 unless x = 0
x, y = x T Gy
T
where G is an n × n matrix satisfying G = G (Hermitian) and x T Gx > 0 (positive definite) for
67
LECTURE 4. INNER PRODUCTS 68
all x 6= 0. Then to each Hermitian positive definite matrix G there corresponds an inner product
on C n . The simplest inner product, which we call the plain vanilla inner product, corresponds to
P
G = I and therein x, y = x T y = ni=1 xi yi
Defining an inner product makes it easy to use the idea of biorthogonal sets of vectors and we
do this at every opportunity as it greatly simplifies our work. We can also formulate a solvability
condition that is not specific to matrix problems, unlike the rank condition presented earlier. To
do this in C n let A be an n × n matrix. Then A maps vectors x ∈ C n into vectors Ax ∈ Im A, a
subspace of C n of dimension r.
4.2 Adjoints
What we seek is a new test to tell us when an arbitrary vector belongs to Im A. At first all we have
to work with is Ker A, a subspace of C n having the interesting dimension n − r but otherwise
bearing no special relation to Im A. Indeed Ker A may be wholly inside Im A or wholly outside
Im A or . . .. What we do then is this: we define a matrix whose kernel helps us identify Im A. To
do this we fix an inner product on C n . Then in this inner product A acquires a companion, denoted
A∗ , and called the adjoint of A, by the requirement that
A∗ x, y = x, Ay
for all x and y in C n . Writing this out using x, y = x T Gy we can determine a formula for A∗ .
It is
A∗ = G−1 A T G
and this shows how A∗ depends on the inner product in which we are working. If G = I then
A∗ = A T
Now the rank of A∗ is equal to the rank of A. Indeed it is easy to see that the rank of A T
LECTURE 4. INNER PRODUCTS 69
is equal to the rank of A; it is less easy to see, but no less true, that the rank of G−1 A T G is also
equal to the rank of A. Bezout’s theorem in Gantmacher’s book “Theory of Matrices,” can be used
to produce an algebraic proof of this. The reader can produce a geometric proof by showing that
the dimensions of the subspaces Im G−1 A T G and Im A T are equal. In doing this it is useful to
observe that, because G is not singular, if the set of vectors x1 , x2 , . . . is independent then so
also the set of vectors G−1 x1 , G−1 x2 , . . . .
Now we need to see what these subspaces tell us about the subspaces Im A and Ker A. To do this
we first let S be an arbitrary subspace of C n . Then if we fix an inner product on C n , S acquires a
companion, denoted S ⊥ and called the orthogonal complement of S, by the requirement that S ⊥ be
the set of all vectors in C n perpendicular to each vector in S. It is a subspace of C n and it depends
on the inner product being used. We write the definition of S ⊥ as
n o
⊥
S = y: y, x =0 ∀x ∈ S
The readers can verify that S ⊥⊥ = S, they can also verify that if dim S = r then dim S ⊥ = n − r
by using the fact that an r × n system of homogeneous equations of rank r has n − r independent
solutions. Then x ∈ S iff x⊥S ⊥ .
⊥
∗
In terms of A and Ker A ∗ we can formulate our main result; it is
⊥
Im A = Ker A∗
Ker A∗
Cn
Im A
0
The solvability condition for the problem Ax = b is the requirement that b ∈ Im A. Our main
⊥
result tells us that this is also the requirement that b ∈ Ker A∗ . To use this we fix an inner
product on C n and obtain A∗ . We then determine Ker A∗ by finding n − r independent solutions
to A∗ y = 0 and, using these, decide whether or not b⊥ Ker A∗ . Hence the solvability condition
for the problem Ax = b is this: Either b, y = 0 for all y such that A∗ y = 0, whence Ax = b is
solvable or b, y 6= 0 for some y such that A∗ y = 0, whence Ax = b is not solvable.
The proof that Ker A∗ = (Im A)⊥ is simple. It is just this: If y ∈ Ker A∗ and x ∈ Im A then
x = Az and y, x = y, Az = A∗ y, z = 0 whence Ker A∗ ⊂ ( Im A)⊥ . If y ∈ (Im A)⊥
and z ∈ C n then Az ∈ Im A and y, Az =0= A∗ y, z ; hence setting z = A∗ y we have
A∗ y = 0 whence ( Im A)⊥ ⊂ Ker A∗ .
85)and are important ingredients in solving linear operator problems. Given a matrix A, then, we
may want to determine if there is an inner product in which it is self-adjoint. This amounts to
determining a positive definite, Hermitian matrix G such that A∗ = A. The condition on G is that
T
A = G−1 A T G or GA = GA . Matrices A for which there is a solution to this equation are
called symmetrizable. They are self-adjoint in the corresponding inner product.
To this point our operators have been n × n matrices mapping C n into itself. We conclude this
lecture by producing the solvability condition for the problem Ax = b, x ∈ C n , b ∈ C m . Then we
introduce projection operators. Projections are basic to what are called generalized inverses which
can be used to construct x such that Ax is as close as possible to b when b 6∈ Im A.
Let , m
and , n
denote inner products on C m and C n then A, an m × n matrix mapping
C n into C m , has an adjoint A∗ , an n × m matrix mapping C m into C n . It is defined by requiring
A∗ y, x n
= y, Ax m
A∗ = G−1 T
n A Gm
Denote by f and g two real valued functions, vanishing at x = 0 and x = 1 and belonging to the
set of smooth functions defined on the interval [0, 1]. Then
Z 1
f (x)g(x) dx = f, g
0
d2
L=
dx2
d2 u
+ π 2 u = f (x)
dx2
where u = 0 at x = 0, 1 has a solution, we observe that Ker (L + π 2 ) is spanned by sin πx. Hence
LECTURE 4. INNER PRODUCTS 73
our problem is solvable for all right hand sides such that
Z 1
sin πx f (x) dx = 0
0
1
i.e., it is solvable for every function odd about , but for no others.
2
4.6 Projections
Let S be any assigned subspace of C m and let y be any fixed vector of C m . Then we can ask: what
vector in S is closest to y? The answer is given by the projection theorem.
P = I on S
and
P = 0 on S ⊥
Then as I − P satisfies
I − P = 0 on S
and
I − P = I on S ⊥
D E D E
0= (I − P ) y, P z = y, I − P ∗ P z ∀y, z ∈ C m
LECTURE 4. INNER PRODUCTS 74
The projection theorem tells us that P y is the vector in S that is closest to y. This is established
later. The difference y − P y = (I − P ) y lies in S ⊥ and is, therefore, perpendicular to every vector
in S. This is the error in approximating y by P y.
a∈S
and
b − a ∈ S⊥
F = f1 f2 . . . fr
P y = a1 f 1 + a2 f 2 + · · · + ar f r
a1
a2
=F
..
= Fa
.
ar
D E
f i , (I − P ) y =0 i = 1, 2, . . . , r
LECTURE 4. INNER PRODUCTS 75
This is
T
f i Gm y − F a = 0 i = 1, 2, . . . , r
and denoting
T
f1
T
f2 T
.. Gm = F Gm
.
T
fr
by F ∗ we have
F ∗y − F ∗ F a = 0
−1
P y = F a = F F ∗F F ∗y
Because this must be true for all y ∈ C m we have for P the formula
−1
P = F F ∗F F∗
The use of the notation F ∗ for F Gm is not too far fetched. If A maps C n into C m and inner
T
products , m
and , n
are defined on C m and C n via positive definite, Hermitian matrices
Gm and Gn , then A∗ mapping C m into C n is given by
A∗ = G−1
T
n A Gm
We can write any vector y ∈ C m as the sum of a vector in S and another vector in S ⊥ . The
expansion is unique, it is
y = P y + (I − P ) y
D E
2
y = y, y = P y + (I − P ) y, P y + (I − P ) y
D E D E
= P y, P y + (I − P ) y, (I − P ) y
2 2
= Py + (I − P ) y
1/2
where y = y, y is the length of the vector y.
We can also write any vector y − s ∈ C m , where y ∈ C m and s ∈ S, as the sum of a vector in
S and another in S ⊥ . The expansion is
y − s = −s + P y + y − P y
2 2 2
y−s = y − Py + s − Py
2 2
y−s ≥ y − Py
LECTURE 4. INNER PRODUCTS 77
and hence that P y lies at least as close to y as does any other vector in S. The vector P y is then
the best approximation to y in S. The error in this approximation, y − P y, lies in S ⊥ and is
perpendicular to all the vectors in S. This establishes the projection theorem, save for the question
of uniqueness.
We can investigate this best approximation problem in another way which shows how it is the
same as the least squares problem.
Again let y be any vector in C m and S be a subspace of C m . Then if f 1 , f 2 , . . . , f r is a set
of r independent vectors spanning S we can write any vector s ∈ S as
s = a1 f 1 + a2 f 2 + · · · + ar f r
= Fa
where
F = f1 f2 . . . fr
and
a1
a2
a=
..
.
ar
D X X E
y− ai f i , y − aj f j ,
least, we look for its stationary points by setting its derivatives with respect to ak , k = 1, . . . , r, to
LECTURE 4. INNER PRODUCTS 78
D X X E D E T
y− ai f i , y − aj f j = y − F a, y − F a = y T − aT F G y − F a
use
a = Re a + i Im a
and
aT = Re aT − i Im aT
∂ Re a ∂ Re aT ∂ Im a
= ei , = eTi , = 0, etc.
∂ Re ai ∂ Re ai ∂ Re ai
T
T
y T − aT F G (−F ei ) + −eTi F G y − F a = 0, i = 1, 2, . . . , r
and hence
n T o
Re F G y − Fa = 0
T
T
y T − aT F G (−iF ei ) + i eTi F G y − F a = 0, i = 1, 2, . . . , r
and hence
n T o
Im F G y − Fa = 0
LECTURE 4. INNER PRODUCTS 79
T T
F Gy = F GF a
−1
a = F ∗F F ∗y
This formula solves the least squares problem. The solution to the best approximation problem is
F a where
−1
F a = F F ∗F F ∗y = P y
Ax = y
2
y = y, y
LECTURE 4. INNER PRODUCTS 80
Ax = P y
where P , the projection of C m onto Im A, can be constructed using the basis columns of A. The
error y − Ax = y − P y = (I − P ) y is then perpendicular to Im A.
To make the solution to the best approximation problem unique we make x0 + ξ lie as close
to 0 as possible. To do this we let I − Q be the projection of C n onto Ker A, Q being the
projection of C n onto ( Ker A)⊥ . Then the vector (I − Q) x0 + ξ is the closest vector in Ker A
to x0 + ξ and their difference, Q x0 + ξ = Qx0 is independent of ξ. Hence, every solution
is the same distance from Ker A as every other, this distance being the length of Qx0 . But
Qx0 = x0 − (I − Q) x0 ∈ ( Ker A)⊥ is also a solution, due to (I − Q) x0 ∈ Ker A, and, because
its projection on Ker A is 0, it must be the solution closest to 0. So by requiring Ax to be the
best approximation to y and x to be as short as possible we get a unique solution to our best
approximation problem, viz., if x0 is any vector satisfying
Ax0 = P y
x = Qx0
Indeed because Qx0 is independent of x0 , i.e., Q (x0 + ξ) = Qx0 ∀ξ ∈ Ker A, it is the unique
solution to our problem.
LECTURE 4. INNER PRODUCTS 81
It remains only to express x in terms of A and y and thereby define a generalized inverse,
denoted AI , such that AI , mapping C m into C n , satisfies the requirement that to any y ∈ C m ,
x = AI y is the shortest vector in C n such that Ax is the best approximation in Im A to y.
A = FR
where R is an r × n matrix whose columns are the coefficients in the expansion of the columns of
A in the basis f 1 , f 2 , . . . , f r for Im A. Hence R is unique and of rank r. Because F ∗ is defined
to be F Gm , we take R∗ to be the n × r matrix G−1
T T
n R and find that
A∗ = G−1
T −1 T T
n A Gm = Gn R F Gm
= R∗ F ∗
−1
where the r independent columns of R∗ span Im A∗ . Then P , where P = F F ∗ F F ∗ , is
−1
m ∗
the projection of C onto Im A and Q, where Q = R R R ∗ ∗ R∗ , is the projection of C n
onto Im A∗ . And observing that R∗∗ = R, viz.,
T T
∗
R∗∗ = R −1 T
Gn = Gn R Gn = R
we can write Q as
−1
Q = R∗ RR∗ R
P A = A, AQ = A
LECTURE 4. INNER PRODUCTS 82
and
QA∗ = A∗ , A∗ P = A∗
−1 −1
AI = R∗ RR∗ F ∗F F∗
we get
QAI = AI , AI P = AI
AAI = P
AI A = Q
AAI A = A
and
AI AAI = AI
These formulas tell us that for any y ∈ C m , we have AAI y = P y and so AI y is a solution of
Ax = P y
Indeed, as QAI y = AI y, AI y is the shortest such solution. Hence AI is the generalized inverse of
A, i.e., AI y is the shortest vector such that AAI y is the best approximation to y.
LECTURE 4. INNER PRODUCTS 83
Aij = y i , Axj
where the elements of a matrix A in a basis x1 , x2 , . . . , xn are defined by
X
A xj = Aij xi
Using
X
Ax = xi y i , Ax
X X
(AB)ij = y i , A xk y k , B xj = Aik Bkj
2. Let A : C n → C m and let Gn and Gm be the weighting factors in the corresponding inner
∗ n n ∗ m m
products. Then A A : C → C and AA : C → C . Show that A A ∗ ∗ = AA∗ and
AA∗
∗ = A∗ A. Be careful as there are three different ∗ ’s here.
1 1 1 1
1 −1 1 1
A=
1 1 −1 1
1 1 1 −1
and
1
1
y=
1
1
1
0
has a unique solution. It is x =
. Determine c1 and c2 so that A {c1 x1 + c2 x2 } is the
0
0
best approximation to y where
1 1
1 0
x1 =
and x2 =
0 1
0 0
Eigenvectors
Eigenvectors are defined for operators that map vector spaces into themselves. Operators that
do this can act repetitively, and hence their squares, cubes, etc., are defined. Their action can be
understood in terms of their invariant subspaces and these may be built up out of their eigenvectors.
Ax = λx
for some complex number λ is called an eigenvector of A and λ is called the corresponding eigen-
value. Each eigenvector of A spans a one dimensional A-invariant subspace of C n and each vector
in this span is also an eigenvector of A corresponding to λ. Eigenvectors are not unique, their
lengths being arbitrary.
(A − λI) x = 0
and observe that solutions other than x = 0 can be found only for certain values of λ. To find the
85
LECTURE 5. EIGENVECTORS 86
eigenvectors we must first find the eigenvalues, viz., the values of λ such that the solution space
of our homogeneous problem contains vectors other than 0, i.e., such that dim Ker (A − λI) > 0
and therefore that dim Im (A − λI) < n. This is satisfied iff the rank of A − λI is less than n
which in turn is satisfied iff
det (A − λI) = 0
Each value of λ satisfying this equation is an eigenvalue of A and the corresponding eigen-
vectors make up the solution space Ker (A − λI), called the eigenspace corresponding to the
eigenvalue λ. The number of independent eigenvectors corresponding to λ is dim Ker (A − λI)
and this is n less the rank of (A − λI). It is called the geometric multiplicity of the eigenvalue λ.
= etc.
where the coefficient ∆i is the sum of the i × i principal minors of A and where a minor is a
principle minor if the elements on its diagonal are also on the diagonal of A. The coefficients in
∆ (λ) can be written in terms of the eigenvalues. The coefficient ∆i is the sum of the products of
LECTURE 5. EIGENVECTORS 87
∆1 = tr A = λ1 + λ2 + · · · + λn
and
∆n = det A = λ1 λ2 · · · λn
The polynomial ∆ (λ) has n roots in C, counting each root according to its multiplicity. To be
definite we call its distinct roots the eigenvalues of A and we denote them
λ 1 , λ2 , · · · , λd
The geometric multiplicity of each eigenvalue, i.e., the greatest number of independent eigen-
vectors corresponding to that eigenvalue, cannot exceed its algebraic multiplicity. Indeed if we
let n1 = dim Ker (A − λ1 I) and determine ∆ (λ) in a basis for C n whose first n1 vectors span
Ker (A − λ I), we see that ∆ (λ) contains the factor (λ − λ )n1 whence n ≤ m . This may
1 1 1 1
We introduce the eigenvectors of A in the hope of constructing a basis for C n which will
simplify certain calculations that we plan to make. But we will not always be able to find an
eigenvector basis for C n . To make the distinction we need to make, we call an eigenvalue problem
plain vanilla if it leads to n algebraically simple eigenvalues. Then the geometric multiplicity of
each eigenvalue is the same as its algebraic multiplicity and this value is one. In fact we will
go on and call an eigenvalue problem plain vanilla whenever the geometric multiplicity of each
eigenvalue is also its algebraic multiplicity.
There are eigenvalue problems that are not plain vanilla. But this is the exception, not the rule.
To see why this is so and why it is important, we look first at the eigenvalue problem in the simplest
LECTURE 5. EIGENVECTORS 88
λ2 − ( tr A) λ + det A
and observe that the characteristic equation has either two distinct roots or one double root. The
matrix A then has either two eigenvalues each of algebraic multiplicity one or one eigenvalue of
algebraic multiplicity two. In the first instance we write
Ax1 = λ1 x1
and
Ax2 = λ2 x2
and derive the important fact that x1 and x2 are independent. Indeed c1 = 0 = c2 is the only
solution to c1 x1 + c2 x2 = 0 for if
c1 x1 + c2 x2 = 0
then
c1 λ1 x1 + c2 λ2 x2 = 0
and so
c2 (λ2 − λ1 ) x2 = 0
hy i , xj i = δij
The set of vectors y 1 , y 2 is independent and likewise a basis for C 2 ; indeed the matrix
y T1 G
y T2 G
We remind the reader that the idea of biorthogonal sets is a powerful idea. If any vector x is
expanded as
x = c1 x1 + c2 x2
or as
x = d1 y 1 + d2 y 2
c1 = hy1 , xi, c2 = hy 2 , xi
and
But there is even more. If we introduce A∗ , the adjoint of A, in the same inner product used to
LECTURE 5. EIGENVECTORS 90
A∗ y i = hx1 , A∗ y i iy 1 + hx2 , A∗ y i iy 2
= hAx1 , y i iy 1 + hAx2 , y i iy 2
= λ1 hx1 , y i iy 1 + λ2 hx2 , y i iy 2
and hence
A∗ y 1 = λ1 y 1
and
A∗ y 2 = λ2 y 2
This tells us that the eigenvalues of A and A∗ are complex conjugates { i.e., if det (λI − A) = 0
then det λI − A ∗ = 0 where A∗ = G−1 A G } and that their eigenvectors form biorthogonal
T
sets.
The first case is complete: when two eigenvalues of algebraic multiplicity one turn up, the
corresponding eigenvectors determine a basis for C 2 . If one eigenvalue of multiplicity two is
obtained this continues to be true if dim Ker (A − λ1 I) = 2 as then we can find two independent
eigenvectors, viz., any two independent vectors in C 2 , and write
Ax1 = λ1 x1
and
Ax2 = λ1 x2
and go on as before. But if dim Ker (A − λ1 I) = 1 a complication arises: we cannot find two
independent eigenvectors.
LECTURE 5. EIGENVECTORS 91
This corresponds to the rank of A − λ1 I having the value one instead of zero whence both
Im (A − λ1 I) and Ker (A − λ1 I) are one dimensional subspaces of C 2 and there is at most one
independent eigenvector. Denoting this x1 we have Ker (A − λ1 I) = [x1 ]. And we observe that
Ker (A − λ1 I) = Im (A − λ1 I) otherwise λ1 cannot be a double root. Indeed, as Im (A − λ1 I)
is one dimensional and A invariant, any vector in Im (A − λ1 I) is an eigenvector of A correspond-
ing to an eigenvalue other than λ1 unless Ker (A − λ1 I) = Im (A − λ1 I). This is established
again in section 5.2.
So, being short an eigenvector and observing that x1 ∈ Im (A − λ1 I) we seek a vector x2 satis-
fying
(A − λ1 I) x2 = x1
c1 x1 + c2 x2 = 0
then
c1 λ1 x1 + c2 {x1 + λ1 x2 } = 0
or
c2 x 1 = 0
LECTURE 5. EIGENVECTORS 92
This illustrates the main idea: when we cannot find enough eigenvectors to make up a basis for
our space we generalize the eigenvector problem in such a way that to each eigenvalue of algebraic
multiplicity m there corresponds m eigenvectors and generalized eigenvectors. The only new idea
required when n is greater than two is that an eigenvector may generate a chain of more than one
generalized eigenvector and there may be more than one chain corresponding to each eigenvalue.
Ax1 = λ1 x1
and
Ax2 = x1 + λ1 x2
n o
where {x1 , x2 } is independent and a basis for C 2 . The corresponding biorthogonal set, y 1 , y 2 ,
is also independent and a basis for C 2 . Making the calculation A∗ y i we find
A∗ y 1 = λ1 y 1 + y 2
and
A∗ y 2 = λ1 y 2
n o
The readers need to carry out this calculation by expanding A∗ y i in y 1 , y 2 to satisfy them-
selves that the eigenvalue problem for A∗ is required to generalize in just this way. The check on
this is that the solvability condition for determining x2 is:
h i
x1 ⊥ Ker (A − λ1 I)∗ = Ker A∗ − λ1 I = y 2
explain what can happen, let λ1 be an eigenvalue of A of algebraic multiplicity m1 and let the
dimension of Ker (A − λ1 I) be n1 so that we have n1 independent eigenvectors. If n1 < m1 then
Ker (A − λ1 I) and Im (A − λ1 I) intersect in at least one vector x1 and we take x1 to be one of
our eigenvectors. Using this vector we can determine a vector y 1 such that (A − λ1 I) y 1 = x1
and x1 and y 1 are the first two vectors in a chain. If y 1 is not in Im (A − λ1 I) the chain ter-
minates, otherwise we can determine a vector z 1 such that (A − λ1 I) z 1 = y 1 , etc. The vectors
x1 , y 1 , z 1 , . . . satisfy the equations
Ax1 = λ1 x1
Ay 1 = x1 + λ1 y 1
Az 1 = y 1 + λ1 z 1
..
.
which now generalize the eigenvalue problem. As there may be more than one chain correspond-
ing to the eigenvalue λ1 , it is important in selecting a basis for Ker (A − λ1 I) to first span
Ker (A − λ1 I) ∩ Im (A − λ1 I). The condition that a vector lie in Im (A − λ1 I) is that it be
orthogonal to Ker (A − λ1 I)∗ .
We can illustrate the main idea in the case n = 2. Suppose λ1 is an eigenvalue of algebraic
multiplicity 2 and geometric multiplicity 1. Then we have
Ker (A − λ1 I) = [ x1 ]
and
dim Im (A − λ1 I) = 1
whereupon
Im (A − λ1 I) = [ x ]
LECTURE 5. EIGENVECTORS 94
(A − λ1 I) x = c x
hence
Ax = (λ1 + c) x
Im (A − λ1 I) = Ker (A − λ1 I)
due to the fact that x1 6∈ Im A − λ1 I and Im A − λ1 I is A invariant. But as λ1 is a double root
it must be a root of det An−1 n−1 − λIn−1 n−1 = 0 whence A must have a second eigenvector
corresponding to λ1 , it must lie in Im A − λ1 I and it must be independent of x1 . But this is
not so and we conclude that x1 ∈ Im A − λ1 I . This is important because it is the solvability
condition for the problem
A − λ1 I x = x1
LECTURE 5. EIGENVECTORS 95
A − λ1 I x 2 = x 1
Indeed to any particular solution x2 may be added any multiple of x1 but all such x2 ’s are indepen-
dent of x1 . That is, if
c1 x1 + c2 x2 = 0
then
A − λ1 I c1 x1 + c2 x2 = c2 x1 = 0
Ax1 = λ1 x1
and
Ax2 = x1 + λ1 x2
And x2 6∈ Im A − λ1 I otherwise there would be a vector x3 such that A − λ1 I x3 = x2 and
λ1 would be a triple root of det A − λI = 0. As x1 ∈ Im A − λ1 I but x2 6∈ Im A − λ1 I ,
x1 , but not x2 , is perpendicular to Ker A∗ − λ1 I .
y 2 = Ker A∗ − λ1 I = Im A − λ1 I
⊥
A∗ y 1 = λ1 y 1 + y 2
and
A∗ y 2 = λ1 y 2
Ker ( A∗ − λ 1 I ) = [ y ]
2
⊥
Im ( A∗ − λ 1 I ) = [ x ] y
x 1 2
2
y
x 1
1
Ker ( A − λ1I ) = [ x ]
1
⊥
Im ( A − λ1I ) = [ y ]
2
LECTURE 5. EIGENVECTORS 97
Our expectation when we solve an eigenvalue problem in C n must be: either we will determine a
set of n independent eigenvectors and hence a basis for C n or we will not. A set of n independent
eigenvectors is called a complete set. There are sufficient conditions for this; one is that the eigen-
values of A turn out to be simple roots of ∆ (λ) = 0. This requires A to have n distinct eigenvalues.
Another is that A∗ = A in some inner product. If this is so we can determine an eigenvector x1
in the usual way and then observe that as [x1 ] is A invariant so also is [x1 ]⊥ . Restricting A to
this n − 1 dimensional subspace we can then start over and determine an eigenvector, x2 , of the
restriction of A to [x1 ]⊥ in the usual way. This will be the second eigenvector of A and it will
satisfy hx1 , x2 i = 0. If n > 2 we can continue this to determine a set of n mutually orthogonal
eigenvectors, orthogonal in the inner product in which A∗ = A.
λ2 − ( tr A) λ + det A = 0
and where tr A = a11 + a22 and det A = a11 a22 − a21 a12 . This equation has a double root iff
( tr A)2 − 4 det A = 0; otherwise it has two simple roots. If a11 , a12 , a21 and a22 are real numbers
then the double root is real and it corresponds to the one dimensional locus ( tr A)2 −4 det A = 0 in
the det A, tr A plane separating the region corresponding to two simple real roots and the region
corresponding to two simple complex roots (which are complex conjugates). Two simple roots is
generic, being realized almost everywhere in the det A, tr A plane; the alternative, a double root,
turns up only on a set of measure zero. This continues to be true for n > 2. Our emphasis then
is on the ordinary and simplest possibility, we take up exceptions by example. What we require at
the outset is that A determine a basis for C n made up of independent eigenvectors; we refer to this
as a complete set of eigenvectors, and n simple eigenvalues is sufficient but not necessary for this.
Before going on we introduce a simple way to find all the eigenvectors lying in one dimensional
eigenspaces. Let A be an n × n matrix. The corresponding eigenvalues are the roots of ∆ (λ) =
LECTURE 5. EIGENVECTORS 98
Letting B (λ) = adj (λI − A), we see that the elements of B (λ) are polynomials of degree n − 1
in λ and so we can write
And, as
we have
and we see that corresponding to any eigenvalue, say λ1 , where ∆ (λ1 ) = 0, the non-vanishing
columns of B (λ1 ) are eigenvectors of A. Now B (λ1 ) is a matrix whose rank is either one or
zero depending on whether the rank of (λ1 I − A) is either n − 1 or less than n − 1. If the rank of
(λ1 I − A) is n−1 we have dim Ker (λ1 I − A) = 1 and then there is one independent eigenvector
and a candidate can be found among the columns of B (λ1 ). This is all that is required if λ1 is a
simple eigenvalue. To determine B1 , B2 , . . . , Bn−1 and hence B (λ), we can equate the coefficients
of the powers of λ on the two sides of
(λI − A) λn−1 I − B1 λn−2 + · · · + (−1)n−1 Bn−1 = λn I − ∆1 λn−1 + ∆2 λn−2 · · ·
LECTURE 5. EIGENVECTORS 99
B1 = ∆1 I − A
B2 = ∆2 I − AB1
etc.
In Gantmacher’s book, “Theory of Matrices,” a method is explained for determining the sequences
∆1 , ∆2 , . . . and B1 , B2 , . . . simultaneously.
Henceforth we let n be arbitrary and assume, unless an exception is made, that we have a com-
plete set of eigenvectors. Then the algebraic multiplicity of the eigenvalues is not important and
we can first denote the eigenvectors x1 , x1 , . . . , xn and then denote the corresponding eigenval-
ues λ1 , λ2 , . . . , λn where λ1 , λ2 , . . . , λn may not be distinct complex numbers. Upon solving the
eigenvalue problem Ax = λx we obtain set of independent eigenvectors. Then we introduce an
inner product in C n and construct its biorthogonal set. Denoting this y 1 , y 2 , . . . , y n we require
y i , xj = δij , i, j = 1, 2, . . . , n
Each of the sets of vectors x1 , x2 , . . . , xn and y 1 , y 2 , . . . , y n is a basis for C n . Now the set of
n2 equations y i , xj = y Ti Gxj = δij can be written
y T1 G
T
y2 G
x1 x2 . . . xn = I
..
.
y Tn G
LECTURE 5. EIGENVECTORS 100
is the inverse of the matrix x1 x2 . . . xn . Indeed the vectors y T1 G, y T2 G, . . . are independent of
G.
Axi = λi xi , i = 1, 2, . . . , n
A x1 x2 . . . xn = λ1 x1 λ2 x2 . . . λn xn
A = λ1 x1 y T1 G + λ2 x2 y T2 G + · · · + λn xn y Tn G
Pi Pi = Pi
LECTURE 5. EIGENVECTORS 101
and
Pi Pj = 0, i 6= j
and we write
A = λ1 P1 + λ2 P2 + · · · + λn Pn
P
We say Pi selects xi because Pi xi = xi and Pi xj = 0, i 6= j, viz., Pi cj xj = ci xi .
This formula simplifies certain calculations via the multiplication rules for the Pi ; indeed it can
be used to derive powers of A via
As an example of its use, the balance and equilibrium equations corresponding to the linear, n
stage, counter current separating cascade sketched below
LECTURE 5. EIGENVECTORS 102
i+1
yi xi + 1
i
xi
y1
1
y0 = yin x1 = x out
are:
yi∗ = mxi
and
yi − yi−1 = E yi∗ − yi−1
where y0 and xn+1 are the compositions of the V and L phase feed streams and yn and x1 are the
compositions of the V and L phase product streams. We write these equations
LECTURE 5. EIGENVECTORS 103
xi+1 xi
= A
yi yi−1
mV
mV
The eigenvalues of A, viz., λ1 = 1 and λ2 = 1 + E L
− 1 , are simple unless L
− 1 = 0. The
corresponding eigenvectors are
1 V /L
x1 = , x2 =
m 1
As a result we have
V
xin 1 1 −L
= +
mV mV
yout
1 −
L m −
L
n
mV
1+E −1 mV V
L − − xout
L L
mV
y
1− −m 1 in
L
LECTURE 5. EIGENVECTORS 104
This is equivalent to the Kremser equation and the overall material balance, but in a symmetric
form. What is important is that we have constructed a useful representation of A n and we did not
need a concrete value of n to do this.
mV
If −1 is zero, then λ1 = 1 is a double root and to it there corresponds only one independent
L
1
eigenvector, x1 = . The spectral representation of A must take this into account. And to
m
do this for a 2 × 2 matrix A we write
Ax1 = λ1 x1
and
Ax2 = x1 + λ1 x2
1
where x2 is a generalized eigenvector, here x2 = . Then we write this
0
A x1 x2 = λ1 x1 x1 + λ1 x2
−1 y T1 G
and using x1 x2 = , where x1 x2 and y y are biorthogonal sets, we have
1 2
y T2 G
A = λ1 x1 y T1 G + x1 y T2 G + λ1 x2 y T2 G
= λ1 P1 + P12 + λ1 P2
= λ1 I + P12
where the multiplication rules are now P1 P12 = P12 , P12 P1 = 0, P2 P12 = 0, P12 P2 = P12 and
P12 P12 = 0. Hence we have An = λn1 I + nλn−1
1 P12 and this can be used to derive the Kremser
equation when the equilibrium and operating lines are parallel.
LECTURE 5. EIGENVECTORS 105
This restates what we already know in the case n = 2. Thus if A has a complete set of eigenvectors,
then, in the same inner product in which we determine y 1 , y 2 , · · · , y n , we can obtain A∗ and
calculate A∗ y i . Expanding A∗ y i in y 1 , y 2 , · · · , y n we get
X n
A∗ y i = xj , A∗ y i y j
j=1
n
X
= Axj , y i y j
j=1
= λi y i
This tells us that the vectors y i satisfy the eigenvalue problem for A∗ , viz.,
A∗ y i = λi y i , i = 1, 2, · · · , n
The sets of eigenvectors of A and A∗ are biorthogonal, while the sets of eigenvalues are complex
conjugates. Indeed det (A − λI) = 0 implies det A∗ − λI = det G−1 A G − λI = 0.
T
We plan to use what we have learned in this lecture in the next lecture to write the solution to
differential and difference equations. Before we do this, and to illustrate the useful fact that we
can solve problems by expanding their solutions in a convenient basis, we return to the problem
Ax = b and assume it has a solution. Then expanding x in the set of eigenvectors of A, assumed
to be complete, we write
X
x= ci x i
LECTURE 5. EIGENVECTORS 106
and our job is to determine the coefficients ci , where ci = hy i , xi. We can find ci by calculating
the inner product of y i and both sides of Ax = b. Indeed we have
y i , Ax = yi, b
A∗ y i , x = yi, b
λi y i , x = yi, b
λi y i , x = yi, b
yi , b
yi, x =
λi
X yi, b
x= xi
λi
is the solution of Ax = b. We see that each coefficient ci is determined independent of the other
coefficients.
The subspaces of C n important to the problem Ax = b, viz., Im A and Ker A, can be thought
about in terms of the eigenvectors of A and A∗: Im A is the span of the eigenvectors of A cor-
responding to eigenvalues that are not zero, Ker A∗ is the span of the eigenvectors of A∗ corre-
sponding to eigenvalues that are zero and Ker A is the span of the eigenvectors of A corresponding
to eigenvalues that are zero. For instance if λ1 = 0 in the foregoing, the solvability condition is
y1, b
y 1 , b = 0 and if that is satisfied is indeterminate and can be replaced by an arbitrary
λ1
constant c1 . The solution then contains an arbitrary multiple of x1 , a basis vector for Ker A.
x= y1, x x1 + y2, x x2 + · · ·
LECTURE 5. EIGENVECTORS 107
yi, A x = yi, b , i = 1, 2
imply
and
λ1 y2, x = y2, b
whereupon, if λ1 is not zero, we obtain y1, x and y2, x and write the solution to Ax = b
accordingly.
Solvability conditions are important in perturbation calculations. To see why this is so, suppose a
matrix of interest, A, is close to a matrix A0 whose eigenvalue problem has been solved resulting
in a complete set of eigenvectors and simple eigenvalues:
A = A0 + εA1 + ε2 A2 + · · ·
Then writing
and
A0 − λ0i I x0i = 0
A0 − λ0i I x1i = λ1i I − A1 x0i
A0 − λ0i I x2i = λ2i I − A2 x0i + λ1i I − A1 x1i
etc.
The first problem determines x0i and λ0i . And at every succeeding order the matrix A0 − λ0i I
appears and the homogeneous problem A0 − λ0iI x = 0 has a non zero solution, viz., x0i . To
determine the first corrections, x1i and λ1i , we turn to the second problem. To get x1i requires
that a solvability condition be satisfied. This is the requirement that λ1i I − A1 x0i belong to
∗
Im A0 − λ0i I and hence be perpendicular to Ker A0 − λ0i I . But this is y 0i and hence the
solvability condition
y 0i , λ1i I − A1 x0i =0
determines λ1i as
λ1i = y 0i , A1 x0i
λi = λ0i + y 0i , A1 x0i ε
LECTURE 5. EIGENVECTORS 109
Continuing the calculation requires no new ideas but a lot of tedious work. Indeed to determine x1i
we use the solution of Ax = b written above putting A0 − λ0i I in place of A and λ1i I − A1 x0i
in place of b. Because the eigenvalues and eigenvectors of A0 − λ0i I are λ0j − λ0i and x0j , j =
1, 2, . . . , n, we get
D E
X y 0j , λ1i I − A1 x0i
x1i = x0j + c1i x0i
j6=i
λ0j − λ0i
This is required to determine λ2i ; the readers may wish to satisfy themselves that λ2i , where
D E
λ2i = y 0i , A2 x0i − y 0i , λ1i I − A1 x1i , is independent of the value assigned to c1i .
The calculation becomes more interesting as complexities arise. Suppose λ01 turns out to be a
double root and Ker A0 − λ01 I is two dimensional so that A0 retains a complete set of eigen-
vectors. On perturbation, λ01 is likely to split into two simple roots λ1 and λ2 to which corre-
spond independent eigenvectors x1 and x2 . Now x1 and x2 approach definite vectors x01 and x02 in
Ker A0 − λ01 I as ε goes to zero but we cannot know in advance what these limits are and hence
we cannot select x01 and x02 in advance out of all of the possibilities in Ker A0 − λ01 I . This means
that at the outset when we write x1 = x01 + εx11 + · · · and x2 = x02 + εx12 + · · · we do not know
x01 and x02 and we must determine their values as we go along. We do not explore this and other
complications. That would deflect us from our elementary goals.
1. Early in the lecture we said that if there is an inner product in which A∗ = A then A has
a complete set of eigenvectors. Indeed in that inner product A has n mutually perpendicu-
lar eigenvectors. This establishes their linear independence and linear independence if not
orthogonality is retained as we introduce other inner products. The reader may go on and
show that the corresponding eigenvalues are real by calculating λi xi , xi . The condition
T
that A∗ = A is GA = GA .
LECTURE 5. EIGENVECTORS 110
The converse of this is true. If A has a complete set of eigenvectors and real eigenvalues
then A∗ = A in some inner product. Denoting the eigenvectors x1 , x2 , . . . , xn and the
corresponding eigenvalues λ1 , λ2 , . . . , λn we can write
A x1 x2 . . . xn = x1 x2 . . . xn diag λ1 λ2 . . . λn
hence letting X = x1 x2 . . . xn we have
T T
AXX = X diag λ1 λ2 . . . λn X
T T
so if G = XX then G = G , xT Gx > 0 for all x 6= 0 and
T
AG = AG
This result tells us that the requirement A∗ = A in some inner product is necessary and
sufficient that A have a complete set of eigenvectors and real eigenvalues. The readers may
ask: why is it that the λ’s must be real?
c1 x1 + c2 x2 + · · · + cd xd = 0
To do this we multiply by A − λ1 I getting
c2 λ2 − λ1 x2 + · · · + cd λd − λ1 xd = 0
and then by A − λ2 I , . . . ultimately getting
cd λ d − λ 1 λd − λ2 · · · λd − λd−1 xd = 0
LECTURE 5. EIGENVECTORS 111
Even more is true. If corresponding to each distinct eigenvalue we determine a set of inde-
pendent eigenvectors and then corresponding to each of these a chain of generalized eigen-
vectors, then all of these eigenvectors and generalized eigenvectors are independent. The
idea is that by multiplying a linear combination of these vectors by A − λ1 I sufficiently
many times we can remove from it all vectors corresponding to λ1 . To see how this works
take the simple example where x1 and x2 correspond to λ1 and x3 to λ3 . Then write
Ax1 = λ1 x1
Ax2 = x1 + λ1 x2
and
Ax3 = λ3 x3
c1 x1 + c2 x2 + c3 x3 = 0
multiply this by A − λ1 I to get
c2 x1 + c3 λ3 − λ1 x3 = 0
and then again by A − λ1 I to get
2
c3 λ3 − λ1 x3 = 0.
3. Any set of independent vectors in C n determines a unique biorthogonal set in its span. For
example if x1 and x2 are independent in C n then we can determine y 1 and y 2 in x1 , x2
LECTURE 5. EIGENVECTORS 112
We introduce the idea that a matrix represents a linear operator in a specified basis. Let ~x be a
vector (possibly a column vector) and ~e1 , ~e2 , . . . , ~en be a basis for the n dimensional vector
space in which ~x resides. Then we can write
etc.
then the column vectors representing the vectors f~1 , f~2 , . . . , f~n in the basis ~e1 , ~e2 , . . . , ~en are
the columns of a matrix denoted P . And because f~1 , f~2 , . . . , f~n is independent so also the set
of columns of P and hence det P 6= 0. If we now write
x = Py
The formula x = P y determines the column vector x representing a vector ~x in the basis
~e1 , ~e2 , . . . , ~en in terms of the column vector y representing the vector ~x in another basis
f~1 , f~2 , . . . , f~n . The columns of the transformation matrix P are the column vectors represent-
ing the second basis vectors in the first basis. Each vector ~x is represented by many column vectors
corresponding to many bases and each column vector represents many vectors again corresponding
to many bases but the representation is one-to-one in a fixed basis.
If L is a linear operator (possibly an n × n matrix) acting in this vector space we can write
etc.
and denote by A the matrix whose columns are the column vectors representing L~e1 , L~e2 , . . . in
the basis ~e1 , ~e2 , . . . , ~en . We call this the matrix representing L in the basis ~e1 , ~e2 , . . . , ~en .
Likewise denoting by B the matrix representing L in the basis f~1 , f~2 , . . . , f~n we find:
A = P BP −1
The formula A = P BP −1 determines the matrix A representing L in the basis ~e1 , ~e2 , . . . , ~en
in terms of the matrix B representing L in another basis f~1 , f~2 , . . . , f~n . Each linear operator
L is represented by many matrices corresponding to many bases and all display the same informa-
tion but this information is easier to obtain in some bases than it is in others. Indeed if A, x and b
LECTURE 5. EIGENVECTORS 114
represent L, ~x and ~b in the basis ~e1 , ~e2 , . . . , ~en while in f~1 , f~2 , . . . , f~n the representation
is B, y and c then the equation L~x = ~b is represented in C n by both Ax = b and By = c. And one
of these may be easier to solve than the other.
Using A = P BP −1 and the theorem that the determinant of a product is the product of the
determinants of its factors we see that
det λI − A = det λI − B
and hence we define the characteristic polynomial of L to be the characteristic polynomial of any
matrix that represents it. We then define the eigenvalues of L to be the eigenvalues of any matrix
that represents it. The eigenvalues of any two matrices A and B, where A = P BP −1 for any
nonsingular matrix P , are the same. The eigenvalues of L are independent of the basis used for
their determination.
If L has an invariant subspace of dimension k, then, using a basis whose first k vectors lie
in this subspace, the matrix representing L in this basis reflects this structure by exhibiting an
n − k × k block of zeros in its lower left hand corner. Its determinant then factors as the product
of the determinants of its upper left hand k × k block and its lower right hand n − k × n − k
block. This establishes the result that the geometric multiplicity of an eigenvalue cannot exceed
its algebraic multiplicity. To see this let dim Ker A − λ1 I = n1 and let the first n1 vectors in a
m
basis be a basis for Ker A − λ1 I , then det λI − A contains the factor λ − λ1 1 where m1
cannot be less than n1 .
Let the linear operator L be the n × n matrix A. Then it is represented by itself in the natural
1 0
0
0
basis, . , · · · , . . If A has a complete set of eigenvectors x1 , x2 , . . . , xn
.. ..
0 1
its matrix in this basis is the diagonal matrix of the corresponding eigenvalues. Such a basis
is called a diagonalizing basis. If to the eigenvalue λ1 repeated twice, there corresponds only
the eigenvector x1 , we construct a generalized eigenvector x2 and in the basis x1 , x2 , . . . the
LECTURE 5. EIGENVECTORS 115
λ1 1
matrix of A has the block in the upper left hand corner. Using a basis of eigenvectors
0 λ1
and generalized eigenvectors, we find that the matrix representing A is block diagonal. To each
eigenvalue λi of multiplicity mi , i = 1, 2, . . . , d, there corresponds an mi × mi block. Outside
of these d blocks all elements vanish. Inside the ith block λi appears on the diagonal, 1 or 0
on the superdiagonal, and 0 elsewhere. The structure of the superdiagonal is determined by the
chains of generalized eigenvectors. For instance if λ1 is a threefold root to which corresponds
only x1 then x1 generates the chain x1 → x2 → x3 and the corresponding block is
the eigenvector
λ 1 0
1
0 λ1 1 ; but if there are two eigenvectors x1 and x2 and x2 generates the chain x2 → x3
0 0 λ1
λ 0 0
1
the block is 0 λ1 1 .
0 0 λ1
The forms we have been talking about, including the purely diagonal form, are called Jordan
forms. Such forms are either diagonal or as close to diagonal as we can get using basis trans-
formations. In A = P JP −1 the columns of the transformation matrix are the column vectors
representing the eigenvectors and generalized eigenvectors of A in the natural basis.
Shilov’s book “Linear Algebra” gives an algebraic account of this via polynomial algebras and
their ideals. Gantmacher’s book “Theory of Matrices” gives both an algebraic and a geometric
explanation.
1. Derive the formula for the eigenvalues and eigenvectors of the block triangular matrix
A 0
where the blocks on the diagonal are square and have simple eigenvalues. To
B C
LECTURE 5. EIGENVECTORS 116
and solve it in terms of the solutions to the eigenvalue problems for A and C.
Fora countercurrent
cascade
the Kremser equation is the corresponding formula for
x x
in in terms of out . For fixed xin , yin , investigate xout and yout as n grows large.
yout yin
mV
The results will depend on whether is greater or less than 1 and on whether yin is greater
L
or less than mxin .
mV
Let = 1 and rederive the Kremser equation. In this instance the eigenvalue 1 is
L
repeated and corresponds to only one independent eigenvector. Is the result the limit of the
mV mV
ordinary Kremser equation for 6= 1 as → 1?
L L
I + a bT
LECTURE 5. EIGENVECTORS 117
4. Show that the eigenvalues of diagonal and triangular matrices are their diagonal elements.
5. If the simple Drude model is used to derive the potential energy of three molecules lying on
a straight line, the matrix
a −b −c
−b a −d
−c −d a
turns up. The numbers a, b, c and d are all positive and a >> b, c, d. The numbers b, c and d
denote the dipole-dipole interactions.
∆1 = 3a
∆2 = 3a2 − b2 + c2 + d2
∆3 = a3 − a b2 + c2 + d2 − 2bcd
λ1 = a − b
λ2 = a
λ3 = a + b
LECTURE 5. EIGENVECTORS 118
√
λ1 = a − b2 + c2
λ2 = a
√
λ3 = a + b2 + c2
√
λ1 = a − b2 + c2 + d2
λ2 = a
√
λ3 = a + b2 + c2 + d2
λ1 + λ2 + λ3 = ∆1
λ1 λ2 + λ2 λ3 + λ3 λ1 = ∆2
λ1 λ2 λ3 = ∆3 + 2bcd
Write
a −b −c a −b −c 0 0 0
−b a −d = −b a 0 −d 0 0 1
−c −d a −c 0 a 0 1 0
and estimate the eigenvalues by a perturbation calculation. Carry this out to first order in d
and then calculate λ1 + λ2 + λ3 , λ1 λ2 + λ2 λ3 + λ3 λ1 and λ1 λ2 λ3 . Are your estimates
improved if d2 is added to b2 + c2 wherever b2 + c2 appears?
LECTURE 5. EIGENVECTORS 119
6. Here is an n × n matrix:
1 1 ··· 1
1 1 ··· 1
.... ..
. . .
1 1 ··· 1
Show that its eigenvalues are n and 0 where 0 is repeated n − 1 times. Show that the
corresponding eigenvectors are
1
1
..
.
1
and
1 1 1
−1 0 0
0 −1 0
, , ···
0 0 0
.. .. ..
. . .
0 0 −1
7. Let A, P and Q be n × n matrices where A and Q are known and P is to be determined. The
equation for doing this is
AT P + P A = −Q
ap = −q
where a is an n2 × n2 matrix and p and q are n2 × 1 vectors. To see what this equation looks
like and to decide when it can be solved, suppose that the eigenvalue problems for A and AT
lead to the biorthogonal sets of eigenvectors x1 , x2 , . . . , xn and y 1 , y 2 , . . . , y n and
the corresponding eigenvalues λ1 , λ2 , . . . , λn . Then write
AT P + P A = −Q
in the basis y i y Tj and show that the linear operator
AT + A
I + a bT + b aT
To do this let x1 , x2 , . . . , xn−2 be independent and lie in [a, b]⊥ . Then x1 , x2 , . . . , xn−2
are eigenvectors corresponding to the eigenvalue 1.
The remaining two eigenvectors lie in [a, b]⊥⊥ = [a, b]. To find these and the corre-
sponding eigenvalues put
x = αa+βb
LECTURE 5. EIGENVECTORS 121
I + a bT + b aT αa+ βb = λ αa + βb
requires
α + α bT a + β bT b = λ α
and
β + α aT a + β aT b = λ β
or
T T
1+b a b b α α
= λ
T T
a a 1+a b β β
The determinant and the trace of the matrix on the left hand side are
2
det = 1 + aT b − aT a bT b
and
tr = 2 1 + aT b
whence
tr 2 − 4 det > 0
and this tells us that the remaining two eigenvalues are real and not equal.
aT b = 0, aT a = 1, bT b = 1
LECTURE 5. EIGENVECTORS 122
I + a bT + c dT
Ax = λBx
can sometimes be of interest. Begin the study of this problem by finding a bound on the
largest number of independent solutions corresponding to any λ 6= 0.
AA∗ = A∗ A
Suppose the departure of x1 (t), x2 (t), . . . , xn (t) from assigned initial values is determined by the
set of n ordinary differential equations
n
dxi X
= aij xj , i = 1, 2, ..., n.
dt j=1
dx
= Ax
dt
123
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 124
where x(t = 0) is assigned and where the constants aij are the elements of the n × n matrix A. We
suppose, first, that A has a complete set of eigenvectors and construct the solution to this problem
in terms of these eigenvectors.
To introduce the notation let the eigenvectors and eigenvalues of A and A∗ satisfy
Axi = λi xi , i = 1, . . . , n
and
A∗ y i = λi y i , i = 1...,n
D E
where y i , xj = δij .
n
X
x (t) = ci (t) xi
i=1
D E
and seek the coefficients ci (t) in this expansion, where ci (t) = y i , x(t) .
Thus, what we do to solve linear problems requires three steps to be carried out: determine an
eigenvector basis, expand the solution in this basis and determine the coefficients in this expansion.
Its simplicity rests on the idea of biorthogonal sets of vectors.
This leads to
d D E D E
y i , x = A∗ y i , x
dt
D E
= λi y i , x
dci
= λ i ci .
dt
As a result we have
ci (t) = ci (t = 0)eλi t
n D
X E
x(t) = y i , x(t = 0) eλi t xi
i=1
Indeed if
dx
= Ax + b(t)
dt
we need to add
n Z
X
t
D E
eλi (t−τ ) y i , b(τ ) dτ xi
i=1 0
to the foregoing to obtain the solution. And this may be discovered using the same steps that led
to the solution of the problem where b(t) = 0.
The problem as originally written requires that we determine the unknown functions
x1 (t), . . . , xn (t) simultaneously. These functions are the components of x(t) in the natural ba-
sis for the problem and each component ordinarily appears in each equation. So we look for a way
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 126
to break this coupling. When A has a complete set of eigenvectors we can do this by expanding
the solution x(t) in the eigenvector basis. Then the determination of the expansion coefficients
c1 (t), ..., cn (t), unlike the determination of the natural components x1 (t), ..., xn (t), is a completely
uncoupled problem.
Example (i)
−1 −1
Let A = then trA = -2 and detA = 2. The eigenvalues are λ1 = −1 + i and
1 −1
1 1
λ2 = −1 − i and the corresponding eigenvectors are x1 = and x2 = . Because A
−i i
is real and λ1 and x1 satisfy the eigenvalue problem so also do λ1 and x1 . And although we find,
in the plain vanilla inner product, that hx1 , x2 i = 0, A∗ is not equal to A. What in fact is true is
T T
that AA = A A, i.e., that A is normal. In the plain vanilla inner product the biorthogonal set is
1 1 1
y1 = ,y = 1 .
2 −i
2 2 i
* +
1 1
c1 = , x(t = 0)
2 −i
and
* +
1 1
c2 = , x(t = 0) .
2 i
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 127
When A is real and x(t = 0) is real then x(t) must be real for all values of t. If λ2 = λ1 and
we require x2 = x1 then y 2 = y 1 and hence c2 = c1 and so the two terms adding to x(t), c1 eλ1 t x1
and c2 eλ2 t x2 , are complex conjugates. As a result x(t) can be written as 2Re c1 eλ1 t x1 . In this
example this is
1
(−1+i)t
2Re c1 e
−i
and, on writing
λ1 = Re λ1 + i Imλ1 = −1 + i
1 0
x1 = Re x1 + i Imx1 = + i
0 −1
and
c1 = ρeiφ ,
we have
Example (ii)
1 −2
Let A = then trA = −3, det A = 2 and
3 −4
1 1
λ1 = −1 , x1 = , λ2 = −2 and x2 =
3
.
1
2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 128
* 3 +
c1 = 2 2 , x(t = 0)
−1
and
* +
−1
c2 = 2 , x(t = 0) .
1
We see that as t increases the second term dies out exponentially fast compared to the first and
for large enough values of t
1
x(t) ∼ c1 e−t
1
1
This tells us that x(t) approaches 0 from the direction . We will use this fact in Lecture 8
1
to help us turn experimental data into estimates of the elements of A.
There are special solutions to dx/dt = Ax called equilibrium solutions. These satisfy Ax = 0
as then dx/dt = 0. An equilibrium solution is constant in time and can only be obtained by starting
there. If detA 6= 0 then 0 is the only solution toAx = 0 and hence is the only equilibrium solution.
All other solutions are always on the move and according to where they start are given by our
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 129
formula
n D
X E
x(t) = y i , x(t = 0) eλi t xi .
i=1
Now these may or may not converge to the equilibrium solution as time grows large. If all do, we
call the equilibrium solution asymptotically stable and a necessary and sufficient condition for this
is that Reλi < 0, i = 1, 2, . . . , n, i.e., all eigenvalues lie in the left half of the complex plane. If
this is so x(t) goes to 0 exponentially fast as t grows large at a rate determined by the largest of the
Reλi , i = 1, 2, . . . , n.
If n is 2, the eigenvalues of A are roots of a quadratic equation and it is easy to see that asymptotic
stability obtains iff trA < 0 and detA > 0. But as n increases beyond 2 eigenvalues are increasingly
difficult to determine and what we need is a simple estimate of where the eigenvalues of A lie. The
best of these is Gerschgorin’s circle theorem and it is surprisingly easy to prove; it tells us that
each eigenvalue of A lies on or inside at least one of n circles in the complex plane. In fact, there
are two sets of n circles: they are
X
|λ − aii | ≤ |aij | , i = 1, ..., n
j6=i
and
X
|λ − aii | ≤ |aji | , i = 1, ..., n
j6=i
The first set of circles corresponds to the rows of A. The second set to the rows of AT and hence
to the columns of A. For instance, in the second example, the best estimate via Gerschgorin’s
theorem is that the eigenvalues cannot lie in the region outside the two circles of radius 2 centered
on -4 and 1. And both sets of circles are required to determine this.
If A is diagonal, Gerschgorin’s theorem predicts all its eigenvalues; if A is triangular the theo-
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 130
rem predicts the eigenvalues a11 and ann ; the more diagonally dominant the matrix, the better the
estimates made by the theorem.
As a variation on the foregoing problem, where the evolution of x (t) is continuous in time, we also
look at the problem where x (k) evolves discretely in time. Suppose x (k), k = 1, 2, . . . satisfies
x (k + 1) = Ax (k)
n
X
x (k) = ci (k) xi
i=1
D E
where ci (k) = y i , x (k) . To find an equation satisfied by ci (k) we multiply both sides of
x (k + 1) = Ax (k) by y Ti G to obtain
D E D E D E D E
∗
y i , x (k + 1) = y i , Ax (k) = A y i , x (k) = λi y i , x (k) .
ci (k + 1) = λi ci (k)
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 131
and so
and therefore
n D
X E
x (k) = y i , x (0) λki xi
i=1
Using this formula we see that the evolution of x (k) as k increases depends on where the eigen-
values of A lie.
What we have found then is this: stability for the problem dx/dt = Ax requires the eigenvalues of
A to lie in the left half of the complex plane; stability for the problem x (k + 1) = Ax (k) requires
the eigenvalues of A to lie inside the unit circle. To see what these conditions have to do with one
another, let x (t) denote the solution to dx/dt = Ax . Then a simple Euler approximation to x (t)
satisfies
x (k + 1) = (I + ∆tA) x (k)
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 132
n D
X E
y i , x (t = 0) eλi t xi
i=1
and
n D
X E
y i , x (k = 0) (1 + ∆tλi )k xi
i=1
Now the first thing to observe is that the approximation converges to x (t) as ∆t → 0 where
the limit is taken holding k∆t = t fixed. The second thing to observe is that, even assuming
the stability of the differential equation, the difference equation is not stable for all values of ∆t.
Indeed we see that the difference equation is stable iff, ∀λi , i = 1, 2, . . . , n,
|1 + ∆tλi |2 < 1
or
That it is possible to satisfy this by making ∆t sufficiently small, when Reλi < 0, is due to the fact
that Imλi is multiplied by (∆t)2 whereas Reλi is multiplied by ∆t.
It is ordinarily true that the eigenvalue whose real part is most negative sets a bound on how
large ∆t may be. Indeed stability of the difference equation, in the case where the eigenvalues are
real and negative, requires that ∀λi
or
2
∆t <
|λi |
So, the most negative eigenvalue, the eigenvalue associated with the term in x (t) that dies out most
rapidly, controls the size of ∆t in the difference approximation. In numerical work this is referred
to as the stiffness problem. Problems where the real parts of the eigenvalues are widely separated
so that insignificant parts of their solutions, at least for t > ǫ, ǫ small, control approximations to
their solutions are called stiff problems. In doing a calculation you can never get rid of the most
negative eigenvalue due to the fact that numerical errors act like new initial conditions.
The results obtained in this lecture will be used in subsequent lectures to investigate problems
where we can establish the fact of a complete set of eigenvectors. Before turning to this we take
up a problem where the set of eigenvectors is not complete and where the use of generalized
eigenvectors is required.
To show how the solution to such a problem can be found, we set n = 2 and suppose that λ1 is a
double root of ∆ (λ) = 0. Then if dim Ker (A − λ1 I) = 1, we write
Ax1 = λ1 x1
Ax2 = x1 + λ1 x2
and
A∗ y 1 = λ1 y 1 + y 2
A∗ y 2 = λ1 y 2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 134
D E
where y i , xj = δij . To determine x (t) where dx/dt = Ax we write x (t) = c1 (t) x1 + c2 (t) x2
D E
where ci (t) = y i , x (t) and discover in the usual way that c1 (t) and c2 (t) satisfy:
dx dc1 D E D E
y1, = = y 1 , Ax = λ1 y 1 + y 2 , x = λ1 c1 + c2
dt dt
and
dx dc2 D E D E
y2, = = y 2 , Ax = λ1 y 2 , x = λ1 c2
dt dt
Zt
λ1 t
c1 = e c1 (t = 0) + eλ1 (t−τ ) c2 (τ ) dτ
0
and
c2 = eλ1 t c2 (t = 0)
whence
c1 = eλ1 t c1 (t = 0) + teλ1 t c2 (t = 0)
nD E D E o D E
x (t) = y 1 , x (t = 0) eλ1 t + y 2 , x (t = 0) teλ1 t x1 + y 2 , x (t = 0) eλ1 t x2
What we see then is this: when a pair of eigenvectors is replaced by an eigenvector and a gen-
eralized eigenvector the purely exponential time dependence eλ1 t and eλ2 t is replaced by eλ1 t and
teλ1 t . If λ1 were repeated three times, assuming n > 2, the number of possibilities increases.
We may have three eigenvectors, two eigenvectors and a generalized eigenvector or an eigenvec-
tor and two generalized eigenvectors. The first corresponds to a complete set of eigenvectors, viz.,
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 135
dim Ker (A − λ1 I) = 3; the second is like the above, dim Ker (A − λ1 I) = 2 and one of the eigen-
vectors, but not the other, must lie in Im (A − λ1 I) in order that it lead to a generalized eigenvector.
The third possibility is new, dim Ker (A − λ1 I) = 1 and Ker (A − λ1 I) lies inside Im (A − λ1 I);
the readers can satisfy themselves that the time dependence is now given by eλ1 t , teλ1 t and 12 t2 eλ1 t .
A problem where generalized eigenvectors are required turns up in the study of a simple stripping
cascade operated as follows: Let M denote the heavy phase holdup in each stage of the cascade
and suppose every T units of time we transfer the heavy phase contents of stage i to stage i − 1,
taking M units of product from stage 1, adding M units of feed to stage n. By doing this we
achieve a heavy phase throughput L = M/T . The light phase is run as before and strips the n
stages of the cascade for a period of time T . Indeed this is the way sugar is stripped out of sugar
beets using water.
If yin = 0 and E = 1 we can determine, using the Kremser equation, that the ordinary perfor-
mance of such a stripping cascade is predicted by
xout 1
=
xin 1 + S + S + · · · + Sn
2
where S = mV /L is the stripping factor. We propose to show that the ordinary operation can be
greatly improved upon.
In the newly proposed method of running the separation cascade, our equations are
dxi
M = V yi−1 − V yi , i = 1, . . . , n
dt
dxi S
= (xi−1 − xi ) , i = 1, . . . , n
dt T
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 136
x1
x2
where xn (t = 0) = xin and x1 (t = T ) = xout . Setting x =
..
we have
.
xn
dx S
= Ax
dt T
where
−1 0 0 ···
1 −1 0 · · ·
A=
0 1 −1 · · ·
...
This matrix has the eigenvalue λ1 = −1 repeated n times and, as dim Ker (A − λ1I) =1, to it
0
0
there corresponds only one independent eigenvector which we can take to be x1 =
.. . This
.
1
eigenvector initiates a chain of generalized eigenvectors x2 , . . . , xn via
Ax2 = x1 + λ1 x2
Ax3 = x2 + λ1 x3
etc.
0 1
For n = 2 we have x1 = , x2 = and, in the plain vanilla inner product,
1 0
0 1
y1 = , y =
2
. Putting this in our earlier formula, we get
1 0
S
x1 (t) = x1 (t = 0) e− T t
and
S S −St
x2 (t) = x2 (t = 0) e− T t + x1 (t = 0) te T
T
These formulas will take us through the startup period where for the first cycle we have
x1 (t = 0) = xin . Thereafter x1 (t = 0) will be x2 (t = T ) for the preceding cycle. After some
number of cycles we assume, and the reader can demonstrate, that the stripping cascade achieves a
repetitive operation wherein x2 (t = T ), and indeed x1 (t) and x2 (t), 0 < t < T , is repeated cycle
after cycle. Then using x1 (t = 0) = x2 (t = T ) in the foregoing we get
xout 1
= 2S
xin e − SeS
1
1 + S + S2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 138
1
eS
and
1
1+S
We present an alternative, solving dx/dt = Ax by using the Laplace transformation, assuming the
eigenvalues of A to be distinct. To do this we write
and
T
T
T
λI − A adj λI − A = det λI − A I
and observe that if λi is a simple root of det (λI − A) then the columns of adj (λi I − A) are
T
all proportional to xi while the columns of adj λi I − A are all proportional to y i . Because
T
n oT
adj λi I − A = adj (λI − A) we can write
d d
det (λI − A) = tr adj (λI − A) (λI − A) = tr {adj (λI − A)}
dλ dλ
whence
d
det (λI − A) = tr ci xi y¯i T = ci y¯i T xi
dλ λ=λi
To determine the solution to dx/dt = Ax we take the Laplace transformation of both sides
obtaining
sL (x) − x (t = 0) = AL (x)
and hence
(sI − A) L (x) = x (t = 0)
whence
adj (sI − A)
L (x) = x (t = 0)
det (sI − A)
The right hand side has simple poles at s = λ1 , λ2 , . . . , λn . Thus L(x) can be written
n
X adj (sI − A) 1
L (x) = x (t = 0)
d s − λi
i=1 det (sI − A) |
ds λ=λi
Then, using our formulas for the adjugate and for the derivative of the determinant, we have
n
X xi y¯ T x (t = 0) 1 X D n E 1
i
L (x) = = xi y i , x (t = 0)
i=1
y¯i T xi s − λi i=1
s − λi
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 140
D E
on requiring y Ti xi = y i , xi = 1 and hence
n
X D E
x (t) = xi y i , x (t = 0) eλi t
i=1
Frazer, Duncan and Collar in their book “Elementary Matrices” propose a very nice way to con-
struct solutions to quite general systems of linear differential equations. We outline the essential
idea here in the hope that this provokes some readers to go and
look at this great old book. Let
d
fij (λ) , i, j = 1, . . . , n, be n2 polynomials in λ. Then fij is a polynomial differential oper-
dt
ator and our problem is to determine solutions to the system of differential equations
d
f x (t) = 0
dt
where f (λ) = λn A0 + λn−1 A1 + · · · is called a lambda matrix. We let F (λ) denote the adjugate
of f (λ), i.e., F (λ) = adj f (λ), and write
where ∆ (λ) = det f (λ) . Then if λ1 is a root of ∆ (λ) = 0 of algebraic multiplicity m1 we have
∆ (λ1 ) = 0
(1)
∆ (λ1 ) = 0
···
(m1 −1)
∆ (λ1 ) = 0
(m1 )
∆ (λ1 ) 6= 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 141
(1)
where ∆ (λ) = d∆ (λ) /dλ, etc. We can then find a set of solutions corresponding to λ1 by
observing first that
d
f eλt F (λ) = eλt f (λ) F (λ) = eλt ∆ (λ) I
dt
d d λt d d λt λt λt
(1)
f e F (λ) = f e F (λ) = te ∆ (λ) + e ∆ (λ) I
dt dλ dλ dt
etc
eλ1 t F (λ1 ) ,
d λ1 t
e F (λ1 ) ,
dλ1
···
dm1 −1 λ1 t
e F (λ1 )
dλm
1
1 −1
satisfy
d
f x (t) = 0
dt
Frazer, Duncan and Collar present the properties of the lambda matrix f (λ) and its adju-
F(λ) that are required to make this a workable method for writing the general solution to
gate
d
f x (t) = 0.
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 142
The stability of an equilibrium point to a small displacement requires the real parts of the eigenval-
ues of some matrix to be negative. To decide stability then, we must determine the eigenvalues of
this matrix and to do this we must first determine its characteristic polynomial and then the roots
of this polynomial. If the problem depends on parameters and the dimension is large this can be a
difficult calculation.
We would like to be able to determine the signs of the real parts of the eigenvalues, short of
determining the eigenvalues themselves, either by looking at the elements of the matrix or, if that
fails, by looking at the coefficients of its characteristic polynomial. Gerschgorin’s circle theorem
is a step in this direction but often it does not resolve the question; yet it always provides estimates
of the eigenvalues that can be refined and hence it is always helpful. In what follows we let n = 2
and 3 and state necessary and sufficient conditions in terms of the coefficients of the characteristic
polynomial that its roots, the eigenvalues, have negative real parts. The matrix is assumed to be
real.
A useful reference is Porter’s book: Stability Criteria for Linear Dynamical Systems. Besides
providing a nice way of looking at this problem, this beautiful little book has a simple derivation
of the Routh criteria.
λ2 − ∆1 λ + ∆2 = 0
where ∆1 = T and ∆2 = D, T and D denoting trace and determinant. The necessary and sufficient
condition that Reλ1 and Reλ2 be negative is: T < 0, D > 0.
λ3 − ∆1 λ2 + ∆2 λ − ∆3 = 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 143
or
λ3 − T λ2 + Sλ − D = 0
λ1 + λ2 + λ3 = T = λ1 + 2x
λ1 λ2 + λ2 λ3 + λ3 λ1 = S = λ1 (2x) + x2 + y 2
λ1 λ2 λ3 = D = λ1 x2 + y 2
where on the right hand side we assume λ2 = λ3 = x + iy. It is easy to see that if λ1 , λ2 and λ3
are negative or have negative real parts then T < 0, S > 0 and D < 0.
So if T is not negative, or S is not positive or D is not negative we can conclude that not all of
Reλ1 , Reλ2 and Reλ3 can be negative.
λ3 − T λ2 + Sλ − D = 0
x3 − 3xy 2 − T x2 − y 2 + Sx − D = 0
and
3x2 y − y 3 − T (2xy) + Sy = 0
and so, dividing the second by y and using the result to eliminate y 2 in the first, we get
−8x3 + 8T x2 − 2 S + T 2 x + T S − D = 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 144
and this tells us: T < 0, S > 0, D < 0 and T S − D < 0 is sufficient that x not be positive. But
T S − D = (λ1 + λ2 + λ3 ) (λ1 λ2 + λ2 λ3 + λ3 λ1 ) − λ1 λ2 λ3
= (λ1 + 2x) λ1 (2x) + x2 + y 2 − λ1 x2 + y 2
= λ21 + x2 + y 2 (2x) + (2x)2 λ1
and so λ1 < 0 and x < 0 is sufficient that T S − D < 0. As T S − D < 0 if λ1 < 0, λ2 < 0
and λ3 < 0 we discover: the necessary and sufficient condition that Reλ1 , Reλ2 and Reλ3 all be
negative is T < 0, S > 0, D < 0 and T S − D < 0.
otherwise, i.e., 4S 3 − S 2 T 2 + 27D 2 + 4DT 3 − 18T SD > 0, there is one real root and a complex
conjugate pair.
If our cubic equation depends on a parameter and the parameter changes, then: if it has three
real roots, one changes sign on crossing the plane D = 0, T < 0, S > 0; if it has one real root and
a complex conjugate pair, the real root changes sign on crossing the plane D = 0, T < 0, S > 0
but now 4S 3 −S 2 T 2 +27D 2 +4DT 3 −18T SD > 0; if it has one real root and a complex conjugate
pair, the real part of the complex conjugate pair changes sign on crossing the surface T S − D = 0,
T < 0, S > 0, D < 0.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 145
The solution to
dx
= Ax
dt
can be written as
x (t) = etA x (t = 0)
where
1 1
etA = I + tA + t2 A2 + t3 A3 + · · ·
2 6
and where the infinite sum converges to etA no matter any complications pertaining to the eigen-
vectors of A. However, there are questions regarding the rate of convergence of the series, yet if t
is small we might get a fair estimate of x (t) by assuming etA = I + tA.
dx
= (A + B) x
dt
The solution is
x (t) = et(A+B) x (t = 0)
but et(A+B) is not etA etB . In obtaining an estimate of et(A+B) it is sometimes useful to observe that
1 1
et(A+B) and e 2 tA etB e 2 tA agree through their first three terms in powers of t.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 146
6.11 A = A (t)
dx
= A (t) x
dt
a set of independent solutions and introduce M (t), a fundamental solution matrix, via
and hence every fundamental solution matrix can be written M (t) C where det C 6= 0.
A (t + T ) = A (t)
M (t + T ) = M (t) C.
C = eT R
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 147
whereupon
P (t + T ) = M (t + T ) e−(t+T )R
= M (t) Ce−(t+T )R
= M (t) e−tR
= P (t)
Now M (t) is stable if the eigenvalues of R have negative real parts. The simplest case is that
in which the eigenvalues of C are distinct, for then the eigenvectors of C and R coincide and the
eigenvalues of R, denoted µ, and C, denoted λ, satisfy
λ = eT µ .
Thus we have
d2 ψ
+ cos t ψ = 0
dt2
can be written
d ψ 0 1
ψ
d = d
dt ψ − cos t 0 ψ
dt dt
−1 1
1. Let A = and show that its eigenvalues lie inside its Gerschgorins circles.
−1 −2
1
Then let x (t = 0) = and sketch the solution to
1
dx
= Ax
dt
in the x1 , x2 plane.
−3/2 1/4 −3 4
Let A = and A = and repeat the above calculations.
1 −3/2 1 1
In the last problemλ1 =
−1 is a double root to which corresponds
only one independent
−2 1
eigenvector x1 = . A generalized eigenvector x2 = can be found so that
1 0
x1 , x2 is a basis.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 149
x 1
2. Determine out once a dynamic stripping cascade reaches a repetitive state if of the heavy
xin 2
1
phase on each stage is transferred to the stage below every T seconds. Is this better than
2
transferring all of the heavy phase every T seconds? What is the result in the limit of trans-
1 1
ferring th of the heavy phase every T seconds as n → ∞? Do this assuming a two stage
n n
cascade.
x
3. Determine out for a three stage dynamic stripping cascade once a repetitive operation is
xin
established. The model is then
dx S
= Ax, 0≤t≤T
dt T
−1 0 0
mV M
where S = , T = , A= 1 −1 0
L L
0 1 −1
and
x1 (t = 0) = x2 (t = T )
x2 (t = 0) = x3 (t = T )
x3 (t = 0) = xin
xi+1 = c xi (1 − xi ) , c>0
is well known in the theory of deterministic chaos. Show that its constant solutions are
1
xi = 0 and xi = 1 − .
c
This is a simple model of population variation, xi being the population of a species
in year i scaled so that 0 < xi < 1. The interesting range of c is then 0 < c < 4 as
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 150
1
0 < x (1 − x) < when 0 < x < 1.
4
Show that the solution xi = 0 is stable to small upsets if and only if 0 < c < 1.
1
Show that the solution xi = 1 − is stable to small upsets if and only if 1 < c < 3.
c
When c = 3 show that the eigenvalue of the linear approximation is −1. Because
(−1) (−1) = 1 this leads to a stable period 2 solution, xi+2 = xi , which takes the place
of the unstable constant or period 1 solution, xi+1 = xi . Determine the period 2 solution.
The range 3 < c < 4 is interesting. As c increases beyond 3 there is a region of period
2 solutions, then a region of period 4 solutions, then a region of period 8 solutions, etc. The
width of successive regions decreases geometrically until what is called deterministic chaos
sets in. This is the period doubling route to chaos and it can be observed on a hand calculator.
5. The simple equilibrium stage model sketched below illustrates how a separation by chro-
matography works:
Q Vc Vc Vc
Va Va Va
Denote by c and a the compositions of a dilute solute in the carrier and in the adsorbent
phases, where Vc and Va denote the volumes of the phases. Assume phase equilibrium holds
in each stage, viz,. c = Ka, where strong binding corresponds to small values of K. The
subscript i denotes the stage.
Va
Vc +
Then, scaling time by K , i. e.,
Q
Va
Vc +
t= K θ
Q
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 151
and denoting by c
c0
c1
c2
..
.
you have
dc
= Ac
dθ
where
−1 0 0 ···
1 −1 0 · · ·
A=
0 1 −1 · · ·
.. .. ..
. . .
and where
c0 (θ = 0)
0
c (θ = 0) =
0
..
.
i.e., initially N moles of solute equilibrate in stage zero, the other stages and the inlet carrier
being solute free, then the carrier is turned on at the volumetric flow rate Q.
The solute is swept through the cascade of stages and your job is to show that
ci (θ) θi e− θ
= , i = 0, 1, 2, . . .
c0 (θ = 0) i!
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 152
where
N
c0 (θ = 0) =
Va
Vc +
K
The distribution of solute over the stages at a given time θ is of most interest, viz., ci (θ)
ci (θ)
vs i. Now is the probability of finding a solute molecule in stage i at time θ, given
c0 (θ = 0)
it was in stage zero at time zero. Show that the average and variance of i, denoted i and σ 2 ,
are given by
i = θ = σ2
Then show that the speed of solute through the cascade, in stages per time, is
di Q
=
dt Va
Vc +
K
Q
where is the carrier speed.
Vc
Thus the stronger the binding the slower the speed. Hence different solutes having
different K’s move at different speeds.
Xn ~ j − yj N
~i
yi N
∇yi =
j=1
cDij
6=i
present us with a model for diffusion in an ideal gas at constant temperature and pressure,
~ i are the mole fraction and mole flow rate per
where c is the mole density of the gas, yi and N
unit area of species i and Dij = Dji > 0 is the diffusion coefficient in an ideal gas made up
of species i and j.
If diffusion takes place steadily in one spatial direction the vector notation is not necessary
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 153
n
dyi X yi Nj − yj Ni
=
dz j=1
cDij
6=i
where the Ni are constants independent of z. This equation can be used to find the Ni from
measurements of the yi at the two ends of a diffusion path in a two bulb diffusion experiment
where a long, small diameter tube provides a diffusion path between two large, well mixed
bulbs of gas at different compositions.
We can study this experiment under the assumption that the compositions in the bulbs
remain constant in time if the bulb volumesare large
and the tube cross sectional area is
y
1
small. Then for a ternary ideal gas, denoting y2 by y, we can write the Stefan-Maxwell
y3
equations as
d 1
y = By
dz c
where B, viz.,
N2 N3 N1 N1
+ − −
D12 D31 D12 D31
N2 N3 N1 N2
B= − + −
D12 D23 D12 D23
N3 N3 N1 N2
− − +
D31 D23 D31 D23
Let
1 1 1 1
σ1 = + , δ1 = −
D12 D31 D12 D31
1 1 1 1
σ2 = + , δ2 = −
D23 D12 D23 D12
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 154
1 1 1 1
σ3 = + , δ3 = −
D31 D23 D31 D23
and order the species so that D31 > D12 > D23 then σ2 > σ3 > σ1 > 0 and
δ2 > δ1 > 0 > δ3 .
I = tr B = σ T N ,
I 2 − 4II = N T ∆N
and
III = det B = 0
where
σ
1
σ = σ2
σ3
N
1
N = N2
N3
and
δ12
−δ1 δ2 −δ3 δ1
2
∆ = −δ1 δ2 δ2 −δ2 δ3
−δ3 δ1 −δ2 δ3 δ32
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 155
√
I± I 2 − 4II
Then the eigenvalues of B are 0, and the eigenvectors of B and B T corre-
2
N1 1
sponding to the eigenvalue zero are N2 and 1 . Indeed it is worth observing that
N3 1
II is proportional to N1 + N2 + N3 .
In the two bulb diffusion experiment the condition of constant pressure requires that
N1 + N2 + N3 = 0. Under this condition show that
I = δ1 N2 − δ2 N1
and
I 2 − 4II = I 2
Then the eigenvalues of B are 0, 0, I. Show that the 2 × 2 minors of B are products of N1
δ1 N2 δ2 N1
or N2 or N3 and − and assuming that this is not zero show that the rank of B is
D23 D31
two. Then the eigenvalue zero is repeated but to it there corresponds but one eigenvector.
N1
Show that to the eigenvalue zero there corresponds the eigenvector N2 and the
N3
generalized eigenvector
N1
D31
1 N2
δ1 N2 δ2 N1 D23
−
D23 D31 N1 N2
δ1 N2 − δ2 N1 − −
D31 D23
δ
1
and that to the eigenvalue I there corresponds the eigenvector δ2 .
δ3
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 156
Using this information show how to predict the composition in one bulb of a two bulb
diffusion experiment in terms of the composition in the other bulb and the values of N1 , N2 ,
N3 , D12 , D23 and D31 .
A full account of equimole counter diffusion in an ideal gas can be found in H.L. Toor’s
1957 paper “Diffusion in Three Component Gas Mixtures,” A.I.Ch.E. J. 3 198. He finds
dy1 dy1
conditions where N1 = 0 but is not zero, where = 0 but N1 is not zero and where
dz dz
dy1
sgn N1 = sgn .
dz
1 2
7. If in Problem 6 the reactions take place at one end of the diffusion path then
3
stoichiometry requires N1 + N2 + N3 = 0 and the diffusion process is again represented by a
matrix that does not have a complete set of eigenvectors. If, instead of this, a single reaction
ν1 A1 + ν2 A2 + ν3 A3 = 0
Suppose that the stoichiometric coefficients are such that the eigenvalues of B are 0 and
a complex conjugate pair. Then y vs. z will be a spiral in composition space.
dxi
= −yi + xi , i = 1, 2, . . . , n
ds
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 157
where
Pi (T )
yi = xi , i = 1, 2, . . . , n
P
and
n
X Pi (T )
xi = 1
i=1
P
where
and
dP1 dP2
= P1′ > 0, = P2′ > 0, etc.
dT dT
The only rest states are the n pure component states. Determine the stability of each such
state to a small perturbation and show that only the state xn = 1 = yn is stable.
9. The equations for the small transverse motions of a set of n particles, each of mass m, equally
spaced on a string of fixed tension, viz.,
y1 yn
y2 yn-1
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 158
1 1
L = m ẏ12 + ẏ22 + · · · + ẏn2 − mω02 y12 + (y2 − y1 )2 + · · · + (yn − yn−1 )2 + yn2
2 2
1 1
L = m ẏ T I ẏ − m ω02 y T Ay
2 2
where
y1
y2
y=
..
.
yn
and
2 −1 0 0 0 ... 0
−1 2 −1 0 0 . . . 0
A=
0 −1 2 −1 0 . . . 0
etc.
lead to
d2 y
m 2 + m ω02 Ay = 0
dt
The matrix A is self adjoint in the plain vanilla inner product and so has n orthogonal
eigenvectors, denoted x1 , x2 , . . . , xn , and real eigenvalues denoted λ1 , λ2 , . . . , λn . The
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 159
y = x1 q1 + x2 q2 + · · · + xn qn .
where qi = hxi , yi
Find the equations satisfied by q1 , q2 , . . . , qn and show that each generalized coordinate
executes a purely harmonic motion at frequency ωi where ωi2 = λi ω02 . Such a motion is
called a normal mode of vibration.
Let n = 2 and 3, determine the eigenvectors and eigenvalues of A and sketch the
configuration of the particles in each normal mode of vibration.
f (λ) = λm A0 + λm−1 A1 + · · · + Am
is called a lambda matrix. Its elements are polynomials in λ of degree at most m. The latent
roots of f (λ) are the solutions of
det f (λ) = 0
f (λ) x = 0
and
A0 S m + A1 S m−1 + · · · + Am x (k) = 0
where S 1 x (k) = x (k + 1), S 2 x (k) = x (k + 2), etc., can be turned into first order equa-
d
tions, i.e., equations in only and S 1 .
dt
Show that if λ1 is a latent root of f (λ) and x1 is a corresponding latent vector then
eλ1 t x1
λk1 x1
11. For a one shell pass, two tube pass heat exchanger, viz.,
T3
T2
T1
z=0 z=L
we have:
dT1
w 1 cp 1 = U1 πD1 {T2 − T1 }
dz
dT2
w 2 cp 2 = U1 πD1 {T1 + T3 − 2T2 }
dz
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 161
dT3
−w1 cp1 = U1 πD1 {T2 − T3 }
dz
where w1 and w2 denote the mass flow rates of the tube and the shell fluids, and where
T1 (z = 0) = T1in
and
T2 (z = 0) = T2in
Write this
w 1 cp 1 0 0 T −1 1 0 T
d 1 1
0 w 2 cp 2 0 T2 = U1 πD1 1 −2 1 T2
dz
0 0 −w1 cp1 T3 0 1 −1 T3
−a a 0
b −2b b
0 −a a
where
U1 πD1
a=
w 1 cp 1
and
U1 πD1
b=
w 2 cp 2
T1 (z = L) = T3 (z = L)
12. By reversing the direction of the shell flow, a second one shell pass, two tube pass heat
exchanger configuration is obtained. Repeat the calculation in Problem 11 for this second
configuration. Sketch the temperatures vs z in the two configurations if the shell side is hot
and the tube side is cold. Show that the tube side temperature cannot cross the shell side
temperature in the first configuration. In the second configuration no such restriction obtains
and T3 vs z need not be monotone.
13. For a two shell pass, four tube pass heat exchanger, viz.,
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 163
T6
T5
T4
T3
T2
T1
z=0 z=L
we have:
dT1
w 1 cp 1 = U1 πD1 T2 − T1
dz
dT2
−w2 cp2 = U1 πD1 T1 + T3 − 2T2
dz
etc.
Denote the matrix on the RHS by A. It is block diagonal and its diagonal blocks turn up in
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 164
Find the eigenvalues and eigenvectors of A and write a formula in terms of six unde-
termined constants for the dependence of T1 , T2 , . . . , T6 on z. It is only the values of these
constants that make T1 , T2 , T3 and T4 , T5 , T6 interdependent.
Find T2out and T6out in terms of T1in and T5in . Sketch the temperatures vs z if the
shell side is hot and the tube side is cold.
14. A one shell pass, one tube pass heat exchanger, i.e., a simple double pipe heat exchanger, is
often built using n small diameter pipes in place of one large diameter pipe:
Tn+1
T1
Tn+1
T2
Tn
Tn+1
z=0 z=L
1 dT1
wcp = UπD Tn+1 − T1
n dz
..
.
1 dTn
wcp = UπD Tn+1 − Tn
n dz
dTn+1
−ws cps = UπD T1 + T2 + · · · + Tn − nTn+1
dz
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 165
where w denotes the total pipe side flow and a is n times as large as before.
The pipe side flow is already equally divided over the n pipes. If the pipe side inlet
temperatures are also the same, show that all but two of the n + 1 constants in the solution
must be zero. These are the constants corresponding to the eigenvalue −a which is repeated
n − 1 times,
This tells us that the temperature in each pipe is the same as it is in all other pipes.
Under this condition the model reduces to the model of a simple double pipe heat exchanger
1
if of the shell flow is assigned to each pipe.
n
T2 T2
or
T1 T1
T1
we introduce the column vector . In the counterflow configuration we have
T2
d T1 −a a T1
=
dz T2 −b b T2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 166
where
UπD UπD
a= and b =
w 1 cp 1 w 2 cp 2
The eigenvalues and the eigenvectors of the matrix on the RHS are
1 a
0, and − a + b,
1 b
Using these produce the usual formulas for this heat exchanger.
The case where a = b leads to a double eigenvalue to which corresponds only one
independent eigenvector. Work out this case using a generalized eigenvector.
W T1
W T2
W T3
W T4
z=0 z=L
2W 2W Tcold out Thot out
Thot in Tcold in
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 167
(UA) eff
−
e wcp = T hot out − T cold out
T hot in − T cold in
Suppose the internal heat transfer coefficient is U and the area of the plane wall sepa-
rating each stream is A.
Show that
P
17. To see what happens when a condition such as xi = 1 must be satisfied in a problem
where the stability of an equilibrium point is being investigated let
dx1
= −f1 (x1 , x2 ) + x1
dt
and
dx2
= −f2 (x1 , x2 ) + x2
dt
whenever x1 + x2 = 1.
Let x01 , x02 be an equilibrium point and let x1 = x01 + ξ1 , x2 = x02 + ξ2 be a small
excursion where x01 + x02 = 1 and ξ1 + ξ2 = 0
What does condition (∗) tell us about the eigenvalues and eigenvectors of the Jacobian
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 168
when
ξ1 (t = 0) 1
∝
ξ2 (t = 0) −1
dx1
= −f1 (x1 , 1 − x1 ) + x1 ?
dt
18. A rider on a merry-go-round throws a ball at another rider directly opposite. The radius is
~ = w ~k and the speed of the ball is initially V .
R, the angular velocity is w
The motion of the ball is viewed by an observer at the center of the merry-go-round and
rigidly fixed to it. Under force free conditions the equation for the motion of the ball is
~a = −2 w
~ × ~v − w
~×w
~ × ~r
r (t = 0) = R ~i and ~v (t = 0) = −V ~i.
where ~
r = x ~i + y ~j , write the equations for the motion of the ball and put them in the
Let ~
form
x x
0 1 0 0
dx dx
d dt
w2
0 0 2w
dt
=
dt y 0 0 0 1 y
dy dy
0 −2w w 2 0
dt dt
Solve this, noticing that the matrix on the RHS has two double eigenvalues and to each there
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 169
Find the motion of the ball as seen by an observer fixed to the ground at the center of
the merry-go-round. The equation of motion is then
~a = 0
r (t = 0) = R ~i and ~v (t = 0) = −V ~i + wR ~j .
where ~
This might explain why the physics of the problem requires that the observer fixed to the
merry-go-round find a pair of double eigenvalues each corresponding to a single eigenvector.
19. Suppose we have a system of particles whose state is given by generalized coordinates
dqi ∂H
=
dt ∂pi
and
dpi ∂H
=−
dt ∂qi
1 XX 1 XX
H= Qij qi qj + Pij pi pj
2 2
Then we have
dqi X
= Pij pj
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 170
and hence
d2 qi X dpj X ∂H
= Pij = Pij −
dt2 dt ∂qj
XX
=− Pij Qjk qk
d2 q
= −W q, W = PQ
dt2
W qi = wi2 q i
Then if wi2 , q i and wj2 , q j are two solutions to the eigenvalue problem, prove
q Ti Q q j = 0
Q P Q q = w2 Q q
20. For a three stage stripping column, where we add H units of solution to the top every T units
of time, we have
dx3
= S (x2 − x3 )
dt
dx2
= S (x1 − x2 )
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 171
dx1
= −Sx1
dt
mV H
where t is scaled by T , S = and L = . The x’s are scaled by xin .
L T
Plot x1 , x2 and x3 vs t for the first few cycles where
x3 (t = 0) = 1
x2 (t = 0) = 1 first cycle
x1 (t = 0) = 1
x3 (t = 0) = 1
x2 (t = 0) = x3 (t = 1) , cycle before thereafter
x1 (t = 0) = x2 (t = 1) , cycle before
1
1 S
e3S − 2Se2S + e
2
and notice that these vectors are orthogonal in the plain vanilla inner product.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 172
21. If
−1 −1
A= and h y, x i = y T x
1 −1
Observe that A 6= A∗ but AA∗ = A∗A. If AA∗ = A∗A in one inner product, is
AA∗ = A∗A in another inner product?
We have a set of parallel channels of rectangular cross section through which hot and
cold fluids flow, viz.,
cold
hot
cold
hot
z=0 z=L
1 2 3 4
T1( z )
Wcold
T2( z )
Whot
T3( z )
Wcold
T4( z )
Whot
The cold flow rates are denoted Wcold, the hot flow rates are denoted Whot. All the cold
side heat transfer coefficients are the same, so too all the hot side heat transfer coefficients.
Hence all the U’s are the same.
Wcold cp cold Ub
Denoting by f and by β we have
Whot cp hot Wcold cp cold
d
T = βAT
dz
T1
T2
where T =
T3
T4
−1 1 0 0
f −2f f 0
A=
0 1 −2 1
0 0 f −f
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 174
and
T
cold in
Thot in
T (z = 0) =
Tcold in
Thot in
eff (U A) (f + 1)
−
(Thot − Tcold) out 2Wcold cp cold
= e
(Thot − Tcold) in
T
and denote A (f = 1) by A1 where A1 = A1 , then observe that A = DA1 .
Show that the eigenvalues of A are real and not positive and that its eigenvectors are
orthogonal in the inner product where G = D −1 .
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 175
Hence, write
U bL
XD E λ
Wcold cp cold i
T (z = L) = x i, T (z = 0) e xi
G
λ4 + 3 (1 + f ) λ3 + 2 1 + 3f + f 2 λ2 + 2f (1 + f ) λ + zero = 0
and conclude:
eigenvalues:
p p
λ1 = 0, λ2 = −f − 1 + f 2 + 1, λ3 = −f − 1, λ4 = −f − 1 − f2 + 1
unnormalized eigenvectors:
1 1
p
1 −f + f 2 + 1
x1 =
,
x2 =
1 p ,
1 f 1 − f2 + 1
1 −1
1 1
p
−f −f − f 2 + 1
x3 = , x4 = p
1 2
−f f 1+ f +1
f2 −1
Then derive a formula for (UA) eff in terms of U. You worked out the case f = 1 in Problem
16.
23. You are going for a walk on a network of N points. A point is denoted i, i = 1, . . . , N.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 176
Each point is linked to others and we set ℓij = 1 if there is a one step path from point j
to point i, otherwise ℓij = 0.
Denote by ℓj the number of one step paths from point j to the points of the network,
viz.,
X
ℓj = ℓij , ℓj 6= 0
i
and assume on taking a step from point j the possible destinations are chosen with equal
1
probability, . Thus the probability of the step j → i
ℓj
1
pij = > 0 if ℓij = 1
ℓj
=0 if ℓij = 0
(m+1)
X (m)
pi = pij pj
and hence
p(m+1) = P p(m)
where
T
1
1 (m)
p =1
..
.
1
lim p(m) = p
m→∞
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 177
independent of p(0) .
A chemostat is a vessel in which a population of cells grows. We denote by V the volume of the
vessel and assume the conditons therein to be spatially uniform. Then if n denotes the number
density of cells in the vessel and W denotes the volume flow into and out of the vessel, the number
of cells in the vessel satisfies
dn
V = −W n + knV
dt
where k is the growth constant and where the cells are not fed but grow from an intial injection
which establishes n (t = 0). Because we have
dn W
= − + k n,
dt V
W W
if k > V
the cell culture grows without bound, whereas if k < V
it washes out.
To get a more interesting model we make the simple assumption that the value of k, which tells
us the rate of cell multiplication, instead of being a constant, depends on the concentration of a
single limiting nutrient. We let c denote this concentration and write k = k (c). Assuming that the
nutrient is consumed only when the cell population grows and that it must be fed to the chemostat
179
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 180
dc
V = W cin − W c − νknV
dt
and
dc W
= {cin − c} − νkn
dt V
where cin is the nutrient concentration in the feed and ν is a stoichiometric coefficient.
and
W
0= {cin − c} − νkn
V
and by
cin − c W
n= , k (c) = .
ν V
value. Then as long as W > V k (cin ) we find only the washout solution, n = 0, c = cin . The
point W = V k (cin ) is a branch point; as W passes through V k (cin ) a new solution branches off
the washout solution, and we have the following steady state diagram:
To establish the stability of these steady state solution branches, we investigate what happens
to a small excursion from a steady solution denoted n0 , c0 . Writing our model
dn W
= f (n, c) = − + k (c) n
dt V
and
dc W
= g (n, c) = (cin − c) − νk (c) n
dt V
we find its linear approximation near n0 , c0 , in terms of the small displacements ξ and η, to be
d ξ fn fc ξ −W + k (c0 ) k ′ (c0 ) n0 ξ ξ
= = V =J .
dt η gn gc η −νk (c0 ) −W ′
− νk (c0 ) n0 η η
V
W
whereas, at a non washout solution, where V
= k (c0 ), our Jacobian matrix is
′
0 k (c0 ) n0
J = .
−ν W
V
−W
V
′
− νk (c0 ) n0
Hence if W < V k (cin ) the new branch is stable whereas the washout branch is unstable. If
W > V k (cin ) the washout branch is stable. This is true because the eigenvalues of the Jacobian
W
matrix on the new branch are −νk ′ (c0 ) n0 and − whereas on the washout branch they are
V
W W
− + k (c0 ) and − . Indeed as W decreases and passes through V k (cin ) the washout solution
V V
loses its stability while the new solution picks up the lost stability of the washout solution. We
observe that at the branch point an eigenvalue vanishes, i.e., the branch point is the point where the
determinant of the Jacobian matrix vanishes. This corresponds to passing from the fourth to the
third quadrant in the plane whose axes are the determinant and the trace of the Jacobian matrix.
The reader may wish to rework this problem after adding a term to account for cell metabolism
i.e., −µn, µ > 0, and specifying
and
dc W
= (cin − c) − (νβc + µ) n
dt V
This is a model problem having a long history in chemistry and chemical engineering. There are
many variations corresponding to many ways of making the reaction speed itself up. We assume
the reaction is autothermal. The rate of a chemical reaction is ordinarily a strongly increasing
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 183
function of the temperature at which the reaction takes place. This leads to a positive feedback
when a reaction releases heat, for this heat then speeds up the reaction. This feedback makes
the problem interesting, even in the simplest case, and carrying out the reaction in a stirred tank
reactor produces a very simple problem. The model is plain vanilla, retaining only the Arrhenius
temperature dependence of the chemical rate coefficient, and this in a simplified form. Our work
is a part of what can be found in Poore’s paper “A Model Equation Arising from Chemical Reactor
Theory” (Arch. Rational Mech. Anal. 52, 358 (1973)).
Before
problem we presented a brief reminder, setting n = 2 so that
we turn to this
x1 a a
x= and A = 11 12 . The solution to
x2 a21 a22
dx
= Ax
dt
where x (t = 0) is assigned is
D E D E
x = y 1 , x (t = 0) eλ1 t x1 + y 2 , x (t = 0) eλ2 t x2
n o
where {x1 , x2 } and y 1 , y 2 are biorthogonal sets of vectors in the inner product being used to
write the solution, x1 and x2 being eigenvectors of A, y 1 and y 2 being eigenvectors of A∗ . The
corresponding eigenvalues, λ1 and λ2 , satisfy
λ2 − trA λ + det A = 0
where trA = a11 + a22 and det A = a11 a22 − a21 a12 .
We assume that A is real and we observe that the qualitative behavior of x (t) differs according
to where the point (det A, trA) lies in the det A−trA plane. The algebraic signs of the eigenvalues
or their real parts are: +, + in the first quadrant, +, − in the second and third quadrants,−, −
in the fourth quadrant. The fourth quadrant is divided into two regions by the curve (trA)2 −
4 det A = 0. Above the curve the two eigenvalues are complex conjugates having a negative real
part, below they are negative real numbers. The path x (t) vs t differs in shape according to where
the point (det A, trA) lies. If it lies below (trA)2 − 4 det A = 0, x (t) is the sum of two vectors
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 184
each remaining fixed in direction, their lengths shrinking exponentially. If (det A, trA) lies above
(trA)2 − 4 det A = 0, x (t) returns to 0 on a spiral path. To see this we write λ2 = λ1 , x2 = x1 ,
y 2 = y 1 , then because x (t = 0) is real, x (t) is given by
nD E o
x (t) = 2Re y 1 , x (t = 0) eλ1 t x1
D E
and, on writing y 1 , x (t = 0) = ρeiφ , this is
Hence Reλ1 tells us the rate of decay of the spiral, Imλ1 tells us its frequency of revolution and
Rex1 and Imx1 determine its shape.
The point (det A, trA) can leave the fourth quadrant in two ways, either by crossing the line
det A = 0 or by crossing the line trA = 0. We call the first instance an exchange of stability, the
second a Hopf bifurcation.
We turn now to the stirred tank reactor. Reactants are fed to the tank and products are removed
along with some of the unused reactants and what we have is an autothermal process controlled by
heat loss and reactant loss. We determine the steady states of the reactor and study their stability.
We will find that the point (det A, trA) tells us all we want to know about the reactor close to a
steady solution. The location of this point depends on the input variables to the problem and, as
the values of these variables change, it may leave the fourth quadrant by crossing either the line
det A = 0 or the line trA = 0.
We write a simple model assuming that an exothermic, first order decomposition of a reactant
in the feed stream takes place in a tank maintained spatially uniform by sufficient stirring. It is
dc
V = qcin − qc − kcV
dt
and
dT
V ρcP = qρcP Tin − qρcP T + {−∆H} kcV − UA (T − Tc )
dt
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 185
where c denotes the concentration of the reactant and T denotes the temperature. The tank is
equipped with a heat exchanger to remove the heat released by the reaction, and this explains the
heat sink appearing as the fourth term on the right hand side of the second equation. The preceding
term is the heat source, as −∆H > 0. The density, the heat capacity, etc. are taken to be constants
while the chemical reaction rate coefficient is specified by the Arrhenius formula:
k = Ae−E/RT .
y
−E 1
− T1 RTin
k = kin e R T in = kin e 1+
E
y
,
E T −Tin RTin T −Tin
where y = RTin Tin
, and then if E
y = Tin
<< 1 it is
k = kin ey .
This is what we use henceforth. It is called the Frank-Kamenetski approximation after D.A.Frank-
Kamenetskii, a mining engineer interested in the problem of thermal explosions. The approxima-
T − Tin
tion makes sense as long as << 1. It is explained in physical terms in his book, “Diffusion
Tin
and Heat Exchange in Chemical Kinetcs.” We use it for its mathematical convenience.
c
Then, letting x denote 1− , the fractional conversion of the feed, scaling time by the holding
cin
V
time and writing it again as t, and introducing the dimensionless groups
q
UA V
β= ≥0
ρcP V q
E (−∆H) cin
B= >0
RTin ρcP Tin
and
V
D= kin > 0
q
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 186
we have
dx
= −x + Dey (1 − x) ≡ f (x, y)
dt
and
dy
= −y + BDey (1 − x) − βy ≡ g (x, y)
dt
where we have assumed Tc = Tin and obtained a non-essential simplification. The input variables
D, B and β, measure the strengths of the chemical reaction, the heat source and the heat sink.
We will assume that they can be adjusted independently whereas that may not be so in a definite
V
physical problem. Indeed to study the response of a system to the holding time τ, where τ = ,
q
1 1
it would be better to put β0 τ and k0 τ in place of β and D where and are the natural time
β0 k0
scales for heat exchange and reaction.
0 = −x + Dey (1 − x)
and
0 = −y + BDey (1 − x) − βy
or
B
y= x
1+β
and
x −y
D= e
1−x
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 187
x − 1+β
B
x
D= e
1−x
To each x on (0,1) there corresponds one value of D. As x increases from 0 to 1 the right hand side,
called RHS henceforth, increases from 0 to ∞ and the question is: is this a monotonic increase?
If it is, there will be one value of x corresponding to each value of D; otherwise there will be more
B
than one value of x corresponding to some values of D. The answer depends on the size of
1+β
B
− x
for this determines whether the factor e 1 + β can turn around the strongly increasing factor
x dRHS
. To answer our question, the implicit function theorem instructs us to find where
1−x dx
dRHS
vanishes. Thus we calculate and find:
dx
B
dRHS 1 1 − x
= e 1+β Bx2 − Bx + (1 + β)
dx (1 − x)2 1 + β
dRHS
The algebraic sign of is that of {Bx2 − Bx + (1 + β)} and hence is positive for x = 0
dx
and for x → 1. Our question then reduces to: does Bx2 − Bx + (1 + β) vanish for intermediate
values of x : 0 < x < 1? We let x1 and x2 denote the roots of Bx2 − Bx + (1 + β) = 0 and find
r
1 1 4 (1 + β)
x1,2 = ± 1−
2 2 B
There are two possibilities: either B < 4 (1 + β) , x1 and x2 do not lie on (0,1) and RHS is a
monotonic increasing function of x, ∀x ∈ (0, 1) or B > 4 (1 + β) , x1 and x2 lie on (0,1) and
RHS exhibits turning points at x1 and x2 .
The line B = 4 (1 + β) divides the positive quadrant of the β − B plane into two regions. In
the lower region x vs D is monotonic, in the upper x vs D is S-shaped. Schematically then the
steady state diagram looks as follows:
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 188
The steady state curve in the upper part of the sketch is not like the branching diagram in the
chemostat problem. On this S-shaped curve jumps take place as D increases through the lower
turning point corresponding to x1 (ignition point) or decreases through the upper turning point
corresponding to x2 (extinction point). And we conclude that conversions between x1 and x2 may
be difficult to achieve.
To establish the stability of the steady solutions we suppose the system to be in a steady state
denoted (x, y) and ask whether a small excursion to (x + ξ, y + η) does or does not return to
(x, y) . Because we have
dξ dη
= f (x + ξ, y + η) , = g (x + ξ, y + η)
dt dt
and
f (x, y) = 0 = g (x, y)
where
fx fy
A=
gx gy
fx = −1 − Dey
fy = Dey (1 − x)
gx = −BDey
and
gy = − (1 + β) + BDey (1 − x)
x
Dey =
1−x
we find
−1
x
A= 1−x
−Bx
1−x
− (1 + β) + Bx
whence we have
1 2
det A = Bx − Bx + (1 + β)
1−x
and
1 2
trA = − Bx − (B + 1 + β) x + 2 + β .
1−x
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 190
Looking at det A first, we see that as x → 0 and as x → 1 det A is positive and that det A
vanishes at x1 and x2 , the turning points of RHS. Hence det A is positive unless B > 4 (1 + β) and
x1 ≤ x ≤ x2 . The algebraic sign of det A is marked on the foregoing sketch and it indicates that
the branch of the S-shaped steady state curve running between the turning points corresponding
to x1 and x2 is unstable. Indeed as x increases through x1 or decreases through x2 an exchange
of stabilities takes place which corresponds to passing from the fourth to the third quadrant in the
det A − trA plane by crossing the line det A = 0. While the turning points at x1 and x2 correspond
to det A = 0 they do not look like the branch point, where also det A = 0, discovered in the
chemostat problem.
Thus at each point (β, B) where B < 4 (1 + β), we have det A > 0 for all values of D but
where B > 4 (1 + β) we have a bounded range of D, depending on β and B, where det A < 0.
To see what is going on when det A is positive we must look at trA. This is negative as x → 0
and as x → 1 so that all such states are stable to small upsets. The question is whether or not trA
vanishes for intermediate values of x: 0 < x < 1. Denoting by x3 and x4 the roots of trA = 0 we
find their values to be
q
B + (1 + β) ± (B + (1 + β))2 − 4B (2 + β)
x3,4 =
2B
The condition that (B + (1 + β))2 − 4B (2 + β) vanish places two curves on the β − B diagram:
p
B =3+β±2 2+β
and between these curves x3 and x4 are complex conjugates and so do not lie on (0,1). Hence in
the region between the two curves we have trA < 0.
Bx2 − (B + 1 + β) x + 2 + β = Bz 2 − (−B + 1 + β) z + 1 = 0
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 191
√ √
but −B + 1 + β < −2 − 2 2 + β < 0 when B > 3 + β + 2 2 + β hence the real parts of z3 and
z4 must be negative and we see that x3 and x4 lie to the left of x = 1.
√
Likewise, if the point (β, B) lies below the lower curve, i.e., B < 3 + β − 2 2 + β, x3 and x4
must also be real and positive but now they lie to the right of x = 1.
√
So in terms of trA what we find is this: for all points (β, B) lying below B = 3 + β + 2 2 + β
√
we have trA < 0, ∀x ∈ (0, 1) ; for all points (β, B) lying above B = 3 + β + 2 2 + β we
have trA < 0 for 0 < x < x3 , trA > 0 for x3 < x < x4 and trA < 0 for x4 < x < 1. The
x − 1+βB
x
correspondence between x and D is D = e .
1−x
√
Thus at each point (β, B) where B < 3 + β + 2 2 + β, we have tr A < 0 for all values of D
√
but where B > 3 + β + 2 2 + β we have a bounded range of D, depending on β and B, where
tr A > 0.
√
The two curves B = 4 (1 + β) and B = 3 + β + 2 2 + β divide the β − B plane into four
regions as follows:
B
(1 +β )
B =4
det A > 0
tr A < 0 B = 3 + β + 2√ 2 + β
det A > 0
tr A < 0
At each point where the sign of det A or trA is not indicated, both signs are possible depending
on the value of x.
tr A < 0
x
tr A < 0
x D
det A < 0
B
D
x
det A < 0
x
D
At each point of the upper most region x1 , x2, x3 and x4 all fall on the interval (0,1) and so as x
increases from 0 to 1 det A takes positive, negative then positive values while trA takes negative,
positive then negative values. This region can be subdivided depending on how x1 , x2, x3 and x4
are ordered. This is worked out in Poore’s paper.
If we suppose that (β, B) is such that x1 < x2 < x3 < x4 then it is possible to find:
X
tr A
1 − − − − − −
+ + + −
x4 + + + + + +
+ + ++ + + +0 +
x3 − + det A
− ++ 0
−
x2 −
− 0
− −
− −
− −
− −
−
x1 − −
− +
0
−
− − ++
+
D
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 193
and where trA > 0 or where det A < 0 we have unstable steady states, the remaining steady
states being stable to small disturbances. The sketch should convince the readers that they have
no idea what happens as D increases and x passes through x1 . The points corresponding to x3
and x4 , where trA = 0 and det A > 0 are called Hopf bifurcation points. For D and x such
that x is just below x3 or just above x4 the corresponding steady state is stable but the return of a
small perturbation to the steady state is not monotonic. As D increases and x increases through
x3 or as D decreases and x decreases through x4 both of which correspond to passing from the
fourth to the first quadrant in the det A − trA plane by crossing the line trA = 0, something new
turns up that we can only guess at. In each instance it may be that a branch of stable periodic
solutions grows from the bifurcation point, but there are other possibilities, and Poore deals with
such questions. A simple way to get the required information can be found in Kuramoto’s book
“Chemical Oscillations, Waves and Turbulence.”
For example, as D increases in the sketch below, and the state crosses
tr A = 0 from S to U,
x U S
S
we may see
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 194
S S
y y
U
S U
x x
or
y y
S
S U
x x
The reader may wish to work out the adiabatic case, viz., β = 0.
The two problems presented in this lecture illustrate the basic ideas of small amplitude stability
studies and therefore serve our purposes very well. But they are too simple to represent real chem-
ical reactor problems and even too simple to represent what is in the chemical reactor literature.
The greatest simplification is in the use of two variables to define the state of the system and the
consequent use of 2 × 2 matrices to determine its stability.
But even in two state variable problems there is a lot going on. To begin to learn about this the
reader can consult chapter six in Gray and Scott’s book “Chemical Oscillations and Instabilities.”
k
A + P −−−−→ 2P
takes place in a spatially uniform reactor whose holding time is θ. If cA and cP denote the
concentrations of A and P and cP = 0, we can write
in
dcA cA c
= in − A − k cA cP
dt θ θ
and
dcP c
= 0 − P − k cA cP
dt θ
Find the steady solutions of this system of equations and their stability to small upsets, i.e.,
show that the steady state diagram in terms of cA is
in
CA in−
1
CP CP = kθ
0
S 1 U CA
in
kθ
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 196
A −→ X
B + X −→ Y + C
2X + Y −→ 3X
X −→ D
Letting a denote the concentration of A, etc., assuming that a and b remain fixed and requir-
ing all the elementary kinetic rate constants to be equal, we can write
dx
= a − bx + x2 y − x
dt
and
dy
= bx − x2 y
dt
Find the values of a and b for which these equations have a stable equilibrium point and
show that the curve b = 1 + a2 is a locus of Hopf bifurcations.
dx
= Ax − Bxy
dt
dy
= Cxy − Dy
dt
where A, B, C and D are positive constants and where y is the predator population while x
is the prey population. Find the equilibrium points and determine their stability.
4. The Lorenz equations, “Deterministic Nonperiodic Flow,” J. Atmos. Sci. 20, 130 (1963),
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 197
viz.,
dx
= σ (y − x)
dt
dy
= ρx − y − xz
dt
and
dz
= −βz + xy
dt
where σ, ρ and β are positive constants, are well known in the theory of deterministic chaos.
These equations represent a three mode truncation of the Boussinesq equations for natural
convection in a fluid layer heated from below. The parameters σ, ρ and β denote the Prandtl
8
number, the Rayleigh number and an aspect ratio. Lorenz set σ = 10 and β = . For
3
fixed σ and β investigate the equilibrium solutions as they depend on ρ and establish their
stability. The point ρ = 1 is called a pitchfork as two new equilibrium solutions break off
from the equilibrium solution x = y = z = 0. Be sure to find the Hopf bifurcation on each
of the new equilibrium branches. To do this let I, II and III denote the principal invariants
of the Jacobian matrix and observe that I × II = III at a Hopf bifurcation.
5. The reaction A −→ B, where A is a gas and B is a liquid, is ordinarily carried out by bub-
bling a gas stream containing A through a liquid containing B. Assuming that A dissolves in
the liquid and that the reaction takes place there, we can write a simple model by supposing
that the solubility of A is independent of temperature, that the rate of absorption of A is not
enhanced by the chemical reaction and that the concentration of B is in large excess. The
concentration of dissolved A and the temperature of the liquid then satisfy
dcA
V = kL0 a V cA∗ − cA − LcA − k cA cB V
dt in
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 198
and
dT
V ρ cP = ρ cP L (Tin − T ) + (−∆H) k cA cB V − UA (T − Tin)
dt in
where L denotes the liquid feed rate and V denotes the liquid volume in the reactor and
where we put cB = cB . The gas stream maintains A at partial pressure PA and creates the
in
P
surface area aV , diffusional resistance lies entirely on the liquid side and cA∗ = A .
H
Introducing
E T −T
in
RT T
k = kin e in in
V
τ=
L
D = τ kin CBin
Dm = τ kL0 a
E T − Tin
Y =
RTin Tin
cA∗ − cA
X=
cA∗
we can write our model
dX
−τ = Dm X − 1 + DeY (1 − X)
dt
and
dY
τ = −Y + BDeY (1 − X) − βY
dt
Find the steady solutions of this system of equations and their stability to small upsets.
6. In problems where the model of a process is a system of differential and algebraic equations,
e.g., p balance equations and q phase equilibrium equations to determine p+q state variables,
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 199
the response of the system is inherently slow. To see this, suppose that x, y and v satisfy
dx
= f (x, y, v)
dt
dy
= g (x, y, v)
dt
and
x = y.
f (x, y, v) = 0
g (x, y, v) = 0
and
x−y =0
dx dy
=
dt dt
or
f (x, y, v) = g (x, y, v)
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 200
g − fx
vx = − x
gv − fv
and
gy − fy
vy = −
gv − fv
where F (x, y) = f x, y, v(x, y) and G (x, y) = g x, y, v(x, y)
Using
Fx = fx + fv vx
Fy = fy + fv vy
Gx = gx + gv vx
and
Gy = gy + gv vy
show that
Fx Fy
det = Fx Gy − Gx Fy = 0.
Gx Gy
This tells us that the linear approximation always exhibits an eigenvalue that is zero. Hence
the response of the system, when it is in an equilibrium state, to a small upset is slow. To see
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 201
The boiling steady states of a constant pressure evaporator exhibit this problem. The
model is:
dx
M = xF F − x (F − V )
dt
dT
M cP = UA (TS − T ) + cP (TF − T ) F − λV
dt
and
T = T 0 + βx
where the state variables are x, T and V , all else being fixed.
M McP
In terms of the holding time τ = , the heat transfer time τh = and the dimen-
F UA
sionless variables
T − T0 V λ
y= , v= and q=
β F cP β
the model is
dx
= xF − x (1 − v)
dt
dy τ
= yS − y + yF − y − qv
dt τh
and
x=y
τ
+1
∂v τh
=−
∂y q+x
and
∂v 1−v
=
∂x q+x
dx
= xF − x + xv (x, y)
dt
and
dy τ
= y S − y + yF − y − qv (x, y)
dt τh
where 0 < v < 1, and the linear approximation is determined by the matrix
∂v ∂v
−1 + v + x x
∂x ∂y
∂v τ ∂v
−q − −1−q
∂x τh ∂y
Show that the determinant of this matrix is zero and that its trace is
q τ x
(−1 + v) − +1 <0
q+x τh q+x
and as a result show that the solution to the linear equations describing a small upset of a
boiling steady state is the sum of two terms, one constant in time, the other dying exponen-
tially in time.
By carrying out an Euler approximation or otherwise, determine what in fact does hap-
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 203
dx
= f (x, y, v)
dt
dy
= g (x, y, v)
dt
x=y
The boiling curve for a three component ideal solution is given by solving
dx1 P1 (T )
= − + 1 x1
dt P
dx2 P2 (T )
= − + 1 x2
dt P
and
P1 (T ) P2 (T ) P3 (T )
x1 + x2 + (1 − x1 − x2 ) = 1
P P P
Here T comes into the third equation and this system is not as sluggish as its look-a-like.
7. In the stirred tank reactor model put β = 3 and B = 14 (setting β = 2 and B = 10 leads to
a better x vs D curve) then
p
3+β+2 2 + β < B < 4 (1 + β)
and the conversion x is a monotonic increasing function of D. On this curve the determinant
of the Jacobian matrix is always positive but its trace vanishes at x3 = 0.406 and x4 = 0.880.
Set D = 0.162, which corresponds to a steady conversion x = 0.380, and integrate the
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 204
differential equations for the start up of the reactor from a variety of initial conditions, in
particular x = 0, y = 0 which corresponds to starting the reactor up using its feed conditions
as initial conditions. Do this using a machine having good graphics and plot the start up
trajectories in the x, y plane.
You will find that only a small set of initial conditions lead to the one and only steady
state. This steady state is stable to small upsets but it is close to the Hopf bifurcation point
at x3 = 0.406. As D increases through this bifurcation point, a small limit cycle breaks
off. If the bifurcation is forward this cycle is stable and surrounds the steady state which
turns unstable. But if it is backward, as it is here, it is unstable and for values of D short of
the bifurcation point an unstable limit cycle surrounds the stable steady state. This in turn
is surrounded by a large stable limit cycle that sets in at a lower value of D. As D passes
through the bifurcation point the small unstable limit cycle vanishes, the stable steady state
turns unstable while the large stable limit cycle is not sensitive to all of this. The picture is
then:
D < Dbifurcation
U
S
X
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 205
D > Dbifurcation
X
The top picture explains why your calculations turn out the way they do.
and
dc W
= {c in − c} − ν k n − µ n
dt V
where µ > 0.
W
Determine the steady solutions and their stability as it depends on .
V
9. In the boiling curve problem for a three component ideal solution, assume
P3 (T ) < P2 (T ) < P1 (T ) and x3 (t = 0) is nearly zero. Show that as t → ∞ we have
x3 → 1.
Lecture 8
In this lecture we present two problems where we must determine the values of the elements aij
dx
of a matrix A using measurements of the elements xi (t) of a vector x(t) satisfying = Ax. The
dt
first problem has been solved by J. Wei and C.D. Prater in “The Structure and Analysis of Complex
Reaction Systems,” Advances in Catalysis 13, 204 (1962).
In each problem A is self-adjoint in some inner product denoted h , iG and we can discover this
inner product while A itself remains unknown. Ordinarily this is not the plain vanilla inner product,
but it does tell us that A has a complete set of eigenvectors corresponding to real eigenvalues.
Denoting these x1 , x2 , . . . , xn and λ1 , λ2 , . . . , λn , where xi , xj G
= δij = y i , xi I
, the
idea is to obtain A by deriving from measurements of x (t) the terms that make up its spectral
decomposition:
X
A= λi xi y Ti .
X
x(t) = y i , x (t = 0) I
eλi t xi
where y i , x(t = 0) I
= xi , x(t = 0) G
and where measurements give us the left hand
side of this formula. The exponential separation of the terms on the right hand side as t grows
large, together with the fact that x(t = 0) is under our control leads to a plan for determining the
207
LECTURE 8. THE INVERSE PROBLEM 208
eigenvectors of A and then the terms in its spectral decomposition. Indeed, if, for some value of i,
we can guess x(t = 0) so that y j , x(t = 0) I
= 0 ∀j 6= i, x(t) is just
y i , x(t)
ln = λi t
y i , x(t = 0)
This done, A can be recovered in terms of its eigenvalues and eigenvectors via
X
A= λi xi y Ti .
The problem to be solved then is the selection of a useful sequence of x(t = 0)’s. What makes
this possible is that the terms in the expansion of x(t) go to 0 at differing rates. Indeed if 0 is the
unique equilibrium point, x1 can be obtained using the long time data from an arbitrary experiment,
x2 can be obtained using the long time data from an experiment satisfying x1 , x(t = 0) G
= 0,
etc.
Let i = 1, . . . , n denote the species in a chemically reacting system and suppose that each pair
participates in a reversible chemical reaction:
kji
i j
kij
LECTURE 8. THE INVERSE PROBLEM 209
where kji > 0 and kij > 0 are the forward and reverse chemical rate constants. A system of n
chemical isomers provides the simplest physical realization of this. We carry out the reaction at
constant temperature in a closed vessel. At time t = 0 we specify the number of moles of each
species in the vessel and inquire as to how these numbers change as time runs on. As the total
number of moles is fixed we can define the state of the system most easily in terms of the mole
fractions of the species. Letting xi , i = 1, . . . , n, denote these we write
n n
dxi X X
= kij xj − kji xi
dt j=1 j=1
j6=i j6=i
x1
x2
or in terms of x =
..
.
xn
dx
= Kx
dt
where the off-diagonal elements of K are kij , the diagonal elements being the negative of the sums
of the off-diagonal elements in the same column. This tells us that the Gerschgorin column circles
are all centered on the negative real line with radius equal to the distance from the center to the
origin. As a result the eigenvalues of K cannot have positive real parts, and if a real part is zero
the eigenvalue itself must be zero. In the special case n = 3 the matrix K is
−k21 − k31 k12 k13
K= k21 −k12 − k32 k23
k31 k32 −k13 − k23
LECTURE 8. THE INVERSE PROBLEM 210
T
1
1
K = 0T ,
..
.
1
P P
so at least one eigenvalue of K is zero; and if xi = 1 when t = 0 then xi = 1 for all t ≥ 0
due to
T T
1 1
d X 1 dx 1
xi =
..
=
..
Kx = 0
dt . dt .
1 1
Also it is not hard to see that if all xi ≥ 0 for t = 0 then each xi ≥ 0 for all t ≥ 0. And so
P
the motion of the state vector x takes place on the plane xi = 1 in the positive cone (quadrant,
octant, . . .) xi ≥ 0 of the composition space.
for all i, j = 1, 2, . . . , n.
LECTURE 8. THE INVERSE PROBLEM 211
A readable explanation of the principle of detailed balance can be found in “Treatise on Irre-
versible and Statistical Thermophysics” by W. Yourgrau, A. van der Merwe and G. Raw. In H.
Haken’s book “Synergetics” the reader will find information on detailed balance as it has to do
with what are called master equations.
The principle of detailed balance is sufficient that the first two assumptions hold. The require-
ment kij xjeq = kji xieq , kij > 0, leads to the requirement that solutions of Kx = 0 have singly
signed components. This tells us that the problem Kx = 0 can have but one independent solu-
tion, because two independent singly signed solutions have nonsingly signed linear combinations.
Hence xeq is unique up to a constant multiplier. It is unique and it lies in the positive cone as it is
P
required to satisfy xi = 1.
The principle of detailed balance is also sufficient that K be self adjoint. To see this let Xeq =
diag x1eq x2eq . . . xneq and observe that Xeq is a positive definite, Hermitian matrix. Then, by
T
detailed balance, we can write KXeq = KX eq , due to the fact that the j th column of KXeq is
the j th column of K multiplied by xjeq . Hence, in the inner product where G = Xeq
−1
, we find that
K ∗ = K. The readers should work this out for themselves using K ∗ = G−1 K G.
T
We conclude therefore that K has a complete set of eigenvectors, denoted x1 , x2 , . . . , xn ,
and we denote the corresponding eigenvalues λ1 , λ2 , . . . , λn . The eigenvalues are real and not
positive and we require that they be ordered: λ1 = 0 > λ2 ≥ λ3 ≥ · · · ≥ λn . The eigenvectors are
−1
orthogonal in the inner product G = Xeq , i. e.,
xTi Xeq
−1
xj = 0, i 6= j
We denote by y 1 , y 2 , . . . , y n the set of vectors orthogonal to x1 , x2 , . . . , xn in the plain
1
1
vanilla inner product and observe that, as K ∗ = K T in this inner product and K T
.. = 0, we
.
1
LECTURE 8. THE INVERSE PROBLEM 212
1
1
have y 1 =
..
if we set x1 = xeq . Then for any imtial composition we can write the solution
.
1
dx
to = Kx as
dt
n
X
x(t) = y i , x(t = 0) eλi t xi
i=1
n
X
= xeq + y i , x(t = 0) eλi t xi
i=2
To see how this can be used to find the matrix K and hence the n(n−1) chemical rate constants
we put n = 3 and suppose λ3 < λ2 . Then x(t) − xeq is the sum of two terms, y 2 , x(t =
0) eλ2 t x2 and y 3 , x(t = 0) eλ3 t x3 where the first dies out more slowly than the second.
We call x2 the slow direction, x3 the fast direction. For large values of t the term in the slow
direction approximates x(t) − xeq and so the long time data provide an estimate of x2 which we
can refine in successive experiments. The idea is illustrated in the following sketch where the
long time tangent direction is the estimate of x2 . It is determined with increasing accuracy in
successive experiments by using the latest estimate of x2 to derive a new initial condition wherein
the magnitude of y 2 , x(t = 0) is increased vis-a-vis y 3 , x(t = 0) . This can be done by
extrapolating the long time tangent to the latest reaction path back to the edge of the triangle. The
sequence x(t = 0) − xeq then turns toward x2 and away from x3 :
LECTURE 8. THE INVERSE PROBLEM 213
x eq + c3 x 3
first
1 x eq + c2 x 2
experiment
x eq
second
experiment 2
In the case n = 3 the subsequent work is especially simple for having obtained an estimate of
x2 as indicated above, we can determine x3 in terms of x1 and x2 via orthogonality in the inner
−1
product G = Xeq , i.e., x3 can be obtained as a solution to
1
1
x1 Xeq x3 =
T −1
..
x3 = 0
.
1
and
xT2 Xeq
−1
x3 = 0
Then using the plane vanilla inner product, we can produce y 1 , y 2 , y 3 via y i , xj = δij
and return to a trajectory, such as 1 , having, at least for short time, roughly equal contributions
in the x2 and x3 directions and use it to obtain λ2 and λ3 via
and
K = λ1 x1 y T1 + λ2 x2 y T2 + λ3 x3 y T3 .
This is what underlies the evaluation of K by the method of Wei and Prater.
And it obtains whatever the value of n. Indeed for any value of n we have x1 = xeq and we can
find x2 as above. To get x3 we select an initial condition at random and write
x(t = 0) = xeq + c2 x2 + c3 x3 + · · · + cn xn .
xT2 Xeq
−1
x(t = 0)
c2 = T −1 x
x2 Xeq 2
and use the corrected initial condition x(t = 0) − c2 x2 to generate a family of trajectories that
will produce x3 in the same way that a random initial condition will produce x2 . But there is a
technical difficulty as the estimate of x2 we have is not perfect and neither are our composition
measurements. Both factors make it impossible to completely free x(t = 0) of its x2 component
and hence x2 tends to reassert itself in any trajectory as time runs on. But this flaw is not fatal, it
just makes the method somewhat more tedious than it might at first seem.
Instead of using data produced by a closed or batch reactor to determine K, we can investigate
the possibility of using data produced by an open or flow reactor. By doing this we can avoid the
problem of determining the time to which the composition measurements correspond.
LECTURE 8. THE INVERSE PROBLEM 215
Now the model for the steady operation of a well mixed reactor is
where θ is the holding time. The values of xin and θ are under our control; the experiment produces
the corresponding value of xout . Letting n = 3, denoting the eigenvectors of K as xeq , x2 and x3
and the corresponding eigenvalues as 0, λ2 and λ3 , where 0 > λ2 > λ3 , and writing
xin = xeq + c2 x2 + c3 x3
and
xout = xeq + d2 x2 + d3 x3
we find
0 = c2 − d2 + θλ2 d2
and
0 = c3 − d3 + θλ3 d3
This tells us that xin , xout and xeq lie on a straight line iff that line is in the direction of an eigen-
vector of K. The reader can use this observation to devise a method for determining the matrix K
via its spectral representation. In so doing it is useful to observe that an experiment turns xin − xeq
into the direction of the line through xeq parallel to x2 and away from the line through xeq in the
direction of x3 . Indeed as
−1
xout − xeq = I − θK xin − xeq
a sequence of experiments that might be worth some study is that in which the choice of xin in any
experiment is the value of xout in the preceeding experiment. This is easily achieved by running a
LECTURE 8. THE INVERSE PROBLEM 216
kij hi − hj
where kij is the conductivity of the pipe connecting tanks i and j and where kij = kji > 0, there
being only one line connecting tanks i and j. Indeed under steady laminar flow conditions we
would anticipate
π R4 g
kij =
8 L ν
The idea is to determine the values of the constants kij by studying the dynamics of the levels
in a set of interconnected tanks as the levels go to equilibrium from an assigned set of initial values.
We can work in terms of the height, hi , or the volume, Vi , of the liquid held in tank i. The
heights make the equilibrium state simple but complicate the constant of the motion, whereas the
reverse is true for the volumes. While this problem is more like the earlier problem when it is
written in terms of volumes, we work in terms of heights and write
dhi X X X
Ai = − kij hi − hj = kij hj − kij hi
dt j6=i j6=i j6=i
LECTURE 8. THE INVERSE PROBLEM 217
h1
h2
or in terms of h =
..
.
hn
dh
= A−1 Kh
dt
where A = diag A1 A2 . . . An and where the off-diagonal elements of K are kij , the diagonal
elements being the negative sums of the off-diagonal elements in the same column or row.
T T
A 1
1
T T A2 −1 1
We first observe that A = A and K = K and then that
.. A K = .. K = 0 .
T
. .
An 1
−1
Hence at least one eigenvalue of A K is zero and
T
A1
A2 dh
d X dV
.. dt = dt Ai hi =
dt
=0
. i
An
P
so that i Ai hi =V is constant
and the motion takes place on a plane of constant volume, a plane
A
1
A2
whose normal is .
in the plane vanilla inner product. Also it is not hard to see that if all
..
An
hi ≥ 0 for t = 0 then each hi ≥ 0 for all t ≥ 0. Therefore the curve mapped out by h(t), t ≥ 0,
P
lies on the plane i Ai hi = V in the positive cone of the vector space Rn where h resides.
LECTURE 8. THE INVERSE PROBLEM 218
∗ T
A−1 K = G−1 A−1 K G
T T
= G−1 K A −1 G = G−1 KA−1 G
∗
and so, on taking G = A, we discover that A−1 K = A−1 K. Therefore in the inner product
where G = A, A−1 K is self adjoint and we conclude that A−1 K has a complete set of eigenvectors
and that the corresponding eigenvalues are real. We denote the eigenvectors x1 , x2 , . . . , xn and
the corresponding eigenvalues λ1 , λ2 , . . . , λn .
The rows of A−1 K are multiples of the rows of K. Hence the Gerschgorin row circles for
A−1 K, as for K itself, are all centered on the negative real line with radius equal to the distance
from the center to the origin, whence the eigenvalues of A−1 K cannot be positive. They can be
∗ T
A−1 K = A−1 K = KA−1
A1
1 A2
y1 =
A1 + A2 + · · · + An
..
.
An
LECTURE 8. THE INVERSE PROBLEM 219
due to
A1 1
−1
A2 1
KA =K =0
.. ..
. .
An 1
1
1
It is easy to see that the only solutions other than x = 0 to Kx = 0 are multiples of
..
.
.
1
Indeed, as the off-diagonal elements of K satisfy kij = kji > 0 and the diagonal elements are the
negative sums of the off-diagonal elements in the same column or row, we can eliminate x1 from
x2
x3
Kx = 0 and discover that .
1 1 1 1
= x satisfies K x = 0 where, like K, the off-diagonal
.
.
xn
elements of K 1 satisfy kij1 = kji
1
> 0 and the diagonal elements are the negative sums of the off-
diagonal elements in the same column or row. Eliminating x2 , x3 , . . . , xn−2 in the same way we
find that
−a a xn−1 0
= , a > 0,
a −a xn 0
At equilibrium we have
A−1 Kh eq = 0
LECTURE 8. THE INVERSE PROBLEM 220
Kh eq = 0.
1
1
By this the equilibrium vector h eq must be a multiple of
..
.
.
1
Detailed balance holds in this problem but this does not help us in this simple problem where
K is symmetric as much as it did in the earlier problem where we had two independent paths
connecting i and j.
Now zero is an eigenvalue of A−1 K and it is simple as long as all kij > 0. If connecting lines
are cut and the corresponding kij are set to zero, zero remains a simple eigenvalue as long as there
remains at least one indirect flow path from each tank to each other tank. At the point where this
is lost, zero cannot remain simple and our network splits into two disjoint subnetworks.
n
X
h (t) = y i , h (t = 0) eλi t xi
i=1
and as
P
Ai hi (t = 0) V
y 1 , h (t = 0) = P =P = heq ,
Ai Ai
and so in the case n = 3, which is sufficient to illustrate the main idea, we write
1
where heq = heq 1 .
1
This formula determines h (t) in terms of h (t = 0) and it can be used to recover K from
experimental data for a variety of initial conditions in the way explained earlier in this lecture and
used to determine chemical rate constants. In short when λ3 < λ2 we see that as t grows large
h (t) approaches h eq from the direction of x2 . So using the direction of the tangent at h eq to each
of a sequence of experimental trajectories to determine an initial condition for the next trajectory,
we step by step reduce the magnitude of y 3 , h (t = 0) in favor of y 2 , h (t = 0) and thereby
determine x2 as accurately as we like. Using the orthogonality conditions xT3 Ax1 = 0 = xT3 Ax2 in
the inner product where G = A we can find x3 and then y 2 and y 3 in the plain vanilla inner product.
As y , y , and an arbitrary trajectory, determine λ and λ via y , h (t) = y , h (t = 0) eλ2 t
2 3 2 3 2 2
and y 3 , h (t) = y 3 , h (t = 0) eλ3 t , the values of the kij are then recovered via
A−1 K = λ1 x1 y T1 + λ2 x2 y T2 + λ3 x3 y T3
h = h eq + c2 eλ2 t x2 + c3 eλ3 t x3
and where c2 = y 2 , h (t = 0) and c3 = y 3 , h (t = 0) are not known. But the second and
third terms may not separate until t is so large that h − h eq is inside experimental accuracy. This
happens if |c2 /c3 | is sufficiently small. The assumption is made that |c2 /c3 | is large enough in the
first run that separation takes place early enough in time that an estimate of x2 is obtained that can
be used to increase |c2 /c3 | for the second run. Then separation will take place even earlier in time
leading to a better estimate of x2 , etc. Such a sequence of experiments will produce an estimate of
x2 limited only by the accuracy of liquid level measurements.
LECTURE 8. THE INVERSE PROBLEM 222
P
Because the curve h (t) vs t lies on the plane Ai hi = V and h1 ≥ 0, h2 ≥ 0, h3 ≥ 0, it lies
on a plane triangle. The readers can work out how to transfer this plane triangle to a piece of graph
paper so that they can draw a graph of an experimental trajectory.
Ordinarily a set of experimental runs will be carried out at different volumes but as the eigen-
vectors and eigenvalues of A−1 K do not depend on volume a sequence of runs can be brought to
a common volume quite easily. Yet all this can be avoided by working in volume fractions. Then
the matrix of interest is KA−1 and this also turns out to be self adjoint, but now G = A−1 .
All the reader needs are three fifty-five gallon drums, measuring sticks and some pipe to build
a nice unit operations lab experiment.
1. The reactions
2 3
are carried out in a spatially uniform flow reactor whose holding time is denoted θ. We have
dx
θ = x F − x + θKx
dt
where x is the column vector of species mole fractions and x F denotes the feed composition.
x F = I − θK x S
LECTURE 8. THE INVERSE PROBLEM 223
dy 1
= − I +K y
dt θ
where y (t = 0) = x (t = 0) − x S .
Show that as t grows large y (t) converges to 0 for all values of y (t = 0).
kji
i j
kij
d2 c
D + K c = 0, 0<x<L
dx2
c (x = 0) = c0
and
dc
(x = L) = 0
dx
LECTURE 8. THE INVERSE PROBLEM 224
c1 D 0 0 ··· 0
1
c2 0 D2 0 · · · 0
where c =
..
and D =
.. .. .. ..
, Di > 0. Determine the total
. . . . .
cn 0 0 0 · · · Dn
rate of reaction in the film and express this in terms of an effectiveness factor matrix.
Because D and K do not ordinarily have a complete set of eigenvectors in common this
might seem like a new problem. Multiplication by D −1 shows that it is not. But the facts
about D −1 K have yet to be established. This can be done by observing that D −1 K is similar
1 1
to D − 2 KD − 2 and to KD −1 . The first is symmetrizable due to
1 1
1 1
T T
D − 2 KD − 2 C eq = D − 2 KD − 2 C eq . To see this requires KC eq = KC eq and use
of the symmetry and commutativity of diagonal matrices. The Gerschgorin column circles
of the second lie in the left half plane because its columns are multiples of the corresponding
1 1
columns of K, the multiplying factors being positive, i.e., , , . . ..
D1 D2
−1
The matrix K is self adjoint in the inner product G = Ceq . Show that the matrix D −1 K
−1
is self adjoint in the inner product G = Ceq D.
3. For reactions taking place in a solvent layer in contact with a reservoir supplying the re-
actants, the effectiveness factor matrix multiplies the rate of production vector evaluated at
reservoir conditions to determine the true rate of production vector. When the reactions are
kji
i j
kij
n √
X tanh −λi L
D √ xi y Ti D −1
i=1
−λ i L
k
1 2
k'
takes place in the solvent show that the effectiveness factor matrix is
′ √ ′ ′
1 D k D1 k −λ2 L 1 D2 k −D1 k
1 + tanh
√
D1 k ′ + D2 k D2 k D 2 k −λ 2 L D 1 k ′+D k
2 −D2 k D1 k ′
where
D1 k ′ + D2 k
−λ2 =
D1 D2
Observe that the rate of production of either species depends on the rate of production
of both species at reservoir conditions.
4. Let
−1 1
A= .
−1 −1
Ax = λx
is satisfied by
i
λ1 = −1 + i, x1 =
−1
LECTURE 8. THE INVERSE PROBLEM 226
and
−i
λ2 = −1 − i, x2 =
−1
and that x1 , x2 and y 1 , y 2 are biorthogonal sets if
1 −i 1 i
y1 = − , y2 = −
2 1 2 1
dx
= Ax
dt
where x (t = 0) is assigned is
D E i D E −i
x (t) = y 1 , x (t = 0) e(−1 + i) t + y 2 , x (t = 0) e(−1 − i) t
−1 −1
1
Sketch the solution in the x1 , x2 plane when x (t = 0) = . Because the decay
−1
2π
constant is Re λ1 = −1 and the period of the revolution is = 2π it may be difficult
Im λ1
to see the spiral as e−2π is small, where e−2π is the factor by which the length of x (t) is
shortened each period.
t h1 h2 h3 h1 h2 h3
0 1.0 4.0 2.0 1.0 2.0 10/3
1 2.1226 2.7030 2.4905 1.8900 2.4323 2.7484
2 2.3848 2.5275 2.5201 2.2677 2.4908 2.5835
4 2.4860 2.5005 2.5043 2.4680 2.4998 2.5108
8 2.4994 2.5000 2.5002
∞ 2.5 2.5 2.5 2.5 2.5 2.5
Use the data on the right hand side to estimate the conductivities of the connecting pipes.
Then use the estimates to predict the data on the left hand side.
It is of some interest to see how the “experimental data” were determined. As
1
A1 = 1, A2 = 2 and A3 = 3 the geometric conditions of the problem require x1 = 1 ,
1
A 1
1 1 1
y1 = A = 2 . Then as x2 and x3 can be determined only up
A1 + A2 + A3 2 6
A3 3
to a constant multiple one degree of freedom is used in setting x2 and this also determines
x3 via
xT1 A x3 = 0 = xT2 A x3
Then x1 , x2 and x3 determine y 1 , y 2 and y 3 . The remaining two degrees of freedom are used
in setting λ2 and λ3 λ3 < λ2 < 0 as λ1 = 0. However this cannot be done arbitrarily as
k12 , k23 and k31 must be positive.
t x1 x2 x3
0 1 0 0
1/4 0.7910 0.1967 0.0122
1/2 0.6451 0.3161 0.0389
1 0.4677 0.4324 0.0999
2 0.3222 0.4908 0.1869
4 0.2592 0.4998 0.2409
∞ 0.2500 0.5000 0.2500
Determine the chemical rate coefficients kij . Improve the estimates of the rate coefficients
given that the point x1 = 0.5, x2 = 0.5, x3 = 0 lies on the slow straight line path.
Our model is
∂c ∂2c
=D 2
∂t ∂x
where ξ = xt−1/2 and where D1 > D2 > 0 denote the eigenvalues of D corresponding to
the eigenvectors x1 and x2 .
LECTURE 8. THE INVERSE PROBLEM 229
Sketch c (ξ) − c0 vs ξ for several ∆’s. You should see the curves turning toward the
slow straight line path, away from the fast straight line path. Use your favorite matrix D.
As we explained in Lecture 6 Gerschgorin’s circle theorem establishes a set of circles in the com-
plex plane inside of which the eigenvalues of a matrix must lie, outside of which they cannot lie.
The theorem leads to estimates of the eigenvalues and while the estimates may not be sharp, neither
are they difficult to obtain.
Now stability conditions and, what is the same thing, convergence conditions tell us where the
eigenvalues of a matrix must not lie. So when convergence is the problem, what the circle theorem
tells us about where the eigenvalues of a matrix do not lie is interesting. This is our emphasis in
this lecture.
Because we will study the diffusion equation in subsequent lectures, we investigate here some
simple approximations to its solution.
The diffusion equation acts to smooth out solute irregularities. Hence it acts to damp concen-
tration excursions from equilibrium in a region whose boundary is held at equilibrium, viz.,
231
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 232
c solute diffusion
filling up
troughs at the
expense of crests
x
We also should see this in approximations to its solution.
∂c ∂2c
= 2, 0 < x < 1
∂t ∂x
and
c (x = 0) = 0 = c (x = 1) , t > 0
L2
where c (t = 0) is assigned. Here distance is scaled by L and time is scaled by D
. We subdivide
the interval (0,1) into n+1 subintervals of length h and approximate c at each of the points xi = ih,
i = 1, . . . , n, by ci so that at any time t the function c (x, t) is approximated by the vector
c (t)
1
c2 (t)
c (t) =
..
.
.
cn (t)
Using a second central difference to approximate the spatial derivative, we require ci to satisfy
dc 1
= 2 Ac
dt h
where
−2 1 0 0 ···
1 −2 1 0 · · ·
A=
0 1 −2 1 · · ·
···
T
As A = A we see that A is self adjoint in the plain vanilla inner product, viz., G = I. Hence
its eigenvalues, denoted λi , must be real and its eigenvectors, denoted xi , can be scaled so that
xi , xj = δij in the plain vanilla inner product. Then our approximation is
n
X 1
c (t) = hxi , c (t = 0)i e h2 λi t xi
i=1
The circle theorem tells us that −4 ≤ λi ≤ 0 and as det A 6= 0, λi 6= 0. Ordering the eigenvalues
as −4 ≤ λn ≤ λn−1 ≤ · · · λ1 < 0 we see that c (t) dies out to 0 exponentially as t grows large, the
1
last gasp being hx1 , c (t = 0)i e h2 λ1 t x1 . So, no matter how ragged the initial solute concentration,
c (t = 0), it is finally only as ragged as x1 .
and
iπ
λi = 2 cos −2
n+1
We can go on and ask for another approximation defined only at equally spaced values of t,
viz., t = kT , k = 1, 2, . . .. The approximation resulting on replacing the time derivative by a
forward difference satisfies
T k
ck+1 − cki = c − 2cki + cki−1
i
h2 i+1
or
k+1 T
c = I + 2 A ck .
h
The eigenvectors of I + hT2 A are those of A itself, the corresponding eigenvalues being 1+ hT2 λi ,
and as −4 ≤ λi < 0, we find 1 − 4 hT2 ≤ 1 + T
λ
h2 i
< 1. The approximation then is
n
X k
k T
c = hxi , c (k = 0)i 1 + 2 λi xi
i=1
h
T
and ck → 0 as k → ∞ iff 1 + λ
h2 i
< 1. To be sure that this is so we must have 1 − 4 hT2 > −1 or
T < 21 h2 . In fact we require 1 − 4 hT2 > 0 or T < 14 h2 to be certain that 0 < 1 + T
λ
h2 i
< 1, thereby
eliminating the possibility that terms in the approximation alternate in sign step by step.
We see that in this second approximation, having set h, we cannot set T freely and be certain
that the approximation is well behaved. We also see that the two approximations differ, the factor
1 t t
e h2 λi in the first being replaced in the second by 1 + hT2 λi T where t = kT . In fact the second
converges to the first if we fix t and let k → ∞ and T → 0 so that kT = t.
Instead of replacing the time derivative by a forward difference, we can get another approxi-
mation if a backward difference is used. It satisfies
T k
cki − ck−1 = c − 2cki + cki−1
i
h2 i+1
or
−1
k+1 T
c = I − 2A ck .
h
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 235
T −1 T
The eigenvectors of I − h2
A , like those of I + h2
A , are those of A; the corresponding
1
eigenvalues are now 1− T
λ
. Because λi < 0, these eigenvalues are all positive and lie to the left
h2 i
n
!k
X 1
ck = hxi , c (k = 0)i T
xi ,
i=1
1− λ
h2 i
goes to 0 as k goes to ∞, each term maintaining a fixed sign for all values of k, and we do not need
a condition on T to make this happen. If we fix t and let k → ∞ and T → 0 so that kT = t, this
approximation, like the second, converges to the first.
∂c
Anticipating higher accuracy, we can replace by the average of the two one-sided differ-
∂t
ences. Indeed as
2
T
λ T 1 T
e h2 i = 1 + 2 λi + λi +···
h 2 h2
T T
1+ 2
λi = 1 + 2 λi
h h
and
2
1 T T
= 1 + λi + λi +···
1 − hT2 λi h2 h2
the average of the second and third expansions agrees with the first to three terms while each by
itself agrees with the first to only two terms. Using this idea, this, then, we get
2T k
cik+1 − ck−1
i = 2
ci+1 − 2cki + cki−1
h
or
T
ck+1 = 2 2
Ack + Ick−1
h
This is a second order difference equation; but it can be rewritten as a first order difference equa-
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 236
tion, viz.,
c k+1
2 hT2 A I c k
=
ck I 0 ck−1
and solved as above. The readers can use Gerschgorin’s theorem to see if they can learn anything
about this problem.
and it may be introduced in an attempt to stabilize an unstable first order difference equation by
introducing a delay. The second order equation can be written in first order form as
k+1 k
c A B c
=
ck I 0 ck−1
is then
λ2 I − λA − B y = 0
The condition that there be solutions y 6= 0 is that det (λ2 I − λA − B) = 0. A matrix whose
elements are polynomials in a scalar λ is called a lambda matrix. Information about lambda ma-
trices, their latent roots and latent vectors can be found in Lancaster’s book “Theory of Matrices”
and in the book “Matrix Polynomials” by Gohberg, Lancaster and Rodman, as well as in Frazer,
Duncan and Collar’s book “Elementary Matrices”. If, however, A and B have a common set of
eigenvectors, elementary methods can be used.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 237
T
ck+1 = 2 Ack + Ick−1
h2
n
X
k
c = xi , ck xi ,
i=1
T
xi , ck+1 = 2 λi xi , ck + xi , ck−1
h2
This is a second order constant coefficient difference equation to be solved for each value of i. Its
solution is
xi , ck = ai µki1 + bi µki2
where µi1 and µi2 are the roots of µ2 − 2 hT2 λi µ − 1 = 0 and where ai and bi satisfy
hxi , c (k = 0)i = ai + bi
and
and hence, because λi < 0, half of these values lie to the left of -1 and so the approximation
n
X
ck = ai µki1 + bi µki2 xi
i=1
grows in magnitude and, sooner or later, oscillates in sign each time k increases by 1 whatever
values of T and h are used.
This fourth approximation is named after L. Richardson and is an example of a good idea that
did not work out.
The forward difference approximation leads to the iteration matrix I + hT2 A whereas the back-
−1
ward difference approximation leads to I − hT2 A . Stability places a condition on T in the first
T T
but not in the second. But to expand the second in powers of h2
A we must have λ
h2 i
< 1 and for
this it is sufficient that T < 14 h2 .
The readers may wish to investigate the stability of the Crank-Nicholson approximation which
−1
leads to the iteration matrix 2I − hT2 A 2I + hT2 A .
∂c ∂2c ∂c
=D 2 −v
∂t ∂x ∂x
∂c
result on making a forward or a backward difference approximation to , viz.,
∂x
c − ci
i+1
∂ci c i + 1 − 2ci + ci − 1 v
=D − or
∂t h2 h
c −c
i i+1
Let v be positive and use Gerschgorin circle theorem to obtain stability conditions on the
size of h in each approximation. When v = 0 stability is obtained for all values of h.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 239
−1
T T
2I − 2 A 2I + 2 A
h h
T
lie as a function of .
h2
x = −D −1 (L + U) x + D −1 b
x k + 1 = −D −1 (L + U) x k + D −1 b
e k + 1 = Je k
where J = −D −1 (L + U) and convergence obtains iff the eigenvalues of J lie inside the
unit circle. You can obtain a sufficient condition for this, viz.,
X
| aij | ≤ | aii | , i = 1, 2, . . . , n
j=1
6=i
by using Gerschgorins’ theorem, and it tells you to carry out elementary row and column
operations on your problem so that when you write it Ax = b, A is as diagonally dominant
as possible.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 240
x = − (L + D)−1 Ux + (L + D)−1 b
whence we have
ek+1 = Gek
where G = − (L + D)−1 U.
Use Gerschgorin’s circle theorem to derive sufficient conditions for a Gauss iteration to
converge.
Part II
In Part II we are going to study linear differential equations in which the differential operator is, for
the most part, ∇2 . The solutions to our problems will depend on position as well as on time and the
spaces where they reside will be called function spaces. The solutions to the eigenvalue problem for
∇2 will be called eigenfunctions and ordinarily there will be infinitely many independent solutions.
The function spaces will be infinite dimensional and our solutions will be in the form of infinite
series. Many difficult questions will then arise that did not arise in Part I and many of these
difficulties can be reduced to the question: what interpretation we can place on an infinite sum of
eigenfunctions?
Before we take up such questions, if we do so at all, we explain the way to go about constructing
the solution to a linear differential equation. We do this by expanding the solution in a series of
eigenfunctions. To obtain the eigenfunctions we need to explain how to solve the eigenvalue
problem. We do this by separation of variables. So we have two aims: first to explain how
eigenfunction expansions are used to solve linear differential equations, second to explain the
method of separation of variables as it is used to solve the corresponding eigenvalue problem.
As ∇2 is the focus of our work, we first establish the elementary facts about ∇2 . We then use
separation of variables to reduce the eigenvalue problem for ∇2 to a set of three one-dimensional
eigenvalue problems and we use Frobenius’ method to solve these eigenvalue problems.
243
LECTURE 10. A WORD TO THE READER UPON LEAVING FINITE DIMENSIONAL VECTOR SPACES244
Earlier, in Part I, the solutions to our problems were finite sums and we might, but did not, have
substituted such a sum directly into an equation to be solved in order to determine the coefficients
of the eigenvectors in the sum. Indeed to see how easily this works the reader needs to determine
the equations for the ci (t) by substituting
X
x (t) = ci (t) xi
dx
= Ax
dt
In Part II this may work, but it may not. The problem is that the solutions are infinite sums
and it may not be known before the problem is solved whether or not the derivative of a sum is in
fact the sum of the derivatives of its terms. There is at least one simple example in what follows to
illustrate this point.
What we do does not differ from what we did in Part I and does not require that an assumed
solution be substituted into the equation being solved. Indeed the coefficients to be determined are
found by integrating this equation, after it is multiplied by suitable weighting functions.
The lectures are on the diffusion equation, its solution in bounded regions in terms of eigen-
functions, the solution of the eigenvalue problem by separation of variables and some problems in
Cartesian, cylindrical and spherical coordinate systems to fill in the details.
Lecture 11
Part II is about the differential operator ∇2 . To begin we need to learn how to write ∇ and
∇2 = ∇ · ∇ in coordinate systems of interest and for us this means only orthogonal coordinate
systems. The simplest of these and our starting point is a system of Cartesian coordinates denoted
x, y, z.
A point in space, say P , can be located with respect to an origin O by the vector
−→
~r = x~i + y~j + z~k = OP where (x, y, z) denotes the Cartesian coordinates of the point P and
where ~i, ~j and ~k are unit vectors along the axes Ox, Oy and Oz. By design we have
~i · ~j = 0, ~j · ~k = 0, ~k · ~i = 0
~i × ~j = ~k, ~j × ~k = ~i, ~k × ~i = ~j
Then, at the point P , we have tangents to the coordinate curves passing through P , viz., the
∂~r ~
tangent to the coordinate curve where x is increasing at constant y and z is = i. Likewise we
∂x
245
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 246
∂~r ∂~r
have = ~j and = ~k. Thus, at P , the set of vectors ~i, ~j, ~k forms an orthogonal basis of
∂y ∂z
unit length vectors, the same basis at any point P as at any other point. Hence the nine derivatives
∂~i
, etc., all vanish. This is what makes Cartesian coordinates the simplest coordinate system.
∂x
Now, suppose that we have a smooth scalar or vector or tensor valued function defined through-
out a region of space and that we wish to introduce a notation that allows us to differentiate this
function. To do this let C denote a curve lying in this region and let s denote arc length along this
curve. The positions of points on the curve are denoted ~r (s) or x(s), y(s), z(s), and the tangent to
d~r
the curve at a point P on the curve is denoted by ~t where ~t = and ~t · ~t = 1 due to ds2 = d~r · d~r.
ds
We introduce the differential operator ∇ so that at a point P of C the derivative of f with
respect to arc length along C is given by
df ~
= t · ∇f
ds
where ∇f depends on the point P and ~t, at the point P , depends on the curve C.
We now select three curves passing through P having unit tangents at P denoted ~t1 , ~t2 and ~t3
and we introduce the set of vectors ~a1 , ~a2 , ~a3 ⊥ ~t1 , ~t2 , ~t3 . Then we have
whereupon
df df df
∇f = ~a1~t1 + ~a2~t2 + ~a3~t3 · ∇f = ~a1 + ~a2 + ~a3
ds1 ds2 ds3
df df df
and therefore we have ∇f in terms of three derivatives of f , viz., , and , along three
ds1 ds2 ds3
curves C1 , C2 and C3 passing through P .
Thus on any curve C the derivative of f with respect to arc length along the curve is given by
df ~ df df df
= t · ~a1 + ~t · ~a2 + ~t · ~a3
ds ds1 ds2 ds3
Now if we have a coordinate system, we will choose the three curves through P to be the three
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 247
coordinate curves passing through P . Hence in Cartesian coordinates we choose ~t1 = ~i, ~t2 = ~j,
and ~t3 = ~k whereupon we have
df ∂f df ∂f df ∂f
= , = , and =
ds1 ∂x ds2 ∂y ds3 ∂z
and, therefore,
∂f ~ ∂f ~ ∂f
∇f = ~i +j +k
∂x ∂y ∂z
~i ∂ + ~j ∂ + ~k ∂
∂x ∂y ∂z
whereupon we have
2 ∂2 ∂2 ∂2
∇ =∇·∇= 2 + 2 + 2
∂x ∂y ∂z
making use of the fact that ~i, ~j and ~k are independent of x, y and z.
Now, we may introduce a new coordinate system where the new coordinates, (u, v, w), like (x, y, z),
are coordinates of a point P . We do this by writing
x = f (u, v, w)
y = g (u, v, w)
z = h (u, v, w)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 248
are tangent to the coordinate curves, viz., the curves u increasing at constant v and w, v increasing
at constant w and u, etc.
This being the case, and it is the only case of interest to us, we introduce unit length vectors along
the three coordinate curves by scaling ~ru, ~rv and ~rw, viz.,
And now at each point of space we have an orthogonal basis of unit vectors that, algebraically, acts
just like ~i, ~j, ~k , viz.,
~i · ~i = 0, ~i · ~i = 0 and ~i · ~i = 0
u v v w w u
and
~i × ~i = ~i , ~i × ~i = ~i and ~i × ~i = ~i
u v w v w u w u v
However, the vectors ~ru, ~rv and ~rw ordinarily do not remain fixed in direction as we move from a
point P to a nearby point.
To write a formula for ∇ in our new coordinate system, we introduce a curve C defined by
and then we differentiate f with respect to arc length along this curve, viz.,
df ∂f du ∂f dv ∂f dw
= + +
ds ∂u ds ∂v ds ∂w ds
we have
du ~r · ~t ~i
= u 2 = u · ~t
ds | ~ru| | ~ru|
dv ~r · ~t ~i
= v 2 = v · ~t
ds | ~rv| | ~rv|
and
dw ~r · ~t ~i
= w 2 = w · ~t
ds | ~rw| | ~rw|
whence we obtain
( )
df ~ ~i ~ ~
=t· u ∂ + iv ∂ + iw ∂ f = ~t · ∇f
ds | ~ru| ∂u | ~rv | ∂v | ~rw| ∂w
~i ∂ ~i ∂ ~i ∂
∇= u + v + w
hu ∂u hv ∂v hw ∂w
and where
Hence if we can write a formula for ds2 in our coordinate system, we can read off the formulas for
hu, hv and hw. For example, in cylindrical coordinates we have
ds2 = dr 2 + r 2 dθ2 + dz 2
We now have a formula for ∇ in any orthogonal coordinate system and we can proceed to
derive a formula for ∇2 = ∇ · ∇.
Before we do this, we introduce the surface gradient, denoted ∇S in order that we may differentiate
functions defined on a surface.
x = f (α, β)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 251
y = g (α, β)
and
z = h (α, β)
lie on a surface, denoted S, on which the curves α constant, β increasing, and α increasing, β
constant, are coordinate curves. The vectors ~rα and ~rβ , tangent to these curves at a point P of S,
ordinarily are not perpendicular. They can be used to determine the unit normal to S at P via
~rα × ~rβ
~n =
~rα × ~rβ
where
2 2 2
2
~rα × ~rβ = ~rα ~rβ − ~rα · ~rβ
The two sets of vectors ~a, ~b, ~n and ~rα, ~rβ , ~n are biorthogonal if ~a and ~b are given by
~rβ × ~n
~a =
~rα · ~rβ × ~n
and
~b = ~n × ~rα
~rα · ~rβ × ~n
df df dα df dβ
= +
ds dα ds dβ ds
where α = α (s) and β = β (s) define the curve and s denotes arc length along the curve.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 252
Due to
we have
dα dβ ~ ~
= ~a · ~t, =b·t
ds ds
and hence
df ~ ∂ ∂
= t · ~a + ~b f
ds ∂α ∂β
which we write as
df ~
= t · ∇S f
ds
where
∂ ∂
∇S = ~a + ~b
∂α ∂β
d
∇ = ∇S + ~n
ds
The mean curvature of a surface, denoted by H, is important in some of our later examples and
we record the fact that it can be obtained via
2H = −∇S · ~n
gαα = ~rα · ~rα , gαβ = ~rα · ~rβ , and gββ = ~rβ · ~rβ ,
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 253
we have
−1
g
αα gαβ bαα bαβ
2H = tr
g bαβ bββ
αβ gββ
z = Z (x, y)
~k − Z ~i − Z ~j
x y
~n = q
1 + Z x2 + Z y 2
and
~rxy = Z xy ~k
whereupon we find
1 + Z y2 Z xx − 2 Z x Z y Z xy + 1 + Z x2 Z yy
2H = 3/2
2
1 + Zx + Zy 2
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 254
To derive a formula for ∇2 in the u, v, w coordinate system we first notice that if f is a vector
valued function, say ~v , where
then to calculate the tensor ∇~v where ∇~v can be used to find the derivative of ~v with respect to arc
length along a curve, say a particle path, we write
n o
1~ ∂
∇~v = iu +··· vu~iu + · · ·
hu ∂u
1 ~ ∂ n ~ o 1 ∂vu ~ ~ vu ~ ∂~iu
iu vu iu = i i + i
hu ∂u hu ∂u u u hu u ∂u
and we see that to get ∇~v we are going to need three derivatives of each of the three base vectors,
twenty seven components in all. These components are of the form
∂~iη
~i ·
ξ ∂ζ
First we have
∂~iξ
~i · =0
ξ ∂η
∂ 2~r ∂ 2~r
=
∂ξ ∂η ∂η ∂ξ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 255
we have
∂ ~ ∂ ~
hη iη = h i
∂ξ ∂η ξ ξ
hence
whereupon we obtain
∂~iξ 1 ∂hη
~i · =
η ∂η hξ ∂ξ
and this is all that we need to derive a formula for ∇ · ~v or for ∇2 = ∇ · ∇, i. e., many of the terms
that appear in ∇v or in ∇∇ are eliminated by the dot product.
1 ~ ∂ 1 ~ ∂
The terms coming from iv · and iw · can be written by replacing u, v and w by v, w
hv ∂v hw ∂w
and u, etc., in this formula.
More simplifications are possible. In fact the reader can derive the formula:
2 1 ∂ hv hw ∂ ∂ hw hu ∂ ∂ hu hv ∂
∇ = + +
hu hv hw ∂u hu ∂u ∂v hv ∂v ∂w hw ∂w
But we turn our attention to some examples which indicate a direct way to write ∇2 .
x = r cos θ
y = r sin θ
z=z
Z
iz
iθ
ir
z
r Y
θ
X
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 257
whereupon we have
~i = ~k, hz = 1
z
∂~ir ~ ∂~iθ
= iθ, = −~ir
∂θ ∂θ
∂ 1 ∂ ~ ∂
∇ = ~ir + ~iθ + iz
∂r r ∂θ ∂z
and
∂2 1 ∂ 1 ∂2 ∂2
∇2 = + + +
∂r 2 r ∂r r 2 ∂θ2 ∂z 2
1 ∂ 1~ ∂ ∂
where the term appears due to iθ · ~ir and where
r ∂r r ∂θ ∂r
∂2 1 ∂ 1 ∂ ∂
+ = r
∂r 2 r ∂r r ∂r ∂r
y = r sin θ sin φ
z = r cos θ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 258
ir
iφ
θ
r
iθ
Y
φ
and we have
∂~ir ~ ∂~ir
= iθ, = sin θ ~iφ,
∂θ ∂φ
∂~iθ ∂~iθ
= −~ir, = cos θ ~iφ,
∂θ ∂φ
∂~iφ
= − sin θ ~ir − cos θ ~iθ
∂φ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 259
Hence we have
∂ 1 ∂ 1 ~ ∂
∇ = ~ir + ~iθ + i
∂r r ∂θ r sin θ φ ∂φ
whereupon
2
~i ∂ · ∇ = ∂
r ∂r ∂r 2
and
1~ ∂ 1 ∂ 1 ∂2
iθ ·∇= + 2 2
r ∂θ r ∂r r ∂θ
1 ~ ∂ 1 ∂ 1 1 ∂ 1 ∂2
iφ ·∇= sin θ + cos θ + 2 2
r sin θ ∂φ r sin θ ∂r r sin θ r ∂θ r sin θ ∂φ2
and we find
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2 r + 2 sin θ + 2 2
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2
Now we have formulas for ∇2 in three coordinate systems and enough information to write ∇2
in any other orthogonal coordinate system. Given a coordinate system it is often easier to proceed
to ∇2 directly via the base vectors and their derivatives than it is to use general formulas.
For axisymmetric flow of an incompressible fluid, i.e., a flow where the velocity components
are the same in every plane through the z-axis we can write
1~
~v = i × ∇ψ (r, z)
r θ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 260
1 ~
~v = i × ∇ψ (r, θ)
r sin θ φ
in spherical coordinates.
and the reader ought to work out the corresponding formula in spherical coordinates, viz.,
1
∇ × ~v = E 2 ψ ~iφ
r sin θ
Our aim is to solve problems on a domain D and to do this in terms of the eigenfunctions of ∇2 on
the domain we must solve the eigenvalue problem
∇2 ψ + λ 2 ψ = 0 on D
where, say, ψ = 0 on the boundary of D. We now know how to write ∇2 in a variety of orthogonal
coordinate systems and if one of these coordinate systems fits our needs we will try to solve our
eigenvalue problem in this coordinate system.
But there are many domains where we can not do this and for some of these, those close to a
domain where our methods will work, we have an option.
Suppose our domain D lies close to a domain D0 on which we are able to solve our problem.
We call D0 the reference domain and we wish to know what problems to solve on D0 in order to
estimate the solution to our problem on D.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 261
To sketch the main idea, we assume D and D0 are two dimensional and in a Cartesian coordi-
nate system we denote the points of D by (x, y) and those of D0 by (x0 , y0).
We imagine a family of domains, Dε, growing out of D0 , one of these being D. Then points
(x, y, ε) ∈ Dε grow out of points (x0 , y0) ∈ D0 via
where the boundary points of Dε, viz., y = Y (x, ε) = g (x0 , Y0 (x0 ) , ε) grow out of the corre-
sponding boundary points of D0 , viz., y0 = Y0 (x0 )
x = x0 , y = g (x0 , y0 , ε)
dg 1 d2 g
g (x0 , y0, ε) = g (x0 , y0 , ε = 0) + ε (x0 , y0 , ε = 0) + ε2 2 (x0 , y0, ε = 0) + · · ·
dε 2 dε
where
g (x0 , y0, ε = 0) = y0
and we define
dg
y1 = (x0 , y0 , ε = 0)
dε
d2 g
y2 = (x0 , y0 , ε = 0)
dε2
etc.
x = x0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 262
1 2
y = y0 + ε y1 (x0 , y0 ) + ε y2 (x0 , y0 ) + · · ·
2
on the domain and
x = x0
1 2
y = Y (x, ε) = g (x0 , Y0 (x) , ε) = Y0 (x0 ) + ε Y1 (x0 ) + ε Y2 (x0 ) + · · ·
2
on its boundary where Y0 (x0 ) defines the boundary of D0 , where
Y1 (x0 ) = y1 x0 , Y (x0 ) , Y2 (x0 ) = y2 x0 , Y (x0 ) , etc.
Now our problem is to find a function, denoted u, defined on a domain Dε. Assuming the
equations satisfied by u on Dε have the same form for all values of ε, we begin by expanding
u (x, y, ε) along the mapping where all derivatives along the mapping hold x0 and y0 fixed.
du
u (x, y, ε) = u x = x0 , y = y0 , ε = 0 + ε x = x0 , y = y0 , ε = 0 +
dε
1 2 d2 u
ε x = x0 , y = y 0 , ε = 0 +···
2 dε2
d
where u depends on x, y, ε and holds x0 and y0 fixed.
dε
Hence we have, using the chain rule,
du ∂u ∂u dy
(x, y, ε) = (x, y, ε) + (x, y, ε) (x0 , y0 , ε)
dε ∂ε ∂y dε
and therefore
du ∂u0
(x = x0 , y = y0 , ε = 0) = u1 (x0 , y0 ) + y1 (x0 , y0 ) (x0 , y0)
dε ∂y0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 263
where
∂u
u1 (x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂ε
dg
y1 (x0 , y0 ) = (x0 , y0 , ε = 0)
dε
and
∂u0 ∂u
(x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂y0 ∂y
Likewise we have
d2 u ∂u1 (x0 , y0 )
2
(x = x0 , y = y0 , ε = 0) = u2 (x0 , y0 ) + 2y1 (x0 , y0 ) +
dε ∂y0
∂ 2 u0 (x0 , y0 ) ∂u0 (x0 , y0 )
y12 (x0 , y0 ) 2
+ y2 (x0 , y0 )
∂y0 ∂y0
where
∂2u
u2 (x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂ε2
1 2
x = x0 , y = y0 + ε y1 (x0 , y0) + ε y2 (x0 , y0) + · · ·
2
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 264
and we find
∂u (x, y, ε) ∂u0 ∂u1 ∂ 2 u0
= +ε + y1 2 +
∂y ∂y0 ∂y0 ∂y0
1 2 ∂u2 ∂ 2 u1 3
2 ∂ u0 ∂ 2 u0
ε + 2y1 2 + y1 + y2 +···
2 ∂y0 ∂y0 ∂y03 ∂y02
Likewise we have
∂u (x, y, ε) ∂u0 ∂u1 ∂ 2 u0
= +ε + y1 +
∂x ∂x0 ∂x0 ∂y0 ∂x0
1 2 ∂u2 ∂ 2 u1 3
2 ∂ u0 ∂ 2 u0
ε + 2y1 + y1 2 + y2 +···
2 ∂x0 ∂y0 ∂x0 ∂y0 ∂x0 ∂y0 ∂x0
The reader may notice that only y1 , y2 , . . . appear in these two formulas, not their derivatives.
To see how the derivatives are lost, the algebra must be worked out. The main idea is to replace
∂y0 ∂y1 1 2 ∂y2 ∂u
with 1 − ε − ε − · · · in the derivation of the formula for .
∂y ∂y 2 ∂y ∂y
Now to derive the equations for u0 , u1 , u2, . . . on the reference domain, we substitute our ex-
pansions for u and its derivatives into the equation satisfied by u. Doing this we discover that the
mappings do not survive. For example, if our problem is
∇2 u = f on Dε
we substitute
∂2u ∂ 2 u0 ∂ 2 u1 ∂ ∂ 2 u1
= +ε + y 1 +···
∂x2 ∂x20 ∂x20 ∂y0 ∂x20
∂2u ∂ 2 u0 ∂ 2 u1 ∂ ∂ 2 u1
= +ε + y 1 +···
∂y 2 ∂y02 ∂y02 ∂y0 ∂y02
and
∂
f = f0 + ε f1 + y1 f0 + · · ·
∂y0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 265
to obtain
∂ ∂
∇20 u0 2
+ ε ∇0 u 1 + y 1 2
∇ u0 + · · · = f0 + ε f1 + y1 f0 + · · ·
∂y0 0 ∂y0
∇20 u0 = f0
∇20 u1 = f1
etc.
and the conclusion is this: the equations for u0 , u1, u2 on D0 , i.e., on the reference domain, can be
derived by ordinary methods from the equation for u on Dε. The mapping of D0 into Dε does not
appear, and we are grateful, because we can never know what it is.
1
Thus we can substitute u = u0 + εu1 + ε2 u2 + · · · into the equation for u and set to zero
2
terms of order zero, one, two, etc. while paying no attention to the fact that Dε is not D0 . By doing
this we obtain the equations for u0 , u1 , u2 , . . . on D0 .
The boundary is different. For example suppose u = 0 must be satisfied at y = Y (x) in the
forgoing problem. Then using the expansion
∂u0
u(x, Y (x) ) = u0 + ε u1 + Y1 +··· +···
∂y0
we obtain at y0 = Y0 (x0 )
u0 = 0
∂u0
u1 + Y1 =0
∂y0
etc.
And we see that in the boundary conditions the displacement of D0 into Dε appears.
To obtain u (x, y, ε) from u0 (x0 , y0 ), u1 (x0 , y0 ), etc., not knowing the mapping, we rearrange
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 266
the expansion
2
∂u0 1 2 ∂u1 2 ∂ u0 ∂u0
u (x, y, ε) = u0 (x0 , y0 ) + ε u1 + y1 + ε u2 + 2y1 + y1 + y2 +···
∂y0 2 ∂y0 ∂y02 ∂y0
using
1
y − y0 = εy1 + ε2 y2 + · · ·
2
and
∂u0 1 ∂ 2 u0
u0 (x, y) = u0 (x0 , y0 ) + (x0 , y0 ) (y − y0 ) + 2
(x0 , y0) (y − y0 )2 + · · ·
∂y0 2 ∂y0
etc.
and conclude
1 2
u (x, y, ε) = u0 (x, y) + εu1 (x, y) + ε u2 (x, y) + · · ·
2
and by this we obtain u at all points (x, y) of Dε which are also points of D0 .
∇2 ψ + λ 2 ψ = 0 on D
ψ=0 on y = Y (x)
we solve
∇2 ψ0 + λ20 ψ0 = 0 on D0
ψ0 = 0 on y0 = Y0 (x0 )
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 267
∇2 ψ1 + λ20 ψ1 = −λ21 ψ0 on D0
∂ψ0
ψ1 + Y1 =0 on y0 = Y0 (x0 )
∂y0
∂ψ1 ∂ 2 ψ0 ∂ψ0
ψ2 + 2Y1 + Y12 2
+ Y2 =0 on y0 = Y0 (x0 )
∂y0 ∂y0 ∂y0
etc.
And we notice that the homogeneous part of every problem has a solution, not zero. Hence a
solvability condition must be satisfied at every order and it will determine λ21 , λ22 , . . . the corrections
to λ20 . The displacement of the boundary , given by Y1 , Y2 , . . ., will appear in the solvability
conditions.
Domain perturbations would be useful if we have a heavy fluid lying above a light fluid in
a container of arbitrary cross section and we wish to learn if the interface is stable to a small
displacement. Hopefuly, the arbitrary cross section is close to, say, a circle, and the displaced
interface is close to, say, a plane.
More the details can be found in the book ”Interfacial Instability” by L.E. Johns and R.
Narayanan.
~v = ~k × ∇ψ (x, y)
1~
~v = i × ∇ψ (r, z) cylindrical, r 2 = x2 + y 2
r θ
or
1 ~
~v = i × ∇ψ (r, θ) spherical, r 2 = x2 + y 2 + z 2
r sin θ φ
Introducing orthogonal coordinates ξ, η, derive formulas for vξ and vη in terms of ψ (ξ, η).
x = f (ξ, η) , y = g (ξ, η)
r = f (ξ, η) , z = g (ξ, η)
or
r = f (ξ, η) , θ = g (ξ, η)
2. You are to write out the terms appearing in the Navier-Stokes equation. Do this in cylindrical
coordinates where
and where
∂~ir ∂~iθ
= ~iθ and = −~ir
∂θ ∂θ
∇~v
∇ · ~v
∇ × ~v
~v · ∇~v
∇ · ∇~v = ∇2~v
∇ · (∇~v )T = ∇ (∇ · ~v )
x = r cos θ
y = r sin θ
r = r (ξ, η) , z = z (ξ, η)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 270
so that
x = r (ξ, η) cos θ
y = r (ξ, η) sin θ
z = z (ξ, η)
r (ξ, η) = ξ sin η
z (ξ, η) = ξ cos η
r = c sinh ξ sin η
z = c cosh ξ cos η
and
r = c cosh ξ sin η
z = c sinh ξ cos η
where 0 ≤ ξ ≤ ∞, 0 ≤ η ≤ π.
Show that prolate and oblate spheroidal coordinates are orthogonal, write ∇2 and reduce
(∇2 + λ2 ) ψ = 0 to ordinary differential equations by separation of variables.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 271
1~ ~
1 x2 + y 2
4. Show that ~v = xi − y j and p = − ρ satisfy
T 2 T2
and
∇ · ~v = 0
Notice that:
p decreasing
~v = | ~v | ~t
d ~ d | ~v | ~ d~t
~a = ~v · ∇~v = | ~v | ~t · ∇ | ~v | ~t = | ~v | | ~v | t = | ~v | t + | ~v |
ds ds ds
d~t
and where = κ~p and p~ · ~t = 0.
ds
5. Denote by m
~ a magnetic dipole. The vector potential and the magnetic induction due to m
~
are:
~ = µ0 m
A ~ × ~r
4π
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 272
and
~ =∇×A
B ~
To learn something about diffusion without doing very much, we present some diffusion problems
in unbounded domains.
∂c ∂2c
= D 2, −∞ < x < ∞
∂t ∂x
and we only need c to vanish strongly enough as | x| → ∞ so that all of its moments are finite.
273
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 274
we can derive the equation satisfied by cm (t) by multiplying the diffusion equation by xm and
integrating the result over all x. Simplifying the right hand side by integration by parts and setting
all the terms evaluated at ±∞ to zero we get
dcm
= Dm(m − 1)cm−2 , m = 0, 1, 2, . . .
dt
dc0
=0
dt
dc1
=0
dt
and
dc2
= 2Dc0
dt
c0 = c0 (t = 0)
c1 = c1 (t = 0)
and
c2 = c2 (t = 0) + 2Dtc0 (t = 0)
The first result expresses the fact that solute is neither gained nor lost in diffusion, it is just
redistributed. The second and third results tell us something about this redistribution. If we think
of c at any time t as the result of an experiment designed to determine the spatial positions of the
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 275
c(x, t)
solute molecules at that time then dx is the probability that the measurement of a molecule’s
c0
position falls between x and x + dx at time t and hence we have
Z +∞
c(x, t) c1
x = x dx =
−∞ c0 c0
and
Z +∞
2 c(x, t) c2
x = x2 dx =
−∞ c0 c0
where h i denotes the average or expected value of a function of x, weighted by the solute density.
So c1 (t) and c2 (t) determine the expected values of x and x2 and in a simple diffusion experiment
and we see hxi is fixed while hx2 i increases linearly in time.
The variance of the solute distribution, i.e., the average of (x − hxi)2 , tells us about the spread-
ing of the solute, and as this is
D 2 E
2
σ = x− x
= hx2 i − hxi2
2
c2 c1
= −
c0 c0
we find that
σ 2 = σ 2 (t = 0) + 2Dt
This is a formula for D in terms of the variance of the solute concentration as independent
solute molecules spread out in a solvent, viz.,
1 dσ 2 1 dc2
D= =
2 dt 2c0 dt
All this carries over to three dimensional problems where a solute distribution is assigned at
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 276
then satisfy
dcℓmn
= Dℓ(ℓ − 1)c(ℓ−2)mn + Dm(m − 1)cℓ(m−2)n + Dn(n − 1)cℓm(n−2)
dt
and again this set of equations can be solved recursively. And a formula for D in three dimensional
diffusion in terms of
σ 2 = h x2 i − h x i2 + h y 2 i − h y i2 + h z 2 i − h z i2
can be obtained.
Suppose now that an initial solute distribution not only spreads out by diffusion but is also
displaced by a flow field. Here too, we can get useful information about what is going on by
looking at the moments of the solute distribution. To do the simplest example we suppose that an
initial solute distribution in two dimensions is being displaced by a stagnation flow such as
x~ y ~
~v = i− j
T T
viz.,
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 277
whose tendency is to stretch the solute out in the x direction while compressing it in the y direction
so that as t increases the solute might pass through the following stages:
∂c
= D∇2 c − ~v · ∇c
∂t
x~ y ~
and using ~v = i − j this is
T T
∂c ∂2c ∂2c x ∂c y ∂c
=D 2 +D 2 − +
∂t ∂x ∂y T ∂x T ∂y
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 278
we can determine the equations satisfied by the moments just as we did before and here we find
dcmn m−n
= Dm(m − 1)c(m−2)n + Dn(n − 1)Cm(n−2) + cmn
dt T
Again the equations can be solved recursively and doing this we find that the moment equations
dc00
=0
dt
dc10 1
= c10
dt T
dc01 1
= − c01
dt T
dc11
=0
dt
dc20 2
= 2Dc00 + c20
dt T
and
dc02 2
= 2Dc00 − c02
dt T
lead to
c00 = c00 (t = 0)
c11 = c11 (t = 0)
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 279
c20 = e2t/T c20 (t = 0) + T Dc00 (t = 0) e2t/T − 1
and
c02 = e−2t/T c02 (t = 0) − T Dc00 (t = 0) e−2t/T − 1
2
c10 2 c20 c20 c10
hxi = , hx i = , σx2 2 2
= hx i − hxi = − , etc.
c00 c00 c00 c00
x = x (t = 0)et/T
y = y (t = 0)e−t/T
σx2 = σx2 (t = 0)e2t/T + T D e2t/T − 1
and
σy2 = σy2 (t = 0)e−2t/T − T D e−2t/T − 1
This tells us that as time grows large σy2 achieves a constant value T D. This value expresses the
balance between the tendency of diffusion to increase σy2 and the tendency of the flow to decrease
it. The tendency of the flow is to carry the solute toward the line y = 0 building concentration
gradients there until ultimately diffusion in the y direction can just offset this.
The reader can use this method to investigate the displacement of a solute concentration field
1
by a simple shearing flow: vx = y, vy = 0.
T
The reader can also do two simple calculations in the spherically symmetric case where
∂c 1 ∂ 2 ∂c
=D 2 r
∂t r ∂r ∂r
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 280
and
Z ∞
c2 = r 2 c 4πr 2 dr
0
and derive
dc0
=0
dt
and
dc2
= 6Dc0
dt
dc0
=0
dt
R +∞
and c1 (t) = −∞
xc dx satisfies
Z +∞
dc1
= D ′ (x)c dx
dt −∞
whence we have
d
hxi = hD ′ (x)i
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 281
c1
where hxi =
c0
Thus we see that solute no longer redistributes about a fixed point. And moving on to solute
spreading we observe that
Z +∞
c2 = x2 c dx
−∞
satisfies
Z +∞
dc2
=2 (xD(x))′ c dx
dt −∞
whereupon
dσ 2 1 dc2 c1 dc1
= −2
dt c0 dt c0 dt
dhxi
= 2h(xD(x))′ i − 2hxi
dt
Now these formulas certainly have the expected form and we could use them if we could write
c(x, t) in terms of c0 (t), c1 (t), etc. but that is not so easy.
Carrying on with our plan to try to learn something about diffusion without doing much, we turn
to a problem in which we discover diffusion where at first there would appear to be none.
A solute is injected into a carrier gas flowing through a packed column where the solute is
adsorbed by the packing. As the solute moves through our column it is adsorbed at its leading
edge, desorbed at its trailing edge. The solute concentration in the carrier phase and in the solid
phase are denoted by c and by a and the dilute solute equilibrium isotherm is assumed to be
c = ma where m is constant
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 282
The smaller the value of m, the more strongly the solute is bound to the solid.
We denote the volumetric flow rate of the carrier by G, the cross sectional area of the empty
column by A and the porosity of the bed by ε.
G v0
Then v0 = denotes the superficial velocity of the carrier and v = denotes its interstitial
A ε
velocity.
We can write a simple model, assuming no variation of c and a on the cross section. It is
∂c ∂c
ε + v0 = −K (c − ma)
∂t ∂z
and
∂a
(1 − ε) = K (c − ma)
∂t
1
where the volumetric mass transfer coefficient is denoted by K and where [K] = .
time
The bed is assumed to be infinitely long and at time zero a finite amount of solute is injected
into the carrier near z = 0.
Our model equations can be solved, but we do not do so. Instead, we will see what we can
learn by solving the moment equations.
satisfy
dc0
ε = −K (c0 − ma0 )
dt
and
da0
(1 − ε) = K (c0 − ma0 )
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 283
whence we have
1 m
−tK +
c0 = A0 + B0 e ε 1−ε
and, as t → ∞,
c0 → A0
dc0
→0
dt
and
c0
a0 →
m
satisfy
dc1
ε − v0 c0 = −K (c1 − ma1 )
dt
and
da1
(1 − ε) = K (c1 − ma1 )
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 284
Again, adding these equations, differentiating the first and eliminating a1 , we have
d 2 c1 1 m dc1 v0 m v0 dc0
+K + =K c0 +
dt2 ε 1−ε dt ε 1−ε ε dt
whereupon we obtain
zero
1 m m
−tK + v0
c1 = A1 + B1 e ε 1−ε + 1−ε A t
0
ε 1 m
+
ε 1−ε
The average distance transversed by the solute in the carrier phase, measured from the injection
point, z = 0, and denoted hzi, viz.,
R +∞
zc dz c1
hzi = R−∞
+∞ =
c dz c0
−∞
m
dhzi 1 dc1 v0 1−ε
= =
dt c0 dt ε 1 m
+
ε 1−ε
dhzi
and we call the speed at which the solute in the carrier phase proceeds through the bed.
dt
For large m, weakly bound solute, we have
dhzi v0
= ,
dt ε
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 285
v0
where is the carrier speed, whereas for small m, strongly bound solute, we have
ε
dhzi v0 ε
= m
dt ε 1−ε
Thus the ratio of the speeds of two strongly bound solutes is the ratio of their m’s. The mass
transfer coefficient does not appear in these formulas.
The reader may wish to find the speed of the solute in the solid phase.
What is interesting about this problem is that we have a transverse distribution of longitudinal
v0
speeds, simple though it may be, viz., in the carrier, zero in the solid, and we have solute that
ε
can move between the regions of different speed. This is a recipe for solute spreading in the flow
direction and, therefore, a hint that a diffusion model may be appropriate.
dc2
ε − 2v0 c1 = −K (c2 − ma2 )
dt
and
da2
(1 − ε) = K (c2 − ma2 )
dt
By eliminating a2 we obtain
d 2 c2 1 m dc2 v0 dc1 v0 m
+K + =2 +2 K c1
dt2 ε 1−ε dt ε dt ǫ 1−ε
m
v0 1 − ε A t = A + const A t
c1 = A1 + 0 1 0
ε 1 m
+
ε 1−ε
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 286
Now we introduce σ 2 , the longitudinal variance of the solute distribution in the carrier, where
R +∞ 2
2 −∞
( z − hzi )2 c dz c2 c1
σ = R +∞ = −
c dz c0 c0
−∞
dσ 2 1 dc2 1 dc1
= − 2 2c1
dt c0 dt c0 dt
dσ 2 dσ 2
where tells us the rate of spreading of the solute. If σ 2 is a multiple of t or is a constant,
dt dt
this constant defines the longitudinal diffusion coefficient.
dσ 2
The long time value of can be worked out and after some algebra we obtain
dt
m
2 2
dσ (v0 /ε) 1−ε
=2 2
dt Kε 1 m
+
ε 1−ε
dσ 2 (v0 /ε)2 2 m
=2 ε
dt K 1−ε
Hence, the larger the value of K and the smaller the value of m, the more the solute hangs
together. Indeed, the larger the value of K, the less the effect of the velocity gradient, and the
smaller the value of m, the stronger the solute is bound to the solid.
The reader ought to derive these formulas. Only a particular solution of the c2 equation is
needed.
1 dc2 1 dc1
In order for the spreading process to be diffusive, the terms in and in 2 2c1 that
c0 dt c0 dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 287
dσ 2
are multiples of t must cancel, otherwise will be a multiple of t and in that case the spreading
dt
is called ballistic, not diffusive.
This problem serves as an introduction to the problem called Taylor dispersion presented in
Lecture 20.
hxi = 0
and
hx2 i = 2Dt
assuming all the solute particles start at x = 0 and diffuse in the ±x directions.
The probability in a simple random walk on a one-dimensional lattice satisfies the difference
equation
where P (N1 , N2 ; N) is the probability that in N steps, N1 steps are to the right, N2 steps are to
the left and where, at each step, p is the probability that it is to the right, q the probability that it is
to the left. We must have N1 + N2 = N and p + q = 1. After N steps the possible values of N1
are 0, 1, 2, . . . , N.
N!
P (N1 , N2 ; N) = p N1 q N2
N1 !N2 !
where N1 + N2 = N and p + q = 1.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 288
This formula is called the binormal distribution. It tells us the probability of N1 successes in N
trials where the trials are independent and where p denotes the probability of a success in any trial.
Now, due to
XX
P (N1 , N2 : N) = (p + q)N , N1 + N2 = N
N1 N2
hN1 i = Np
and
hN12 i = Np (1 − p) + N 2 p2
Hence if we denote N1 − N2 by ∆, i.e., the net number of steps to the right, then the average values
of ∆ and ∆2 are
h∆i = N (2p − 1)
and
If the lattice spacing is ℓ and a particle is released at x = 0 then its position on the lattice after
1
N steps is x = ℓ∆. So if all particles are released at x = 0 and if p = q = then
2
hxi = 0
and
hx2 i = ℓ2 N
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 289
Assuming the time required to take a step is τ , we see that N steps corresponds to time t = Nτ
and
hx2 i = ℓvt
where v = ℓ/τ is the speed of a particle. Our two formulas for hx2 i tell us that random walk acts
statistically like a diffusion process where
1
D= ℓv
2
We can repeat this calculation assuming that diffusion takes place, not in one, but in three
dimensions. Then if all the particles start at the origin the diffusion equation tells us directly that
r2 = x2 + y2 + z2 = 6Dt
To introduce the corresponding simple random walk on a three dimensional cubic lattice we let
P (N1 , N2 , N3 , N4 , N5 , N6 ; N) be the probabiity that in N steps, N1 steps are in the +x direction,
N2 steps are in the −x direction, N3 steps are in the +y direction, etc. And at each step we let p
be the probability that it is in the +x direction, q the probability that it is in the −x direction, r the
probability that it is in the +y direction, etc. Then we find
N!
P (N1 , N2 , . . . , N6 ; N) = p N1 q N2 r N3 s N4 u N5 v N6
N1 !N2 !N3 !N4 !N5 !N6 !
XX X
··· P (N1 , N2 , . . . , N6 ; N) = (p + q + r + s + u + v)N = 1, N1 +N2 +· · ·+N6 = N
N1 N2 N6
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 290
XX X
N1 = ··· N1 P N1 , N2 , . . . , N6 ; N
N1 N2 N6
∂ XX X
=p ··· P N1 , N2 , . . . , N6 ; N
∂p N N N
1 2 6
= pN
and
XX X
N12 = ··· N12 P N1 , N2 , . . . , N6 ; N
N1 N2 N6
∂ ∂ XX X
=p p ··· P N1 , N2 , . . . , N6 ; N
∂p ∂p N N N 1 2 6
= pN + p2 N N − 1
Likewise we find
N2 = qN
N1 N2 = pqN N − 1
and
N22 = qN + q 2 N N − 1
∆x = N1 − N2 = p−q N
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 291
x =0
and
1
x2 = ℓ2 N
3
1
Likewise we have hyi = 0 = hzi and hy 2i = ℓ2 N = hz 2 i.
3
Again if the time required to take a step is τ then t = Nτ and
r2 = x2 + y2 + z2 = ℓvt
whence
1
D= ℓv
6
So when we hold the length of a free path fixed, require the free path to lie on a lattice and hold
the speed of a particle fixed we get
1
D= ℓv
2
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 292
and
1
D= ℓv
6
This may be a surprising result. Even more surprising is the way the lattice calculations turn
out. After N steps the average of the square of the distance from the origin is the same whether the
random walk is on a one dimensional lattice where hx2 i = ℓ2 N or on a three dimensional lattice
where hr 2 i = ℓ2 N.
1
D= ℓv
3
A solute is distributed throughout all space at time t = 0. The amount of solute is finite. There are
no natural length and time scales. Our job is to predict the solute concentration, c, at t > 0. To do
this we must solve
∂c
= D∇2 c, ∀~r
∂t
where we must have c → 0 as | ~r| → ∞ and where c(t = 0) is assigned. Our first step is to find
out how a point source of solute spreads out in time in an unbounded domain. The result will be
the Green’s function for our problem and we obtain it by assuming we know the solution to a step
function initial condition. Then we come back to Green’s functions in Lecture 19.
x
A + B erf √
4t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 293
satisfies
∂c ∂2c
= 2
∂t ∂x
erf(z)
1.00
0.75
0.50
0.25 1.00 2.00 3.00
z
-3.00 -2.00 -1.00 − 0.25
− 0.50
− 0.75
− 1.00
While this would not seem to be a very useful solution to the diffusion equation due to the
limited class of boundary conditions that can be satisfied by setting two constants, nonetheless it
plays an important role in modeling diffusion processes as can be seen by looking in Bird, Stewart
and Lightfoot’s book “Transport Phenomena.” To get the solution when the diffusivity is other
than D = 1, write Dt in place of t.
If c(t = 0) is zero for x < ξ and x > ξ + ∆ξ and uniform but not zero for ξ < x < ξ + ∆ξ
then, for t > 0 we have
1 x−ξ 1 x − (ξ + ∆ξ)
c = c0 erf √ − erf √
2 4t 2 4t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 294
(x − ξ)2
1 −
c0 √ e 4π ∆ξ
4πt
∂c ∂2c
= 2
∂t ∂x
∂c
= ∇2 c
∂t
satisfies
∂c
= ∇2 c
∂t
where
ξ < x < ξ + ∆ξ
c(t = 0) = c0 η < y < η + ∆η
ζ < z < ζ + ∆ζ
=0 otherwise
The product c0 ∆ξ∆η∆ζ is the total amount of solute initially present. If we hold this fixed,
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 295
equal to one unit of mass, and let ∆ξ, ∆η and ∆ζ go to zero, we find
and c satisfies
∂c
= ∇2 c, t>0
∂t
and
Z +∞ Z +∞ Z +∞
c dV = 1, ∀t ≥ 0
−∞ −∞ −∞
This is called the point source solution to the diffusion equation as it tells us the solute density
at the point (x, y, z) and at the time t if at t = 0 a unit mass of solute is injected into a region
of vanishingly small extent at the point (ξ, η, ζ). It is called the Green’s function for the diffusion
equation in an unbounded region.
If the diffusivities are other that D = 1 it is a simple matter to decide that the point source
solution to
is
−1
Dxx (x − ξ)2 + Dyy
−1
(x − η)2 + Dzz
−1
(x − ζ)2
1 1 −
c = √ 3 √ p √ e 4t
4πt Dxx Dyy Dzz
∂c ~~ ~~
=∇·D · ∇c = D : ∇∇c
∂t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 296
~~
where we have assumed D to be symmetric. Thus being so there is an orthogonal basis in which
~~ ~~
D is diagonal. Hence requiring ~i, ~j and ~k to lie along the eigenvectors of D, we can write
~~ ~~
where Dxx , Dyy and Dzz are the eigenvalues of D. And in the eigenbasis of D we can write the
point source solution to the anisotropic diffusion equation as above. In coordinate free form it is
~~ −1
D : (~r − r~0 )(~r − r~0 )
1 −
c (~r, t) = n o3 n o e 4t
√ ~~ 1/2
4πt det D
We can use the Green’s function to write the solution to the diffusion equation in a way that
does not require us to decompose the sources but accepts them as they stand. To do this observe
that the point source solution tells us the solute concentration at the point (x, y, z) at a time t
following the introduction of a unit mass of solute at the point (ξ, η, ζ). Knowing this we can write
the solution to
∂c ~~
=D : ∇∇c + Q (~r, t)
∂t
where
c(t = 0) = c0 (~r)
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 297
~~ −1
Z +∞ Z +∞ Z +∞ D : (~r − r~0 )(~r − r~0 )
1 −
c (~r, t) = ( )3 (q )e 4t c0 (r~0 ) dV0
−∞ −∞ −∞ √ ~~
4πt det D
~~ −1
Z Z Z Z D : (~r − r~0 )(~r − r~0 )
+∞ +∞ +∞ t
1 −
+ ( )3 (q ) e 4 (t − t0 ) Q (r~0 , t0 ) dV0 dt0
−∞ −∞ −∞ 0 p ~~
4π (t − t0 ) det D
inasmuch as c0 (r~0 )dV0 is the amount of solute introduced into dV0 at r~0 at t = 0 and Q (~
r0 , t0 ) dV0 dt0
is the amount of solute introduced into dV0 at r~0 during dt0 at t0 . Again the solution is a sum of
terms, each term itself being the solution corresponding to one of the sources when the others
~~ ~~
vanish. If this result is used to determine D, only symmetric D’s can be found.
The point source solution for diffusion in three dimensions leads easily to the point source
solution for diffusion in two dimensions and then in one dimension.
If we introduce M ′′′ mass units of solute into an isotropic solvent at the point ~r = r~0 and at
time t = 0 then the resulting solute concentration is
3 (x − x0 )2 + (y − y0 )2 + (z − z0 )2
1 −
c (~r, t) = M ′′′ √ e 4Dt
4πDt
and this is the point source solution in three dimensions. The dimensions of c are M/L3 and the
√
dimension of Dt is L.
If instead we introduce M ′′ mass units of solute per unit of length uniformly on the line
x = x0 , y = y0 , −∞ < z < ∞, at time t = 0, then the resulting solute concentration is
Z +∞ 3 (x − x0 )2 + (y − y0 )2 + (z − z0 )2
1 −
c (~r, t) = M ′′ dz0 √ e 4Dt
−∞ 4πDt
2 (x − x0 )2 + (y − y0 )2
1 −
= M ′′ √ e 4Dt
4πDt
and this is called the line source solution in three dimensions. It is uniform in z and does not vanish
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 298
as | z| → ∞. It can be called the point source solution in two dimensions, taking the dimensions
of M ′′ to be M instead of M/L and then the dimensions of c to be M/L2. Likewise if we introduce
M ′ mass units of solute per unit of area uniformly over the plane
x = x0 , −∞ < y < ∞, −∞ < z < ∞, at time t = 0, then the resulting solute concentration is
(x − x0 )2
1 −
c (~r, t) = M ′ √ e 4Dt
4πDt
and this is called the plane source solution in three dimensions. It is uniform in y and z and does
not vanish as | y| → ∞ or as | z| → ∞. It can be called the point source solution in one dimension,
taking the dimensions of M ′ to be M instead of M/L2 and then the dimensions of c to be M/L.
1. Suppose c satisfies
∂c ~~
=D : ∇∇c
∂t
where c (t = 0) is assigned and c vanishes strongly as | ~r| → ∞. Derive equations for the
power moments of c, viz.,
ZZZ
c0 = c dV
ZZZ
~c1 = ~r c dV
ZZZ
~~c2 = ~r ~r c dV
etc.
~~
Derive a formula for D in terms of the power moments of c.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 299
~r ∇2 c = ∇2 (~rc) − 2∇c
~~
~r ~r∇2 c = ∇2 ~r ~r∇2 c − 2Ic − 2∇c ~r − 2~r∇c
and
~~
(∇c) ~r = ∇ (~rc) − Ic
It helps to know:
ZZZ ZZ
∇f dV = dA ~nf
V S
2. A thermometer, your finger, senses the temperature of its tip shortly after it touches another
body. Let αB , kB and TB denote the thermal diffusivity, conductivity and temperature of a
body while αF , kF and TF denote the thermal diffusivity, conductivity and temperature of
the thermometer.
to decide that two bodies at the same temperature ordinarily do not feel like they are at the
same temperature. What determines which feels cooler?
3. Denote by σ a surface heat source and assume σ is constant over a sphere of radius R.
Derive a formula for T inside and for T outside, assuming T → 0 as r → ∞. The thermal
conductivities inside and outside may differ.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 300
Here, ∇2 T is simply
1 d 2 dT
r
r 2 dr 2 dr
Write your result in terms of Σ = 4πR2 σ and then let R → 0 holding Σ constant.
g p0
water z = h ( x, t )
z
µ ∂p
vx = −
K ∂x
p = p0 + ρg (h − z)
and using
∂vx ∂vz
+ = 0, vz = 0 at z=0
∂x ∂z
derive
ρ ∂ ∂h ∂h
K g h =
µ ∂x ∂x ∂t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 301
starting from the balance satisfied by the water at the surface z = h (x, t), viz.,
∂h ∂h
vz − vx =
∂x ∂t
Notice that
Kρg L2
h =
µ T
and that what you have is a nonlinear diffusion equation where the diffusivity at (x, t) de-
pends on how much h is there.
Define
Z +∞
hm = xm h (x, t) dx
−∞
dh0 dh1
=0=
dt dt
and
dh2
>0
dt
Thus the amount of water does not change nor does its mean position as it spreads out.
5. Chromatography
Solve for
c0 c1
, , ···
a0 a1
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 302
c = ε c + (1 − ε) a
1 dc1
Veff =
c0 dt
and
1 dc2 c1 dc1
Deff = − 2
2c0 dt c0 dt
Multipole Expansions
In this lecture we deal with the problem of steady solute diffusion or heat conduction from a source
of solute or heat near the origin to a sink at zero concentration or temperature, infinitely far away.
Again we derive a point source solution.
Suppose heat is generated uniformly inside a sphere of radius R centered on the origin O.
Then, assuming the heat is conducted steadily to T = 0 at r = ∞, we have
Q heat
∇2 T + = 0 , [Q] =
k volume time
∇2 T = 0
1 d 2d
∇2 = r
r 2 dr dr
303
LECTURE 13. MULTIPOLE EXPANSIONS 304
B
T =A+
r
Before going on we ought to notice that there would be no steady solution if the sphere were
replaced by an infinitely long cylinder.
We find
1 Q
T = A − r2
6 k
inside,
B
T =
r
1 Q
B = R3
3 k
1 Q1 VQ1
T = R3 = , r>R
3 kr 4πk r
where V Q is the heat supplied per unit time. Denoting V Q by Q, we have, for a point source of
heat at the origin:
Q 1
T = , r > 0.
4πk r
Q 1 heat
T (→
−
r)= , [Q] =
4πk r − →
−
→ −
r0 time
The readers ought to convince themselves that the integral makes sense, whether −
→
r lies inside
or outside the sphere, by looking at the special case where heat is generated uniformly in a finite
sphere centered on O
ρ (−
→r)
∇2 T + = 0, T → 0 as | →
−
r | →∞
k
Now suppose we have a system of point sources in the neighbourhood of a point O and we
wish to know T at a point P some distance away, viz.,
LECTURE 13. MULTIPOLE EXPANSIONS 306
At P we have
X Qi 1
T (P ) =
4πk | r − →
→
− −
ri |
where
−
→ →
− 2 →
− →
− →
− →
− 2 ri ri2
| r − ri | = ( r − ri ) · ( r − ri ) = r 1 − 2 cos θi + 2
r r
and therefore
2
1 1 ri r2 3 ri
|→
−
r −−
→ −1
ri | = 1− −2 cos θi + i2 + 2
4 2 cos θi + · · · + · · ·
r 2 r r 8 r
Then using
−
→
r ·−→
ri
ri cos θi = ,
r
we have
X X X 3
1 1 1→
− →
− 1→
− →
− →
− →
− 1→
− →
− ⇒
T (P ) = Qi + 3 r · Qi ri + 5 r r : Qi r r − r · ri I + · · ·
4πk r r r 2 i i 2 i
P P P 3− 1− ⇒
The factors Qi , Qi →
−
ri, Qi → →
− → →
−
r r − r · ri I , · · · depend only on the distribution
2 i i 2 i
of the point heat sources near O and are called the monopole, dipole, quadrupole, · · · moments of
→ ⇒
−
the point source distribution. Denoting these M, D , Q,· · · our formula can be written
1 1→ − → 1 −− ⇒
4πkT (P ) = M + 3 −
r · D + 5→
r→r : Q+···
r r r
→ ⇒
−
where −
→
r denotes the position of P with respect to O and M, D , Q, · · · do not depend on the field
LECTURE 13. MULTIPOLE EXPANSIONS 307
point −
→
r.
If, instead of a heat conduction problem, we have an electrostatic problem, where the electrical
potential through out space is created by a system of point charges near the origin, we find for the
electrostatic potential at P , via Coulomb’s law or, as above, via ∇2 φ = 0:
1 1− − → 1−− ⇒
4πǫ0 φ (P ) = MO + 3 →
r · DO + 5 →
r→r : QO + · · ·
r r r
→ ⇒
−
where MO , D O , QO , · · · denote the monopole, dipole, quadrupole, · · · moments of the charge
distribution about O.
We might then have a second charge distribution in the neighbourhood of P and ask for the
potential energy of this second set of charges due to the electrical potential created by the first
set of charges, i.e., the potential energy of one molecule due to the electrical potential created by
another molecule. The result is
X X
PE = φ (Pi ) Qi = φ (→
−
r +−
→
ri ) Qi
where −
→
ri denotes the position with respect to P of the ith charge in the second charge distribution.
The potential energy can be written in terms of the moments of the two charge distributions by
expanding φ (Pi ) via
1 →−
φ (−
→
r +−
→
ri ) = φ (−
→
r )+−
→
ri · ∇φ (−
→
r )+ −ri →
ri : ∇∇φ (−
→
r ) +···
2
Hence we have
X
1− −
PE = Qi φ (→
−
r )+−
→ r )+ →
ri · ∇φ (−
→ r→ r : ∇∇φ (−
→
r )+···
2 i i
and using
→
−
1 r
∇ =− 3
r r
→
−
1 r 3→
−
r−→r 1⇒
∇∇ = −∇ = − I
r r3 r5 r3
and
−
→
r 3−
→
r−→r 1⇒
∇ = − + I
r3 r5 r3
1
we find, to terms of order ,
r3
1 1→− →
− 1− → →
− ⇒
4πǫ0 P E = MP MO + 3 r · D O + 5 r r : QO
r r r
− → −
→
− r 3→ r−→r 1⇒ − →
+ D P · − 3 MO + − 5 + 3 I · D O
r r r
−
X 1
→
− →
− 3→
r−→r 1⇒
+ Qi r i r i : − 3 I MO
2 r5 r
Then, using
− ⇒ −
3→
r−→r 1⇒ 3→
r→−r 1⇒
tr − 3I =I : − 3I =0
r5 r r5 r
LECTURE 13. MULTIPOLE EXPANSIONS 309
and
⇒
tr QP = 0
where
X
⇒ 3−→ 1− → ⇒
QP = Qi ri →
−
ri − →ri · −
ri I
2 2
we have
MO MP →
−
r n→− −
→ o 1⇒ 1→
→ −
− →
4πǫ0 P E = + 3 · D O MP − D P MO + − →
−
I − 5 r r : DO DP
r r r 3 r
1− → −
→ ⇒ ⇒
+ 5 r r : QO MP + QP MO
r
−→
where −
→
r = OP .
We see that if the two charge distributions are neutral; i.e., MO = 0 = MP , the leading term is
1
the dipole-dipole term, falling off as 3 .
r
If the point charges were replaced by point masses and ε0 by G, and if O and P were taken
→
− → −
− →
to lie at the centers of mass so that D O = 0 = D P , then the first correction to the leading term
MO MP
would be the quadrupole term, where the quadrupole moment of a mass distribution can
r
be expressed in terms of its inertia tensor.
σ = σ0 + σ1 cos θ
Your job is to solve ∇2 T = 0 for r > R and r < R where T → 0 as r → ∞. The thermal
conductivities inside and outside may differ.
LECTURE 13. MULTIPOLE EXPANSIONS 310
You have
2 1 ∂ 2 ∂ 1 ∂ ∂
∇ = 2 r + 2 sin θ
r ∂r ∂r r sin θ ∂θ ∂θ
and derive
1 d 2 dT0
r =0
r 2 dr dr
and
1 d 2 dT1 2
r − T1 = 0
r 2 dr dr r2
whereupon
B0
T0 = A0 +
r
and
B1
T1 = A1 r +
r2
~n
T (~r)
~r
V0
S0
∇ · ~q = Q
This suggests that perhaps we can replace all the moments of Q over V0 by moments of
~n · ~q over S0 in our equation for T (~r).
Show that
~r ∇ · ~q = ∇ · (~q ~r) − ~q
LECTURE 13. MULTIPOLE EXPANSIONS 312
and conclude that we cannot replace first moments of Q over V0 by first moments of ~n · ~q
over S0 .
However, we have
ZZZ ZZZ ZZ
~q dV0 = −k ∇T dV0 = −k ~n T dA0
V0 V0 S0
3. Derive a formula for the potential energy due to systems of charges near O, P and Q.
LECTURE 13. MULTIPOLE EXPANSIONS 313
O
~rP
P
~rQ
where | ~rP |, | ~rQ | and | ~rP − ~rQ | are all much greater than the distances of the charges from
O, P and Q.
Assume the monopole moments are all zero, and account only for the dipole moments.
Q
T (~r) = 4πk
| ~r − ~r0 |
Assume ~r0 = z0~k, z0 > 0 and show that the temperature at the points (R, θ, φ) is
Q
T (R, θ, φ) = p 2 4πk
z0 − 2z0 R cos θ + R2
Setting this aside, find the temperature at ~r if the temperature, T (R, θ, φ), is specified
on the surface of a sphere of radius R, r > R and T → 0 as r → ∞.
The result is the sum of two contributions, that of the point source in the absence of the
sphere and that of a sphere whose surface temperature is
Q
TS − p 2 4πk
z0 − 2z0 R cos θ + R2
How should we think about finding the temperature at ~r due to two spheres of radius
R, one centered at ~r1 having surface temperature T1 , the other centered at ~r2 having surface
temperature T2 , T1 and T2 constants?
5. Suppose our charge distribution is composed of two charges, q1 at ~r1 and q2 at ~r2 , where
q1 = q = −q2 and ~r1 − ~r2 = d~u and where ~u is a vector of unit length in the direction
~r1 − ~r2 .
Derive the monopole, dipole and quadrupole moments of this charge distribution. What is
the limiting form of each of these moments as d → 0 holding qd fixed.
6. The solute concentration at a point ~r due to point sources at ~r1 , ~r2 , . . ., near the origin is
X mi
4πDc (~r) =
| ~r − ~ri |
LECTURE 13. MULTIPOLE EXPANSIONS 315
~r2
V0
~r1
O
~r
If ~r lies inside V0 we might wonder about the integral on the right hand side as ~r0 passes
through ~r. Suppose ~r = 0 and write
ZZZ ZZZ ZZZ
= +
V0 V0 −Vε Vε
Show that
ZZZ
ρ (r0 )
dV0
| ~r0 |
Vε
Suppose the solute source is distributed over a surface S0 , its density being denoted by
σ. Then decide
ZZ
σ (~r0 )
4πD c (~r) = dA0
| ~r − ~r0 |
S0
Suppose S0 is the surface of a sphere of radius R0 centered at the point ~r ∗ and define
1
G (~r, ~r0 ) = , ~r0 ∈ S0
| ~r − ~r0 |
1
+ (~r0 − ~r ∗) (~r0 − ~r ∗) : ∇0 ∇0 G (~r, ~r ∗) + · · ·
2
where ~r0 − ~r ∗ = R0 ~n (~r0 ) and ~n is the unit normal to the sphere at ~r0 .
LECTURE 13. MULTIPOLE EXPANSIONS 317
+···
Suppose the interior of the sphere is not permeable to solute. How must the foregoing be
corrected?
Assume ρ is constant inside a sphere of radius R0 centered on the origin and zero outside.
Derive
ρ1 2
T (origin) = R
k 2 0
14.1 Introduction
We begin with a simple problem, solute diffusion in one dimension. The diffusion takes place in a
solvent layer separating two solute reservoirs where we control what is going on in the reservoirs.
∂2
The differential operator ∇2 is then 2 and our problem is to solve
∂x
∂c ∂2c
=D 2 , 0<x<L
∂t ∂x
In our first problem the solvent layer is in perfect contact with large reservoirs maintained
x Dt
solute free. Hence, in scaled variables ⇒ x, 2 ⇒ t we have
L L
∂c ∂2c
= 2 , 0<x<1
∂t ∂x
and
c (x = 0) = 0 = c (x = 1)
319
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 320
We are going to make several assumptions as we go along, one being that our functions are
smooth enough that integration by parts formulas can be used.
Earlier we solved
dx
= Ax , x (t = 0) specified
dt
by expanding x (t) in the eigenvectors of A and solving for the time dependent coefficients in the
∂2c
expansion. In fact, in one of our problems A was derived from difference approximations to 2 .
∂x
To maintain continuity with our earlier work, we are going to introduce the eigenvalue problem
d2
for the differential operator 2 , viz.
dx
d2
ψ = −λ2 ψ , 0 < x < 1
dx2
Solutions, ψ, to this problem are called eigenfunctions, and we will call the corresponding values
of λ2 the eigenvalues.
This problem has been stated too generally and we need to introduce restrictions on its solu-
tions. At this point we can only say that
ψ = A cos λx + B sin λx
where A, B and λ are unknown. The problem is homogeneous and will remain homogeneous as
we introduce additional conditions, hence if ψ is a solution, so too any multiple of ψ.
d2
What we need to do is to define the domain of the differential operator 2 in a way that is
dx
specific to the problem at hand. Here, as our solutions satisfy c (x = 0) = 0 = c (x = 1) it is
natural to require ψ (x = 0) = 0 = ψ (x = 1). This is the best choice and we will see why this is
so as we go along. We might simply argue that if we make each term in a sum vanish at x = 0,
then the sum vanishes at x = 0.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 321
d2 ψ
= −λ2 ψ , 0 < x < 1
dx2
and
ψ (x = 0) = 0 = ψ (x = 1)
sin λ = 0
Thus, λ = nπ, n = 0, ±1, ±2, . . .. The value n = 0 leads to ψ = 0 whereas the values n =
−1, −2, . . . lead to eigenfunctions that are multiples of those corresponding to n = 1, 2, . . .
Thus, we find an infinite set of eigenfunctions, ψ = sin nπx, corresponding to an infinite set of
eigenvalues.
λ2 = n2 π 2 , n = 1, 2, . . .
The main question that we would like to answer about an infinite set of functions, such as these
eigenfunctions, is whether or not it can be used as a basis for the expansion of a fairly arbitrary
function, viz., the function c (t = 0). This is what the theory of Fourier series is about and a
good elementary account of this theory can be found in Weinberger’s book (A First Course in
Partial Differential Equations with Complex Variables and Transform Methods). The expansion
we require is an infinite series, not a finite sum, and the question is not easy to answer. Still
there are conditions on the arbitrary function and on the basis functions sufficient for the series
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 322
representation of the function to mean something. On their part, the set of basis functions sin nπx,
n = 1, 2, . . . satisfy the conditions required, viz., first, they are a set of orthogonal functions. We
can establish this directly by using the functions themselves or indirectly by using the eigenvalue
problem defining the functions.
Z1 1 Z1
d2 ψ dψ dφ dψ
φ 2 dx = φ − dx
dx dx 0 dx dx
0 0
and
Z1 1 Z1 2
d2 ψ dψ dφ d φ
φ 2 dx = φ − ψ + ψdx
dx dx dx 0 dx2
0 0
where
[φ]10 = φ (x = 1) − φ (x = 0)
and where these formulas hold for complex valued functions as well as for real valued functions.
d2
Then as ψ and λ2 satisfy the eigenvalue problem for 2 whenever it is satisfied by ψ and λ2 ,
dx
we let ψ be an eigenfunction and φ be its complex conjugate in the second formula and determine
that λ2 = λ2 , and hence that λ2 is real. It follows that if ψ is an eigenfunction corresponding to the
eigenvalue λ2 so too is its complex conjugate and its real and imaginary parts.
Again if ψ and φ denote an eigenfunction and its complex conjugate, the first formula tells us
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 323
that:
Z1 1 Z1 2 Z1 2
dψ dψ dψ
−λ2 |ψ|2 dx = ψ − dx = − dx
dx 0 dx dx
0 0 0
and hence that the corresponding eigenvalue, λ2 , is strictly positive. If λ2 were zero, this formula
would require that ψ be a constant and as ψ (x = 0) = 0 = ψ (x = 1) this constant would then be
zero.
Z1
φψdx = 0
0
and we call φ and ψ orthogonal functions. We can introduce the inner product
Z1
hφ, ψi = φψdx
0
hφ, ψi = 0.
In terms of this inner product the two integration by parts formulas can be rewritten as
1 Z1
d2 dψ dφ dψ
φ, 2 ψ = φ − dx
dx dx 0 dx dx
0
and
1
d2 dψ dφ d2
φ, 2 ψ = φ − ψ + φ, ψ .
dx dx dx 0 dx2
What we have established then is just what we can show to be true by direct calculation:
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 324
the eigenvalues n2 π 2 , n = 1, 2, . . ., are real and positive and the eigenfunctions, sin nπx, n =
1, 2, . . ., are orthogonal, viz.,
Z1
sin mπx sin nπx dx = 0 , m 6= n
0
What is important is not the confirmation of results we already have, but the possibility of
obtaining new results. To see this we observe the important role played by the boundary con-
1 1
dψ dψ dφ
ditions in eliminating the term φ in the first formula and the term φ − ψ in the
dx 0 dx dx 0
second. Indeed if φ and ψ, and therefore their complex conjugates, satisfy the boundary conditions
dψ dψ
ψ (x = 0) = 0, (x = 1) = 0 or the boundary conditions (x = 0) = 0, ψ (x = 1) = 0
dx dx
dψ
then the conclusions are again as above. So too if the boundary conditions are (x = 0) =
dx
dψ
0, (x = 1) = 0, except that now we can conclude only that λ2 ≥ 0 as ψ = constant 6= 0
dx
satisfies the boundary conditions. This also obtains for periodic boundary conditions, where
1 1
dψ dψ dψ dψ dφ
ψ (x = 0) = ψ (x = 1) and (x = 0) = (x = 1), as again the terms φ and φ − ψ
dx dx dx 0 dx dx 0
vanish and ψ = constant 6= 0 satisfies the boundary conditions.
dψ
Because the specification of a linear combination of ψ and at a boundary is of physical
dx
dψ dψ
interest, we look also at the boundary conditions (x = 0)+ β0 ψ (x = 0) = 0 and (x = 1) +
dx dx 1
dψ dφ
β1 ψ (x = 1) = 0 where the constants β0 and β1 take real values. Then because φ − ψ =
dx dx 0
0 all conclusions drawn from the second integration by parts formula are as above whereas the first
now tells us that
Z1 Z1 2
2 2 2 2 dψ
−λ |ψ| dx = −β1 |ψ (x = 1)| + β0 |ψ (x = 0)| − dx.
dx
0 0
If β0 and β1 are not both zero then β1 ≥ 0, β0 ≤ 0 are sufficient, but, as we shall see, not necessary,
that λ2 > 0.
d2
What we conclude then is that the operator , restricted to a variety of domains by a variety
dx2
of boundary conditions of differing physical interpretation, leads via the solution of its eigenvalue
problem to a variety of sets of orthogonal functions. Denote one such set ψ1 , ψ2 , . . . and require
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 325
R1 2
that ψ i ψi dx = 1 so that hψi , ψj i = δij . Then sums of terms like e−λi t ψi (x) satisfy the diffusion
0
equation and the corresponding homogeneous boundary conditions. It remains only to determine
the weight of each such term in a series solution to a diffusion problem. And this must be decided
by the assigned solute distribution at the initial instant, t = 0. Thus we need to learn how to
determine the coefficients in an expansion such as
c (t = 0) = c1 ψ1 (x) + c2 ψ2 (x) + · · ·
Writing c (t = 0) as f (x) we find the error in an n term approximation to f (x) to be f (x)−Sn (x)
Pn
where Sn (x) = ci ψi (x). The mean square error is then
i=1
Z1 n o
f (x) − Sn (x) {f (x) − Sn (x)} dx
0
R1
which, on using ψ i ψj dx = δij , can be rewritten
0
2 2
Z1 n
X Z1 n
X Z1
|f |2 dx + ci − ψ i f dx − ψ i f dx
0 i=1 0 i=1 0
2
P
n R1
where only the second term, ci − ψ i f dx , depends on the values of the coefficients c1 , c2 ,
i=1 0
. . ., cn . Because the second term is not negative we can make the mean square error least by
making this term zero. To do this we assign the coefficients in the expansion the values
Z1
ci = ψ i f dx = hψi , f i
0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 326
n
X
hψi , f i ψi (x)
i=1
Z1 n
X
2
|f | dx − |ci |2 .
0 i=1
What is going on is indicated in the picture below where the best approximation of f using
only ψ1 is shown
−
→ →
− →
− →
− →
− D−
→ − →E
and where we see that the length of f −c1 ψ 1 is least if f −c1 ψ 1 is ⊥ to ψ 1 or if c1 = ψ1, f .
Pn
Indeed on requiring f − ci ψi to be ⊥ to ψ1 , ψ2 , · · · , ψn we get immediately ci = hψi , f i. And
i=1
we see that the values of the coefficients c1 , c2 . · · · , cn do not depend on n, remaining fixed as n
increases once determined for some value of n.
n
X Z1
2
|ci | ≤ |f |2 dx
i=1 0
and for n −→ ∞
∞
X Z1
|ci |2 ≤ |f |2 dx.
i=1 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 327
P
∞
This tells us that the series |ci |2 converges and therefore that |ci |2 −→ 0 as i −→ ∞. The
i=1
R1
coefficients ci = ψ i f dx = hψi , f i are called the Fourier coefficients of f and the set of orthogonal
0
functions ψ1 , ψ2 , · · · is said to be complete if for every function f in a class of functions of interest
we have
∞
X Z1
2
|ci | = |f |2 dx.
i=1 0
P
n
Then f (x) is approximated by Sn (x) = hψi , f i ψi to a mean square error that vanishes as
i=1
n −→ ∞ and the sequence Sn (x) is said to converge to f (x) in the mean or in the norm. This
does not imply that the sequence Sn (x) converges to f (x) for an arbitrary value of x on the
interval [0, 1]. But it does imply that the sequence Sn (x) converges to f (x) pointwise almost
everywhere. A discussion of pointwise and norm convergence is given in Weinberger’s book in
terms of conditions on the functions being approximated.
We will assume that the sets of orthogonal functions that we generate by solving the eigenvalue
d2
problem for 2 , and indeed for ∇2 itself, are complete vis-a-vis functions of interest in physical
dx
problems. But whereas completeness implies only convergence in norm, we will go on and write
∞
X
f (x) = hψi , f i ψi (x)
i=1
mindful of the warning that this might not be true for all values of x. Indeed the series obtained
P
∞
by differentiating hψi , f i ψi termwise might not converge for any value of x and therefore not
i=1
have a meaning in any ordinary sense.
An infinite series is nearly as useful as a finite sum if the function f (x) is smooth enough and
d2 f
satisfies the same boundary conditions as do the functions ψ1 (x), ψ2 (x), · · · . For suppose 2
dx
P∞ d2 f
has the Fourier series di ψi (x) where di = ψi , 2 and di −→ 0 as i −→ ∞. Then we have
i=1 dx
Z1 1 Z1 2
d2 f df dψ i d ψi
di = ψ i 2 dx = ψ i − f + f dx
dx dx dx 0 dx2
0 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 328
or
Z1
di = −λ2i ψ i f dx = −λ2i hψi , f i
0
and hence
di
ci = hψi , f i = −
λ2i
P
∞
where ci ψi (x) is the Fourier series corresponding to f . Assuming λ2i ∝ i2 , we see that ci −→ 0
i=1
faster than i12 as i −→ ∞ and hence that the Fourier series for f may then be a useful representation.
With the foregoing as background we investigate a sequence of problems where diffusion takes
place in a solvent layer separating two reservoirs. If the reservoirs are very large, well mixed
and solute free we put c (x = 0) = 0 = c (x = 1). If the reservoirs are impermeable to solute,
∂c ∂c
we put (x = 0) = 0 = (x = 1). If the solute diffusing to the right hand boundary is
∂x ∂x
D ∂c
dissipated by a first order reaction taking place there we put − (x = 1) = kc (x = 1) or
l ∂x
∂c kl
(x = 1) + βc (x = 1) = 0 where β = . As we have written it, k is positive for an ordinary
∂x D
decomposition of solute, but negative for an autocatalytic reaction wherein, at the wall, solute
catalyzes the production of more solute.
We can also assume that the solute diffusing to the right hand boundary accumulates in a finite,
well mixed reservoir whose composition is in equilibrium with the composition at the right-hand
edge of the film. Then we put
D ∂c l1 m ∂c
− (x = 1) = 2 (x = 1)
l ∂x l /D ∂t
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 329
or
∂c ∂c
(x = 1) + α (x = 1) = 0
∂t ∂t
l1
where α = m, l1 denoting the volume of the reservoir divided by the cross sectional area of the
l
diffusion layer, m denoting the equilibrium distribution ratio. This problem comes up in separa-
tions by chromatography and its solutions can be used to explain why retention times depend on
initial solute distributions even if there is no competition for the adsorbent.
In each of these problems, differing only in the conditions specified at the boundary of the
diffusion layer, we have
∂c ∂2c
= 2 , 0<x<1
∂t ∂x
where the coefficients ci (t) = hψi , ci remain to be determined. To find the equations satisfied by
the ci we multiply the diffusion equation by ψ i and integrate over the domain, viz.,
Z1 Z1
∂c ∂2c
ψi dx = ψi dx
∂t ∂x2
0 0
1
d ∂c dψ i
hψi , ci = ψ i − c − λ2i hψi , ci .
dt ∂x dx 0
∂c
Now we have what appears to be a technical difficulty: we do not know both c and at both
∂x
x = 0 and x = 1.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 330
If the first term on the right hand side vanishes, and we set the boundary conditions in the
eigenvalue problem to make this happen, if we can, this is simply
d
hψi , ci = −λ2i hψi , ci
dt
whence
2
hψi , ci = e−λi t hψi , c (t = 0)i
∞
X 2
c (x, t) = hψi , c (t = 0)i e−λi t ψi (x) .
i=1
This series is an increasingly useful representation of c (x, t) as t increases. How useful it is for
small values of t depends on what c (t = 0) is. For very large values of t, c (x, t) is approximately
2
hψ1 , c (t = 0)i e−λ1 t ψ1 (x)
2
hψ1 , c (t = 0)i ψ1 (x) + hψ2 , c (t = 0)i e−λ2 t ψ2 (x)
It is important to notice that we have the solution to our problem, even though it is an infinite
series, and that we have not differentiated the series to obtain it.
We now look at the solutions to the eigenvalue problem corresponding to a variety of boundary
conditions where in every case ψ = A cos λx + B sin λx satisfies
∂2ψ
+ λ2 ψ = 0.
∂x2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 331
Example (1): c (x = 0) = 0 = c (x = 1)
In the first problem a solute initially distributed over a solvent layer according to c (t = 0), is lost to
the adjacent reservoirs which maintain the edges of the diffusion layer solute free, i.e., c (x = 0) =
0 = c (x = 1). Hence we choose the boundary conditions satisfied by ψ to be ψ (x = 0) = 0 =
1
∂c ∂ψ i
ψ (x = 1) whereupon the term ψ i − c on the right hand side of the equation for hψi , ci
∂x ∂x 0
vanishes.
√
ψi = 2 sin iπx , i = 1, 2, · · ·
and
λ2i = i2 π 2 , i = 1, 2, · · ·
∂c
Example (2): (x = 0) = 0, c (x = 1) = 0
∂x
The solute sink at x = 0 in Example (1) is replaced by a barrier impermeable to solute, all else
remaining the same.
To solve this problem we need a new set of eigenfunctions which we obtain by introducing new
boundary conditions, viz.,
dψ
(x = 0) = 0, ψ (x = 1) = 0
dx
1
∂c dψi
for this is what is required to make the term ψ i −c on the right hand side of the equation
∂x dx 0
for hψi , ci vanish.
dψ
Then (x = 0) = 0 implies B = 0 and ψ (x = 0) = 0 implies cos λ = 0, A 6= 0, whence we
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 332
have
√ 1
ψi = 2 cos i − πx, i = 1, 2, · · ·
2
and
2
1
λ2i = i− π 2 , i = 1, 2, · · ·
2
The loss of solute is slowed by imposing the barrier on the left hand side of the layer. For long
times, after the details of the initial solute distribution are forgotten, the second layer is effectively
twice as thick as the first and to see this we need only observe that λ21 in the second problem is one
fourth λ21 in the first.
∂c ∂c
Example (3): (x = 0) = 0, (x = 1) = 0
∂x ∂x
Here both edges of the solute layer are isolated from the bounding reservoirs and the initial solute
distribution in the film is simply rearranged by diffusion, no solute being lost. The boundary
conditions on ψ are now
∂ψ ∂ψ
(x = 0) = 0, (x = 1) = 0
∂x ∂x
as this leads to
1
∂c dψi
ψi − c =0
∂x dx 0
Because
dψ
= −Aλ sin λ + Bλ cos λx
dx
dψ
we see that (x = 0) = 0 implies Bλ = 0. Thus we have either
dx
λ = 0, ψ = A
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 333
or
B = 0, ψ = A cos λx, A 6= 0
dψ
where (x = 1) = 0 implies
dx
Aλ sin λ = 0
and
λ2i = (i − 1)2 π 2 , i = 1, 2, · · · .
Z1
cavg (t) = c (x, t) dx = hψ1 , ci
0
∞
X 2
c (x, t) = cavg (t = 0) + hψi , c (t = 0)i e−λi t ψi (x)
i=2
∂c ∂c
Example (4): (x = 0) = 0, (x = 1) + βc (x = 1) = 0, β ≥ 0
∂x ∂x
Again the left hand edge of the solute layer is impermeable to solute but at the right hand edge
solute is lost by a first order process. If β = 0 we get Example (3), if β −→ ∞ we get Example
(2).
dψ ∂ψ
(x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx ∂x
vanish.
First we notice that if λ = 0, then ψ must be constant. But that constant must be zero if β 6= 0.
dψ
The boundary condition (x = 0) = 0 implies B = 0 hence we have
dx
ψ = A cos λx , A 6= 0
λ sin λ − β cos λ = 0
where if λ is a solution so too −λ, both implying the same λ2 and ψ. Hence we write our equation
λ
= cot λ
β
and look for solutions λ > 0. These can be obtained graphically. The figure illustrates their
dependence on β.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 335
We see that 0 ≤ λ1 ≤ 12 π, π ≤ λ2 ≤ 23 π and indeed that (i − 1) π ≤ λi ≤ i − 1 + 12 π =
i − 21 π, i = 1, 2, · · · . We also see that λi −→ (i − 1) π as β −→ 0, the result for Example (3),
and λi −→ i − 12 π as β −→ ∞, the result for Example (2). As β increases from 0 to ∞, λi
increases monotonically from (i − 1) π to i − 12 π and this corresponds to an increasing rate of
loss of solute. Some information on how the eigenvalues depend on β is in Appendix 1.
While each eigenvalue is a smooth monotonic function of β, moving from its β −→ 0 limit
(chemical reaction control) to its β −→ ∞ limit (diffusion control) as β increases, we observe
that if we hold β fixed, at any value other than ∞, then as i −→ ∞, λi −→ (i − 1) π, and this
is its β = 0 value. The larger the value of β the larger the value of i before λi can be closely
approximated by its β = 0 value, viz., lim lim λi (β) 6= lim lim λi (β). Ordinarily it is only
i−→∞β−→∞ β−→∞i−→∞
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 336
the first few λi ’s that depart greatly from their β = 0 values but these are the most important
eigenvalues for all but the shortest times.
This is the first problem where we might not have been able to use trigonometric identities to
discover that
Z1
ψi ψj dx = 0 , i 6= j
0
and yet this condition holds here just as it does in the earlier problems.
The eigenvalues then are the squares of the values λ1 , λ2 , · · · determined as above and the
q R1
1 1
corresponding eigenfunctions are cos λi x divided by 2λi
sin λ i cos λ i + 2
as cos2 λxdx =
0
1
2λ
sin λ cos λ + 21 .
It may be worthwhile to observe that for each value of β we generate an infinite set of orthog-
onal eigenfunctions:
where λ1 , λ2 , λ3 , · · · depend on β. The eigenfunctions for one value of β are not particularly
useful in writing the solution for another value of β. The readers may wish to satisfy themselves
that this is so. An example is presented in Appendix 3.
∂c
Example (5): c (x = 0) = 0, (x = 1) + βc (x = 1) = 0, β < 0
∂x
Here we have a source of solute at x = 1, not a sink as in the earlier case where β > 0, and
for β near zero we might imagine that all λ2 ’s are positive. Solute is produced at the right hand
boundary, lost at the left hand boundary and we wish to know: at what value of β, as the source
becomes increasingly stronger, can diffusion across the layer no longer control the source.
d2 ψ
+ λ2 ψ = 0 , 0 < x < 1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 337
and
dψ
ψ = 0 at x = 0; + βψ = 0 at x = 1
dx
Z1 1 Z1
2 2 dψ dψ dψ
−λ ψ dx = ψ − dx
dx 0 dx dx
0 0
−βψ 2 (x = 1) .
Hence if β > 0 we have λ2 > 0, but if β < 0 we cannot tell the sign of λ2 without a calculation.
The first term depends explicitly on β, the second implicitly. The signs of both are known
and opposite if β < 0. We therefore anticipate that stability will be lost if β becomes sufficiently
negative (i.e., at least one value of λ2 will become negative). Indeed our formula for λ2 continues
dψ
to hold if ψ (x = 0) = 0, a sink, is replaced by (x = 0) = 0, a barrier. In that case the critical
dx
value of β is certainly zero; i.e., every negative value of β leads to growth.
By choosing ψ to satisfy
dψ
ψ (x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx
we have
1
∂c dψi
ψi − c =0
∂x dx 0
and the equation for hψi , ci is the same as in the earlier examples.
λ cos λ + β sin λ = 0.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 338
This is an equation for λ2 . If λ is a solution so also is −λ and both ±λ lead to the same
eigenvalue and eigenfunction. However, if we anticipate a solution λ = 0, we ought to write
ψ = A + Bx
and then we see that λ = 0 is indeed a solution corresponding to ψ = βx, iff β = −1.
First we look for positive real values of λ leading to positive real values of λ2 . To do this we
write our equation
λ
− = tan λ
β
For β > 0 and indeed for β > −1 all is well. The value of λ1 , the smallest positive root,
π π
decreases from π to as β decreases from ∞ to 0 and then decreases from to 0 as β decreases
2 2
from 0 to -1. This makes physical sense as it tells us that an initial solute distribution dies out
more and more slowly as a solute sink loses strength and turns into a weak source. But as β passes
through -1 a root is lost and something new seems to happen. Indeed if we were to inquire as to
whether a steady concentration field were possible, wherein diffusion to the left hand reservoir just
balances production at the right hand boundary, we would find that such a condition obtains only
for β = −1.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 340
Now λ2 must be real and, when β ≥ 0, it must also be positive. This is what directed our
earlier attention to real and positive values of λ; but if we admit purely imaginary values of λ, i.e.,
λ = iω where ω is real, then λ2 = −ω 2 and now λ2 is real, as it must be, but it is negative and this
is new. To see if this might be what is happening we put λ = iω, ω > 0, into λ cos λ + β sin λ = 0
and then use cos iω = cosh ω and sin iω = i sinh ω to get
ω cosh ω + β sinh ω = 0.
If β is not negative, this equation is not satisfied by any real values of ω. If β is negative we write
β = − |β| whence ω satisfies
ω
= tanh ω
|β|
But tanh ω increases monotonically from 0 to 1 while its derivative decreases monotonically from
tanh ω
1 to 0 as ω increases from 0 to ∞ and lim = 1. Hence there are no solutions to this
ω−→0 ω
equation for 0 < |β| < 1; but for |β| > 1 there is a solution as we can see on the graph:
So the eigenvalue λ21 decreases from π 2 to 0 as β decreases from ∞ to −1 and then decreases
from 0 to −∞ as β decreases from −1 to −∞. Hence for β < −1 an initial solute distribution, no
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 341
matter what its shape, runs away, while for β > −1 an initial solute distribution, again no matter
what its shape, dies out.
What we have found then is this. The diffusion equation, under the conditions c (x = 0) = 0,
∂c
(x = 1) + βc (x = 1) = 0, acts to dissipate imposed solute fields (of any size) so long as
∂x
kl
−1 < β < ∞. The parameter β is , where k depends on temperature and l is determined by the
D
proximity of the source and the sink, i.e., the diffusion path length. Calculations of this sort are of
interest in the design of cylinders for the storage of acetylene, but there the autocatalytic reaction
is homogeneous and is controlled by diffusion to the wall where deactivation takes place. Again
the diameter of the tank is important as is the temperature.
We might have expected to see first one negative value of λ2 , then two, then three, etc., as β
decreases below −1. But we do not.
The problem
∂c ∂2c
= 2, 0 < x < 1
∂t ∂x
and
∂c
c (x = 0) = 0, (x = 1) + βc (x = 1) = 0
∂x
where c (t = 0) is assigned has the solution c (x, t) = 0 for all values of β if c (t = 0) = 0; likewise
the corresponding steady problem has the solution c (x) = 0 for all values of β. If β > −1 this is
the long time limit of all unsteady solutions.
c = A + Bx
d2 ψ
+ λ2 ψ = 0, 0 < x < 1
dx2
and
dψ
ψ (x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx
B
ψ = A cos λx + sin λx
λ
B
where A = 0 as ψ (x = 0) = 0. The eigenvalues corresponding to ψ = sin λx are determined
λ
by the boundary condition at x = 1 and hence by the solutions to
β
cos λ + sin λ = 0.
λ
This has the solution λ = 0 iff β = −1. For small λ we can write this equation
1 2 β 1 3
1− λ ··· + λ− λ ··· = 0
2 λ 6
or
2 1 1
(1 + β) − λ + β +··· = 0
2 6
whence λ2 = 0 is a root and a simple root, iff β = −1. When λ2 is zero it corresponds to the
eigenfunction ψ = Bx.
So if β = −1, the eigenvalues are the squares of the roots of λ cos λ − sin λ = 0 and the
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 343
√
corresponding eigenfunctions are ψ1 = 3x, ψ2 = B2 sin λ2 x, · · · . And we can demonstrate by
R1
direct calculation that x sin λi xdx = 0, i = 2, · · · .
0
The solution to our problem when β = −1 is then
1
Z √ √ ∞
X 2
c= 3xc (t = 0) dx 3x + hψi , c (t = 0)i e−λi t ψi (x)
0 i=2
∂c ∂c ∂c
Example (6): (x = 0) = 0, (x = 1) + α (x = 1) = 0, α > 0
∂x ∂x ∂t
Now the diffusion layer is isolated from the left hand reservoir but exchanges solute with the
right hand reservoir, assumed to be of finite extent so that its concentration responds to this solute
exchange. To simplify the problem we assume that the right hand edge of the diffusion layer and
the reservoir remain in phase equilibrium for all time.
Here we see something new: time derivatives appear in our problem in two places and hence
we anticipate that our eigenvalue problem will not be the same as it was in the earlier examples.
But imagining that our time dependence will remain exponential we propose that ψ and λ2 must
satisfy
d2 ψ
+ λ2 ψ = 0
dx2
dψ dψ
(x = 0) = 0 and (x = 1) − αλ2 ψ (x = 1) = 0
dx dx
where the eigenvalue λ2 now also appears in the boundary condition.
Before we solve this problem we ought to use our two integration formulas to learn something
about it.
distinct eigenvalues, in the second formula. We see that φ and ψ are not orthogonal in the plane
vanilla inner product. Instead they are orthogonal in the inner product
Z1
ha, bi = ab dx + αa (x = 1) b (x = 1)
0
X
c= ci (t) ψi
and find
d
hψi , ci = −λ2i hψi , ci
dt
just as before, but now in a new inner product, and our solution is as before, but now in a new inner
product, viz.,
X 2
c= hψi , c (t = 0)i e−λi t ψi
where
Z1
hψi , c (t = 0)i = ψi c (t = 0) dx + αψi (x = 1) c (t = 0) |
x=1
0
and we can see how the initial states of the diffusion layer and the reservoir come into the solution.
It remains only to solve the eigenvalue problem. First we observe that zero is an eigenvalue
corresponding to ψ = 1. This is not surprising because we expect the system to come to rest as
t −→ ∞ with a uniform solute concentration in the diffusion layer, in equilibrium with whatever
solute concentration winds up in the reservoir. So writing
ψ = A cos λx + B sin λx
dψ
we see that (x = 0) = 0 implies that B = 0, hence A must not be zero. The values of λ then
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 345
satisfy
dψ
(x = 1) − λ2 αψ (x = 1) = 0
dx
λ sin λ + λ2 α cos λ = 0.
This is an equation for λ2 and we see that λ2 = 0 is a simple root. We can find the remaining
values of λ2 by graphical means, by solving
αλ = − tan λ
for λ > 0. Because α is positive there is no solution λ = iω and hence λ2 is not negative.
Suppose we have
c (x = 0) = c (x = 1)
and
∂c ∂c
(x = 0) = (x = 1)
∂x ∂x
vanishes if we choose
ψ (x = 0) = ψ (x = 1)
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 346
and
dψ dψ
(x = 0) = (x = 1) .
dx dx
d2 ψ
2
+ λ2 ψ = 0
dx
as
ψ = A cos λx + B sin λx
A = A cos λ + B sin λ
and
and hence
cos λ − 1 sin λ A 0
=
−λ sin λ λ (cos λ − 1) B 0
To have a solution to these homogeneous equations such A and B are not both zero, we must have
cos λ − 1 sin λ
det =0
−λ sin λ λ (cos λ − 1)
cos λ = 1
or
λ = 2π, 4π, . . .
0, (2π)2 , (4π)2 , . . .
n = 1, 2, · · · Thus to every eigenvalue not zero, we have two periodic eigenfunctions, viz., A = 1,
B = 0 and A = 0, B = 1. But corresponding to λ = 0, we have only one periodic solution.
The reader may observe that the expansion of a periodic function in these eigenfunctions is
what is ordinarily called a Fourier series expansion, where the coefficient of the first eigenfunction
is the average value of the function being expanded.
We present here an eigenvalue problem that is a little out of the ordinary. First, suppose φ and µ2
satisfy
d2 φ
+ µ2 φ = 0, −1 < x < 1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 348
where
φ (x = −1) = 0 = φ (x = 1)
2
1 2 1
cos πx µ = π
2 2
sin πx µ2 = π 2
2
3 2 3
cos πx µ = π
2 2
etc.
where
ψ (x = −1) = 0 = ψ (x = 1) .
This is a model for an eigenvalue problem arising in frictional heating. It appears if a small pertur-
bation is imposed on the solution to a problem in plane Couette flow.
sin πx ν 2 = π2
etc.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 349
Then setting
Z1
C= ψdx
−1
we have
d2 ψ
+ ν2ψ = ν2C
dx2
whereupon
ψ = C + A cos νx + B sin νx
and
2A
C = 2C + sin ν
ν
The case B 6= 0, A = 0, ν = π, 2π, · · · corresponds to the odd solutions about x = 0 and hence
to C = 0. In the remaining case, C 6= 0, we have A 6= 0, B = 0 and
1 sin ν
ν= = tan ν
2 cos ν
This equation has many positive solutions, which can be found graphically, as well as one negative
solution ν = iω, whence ν 2 = −ω 2 < 0.
There are similarities here to two earlier problems in this lecture, one of which also has a
negative eigenvalue.
The reader can use our two integration by parts formulas, now on the interval −1 ≤ x ≤ 1, to
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 350
derive general conclusions about the solutions to this new eigenvalue problem. For example setting
φ = ψ in our second integration by parts formula and observing that ψ, λ2 is a solution if ψ, λ2 is
a solution we have
1 1
Z Z1 Z1 Z Z1 Z1
−ν 2 ψψdx − ψdx ψdx = −ν 2 ψψdx − ψdx ψdx
−1 −1 −1 −1 −1 −1
d2 ψ
+ λ2 ψ = 0, ψ = 0 at x=0
dx2
and
dψ
= −βψ, at x=1
dx
dψ
= αλ2 ψ
dx
All β’s make physical sense but β ≥ 0 is the ordinary case. Only α ≥ 0 makes sense but here
we suppose α < 0 is possible, i.e., we have an antireservoir.
In both problems
ψ = A sin λx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 351
λ
= tan λ
−β
1
= tan λ
αλ
If we ask: is there a value of α or β where λ2 = 0, in the first problem we find one and only
one value of β : β = −1. In the second problem there are no values of α such that λ2 = 0. For
β > −1 all λ2 ’s are positive, at β = −1, one becomes zero and at β < −1, one is negative, the
others remaining positive because there is no value of β other than −1 where λ2 = 0. All λ2 ’s are
smooth functions of β.
This is not the way the second problem works. For α ≥ 0 all λ2 ’s are positive, α = 0 cor-
responding to an impermeable wall at x = 1. However, if we admit the possibility α < 0 we
have
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 352
tan λ
π⁄2 3π ⁄ 2
λ
1
αλ
and the root π/2 at α = 0 appears to have been lost. However, at α < 0 by setting λ = iω we have
1
= tan ω
(−α) ω
λ1
2
( π ⁄ 2 )2
Our pictures show that in the second problem all λ2 ’s are positive for α ≥ 0, one λ2 is negative,
all others remaining positive, for α < 0 and now no λ2 is ever zero.
ψ (x = ±1) = 0
1
The crisis here occurs at γ = . The two earlier cases correspond to γ = 0 and γ = 1.
2
If we now put our autocatalytic reaction on the domain, our eigenvalue problem is
d2 ψ
2
+ λ2 ψ − βψ = 0, β<0
dx
ψ = 0 at x = 0, 1
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 354
And we have
p
ψ = A sin λ2 − β x
where
λ2 = β + n2 π 2
β = −n2 π 2 , n = 1, 2, . . .
Every λ2 can turn negative and we see that the larger −β the more spatial variation is needed to
control growth. But once −β exceeds π 2 we have lost stability.
and
d d
ψ (x = 0) = 0, + β ψ (x = 1) = 0
dx dx
and our aim here is to determine the dependence of λ21 , λ22 , · · · on β. To do this we find, by
dψ
differentiating the problem, that satisfies
dβ
d2 dψ dλ2
+ λ2 =− ψ
dx2 dβ dβ
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 355
and
d dψ d dψ
(x = 0) = 0, +β (x = 1) = −ψ (x = 1)
dx dβ dx dβ
The corresponding homogeneous problem has a nonzero solution, hence a solvability condition
dψ
must be satisfied. It is satisfied because can be found by differentiating ψ, and solvability
dβ
dλ2
determines . Thus we use our second integration by parts formula to write
dβ
Z1 1 Z1 2
d2 dψ d dψ dψ dψ d dψ
ψ + λ2 dx = ψ − + 2
+ λ ψ dx
dx2 dβ dx dβ dx dβ 0 dx 2 dβ
0 0
d2 dψ dλ2 d2
and then substitute + λ2 =− ψ and 2
+ λ ψ = 0 into this to get
dx2 dβ dβ dx2
Z1
dλ2
ψψdx = −ψψ (x = 1)
dβ
0
dλ2 cos2 λ
= 1 1
dβ 2λ
sin λ cos λ + 2
dλ2 2λ2
= 2
dβ λ + β2 + β
dλ2i
(β = 0) = 2, i = 2, · · ·
dβ
dλ21
but (β = 0) is indeterminate because λ21 (β = 0) = 0.
dβ
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 356
To see what λ21 is doing when β is small we first observe that λ21 as a function of β satisfies
λ1 sin λ1 − β cos λ1 = 0.
Then when λ21 is near zero, as it is when β is small, we can approximate λ1 sin λ1 and cos λ1 by
1 1 6
λ1 sin λ1 = λ21 − λ41 + λ −···
6 120 1
and
1 1
cos λ1 = 1 − λ21 + λ41 − · · ·
2 24
and write
λ21 = c1 β + c2 β 2 + · · ·
Substituting this in
1 4 1 6 1 2 1 4
λ21 − λ1 + λ − · · · − β 1 − λ1 + λ1 − · · · = 0
6 120 1 2 24
we find that
1
c1 = 1, c2 = − , · · ·
3
1
λ21 = β − β 2
3
or
β
λ21 = .
1
1+ β
3
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 357
dλ21
(β = 0) = 1.
dβ
To get a clear idea how much help a diagonalizing basis is in writing the solution to the diffusion
equation we carry out a calculation in a nondiagonalizing basis.
Let c satisfy
∂c ∂2c
= 2, 0 < x < 1
∂t ∂x
∂c
(x = 0) = 0
∂x
and
∂c
(x = 1) + βc (x = 1) = 0
∂x
d2 ψ
+ λ2 ψ = 0, 0 < x < 1
dx2
dψ
(x = 0) = 0
dx
and
dψ
(x = 1) + βψ (x = 1) = 0
dx
and it is satisfied by
B
ψ = A cos λx + sin λx
λ
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 358
ψi = A cos λi x
λ sin λ − β cos λ = 0
and the A′ s are normalization constants. For each value of β we get a set of eigenfunctions and
these eigenfunctions depend on the value of β.
Earlier, Example (4) page 334, we solved this problem. Whatever the value of β we expanded
the solution in the corresponding eigenfunctions. Here we try something else. Let β > 0 be fixed.
Then to determine the solution for this value of β we expand it in the eigenfunctions corresponding
to β = 0 as they are easy to determine.
2
which is an equation in λ(0) having simple roots, viz.,
2
λ(0) = 0, π 2 , 22 π 2 , · · ·
ψ00 = 1
√
ψ10 = 2 cos πx
√
ψ20 = 2 cos 2πx
etc
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 359
To solve the problem corresponding to a fixed value of β > 0 in terms of the eigenfunctions
corresponding to β = 0 we write
∞
X
c= ci ψi0
i=0
Z1
ci = ψi0 , c = ψi0 cdx
0
To do this we multiply the equation for c by ψi0 , integrate the result from 0 to 1 and use the
integration by parts formula
Z1 Z1 1
d2 v d2 u dv du
u 2 dx = vdx + u − v
dx dx2 dx dx 0
0 0
to get
1
∂c ∂2c d2 ψi0 0 ∂c dψi0
ψi0 , = ψi0 , = , c + ψi − c
∂t ∂x2 dx2 ∂x dx 0
and this is
d 2 ∂c
ci = − λ0i ci + ψi0 (x = 1) (x = 1) .
dt ∂x
∂c ∂c
The technical difficulty here is that (x = 1) is not zero. In fact (x = 1) = −βc (x = 1) =
P ∂x ∂x
−β cj ψj0 (x = 1) and using this we get
j=0
d 2 X
ci = − λ0i ci − βψi0 (x = 1) cj ψj0 (x = 1)
dt j=0
This illustrates what happens when we do not use a diagonalizing basis; the equations satisfied by
the ci are not uncoupled.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 360
√ 2 2
Using ψ00 (x = 1) = 1, ψj0 (x = 1) = 2 (−1)j , j = 1, 2, · · · , (λ00 ) = 0 and λ0j = j 2π2,
j = 1, 2, · · · we must solve
dci 2 √ X
= − λ0i ci − βψi0 (x = 0) c0 − 2βψi0 (x = 0) (−1)j cj
dt j=1
or
dc0 √ X
= 0 − βc0 − 2β (−1)j cj
dt j=1
and
dci √ X
= −i2 π 2 ci − 2β (−1)i c0 − 2β (−1)i (−1)j cj , i = 1, 2, · · · .
dt j=1
To try to learn something about our solution, we truncate the first two equations to
dc0 √
= −βc0 + 2βc1
dt
and
dc1 √
= −π 2 c1 + 2βc0 − 2βc1
dt
and hence to determine c0 and c1 in this approximation we need the eigenvalues of the matrix
√
−β 2β
√
2β −π 2 − 2β
For small β these are −β and −π 2 and so for long time and small β the solute is dissipated as e−βt
and this is correct. But more information than this is difficult to obtain in this basis.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 361
Before we go on, we can get an idea of what is to come and at the same time make an observation
about Fourier series. The eigenvalue problem
d2 ψ
2
+ λ2 ψ = 0 , 0 < x < 1
dx
and
ψ (x = 0) = 0, ψ (x = 1) = 0
√
has solutons λ2i = i2 π 2 and ψi = 2 sin iπx, i = 1, 2, · · · . This orthogonal set of eigenfunctions
can be used to solve problems such as
d2 c
0= + Q (x)
dx2
where
c (x = 0) = c0 , c (x = 1) = c1 .
Indeed, writing
∞
X
c (x) = hψi , ci ψi
i=1
d2 c
we can find hψi , ci by multiplying 0 = 2 + Q by ψi and integrating over 0 ≤ x ≤ 1. Doing this
dx
we get
d2 c
0= ψi , 2 + hψi , Qi
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 362
1 2
dc dψi d ψi
0 = ψi − c + , c + hψi , Qi
dx dx 0 dx2
1
dc dψi
Something new happens here. The term ψi − c does not vanish, but it can be evaluated
dx dx 0
1
dc
because ψi vanishes on the boundary, eliminating the piece ψi , while c is assigned there,
dx 0
1
dψi
establishing the value of the piece − c . Using this we get
dx 0
and hence
We see, then, that c (x) is the sum of three terms each accounting for the contribution of one of
the three sources. The boundary sources introduce a special problem. To see this let Q = 0, c0 = 0
and c1 = 1, then
X 1 dψ X 1 √ X∞
√ 2
c (x) = − 2
i
(x = 1) ψi = − 2 2
2iπ cos iπ 2 sin iπx = − (−1)i sin iπx
λi dx iπ i=1
iπ
Because c (x) = x satisfies this problem, this expansion must be the Fourier series for x and indeed
it is, viz.,
∞
X 2
x= − (−1)i sin iπx
i=1
iπ
1
The terms in this series fall off as and so convergence depends on the alternating sign, (−1)i ,
i
1 1 1
i.e., − ∼ 2 , and gets a little help from the sign pattern of sin iπx.
i i+1 i
What we seem to have here is the function f (x) = x on 0 ≤ x ≤ 1 expanded in a series of
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 363
odd functions of period 2. But in fact what we really have is the function f (x) = x on 0 ≤ x ≤ 1,
extended first to −1 ≤ x ≤ 1 as an odd function and then extended to all x as a function of period
2, expanded in a series of functions of period 2. That is, what we have expanded is the function
shown here:
+1
-3 -1 +1 +3
-1
The convergence is slow because this function is not smooth, having a jump at x = 1. But the
series converges to x for all x : 0 ≤ x < 1. It converges to 0 at x = 1, where 0 is the average of 1
and -1, the limits of the value of the function as x goes to 1 from the left and the right.
X
−2 (−1)i cos iπx,
The solution to this boundary source problem is not a superposition of terms that satisfy the
original differential equation as we found earlier for initial value problems nor is it a superposition
of terms that satisfy special problems of the same kind as we find for interior sources where
hψi , Qi ψi
c=
λ2i
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 364
satisfies
d2 c
0= + hψi , Qi ψi
dx2
and
c (x = 0) = 0, c (x = 1) = 0.
What we have then is a series expansion of the solution to a problem driven by a boundary source
which is correct but which cannot be verified by direct substitution into the problem. Indeed what
can be learned about the solution to the problem using its series expansion is limited to operations
on the series that exclude differentiation.
1. Let c1 and c2 denote the concentrations of two solute species dissolved in a solvent which
is confined to a thin layer. The two solute species are distributed across the solvent layer at
t = 0 in some assigned way. The edges of the layer are impermeable to both species and an
estimate of the time for the layer to equilibrate is required. In terms of
c1
c=
c2
∂c ∂2c
= D 2, 0<x<1
∂t ∂x
and
∂c ∂c
(x = 0) = 0, (x = 1) = 0,
∂x ∂x
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 365
and where at t = 0:
1, 1
0≤x<
c1 = 2
1
0, <x≤1
2
0, 1
0≤x<
c2 = 2
1
1, <x≤1
2
How long before the maximum deviation from uniformity is less than 0.000001?
kji
i j
kij
Let the solute be in chemical equilibrium and be distributed uniformly in the solvent
which is confined to a thin layer, 0 < x < 1. At t = 0 the edges of the solvent layer contact
large solute free reservoirs that hold the solute concentration there at zero for all t > 0. Then
for the loss of solute to the reservoir we have
∂c ∂2c
= D 2 + Kc, 0<x<1
∂t ∂x
c (x = 0) = 0, c (x = 1) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 366
and
c (t = 0) = c eq
where
c1
c2
c=
..
.
cn
Kc eq = 0
and
D D12 · · ·
11
D = D21 D22 · · ·
.. .. ..
. . .
Determine c (x, t) and hence the time required for the solvent to be cleared of solute.
In problem 1 expanding c in the eigenvectors of D leads to two familiar diffusion equations.
That idea will not work here unless D and K have a complete set of eigenvectors in common.
But this is not ordinarily so, even if D is diagonal.
Yet the problem is special in another way: the boundary conditions are Dirichlet con-
ditions for all solute species. And so its solution can be obtained by expanding c in the
eigenfunctions determined by the ordinary eigenvalue problem
d2 Ψ
+ λ2 Ψ = 0
dx2
and
Ψ (x = 0) = 0, Ψ (x = 1) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 367
X
c= ci (t) Ψi (x)
Z 1
where ci (t) = h Ψi , c i and h , i = dx. Then as
0
Z 1 1
d 2 c d 2 Ψi dc dΨi
Ψi 2 − c dx = Ψi − c =0
0 dx dx2 dx dx 0
our expansion works out here just as it does in more familiar problems.
How much time must elapse before only 1% of the initial equilibrium solute remains in
the solvent, if n = 2, if, in units of film thickness,
1 0
D= 1
0
2
and if
−1 1
K= ?
1 −1
∂c ∂2c
= 2, 0<x<1
∂t ∂x
(i) c (x = 0) = 0 = c (x = 1)
∂c
(ii) c (x = 0) = 0 = (x = 1)
∂x
∂c ∂c
(iii) (x = 0) = 0 = (x = 1)
∂x ∂x
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 368
Order the rates at which the solute goes to its long time uniform state in the three experi-
ments. Observe that the uniform state in (iii) cannot be determined by solving the steady
equation but depends on c (t = 0). This is not true in (i) and (ii).
4. Free radicals are created in acetylene whereupon they catalyse the production of more free
radicals.
The growth of free radicals is controlled by diffusion to the wall whereupon the free
radicals are destroyed upon collision with the solid surface. A tank of acetylene must not be
too large in diameter if a runaway is to be averted.
Suppose acetylene is stored between two plane walls a distance L apart. In terms of k
and D, how large can L be before a runaway occurs?
The model is
1 ∂c ∂2c k
= 2 + c, k>0
D ∂t ∂x D
L2
c = 0 at x = 0 and x = L where c denotes the free radical concentration, [D] = and
T
1
[k] = .
T
Does the value of L depend on c (t = 0) ?
∂2c k
0= 2
+ c, 0<x<L
∂x D
and
c (x = 0) = 0 = c (x = L)
kL2 kL2
has the solution c = 0 for all values of > 0. But for special values of it has
D D
solutions other than c = 0. Find these values.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 369
The steady problem does not have only non-negative solutions. But you can show that
if c (t = 0) > 0, the unsteady problem must have non-negative solutions. Do this.
kL2
Write the solution to the unsteady problem if c (t = 0) > 0 and is less than, equal
D
to or greater than π 2 .
kL2
If = π 2 , the solution to the steady problem can only be obtained by solving the
D
unsteady problem and then letting t grow large. This steady solution depends on c (t = 0).
5. Two species having concentrations a and b are distributed over a one dimensional domain,
∂b
0 < x < 1. At the ends we have a = 0 and = 0 and on the domain, where the reaction
∂x
a b
∂a ∂2a
= 2 +b−a
∂t ∂x
and
∂b ∂2b
= 2 −b+a
∂t ∂x
∂c ∂2c
= 2, 0<x<1
∂t ∂x
c (x = 0) = 0, c (x = 1) = 1
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 370
and
c (t = 0) = 0
d2 Ψ
+ λ2 Ψ = 0
dx2
and
Ψ (x = 0) = 0, Ψ (x = 1) = 0
Take the limit of the solution as t → ∞ and verify that this is the Fourier series for f (x) = x
on the interval −1 ≤ x ≤ 1. This series converges for all values of x and defines a periodic
function of period 2. Its value when x = 1 is zero, its limit as x → 1− is one. It is the solution
to the problem
d2 c
0= , 0<x<1
dx2
and
c (x = 0) = 0, c (x = 1) = 1
This can be verified by construction but not by direct substitution, as the series derived by
termwise differentiation does not converge.
d2
D = 2, 0<x<ε
dx
d2
D=β , ε<x<1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 371
Z 1 ε 1 Z ε Z 1
dv dv du dv du dv
uDv dx = u + uβ − dx − β dx
0 dx 0 dx ε 0 dx dx ε dx dx
ε 1 Z 1
dv du dv du
= u − v + uβ −β v + Duv dx
dx dx 0 dx dx ε 0
DΨ + λ2 Ψ = 0, 0<x<1
Ψ (x = 0) = 0, Ψ (x = 1) = 0
Ψ x = ε− = Ψ x = ε+
and
dΨ dΨ
x = ε− = β x = ε+
dx dx
Use the solutions to this eigenvalue problem to obtain a formula for the solution to
a diffusion problem where a solute, initially distributed over a layer composed of two im-
miscible solvents, diffuses out of the layer when it is placed in contact at t = 0 with two
solute free reservoirs that maintain the solute concentration at its edges at c = 0. The solute
concentration then satisfies
∂2c
D1 2 , 0 < x < x12
∂c ∂x
=
∂t
2
D2 ∂ c ,
x12 < x < ℓ
∂x2
and
c (x = 0) = 0, c (x = ℓ) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 372
In this and the preceding problem, by using our integration by parts formula, you can
prove that λ2 must be real, etc.
Lecture 15
Diffusion is always smoothing and ordinarily it is stabilizing; nonetheless there is a paper by Segel
and Jackson (L. A. Segel, J. L. Jackson, J. Theoretical Biology, (1972), 37, 545) in which it is
proposed that diffusion is destabilizing, causing non uniformities to appear in an otherwise stable,
spatially uniform system.
We present the model. Two chemical species occupy the real line, −∞ < x < ∞. We denote
their concentrations c1 and c2 and refer to the first as the activator, the second as the inhibitor.
The model is
∂c1 ∂ 2 c1
= D1 2 + R1 (c1 , c2 )
∂t ∂x
∂c2 ∂ 2 c2
= D2 2 + R2 (c1 , c2 )
∂t ∂x
R1 (c1 , c2 ) = 0 = R2 (c1 , c2 )
373
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 374
Interest in this stems from the fact that an activator and an inhibitor might be in balance in a
cell wall where there also reside receptors picking up signals that the cell must respond to. Our
uniform activator-inhibitor state may not persist in the face of perturbations due to such signals
and our aim might be to find conditions that cause the cell to spring into action.
(0) (0)
To do this we introduce small perturbations to c1 and c2 , denoted ξ1 and ξ2 , and derive
∂2
D
∂ ξ1 1 ∂x2 0 ξ1 a11 a12 ξ1
= 2 +
∂t ξ2 ∂ ξ a21 a12 ξ2
0 D2 2 2
∂x
In the absence of diffusion, the uniform state is assumed to be stable. Thus we have
and
and
ξ1 = a1 cos kxe σt
ξ2 = a2 cos kxe σt
whereupon we have
a1 D1 0 a1 a11 a12
σ = −k 2 + A , A=
a2 0 D2 a2 a21 a22
and we see that the σ’s, the growth constants, are eigenvalues of the matrix
a11 − k 2 D1 a12
2
a21 a22 − k D2
where, because x runs from −∞ to +∞ all values of k are admissible. Had x a finite range
the admissible k’s would be limited by the end conditions, eg., Neumann conditions, periodic
conditions, etc.
Our system is stable to long wave length perturbations (k 2 = 0), by construction, and to small
wave length perturbations (k 2 → ∞) due to diffusive smoothing.
Upon setting
D1 0
det −k 2 + A − σI = 0
0 D2
and
−k 2 D1 + a11 −k 2 D2 + a22 − a21 a12 > 0 (det condition)
det k 2 = D1 D2 k 4 − (D1 a22 + D2 a11 ) k 2 + a11 a22 − a21 a12
| {z } | {z }
(+) (+)
whence we need
D2 > D1
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 377
det (k2)
k2
and we see that to have an instability det (k 2 ) must be negative at its least value. The least value
occurs at
2
D1 a22 + D2 a11 > 4D1 D2 a11 a22 − a21 a12
| {z } | {z } | {z }
(−) (+) (+)
Thus we can arrange an instability and we can understand it: At a site where a perturbation due
to an outside signal increases the activator concentration, and hence the rates of production of the
activator and the inhibitor are both increased, instability obtains if the activator remains in place
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 378
(low D1 ) and the inhibitor diffuses away (high D2 ), leaving the activator to reinforce itself. This is
referred to as diffusion induced symmetry breaking.
Given an unstable range of k’s, we could go on and get an estimate of the non uniform state
that appears. This is done in Grindrod’s book “The Theory and Applications of Reaction-Diffusion
Equations: Patterns and Waves.” We turn instead to the Petri Dish problem where this is easier to
do.
A steady solution to our Petri Dish problem (Lecture 1) satisfies, in scaled variables,
d2 c
0= + λF (c)
dx2
We already know that c = c0 = 0 is a solution for all values of λ, and we know that this solution
is stable to small perturbations for all values of λ < λcrit (see also §15.3); beyond λ = λcrit it is
unstable and we wish to find out what the new solution looks like for λ just beyond λcrit.
To find the non zero steady solution branch emerging from λcrit as λ increases we write
1 2
c = c0 + ε c 1 + ε c2 + · · ·
2
There is a method, called dominant balance, for doing this and it is explained in the books by
Bender and Orzag and by Grindrod. (“Advanced Mathematical Methods for Scientists and Engi-
neers” and “The Theory and Application of Reaction-Diffusion Equations.”) But we can illustrate
the main ideas by trying two possibilities:
(1) λ = λcrit + ε
and
(2) λ = λcrit + 21 ε2
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 379
1 2 ′
F (c) = F (c0 ) + εF ′ (c0 ) c1 + ε F (c0 ) c2 + F ′′ (c0 ) c21
2
1 3
+ ε F ′ (c0 ) c3 + 3F ′′ (c0 ) c1 c2 + F ′′′ (c0 ) c31 + · · ·
6
d 2 c1
2
+ λcritF ′ (c0 ) c1 = 0, c1 = 0 at x = 0, 1
dx
and
d 2 c2
+ λcritF ′ (c0 ) c2 = −2F ′ (c0 ) c1 − λcritF ′′ (c0 ) c21 , c2 = 0 at x = 0, 1
dx2
1 2
at order ε and at order ε
2
The first problem is the eigenvalue problem that we solved earlier, Lecture 1, to obtain λcrit,
and we found λcritF ′ (c0 ) = π 2 . Whence we have
c1 = A sin πx
For our expansion, here (1), to be correct we need to be able to find c1 , c2 , . . . and hence we
move on to the second order problem and we notice that the homogeneous part of this problem
is the eigenvalue problem and we already know that it has a non zero solution, viz., c1 . Thus a
solvability condition must be satisfied in order for the calculation to continue. and we find this
condition by multiplying the second problem by c1 the first by c2 , subtracting and integrating over
0 < x < 1.
The result is
Z 1
′ ′′
c1 2F (c0 ) c1 + λcritF (c0 ) c21 dx = 0
0
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 380
First we suppose
F ′ (0) = 1
F (c) = c − c2 F ′′ (0) = −2
F ′′′ (0) = 0
c = (λ − λcrit) A sin πx
Second we suppose
F ′ (0) = 1
F (c) = c − c3 F ′′ (0) = 0
F ′′′ (0) = −6
and now we find, at second order, that A = 0. Hence we conclude that expansion (1) fails in this
case.
Turning to expansion (2), and continuing with our second case, we have, at first and second
orders,
d 2 c1
+ λcritF ′ (c0 ) c1 = 0, c1 = 0 at x = 0, 1
dx2
and
d 2 c2
2
+ λcritF ′ (c0 ) c2 = −λcritF ′′ (c0 ) c21 , c2 = 0 at x = 0, 1
dx
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 381
whence
c1 = A sin πx
is satisfied for all values of A due to F ′′ (c0 ) = 0. Hence we must go to third order where we have
d 2 c3
2
+ λcritF ′ (c0 ) c3 = −3λcritF ′′ (c0 ) c1 c2 − λcritF ′′′ (c0 ) c31 − 3F ′ (c0 ) c1
dx
c3 = 0 at x = 0, 1
and solvability must be satisfied if we are to be able to find c3 and continue our calculations. Ordi-
narily c2 would be needed, and it can be found, but it is not needed here because
F ′′ (c0 ) = 0. (It is not usually true that the condition needed to satisfy solvability at second
order eliminates the need for c2 at third order.) The solvability condition at third order is
Z 1 Z 1
′′′
−λcritF (c0 ) c41 ′
dx − 3F (c0 ) c21 dx = 0
0 0
where we have used λcritF ′ (c0 ) = π 2 , F ′ (c0 ) = 1 and F ′′′ (c0 ) = −6.
p
c= 2 (λ − λcrit) A sin πx
and we see that how our solution depends on λ, for λ just beyond λcrit, differs as the nonlinearity
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 382
differs.
We wish to find the critical value of λ at which the solution c = 0, which holds for all λ, becomes
unstable. The model is
∂c ∂2c
= 2 + λF (c)
∂t ∂x
∂c1 ∂ 2 c1
= + λF ′ (0) c1
∂t ∂x2
where c1 = 0 at x = 0, 1.
∂2ψ
+ µ2 ψ = 0
∂x2
where ψ = 0 at x = 0, 1.
ψ = A sin µx µ2 = π 2 , (2π)2 , . . .
d2
Expanding c1 in the eigenfunctions of we have
dx2
X
c1 = h ψ, c1 i ψ
where
d
h ψ, c1 i = − µ2 + λF ′ (0) h ψ, c1 i
dt
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 383
As λ increases from zero, −µ2 + λF ′ (0) starts out negative. It first becomes zero at λ = λcrit,
corresponding to µ2 = µ21 , and thereafter remains positive, viz.,
µ2 = µ21
µ2 = µ22
λ
λcrit
1. Solve the Segel and Jackson problem on a bounded, one dimensional domain, assuming
homogeneous Neumann conditions at the ends.
Lecture 16
In this lecture we do not assume the boundary conditions are homogeneous and we do not assume
d2
the domain is one dimensional. Thus, we replace the differential operator 2 by ∇2 corresponding
dx
to diffusion in more than one dimension. Our emphasis will be on ∇2 and its eigenvalue problem.
We suppose that at time t = 0 a solute is distributed in some specified way throughout a solvent
occupying a bounded region of three dimensional space. We let V denote the region as well as its
volume and we suppose that V is separated from the remainder of physical space, over which we
have control, by a piecewise smooth surface S.
Our interest is in determining how our initial distribution of solute changes as time goes on and
we assume that its concentration satisfies
∂c
= ∇2 c + Q (−
→
r , t) , t > 0, →
−
r ∈V
∂t
385
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 386
where c (t = 0) is assigned ∀ −→
r ∈ V and where we measure lengths in units of a length L, say
1
L = V 3 , and measure time in units of L2 /D. This sets the value of the diffusion coefficient to one
in the scaled units. The assigned functions Q (−
→
r , t) and c (t = 0) specify the source of solute in
the region and the initial distribution of solute there, but the problem is indeterminate until we go
on and specify the conditions on S, the boundary of V , which define the effect of the surroundings
on what is going on inside V . To do this we divide S into three parts S1 , S2 and S3 and on each of
these we specify a definite condition:
c (−
→
r , t) = g1 (−
→
r , t) , →
−
r ∈ S1 , t > 0
−
→
n · ∇c (−
→
r , t) = g2 (−
→
r , t) , →
−
r ∈ S2 , t > 0
and
−
→
n · ∇c (−
→
r , t) + βc (−
→
r , t) = g3 (−
→
r , t) , →
−
r ∈ S3 , t > 0
where −
→
n is the outward unit normal vector to S and where β is assigned on S3 . It is assumed to
be real and it may not be constant. Ordinarily it is a positive constant. Thus we specify the solute
concentration itself on S1 , the rate of solute diffusion across S2 and a linear combination of these
on S3 by specifying the functions g1 , g2 , and g3 defined on S1 , S2 and S3 ∀ t > 0. The conditions
on S1 , S2 and S3 are called Dirichlet, Neuman and Robin conditions and if S1 = S the problem is
called a Dirichlet problem, etc.
Our goal here is to learn how to write the solution to this problem. To do this we introduce the
eigenvalue problem
∇2 ψ = −λ2 ψ, →
−
r ∈V
and denote its solutions, the eigenfunctions and the eigenvalues, ψ1 , ψ2 , . . . corresponding to λ21 , λ22 , . . .
We face two problems. The first is to specify the boundary conditions on S1 , S2 and S3 that the
eigenfunctions must satisfy in order that they can be used in solving for c. The second is to prove
that eigenfunctions are an orthogonal set of functions so that the coefficients in the expansion of c
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 387
Thus we complete the statement of the eigenvalue problem and then prove orthogonality of the
eigenfunctions in the inner product
ZZZ
φ, ψ = φ ψ dV
V
ψi , ψi =1
By asking what we must do to determine the coefficients in the expansion of the solution to our
problem in the eigenfunctions of ∇2 , we will discover the conditions that the eigenfunctions must
satisfy on S1 , S2 and S3 and to do this we need two integration by parts formulas for functions
defined on V , then the argument is much as it was in Lecture 14.
In Brand’s book, “Vector and Tensor Analysis,” there are many very general integration theo-
rems. But all we need are the three dimensional forms of our earlier integration by parts formulas.
If φ and ψ are sufficiently smooth these are:
ZZZ ZZ ZZZ
2
φ∇ ψ dV = φ→
−
n · ∇ψ dA − ∇φ · ∇ψ dV
V S V
and
ZZZ ZZ ZZZ
2
φ∇ ψ dV = {φ→
−
n · ∇ψ − ψ −
→
n · ∇φ} dA + ψ∇2 φ dV
V S V
To begin we assume that the solution to the diffusion equation can be expanded in the eigen-
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 388
functions of ∇2 and that the coefficients in the expansion are the Fourier coefficients of c (−
→
r , t).
Thus we write
∞
X
c (−
→
r , t) = ci (t) ψi (−
→
r)
i=1
To derive the equation satisfied by ci (t) we multiply the diffusion equation by ψ i and integrate
over V obtaining
ZZZ ZZZ ZZZ
∂c 2
ψ i dV = ψ i ∇ c dV + ψ i QdV.
∂t
V V V
Then, using Green’s second theorem to turn the first term on the right hand side into terms we can
evaluate, we have
ZZ
d →
hψi , ci = dA ψ i −
n · ∇c − c−
→
n · ∇ψ i + ∇2 ψi , c + hψi , Qi
dt
S
d 2
hψi , ci = −λi hψi , ci + hψi , Qi
dt
ZZ
→
+ dA ψ i − n · ∇c − c−→
n · ∇ψ i
S1
ZZ
→
+ dA ψ i −
n · ∇c − c−
→
n · ∇ψ i
S2
ZZ
→
+ dA ψ i −
n · ∇c − c−
→
n · ∇ψ i
S3
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 389
Assuming that we have the eigenfunctions and the eigenvalues, this is an equation by which we can
determine ci (t) = hψi , ci, i.e., the coefficient of ψi in the solution to our problem. The first two
terms on the right hand side present no problem. But in the third term − →n ·∇c is not specified on S ,
1
in the fourth term c is not specified on S2 while in the fifth term neither →
−
n · ∇c nor c are specified
on S3 . The equation then is indeterminate and it is our choice of the boundary conditions satisfied
by ψi , which completes the definition of the eigenvalue problem, that removes this indeterminancy.
So to make this a determinate equation for hψi , ci, we put ψi = 0 on the part of the boundary where
c is specified but −
→
n · ∇c is not; while on the part of the boundary where − →
n · ∇c is specified but c
is not we put −
→
n · ∇ψi = 0; on the remaining part of the boundary where −
→
n · ∇c + βc is specified
we put −→
n · ∇ψ + βψ = 0. Then the differential equation
i i
d
hψi , ci = −λ2i hψi , ci + hψi , Qi
dt
ZZ
+ − g1 −→n · ∇ψ i dA
S1
ZZ
+ ψ i g2 dA
S2
ZZ
+ ψ i g3 dA
S3
X
c (−
→
r , t) = hψi , ci ψi (−
→
r ).
Each coefficient, hψi , ci, can be written as the sum of five terms, each one corresponding to one
of the sources: c (t = 0), Q, g1 , g2 and g3 . If , as an example, S1 is all of S then the sources are
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 390
2
hψi , ci = hψi , c (t = 0)i e−λi t
Zt
2
+ e−λi (t−τ ) hψi , Q (t = τ )i dτ
0
Zt ZZ
− e−λ2i (t−τ ) →
g1 (t = τ ) −
n · ∇ψ i dAdτ
0 S
This, when multiplied by ψi and summed over i, is the solution to our problem as it depends on the
three sources of solute: c (t = 0) ,Q and g1 . Each term, in fact, produces a solution to the diffusion
equation corresponding to one of the sources when the other two vanish.
This method of solving for c requires that an eigenvalue problem be solved. Doing this deter-
mines a set of eigenfunctions and a way of doing this will be presented in Lecture 17. Then to pro-
duce a solution to the diffusion equation acting under a specified set of sources, each eigenfunction
must be multiplied by a coefficient hψi , ci and the product summed over the set of eigenfunctions.
Each coefficient is the solution of a linear, constant coefficient, first order differential equation,
independent of every other coefficient. Each coefficient depends in its own way on the sources of
the solute , i.e., on c (t = 0), Q, g1 , g2 and g3 , and its dependence on the sources is additive. The
coefficient hψi , ci depends on t and is a sum of terms each depending on t and each corresponding
to one of the sources of the field. This separation of the contributions of the sources carries over to
the solution itself and is one form of the principle of superposition satisfied by the solutions to the
diffusion equation. It is also satisfied by the solutions to all linear equations.
We now know what our method is and how the sources of the field make their contribution to
the solution. We also know what the eigenvalue problem is; it is the homogeneous problem
∇2 ψ = −λ2 ψ, −
→
r ∈V
ψ = 0, −
→
r ∈ S1
−
→
n · ∇ψ = 0, −
→
r ∈ S2
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 391
−
→
n · ∇ψ + βψ = 0, −
→
r ∈ S1
The nonzero solutions ψ are called eigenfunctions while the corresponding values of λ2 are called
eigenvalues.
n
X
f= ci ψi
i=1
n
X
Sn = ci ψi
i=1
hence the error is f − Sn and the mean square error, MSE, viz.,
ZZZ
(f − Sn ) (f − Sn ) dV
V
is positive.
Now we have
(f − Sn ) (f − Sn ) = f f − S n f − f Sn + S n Sn
X X X X
=ff− ci ψ i f − ci ψi f + ci ψ i cj ψj
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 392
and therefore
ZZZ X ZZZ X ZZZ X
2
MSE = | f | dV − ci ψ i f dV − ci ψi f dV + ci ci
V V V
2 2
ZZZ n
X ZZZ n
X ZZZ
= | f |2 dV + ci − ψ i f dV − ψ i f dV
V i=1 V i=1 V
due to
2
ZZZ ZZZ ZZZ
ci − ψ i f dV = ci ci − ci ψ i f dV − ci ψi f dV
V V V
ZZZ ZZZ
+ ψ i f dV ψi f dV
V V
Hence, we see that only the second term depends on the ci ’s and to make MSE least we should
set the ci ’s to
ZZZ
ci = ψ i f dV = ψi , f
V
n
X
ψi , f ψi
i=1
ZZZ n
X
| f | dV − | ci |2 > 0
V i=1
ZZZ ∞
X
2
| f | dV − | ci | 2 ≥ 0
V i=1
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 393
∞
X ZZZ
2
| ci | = | f |2 dV
i=1 V
for all functions f of interest. Hence our series for f converges to f in the mean, i.e., the MSE
vanishes as n → ∞, and we ordinarily expect to have pointwise convergence almost everywhere
in V .
We can go on and learn something about the eigenfunctions and the eigenvalues of ∇2 by using
Green’s two theorems. If ψ and λ2 satisfy the eigenvalue problem then so also do ψ and λ2 and
hence on putting φ = ψ in Green’s second theorem we conclude that λ2 must be real. Then on
putting φ = ψ in Green’s first theorem we get
ZZZ ZZ ZZZ
−λ 2 2
|ψ| dV = ψ→
−
n · ∇ψdA − |∇ψ|2 dV
V S V
ZZ ZZZ
= − β |ψ|2 dA − |∇ψ|2 dV
S3 V
This is an equation telling us the sign of λ2 . First, if β > 0, then λ2 = 0 cannot be a solution
and we have λ2 > 0. But, if β = 0 and S2 includes S3 , then λ = 0 and ψ = constant might be
a solution. Indeed if S2 = S, and we have a Neumann problem, λ2 = 0, ψ = constant 6= 0 is
one solution to our eigenvalue problem. Otherwise, β > 0 or S1 = S, we must have λ2 > 0. We
go on and put φ = ψ i , ψ = ψj in Green’s second theorem, where ψi and ψj are solutions to the
eigenvalue problem corresponding to distinct eigenvalues, and learn that
hψi , ψj i = 0.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 394
This, and the observation that any independent set of eigenfunctions corresponding to the same
eigenvalue can be replaced by an orthogonal set of eigenfunctions, shows that the eigenvalue prob-
RRR
lem determines an orthogonal set of eigenfunctions in the inner product hφ, ψi = φψdV . In
V
fact restricting ∇2 to the class of functions on V satisfying homogeneous boundary conditions on
S = S1 + S2 + S3 we have hφ, ∇2 ψi = h∇2 φ, ψi and we say that ∇2 is self-adjoint on that class
of functions. In a way we have been very lucky. The boundary conditions of physical interest,
the plain vanilla inner product and the differential operator ∇2 have gotten together and given us
simple answers to the important questions. In another inner product or for other boundary con-
ditions or for another differential operator we would be required to determine an adjoint operator
and adjoint boundary conditions to work out our theory. This comes up again in Lecture 19.
In the next lecture we turn to the question: how do we solve the eigenvalue problem? And we
explain the method of separation of variables for doing this. The readers may satisfy themselves
that there are places in the foregoing where it is important that the coefficient β in the Robin
boundary condition be real and places where it is important that β be positive but nowhere is it
required that β be constant. While this is so, in solving the eigenvalue problem by the method
of separation of variables it will also be important that β be a constant, or at least be piecewise
constant and constant on each coordinate surface.
The more distant the heat source is from the boundary, the higher the temperature must rise
there to conduct it away. If the source is assigned in advance the heat generation can always
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 395
be balanced by heat conduction. But if the source depends on the temperature, and increases in
strength as the temperature increases, then there is a positive feed back and this may create a critical
condition beyond which a steady solution cannot be found.
To see why this might be so we can study the problem as the region grows larger in size. Then
heat is generated at greater and greater distances from the boundary and the temperature required
to dissipate it must increase. The greatest temperature in the region then increases as the size of
the region increases and, as this goes on, the source grows stronger. Depending on how fast the
strength of the source increases as the temperature increases, there may be a critical size of the
region beyond which the heat generated therein cannot be conducted steadily to the surroundings.
This critical condition is called a runaway condition and it leads to a thermal explosion.
We suppose that the temperature is the only variable of interest and that the heat source is an
exponential function of the temperature. Then in scaled variables we have
∇2 u + µ2 eu = 0 , →
−
r ∈V
and
u=0 , →
−
r ∈S
where the size of the region appears in the constant µ2 which indicates the strength of the source.
This model is introduced by D.A. Frank-Kamenetskii in his book “Diffusion and Heat Conduction
in Chemical Kinetics.”
In certain simple geometries this equation can be solved and the critical value of µ2 can be
determined. But that is not our aim here. What we do instead is use the eigenvalue problem for ∇2
in V to estimate the critical value of µ2 . To do this we write our problem:
∇2 u + µ2 f (u) = 0, →
−
r ∈V
and
u=0 , →
−
r ∈S
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 396
where f (u) denotes the nonlinear source of heat and where u and f (u) have been scaled so that
df
f (u = 0) = 1 and (u = 0) = 1. Our interest is in how large µ2 can be, consistent with a
du
bounded solution u.
f (u) = u + g (u)
dg
where g (u = 0) = 1 and (u = 0) = 0 and where we must have g (u) ≥ 0 ∀ u ≥ 0. Indeed if
du
d2 f
≥ 0 ∀ u ≥ 0 then g (u) ≥ 1 ∀ u ≥ 0.
du2
The solutions to our problem must be non negative and our job is to estimate the range of values
of the control variable µ2 where this obtains. To do this we assume that we have a non negative
solution to our problem
∇2 u + µ2 u + µ2 g (u) = 0, →
−
r ∈V
and
u=0 , →
−
r ∈S
for some value of µ2 and then introduce for comparison the eigenvalue problem for ∇2 in V , viz.,
∇2 ψ + λ 2 ψ = 0 , →
−
r ∈V
and
ψ=0 , →
−
r ∈S
is
2
hψ1 , c (t = 0)i e−λ1 t ψ1
To determine a bound on µ2 we observe that our second integration by parts formula tells us
ZZZ ZZ
−
→
ψ1 ∇2 u − u∇2 ψ1 dV = n · {ψ1 ∇u − u∇ψ1 } dA
V S
where the right hand side is zero by the boundary conditions on u and ψ1 . The left hand side is
ZZZ
2
−µ ψ1 u − µ2 ψ1 g (u) + λ21 uψ1 dV
V
µ2 < λ21 .
This tells us that the critical value of µ2 lies to the left of λ21 .
This is interesting. It tells us that the critical value of µ2 , a control variable in a nonlinear prob-
lem, cannot exceed λ21 , the slowest diffusion eigenvalue, where λ21 can be determined by solving a
linear eigenvalue problem. If we replace eu by 1 + u, viz., linear heating, we expect to find
µ2crit = λ21 .
The heating problem now looks a lot like the eigenvalue problem.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 398
16.4 Solvability
We may ask whether or not a problem presented to us is solvable. The question ordinarily comes
up in a steady state problem such as
∇2 φ + µ2 φ = Q in V
φ = 0 on S
We are not asking whether or not the expansion of Q in the eigenfunctions of ∇2 makes sense.
In fact Q ordinarily does not satisfy the same conditions on S as do the eigenfunctions and hence
its series expansion most likely converges in norm, not pointwise.
X
φ= ci ψi
where ci = hψi , φi. To do this we use our second integration by parts formula to obtain
ZZ
2
∇ ψi , φ + ψi−
→
n · ∇φ − φ−
→
n · ∇ψ i dA + µ2 hψi , φi = hψi , Qi
S
Thus we have
−λ2i + µ2 hψi , φi = hψi , Qi
and we conclude that so long as µ2 is not one of the eigenvalues of ∇2 we can find a solution to our
problem. If, however, µ2 = λ2i then Q must be perpendicular to every independent eigenfunction
corresponding to λ2i . This is the solvability condition.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 399
16.5 ∇4
The linear differential operator ∇4 = ∇2 ∇2 appears in problems in the slow flow of viscous fluids
and in the deformation of elastic solids. Its eigenvalue problem is
∇4 ψ = λ4 ψ, →
−
r ∈V
ψ=0
−
→
n · ∇ψ = 0
∇2 ψ = 0
on each part of S. (Here, as earlier, the boundary conditions on the problem to be solved will
determine the boundary conditions on ψ. The conditions listed are not all that are physically
interesting and to these can be added their linear combinations.)
The integral formulas we need here can be obtained from Green’s second theorem. First we
put ∇2 ψ in place of ψ to get
ZZZ ZZ ZZZ
−
4
φ∇ ψdV = φ→
n · ∇∇2 ψ − ∇2 ψ −
→
n · ∇φ dA + ∇2 ψ∇2 φdV.
V S V
And then we put ∇2 φ in place of φ, and use the result to rewrite the second term on the right hand
side, to get
ZZZ ZZ
−
4
φ∇ ψdV = φ→
n · ∇∇2 ψ − ∇2 ψ −
→
n · ∇φ dA
V S
ZZ ZZZ
2 −
+ ∇ φ→
n · ∇ψ − ψ −
→
n · ∇∇2 φ dA + ψ∇4 φdV
S V
These two formulas can be used in solving equations in ∇4 in just the same way that Green’s
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 400
first and second theorems can be used in solving the diffusion equation or any other equation in
∇2 . They are especially useful in exhibiting the way in which the sources of the field make their
contribution to the field itself.
To get information about the eigenvalues and eigenfunctions of ∇4 we first observe that if λ4
4
and ψ satisfy the eigenvalue problem then so also do λ and ψ. Hence putting ψ in place of φ in
the second formula we discover that λ4 must be real. Likewise putting ψ in place of φ in the first
formula we get
ZZZ ZZZ
4 2 2
λ |ψ| dV = ∇2 ψ dV
V V
and conclude that λ4 is not negative. In both calculations the integrals over S vanish due to the
conditions that we assume the eigenfunctions satisfy on S. To establish orthogonality we can go
on and put φ = ψ i , ψ = ψj in the second formula, where ψi and ψj are solutions to the eigenvalue
problem corresponding to different eigenvalues, and obtain
ZZZ
hψi , ψj i = ψ i ψj dV = 0.
V
φ, ∇4 ψ = ∇4 φ, ψ .
The set of eigenfunctions determined by the eigenvalue problem for ∇4 can be used in writing
the solution to equations such as (This equation is not entirely made up, at least not by me. It is in
the book “Fractal Concepts in Surface Growth” by Barabasi and Stanley.)
∂c
= −∇4 c + Q , →
−
r ∈V
∂t
ZZ n o
d
hψi , ci = − ψi→
−
n · ∇∇2 c − ∇2 c−
→
n · ∇ψ i dA
dt
S
ZZ n o
− ∇2 ψ i −
→
n · ∇c − c−
→
n · ∇∇2 ψ i dA
S
where the singly underlined terms vanish as we require ψi = 0 and →−n · ∇ψi = 0 on S and the
doubly underlined terms can be determined from the assigned values of c and −
→
n · ∇c on S.
∇4 ψ = λ 4 ψ
is not easy to solve. There are home problems in Lecture 17 which illustrate the difficulty.
→
− → −
−
∇2 ψ = −λ2 ψ , →
r ∈V
→
−
where ψ satisfies homogeneous conditions on the boundary of V .
To derive some facts about the solutions to this problem, we first use
⇒ ⇒ ⇒
∇· T · v = ∇·T ·−
→
− →
v + T : (∇−
→ T
v)
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 402
to get
−
→ − → → −
− → → −
− →T
∇ · ∇ ψ · φ = ∇2 ψ · φ + ∇ ψ : ∇ φ
and
−
→ − → → −
− → → −
− →T
∇ · ∇ φ · ψ = ∇2 φ · ψ + ∇ φ : ∇ ψ
where the underlined terms are equal as tr (AB) = tr B T AT ; we then use
ZZZ ZZ
∇·→
−
v dV = dA →
−
n ·→
−
v
V S
and
ZZZ ZZ n − ZZZ
2−
→ −
→ −
→ → − → → −
− →o → −
− →
∇ ψ · φ dV = dA n · ∇ ψ · φ − ∇ φ · ψ + ∇2 φ · ψ dV
V S V
To go on we require that
ZZ n −
→
− → − → → →
− −o
dA n · ∇ ψ · φ − ∇ φ · ψ = 0
S
−
→ →
−
whenever ψ and φ satisfy the homogeneous form of the boundary conditions assigned to −
→
v , then
the second formula reduces to
ZZZ ZZZ
2−
→ −
→ → −
− →
∇ ψ · φ dV = ∇2 φ · ψ dV
V V
→
− →
−
If ψ is a solution of the eigenvalue problem corresponding to λ2 so also is ψ corresponding to
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 403
→ −
− →
λ2 . Then putting φ = ψ in the second formula shows that λ2 must be real and hence that the real
→
− → −
− →
and imaginary parts of ψ must also be eigenfunctions corresponding to λ2 . And putting φ = ψ
in the first formula shows that
ZZZ ZZ ZZZ
2 − −
→ → −
→ → −
− → →
− → T
−
−λ ψ · ψ dV = dA n · ∇ ψ · ψ − ∇ψ : ∇ψ dV
V S V
where the second integral on the right hand side is not negative and both integrals are zero if there
→
− ⇒
is an eigenfunction such that ∇ ψ = 0. If the boundary conditions are such that the first term
vanishes, this formula shows that
λ2 ≥ 0;
otherwise the boundary conditions must be such that the sign of the right hand side is the sign of
the second term if we must have λ2 ≥ 0.
D−
→ → − E
ψ i, ψ j = 0
−
→ →
−
whenever ψ i and ψ j are eigenfunctions corresponding to different eigenvalues where
D− ZZZ
→ → − E − →
→ −
ψ i, ψ j = ψ i · ψ j dV.
V
In each of these three problems there is an eigenvalue, denoted σ. You are to see if you can prove
it is real.
1. You have an incompressible fluid at rest whose density varies upward: ρ = ρ0 (z). The fluid
is inviscid and you are to find out if the rest state is stable to small perturbations.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 404
Upon perturbation, the base density is carried by the perturbation flow and you have
∂−
→v1 →
−
ρ0 = −∇p1 − ρ1 g k
∂t
∇·−
→
v1 = 0
and
∂ρ1 −
+→
v1 · ∇ρ0 = 0
∂t
etc.
eliminate vbx1 and vby1 in favor of vbz1 and pb1 . Then eliminate pb1 obtaining
d db
v z1 k 2 dρ0
ρ0 − k 2 ρ0 vbz1 = − 2 g vbz1
dz dz σ dz
−
→ →
−
ρ = ρ0 (z) , v0 = 0
and
dp0
= −ρ0 g
dz
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 405
∂vx1 ∂p1
ρ0 =− + µ∇2 vx1
∂t ∂x
∂vy1 ∂p1
ρ0 =− + µ∇2 vy1
∂t ∂y
∂vz1 ∂p1
ρ0 =− + µ∇2 vz1 − ρ1 g
∂t ∂z
∂vx1 ∂vy1 ∂vz1
+ + =0
∂x ∂y ∂z
and
∂ρ1 dρ0
+ vz1 =0
∂t dz
Assume a solution
etc.
and
dρ0
σb
ρ1 + vbz1 =0
dz
d vbz1
ρ1 and b
Eliminate b p1 to derive an equation for vbz1 where vbz1 = 0 = at z = 0, H.
dz
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 406
You now have an eigenvalue problem, where σ is the eigenvalue. Can you say anything
about σ without a calculation?
3. You can account for viscosity more easily by assuming your fluid saturates a porous solid.
Then you can use Darcy’s law and you have
→
−
→
−
µ v = K −∇p − ρ g k
∇·−
→
v =0
and
∂ρ −
+→
v · ∇ρ = 0
∂t
Assume you have a two dimensional problem whereupon your perturbation equations
are
∂p1
µ vx1 = −K
∂x
∂p1
µ vz1 = −K − ρ1 g
∂z
∂vx1 ∂ vz1
+ =0
∂x ∂z
and
∂ρ1 dρ0
+ vz1 =0
∂t dz
Writing
and
you have
µ vbx1 = Kk pb1
db
p1
µ vbz1 = −K − ρ1 g
dz
d vbz1
k vbx1 + =0
dz
dρ0
σ ρb1 + vbz1 =0
dz
d2vbz1 2 k 2 g dρ0
− k v
b z1 = − vbz1
dz 2 σ µ dz
Separation of Variables
In Lecture 16 we found that the eigenvalue problem that must be solved in order to solve the
diffusion equation is:
∇2 ψ + λ2 ψ = 0, ~r ∈ V
ψ = 0, ~r ∈ S1
~n · ∇ψ = 0, ~r ∈ S2
~n · ∇ψ + βψ = 0, ~r ∈ S3
and we learned that the eigenvalues are real and that the eigenfunctions corresponding to different
eigenvalues are orthogonal. We can add the term V (~r) ψ to the left hand side of ∇2 ψ + λ2 ψ = 0,
where V (~r) is real valued, and not change the conclusion that the eigenvalues are real and that
the eigenfunctions are orthogonal in the plain vanilla inner product. We now turn to the method
of solving this eigenvalue problem and present the details in three coordinates systems. The job
begins here and is finished in Lecture 20.
409
LECTURE 17. SEPARATION OF VARIABLES 410
The method we use to do this is called separation of variables. To see how it goes we suppose
that we have an orthogonal coordinate system which is such that the bounding surface of the region
V coincides piecewise with a finite number of coordinate surfaces. Then the first question to ask
is this: in what form can we express the solutions to our eigenvalue problem?
If it works out, the method of separation of variables answers this question. The idea is to
reduce a three dimensional problem to three one dimensional problems. In certain orthogonal
coordinate systems this can be done. It is done by assuming that ψ can be written as the product
of three functions, each depending on only one of the three coordinates, substituting this into
∇2 ψ + λ2 ψ = 0, dividing by ψ and then determining which parts of the result must be constants.
We begin by showing how this works out in the Cartesian, cylindrical and spherical coordinate
systems.
into
2 2
∂ ∂ ∂ 2
∇ +λ ψ= + + +λ ψ =0
∂x2 ∂y 2 ∂z 2
1 d2 X d2 Y d2 Z
+ + + λ2 = 0
X dx2 dy 2 dz 2
The first term depends only on x, the second only on y and the third only on z. Because these
terms are independent of one another we conclude that each term must be equal to a constant.
Denoting these undetermined constants by −α2 , −β 2 and −γ 2 , we have replaced (∇2 + λ2 ) ψ = 0
LECTURE 17. SEPARATION OF VARIABLES 411
d2 X
+ α2 X = 0 (1)
dx2
d2 Y
+ β2Y = 0 (2)
dy 2
and
d2 Z
+ γ2Z = 0 (3)
dz 2
into
2 2
1 ∂ ∂ 1 ∂2 ∂2 2
∇ +λ ψ= r + 2 2 + 2 +λ ψ =0
r ∂r ∂r r ∂θ ∂z
1 d2 Z 1 d2 Θ
We conclude first that must be a constant and then that must be a constant. Denoting
Z dz 2 Θ dθ2
these constants by −γ 2 and −m2 we have replaced (∇2 + λ2 ) ψ = 0 in cylindrical coordinates by
the three equations
d2 Z
+ γ2Z = 0 (4)
dz 2
d2 Θ
+ m2 Θ = 0 (5)
dθ2
LECTURE 17. SEPARATION OF VARIABLES 412
and
1 d dR 2 m2 2
r + λ −γ − 2 R=0 (6)
r dr dr r
into
2 2
1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2 2
∇ +λ ψ= r + 2 sin θ + 2 2 +λ ψ
r 2 ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2
1 d2 Φ
Now we see that must be a constant. Calling this −m2 , we then see that
Φ dφ2
1 1 d dΘ m2
sin θ − must be a constant. Calling this constant −ℓ (ℓ + 1) we have
Θ sin θ dθ dθ sin2 θ
replaced (∇2 + λ2 ) ψ = 0 in spherical coordinates by
d2 Φ
+ m2 Φ = 0 (7)
dφ2
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0 (8)
sin θ dθ dθ sin2 θ
and
1 d 2 dR 2 ℓ (ℓ + 1)
r + λ − R=0 (9)
r 2 dr dr r2
LECTURE 17. SEPARATION OF VARIABLES 413
For Cartesian, cylindrical and spherical coordinate systems we have now reduced the problem of
solving ∇2 ψ +λ2 ψ = 0 to the problem of solving nine second order, linear, homogeneous ordinary
differential equations, three for each coordinate system.
Indeed each of these problems is a one dimensional eigenvalue problem in its own right and
some of them are one dimensional forms of eigenvalue problem for ∇2 , while all of them are one
dimensional forms of the eigenvalue problem for ∇2 + V (~r).
In Lecture 19 we will describe in a little more detail the elementary facts about linear ordinary
differential equations but for now we assume only that each of these equations has two independent
solutions, observing that these two solutions can be written in many ways. Taking equation (1) as
an example we can write its general solution as
A cos αx + B sin αx
or
Ae iαx + Be −iαx
or
These familiar functions satisfy our needs in Eqs. (1), (2), (3), (4), (5) and (7). Equations
(6), (8) and (9) have solutions that are less familiar. For instance we denote by Jm and Ym two
independent solutions of Eq. (6) which is called Bessel’s equation. And while these functions may
not be as familiar as cosine and sine, Watson’s book “Theory of Bessel Functions” has nearly 1,000
LECTURE 17. SEPARATION OF VARIABLES 414
pages of information on these and related functions and so Jm and Ym are very familiar to some
people. The same is true of the solutions of Eqs. (8) and (9).
While the solutions of each of the nine equations are denoted by special symbols, in every
instance the symbols stand for power series, either infinite or finite, or power series multiplied
by familiar functions. The power series solutions are determined by what is called the method of
Frobenius and we will show how this works by using Eq. (8) as an example in Lecture 20.
2m
m 1
X∞ (−1) z
2
For now we observe only that J0 (z) is the name assigned to the series
m=0
(m!)2
which satisfies
1 d d
z +1 ψ =0
z dz dz
ψ (z = 0) = 1
and
ψ ′ (z = 0) = 0
∞
X z 2m 1 2 1 4
(−1)m 2 = 1− z + z −···
m=0
(m!) 2! 4!
which satisfies
d2
+1 ψ =0
dz 2
ψ (z = 0) = 1
and
ψ ′ (z = 0) = 0
LECTURE 17. SEPARATION OF VARIABLES 415
It is worth observing that when a function is defined by a power series it is, in fact, defined by the
sequence of coefficients in the series which then must encode all of the properties of the function.
To illustrate this idea we show in §17.4 how the zeros of cos z and J0 (z) can be determined using
the coefficients in their power series.
It is also worth observing that new technical difficulties come up as we move away from Carte-
sian coordinates. Eqs. (1), (2) and (3) are independent of one another. But in Eqs. (4), (5) and (6)
m2 must be determined in Eq. (5) before Eq. (6) can be solved while in Eqs. (7), (8) and (9) m2
must be determined in Eq. (7) before Eq. (8) can be solved and ℓ (ℓ + 1) must be determined in
Eq. (8) before Eq. (9) can be solved. Equations (5) and (6) correspond to spherical coordinates
in two dimensions, Eqs. (7), (8) and (9) correspond to spherical coordinates in three dimensions
and in §17.5 we observe that the pattern we see here obtains in spherical coordinates in four di-
mensions. Also in §17.5 we carry out separation of variables in elliptic cylinder coordinates where
again something new happens that we have not seen heretofore. There are a dozen or so orthogo-
nal coordinate systems where we can separate (∇2 + λ2 ) ψ = 0 and information on these can be
found in Moon and Spencer’s book “Field Theory Handbook,” and in Morse and Feshbach’s book
“Methods of Theoretical Physics,” as well as in many other books. Indeed a lot of information on
orthogonal coordinate systems and on ∇2 can be found in Pauling and Wilson’s book “Quantum
Mechanics” and Happel and Brenner’s book “Low Reynolds Number Hydrodynamics.” The titles
of these books suggest the wide range of application of the method of separation of variables.
We turn now to an explanation of how we use Eqs. (1), ..., (9). The boundary conditions in a
specific problem may be any of a large number of possibilities, yet we can illustrate the essential
ideas by taking up a small number of concrete examples. We begin by observing that Eqs. (1), (2),
(3), (4), (5) and (7) are identical and that we can write the general solution to each in terms of a
linear combination of the functions cosine and sine. The boundary conditions for Eqs. (1), (2), (3)
and (4) are ordinarily imposed on two coordinate surfaces and this may also be true for Eqs. (5)
and (7), though often periodic conditions are imposed. To see how the boundary conditions select
the solutions we use we take Eq. (1) as an example and work out the Dirichlet problem where c
LECTURE 17. SEPARATION OF VARIABLES 416
is specified on the boundary of the region. Many more examples appear in Lecture 14. Then the
solutions to Eq. (1) must satisfy
X (x = a) = 0
and
X (x = b) = 0
X = A cos αx + B sin αx
0 = A cos αa + B sin αa
and
0 = A cos αb + B sin αb
or
cos αa sin αa A 0
=
cos αb sin αb B 0
are satisfied. Now for arbitrary values of α the only solution to this homogeneous equation is
A = 0 = B. To get solutions other than A = 0 = B we must find the special values of α that
make the determinant of the matrix on the left hand side vanish. To each such value of α there are
solutions such that A 6= 0 or B 6= 0 or A 6= 0 and B 6= 0 but there is only one independent solution
LECTURE 17. SEPARATION OF VARIABLES 417
cos αa sin αa
as the rank of is one.
cos αb sin αb
But not all values of α that make the determinant vanish produce new solutions. If α makes the
determinant vanish so also does −α but α and −α determine the same eigenvalue α2 and dependent
eigenfunctions.
Assuming we solve Eqs. (2) and (3) in a similar way we can determine the eigenvalues and
eigenfunctions of a problem in Cartesian coordinates as
λ2 = α 2 + β 2 + γ 2
and
ψ = XY Z
where the arbitrary multiples in X, Y and Z will be determined so that h ψ, ψ i = 1. And it is worth
observing that only three sets of orthogonal functions are needed to solve problems in Cartesian
coordinates. In this way Cartesian coordinates are special.
LECTURE 17. SEPARATION OF VARIABLES 418
While the foregoing shows how we deal with Eqs. (5) and (7) when surfaces θ = constant or
φ = constant separate the system from its surroundings, it often happens that the boundary of a
region can be completely specified in terms that are independent of θ in Eq. (5) or φ in Eq. (7).
Then we do not have surfaces on which we can specify physical boundary conditions and in place
of this we must require the solution to our problem to be periodic in θ or φ of period 2π. This
requirement is passed on to the eigenfunctions and then to their θ or φ dependent parts and so the
boundary conditions for Eqs. (5) and (7) can be taken to be
Θ (θ + 2π) = Θ (θ)
or
Φ (φ + 2π) = Φ (φ)
These ideas carry over to Eq. (6), (8) and (9), but we have not yet explained how to write the
solutions to these equations. Before we do this we take up a simple concrete example which shows
how the various parts of the solution fit together .
and
∂c
(r = a) = 0 = c (r = b)
∂r
X 2
c (r, θ, t) = ψi , c (t = 0) e−λ i t ψi (r, θ)
and
∂ψ
(r = b) = 0 = ψ (r = a) , 0 ≤ θ < 2π
∂r
The expectation might be that we will have to piece together two sets of orthogonal functions
in order build up the eigenfunctions we use to solve our problem. That would be true in Cartesian
coordinates where the separated eigenvalue problems are not coupled, but it is not ordinarily true,
and it is not true here.
To solve the eigenvalue problem we put ψ = R (r) Θ (θ) and conclude that R and Θ satisfy
d2 Θ
+ m2 Θ = 0
dθ2
and
d2 R 1 dR m2
+ − 2 R + λ2 R = 0
dr 2 r dr r
where
dR
(r = a) = 0 = R (r = b)
dr
LECTURE 17. SEPARATION OF VARIABLES 420
Θ (θ) = Θ (θ + 2π)
by
Θ (0) = Θ (2π)
and
Θ′ (0) = Θ′ (2π)
Now writing Θ as
Θ = A cos mθ + B sin mθ
and
whereupon we have
cos 2πm − 1 sin 2πm A 0
=
−m sin 2πm m (cos 2πm − 1) B 0
LECTURE 17. SEPARATION OF VARIABLES 421
And we see that only for special values of m does this system of homogeneous, linear, algebraic
equations have solutions other that A = 0 = B. These special values of m are those that make the
determinant of the matrix on the left had side vanish and as this determinant is
m (2 − 2 cos 2πm)
m = 0, ±1, ±2, · · ·
Now the simplest way to denote this set of eigenfunctions is to write the independent solutions
corresponding to m2 = 0, 1, 4, . . . as e imθ and e −imθ, m = 0, 1, 2, . . . Doing this we exhibit the
eigenfunctions satisfying periodic conditions as
1
Θm = √ e imθ m = . . . , −2, −1, 0, 1, 2, . . .
2π
In this way we deal with Eq. (5) in cylindrical coordinates and Eq. (7) in spherical coordinates.
Now having established the values of m2 , viz., 0, 1, 4, . . . we turn to the R equation and notice that
we get a different R equation for each value of m2 and that λ2 appears only in the R equations.
For each value of m2 we denote the two independent solutions of the R equation by Jm (λr) and
Ym (λr) , m = 0, 1, 2, . . . where Jm and Ym denote independent solutions of Bessel’s equation for
nonnegative integer values of m.
Now there are many R equations, one corresponding to each fixed value of m2 , hence we write
′
λA Jm (λa) + λB Ym′ (λa) = 0
and
A Jm (λb) + B Ym (λb) = 0
′ dJm (x)
where Jm denotes . Thus we have
dx
′
λ Jm (λa) λ Ym′ (λa) A 0
=
Jm (λb) Ym (λb) B 0
and to each fixed value of m2 the values of λ2 can be determined by finding the values of λ that
make the determinant of the matrix on the left hand side vanish. Indeed only for values of λ such
that
′
λ Jm (λa) Ym (λb) − λ Ym′ (λa) Jm (λb) = 0
can constants other than A = 0 = B, and, therefore, solutions other than R =0, be determined.
And we see that this equation must be solved for m = 0, 1, 2, . . ..
LECTURE 17. SEPARATION OF VARIABLES 423
To go on, we put a =0. By doing this we turn up a technical difficulty. The boundary of the
region is now the circle r = b. And as this is the only surface on which we can assign a physical
boundary condition we no longer have the two boundary conditions required to evaluate the two
constants in the solution of the R equation. What gets us out of this is the discovery that Ym (r) is
not bounded as r → 0 and hence, upon requiring c to be bounded, and therefore ψ to be bounded,
we must require R to be bounded and to achieve this we put B = 0. So if a = 0 and b = 1 we
write
R = A Jm (λr)
Jm (λ) = 0
If λ is a root of this equation then so also is −λ but λ and −λ lead to the same eigenvalue and to de-
pendent eigenfunctions as Jm (−z) = ±Jm (z). If zero is a root of this equation the corresponding
eigenfunction is zero everywhere.
We let λ| m|1, λ| m| 2, . . . denote the positive roots of J | m| (λ) = 0 and then organize the
solutions to the eigenvalue problem in terms of m by assigning to each value of m, i.e., to
. . . , −2, −1, 0, 1, 2, . . . the eigenvalues
λ2 | m|1, λ2 | m| 2, · · ·
e imθ e imθ
J | m| λ| m|1 r √ , J | m| λ| m| 2 r √ , ···
2π 2π
In this way, of the two eigenfunctions corresponding to the eigenvalue λ2 | m|i, one is assigned to
| m| the other to − | m|. In problem 1 you will derive the factor normalizing the Bessel’s functions.
Letting ψ mi (r, θ) denote the normalized eigenfunction R| m|i (r) Θm (θ), where
LECTURE 17. SEPARATION OF VARIABLES 424
R| m|i (r) ∝ J | m| λ| m|i r we can write the solution to our problem as
+∞ X
X ∞
2
c (r, θ, t) = ψ mi, c (t = 0) e −λ | m|it ψ mi (r, θ)
m=−∞ i=1
where
Z 1 Z 2π
ψ mi, c (t = 0) = R| m| i (r) Θm (θ) c (t = 0) dr r dθ
0 0
= R| m|i , h Θm, c (t = 0) iθ r
D E D E D E
R| m|i Θm , R| n|j Θn = R| m|i , R| n|j Θm , Θn
r θ
D E
and this is zero if m 6= n whereas it is R| m|i , R| m|j if m = n where
r
D E 0, i 6= j
R| m|i , R| m|j =
r 1, i=j
So, corresponding to each value of m2 we have a complete orthogonal set of functions of r and
indeed a different set for each value of m2 . The completeness we assume; the orthogonality we
infer from Lecture 15 or establish directly using the integration by parts formulas:
Z 1 1 Z 1
1 d dψ dψ dφ dψ
φ r r dr = φr − r dr
0 r dr dr dr 0 0 dr dr
and
Z 1 1 Z 1
1 d dψ dψ dφ 1 d dφ
φ r r dr = φr −r ψ + r ψr dr
0 r dr dr dr dr 0 0 r dr dr
LECTURE 17. SEPARATION OF VARIABLES 425
and
R (r = 1) = 0
2
where R is required to be bounded as r → 0 and where m2 is fixed, then so also do R and λ . On
setting φ = R and ψ = R in the first and second formulas we find that λ2 is real and positive. On
setting φ = R| m|i and ψ = R| m|j in the second formula where R| m|i and R| m|j are two solutions
corresponding to different values of λ2 we find that
Z 1
R| m|i R| m|j r dr = R| m|i , R| m|j =0
0 r
What is going on here is this: the index m sorts out the θ variation of c (t = 0) and then the
index i sorts out the corresponding r variation. Indeed we first expand an assigned initial solute
distribution c (t = 0) in the set of functions
n o+∞
Θm (θ)
m=−∞
as
+∞ D
X E
c (t = 0) = Θm, c (t = 0) Θm (θ)
θ
m=−∞
The complex and real Fourier series are two forms of the same expansion. If
P+∞ imφ P∞
m=−∞ cm e = m=0 am cos mφ + bm sin mφ then am = cm + c−m and
bm = icm − ic−m.
This resolves c (t = 0) into its various angular pieces where each of the resulting coefficients
D E
Θm, c (t = 0) is a function of r. This function, defining the part of the r dependence of
θ
LECTURE 17. SEPARATION OF VARIABLES 426
n o∞
c (t = 0) that corresponds to Θm (θ), is then expanded in the set of functions R| m|i (r) and
i=1
this set is special to each value of | m|.
∞ X
X ∞ nD E 2
o
c (r, θ, t) = 2 Re Rmi , Θm , c (t = 0) Rmi (r) Θm (θ) e −λ | m|it
θ r
m=0 i=1
∞
X 2
c (r, t) = R0i , c (t = 0) R0i (r) e −λ 0 it
r
i=1
1 J0 (z)
J1 (z)
J2 (z) λ21 λ22 λ23
indicate that the positive zeros of J0 , J1 , J2 , . . . are ordered in the following way: The lowest is
the first zero of J0 , the next lowest is the first positive zero of J1 , then the first positive zero of J2 ,
before the second zero of J0 .
Hence as t grows large the last remaining term in our solution corresponds to m = 0, i = 1
and this term is uniform in θ. The next to the last term corresponds to m = 1, i = 1, not to
m = 0, i = 2. Indeed estimates of how large t must be before the series can be replaced by its first
term are too short if we look at λ01 and λ02 . We need to look at λ01 and λ11 .
The terms corresponding to larger values of | m| and i die out faster than the terms corre-
sponding to smaller values of | m| and i. This is the smoothing we associate with diffusion as
the eigenfunctions corresponding to larger values of | m| and i exhibit more oscillations and their
contribution to the solution dies out faster.
The sketch below of J0 (λ01 r), J0 (λ02 r), J0 (λ03 r), . . . shows that this set of orthogonal func-
tions is constructed from J0 (z) by scaling, in turn, its positive zeros, z1 , z2 , z3 , . . . to 1. Likewise
J1 (λ11 r), J1 (λ12 r), J1 (λ13 r), . . . is constructed from J1 (z) in just this same way. Etc. This illus-
trates the rule that in a set of orthogonal functions each function can be identified by the number
of its interior zeros. It also shows that the zeros of any two functions in such a set are nested.
1
J0 ( λ01r)
J0 ( λ02r)
r
J0 ( λ03r)
LECTURE 17. SEPARATION OF VARIABLES 428
As the last remaining term in our solution as t grows large is a multiple of J0 (λ01 r), it is im-
portant that J0 (λ01 r) be singly signed. It is also important that J1 (z = 0) = 0 otherwise the eigen-
functions J1 (λ1i r) cos θ would be poorly behaved as r → 0. Likewise
J2 (z = 0) = 0 = J2′ (z = 0) and this is important as the eigenfunctions exhibiting J2 as a fac-
tor are multiplied by cos 2θ. Etc.
dV = hξ hη hζ dξ dη dζ
hence the inner product over V factors into one dimensional inner products only under special
conditions. And these conditions are not satisfied in elliptic cylinder coordinates introduced in
§17.5
Further, we have
2 1 ∂ hη hζ ∂
∇ = +···
hξ hη hζ ∂ξ hξ ∂ξ
and for separation of variables to work out as simply as above places strong requirements on hξ , hη
and hζ , conditions not satisfied in all orthogonal coordinate systems.
In the next lecture we solve some problems where what we have done in this lecture is suffi-
cient.
To determine the zeros of cos z or J0 (z) in terms of the coefficients in their power series expansions
we observe that if q (z) has a simple zero at z0 then q ′ (z0 )/q(z0 ) has a simple pole there and its
residue is 1. Then because the contour integral
I
1 cos′ w
dw
2πi (w − z) cos w
C
LECTURE 17. SEPARATION OF VARIABLES 429
is equal to the sum of the residues of its integrand at its poles inside C and because the integral
vanishes as the diameter of C grows large, we get
cos′ z 1 1
0= + + +···
cos z z1 − z −z1 − z
where the zeros of cos z are real and where 0 < z1 < z2 < · · · denote the positive zeros. Writing
this as
cos′ z 2z 2z
= 2 + 2 +···
cos z z − z1 z − z22
2
or
1 cos′ z X 1 1
− =
2z cos z zi2 z2
1− 2
zi
j
1 P z2
and expanding as when | z 2 | < | zi2 | we get
z2 zi2
1− 2
zi
∞ ∞ j
1 cos′ z X 1 X z 2
− =
2z cos z z2
i=1 i j=0
zi2
∞
X X∞ j+1
2j 1
= z
j=0 i=1
zi2
d 1
where cos z is a function of z 2 and 2
cos z = cos′ z . Using this to write
dz 2z
X∞ X∞ j+1
d 2j 1
− 2 cos z = cos z z
dz j=0 i=1
zi2
LECTURE 17. SEPARATION OF VARIABLES 430
j+1
2
P∞ 1
and expanding both sides in powers of z , we can evaluate the sums i=1 . Thus, using
zi2
the series defining cos z, viz.,
1 1 1 6
cos z = 1 − z 2 + z 4 − z +···
2 24 720
and
d 1 1 1 4
2
cos z = − + z 2 − z +···
dz 2 12 240
we get
X 1 1
2
=
zi 2
X 1 1
4
=
zi 6
etc.
√ π2
Indeed 6 = 2.449 is already a fair approximation to z12 = = 2.467.
4
The corresponding equations for J0 are
X 1 1
2
=
zi 4
X 1 1
4
=
zi 32
X 1 1
6
=
zi 192
etc.
√
3
where 192 = 5.769 is a very good approximation to the square of the smallest positive zero of
J0 .
We let w, x, y and z denote rectangular Cartesian coordinates in four dimensions and define spher-
ical coordinates via
w = r cos ω
x = r sin ω cos θ
The result of substituting ψ = R (r) Ω (ω) Θ (θ) Φ (φ) into (∇2 + λ2 ) ψ, dividing by RΩΘΦ and
identifying terms which must be constant is
d2 Φ
+ m2 Φ = 0
dφ2
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0
sin θ dθ dθ sin2 θ
1 d 2 dΩ ℓ (ℓ + 1)
sin ω + k (k + 2) − Ω=0
sin2 ω dω dω sin2 ω
LECTURE 17. SEPARATION OF VARIABLES 432
and
1 d 3 dR 2 k (k + 2)
r + λ − R=0
r 3 dr dr r2
x = c cosh ξ cos η
y = c sinh ξ sin η
z=z
p
Then we find hξ = hη = c sinh2 ξ + sin2 η and hz = 1 and hence
2 1 ∂2 ∂2 ∂2
∇ = 2
+ +
c2 sinh ξ + sin2 η ∂ξ 2 ∂η 2 ∂z 2
d2 Z
+ γ 2Z = 0
dz 2
d2 X n 2 2 2 2
2
o
+ c sin ξ λ − γ − m X=0
dξ 2
d2 Y n o
2 2 2 2 2
+ c sin η λ − γ + m Y =0
dη 2
We notice that λ2 appears in two equations. This is new. And we see that the orthogonality
does not factor, viz.,
Z Z
ψ ij , ψ kℓ = Xi (ξ) Xk (ξ) Y j (η) Y ℓ (η) c2 sinh2 ξ + c2 sin2 η dξdη
This is new.
LECTURE 17. SEPARATION OF VARIABLES 433
dψ
by 2r 2 to obtain
dr
dψ d dψ dψ
2 r r = 2 − λ2 r + m2 ψ
dr dr dr dr
Now this is
2
d dψ d
r = − λ2 r 2 + m2 ψ2
dr dr dr
2. Heat is generated in a circle of radius R. The temperature at the edge is held fixed at T = 0.
The rate of heat generation is a linear function of temperature, increasing as temperature
LECTURE 17. SEPARATION OF VARIABLES 434
∇2 T + λ2 (1 + T ) = 0
T = 0 at r=R
What is the greatest value of R at which there is a solution to our problem, at a fixed
value of λ2 ? { We could ask: what is the greatest value of λ2 at which there is a solution to
our problem at a fixed value of R ? }.
A cooling pipe of radius R0 is introduced at the center of the circle. Its temperature is
T = 0. By how much can R be increased?
It is not possible to center the cooing pipe precisely. What is its effect if it is off center
by a small amount ε?
In the expansion
1 2
R = R0 + εR1 (θ) + ε R2 (θ) + · · ·
2
we have
sin2 θ
R1 = cos θ, R2 = −
R0
3. Assume the temperature, T > 0, is specified at the cross section z = 0 of an infinite circular
cylinder of radius R. The walls are held at T = 0. Find the temperature in the cylinder,
z > 0, by solving
∂2T 1 ∂ ∂T
− 2
= r
∂z r ∂r ∂r
1 d dψ
r + λ2 ψ = 0, 0 < r < R, and ψ = 0 at r = R
r dr dr
LECTURE 17. SEPARATION OF VARIABLES 435
∂T 1 ∂ ∂T
= r
∂t r ∂r ∂r
where T is specified at t = 0.
2
d2 1 d
2
+ ψ = λ4 ψ
dr r dr
ψ (r = 1) = 0
ψ ′ (r = 1) = 0
due to
d2 1 d
+ J0 (λr) = −λ2 J0 (λr)
dr 2 r dr
and
d2 1 d
2
+ I0 (λr) = λ2 I0 (λr)
dr r dr
and
whereupon, to have a solution such that A and B are not both zero, the λ’s must satisfy
λ J0 (λ) I0′ (λ) − J0′ (λ) I0 (λ) = 0
AJ0 (λ)
B=−
Io (λ)
or
ψ = A J0 (λr) I0 (λ) − I0 (λr) J0 (λ)
Denote by W (λ) the Wronskian of J0 (λ) and I0 (λ) and show that
dW 1
= − W + 2J0 (λ) I0 (λ)
dλ λ
where
The λ’s satisfy W (λ) = 0 and you will find that W (λ) is not a very nice function.
LECTURE 17. SEPARATION OF VARIABLES 437
5. Show that
2
d2 1 d
2
+ ψ = λ4 ψ
dr r dr
has solutions
and that
d4 ψ
4
= λ4 ψ
dz
has solutions
Each of these solutions has enough flexibility to satisfy four boundary conditions.
To solve
2
∂2 1 ∂ ∂
2
+ + 2 ψ = 4λ4 ψ
∂r r ∂r ∂z
observe that
∂2 1 ∂ ∂2
+ + I0 (λr) sin λz = 0
∂r 2 r ∂r ∂z 2
∂2 1 ∂ ∂2
+ + I0 (λr) sinh λz = 2λ2 I0 (λr) sinh λz
∂r 2 r ∂r ∂z 2
etc.
AJ0 (λr) + BY0 (λr) C sin λz + D cos λz
LECTURE 17. SEPARATION OF VARIABLES 438
and
AI0 (λr) + BK0 (λr) C sinh λz + D cosh λz
and also
√ √
AJ0 3 λr + BY0 3 λr C sinh λz + D cosh λz
etc.
But it is not easy to find ψ’s with enough flexibility to satisfy the boundary conditions
you are likely to meet. Show that the same problem arises if you are trying to solve
2
4 ∂2 ∂
∇ψ= 2
+ 2 =0
∂x ∂y
∇4 φ = f (x, y)
d4 ψ
= λ4 ψ
dx4
ψλ = A cos λx + B cosh λx
LECTURE 17. SEPARATION OF VARIABLES 439
ψλ = C sin λx + D sinh λx
X
φ (x, y) = cλ (y) ψλ (x)
where
Z +1
cλ (y) = ψλ (x) φ (x, y) dx
−1
and then to find the equation satisfied by the cλ’s you multiply ∇4 φ = f by ψλ and integrate
over −1 ≤ x ≤ 1 obtaining
Z +1 Z +1 Z +1 Z +1
∂4φ d2 ∂2φ d4
ψλ 4 dx + 2 2 ψλ 2 dx + 4 ψλ φ dx = ψλ f dx
−1 ∂x dy −1 ∂x dy −1 −1
On carrying out integration by parts as many times as you need to, you discover one,
and only one, technical difficulty: the term
Z +1
d2 d2 ψ λ
2 2 φ dx
dy −1 dx2
d2 ψ λ
appears and it is not easy to write in terms of ψλ,
dx2
It turns out, in two dimensional problems, the theory of complex variables comes to
your rescue: if you write z = x + ıy then
∂c
= ∇2 c
∂t
∇2 ψ + λ 2 ψ = 0
X 2
c= h ψ, c (t = 0) i e−λ t ψ
Hence all you need are the solutions to the eigenvalue problem in order to estimate c.
First, suppose the domain is a circle of radius R0 . Find the two eigenvalues in control
of the final stages of solute loss to the surroundings.
1 2
R (θ) = R0 + ε R1 + ε R2 + · · ·
2
sin2 θ
where R1 = cos θ, R2 = − and find the corrections to the above two eigenvalues in
R0
order to learn whether the diffusion of solute is faster or slower, i.e., find λ21 and λ22 in the
series
1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2
The hope is you find λ21 = 0 = λ22 for both eigenvalues and you are curious to know
why this is so.
LECTURE 17. SEPARATION OF VARIABLES 441
1 2
R (θ) = R0 + ε R1 + ε R2 + · · ·
2
x = c cosh ξ cos η
y = c sinh ξ sin η
z=z
y
= tan η
x
and hence the four values of η are the angles that the branches make with the positive x axis.
The ellipses and hyperbolas are centered at x = 0 = y and their foci lie at x = ±c, y = 0.
1 2 4 5
The curves η = π, π, π and π corresponding to two branchs of one hyperbola are
3 3 3 3
shown in the sketch.
LECTURE 17. SEPARATION OF VARIABLES 442
2 1
η= π η= π
3 3
η=π η=0
(−c, 0) (c, 0) η = 2π
4 5
η= π η= π
3 3
Write ∇2 in this coordinate system and reduce the eigenvalue problem (∇2 + λ2 ) ψ = 0
to three one dimensional eigenvalue problems by separation of variables.
x2 y 2
+ 2 = 1, a2 − b2 = c2
a2 b
z = 0, z=d
Assuming that the solute concentration on the lateral surface of the elliptical cylinder
does not depend on z, show that c (t > 0) is independent of z iff c (t = 0) is independent
of z.
a2 = c2 cosh2 ξ1 , b2 = c2 sinh2 ξ1
and suppose that our domain is in contact with a solute free reservoir so that c (ξ = ξ1 ) = 0
for all t > 0. Then determine whether or not the solute concentration is always independent
of η if it is initially independent of η. Moon and Spencer’s book “Field Theory Handbook:
LECTURE 17. SEPARATION OF VARIABLES 443
Including Coordinate Systems, Differential Equations, and their Solutions” is a useful refer-
ence.
10. Occasionally it is possible to use eigenfunctions in a region of simple shape to derive eigen-
functions in a not so simple region of interest. To do this we make linear combinations of the
eigenfunctions we have in a way that satisfies the conditions on the boundary of the region
of interest.
For example sin mπx sin nπy and sin nπx sin mπy are eigenfunctions of ∇2 in the
square, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, satisfying ψ = 0 along x = 0, x = 1, y = 0 and
y = 1. The eigenvalue is
λ2 = π 2 m2 + n2
∇2 ψ + λ2 ψ = 0, r ≤ R (θ) , 0 ≤ θ < 2π
LECTURE 17. SEPARATION OF VARIABLES 444
where
ψ = 0 at r = R (θ)
a = R0 + ε
ε2
b = R0 − ε + −···
R0
and
1 2
R (θ) = R0 + ε R1 (θ) + ε R2 (θ) + · · ·
2
b2 x2 + a2 y 2 = a2 b2
where x = R cos θ, y = R sin θ, to find R1 (θ) , R2 (θ) , . . . By doing this you should have
R1 (θ) = cos 2θ and R2 (θ) = your job.
Then any eigenfunction and eigenvalue on the circle, viz., ψ0 , λ20 , can be corrected for
a slight displacement by writing
1 2
ψ = ψ0 + ε ψ1 + ε ψ2 + · · ·
2
1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2
deriving the equations for ψ1 , λ21 and ψ2 , λ22 on the circle in the usual way and then deriving
LECTURE 17. SEPARATION OF VARIABLES 445
where the RHS is evaluated at r = R0 . To see how this goes first take ψ0 = J0 (λ0 r) where
J0 (λ0 R0 ) = 0 and then ψ0 = J1 (λ0 r) cos θ where J1 (λ0 R0 ) = 0.
You should notice that at each order 1st , 2nd , etc., the homogeneous problem is the ze-
roth order problem and it has a solution, viz., ψ0 , not zero. Hence a solvability condition must
be satisfied. This determines λ21 at first order, before ψ1 , λ22 at second order, before ψ2 , etc. In
solving for ψ1 , ψ2 , etc. you ought to expand the inhomogeneous terms in 1, cos θ, cos 2θ, . . .,
1 1
eg., cos2 θ = + cos 2θ.
2 2
12. The setting for the free radical problem, see Lecture 14, is now a very long circular cylinder
of radius R. What is the critical value of R in terms of k and D?
By how much can the critical value of R be increased if the cylinder is of length L and
c = 0 at z = 0 and L.
Define ψ by
dψ
Multiply this equation by r 2 and integrate the product over R1 < r < R2 . Derive a
dr
LECTURE 17. SEPARATION OF VARIABLES 446
formula for
Z R2
r ψ 2 dr
R1
∂~v
+ ~v · ∇~v = −∇p
∂t
∇ · ~v = 0
p
where p denotes .
ρ
Show that
vr = 0, vθ = rΩ, vz = 0
to obtain
db
p
ı (σ + m Ω) vbr − 2Ω vbθ = −
dr
ım
ı (σ + m Ω) vbθ + 2Ω vbr = − p
b
r
ı (σ + m Ω) vbz = −ı k b
p
and
vr vbr ı mb
db vθ
+ + + ı k vbz = 0
dr r r
and at r = R, vr = 0 implies
db
p 2Ωm b p
+ =0
dr σ + mΩ r
15. Solve the Petri dish problem in a circular domain , assuming homogeneous Dirichlet condi-
tions along the circumference.
16. A cold rod of radius κR0 lies inside a hot pipe of radius R0 . The temperatures T cold and
LECTURE 17. SEPARATION OF VARIABLES 448
T hot are held fixed and the temperature of the fluid in the annular region is
r
A + B ln
R0
where
T cold = A + B ln κ
and
T hot = A
The fluid and the cylinders are spinning at constant angular velocity, Ω, such that the fluid
~ = Ω~k and ~k lies along the axis of the rod.
~ × ~r where Ω
velocity is Ω
ρ = ρref 1 − α (T − T ref)
Accounting for the variation of ρ only in the ~v · ∇~v terms derive the equations satisfied
by a small perturbation of the base state, assume a solution
vr1 = vbr (r)
vθ1 = vbθ (r) σt ı m θ
e e
p1 = pb (r)
T = T (r)
1
b
and derive the equations for vbr, vbθ, pb, and Tb. There is no gravity, no z variation, no vz and
for κ near 1 the base temperature is more or less linear.
Making this approximation find the critical value of T hot −T cold, i.e., the smallest value
of T hot − T cold such that σ = 0.
LECTURE 17. SEPARATION OF VARIABLES 449
17. A hot rod of radius R0 loses heat by conduction to a cold pipe of radius κR0 , κ > 1. Their
temperatures are Th > Tc . Derive a formula for the rate of heat loss. Move the rod off center
by a small amount ε so that its surface is now
1 2
R (θ) = R0 + ε R1 (θ) + ε R2 (θ) + · · ·
2
sin2 θ
R1 (θ) = cos θ, R2 (θ) = −
R0
and find out by how much the heat loss is changed.
∇2 ψ + λ 2 ψ = 0 0 ≤ r ≤ R0 , 0 ≤ θ < 2π
where
ψ = 0 at r = R0
are
etc.
r = R (θ) = R0 + εR1 + · · ·
LECTURE 17. SEPARATION OF VARIABLES 450
where
ab = R02
a = R0 + ε
R02 ε
b= = R0 1 − +···
a R0
we have
1 1 3
R1 = cos 2θ, R2 = − − cos 2θ + cos 4θ , ···
R0 2 2
x2 y 2
∇2 ψ + λ2 ψ = 0, + 2 ≤1
a2 b
where
x2 y 2
ψ = 0, at + 2 =1
a2 b
λ2 = λ20 + ελ21 + · · ·
ψ = ψ0 + εψ1 + · · ·
where λ20 and ψ0 are the corresponding eigenvalues and eigenvectors on the circle, derive the
result
1 2
y = Y (x) = Y0 + ε Y1 (x) + ε Y2 (x) + · · ·
2
where Y1 (x) , Y2 (x) , · · · are inputs. You can make your job easy by assuming
Y1 (x) = sin 2πx, Y2 (x) = 0, . . .
Y0
0
0 1 x
y0
Y0
0
0 1 x0
The solutions to
are
y0
ψ0 = sin mπx0 sin nπ m, n = 1, 2, . . .
Y0
and
n2 π 2
λ20 2 2
=m π + 2
Y0
1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2
where λ20 , ψ0 correspond to definite values of m and n, say m0 , n0 , held fixed henceforth.
LECTURE 17. SEPARATION OF VARIABLES 453
∂ψ0
ψ1 = −Y1 (x0 ) (x0 , y0 = Y0 ) on y0 = Y0
∂y0
and, second,
∂ψ1 ∂ 2 ψ0
ψ2 = −2Y1 (x0 ) (x0 , y0 = Y0 ) − Y12 (x0 ) (x0 , y0 = Y0 )
∂y0 ∂y02
∂ψ0
− Y2 (x0 ) (x0 , y0 = Y0 ) on y0 = Y0
∂y0
etc.
At each order the homogeneous problem has a solution, not zero. Hence a solvability condi-
tion must be satisfied and these conditions lead you to λ21 , λ22 , etc.
Second, solve for ψ1 by deriving formulas for the coefficients Amn , where
XX
ψ1 = Amn ψ0mn
and where
Z Z
Amn = ψ0mn ψ1 dx0 dy0
Z Z
2
assuming ψ0mn dx0 dy0 = 1.
LECTURE 17. SEPARATION OF VARIABLES 454
∇2 ψ + λ 2 ψ = C
ψ = 0, at the sides
and
ZZ
ψ dxdy = 0
A
L L
−L L
−L
and ψ, C and λ2 are the outputs. Your interest is the solutions where C 6= 0.
The problem
∇2 φ + µ 2 φ = 0
and
φ = 0, at the sides
LECTURE 17. SEPARATION OF VARIABLES 455
has solutions
x y
sin mπ sin nπ , m, n = 1, 2, . . .
L L
x 1 y
sin mπ cos n + π , m = 1, 2, . . . n = 0, 1, . . .
L 2 L
1 x y
cos m + π sin nπ , m = 0, 1, . . . n = 1, 2, . . .
2 L L
all of which integrate to zero hence all of which are ψ’s and λ2 ’s corresponding to C = 0.
Z Z 2
X φmn dxdy
=0
m,n
λ2 − µ2mn
Using what we did in Lecture 17 we can work out two stability problems: the Saffman-Taylor
problem (P. G. Saffman, G. I. Taylor, Proc. Roy. Soc., Vol. 245, 312, 1958) and the Rayleigh-
Taylor problem. (S. Chandrasekhar, “Hydrodynamic and Hydromagnetic Stability.”)
The setting for each is a cylindrical column of circular cross section bounding a porous solid.
Fluid fills the pores and its velocity is given by Darcy’s law.
We will need the balances, expressing the conservation laws, across a surface separating two
phases, denoted (1) and (2):
(2) → →
u = u n ( normal velocity )
→
n
(1)
z = Z( x, y, t )
where
~k − Z ~i − Z ~j
x y
~n = q
1 + Zx2 + Zy2
457
LECTURE 18. TWO STABILITY PROBLEMS 458
Zt
u= q
1 + Zx2 + Zy2
and
1 + Zy Zxx − 2ZxZy Zxy + 1 + Zx2 Zyy
2
2H = 3/2
1 + Zx2 + Zy2
Assuming the phases are immiscible, neither crossing the surface, we have at z = Z (x, y, t):
~n · ~v (1) = u = ~n · ~v (2)
n o n o
vz − Z xvx − Z y vy (1) = Z t = vz − Z xvx − Z yvy (2)
and
~ ~
−~n~n : T~ (1) + γ2H = −~n~n : T~ (2)
∂
∇ = ~k + ∇H
∂z
and
~v = vz~k + ~vH
LECTURE 18. TWO STABILITY PROBLEMS 459
assuming
Z = Z 0 + εZ 1 + · · ·
In this problem the stability of the surface separating two immiscible fluids is of interest, one fluid
displacing the other in a porous rock. The flow is in the z direction at a speed U and gravity is not
important. What is important is that the viscosity of the two fluids differs.
R
→
n
µ
z
z = Z0 = 0
z = Z( r, θ, t )
µ∗
U
LECTURE 18. TWO STABILITY PROBLEMS 460
We introduce an observer moving at the velocity U ~k. Then, in the moving frame, we write
the nonlinear equations making up our model for the dynamics of the surface, z = Z (r, θ, t),
separating the two phases. First, we have Darcy’s law, which is not Galilean invariant, above and
below the surface, viz.,
µ
~ = −∇p,
~v + U z>Z
K
and
µ∗
~ = −∇p∗,
~v ∗ + U z<Z
K
where ∇ · ~v = 0 = ∇ · ~v ∗, where U
~ = U ~k and where K denotes the permeability of the porous
solid, whereupon
∇2 p = 0 = ∇2 p ∗
~n · ~v = 0 = ~n · ∇H p, z>Z
and
~n · ~v ∗ = 0 = ~n · ∇H p∗, z<Z
and contact at right angles to the wall implies ~n · ∇H Z = 0. Far from the surface z = Z the
pressures p and p∗ must be bounded.
vz − ~v · ∇H Z = Z t = vz∗ − ~v ∗ · ∇H Z
H H
LECTURE 18. TWO STABILITY PROBLEMS 461
and
p − p∗ = γ2H
~v0 = ~0 = ~v0∗
dp0 µ
=− U
dz K
and
dp∗
0 µ∗
=− U
dz K
where the surface separating the two fluids lies at z = Z0 = 0, defining the base domain.
Imposing a small displacement on our base solution and denoting the perturbation variables by
the subscript 1, viz., Z = Z0 + εZ1 , we obtain the perturbation problem. It is defined on the base
domain and we have
∇2 p1 = 0. z > 0
and
∇2 p1 ∗ = 0. z < 0
∂p1
vr1 = 0 ∴ = 0 at r=R
∂r
∂p1∗
v∗
r1= 0 ∴ =0 at r = R
∂r
LECTURE 18. TWO STABILITY PROBLEMS 462
and
∂Z1
=0 at r = R
∂r
And at z = 0 we have
dp 0 dp 0∗
p1 + Z 1 ∗
− p 1 + Z1 2
= γ∇H Z1
dz dz
K ∂p1 ∂Z1 K ∂p 1∗
− = v z1 = = v∗
z1= − ∗
µ ∂z ∂t µ ∂z
and
Z 2π Z R
Z 1r drdθ = 0
0 0
This is a linear problem in p 1, p 1∗ and Z1, where each of these variables satisfies homogeneous
Neumann conditions at r = R.
2
∇H ψ + λ2 ψ = 0
∂ψ
= 0 at r=R
∂r
ψ = Jm (λr) cos mθ
where λ is a root of
′
Jm (λR) = 0
′ d
and Jm (x) denotes Jm (x).
dx
LECTURE 18. TWO STABILITY PROBLEMS 463
Our plan is to determine the growth rate of surface displacements in the shape of any of the
allowable eigenfunctions.
and
whereupon we find
db
p1
− λ2 pb1 = 0, z>0
dz
p1∗
db
− λ2 pb1∗ = 0, z<0
dz
and at z = 0 we have
K db
p1 p1∗
K db
− = σ Zb1 = −
µ dz µ∗ dz
µ µ ∗
pb1 + Zb1 − U − pb1∗ + Zb1 − U = −γλ2 Zb1
K K
and
Z 2π Z R
Zb1ψeσt r drdθ
0 0
pb1 = Ae−λz
LECTURE 18. TWO STABILITY PROBLEMS 464
and
pb1∗ = A∗eλz
hence at z = 0 we find
K K
λA = σ Zb1 = − λA∗
µ µ∗
and
µ
µ ∗
∗
A + Zb1 − U − A + Zb1 − U = −γ λ2 Zb1
K K
The inputs to our problem are U and R, the output is σ and we have three linear homogeneous
equations in A, A∗ and Zb1 which have a non vanishing solution iff the determinant of the matrix
of coefficients vanishes. This determines σ, the growth rate of a surface displacement in the shape
ψ. The readers can work this out.
U
(µ − µ∗) = γλ2
K
which tells us this: if µ∗ > µ there is no critical condition, i.e., the surface separating a more
viscous fluid displacing a less viscous is stable to any small displacement. A critical value of U is
possible iff µ∗ < µ, i.e., a less viscous fluid displacing a more viscous fluid. A plot of the critical
value of U vs λ2 , then, looks as follows
LECTURE 18. TWO STABILITY PROBLEMS 465
Ucrit
unstable
stable
2
λ
and if we mark the allowable values of λ2 on the abscissa we see that the lowest allowable λ2 sets
the pattern of the instability.
where we have
Jm (x)
J0 (x)
J′1 = 0 J2 (x)
J′2 = 0
J′0 = 0 J′1 = 0 J′2 = 0
X
J1 (x)
J′ = 0
J′0 = 0 J′1 = 0 2
We look first at the eigenfunctions ψ = J 0 (λr) , where J ′0 (λR) = 0, and observe that
J ′0 (x) = 0 has a root at x = 0, hence we have a solution λ = 0 and ψ = 1. This solution
LECTURE 18. TWO STABILITY PROBLEMS 466
and hence it is not marked on the diagram. Every other root of J ′0 (x) = 0 is allowable because
Z R R
1
J 0 (λr) r dr = rJ1 (λr)
0 λ 0
All the eigenfunctions J m (λr) cos mθ, where J ′m (λR) = 0, are allowable due to
Z 2π
cos mθ dθ = 0
0
and we observe that J ′2 (x) , J ′3 (x), etc., all vanish at x = 0 but in each case the corresponding
eigenfunction is zero.
Kγ
Ucrit = λ2
(µ − µ∗)
Thus the pattern we should expect to see as we increase U in an experiment to just beyond Ucrit
should have a cos θ dependence, viz.,
LECTURE 18. TWO STABILITY PROBLEMS 467
− +
− +
J1 ( λ r )
r
R
Now we can ask another question: at what value of R does the surface become unstable, given
that U is fixed at a positive value?
U x2
(µ − µ∗) = λ2 , λ2 =
Kγ R2
If R is very small the right hand side is very large even for the smallest root of J ′1 (x) = 0.
Hence small diameter columns are stable unless U is very large. Upon increasing R we arrive at
its critical value where
U x2
(µ − µ∗) = 21
Kγ R crit
This problem does not differ from the Saffman-Taylor problem by much. Here the instability is
~ Again we set the problem in a porous rock and use
caused by gravity and ~g takes the place of U.
Darcy’s law.
We have two fluids of different density lying in a gravitational field, the heavy fluid above the
light fluid.
→ →
z g = −gk
z = Z( r, θ, t )
ρ∗
µ
~v = −∇p + ρ~g , ∇ · ~v = 0
K
∇2 p = 0, z>Z
LECTURE 18. TWO STABILITY PROBLEMS 469
and
∇2 p∗ = 0, z<Z
∂p
= 0, at r = R, z>Z
∂r
and
∂p∗
= 0, at r = R, z<Z
∂r
due to no flow across the side walls, p must be bounded as z → ∞ and p∗ must be bounded as
z → − ∞ and, at the surface z = Z,
Zt
~n · ~v = r = ~n · ~v ∗
1
1 + Z 2r + 2 Z 2θ
r
p − p∗ = γ2H
and
Z 2π Z R
Z (r, θ, t) r dr dθ
0 0
At this point we do something a little different than before. We are going to change the bound-
ary conditions satisfied by Z and require pinned edges in place of free edges, i.e., Z = 0 at r = R.
Hence the boundary conditions satisfied by p1, p1∗ and Z 1 at the wall in the perturbed problem
differ and will not allow us to separate variables as we did above, viz., p1 and p1∗ will be ask-
ing for one set of ψ’s, corresponding to Neumann conditions, Z 1 will be asking for another set
corresponding to Dirichlet conditions. Therefore we are limited in what we can do easily.
LECTURE 18. TWO STABILITY PROBLEMS 470
We do not ask for σ, instead we set σ to zero and look for the neutral condition. By setting σ
to zero in the perturbation problem we have
∇2 p1 = 0 = ∇2 p1∗
∂p1 ∂p ∗
=0= 1 at r=R
∂r ∂r
and
∂p1 ∂p ∗
=0= 1 at z=0
∂z ∂z
where p1 and p1∗ must be bounded. Hence, at the critical value of R, p1 and p1∗ must be constants,
but not necessarily zero as would be the case if the edges of the surface were free instead of pinned.
dp0 dp0∗
p1 + Z − p1∗ + Z = γ2H1
dz 1 dz 1
dp0 dp0∗
= −ρg, = −ρ∗g and 2H 1 = ∇H2 Z1
dz dz
we have
γC − g (ρ − ρ∗) Z1 = γ∇H2 Z1
where
Z1 = 0 at r = R, Z1 bounded at r = 0
and
Z 2π Z R
Z1r drdθ = 0
0 0
LECTURE 18. TWO STABILITY PROBLEMS 471
g (ρ − ρ∗)
This is a homogeneous problem in Z1 and C and we are looking for the value of
γ
such that Z1 is not zero.
Thus we write
∇H2 Z 1 + λ2 Z 1 = C, Z 1 = 0 at r=R
and
Z 2π Z R
Z 1r drdθ = 0
0 0
2
∇H ψ + λ2 ψ = C
ψ = 0 at r=1
and
Z 2π Z 1
ψr drdθ = 0
0 0
g
R2 (ρ − ρ∗)
γ
g
must be one of the λ2 ’s and we can look for critical values of R given (ρ − ρ∗).
γ
First we see that if ρ∗ > ρ the surface is stable to small perturbations for all values of R.
This is the case of a light fluid lying above a heavy fluid. Then for ρ > ρ∗ and R very small,
g
R2 (ρ − ρ∗) will be less than all λ2 ’s. And a heavy fluid lying above a light fluid will be stable
γ
to small perturbations.
LECTURE 18. TWO STABILITY PROBLEMS 472
g
As we increase R, the critical value of R will be reached when R2 (ρ − ρ∗) becomes equal
γ
to λ21 the smallest eigenvalue among the set of λ2 ’s satisfying our eigenvalue problem.
J1 (x)
J2 (x)
x1 = 3.83171
This leaves only the case of axisymmetric disturbances, viz., m = 0, where we can not use
R 2π
0
cos mθ dθ = 0, m = 1, 2, . . . to easily conclude that C = 0. In fact at m = 0, C is not zero.
Indeed at m = 0 we have
C
ψ = AJ0 (λr) +
λ2
d2 ψ 1 dψ
+ + λ2 ψ = C
dr 2 r dr
LECTURE 18. TWO STABILITY PROBLEMS 473
assuming ψ is bounded.
Now λ cannot be zero for then C must be zero and we have ψ = A, whereupon A must be
zero. Hence to find the positive λ’s we observe that
ψ (r = 1) = 0
and
Z 1
ψr dr = 0
0
imply that
C
AJ0 (λ) + =0
λ2
and
Z 1
1C
A J0 (λr) r dr + =0
0 2 λ2
Then using
d
rJ1 (λr) = λrJ0 (λr)
dr
whose solutions are the values of λ at m = 0. The lowest solution lies to the right of x1 , the
smallest positive root of J1 (x) = 0. Hence we conclude that the critical value of R is given by
g
R2 (ρ − ρ∗) = x21
γ
LECTURE 18. TWO STABILITY PROBLEMS 474
We present a graph of x J0 (x) − 2J1 (x) vs x and indicate the first few positive roots.
LECTURE 18. TWO STABILITY PROBLEMS 475
J1 (x) = 0 : x1 = 3.8317
J ′1 (x) = 0 : x1 = 1.8412
Hence the case of pinned edges is most unstable to a cos θ perturbation and, although both free
and pinned edges are most unstable to a cos θ perturbation, it takes a much larger value of R to
destabilize the case where the edges are pinned.
You have two isothermal horizontal planes bounding a fluid. The lower one, at z = 0,
is hot, the upper one, at z = H, is cold. The density of the fluid depends on its temperature
LECTURE 18. TWO STABILITY PROBLEMS 476
via
ρ = ρref 1 − α (T − Tref)
dT0 TH − TC
~v0 = ~0, =−
dz H
Your model is
∂~v
ρ + ρ~v · ∇~v = −∇p + µ∇2~v + ρ~g , ∇ · ~v = 0
∂t
and
∂T
+ ~v · ∇T = κ∇2 T
∂t
∂vz ∂vx
vz = 0, + =0
∂x ∂z
∂ 2 ∂ 2 T1
ρ ∇ vz1 = µ∇2 ∇2 vz1 + ρref αg
∂t ∂x2
and
∂T1 dT0
+ vz1 = κ∇2 T1
∂t dz
LECTURE 18. TWO STABILITY PROBLEMS 477
dT0
Your job is to find at neutral conditions where steady values of vz1 and T1 , not both
dz
∂
zero, prevail. Dropping and scaling, you can obtain
∂t
∂ 2 T1
∇2 ∇2 vz1 − ∆T =0
∂x2
and
∇2 T1 + vz1 = 0
where T1 = 0 = vz1 at z = 0, 1
∂ 2 vz1 ∂ 2 vz1
− =0 at z = 0, 1
∂x2 ∂z 2
∂ 2 vz1
∇2 ∇2 ∇2 vz1 + ∆T =0
∂x2
Your result should look like this, where σ = 0 curves are plotted as ∆T vs. k 2 .
LECTURE 18. TWO STABILITY PROBLEMS 478
n=2
∆T
unstable to n = 1, 2
etc. n=1
unstable to n = 1
∆Tcrit stable
k2
k2crit
2π
where k 2 is an input, telling you the horizontal wave length of the perturbation, . You will
k
notice that very long and very short wave length disturbances are very stable.
Observe that
and
Z ( 3 2 3 )
1
d2 d
a − k2 b − b − k 2 a dz
0 dz 2 dz 2
the density stratification is unstable. This is offset by the fact that our element of fluid is now
hotter than its new surroundings and it cools, its density increasing.
2. You have a porous rock bounded by a cylinder of constant cross section. At the top and
bottom are planes held at constant temperature, viz., Thot at z = 0, Tcold at z = H.
The side wall is an insulated, no-flow surface. The top and bottom planes are isothermal
no-flow surfaces.
Your model is
µ
~v = −∇p − ρ g ~k, Darcy’s law
K
∇ · ~v = 0
and
∂T
+ ~v · ∇T = κ ∇2 T
∂t
where
~n · ~v = 0 at all surfaces
T = Thot at z = 0
and
T = Tcold at z=H
T0 = T0 (z) ,
~v0 = ~0,
dp0
= −ρ (T0 ) g,
dz
dT0 T − Tcold
= − hot <0
dz H
You may proceed without declaring the shape of the cross section by writing
~v = vz ~k + ~vH
and
∂
∇ = ~k + ∇H
∂z
whereupon
∂2
∇2 = + ∇2H
∂z 2
∂vz
∇ · ~v = + ∇H · ~vH
∂z
µ ∂p
vz = − − ρ g
K ∂z
and
µ
~v = −∇H p
K H
K
∇2 vz = ρ α g∇H2 T
µ ref
LECTURE 18. TWO STABILITY PROBLEMS 481
and
∂T ∂T
+ vz + ~vH · ∇H T = κ∇2 T
∂t ∂z
~n · ∇H vz = 0 = ~n · ∇H T
You introduce a small perturbation of the base solution and derive the perturbation
dT0
problem for vz1 and T1 . And, assuming a steady solution at the critical value of , you
dz
solve your problem by separation of variables, viz.,
b1 (z) ψ (x, y)
T1 = T
~n · ∇H ψ = 0 at the edge
So far the cross section has not come into the problem. But at this point it determines
the allowable values of λ2 and these depend on the shape and diameter of the cross section.
Assume the cross section is a circle of radius R0 and deduce the convection pattern seen
at critical as R0 increases from a small value to its critical value, at fixed T hot − T cold.
LECTURE 18. TWO STABILITY PROBLEMS 482
The dip in the plot of LHS vs. λ2 allows you to see many patterns at the critical value
of T hot − T cold. Set n = 1 and assume the cross section to be one dimensional having side
walls at x = 0 and x = L. Then ψ = cos kx (k = λ) where the allowable values of k are
mπ
, m = 0, 1, 2, . . . .
L
For small values of L the most dangerous value of k corresponds to m = 1. As L
increases show that the most dangerous value of k corresponds to increasing values of m
and, therefore, that many patterns can be seen at the critical value of T hot − T cold depending
on the width of the cell.
3. Assume the cross section in Problem 2 to be a thin rectangle of length a and width b, a >> b.
Is it the value of a or b that controls the critical temperature difference?
4. Your job is to look again at the Rayleigh-Taylor problem, assuming Darcy’s law tells you the
velocity. Do this on an arbitrary cross section, writing
∂
∇ = ~k + ∇H
∂z
and
~v = vz ~k + ~vH
and suppose that the surface is not pinned at the edge but contacts the side wall at right
angles, viz.,
~n · ∇H Z = 0
The question is: is there an effect of the fluid depths on the critical diameter of the cross
section?
LECTURE 18. TWO STABILITY PROBLEMS 483
(1) (2)
At infinite depths you have p1 = 0 = p1 whereas at finite depths you have, instead,
(1)
p1 = c1 , p(2) = c2 . And your equation for Z1 is then
c2 − c1 + g −ρ(2) + ρ(1) Z1 = γ∇H2 Z1
Do you think this would also be true if the edges were pinned, viz., Z1 = 0 at the edges?
5. You are going to try to predict what you might see in a Rayleigh-Taylor experiment, assum-
ing you see the pattern having the greatest growth rate.
You have a cylinder of circular cross section. The radius is denoted R. A heavy fluid,
density ρ, lies above a light fluid, density ρ⋆. The surface separating the fluids is denoted
z = Z (r, θ, t) and at first the two fluids are at rest, being separated by the horizontal surface
dp0 dp0⋆
z = Z0 = 0. You have − →v0 = 0 = −→v0⋆, = −ρg and = −ρ⋆g.
dz dz
The domain equations are
→
−
µ−
→
v = K −∇p − ρg k
and
∇·−
→
v =0
and therefore
∇2 p = 0
LECTURE 18. TWO STABILITY PROBLEMS 484
Zθ ⋆ Zθ vθ⋆
vz − Zr vr − vθ = Zt = v⋆
z − Zr vr −
r r
and
p − p⋆ = γ2H
The surface is given a small perturbation, viz., Z = Z0 +ε Z1 and your perturbation equations
are then
∇2 p1 = 0 = ∇2 p1⋆
∂p1 ∂p1⋆
=0= at r=R
∂r ∂r
and
∂Z1
= 0 at r=R
∂r
At z = 0 you have
K ∂p1 K ∂p⋆
= vz1 = Z1t = v⋆
1
− z1 = −
µ ∂z µ⋆ ∂z
and
dp0 ⋆ dp⋆
0
p1 + Z 1 − p1 + Z 1 = γ2H1 = γ∇H2Z1
dz dz
Writing
and
where
′
Jm (λR) = 0
you have
pb1 = Ae−λz
and
pb1⋆ = A⋆eλz
K K
Aλ = σ Zb1 = − ⋆ A⋆λ
µ µ
and
A − A⋆ = −γλ2 + (ρ − ρ⋆) g Zb1
whereupon to have a solution A, A⋆ and Zb1 not all zero you find
σ (µ + µ⋆)
= λ −γλ2 + (ρ − ρ⋆) g
K
Your second job is to notice that σ vs λ is one curve, you can sketch it and you can observe
that there is a greatest value of σ. The curve rises due to the kinematic condition then falls
(ρ − ρ⋆) g
due to surface tension crossing zero at λ2 =
γ
Now the allowable λ’s depend on R via
′
Jm (λR) = 0
′ xm
Denoting by xm the solution to Jm (x) = 0 your λ’s are: λ = and you have
R
Jm (x)
J0 (x)
J′1 = 0
J′0 = 0 J′1 = 0
X
J1 (x)
J′0 = 0 J′1 = 0
∆ρg
For a small value of R all the λ2 ’s lie to the right ofand all perturbations are stable. As
γ
R increases they all move leftward but maintain their order. Soon the lowest moves to the
∆ρg
left of and the problem is unstable to the corresponding perturbation. Upon increasing
γ
R we can make any of the λ2 ’s the fastest growing and that λ2 determines the pattern you
see.
LECTURE 18. TWO STABILITY PROBLEMS 487
Your third job is to satisfy yourself that all of the above is true and that first you will see an
m = 1 pattern followed by an m = 0 pattern.
6. A heavy fluid lies above a light fluid. The two fluids are in hydrostatic equilibrium. The
problem is two dimensional. The interface is horizontal, its ends are pinned and the width of
the cell is such that the equilibrium is stable. The volume of the heavy fluid is 2LH.
z=H
→
g
z
x= −L x=L
x
z = − H∗
Now you add heavy fluid and remove light fluid resulting in
x
LECTURE 18. TWO STABILITY PROBLEMS 488
ψ = 0 at x = ±L
and
Z L
ψ dx = 0
−L
Z !
L
1
3/2
ψx2 dx
−L (1 + Z02 (x))
λ2 = Z L
ψ 2 dx
−L
This is called a Rayleigh quotient and the facts about Rayleigh quotients are explained in
Weinberger’s book.
Z L
Every trial function you put into the RHS, satisfying ψ = 0 at x = ±L, ψ dx = 0 gives
−L
an estimate of λ21 lying above the true value of λ21 .
Your job is to satisfy yourself that λ21 in the first picture lies above λ21 in the second.
7. You have a cylinder of arbitrary cross section bounded above and below by parallel horizon-
tal planes, one at z = 0, the other at z = H. The cylinder is filled with a porous solid whose
free space is filled with a liquid.
ρ = ρref 1 − α (T − T ref)
LECTURE 18. TWO STABILITY PROBLEMS 489
The lower plane is at temperature T hot, the upper plane is at temperature T cold. And you
need to take into account only the temperature dependence of the density.
All walls are no flow, the vertical side walls are adiabatic and the upper and lower walls are
isothermal.
The density is unstably stratified, heavy over light; and you are to, first, derive the critical
value of T hot at which flow sets in and then you are to increase T hot slightly beyond its
critical value and begin the process of estimating the steady flow above T hot critical.
The model is
−
→ K →
−
v = − ∇p − ρ g k , ∇·−
→
v =0
µ
and
∂T −
+→
v · ∇T = κ ∇2 T
∂t
where
T = T cold at z=H
T = T hot at z=0
vz = 0 at z = 0, H
and
−
→
n ·−
→
v =0=−
→
n · ∇T at the side walls.
−
→ →
−
v0 = 0
LECTURE 18. TWO STABILITY PROBLEMS 490
and
dT0 T − T cold 0
− = hot 0
dz H
Now set
→ ∂
−
∇= k + ∇H
∂z
and
−
→ → →
−
v = vz k + −
vH
K ∂p
vz = − − ρg
µ ∂z
−
→ K
vH = − ∇H p
µ
∂vz
+ ∇H · →
−
vH = 0
∂z
and
∂T − ∂2
vz +→
vH · ∇H T = κ 2
+ ∇H T
∂z ∂z 2
T = T cold 0, vz = 0 at z=H
LECTURE 18. TWO STABILITY PROBLEMS 491
T = T hot 0, vz = 0
−
→
n · ∇H vz = 0 = −
→
n · ∇H T at the side walls
To find the critical value of T hot 0, impose a small perturbation on the base solution, denote
the perturbation variables by the subscript one and derive the perturbation problem at zero
growth rate, viz.,
∂2
+ ∇H vz1 − ρref gα∇H2 T1 = 0
2
∂z 2
∂2 2 1 dT0
+ ∇H T1 + vz1
∂z 2 κ − dz = 0
where
T1 = 0 = vz1 at z=0=0
and
−
→
n · ∇H vz = 0 = −
→
n · ∇H T at the side walls.
dT0
Your job is to find the values of − at which this problem has solutions other than
dz
vz1 = 0 = T1
∇H2 ψ + λ2 ψ = 0
and
−
→
n · ∇H ψ = 0 at the edge
LECTURE 18. TWO STABILITY PROBLEMS 492
λ21 , λ22 , . . .
ψ1 , ψ2 , . . .
These solutions depend on what the cross section is and if we denote its diameter by d then
the λ2 ’s are multiples of d−2 . Henceforth by not specifying d you can view λ2 as a continuous
variable.
Assuming we have a perturbation in the shape of the eigenfunction ψ we can separate vari-
ables and write
and
T1 = b
T1 (z) ψ
vbz1 = 0 = b
T1 at z = 0, H
nπ
vbz1 = β sin z
H
LECTURE 18. TWO STABILITY PROBLEMS 493
and
b nπ
T1 = sin z
H
n = 1, 2, . . .
dT0
Hence for each value of n you have the critical value of − as a function of λ2 , viz.,
dz
2
n2 π 2
+ λ2
ρref α g dT0 H2
− =
κ dz λ2
dT0 π2
The least critical value of − occurs at n = 1, λ2 = 2 where
dz H
ρref α g dT0 π2
− =4 2
κ dz crit H
and
π2 2
+λ
H2
β=
1 dT0
κ − dz
1 2
T hot = T hot 0 + ε T hot 2
2
1
vz = vz0 + ε vz1 + ε2 vz2
2
and
1 2
T = T0 + εT1 + ε T2
2
In the Petrie dish problem in Lecture 15 you learned that the successful expansion of the
control variable depends on the nonlinearity at hand.
LECTURE 18. TWO STABILITY PROBLEMS 494
Your job is to find out that the expansion above is the correct expansion by proving that
solvability is satisfied at second order. Thus you have
vz0 = 0
z
T0 = T hot 0 + T cold 0 − T hot 0
H
vz1 = Ab
vz1 ψ
and
T1 = A b
T1 ψ
where T hot 0 is the critical value of T hot, ψ corresponds to the critical value of λ2 and vbz1
and b
T are known from above.
1
This can be found at third order if you can get through the second order problem without
finding that A = 0 which would tell you that your expansion of T hot is not correct.
T2 = 0 at z=H
and
T2 = T hot 2 at z=0
LECTURE 18. TWO STABILITY PROBLEMS 495
And
z
vz2 = 0, T2 = T hot 2 1 −
H
2−
→
v1 · ∇ T1
is a solution if is zero, so henceforth we set
κ
T2 = 0 at z=0
vz2 0
L = −
2→
v1 · ∇ T1
T2
κ
and
vz2 = 0 = T2 at z = 0, H
The corresponding homogeneous problem has a nonzero solution, hence a solvability condi-
tion must be satisfied, viz.,
T
Z Z
H v⋆0
z1
dz dx dy −
2→
v1 · ∇ T1
=0
0 A T1⋆
κ
where
v⋆ 0
L⋆
z1
=
T1⋆ 0
and
v⋆ ⋆
z1 = 0 = T1 at z = 0, H
LECTURE 18. TWO STABILITY PROBLEMS 496
πz
v⋆ ⋆
z1 = β sin ψ
H
and
πz
T1⋆ = sin ψ
H
And then you can conclude that solvability is satisfied at second order due entirely to the z
integration, viz.,
Z H
πz πz πz
sin sin cos dz = 0
0 H H H
Thus, if we had to, we could go on to third order in the expectation of finding A as a function
of T hot 2. This is the way the Petrie dish problem worked out for the cubic nonlinearity
Lecture 19
As we now know, using the method of separation of variables to solve the eigenvalue problem
(∇2 + λ2 ) ψ = 0 reduces this problem to the solution of ordinary differential equations. In this
lecture we present the elementary facts about second order, linear, ordinary differential equations.
d2 u du
Lu = a (x) 2
+ b (x) + c (x)
dx dx
497
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 498
d2 v dv
Integrating ua 2 twice by parts and ub once, we can write u, Lv as
dx dx
h i1
u, Lv = a {uv ′ − u′ v} − {a′ − b} uv + L∗u, v
0
da
where a′ denotes , etc. By doing this we introduce the operator L∗, associated to L and called
dx
its adjoint, where
d2 d
L∗u = (au) − (bu) + cu
dx2 dx
Lu = f, 0<x<1
where f (x) is an assigned function on the interval (0, 1). To complete the specification of the
problem two boundary conditions must be assigned. Most of the boundary conditions of physical
interest can be taken into account by assigning values to two linear combinations of u and u′ at the
boundary, i. e., to two linear combinations of u (x = 0), u′ (x = 0), u (x = 1), u′ (x = 1). This
includes both initial value and boundary value problems. We limit ourselves to unmixed boundary
value problems and write the boundary conditions
at x = 0 : B0 u = a0 u + b0 u′ = g0
and
at x = 1 : B1 u = a1 u + b1 u′ = g1
Occasionally periodic conditions are imposed and these are of mixed type, viz.,
u (x = 0) − u (x = 1) = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 499
and
u′ (x = 0) − u′ (x = 1) = 0
A problem then is defined by the operators L, B0 and B1 and assigned sources f (x), g0 and
g1 . Associated with this is the adjoint problem and to formulate the adjoint problem we need to
identify the adjoint boundary operators B0∗ and B1∗ that go with the adjoint differential operator
L∗. To do this write
h i1
u, Lv − L∗u, v = a uv ′ − u′ v − a′ − b uv
0
and define B0∗ and B1∗ such that B0∗u = 0 = B1∗u and B0 v = 0 = B1 v imply
h i1
a uv ′ − u′ v − a′ − b uv = 0
0
Lu = f, 0<x<1
and
B0 u = g0 , B1 u = g1
has a solution and, if it does, to decide what it is. Naimark’s book “Linear Differential Operators”
deals with this, and more, but we do not need very many of Naimark’s results as we deal only with
second order differential operators and in this case a simplification obtains by which we need deal
only with self adjoint differential operators.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 500
To see why this is so, observe that Lv, L∗u and uLv − L∗uv are
Lv = av ′′ + bv ′ + cv
and
uLv − L∗uv = a (uv ′ − u′ v) − (a′ − b) uv
′
Then if b = a′ we get
L∗ = L
and
′
uLv − Luv = a (uv ′ − u′ v)
u (x = 0) v ′ (x = 0) − u′ (x = 0) v (x = 0) = 0
Likewise if B1 u = 0 = B1 v, then
u (x = 1) v ′ (x = 1) − u′ (x = 1) v (x = 1) = 0
1
This tells us that if B0∗ = B0 and B1∗ = B1 then a (uv ′ − u′ v) 0 = 0.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 501
and
L∗ = L
B0∗ = B0
and
B1∗ = B1
When this is so, a problem defined by L, B0 and B1 is called self adjoint in the plain vanilla inner
product (L is called self adjoint if L∗ = L). We get all this by requiring only b = a′ , but it must
be observed that we have assumed special forms for B0 and B1 , yet none of this depends on the
values assigned to a0 , b0 , a1 and b1 .
The condition b = a′ is important for two reasons. The first is that all of the ordinary differential
1
operators coming from ∇2 on separation of variables can be written as times a self adjoint
w (x)
operator and hence are themselves self adjoint in the inner product
Z 1
u, v = uvw dx
0
The second is that any second order linear differential operator, viz.,
Lu = au′′ + bu′ + cu
can be written
a d du c
Lu = d + du
d dx dx a
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 502
where
Z x
b
dξ
d (x) = e x0 a
d
and hence is self adjoint in a weighted inner product with w = . Henceforth then we will assume
a
that L is self adjoint in the plain vanilla inner product and write
d du
Lu = p − qu
dx dx
and
d
uLv − Luv = p (uv ′ − u′ v)
dx
Two functions u and v are linearly dependent if and only if their Wronskian vanishes.
d u 0 1 u
=
p p
′
dx u ′ − u ′
q p
and then observe, as we discovered in Lecture 2, that if u and v are any two solutions of this
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 503
This tells us that if u and v satisfy Lu = 0 then their Wronskian, multiplied by p, remains constant,
i.e., pW = const. This is also a simple consequence of the formula
′
uLv − Luv = p (uv ′ − u′ v) = (pW )′ . So if p is not zero, then W is either always zero or
never zero. And if pW is not zero and p → 0 as x → x0 then W → ∞ as x → x0 and at least one of
u and v does not remain bounded as x → x0 .
We can now write the general solution, i.e., no end conditions, to Lu = f in terms of the solutions
to Lu = 0. As Coddington and Levinson explain, there are always two independent solutions to
Lu = 0 and every other solution can be expresses as a linear combination of any two independent
solutions. We let u1 and u2 denote two independent solutions of Lu = 0, then Lu1 = 0 = Lu2 ,
W = u1 u′2 − u′1u2 does not vanish, pW is a nonzero constant and
Z x Z x
1
u0 = −u1 (x) u2 (y) f (y) dy + u2 (x) u1 (y) f (y) dy
pW 0 0
u = u 0 + c1 u 1 + c2 u 2
Lu = 0
and
B0 u = 0, B1 u = 0
u = c1 u 1 + c2 u 2
and observe that W (x = 0) is not zero and that a0 and b0 cannot be both zero.
Hence we always have one non zero solution, and it is the only independent solution, to Lu = 0,
B0 u0 = 0. Likewise, there is one independent solution to Lu = 0, B1 u = 0. And all this is true no
matter the value of D. Then if D is zero, we have one independent solution of Lu = 0, B0 u = 0 ,
B1 u = 0. If u1 is this solution then neither B0 u2 nor B1 u2 can be zero.
These results depend on the boundary conditions being unmixed as both cos 2πx and sin 2πx
satisfy
d2 2
+ 4π u = 0
dx2
u (0) = u (1)
and
u′ (0) = u′ (1)
Thus we have
B0 u0 + c1 B0 u1 + c2 B0 u2 = g0
and
B1 u0 + c1 B1 u1 + c2 B1 u2 = g1
and hence
B0 u1 B0 u2 c1 g0 − B0 u0
=
B1 u1 B1 u2 c2 g1 − B1 u0
and, as
Z x Z x
1
u0 = −u1 u2 f dy + u2 u1 f dy
pW 0 0
and
Z x Z x
1
u′0 = −u′1 u2 f dy + u′2 u1 f dy
pW 0 0
we have
Z x Z x
1
Bu0 = −Bu1 u2 f dy + Bu2 u1 f dy
pW 0 0
Now, if D is not zero, the constants c1 and c2 can be determined uniquely and our problem has
a unique solution; otherwise, to have a solution, a solvability condition must be satisfied and if the
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 507
solvability condition is satisfied, the solution is not unique. The solvability condition for u is the
solvability condition for c1 and c2 .
Whether D is zero or not depends only on L, B0 and B1 . It does not depend on how we select
the two independent solutions of Lu = 0 denoted u1 and u2 .
Lu = f
and
B0 u = g0 , B1 u = g1
Lu = 0
and
B0 u = 0, B1 u = 0
To determine this unique solution we first must find two independent solutions of Lu = 0.
Denoting these u1 and u2 where W = u1 u′2 − u′1 u2 , W (u1 , u2 ) 6= 0 and pW = constant 6= 0, we
evaluate D. Then we write the solution
u = u 0 + c1 u 1 + c2 u 2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 508
where
Z x Z x
1
u0 = −u1 u2 f dy + u2 u1 f dy
pW 0 0
and where
c1 B1 u2 −B0 u2 g − B0 u0
= 1 0
c2 D −B1 u1 B0 u1 g1 − B1 u0
B0 u0 = 0
and
B1 u2 B1 u1
B1 u0 = u1 , f − u2 , f
pW pW
B0 u1 = 0 = B1 u2
for then
g1 1
c1 = + u2 , f
B1 u1 pW
and
g0
c2 =
B0 u2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 509
The solvability condition for u is the solvability condition for c1 and c2 . Now when D = 0, the
rank of
B0 u1 B0 u2
B1 u1 B1 u2
is one and the solvability condition for c1 and c2 is simply the requirement that the rank of
B0 u1 B0 u2 g0 − B0 u0
B1 u1 B1 u2 g1 − B1 u1
If one of these determinants is zero then so is the other as their first columns are dependent due to
D = 0.
B1 u2
where B0 u0 = 0 and B1 u0 = u1 , f . Hence we have
pW
B0 u2 B1 u2
B0 u2 g1 − B1 u2 g0 = u1 , f
pW
or
(pW )1 (pW )0
g1 − g0 = u1 , f
B1 u2 B0 u2
This, the simplest expression of the solvability condition, does not depend on how u2 is selected
once u1 is set, so long as u1 and u2 are independent. Indeed, because W is the Wronskian of u1
W1
and u2 and B1 u1 = 0, is unchanged if u2 is replaced by a linear combination of u1 and u2 .
B1 u2
W0
The same is true of .
B0 u2
u1 , f =0
Lu = f
and
B0 u = 0, B1 u = 0
Lu = 0
and
B0 u = 0, B1 u = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 511
g0 g1 − B1 u0
c2 = =
B0 u2 B1 u2
u0 + c2 u2 + cu1
cu1
and
u (x = 0) = 0, u (x = 1) = 0
d2
has a solution. The differential operator L = + a2 is self adjoint in the plain vanilla inner
dx2
1
product and the functions u1 = sin ax and u2 = cos ax are two independent solutions of Lu = 0.
a
As B0 u = u and B1 u = u we find that D is
−1
D= sin a
a
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 512
and hence that D is not zero unless a2 = π 2 , 22 π 2 , . . .. So for each value of a2 other than
π 2 , 22 π 2 , . . . this problem has a solution for all functions f (x). But if a2 = n2 π 2 , where n is
a positive integer, the problem has a solution if and only if the function f (x) satisfies
Z 1
sin nπxf (x) dx = 0
0
We suppose D is not zero so that our problem has a unique solution. We denote the two inde-
pendent solutions of Lu = 0 by u1 and u2 where W (u1 , u2 ) 6= 0. The simplest result obtains
if
Lu1 = 0, B0 u1 = 0
and
Lu2 = 0, B1 u2 = 0
for then
c1 B1 u2 −B0 u2 g − B0 u0
= 1 0
c2 D −B1 u1 B0 u1 g1 − B1 u1
where
B0 u0 = 0
and
Z 1 Z 1
1
B1 u0 = −B1 u1 u2 f dy + B1 u2 u1 f dy
pW 0 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 513
c1
and is simply
c2
c1 0 g
−B0 u2 0
= 1 Z 1
D 1
c2 −B1 u1 0 g1 + B1 u1 u2 f dy
pW 0
where D = −B1 u1 B0 u2 .
and
c2 = 0
where g is called the Green’s function for our problem and we have
1
pW u1 (y) u2 (x) , y<x
g (x, y) =
1
u1 (x) u2 (y) , x<y
pW
Lg = 0, B0 g = 0, x<y
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 514
and
Lg = 0, B1 g = 0, y<x
where
g (x, y) − g (x, y) = 0
x → y+ x → y−
and
∂g ∂g 1
(x, y) − (x, y) =
∂x ∂x p (y)
x → y+ x → y−
Hence we can find g by solving the foregoing equations but we see that g is not an ordinary
∂2g
solution to Lg = 0 ( as are u1 and u2 ) because 2 does not exist at x = y.
∂x
As an example, to solve
d2 u
= f, 0<x<1
dx2
and
u (x = 0) = 0 = u (x = 1)
we can write
Z 1
u (x) = g (x, y) f (y) dy
0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 515
d2 g
= 0, 0<x<y
dx2
g (x = 0, y) = 0
d2 g
= 0, y<x<1
dx2
g (x = 1, y) = 0
g y+, y − g y−, y = 0
and
∂g + ∂g −
y ,y − y ,y = 1
∂x ∂x
whence
g (x, y) = −x (1 − y) , 0<x<y
= −y (1 − x) , y<x<1
The reader can continue and determine how to use g (x, y) to write the solution if g0 and g1 are
not zero. This will complete the introduction to the Green’s function when D is not zero. When
D is zero it is possible to introduce a generalized Green’s function and it is possible to do this in a
way that produces a best approximation when a solution cannot be determined. We do not go into
this but the main ideas can be found back in Lecture 4.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 516
u = u0 + c1 + c2 (ln x)
and the requirement that u remain bounded as x → 0 replaces the boundary condition B0 u = g0 . It
is satisfied if and only if c2 = 0.
Using this the reader can determine the Green’s function for this problem and show that it
satisfies
Lg = 0, 0<x<y
g (x = 0, y) finite
Lg = 0, y<x<1
B1 g = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 517
g y+, y − g y−, y = 0
and
∂ ∂ 1
g y +, y − g y −, y =
∂x ∂x y
The solution to
Lu = xf
u finite at x = 0
and
B1 u = 0
is then
Z 1
u (y) = xf (x) g (x, y) dx
0
Note:
In §19.9 we introduce the delta function. We can use it to define the Green’s function via
Lg = δ (x − y)
g (x, y) finite at x = 0
and
B1 g = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 518
The short answer for us is nothing. This is, more or less, the first place where the symbol δ (x) has
been used. But we have made great use of the integration by parts formula
Z 1 h i1 Z 1
dg df
f dx = f g − g dx
0 dx 0 0 dx
and we have not inquired whether this use is justified. To see why there might be a question let f
be smooth and suppose first that g is continuous but that its derivative is not. For instance let g and
dg
be
dx
dg
g dx 1
1 1−ξ
1−ξ
1 x−ξ
1−ξ
0 0
0 0
X X
0 ξ 1 0 ξ 1
dg
As LHS = RHS we see that the discontinuity in does not require us to give up our integration
dx
by parts formula.
dg
But suppose that g and are now
dx
dg
g dx
1
1
0 0 0
0 0
X X
0 ξ 1 0 ξ 1
whereas the left hand side is ambiguous. If we suppose it to be zero then our integration by parts
formula is incorrect. To account for the jump in g and to enlarge the class of functions for which
our formula is correct we simply let the right hand side define the left hand side whenever the left
hand side is ambiguous.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 520
Then
Z b
f (x) δ (x − x0 ) dx = f (x0 ) , a < x0 < b
a
dg
and so if we write = δ (x − ξ) in the second example we have
dx
LHS = f (ξ)
What is really useful about this notation is that in terms of it the introduction of Green’s func-
tions can be formalized and simplified because we can now use our integration by parts formula to
evaluate terms such as
Z
uLg dx
where g is a Green’s function. Its derivative takes a jump at a point where Lg is not defined. This
integral then is like that in the second example just above.
Lu = f (x) , 0<x<1
and
u (x = 0) = 0, u (x = 1) = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 521
Lg = δ (x − ξ)
and
g (x = 0, ξ) = 0, g (x = 1, ξ) = 0
Then as
Z 1 n o h i1
′ ′
gLu − uLg dx = p {gu − ug } =0
0 0
we have
Z 1
u (ξ) = g (x, ξ) f (x) dx
0
To take a more interesting example, suppose that heat is released in a thin layer of fluid of width
L bounded on either side by large reservoirs to which heat can be rejected. Where the fluid is
in contact with the reservoirs, the fluid temperature is held at the reservoir temperature, T0 . Our
problem is to determine whether or not the heat released can be balanced by heat conduction to
the reservoirs. In the simplest case we get an interesting problem by assuming that the local heat
source is autothermal with the temperature dependence of its rate given by the Arrhenius formula.
Then scaling distance by the width of the fluid layer our problem is to determine u so that
u
d2 u γ
1+u = 0
+ λ e
dx2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 522
and
u (x = 0) = 0, u (x = 1) = 0
T − T0
are satisfied where u = , γ is a scaled activation energy and λ is a multiple of L2 .
T0
This is not the kind of problem we have been talking about. The source is not specified in
advance but instead is a function of the solution and a non-linear function at that. Nonetheless we
can determine the Green’s function to be
g (x, y) = −x (1 − y) , 0<x<y
= −y (1 − x) , y<x<1
Now we have not solved our problem, we have simply put it in another form. But this form is
useful as −g (x, y) is a non-negative function and using this integral equation it is easy to construct
both bounds on and approximations to u (x).
Writing L, B0 and B1 as
d du
Lu = p − qu
dx dx
B0 u = a0 u + b0 u′ at x = 0
and
B1 u = a1 u + b1 u′ at x = 1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 523
we have
d dv du dv
uLv = up −p − quv
dx dx dx dx
and
d n o d
uLv − Luv = p (uv ′ − u′v) = {pW }
dx dx
Now, assuming a0 and b0 are not both zero and a1 and b1 are not both zero, B0 u = 0 = B0 v and
B1 u = 0 = B1 v imply
W0 = {uv ′ − u′ v} (x = 0) = 0
and
W1 = {uv ′ − u′ v} (x = 1) = 0
Z 1 1
{uLv − Luv} dx = pW =0
0 0
Z 1 1 Z 1
′
uLv dx = puv − {pu′v ′ + quv} dx
0 0 0
and
Z 1 1
′ ′
{uLv − Luv} dx = {p (uv − u v)}
0 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 524
Lψ + λ2 ψ = 0, 0<x<1
and
B0 ψ = 0 = B1 ψ
where the eigenfunctions are the non zero solutions, ψ, and the corresponding values of λ2 are the
eigenvalues.
Lu + λ2 u = 0
has two independent solutions denoted u1 (λ2 ) and u2 (λ2 ), where W u1 (λ2 ) , u2 (λ2 ) , does not
vanish. The general solution of (L + λ2 ) ψ = 0 is then
ψ = c1 u 1 λ 2 + c2 u 2 λ 2
D λ2 = B0 u1 λ2 B1 u2 λ2 − B1 u1 λ2 B0 u2 λ2
is an eigenvalue for then c1 and c2 , not both zero, can be determined so that
2 2
B0 u1 (λ ) B0 u2 (λ ) c1 0
=
B1 u1 (λ2 ) B1 u2 (λ2 ) c2 0
and hence a solution to the eigenvalue problem other than ψ = 0can be obtained. To each eigen-
2 2
B0 u1 (λ ) B0 u2 (λ )
value there will be one independent eigenfunction as the rank of can-
2 2
B1 u1 (λ ) B1 u2 (λ )
not be zero. The eigenvalues will be isolated as D will ordinarily be an analytic function of λ2 and
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 525
hence its zeros will be isolated. And D will ordinarily have infinitely many zeros.
Z Z ( )
1 h i1 1
dψ
2
−λ2 | ψ|2 dx = p ψ ψ′ − p + q | ψ|2 dx
0 0 0 dx
and likewise
0, b1 = 0
p ψ ψ ′ (x = 1) = a
− 1 p | ψ|2 (x = 1) , otherwise
b1
The second formula then shows that if the eigenfunctions ψ1 and ψ2 correspond to distinct
eigenvalues, λ21 and λ22 , we have
Z 1
ψ1 , ψ2 = ψ1 ψ2 dx = 0
0
Note: The eigenvalues are real and to each there is one independent eigenfunction. So we take
the eigenfunction to be a real valued function. Under other boundary conditions it may be of
some advantage to indroduce complex valued eigenfunctions. Then if ψ is an eigenfunction
so also Re ψ, Im ψ and ψ.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 526
1
Lψ + λ2 ψ = 0
w
In using the integration by parts formulas to again determine that the eigenvalues are real and not
negative, we now put Lψ = −λ2 wψ instead of Lψ = −λ2 ψ as above. Then in determining the
sign of λ2 we use
Z Z ( )
1 h i1 1
dψ
2
−λ2 w | ψ|2 dx = p ψ ψ′ − p + q | ψ|2 dx
0 0 0 dx
1
which is not unexpected as L is self-adjoint in the inner product
w
Z 1
u, v = uvw dx
0
The simplest way to decide whether or not a problem has a solution is to determine if all of the
steps required to write the solution can be carried out.
Lψ + λ2 ψ = 0, 0<x<1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 527
and
B0 ψ = 0, B1 ψ = 0
Lu = f, 0<x<1
and
B0 u = g0 , B1 u = g1
we write
X
u= ci ψi
and try to determine the coefficients c1 , c2 , · · · in this expansion. To find the equation satisfied by
the coefficient ci = h ψi , u i, we multiply Lu = f by ψi and integrate over 0 ≤ x ≤ 1, getting
Z 1 Z 1
ψi Lu dx = ψi f dx
0 0
Then as
Z 1 h i1 Z 1
ψi Lu dx = pW + Lψi u dx
0 0 0
h i1
′
ψi , f = p {ψi u − ψi′ u} − λ2i ci
0
This is the result we need. It tells us this: in the expansion of u the coefficient of an eigenfunc-
tion corresponding to a non vanishing eigenvalue has one, and only one, value. But the coefficient
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 528
h i1
ψ1 , f 6= p {ψ1 u′ − ψ1′ u}
0
h i1
′
ψ1 , f = p {ψ1 u − ψ1′ u}
0
−g0 ψ1′ (x = 0)
a0 =
W0
and
g0 ψ1 (x = 0)
b0 =
W0
Likewise we get
−g1 ψ1′ (x = 1)
a1 =
W1
and
g1 ψ1 (x = 1)
b1 =
W1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 529
Then if a0 and a1 are not zero we can write the solvability condition as
g1 g0
ψ1 , f = −p (x = 1) ψ1′ (x = 1) + p (x = 0) ψ1′ (x = 0)
a1 a0
Our model is
d2 cA
− k cA cB = 0
dx2
d2 cB
− k cA cB = 0
dx2
where
dcA dcB
cA (x = 0) = cA⋆, (x = 1) = 0, (x = 0) = 0, cB (x = 1) = c⋆
B
dx dx
and
where cA0 = cA⋆, cB0 = cB⋆ and cA1, cA2, . . . , cB1, cB2, . . . satisfy
d2 cA1
− cA0 cB0 = 0
dx2
d2 cB1
− cA0 cB0 = 0
dx2
where
dcA1 dcB1
cA1 (x = 0) = 0, (x = 1) = 0, (x = 0) = 0, cB1 (x = 1) = 0
dx dx
and
d2 cA2
− cA0 cB1 − cA1 cB0 = 0
dx2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 531
where
dcA2
cA2 (x = 0) = 0, (x = 1) = 0
dx
etc.
Derive the Green’s function for the cA1, cA2, etc. problems and for the cB1, cB2, etc.
problems. Use these Green’s functions to find the first few terms in the two expansions.
d2 u
0= + λ2 (1 + u)
dx2
where
u (x = −1) = 0 = u (x = +1)
by adding to the particular solution u0 = −1 the general solution to the homogeneous equa-
tion
A cos λx + B sin λx
d2 u
0= 2
+ λ2 u
dx
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 532
where
u (x = −1) = 0 = u (x = +1)
A cos λx + B sin λx
cos λ sin λ
To all values of λ such that (cos λ sin λ) 6= 0 the constants A and B are both zero and
u = 0 is the only solution to the homogeneous problem. Our problem then has a unique
solution.
1 3
To all values of λ such that cos λ = 0, i.e., to
π, π, . . ., the constant B must be zero
2 2
but A is indeterminate and the homogeneous problem has solutions A cos λx. Our problem
then requires that a solvability condition be satisfied. Show that it is not satisfied.
To all values of λ such that sin λ = 0, i.e., to 0, π, . . ., the constant A must be zero but
B is indeterminate and the homogeneous problem has solutions B sin λx. When λ = 0 this
is 0 but otherwise our problem again requires that a solvability condition be satisfied. Show
that it is satisfied.
π 3π
So when λ = , , . . . our problem has no solution; when λ = π, . . . it has a solution
2 2
but it is not unique.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 533
∂u ∂2u
= 2 + λ2 (1 + u)
∂t ∂x
where
u (x = −1) = 0 = u (x = +1)
1
and where u (t = 0) > 0 is assigned. Do this when λ = π and π.
2
3. You have seen how the Green’s function can be used to solve
Lu = f, B0 u = 0, B1 u = 0
use it to solve
Lu = 0, B0 u = g0 , B1 u = 0
4. We have
′
Lu = (pu′ ) − qu
and therefore
1 p′ q
Lu = u′′ + u′ − u
p p p
1
Show that L is self adjoint in the inner product
p
Z 1
h u, v i = uvp dx
0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 534
5. We present three frictional heating problems for your enjoyment, all in cylindrical coordi-
nates. In each case the temperature depends only on r as does the only non zero velocity
component. The viscosity depends on temperature via
µ (Twall)
= 1 + β (T − Twall)
µ (T )
And in each case a stress component is specified. The problems are written in terms of a
scaled temperature
β (T − Twall)
R1
z
R0
T (r = R1 ) = Twall
dT
(r = R0 ) = 0
dr
Trz (r = R0 ) input
d2 T 1 dT 2 1
+ + λ (1 + T ) = 0
dr 2 r dr r2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 535
β
λ2 = T 2 (r = R0 ) R02
k µwall rz
R0 z
T (r = R0 ) = Twall
T (r = 0) bounded
dp
input
dz
Our problem is
d2 T 1 dT
2
+ + λ2 r 2 (1 + T ) = 0
dr r dr
2
2 β 1 dp
λ =
k Twall 4 dz
R0
R1
T (r = R1 ) = Twall
dT
(r = R0 ) = 0
dr
Tr θ (r = R0 ) input
Our problem is
d2 T 1 dT 1
2
+ + λ2 4 (1 + T ) = 0
dr r dr r
β
λ2 = Tr2 θ (r = R0 ) R04
k µwall
Your job is to solve these problems for increasing values of λ2 and to find the value of
λ2 at which T becomes unbounded.
d2 ψ 1 dψ
+ + λ2 r 2 ψ = 0
dr 2 r dr
or
d2 ψ 1 dψ 1
2
+ + λ2 4 ψ = 0
dr r dr r
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 537
ψ = J0 (µr p )
where in the second problem p = 2 and in the third p = −1 and where µ is to be found.
This is not true of the first problem which, however, is a Bernoulli equation.
µwall
= 1 + β (T − Twall)
µ (T )
The fluid lying between two fixed plane walls at x = L and x = −L⋆ is sheared by a plane
wall at x = 0 moving to the right.
⋆ are constants. But they are not independent. Our problem, in
The stresses Txz and Txz
scaled temperature, is
d2 T
2
+ λ2 (1 + T ) = 0, T (x = L) = 0
dx
d2 T⋆ 2
2
+ λ⋆ 1 + T⋆ = 0, T x = −L⋆ = 0
dx
where
β 2 β 2
λ2 = T2 , λ⋆ = T⋆xz
k µwall xz k µwall
and where
T (x = 0) = T⋆ (x = 0)
and
dT dT⋆
(x = 0) = (x = 0)
dx dx
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 538
Because the speed of the moving wall is common, Txz and T⋆xz are not independent.
Show that Txz and T⋆xz have opposite signs and
Z L Z 0
−Txz (1 + T ) dx = T⋆xz 1 + T⋆ dx
0 −L ⋆
Assume Txz to be the input variable, determine the value of λ2 at which the temperatures
become infinite.
dT
First do the T problem assuming T (x = 0) = 0 and then (x = 0) = 0, before
dx
solving the T, T⋆ problem.
Lecture 20
We have seen in some of our earlier examples that the information we can get out of the eigenvalues
and the eigenfunctions is interesting in itself whether or not we plan to use them in an infinite series.
Of the nine one-dimensional problems arising upon separating variables in our three simple
coordinate systems only problems (6), (8) and (9) might be unfamiliar, all the others are of the
form
d2 X
2
+ α2 X = 0
dx
X = A sin αx + B cos αx
539
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 540
where what we do next depends on the boundary conditions that must be satisfied.
The solutions of problems (6), (8) and (9) can be obtained by a method due to Fuchs and
Frobenius which we will explain when we come to problem (8).
But first we can present another example of the importance of the eigenvalues themselves by
looking at the simplest quantum mechanical problem.
The rules for writing Schrödinger’s equation and the rules for interpreting its solutions are the
postulates of quantum mechanics. They cannot be proved; they can only be shown to lead to
conclusions that either agree or do not agree with experimental results. We obtain Schrödinger’s
equation in Cartesian coordinates, for a system of particles, if in the classical formula
H =T +V =E
where
1 n 2 o 1 n 2 o
T = p x + p2y + p2z + p x + p2y + p2z + · · ·
2m1 1 1 1 2m2 2 2 2
h ∂ h ∂ h ∂
we replace px , py , . . . by , , · · · and E by − , and then introduce a
1 1 2πi ∂x1 2πi ∂y1 2πi ∂t
function Ψ for these differential operators to act on. It is easy to see how ∇2 turns up. Indeed
∇2 turns up for each particle and if we have N particles of the same mass we turn up ∇2 in a 3N
dimensional space.
We can restrict a particle having only kinetic energy to a box 0 < x < a, 0 < y < a,
0 < z < a by setting V = 0 inside the box and V = ∞ outside the box. Then Schrödinger’s
equation for this particle is
2
1 h h ∂Ψ
∇2 Ψ = −
2m 2πi 2πi ∂t
2
1 h
∇2 ψ = Eψ
2m 2πi
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 541
where ψ = 0 on the boundary of the box. It determines the allowable values of the energy of the
particle in its stationary states. These values are important for they are the possible outcomes of a
measurement of the energy of the particle. We introduce
8π 2 m
λ2 = E
h2
∇2 ψ + λ 2 ψ = 0
d2 X
+ kx2 X = 0
dx2
X (x = 0) = 0 = X (x = a)
d2 Y
2
+ k 2y Y = 0
dy
Y (y = 0) = 0 = Y (y = a)
and
d2 Z
+ k 2z Z = 0
dz 2
Z (z = 0) = 0 = Z (z = a)
where λ2 = k 2x + k 2y + k 2z , where
π π π
kx = n , ky = n , kz = n
a x a y a z
π2 n 2 o
λ2 = n x + n2
y + n2
z
a2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 542
and
h2 2 h2 n 2 2 2
o
E= λ = n x + n y + n z
8π 2 m 8a2 m
The points nx, ny, nz , where nx, ny and nz = 1, 2, . . .. lie at the nodes of a cubic lat-
tice in the positive octant of quantum number space. To each point of the lattice there corre-
p2
sponds an eigenfunction, i.e., a quantum mechanical state. The energy of the state is , where
n o 2m
2
h h
p~ = nx~ix + ny ~jy + nz ~kz , and this is times the distance of the lattice point from the
2a 8ma2
origin.
We can make this example a little more interesting by assuming that the particle inside the box
is in a uniform gravitational field, ~g = −g~k. Then we have V = mgz and our problem is
2
1 h
∇2 ψ + mgzψ = Eψ
2m 2πi
or
2 2m2 2
∇ ψ+ λ − 2 z ψ=0
h
2
22mE h 2m g 1
where λ = , h = and = . Separating variables leads to the same X and
h2 2π h2 L3
Y problems as before but the Z problem is now
d2 Z 2 2m2 g
+ kz − z Z=0
dz 2 h2
where Z (z = 0) = 0 = Z (z = a). Again λ2 = kx2 + ky2 + kz2 but now the values of kz2 are new.
We can write the solution to our new Z problem in terms of Airy functions, cf., Bender and
Orzag, “Advanced Mathematical Methods for Engineers and Scientists.”
In terms of the Fuchs, Frobenius classification, the point x = 0 is an ordinary point of the
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 543
d2 y
+y =0
dx2
and
d2 y
− xy = 0
dx2
The second equation is called Airy’s equation and two of its independent solutions are denoted
Ai (x) and Bi (x) where these are the names of Taylor series about x = 0 having infinite radii of
convergence.
1.00
Bi( x )
0.75
0.50
0.25
Ai( x )
0.00 x
− 0.25
− 0.50
− 10 −5 0 5
− 15
A sketch of Ai (x) and Bi (x) is presented above and we see that the Airy functions look like
trigonometric functions to the left, exponential functions to the right.
2m2 g
Introducing a characteristic length, denoted β, where β 3 = 1, and replacing z by βz we
h2
have
d2 Z
2
+ β 2 kz2 − z Z = 0
dz
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 544
whence
Z = c1 Ai z − β 2 kz2 + c2 Bi z − β 2 kz2
Thus, in order that the values of c1 and c2 are not both zero, kz2 must satisfy
a a
Ai −β 2 kz2 Bi − β 2 kz2 − Ai − β kz Bi −β 2 kz2 = 0
2 2
β β
Solute Dispersion
Solute dispersion refers to the longitudinal spreading of a solute introduced into a flowing solvent
stream. We have seen a simple example of this in our model of a chromatographic separation
where we observed that the longitudinal dispersion of solute is due to a transverse variation of the
solvent velocity.
For a solvent in straight line flow in a pipe our problem is at least two dimensional. One
dimensional models are called dispersion equations and the coefficients appearing in these models
are called dispersion coefficients.
Assuming the process takes place in a pipe of circular cross section, we use cylindrical coordi-
nates and ∇2 in cylindrical coordinates is what this lecture is about.
Suppose a solvent is in straight line flow in a long straight pipe of circular cross section. At
time t = 0 a solute is injected into the solvent, its initial concentration being denoted c (t = 0).
Our job is to estimate its concentration some time later.
We align the z-axis with the axis of the pipe and denote the diameter of the pipe by 2a. Then
the solute concentration satisfies
∂c ∂c
+ vz = D∇2 c, θ ≤ r ≤ a, 0 ≤ θ < 2π, −∞ < z < ∞
∂t ∂z
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 545
where
r2
vz = 2 vz 1 − 2
a
where vz denotes the average of v z over the pipe cross section and where
2 1 ∂ ∂ 1 ∂2 ∂2
∇ = r + +
r ∂r ∂r r 2 ∂θ2 ∂z 2
Assuming that the wall of the pipe is impermeable to solute and that only a finite amount of
solute is put into the solvent at t = 0, we require c to satisfy
∂c
(r = a) = 0
∂r
Now solvent near the axis of the pipe is moving faster than solvent near the wall; the result is
that solute is carried down stream by the solvent at different rates depending on where it initially
resides on the cross section of the pipe. This convective distortion of the initial solute distribution
creates transverse concentration gradients which tend to move solute from the axis to the wall at
the leading edge of the distribution and in the opposite direction at the trailing edge.
The problem of determining c, or at least something about c, is called the problem of solute
dispersion or Taylor dispersion. The first work leading to an understanding of this problem was
reported by G.I. Taylor in his 1953 paper “Dispersion of Solute Matter in Solvent Flowing Slowly
Through a Tube.” We deal with this problem because it requires us to deal with the eigenfunctions
of ∇2 in cylindrical coordinates and because it requires us to solve the diffusion equation taking
into account homogeneous sources. The latter two reasons are artifacts of our way of doing the
problem and fit well into our sequence of lectures, Taylor did not require any of this.
The convective diffusion equation is a linear equation in c and as such its solution can be
written in terms of the transverse eigenfunctions of ∇2 and a special set of orthogonal functions in
z. The result is not instructive. What invites the construction of models is that v is not constant but
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 546
depends on r. As Taylor was able to measure the transverse average solute concentration, i.e., c,
where
Z a Z 2π
1
c= 2 cr drdθ
πa 0 0
determining something about c as it depends on z and t became the goal of his and much subsequent
work on this problem
To begin we inquire as to what the velocity field by itself is doing to the solute distribution in the
pipe. If diffusion is set aside then c satisfies the purely convective equation
∂c ∂c
= −v (r)
∂t ∂z
and this is easy to solve for any assigned c (t = 0) as it is simply the statement that the value
of c at the point r0 , θ0 , z0 at the time t = 0 will be found also at the point r1 = r0 , θ1 = θ0 ,
z1 = z0 + v (r0 ) t1 at the time t1 and hence c (t > 0) can be calculated in terms of c (t = 0).
In the end the information we seek is independent of c (t = 0) and we can get it most easily by
introducing the longitudinal power moments of c. Denoting these by cm , where
Z +∞
cm = z mc dz, m = 0, 1, 2, . . .
−∞
we can derive the equation for cm by multiplying the purely convective equation by z m, integrating
over z from −∞ to +∞, using integration by parts and discarding the terms evaluated at z = ±∞.
Doing this for m = 0, 1, 2, . . . we get
∂c0
=0
∂t
∂c1
= vc0
∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 547
∂c2
= 2vc1
∂t
etc.
c0 = c0 (t = 0)
c1 = c1 (t = 0) + vc0 (t = 0) t
c2 = c2 (t = 0) + 2vc1 (t = 0) t + v 2 c0 (t = 0) t2
etc.
c0 = c0 (t = 0)
c1 = c1 (t = 0) + vc0 (t = 0) t
c2 = c2 (t = 0) + 2 vc1 (t = 0) t + v 2 c0 (t = 0) t2
etc.
To see how fast the solute is spreading in the axial direction due to nonuniform flow on the
cross section we calculate the longitudinal variance of the transverse average solute concentration
and determine how fast this is growing in time. Denoting the expected value of z m by
R +∞ m
z c dz c
zm = −∞
R +∞ = m
c dz c0
−∞
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 548
we have
D E 2
2 2 c2
2 c1
(z − h z i) = z i− z i = −
c0 c0
If v were uniform on the cross section the variance would remain at its initial value. But a
non-uniform v causes the variance to grow as t2 . To see how much of this might be due to the
average motion we repeat the calculation in a fame moving at the average speed. To do this let
z′ = z − v t
and
t′ = t
Then c satisfies
∂c ∂c
′
= − (v − v) ′
∂t ∂z
and the new formula for the variance requires only that we put v − v in place of v in the formula we
already have. As this leads to no change, we see that the variance is the same whether we examine
solute spreading in a frame at rest or in a frame moving at the average speed of the solvent.
Now we know that diffusion and a linear growth of the variance go hand in hand, hence, if
accounting for transverse diffusion can eliminate the quadratic term in the above formula, we can
think about solute dispersion as a longitudinal diffusion process.
Assuming transverse diffusion can cut the growth of the longitudinal variance of the solute
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 549
distribution from quadratic in time to linear in time, solute dispersion can be thought to be, on the
average, a longitudinal diffusion process. But as the balance between longitudinal convection and
transverse diffusion, on which elimination of the quadratic growth term depends, may take some
time to be established and as this time may depend on how the solute is initially distributed, we
need to view the representation of solute dispersion in terms of longitudinal diffusion as a long
time representation.
1 v2 a2
Deff =
48 D
where the reader can observe that Deff is smaller as D is larger, by a route that takes us through
familiar territory. To begin observe that if the model for c is
∂c ∂2c ∂c
= Deff 2 − V eff
∂t ∂z ∂z
satisfy
dc0
=0
dt
dc1
= V eff c0
dt
dc2
= 2Deff c0 + 2V eff c1
dt
dc3
= 6Deff c1 + 3V eff c2
dt
etc.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 550
and these equations tell us that we can determine V eff and Deff in terms of c0 , c1 and c2 via
1 dc1 d c1
V eff = =
c0 dt dt c0
and
( 2 )
1 dc2 c1 dc1 1d c2 c1
Deff = −2 = −
2c0 dt c0 dt 2 dt c0 c0
2
c1 c2 c1 c (z, t)
where and − are the average and the variance of z distributed according to
c0 c0 c0 c0
at time t.
and
∂c
(r = a) = 0
∂r
where c (t = 0) is assigned. So to determine V eff and Deff we need only determine c0 , c1 and c2
and use their averages in the above formulas. This does not mean that the model is then correct,
only that its first three power moments match those of the true solute distribution.
a2 D
Before we go on we scale length, time and velocity by a, and then Deff is scaled by
D a
D and we do not introduce new symbols. In terms of scaled variables the solute concentration
satisfies
∂c 1 ∂ ∂c ∂c ∂ 2 c
= r −v +
∂t r ∂r ∂r ∂z ∂z 2
and
∂c
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 551
then
u = h 1, u i
satisfy
∂c0 1 ∂ ∂c0
= r
∂t r ∂r ∂r
∂c1 1 ∂ ∂c1
= r + vc0
∂t r ∂r ∂r
∂c2 1 ∂ ∂c2
= r + 2vc1 + 2c0
∂t r ∂r ∂r
etc.
which can be obtained by multiplying the equation satisfied by c by z m, integrating this over z
from −∞ to +∞, using integration by parts and then discarding terms evaluated at z = ±∞. The
boundary conditions to be satisfied are
∂c0
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 552
∂c1
(r = 1) = 0
∂r
∂c2
(r = 1) = 0
∂r
etc.
The moment equations can be solved recursively. The equation for c0 is simply a transverse
diffusion equation in a region where the boundary is impermeable. So also the equation for c1
but now there is a source depending on c0 ; likewise for c2 , the source now depending on c0 and
c1 . Indeed the equation for cm is a transverse diffusion equation having a source depending on
cm − 1 and cm − 2. The moment equations all take the same form. It is that of an unsteady, radial
diffusion equation driven by an assigned source. The equations must be solved in sequence so that
the sources can be determined before they are required. The power moments lead to this useful
∂c
structure as the mth moment of is a multiple of the m − 1st moment of c, etc. This lowering of
∂z
the order of the moments moves the variable coefficient v (r) into the source term in each equation.
and
∂ψ
(r = 1) = 0
∂r
The eigenvalues, λ2 , are real and not negative and the eigenfunctions corresponding to different
eigenvalues are orthogonal in the inner product
Z 1
u, v =2 uvr dr
0
The eigenfunctions are thus solutions of Bessel’s equation and because they must be bounded
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 553
at r = 0, we have
ψ = AJ0 (λr)
J0′ (λ) = 0
Again, maybe for the third time, we say: J0 (z) is a power series in z 2 , determined by Frobenius’
method. The coefficients in the series tell us everything about J0 .
To every positive root, λ, there corresponds a negative root, −λ, but as J0 (z) = J0 (−z) this
root adds neither a new eigenvalue nor a new eigenfunction. Denoting the non-negative roots then
as λ0 , λ1 , λ2 , . . . we have the corresponding normalized eigenfunctions ψ0 , ψ1 , ψ2 , . . ., where
J0 (λi r)
ψi = Z 1/2
1
2 J02 (λi r) r dr
0
u= ψ0 , u
1 J0(λ0 r)
J0(λ2 r)
0 r
1
J0(λ1 r)
J0 (z)
0 z
λ1 λ2
λ0
∞
X
c0 = ψj , c0 ψj
j=0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 555
and obtain the Fourier coefficients of c0 by multiplying the c0 equation by rψj and integrating over
0 ≤ r ≤ 1, viz.,
* +
D ∂c0 E 1 ∂ ∂c0
ψj , = ψj , r
∂t r ∂r ∂r
∂ψi ∂c0
Then, because (r = 1) = 0 = (r = 1), we have
∂r ∂r
d
h ψj , c0 i = −λ2j h ψj , c0 i
dt
and hence
∞
X 2
c0 = ψj , c0 (t = 0) e−λj t ψj
j=0
whereupon
c0 = ψ0 , c0 = ψ0 , c0 (t = 0) = c0 (t = 0)
and
∂c1
(r = 1) = 0
∂r
where c1 (t = 0) is assigned and where c0 is now known. This is a diffusion equation driven by
a homogeneous source. Again we expand the solution in the eigenfunctions ψ0 , ψ1 , ψ2 , . . . and
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 556
∞
X
c1 = ψj , c1 ψj
j=0
∂ψj ∂c1
Then because (r = 1) = 0 = (r = 1) we have
∂r ∂r
d
ψj , c1 = −λ2j ψj , c1 + ψj , vc0
dt
2 2
e−λk t − e−λj t
X ∞
2
ψj , c1 = ψj , c1 (t = 0) e−λj t + ψj , vψk ψk , c0 (t = 0)
k=0
−λ2k + λ2j
2 e−λ2k t − e−λ2j t
−λ t
where when k = j we write te j in place of . Thus we have determined c1 as
−λ2k + λ2j
2 2
e−λk t − e−λj t
∞
X XX ∞ ∞
2
c1 = ψj , c1 (t = 0) e−λj t ψj + ψj , vψk ψk , c0 (t = 0) ψj
j=0 j=0 k=0
−λ2k + λ2j
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 557
c1 = ψ0 , c1
2
e−λk t − 1
∞
X
= ψ0 , c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
k=0
−λ2k
2
e−λk t − 1
∞
X
= c1 (t = 0) + vψk ψk , c0 (t = 0)
k=0
−λ2k
We now have c0 , c0 , c1 and c1 and the reader can go on and obtain c2 , c2 , . . . in a similar way;
to do this simply requires a notational scheme to keep track of what is going on. Everything falls
into place once such a scheme is invented.
It turns out that c2 is not needed in order to derive a formula for c2 and hence for Deff . To see
this we average the equations satisfied by c0 , c1 and c2 obtaining
dc0 1 ∂ ∂c0
= r
dt r ∂r ∂r
dc1 1 ∂ ∂c1
= r + vc0
dt r ∂r ∂r
and
dc2 1 ∂ ∂c2
= r + 2vc1 + 2c0
dt r ∂r ∂r
Now ψ0 and ci satisfy homogeneous Neumann conditions at r = 1 and λ0 is zero. This tells us that
* + * +
1 ∂ ∂c i 1 ∂ ∂c i 1 d dψ0
r = ψ0 , r = r , ci = 0
r ∂r ∂r r ∂r ∂r r dr dr
dc0
=0
dt
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 558
dc1
= vc0
dt
and
dc2
= 2vc1 + 2c0
dt
These formulas simplify the determination of V eff and Deff. Putting them into
1 dc1
V eff =
c0 dt
and
1 dc2 c1 dc1
Deff = −2
2c0 dt c0 dt
we get
vc0
V eff =
c0
and
vc1 vc0 c1
Deff = − +1
c0 c0 c0
and this tells us that we need only c0 , c0 , c1 and c1 to determine V eff and Deff.
Now because
∞
X 2
c0 = ψj , c0 (t = 0) e−λj t ψj
j=0
∞
X 2
= c0 (t = 0) + ψj , c0 (t = 0) e−λj t ψj
j=1
and
c0 = c0 (t = 0)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 559
we find
∞
X 2
vc0 = ψj , c0 (t = 0) e−λj t vψ j
j=0
2
= c0 (t = 0) v + O e 1 t
−λ
and we see that V eff turns out to be time dependent. This results whenever c0 (t = 0) differs from
its equilibrium value c0 (t = 0). This difference weights the streamlines more or less heavily than
their equilibrium weighting and as the streamlines differ in speed this leads to V eff being other than
v. But for large enough values of t the equilibrium distribution of c0 is attained and V eff reaches a
constant value. It is this that we now denote by V eff and its value is
V eff = v
2
e−λk t − e−λj t
∞ ∞ ∞ 2
X 2 XX
c1 = ψj , c1 (t = 0) e−λj t ψj + ψj , vψk ψk , c0 (t = 0) ψj
j=0 j=0 k=0
−λ2k + λ2j
and
2
e−λk t − 1
∞
X
c1 = c1 (t = 0) + vψk ψk , c0 (t = 0)
k=0
−λ2k
∞
X 1 2
= c1 (t = 0) + vc0 (t = 0) t + vψk ψk , c0 (t = 0) 2
+ O e 1t
−λ
k=1
λk
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 560
we find
∞
X 2
vc1 = ψj , c1 (t = 0) e−λj t vψj
k=0
2
e−λk t − e−λj t
∞ X
∞ 2
X
+ ψj , vψk ψk , c0 (t = 0) vψj
j=0 k=0
−λ2k + λ2j
2
= c1 (t = 0) v + O e−λ1 t + vc0 (t = 0) v t
2
e−λk t − 1
∞
X
+ vψk ψk , c0 (t = 0) v
k=1
−λ2k
2
1 − e−λj t 2
∞
X
+ ψj , v c0 (t = 0) 2
vψj + O te 1 t
−λ
j=1
λj
where the first of the last four terms corresponds to j = 0, k = 0, the second to j = 0, k > 0,
the third to j > 0, k = 0 and the last to j > 0, k > 0. Using these formulas and writing
ψj , v = 1, ψj v = ψj v we find
X∞
vψj vψj
vc1 c0 − vc0 c1 = c c + O te−λ21 t
0 0
j=1
λ2j
and hence, for large enough values of t, we we see that Deff reaches a constant value. It is
X∞
vψj vψj
Deff = 1 +
j=1
λ2j
1 2
1+ v
48
we find that
1 d dv
r = −8v
r dr dr
dv
where v (r = 1) = 0 and (r = 1) = −4v. Then, integrating by parts, we have
dr
* + 1 * +
1 d dv dv dψj 1 d dψj
ψj , r = 2 ψj r − r v + r ,v
r dr dr dr dr 0 r dr dr
8vψj (r = 1)
ψj , v =−
λ2j
Now, using
J0 (λj )
ψj (r = 1) = Z 1/2
1
2 J02 (λj r) r dr
0
Z 1
1 n ′2 o
J02 (λj r) r dr = J0 (λ) + J02 (λ)
0 2
and
J0′ (λj ) = 0
we obtain
ψj (r = 1) = 1
8v
ψj , v =−
λ2j
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 562
whereupon we have
X∞ X∞
vψj vψj 2 1
2
= 64v
j=1
λj λ6
j=1 j
X∞
1
The sum 6
is found, cf., A. E. DeGance and L. E. Johns, “Infinite Sums in the Theory of
j=1
λ j
1
Dispersion of Chemically Reactive Solute” SIAM J. Math. Anal. 18, 473 (1987), to be ,
3072
using the fact that J0′ (z) = −J1 (z), and the formula for Deff, when v = 2v (1 − r 2 ), is then
1 2
Deff = 1 + v
48
Having now determined V eff and Deff , we must be careful not to claim too much for these
results. Indeed we have not really matched the moments of the solution to the model equation to
the moments of the true solute distribution, at least not yet. Of course we have not even looked at
the moments for i > 2. But more than this, if we look at c1 where
2
e−λk t − 1
∞
X
c1 = c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
k=0
−λ2k
2
e−λk t − 1
∞
X
c1 = c1 (t = 0) + vc0 (t = 0) t + ψ0 , vψk ψk , c0 (t = 0)
k=1
−λ2k
whereas from the model we get, assuming V eff to be constant at its long time value,
c1 (t = 0) + V eff c0 (t = 0) t
This cannot match the true c1 for all time but we can determine V eff, c0 (t = 0) and c1 (t = 0) to
match it to the true c1 as t grows large. To do this we let Veff = v and retain c0 (t = 0) at its true
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 563
value. But we cannot take c1 (t = 0) to be its true value; we must instead take it to be
∞
X 1
c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
λ2k
k=1
What we see then is that in using the model we must determine ceff (t = 0) as well as V eff and
Deff. If we do this we make the difference between the model prediction of c1 and the true c1 , i.e.,
2
e−λk t
X∞
vψk ψk , c0 (t = 0) , to be O e−λ21 t ; otherwise the difference will be O (1). A
−λ2k
k=1
similar observation pertains to the moment c2 . To have the model get c2 right as t grows large,
additional conditions are required to be satisfied by ceff (t = 0).
The Bessel function J1 (z) is defined to be the sum of the infinite series
2k
k z
1 z 2
z 4
1 X
∞ (−1)
z 1− 2 + 2 −··· = z 2
2 2 (1! 2!) 2 (2! 3!) 2 k=0 k! (k + 1)!
This series converges for all finite values of z as do the series obtained from it by term-by-term
differentiation.
J1′ (z)
The zeros J1 are real and simple and we denote them 0, ±z1 , ±z2 , . . .. The function has
J1 (z)
simple poles at z = 0, ±z1 , ±z2 , . . . and its residue at each pole is +1. Thus on any closed contour
C the value of the integral
Z
1 J1′ (z)
dz
C z − ζ J1 (z)
is the product of 2πi and the sum of the residues of the integrand at its poles inside C. In the limit
as C grows large and encloses the entire complex plane, the integral vanishes and we get
J1′ (ζ) 1 1 1 1 1
0 = 2πi + + + + + +···
J1 (ζ) 0 − ζ z1 − ζ −z1 − ζ z2 − ζ −z2 − ζ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 564
whereupon
1 1 J1′ (z) 1 1
− = 2
+ +···
2z z J1 (z) z z2
z12 1− 2 z22 1− 2
z1 z2
For any z such that | z| < | z1 | where | z1 | < | z2 | < · · · we can expand the factors 1 on
z2
1− 2
zi
the right hand side to get
∞ ∞ ∞
J1 (z) − zJ1′ (z) X 1 X 1
2
X 1
2
= 2
+ 4
z + 6
z4 + · · ·
2z J1 (z) z
i=1 i
z
i=1 i
z
i=1 i
and then, using the series for J1 (z) to expand the left hand side in powers of z 2 , we can match the
coefficients of 1, z 2 , z 4 , . . . on the two sides to get
X∞
1 1
2
=
z
i=1 i
8
X∞
1 1
4
=
z
i=1 i
192
X∞
1 1
6
=
z
i=1 i
3072
etc.
These formulas can be used to estimate the zeros of J1 (z); the third formula is of interest as it
stands in evaluating Deff .
Functions such as cos z, sin z, J0 (z), J1 (z), etc., defined by infinite sums converging for all
finite values of z are generalizations of polynomials. As such they also have infinite product
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 565
expansions which simplify this kind of work. The infinite product expansion of J1 leads directly
to the formula
J1′ (z) 1 2z 2z
= + 2 2
+ 2 +···
J1 (z) z z − z1 z − z22
The eigenvalue problem for ∇2 , viz., (∇2 + λ2 ) ψ = 0, when written out in spherical coordinates
leads on separation of variables to three problems. Each of these determines one of the factors
making up ψ (r, θ, φ) as the product R (r) Θ (θ) Φ (φ). Solving for Φ (φ) is easy and will be re-
viewed here. But our main job in this lecture is the determination of Θ (θ). This requires us to
explain Frobenius’ method and we use Θ (θ) as an example of this. The determination of R (r) is
also easy and is not worked out in detail.
To get going we recall the multipole moment expansion of the electrical potential due to a set
of point changes. This is of interest because the potential, denoted φ, satisfies ∇2 φ = 0.
To write φ at a point P , whose position is denoted by ~r, we use Coulomb’s law (in rationalized
MKS units), viz.,
n
X qi
φ (~r) =
i=1
4πε0 | ~r − ~ri|
= r 2 − 2r ri cos θi + ri2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 566
1 1 1
= r r 2
| ~r − ~ri| r r
1 − 2 i cos θi + i
r r
so that if the field point is further from the origin then any source point we can use the expansion
1 1 3 1 3 5
− − − − − −
1 2 2 2 2 2 2
√ =1+ z+ z2 + z3 + · · ·
1+z 1! 2! 3!
ri r 2
This requires −1 < −2 cos θi + i < 1.
r r
r
On rewriting this in ascending powers of i we get
r
ri r 2 1
1 + cos θi + i 2
3 cos θi − 1 + · · ·
r r 2
1
Our interest in this lies in the functions 1, cos θi, 3 cos2 θi −1 , . . . obtained on expanding
2
1 2 ri
using the binomial theorem, writing z = −2xy + y where x = cos θi and y = ,
{1 + z}1/2 r
and then arranging the result in a power series in y. We can write the expansion directly in powers
of y as (see Linus Pauling and E. Bright Wilson, jr. “Introduction to quantum mechanics, with
applications to chemistry”)
∞
X
1
T (x, y) = = Pℓ (x) y ℓ
{1 − 2xy + y 2 }1/2 ℓ=0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 567
T (x, y = 0) = 1 = P0 (x)
and
∂
T (x, y = 0) = x = P1 (x)
∂y
Then we can derive a recursion formula expressing any one function in terms of the previous two.
Indeed using
1 X
T (x, y) = = Pℓ (x) y ℓ
{1 − 2xy + y 2 }1/2
and
∂ x−y X
T (x, y) = 3/2
= ℓPℓ (x) y ℓ−1
∂y 2
{1 − 2xy + y }
we get
X X
(x − y) Pℓ (x) y ℓ = 1 − 2xy + y 2 ℓPℓ (x) y ℓ−1
3 2 1
P2 (x) = x − , . . .. And by doing this we can see that Pℓ (x) is a polynomial of degree ℓ
2 2
and that it is odd or even as ℓ is odd or even. The polynomials so defined are called Legendre
polynomials and they are fundamental to work in spherical coordinate systems.
We can use the generating function to obtain the differential equation satisfied by Pℓ (x). To do
this we observe that
∂T y X
(x, y) = 3/2
= Pℓ′ (x) y ℓ
∂x 2
{1 − 2xy + y }
X X
y Pℓ (x) y ℓ = 1 − 2xy + y 2 Pℓ′ (x) y ℓ
and again requiring the coefficients of y ℓ to agree on the two sides we get
′
Pℓ+1 (x) − 2xPℓ′ (x) + Pℓ−1
′
(x) − Pℓ (x) = 0
Then multiplying this by ℓ+1 and subtracting the result from the derivative of the recursion formula
leads to
Likewise we find
′
Pℓ+1 (x) − xPℓ′ (x) − (ℓ + 1) Pℓ (x) = 0
and subtracting the first, multiplied by x, from the second, written with ℓ−1 in place of ℓ we obtain
1 − x2 Pℓ′ (x) + ℓxPℓ (x) − ℓPℓ−1 (x) = 0
which holds for any smooth and bounded functions P and Q. Then setting P = Pℓ and Q = Pℓ ′
produces the required result.
1n o
Pℓ (z) = (2ℓ − 1) zPℓ−1 (z) − (ℓ − 1) Pℓ−2 (z)
ℓ
multiply by Pℓ (z) and integrate from −1 to +1. Due to the orthogonality of Pℓ and Pℓ−2 we get
Z +1 Z +1
2ℓ − 1
Pℓ2 (z) dz = Pℓ (z) zPℓ−1 (z) dz
−1 ℓ −1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 570
1 n o
Then as zPℓ (z) = (ℓ + 1) Pℓ+1 (z) + ℓPℓ−1 (z) we see that
2ℓ + 1
Z +1 Z +1
2ℓ − 1
Pℓ2 (z) dz = 2
Pℓ−1 (z) dz
−1 2ℓ + 1 −1
We will rediscover the functions Pℓ (z) in the course of solving for the eigenfunctions of ∇2 in
spherical coordinates; it is to this that we now turn. Writing ∇2 in spherical coordinates and
substituting ψ (r, θ, φ) = R (r) Θ (θ) Φ (φ) into the eigenvalue problem (∇2 + λ2 ) ψ = 0 we find
that R (r), Θ (θ) and Φ (φ) satisfy
d2 Φ
2
+ m2 Φ = 0
dφ
1 d dΘ m2
2
sin θ + β − Θ=0
sin θ dθ dθ sin2 θ
and
1 d 2 dR β2 2
r + λ − 2 R=0
r 2 dr dr r
These three ordinary differential equations turn up in the order listed and −m2 and −β 2 are
separation constants introduced in the course of carrying out the separation of variables algorithm.
The values of m2 , β 2 and λ2 must be determined as well as expressions for the functions Φ, Θ and
R. The problems must be solved in order, because m2 in the first carries over to the second and
β 2 in the second carries over to the third. Each of these equations is a one-dimensional eigenvalue
problem in its own right but is not a definite problem until boundary conditions are assigned.
The boundary conditions derive from those satisfied by ψ which in turn come from the physical
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 571
problem of interest.
The φ and θ equations are common to many problems and we work out their solutions in some
detail.
We suppose that physical boundary conditions are assigned only on surfaces where r is con-
stant. Hence we use symmetry and boundedness conditions to derive the dependence of ψ on θ
and φ and therefore Θ on θ and Φ on φ.
d2 Φ
+ m2 Φ = 0
dφ2
as
1
Φ = A cos mφ + B sin mφ
m
whereupon using Φ (0) = Φ (2π) and Φ′ (0) = Φ′ (2π) we have, for A, B and m,
1
cos 2πm − 1 sin 2πm A 0
m =
−m sin 2πm cos 2πm − 1 B 0
2 − 2 cos 2πm = 0
information.
However what we do is this: we replace cos mφ and sin mφ with two linear combinations,
eimφ and e−imφ . Then we assign eimφ to m and let m run through · · · , −2, −1, 0, 1, 2, · · · . So
for m = 0, ±1, ±2, . . . we have
1
Φm (φ) = √ eimφ
2π
where
Z 2π 1, m=n
ΦmΦn dφ =
0 0, m 6= n
+∞
X
f (φ) = cmΦm (φ)
m=−∞
are given by
Z 2π Z 2π
cm = φm (φ) f (φ) dφ = φ−m (φ) f (φ) dφ
0 0
or
d2 P2 dP 2 m2
1−z − 2z + β − P = 0, −1 ≤ z ≤ 1
dz 2 dz 1 − z2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 573
Our aim is to derive the solution to this equation by obtaining the coefficients in a series expansion
of P (z) in powers of z.
But first we outline the method we use, due to Fuchs and Frobenius, for obtaining solutions to
equations of the form
A point x0 is called an ordinary point if all pi (x) have Taylor series expansions about x0 . For
example x = 0 is an ordinary point of
d2 y
+y =0
dx2
and the reader may seek a solution to this equation in the form
∞
X
y= an xn
n=0
producing the series for cos x if a0 = 1, a1 = 0 and for sin x if a0 = 0, a1 = 1. The coefficients in
these series tell us everything we might want to know about the functions denoted sin and cos.
all possess Taylor series expansions about x0 . If x0 is a regular singular point, we can find a
solution
X
y = (x − x0 ) α an (x − x0 ) n, a0 6= 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 574
In the case of second order equations, where x0 is a regular singular point, we write
p (x) ′ q (x)
y ′′ + y + y=0
x − x0 (x − x0 )2
where p (x) and q (x) have Taylor series expansions about x = x0 , viz.,
X
p= pn (x − x0 ) n
n=0
and
X
q= qn (x − x0 ) n
n=0
Substituting
X X
y = (x − x0 ) α an (x − x0 ) n = an (x − x0 ) n + α
X X
y′ = α an (x − x0 ) n + α − 1 + nan (x − x0 ) n + α − 1
X X
y ′′ = α (α − 1) an (x − x0 ) n + α − 2 + 2α nan (x − x0 ) n + α − 2
X
+ n (n − 1) an (x − x0 ) n + α − 2
X X
α (α − 1) an (x − x0 ) n + α − 2 + 2α nan (x − x0 )n+α−2
X
+ n (n − 1) an (x − x0 ) n + α − 2
X X
+ pn (x − x0 ) n (α + n) an (x − x0 ) n + α − 2
X X
+ qn (x − x0 ) n an (x − x0 ) n + α − 2 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 575
n=0: α2 + (p0 − 1) α + q0 a0 = 0
n = 1, 2, . . . : (α + n)2 + (p0 − 1) (α + n) + q0 an =
n−1
X
− (α + k) pn−k + qn−k ak
k=0
n−1
X
P (α + n) an = − (α + k) pn−k + qn−k ak
k=0
We denote by α1 and α2 the two roots of P (α) = 0 and assume Re α1 ≥ Re α2 . Then for
α = α1 we have P (α + n) 6= 0 for n = 1, 2, . . . because α2 is the only root, other than α1 , of
P (α) = 0 and α2 6= α1 + n. Hence we always have one solution to our equation.
As an example, we solve
1 ′
′′ ν2
y + y − 1+ 2 y =0
x x
X
which has a regular singular point at x = 0. We substitute y = x α an x n obtaining
X X
(α + n) (α + n − 1) an x n + α − 2 + (α + n) an x n + α − 2
n=0 n=0
X X n+α−2
− an−2 x n + α − 2 − ν 2 x =0
n=2 n=0
hence we have
α2 − ν 2 a0 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 576
(α + 1)2 − ν 2 a1 = 0
(α + n)2 − ν 2 an = an−2 , n = 2, . . .
a2n−2 a2n−4
a2n = 2
=
2 n (ν + n) 24 n (n − 1) (ν + n) (ν + n − 1)
etc.
2 −ν
where, chosing a0 = , we have the series
Γ (ν + 1)
2n + ν
1
X 2x
, Γ (z) = zΓ (z − 1)
n=0
n!Γ (ν + n + 1)
which has an infinite radius of convergence and defines the functions Iν (x).
If α1 − α2 is not an integer, we can find a second solution by using α2 in place of α1 . But often
α1 − α2 is an integer and this presents a technical difficulty.
X
y (x, α) = (x − x0 ) α an (α) (x − x0 ) n
∂ X d
y (x, α) = y (x, α1 ) ln (x − x0 ) + an (α) (x − x0 ) n + α1
∂α α=α1 dα α = α1
Much more than this sketch can be found in “Advanced Mathematical Methods for Scientists
and Engineers” by Carl M. Bender and Steven A. Orszag and in “An Introduction to the Theory of
Functions of a Complex Variable” by E. T. Copson.
The reader can work out the closely related problem, again Bessel’s equation,
d2 y 1 dy m2
+ + 1− 2 y =0
dx2 x dx x
X
y = xα an x n
X X
y ′ = αx α − 1 an x n + x α nan xn−1
and
X X X
y ′′ = α (α − 1) x α − 2 an x n + 2αx α − 1 nan x n − 1 + x α n (n − 1) an x n − 2
n o αX X
α (α − 1) + α + x − m 2
x an x n + {2α + 1} x α + 1
2
nan x n − 1
X
+x α + 2 n (n − 1) an x n − 2 = 0
The coefficient of xα on the LHS is (α2 − m2 ) a0 and assuming a0 6= 0, you have α = ±m. The
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 578
coefficient of x α + n is
an n2 + 2αn − an−2
−1
an = an−2
n2 + 2αn
and this determines the even numbered coefficients in the series in terms of a0 .
The odd numbered coefficients, multiples of a1 , are zero. For example, setting m = 0, a0 = 1
you have a power series solution for all finite x. The series is called J0 (x) where
1 2 1 1
J0 (x) = 1 − 2
x + 2 2 x4 − 2 2 2 x6 + · · ·
2 2 4 2 4 6
1
A second independent solution can be obtained. Because the Wronskian of two solutions is , this
x
second solution will diverge logarithmically as x → 0.
where m = . . . , −2, −1, 0, 1, 2, . . . , and observe that it has regular singular points at z = ±1.
Because our aim is to expand P in a power series in z about z = 0 and to make use of the condition
that P must be bounded, we need to determine what is going on near z = ±1 by investigating the
indicial equation at each of these points. To see what happens at z = +1 we introduce x = 1 − z
and write R (x) = P (z), translating the singular point to x = 0. Then our equation is
d dR 2 m2
x (2 − x) + β − R=0
dx dx x (2 − x)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 579
and substituting
X
R = xα an xn , a0 6= 0
into
n o
d2 R dR
x (2 − x) x (2 − x) 2 + (2 − 2x) + x (2 − x) β 2 − m2 R=0
dx dx
4α2 − m2 a0
1 2
α2 = m
4
or
| m|
α=±
2
Likewise at z = −1 we find
| m|
α=±
2
| m|
Because we are looking for bounded solutions on −1 ≤ z ≤ 1 we discard the factors (1 − z)− 2
| m|
and (1 + z)− 2 and assume that P (z) can be written
| m| | m|
P (z) = (1 − z) 2 (1 + z) 2 G (z)
| m|
= 1 − z2 2 G (z)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 580
where G (z) is a power series in z about z = 0. The differential equation satisfied by G is then
1 − z 2 G′′ − 2 | m| + 1 zG′ + β 2 − | m| {| m| + 1} G = 0
The points z = ±1 remain regular singular points but the corresponding indicial equations require
α = 0. And so we look for a solution in the form of an ordinary power series about z = 0 where
the singular points z = ±1 bound the interval of interest. Writing
X
G= an z n
X
G′ = nan z n−1
and
X
G′′ = n (n − 1) an z n−2
X X X
n (n − 1) an z n−2 − n (n − 1) an z n − 2 | m| + 1 nan z n
X
+ β 2 − | m| {| m| + 1} an z n = 0
Observing that
∞
X ∞
X
n (n − 1) an z n−2 = (n + 2) (n + 1) an+2 z n
n=0 n=0
and setting the coefficient of z n on the left hand side to zero we discover the two term recursion
formula
β 2 − | m| {| m| + 1} − 2n | m| + 1 − n (n − 1)
an+2 =− an
(n + 2) (n + 1)
(n + | m|) (n + | m| + 1) − β 2
= an
(n + 2) (n + 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 581
So, for fixed values of | m| and β 2 we ordinarily do not get a bounded solution on −1 ≤ z ≤ 1.
But for fixed values of | m| there is the possibility that special values of β 2 lead to a bounded
solution. These values of β 2 are those that make one series or the other terminate in a finite number
of terms. For each fixed value of | m| we can select a sequence of values of β 2 that terminate the
even series in 1, 2, . . . terms and a sequence of values of β 2 that terminate the odd series in 1, 2, . . .
terms. To each such value of β 2 one series terminates in a polynomial, while the other series does
not terminate and is discarded. (The boundedness condition works here just like it did in cylindrical
coordinates where we discarded a solution to Bessel’s equation on the same grounds.)
So for each fixed value of | m| we get an infinite sequence of polynomial solutions. The poly-
nomial whose highest power is z ν , ν = 0, 1, 2, . . ., corresponds to
β 2 = (ν + | m|) (ν + | m| + 1)
For each even value of ν the even series terminates in a polynomial and we discard the divergent
odd series; for each odd value of ν the odd series terminates in a polynomial and we discard the
divergent even series.
Again, to each fixed value of | m| there corresponds to each value of β 2 such that
β 2 = (ν + | m|) (ν + | m| + 1) , ν = 0, 1, 2, . . .
a polynomial solution to the equation for G whose highest power is z ν , and it is odd or even as ν is
odd or even. The polynomial is completely determined by the recursion formula up to a constant
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 582
factor. If we write
ℓ = ν + | m|
then
β 2 = ℓ (ℓ + 1)
ℓ = | m| , (| m| + 1) , (| m| + 2) , . . .
Going back to the equation for P (z), and hence to the equation for Θ (θ), we see that to each
fixed value of | m|, | m| = 0, 1, 2, . . ., we have determined a sequence of eigenvalues β 2 , where
β 2 = ℓ (ℓ + 1)
and where
ℓ = | m| + ν, ν = 0, 1, 2, . . .
To each such eigenvalue we have one independent eigenfunction and it is of the form
| m| an even or odd polynomial in z
P (z) = 1 − z2 2 ×
of degree ν = ℓ − | m|
To each value of | m| and to each ℓ the polynomial is defined by the recursion formula
(ν + | m|) (ν + | m| + 1) − ℓ (ℓ + 1)
aν+2 = aν
(ν + 1) (ν + 2)
ℓ = | m| , (| m| + 1) , . . ., whence
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 583
ℓ = | m| leads to a0 ,
ℓ = | m| + 1 leads to a1 z,
ℓ = | m| + 2 leads to a0 + a2 z 2 , etc.
We can write out these eigenfunctions using the recursion formula for the coefficients in the
polynomials, but there is a better way of presenting our results. Before we do this we establish the
orthogonality of two eigenfunctions corresponding to different eigenvalues. Let | m| be fixed and
let P (z; ℓ, m) denote the bounded solution of
d d
2 m2
1−z P + ℓ (ℓ + 1) − P =0
dz dz 1 − z2
corresponding to ℓ = | m| , (| m| + 1) , . . .. Then as
Z +1 Z +1
d dQ
2 d 2
dP
P 1−z dz − Q 1−z dz =
−1 dz dz −1 dz dz
+1
dQ 2
dP
P 1−z − Q 1 − z2 =0
dz dz −1
where ℓ and ℓ′ are distinct values taken from the sequence | m|, (| m| + 1), (| m| + 2) , . . .
We emphasize the fact that to each value of | m| there corresponds an infinite set of polynomi-
als. And the sets of polynomials differ as | m| differs. The orthogonality is a condition satisfied by
the polynomials in each one of these sets.
and as polynomial solutions of a fixed degree of this equation are unique up to a constant factor we
can take the solutions to our problem for | m| = 0 to be
Pℓ (z) , ℓ = 0, 1, 2, . . .
This we do henceforth
We can use the Legendre polynomials to define the associated Legendre functions of degree ℓ
and order | m| via
| m| | m| d| m|
Pℓ = 1 − z2 2 Pℓ (z)
dz | m|
| m| times we get
d| m| + 2 d| m| + 1
1 − z2 Pℓ (z) − 2 {| m| + 1} z Pℓ (z)
dz | m| + 2 dz | m| + 1
n o
+ ℓ (ℓ + 1) − | m| {| m| + 1} Pℓ (z) = 0
| m| | m| d| m|
and using Pℓ = 1 − z2 2 Pℓ (z) we find
dz | m|
d2 Pℓ| m| | m|
2 dPℓ m2 | m|
1−z − 2z + ℓ (ℓ + 1) − Pℓ =0
dz 2 dz 1 − z2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 585
and we see that the associated Legendre functions are bounded solutions of our differential equa-
tion, viz.,
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0
sin θ dθ dθ sin2 θ
| m|
| m|
where z = cos θ and P (z) = Θ (θ). And as Pℓ (z) is (1 − z 2 ) 2 times an even or odd poly-
nomial of degree ℓ − | m| the associated Legendre functions must be constant multiples of the
solutions we determined by Froebenius’ method.
| m| | m| d| m|
Pℓ = 1 − z2 2 Pℓ (z)
dz | m|
where Pℓ (z), ℓ = 0, 1, 2, . . . are the Legendre polynomials. Indeed all that we may wish to know
about the associated Legendre functions can be determined from the Legendre polynomials. For
instance the integrals of their squares,
Z +1 2
| m|
Pℓ (z) dz
−1
2 {ℓ + | m|}!
2ℓ + 1 {ℓ − | m|}!
This establishes the angular eigenfunctions of ∇2 . So, while the radial parts of the eigenfunc-
tions of ∇2 remain to be determined, we now have a complete picture of the angular part. Defining
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 586
Yℓ m (θ, φ) by
s
2ℓ + 1 {ℓ − | m|}! | m|
Yℓ m (θ, φ) = P (cos θ) Φm (φ)
2 {ℓ + | m|}! ℓ
m = . . . , −2, −1, 0, 1, 2, . . .
ℓ = | m| , (| m| + 1) , . . .
we have a complete set of orthogonal functions defined on the surface of a sphere. The orthogo-
nality works like this. Because
Z 2π Z π
Y ℓ m Yℓ ′ m′ sin θ dθdφ
0 0
p p Z π Z 2π
′ ′ | m| | m′ |
= ℓ, m ℓ , m Pℓ (cos θ) Pℓ ′ (cos θ) sin θ dθ Φm (φ) Φm′ (φ) dφ
0 0
p p Z +1 Z 2π
′ ′ | m| | m′ |
= ℓ, m ℓ , m Pℓ (z) Pℓ ′ (z) dz Φm (φ) Φm′ (φ) dφ
−1 0
where
s
p 2ℓ + 1 {ℓ − | m|}!
ℓ, m =
2 {ℓ + | m|}!
and
s
p 2ℓ′ + 1 {ℓ′ − | m|}!
ℓ ′, m′ =
2 {ℓ′ + | m|}!
The functions Yℓm are called spherical harmonics and as they are eigenfunctions of ∇2 on the
surface of a sphere, viz.,
ℓ (ℓ + 1)
∇2 Yℓm = − Yℓm
r2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 587
∂c
= ∇2 c, 0 ≤ φ ≤ 2π, 0≤θ≤π
∂t
+∞
X ∞
X
c (θ, φ, t) = cℓm Yℓm (θ, φ) e−ℓ (ℓ + 1) t
m=−∞ ℓ=| m|
where
Z 2π Z π
cℓm = Yℓm (θ, φ) c (t = 0) sin θ dθdφ
0 0
and where distance and time are scaled so that the radius of the sphere is one unit of length and
the diffusivity is one unit of length2 /time. The readers can determine that c (t > 0) is independent
of φ for all t > 0 if c (t = 0) is independent of φ and then go on and investigate the long time
dependence of c on θ for arbitrary c (t = 0).
P P∞ P∞ Pℓ
The double sum +∞ m=−∞ ℓ=| m| is often written ℓ=0 m=−ℓ where a given value of ℓ
8π 2 µ
∇2 ψ + {E − V } ψ = 0
h2
m1 m2
where E is the energy of the electron, assuming the nucleus is fixed, µ = is the reduced
m1 + m2
mass, m1 and m2 being the masses of the particles, and r, θ and φ are the spherical coordinates
of the electron taking the nucleus to be at the origin of our coordinate system. Both particles are
assumed to be point particles and the potential energy is that due to the Coulombic attraction of
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 588
Ze2
V =−
r
The eigenfunctions of ∇2 in spherical coordinates, periodic in φ and bounded in θ, take the form
Ψ = R (r) Yℓ m (θ, φ)
and where m = 0, ±1, ±2, . . . and ℓ = | m|, (| m| + 1), (| m| + 2) , . . .. This equation can be
reduced to Bessel’s equation and solved in terms of Bessel functions. We observe that if ℓ = 0 two
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 589
sin λr cos λr
and
λr λr
where the first, but not the second, is bounded as r → 0. Likewise for each value of ℓ, ℓ =
0, 1, 2, . . ., two independent solutions can be written in terms of sin λr, cos λr and powers of λr.
These can be used to solve diffusion problems in a sphere when c (t = 0) and all other sources of
solute are assigned. The reader can determine that
sin λr cos λr
−
(λr)2 λr
1
∇2 r ℓ = ℓ (ℓ + 1) r ℓ
r2
and
1 1 1
∇2 = ℓ (ℓ + 1) ℓ+1
r ℓ+1 r 2 r
the function
ℓ 1
Aℓ m r + Bℓ m Yℓ m (θ, φ)
r ℓ+1
h f, ∇2g i = h ∇2f, g i
Hence to solve
∇2 c = 0, R1 ≤ r ≤ R2 , 0 ≤ θ ≤ π, 0 ≤ φ ≤ 2π
XX
c= cℓ m (r) Yℓ m (θ, φ)
where the radial part of the solution is denoted cℓ m (r) and where
cℓ m (r) = h Yℓ m , c i
The equation satisfied by cℓ m (r) is then obtained by multiplying Laplace’s equation by Y ℓ m and
integrating over θ and φ, viz.,
0 = h Y ℓ m , ∇2 c i
1 d 2d
= r h Yℓ m , c i + h Yℓ m , ∇2θ,φ c i
r 2 dr dr
1 d 2d
= r h Y ℓ m , c i + h ∇2 Y ℓ m , c i
r 2 dr dr
1 d 2d ℓ (ℓ + 1)
2
r h Yℓ m , c i − h Yℓ m , c i = 0
r dr dr r2
and, using
Bℓ m
h Yℓ m , c i = Aℓ m r ℓ +
r ℓ+1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 591
we get
XX Bℓ m
ℓ
c= Aℓ m r + ℓ+1 Yℓ m (θ, φ)
r
where Aℓ m and Bℓ m can be determined using the conditions assigned on r = R1 and r = R2 . This
will work in solving Poisson’s equation as well, viz.,
0 = ∇2 c + Q (r, θ, φ)
The differential equation determining the radial part of the solution is then
1 d 2d ℓ (ℓ + 1)
2
r h Yℓ m , c i − h Yℓ m , c i + h Yℓ m , Q i = 0
r dr dr r2
To do any other problem, where, for instance, c is driven by an initial condition or by an initial
condition and a volume source, brings us back to
∇2 ψ + λ 2 ψ = 0
where
2 1 d 2d ℓ (ℓ + 1)
∇ + λ2 R (r) Yℓ m = 2
r + λ2 − R Yℓ m
r dr dr r2
and where a homogeneous condition must be satisfied by ψ and hence by R on the surface of a
sphere (or on two spheres) corresponding to a physical condition imposed on c there. The angular
parts of this, the Yℓ m ’s, are now known, only the radial part part remains to be determined, viz.,
the solutions to
1 d 2d ℓ (ℓ + 1)
2
r + λ2 − R=0
r dr dr r2
We wish to find the frequencies of small amplitude oscillations of a sphere of inviscid fluid.
The radius of the sphere is denoted R0 and we introduce a displacement so that its surface is
r = R (θ, φ, t) .
We have
∂−
→v
ρ = −∇p, ∇·−
→
v =0
∂t
and hence
∇2 p = 0
p = −γ2H
and
Rθ Rφ ∂R
vr − vθ − vφ =
R R sin θ ∂t
2 → →
−
Now introducing a small displacement of the rest state, viz., the state p0 = γ , −
v0 = 0 , we
R0
write
R (θ, φ, t) = R0 + ε R1 (θ, φ, t)
vr = εv r1
p = p0 + ε p 1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 593
and
2 2 1 2
2H = − +ε + ∇ R1
R0 R02 R02 θφ
where
1 ∂ ∂ 1 ∂2
∇2θφ = sin θ +
sin θ ∂θ ∂θ sin2 θ ∂φ2
∂vr1 ∂p1
ρ =−
∂t ∂r
∇2 p 1 = 0
∂R1
vr1 = at r = R0
∂t
and
γ
p1 = − 2 + ∇2θφ R1 at r = R0
R02
and
b Y ℓm (θ, φ) eσt
R1 = R1
b satisfy
where vbr1, bp1, and R1
dbp1
ρ σ vbr1 = −
dr
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 594
1 d 2dbp1 ℓ (ℓ + 1)
r − bp1 = 0
r 2 dr dr r2
b
vbr1 = σ R at r = R0
1
and
γ b
bp1 = − {2 − ℓ (ℓ + 1)} R at r = R0
R02 1
bp1 = A r ℓ
where
ℓ = | m| , | m| + 1, . . .
Thus we obtain
γ
σ2 = − ℓ (ℓ − 1) (ℓ + 2)
ρR03
and for | m| = 1, 2, . . . , ℓ ≥ | m|, the integral is zero. But m = 0, ℓ = 0 must be ruled out assuming
the volume of the sphere remains fixed on perturbation.
Then, substituting
x = R sin θ cos φ
y = R sin θ sin φ
z = R cos θ
R = R0 + ε R1
into
(x − ε)2 + y 2 + z 2 = R02
we find
R1 = sin θ cos φ
R1 = sin θ sin φ
and if our sphere is centered at (0, 0, ε) we find R1 = cos θ. Now sin θ cos φ, sin θ sin φ and cos θ
are eigenfunctions of ∇2θ,φ corresponding to | m| = 1, ℓ = 1 and m = 0, ℓ = 1. Hence all of the
small displacements at ℓ = 1 correspond simply to moving the sphere off center and hence lead to
σ = 0.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 596
Coulomb’s law tells us that the electrostatic potential at the field point ~r due to electric charges qi
concentrated at the source points ~ri is
1 X qi
φ (~r) =
4πε0 | ~r − ~ri |
If the charge is distributed continuously, instead of discretely, due to an assigned charge density ρ,
then the electrostatic potential is
ZZZ
1 ρ (~r ′ )
φ (~r) = dV ′
4πε0 | ~r − ~ri |
where the source at ~r ′ is ρ (~r ′ ) dV ′ and the sum over point charges is replaced by an integral over
the charge density.
∇2 φ = −ρ/ε0
To see this we write our second integration by parts formula in the form
ZZZ ZZ ZZ ZZZ
2
φ∇ ψ dV = dA~n · {φ∇ψ − ψ∇φ} + dA~n · {φ∇ψ − ψ∇φ} + ψ∇2 φ dV
V S S V
2 1
where S1 and S2 bound a region V . To use this we assume the origin lies inside S1 which in turn
1 1
lies inside S2 . Then setting ψ = so that ∇ψ = − 2 ~ir and ∇2 ψ = 0, we get
r r
ZZ ZZ ZZZ
φ~ 1 φ~ 1 1 2
0=− dA~n · 2
ir + ∇φ − dA~n · 2
ir + ∇φ + ∇ φ dV
r r r r r
S S V
2 1
We now let S2 be a sphere of very large radius and we let S1 be a sphere of radius ε where
1
ε → 0. Then by requiring that φ → 0 at least as fast as as r → ∞, the first term on the right hand
r
side vanishes and by requiring that φ and ∇φ remain bounded as r → 0, the second reduces to
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 597
4πφ ~0 . Using this our integration by parts formula simplifies to
ZZZ
~ 1 −∇2 φ
φ 0 = dV
4π r
V
ρ
and so, if φ satisfies Poisson’s equation, ∇2 φ = − , we get
ε0
ZZZ
1 ρ (~r)
φ ~0 = dV
4πε0 r
V
By doing this we have discovered the Green’s function for the operator ∇2 when its domain is
1
the set of functions defined throughout all space, required to vanish as r → ∞ at least as fast as .
r
Indeed we observe that
ZZ ZZZ
1 2 1
dA~n · ∇ = ∇ dV = 0
r r
S V
whenever S is the complete surface bounding any region V that does not include the origin. Then
when S is any surface enclosing the origin we conclude that
ZZ ZZ
1 1
dA~n · ∇ =− dA~n · ∇ = −4π
r r
S Sε
1 1 1
g= ·1
D 4π r
M
and where the factor 1 is written because it has physical dimensions , satisfies
T
∇2 g = 0, ∀~r 6= ~0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 598
and
ZZ
dA~n · {−D∇g} = 1
S
where S is any surface enclosing the origin. It is therefore the concentration field resulting when a
M
steady point source of unit strength, i.e., of strength 1 in units , is established at ~r = ~0 ∀t.
T
2
This is the Green’s function for ∇ . If a mass source is distributed via a continuous source
density so that Q (~r) dV units of mass per unit of time is introduced at ~r then by superposition
ZZZ
1
c (~r) = Q (~r ′ ) dV ′
D4π | ~r − ~r ′ |
V
satisfies
D∇2 c + Q = 0
∞
V
m
g
z=0 z
d2 ψ 2m
2
= 2 (gz − E) ψ
dz ~
α3 2mg α2 2mg
Replacing z by αz where = 1 and setting E = ε you have
~2 ~2
d2 ψ
= (z − ε) ψ
dz 2
The solution is
ψ = CAi (z − ε) + DBi (z − ε)
2. An incompressible fluid is in steady straight line flow in a long straight pipe of rectangular
cross section. Determine vz (x, y) where
1 dp
∇2 vz = <0
µ dz
dp
where is an input and where vz is zero at x = ± a and y = ± b.
dz
Determine Q, the volumetric flow rate, and then determine the values of a and b so that the
volumetric flow rate is greatest at a fixed cross sectional area.
∂ 2 vz ∂ 2 vz 2 1 ∂p
∇2 vz = + = a
∂x2 ∂y 2 µ ∂z
where
vz (x = 0, 1) = 0 = vz (y = 0, 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 600
by writing
∞
X
vz = cn (y) sin nπx
n=1
This series is the sum of products of functions of x times functions of y, unlike the
solution in Problem 1.
4. An incompressible fluid is in steady straight line flow in a long straight pipe. The axial
velocity depends on x and y and satisfies
1 ∂p
∇2 vz = , (x, y) ∈ A
µ ∂z
and
vz = 0, (x, y) ∈ P
dp
where is a negative constant, A denotes the fixed cross section of the pipe and P denotes
dz
its perimeter.
Polynomial solutions to this equation can be obtained for a variety of cross sections using
∇2 {1, x, y} = {0, 0, 0)
∇2 x2 , xy, y 2 = {2, 0, 2)
∇2 x3 , x2 y, xy 2, y 3 = {6x, 2y, 2x, 6y)
etc.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 601
Determine the solution to the problem when the cross section of the pipe is bounded
by:
1
y=− √ a
2 3
√ 1
y = 3x + √ a
3
√ 1
y = − 3x+ √ a
3
Show that for the same pressure gradient and the same area, the circular cross section
carries the greatest volumetric flow.
Show that polynomial solutions cannot be found when the cross section is a square.
5. This has to do with the nonlinear heat generation problem presented at the end of Lecture 16.
Let the region V in which a heat generating material resides a be a rectangular paral-
lelepipde of side lengths a, b and c. Its volume is then abc. Determine the eigenvalues of ∇2
in this region when the eigenfunctions are required to vanish on its boundary. Observe that
1 1 1
λ21 =π 2
2
+ 2+ 2
a b c
Holding the volume of the region fixed show that a cube might be thought to be the most
dangerous shape. Is this result expected on physical grounds?
6. A solvent is in straight line flow in a long straight pipe of rectangular cross section. The pipe
is aligned with the axis 0z of a Cartesian coordinate system and −a < x < a, −b < y < b,
−∞ < z < ∞.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 602
The longitudinal velocity of the solvent, denoted vz (x, y), is known from Problem 1.
and
ZZ
1
u= u dA
A A
then
u = h 1, u i
8. Two planes, one at x = 0 and the other at x = L, are held at temperature T0 . A plane at
y = 0 is held at temperature T1 > T0 .
Derive a formula for the steady temperature field in the region 0 < x < L, 0 < y < ∞.
∂T
To estimate the heat that must be supplied to establish this temperature field find
∂y
along the hot plane y = 0.
First differentiate the formula for T (x, y) term by term, observe that the result is a
series that diverges at y = 0 and conclude that, while the series produces T , it does not
produce everything that we might want to know about the problem.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 603
∂T
Devise a way of determining (y = 0).
∂y
P = a0 + a1 z + a2 z 2 + · · ·
z = r cos ω
w = r sin ω cos θ
and derive
2 1 ∂ 3 ∂ 1 ∂ 2 ∂
∇ = 3 r+ 2 sin ω
r ∂r r sin2 ω ∂ω
∂r ∂ω
1 ∂ ∂ 1 ∂2
+ 2 sin θ + 2
r sin2 ω sin θ ∂θ ∂θ r sin2 ω sin2 θ ∂φ2
∇2 + λ 2 ψ = 0
and obtain four one dimensional eigenvalue problems. The θ and φ equations are just the θ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 604
and φ equations that come up in three dimensions. The ω equation is new, being
1 d 2 dΩ 2 ℓ (ℓ + 1)
2 sin ω + β − Ω=0
sin ω dω dω sin2 ω
1 1
s= ℓ, − (ℓ + 1)
2 2
ℓ/2
Write P (z) = (1 − z 2 ) G (z) and show that G satisfies
d2 G dG 2
1 − z2 2
− (2ℓ + 3) z + β − ℓ (ℓ + 2) G = 0
dz dz
(v + ℓ) (v + ℓ + 2) − β 2
av + 2 = av
(v + 1) (v + 2)
β 2 = k (k + 2) , k = ℓ, ℓ + 1
in two dimensions: m (m + 0)
in three dimensions: m (m + 0)
ℓ (ℓ + 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 605
in four dimensions: m (m + 0)
ℓ (ℓ + 1)
k (k + 2)
2 k (k + 2)
∇ωθφ Ykℓm = − Ykℓm
r2
Show that
1
r k Ykℓm and Ykℓm
rk + 2
Almost everything that needs to be known, in any number of dimensions, can be in-
ferred by pursuing the pattern that is emerging.
11. A dye drop in the shape of North and South America is absorbed on the surface of a sphere of
water in which it is insoluble. The dye cannot escape into the surrounding air but being sub-
jected to collisions by the water molecules it can diffuse over the surface of the water. Find
a formula for the diffusive homogenization of the dye. Scale length and time so that D = 1
and R = 1 where R is the radius of the sphere of water. Then the surface concentration of
dye satisfies
∂c
= ∇2 c, 0 ≤ θ ≤ π, 0 ≤ φ ≤ 2π
∂t
where c (t = 0) is assigned.
Show that as t grows large the dye becomes uniformly distributed over the surface of the
sphere. Find the long time limiting concentration of the dye. What is the θ and φ dependence
of the non-uniform terms that die out most slowly? If c (t = 0) depends only on θ, is this
symmetry maintained for all t > 0?
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 606
12. In the Debye model for the reorientation of rigid rods by Brownian motion, the direction
of a rod can be specified by a unit vector lying along its axis and hence by a point (θ, φ)
lying on the surface of the unit sphere. If the initial orientation of the rods is specified by the
probability density p (t = 0), then the probability density p (t > 0) satisfies
∂p 2
= D ∇θφ p
∂t
1
where [D] = . Write a formula for p (t > 0).
T
If in the initial orientation, the rods are concentrated at (θ0 , φ0) then
δ (θ − θ0 ) δ (φ − φ0 )
p (t = 0) =
sin θ0
where the delta function is introduced in Lecture 19, Appendix 3. Show that
+∞
X ∞
X
p (t > 0) = Yℓm (θ0 , φ0 ) Yℓm (θ, φ) e−ℓ (ℓ + 1) Dt
m=−∞ ℓ=| m|
This is a transition probability. It is the probability that a rod acquires the orientation (θ, φ)
at time t, given its initial orientation is (θ0 , φ0 ).
8π 2 µ
∇2 ψ = − Eψ
h2
where
ZZZ
ψψ dV < ∞
V
14. This is a 2-dimensional heat conduction problem, based on the 1-dimensional problem pre-
sented in Lecture 14
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 607
We have a region 0 < z < H, 0 < x < L in which the temperature is specified at
t = 0. The wall at z = H is at a fixed temperature, the walls at x = 0 and x = L are
insulated. The wall at z = 0 is in perfect contact with a well stirred reservoir. The uniform
temperature of the reservoir is denoted T0 . Hence we have T (z = 0) = T0 for all x ∈ (0, L)
∂T
whereupon T is uniform in x at z = 0 but need not be uniform.
∂z
At z = 0 we also have
Z L
∂T dT0
dx = constant ×
0 ∂z dt
∂T
= κ∇2 T
∂t
on the domain.
First we scale our problem and then introduce an eigenvalue problem to aid us in solving
the scaled problem.
Thus we have
∂T
= ∇2 T 0 < x < 1, 0<z<H
∂z
T = 0 at z=H
∂T
= 0 at x = 0, 1
∂x
and
Z 1
∂T ∂T
dx = C at z = 0
0 ∂z ∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 608
∇2 ψ + λ 2 ψ = 0
ψ = 0 at z=H
∂ψ
= 0 at x = 0, 1
∂x
Z 1
∂ψ
dx + Cλ2 ψ = 0 at z=0
0 ∂z
k = 0, π, 2π, . . .
and obtain the ψ’s and λ’s. Then the ψ’s and λ’s must be used to obtain T (x, z, t) where
T (x, z, t = 0) is specified.
Our integration by parts formulas can be used to light our path, i.e., to tell us the inner
product we ought to be using.
15. The Stokes’ equation for the slow flow of a constant density fluid is
1
∇2~v = ∇p, ∇ · ~v = 0
µ
∇2 p = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 609
and
2 1 1
∇ ~r p = ∇p
2µ µ
A nr n + B nr − (n + 1) Ynm
Then we have
X
p= pn
X n+3
~v = ∇ × (~rχ n) + ∇φn + r 2 ∇p n
2µ (n + 1) (2n + 3)
n
− ~r p n
µ (n + 1) (2n + 3)
satisfies
1
∇2~v = ∇p and ∇ · ~v = 0
µ
16. Stokes’ equation for slow flow past a sphere has no solution in two dimensions. This is
Stokes’ paradox. It has a solution in three dimensions, but no first order corrections to
account for non zero Reynolds number. This is Whitehead’s paradox.
Your job is to see what you can find out about slow flow past a four dimensional sphere.
First you ought to derive the symmetry conditions and then ask if a streamfunction can be
found.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 610
17. We have a sphere of radius R centered at the origin lying in an unbounded region. We
impose a constant temperature gradient at a great distance from the sphere. The thermal
conductivities of the sphere and its surroundings are kS and k.
∂T dT0 ∂T ∂T
= , =0=
∂z dz ∂x ∂y
No temperature is specified at any point in the problem and we have ∇2 T = 0 inside and
outside the sphere. The axisymmetric solutions to this equation are
B0 B1
2 B2 3 2 1
A0 + + A1 r + 2 cos θ + A2 r + 3 cos θ − + etc.
r r r 2 2
3k dT0
T = r cos θ
kS + 2k dz
and
dT0 dT0 k − kS R3
T = r cos θ + r cos θ
dz dz kS + 2k r3
18. D. J. Jeffrey, “Conduction through a random suspension of spheres,” Proc. Roy. Soc. Lon-
don, A335 (1973) 355, proposes to determine the effective conductivity of a dilute suspen-
sion of spheres in the following way:
Write
− ~q = k eff ∇T
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 611
where
D E ZZZ
1
() = lim ( ) dV
V →∞ V
V
Set
ZZZ
~i =
∆ (kS − k) ∇T dV
V sphere i
whereupon
1 n~ ~ 2 +···
o
h − ~q i = k h ∇T i + ∆1 + ∆
V
n ~
h − ~q i = k h ∇T i + ∆1
V
Assuming the spheres are dilute and do not interact with one another, we can use
3k dT0 ~
∇TS = k
kS + 2k dz
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 612
k eff kS − k
= 1 + 3φ
k kS + 2k
4
n πR3
where φ = 3 is the volume fraction spheres.
V
19. In cylindrical coordinates the Bessel’s functions Im (λr) and Km (λr) satisfy
d2 1 d m2 2
+ − 2 − λ {Im (λr)} = 0
dr 2 r dr r
and
d2 1 d m2 2
+ − 2 − λ {Km (λr)} = 0
dr 2 r dr r
∇2~v = ∇p, ∇ · ~v = 0
Show that
∇2 p = 0 and ∇4~v = ~0
p = Im (λr) eı mθ eı λz
∂p
∇2 v z =
∂z
v 2 ∂v ∂p
∇2 vr − r2 − 2 θ =
r r ∂θ ∂r
v 2 ∂v 1 ∂p
∇2 vθ − θ2 + 2 r =
r r ∂θ r ∂θ
and
Observe that
ψ = I√ (λr) eı mθ eı λz
m2 + 1
satisfies
2 1
∇ − 2 ψ=0
r
vz = I0 (λr) eı λz
vr = I1 (λr) eı λz
and
vθ = I1 (λr) eı λz
Assuming
you have
Then write
1 d
f= r Im (λr)
2λ dr
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 615
20. A cylindrical column of inviscid fluid is rigid body rotation about its axis of symmetry at
angular velocity Ω. Thus you have vr = 0, vθ = r Ω and vz = 0. A small axisymmetric
perturbation is introduced and your job is to find the frequencies of small amplitude oscilla-
tions.
∂−
→v
v · ∇−
+−
→ →
v = −∇p, ∇·−
→
v =0
∂t
p
where has been replaced by p.
ρ
And you need to derive the equations corresponding to m = 0, viz.,
∂vr1 ∂p1
− 2Ω vθ1 = −
∂t ∂r
∂vθ1
+ 2Ω vr1 = 0
∂t
∂vz1 ∂p1
=−
∂t ∂z
∂vr1 vr1 ∂vz1
+ + =0
∂r r ∂z
Then seeking solutions vr1 = vbr1 (r) eiωt eikz , etc. you can eliminate vbθ1 in favor of vbr1
and vbz1 in favor of pb1 and arrive at an equation for vbr1:
d2 vbr1 1 db
vr1 1 4Ω2
+ − vb + k 2 − 1 vbr1 = 0
dr 2 r dr r 2 r1 ω2
This eigenvalue problem tells you the values of ω 2 as they depend on k 2 . Photographs
illustrating what you have found are presented by D. Fultz, J. Meteorology, 16 199 (1959).
∂c
= ∇2 c, 0 < x < 1, 0<y<a
∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 616
where c (t = 0) is assigned.
In case a) you are to deduce the fact that if a >> 1, the x variation dies out quickly
leaving the y variation in control of the loss of solute to the surroundings. If a << 1, it is
the reverse, i.e., the long direction is slow.
In case b) there is no solute loss, the initial solute distribution is simply working its way
to uniformity. You have eigenfunctions
• independent of x and y,
For a >> 1 and for a << 1 does the x or y variation control the final approach to
uniformity.
22. Suppose we inject a decomposing solute into a solvent in straight line flow in a long pipe of
circular cross section.
The solute concentration decreases due to a first order reaction and we have, in scaled vari-
ables,
∂c 1 ∂ ∂c ∂c ∂ 2 c
= r −v + − kc
∂t r ∂r ∂r ∂z ∂z 2
and
∂c
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 617
where v = 2v (1 − r 2 ).
Our model is
∂c ∂2c ∂c
= D eff 2 − V eff − K eff c
∂t ∂z ∂z
First derive
1 dc0
K eff = −
c0 dt
d c1
V eff =
dt c0
etc.
c0 (k > 0) = c0 (k = 0) e−kt
c1 (k > 0) = c1 (k = 0) e−kt
etc.
and conclude that K eff = k and that V eff and D eff are independent of k.
Estimate the fraction of solute remaining at the time V eff and D eff become nearly con-
stant.
23. Your tennis ball, having a diameter 2R and a wall thickness L is filled with air at pressure
P0 > P atm. The air diffuses across the wall and your tennis ball goes flat. Assume all the
pressure drop is across the wall and diffusion through the wall is steady. Then write
∇· c~
v =0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 618
P
c=
RT
and
K
~v = ∇P (Darcy’s law)
µ
At constant temperature and assuming one dimensional diffusion derive the equation
d 2 dP
Pr =0
dr dr
solve it and derive a formula for the time at which the pressure in your tennis ball falls to
P0 + P atm
.
2
24. A solid sphere of radius R0 and density cS dissolves sparingly in a solvent. It is in equilib-
rium with the solvent at solids concentration c∗, where
c∗ = c∗∞− Aγ2H
Writing
R = R0 + ε R1
2
2H0 = −
R0
and
1 ∂ 2 R1 cos θ ∂R1
2H1 = 2 2R1 + +
R0 ∂θ2 sin θ ∂θ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 619
∂c1
= D∇2 c1
∂t
c1 = c1∗ = Aγ2H1 at r = R0
and
∂R1 ∂c1
(cS − c0∗) =D at r = R0
∂t ∂r
∂R0
where c1∗ does not appear in the third equation because =0
∂t
Your aim is to find out how fast the perturbation dies out.
Assume a solution
and
b Pℓ (cos θ) eσt
R1 = R1
and derive the domain equation for bc1 (r). Its solutions are denoted
r ! r !
−σ −σ
j r and y r
ℓ D ℓ D
The case ℓ = 0 does not maintain the volume of the sphere fixed and the case ℓ = 1 is
neutral, so set ℓ = 2. Then a technical difficulty arises, viz., j2 (r) and y2 (r) do not vanish
as fast as you would like as r → ∞.
∂R
To get some idea of what is going on drop σ on the domain on the grounds that is
∂t
controlling the equilibration.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 620
Then
B
bc1 = Ar 2 +
r3
→
− →
−
25. Your job is to find the shape of a sphere spinning at angular velocity Ω = Ω k . To do this
you need the velocity of the fluid:
−
→ →
−
v = r sin θΩ iφ
Now writing
1 4
R = R0 + Ω2 R1 + Ω R2 + · · ·
2
At r = R (θ, φ) you have p + γ2H = 0 and hence to first order in Ω2 you have, at r = R0 ,
p1 + γ2H1 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 621
where
1 1 d d
2H1 = 2 2+ sin θ R1
R0 sin θ dθ dθ
Now you can find R1 and hence you will have R to first order in Ω2 .
Answer:
2ρ 2
R = R0 + Ω R04 − 2 cos2 θ
γ 3
3 1
At m = 0, the Yℓm ’s are the Pℓ ’s: P0 = 1, P1 = cos θ, P2 = cos2 θ −
2 2
To go to second order in Ω2 , you will need 2H2 . At r = R0 you will find
∂p1
2R1 + γ2H2 = 0
∂r
d 2 p0 dp0
because p2 , 2
and are all zero.
dr dr
26. An inviscid fluid confined to a circle of radius R0 by the surface tension acting at its edge is
spinning at a constant angular velocity, Ω.
−
→ →
−
v0 = rΩ iθ
and
dp0
= ρΩ2 r
dr
γ
where p0 = at r = R0
R0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 622
Your job is to find the frequency of oscillation, σ, assuming the surface is given a small
displacement, viz., R = R0 + εR1 and assuming there is no z variation and vz = 0.
You have
∂−
→v p
v · ∇−
+−
→ →
v = −∇ , →
∇·−
v =0
∂t ρ
and, at r = R,
Rθ
vr − vθ = Rt
R
and
p γ
+ 2H = 0
ρ ρ
where
1 Rθθ 1 2R2θ
2H = 3/2 − − 3
R2 R2 R R
1 + 2θ
R
pb1 d
Eliminate by differentiation and use im vbθ1 = − rb
vr1 to eliminate vbθ1 arriving at
ρ dr
d2 3 d 1 − m2
+ + vr1 = 0
b
dr 2 r dr r2
Retain the bounded solution, observing that m = 0 is ruled out if the area of the circle is
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 623
γ
σ2 = m (m − 1) (m + 1)
ρR03
Answer:
γ
(σ + mΩ)2 − (σ + mΩ) 2Ω + m2 Ω2 + 3
m (m − 1)2 = 0
ρR0
27. The mechanical energy equation may tell you something important about a problem before
you try to solve it, eg., the oscillating drop problem. As a simple example, we may have a
fluid occupying a volume V bounded by a surface S where we omit the fluid outside S.
Then in V we have
∂−
→v →
−
→
−
ρ + ρ−
→ −
v · ∇→
v =∇· T, ∇·−
→
v =0
∂t
and on S we have
−
→
n ·−
→
v =u
→
−
→
−
−−
→
n−→
n : T + γ 2H = 0
→→ −
− →
→
−
−t −
n : T =0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 624
S
V
→
− →
− −
→ T
→
− →
− →
− − → →
−
∇· T · v =∇· T · v − T : ∇−
→
v,
−
→
→
−
I : ∇−
→
v =∇·−
→
v = 0,
−
→
→
− →
−
→
− →
−
→
−
T = −p I + 2µ D ,
where
→
−
→ −
− →
−
→ −
→
→ −
− →
−
→
∇−
→
v = D + W, D :W =0
to derive
Z Z Z −
→
d 1 − → −
− →
→
−
ρ→
v 2 dV = dA γ 2Hu − 2µ dV D : D
dt 2
V S V
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 625
Now, because
Z Z
d
dA = − dA 2Hu
dt
S S
you have
Z Z Z →
−
d 1 − d → −
− →
→
−
ρ→
v 2 dV + γ dA = −2µ D : D dV
dt 2 dt
V S V
Now if you wish, you can easily include gravity in your formula by adding +ρ−
→
g, →
−
g =
−∇φ, to the RHS at the beginning.
Suppose you have a spherical drop at rest in free space, no gravity. Denote its radius by R0
its volume by V0 . You give it a small displacement from rest. Prove that the drop returns to
rest
Index
627
INDEX 628
M P
magnetic dipole, 271 periodic conditions, 375
manifold, 47 periodic eigenfunctions, 347
Mathieu equation, 148 perturbation calculations, 107
matrix multiplication, 6 Petri Dish Problem, 25
matrix of the cofactors, 34 plain vanilla inner product, 68, 497
maximum boiling azeotrope, 9, 24 plane source solution in three dimensions, 298
mean curvature, 252 point source, 292
mean square error, 325 point source solution, 295, 303
merry-go-round, 168 point source solution in one dimension, 298
method of Frobenius, 414, 573 point source solution in three dimensions, 297
minimum boiling azeotrope, 24 point source solution in two dimensions, 298
minor, 34 Poisson’s equation, 305, 591, 596
minors, 37 polynomial differential operator, 53
monopole, 306 population of cells, 179
Multiplication of a Vector by a Matrix, 6 positive cone, 210
Multipole Expansions, 303 positive definite, 67
multipole moment expansion, 565 positive feed back, 395
potential energy, 307
N
power moments, 273, 546
Navier-Stokes equation, 268
power series, 414
nearly dependent columns, 20
principal minors, 86
Neumann conditions, 375, 462, 469, 557
principle of detailed balance, 210
norm convergence, 327
principle of superposition, 10, 390
nutrient concentration, 180
probability density, 16
O projection, 73
one-dimensional lattice, 287 projection theorem, 73, 74
ordinary differential equations, 497 prolate and oblate spheroidal coordinates, 270
ordinary point, 542, 573 Pythagorean Theorem, 76
orthogonal basis, 246
orthogonal complement, 69
Q
quadrupole, 306
orthogonal coordinate system, 248, 410
quantum mechanics, 540
orthogonal functions, 323
INDEX 631
W
washout branch, 182
washout solution, 181
Wronskian, 502
Wronskian of the solutions, 40
20 Lectures on
www.orangegrovetexts.com
j o hns
L. E. Johns
56500
9 781616 101657