0% found this document useful (0 votes)
40 views674 pages

Johns

The document is a publication titled '20 Lectures on Eigenvectors, Eigenvalues, and Their Applications' by L. E. Johns, aimed at providing educational resources in the field of chemical engineering. It is published by the University Press of Florida and is part of the Orange Grove Texts Plus initiative, which focuses on accessible and customizable scholarly materials. The book covers a range of topics related to eigenvectors and eigenvalues, including their applications in solving differential equations and modeling chemical processes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views674 pages

Johns

The document is a publication titled '20 Lectures on Eigenvectors, Eigenvalues, and Their Applications' by L. E. Johns, aimed at providing educational resources in the field of chemical engineering. It is published by the University Press of Florida and is part of the Orange Grove Texts Plus initiative, which focuses on accessible and customizable scholarly materials. The book covers a range of topics related to eigenvectors and eigenvalues, including their applications in solving differential equations and modeling chemical processes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 674

20 Lectures on

and Their Applications


20 Lectures on Eigenvectors, Eigenvalues,
Eigenvectors,
Eigenvalues, and
Orange Grove Texts Plus seeks to redefine
Their Applications
publishing in an electronic world. This imprint
of the University Press of Florida provides
faculty, students, and researchers worldwide
Problems in Chemical Engineering
with the latest scholarship and course materials
in a twenty-first-century format that is
readily discoverable, easily customizable, and
consistently affordable.

www.orangegrovetexts.com

j o hns
L. E. Johns

ISBN 978-1-61610-165-7 $65.00

56500

9 781616 101657
20 Lectures on Eigenvectors, Eigenvalues, and Their Applications

orange grove text plus

University Press of Florida


Florida A&M University, Tallahassee
Florida Atlantic University, Boca Raton
Florida Gulf Coast University, Ft. Myers
Florida International University, Miami
Florida State University, Tallahassee
New College of Florida, Sarasota
University of Central Florida, Orlando
University of Florida, Gainesville
University of North Florida, Jacksonville
University of South Florida, Tampa
University of West Florida, Pensacola
20 Lectures on Eigenvectors,
Eigenvalues, and Their Applications
Problems in Chemical Engineering

L. E. Johns

University Press of Florida

Gainesville · Tallahassee · Tampa · Boca Raton


Pensacola · Orlando · Miami · Jacksonville · Ft. Myers · Sarasota
Copyright 2015 by the University of Florida Board of Trustees on behalf of the University of Florida
Department of Chemical Engineering

This work is licensed under a modified Creative Commons Attribution-Noncommercial-No


Derivative Works 3.0 Unported License. To view a copy of this license, visit http://creativecommons.
org/licenses/by-nc-nd/3.0/. You are free to electronically copy, distribute, and transmit this work if
you attribute authorship. However, all printing rights are reserved by the University Press of Florida
(http://www.upf.com). Please contact UPF for information about how to obtain copies of the work
for print distribution. You must attribute the work in the manner specified by the author or licensor
(but not in any way that suggests that they endorse you or your use of the work). For any reuse or
distribution, you must make clear to others the license terms of this work. Any of the above conditions
can be waived if you get permission from the University Press of Florida. Nothing in this license
impairs or restricts the author’s moral rights.

ISBN 978-1-61610-165-7

Orange Grove Texts Plus is an imprint of the University Press of Florida, which is the scholarly
publishing agency for the State University System of Florida, comprising Florida A&M University,
Florida Atlantic University, Florida Gulf Coast University, Florida International University, Florida
State University, New College of Florida, University of Central Florida, University of Florida,
University of North Florida, University of South Florida, and University of West Florida.

University Press of Florida


15 Northwest 15th Street
Gainesville, FL 32611-2079
http://orangegrovetexts.com
Contents

Acknowledgment ix

What Sets this Book Apart? xi

Reader’s Guide xv

Lecture 1: Getting Going . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

Lecture 2: Independent and Dependent Sets of Vectors . . . . . . . . . . . . . . . . . . xvii

Lecture 3: Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii

Lecture 4: Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii

Lecture 5: Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

Lecture 6: The Solution of Differential and Difference Equations . . . . . . . . . . . . xxi

Lecture 7: Simple Chemical Reactor Models . . . . . . . . . . . . . . . . . . . . . . . xxiv

Lecture 8: The Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv

Lecture 9: More Uses of Gerschgorin’s Circle Theorem . . . . . . . . . . . . . . . . . . xxv

Lecture 10: A Word To The Reader Upon Leaving Finite Dimensional Vector Spaces . . xxv

Lecture 11: The Differential Operator ∇2 . . . . . . . . . . . . . . . . . . . . . . . . . xxv

Lecture 12: Diffusion in Unbounded Domains . . . . . . . . . . . . . . . . . . . . . . xxvi

Lecture 13: Multipole Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxvi

Lecture 14: One Dimensional Diffusion in Bounded Domains . . . . . . . . . . . . . . xxvi

Lecture 15: Two Examples of Diffusion in One Dimension . . . . . . . . . . . . . . . . xxviii

i
CONTENTS ii

Lecture 16: Diffusion in Bounded, Three Dimensional Domains . . . . . . . . . . . . . xxviii

Lecture 17: Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxix

Lecture 18: Two Stability Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi

Lecture 19: Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . xxxii

Lecture 20: Eigenvalues and Eigenfunctions of ∇2 in Cartesian, Cylindrical and Spher-


ical Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxiii

Part I: Elementary Matrices 1

1 Getting Going 3

1.1 Boiling curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 A Simple Evaporator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 A Less Simple Evaporator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 The Hilbert Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.5 More Equations than Unknowns . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.6 Getting Ready for Lectures 2 and 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.7 A Source of Linear Problems: Boiling an Azeotrope . . . . . . . . . . . . . . . . . 23

1.8 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 Independent and Dependent Sets of Vectors 29

2.1 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2 Independent Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3 Looking at the Problem Ax=b from the Point of View of Linearly Independent
Sets of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Biorthogonal Sets of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5 The Number of Linearly Independent Vectors in a Set of Vectors and the Rank of
a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.6 Derivatives of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


CONTENTS iii

2.7 Work for the Reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.8 Looking Ahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.9 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Vector Spaces 47

3.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.2 The Image and the Kernel of a Matrix: The Geometric Meaning of Its Rank . . . . 48

3.3 The Facts about the Solutions to the Problem Ax=b . . . . . . . . . . . . . . . . 52

3.4 Systems of Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.6 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4 Inner Products 67

4.1 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.3 Solvability Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.5 Example: Functions Defined on 0 ≤ x ≤ 1 . . . . . . . . . . . . . . . . . . . . . 72

4.6 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.7 The Projection Theorem: Least Squares Approximations . . . . . . . . . . . . . . 76

4.8 Generalized Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.9 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5 Eigenvectors 85

5.1 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 Generalized Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.3 The Generalized Eigenvector Corresponding to a Double Eigenvalue . . . . . . . . 94

5.4 Complete Sets of Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97


CONTENTS iv

5.5 The Spectral Representation of a Matrix and a Derivation of the Kremser Equation 99

5.6 The Adjoint Eigenvalue Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.7 Eigenvector Expansions and the Solution to the Problem Ax=b . . . . . . . . . . 105

5.8 Solvability Conditions and the Solution to Perturbation Problems . . . . . . . . . . 107

5.9 More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.10 Similarity or Basis Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.11 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6 The Solution of Differential and Difference Equations 123

6.1 A Formula for the Solution to dx/dt = Ax . . . . . . . . . . . . . . . . . . . . 123

6.2 Gerschgorin’s Circle Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.3 A Formula for the Solution to x(k + 1) = Ax(k) . . . . . . . . . . . . . . . . . 130

6.4 The Stiffness Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.5 The Use of Generalized Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . 133

6.6 Improving the Performance of a Linear Stripping Cascade . . . . . . . . . . . . . 135

6.7 Another Way to Solve dx/dt = Ax . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.8 The Solution to Higher Order Equations . . . . . . . . . . . . . . . . . . . . . . . 140

6.9 Roots of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

6.10 The Matrix eA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

6.11 A = A (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

6.12 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

7 Simple Chemical Reactor Models 179

7.1 The Chemostat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

7.2 The Stirred Tank Reactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

7.3 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195


CONTENTS v

8 The Inverse Problem 207

8.1 A First Order Chemical Reaction Network. . . . . . . . . . . . . . . . . . . . . . 208

8.2 Liquid Levels in a Set of Interconnected Tanks . . . . . . . . . . . . . . . . . . . . 216

8.3 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

9 More Uses of Gerschgorin’s Circle Theorem 231

9.1 Difference Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

9.2 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Part II: Elementary Differential Equations 241

10 A Word To The Reader Upon Leaving Finite Dimensional Vector Spaces 243

11 The Differential Operator ∇2 245

11.1 The Differential Operator ∇ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

11.2 New Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

11.3 The Surface Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

11.4 A Formula for ∇2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

11.5 Domain Pertubations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

11.6 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

12 Diffusion in Unbounded Domains 273

12.1 Power Moments of an Evolving Solute Concentration Field:


1 dσ 2
The Formula D = . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
2 dt
12.2 Chromatographic Separations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

12.3 Random Walk Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

12.4 The Point Source Solution to the Diffusion Equation . . . . . . . . . . . . . . . . 292

12.5 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298


CONTENTS vi

13 Multipole Expansions 303

13.1 Steady Diffusion in an Unbounded Domain . . . . . . . . . . . . . . . . . . . . . 303

13.2 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

14 One Dimensional Diffusion in Bounded Domains 319

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

14.2 Orthogonality of the Functions ψn . . . . . . . . . . . . . . . . . . . . . . . . . . 322

14.3 Least Mean Square Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

14.4 Example Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

14.5 An Eigenvalue Problem Arising in Frictional Heating . . . . . . . . . . . . . . . . 347

14.6 More on Examples (5) and (6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

14.7 Differentiating the Eigenvalue Problem . . . . . . . . . . . . . . . . . . . . . . . 354

14.8 The Use of a Nondiagonalizing Basis. . . . . . . . . . . . . . . . . . . . . . . . . 357

14.9 A Warning About Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 361

14.10 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

15 Two Examples of Diffusion in One Dimension 373

15.1 Instability due to Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

15.2 Petri Dish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

15.3 A Lecture 14 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

15.4 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

16 Diffusion in Bounded, Three Dimensional Domains 385

16.1 The Use of the Eigenfunctions of ∇2 to Solve Inhomogeneous Problems . . . . . . 385

16.2 The Facts about the Solutions to the Eigenvalue Problem . . . . . . . . . . . . . . 393

16.3 The Critical Size of a Region Confining an Autothermal Heat Source . . . . . . . . 394

16.4 Solvability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

16.5 ∇4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
CONTENTS vii

16.6 Vector Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

16.7 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

17 Separation of Variables 409

17.1 Separating Variables in Cartesian, Cylindrical and Spherical Coordinate Systems . 409

17.2 How the Boundary Conditions Fix the Eigenvalues . . . . . . . . . . . . . . . . . 415

17.3 Solving a Two Dimensional Diffusion Problem in Plane Polar Coordinates (Spher-
ical Coordinates in Two Dimensions) . . . . . . . . . . . . . . . . . . . . . . . . . 418

17.4 The Zeros of cosz and J0 (z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428

17.5 Separation of Variables in Two More Coordinate Systems . . . . . . . . . . . . . . 431

17.6 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

18 Two Stability Problems 457

18.1 The Saffman-Taylor Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

18.2 The Rayleigh-Taylor Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468

18.3 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475

19 Ordinary Differential Equations 497

19.1 Boundary Value Problems in Ordinary Differential Equations . . . . . . . . . . . . 497

19.2 The Wronskian of Two Solutions to Lu = 0 . . . . . . . . . . . . . . . . . . . . . 502

19.3 The General Solution to Lu = f . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

19.4 Solving the Homogeneous Problem f = 0, g0 = 0, g1 = 0 . . . . . . . . . . . . . 504

19.5 Solving the Inhomogeneous Problem . . . . . . . . . . . . . . . . . . . . . . . . . 505

19.6 The Case D 6= 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507

19.7 The Case D = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

19.8 The Green’s Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512

19.9 What is δ(x) Doing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518


CONTENTS viii

19.10 Turning a Differential Equation into an Integral Equation: Autothermal Heat Gen-
eration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

19.11 The Eigenvalue Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522

19.12 Solvability Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526

19.13 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529

20 Eigenvalues and Eigenfunctions of ∇2 in Cartesian, Cylindrical

and Spherical Coordinate Systems 539

20.1 Cartesian Coordinates: Sines, Cosines and Airy Functions . . . . . . . . . . . . . 539

20.2 Cylindrical Coordinates: Bessel Functions . . . . . . . . . . . . . . . . . . . . . . 544

20.3 Spherical Coordinates: Spherical Harmonics, the Method of Frobenius . . . . . . 565

20.4 Small Amplitude Oscillations of an Inviscid Sphere . . . . . . . . . . . . . . . . . 592

20.5 The Solution to Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . 596

20.6 Home Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598

Index 627
Acknowledgment

In writing out these lectures in this way I was guided by what I imagined Charles Petty and Anthony
DeGance, students in the 60’s and 70’s, and Ranganathan Narayanan, a colleague from the 80’s to
the present, might ask me if I were teaching this material to them.

Thanks are due Debbie Sandoval and Santiago A. Tavares for the best possible help.

ix
What Sets this Book Apart?

First it has a detailed readers guide. Setting that aside, we answer the question:

There is no formal mathematics in this book, the proofs presented are mostly sketches. But
plausibility is not slighted and the geometric interpretation of the results obtained is a theme that
is maintained throughout the book.

We present the theory of finite dimensional spaces in Part I, first because there are many in-
teresting problems that can be formulated and solved in finite dimensions and second because the
main ideas in n dimensional spaces can be illustrated by drawing pictures in two dimensions.

Then, without stretching the readers imagination too much, in Part II we carry these pictures
over to infinite dimensional spaces. Indeed, often what is of interest in an infinite dimensional
space is a finite dimensional subspace.

In both finite and infinite dimensional spaces our search is always for a problem specific basis,
hence eigenvectors and eigenfunctions become a second theme. We make the jump from Part I to
Part II by introducing difference approximations to the solutions to the diffusion equation.

Now whether we are solving problems in Part I or in Part II, what we do is always the same. We
solve an eigenvalue problem, then all the steps come down to summation in Part I or integration in
Part II. No series, finite or infinite, is differentiated in order to solve a problem. For instance the
method of separation of variables is used only to solve eigenvalue problems, never to solve initial
value problems.

An explanation of domain perturbations is presented in order to extend the problems that can
be solved to domains other than cubes, cylinders and spheres.

Chemical engineering students need to solve problems having physical origins. Thus problems

xi
WHAT SETS THIS BOOK APART? xii

of physical interest are presented in every lecture and the readers will meet many of these problems
in their other courses, in their research or they will find these problems to be the first problems they
would solve in learning a new subject. This is a theme.

We often obtain linear problems via perturbation methods and due to this, there is a strong
emphasis on solvability conditions and hence on inner products. Solvability is thus a theme as is
the use of the eigenfunctions themselves to reveal patterns in nonlinear problems.

A short list of the examples presented in the lectures, and some of what they illustrate, follows:


 Linear approximation
Boiling curves
 Matrix multiplication




 Linear independence
Greatest number of reactions among 
Rank
M molecules made up A atoms 


 Determinant


Kremser equation Spectral decomposition



Dynamic stripping cascade Generalized eigenvectors



 Eigenvalues


Chemostat
Branch points
Stirred tank reactor 


 Hopf bifurcations

Isomerization reactions 
Eigenvectors
Draining tanks 


Difference approximations Gerschgorin’s theorem

WHAT SETS THIS BOOK APART? xiii


Chromatography Power moments



Electrical potential Multipole expansions



Activator-inhibitor kinetics Eigenvalues



Petri dish Solvability



Size of a confined autothermal heat source Eigenvalues



 Separation of variables


Saffman-Taylor problem
Integral constraints
Rayleigh-Taylor problem 

 Patterns derived from eigenfunctions

Energy of a quantum ball 
 Separation of variables


Solute dispersion Eigenvalues and eigenfunctions



Oscillations of an inviscid drop  Cartesian, cylindrical and spherical coordinates

Many home problems stem from these examples and many of the home problems are not ques-
tions but stories leading to the reader to derive some well known results.
Reader’s Guide

Before we outline the main ideas presented in each lecture we present an overview stating how the
lectures fit together.

Part I has to do with problems whose solutions lie in finite dimensional spaces. Part II has to
do with problems whose solutions are functions.

Thus in Part I the subject is matrices, in Part II the subject is ∇2 and the differential equations
arising upon use of the method of separation of variables.

The applications are indicated in the opening statement: “What Sets this Book Apart?”

Diffusion is a theme of Part II, but only because ∇2 would not be so interesting if there were
no diffusion. Of course other themes could have been chosen from classical physics.

The first five lectures bring the reader to the point where they understand the basic facts about
the eigenvectors and eigenvalues of matrices.

The sixth lecture then uses these ideas to write the solution to systems of constant coefficient
ordinary differential equations.

The seventh and eighth lectures are applications of the sixth lecture to problems the reader
might have learned something about as an undergraduate.

The ninth lecture makes the transition to Part II by solving difference approximation to the
diffusion equation. The idea is that the expansion of the solution in the eigenvectors of a matrix
carries over to the solution of the diffusion equation itself in Part II.

Lectures 11, 14, 16 and 17 present the basic facts about the eigenvalues and eigenvectors of
∇2 in a bounded domain. Lectures 12 and 13 explain a little about problems in unbounded do-
mains, mostly by the use of the method of moments to derive simple facts about concentration

xv
READER’S GUIDE xvi

distributions.

In Lecture 11 formulas for ∇ and ∇2 are presented in various coordinate systems.

In Lecture 14 the dependence of the eigenfunctions and eigenvalues of ∇2 on the boundary


conditions is illustrated in one space dimension.

Lecture 15 presents two applications, one to activator-inhibitor kinetics, the other to the con-
struction of the solution to a nonlinear reaction-diffusion problem.

Lecture 16 repeats Lecture 14, but now we have three space dimensions and volume and surface
sources are taken into account.

The method of separation of variables is presented in Lecture 17 and two well-known stability
problems are solved in Lecture 18.

Lecture 19 is about the second order ordinary differential equations that present themselves
upon the use of separation of variables. Their solutions by Frobenius’ method appears in Lecture
20.

Lecture 20 presents applications of the eigenfunctions and eigenvalues of ∇2 to problems I


like in Cartesian, cylindrical and spherical coordinates. The problems are not about summing
infinite series but instead illustrate the physical significance of the eigenvalues and eigenfunctions
themselves.

Lecture 1: Getting Going

Our aim is to introduce linear problems and to suggest how they might arise.

A linear problem can be written



L−

x = f



where →
−x lies in the space of unknown vectors and f lies in the space of vectors that drive the


problem. We seek − →
x such that L carries −

x into f , where L is a linear operator in the sense that

L (−

x +−

y ) = L−

x + L−

y
READER’S GUIDE xvii
 
x1
 
 
 x2 
The simplest linear problems are those where −

x = x = 
 ..
 and

 . 
 
xn
 
f1
 
 

−  f2 
f =f =
 ..


 . 
 
fm
 
a11 a12 · · · a1n
Then L = A =  , an m × n matrix, and we write our problem Ax = f .
etc.
The most important point in Lecture 1 is that A should be viewed as a set of n columns
 
A= a1 a2 ... an

each lying in the space where f lies.

What we are mostly interested in is learning whether or not there is a solution and if there is,
how many.

So we start with A x = f and write it

a 1 x1 + a 2 x2 + · · · + a n xn = f

This causes us to ask our question in the following way: Can f be written as a linear combination
of the set of vectors a1 , a2 , · · · , an ? If it can then the coefficients in the expansion of f are the
elements of x, our answer.

Lecture 2: Independent and Dependent Sets of Vectors

We have a vector f lying in the space Cm of dimension m and we have a set of vectors a 1, a 2, · · · , a n
also lying in Cm , and we want to know how much of Cm can be included in the linear combinations
READER’S GUIDE xviii

of a 1, a 2, · · · , a n. Then we want to know if f lies in that part of Cm .

The main idea is introduced: linear independence. It is illustrated by the question: how many
independent chemical reactions can be written among M molecules made up of A atoms?

The function det is introduced where det acting on a square matrix produces a scalar, called
its determinant.

We define the rank of a matrix, denoted r, where the matrix may or may not be square. And
we identify a set of r basis columns. The basis columns are independent, r is the largest number
of independent columns that can be found and any set of r independent columns is a set of basis
columns. Every set of r + 1 columns is dependent and each of the remaining n − r columns must
be a linear combination of the r basis columns.

Lecture 3: Vector Spaces

The idea of a vector space is presented, its dimension is defined and the idea of a basis is introduced.

Two vector spaces associated to an m × n matrix A, are introduced: Im A and Ker A. The
dimension of Im A is r, the dimension of Ker A is n − r. Im A is the collection of all vectors A x,
Ker A is the collection of all vectors x such that A x = 0.

The solvability condition for A x = b is then given. Thus if b ∈ Im A then A x = b is


is solvable and the general solution is x 0, a particular solution, plus an arbitrary vector lying in
Ker A.

Lecture 4: Inner Products

To be able to extend what we now know to problems beyond matrix problems and free ourselves of
the determinant function which only applies to matrix problems, we introduce the idea of an inner
product, paying most attention to the case where A is n × n.

We are looking for a way to tell if a vector b lies in the subspace Im A, where the dimension
of Im A is r.
READER’S GUIDE xix

Now we can find Ker A, by solving

Ax = 0

It is of dimension n − r.

To find Im A we introduce a new matrix A∗ , called the adjoint of A, where A∗ depends on


the inner product in which we are working.

Having A∗ we find Ker A∗ by solving A∗ x = 0, where Ker A∗ has dimension n − r. And


Ker A∗ has an interesting geometric property: it is orthogonal to Im A. Thus b ∈ Im A if and
only if b is orthogonal to all y ∈ Ker A∗ .

Hence to find out if the problem

Ax = b

is solvable we must test b against n − r independent solutions of A∗ y = 0 whence A x = b is


solvable if and only if b is orthogonal to each of them.

Lecture 5: Eigenvectors

Here we denote by A an n × n matrix and we ask if there are vectors x that are mapped by A
without change of direction, i.e., we ask for x’s such that

Ax = λx

This is the eigenvalue problem for A and solutions (x 6= 0, λ) are called eigenvectors and
eigenvalues. If x is a solution, likewise c x for any c.

Writing A x = λ x as

(A − λ I) x = 0
READER’S GUIDE xx

we see that in order that solutions x 6= 0 can be found the λ’s must be such that the rank of
Ker (A − λI) is less than n, i. e., we must have

det (A − λ I) = 0

This equation has n roots λ1 , λ2 , · · · , λd repeated m1 , m2 , · · · , md times, where m1 + m2 +


· · · + md = n The m’s are called the algebraic multiplicities of the λ’s and to each λ there is at
least one x 6= 0.

The number of independent solutions corresponding to a root λ is the dimension of Ker (A − λ I).

Denoting these dimensions n1 , n2 , · · · , nd , corresponding to λ1 , λ2 , · · · , λd we have n1 in-


dependent eigenvectors corresponding to λ1 , etc. and n1 ≤ m1 , etc.

Now we pay most of our attention to the plain vanilla case where d = n and
n1 = 1 = m1 , etc.

Then we have n distinct eigenvalues and the corresponding n eigenvectors are independent and
form a basis.

Introducing an inner product, we can derive A∗ , the adjoint of A, and write its eigenvalue
problem

A∗ y = µ y

whereupon we find the µ’s are the complex conjugates of the λ’s and the set of eigenvectors
 
y 1, y 2, · · · , y n is the set of vectors orthogonal to the set x 1, x 2, · · · , x n ,
viz., h y i, x j i = 0

Thus any vector x can be expanded in two ways

X
x= ci x i

where ci = h y i, x i and
READER’S GUIDE xxi

X
x= di y i

where di = h x i, x i.

We solve the linear stripping cascade problem and derive a symmetric form of the Kremser
equation.

Lecture 6: The Solution of Differential and Difference Equa-


tions

We are going to learn how to solve the system of differential equations

dx
= A x, t>0
dt
 
x1
 
 
 x2 
where x = 
 ..
 denotes the time dependent unknowns, x at t = 0 is specified, and A denotes

 . 
 
xn
an n × n matrix of constants.

The ordinary case is where A has a complete set of n independent eigenvectors, x 1, x 2, · · · , x n


corresponding to eigenvalues λ1 , λ2 , · · · , λn . To solve our problem we are going to write x (t) in
terms of x 1, x 2, · · · , x n, viz.,

x (t) = c1 (t) x 1 + c2 (t) x 2 + · · · + cn (t) x n

and try to find c1 (t) , c2 (t) · · ·

To do this we introduce an inner product, denoted , , and in this inner product we derive the
adjoint of A, viz., A∗ .
READER’S GUIDE xxii

Then the eigenvectors of A∗ , viz., y 1, y 2, · · · , y n, with the normalization y 1, x 1 = 1,


etc. form a set of vectors biorthogonal to the set of eigenvectors of A.

Thus we have

c1 = y 1, x (t)

c2 = y 2, x (t)

etc.

and to derive the equation for c1 we take the steps

D dx E d
y 1, = y 1, Ax =⇒ y 1, x = A∗ y 1, x =⇒
dt dt

d
c1 = λ1 y 1, x = λ1 y 1 , x = λ 1 c1
dt
and hence we have

c1 (t) = c1 (t = 0) eλ1 t

And our solution is

x (t) = y 1, x (t = 0) eλ1 t x 1 + y 2, x (t = 0) eλ2 t x 2 + etc.

If instead of a differential equation we have a difference equation, viz.,

x (n + 1) = A x (n) n = 0, 1, 2, . . .

we find

x (n) = y 1, x (n = 0) λn1 x 1 + y 2, x (n = 0) λn2 x 2 + etc.

Stability of solutions to our differential equation requires all the eigenvalues of A to lie in the
READER’S GUIDE xxiii

left half of the complex plane, viz., Re λ < 0, whereas stability of solutions to our difference
equation requires all eigenvectors to lie inside the unit circle | λ| < 1.

We then explain what to do if eigenvectors are missing. For example if λ1 is a root of algebraic
multiplicity 2 but geometric multiplicity 1, i.e., dim Ker (A − λ1 I) = 1 so that there is only one
independent eigenvector corresponding to λ1 , we are going to be short one eigenvector and we will
not have an eigenvector basis for our space. What we do to overcome this difficulty is to introduce
generalized eigenvectors. Thus we write

A x 1 = λ1 x 1

and

A x 2 = x 1 + λ1 x 2

And hence writing

x (t) = c1 (t) x 1 + c2 (t) x 2

we find

dc1
= λ 1 c1 + c2
dt

and

dc2
= λ 1 c2
dt

whereupon c1 and c2 can be found sequentially, and the factor teλ1 t appears.

We use this to solve a dynamic linear stripping cascade problem.


READER’S GUIDE xxiv

Lecture 7: Simple Chemical Reactor Models

We put to use what we have been learning and we do this in the context of a chemostat and a very
simple stirred tank reactor, but a reactor that retains the interesting physics of these reactors. Thus
as the reaction proceeds it releases heat, it uses up reactants and it speeds up as the temperature
increases.

The stirred tank reactor model is two dimensional and therefore 2 × 2 matrices turn up in the
investigation of the stability of its steady states. The eigenvalues of a 2 × 2 matrix depend on
its trace T and determinant D and in the D − T plane stability obtains in the fourth quadrant,
D > 0, T < 0.

The model may be a bit too simple, but then we only need to solve quadratic equations to see
what is going on.

Lecture 8: The Inverse Problem

In this lecture we assume we have a model

dx
= Ax
dt

and that we can run experiments where we measure x (t) vs t.

The aim is to derive the elements of the matrix A from the measurements. The main idea is
that there are straight line paths, x (t) vs t, and that if we can find the directions of these straight
lines, we have the eigenvectors of A, and hence we can derive A from its spectral expansion

X
A= λi xi y Ti

We illustrate this by solving the problem of measuring reaction rate coefficients in a system of
isomers, the Wei and Prater problem.
READER’S GUIDE xxv

Lecture 9: More Uses of Gerschgorin’s Circle Theorem

This is a lecture on difference approximations to the solution of the diffusion equation, first, to
present some ideas about diffusion, second, to illustrate the use of Gerschgorin’s theorem in es-
timating the eigenvalues of a matrix and, third, to present the method of solution which will be
carried over to the diffusion equation itself in Part II.

Lecture 10: A Word To The Reader Upon Leaving Finite Di-


mensional Vector Spaces

This lecture presents a warning that we are leaving behind solutions that are finite sums and moving
ahead to solutions that are infinite sums, and that it is now important that we do not substitute our
proposed solutions into our equations, i.e., integration is the rule not differentiation.

Lecture 11: The Differential Operator ∇2

The main idea is to learn how to write ∇2 in orthogonal coordinate systems.

We begin by defining the gradient operator ∇ in terms of the derivative of a function along a
curve. Then we write ∇ in Cartesian coordinates and in any orthogonal coordinate system derived
from Cartesian coordinates.

We introduce the surface gradient operator, ∇S , so that we can differentiate a function defined
only on a surface and we go on and obtain a formula for the mean curvature of a surface.

We present a formula for ∇2 in an arbitrary orthogonal coordinate system and we work out
the details in cylindrical and spherical coordinate systems. Our emphasis is on the variation of the
base vectors from one point in space to a nearby point.

Then in order to solve problems on domains close to domains we like, we explain how domain
perturbations are carried out.
READER’S GUIDE xxvi

Lecture 12: Diffusion in Unbounded Domains

We begin our study of diffusion, and ∇2 , by deriving formulas for the power moments of a con-
centration field in an unbounded domain. Viewing the concentration as the probability of finding
a solute molecule at a certain point at a certain time, we introduce its mean, its variance, etc. We
then present an example where the effect of convection on the variance can be derived.

We use power moments to explain how chromatographic separations work.

We introduce a random walk model.

And we present the point source solution to the diffusion equation and explain superposition.

Lecture 13: Multipole Expansions

We continue solving problems in an unbounded domain and derive solutions to the problem of
steady diffusion from a source near the origin to a sink at infinity.

We introduce the monopole, dipole, quadrupole, etc. moments of the source and expand the
solution in these moments.

By doing this we obtain solutions to Poisson’s equation. We derive the electrical potential due
to a set of charges and thus the electrostatic potential energy of two charge distributions.

Lecture 14: One Dimensional Diffusion in Bounded Domains

We solve a one dimensional diffusion problem, viz.,

∂c ∂2c
= 2
∂t ∂x

on a bounded domain, say 0 ≤ x ≤ 1. The solute concentration is specified at t = 0, via

c = c0 (x) ≥ 0
READER’S GUIDE xxvii

This is the source of the solute. The sinks are at x = 0 and x = 1 where we specify a variety of
homogeneous boundary conditions.

To solve our diffusion problem we introduce an eigenvalue problem

d2 ψ
2
+ λ2 ψ = 0, 0≤x≤1
dx

and try to decide what homogeneous boundary conditions we ought to require ψ to satisfy at x = 0
and x = 1 in order that we can use the eigenfunctions and eigenvalues in solving for c.

To help us do this we introduce two integration by parts formulas:

Z 1 1 Z 1
dψ 2 dψ dφ dψ
φ 2 dx = φ − dx
0 dx dx 0 0 dx dx

and
Z 1  1 Z 1
d2 ψ dψ dφ d2 φ
φ 2 dx = φ − ψ + ψ 2 dx
0 dx dx dx 0 0 dx

Our expectation is that by solving our eigenvalue problem we will find an infinite set of or-
thogonal eigenfunctions in an inner product denoted h , i and by the theory of Fourier series we
expect to be able to expand the solution to our diffusion problem as a linear combination of these
functions.

Thus we write our solution

X
c (x, t) = ci (t) ψi (x)

where

ci (t) = h ψi , c i

and we derive the equation for ci , via


Z 1 Z 1
∂c ∂2c ∂c ∂2c ∂c ∂2c
= 2 =⇒ ψi = ψi 2 =⇒ ψi dx = ψi =⇒
∂t ∂x ∂t ∂x 0 ∂t 0 ∂x2
READER’S GUIDE xxviii
  1 Z 1
dci ∂c ∂ψi ∂ 2 ψi
= ψi − ci + c dx
dt ∂x ∂x 0 0 ∂x2

It is at this point that we decide how to choose the boundary conditions that ψi must satisfy.
Thus if c is specified at x = 0 and x = 1, we set ψi = 0 at x = 0 and x = 1 to eliminate
∂c ∂c
the unknown at x = 0, 1. If is specified at x = 0 and c is specified at x = 1 we chose
∂x ∂x
∂ψ
= 0 at x = 0 and ψ = 0 at x = 1. The plan is now apparent. We choose ψ at the boundary
∂x
to remove the indeterminacy in the equation for ci , whereupon the λ2 ’s and ψ’s depend on the
boundary conditions satisfied by c.

We present several examples differing from one another only in the boundary conditions at
x = 0 and x = 1.

Lecture 15: Two Examples of Diffusion in One Dimension

Two examples of diffusion in one dimension are presented. The first is an activator–inhibitor
model which illustrates an instability caused by diffusion, viz., the inhibitor diffuses away before
it can arrest the growth of a perturbation. The second is our Petri Dish problem, first introduced
in Lecture 1, where we now explain how to find a solution branch which appears as some input to
the problem advances beyond its critical value. We see that the amplitude of the branch depends
on the input variable in different ways for different nonlinearities.

Lecture 16: Diffusion in Bounded, Three Dimensional Domains

In this lecture the use of the eigenfunctions of ∇2 to solve inhomogeneous problems on bounded,
three dimensional domains is explained.

Our first job is to use Green’s two theorems to help us discover the boundary conditions that
the eigenfunctions must satisfy and then to derive the important facts about the eigenvalues and the
eigenfunctions.

We then indicate how diffusion eigenvalues can be used to estimate critical conditions in non-
linear problems, say, the critical size of a region confining an autothermal heat source.
READER’S GUIDE xxix

Lecture 17: Separation of Variables

To solve the solute diffusion problem

∂c
= ∇2 c + Q
∂t

in a bounded domain, we introduce the eigenvalue problem

∇2 ψ + λ 2 ψ = 0

where ψ satisfies homogeneous conditions on the boundary of our domain. The specified function
Q assigns sources and sinks of solute on the domain. Other sources and sinks may be assigned at
the boundary of our domain.

The simplest domain shapes that we can deal with are those where we would introduce Carte-
sian, cylindrical or spherical coordinates.

Now separation of variables is the method ordinarily used to solve the eigenvalue problem.
And our aim here is to explain how it works in simple cases.

In Cartesian, cylindrical and spherical coordinates we substitute

ψ = X (x) Y (x) Z (x)

ψ = R (r) Θ (θ) Z (z)

and

ψ = R (r) Θ (θ) Φ (φ)

and obtain

∂2X
2
+ α2 X = 0 (1)
∂x

∂2Y
2
+ β2 Y = 0 (2)
∂y
READER’S GUIDE xxx

and

∂2Z
+ γ 2 Z = 0, (3)
∂z 2

∂2Z
+ γ2 Z = 0 (4)
∂z 2
∂2Θ
+ m2 Θ = 0 (5)
∂θ2
and
   
∂2 1 ∂ 2 m2 2
2
+ R+ λ − 2 −γ R =0 (6)
∂r r ∂r r

and

d2 Φ
+ m2 Φ = 0 (7)
dφ2
   
1 d ∂Θ m2
sin θ + ℓ (ℓ + 1) − Θ=0 (8)
sin θ dθ ∂θ sin2 θ
and
   
d2 2 2 ℓ (ℓ + 1)
+ R+ λ − R=0 (9)
dr 2 r r2

Eqs. (4), (5) and (7) are just like Eqs. (1), (2) and (3). Eqs. (6), (8) and (9) are new.

The reader ought to observe that Eqs. (1), (2) and (3) are independent of one another but by
the time we get to Eqs. (7), (8) and (9), Eq. (7) must be solved first so that the m’s are available in
Eq. (8) then Eq. (8) must be solved so that the ℓ’s are available in Eq. (9) and λ2 appears only in
Eq. (9), and it is independent of m2 .

We work out two simple two dimensional problems in order to see what changes occur as we
go from one coordinate system to another.

First our diffusion problem is set on a rectangle of sides a and b, c is specified on the perimeter
READER’S GUIDE xxxi

and Q is specified on the domain. We find two sets of orthogonal functions, viz.,

mπx m2 π 2
X = sin , α2 = m = 1, 2, . . .
a a2

and

nπy n2 π 2
Y = sin , β2 = n = 1, 2, . . .
b b2

Cartesian coordinates are special. These two sets of orthogonal functions are all that we need
to solve our problem and to obtain c (x, y, t) we take the same steps we took in the one dimensional
case, in Lecture 14.

Second our diffusion problem is now set on a circle of radius R0 . And again we assume c is
specified on the circumference, but now bounded at the origin and periodic in θ.

Thus we obtain a set of orthogonal angular functions

1
Θm (θ) = √ ei mθ , m = · · · , −2, −1, 0, 1, 2, · · ·

and these can be used to expand the θ variation of our solution.

The corresponding radial functions satisfy


 
d2 1 d m2 2
+ − +λ R=0
dr 2 r dr r

and we see something new: for each value of m2 we will have a corresponding set of radial
eigenfunctions and the set will differ as m2 differs.

Lecture 18: Two Stability Problems

In this Lecture we see that eigenvalues and eigenfunctions of ∇2 are themselves of great interest,
whether or not a series solution to a diffusion problem is being sought.

To make this point we solve two stability problems, the Saffman-Taylor problem and the
READER’S GUIDE xxxii

Rayleigh-Taylor problem. In both cases we imagine the setting to be cylinder of circular cross
section bounding a porous solid. Two immiscible fluids fill the pores and in one case a less viscous
fluid is displacing a more viscous fluid, in the other case a heavy fluid lies above a light fluid.

The eigenvalues of ∇2 tell us the critical value of the input variable of interest, the eigenfunc-
tions tell us the pattern we ought to see at critical.

In the second problem the distinction between free and pinned edges bears on the possibility
of separating variables.

Lecture 19: Ordinary Differential Equations

Separation of variables leads to second order, linear, ordinary differential equations. The simple
facts about these equations are presented in this lecture, before the method of Frobenius for solving
these equations is outlined in Lecture 20. Our problem is to find u where u satisfies

Lu = f

B0 u = a0 at x=0

and

B1 u = a1 at x=1

where L is a linear, second order differential operator and B is a linear combination of u and u′.

Two independent solutions of Lu = 0 are introduced and the general solution to Lu = f is


presented. Then the homogeneous problem

Lu = 0

B0 u = 0 at x=0
READER’S GUIDE xxxiii

and

B1 u = 0 at x=1

is taken up and conditions that it has solutions other than zero are presented.

This leads to solvability conditions for the inhomogeneous problem.

The Green’s function is introduced.

The simple facts about the eigenvalue problem

Lψ + λ2 ψ = 0

B0 ψ = 0 at x=0

and

B1 ψ = 0 at x=1

are derived.

Lecture 20: Eigenvalues and Eigenfunctions of ∇2 in Cartesian,


Cylindrical and Spherical Coordinate Systems

Solutions to the eigenvalue problem for ∇2 are presented in this lecture. The coordinate systems
of interest are Cartesian, cylindrical and spherical.

The method of Frobenius is presented and first used to obtain a power series expansion for the
Bessel function I0 (x). The coefficients in the series define the nature of the functions so obtained.
To emphasize this point we derive the zeros of J0 (z) and cos z from the coefficients in their power
series expansions.

Then the bounded solutions of the associated Legendre equation are worked out and the spher-
ical harmonics are introduced.
READER’S GUIDE xxxiv

Applications to the problem of solute dispersion due to a velocity gradient, to the problem of
small amplitude oscillations of a nonviscous sphere and to the energies of a quantum ball in a
gravitational field are presented.
Part I

Elementary Matrices

This is the title of an old book by Frazer, Duncan and Collar; while this book is by no means
elementary the title does fit Part I of these lectures.
Lecture 1

Getting Going

In this lecture we introduce some ideas which point to the direction of our future work. The next
to last section presents the facts about the solutions to linear algebraic equations.

The last section suggests that we can learn something about nonlinear problems by solving
certain linear problems.

1.1 Boiling curves

To determine a boiling curve for a liquid solution, we heat the liquid and draw off an equilibrium
vapor. We denote the species making up the liquid by 1, 2, . . . , n in order of decreasing volatility or
increasing boiling point. Ordinarily the vapor is enriched in the more volatile species and so, as the
boiling goes on, the liquid composition shifts in favor of the less volatile species. To write a model
of this we denote by xi and yi the mole fractions of species i in the liquid and in the equilibrium
vapor and by N the number of moles of liquid being heated, then we write

dxi
= −yi + xi i = 1, . . . , n
ds

where
 
N(t)
s = −ℓn
N(t = 0)

3
LECTURE 1. GETTING GOING 4

where s increases as t increases, and where, at constant pressure, y1 , . . . , yn and T are determined
by x1 , . . . , xn , assuming the phases remain in equilibrium as the liquid is boiled off. Writing
the phase equilibrium equations at constant pressure as yi = fi (x1 , x2 , . . . , xn ) and assuming
P P P
yi = 1 whenever xi = 1 and yi = 0 whenever xi = 0 we see that if xi = 1 and xi ≥ 0 at
P
any point in the boiling process then xi = 1 and xi ≥ 0 at any subsequent point.

It is enough to let n = 3. Then we can represent


  the state of the liquid geometrically by
x
 1 
 
~x = x1~i + x2~j + x3~k or algebraically by x =  x2  and observe that the state moves on the
 
x3
P
plane xi = 1 but remains in the positive octant, i.e., the state space is the equilateral triangle
whose vertices lie at ~i, ~j and ~k, or at
     
1 0 0
     
     
 0 ,  1  and  0 
     
0 0 1

The state space is not a vector space, indeed no sum of two vectors in the state space lies in the
state space; it is a subset but not a subspace of R3 .

Should we wish to solve our system of equations, we first ought to develop some guidelines
to what the solution looks like. The simplest guide posts are the points where the system comes
to rest, the so called critical points, steady state points, equilibrium points, etc. These points are
dxi
defined by = 0, i = 1, 2, . . . , n, and here this corresponds to the equations −yi + xi = 0, i =
ds
1, 2, . . . , n, i.e., to the homogeneous azeotropes. The simplest of these lie at the vertices of the
triangle, (points 1, 2, 3), next are the binary azeotropes lying on the edges (e.g., point 4) then the
full ternary azeotropes on the face of the triangle (e.g., point 5), viz,
LECTURE 1. GETTING GOING 5

5
1 2

If we can decide how points move in the neighborhood of these rest points, that is whether
they are attracted to or repelled by the rest points, we can begin to make a qualitative sketch of the
family of solutions to our problem. This property of a critical point is referred to as its stability and
we can get some information on stability by assuming the system is displaced a small amount from
a rest point and then determining whether this small displacement is strengthened or weakened.
To do this we construct a linear approximation to our model in the neighborhood of a rest point of
interest and then investigate its solution.

Denote by x0i , i = 1, 2, . . . , n a solution to the equations


0 = −fi x01 , x02 , · · · , x0n + x0i , i = 1, 2, . . . , n

Then approximate the model when x is near x0 by writing x = x0 + ξ and retain only terms linear
in ξ. By doing this we obtain


= Aξ
ds

∂fi 0 0 
where A is an n × n matrix whose elements are aij = − x1 , x2 , · · · , x0n + δij
∂xj
This is a system of linear differential equations; what its solutions look like is determined by
the matrix A. To understand stability problems is one of our main interests in studying matrices
but it is far from our only interest as we will soon see. But for now, what is important is that
this example introduces the multiplication Aξ. The reader can carry out the calculation described
above and learn the rule determining the product.
LECTURE 1. GETTING GOING 6

How to Think about the Multiplication of a Vector by a Matrix


 
x
 1 
 
 x2 
A column of n complex numbers is called a column vector and is denoted x =  
 .. .
 . 
 
xn
It belongs to the vector space denoted C n . It is also an n × 1 matrix and its transpose

xT = x1 x2 . . . xn is a 1 × n matrix. In the foregoing we multiplied a column vector on
the left by a matrix, the result of doing this being another column vector. The rule for doing this is:

the product of an m × n matrix A and an n × p matrix B is an m × p matrix C where


P
cij = nk=1 aik bkj and where aij lies in the ith row and the jth column of the matrix
A.

This formula defines matrix multiplication but to see what is going on when two matrices are
multiplied it is useful to think about a matrix in terms of its columns or its rows instead of in terms
of its elements. Indeed an m × n matrix A is made up of n columns, each a column vector lying
in C m . Denoting these a1 , a2 , . . . , an where
     
a a a
 11   12   1n 
     
 a21   a22   a2n 
a1 = 
 ..
,
 a2 = 
 ..
,
 ··· , an = 
 ..


 .   .   . 
     
am1 am2 amn

we can write


A = a1 a2 . . . an

whence the product Ax is the column vector

x1 a1 + x2 a2 + · · · + xn an ∈ C m
LECTURE 1. GETTING GOING 7

where
 
x1
 
 
 x2 
x=
 ..
∈Cn

 . 
 
xn

The product Ax is then the linear combination of the columns of A determined by coefficients
taken to be the elements of x. Likewise in the product AB, each column of B belongs to C n
and determines the coefficients for the linear combination of the columns of A that adds up to the
corresponding column of the product. So the j th column of a product AB is a linear combination
of the columns of A, the coefficients being the elements of the j th column of B, indeed AB =
(Ab1 Ab2 . . .). Again each row of BA is a linear combination of the rows of A, the coefficients for
constructing the ith row of BA being the elements of the ith row of B. Ordinarily if AB is defined
BA is not and vice versa, the exception is when A and B are square. Then AB and BA need
not be equal. This way of looking at matrix multiplication will help us understand the solvability
conditions for linear algebraic equations, viz., Ax = b. In fact, if m > n, the picture

n m
C C

x
b

a1
O an O a2

suggests the need for a solvability condition.


LECTURE 1. GETTING GOING 8

Just as
 
x1
 
 
 x2 
Ax = a1 a2 . . . an 
 ..
 = a1 x1 + a2 x2 + · · · + an xn

 . 
 
an

so also
 
y T1
 
 
 y T2 
AX = a1 a2 . . . an 
 ..
 = a1 y T + a2 y T + · · · + an y T
 1 2 n
 . 
 
y Tn

where y T1 is the first row of X, y T2 the second, etc. and where a1 y T1 is a matrix each of whose
columns is a multiple of a1 . This way of looking at a matrix multiplication, instead of

AX = Ax1 Ax2 . . . as above, will be useful later on in turning the solutions to the eigen-
value problem for A into a spectral representation of A.

Columns and rows turn up in a symmetric way. For every result about a set of columns there is
a corresponding result about a set of rows. We emphasize columns and write a system of algebraic
equations as Ax = b but we can usethe transpose
 operation, denoted T , where the columns of A
aT1
 
 
 aT2 
become the rows of AT , viz., A = 
T
 ..
 , and write the problem Ax = b as xT AT = bT .

 . 
 
aTn
Before leaving the boiling curve problem we can observe that the temperature of the liquid
plays no role. Because we ordinarily expect the temperature to increase as the boiling proceeds we
expect a family of solution curves for a plain vanilla liquid system to look as follows
LECTURE 1. GETTING GOING 9

1 2

If there are binary and ternary azeotropes we first observe that binary azeotropes come in two
kinds, maximum boiling which are stable and minimum boiling which are unstable. Knowing
whether a binary azeotrope is maximum or minimum boiling allows us to determine what the
system does on the binary edges. But it may not determine what happens in the interior even in the
absence of ternary azeotropes. Thus if the 1, 2 binary has a maximum boiling azeotrope we expect
either

3 3

or

1 2 1 2
depending on whether the 1-2 azeotrope boils at a higher or lower temperature than does 3.

We might guess that if we calculate the temperature associated with each state and plot the
isotherms on the state diagram we can sketch the solution curves of our problem by insisting only
that the temperature not decrease as the boiling goes on. The temperature then is a sort of potential
for this problem.

The arrows on the edges 1-3 and 2-3 show that the vertex 3 is stable and therefore has at least
a small region of attraction. (Arrows pointing at vertex 3 will lie on the two edges that converge
on vertex 3 as long as there is not a 1-3 or 2-3 maximum boiling azeotrope.) This makes the first
of the two figures doubtful. Assuming the 1-2 azeotrope is stable whenever its boiling point is
LECTURE 1. GETTING GOING 10

higher than the boiling point at vertex 3, and knowing that vertex 3 is stable, we can speculate that
this boiling point ordering is sufficient that an unstable ternary azeotrope mediates the dynamics
of the system, anchoring a boundary separating the regions of attraction of vertex 3 and the 1-2
azeotrope.

There is a theory, called index theory, an account of which can be found in Coddington and
Levinson’s book ”Theory of Ordinary Differential Equations,” which gives global information
about questions of this sort. It establishes conditions that must be satisfied by the sum over the
local stability at each of a set of multiple critical points. This theory requires ideas beyond what
we intend to explain in these lectures and is therefore a direction for advanced study.

1.2 A Simple Evaporator

The solution of the differential equation

dx
= a(t)x + b(t)
dt

where x (t = t0 ) and b (t) are assigned is


Rt Z t Rt
a(λ)dλ
x (t) = x (t = t0 ) e 0
t + e τ a(λ)dλ b(τ )dτ
t0

This formula tells us that the value of x at time t is determined by its value at time t0 and the
values of b on the interval (t0 , t). It shows that the contributions of the two sources to the solution
are independent and additive. This is one way, but not the only way, of stating the principle of
superposition. It exhibits the main way in which linear problems are special.

If a is constant the formula is


Z t
x (t) = x (t = t0 ) ea(t − t0 ) + ea(t − τ ) b(τ )dτ
t0
LECTURE 1. GETTING GOING 11

and if b is also constant it is

b n a(t − t0 ) o
x (t) = x (t = t0 ) ea(t − t0 ) + e −1
a

We can make use of this formula in studying the dynamics of a simple evaporator. The problem
is to concentrate a nonvolatile solute in a feed stream by boiling off the volatile solvent. The
stream to be concentrated, specified by its feed rate F [#/hr], concentration xF [mass fraction] and
temperature TF [o F], is run into a tank containing a heat exchanger of area A [ft2 ] and heat transfer
coefficient U [Btu/hr ft2 o F] supplied with steam condensing at temperature Ts . The pressure in
the system is determined by the conditions under which the solvent being boiled off is condensed.
This is done in a heat exchanger of area Ac and heat transfer coefficient Uc , supplied with cooling
water at temperature Tc .
V, T

F, TF, xF
TS T T Tc

L, T, x

The simplest model corresponds to concentrating a dilute solution. In this case we assume
that
 the physical
 properties of the solution are those of the solvent and add that its heat capacity
Btu Btu
cp 0
and latent heat λ are constant. Then under steady conditions we write
# F #
LECTURE 1. GETTING GOING 12

0 = F −L−V
0 = xF F − xL

0 = hF F + UA Ts − T − hL − HV

0 = HV − hV − Uc Ac T − Tc
 
Btu
where T is the boiling point of the solvent, x the concentration of the product and h and H
#
the enthalpies of the liquid and the vapor streams. The pressure is the vapor pressure of the solvent
at temperature T . If F , xF , TF , Ts , UA, Tc and Uc Ac (the operating variables) are set then the
number of equations equals the number of unknowns and we can determine x, L, V and T (the
performance variables). Indeed eliminating L and introducing cp and λ we get

0 = xF F − x F − V
 
0 = cp TF − T F + UA Ts − T − λV

0 = λV − Uc Ac T − Tc
and we see that as long as


UATs + cp F TF > UA + cp F Tc

then T > Tc and a pressure is established so that a boiling, i.e., V > 0, solution to these equations
is obtained. We say, then, that the pressure is established so that the heat balance balances, i.e., so
that the heat supplied at the evaporator equals the heat removed at the condenser:

  
cp TF − T F + UA Ts − T = Uc Ac T − Tc

Indeed taking F = 1000, TF = 100, UA = 2000, Ts = 300, Uc Ac = 2000, Tc = 50, cp = 1 and


λ = 1000 we find T = 160, V = 220, whereas if the feed is colder and faster, viz., F = 2000,
TF = 50, we find T = 133, V = 166. Assuming P to be the vapor pressure of water we can
understand the sensitivity of P to the operating conditions in an evaporator.

Now let the system be in a steady state corresponding to assigned values of the operating
variables and suppose that certain of these are changed to new values at t = 0. Then, while we
know how to determine the new steady state reached as t → ∞ we need to answer the question:
how does the system make the transition from the old to the new steady state?
LECTURE 1. GETTING GOING 13

Letting M[#] denote the amount of well mixed solution held in the evaporator and assuming
that L is adjusted to hold M fixed, we replace the left hand sides of the original equations by
dM dx dh dT
=0,M ,M = cp M and 0, this last by assuming the condenser hold up to be small.
dt dt dt dt
Then eliminating L we find

dT
cp M = cp F (TF − T ) + UA (Ts − T ) − Uc Ac (T − Tc )
dt

The value of T at t = 0 is the old steady state value while the operating variables take their new
values at t = 0.

In this simple model T vs t can be found using the formula introduced at the beginning of this
section. Then V vs t is determined by

Uc Ac
V = (T − Tc )
λ

and, using this, x vs t can be found by solving

dx
M = xp F − (F − V ) x
dt

again using our now favorite formula. Setting M = 1000 the reader can determine T vs t, V vs t
and x vs t as the evaporator makes the transition from the steady state corresponding to T = 160
to that corresponding to T = 133.

It is helpful in dealing with problems like this to scale the variables so that only dimension-
less variables appear. In doing this there may be a variety of time scales and each may sug-
gest a useful approximation. Here the problem is so simple that the only important time scale is
cp M
and this determines how fast the system responds to step changes.
cp F + UA + Uc Ac

1.3 A Less Simple Evaporator

If in the foregoing we make the value of Uc Ac very large then the solvent condenses at the temper-
ature Tc and the evaporator operates at constant pressure. The problem remains interesting when
LECTURE 1. GETTING GOING 14

the pressure is fixed if we include in the model the possibility of a boiling point rise. As a dilute
solution can exhibit a significant boiling point rise we retain all the simplifying approximations in
the foregoing save one: we now assume the boiling point of the solution to be Tc + βx where βx
is the boiling point rise. Then as long as the solution is boiling, i.e., V > 0, we can write

dx 
M = xF − x F + xV
dt

dT  
Mcp = UA Ts − T + cp TF − T F − λV
dt
and

T = Tc + βx

whereas if V = 0 we write instead

dx 
M = xF − x F
dt

dT  
Mcp = UA Ts − T + cp TF − T F
dt
and

T < Tc + βx

When the solution is boiling, the model contains the nonlinear term xV but only two of the
three equations are differential equations. To determine V we use T = Tc + βx and hence
dT dx
Mcp = Mcp β to conclude that
dt dt
  
UA Ts − T + cp TF − T − λV = cp β (xF − x) F + xV

This formula determines V as a function of x and T and can be used to eliminate V from the dif-
ferential equations. We will return to this problem and examine the stability of its steady solutions
to small upsets.
LECTURE 1. GETTING GOING 15

1.4 The Hilbert Matrix

To approximate an assigned function f (x) on the interval a ≤ x ≤ b by a polynomial of degree n,


viz., by Pn (x) = a0 + a1 x + · · · + an xn , the problem is to find the n + 1 coefficients a0 , a1 , · · · , an .

Z b error is f (x) − Pn (x) and if we determine a0 , a1 , · · · , an to make the integral square error,
The
{f (x) − Pn (x)}2 dx, as small as possible, we find, on setting the derivatives of this with respect
a
to a0 , a1 , · · · , an to zero that

n Z
X b  Z b
j+i
x dx aj = f (x)xi dx, i = 0, 1, . . . , n
j=0 a a

This is a system of n + 1 equations in n + 1 unknowns which, when a = 0 and b = 1, can be


written
    R 
1 1 1 1
··· a0 f (x)dx
 1 2 n+1    R0 
 1 1 1    1 
 ···  a1   0 f (x)xdx 
 2 3 n+2  = 
 .. .. .. ..  ..   .. 
 . . . .  .   . 
    R 
1 1 1 1
n+1 n+2
··· 2n+1
an 0
f (x)xn dx

The matrix on the left hand side is called the Hilbert matrix and the corresponding equations
are remarkable for how difficult they are to solve for values of n that are not large. Problems such
as this require for their solution the use of correction methods designed to improve approximations
obtained by elimination methods. The determinants of the 2 × 2 and 3 × 3 Hilbert matrices are
1/12 and 1/2160, where the numerators are 4 − 3 = 1 and 81 − 80 = 1. Now, if 1/3 is replaced
1 33 1
by 33/100, where − = , the determinant of the altered 3 × 3 Hilbert matrix is 63/106,
3 100 300
which is about 10% of 1/2160.
1 Rb
If we approximate f (x) by a0 + a1 x and denote ( ) dx by ( ) avg , we get
b−a a

(x2 ) avg (f ) avg − (x) avg (xf ) avg


a0 =
(x2 ) avg − (x)2avg
LECTURE 1. GETTING GOING 16

and

(xf ) avg − (x) avg (f ) avg


a1 =
(x2 ) avg − (x)2avg

Now let X and Y be random variables, the values x of X lying in [a, b], the values y of Y lying
in [c, d]. And let f (X, Y ) be the joint probability density: f (x, y)dxdy being the probability that
the point (X, Y ) lies in the rectangle (x, x + dx) × (y, y + dy). The expected value of any function
G(X, Y ) is
Z bZ d

E G(X, Y ) = G(x, y)f (x, y)dxdy
a c


To approximate Y by a0 + a1 X we seek to determine a0 and a1 so that E (Y − (a0 + a1 X))2
is least. Then as

 
E (Y − (a0 + a1 X))2 = E Y 2 − 2a1 E (XY ) −

2a0 E (Y ) + a21 E X 2 + 2a0 a1 E (X) + a20

we find, on setting the derivatives of this with respect to a0 and a1 to zero and solving for a0 and
a1 , that

E (X 2 ) E (Y ) − E (X) E (XY )
a0 =
E (X 2 ) − E (X)2

and

E (XY ) − E (X) E (Y )
a1 =
E (X 2 ) − E (X)2

These formulas state in another way what we found just above. If X and Y are uncorrelated, we
have E(XY ) = E(X)E(Y ) whence a0 = E(Y ), a1 = 0.
LECTURE 1. GETTING GOING 17

Defining the variance σ 2 and the correlation coefficient ρ as

 
2
σX = E (X − E (X))2 = E X 2 − E (X)2

σY2 = E Y 2 − E (Y )2

and

ρXY σX σY = E [(X − E (X)) (Y − E (Y ))]

= E (XY ) − E (X) E (Y )

we can write

a0 = E (Y ) − E (X) a1

and

σY
a1 = ρXY
σX


The least value of E (Y − (a0 + a1 X))2 is then σY2 (1 − ρ2XY ) . If X and Y are uncorrelated
this is σY2 , whereas if they are perfectly correlated it is 0.

We will use some simple ideas about random variables and probability densities when we deal
with the diffusion of a solute in a solvent.

1.5 More Equations than Unknowns

Instead of having the values of a function everywhere on an interval, we may have its values only
at a discrete set of points. Call these values y1 , y2 , . . . , yn corresponding to x = x1 , x2 , . . . , xn .
Then we can try to find a polynomial of degree n − 1 that fits this information. Thus, writing

Pn−1 (x) = a0 + a1 x + · · · + an−1 xn−1


LECTURE 1. GETTING GOING 18

we determine a0 , a1 , . . . , an−1 via

yi = Pn−1 (xi ) , i = 1, . . . , n

which we write as
   
a y1
 0   
   
 a1   y2 
V
 ..
=
  ..


 .   . 
   
an−1 yn

where
 
1 x1 x21 ··· xn−1
1
 
 
 1 x2 x22 ··· xn−1
2 
V =
 .. .. .. .. ..


 . . . . . 
 
1 xn x2n · · · xn−1
n

and where V is called the Vandermonde matrix. This is a system of n equations in n unknowns.
Ordinarily it has one and only one solution, but the solution may be sensitive to small changes in
the y’s.

If we measure our function at a set of m points, where m > n, our problem is


    
1 x1 x21 ··· xn−1 a y1
 1  0   
 .. .. .. .. ..    .. 
 . . . . .   a1   . 
    
    
 1 xn x2n · · · xn−1
n   a2 = yn 
    
 .. .. .. .. ..   ..   .. 
 . . . . .  .   . 
    
1 xm x2m · · · xn−1
m an−1 ym

and this is a system of m equations in n unknowns. It is overdetermined and we suspect that it does
not have a solution, either because our function is not really a polynomial of degree n−1 or, if it is,
that errors in the data preclude us from seeing this. In fact, what we will find is that a system of m
equations in n unknowns may (i) not have a solution or (ii) may have exactly one solution or (iii)
LECTURE 1. GETTING GOING 19

may have many solutions. This also summarizes the possibilities if m = n, whereas if m < n, an
underdetermined system, the second possibility must be excluded.

We write this

Va = y

where V is the m × n Vandermonde matrix, a is the n × 1 column of unknown coefficients and y is


the m × 1 column of measured values of our function. If V and y are the results of measurements
and m > n it is unlikely that there is a solution and we can look for an approximation, that is a
value of a such that V a is as close as possible to y. The error in the ith equation is

n−1
X
yi − aj xji
j=0

and the sum of the squares of the errors is

m n−1
! m
X X X 2
yi − xji aj = y−Va i
i=1 j=0 i=1

T 
= y−Va y−Va

= y T y − y T V a − (V a)T y + (V a)T a

= y T y − 2aT V T y + aT V T V a

To find a0 , a1 , . . . , an−1 so that the sum of the squares of the errors takes its least value, we
set the derivative of this expression with respect to each ak , k = 0, . . . , n − 1, to zero getting n
equations:

n−1
( m ) m
X X X
xki xji ai = xki yi
j=0 i=1 i=1
LECTURE 1. GETTING GOING 20

which can be written

V TV a = V Ty

This is a system of n equations in n unknowns where the elements of the n × n coefficient matrix
V T V are

m
X

V TV ij
= xi+j
k , i, j = 0, . . . , n − 1
k=1

and this is just what we would expect to turn up in this, the discrete problem, knowing that the
Rb
elements of the corresponding matrix in the continuous problem are a xi+j dx

It is not easy to get an accurate solution to the problem V T V a = V T y, as V T V , like the Hilbert
matrix, does not work well when elimination methods are used. To see what is going on suppose
that x1 , x2 , . . . , xm is an increasing sequence of positive numbers. Then the columns of V T V , viz.,
 P   P   P 
x0i x1i xn−1
i
     
 P 1   P 2   P n 
 xi   xi   xi 
 ,  , ··· ,  
 P 2   P 3   P n+1 
 xi   xi   xi 
     
.. .. ..
. . .

lie in the positive cone of Rn and their directions converge to a limiting direction as n grows large.
Linear independence is retained for all n, but just barely as n grows large. The reader can see this
simply by letting m = 4, n = 3 and x1 = 1, x2 = 2, x3 = 3 and x4 = 4.

To see why nearly dependent columns lead to uncertainties in numerical work, observe that the
solution to
    
1 1 x 1
  = 
0 ε y 0

   
1 1
is x = 1, y = 0 whereas it is x = 0, y = 1 if   on the RHS is replaced by  
0 ε
For later work we record the observation that V T V a = V T y can be written in terms of the
LECTURE 1. GETTING GOING 21

error, y − V a, as

T
y − V a V = 0T

1.6 Getting Ready for Lectures 2 and 3

It is important to build up a picture of the facts about the solutions to the problem Ax = b. We
begin to do this in the hope that the readers will add to it as they go along.

Write Ax = b as

x1 a1 + x2 a2 + · · · + xn an = b

where x ∈ C n , where a1 , a2 , . . . , an , b ∈ C m and where r denotes the largest number of indepen-


dent vectors in {a1 , a2 , . . . , an } .

The simplest case is n = 1, r = 1, m = 2. Then the picture corresponding to the problem


xa1 = b is

a or b a
1 1

LHS RHS
On the LHS there is no value of x satisfying xa1 = b while on the RHS there is exactly one
value of x. On the LHS we can determine the value of x so that xa1 is as close as possible to b but
xa1 cannot equal b.

If n = 2, r = 2, m = 3 the problem is xa1 + ya2 = b, the picture is


LECTURE 1. GETTING GOING 22

a or a
2 2

b
a a
1 1
LHS RHS

and the conclusions are as before: on the LHS there are no values of x and y such that xa1 + ya2 =
b, while on the RHS there is exactly one value of x and one value of y. But if r = 1 the picture is

or
a a
2 2
b b
a a
1 1
LHS RHS

and again on the LHS there are no values of x and y such that xa1 + ya2 = b. The RHS is new: as
before x and y can be determined, but now this is possible in many ways.

These pictures lead us to certain conclusions about the solutions in terms of the numerical
values of n, r and m. If r < m the problem has solutions for some values of b but not for others.
If the problem has a solution and if r = n then it is the only solution, but if r < n there are many
solutions.

Certainly r cannot exceed n and, as a1 , . . . , an ∈ C m , r cannot exceed m either. Using this the
reader can draw more conclusions about the solutions when n < m, n = m and n > m.
LECTURE 1. GETTING GOING 23

1.7 A Source of Linear Problems: Boiling an Azeotrope

Assuming that most of the interesting problems a student will face are nonlinear, we ought to
indicate at least one source of linear problems.

Suppose that upon writing a model to explain or predict an experimental observation, viz., the
output variable, we have to solve a nonlinear problem. Included in the specification of the problem
will be the values of the input variables.

Often a simple steady solution to our problem can be found where, possibly, the nonlinear
terms vanish and we would then like to know if we can see this solution if we run the experiment.

To answer this question we add to the simple base solution we have a small correction and
substitute the sum into our nonlinear equation in order to obtain an equation for the correction.
Upon discarding squares, etc., of small quantities, this will be a linear equation and our aim will
be to discover if the small displacement grows or dies out in time.

If the displacement grows, our base solution will be called unstable and we will not see it in an
experiment.

If all displacements die out our base solution will be stable to small displacements and we may
be able to see it in an experiment.

Ordinarily there will be ranges of inputs where stability obtains and ranges of inputs where
it does not. The critical values of the inputs divide these ranges. Hence we may decide to run a
sequence of experiments where we increase an input to its critical value and ask what we expect to
see if the input is advanced just beyond its critical value.

By asking this question, we are led to derive a sequence of linear problems which are inhomo-
geneous versions of the homogeneous stability problem and which introduce solvability questions.

Boiling an Azeotrope

As a simple example, recall that in our boiling problem we have

d~x
= −~y + ~x
ds
LECTURE 1. GETTING GOING 24

where ~x is specified at s = 0 and where, at constant pressure, ~y is known as a function of ~x. Hence,
if we are boiling an azeotrope, i.e., if ~x at s = 0 is an azeotropic composition so that at s = 0 we
have ~y = ~x, then, for all s we have the solution

~x = ~x(t = 0)

and we can ask: is this what we see in an experiment?

The easy case is n = 2, where we have

dx
= −y + x
ds

and y as function of x looks one of two ways

y y

or

xA x xA x
maximum boiling azeotrope. minimum boiling azeotrope.

And starting a boiling experiment at x = xA we predict x = xA for all s.

To see if this solution is stable, we substitute x = xA + ξ, ξ small, and obtain


= −f ′ (xA )ξ + ξ
ds

Now we observe that t = 0 corresponds to s = 0 and that s is a time like variable. Hence we have
stability if f ′ (xA ) > 1, instability if f ′ (xA ) < 1 and we find that a maximum boiling azeotrope
can be boiled off at constant composition. But a minimum boiling azeotrope cannot be sustained.
Here it is not that we expect a composition fluctuation to occur during boiling, instead the problem
lies in the preparation of the initial state.
LECTURE 1. GETTING GOING 25

Boiling Point Rise Model

Going back to our boiling point rise model where the inputs, viz., M, xF , TF , F , cp , λ, UA, Ts ,
Tc and β, are held fixed and where the outputs are x, T and V , denote by x0 , T0 , V0 > 0, a boiling
steady state. Is it stable? To see, substitute

V = V0 + εV1 ,

x = x0 + εx1 ,

and

T = T0 + εT1

into the model presented earlier, where ε is small, and obtain equations for x1 , T1 and V1 . Eliminate
V1 from the first two equations, then eliminate T1 via T1 = βx1 and draw the conclusion that for
any small initial displacement of the steady state, x1 goes to zero as t increases.

Petri Dish Problem

To introduce a solvability question we present the following model, where c denotes the concentra-
tion of a solute in a one dimensional domain and F (c) denotes its rate of formation. All variables
are scaled and at first we say only that F (0) = 0 and F ′ (0) > 0. Thus an excursion away from
c = 0 reinforces itself. Our model is

∂c ∂2c
= 2 + λF (c)
∂t ∂x

where c = 0 at x = 0, 1, i.e., there is a solute sink at the ends of our domain, and where λ denotes
the strength of the source.

Our aim is to find the value of λ at which diffusion to the solute sinks at x = 0, 1 can no longer
control the solute source on the domain.

We have a solution c = 0 for all λ and we might wish to know for what values of λ we can
LECTURE 1. GETTING GOING 26

observe this solution. There are two simple things we can do, both leading to the same conclusion.
First we can introduce a small perturbation, viz., c = 0 + c1 where c1 is small and find that c1
satisfies

∂c1 ∂ 2 c1
= + λF ′ (0)c1
∂t ∂x2

where c1 = 0 at x = 0, 1.

This is a linear problem and we can seek solutions of the form

c1 = ψ(x)eσt

where σ is the growth rate of a perturbation whose spatial dependence is ψ(x). Then σ and ψ solve
the homogeneous problem

d2 ψ
+ λF ′ (0)ψ − σψ = 0
dx2

where ψ = 0 at x = 0, 1, and this problem has solutions other than ψ = 0 only for special values
of σ.

The solutions are ψ = sin πx, sin 2πx, . . . corresponding to

σ = λF ′ (0) − π 2 , λF ′ (0) − 4π 2 , . . .

and these σ’s are the growth rates of an independent set of perturbations spanning all allowable
perturbations. Because the strength of diffusion increases as the spatial variation increases, we see
that sin πx is the most dangerous perturbation.

At λ = 0, the greatest value of σ is found to be −π 2 . And for all λ > 0 the greatest σ is
λF ′ (0) − π 2 , whereupon the greatest σ becomes zero at λ = π 2 /F ′(0). This then is the critical
value of λ beyond which the solution c = 0 is not stable.

The second thing we can do is to look only at steady solutions, viz., solutions to

d2 c
+ λF (c) = 0
dx2
LECTURE 1. GETTING GOING 27

where c = 0 at x = 0, 1, and observe that one such solution is c = 0 for all λ. Then we can ask: if
dc
we have the solution c = 0 at some λ can we advance it if we advance λ, i.e., can we find ?

dc .
Denoting by c we have

d2 c
.
.
+ λF ′(0)c = 0
dx 2

where ċ = 0 at x = 0, 1.
.
This is the forgoing, viz., ψ, problem at σ = 0 and it has only the solution c = 0 for all
λ < λcrit , hence we keep finding only the solution c = 0 as λ increases from zero until we reach
.
λ = λcrit , whereupon a solution c 6= 0 appears, signaling something new may be happening.

Now we can go back to the problem for σ and differentiate with respect to λ, obtaining
.
d2 ψ ′
. . .
+ λF (0) ψ − σ ψ = σψ − F ′ (0)ψ
dx2
.
where ψ = 0 at x = 0, 1.

The corresponding homogeneous equation is the equation for ψ and, at λcrit , it has a solution
.
other that ψ = 0. Hence ψ, not zero, must exist and, therefore, the equation for ψ̇ must be solvable.
. .
The solvability condition then determines σ. The reader can multiply the ψ equation by ψ, the ψ
. .
equation by ψ, subtract and integrate the difference over 0 ≤ x ≤ 1, learning that σ is positive at

λ = λcrit . Thus we have σ = 0 and > 0 at λ = λcrit . We therefore anticipate seeing a nonzero

solution branch for λ > λcrit .

We will return to the Petri Dish problem later on and try to decide what the nonzero solution
looks like for λ just beyond λcrit . And we can do this by solving only linear equations.

1.8 Home Problems

1. A graph is a set of points connected pairwise by directed line segments. If there are n points
and a line segment runs from point i to point j then the ij element in an n × n matrix is
LECTURE 1. GETTING GOING 28

set to 1 otherwise it is 0. The resulting matrix is called a connection matrix. What do the
powers of a connection matrix tell us about the graph? The powers of a connection matrix
are easy to determine because its columns are made up of zeros and ones. Each column of
the product of a matrix A multiplied on the right by a connection matrix is simply the sum
of certain columns of A.

2. Suppose the elements of the square matrix A themselves are square matrices. Then A is
called a block matrix. Write A as LU where L is blockwise lower triangular and U is
blockwise upper triangular, its diagonal blocks being I.

3. Determine T vs. t, V vs. t and x vs. t in the simple evaporator problem presented in Lecture
1.

4. In the nonlinear evaporator model, set UA = 2000, TS = 300, Tc = 40, β = 1000, cp = 1


1
and λ = 1000. Then if F = 1000, xF = and TF = 100 the steady solution corresponds
10
1
to boiling: T = 165, V = 200, x = . But if F = 4000 and TF = 50 the steady solution
8
1
corresponds simply to heating: T = 133, V = 0, x = . Set M = 1000 and determine
10
how the system makes the boiling-to-nonboiling transition if it is in the first steady state
when t < 0 and then at t = 0 the above step changes in F and TF are made. A simple Euler
approximation will do the job. It is instructive to see how T = Tc + βx is used to determine
V at each step. Once V is zero the calculation can be continued without approximation.
Lecture 2

Independent and Dependent Sets of Vectors

The m × n matrix A is assigned. In this and the next two lectures we show how to determine
whether or not a vector x ∈ C n can be found so that the vector Ax ∈ C m equals an assigned
vector b ∈ C m . Assuming the problem Ax = b has a solution, we then show how to write its
general solution.

2.1 Linear Independence

The main idea is linear independence. This, or its opposite, linear dependence, is a property of
a set of vectors. Indeed, starting with a set of vectors, {v 1 , v 2 , ..., v n } , we can create additional
vectors by making linear combinations of the assigned vectors using arbitrary complex numbers,
viz., c1 v1 + c2 v 2 + ... + cn v n . The set v 1 , v 2 , ..., v n is said to be independent if the only way
we can create the vector 0 is by setting c1 , c2 , ..., cn to zero. In other words the set of vectors
v 1 , v 2 , ..., vn is said to be independent iff

c1 , c2 , ..., cn all zero

is the only solution to the equation

c1 v 1 + c2 v 2 + ... + cn vn = 0

29
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 30

otherwise it is said to be dependent.

The idea of linear dependence can be stated in terms of linear combinations in the following
way: at least one vector of the set v 1 , v2 , ..., vn is a linear combination of the others if and only if
the set is dependent, i.e., if and only if the equation c1 v1 + c2 v2 + ... + cn v n = 0 is satisfied for
c1 , c2 , ..., cn other than c1 , c2 , ..., cn all zero.

The idea of linear independence is not special to sets of column vectors and is defined as above
for vectors in general, using the corresponding zero vector; indeed it pertains to sets of row vectors
as well as to sets of column vectors.

2.2 Independent Chemical Reactions

To illustrate this idea suppose we have M molecules, m = 1, 2, ..., M, participating in R chemical


reactions, r = 1, 2, ..., R. Let νrm denote the stoichiometric coefficient
  molecule m in reaction
of
ν
 1m 
 
 ν2m 
r. Then we can identify the molecule m with the column ν m =  . 

 and the reaction r with
 .. 
 
νRm
the row (νr1 νr2 ... νrM ). Independent reactions then correspond to independent rows. A question
of interest in chemical equilibrium calculations is this: for an assigned set of molecules how many
independent reactions can we write? Writing our columns (or rows) as the columns (or rows) of
a matrix ν, where ν = (ν 1 ν 2 ... ν M ), called the stoichiometric or reaction-molecule matrix, we
must determine the greatest number of independent rows in this R × M matrix. Indeed we must
determine whether the greatest number of independent rows increases indefinitely or is bounded
as the number of rows increases indefinitely.

It turns out that matrices have a surprising property: the greatest number of independent rows
cannot exceed the greatest number of independent columns (and vice versa); and this, willy nilly,
cannot exceed the total number of columns. The number of molecules then is a bound on the
greatest number of independent reactions that can be written.

If we take into account that each molecule is made up of atoms from the set a = 1, 2, ..., A
and denote by αma the number of atoms a in molecule m, then as each atom is conserved in each
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 31

reaction we have for reaction r and atom a

M
X
νrm αma = 0
m=1

 
α1a
 
 
 α2a 
for all r and all a. We can identify the atom a with the column αa =   ..
 whence the

 . 
 
αM a
conservation conditions are ναa = 0, a = 1, 2, ..., A and these can be written

ν 1 α1a + ν 2 α2a + ... + ν M αM a = 0, a = 1, 2, ..., A

Each independent condition of this kind reduces by one the greatest number of independent
columns in the set ν 1 , ν 2 , ..., ν M . Assuming the atoms to be distributed independently over the
molecules, i.e., assuming the set α1 , α2 , ..., αA to be independent , the greatest number of indepen-
dent reactions that can be written using M molecules made up of A atoms is M − A, hence the
requirement that balanced reactions be written reduces the greatest number of independent reac-
tions from M to M − A. But even the bound M corresponding to arbitrary reactions is interesting
and it has nothing to do with the requirement that the stoichiometric coefficients be integers. It
holds assuming the νrm to be arbitrary complex numbers and cannot be lowered by the restriction
to integers.

To every set of n column vectors in C m there corresponds a set of m row vectors in C n gener-
ated by writing the column vectors as the columns of an m × n matrix A. The row vectors are then
the rows of A. If we rephrase the question as to the greatest number of independent vectors in the
set a1 , a2 , ..., an to a corresponding question about the greatest number of independent columns in
the matrix A we will get information not only about the columns of A but also about the rows of A
as well. But the rows of A are the columns of AT so answers to questions about n column vectors
in C m are also answers to questions about a corresponding set of m column vectors in C n .
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 32

2.3 Looking at the Problem Ax=b from the Point of View of


Linearly Independent Sets of Vectors

To see what linear independence has to do with our main problem of determining whether or not
solutions of Ax = b exist and, if they do, writing them, we let a1 , a2 , ..., an denote the columns of
A and write the problem Ax = b as

x1 a1 + x2 a2 + ... + xn an = b

which requires b to be a linear combination of a1 , a2 , ..., an . The corresponding homogeneous


problem, Ax = 0, can be written

x1 a1 + x2 a2 + ... + xn an = 0

We see therefore that Ax = 0 has solutions other than x1 , x2 , ..., xn all zero iff the set of vectors
a1 , a2 , ..., an is dependent, i.e., not independent, and that Ax = b has solutions iff on joining b to
the set a1 , a2 , ..., an we do not increase the greatest number of independent columns.

Our first problem, therefore, is to determine the greatest number of independent vectors in a set
of n vectors, a1 , a2 , ..., an in C m . To do this we introduce the determinant of a square matrix. The
determinant is a function defined on square matrices mapping each square matrix into a complex
number.

Let A = (aij ) = (a1 a2 ... an ) be an n × n matrix, then the determinant of A, denoted det A, is
defined by

det A = Σ ± aα1 1 aα2 2 ...aαn n

where we have chosen to write the column indices in their natural order and where the sum is
over all sets of integers α1 , α2 , ..., αn that are permutations of 1, 2, ..., n, the + sign to be used if
the permutation is even, the − sign if it is odd. This then is a sum of n! terms, each term being
a product of n factors, where each row and each column is represented once in each term. This
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 33

definition leads to the following four properties from which our conclusions can be drawn:

(i) det A = det AT

(ii) det(. . . ai . . . aj . . .) = − det(. . . aj . . . ai . . .)


P P
(iii) det(. . . ck bk . . .) = ck det(. . . bk . . .)
 P
(iv) det . . . ai + ck ak . . .) = det(. . . ai . . .)
k6=i

In (ii) two columns are interchanged; in (iii) a fixed column is written as a linear combination of
arbitrary column vectors; in (iv) a linear combination of other columns is added to the ith column.
All else is held fixed. The proofs of these properties and their corollaries, such as det A = 0 if
ai = aj for any i 6= j, det A = 0 if ai = 0 for any i, etc., come easily out of the definition and
either can be supplied by the reader or can be found in Shilov’s book “Linear Algebra.”

Property (i) is important in turning column theorems into row theorems and vice-versa. Prop-
erties (ii), (iii) and (iv) are properties of columns and therefore properties of the columns of AT .
As the rows of A are the columns of AT they are also properties of the rows of A.

If a1 , a2 , ..., an is a dependent set of vectors in C n then det A = 0. This is so as we can write one
of a dependent set of vectors as a linear combination of the others and then use this combination
in (iii) to show that det A is zero. We can restate this as: if det A 6= 0 then {a1 , a2 , ..., an } is
independent.

Each term in the expansion of det A contains one factor from the j th column. Of the n! terms,
(n − 1)! contain the common factor a1j , (n − 1)! contain the common factor a2j , etc. Writing the
first of these n sets of (n − 1)! terms as a1j A1j , the second as a2j A2j , etc., we find that

n
X
det A = a1j A1j + a2j A2j ... = aij Aij
i=1

and we notice that

∂ det A
= Aij
∂ aij
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 34

This is the expansion of det A via its j th column and it can be written for each j = 1, 2, ..., n.
Each factor Aij , i = 1, 2, ..., n is called the cofactor of the corresponding element aij , i = 1, 2, ..., n
and by definition their values do not depend on the values of the elements in the j th column. When
this construction is carried out for each column, j = 1, 2, ..., n, it generates n2 elements Aij . Then
letting
   
A11 A12
   
   
 A21   A22 
A1 = 
 ..
,
 A2 = 
 ..
 , etc.

 .   . 
   
An1 An2

where A1 is the column of cofactors of a1 , the first column of A, etc, we can write

det A = ATj aj , j = 1, 2, . . . , n

The matrix whose columns are A1 , A2 , ..., An is called the matrix of the cofactors of A and its
transpose, denoted adj A, where adj A = (A1 A2 ...An )T , is called the adjugate of A.

It turns out that Aij = (−1)i+j Mij where Mij , a minor of A, is the determinant of the (n − 1) ×
(n − 1) submatrix of A obtained by deleting its ith row and j th column. It is worth stating that,
unless n = 2 or 3, it is not practical to evaluate determinants directly from the definition, requiring
the evaluation of n! terms, nor by the expansion in cofactors, requiring the evaluation of (n − 1)!
terms n times, etc.

We now have two sets of columns {a1 , a2 , ..., an } and {A1 , A2 , ..., An }, which satisfy

ATj ak = det A, k = j

ATj ak = 0, k 6= j

The second formula is the expansion, via the j th column, of the determinant obtained by writing
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 35

ak in place of aj as the j th column of A and hence is zero. The multiplication on the left hand side
is a column multiplied on the left by a row. The product is a scalar, indeed ATj ak = aTk Aj .

2.4 Biorthogonal Sets of Vectors

Two sets of n vectors belonging to C n satisfying the conditions

ATi aj 6= 0, i = j

ATi aj = 0, i 6= j

are called biorthogonal sets. This is a useful idea. It’s usefulness stems from the observation that
if we expand a vector in one of the sets, the coefficients in the expansion can be determined simply
by operating on the expansion using vectors of the other set. Indeed to solve the problem Ax = b
where x and b belong to C n and det A 6= 0, we rewrite the equation as

x1 a1 + x2 a2 + ... + xn an = b

and multiply both sides by ATj to obtain

xj ATj aj = ATj b

whereupon

ATj b
xj = , j = 1, ..., n.
det A
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 36

This is Cramer’s rule and it can be written

(A1 A2 ...An )T
x= b
det A
adj A
= b ≡ A−1 b
det A

Indeed the vectors A1 , A2 , ..., An enable us to select from expressions such as x1 a1 + x2 a2 +


... + xn an any coefficient we wish to look at. This is familiar from analytic geometry where if
n o
~r = x~i + y~j + z~k then x = ~i · ~r as ~i, ~j, ~k is its own biorthogonal set. Cramer’s rule produces
the unique solution to Ax = b when A is square and det A 6= 0.

Before going on we write our results in another way: the column formulas for the expansion of
a determinant, viz.,

n
X
ATj aj = aij Aij = det A, j = 1, ..., n
i=1

and

n
X
ATj ak = aik Aij = 0, j = 1, ..., n , k = 1, ..., n, k 6= j
i=1

can be written

( adj A)A = (det A)I

where I = (δij ).

Now there are row formulas for the expansion of a determinant which can be obtained either by
going back to the definition and factoring an element in the ith row out of each term or by writing
the column formulas using AT in place of A. The row formulas
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 37

n
X
det A = aij Aij , i = 1, ..., n
j=1

and their corollaries

n
X
0= akj Aij , i = 1, ..., n, k = 1, ..., n, k 6= i
j=1

obtained by expanding, via the ith row, the determinant obtained by replacing the ith row of A by
its k th row, can be written

A ( adj A) = (det A) I

The readers may wish to satisfy themselves that aij is multiplied by one and the same coeffi-
cient, denoted Aij , whether it appears in a row or column expansion. If det A = 0, we see that
A( adj A) = 0 and hence that each column of adj A is a solution of Ax = 0.

2.5 The Number of Linearly Independent Vectors in a Set of


Vectors and the Rank of a Matrix

The determinant, defined only on square matrices, can be used to determine the greatest number of
independent columns in an m × n matrix A = (a1 a2 ...an ), where aj ∈ C m , j = 1, ..., n, and where
m need not be equal to n. To do this we introduce square submatrices of A of order k by deleting
all but k rows and all but k columns and then we calculate the determinants, called minors of order
k, of all these submatrices. Using this information we define the rank of A to be the order of the
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 38

largest non-vanishing minor of A and denote it by r; we call the set of r columns of A running
through that minor a set of basis columns. The rank is unique but, as more than one minor of order
r may be non-vanishing, a set of basis columns need not be unique, however each such set is made
up of r columns, and each such set is independent. To see this let a1 , a2 , ..., ar be a set of r basis
columns. (In the problem Ax = b interchanging columns of A and interchanging corresponding
elements of x leave the problem unchanged.) Then to see if the set is independent we investigate
the solutions of

c1 a1 + c2 a2 + ... + cr ar = 0

Looking at the r equations corresponding to the basis minor we see by Cramer’s rule that their
only solution is c1 , c2 , ..., cr all zero. This then is the only solution to the full set of m equations.

What we have established is this: in an m × n matrix there is at least one set of r columns, the
basis columns, that is independent, and r, the rank of the matrix, cannot exceed the smaller of m
or n. To go on we require a result telling us how the columns not in a set of basis columns depend
on the basis columns. Indeed what we need is the basis minor theorem. As stated and proved in
Shilov’s book ”Linear Algebra,” on p. 25, this theorem tells us that any column of a matrix can be
written as a linear combination of any set of basis columns. This most important result in linear
algebra is surprisingly easy to prove. As a set of basis columns is independent each such expansion
must be unique. The basis minor theorem tells us directly that any set of r + 1 or more columns of
which r are basis columns is dependent. Indeed any set of r + 1 or more columns in a matrix of
rank r must be dependent. The argument for this can be found early in the next lecture. Hence the
greatest number of independent columns in an m × n matrix of rank r is r. Of course any subset
of an independent set of vectors is also independent.

Because of property (i) we see that the ranks of A and AT coincide, every square submatrix of
A being the transpose of a square submatrix of AT . The foregoing argument in terms of AT then
shows that r is also the greatest number of independent rows in A. Indeed if m > n the greatest
number of independent rows cannot exceed n whatever values are assigned to the aij .
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 39

We also see that if A is square, i.e., n × n, and det A = 0 then its rank is at most n − 1 whence
the columns of A must be dependent. This is the converse of the earlier result that if det A 6= 0 its
columns are independent.

To wind this lecture down we develop some useful results having to do with the determinant.

2.6 Derivatives of Determinants

Let the elements aij of a matrix A be functions of t. Then det A is a function of t and its derivative
is

d X X ∂ det A daij
det A =
dt i j
∂aij dt

Using

∂ det A
= Aij
∂aij

we get the important formula

d X X daij
det A = Aij
dt i j
dt

This can be written as the sum of n determinants in two ways, using column or row expansions.

Now let x1 (t), x2 (t), ..., xn (t) be a set of n solutions of


LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 40

dx
= A(t)x
dt

and let W = det (x1 (t) x2 (t) ... xn (t)) = det (xij (t)). Then, as above,

dW X X dxij
= Xij
dt i j
dt

But this is

dW XXX
= aik xkj Xij
dt i j k

P
and hence, using j xkj Xij = W δki , we find

dW
= tr A(t)W
dt

where tr A, called the trace of A, is a11 + a22 + · · · + ann . As


Rt
tr A (t) dt
W (t) = W (t0 ) e t0

we conclude that W (t) is either always zero or never zero. This is an important result in the
theory of differential equations. It is required in Lecture 19. The determinant W (t) is called the
Wronskian of the solutions.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 41

Derivatives of Determinants: The Trace Formula

If A = (a1 a2 ...an ) and B = (b1 b2 ...bn ) then the ij element of AT B is aTi bj whence tr (AT B) =
Pn T
i=1 ai bi . Using this, the formula for the derivative of a determinant, viz.,

d X X daij X daj
det A = Aij = ATj
dt j i
dt j
dt

can be written

 
d dA
det A = tr ( adj A)
dt dt

2.7 Work for the Reader

We give here some simple results which the readers can verify and some not so simple results.

A square matrix is called diagonal if its elements off the main diagonal vanish; it is called upper
or lower triangular if its elements below or above the main diagonal vanish. The determinant of
each such matrix is the product of its diagonal elements.

The determinant of a product of square matrices is the product of the determinants of the
factors. This is a particular instance of a result by which a minor of a product can be expressed as
a sum of products of minors of the factors, see p. 91 of Shilov’s “Linear Algebra.”

The reader can discover that (AB)T = B T AT and then that rank AB ≤ rank A as each col-
umn of AB is a linear combination of the columns of A. As
rank AB = rank (AB)T ≤ rank B T = rank B we also see that rank AB ≤ rank B.

If AA−1 = I and BB −1 = I then ABB −1 A−1 = I and so (AB)−1 = B −1 A−1 . This result
assumes A and B are square. If AB is square but A and B are not, more work is required.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 42

Ordinarily we can write a square matrix A as a product LU where L is lower triangular and
U is upper triangular having 1’s on its main diagonal. This is easy to do column by column: on
writing
 
1 u12 . . .
 
 
 0 1 ... 
(a1 a2 . . . an ) = (ℓ1 ℓ2 . . . ℓn ) 
 .. ..


 . . ... 
 
0 0 ...

we can derive the columns of L and U recursively via

a 1 = ℓ1

a2 = u12 ℓ1 + ℓ2

a3 = u13 ℓ1 + u23 ℓ2 + ℓ3

etc.

Thus, because ℓ12 = 0, u12 can be determined to be a12 /a11 , etc. What appears on the diagonal
of L is a11 , a11 a22 − a12 a21 , . . . , i.e., the ratios of the determinants of the upper left hand subma-
a11
trices of A. If one of these is zero the calculation,  as indicated
 above, cannot go on. This may
0 1
happen whether or not det A = 0. For instance   cannot be so expanded but the problem
1 1
disappears if the columns are interchanged.

The readers can show that if A is tridiagonal then L and U are bidiagonal. The readers can also
satisfy themselves that the recipe for the determination of L and U can be improved by calculating
the columns of L and the rows of U in the following sequence: first column of L, first row of
U, second column of L, second column of U, etc. Indeed if A is partitioned into blocks, the
decomposition can be carried out blockwise. In doing this the blocks of A must satisfy certain
minimum conditions, e.g., the diagonal blocks must be square. The equation LUx = b is easy to
solve.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 43

2.8 Looking Ahead

Again we denote by a1 , a2 , . . . , an the n columns of an m × n matrix A and we say that the set
{a1 , a2 , . . . , an } is independent iff c1 a1 + c2 a2 + · · · + cn an = 0 has only the solutions: all c’s = 0.

The reader should then believe that if {a1 , a2 , . . . , an } is independent, the equation Ax = 0
has only the solution x = 0 and if {a1 , a2 , . . . , an , b} is independent, the equation Ax = b has no
solution.

Suppose the R × M matrix ν, having rank r, is our reaction-molecule matrix and the M × A
matrix α = (α1 , . . . , αA ) is our molecule-atom matrix. We will see in Lecture 3 that the equation
νx = 0 has M − r independent solutions. Thus if

ν a α = 0, α = 1, . . . , A

accounts for A of these, we must have M − r ≥ A and thus

r ≤M −A

where r is the greatest number of independent reactions.

2.9 Home Problems

1. Denote by D (x1 , x2 , . . . , xn ) the determinant of the Vandermonde matrix


 
1 x1 x21 ··· x1n−1
 
 
 1 x2 x22 ··· x2n−1 
 
 .. .. .. .. 
 . . . . 
 
1 xn x2n · · · xn−1
n

Then derive the formula

D (x1 , x2 , . . . , xn ) = π (xi − xj ) 1≤j<i≤n


LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 44

This is a problem in Shilov’s book.

To do this expand D by its last row and observe that it is a polynomial of degree n − 1
in xn whose coefficients depend on x1 , x2 , . . . , xn−1 . The n − 1 zeros of this polynomial
are x1 , x2 , . . . , xn−1 , and it can be written

n−1
Y
D (x1 , x2 , . . . , xn ) = c (xn − xi )
i=1

where c is the coefficient of xn−1


n and is therefore D (x1 , x2 , . . . , xn−1 ).

2. A determinant of order k can be written as a linear combination of its first minors, determi-
nants of order k − 1. Let A be an m × n matrix. If all its minors of order k vanish then so
too all its minors of order k + 1, k + 2, etc. Prove this.

3. Suppose the elements of an n × n matrix A are polynomials in a scalar λ. Then det A is


also a polynomial in λ. Denote by the term first minors all the minors of order n − 1, by the
term second minors all the minors of order n − 2, etc. Then second minors are first minors
of first minors, etc.

Using the formula for the derivative of a determinant and the formula for the expansion
d
of a determinant in terms of its first minors show that: det A is a linear homogeneous

d2
function of the first minors of A, 2 det A is a linear homogeneous function of the second

minors of A, etc.

4. Because you see 6 molecules and 3 atoms in the set of 5 reactions listed below, you believe
that at most 3 of the reactions are independent. Calculate the rank of the reaction-molecule
matrix.
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 45

C + H2 O −→ CO + H2

C + 2H2 O −→ CO2 + 2H2

C + 2H2 −→ CH4

C + CO2 −→ 2CO

CO + H2 O −→ CO2 + H2

5. Suppose m = 3 = n and expand J = det (I + εA) in powers of ε

Answer:
 
+a11 a33 − a13 a31
 
 
J = 1 + ε tr A + ε2  +a11 a22 − a12 a21  + ε3 det A
 
+a22 a33 − a23 a32

6. Given R reactions among M molecules composed of A atoms, multiply the R × M reaction-


molecule matrix by the corresponding M × A molecule-atom matrix obtaining an R × A
matrix all of whose elements are?

An example is:

C + O2 −→ C O2 R = 3, M = 4, A = 2

1
C+ O2 −→ C O
2

1
CO + O2 −→ C O2
2
LECTURE 2. INDEPENDENT AND DEPENDENT SETS OF VECTORS 46

7. A scalar valued function of an n × n matrix A, say f (A), is said to be a scalar invariant of


A iff


f QAQT = f (A)

for all Q’s such that Q−1 = QT .

Expanding det (λI + A) as

λn + λn−1 I1 (A) + · · · + λIn−1 (A) + In (A)

show that I1 (A) , . . . , In (A) are scalar invariants of A. They are called the principal invari-
ants of A.
∂f
Define the gradient of a scalar function of A, denoted fA (A), having components , by
∂Aij

d ∂f 
f (A + sC) = Cij = tr fA (A) C T
ds s=0 ∂Aij

for any C.

Derive

T
detA (A) = det A A−1

and

trA (A) = I
Lecture 3

Vector Spaces

3.1 Vector Spaces

A set of vectors on which rules for addition and scalar multiplication are defined is called a vector
space if the sum of any two vectors in the set is in the set and the product of any scalar and any
vector in the set is in the set. The set of columns of m complex numbers, denoted C m , is a vector
space.

Let the vectors v 1 , . . . , vn belong to C m . Then the set of all linear combinations of v1 , . . . , vn ,
i.e., the set of all vectors c1 v 1 + c2 v 2 + . . . + cn v n corresponding to all ways of choosing c1 , . . . , cn
is a vector space. It is a subspace of C m ; it is called the manifold spanned by v 1 , . . . , vn and it
is denoted [v 1 , v 2 , . . . , v n ]. A set of independent vectors in a vector space that spans the space
is called a basis for the space. Each vector in the space has a unique expansion in a set of basis
vectors and the coefficients in the expansion are called the components of the vector in the basis.

If there are n vectors in a basis for a space then every set of n + 1 vectors in the space is
dependent. Indeed if v 1 , . . . , vn is a basis then u1 , . . . , un+1 must be dependent. To see this let

n
X
uj = ξij v i , j = 1, . . . , n + 1
i=1

47
LECTURE 3. VECTOR SPACES 48

and observe that the equation

c1 u1 + c2 u2 + · · · + cn+1 un+1 = 0

can be written

n
X n+1
X
vi ξij cj = 0
i=1 j=1

whence we have

n+1
X
ξij cj = 0, i = 1, . . . , n
j=1

 
ξ1j
 
 
 ξ2j 
and denoting 

 n
..  by ξ j ∈ C , j = 1, 2, . . . , n + 1, this is
 . 
 
ξnj

c1 ξ 1 + c2 ξ 2 + · · · + cn+1 ξ n+1 = 0

 
As the rank of the n × (n + 1) matrix ξ 1 ξ 2 · · · ξ n+1 cannot exceed n, the set of columns
ξ 1 , ξ 2 , · · · , ξ n+1 must be dependent and so too therefore the set of vectors u1 , u2 , · · · , un+1 . This
tells us: if there are n vectors in some basis for a space then every basis for the space is made up
of n vectors. We go on and define the dimension of the space to be n. It follows directly that any
set of n independent vectors in a space of dimension n is a basis for the space.

3.2 The Image and the Kernel of a Matrix: The Geometric


Meaning of Its Rank

There are two subspaces associated to an m × n matrix A that are important to us, one a subspace
of C m the other of C n . Both depend for their identification on the basis columns of A. We denote
LECTURE 3. VECTOR SPACES 49

by r the rank of A, and assume the columns a1 , a2 , · · · , ar to be a set of basis columns. Then we
denote the set of vectors Ax ∈ C m , for all x ∈ C n , by Im A and call it the image of A. Because
c1 Ax1 + c2 Ax2 = A (c1 x1 + c2 x2 ), Im A is a vector space and hence a subspace of C m . Because
Ax = x1 a1 + x2 a2 + · · · + xn an , we can write Im A = [a1 , a2 , . . . , an ] and hence by the basis
minor theorem Im A = [a1 , a2 , . . . , ar ]. As {a1 , a2 , . . . , ar } is independent, it is a basis for Im A.
Hence the dimension of Im A is r and the rank of a matrix, an algebraic quantity, turns out to have
a geometric interpretation as the dimension of its image.

This leads directly to results such as rank AB ≤ rank A inasmuch as Im AB cannot lie outside
Im A.

The geometric interpretation of the rank of A leads to a practical way of determining its value.
Indeed we do not change the rank of A by carrying out operations on the columns of A that do not
change the dimension of Im A. The following operations satisfy this requirement and are therefore
rank preserving:

1. interchange two columns

2. multiply a column by a non-zero number

3. add a multiple of a column to another column

These rank preserving column operations can be used to produce from A a matrix whose rank,
i.e., number of independent columns, can be established by inspection. The idea is to create zeros
in the first row in columns 2, . . . , n, then in the second row in columns 3, . . . , n, then etc.

If the problem Ax = b has a solution then b = x1 a1 + x2 a2 + · · · + xn an for certain values


of x1 , x2 , . . . , xn and b ∈ Im A, if it does not, then b 6= x1 a1 + x2 a2 + · · · + xn an for any
values of x1 , x2 , . . . , xn . and b ∈
/ Im A. The solvability condition then is this: Ax = b is
solvable iff b ∈ Im A; in terms of ranks this is: Ax = b is solvable iff rank (a1 a2 . . . an b) =
rank (a1 a2 . . . an ). Indeed if rank (a1 a2 . . . an b) = rank A then {a1 , a2 , . . . , ar } is a set of
basis columns for (a1 a2 . . . an b) and by the basis minor theorem Ax = b is solvable. Conversely
if Ax = b is solvable then b is a linear combination of a1 , a2 , . . . , an and hence the manifolds
[a1 , a2 , . . . , an , b] and [a1 , a2 , . . . , an ] are identical and so therefore are their dimensions. The
practical evaluation of rank makes the rank test for solvability practical as well.
LECTURE 3. VECTOR SPACES 50

The corresponding homogeneous problem Ax = 0 is always solvable; for there to be solutions


other than x = 0, the rank of A must be less than the number of its columns for then by the basis
minor theorem each of n−r columns can be written as a linear combination of the r basis columns.

We denote the set of vectors x in C n satisfying Ax = 0 by Ker A, called the kernel of A. As


A (c1 x1 + c2 x2 ) = 0 if Ax1 = 0 = Ax2 , Ker A is a vector space and hence a subspace of C n . It is
the solution space for Ax = 0. To identify it and determine its dimension we need to find a basis,
i.e., a set of independent solutions of Ax = 0 which span all solutions of Ax = 0. To do this we
write Ax = x1 a1 + x2 a2 + · · · + xn an = 0 and observe that ar+1 , ar+2 , · · · , an belong to Im A for
which {a1 , a2 , · · · , ar } is a basis. Hence each vector ar+1 , ar+2 , · · · , an has a unique expansion
in terms of a1 , a2 , · · · , ar and writing these expansions

c1 a1 + c2 a2 + · · · + cr ar + ar+1 = 0

d1 a1 + d2 a2 + · · · + dr ar + ar+2 = 0

etc.

we read off n − r independent solutions of Ax = 0 as


   
c1 d1
   
 ..   .. 
 .   . 
   
   
 cr   dr 
   
   
 1 ,  0 , etc.
   
   
 0   1 
   
 ..   .. 
 .   . 
   
0 0

We denote these fundamental solutions of Ax = 0 by x1 , x2 , . . . , xn−r . The set of vectors


 
ξ
 1 
 
 ξ2 
x1 , x2 , . . . , xn−r is independent by inspection. And if x =  
 ..  is any solution of
 . 
 
ξn
LECTURE 3. VECTOR SPACES 51

Ax = 0, then ξ1 a1 + ξ2 a2 + · · · + ξn an = 0 and this implies, using the expansions of

ar+1 , ar+2 , . . . in terms of a1 , a2 , . . . , ar and the fact that {a1 , a2 , . . . , ar } is independent, that
       
ξ1 c1 d1 0
       
       
 ξ2   c2   d2   0 
  − ξr+1   − ξr+2  −··· =  
 ..   ..   ..   .. 
 .   .   .   . 
       
ξr cr dr 0

and hence that

x − ξr+1 x1 − ξr+2 x2 − · · · = 0

  
This tells us that x1 , x2 , . . . , xn−r is a basis for Ker A, which we can now write as x1 , x2 , . . . , xn−r ,
and that the dimension of Ker A, the solution space of Ax = 0, is n − r, the number of columns
of A less its rank.

As simple examples the readers may satisfy themselves that the homogeneous equation

fT x = 0

has n − 1 independent solutions if f 6= 0, whereas the set of two homogeneous equations

f T1 x = 0

and

f T2 x = 0,

rewritten as
   
f T1 0
 x =  ,
f T2 0

has n − 2 independent solutions if f 1 and f 2 are independent, etc. We will see that the solutions
LECTURE 3. VECTOR SPACES 52

to the first equation may be interpreted as the set of vectors perpendicular to f , the solutions to the
second as the set of vectors perpendicular to f 1 and f 2 , etc.

3.3 The Facts about the Solutions to the Problem Ax=b

We can sum up the facts about the problem Ax = b. Even though all the foregoing is phrased
in terms of matrix operators and column vectors the conclusions hold as well for all other linear
operator problems so we will state them generally:

The problem A~x = ~b has a solution iff ~b ∈ Im A. If ~b ∈ Im A and ~x0 satisfies A~x = ~b then
so also does ~x0 + ~y for any ~y ∈ Ker A. And all solutions can be so written, for if A~x1 = ~b then
~y1 = ~x1 −~x0 ∈ Ker A and ~x1 = ~x0 +~y1 . If ~x = ~0 is the only solution of A~x = ~0, then if ~b ∈ Im A,
the solution to A~x = ~b is unique.

For a matrix of n columns and rank r the general solution of Ax = b depends on n−r constants
and is x0 + c1 x1 + c2 x2 + · · · + cn−r xn−r where x0 is any particular solution and x1 , x2 , · · · , xn−r
is a fundamental system of solutions of Ax = 0. In this the set of fundamental solutions, viz.,

x1 , x2 , · · · , xn−r , may be replaced by any basis for Ker A.

Working out the following problem (this problem is on page 71 in Shilov’s “Linear Algebra”)
will help the reader get all this straightened out. The problem is to determine the solution to a
system of four equations in five unknowns:
LECTURE 3. VECTOR SPACES 53

x1 + x2 + x3 + x4 + x5 = 7

3x1 + 2x2 + x3 + x4 − 3x5 = −2

x2 + 2x3 + 2x4 + 6x5 = 23

5x1 + 4x2 + 3x3 + 3x4 − x5 = 12

The rank of the 4 × 5 coefficient matrix A is 2. So Im A is a 2 dimensional subspace of R4


while Ker A is a 3 dimensional subspace of R5 . Because the solvability condition is satisfied there
are two independent equations. It helps therefore to drop two dependent equations, for they must
be satisfied if the remaining two independent equations are satisfied. Then taking the basis minor
to be in the upper left hand corner and transposing dependent columns to the right hand side, we
write

           
1 1 7 1 1 1
  x1 +   x2 =   − x3   − x4   + x5  
3 2 −2 1 1 −3

From here a particular solution and the fundamental system of solutions to the homogeneous
equations can be obtained easily. A particular solution is obtained
  setting x3 , x4 and x5 to
by
7
zero and then the fundamental system is obtained by dropping   and first setting x3 = 1,
−2
x4 = 0, x5 = 0, then setting x3 = 0, x4 = 1, x5 = 0, etc.

3.4 Systems of Differential Equations

To illustrate that what we have been doing has more significance than what its face value would
suggest define pij , a polynomial differential operator, by

d d2
pij = aij + bij + cij 2 + · · ·
dt dt
LECTURE 3. VECTOR SPACES 54

and suppose that we have n differential equations which determine n functions u1 (t), u2 (t), . . . , un (t),
via
    
p11 p12 · · · p1n u1 b (t)
    1 
    
 p21 p22 · · · p2n   u2   b2 (t) 
  = 
 .. .. ..   ..   .. 
 . . .  .   . 
    
pn1 pn2 · · · pnn un bn (t)

or P u = b. Then D = det P is a polynomial differential operator, as is Pij , the cofactor of pij in


the matrix P .

Now the differential operators pij can be added and multiplied as if they were complex numbers
and so we can write

X
pij Pij = D, j = 1, 2, . . . , n
i

and

X
pik Pij = 0, k, j = 1, 2, . . . , n, k 6= j
i

or in better notation

P Tj pk = Dδjk

Then writing P u = b as

p1 u 1 + p2 u 2 + · · · + pn u n = b

we can use the selection properties of P 1 , P 2 , . . . to write


LECTURE 3. VECTOR SPACES 55

Du1 = P T1 b

Du2 = P T2 b

etc.

The result is that we have turned n differential equations in n unknowns, of order equal to the
highest order among the pij , into one differential equation in one unknown of order equal to the
order of D. Only the right hand sides of the equations determining u1 , u2, . . . differ.

3.5 Example

Let y satisfy

d2 y
+ λy = f (x)
dx2

then, of course, there is a solution for any f .

Add to this

y=0 at x = 0, 1

and ask again: is there a solution?

If λ is not π 2 , 4π 2 , 9π 2 , etc., the homogeneous problem

d2 y
+ λy = 0, y=0 at x = 0, 1
dx2

has only the solution y = 0, whereupon our problem has a solution for all f ’s and it is unique.

If λ is one of the values π 2 , 4π 2 , 9π 2 , etc., then the homogeneous problem has the solution


y = sin λx
LECTURE 3. VECTOR SPACES 56

and our problem may or may not have a solution.


d2 y
If we introduce a difference approximation to , we can use the results of this lecture. Oth-
dx2
erwise, we need a new idea.

3.6 Home Problems

1. We have S atomic or molecular species, s = 1, 2, . . . , S participating in E elementary re-


actions, e = 1, 2, . . . , E and we denote the stoichiometric coefficients in each elementary
reaction by
 
ν1e
 
 
 ν2e 
νe = 


.. 
 . 
 
νSe

and its rate by ξ˙ e. Here ν e specifies a reaction not a species and νse is a net stoichiometric
coefficient as a species may be written more than one time in an elementary reaction.

If the rank of the S × E matrix ν = (ν 1 ν 2 . . . ν E ) is R then an independent set of


  
R columns can be selected from ν 1 , ν 2 , . . . , ν E , say µ1 , µ2 , . . . , µR , and as each of
the µi ’s may differ from all of the ν j ’s the corresponding set of reactions is called a set of
apparent reactions. Denote their rates η̇r , r = 1, 2, . . . , R.

Then as unique coefficients C re can be found such that

R
X
νe = C re µr
r=1

PE
the rate of production of the species, e=1 ν e ξ˙e , can be written in terms of the rates of the
apparent reactions as

E
X R
X R
X
ξ˙e Cre µr = µr η̇r
e=1 r=1 r=1
LECTURE 3. VECTOR SPACES 57

where

E
X
η̇r = Cre ξ˙e
e=1

This last equation is the rule for writing rate laws for apparent reactions in terms of rate laws
for elementary reactions. As an example take the following set of elementary reactions

O3 + O3 ⇄ O1 + O2 + O3

O3 + O1 ⇄ O1 + O1 + O2

O3 + O2 ⇄ O1 + O2 + O2

O3 + O1 ⇄ O2 + O2

O2 + O1 ⇄ O1 + O1 + O1

O2 + O3 ⇄ O1 + O1 + O3

and show that it is equivalent to the apparent reactions

O3 ⇄ 3O1

2O3 ⇄ 3O2

by writing η̇1 and η̇2 in terms of ξ˙1 , ξ˙2 , . . . , ξ˙6 .

2. Given a real number a and an estimate of a−1 , the iteration formula

xi+1 = xi (2 − axi )

produces a sequence of approximations converging to a−1 if the initial estimate of a−1 is


LECTURE 3. VECTOR SPACES 58

close enough. This is Newton’s iteration as it is used to find the roots of

1
1− =0
ax

Because division is not required to determine the sequence of approximations, we can try to
use this iteration formula in matrix inversion. To see why it might work suppose that B is an
estimate of A−1 , differing from it by a small amount ∆, so that AB and BA are close to I.
Then we can write

A (B + ∆) = I

which leads to

A∆ = I − AB

and then to

BA∆ = B − BAB

and we thereby discover that ∆ is approximately

B − BAB

Using this, a new estimate of A−1 can be determined from B as

B + ∆ = 2B − BAB

and this is the iteration formula

Bi+1 = 2Bi − Bi ABi

Let A be the n × n Hilbert matrix where n = 5. Find an approximate inverse and then
LECTURE 3. VECTOR SPACES 59

try to improve it using the above iteration formula.

Here is another way to think about this. Suppose we can find approximations to the
solution of Ax = b, possibly by using an elimination procedure corrupted by round-off
errors.

Let X0 be an approximation to X, the solution to AX = I. Then the error,


E0 = X − X0 , satisfies AE0 = R0 where the residual R0 is I − AX0 .

Now let ∆0 be an approximation to E0 and define an improved approximation via


X1 = X0 + ∆0 . The new error E1 = X − X1 satisfies AE1 = R1 where the residual R1 is
I − AX1 = R0 − A∆0 .

Let ∆1 be an approximation to E1 and define X2 = X1 +∆1 . The error is E2 = X −X2 ,


the residual is R2 = I − AX2 = R1 − A∆1 and AE2 = R2 .

Etc.

Because AE0 = R0 , we have E0 = XR0 and we can estimate E0 to be X0 R0 . Taking


this to be ∆0 and using

R1 = R0 − A∆0

we can estimate R1 as

R0 − AX0 R0 = R0 − (I − R0 ) R0 = R02 .

So if R0 is small, R1 is smaller yet and

X1 = X0 + ∆0

is approximately

X0 + X0 R0 = X0 + X0 (I − AX0 ) = 2X0 − X0 AX0


LECTURE 3. VECTOR SPACES 60

This is the same iteration formula as before.

Bi+1 = 2Bi − Bi ABi

   
A b W x
3. To find the inverse of   in terms of A−1 , find   so that
T T
c d y z

     
A b W x I 0
   = 
T T T
c d y z 0 1

To do this show that W and y T satisfy

AW + b y T = I

and

cT W + d y T = 0T

Then write the first of these as

W + A−1 b y T = A−1

multiply this by cT and use the second to determine y T via

cT A−1
yT =
−d + cT A−1 b

and then W via

A−1 b cT A−1
W = A−1 −
−d + cT A−1 b
LECTURE 3. VECTOR SPACES 61
 
A b
Indeed   has an inverse iff −d + cT A−1 b 6= 0. In the same way find
T
c d
x and z.

This is the bordering algorithm used to invert (n + 1) × (n + 1) matrices in terms of


the inverses of n × n matrices. Use it to invert
 
1 0 0 1
 
 
 2 1 0 1 
 
 
 3 2 1 1 
 
1 1 1 1

4. To see what happens in solving Ax = b when A is nearly singular write the LU decomposi-
tion of A as
   
L1 0 U1 u
A = LU =    
T T
ℓ 1 0 ε

where here 1’s lie on the diagonal of L instead of on the diagonal of U and where
det A = ε det U1 . Requiring 1’s on the diagonal of L instead of on the diagonal of U intro-
duces no new idea; but requiring the small quantity ε to be in the lower right hand corner of
U may require interchanging some columns and/or rows of A.

Then write Ax = b as
     
L1 U1 L1 u x1 b1
   = 
T T
ℓ U1 ℓ u + ε x b

and multiply this out to get

L1 U1 x1 + L1 u x = b1
LECTURE 3. VECTOR SPACES 62

and


ℓT U1 x1 + ℓT u + ε x = b

Write the first of these

−1
x1 + U1−1 u x = L1 U1 b1

multiply this by ℓT U1 and use the second to find x via

b − ℓT L−1
1 b1
x=
ε

whence
   −1   
x1 L1 U1 b1 −U1−1 u T −1
x= = +  b − ℓ L1 b1
x 0 1 ε

This formula is the main result of this problem. It tells us that the closer ε is to zero the better
job we must do in the determination of b − ℓT L−1
1 b1 . Write this formula as

 −1 
L1 U1 b1
x=  + 1 ψT b φ
0 ε

   T 
U1−1 u L−1
1 ℓ
where φ =   and ψ =  . Observe that if ψ T b = 0 then x is indepen-
−1 −1
dent of ε. Show that this obtains when b ∈ Im A (ε = 0) by observing that A (ε = 0) φ = 0
and AT (ε = 0) ψ = 0.
LECTURE 3. VECTOR SPACES 63

5. The 5 × 5 Hilbert matrix is


 
1 1/2 1/3 1/4 1/5
 
 
 1/2 1/3 1/4 1/5 1/6 
 
 
 1/3 1/4 1/5 1/6 1/7 
 
 
 1/4 1/5 1/6 1/7 1/8 
 
1/5 1/6 1/7 1/8 1/9

Retaining the fractions, find the inverse. Round to four decimal places, write the decimals as
fractions and find the inverse.

6. Let v1 , v2 , . . . , vn be a set of n variables and let X be a dimensionless variable made up


using products of powers of these variables via

p p
X = v1 1 v2 2 · · · vnpn

Denote the three fundamental dimensions by M, L and T and write

[v1 ] = Ma1 Lb1 Tc1

[v2 ] = Ma2 Lb2 Tc2

etc.

then

n op1 n o p2 n opn
[X] = Ma1 Lb1 Tc1 Ma2 Lb2 Tc2 · · · Man Lbn Tcn
LECTURE 3. VECTOR SPACES 64

where p1 , p2 , . . . , pn must satisfy


 
  p1  
a a . . . an  
 1 2     0 
   p2   
 b1 b2 . . . bn   = 0 
   ..   
 . 
c 1 c 2 . . . cn   0
pn

to make X dimensionless. This is a system of three homogeneous equations in n unknowns.

Each solution to these equations leads to a dimensionless variable. Independent solu-


tions produce independent dimensionless variables. The number of independent dimension-
less variables is the number of independent solutions. And because the sum of independent
solutions is a solution so also the product of dimensionless variables is a dimensionless vari-
able.

As an example, find an independent set of dimensionless variables in terms of which


data on the flow of a liquid in a pipe can be correlated. The variables of interest are: the
pressure drop across the pipe, the velocity of the flow, the length and diameter of the pipe,
the viscosity, density and surface tension of the liquid and the acceleration due to gravity.
Take the dimensions of these variables to be

[∆p ] = M1 L−1 T−2

[v ] = M0 L1 T−1

[ℓ ] = M0 L1 T0

[d ] = M 0 L1 T0

[µ ] = M1 L−1 T−1

[ρ ] = M1 L−3 T0

[σ ] = M1 L0 T−2

[g ] = M0 L1 T−2
LECTURE 3. VECTOR SPACES 65

7. A set of variables P1 , P2 , . . . is said to be dimensionally independent if the dimension of


none of the variables can be written as a product of powers of the dimensions of the others.

Suppose

[P1 ] = M α1 L β1 T γ 1

etc.

and write the definition of dimensional independence in terms of the rank of the matrix
 
α1 α2 · · · αn
 
 
 β1 β2 · · · βn 
 
γ1 γ2 · · · γn

8. The dimensions of P are independent of the dimensions of P1 and P2 if there are no solutions,
p and q, to
   
α α α  
   1 2  p
   
 β  =  β1 β2   
    q
γ γ1 γ2

where

[P ] = M α L β T γ

etc.

How can this be stated in terms of the rank of a matrix?

Are the dimensions of pressure independent of the dimensions of viscosity and veloc-
ity?
LECTURE 3. VECTOR SPACES 66

9. The differential equations determining the growth rate of a small disturbance superimposed
on the base state of a rotating fluid layer under an adverse temperature gradient are
 
2 2 β 2
D − a − Pr σ 0 d    
 κ 
  Θ 0
 2Ω    
 0 D 2 − a2 − σ dD    
 ν  Z  =  0 
    
 
 gα 2 2 2Ω 3  W 0
− da − dD (D 2 − a2 ) (D 2 − a2 − σ)
ν ν

d
where D denotes .
dz
Write this as a single differential equation in Θ or Z or W .
Lecture 4

Inner Products

4.1 Inner Products

An inner product on C n is a function that assigns a complex number to every pair of vectors in
C n . The complex number assigned to x and y is denoted x, y and is required to satisfy the
following conditions

x, y = y, x

x, c1 y 1 + c2 y 2 = c1 x, y1 + c2 x, y 2

and

x, x > 0 unless x = 0

where an overbar denotes a complex conjugate. This definition is not special to C n .

All inner products on C n take the form

x, y = x T Gy

T
where G is an n × n matrix satisfying G = G (Hermitian) and x T Gx > 0 (positive definite) for

67
LECTURE 4. INNER PRODUCTS 68

all x 6= 0. Then to each Hermitian positive definite matrix G there corresponds an inner product
on C n . The simplest inner product, which we call the plain vanilla inner product, corresponds to
P
G = I and therein x, y = x T y = ni=1 xi yi

In a specific inner product two vectors x and y are said to be perpendicular if x, y = 0;


when this is so x and Gy are perpendicular in the plain vanilla inner product.

Defining an inner product makes it easy to use the idea of biorthogonal sets of vectors and we
do this at every opportunity as it greatly simplifies our work. We can also formulate a solvability
condition that is not specific to matrix problems, unlike the rank condition presented earlier. To
do this in C n let A be an n × n matrix. Then A maps vectors x ∈ C n into vectors Ax ∈ Im A, a
subspace of C n of dimension r.

4.2 Adjoints

What we seek is a new test to tell us when an arbitrary vector belongs to Im A. At first all we have
to work with is Ker A, a subspace of C n having the interesting dimension n − r but otherwise
bearing no special relation to Im A. Indeed Ker A may be wholly inside Im A or wholly outside
Im A or . . .. What we do then is this: we define a matrix whose kernel helps us identify Im A. To
do this we fix an inner product on C n . Then in this inner product A acquires a companion, denoted
A∗ , and called the adjoint of A, by the requirement that

A∗ x, y = x, Ay

for all x and y in C n . Writing this out using x, y = x T Gy we can determine a formula for A∗ .
It is

A∗ = G−1 A T G

and this shows how A∗ depends on the inner product in which we are working. If G = I then
A∗ = A T

Now the rank of A∗ is equal to the rank of A. Indeed it is easy to see that the rank of A T
LECTURE 4. INNER PRODUCTS 69

is equal to the rank of A; it is less easy to see, but no less true, that the rank of G−1 A T G is also
equal to the rank of A. Bezout’s theorem in Gantmacher’s book “Theory of Matrices,” can be used
to produce an algebraic proof of this. The reader can produce a geometric proof by showing that
the dimensions of the subspaces Im G−1 A T G and Im A T are equal. In doing this it is useful to

observe that, because G is not singular, if the set of vectors x1 , x2 , . . . is independent then so

also the set of vectors G−1 x1 , G−1 x2 , . . . .

Fixing an inner product on C n , we can identify an n × n matrix A∗ , the adjoint of A. And as


A∗ maps C n into itself it defines two subspaces of C n : Im A∗ , of dimension r, and Ker A∗ , of
dimension n − r.

4.3 Solvability Conditions

Now we need to see what these subspaces tell us about the subspaces Im A and Ker A. To do this
we first let S be an arbitrary subspace of C n . Then if we fix an inner product on C n , S acquires a
companion, denoted S ⊥ and called the orthogonal complement of S, by the requirement that S ⊥ be
the set of all vectors in C n perpendicular to each vector in S. It is a subspace of C n and it depends
on the inner product being used. We write the definition of S ⊥ as

n o

S = y: y, x =0 ∀x ∈ S

The readers can verify that S ⊥⊥ = S, they can also verify that if dim S = r then dim S ⊥ = n − r
by using the fact that an r × n system of homogeneous equations of rank r has n − r independent
solutions. Then x ∈ S iff x⊥S ⊥ .
 ⊥

In terms of A and Ker A ∗ we can formulate our main result; it is

 ⊥
Im A = Ker A∗

Schematically this is:


LECTURE 4. INNER PRODUCTS 70

Ker A∗
Cn

Im A
0

The solvability condition for the problem Ax = b is the requirement that b ∈ Im A. Our main
 ⊥
result tells us that this is also the requirement that b ∈ Ker A∗ . To use this we fix an inner
product on C n and obtain A∗ . We then determine Ker A∗ by finding n − r independent solutions
to A∗ y = 0 and, using these, decide whether or not b⊥ Ker A∗ . Hence the solvability condition
for the problem Ax = b is this: Either b, y = 0 for all y such that A∗ y = 0, whence Ax = b is
solvable or b, y 6= 0 for some y such that A∗ y = 0, whence Ax = b is not solvable.

The proof that Ker A∗ = (Im A)⊥ is simple. It is just this: If y ∈ Ker A∗ and x ∈ Im A then
x = Az and y, x = y, Az = A∗ y, z = 0 whence Ker A∗ ⊂ ( Im A)⊥ . If y ∈ (Im A)⊥
and z ∈ C n then Az ∈ Im A and y, Az =0= A∗ y, z ; hence setting z = A∗ y we have
A∗ y = 0 whence ( Im A)⊥ ⊂ Ker A∗ .

4.4 Invariant Subspaces

A subspace S is said to be A invariant if Ax ∈ S whenever x ∈ S. If S is A invariant then S ⊥


is A∗ invariant. To see this let y ∈ S ⊥ then if x ∈ S, Ax ∈ S and A∗ y, x = y, Ax =0
hence A∗ y ∈ S ⊥ . If we fix an inner product, obtain A∗ and find that A∗ = A, then A is called
self-adjoint (in that inner product). Self-adjoint matrices (and self-adjoint operators in general)
satisfy special and useful conditions. Many of these come from the fact that if S is A invariant
and A∗ = A then S ⊥ is also A invariant. This lies at the heart of the fact that a self-adjoint
matrix has a complete set of eigenvectors. Eigenvectors are introduced in Lecture 5 (on page
LECTURE 4. INNER PRODUCTS 71

85)and are important ingredients in solving linear operator problems. Given a matrix A, then, we
may want to determine if there is an inner product in which it is self-adjoint. This amounts to
determining a positive definite, Hermitian matrix G such that A∗ = A. The condition on G is that
T
A = G−1 A T G or GA = GA . Matrices A for which there is a solution to this equation are
called symmetrizable. They are self-adjoint in the corresponding inner product.

To this point our operators have been n × n matrices mapping C n into itself. We conclude this
lecture by producing the solvability condition for the problem Ax = b, x ∈ C n , b ∈ C m . Then we
introduce projection operators. Projections are basic to what are called generalized inverses which
can be used to construct x such that Ax is as close as possible to b when b 6∈ Im A.

Let , m
and , n
denote inner products on C m and C n then A, an m × n matrix mapping
C n into C m , has an adjoint A∗ , an n × m matrix mapping C m into C n . It is defined by requiring

A∗ y, x n
= y, Ax m

for all y ∈ C m , x ∈ C n and therefore it can be obtained via the formula

A∗ = G−1 T
n A Gm

By arguments not unlike the above we discover that

1. Im A and Ker A∗ are subspaces of C m of dimensions r and m − r whereas Im A∗ and


Ker A are subspaces of C n of dimensions r and n − r, and
 ⊥
2. Im A = Ker A∗ and Im A∗ = ( Ker A)⊥ .

The solvability condition for Ax = b is as before: Ax = b is solvable iff b, y m


= 0 for all y
such that A∗ y = 0.
 
1 1 1 1 1
 
 
 3 2 1 1 −3 
As an example of the use of this to determine solvability, let A = 

 and

 0 1 2 2 6 
 
5 4 3 3 −1
LECTURE 4. INNER PRODUCTS 72
 
  1 3 0 5
7  
   
   1 2 1 4 
 −2   
b= . Then taking Gm = Im and Gn = In we have A∗ = A T = 
 1 1 2 3 

   
 23   
   1 1 2 3 
12  
1 −3 6 −1
    
−3 −2
    
    
  1   −1 
whence Ker A∗ = y =
 1 
 , y = 
 2 
.

  1   0 
    
0 1

As b passes the test b, y 1 m


= 0 = b, y 2 m
, Ax = b is solvable, confirming our earlier
conclusion.

4.5 Example: Functions Defined on 0 ≤ x ≤ 1

Denote by f and g two real valued functions, vanishing at x = 0 and x = 1 and belonging to the
set of smooth functions defined on the interval [0, 1]. Then
Z 1
f (x)g(x) dx = f, g
0

defines an inner product on these functions and L where

d2
L=
dx2

is a linear differential operator acting on this set of functions.

Now L is self-adjoint and so too L + π 2 and hence to decide if the problem

d2 u
+ π 2 u = f (x)
dx2

where u = 0 at x = 0, 1 has a solution, we observe that Ker (L + π 2 ) is spanned by sin πx. Hence
LECTURE 4. INNER PRODUCTS 73

our problem is solvable for all right hand sides such that
Z 1
sin πx f (x) dx = 0
0

1
i.e., it is solvable for every function odd about , but for no others.
2

4.6 Projections

Let S be any assigned subspace of C m and let y be any fixed vector of C m . Then we can ask: what
vector in S is closest to y? The answer is given by the projection theorem.

To introduce the projection of C m onto S we construct S ⊥ , the orthogonal complement of S,


using the inner product at hand, and then define P , the projection of C m onto S by

P = I on S

and

P = 0 on S ⊥

Then as I − P satisfies

I − P = 0 on S

and

I − P = I on S ⊥

we see that I − P is the projection of C m onto S ⊥ . Indeed we have P 2 = P and P ∗ = P ; to see


that P ∗ = P we observe that

D E D   E
0= (I − P ) y, P z = y, I − P ∗ P z ∀y, z ∈ C m
LECTURE 4. INNER PRODUCTS 74

Thus we have P ∗ P = P and therefore P ∗ = I on S. But Ker P ∗ = { Im P }⊥ = S ⊥ ; hence


P ∗ = 0 on S ⊥ and we have P ∗ = P .

The projection theorem tells us that P y is the vector in S that is closest to y. This is established
later. The difference y − P y = (I − P ) y lies in S ⊥ and is, therefore, perpendicular to every vector
in S. This is the error in approximating y by P y.

If dim S = r then dim S ⊥ = m − r and so if b is assigned and a is required to be its best


approximation in S then a is defined by the conditions

a∈S

and

b − a ∈ S⊥

This requires a to satisfy m − r + r = m conditions and a is uniquely defined as a function of b.

To construct P we take any set of r independent vectors in S, where dim S = r, say f 1 , f 2 , . . . , f r


and let F denote the m × r matrix

 
F = f1 f2 . . . fr

where F is of rank r. Then for any y ∈ C m we have

P y = a1 f 1 + a2 f 2 + · · · + ar f r
 
a1
 
 
 a2 
=F
 ..
 = Fa

 . 
 
ar

where the coefficients a1 , a2 , . . . , ar are uniquely determined. Now as (I − P ) y ∈ S ⊥ we have

D E
f i , (I − P ) y =0 i = 1, 2, . . . , r
LECTURE 4. INNER PRODUCTS 75

This is

T 
f i Gm y − F a = 0 i = 1, 2, . . . , r

and denoting
 
T
f1
 
 T 
 f2  T
 
 ..  Gm = F Gm
 . 
 
T
fr

by F ∗ we have

F ∗y − F ∗ F a = 0

and hence we get

 −1
P y = F a = F F ∗F F ∗y

Because this must be true for all y ∈ C m we have for P the formula

 −1
P = F F ∗F F∗

where, as F ∗ is an r × m matrix of rank r, F ∗ F is an r × r matrix of rank r.

The use of the notation F ∗ for F Gm is not too far fetched. If A maps C n into C m and inner
T

products , m
and , n
are defined on C m and C n via positive definite, Hermitian matrices
Gm and Gn , then A∗ mapping C m into C n is given by

A∗ = G−1
T
n A Gm

This is the way in which F ∗ is to be understood. Its leading factor is Ir .


LECTURE 4. INNER PRODUCTS 76

4.7 The Projection Theorem: Least Squares Approximations

Let S be a subspace of C m and let , denote an inner product defined on C m in terms of a


positive definite, Hermitian matrix Gm . Let S ⊥ denote the orthogonal complement of S and P and
I − P denote the projections of C m onto S and S ⊥ .

We can write any vector y ∈ C m as the sum of a vector in S and another vector in S ⊥ . The
expansion is unique, it is

y = P y + (I − P ) y

and it leads to the Pythagorean Theorem:

D E
2
y = y, y = P y + (I − P ) y, P y + (I − P ) y
D E D E
= P y, P y + (I − P ) y, (I − P ) y

2 2
= Py + (I − P ) y

1/2
where y = y, y is the length of the vector y.

We can also write any vector y − s ∈ C m , where y ∈ C m and s ∈ S, as the sum of a vector in
S and another in S ⊥ . The expansion is

 
y − s = −s + P y + y − P y

and using the Pythagorean theorem we can conclude

2 2 2
y−s = y − Py + s − Py

This tells us that for any vector s ∈ S

2 2
y−s ≥ y − Py
LECTURE 4. INNER PRODUCTS 77

and hence that P y lies at least as close to y as does any other vector in S. The vector P y is then
the best approximation to y in S. The error in this approximation, y − P y, lies in S ⊥ and is
perpendicular to all the vectors in S. This establishes the projection theorem, save for the question
of uniqueness.

We can investigate this best approximation problem in another way which shows how it is the
same as the least squares problem.
 
Again let y be any vector in C m and S be a subspace of C m . Then if f 1 , f 2 , . . . , f r is a set
of r independent vectors spanning S we can write any vector s ∈ S as

s = a1 f 1 + a2 f 2 + · · · + ar f r

= Fa

where


F = f1 f2 . . . fr

and
 
a1
 
 
 a2 
a=
 ..


 . 
 
ar

The problem is to determine a1 , a2 , . . . , ar so that s ∈ S is the best approximation to y.

To make the square of the length of the error vector, viz.,

D X X E
y− ai f i , y − aj f j ,

least, we look for its stationary points by setting its derivatives with respect to ak , k = 1, . . . , r, to
LECTURE 4. INNER PRODUCTS 78

zero. To do this we write out the error vector as

D X X E D E  T
 
y− ai f i , y − aj f j = y − F a, y − F a = y T − aT F G y − F a

use

a = Re a + i Im a

and

aT = Re aT − i Im aT

and then differentiate this with respect to both Re ai and Im ai . Using

∂ Re a ∂ Re aT ∂ Im a
= ei , = eTi , = 0, etc.
∂ Re ai ∂ Re ai ∂ Re ai

we find, on setting the derivative with respect to Re ai to zero, that

 T
  T 
y T − aT F G (−F ei ) + −eTi F G y − F a = 0, i = 1, 2, . . . , r

and hence

n T o
Re F G y − Fa = 0

Likewise on setting the derivative with respect to Im ai to zero, we find

 T
 T 
y T − aT F G (−iF ei ) + i eTi F G y − F a = 0, i = 1, 2, . . . , r

and hence

n T o
Im F G y − Fa = 0
LECTURE 4. INNER PRODUCTS 79

Our result then is

T T
F Gy = F GF a

but F G = F ∗ and therefore, we get


T

 −1
a = F ∗F F ∗y

This formula solves the least squares problem. The solution to the best approximation problem is
F a where

 −1
F a = F F ∗F F ∗y = P y

and this is our earlier result.

4.8 Generalized Inverses

Supose A is an m × n matrix mapping C n into C m . The problem

Ax = y

has a solution iff y ∈ Im A.

Suppose dim Im A is less than m, then there are vectors in C m not in Im A. If y 6∈ Im A,


there is no solution to our problem but we can ask for the best approximation, i.e., a vector x ∈ C n
such that Ax ∈ Im A is as close as possible to y.

To do this we introduce an inner product , on C m , i.e., we select a positive definite, Her-


mitian matrix Gm . Then we denote the length of a vector y ∈ C m by y where

2
y = y, y
LECTURE 4. INNER PRODUCTS 80

To find x such that Ax is the best approximation to y, we first determine Ax via

Ax = P y

where P , the projection of C m onto Im A, can be constructed using the basis columns of A. The
error y − Ax = y − P y = (I − P ) y is then perpendicular to Im A.

The vector P y, the best approximation to y in Im A, is uniquely determined, hence so too


Ax. But x need not be unique. Because P y ∈ Im A, the solvability condition is satisfied and
we can begin to determine x by letting x0 be a particular solution to Ax = P y. Then the general
solution is x0 + ξ where ξ ∈ Ker A ⊂ C n . Now ξ can be any vector in Ker A, hence x0 + ξ can
be any vector in a plane parallel to Ker A. This plane is simply Ker A translated by x0 and it is
independent of the particular solution x0 we happen to be using.

To make the solution to the best approximation problem unique we make x0 + ξ lie as close
to 0 as possible. To do this we let I − Q be the projection of C n onto Ker A, Q being the

projection of C n onto ( Ker A)⊥ . Then the vector (I − Q) x0 + ξ is the closest vector in Ker A

to x0 + ξ and their difference, Q x0 + ξ = Qx0 is independent of ξ. Hence, every solution
is the same distance from Ker A as every other, this distance being the length of Qx0 . But
Qx0 = x0 − (I − Q) x0 ∈ ( Ker A)⊥ is also a solution, due to (I − Q) x0 ∈ Ker A, and, because
its projection on Ker A is 0, it must be the solution closest to 0. So by requiring Ax to be the
best approximation to y and x to be as short as possible we get a unique solution to our best
approximation problem, viz., if x0 is any vector satisfying

Ax0 = P y

then the unique solution x is

x = Qx0

Indeed because Qx0 is independent of x0 , i.e., Q (x0 + ξ) = Qx0 ∀ξ ∈ Ker A, it is the unique
solution to our problem.
LECTURE 4. INNER PRODUCTS 81

It remains only to express x in terms of A and y and thereby define a generalized inverse,
denoted AI , such that AI , mapping C m into C n , satisfies the requirement that to any y ∈ C m ,
x = AI y is the shortest vector in C n such that Ax is the best approximation in Im A to y.

To do this let f 1 , f 2 , . . . , f r be as in the construction of P , i.e., a set of r independent vectors


spanning Im A, then we can write A as

A = FR

where R is an r × n matrix whose columns are the coefficients in the expansion of the columns of
A in the basis f 1 , f 2 , . . . , f r for Im A. Hence R is unique and of rank r. Because F ∗ is defined
to be F Gm , we take R∗ to be the n × r matrix G−1
T T
n R and find that

A∗ = G−1
T −1 T T
n A Gm = Gn R F Gm

= R∗ F ∗

 −1
where the r independent columns of R∗ span Im A∗ . Then P , where P = F F ∗ F F ∗ , is
 −1
m ∗
the projection of C onto Im A and Q, where Q = R R R ∗ ∗ R∗ , is the projection of C n
onto Im A∗ . And observing that R∗∗ = R, viz.,

 T  T

R∗∗ = R −1 T
Gn = Gn R Gn = R

we can write Q as

 −1
Q = R∗ RR∗ R

These formulas for A, A∗ , P and Q establish the important facts that

P A = A, AQ = A
LECTURE 4. INNER PRODUCTS 82

and

QA∗ = A∗ , A∗ P = A∗

Then defining AI via

 −1  −1
AI = R∗ RR∗ F ∗F F∗

we get

QAI = AI , AI P = AI

AAI = P

AI A = Q

AAI A = A

and

AI AAI = AI

These formulas tell us that for any y ∈ C m , we have AAI y = P y and so AI y is a solution of

Ax = P y

Indeed, as QAI y = AI y, AI y is the shortest such solution. Hence AI is the generalized inverse of
A, i.e., AI y is the shortest vector such that AAI y is the best approximation to y.
LECTURE 4. INNER PRODUCTS 83

4.9 Home Problems



1. Let x1 , x2 , . . . , xn be a basis for C n . Then introduce an inner product and let

y 1 , y 2 , . . . , y n be its biorthogonal set. Show that the elements of a matrix A in the basis
{x1 , x2 , . . . , xn } are

Aij = y i , Axj


where the elements of a matrix A in a basis x1 , x2 , . . . , xn are defined by

X
A xj = Aij xi

Using

X
Ax = xi y i , Ax

show that the elements of the product AB are

X X
(AB)ij = y i , A xk y k , B xj = Aik Bkj

2. Let A : C n → C m and let Gn and Gm be the weighting factors in the corresponding inner
 
∗ n n ∗ m m
products. Then A A : C → C and AA : C → C . Show that A A ∗ ∗ = AA∗ and
 
AA∗
∗ = A∗ A. Be careful as there are three different ∗ ’s here.

3. Let A be the matrix in the numerical example in Lectures 3 and 4 where m = 4, n = 5


and r = 2. Then A has many generalized inverses each one corresponding to definite inner
products on C 4 and C 5 . Determine the generalized inverse of A when G4 = I4 and G5 = I5 .
Use this to solve the example problem in Lecture 3.
LECTURE 4. INNER PRODUCTS 84

4. The problem Ax = y where

 
1 1 1 1
 
 
 1 −1 1 1 
A=



 1 1 −1 1 
 
1 1 1 −1

and
 
1
 
 
 1 
y=
 

 1 
 
1

 
1
 
 
 0 
has a unique solution. It is x =  
 . Determine c1 and c2 so that A {c1 x1 + c2 x2 } is the
 0 
 
0
best approximation to y where

   
1 1
   
   
 1   0 
x1 = 
 
 and x2 = 
 

 0   1 
   
0 0

Is c1 x1 + c2 x2 the best approximation to x in [x1 , x2 ]?


Lecture 5

Eigenvectors

5.1 Eigenvectors and Eigenvalues

Eigenvectors are defined for operators that map vector spaces into themselves. Operators that
do this can act repetitively, and hence their squares, cubes, etc., are defined. Their action can be
understood in terms of their invariant subspaces and these may be built up out of their eigenvectors.

Let A be an n × n matrix mapping C n into itself. Then a vector x 6= 0 which satisfies

Ax = λx

for some complex number λ is called an eigenvector of A and λ is called the corresponding eigen-
value. Each eigenvector of A spans a one dimensional A-invariant subspace of C n and each vector
in this span is also an eigenvector of A corresponding to λ. Eigenvectors are not unique, their
lengths being arbitrary.

To determine the eigenvectors of A we write Ax = λx as

(A − λI) x = 0

and observe that solutions other than x = 0 can be found only for certain values of λ. To find the

85
LECTURE 5. EIGENVECTORS 86

eigenvectors we must first find the eigenvalues, viz., the values of λ such that the solution space
of our homogeneous problem contains vectors other than 0, i.e., such that dim Ker (A − λI) > 0
and therefore that dim Im (A − λI) < n. This is satisfied iff the rank of A − λI is less than n
which in turn is satisfied iff

det (A − λI) = 0

Each value of λ satisfying this equation is an eigenvalue of A and the corresponding eigen-
vectors make up the solution space Ker (A − λI), called the eigenspace corresponding to the
eigenvalue λ. The number of independent eigenvectors corresponding to λ is dim Ker (A − λI)
and this is n less the rank of (A − λI). It is called the geometric multiplicity of the eigenvalue λ.

To determine the eigenvalues of A we let ∆ (λ) = det (A − λI) , write


λI − A = (λe1 − a1 λe2 − a2 · · · λen − an ) and expand ∆ (λ) using property (iii) of determi-
nants, page 33, to get

∆ (λ) = det (λe1 λe2 − a2 · · ·) + det (−a1 λe2 − a2 · · ·)

= det (λe1 λe2 λe3 − a3 · · ·) + det (λe1 − a2 λe3 − a3 · · ·) +

det (−a1 λe2 λe3 − a3 · · ·) + det (−a1 − a2 λe3 − a3 · · ·)]

= etc.

and conclude that ∆ (λ) is a monic polynomial of degree n in λ. We write it

∆ (λ) = λn − ∆1 λn−1 + ∆2 λn−2 − · · · + (−1)n ∆n

where the coefficient ∆i is the sum of the i × i principal minors of A and where a minor is a
principle minor if the elements on its diagonal are also on the diagonal of A. The coefficients in
∆ (λ) can be written in terms of the eigenvalues. The coefficient ∆i is the sum of the products of
LECTURE 5. EIGENVECTORS 87

the eigenvalues taken i at a time, e.g.,

∆1 = tr A = λ1 + λ2 + · · · + λn

and

∆n = det A = λ1 λ2 · · · λn

where tr and det denote trace and determinant.

The polynomial ∆ (λ) has n roots in C, counting each root according to its multiplicity. To be
definite we call its distinct roots the eigenvalues of A and we denote them

λ 1 , λ2 , · · · , λd

denoting their algebraic multiplicities m1 , m2 , . . . , md where m1 + m2 + · · · + md = n.

The geometric multiplicity of each eigenvalue, i.e., the greatest number of independent eigen-
vectors corresponding to that eigenvalue, cannot exceed its algebraic multiplicity. Indeed if we
let n1 = dim Ker (A − λ1 I) and determine ∆ (λ) in a basis for C n whose first n1 vectors span
Ker (A − λ I), we see that ∆ (λ) contains the factor (λ − λ )n1 whence n ≤ m . This may
1 1 1 1

cause the reader to look at section 5.9 on page 109.

We introduce the eigenvectors of A in the hope of constructing a basis for C n which will
simplify certain calculations that we plan to make. But we will not always be able to find an
eigenvector basis for C n . To make the distinction we need to make, we call an eigenvalue problem
plain vanilla if it leads to n algebraically simple eigenvalues. Then the geometric multiplicity of
each eigenvalue is the same as its algebraic multiplicity and this value is one. In fact we will
go on and call an eigenvalue problem plain vanilla whenever the geometric multiplicity of each
eigenvalue is also its algebraic multiplicity.

There are eigenvalue problems that are not plain vanilla. But this is the exception, not the rule.
To see why this is so and why it is important, we look first at the eigenvalue problem in the simplest
LECTURE 5. EIGENVECTORS 88

case, n = 2. To do this we write out the characteristic polynomial of a 2 × 2 matrix A as

λ2 − ( tr A) λ + det A

and observe that the characteristic equation has either two distinct roots or one double root. The
matrix A then has either two eigenvalues each of algebraic multiplicity one or one eigenvalue of
algebraic multiplicity two. In the first instance we write

Ax1 = λ1 x1

and

Ax2 = λ2 x2

and derive the important fact that x1 and x2 are independent. Indeed c1 = 0 = c2 is the only
solution to c1 x1 + c2 x2 = 0 for if

c1 x1 + c2 x2 = 0

then

c1 λ1 x1 + c2 λ2 x2 = 0

and so

c2 (λ2 − λ1 ) x2 = 0

As λ1 6= λ2 and x2 6= 0 we find only c2 = 0 and likewise only c1 = 0. And so eigenvectors corre-


sponding to distinct eigenvalues are independent. By an easy extension of this we see that if n > 2
and λ1 , λ2 , · · · , λd are distinct eigenvalues then any set of d eigenvectors, one corresponding to
each of λ1 , λ2 , · · · , λd , is independent. This is worked out in section 5.8 on page 107.

Because {x1 , x2 } is an independent set of vectors it is a basis for C 2 . Introducing an inner


LECTURE 5. EIGENVECTORS 89
n o
2
product on C we can construct its biorthogonal set, denoted y 1 , y 2 , via

hy i , xj i = δij


The set of vectors y 1 , y 2 is independent and likewise a basis for C 2 ; indeed the matrix

 
y T1 G
 
y T2 G

is the inverse of the matrix (x1 x2 ).

We remind the reader that the idea of biorthogonal sets is a powerful idea. If any vector x is
expanded as

x = c1 x1 + c2 x2

or as

x = d1 y 1 + d2 y 2

then the coefficients are simply

c1 = hy1 , xi, c2 = hy 2 , xi

and

d1 = hx1 , xi, d2 = hx2 , xi

But there is even more. If we introduce A∗ , the adjoint of A, in the same inner product used to
LECTURE 5. EIGENVECTORS 90

construct y 1 and y 2 , then y 1 and y 2 turn out to be eigenvectors of A∗ , viz.,

A∗ y i = hx1 , A∗ y i iy 1 + hx2 , A∗ y i iy 2

= hAx1 , y i iy 1 + hAx2 , y i iy 2

= λ1 hx1 , y i iy 1 + λ2 hx2 , y i iy 2

and hence

A∗ y 1 = λ1 y 1

and

A∗ y 2 = λ2 y 2

This tells us that the eigenvalues of A and A∗ are complex conjugates { i.e., if det (λI − A) = 0
 
then det λI − A ∗ = 0 where A∗ = G−1 A G } and that their eigenvectors form biorthogonal
T

sets.

The first case is complete: when two eigenvalues of algebraic multiplicity one turn up, the
corresponding eigenvectors determine a basis for C 2 . If one eigenvalue of multiplicity two is
obtained this continues to be true if dim Ker (A − λ1 I) = 2 as then we can find two independent
eigenvectors, viz., any two independent vectors in C 2 , and write

Ax1 = λ1 x1

and

Ax2 = λ1 x2

and go on as before. But if dim Ker (A − λ1 I) = 1 a complication arises: we cannot find two
independent eigenvectors.
LECTURE 5. EIGENVECTORS 91

This corresponds to the rank of A − λ1 I having the value one instead of zero whence both
Im (A − λ1 I) and Ker (A − λ1 I) are one dimensional subspaces of C 2 and there is at most one
independent eigenvector. Denoting this x1 we have Ker (A − λ1 I) = [x1 ]. And we observe that
Ker (A − λ1 I) = Im (A − λ1 I) otherwise λ1 cannot be a double root. Indeed, as Im (A − λ1 I)
is one dimensional and A invariant, any vector in Im (A − λ1 I) is an eigenvector of A correspond-
ing to an eigenvalue other than λ1 unless Ker (A − λ1 I) = Im (A − λ1 I). This is established
again in section 5.2.

5.2 Generalized Eigenvectors

So, being short an eigenvector and observing that x1 ∈ Im (A − λ1 I) we seek a vector x2 satis-
fying

(A − λ1 I) x2 = x1

Now, to find x2 , a solvability condition must be satisfied because (A − λ1 I) x = 0 has a solution


other than 0, viz., x1 . However x1 ∈ Im (A − λ1 I) and hence the solvability condition is satisfied.
And therefore a solution, x2 , can be found. It is called a generalized eigenvector. It satisfies
(A − λ1 I)2 x2 = 0 but not (A − λ1 I) x2 = 0. And it is independent of x1 : if

c1 x1 + c2 x2 = 0

then

c1 λ1 x1 + c2 {x1 + λ1 x2 } = 0

or

c2 x 1 = 0
LECTURE 5. EIGENVECTORS 92

and we find only c1 = 0 = c2 . Of course x2 is not uniquely determined; a multiple of x1 can be


added to a particular x2 to produce another possible x2 .

This illustrates the main idea: when we cannot find enough eigenvectors to make up a basis for
our space we generalize the eigenvector problem in such a way that to each eigenvalue of algebraic
multiplicity m there corresponds m eigenvectors and generalized eigenvectors. The only new idea
required when n is greater than two is that an eigenvector may generate a chain of more than one
generalized eigenvector and there may be more than one chain corresponding to each eigenvalue.

In C 2 the solutions to the generalized eigenvalue problem for A satisfy

Ax1 = λ1 x1

and

Ax2 = x1 + λ1 x2

n o
where {x1 , x2 } is independent and a basis for C 2 . The corresponding biorthogonal set, y 1 , y 2 ,
is also independent and a basis for C 2 . Making the calculation A∗ y i we find

A∗ y 1 = λ1 y 1 + y 2

and

A∗ y 2 = λ1 y 2

n o
The readers need to carry out this calculation by expanding A∗ y i in y 1 , y 2 to satisfy them-
selves that the eigenvalue problem for A∗ is required to generalize in just this way. The check on
this is that the solvability condition for determining x2 is:

  h i
x1 ⊥ Ker (A − λ1 I)∗ = Ker A∗ − λ1 I = y 2

The problem of generalized eigenvectors leads to more possibilities than we found in C 2 . To


LECTURE 5. EIGENVECTORS 93

explain what can happen, let λ1 be an eigenvalue of A of algebraic multiplicity m1 and let the
dimension of Ker (A − λ1 I) be n1 so that we have n1 independent eigenvectors. If n1 < m1 then
Ker (A − λ1 I) and Im (A − λ1 I) intersect in at least one vector x1 and we take x1 to be one of
our eigenvectors. Using this vector we can determine a vector y 1 such that (A − λ1 I) y 1 = x1
and x1 and y 1 are the first two vectors in a chain. If y 1 is not in Im (A − λ1 I) the chain ter-
minates, otherwise we can determine a vector z 1 such that (A − λ1 I) z 1 = y 1 , etc. The vectors
x1 , y 1 , z 1 , . . . satisfy the equations

Ax1 = λ1 x1

Ay 1 = x1 + λ1 y 1

Az 1 = y 1 + λ1 z 1
..
.

which now generalize the eigenvalue problem. As there may be more than one chain correspond-
ing to the eigenvalue λ1 , it is important in selecting a basis for Ker (A − λ1 I) to first span
Ker (A − λ1 I) ∩ Im (A − λ1 I). The condition that a vector lie in Im (A − λ1 I) is that it be
orthogonal to Ker (A − λ1 I)∗ .

We can illustrate the main idea in the case n = 2. Suppose λ1 is an eigenvalue of algebraic
multiplicity 2 and geometric multiplicity 1. Then we have

Ker (A − λ1 I) = [ x1 ]

and

dim Im (A − λ1 I) = 1

whereupon

Im (A − λ1 I) = [ x ]
LECTURE 5. EIGENVECTORS 94

and we see that

(A − λ1 I) x = c x

hence

Ax = (λ1 + c) x

whence c must be zero and x must be a multiple of x1 . And we conclude that

Im (A − λ1 I) = Ker (A − λ1 I)

5.3 The Generalized Eigenvector Corresponding to a Double


Eigenvalue

Let λ1 be a double root of det A−λI = 0. { If A is real and λ1 is not real then λ1 is also a double
root. } Let x1 be an eigenvector corresponding to λ1 and suppose there is no other solution of
    
A − λ1 I x = 0 independent of x1 , i.e., Ker A − λ1 I = x1 . Then dim Im A − λ1 I = n − 1
 
and we can see that x1 ∈ Im A − λ1 I . Indeed if x1 6∈ Im A − λ1 I , then, in a basis made up

of x1 and n − 1 independent vectors in Im A − λ1 I , A has the representation
 
λ1 0Tn−1
 
0n−1 An−1 n−1

 
due to the fact that x1 6∈ Im A − λ1 I and Im A − λ1 I is A invariant. But as λ1 is a double root

it must be a root of det An−1 n−1 − λIn−1 n−1 = 0 whence A must have a second eigenvector

corresponding to λ1 , it must lie in Im A − λ1 I and it must be independent of x1 . But this is

not so and we conclude that x1 ∈ Im A − λ1 I . This is important because it is the solvability
condition for the problem


A − λ1 I x = x1
LECTURE 5. EIGENVECTORS 95

and hence we can find x2 such that


A − λ1 I x 2 = x 1

Indeed to any particular solution x2 may be added any multiple of x1 but all such x2 ’s are indepen-
dent of x1 . That is, if

c1 x1 + c2 x2 = 0

then


A − λ1 I c1 x1 + c2 x2 = c2 x1 = 0

whence c2 = 0 and so too c1 = 0.


 
So, if λ1 is a double root of det A − λI = 0 and dim Ker A − λ1 I = 1 then
 
Ker A − λ1 I ⊂ Im A − λ1 I and we can write

Ax1 = λ1 x1

and

Ax2 = x1 + λ1 x2

 
And x2 6∈ Im A − λ1 I otherwise there would be a vector x3 such that A − λ1 I x3 = x2 and
  
λ1 would be a triple root of det A − λI = 0. As x1 ∈ Im A − λ1 I but x2 6∈ Im A − λ1 I ,

x1 , but not x2 , is perpendicular to Ker A∗ − λ1 I .

If we denote by y 2 a nonzero solution to A∗ y = λ1 I, then

    
y 2 = Ker A∗ − λ1 I = Im A − λ1 I

and there is no other solution independent of y 2 . Because x1 , y 2


= 0 and x2 , y 2 6=

0 we can require y 2 to satisfy x2 , y 2 = 1. Now as x1 ∈ Im A − λ1 I we conclude:
LECTURE 5. EIGENVECTORS 96
   
y 2 ⊥ x1 = Ker A − λ1 I and hence y 2 ∈ Im A∗ − λ1 I . Using this we can let y 1 denote a
 
solution to A∗ − λ1 I y 1 = y 2 . Then y 1 is independent of y 2 , y 1 6∈ Im A∗ − λ1 I and we can
write

A∗ y 1 = λ1 y 1 + y 2

and

A∗ y 2 = λ1 y 2

By calculating λ1 y 1 , x2 we can see that x1 , y 1 = 1 while y 1 is determined only up to an


additive multiple of y 2 . It remains only to select the constant c so that x2 , y 1 + cy 2 = 0 and
 
rename y 1 + cy 2 as y 1 , then x1 , x2 and y 1 , y 2 are biorthogonal sets in C n .
     
It is ordinarily not true that x1 , x2 and y 1 , y 2 coincide, this obtains only if x1 , x2 is
 
A∗ invariant and then y 1 , y 2 can be determined directly as the biorthogonal set to x1 , x2 in
 
x1 , x2 .

A picture may help:

Ker ( A∗ − λ 1 I ) = [ y ]
2

Im ( A∗ − λ 1 I ) = [ x ] y
x 1 2
2
y
x 1
1
Ker ( A − λ1I ) = [ x ]
1


Im ( A − λ1I ) = [ y ]
2
LECTURE 5. EIGENVECTORS 97

5.4 Complete Sets of Eigenvectors

Our expectation when we solve an eigenvalue problem in C n must be: either we will determine a
set of n independent eigenvectors and hence a basis for C n or we will not. A set of n independent
eigenvectors is called a complete set. There are sufficient conditions for this; one is that the eigen-
values of A turn out to be simple roots of ∆ (λ) = 0. This requires A to have n distinct eigenvalues.
Another is that A∗ = A in some inner product. If this is so we can determine an eigenvector x1
in the usual way and then observe that as [x1 ] is A invariant so also is [x1 ]⊥ . Restricting A to
this n − 1 dimensional subspace we can then start over and determine an eigenvector, x2 , of the
restriction of A to [x1 ]⊥ in the usual way. This will be the second eigenvector of A and it will
satisfy hx1 , x2 i = 0. If n > 2 we can continue this to determine a set of n mutually orthogonal
eigenvectors, orthogonal in the inner product in which A∗ = A.

To decide the likelihood of turning up n independent eigenvectors we go back to the case n = 2


where ∆ (λ) = det (λI − A) = 0 is the quadratic equation

λ2 − ( tr A) λ + det A = 0

and where tr A = a11 + a22 and det A = a11 a22 − a21 a12 . This equation has a double root iff
( tr A)2 − 4 det A = 0; otherwise it has two simple roots. If a11 , a12 , a21 and a22 are real numbers
then the double root is real and it corresponds to the one dimensional locus ( tr A)2 −4 det A = 0 in
the det A, tr A plane separating the region corresponding to two simple real roots and the region
corresponding to two simple complex roots (which are complex conjugates). Two simple roots is
generic, being realized almost everywhere in the det A, tr A plane; the alternative, a double root,
turns up only on a set of measure zero. This continues to be true for n > 2. Our emphasis then
is on the ordinary and simplest possibility, we take up exceptions by example. What we require at
the outset is that A determine a basis for C n made up of independent eigenvectors; we refer to this
as a complete set of eigenvectors, and n simple eigenvalues is sufficient but not necessary for this.

Before going on we introduce a simple way to find all the eigenvectors lying in one dimensional
eigenspaces. Let A be an n × n matrix. The corresponding eigenvalues are the roots of ∆ (λ) =
LECTURE 5. EIGENVECTORS 98

det (λI − A) where we write

∆ (λ) = λn − ∆1 λn−1 + ∆2 λn−2 − · · · + (−1)n ∆n

Letting B (λ) = adj (λI − A), we see that the elements of B (λ) are polynomials of degree n − 1
in λ and so we can write

B (λ) = λn−1 I − B1 λn−2 + · · · + (−1)n−1 Bn−1

And, as

(λI − A) adj (λI − A) = det (λI − A) I

we have

(λI − A) B (λ) = ∆ (λ) I,

and we see that corresponding to any eigenvalue, say λ1 , where ∆ (λ1 ) = 0, the non-vanishing
columns of B (λ1 ) are eigenvectors of A. Now B (λ1 ) is a matrix whose rank is either one or
zero depending on whether the rank of (λ1 I − A) is either n − 1 or less than n − 1. If the rank of
(λ1 I − A) is n−1 we have dim Ker (λ1 I − A) = 1 and then there is one independent eigenvector
and a candidate can be found among the columns of B (λ1 ). This is all that is required if λ1 is a
simple eigenvalue. To determine B1 , B2 , . . . , Bn−1 and hence B (λ), we can equate the coefficients
of the powers of λ on the two sides of


(λI − A) λn−1 I − B1 λn−2 + · · · + (−1)n−1 Bn−1 = λn I − ∆1 λn−1 + ∆2 λn−2 · · ·
LECTURE 5. EIGENVECTORS 99

Doing this we get

B1 = ∆1 I − A

B2 = ∆2 I − AB1

etc.

In Gantmacher’s book, “Theory of Matrices,” a method is explained for determining the sequences
∆1 , ∆2 , . . . and B1 , B2 , . . . simultaneously.

5.5 The Spectral Representation of a Matrix and a Derivation


of the Kremser Equation

Henceforth we let n be arbitrary and assume, unless an exception is made, that we have a com-
plete set of eigenvectors. Then the algebraic multiplicity of the eigenvalues is not important and
we can first denote the eigenvectors x1 , x1 , . . . , xn and then denote the corresponding eigenval-
ues λ1 , λ2 , . . . , λn where λ1 , λ2 , . . . , λn may not be distinct complex numbers. Upon solving the
eigenvalue problem Ax = λx we obtain set of independent eigenvectors. Then we introduce an

inner product in C n and construct its biorthogonal set. Denoting this y 1 , y 2 , . . . , y n we require

y i , xj = δij , i, j = 1, 2, . . . , n

 
Each of the sets of vectors x1 , x2 , . . . , xn and y 1 , y 2 , . . . , y n is a basis for C n . Now the set of
n2 equations y i , xj = y Ti Gxj = δij can be written

 
y T1 G
 
 T 
 y2 G  
  x1 x2 . . . xn = I
 .. 
 . 
 
y Tn G
LECTURE 5. EIGENVECTORS 100

and we see that the matrix


 
y T1 G
 
 T 
 y2 G 
 
 .. 
 . 
 
y Tn G


is the inverse of the matrix x1 x2 . . . xn . Indeed the vectors y T1 G, y T2 G, . . . are independent of
G.

Writing the n vector equations

Axi = λi xi , i = 1, 2, . . . , n

as the matrix equation

 
A x1 x2 . . . xn = λ1 x1 λ2 x2 . . . λn xn

we get a formula for A, viz.,


 
y T1 G
 
 
  y T2 G 
A = λ1 x1 λ2 x2 . . . λn xn  ..


 . 
 
y Tn G

And when this is written out, we have

A = λ1 x1 y T1 G + λ2 x2 y T2 G + · · · + λn xn y Tn G

This is called the spectral representation of A. Letting P1 = x1 y T1 G, P2 = x2 y T2 G, etc., and using


y Ti Gxj = δij , we find

Pi Pi = Pi
LECTURE 5. EIGENVECTORS 101

and

Pi Pj = 0, i 6= j

and we write

A = λ1 P1 + λ2 P2 + · · · + λn Pn

P
We say Pi selects xi because Pi xi = xi and Pi xj = 0, i 6= j, viz., Pi cj xj = ci xi .

This formula simplifies certain calculations via the multiplication rules for the Pi ; indeed it can
be used to derive powers of A via

Ap = λp1 P1 + λp2 P2 + · · · + λpn Pn , p = 0, 1, 2, . . .

where I = P1 + P2 + · · · + Pn . It can also be used to define polynomials and power series in


A in terms of polynomials and power series in λ, i.e., if f (λ) is a polynomial or power series in
λ then f (A) = f (λ1 ) P1 + f (λ2 ) P2 + · · · . The formula holds as well for p = −1, −2, . . . if
λi 6= 0, i = 1, 2, . . . , n.

As an example of its use, the balance and equilibrium equations corresponding to the linear, n
stage, counter current separating cascade sketched below
LECTURE 5. EIGENVECTORS 102

i+1

yi xi + 1
i

xi

y1
1

y0 = yin x1 = x out

are:

Lxi+1 + V yi−1 = Lxi + V yi

yi∗ = mxi

and

 
yi − yi−1 = E yi∗ − yi−1

where y0 and xn+1 are the compositions of the V and L phase feed streams and yn and x1 are the
compositions of the V and L phase product streams. We write these equations
LECTURE 5. EIGENVECTORS 103

   
xi+1 xi
  = A 
yi yi−1

Then by stepping through the cascade we determine its input-output formula to be


   
xin xout
  = An  
yout yin

where A is the 2 × 2 matrix


 
mV
1+ E − VL E
A= L .
mE 1−E

mV
 mV
The eigenvalues of A, viz., λ1 = 1 and λ2 = 1 + E L
− 1 , are simple unless L
− 1 = 0. The
corresponding eigenvectors are
   
1 V /L
x1 =  , x2 =  
m 1

and in the plain vanilla inner product, viz., G = I, we find


   
1 1 1 −m
   
y1 = , y2 =
mV  V  mV 1
1− − 1−
L L L

As a result we have
  
   V

xin  1  1 −L 
 =  +
 mV  mV 
yout 
1 −
L m −
L
  n 
mV    
1+E −1 mV V 

L − − xout
 L L   
mV 
 y
1− −m 1  in
L
LECTURE 5. EIGENVECTORS 104

This is equivalent to the Kremser equation and the overall material balance, but in a symmetric
form. What is important is that we have constructed a useful representation of A n and we did not
need a concrete value of n to do this.
mV
If −1 is zero, then λ1 = 1 is a double root and to it there corresponds only one independent
L  
1
eigenvector, x1 =  . The spectral representation of A must take this into account. And to
m
do this for a 2 × 2 matrix A we write

Ax1 = λ1 x1

and

Ax2 = x1 + λ1 x2
 
1
where x2 is a generalized eigenvector, here x2 =  . Then we write this
0

 
A x1 x2 = λ1 x1 x1 + λ1 x2
 
−1 y T1 G  
and using x1 x2 = , where x1 x2 and y y are biorthogonal sets, we have
1 2
y T2 G

A = λ1 x1 y T1 G + x1 y T2 G + λ1 x2 y T2 G

= λ1 P1 + P12 + λ1 P2

= λ1 I + P12

where the multiplication rules are now P1 P12 = P12 , P12 P1 = 0, P2 P12 = 0, P12 P2 = P12 and
P12 P12 = 0. Hence we have An = λn1 I + nλn−1
1 P12 and this can be used to derive the Kremser
equation when the equilibrium and operating lines are parallel.
LECTURE 5. EIGENVECTORS 105

5.6 The Adjoint Eigenvalue Problem

This restates what we already know in the case n = 2. Thus if A has a complete set of eigenvectors,

then, in the same inner product in which we determine y 1 , y 2 , · · · , y n , we can obtain A∗ and

calculate A∗ y i . Expanding A∗ y i in y 1 , y 2 , · · · , y n we get

X n
A∗ y i = xj , A∗ y i y j
j=1

n
X
= Axj , y i y j
j=1

= λi y i

This tells us that the vectors y i satisfy the eigenvalue problem for A∗ , viz.,

A∗ y i = λi y i , i = 1, 2, · · · , n

The sets of eigenvectors of A and A∗ are biorthogonal, while the sets of eigenvalues are complex
   
conjugates. Indeed det (A − λI) = 0 implies det A∗ − λI = det G−1 A G − λI = 0.
T

5.7 Eigenvector Expansions and the Solution to the Problem


Ax=b

We plan to use what we have learned in this lecture in the next lecture to write the solution to
differential and difference equations. Before we do this, and to illustrate the useful fact that we
can solve problems by expanding their solutions in a convenient basis, we return to the problem
Ax = b and assume it has a solution. Then expanding x in the set of eigenvectors of A, assumed
to be complete, we write

X
x= ci x i
LECTURE 5. EIGENVECTORS 106

and our job is to determine the coefficients ci , where ci = hy i , xi. We can find ci by calculating
the inner product of y i and both sides of Ax = b. Indeed we have

y i , Ax = yi, b

A∗ y i , x = yi, b

λi y i , x = yi, b

λi y i , x = yi, b

whence, assuming λi 6= 0, we get

yi , b
yi, x =
λi

and so we conclude that

X yi, b
x= xi
λi

is the solution of Ax = b. We see that each coefficient ci is determined independent of the other
coefficients.

The subspaces of C n important to the problem Ax = b, viz., Im A and Ker A, can be thought
about in terms of the eigenvectors of A and A∗: Im A is the span of the eigenvectors of A cor-
responding to eigenvalues that are not zero, Ker A∗ is the span of the eigenvectors of A∗ corre-
sponding to eigenvalues that are zero and Ker A is the span of the eigenvectors of A corresponding
to eigenvalues that are zero. For instance if λ1 = 0 in the foregoing, the solvability condition is
y1, b
y 1 , b = 0 and if that is satisfied is indeterminate and can be replaced by an arbitrary
λ1
constant c1 . The solution then contains an arbitrary multiple of x1 , a basis vector for Ker A.

If corresponding to λ1 (m1 = 2, n1 = 1) we have an eigenvector x1 and a generalized eigen-


vector x2 , we write

x= y1, x x1 + y2, x x2 + · · ·
LECTURE 5. EIGENVECTORS 107

and observe that

yi, A x = yi, b , i = 1, 2

imply

y2, x + λ1 y1, x = y1, b

and

λ1 y2, x = y2, b

whereupon, if λ1 is not zero, we obtain y1, x and y2, x and write the solution to Ax = b
accordingly.

5.8 Solvability Conditions and the Solution to Perturbation


Problems

Solvability conditions are important in perturbation calculations. To see why this is so, suppose a
matrix of interest, A, is close to a matrix A0 whose eigenvalue problem has been solved resulting
in a complete set of eigenvectors and simple eigenvalues:

A = A0 + εA1 + ε2 A2 + · · ·

where ε is small and

A0 x0i = λ0i x0i , i = 1, 2, . . . , n


∗ 0
A0 y 0i = λi y 0i , i = 1, 2, . . . , n
LECTURE 5. EIGENVECTORS 108

Then writing

xi = x0i + εx1i + ε2 x2i + · · ·

and

λi = λ0i + ελ1i + ε2 λ2i + · · ·

substituting into Axi = λi xi and equating the coefficients of 1, ε, ε2 , . . . to zero we find


A0 − λ0i I x0i = 0
 
A0 − λ0i I x1i = λ1i I − A1 x0i
  
A0 − λ0i I x2i = λ2i I − A2 x0i + λ1i I − A1 x1i

etc.

The first problem determines x0i and λ0i . And at every succeeding order the matrix A0 − λ0i I

appears and the homogeneous problem A0 − λ0iI x = 0 has a non zero solution, viz., x0i . To
determine the first corrections, x1i and λ1i , we turn to the second problem. To get x1i requires

that a solvability condition be satisfied. This is the requirement that λ1i I − A1 x0i belong to
 ∗  
Im A0 − λ0i I and hence be perpendicular to Ker A0 − λ0i I . But this is y 0i and hence the
solvability condition


y 0i , λ1i I − A1 x0i =0

determines λ1i as

λ1i = y 0i , A1 x0i

and so to first order

λi = λ0i + y 0i , A1 x0i ε
LECTURE 5. EIGENVECTORS 109

Continuing the calculation requires no new ideas but a lot of tedious work. Indeed to determine x1i

we use the solution of Ax = b written above putting A0 − λ0i I in place of A and λ1i I − A1 x0i
in place of b. Because the eigenvalues and eigenvectors of A0 − λ0i I are λ0j − λ0i and x0j , j =
1, 2, . . . , n, we get
D  E
X y 0j , λ1i I − A1 x0i
x1i = x0j + c1i x0i
j6=i
λ0j − λ0i

This is required to determine λ2i ; the readers may wish to satisfy themselves that λ2i , where
D  E
λ2i = y 0i , A2 x0i − y 0i , λ1i I − A1 x1i , is independent of the value assigned to c1i .

The calculation becomes more interesting as complexities arise. Suppose λ01 turns out to be a

double root and Ker A0 − λ01 I is two dimensional so that A0 retains a complete set of eigen-
vectors. On perturbation, λ01 is likely to split into two simple roots λ1 and λ2 to which corre-
spond independent eigenvectors x1 and x2 . Now x1 and x2 approach definite vectors x01 and x02 in

Ker A0 − λ01 I as ε goes to zero but we cannot know in advance what these limits are and hence

we cannot select x01 and x02 in advance out of all of the possibilities in Ker A0 − λ01 I . This means
that at the outset when we write x1 = x01 + εx11 + · · · and x2 = x02 + εx12 + · · · we do not know
x01 and x02 and we must determine their values as we go along. We do not explore this and other
complications. That would deflect us from our elementary goals.

5.9 More Information

Here is more information on the material in this lecture.

1. Early in the lecture we said that if there is an inner product in which A∗ = A then A has
a complete set of eigenvectors. Indeed in that inner product A has n mutually perpendicu-
lar eigenvectors. This establishes their linear independence and linear independence if not
orthogonality is retained as we introduce other inner products. The reader may go on and
show that the corresponding eigenvalues are real by calculating λi xi , xi . The condition
T
that A∗ = A is GA = GA .
LECTURE 5. EIGENVECTORS 110

The converse of this is true. If A has a complete set of eigenvectors and real eigenvalues
then A∗ = A in some inner product. Denoting the eigenvectors x1 , x2 , . . . , xn and the
corresponding eigenvalues λ1 , λ2 , . . . , λn we can write

  
A x1 x2 . . . xn = x1 x2 . . . xn diag λ1 λ2 . . . λn


hence letting X = x1 x2 . . . xn we have

T  T
AXX = X diag λ1 λ2 . . . λn X

T T
so if G = XX then G = G , xT Gx > 0 for all x 6= 0 and

T
AG = AG

This result tells us that the requirement A∗ = A in some inner product is necessary and
sufficient that A have a complete set of eigenvectors and real eigenvalues. The readers may
ask: why is it that the λ’s must be real?

2. If λ1 , λ2 , . . . , λd are distinct eigenvalues then any set of d eigenvectors, each eigenvector


corresponding to a different eigenvalue, is independent. This is true whatever the multiplici-

ties of λ1 λ2 . . . λd . If the set of eigenvectors is x1 , x2 , . . . , xd then to determine whether
it is independent or dependent we must solve the equation

c1 x1 + c2 x2 + · · · + cd xd = 0


To do this we multiply by A − λ1 I getting

 
c2 λ2 − λ1 x2 + · · · + cd λd − λ1 xd = 0


and then by A − λ2 I , . . . ultimately getting

  
cd λ d − λ 1 λd − λ2 · · · λd − λd−1 xd = 0
LECTURE 5. EIGENVECTORS 111

or cd = 0. Likewise c1 = 0, c2 = 0, etc. More is true: If n1 independent eigenvectors


correspond to λ1 , n2 to λ2 , . . . then the set of n1 + n2 + · · · eigenvectors is independent.

Even more is true. If corresponding to each distinct eigenvalue we determine a set of inde-
pendent eigenvectors and then corresponding to each of these a chain of generalized eigen-
vectors, then all of these eigenvectors and generalized eigenvectors are independent. The

idea is that by multiplying a linear combination of these vectors by A − λ1 I sufficiently
many times we can remove from it all vectors corresponding to λ1 . To see how this works
take the simple example where x1 and x2 correspond to λ1 and x3 to λ3 . Then write

Ax1 = λ1 x1

Ax2 = x1 + λ1 x2

and

Ax3 = λ3 x3

To determine c1 , c2 and c3 so that

c1 x1 + c2 x2 + c3 x3 = 0


multiply this by A − λ1 I to get


c2 x1 + c3 λ3 − λ1 x3 = 0


and then again by A − λ1 I to get

2
c3 λ3 − λ1 x3 = 0.

By doing this we discover that c3 = 0 and this implies c2 = 0 and so c1 = 0.

3. Any set of independent vectors in C n determines a unique biorthogonal set in its span. For
 
example if x1 and x2 are independent in C n then we can determine y 1 and y 2 in x1 , x2
LECTURE 5. EIGENVECTORS 112

so that y 1 , x1 = 1, y 1 , x2 = 0, y 2 , x1 = 0, and y 2 , x2 = 1. Indeed writing


y 1 = ax1 + bx2 , y 2 = cx1 + dx2 the four biorthogonality conditions determine a, b, c and d
 
uniquely. If x1 , x2 , and x3 , are independent we can determine y 1 , y 2 , and y 3 in x1 , x2 , x3
 
so that y i , xj = δij but now y 1 and y 2 need not lie in x1 , x2 .

5.10 Similarity or Basis Transformations

We introduce the idea that a matrix represents a linear operator in a specified basis. Let ~x be a

vector (possibly a column vector) and ~e1 , ~e2 , . . . , ~en be a basis for the n dimensional vector
space in which ~x resides. Then we can write

~x = x1~e1 + x2~e2 + · · · + xn~en


 
x1
 
 
 x2  
and denote by x = 
 ..
 the column vector representing ~x in the basis ~e1 , ~e2 , . . . , ~en . If

 . 
 
xn

f~1 , f~2 , . . . , f~n is a second basis written in terms of the first by

f~1 = p11 ~e1 + p21 ~e2 + · · · + pn1 ~en

f~2 = p12 ~e1 + p22 ~e2 + · · · + pn2 ~en

etc.


then the column vectors representing the vectors f~1 , f~2 , . . . , f~n in the basis ~e1 , ~e2 , . . . , ~en are

the columns of a matrix denoted P . And because f~1 , f~2 , . . . , f~n is independent so also the set
of columns of P and hence det P 6= 0. If we now write

~x = y1 f~1 + y2 f~2 + · · · + yn f~n


LECTURE 5. EIGENVECTORS 113
 
y1
 
 
  y2 
then ~x is represented in the basis f~1 , f~2 , . . . , f~n by the column vector y = 
 ..
 and a

 . 
 
yn
simple calculation shows that

x = Py

The formula x = P y determines the column vector x representing a vector ~x in the basis

~e1 , ~e2 , . . . , ~en in terms of the column vector y representing the vector ~x in another basis

f~1 , f~2 , . . . , f~n . The columns of the transformation matrix P are the column vectors represent-
ing the second basis vectors in the first basis. Each vector ~x is represented by many column vectors
corresponding to many bases and each column vector represents many vectors again corresponding
to many bases but the representation is one-to-one in a fixed basis.

If L is a linear operator (possibly an n × n matrix) acting in this vector space we can write

L~e1 = a11 ~e1 + a21 ~e2 + · · · + an1 ~en

L~e2 = a12 ~e1 + a22 ~e2 + · · · + an2 ~en

etc.

and denote by A the matrix whose columns are the column vectors representing L~e1 , L~e2 , . . . in
 
the basis ~e1 , ~e2 , . . . , ~en . We call this the matrix representing L in the basis ~e1 , ~e2 , . . . , ~en .

Likewise denoting by B the matrix representing L in the basis f~1 , f~2 , . . . , f~n we find:

A = P BP −1


The formula A = P BP −1 determines the matrix A representing L in the basis ~e1 , ~e2 , . . . , ~en

in terms of the matrix B representing L in another basis f~1 , f~2 , . . . , f~n . Each linear operator
L is represented by many matrices corresponding to many bases and all display the same informa-
tion but this information is easier to obtain in some bases than it is in others. Indeed if A, x and b
LECTURE 5. EIGENVECTORS 114
 
represent L, ~x and ~b in the basis ~e1 , ~e2 , . . . , ~en while in f~1 , f~2 , . . . , f~n the representation
is B, y and c then the equation L~x = ~b is represented in C n by both Ax = b and By = c. And one
of these may be easier to solve than the other.

Using A = P BP −1 and the theorem that the determinant of a product is the product of the
determinants of its factors we see that

 
det λI − A = det λI − B

and hence we define the characteristic polynomial of L to be the characteristic polynomial of any
matrix that represents it. We then define the eigenvalues of L to be the eigenvalues of any matrix
that represents it. The eigenvalues of any two matrices A and B, where A = P BP −1 for any
nonsingular matrix P , are the same. The eigenvalues of L are independent of the basis used for
their determination.

If L has an invariant subspace of dimension k, then, using a basis whose first k vectors lie
in this subspace, the matrix representing L in this basis reflects this structure by exhibiting an

n − k × k block of zeros in its lower left hand corner. Its determinant then factors as the product
 
of the determinants of its upper left hand k × k block and its lower right hand n − k × n − k
block. This establishes the result that the geometric multiplicity of an eigenvalue cannot exceed

its algebraic multiplicity. To see this let dim Ker A − λ1 I = n1 and let the first n1 vectors in a
  m
basis be a basis for Ker A − λ1 I , then det λI − A contains the factor λ − λ1 1 where m1
cannot be less than n1 .

Let the linear operator L be the n × n matrix A. Then it is represented by itself in the natural
   

 1 0  

    

  
  0 

 0 

   
basis,  .  , · · · ,  .  . If A has a complete set of eigenvectors x1 , x2 , . . . , xn

  ..   .. 

     

 

 0 1 

its matrix in this basis is the diagonal matrix of the corresponding eigenvalues. Such a basis
is called a diagonalizing basis. If to the eigenvalue λ1 repeated twice, there corresponds only

the eigenvector x1 , we construct a generalized eigenvector x2 and in the basis x1 , x2 , . . . the
LECTURE 5. EIGENVECTORS 115
 
λ1 1
matrix of A has the block   in the upper left hand corner. Using a basis of eigenvectors
0 λ1
and generalized eigenvectors, we find that the matrix representing A is block diagonal. To each
eigenvalue λi of multiplicity mi , i = 1, 2, . . . , d, there corresponds an mi × mi block. Outside
of these d blocks all elements vanish. Inside the ith block λi appears on the diagonal, 1 or 0
on the superdiagonal, and 0 elsewhere. The structure of the superdiagonal is determined by the
chains of generalized eigenvectors. For instance if λ1 is a threefold root to which corresponds
only  x1 then x1 generates the chain x1 → x2 → x3 and the corresponding block is
 the eigenvector
λ 1 0
 1 
 
 0 λ1 1 ; but if there are two eigenvectors x1 and x2 and x2 generates the chain x2 → x3
 
0 0 λ1
 
λ 0 0
 1 
 
the block is  0 λ1 1 .
 
0 0 λ1

The forms we have been talking about, including the purely diagonal form, are called Jordan
forms. Such forms are either diagonal or as close to diagonal as we can get using basis trans-
formations. In A = P JP −1 the columns of the transformation matrix are the column vectors
representing the eigenvectors and generalized eigenvectors of A in the natural basis.

Shilov’s book “Linear Algebra” gives an algebraic account of this via polynomial algebras and
their ideals. Gantmacher’s book “Theory of Matrices” gives both an algebraic and a geometric
explanation.

5.11 Home Problems

1. Derive the formula for the eigenvalues and eigenvectors of the block triangular matrix
 
A 0
  where the blocks on the diagonal are square and have simple eigenvalues. To
B C
LECTURE 5. EIGENVECTORS 116

do this, write the eigenvalue problem as


    
A 0 x x
   = λ 
B C y y

and solve it in terms of the solutions to the eigenvalue problems for A and C.

2. A linear separating cascade is run steadily in cocurrent flow. Show that


 
  V  
xi+1  1 E  x
 = 1   L   i 
mV  mV 
yi+1 1+ E mE 1 + −1 E yi
L L
   
xout xin
and hence determine   in terms of   for an n stage cascade. The perfor-
yout yin
mance of such a cascade cannot  Indeed ifE = 1 the
exceedthat of one equilibrium stage. 
xi xi+1
stepping matrix is singular and   cannot be determined from  . And for
yi yi+1
good reason.

 Fora countercurrent
 cascade
 the Kremser equation is the corresponding formula for
x x
 in  in terms of  out . For fixed xin , yin , investigate xout and yout as n grows large.
yout yin
mV
The results will depend on whether is greater or less than 1 and on whether yin is greater
L
or less than mxin .
mV
Let = 1 and rederive the Kremser equation. In this instance the eigenvalue 1 is
L
repeated and corresponds to only one independent eigenvector. Is the result the limit of the
mV mV
ordinary Kremser equation for 6= 1 as → 1?
L L

3. Determine the eigenvectors and the eigenvalues of the matrix

I + a bT
LECTURE 5. EIGENVECTORS 117

Show that its trace is n + bT a and that its determinant is 1 + bT a.

4. Show that the eigenvalues of diagonal and triangular matrices are their diagonal elements.

5. If the simple Drude model is used to derive the potential energy of three molecules lying on
a straight line, the matrix
 
a −b −c
 
 
 −b a −d 
 
−c −d a

turns up. The numbers a, b, c and d are all positive and a >> b, c, d. The numbers b, c and d
denote the dipole-dipole interactions.

The principal invariants of this matrix are

∆1 = 3a

∆2 = 3a2 − b2 + c2 + d2

∆3 = a3 − a b2 + c2 + d2 − 2bcd

and so its eigenvalues remain unchanged if b, c and d are interchanged.

Show that if c = d = 0 the eigenvalues are

λ1 = a − b

λ2 = a

λ3 = a + b
LECTURE 5. EIGENVECTORS 118

while if only d = 0 the eigenvalues are


λ1 = a − b2 + c2

λ2 = a

λ3 = a + b2 + c2

Indeed if d 6= 0 we might guess a fair approximation to be


λ1 = a − b2 + c2 + d2

λ2 = a

λ3 = a + b2 + c2 + d2

and using this guess we get

λ1 + λ2 + λ3 = ∆1

λ1 λ2 + λ2 λ3 + λ3 λ1 = ∆2

λ1 λ2 λ3 = ∆3 + 2bcd

Write
     
a −b −c a −b −c 0 0 0
     
     
 −b a −d  =  −b a 0  −d 0 0 1 
     
−c −d a −c 0 a 0 1 0

and estimate the eigenvalues by a perturbation calculation. Carry this out to first order in d
and then calculate λ1 + λ2 + λ3 , λ1 λ2 + λ2 λ3 + λ3 λ1 and λ1 λ2 λ3 . Are your estimates
improved if d2 is added to b2 + c2 wherever b2 + c2 appears?
LECTURE 5. EIGENVECTORS 119

6. Here is an n × n matrix:
 
1 1 ··· 1
 
 
 1 1 ··· 1 
 
 .... .. 
 . . . 
 
1 1 ··· 1

Show that its eigenvalues are n and 0 where 0 is repeated n − 1 times. Show that the
corresponding eigenvectors are
 
1
 
 
 1 
 
 .. 
 . 
 
1

and
     
1 1 1
     
     
 −1   0   0 
     
     
 0   −1   0 
 ,  , ···  
     
 0   0   0 
     
 ..   ..   .. 
 .   .   . 
     
0 0 −1

Construct an orthogonal set of eigenvectors.

7. Let A, P and Q be n × n matrices where A and Q are known and P is to be determined. The
equation for doing this is

AT P + P A = −Q

This is a system of linear equations in the unknown elements of the matrix P .


LECTURE 5. EIGENVECTORS 120

As such it can be written

ap = −q

where a is an n2 × n2 matrix and p and q are n2 × 1 vectors. To see what this equation looks
like and to decide when it can be solved, suppose that the eigenvalue problems for A and AT
 
lead to the biorthogonal sets of eigenvectors x1 , x2 , . . . , xn and y 1 , y 2 , . . . , y n and

the corresponding eigenvalues λ1 , λ2 , . . . , λn . Then write

AT P + P A = −Q


in the basis y i y Tj and show that the linear operator

 
AT + A

is non singular iff λi + λj 6= 0 ∀ i, j.

8. Determine the eigenvalues and the eigenvectors of the matrix

I + a bT + b aT

taking a and b to be independent.

To do this let x1 , x2 , . . . , xn−2 be independent and lie in [a, b]⊥ . Then x1 , x2 , . . . , xn−2
are eigenvectors corresponding to the eigenvalue 1.

The remaining two eigenvectors lie in [a, b]⊥⊥ = [a, b]. To find these and the corre-
sponding eigenvalues put

x = αa+βb
LECTURE 5. EIGENVECTORS 121

and observe that

  
I + a bT + b aT αa+ βb = λ αa + βb

requires

α + α bT a + β bT b = λ α

and

β + α aT a + β aT b = λ β

or
    
T T
1+b a b b α α
   = λ 
T T
a a 1+a b β β

The determinant and the trace of the matrix on the left hand side are

2
det = 1 + aT b − aT a bT b

and


tr = 2 1 + aT b

whence

tr 2 − 4 det > 0

and this tells us that the remaining two eigenvalues are real and not equal.

As a simple example, let

aT b = 0, aT a = 1, bT b = 1
LECTURE 5. EIGENVECTORS 122

then the eigenvalues of the matrix


 
1 1
 
1 1

are 0 and 2 corresponding to α = 1, β = −1 and α = 1, β = 1.

Determine the eigenvectors and the eigenvalues of

I + a bT + c dT

taking a and c and b and d to be independent.

9. Let A and B be m × n matrices of ranks r and s. Then solutions, x ∈ Rn , x 6= 0, to the


problem

Ax = λBx

can sometimes be of interest. Begin the study of this problem by finding a bound on the
largest number of independent solutions corresponding to any λ 6= 0.

How can Im A ∩ Im B be determined?

10. Suppose A commutes with its adjoint A∗ , viz.,

AA∗ = A∗ A

whereupon A is called normal.

Denote by x, λ an eigenvector and the corresponding eigenvalue of A. Assume the


eigenspaces of A are one dimensional and prove that x is an eigenvector of A∗ . Hence
conclude that the eigenvectors of A form an orthogonal set of vectors.
Lecture 6

The Solution of Differential and Difference


Equations

6.1 A Formula for the Solution to dx/dt = Ax

Suppose the departure of x1 (t), x2 (t), . . . , xn (t) from assigned initial values is determined by the
set of n ordinary differential equations

n
dxi X
= aij xj , i = 1, 2, ..., n.
dt j=1

Then introducing x(t) via


 
x1 (t)
 
 .. 
x (t) =  . 
 
xn (t)

we can write this as

dx
= Ax
dt

123
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 124

where x(t = 0) is assigned and where the constants aij are the elements of the n × n matrix A. We
suppose, first, that A has a complete set of eigenvectors and construct the solution to this problem
in terms of these eigenvectors.

To introduce the notation let the eigenvectors and eigenvalues of A and A∗ satisfy

Axi = λi xi , i = 1, . . . , n

and

A∗ y i = λi y i , i = 1...,n

D E
where y i , xj = δij .

Then to solve our problem we expand its solution in the eigenvectors of A as

n
X
x (t) = ci (t) xi
i=1

D E
and seek the coefficients ci (t) in this expansion, where ci (t) = y i , x(t) .

Thus, what we do to solve linear problems requires three steps to be carried out: determine an
eigenvector basis, expand the solution in this basis and determine the coefficients in this expansion.
Its simplicity rests on the idea of biorthogonal sets of vectors.

To establish an equation satisfied by ci (t) we multiply both sides of dx/dt = Ax by y Ti G


obtaining
  D E
dx
yi, = y i , Ax
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 125

This leads to

d D E D E
y i , x = A∗ y i , x
dt
D E
= λi y i , x

whence each coefficient ci (t) satisfies

dci
= λ i ci .
dt

As a result we have

ci (t) = ci (t = 0)eλi t

and the solution to our problem is

n D
X E
x(t) = y i , x(t = 0) eλi t xi
i=1

Indeed if

dx
= Ax + b(t)
dt

we need to add

n Z
X
t
D E
eλi (t−τ ) y i , b(τ ) dτ xi
i=1 0

to the foregoing to obtain the solution. And this may be discovered using the same steps that led
to the solution of the problem where b(t) = 0.

The problem as originally written requires that we determine the unknown functions
x1 (t), . . . , xn (t) simultaneously. These functions are the components of x(t) in the natural ba-
sis for the problem and each component ordinarily appears in each equation. So we look for a way
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 126

to break this coupling. When A has a complete set of eigenvectors we can do this by expanding
the solution x(t) in the eigenvector basis. Then the determination of the expansion coefficients
c1 (t), ..., cn (t), unlike the determination of the natural components x1 (t), ..., xn (t), is a completely
uncoupled problem.

We work out two examples:

Example (i)
 
−1 −1
Let A =   then trA = -2 and detA = 2. The eigenvalues are λ1 = −1 + i and
1 −1
   
1 1
λ2 = −1 − i and the corresponding eigenvectors are x1 =   and x2 =  . Because A
−i i
is real and λ1 and x1 satisfy the eigenvalue problem so also do λ1 and x1 . And although we find,
in the plain vanilla inner product, that hx1 , x2 i = 0, A∗ is not equal to A. What in fact is true is
T T
that AA = A A, i.e., that A is normal. In the plain vanilla inner product the biorthogonal set is
    
 1 1 1 
y1 =  ,y = 1  .
 2 −i
2 2 i 

Hence the solution to dx/dt = Ax is


   
1 1
x(t) = c1 e(−1+i)t   + c2 e(−1−i)t  
−i i

where here c1 denotes c1 (t = 0), etc., and

*   +
1 1 
c1 = , x(t = 0)
2 −i

and
*   +
1 1 
c2 = , x(t = 0) .
2 i
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 127

When A is real and x(t = 0) is real then x(t) must be real for all values of t. If λ2 = λ1 and
we require x2 = x1 then y 2 = y 1 and hence c2 = c1 and so the two terms adding to x(t), c1 eλ1 t x1

and c2 eλ2 t x2 , are complex conjugates. As a result x(t) can be written as 2Re c1 eλ1 t x1 . In this
example this is
  
 1 
(−1+i)t  
2Re c1 e
 −i 

and, on writing

λ1 = Re λ1 + i Imλ1 = −1 + i
   
1 0
x1 = Re x1 + i Imx1 =   + i 
0 −1

and

c1 = ρeiφ ,

we have

2ρeReλ1 t {cos (Imλ1 t + φ) Rex1 − sin (Imλ1 t + φ) Imx1 }

which corresponds to a converging spiral in the x1 , x2 plane.

Example (ii)
 
1 −2
Let A =   then trA = −3, det A = 2 and
3 −4

   
1 1
λ1 = −1 , x1 =   , λ2 = −2 and x2 = 
3
.
1
2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 128

The biorthogonal set in the plain vanilla inner product is


    
 3
−1 
y1 = 2  2  , y2 = 2   .
 −1 1 

and the solution to dx/dt = Ax is


  
1 1
x(t) = c1 e−t   + c2 e−2t  3 
1
2

where again c1 denotes c1 (t = 0), etc., and

*  3  +
c1 = 2  2  , x(t = 0)
−1

and
*   +
−1
c2 = 2   , x(t = 0) .
1

We see that as t increases the second term dies out exponentially fast compared to the first and
for large enough values of t
 
1
x(t) ∼ c1 e−t  
1

 
1
This tells us that x(t) approaches 0 from the direction  . We will use this fact in Lecture 8
1
to help us turn experimental data into estimates of the elements of A.

There are special solutions to dx/dt = Ax called equilibrium solutions. These satisfy Ax = 0
as then dx/dt = 0. An equilibrium solution is constant in time and can only be obtained by starting
there. If detA 6= 0 then 0 is the only solution toAx = 0 and hence is the only equilibrium solution.
All other solutions are always on the move and according to where they start are given by our
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 129

formula

n D
X E
x(t) = y i , x(t = 0) eλi t xi .
i=1

Now these may or may not converge to the equilibrium solution as time grows large. If all do, we
call the equilibrium solution asymptotically stable and a necessary and sufficient condition for this
is that Reλi < 0, i = 1, 2, . . . , n, i.e., all eigenvalues lie in the left half of the complex plane. If
this is so x(t) goes to 0 exponentially fast as t grows large at a rate determined by the largest of the
Reλi , i = 1, 2, . . . , n.

6.2 Gerschgorin’s Circle Theorem

If n is 2, the eigenvalues of A are roots of a quadratic equation and it is easy to see that asymptotic
stability obtains iff trA < 0 and detA > 0. But as n increases beyond 2 eigenvalues are increasingly
difficult to determine and what we need is a simple estimate of where the eigenvalues of A lie. The
best of these is Gerschgorin’s circle theorem and it is surprisingly easy to prove; it tells us that
each eigenvalue of A lies on or inside at least one of n circles in the complex plane. In fact, there
are two sets of n circles: they are

X
|λ − aii | ≤ |aij | , i = 1, ..., n
j6=i

and

X
|λ − aii | ≤ |aji | , i = 1, ..., n
j6=i

The first set of circles corresponds to the rows of A. The second set to the rows of AT and hence
to the columns of A. For instance, in the second example, the best estimate via Gerschgorin’s
theorem is that the eigenvalues cannot lie in the region outside the two circles of radius 2 centered
on -4 and 1. And both sets of circles are required to determine this.

If A is diagonal, Gerschgorin’s theorem predicts all its eigenvalues; if A is triangular the theo-
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 130

rem predicts the eigenvalues a11 and ann ; the more diagonally dominant the matrix, the better the
estimates made by the theorem.

6.3 A Formula for the Solution to x(k + 1) = Ax(k)

As a variation on the foregoing problem, where the evolution of x (t) is continuous in time, we also
look at the problem where x (k) evolves discretely in time. Suppose x (k), k = 1, 2, . . . satisfies

x (k + 1) = Ax (k)

where x (k = 0) is assigned. This is a system of linear constant coefficient difference equations


and we can write its solution x (n) = An x (0) and use the spectral representation of A to discover
the nature of the solution. But we can also expand the solution in the eigenvectors of A and to do
this we write

n
X
x (k) = ci (k) xi
i=1

D E
where ci (k) = y i , x (k) . To find an equation satisfied by ci (k) we multiply both sides of
x (k + 1) = Ax (k) by y Ti G to obtain

D E D E D E D E

y i , x (k + 1) = y i , Ax (k) = A y i , x (k) = λi y i , x (k) .

Hence we discover that

ci (k + 1) = λi ci (k)
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 131

and so

ci (k) = λki ci (0)

and therefore

n D
X E
x (k) = y i , x (0) λki xi
i=1

Using this formula we see that the evolution of x (k) as k increases depends on where the eigen-
values of A lie.

The equilibrium solutions of x (k + 1) = Ax (k) satisfy x = Ax or {A − I} x = 0 because


then x (k + 1) = x (k) and x (k) remains constant. An equilibrium solution can only be obtained
by starting there. If det {A − I} 6= 0 then 0 is the only equilibrium solution. All other solutions
are always on the move and are given by our formula. They may or may not converge to the
equilibrium solution as k grows large. If all do we call the equilibrium solution asymptotically
stable. A necessary and sufficient condition for this is that |λi | < 1, i = 1, 2, . . . , n, where
|λi |2 = λi λi = (Reλi )2 + (Imλi )2 . The stability requirement is that the eigenvalues lie inside the
unit circle in the complex plane.

6.4 The Stiffness Problem

What we have found then is this: stability for the problem dx/dt = Ax requires the eigenvalues of
A to lie in the left half of the complex plane; stability for the problem x (k + 1) = Ax (k) requires
the eigenvalues of A to lie inside the unit circle. To see what these conditions have to do with one
another, let x (t) denote the solution to dx/dt = Ax . Then a simple Euler approximation to x (t)
satisfies

x (k + 1) = (I + ∆tA) x (k)
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 132

whence x (t) and its approximation are given by

n D
X E
y i , x (t = 0) eλi t xi
i=1

and

n D
X E
y i , x (k = 0) (1 + ∆tλi )k xi
i=1

This is so as xi is an eigenvector of both A and I + ∆tA corresponding to the eigenvalues λi and


1 + ∆tλi .

Now the first thing to observe is that the approximation converges to x (t) as ∆t → 0 where
the limit is taken holding k∆t = t fixed. The second thing to observe is that, even assuming
the stability of the differential equation, the difference equation is not stable for all values of ∆t.
Indeed we see that the difference equation is stable iff, ∀λi , i = 1, 2, . . . , n,

|1 + ∆tλi |2 < 1

or

(1 + ∆tReλi )2 + (∆tImλi )2 < 1.

That it is possible to satisfy this by making ∆t sufficiently small, when Reλi < 0, is due to the fact
that Imλi is multiplied by (∆t)2 whereas Reλi is multiplied by ∆t.

It is ordinarily true that the eigenvalue whose real part is most negative sets a bound on how
large ∆t may be. Indeed stability of the difference equation, in the case where the eigenvalues are
real and negative, requires that ∀λi

−1 < 1 + ∆tλi < 1


LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 133

or

2
∆t <
|λi |

So, the most negative eigenvalue, the eigenvalue associated with the term in x (t) that dies out most
rapidly, controls the size of ∆t in the difference approximation. In numerical work this is referred
to as the stiffness problem. Problems where the real parts of the eigenvalues are widely separated
so that insignificant parts of their solutions, at least for t > ǫ, ǫ small, control approximations to
their solutions are called stiff problems. In doing a calculation you can never get rid of the most
negative eigenvalue due to the fact that numerical errors act like new initial conditions.

The results obtained in this lecture will be used in subsequent lectures to investigate problems
where we can establish the fact of a complete set of eigenvectors. Before turning to this we take
up a problem where the set of eigenvectors is not complete and where the use of generalized
eigenvectors is required.

6.5 The Use of Generalized Eigenvectors

To show how the solution to such a problem can be found, we set n = 2 and suppose that λ1 is a
double root of ∆ (λ) = 0. Then if dim Ker (A − λ1 I) = 1, we write

Ax1 = λ1 x1

Ax2 = x1 + λ1 x2

and

A∗ y 1 = λ1 y 1 + y 2

A∗ y 2 = λ1 y 2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 134
D E
where y i , xj = δij . To determine x (t) where dx/dt = Ax we write x (t) = c1 (t) x1 + c2 (t) x2
D E
where ci (t) = y i , x (t) and discover in the usual way that c1 (t) and c2 (t) satisfy:

 
dx dc1 D E D E
y1, = = y 1 , Ax = λ1 y 1 + y 2 , x = λ1 c1 + c2
dt dt

and
 
dx dc2 D E D E
y2, = = y 2 , Ax = λ1 y 2 , x = λ1 c2
dt dt

Solving these we get

Zt
λ1 t
c1 = e c1 (t = 0) + eλ1 (t−τ ) c2 (τ ) dτ
0

and

c2 = eλ1 t c2 (t = 0)

whence

c1 = eλ1 t c1 (t = 0) + teλ1 t c2 (t = 0)

and as a result we find

nD E D E o D E
x (t) = y 1 , x (t = 0) eλ1 t + y 2 , x (t = 0) teλ1 t x1 + y 2 , x (t = 0) eλ1 t x2

What we see then is this: when a pair of eigenvectors is replaced by an eigenvector and a gen-
eralized eigenvector the purely exponential time dependence eλ1 t and eλ2 t is replaced by eλ1 t and
teλ1 t . If λ1 were repeated three times, assuming n > 2, the number of possibilities increases.
We may have three eigenvectors, two eigenvectors and a generalized eigenvector or an eigenvec-
tor and two generalized eigenvectors. The first corresponds to a complete set of eigenvectors, viz.,
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 135

dim Ker (A − λ1 I) = 3; the second is like the above, dim Ker (A − λ1 I) = 2 and one of the eigen-
vectors, but not the other, must lie in Im (A − λ1 I) in order that it lead to a generalized eigenvector.
The third possibility is new, dim Ker (A − λ1 I) = 1 and Ker (A − λ1 I) lies inside Im (A − λ1 I);
the readers can satisfy themselves that the time dependence is now given by eλ1 t , teλ1 t and 12 t2 eλ1 t .

6.6 Improving the Performance of a Linear Stripping Cascade

A problem where generalized eigenvectors are required turns up in the study of a simple stripping
cascade operated as follows: Let M denote the heavy phase holdup in each stage of the cascade
and suppose every T units of time we transfer the heavy phase contents of stage i to stage i − 1,
taking M units of product from stage 1, adding M units of feed to stage n. By doing this we
achieve a heavy phase throughput L = M/T . The light phase is run as before and strips the n
stages of the cascade for a period of time T . Indeed this is the way sugar is stripped out of sugar
beets using water.

If yin = 0 and E = 1 we can determine, using the Kremser equation, that the ordinary perfor-
mance of such a stripping cascade is predicted by

xout 1
=
xin 1 + S + S + · · · + Sn
2

where S = mV /L is the stripping factor. We propose to show that the ordinary operation can be
greatly improved upon.

In the newly proposed method of running the separation cascade, our equations are

dxi
M = V yi−1 − V yi , i = 1, . . . , n
dt

which can be written

dxi S
= (xi−1 − xi ) , i = 1, . . . , n
dt T
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 136
 
x1
 
 
 x2 
where xn (t = 0) = xin and x1 (t = T ) = xout . Setting x = 
 ..
 we have

 . 
 
xn

dx S
= Ax
dt T

where
 
−1 0 0 ···
 
 
 1 −1 0 · · · 

A= 

 0 1 −1 · · · 
 
...

This matrix has the eigenvalue λ1 = −1 repeated n times and, as dim Ker (A − λ1I) =1, to it
0
 
 
 0 
there corresponds only one independent eigenvector which we can take to be x1 =  
 ..  . This
 . 
 
1
eigenvector initiates a chain of generalized eigenvectors x2 , . . . , xn via

Ax2 = x1 + λ1 x2

Ax3 = x2 + λ1 x3

etc.

or (A − λ1 I)n xn = 0 but (A − λ1 I)n−1 xn 6= 0, (A − λ1 I)n−1 xn−1 = 0 but


LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 137

(A − λ1 I)n−2 xn−1 6= 0, etc. Indeed in this simple example we find


   
0 0
   
   
 0   0 
   
 ..   .. 
 .   . 
x2 = 

 , x3 =   , · · ·
  
 0   1 
   
   
 1   0 
   
0 0

   
0 1
For n = 2 we have x1 =  , x2 =   and, in the plain vanilla inner product,
1 0
   
0 1
y1 =  , y = 
2
. Putting this in our earlier formula, we get
1 0

S
x1 (t) = x1 (t = 0) e− T t

and

S S −St
x2 (t) = x2 (t = 0) e− T t + x1 (t = 0) te T
T

These formulas will take us through the startup period where for the first cycle we have
x1 (t = 0) = xin . Thereafter x1 (t = 0) will be x2 (t = T ) for the preceding cycle. After some
number of cycles we assume, and the reader can demonstrate, that the stripping cascade achieves a
repetitive operation wherein x2 (t = T ), and indeed x1 (t) and x2 (t), 0 < t < T , is repeated cycle
after cycle. Then using x1 (t = 0) = x2 (t = T ) in the foregoing we get

xout 1
= 2S
xin e − SeS

and this is better than the Kremser equation prediction

1
1 + S + S2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 138

The two results for n = 1 are

1
eS

and

1
1+S

The readers can determine the general result.

6.7 Another Way to Solve dx/dt = Ax

We present an alternative, solving dx/dt = Ax by using the Laplace transformation, assuming the
eigenvalues of A to be distinct. To do this we write

(λI − A) adj (λI − A) = det (λI − A) I

and

 T
  T
  T

λI − A adj λI − A = det λI − A I

and observe that if λi is a simple root of det (λI − A) then the columns of adj (λi I − A) are
 T

all proportional to xi while the columns of adj λi I − A are all proportional to y i . Because
 T
 n oT
adj λi I − A = adj (λI − A) we can write

adj (λi I − A) = ci xi y¯i T

for some non-zero constant ci .


d
We also need to evaluate det (λI − A) . We can do this by using our formula for
dλ λ=λi
differentiating a determinant, viz.,
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 139

 
d d
det (λI − A) = tr adj (λI − A) (λI − A) = tr {adj (λI − A)}
dλ dλ

whence

d  
det (λI − A) = tr ci xi y¯i T = ci y¯i T xi
dλ λ=λi

To determine the solution to dx/dt = Ax we take the Laplace transformation of both sides
obtaining

sL (x) − x (t = 0) = AL (x)

and hence

(sI − A) L (x) = x (t = 0)

whence

adj (sI − A)
L (x) = x (t = 0)
det (sI − A)

The right hand side has simple poles at s = λ1 , λ2 , . . . , λn . Thus L(x) can be written

n
X adj (sI − A) 1
L (x) = x (t = 0)
d s − λi
i=1 det (sI − A) |
ds λ=λi

Then, using our formulas for the adjugate and for the derivative of the determinant, we have

n
X xi y¯ T x (t = 0) 1 X D n E 1
i
L (x) = = xi y i , x (t = 0)
i=1
y¯i T xi s − λi i=1
s − λi
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 140
D E
on requiring y Ti xi = y i , xi = 1 and hence

n
X D E
x (t) = xi y i , x (t = 0) eλi t
i=1

6.8 The Solution to Higher Order Equations

Frazer, Duncan and Collar in their book “Elementary Matrices” propose a very nice way to con-
struct solutions to quite general systems of linear differential equations. We outline the essential
idea here in the hope that this provokes some readers to  go and
 look at this great old book. Let
d
fij (λ) , i, j = 1, . . . , n, be n2 polynomials in λ. Then fij is a polynomial differential oper-
dt
ator and our problem is to determine solutions to the system of differential equations
 
d
f x (t) = 0
dt

where f (λ) = λn A0 + λn−1 A1 + · · · is called a lambda matrix. We let F (λ) denote the adjugate
of f (λ), i.e., F (λ) = adj f (λ), and write

f (λ) F (λ) = ∆ (λ) I

where ∆ (λ) = det f (λ) . Then if λ1 is a root of ∆ (λ) = 0 of algebraic multiplicity m1 we have

∆ (λ1 ) = 0
(1)
∆ (λ1 ) = 0

···
(m1 −1)
∆ (λ1 ) = 0
(m1 )
∆ (λ1 ) 6= 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 141

(1)
where ∆ (λ) = d∆ (λ) /dλ, etc. We can then find a set of solutions corresponding to λ1 by
observing first that
 
d
f eλt F (λ) = eλt f (λ) F (λ) = eλt ∆ (λ) I
dt

and then that

     
d d  λt d d λt λt λt
(1)
f e F (λ) = f e F (λ) = te ∆ (λ) + e ∆ (λ) I
dt dλ dλ dt

etc

Whence all of the columns of

eλ1 t F (λ1 ) ,

d  λ1 t
e F (λ1 ) ,
dλ1
···
dm1 −1  λ1 t
e F (λ1 )
dλm
1
1 −1

satisfy
 
d
f x (t) = 0
dt

and so also for the remaining roots of ∆ (λ).

Frazer, Duncan and Collar present the properties of the lambda matrix f (λ) and its adju-

 F(λ) that are required to make this a workable method for writing the general solution to
gate
d
f x (t) = 0.
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 142

6.9 Roots of Polynomials

The stability of an equilibrium point to a small displacement requires the real parts of the eigenval-
ues of some matrix to be negative. To decide stability then, we must determine the eigenvalues of
this matrix and to do this we must first determine its characteristic polynomial and then the roots
of this polynomial. If the problem depends on parameters and the dimension is large this can be a
difficult calculation.

We would like to be able to determine the signs of the real parts of the eigenvalues, short of
determining the eigenvalues themselves, either by looking at the elements of the matrix or, if that
fails, by looking at the coefficients of its characteristic polynomial. Gerschgorin’s circle theorem
is a step in this direction but often it does not resolve the question; yet it always provides estimates
of the eigenvalues that can be refined and hence it is always helpful. In what follows we let n = 2
and 3 and state necessary and sufficient conditions in terms of the coefficients of the characteristic
polynomial that its roots, the eigenvalues, have negative real parts. The matrix is assumed to be
real.

A useful reference is Porter’s book: Stability Criteria for Linear Dynamical Systems. Besides
providing a nice way of looking at this problem, this beautiful little book has a simple derivation
of the Routh criteria.

In the case n = 2 the characteristic equation is

λ2 − ∆1 λ + ∆2 = 0

where ∆1 = T and ∆2 = D, T and D denoting trace and determinant. The necessary and sufficient
condition that Reλ1 and Reλ2 be negative is: T < 0, D > 0.

For n = 3 the characteristic equation is

λ3 − ∆1 λ2 + ∆2 λ − ∆3 = 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 143

or

λ3 − T λ2 + Sλ − D = 0

where we assume T , S and D to be real. Then λ1 , λ2 and λ3 satisfy

λ1 + λ2 + λ3 = T = λ1 + 2x

λ1 λ2 + λ2 λ3 + λ3 λ1 = S = λ1 (2x) + x2 + y 2

λ1 λ2 λ3 = D = λ1 x2 + y 2

where on the right hand side we assume λ2 = λ3 = x + iy. It is easy to see that if λ1 , λ2 and λ3
are negative or have negative real parts then T < 0, S > 0 and D < 0.

So if T is not negative, or S is not positive or D is not negative we can conclude that not all of
Reλ1 , Reλ2 and Reλ3 can be negative.

If T < 0, S > 0 and D < 0 then

λ3 − T λ2 + Sλ − D = 0

cannot have real positive roots. On substituting λ2 = λ3 = x + iy, y 6= 0, we get


x3 − 3xy 2 − T x2 − y 2 + Sx − D = 0

and

3x2 y − y 3 − T (2xy) + Sy = 0

and so, dividing the second by y and using the result to eliminate y 2 in the first, we get


−8x3 + 8T x2 − 2 S + T 2 x + T S − D = 0
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 144

and this tells us: T < 0, S > 0, D < 0 and T S − D < 0 is sufficient that x not be positive. But

T S − D = (λ1 + λ2 + λ3 ) (λ1 λ2 + λ2 λ3 + λ3 λ1 ) − λ1 λ2 λ3
 
= (λ1 + 2x) λ1 (2x) + x2 + y 2 − λ1 x2 + y 2

= λ21 + x2 + y 2 (2x) + (2x)2 λ1

and so λ1 < 0 and x < 0 is sufficient that T S − D < 0. As T S − D < 0 if λ1 < 0, λ2 < 0
and λ3 < 0 we discover: the necessary and sufficient condition that Reλ1 , Reλ2 and Reλ3 all be
negative is T < 0, S > 0, D < 0 and T S − D < 0.

Three real roots corresponds to

4S 3 − S 2 T 2 + 27D 2 + 4DT 3 − 18T SD < 0;

otherwise, i.e., 4S 3 − S 2 T 2 + 27D 2 + 4DT 3 − 18T SD > 0, there is one real root and a complex
conjugate pair.

If our cubic equation depends on a parameter and the parameter changes, then: if it has three
real roots, one changes sign on crossing the plane D = 0, T < 0, S > 0; if it has one real root and
a complex conjugate pair, the real root changes sign on crossing the plane D = 0, T < 0, S > 0
but now 4S 3 −S 2 T 2 +27D 2 +4DT 3 −18T SD > 0; if it has one real root and a complex conjugate
pair, the real part of the complex conjugate pair changes sign on crossing the surface T S − D = 0,
T < 0, S > 0, D < 0.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 145

6.10 The Matrix eA

The solution to

dx
= Ax
dt

can be written as

x (t) = etA x (t = 0)

where

1 1
etA = I + tA + t2 A2 + t3 A3 + · · ·
2 6

and where the infinite sum converges to etA no matter any complications pertaining to the eigen-
vectors of A. However, there are questions regarding the rate of convergence of the series, yet if t
is small we might get a fair estimate of x (t) by assuming etA = I + tA.

Now we may need to solve

dx
= (A + B) x
dt

where A and B are large matrices and AB 6= BA.

The solution is

x (t) = et(A+B) x (t = 0)

but et(A+B) is not etA etB . In obtaining an estimate of et(A+B) it is sometimes useful to observe that
1 1
et(A+B) and e 2 tA etB e 2 tA agree through their first three terms in powers of t.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 146

6.11 A = A (t)

Suppose our problem is to solve

dx
= A (t) x
dt

where, as indicated, A depends on t.

To learn what is known about this problem, denote by

{x1 (t) , x2 (t) , · · · , xn (t)}

a set of independent solutions and introduce M (t), a fundamental solution matrix, via

M (t) = (x1 (t) x2 (t) · · · xn (t))

Then every solution can be written

c1 x1 (t) + c2 x2 (t) + · · · + cn xn (t)

and hence every fundamental solution matrix can be written M (t) C where det C 6= 0.

A case of interest is that in which A is periodic, viz.,

A (t + T ) = A (t)

whereupon M (t + T ) is a fundamental solution matrix and hence we have

M (t + T ) = M (t) C.

If we define a matrix R such that

C = eT R
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 147

and define P (t) via

P (t) = M (t) etR

whereupon

M (t) = P (t) e−tR

then, we find that P (t) is periodic, viz.,

P (t + T ) = M (t + T ) e−(t+T )R

= M (t) Ce−(t+T )R

= M (t) e−tR

= P (t)

Hence if we determine M (t) for 0 ≤ t ≤ T , then C = M −1 (0) M (T ) = eT R implies R and


P (t) = M (t) e−tR implies P (t), 0 ≤ t ≤ T . Thus we have P (t) for all t and therefore also
M (t) = P (t) etR for all t, i.e., M (t), 0 ≤ t ≤ T , implies M (t) for all t.

Now M (t) is stable if the eigenvalues of R have negative real parts. The simplest case is that
in which the eigenvalues of C are distinct, for then the eigenvectors of C and R coincide and the
eigenvalues of R, denoted µ, and C, denoted λ, satisfy

λ = eT µ .

Thus we have

Reµ < 0 iff |λ| < 1

and we can decide the stability of M by looking at the eigenvalues of C = M (0)−1 M (T ).


LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 148

As an example, the Mathieu equation, viz.,

d2 ψ
+ cos t ψ = 0
dt2

can be written
    
d  ψ   0 1

ψ

d = d
dt ψ − cos t 0 ψ
dt dt

whereupon our matrix A (t) is


 
0 1
 
− cos t 0

6.12 Home Problems

 
−1 1
1. Let A =   and show that its eigenvalues lie inside its Gerschgorins circles.
−1 −2
 
1
Then let x (t = 0) =   and sketch the solution to
1

dx
= Ax
dt

in the x1 , x2 plane.
   
−3/2 1/4 −3 4
Let A =   and A =   and repeat the above calculations.
1 −3/2 1 1
In the last problemλ1 = 
−1 is a double root to which corresponds
 only one independent
−2 1
eigenvector x1 =  . A generalized eigenvector x2 =   can be found so that
1 0

x1 , x2 is a basis.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 149

x 1
2. Determine out once a dynamic stripping cascade reaches a repetitive state if of the heavy
xin 2
1
phase on each stage is transferred to the stage below every T seconds. Is this better than
2
transferring all of the heavy phase every T seconds? What is the result in the limit of trans-
1 1
ferring th of the heavy phase every T seconds as n → ∞? Do this assuming a two stage
n n
cascade.

x
3. Determine out for a three stage dynamic stripping cascade once a repetitive operation is
xin
established. The model is then

dx S
= Ax, 0≤t≤T
dt T
 
−1 0 0
mV M  
 
where S = , T = , A= 1 −1 0 
L L  
0 1 −1
and

x1 (t = 0) = x2 (t = T )

x2 (t = 0) = x3 (t = T )

x3 (t = 0) = xin

4. The difference equation

xi+1 = c xi (1 − xi ) , c>0

is well known in the theory of deterministic chaos. Show that its constant solutions are
1
xi = 0 and xi = 1 − .
c
This is a simple model of population variation, xi being the population of a species
in year i scaled so that 0 < xi < 1. The interesting range of c is then 0 < c < 4 as
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 150

1
0 < x (1 − x) < when 0 < x < 1.
4
Show that the solution xi = 0 is stable to small upsets if and only if 0 < c < 1.
1
Show that the solution xi = 1 − is stable to small upsets if and only if 1 < c < 3.
c
When c = 3 show that the eigenvalue of the linear approximation is −1. Because
(−1) (−1) = 1 this leads to a stable period 2 solution, xi+2 = xi , which takes the place
of the unstable constant or period 1 solution, xi+1 = xi . Determine the period 2 solution.

The range 3 < c < 4 is interesting. As c increases beyond 3 there is a region of period
2 solutions, then a region of period 4 solutions, then a region of period 8 solutions, etc. The
width of successive regions decreases geometrically until what is called deterministic chaos
sets in. This is the period doubling route to chaos and it can be observed on a hand calculator.

5. The simple equilibrium stage model sketched below illustrates how a separation by chro-
matography works:

Q Vc Vc Vc
Va Va Va

i=0 i=1 i=n

Denote by c and a the compositions of a dilute solute in the carrier and in the adsorbent
phases, where Vc and Va denote the volumes of the phases. Assume phase equilibrium holds
in each stage, viz,. c = Ka, where strong binding corresponds to small values of K. The
subscript i denotes the stage.
Va
Vc +
Then, scaling time by K , i. e.,
Q

Va
Vc +
t= K θ
Q
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 151

and denoting by c
 
c0
 
 
 c1 
 
 
 c2 
 
..
.

you have

dc
= Ac

where
 
−1 0 0 ···
 
 
 1 −1 0 · · · 
A=



 0 1 −1 · · · 
 
.. .. ..
. . .

and where
 
c0 (θ = 0)
 
 
 0 
c (θ = 0) = 



 0 
 
..
.

i.e., initially N moles of solute equilibrate in stage zero, the other stages and the inlet carrier
being solute free, then the carrier is turned on at the volumetric flow rate Q.

The solute is swept through the cascade of stages and your job is to show that

ci (θ) θi e− θ
= , i = 0, 1, 2, . . .
c0 (θ = 0) i!
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 152

where

N
c0 (θ = 0) =
Va
Vc +
K

The distribution of solute over the stages at a given time θ is of most interest, viz., ci (θ)
ci (θ)
vs i. Now is the probability of finding a solute molecule in stage i at time θ, given
c0 (θ = 0)
it was in stage zero at time zero. Show that the average and variance of i, denoted i and σ 2 ,
are given by

i = θ = σ2

Then show that the speed of solute through the cascade, in stages per time, is

di Q
=
dt Va
Vc +
K

Q
where is the carrier speed.
Vc
Thus the stronger the binding the slower the speed. Hence different solutes having
different K’s move at different speeds.

6. The Stefan-Maxwell equations, viz.,

Xn ~ j − yj N
~i
yi N
∇yi =
j=1
cDij
6=i

present us with a model for diffusion in an ideal gas at constant temperature and pressure,
~ i are the mole fraction and mole flow rate per
where c is the mole density of the gas, yi and N
unit area of species i and Dij = Dji > 0 is the diffusion coefficient in an ideal gas made up
of species i and j.

If diffusion takes place steadily in one spatial direction the vector notation is not necessary
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 153

and we can write

n
dyi X yi Nj − yj Ni
=
dz j=1
cDij
6=i

where the Ni are constants independent of z. This equation can be used to find the Ni from
measurements of the yi at the two ends of a diffusion path in a two bulb diffusion experiment
where a long, small diameter tube provides a diffusion path between two large, well mixed
bulbs of gas at different compositions.

We can study this experiment under the assumption that the compositions in the bulbs
remain constant in time if the bulb volumesare large
 and the tube cross sectional area is
y
 1 
 
small. Then for a ternary ideal gas, denoting  y2  by y, we can write the Stefan-Maxwell
 
y3
equations as

d 1
y = By
dz c

where B, viz.,
 
N2 N3 N1 N1
+ − −
 D12 D31 D12 D31 
 
 N2 N3 N1 N2 
 
B= − + − 
 D12 D23 D12 D23 
 
 N3 N3 N1 N2 
− − +
D31 D23 D31 D23

is constant on a diffusion path.

Let

1 1 1 1
σ1 = + , δ1 = −
D12 D31 D12 D31

1 1 1 1
σ2 = + , δ2 = −
D23 D12 D23 D12
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 154

1 1 1 1
σ3 = + , δ3 = −
D31 D23 D31 D23
and order the species so that D31 > D12 > D23 then σ2 > σ3 > σ1 > 0 and
δ2 > δ1 > 0 > δ3 .

Write the characteristic polynomial of B as

λ3 − Iλ2 + IIλ − III

and show that

I = tr B = σ T N ,

I 2 − 4II = N T ∆N

and

III = det B = 0

where
 
σ
 1 
 
σ =  σ2 
 
σ3

 
N
 1 
 
N =  N2 
 
N3

and
 
δ12
−δ1 δ2 −δ3 δ1
 
 2 
∆ =  −δ1 δ2 δ2 −δ2 δ3 
 
−δ3 δ1 −δ2 δ3 δ32
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 155

I± I 2 − 4II
Then the eigenvalues of B are 0, and the eigenvectors of B and B T corre-
 2  
N1 1
   
   
sponding to the eigenvalue zero are  N2  and  1 . Indeed it is worth observing that
   
N3 1

II is proportional to N1 + N2 + N3 .

In the two bulb diffusion experiment the condition of constant pressure requires that
N1 + N2 + N3 = 0. Under this condition show that

I = δ1 N2 − δ2 N1

and

I 2 − 4II = I 2

Then the eigenvalues of B are 0, 0, I. Show that the 2 × 2 minors of B are products of N1
δ1 N2 δ2 N1
or N2 or N3 and − and assuming that this is not zero show that the rank of B is
D23 D31
two. Then the eigenvalue zero is repeated but to it there corresponds but one eigenvector.
 
N1
 
 
Show that to the eigenvalue zero there corresponds the eigenvector  N2  and the
 
N3
generalized eigenvector
 
N1
 D31 
 
1  N2 
 
 
δ1 N2 δ2 N1  D23 
−  
D23 D31  N1 N2 
δ1 N2 − δ2 N1 − −
D31 D23
 
δ
 1 
 
and that to the eigenvalue I there corresponds the eigenvector  δ2 .
 
δ3
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 156

Using this information show how to predict the composition in one bulb of a two bulb
diffusion experiment in terms of the composition in the other bulb and the values of N1 , N2 ,
N3 , D12 , D23 and D31 .

A full account of equimole counter diffusion in an ideal gas can be found in H.L. Toor’s
1957 paper “Diffusion in Three Component Gas Mixtures,” A.I.Ch.E. J. 3 198. He finds
dy1 dy1
conditions where N1 = 0 but is not zero, where = 0 but N1 is not zero and where
dz dz
dy1
sgn N1 = sgn .
dz

1 2
7. If in Problem 6 the reactions take place at one end of the diffusion path then
3
stoichiometry requires N1 + N2 + N3 = 0 and the diffusion process is again represented by a
matrix that does not have a complete set of eigenvectors. If, instead of this, a single reaction

ν1 A1 + ν2 A2 + ν3 A3 = 0

takes place where ν1 + ν2 + ν3 6= 0 then N1 + N2 + N3 6= 0, the eigenvalue zero is simple


and the matrix B ordinarily has a complete set of eigenvectors.

Suppose that the stoichiometric coefficients are such that the eigenvalues of B are 0 and
a complex conjugate pair. Then y vs. z will be a spiral in composition space.

Show that if a reservoir where the reaction v1 A1 + v2 A2 + v3 A3 = 0 takes place is


fed by diffusion from a reservoir where y is fixed, the center of the spiral cannot lie in the
physical part of composition space where y1 ≥ 0, y2 ≥ 0, y3 ≥ 0, y1 + y2 + y3 = 1

8. The boiling curve for an ideal solution is obtained by solving

dxi
= −yi + xi , i = 1, 2, . . . , n
ds
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 157

where

Pi (T )
yi = xi , i = 1, 2, . . . , n
P

and

n
X Pi (T )
xi = 1
i=1
P

where

P1 (T ) > P2 (T ) > · · · > 0

and

dP1 dP2
= P1′ > 0, = P2′ > 0, etc.
dT dT

The only rest states are the n pure component states. Determine the stability of each such
state to a small perturbation and show that only the state xn = 1 = yn is stable.

For example, one rest state is x1 = 1 = y1 , P1 (T ) = P , xi = 0 = yi ,


Pi (T ) < P, i = 2, . . . , n. The perturbation of x1 must be negative, the perturbations of
xi , i = 2, . . . , n must be positive, and the perturbation of T must be positive.

9. The equations for the small transverse motions of a set of n particles, each of mass m, equally
spaced on a string of fixed tension, viz.,

y1 yn

y2 yn-1
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 158

can be obtained from

1  1 
L = m ẏ12 + ẏ22 + · · · + ẏn2 − mω02 y12 + (y2 − y1 )2 + · · · + (yn − yn−1 )2 + yn2
2 2

where L is the Lagrangian.

Show that L can be written

1 1
L = m ẏ T I ẏ − m ω02 y T Ay
2 2

where
 
y1
 
 
 y2 
y=
 ..


 . 
 
yn

and
 
2 −1 0 0 0 ... 0
 
 
 −1 2 −1 0 0 . . . 0 

A= 

 0 −1 2 −1 0 . . . 0 
 
etc.

Show that Lagrange’s equations of motion, i.e.,


 
d ∂L ∂L
− = 0, i = 1, 2, . . . , n
dt ∂ ẏi ∂yi

lead to

d2 y
m 2 + m ω02 Ay = 0
dt

The matrix A is self adjoint in the plain vanilla inner product and so has n orthogonal
eigenvectors, denoted x1 , x2 , . . . , xn , and real eigenvalues denoted λ1 , λ2 , . . . , λn . The
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 159

circle theorem tells us the eigenvalues cannot be negative.

Requiring xi xi = 1, we can solve this second order differential equation by intro-


ducing generalized coordinates q1 , q2 , . . . , qn via

y = x1 q1 + x2 q2 + · · · + xn qn .

where qi = hxi , yi

Find the equations satisfied by q1 , q2 , . . . , qn and show that each generalized coordinate
executes a purely harmonic motion at frequency ωi where ωi2 = λi ω02 . Such a motion is
called a normal mode of vibration.

Write L in terms of the generalized coordinates.

Let n = 2 and 3, determine the eigenvectors and eigenvalues of A and sketch the
configuration of the particles in each normal mode of vibration.

10. Let A0 , A1 , . . . , Am be a set of n × n matrices. Then

f (λ) = λm A0 + λm−1 A1 + · · · + Am

is called a lambda matrix. Its elements are polynomials in λ of degree at most m. The latent
roots of f (λ) are the solutions of

det f (λ) = 0

The corresponding latent vectors are the non-zero solutions of

f (λ) x = 0

Show that the equations


 
dm dm−1
A0 m + A1 m−1 + · · · + Am x (t) = 0
dt dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 160

and


A0 S m + A1 S m−1 + · · · + Am x (k) = 0

where S 1 x (k) = x (k + 1), S 2 x (k) = x (k + 2), etc., can be turned into first order equa-
d
tions, i.e., equations in only and S 1 .
dt
Show that if λ1 is a latent root of f (λ) and x1 is a corresponding latent vector then

eλ1 t x1

satisfies the first equation while

λk1 x1

satisfies the second.

11. For a one shell pass, two tube pass heat exchanger, viz.,

T3
T2
T1
z=0 z=L

we have:

dT1
w 1 cp 1 = U1 πD1 {T2 − T1 }
dz

dT2
w 2 cp 2 = U1 πD1 {T1 + T3 − 2T2 }
dz
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 161

dT3
−w1 cp1 = U1 πD1 {T2 − T3 }
dz
where w1 and w2 denote the mass flow rates of the tube and the shell fluids, and where

T1 (z = 0) = T1in

and

T2 (z = 0) = T2in

Write this
      
w 1 cp 1 0 0 T −1 1 0 T
  d  1    1 
      
 0 w 2 cp 2 0   T2  = U1 πD1  1 −2 1   T2 
  dz     
0 0 −w1 cp1 T3 0 1 −1 T3

and show that


 
−1 1 0
    
 
1 1 1  1 −2 1  = 0 0 0
 
0 1 −1

is the conservation of energy equation.

To determine T1 , T2 and T3 write


    
T1 −a a 0 T1
d     
    
 T2  =  b −2b b   T 
dz     2 
T3 0 −a a T3
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 162
 
T
 1 
 
and expand  T2  in terms of the eigenvalues and eigenvectors of the matrix
 
T3

 
−a a 0
 
 
 b −2b b 
 
0 −a a

where

U1 πD1
a=
w 1 cp 1

and

U1 πD1
b=
w 2 cp 2

Find T3out in terms of T1in and T2in by observing that

T1 (z = L) = T3 (z = L)

What happens in the limit as L grows large?

12. By reversing the direction of the shell flow, a second one shell pass, two tube pass heat
exchanger configuration is obtained. Repeat the calculation in Problem 11 for this second
configuration. Sketch the temperatures vs z in the two configurations if the shell side is hot
and the tube side is cold. Show that the tube side temperature cannot cross the shell side
temperature in the first configuration. In the second configuration no such restriction obtains
and T3 vs z need not be monotone.

13. For a two shell pass, four tube pass heat exchanger, viz.,
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 163

T6
T5
T4

T3
T2
T1
z=0 z=L

we have:

dT1 
w 1 cp 1 = U1 πD1 T2 − T1
dz

dT2 
−w2 cp2 = U1 πD1 T1 + T3 − 2T2
dz
etc.

which can be written


 
−a a 0 0 0 0
    
 
T1  −b 2b −b 0 0 0  T1
    
    
d 
T2  
=
0 −a a 0 0 0 

T2 

dz 

..
.
 
  0 0 0 −a a 0


..
.


    
 
T6  0 0 0 b −2b b  T6
 
0 0 0 0 −a a

Denote the matrix on the RHS by A. It is block diagonal and its diagonal blocks turn up in
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 164

models of one shell pass, two tube pass heat exchangers.

Find the eigenvalues and eigenvectors of A and write a formula in terms of six unde-
termined constants for the dependence of T1 , T2 , . . . , T6 on z. It is only the values of these
constants that make T1 , T2 , T3 and T4 , T5 , T6 interdependent.

Find T2out and T6out in terms of T1in and T5in . Sketch the temperatures vs z if the
shell side is hot and the tube side is cold.

14. A one shell pass, one tube pass heat exchanger, i.e., a simple double pipe heat exchanger, is
often built using n small diameter pipes in place of one large diameter pipe:

Tn+1
T1
Tn+1
T2

Tn
Tn+1
z=0 z=L

For this we have:

1 dT1 
wcp = UπD Tn+1 − T1
n dz
..
.

1 dTn 
wcp = UπD Tn+1 − Tn
n dz

dTn+1 
−ws cps = UπD T1 + T2 + · · · + Tn − nTn+1
dz
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 165

The matrix that turns up here is


 
−a 0 0 ··· 0 a
 
 
 0 −a 0 ··· 0 a 
 
 .. .. .. .. .. 
 . . . . . 
 
 
 0 0 0 · · · −a a 
 
−b −b −b · · · −b nb

where w denotes the total pipe side flow and a is n times as large as before.

Determine the eigenvalues and eigenvectors of this matrix.

The pipe side flow is already equally divided over the n pipes. If the pipe side inlet
temperatures are also the same, show that all but two of the n + 1 constants in the solution
must be zero. These are the constants corresponding to the eigenvalue −a which is repeated
n − 1 times,

This tells us that the temperature in each pipe is the same as it is in all other pipes.
Under this condition the model reduces to the model of a simple double pipe heat exchanger
1
if of the shell flow is assigned to each pipe.
n

15. For the simple 1 − 1 heat exchanger, viz.,

T2 T2
or
T1 T1
 
T1
we introduce the column vector  . In the counterflow configuration we have
T2

    
d  T1   −a a   T1 
=
dz T2 −b b T2
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 166

where

UπD UπD
a= and b =
w 1 cp 1 w 2 cp 2

The eigenvalues and the eigenvectors of the matrix on the RHS are
   
1 a
0,   and − a + b,  
1 b

Using these produce the usual formulas for this heat exchanger.

The case where a = b leads to a double eigenvalue to which corresponds only one
independent eigenvector. Work out this case using a generalized eigenvector.

16. A cocurrent honeycomb heat exchanger is shown below.

W T1

W T2

W T3

W T4

z=0 z=L
2W 2W Tcold out Thot out
Thot in Tcold in
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 167

As a model, we write, assuming all cp ’s are the same,

(UA) eff

e wcp = T hot out − T cold out
T hot in − T cold in

Suppose the internal heat transfer coefficient is U and the area of the plane wall sepa-
rating each stream is A.

Show that

(UA) eff (UA) √  (UA) √ 


− 1  √  −2 + 2 1  √  −2 − 2
e wc p = 2− 2 e wc p + 2+ 2 e wc p
4 4

P
17. To see what happens when a condition such as xi = 1 must be satisfied in a problem
where the stability of an equilibrium point is being investigated let

dx1
= −f1 (x1 , x2 ) + x1
dt

and

dx2
= −f2 (x1 , x2 ) + x2
dt

where f1 + f2 = 1 whenever x1 + x2 = 1. This requires

∂f1 ∂f1 ∂f2 ∂f2


− + − =0 (∗ )
∂x1 ∂x2 ∂x1 ∂x2

whenever x1 + x2 = 1.

Let x01 , x02 be an equilibrium point and let x1 = x01 + ξ1 , x2 = x02 + ξ2 be a small
excursion where x01 + x02 = 1 and ξ1 + ξ2 = 0

What does condition (∗) tell us about the eigenvalues and eigenvectors of the Jacobian
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 168

matrix and about the solution to


   
d  ξ1  ξ1
=J 
dt ξ2 ξ2

when
   
ξ1 (t = 0) 1
 ∝ 
ξ2 (t = 0) −1

Does this agree with what is obtained on reducing the problem to

dx1
= −f1 (x1 , 1 − x1 ) + x1 ?
dt

18. A rider on a merry-go-round throws a ball at another rider directly opposite. The radius is
~ = w ~k and the speed of the ball is initially V .
R, the angular velocity is w

The motion of the ball is viewed by an observer at the center of the merry-go-round and
rigidly fixed to it. Under force free conditions the equation for the motion of the ball is

~a = −2 w
~ × ~v − w
~×w
~ × ~r

r (t = 0) = R ~i and ~v (t = 0) = −V ~i.
where ~

r = x ~i + y ~j , write the equations for the motion of the ball and put them in the
Let ~
form
    
x x
   0 1 0 0
 
 dx    dx 
d  dt
  w2
  0 0 2w  
 dt


 =  
dt  y   0 0 0 1  y 
    
 dy   dy 
0 −2w w 2 0
dt dt

Solve this, noticing that the matrix on the RHS has two double eigenvalues and to each there
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 169

corresponds but one eigenvector.

Find the motion of the ball as seen by an observer fixed to the ground at the center of
the merry-go-round. The equation of motion is then

~a = 0

r (t = 0) = R ~i and ~v (t = 0) = −V ~i + wR ~j .
where ~

This might explain why the physics of the problem requires that the observer fixed to the
merry-go-round find a pair of double eigenvalues each corresponding to a single eigenvector.

19. Suppose we have a system of particles whose state is given by generalized coordinates

q1 , q2 , . . . and generalized momenta p1 , p2 , . . . . Denote the Hamiltonian for the system by


H = H (q1 , q2 , . . . , p1 , p2 , . . .). The equations of motion are

dqi ∂H
=
dt ∂pi

and

dpi ∂H
=−
dt ∂qi

Now for small oscillations about an equilibrium point, H can be approximated by

1 XX 1 XX
H= Qij qi qj + Pij pi pj
2 2

where Q and P are real, symmetric and positive definite.

Then we have

dqi X
= Pij pj
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 170

and hence
 
d2 qi X dpj X ∂H
= Pij = Pij −
dt2 dt ∂qj
XX
=− Pij Qjk qk

which we can write

d2 q
= −W q, W = PQ
dt2

The eigenvalue problem for W is

W qi = wi2 q i

Prove wi2 > 0.

Then if wi2 , q i and wj2 , q j are two solutions to the eigenvalue problem, prove

q Ti Q q j = 0

To get going multiply the eigenvalue problem by Q whereupon it is:

Q P Q q = w2 Q q

20. For a three stage stripping column, where we add H units of solution to the top every T units
of time, we have

dx3
= S (x2 − x3 )
dt

dx2
= S (x1 − x2 )
dt
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 171

dx1
= −Sx1
dt
mV H
where t is scaled by T , S = and L = . The x’s are scaled by xin .
L T
Plot x1 , x2 and x3 vs t for the first few cycles where

x3 (t = 0) = 1 



x2 (t = 0) = 1 first cycle



x1 (t = 0) = 1 


x3 (t = 0) = 1 



x2 (t = 0) = x3 (t = 1) , cycle before thereafter



x1 (t = 0) = x2 (t = 1) , cycle before 

After many cycles x1 (t = 1) should approach

1
1 S
e3S − 2Se2S + e
2

Notice that the only eigenvalue of


 
−1 0 0
 
 
 1 −1 0 
 
0 1 −1

is λ = −1 to which there corresponds the eigenvector and generalized eigenvectors


     
0 0 1
     
     
 0 ,  1 ,  0 
     
1 0 0

and notice that these vectors are orthogonal in the plain vanilla inner product.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 172

21. If
 
−1 −1
A=  and h y, x i = y T x
1 −1

show that the eigenvectors of A are orthogonal.

Observe that A 6= A∗ but AA∗ = A∗A. If AA∗ = A∗A in one inner product, is
AA∗ = A∗A in another inner product?

22. Cocurrent honeycomb heat exchanger.

We have a set of parallel channels of rectangular cross section through which hot and
cold fluids flow, viz.,

cold
hot
cold
hot

z=0 z=L

1 2 3 4

and our notation is


LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 173

T1( z )
Wcold
T2( z )
Whot
T3( z )
Wcold
T4( z )
Whot

The cold flow rates are denoted Wcold, the hot flow rates are denoted Whot. All the cold
side heat transfer coefficients are the same, so too all the hot side heat transfer coefficients.
Hence all the U’s are the same.
Wcold cp cold Ub
Denoting by f and by β we have
Whot cp hot Wcold cp cold

d
T = βAT
dz
 
T1
 
 
 T2 

where T =  

 T3 
 
T4

 
−1 1 0 0
 
 
 f −2f f 0 
A=



 0 1 −2 1 
 
0 0 f −f
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 174

and
 
T
 cold in 
 
 Thot in 
T (z = 0) = 



 Tcold in 
 
Thot in

Having access to inputs and outputs only at points 1 , 2 , 3 and 4 and


knowing only that the flows are cocurrent, we define (UA) eff by

eff (U A) (f + 1)

(Thot − Tcold) out 2Wcold cp cold
= e
(Thot − Tcold) in

and our job is to measure (UA) eff and estimate U, where


 
−1
 
 
1  1 
(Thot − Tcold) out = T 4 −T 3 =   T (z = L)
2  
 −1 
 
1

Define D, a diagonal matrix, by


 
1
 
 
 f  T
D=

,
 where D=D
 1 
 
f

T
and denote A (f = 1) by A1 where A1 = A1 , then observe that A = DA1 .

Show that the eigenvalues of A are real and not positive and that its eigenvectors are
orthogonal in the inner product where G = D −1 .
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 175

Hence, write

U bL
XD E λ
Wcold cp cold i
T (z = L) = x i, T (z = 0) e xi
G

Derive the characteristic polynomial of A, viz.,


λ4 + 3 (1 + f ) λ3 + 2 1 + 3f + f 2 λ2 + 2f (1 + f ) λ + zero = 0

and conclude:

eigenvalues:
p p
λ1 = 0, λ2 = −f − 1 + f 2 + 1, λ3 = −f − 1, λ4 = −f − 1 − f2 + 1

unnormalized eigenvectors:
   
1 1
   p 
   
 1   −f + f 2 + 1 
x1 = 
 ,
 x2 = 
 1 p  ,

 1   f 1 − f2 + 1 
   
1 −1

   
1 1
   p 
   
 −f   −f − f 2 + 1 

x3 =  , x4 =   p  
  1 2 
 −f   f 1+ f +1 
   
f2 −1

Then derive a formula for (UA) eff in terms of U. You worked out the case f = 1 in Problem
16.

23. You are going for a walk on a network of N points. A point is denoted i, i = 1, . . . , N.
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 176

Each point is linked to others and we set ℓij = 1 if there is a one step path from point j
to point i, otherwise ℓij = 0.

Denote by ℓj the number of one step paths from point j to the points of the network,
viz.,

X
ℓj = ℓij , ℓj 6= 0
i

and assume on taking a step from point j the possible destinations are chosen with equal
1
probability, . Thus the probability of the step j → i
ℓj

1
pij = > 0 if ℓij = 1
ℓj

=0 if ℓij = 0

The probability of visiting point i at step m + 1 is

(m+1)
X (m)
pi = pij pj

and hence

p(m+1) = P p(m)

where
 T
1
 
 
 1  (m)
  p =1
 .. 
 . 
 
1

What must be true of P in order that

lim p(m) = p
m→∞
LECTURE 6. THE SOLUTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS 177

independent of p(0) .

Given P how would you find p ?


Lecture 7

Simple Chemical Reactor Models

7.1 The Chemostat

A chemostat is a vessel in which a population of cells grows. We denote by V the volume of the
vessel and assume the conditons therein to be spatially uniform. Then if n denotes the number
density of cells in the vessel and W denotes the volume flow into and out of the vessel, the number
of cells in the vessel satisfies

dn
V = −W n + knV
dt

where k is the growth constant and where the cells are not fed but grow from an intial injection
which establishes n (t = 0). Because we have
 
dn W
= − + k n,
dt V

W W
if k > V
the cell culture grows without bound, whereas if k < V
it washes out.

To get a more interesting model we make the simple assumption that the value of k, which tells
us the rate of cell multiplication, instead of being a constant, depends on the concentration of a
single limiting nutrient. We let c denote this concentration and write k = k (c). Assuming that the
nutrient is consumed only when the cell population grows and that it must be fed to the chemostat

179
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 180

to make up for this, we write

dc
V = W cin − W c − νknV
dt

and our model is


 
dn W
= − +k n
dt V

and

dc W
= {cin − c} − νkn
dt V

where cin is the nutrient concentration in the feed and ν is a stoichiometric coefficient.

The steady states are solutions to


 
W
0= − +k n
V

and

W
0= {cin − c} − νkn
V

and these equations are statisfied by

n = 0, c = cin (wash out)

and by

cin − c W
n= , k (c) = .
ν V

Ordinarily k is a monotonically increasing function of c and assuming this to be so we require


k ′ (c) > 0. As the values of c lie on the interval [0, cin ], the largest value of k is k (cin ). We
suppose that cin and V are held at fixed values and that W is decreased from an arbitrarily large
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 181

value. Then as long as W > V k (cin ) we find only the washout solution, n = 0, c = cin . The
point W = V k (cin ) is a branch point; as W passes through V k (cin ) a new solution branches off
the washout solution, and we have the following steady state diagram:

To establish the stability of these steady state solution branches, we investigate what happens
to a small excursion from a steady solution denoted n0 , c0 . Writing our model
 
dn W
= f (n, c) = − + k (c) n
dt V

and

dc W
= g (n, c) = (cin − c) − νk (c) n
dt V

we find its linear approximation near n0 , c0 , in terms of the small displacements ξ and η, to be
         
d    ξ fn fc ξ −W + k (c0 ) k ′ (c0 ) n0 ξ ξ
=   =  V  =J .
dt η gn gc η −νk (c0 ) −W ′
− νk (c0 ) n0 η η
V

At a washout solution, where n0 = 0, c0 = cin , we have


 
−W + k (cin ) 0
J = V 
−νk (cin ) −W
V
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 182

W
whereas, at a non washout solution, where V
= k (c0 ), our Jacobian matrix is
 

0 k (c0 ) n0
J = .
−ν W
V
−W
V

− νk (c0 ) n0

Hence if W < V k (cin ) the new branch is stable whereas the washout branch is unstable. If
W > V k (cin ) the washout branch is stable. This is true because the eigenvalues of the Jacobian
W
matrix on the new branch are −νk ′ (c0 ) n0 and − whereas on the washout branch they are
V
W W
− + k (c0 ) and − . Indeed as W decreases and passes through V k (cin ) the washout solution
V V
loses its stability while the new solution picks up the lost stability of the washout solution. We
observe that at the branch point an eigenvalue vanishes, i.e., the branch point is the point where the
determinant of the Jacobian matrix vanishes. This corresponds to passing from the fourth to the
third quadrant in the plane whose axes are the determinant and the trace of the Jacobian matrix.

The reader may wish to rework this problem after adding a term to account for cell metabolism
i.e., −µn, µ > 0, and specifying

k (c) = βc, β > 0,

which rules out cell growth at c = 0. Then the model is


 
dn W
= − + βc n
dt V

and

dc W
= (cin − c) − (νβc + µ) n
dt V

7.2 The Stirred Tank Reactor

This is a model problem having a long history in chemistry and chemical engineering. There are
many variations corresponding to many ways of making the reaction speed itself up. We assume
the reaction is autothermal. The rate of a chemical reaction is ordinarily a strongly increasing
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 183

function of the temperature at which the reaction takes place. This leads to a positive feedback
when a reaction releases heat, for this heat then speeds up the reaction. This feedback makes
the problem interesting, even in the simplest case, and carrying out the reaction in a stirred tank
reactor produces a very simple problem. The model is plain vanilla, retaining only the Arrhenius
temperature dependence of the chemical rate coefficient, and this in a simplified form. Our work
is a part of what can be found in Poore’s paper “A Model Equation Arising from Chemical Reactor
Theory” (Arch. Rational Mech. Anal. 52, 358 (1973)).

Before
  problem we presented a brief reminder, setting n = 2 so that
we turn to this

x1 a a
x=  and A =  11 12 . The solution to
x2 a21 a22

dx
= Ax
dt

where x (t = 0) is assigned is

D E D E
x = y 1 , x (t = 0) eλ1 t x1 + y 2 , x (t = 0) eλ2 t x2

n o
where {x1 , x2 } and y 1 , y 2 are biorthogonal sets of vectors in the inner product being used to
write the solution, x1 and x2 being eigenvectors of A, y 1 and y 2 being eigenvectors of A∗ . The
corresponding eigenvalues, λ1 and λ2 , satisfy

λ2 − trA λ + det A = 0

where trA = a11 + a22 and det A = a11 a22 − a21 a12 .

We assume that A is real and we observe that the qualitative behavior of x (t) differs according
to where the point (det A, trA) lies in the det A−trA plane. The algebraic signs of the eigenvalues
or their real parts are: +, + in the first quadrant, +, − in the second and third quadrants,−, −
in the fourth quadrant. The fourth quadrant is divided into two regions by the curve (trA)2 −
4 det A = 0. Above the curve the two eigenvalues are complex conjugates having a negative real
part, below they are negative real numbers. The path x (t) vs t differs in shape according to where
the point (det A, trA) lies. If it lies below (trA)2 − 4 det A = 0, x (t) is the sum of two vectors
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 184

each remaining fixed in direction, their lengths shrinking exponentially. If (det A, trA) lies above
(trA)2 − 4 det A = 0, x (t) returns to 0 on a spiral path. To see this we write λ2 = λ1 , x2 = x1 ,
y 2 = y 1 , then because x (t = 0) is real, x (t) is given by

nD E o
x (t) = 2Re y 1 , x (t = 0) eλ1 t x1

D E
and, on writing y 1 , x (t = 0) = ρeiφ , this is

x (t) = 2ρeReλ1 t {cos (Imλ1 t + φ) Rex1 − sin (Imλ1 t + φ) Imx1 }

Hence Reλ1 tells us the rate of decay of the spiral, Imλ1 tells us its frequency of revolution and
Rex1 and Imx1 determine its shape.

The point (det A, trA) can leave the fourth quadrant in two ways, either by crossing the line
det A = 0 or by crossing the line trA = 0. We call the first instance an exchange of stability, the
second a Hopf bifurcation.

We turn now to the stirred tank reactor. Reactants are fed to the tank and products are removed
along with some of the unused reactants and what we have is an autothermal process controlled by
heat loss and reactant loss. We determine the steady states of the reactor and study their stability.
We will find that the point (det A, trA) tells us all we want to know about the reactor close to a
steady solution. The location of this point depends on the input variables to the problem and, as
the values of these variables change, it may leave the fourth quadrant by crossing either the line
det A = 0 or the line trA = 0.

We write a simple model assuming that an exothermic, first order decomposition of a reactant
in the feed stream takes place in a tank maintained spatially uniform by sufficient stirring. It is

dc
V = qcin − qc − kcV
dt

and

dT
V ρcP = qρcP Tin − qρcP T + {−∆H} kcV − UA (T − Tc )
dt
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 185

where c denotes the concentration of the reactant and T denotes the temperature. The tank is
equipped with a heat exchanger to remove the heat released by the reaction, and this explains the
heat sink appearing as the fourth term on the right hand side of the second equation. The preceding
term is the heat source, as −∆H > 0. The density, the heat capacity, etc. are taken to be constants
while the chemical reaction rate coefficient is specified by the Arrhenius formula:

k = Ae−E/RT .

This can be written in terms of kin as

  y
−E 1
− T1 RTin
k = kin e R T in = kin e 1+
E
y
,

 
E T −Tin RTin T −Tin
where y = RTin Tin
, and then if E
y = Tin
<< 1 it is

k = kin ey .

This is what we use henceforth. It is called the Frank-Kamenetski approximation after D.A.Frank-
Kamenetskii, a mining engineer interested in the problem of thermal explosions. The approxima-
T − Tin
tion makes sense as long as << 1. It is explained in physical terms in his book, “Diffusion
Tin
and Heat Exchange in Chemical Kinetcs.” We use it for its mathematical convenience.
c
Then, letting x denote 1− , the fractional conversion of the feed, scaling time by the holding
cin
V
time and writing it again as t, and introducing the dimensionless groups
q

UA V
β= ≥0
ρcP V q

E (−∆H) cin
B= >0
RTin ρcP Tin
and

V
D= kin > 0
q
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 186

we have

dx
= −x + Dey (1 − x) ≡ f (x, y)
dt

and

dy
= −y + BDey (1 − x) − βy ≡ g (x, y)
dt

where we have assumed Tc = Tin and obtained a non-essential simplification. The input variables
D, B and β, measure the strengths of the chemical reaction, the heat source and the heat sink.
We will assume that they can be adjusted independently whereas that may not be so in a definite
V
physical problem. Indeed to study the response of a system to the holding time τ, where τ = ,
q
1 1
it would be better to put β0 τ and k0 τ in place of β and D where and are the natural time
β0 k0
scales for heat exchange and reaction.

The steady solutions satisfy

0 = −x + Dey (1 − x)

and

0 = −y + BDey (1 − x) − βy

or

B
y= x
1+β

and

x −y
D= e
1−x
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 187

The dependence of x on D is then given implicity by

x − 1+β
B
x
D= e
1−x

To each x on (0,1) there corresponds one value of D. As x increases from 0 to 1 the right hand side,
called RHS henceforth, increases from 0 to ∞ and the question is: is this a monotonic increase?
If it is, there will be one value of x corresponding to each value of D; otherwise there will be more
B
than one value of x corresponding to some values of D. The answer depends on the size of
1+β
B
− x
for this determines whether the factor e 1 + β can turn around the strongly increasing factor
x dRHS
. To answer our question, the implicit function theorem instructs us to find where
1−x dx
dRHS
vanishes. Thus we calculate and find:
dx

B
dRHS 1 1 − x
= e 1+β Bx2 − Bx + (1 + β)
dx (1 − x)2 1 + β

dRHS
The algebraic sign of is that of {Bx2 − Bx + (1 + β)} and hence is positive for x = 0
dx
and for x → 1. Our question then reduces to: does Bx2 − Bx + (1 + β) vanish for intermediate
values of x : 0 < x < 1? We let x1 and x2 denote the roots of Bx2 − Bx + (1 + β) = 0 and find
r
1 1 4 (1 + β)
x1,2 = ± 1−
2 2 B

There are two possibilities: either B < 4 (1 + β) , x1 and x2 do not lie on (0,1) and RHS is a
monotonic increasing function of x, ∀x ∈ (0, 1) or B > 4 (1 + β) , x1 and x2 lie on (0,1) and
RHS exhibits turning points at x1 and x2 .

The line B = 4 (1 + β) divides the positive quadrant of the β − B plane into two regions. In
the lower region x vs D is monotonic, in the upper x vs D is S-shaped. Schematically then the
steady state diagram looks as follows:
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 188

The steady state curve in the upper part of the sketch is not like the branching diagram in the
chemostat problem. On this S-shaped curve jumps take place as D increases through the lower
turning point corresponding to x1 (ignition point) or decreases through the upper turning point
corresponding to x2 (extinction point). And we conclude that conversions between x1 and x2 may
be difficult to achieve.

To establish the stability of the steady solutions we suppose the system to be in a steady state
denoted (x, y) and ask whether a small excursion to (x + ξ, y + η) does or does not return to
(x, y) . Because we have

dξ dη
= f (x + ξ, y + η) , = g (x + ξ, y + η)
dt dt

and

f (x, y) = 0 = g (x, y)

we find, for small excursions from (x, y)


   
d  ξ  ξ
= A 
dt η η
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 189

where
 
fx fy
A= 
gx gy

and where A is to be evaluated at the steady state under investigation. Now as

fx = −1 − Dey

fy = Dey (1 − x)

gx = −BDey

and

gy = − (1 + β) + BDey (1 − x)

and as f (x, y) = 0 implies

x
Dey =
1−x

we find
 
−1
x
A= 1−x 
−Bx
1−x
− (1 + β) + Bx

whence we have

1  2
det A = Bx − Bx + (1 + β)
1−x

and

1  2
trA = − Bx − (B + 1 + β) x + 2 + β .
1−x
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 190

Looking at det A first, we see that as x → 0 and as x → 1 det A is positive and that det A
vanishes at x1 and x2 , the turning points of RHS. Hence det A is positive unless B > 4 (1 + β) and
x1 ≤ x ≤ x2 . The algebraic sign of det A is marked on the foregoing sketch and it indicates that
the branch of the S-shaped steady state curve running between the turning points corresponding
to x1 and x2 is unstable. Indeed as x increases through x1 or decreases through x2 an exchange
of stabilities takes place which corresponds to passing from the fourth to the third quadrant in the
det A − trA plane by crossing the line det A = 0. While the turning points at x1 and x2 correspond
to det A = 0 they do not look like the branch point, where also det A = 0, discovered in the
chemostat problem.

Thus at each point (β, B) where B < 4 (1 + β), we have det A > 0 for all values of D but
where B > 4 (1 + β) we have a bounded range of D, depending on β and B, where det A < 0.

To see what is going on when det A is positive we must look at trA. This is negative as x → 0
and as x → 1 so that all such states are stable to small upsets. The question is whether or not trA
vanishes for intermediate values of x: 0 < x < 1. Denoting by x3 and x4 the roots of trA = 0 we
find their values to be
q
B + (1 + β) ± (B + (1 + β))2 − 4B (2 + β)
x3,4 =
2B

The condition that (B + (1 + β))2 − 4B (2 + β) vanish places two curves on the β − B diagram:

p
B =3+β±2 2+β

and between these curves x3 and x4 are complex conjugates and so do not lie on (0,1). Hence in
the region between the two curves we have trA < 0.

Now the roots, x3 and x4 , of trA = 0 satisfy Bx2 − (B + 1 + β) x + 2 + β = 0 and because


B + 1 + β > 0 and 2 + β > 0 their real parts must be positive. If, then , the point (β, B) lies above

the upper curve, i.e., B > 3 + β + 2 2 + β, the roots must be real and positive. In terms of z,
where z = x − 1, the roots of trA = 0 satisfy

Bx2 − (B + 1 + β) x + 2 + β = Bz 2 − (−B + 1 + β) z + 1 = 0
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 191
√ √
but −B + 1 + β < −2 − 2 2 + β < 0 when B > 3 + β + 2 2 + β hence the real parts of z3 and
z4 must be negative and we see that x3 and x4 lie to the left of x = 1.

Likewise, if the point (β, B) lies below the lower curve, i.e., B < 3 + β − 2 2 + β, x3 and x4
must also be real and positive but now they lie to the right of x = 1.

So in terms of trA what we find is this: for all points (β, B) lying below B = 3 + β + 2 2 + β

we have trA < 0, ∀x ∈ (0, 1) ; for all points (β, B) lying above B = 3 + β + 2 2 + β we
have trA < 0 for 0 < x < x3 , trA > 0 for x3 < x < x4 and trA < 0 for x4 < x < 1. The
x − 1+βB
x
correspondence between x and D is D = e .
1−x

Thus at each point (β, B) where B < 3 + β + 2 2 + β, we have tr A < 0 for all values of D

but where B > 3 + β + 2 2 + β we have a bounded range of D, depending on β and B, where
tr A > 0.

The two curves B = 4 (1 + β) and B = 3 + β + 2 2 + β divide the β − B plane into four
regions as follows:

B
(1 +β )
B =4
det A > 0

tr A < 0 B = 3 + β + 2√ 2 + β
det A > 0
tr A < 0

At each point where the sign of det A or trA is not indicated, both signs are possible depending
on the value of x.

The next figure presents one possibility


LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 192

tr A < 0
x

tr A < 0
x D
det A < 0
B
D

x
det A < 0

x
D

At each point of the upper most region x1 , x2, x3 and x4 all fall on the interval (0,1) and so as x
increases from 0 to 1 det A takes positive, negative then positive values while trA takes negative,
positive then negative values. This region can be subdivided depending on how x1 , x2, x3 and x4
are ordered. This is worked out in Poore’s paper.

If we suppose that (β, B) is such that x1 < x2 < x3 < x4 then it is possible to find:

X
tr A
1 − − − − − −
+ + + −
x4 + + + + + +
+ + ++ + + +0 +
x3 − + det A
− ++ 0

x2 −
− 0
− −
− −
− −
− −

x1 − −
− +
0

− − ++
+
D
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 193

and where trA > 0 or where det A < 0 we have unstable steady states, the remaining steady
states being stable to small disturbances. The sketch should convince the readers that they have
no idea what happens as D increases and x passes through x1 . The points corresponding to x3
and x4 , where trA = 0 and det A > 0 are called Hopf bifurcation points. For D and x such
that x is just below x3 or just above x4 the corresponding steady state is stable but the return of a
small perturbation to the steady state is not monotonic. As D increases and x increases through
x3 or as D decreases and x decreases through x4 both of which correspond to passing from the
fourth to the first quadrant in the det A − trA plane by crossing the line trA = 0, something new
turns up that we can only guess at. In each instance it may be that a branch of stable periodic
solutions grows from the bifurcation point, but there are other possibilities, and Poore deals with
such questions. A simple way to get the required information can be found in Kuramoto’s book
“Chemical Oscillations, Waves and Turbulence.”

For example, as D increases in the sketch below, and the state crosses
tr A = 0 from S to U,

x U S
S

we may see
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 194

S S
y y
U
S U

x x

or

y y
S
S U

x x

The reader may wish to work out the adiabatic case, viz., β = 0.

The two problems presented in this lecture illustrate the basic ideas of small amplitude stability
studies and therefore serve our purposes very well. But they are too simple to represent real chem-
ical reactor problems and even too simple to represent what is in the chemical reactor literature.
The greatest simplification is in the use of two variables to define the state of the system and the
consequent use of 2 × 2 matrices to determine its stability.

But even in two state variable problems there is a lot going on. To begin to learn about this the
reader can consult chapter six in Gray and Scott’s book “Chemical Oscillations and Instabilities.”

More information on populations of microorganisms can be found in Waltman’s book “Com-


petition Models in Population Biology.”
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 195

7.3 Home Problems

1. The autocatalytic reaction

k
A + P −−−−→ 2P

takes place in a spatially uniform reactor whose holding time is θ. If cA and cP denote the
concentrations of A and P and cP = 0, we can write
in

dcA cA c
= in − A − k cA cP
dt θ θ

and

dcP c
= 0 − P − k cA cP
dt θ

where cA (t = 0) and cP (t = 0) > 0 are specified.

Find the steady solutions of this system of equations and their stability to small upsets, i.e.,
show that the steady state diagram in terms of cA is
in

CA in−
1
CP CP = kθ
0

S 1 U CA
in

LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 196

2. The reaction A + B −→ C + D takes place by the mechanism

A −→ X

B + X −→ Y + C

2X + Y −→ 3X

X −→ D

Letting a denote the concentration of A, etc., assuming that a and b remain fixed and requir-
ing all the elementary kinetic rate constants to be equal, we can write

dx
= a − bx + x2 y − x
dt

and

dy
= bx − x2 y
dt

Find the values of a and b for which these equations have a stable equilibrium point and
show that the curve b = 1 + a2 is a locus of Hopf bifurcations.

3. The predator-prey equations of Volterra and Lotka are

dx
= Ax − Bxy
dt

dy
= Cxy − Dy
dt
where A, B, C and D are positive constants and where y is the predator population while x
is the prey population. Find the equilibrium points and determine their stability.

4. The Lorenz equations, “Deterministic Nonperiodic Flow,” J. Atmos. Sci. 20, 130 (1963),
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 197

viz.,

dx
= σ (y − x)
dt

dy
= ρx − y − xz
dt
and

dz
= −βz + xy
dt

where σ, ρ and β are positive constants, are well known in the theory of deterministic chaos.
These equations represent a three mode truncation of the Boussinesq equations for natural
convection in a fluid layer heated from below. The parameters σ, ρ and β denote the Prandtl
8
number, the Rayleigh number and an aspect ratio. Lorenz set σ = 10 and β = . For
3
fixed σ and β investigate the equilibrium solutions as they depend on ρ and establish their
stability. The point ρ = 1 is called a pitchfork as two new equilibrium solutions break off
from the equilibrium solution x = y = z = 0. Be sure to find the Hopf bifurcation on each
of the new equilibrium branches. To do this let I, II and III denote the principal invariants
of the Jacobian matrix and observe that I × II = III at a Hopf bifurcation.

5. The reaction A −→ B, where A is a gas and B is a liquid, is ordinarily carried out by bub-
bling a gas stream containing A through a liquid containing B. Assuming that A dissolves in
the liquid and that the reaction takes place there, we can write a simple model by supposing
that the solubility of A is independent of temperature, that the rate of absorption of A is not
enhanced by the chemical reaction and that the concentration of B is in large excess. The
concentration of dissolved A and the temperature of the liquid then satisfy

dcA 
V = kL0 a V cA∗ − cA − LcA − k cA cB V
dt in
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 198

and

dT
V ρ cP = ρ cP L (Tin − T ) + (−∆H) k cA cB V − UA (T − Tin)
dt in

where L denotes the liquid feed rate and V denotes the liquid volume in the reactor and
where we put cB = cB . The gas stream maintains A at partial pressure PA and creates the
in
P
surface area aV , diffusional resistance lies entirely on the liquid side and cA∗ = A .
H
Introducing

E T −T
in
RT T
k = kin e in in

V
τ=
L
D = τ kin CBin

Dm = τ kL0 a
E T − Tin
Y =
RTin Tin
cA∗ − cA
X=
cA∗
we can write our model

dX 
−τ = Dm X − 1 + DeY (1 − X)
dt

and

dY
τ = −Y + BDeY (1 − X) − βY
dt

Find the steady solutions of this system of equations and their stability to small upsets.

6. In problems where the model of a process is a system of differential and algebraic equations,
e.g., p balance equations and q phase equilibrium equations to determine p+q state variables,
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 199

the response of the system is inherently slow. To see this, suppose that x, y and v satisfy

dx
= f (x, y, v)
dt

dy
= g (x, y, v)
dt
and

x = y.

Then the equilibrium values of x, y and v satisfy

f (x, y, v) = 0

g (x, y, v) = 0

and

x−y =0

and we assume that at an equilibrium point


 
fx fy fv
 
 
det  gx gy gv  6= 0
 
1 −1 0

The dynamic value of v is determined by

dx dy
=
dt dt

or

f (x, y, v) = g (x, y, v)
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 200

If x0 , y0 , v0 denotes an equilibrium solution, we can construct a linear approximation


to the nearby dynamics. To do this we obtain v (x, y) via f (x, y, v) = g (x, y, v) whence

g − fx
vx = − x
gv − fv

and

gy − fy
vy = −
gv − fv

Then the linear approximation is


  
 
ξ
F
d    x F y  ξ 
=
dt η Gx Gy η

 
where F (x, y) = f x, y, v(x, y) and G (x, y) = g x, y, v(x, y)

Using

Fx = fx + fv vx

Fy = fy + fv vy

Gx = gx + gv vx

and

Gy = gy + gv vy

show that
 
Fx Fy
det   = Fx Gy − Gx Fy = 0.
Gx Gy

This tells us that the linear approximation always exhibits an eigenvalue that is zero. Hence
the response of the system, when it is in an equilibrium state, to a small upset is slow. To see
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 201

whether an upset strengthens or weakens, we need to go on and look at quadratic terms.

The boiling steady states of a constant pressure evaporator exhibit this problem. The
model is:

dx
M = xF F − x (F − V )
dt

dT
M cP = UA (TS − T ) + cP (TF − T ) F − λV
dt
and

T = T 0 + βx

where the state variables are x, T and V , all else being fixed.
M McP
In terms of the holding time τ = , the heat transfer time τh = and the dimen-
F UA
sionless variables

T − T0 V λ
y= , v= and q=
β F cP β

the model is

dx
= xF − x (1 − v)
dt

dy τ 
= yS − y + yF − y − qv
dt τh
and

x=y

where t is written in place of t/τ .


dx dy
To determine v in terms of x and y use x = y and hence = to conclude that
dt dt
τ  
yS − y + yF − y − qv = xF − x + xv
τh
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 202

This tells us that

τ
+1
∂v τh
=−
∂y q+x

and

∂v 1−v
=
∂x q+x

Then as long as boiling is taking place we have

dx
= xF − x + xv (x, y)
dt

and

dy τ  
= y S − y + yF − y − qv (x, y)
dt τh

where 0 < v < 1, and the linear approximation is determined by the matrix

 ∂v ∂v 
−1 + v + x x
 ∂x ∂y 
 
 ∂v τ ∂v 
−q − −1−q
∂x τh ∂y

evaluated at a solution x0 , y0 , v0 of the steady equations.

Show that the determinant of this matrix is zero and that its trace is
 
q τ x
(−1 + v) − +1 <0
q+x τh q+x

and as a result show that the solution to the linear equations describing a small upset of a
boiling steady state is the sum of two terms, one constant in time, the other dying exponen-
tially in time.

By carrying out an Euler approximation or otherwise, determine what in fact does hap-
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 203

pen when a boiling steady state experiences an upset.

This model takes the form

dx
= f (x, y, v)
dt

dy
= g (x, y, v)
dt
x=y

where v does not come into the third equation.

The boiling curve for a three component ideal solution is given by solving
 
dx1 P1 (T )
= − + 1 x1
dt P
 
dx2 P2 (T )
= − + 1 x2
dt P
and

P1 (T ) P2 (T ) P3 (T )
x1 + x2 + (1 − x1 − x2 ) = 1
P P P

where x1 + x2 + x3 = 1 is used to eliminate x3 .

Here T comes into the third equation and this system is not as sluggish as its look-a-like.

7. In the stirred tank reactor model put β = 3 and B = 14 (setting β = 2 and B = 10 leads to
a better x vs D curve) then

p
3+β+2 2 + β < B < 4 (1 + β)

and the conversion x is a monotonic increasing function of D. On this curve the determinant
of the Jacobian matrix is always positive but its trace vanishes at x3 = 0.406 and x4 = 0.880.

Set D = 0.162, which corresponds to a steady conversion x = 0.380, and integrate the
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 204

differential equations for the start up of the reactor from a variety of initial conditions, in
particular x = 0, y = 0 which corresponds to starting the reactor up using its feed conditions
as initial conditions. Do this using a machine having good graphics and plot the start up
trajectories in the x, y plane.

You will find that only a small set of initial conditions lead to the one and only steady
state. This steady state is stable to small upsets but it is close to the Hopf bifurcation point
at x3 = 0.406. As D increases through this bifurcation point, a small limit cycle breaks
off. If the bifurcation is forward this cycle is stable and surrounds the steady state which
turns unstable. But if it is backward, as it is here, it is unstable and for values of D short of
the bifurcation point an unstable limit cycle surrounds the stable steady state. This in turn
is surrounded by a large stable limit cycle that sets in at a lower value of D. As D passes
through the bifurcation point the small unstable limit cycle vanishes, the stable steady state
turns unstable while the large stable limit cycle is not sensitive to all of this. The picture is
then:

D < Dbifurcation

U
S

X
LECTURE 7. SIMPLE CHEMICAL REACTOR MODELS 205

D > Dbifurcation

X
The top picture explains why your calculations turn out the way they do.

8. To put metabolism into the chemostat model in a simple way write


 
dn W
= − +k n
dt V

and

dc W
= {c in − c} − ν k n − µ n
dt V

where µ > 0.
W
Determine the steady solutions and their stability as it depends on .
V

9. In the boiling curve problem for a three component ideal solution, assume
P3 (T ) < P2 (T ) < P1 (T ) and x3 (t = 0) is nearly zero. Show that as t → ∞ we have
x3 → 1.
Lecture 8

The Inverse Problem

In this lecture we present two problems where we must determine the values of the elements aij
dx
of a matrix A using measurements of the elements xi (t) of a vector x(t) satisfying = Ax. The
dt
first problem has been solved by J. Wei and C.D. Prater in “The Structure and Analysis of Complex
Reaction Systems,” Advances in Catalysis 13, 204 (1962).

In each problem A is self-adjoint in some inner product denoted h , iG and we can discover this
inner product while A itself remains unknown. Ordinarily this is not the plain vanilla inner product,
but it does tell us that A has a complete set of eigenvectors corresponding to real eigenvalues.
Denoting these x1 , x2 , . . . , xn and λ1 , λ2 , . . . , λn , where xi , xj G
= δij = y i , xi I
, the
idea is to obtain A by deriving from measurements of x (t) the terms that make up its spectral
decomposition:

X
A= λi xi y Ti .

To do this we expand the data, x(t), in the eigenvectors of A and write

X
x(t) = y i , x (t = 0) I
eλi t xi

where y i , x(t = 0) I
= xi , x(t = 0) G
and where measurements give us the left hand
side of this formula. The exponential separation of the terms on the right hand side as t grows
large, together with the fact that x(t = 0) is under our control leads to a plan for determining the

207
LECTURE 8. THE INVERSE PROBLEM 208

eigenvectors of A and then the terms in its spectral decomposition. Indeed, if, for some value of i,
we can guess x(t = 0) so that y j , x(t = 0) I
= 0 ∀j 6= i, x(t) is just

x(t) = y i , x(t = 0) eλi t xi

and measurements of x(t) determine xi .

If, in this way, x1 , x2 , . . . , xn , can be determined in a sequence of experiments then y 1 , y 2 , . . . , y n


can be calculated and λ1 , λ2 , . . . , λn can be obtained via

y i , x(t)
ln = λi t
y i , x(t = 0)

This done, A can be recovered in terms of its eigenvalues and eigenvectors via

X
A= λi xi y Ti .

The problem to be solved then is the selection of a useful sequence of x(t = 0)’s. What makes
this possible is that the terms in the expansion of x(t) go to 0 at differing rates. Indeed if 0 is the
unique equilibrium point, x1 can be obtained using the long time data from an arbitrary experiment,
x2 can be obtained using the long time data from an experiment satisfying x1 , x(t = 0) G
= 0,
etc.

8.1 A First Order Chemical Reaction Network.

Let i = 1, . . . , n denote the species in a chemically reacting system and suppose that each pair
participates in a reversible chemical reaction:

kji
i j
kij
LECTURE 8. THE INVERSE PROBLEM 209

where kji > 0 and kij > 0 are the forward and reverse chemical rate constants. A system of n
chemical isomers provides the simplest physical realization of this. We carry out the reaction at
constant temperature in a closed vessel. At time t = 0 we specify the number of moles of each
species in the vessel and inquire as to how these numbers change as time runs on. As the total
number of moles is fixed we can define the state of the system most easily in terms of the mole
fractions of the species. Letting xi , i = 1, . . . , n, denote these we write

n n
dxi X X
= kij xj − kji xi
dt j=1 j=1
j6=i j6=i

 
x1
 
 
 x2 
or in terms of x = 
 ..


 . 
 
xn

dx
= Kx
dt

where the off-diagonal elements of K are kij , the diagonal elements being the negative of the sums
of the off-diagonal elements in the same column. This tells us that the Gerschgorin column circles
are all centered on the negative real line with radius equal to the distance from the center to the
origin. As a result the eigenvalues of K cannot have positive real parts, and if a real part is zero
the eigenvalue itself must be zero. In the special case n = 3 the matrix K is
 
−k21 − k31 k12 k13
 
 
K= k21 −k12 − k32 k23 
 
k31 k32 −k13 − k23
LECTURE 8. THE INVERSE PROBLEM 210

Now we have det K = 0 because the rows of K add to 0T , i.e.,

 T
1
 
 
 1 
  K = 0T ,
 .. 
 . 
 
1

P P
so at least one eigenvalue of K is zero; and if xi = 1 when t = 0 then xi = 1 for all t ≥ 0
due to

 T  T
1 1
   
   
d X  1  dx  1 
xi = 
 ..

 =
 ..
 Kx = 0

dt  .  dt  . 
   
1 1

Also it is not hard to see that if all xi ≥ 0 for t = 0 then each xi ≥ 0 for all t ≥ 0. And so
P
the motion of the state vector x takes place on the plane xi = 1 in the positive cone (quadrant,
octant, . . .) xi ≥ 0 of the composition space.

Because det K = 0, the equation Kx = 0 has solutions other than x = 0. We let


 
x1eq
 
 
 x2eq 

xeq =  .   denote one such solution. We make three assumptions about the problem:
 .. 
 
xneq
first that xieq is positive, i = 1, 2, . . . , n, and second that xeq is the unique solution of Kx = 0
P
satisfying xieq = 1. We call xeq the equilibrium point for the reaction network and go on and
assume that when the network is in equilibrium each reaction is in equilibrium. This is the principle
of detailed balance and it tells us that

kij xjeq = kji xieq

for all i, j = 1, 2, . . . , n.
LECTURE 8. THE INVERSE PROBLEM 211

A readable explanation of the principle of detailed balance can be found in “Treatise on Irre-
versible and Statistical Thermophysics” by W. Yourgrau, A. van der Merwe and G. Raw. In H.
Haken’s book “Synergetics” the reader will find information on detailed balance as it has to do
with what are called master equations.

The principle of detailed balance is sufficient that the first two assumptions hold. The require-
ment kij xjeq = kji xieq , kij > 0, leads to the requirement that solutions of Kx = 0 have singly
signed components. This tells us that the problem Kx = 0 can have but one independent solu-
tion, because two independent singly signed solutions have nonsingly signed linear combinations.
Hence xeq is unique up to a constant multiplier. It is unique and it lies in the positive cone as it is
P
required to satisfy xi = 1.

The principle of detailed balance is also sufficient that K be self adjoint. To see this let Xeq =

diag x1eq x2eq . . . xneq and observe that Xeq is a positive definite, Hermitian matrix. Then, by
T
detailed balance, we can write KXeq = KX eq , due to the fact that the j th column of KXeq is
the j th column of K multiplied by xjeq . Hence, in the inner product where G = Xeq
−1
, we find that
K ∗ = K. The readers should work this out for themselves using K ∗ = G−1 K G.
T


We conclude therefore that K has a complete set of eigenvectors, denoted x1 , x2 , . . . , xn ,
and we denote the corresponding eigenvalues λ1 , λ2 , . . . , λn . The eigenvalues are real and not
positive and we require that they be ordered: λ1 = 0 > λ2 ≥ λ3 ≥ · · · ≥ λn . The eigenvectors are
−1
orthogonal in the inner product G = Xeq , i. e.,

xTi Xeq
−1
xj = 0, i 6= j

 
We denote by y 1 , y 2 , . . . , y n the set of vectors orthogonal to x1 , x2 , . . . , xn in the plain
 
1
 
 
 1 
vanilla inner product and observe that, as K ∗ = K T in this inner product and K T  
 ..  = 0, we
 . 
 
1
LECTURE 8. THE INVERSE PROBLEM 212
 
1
 
 
 1 
have y 1 = 
 ..
 if we set x1 = xeq . Then for any imtial composition we can write the solution

 . 
 
1
dx
to = Kx as
dt
n
X
x(t) = y i , x(t = 0) eλi t xi
i=1

n
X
= xeq + y i , x(t = 0) eλi t xi
i=2

To see how this can be used to find the matrix K and hence the n(n−1) chemical rate constants
we put n = 3 and suppose λ3 < λ2 . Then x(t) − xeq is the sum of two terms, y 2 , x(t =
0) eλ2 t x2 and y 3 , x(t = 0) eλ3 t x3 where the first dies out more slowly than the second.
We call x2 the slow direction, x3 the fast direction. For large values of t the term in the slow
direction approximates x(t) − xeq and so the long time data provide an estimate of x2 which we
can refine in successive experiments. The idea is illustrated in the following sketch where the
long time tangent direction is the estimate of x2 . It is determined with increasing accuracy in
successive experiments by using the latest estimate of x2 to derive a new initial condition wherein
the magnitude of y 2 , x(t = 0) is increased vis-a-vis y 3 , x(t = 0) . This can be done by
extrapolating the long time tangent to the latest reaction path back to the edge of the triangle. The
sequence x(t = 0) − xeq then turns toward x2 and away from x3 :
LECTURE 8. THE INVERSE PROBLEM 213

x eq + c3 x 3

first
1 x eq + c2 x 2
experiment

x eq
second
experiment 2

In the case n = 3 the subsequent work is especially simple for having obtained an estimate of
x2 as indicated above, we can determine x3 in terms of x1 and x2 via orthogonality in the inner
−1
product G = Xeq , i.e., x3 can be obtained as a solution to

 
1
 
 
 1 
x1 Xeq x3 = 
T −1
 ..
 x3 = 0

 . 
 
1

and

xT2 Xeq
−1
x3 = 0


Then using the plane vanilla inner product, we can produce y 1 , y 2 , y 3 via y i , xj = δij
and return to a trajectory, such as 1 , having, at least for short time, roughly equal contributions
in the x2 and x3 directions and use it to obtain λ2 and λ3 via

y 2 , x(t) = y 2 , x(t = 0) eλ2 t


LECTURE 8. THE INVERSE PROBLEM 214

and

y 3 , x(t) = y 3 , x(t = 0) eλ3 t

This information determines K via

K = λ1 x1 y T1 + λ2 x2 y T2 + λ3 x3 y T3 .

This is what underlies the evaluation of K by the method of Wei and Prater.

And it obtains whatever the value of n. Indeed for any value of n we have x1 = xeq and we can
find x2 as above. To get x3 we select an initial condition at random and write

x(t = 0) = xeq + c2 x2 + c3 x3 + · · · + cn xn .

This cannot be used to determine x3 unless c2 = 0, but because the eigenvectors x1 , x2 , . . . , xn


−1
are orthogonal in the inner product G = Xeq we can estimate c2 as

xT2 Xeq
−1
x(t = 0)
c2 = T −1 x
x2 Xeq 2

and use the corrected initial condition x(t = 0) − c2 x2 to generate a family of trajectories that
will produce x3 in the same way that a random initial condition will produce x2 . But there is a
technical difficulty as the estimate of x2 we have is not perfect and neither are our composition
measurements. Both factors make it impossible to completely free x(t = 0) of its x2 component
and hence x2 tends to reassert itself in any trajectory as time runs on. But this flaw is not fatal, it
just makes the method somewhat more tedious than it might at first seem.

The Use of Flow Reactor Data

Instead of using data produced by a closed or batch reactor to determine K, we can investigate
the possibility of using data produced by an open or flow reactor. By doing this we can avoid the
problem of determining the time to which the composition measurements correspond.
LECTURE 8. THE INVERSE PROBLEM 215

Now the model for the steady operation of a well mixed reactor is

0 = xin − xout + θKxout

where θ is the holding time. The values of xin and θ are under our control; the experiment produces
the corresponding value of xout . Letting n = 3, denoting the eigenvectors of K as xeq , x2 and x3
and the corresponding eigenvalues as 0, λ2 and λ3 , where 0 > λ2 > λ3 , and writing

xin = xeq + c2 x2 + c3 x3

and

xout = xeq + d2 x2 + d3 x3

we find

0 = c2 − d2 + θλ2 d2

and

0 = c3 − d3 + θλ3 d3

This tells us that xin , xout and xeq lie on a straight line iff that line is in the direction of an eigen-
vector of K. The reader can use this observation to devise a method for determining the matrix K
via its spectral representation. In so doing it is useful to observe that an experiment turns xin − xeq
into the direction of the line through xeq parallel to x2 and away from the line through xeq in the
direction of x3 . Indeed as

−1 
xout − xeq = I − θK xin − xeq

a sequence of experiments that might be worth some study is that in which the choice of xin in any
experiment is the value of xout in the preceeding experiment. This is easily achieved by running a
LECTURE 8. THE INVERSE PROBLEM 216

set of well mixed reactors in series.

8.2 Liquid Levels in a Set of Interconnected Tanks

We denote by i = 1, 2, . . . , n the tanks in a network of n pairwise connected tanks, by hi the liquid


level in tank i and by Ai its cross sectional area. Then the volume flow from tank i to tank j is


kij hi − hj

where kij is the conductivity of the pipe connecting tanks i and j and where kij = kji > 0, there
being only one line connecting tanks i and j. Indeed under steady laminar flow conditions we
would anticipate

π R4 g
kij =
8 L ν

where R is the radius of the connecting line, L its length, etc.

The idea is to determine the values of the constants kij by studying the dynamics of the levels
in a set of interconnected tanks as the levels go to equilibrium from an assigned set of initial values.

We can work in terms of the height, hi , or the volume, Vi , of the liquid held in tank i. The
heights make the equilibrium state simple but complicate the constant of the motion, whereas the
reverse is true for the volumes. While this problem is more like the earlier problem when it is
written in terms of volumes, we work in terms of heights and write

dhi X   X X
Ai = − kij hi − hj = kij hj − kij hi
dt j6=i j6=i j6=i
LECTURE 8. THE INVERSE PROBLEM 217
 
h1
 
 
 h2 
or in terms of h = 
 ..


 . 
 
hn

dh
= A−1 Kh
dt

where A = diag A1 A2 . . . An and where the off-diagonal elements of K are kij , the diagonal
elements being the negative sums of the off-diagonal elements in the same column or row.
 T  T
A 1
 1   
   
T T  A2  −1  1 
We first observe that A = A and K = K and then that    
 ..  A K =  ..  K = 0 .
T
 .   . 
   
An 1
−1
Hence at least one eigenvalue of A K is zero and

 T
A1
 
 


A2  dh
 d X dV
 ..  dt = dt Ai hi =
dt
=0
 .  i
 
An

P
so that i Ai hi =V is constant
 and the motion takes place on a plane of constant volume, a plane
A
 1 
 
 A2 
whose normal is  . 

 in the plane vanilla inner product. Also it is not hard to see that if all
 .. 
 
An
hi ≥ 0 for t = 0 then each hi ≥ 0 for all t ≥ 0. Therefore the curve mapped out by h(t), t ≥ 0,
P
lies on the plane i Ai hi = V in the positive cone of the vector space Rn where h resides.
LECTURE 8. THE INVERSE PROBLEM 218

The adjoint of A−1 K in the inner product x, y G


= xT Gy is

∗ T
A−1 K = G−1 A−1 K G

T T
= G−1 K A −1 G = G−1 KA−1 G

∗
and so, on taking G = A, we discover that A−1 K = A−1 K. Therefore in the inner product
where G = A, A−1 K is self adjoint and we conclude that A−1 K has a complete set of eigenvectors
and that the corresponding eigenvalues are real. We denote the eigenvectors x1 , x2 , . . . , xn and
the corresponding eigenvalues λ1 , λ2 , . . . , λn .

The rows of A−1 K are multiples of the rows of K. Hence the Gerschgorin row circles for
A−1 K, as for K itself, are all centered on the negative real line with radius equal to the distance
from the center to the origin, whence the eigenvalues of A−1 K cannot be positive. They can be

 λn≤ λn−1 ≤ · · · ≤ λ2 ≤ λ1 ≤ 0 where λ1 = 0 and corresponding to λ1 we have


ordered:
1
 
 
 1 
x1 =  
 ..  due to Kx1 = 0.
 . 
 
1
In the plain vanilla inner product, i.e., G = I, the adjoint of A−1 K is

∗ T
A−1 K = A−1 K = KA−1

and we denote its eigenvectors y 1 , y 2 , . . . , y n where xi , y j = δij and where

 
A1
 
 
1  A2 
y1 =  
A1 + A2 + · · · + An 

..
.


 
An
LECTURE 8. THE INVERSE PROBLEM 219

due to
   
A1 1
   
   

−1 
A2   1 
KA  =K =0
..   .. 
 .   . 
   
An 1

 
1
 
 
 1 
It is easy to see that the only solutions other than x = 0 to Kx = 0 are multiples of 
 ..
.

 . 
 
1
Indeed, as the off-diagonal elements of K satisfy kij = kji > 0 and the diagonal elements are the
negative sums of the off-diagonal elements in the same column or row, we can eliminate x1 from
 
x2
 
 
 x3 
Kx = 0 and discover that  . 
 1 1 1 1
 = x satisfies K x = 0 where, like K, the off-diagonal
.
 . 
 
xn
elements of K 1 satisfy kij1 = kji
1
> 0 and the diagonal elements are the negative sums of the off-
diagonal elements in the same column or row. Eliminating x2 , x3 , . . . , xn−2 in the same way we
find that
    
−a a xn−1 0
  = , a > 0,
a −a xn 0

whence xn−1 = xn and indeed we have xn−2 = xn−1 , . . . , x1 = x2 .

At equilibrium we have

A−1 Kh eq = 0
LECTURE 8. THE INVERSE PROBLEM 220

and hence, due to det A 6= 0,

Kh eq = 0.
 
1
 
 
 1 
By this the equilibrium vector h eq must be a multiple of 
 ..
.

 . 
 
1
Detailed balance holds in this problem but this does not help us in this simple problem where
K is symmetric as much as it did in the earlier problem where we had two independent paths
connecting i and j.

Now zero is an eigenvalue of A−1 K and it is simple as long as all kij > 0. If connecting lines
are cut and the corresponding kij are set to zero, zero remains a simple eigenvalue as long as there
remains at least one indirect flow path from each tank to each other tank. At the point where this
is lost, zero cannot remain simple and our network splits into two disjoint subnetworks.

The solution to our problem is

n
X
h (t) = y i , h (t = 0) eλi t xi
i=1

and as
P
Ai hi (t = 0) V
y 1 , h (t = 0) = P =P = heq ,
Ai Ai

we can write this


 
1
 
  n
 1  X
h (t) = heq 
 ..
+
 y i , h (t = 0) eλi t xi
 .  i=2
 
1
LECTURE 8. THE INVERSE PROBLEM 221

and so in the case n = 3, which is sufficient to illustrate the main idea, we write

h (t) = heq + y 2 , h (t = 0) eλ2 t x2 + y 3 , h (t = 0) eλ3 t x3

 
1
 
 
where heq = heq  1 .
 
1

This formula determines h (t) in terms of h (t = 0) and it can be used to recover K from
experimental data for a variety of initial conditions in the way explained earlier in this lecture and
used to determine chemical rate constants. In short when λ3 < λ2 we see that as t grows large
h (t) approaches h eq from the direction of x2 . So using the direction of the tangent at h eq to each
of a sequence of experimental trajectories to determine an initial condition for the next trajectory,
we step by step reduce the magnitude of y 3 , h (t = 0) in favor of y 2 , h (t = 0) and thereby
determine x2 as accurately as we like. Using the orthogonality conditions xT3 Ax1 = 0 = xT3 Ax2 in
the inner product where G = A we can find x3 and then y 2 and y 3 in the plain vanilla inner product.
As y , y , and an arbitrary trajectory, determine λ and λ via y , h (t) = y , h (t = 0) eλ2 t
2 3 2 3 2 2

and y 3 , h (t) = y 3 , h (t = 0) eλ3 t , the values of the kij are then recovered via

A−1 K = λ1 x1 y T1 + λ2 x2 y T2 + λ3 x3 y T3

The idea is to make a run at random where

h = h eq + c2 eλ2 t x2 + c3 eλ3 t x3

and where c2 = y 2 , h (t = 0) and c3 = y 3 , h (t = 0) are not known. But the second and
third terms may not separate until t is so large that h − h eq is inside experimental accuracy. This
happens if |c2 /c3 | is sufficiently small. The assumption is made that |c2 /c3 | is large enough in the
first run that separation takes place early enough in time that an estimate of x2 is obtained that can
be used to increase |c2 /c3 | for the second run. Then separation will take place even earlier in time
leading to a better estimate of x2 , etc. Such a sequence of experiments will produce an estimate of
x2 limited only by the accuracy of liquid level measurements.
LECTURE 8. THE INVERSE PROBLEM 222
P
Because the curve h (t) vs t lies on the plane Ai hi = V and h1 ≥ 0, h2 ≥ 0, h3 ≥ 0, it lies
on a plane triangle. The readers can work out how to transfer this plane triangle to a piece of graph
paper so that they can draw a graph of an experimental trajectory.

Ordinarily a set of experimental runs will be carried out at different volumes but as the eigen-
vectors and eigenvalues of A−1 K do not depend on volume a sequence of runs can be brought to
a common volume quite easily. Yet all this can be avoided by working in volume fractions. Then
the matrix of interest is KA−1 and this also turns out to be self adjoint, but now G = A−1 .

All the reader needs are three fifty-five gallon drums, measuring sticks and some pipe to build
a nice unit operations lab experiment.

8.3 Home Problems

1. The reactions

2 3

are carried out in a spatially uniform flow reactor whose holding time is denoted θ. We have

dx
θ = x F − x + θKx
dt

where x is the column vector of species mole fractions and x F denotes the feed composition.

The steady solutions, denoted x S satisfy


x F = I − θK x S
LECTURE 8. THE INVERSE PROBLEM 223

Determine the eigenvectors and eigenvalues of I − θK in terms of the eigenvectors and


eigenvalues of K. Show that there is one and only one value of x S corresponding to each
value of x F . Show that if x F is physically meaningful then so too x S .

Let y = x − x S then y satisfies

 
dy 1
= − I +K y
dt θ

where y (t = 0) = x (t = 0) − x S .

Show that as t grows large y (t) converges to 0 for all values of y (t = 0).

2. Let n solutes be dissolved in a solvent which is confined to a layer of thickness L. The


solvent layer is in contact with a reservoir at x = 0 which maintains the solute concentration
there at the value c0 . The edge of the layer at x = L is impermeable to solute. The solutes
undergo an isomerization reaction

kji
i j
kij

in the solvent and for their distribution there we have

d2 c
D + K c = 0, 0<x<L
dx2

c (x = 0) = c0

and

dc
(x = L) = 0
dx
LECTURE 8. THE INVERSE PROBLEM 224
   
c1 D 0 0 ··· 0
   1 
   
 c2   0 D2 0 · · · 0 
where c = 
 ..
 and D = 
  .. .. .. ..
, Di > 0. Determine the total

 .   . . . . 
   
cn 0 0 0 · · · Dn
rate of reaction in the film and express this in terms of an effectiveness factor matrix.

Because D and K do not ordinarily have a complete set of eigenvectors in common this
might seem like a new problem. Multiplication by D −1 shows that it is not. But the facts
about D −1 K have yet to be established. This can be done by observing that D −1 K is similar
1 1
to D − 2 KD − 2 and to KD −1 . The first is symmetrizable due to
1 1
 1 1
T T
D − 2 KD − 2 C eq = D − 2 KD − 2 C eq . To see this requires KC eq = KC eq and use
of the symmetry and commutativity of diagonal matrices. The Gerschgorin column circles
of the second lie in the left half plane because its columns are multiples of the corresponding
1 1
columns of K, the multiplying factors being positive, i.e., , , . . ..
D1 D2
−1
The matrix K is self adjoint in the inner product G = Ceq . Show that the matrix D −1 K
−1
is self adjoint in the inner product G = Ceq D.

3. For reactions taking place in a solvent layer in contact with a reservoir supplying the re-
actants, the effectiveness factor matrix multiplies the rate of production vector evaluated at
reservoir conditions to determine the true rate of production vector. When the reactions are

kji
i j
kij

the effectiveness factor matrix is

n √
X tanh −λi L
D √ xi y Ti D −1
i=1
−λ i L

where λi and xi , i = 1, 2, . . . , n, denote the eigenvalues and eigenvectors of D −1 K and


y Ti xj = δij
LECTURE 8. THE INVERSE PROBLEM 225

When the simple reversible reaction

k
1 2
k'

takes place in the solvent show that the effectiveness factor matrix is
   
′ √ ′ ′
1 D k D1 k −λ2 L 1 D2 k −D1 k
 1  + tanh
√  
D1 k ′ + D2 k D2 k D 2 k −λ 2 L D 1 k ′+D k
2 −D2 k D1 k ′

where

D1 k ′ + D2 k
−λ2 =
D1 D2

Observe that the rate of production of either species depends on the rate of production
of both species at reservoir conditions.

4. Let
 
−1 1
A= .
−1 −1

Then show that the eigenvalue problem

Ax = λx

is satisfied by
 
i
λ1 = −1 + i, x1 =  
−1
LECTURE 8. THE INVERSE PROBLEM 226

and
 
−i
λ2 = −1 − i, x2 =  
−1

 
and that x1 , x2 and y 1 , y 2 are biorthogonal sets if

   
1 −i 1 i
y1 = −  , y2 = −  
2 1 2 1

Show that the solution to

dx
= Ax
dt

where x (t = 0) is assigned is
   
D E i D E −i
x (t) = y 1 , x (t = 0) e(−1 + i) t  + y 2 , x (t = 0) e(−1 − i) t  
−1 −1

 
1
Sketch the solution in the x1 , x2 plane when x (t = 0) =  . Because the decay
−1

constant is Re λ1 = −1 and the period of the revolution is = 2π it may be difficult
Im λ1
to see the spiral as e−2π is small, where e−2π is the factor by which the length of x (t) is
shortened each period.

An experiment on a system where either Re λ1 is much less than zero or Im λ1 is close


to zero may look like it identifies a straight line path as the spiral will be difficult to detect
and a long time tangent direction may appear to be defined. But the direction of the apparent
approach to 0 will  on x (t = 0). To see this sketch the solution in the x1 , x2 plane
depend
1
when x (t = 0) =   and then look at the earlier sketch.
1
LECTURE 8. THE INVERSE PROBLEM 227

5. Two experiments run on three tanks having cross-sectional areas A1 = 1, A2 = 2 and


A3 = 3 produce the following data:

t h1 h2 h3 h1 h2 h3
0 1.0 4.0 2.0 1.0 2.0 10/3
1 2.1226 2.7030 2.4905 1.8900 2.4323 2.7484
2 2.3848 2.5275 2.5201 2.2677 2.4908 2.5835
4 2.4860 2.5005 2.5043 2.4680 2.4998 2.5108
8 2.4994 2.5000 2.5002
∞ 2.5 2.5 2.5 2.5 2.5 2.5

Use the data on the right hand side to estimate the conductivities of the connecting pipes.
Then use the estimates to predict the data on the left hand side.

It is of some interest to see how the “experimental data” were determined.  As
1
 
 
A1 = 1, A2 = 2 and A3 = 3 the geometric conditions of the problem require x1 =  1 ,
 
1
   
A 1
1  1  1 
   
y1 =  A  =  2 . Then as x2 and x3 can be determined only up
A1 + A2 + A3  2  6 
A3 3
to a constant multiple one degree of freedom is used in setting x2 and this also determines
x3 via

xT1 A x3 = 0 = xT2 A x3

Then x1 , x2 and x3 determine y 1 , y 2 and y 3 . The remaining two degrees of freedom are used

in setting λ2 and λ3 λ3 < λ2 < 0 as λ1 = 0. However this cannot be done arbitrarily as
k12 , k23 and k31 must be positive.

6. An experiment on the isomerization of species A1 , A2 and A3 produces the data


LECTURE 8. THE INVERSE PROBLEM 228

t x1 x2 x3
0 1 0 0
1/4 0.7910 0.1967 0.0122
1/2 0.6451 0.3161 0.0389
1 0.4677 0.4324 0.0999
2 0.3222 0.4908 0.1869
4 0.2592 0.4998 0.2409
∞ 0.2500 0.5000 0.2500

Determine the chemical rate coefficients kij . Improve the estimates of the rate coefficients
given that the point x1 = 0.5, x2 = 0.5, x3 = 0 lies on the slow straight line path.

7. Straight Line Paths in Diffusion


   
c1 D D12
Denote   by c and  11  by D where c1 and c2 are the composition of two
c2 D21 D22
dilute solutes diffusing in a solvent.

At t = 0 we have c = c0 for 0 ≤ x < ∞ and as x → ∞ we have c = c0 for all


t ≥ 0. At x = 0 we have c = c0 + ∆ for t > 0.

Our model is

∂c ∂2c
=D 2
∂t ∂x

and its solution is


 
1 −1/2
c (ξ) = c0 + I − erf ξ D ∆
2

where ξ = xt−1/2 and where D1 > D2 > 0 denote the eigenvalues of D corresponding to
the eigenvectors x1 and x2 .
LECTURE 8. THE INVERSE PROBLEM 229

Sketch c (ξ) − c0 vs ξ for several ∆’s. You should see the curves turning toward the
slow straight line path, away from the fast straight line path. Use your favorite matrix D.

To run an experiment, we would try to bring two solutions of different compositions


together at t = 0 and run for a short time. The reader ought to derive a formula for c in this
experiment.
Lecture 9

More Uses of Gerschgorin’s Circle Theorem

9.1 Difference Approximations

As we explained in Lecture 6 Gerschgorin’s circle theorem establishes a set of circles in the com-
plex plane inside of which the eigenvalues of a matrix must lie, outside of which they cannot lie.
The theorem leads to estimates of the eigenvalues and while the estimates may not be sharp, neither
are they difficult to obtain.

Now stability conditions and, what is the same thing, convergence conditions tell us where the
eigenvalues of a matrix must not lie. So when convergence is the problem, what the circle theorem
tells us about where the eigenvalues of a matrix do not lie is interesting. This is our emphasis in
this lecture.

Because we will study the diffusion equation in subsequent lectures, we investigate here some
simple approximations to its solution.

The diffusion equation acts to smooth out solute irregularities. Hence it acts to damp concen-
tration excursions from equilibrium in a region whose boundary is held at equilibrium, viz.,

231
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 232

c solute diffusion
filling up
troughs at the
expense of crests

x
We also should see this in approximations to its solution.

For one dimensional diffusion across a layer of thickness L, we write

∂c ∂2c
= 2, 0 < x < 1
∂t ∂x

and

c (x = 0) = 0 = c (x = 1) , t > 0

L2
where c (t = 0) is assigned. Here distance is scaled by L and time is scaled by D
. We subdivide
the interval (0,1) into n+1 subintervals of length h and approximate c at each of the points xi = ih,
i = 1, . . . , n, by ci so that at any time t the function c (x, t) is approximated by the vector
 
c (t)
 1 
 
 c2 (t) 
c (t) = 
 ..
.

 . 
 
cn (t)

Using a second central difference to approximate the spatial derivative, we require ci to satisfy

dci ci+1 − 2ci + ci−1


=
dt h2
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 233

which we can write

dc 1
= 2 Ac
dt h

where
 
−2 1 0 0 ···
 
 
 1 −2 1 0 · · · 
A=



 0 1 −2 1 · · · 
 
···

T
As A = A we see that A is self adjoint in the plain vanilla inner product, viz., G = I. Hence
its eigenvalues, denoted λi , must be real and its eigenvectors, denoted xi , can be scaled so that
xi , xj = δij in the plain vanilla inner product. Then our approximation is

n
X 1
c (t) = hxi , c (t = 0)i e h2 λi t xi
i=1

The circle theorem tells us that −4 ≤ λi ≤ 0 and as det A 6= 0, λi 6= 0. Ordering the eigenvalues
as −4 ≤ λn ≤ λn−1 ≤ · · · λ1 < 0 we see that c (t) dies out to 0 exponentially as t grows large, the
1
last gasp being hx1 , c (t = 0)i e h2 λ1 t x1 . So, no matter how ragged the initial solute concentration,
c (t = 0), it is finally only as ragged as x1 .

In this simple case the solutions to the eigenvalue problem Ax = λx are


 
1
 sin iπ 
 n+1 
 .. 
xi ∝  . 
 n 
sin iπ
n+1

and


λi = 2 cos −2
n+1

so x1 is smooth and singly signed.


LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 234

We can go on and ask for another approximation defined only at equally spaced values of t,
viz., t = kT , k = 1, 2, . . .. The approximation resulting on replacing the time derivative by a
forward difference satisfies

T  k
ck+1 − cki = c − 2cki + cki−1
i
h2 i+1

or
 
k+1 T
c = I + 2 A ck .
h

The eigenvectors of I + hT2 A are those of A itself, the corresponding eigenvalues being 1+ hT2 λi ,
and as −4 ≤ λi < 0, we find 1 − 4 hT2 ≤ 1 + T
λ
h2 i
< 1. The approximation then is

n
X  k
k T
c = hxi , c (k = 0)i 1 + 2 λi xi
i=1
h

T
and ck → 0 as k → ∞ iff 1 + λ
h2 i
< 1. To be sure that this is so we must have 1 − 4 hT2 > −1 or
T < 21 h2 . In fact we require 1 − 4 hT2 > 0 or T < 14 h2 to be certain that 0 < 1 + T
λ
h2 i
< 1, thereby
eliminating the possibility that terms in the approximation alternate in sign step by step.

We see that in this second approximation, having set h, we cannot set T freely and be certain
that the approximation is well behaved. We also see that the two approximations differ, the factor
 1 t t
e h2 λi in the first being replaced in the second by 1 + hT2 λi T where t = kT . In fact the second
converges to the first if we fix t and let k → ∞ and T → 0 so that kT = t.

Instead of replacing the time derivative by a forward difference, we can get another approxi-
mation if a backward difference is used. It satisfies

T  k
cki − ck−1 = c − 2cki + cki−1
i
h2 i+1

or
 −1
k+1 T
c = I − 2A ck .
h
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 235
 T −1  T
The eigenvectors of I − h2
A , like those of I + h2
A , are those of A; the corresponding
1
eigenvalues are now 1− T
λ
. Because λi < 0, these eigenvalues are all positive and lie to the left
h2 i

of +1, and so this approximation,

n
!k
X 1
ck = hxi , c (k = 0)i T
xi ,
i=1
1− λ
h2 i

goes to 0 as k goes to ∞, each term maintaining a fixed sign for all values of k, and we do not need
a condition on T to make this happen. If we fix t and let k → ∞ and T → 0 so that kT = t, this
approximation, like the second, converges to the first.
∂c
Anticipating higher accuracy, we can replace by the average of the two one-sided differ-
∂t
ences. Indeed as
 2
T
λ T 1 T
e h2 i = 1 + 2 λi + λi +···
h 2 h2

T T
1+ 2
λi = 1 + 2 λi
h h
and
 2
1 T T
= 1 + λi + λi +···
1 − hT2 λi h2 h2

the average of the second and third expansions agrees with the first to three terms while each by
itself agrees with the first to only two terms. Using this idea, this, then, we get

2T  k
cik+1 − ck−1
i = 2
ci+1 − 2cki + cki−1
h

or

T
ck+1 = 2 2
Ack + Ick−1
h

This is a second order difference equation; but it can be rewritten as a first order difference equa-
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 236

tion, viz.,
    
c k+1
2 hT2 A I c k
 =  
ck I 0 ck−1

and solved as above. The readers can use Gerschgorin’s theorem to see if they can learn anything
about this problem.

Ordinarily a simple second order difference equation looks like this

ck+1 = Ack + Bck−1

and it may be introduced in an attempt to stabilize an unstable first order difference equation by
introducing a delay. The second order equation can be written in first order form as
    
k+1 k
c A B c
 =  
ck I 0 ck−1

whereupon the corresponding eigenvalue problem,


    
A B x x
   = λ ,
I 0 y y

is then


λ2 I − λA − B y = 0

The condition that there be solutions y 6= 0 is that det (λ2 I − λA − B) = 0. A matrix whose
elements are polynomials in a scalar λ is called a lambda matrix. Information about lambda ma-
trices, their latent roots and latent vectors can be found in Lancaster’s book “Theory of Matrices”
and in the book “Matrix Polynomials” by Gohberg, Lancaster and Rodman, as well as in Frazer,
Duncan and Collar’s book “Elementary Matrices”. If, however, A and B have a common set of
eigenvectors, elementary methods can be used.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 237

Returning then to our problem

T
ck+1 = 2 Ack + Ick−1
h2

we expand ck in the eigenvectors of A as

n
X
k
c = xi , ck xi ,
i=1

and find the equation satisfied by xi , ck to be

T
xi , ck+1 = 2 λi xi , ck + xi , ck−1
h2

This is a second order constant coefficient difference equation to be solved for each value of i. Its
solution is

xi , ck = ai µki1 + bi µki2

where µi1 and µi2 are the roots of µ2 − 2 hT2 λi µ − 1 = 0 and where ai and bi satisfy

hxi , c (k = 0)i = ai + bi

and

hxi , c (k = 1)i = ai µi1 + bi µi2

The values of µi1 and µi2 are


s 2
T T
µi1,2 = 2 λi ± λi +1
h h2
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 238

and hence, because λi < 0, half of these values lie to the left of -1 and so the approximation

n
X 
ck = ai µki1 + bi µki2 xi
i=1

grows in magnitude and, sooner or later, oscillates in sign each time k increases by 1 whatever
values of T and h are used.

This fourth approximation is named after L. Richardson and is an example of a good idea that
did not work out.

The forward difference approximation leads to the iteration matrix I + hT2 A whereas the back-
−1
ward difference approximation leads to I − hT2 A . Stability places a condition on T in the first
T T
but not in the second. But to expand the second in powers of h2
A we must have λ
h2 i
< 1 and for
this it is sufficient that T < 14 h2 .

The readers may wish to investigate the stability of the Crank-Nicholson approximation which
−1 
leads to the iteration matrix 2I − hT2 A 2I + hT2 A .

9.2 Home Problems

1. Two simple difference approximations to the convective diffusion equation

∂c ∂2c ∂c
=D 2 −v
∂t ∂x ∂x

∂c
result on making a forward or a backward difference approximation to , viz.,
∂x


 c − ci

 i+1
∂ci c i + 1 − 2ci + ci − 1 v
=D − or
∂t h2 h


 c −c
i i+1

Let v be positive and use Gerschgorin circle theorem to obtain stability conditions on the
size of h in each approximation. When v = 0 stability is obtained for all values of h.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 239

2. Determine where the eigenvalues of the Crank-Nicholson iteration matrix

 −1  
T T
2I − 2 A 2I + 2 A
h h

T
lie as a function of .
h2

3. Every practical method of solving Ax = b stops at some point, producing an estimate of x.


Often this estimate needs to be improved. Two simple iterations used to do this are called
Jacobi’s and Gauss’ iterations.

To present these improvement methods we write A = L + D + U, where L and U are


strictly lower and upper triangular. Then in Jacobi’s method, we rearrange Ax = b as

x = −D −1 (L + U) x + D −1 b

and use the iteration formula

x k + 1 = −D −1 (L + U) x k + D −1 b

to generate a sequence of estimates x1 , x2 , . . . from an initial guess x0 . The error satisfies

e k + 1 = Je k

where J = −D −1 (L + U) and convergence obtains iff the eigenvalues of J lie inside the
unit circle. You can obtain a sufficient condition for this, viz.,

X
| aij | ≤ | aii | , i = 1, 2, . . . , n
j=1
6=i

by using Gerschgorins’ theorem, and it tells you to carry out elementary row and column
operations on your problem so that when you write it Ax = b, A is as diagonally dominant
as possible.
LECTURE 9. MORE USES OF GERSCHGORIN’S CIRCLE THEOREM 240

In Gauss’ iteration we rearrange Ax = b as

x = − (L + D)−1 Ux + (L + D)−1 b

whence we have

ek+1 = Gek

where G = − (L + D)−1 U.

4. In an iteration process where xk determines x k + 1, the components of x k + 1 are not obtained


simultaneously but one at a time. Show that in a Jacobi iteration, the components of x k + 1
obtained in the step k → k + 1 are not used in the calculation until the step k + 1 → k + 2.
Show that in a Gauss iteration, the components obtained in the step k → k + 1 are used in
the calculation as soon as they are obtained, i.e., the i + 1 component of x k + 1 is determined
using the first i components of x k + 1 and the last n − i components of x k.

Use Gerschgorin’s circle theorem to derive sufficient conditions for a Gauss iteration to
converge.
Part II

Elementary Differential Equations


Lecture 10

A Word To The Reader Upon Leaving


Finite Dimensional Vector Spaces

In Part II we are going to study linear differential equations in which the differential operator is, for
the most part, ∇2 . The solutions to our problems will depend on position as well as on time and the
spaces where they reside will be called function spaces. The solutions to the eigenvalue problem for
∇2 will be called eigenfunctions and ordinarily there will be infinitely many independent solutions.
The function spaces will be infinite dimensional and our solutions will be in the form of infinite
series. Many difficult questions will then arise that did not arise in Part I and many of these
difficulties can be reduced to the question: what interpretation we can place on an infinite sum of
eigenfunctions?

Before we take up such questions, if we do so at all, we explain the way to go about constructing
the solution to a linear differential equation. We do this by expanding the solution in a series of
eigenfunctions. To obtain the eigenfunctions we need to explain how to solve the eigenvalue
problem. We do this by separation of variables. So we have two aims: first to explain how
eigenfunction expansions are used to solve linear differential equations, second to explain the
method of separation of variables as it is used to solve the corresponding eigenvalue problem.

As ∇2 is the focus of our work, we first establish the elementary facts about ∇2 . We then use
separation of variables to reduce the eigenvalue problem for ∇2 to a set of three one-dimensional
eigenvalue problems and we use Frobenius’ method to solve these eigenvalue problems.

243
LECTURE 10. A WORD TO THE READER UPON LEAVING FINITE DIMENSIONAL VECTOR SPACES244

Earlier, in Part I, the solutions to our problems were finite sums and we might, but did not, have
substituted such a sum directly into an equation to be solved in order to determine the coefficients
of the eigenvectors in the sum. Indeed to see how easily this works the reader needs to determine
the equations for the ci (t) by substituting

X
x (t) = ci (t) xi

into the equation

dx
= Ax
dt

In Part II this may work, but it may not. The problem is that the solutions are infinite sums
and it may not be known before the problem is solved whether or not the derivative of a sum is in
fact the sum of the derivatives of its terms. There is at least one simple example in what follows to
illustrate this point.

What we do does not differ from what we did in Part I and does not require that an assumed
solution be substituted into the equation being solved. Indeed the coefficients to be determined are
found by integrating this equation, after it is multiplied by suitable weighting functions.

The lectures are on the diffusion equation, its solution in bounded regions in terms of eigen-
functions, the solution of the eigenvalue problem by separation of variables and some problems in
Cartesian, cylindrical and spherical coordinate systems to fill in the details.
Lecture 11

The Differential Operator ∇2

11.1 The Differential Operator ∇

Part II is about the differential operator ∇2 . To begin we need to learn how to write ∇ and
∇2 = ∇ · ∇ in coordinate systems of interest and for us this means only orthogonal coordinate
systems. The simplest of these and our starting point is a system of Cartesian coordinates denoted
x, y, z.

A point in space, say P , can be located with respect to an origin O by the vector
−→
~r = x~i + y~j + z~k = OP where (x, y, z) denotes the Cartesian coordinates of the point P and
where ~i, ~j and ~k are unit vectors along the axes Ox, Oy and Oz. By design we have

~i · ~j = 0, ~j · ~k = 0, ~k · ~i = 0

i. e., a Cartesian coordinate system is constructed to be an orthogonal coordinate system, and we


have

~i × ~j = ~k, ~j × ~k = ~i, ~k × ~i = ~j

Then, at the point P , we have tangents to the coordinate curves passing through P , viz., the
∂~r ~
tangent to the coordinate curve where x is increasing at constant y and z is = i. Likewise we
∂x

245
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 246

∂~r ∂~r 
have = ~j and = ~k. Thus, at P , the set of vectors ~i, ~j, ~k forms an orthogonal basis of
∂y ∂z
unit length vectors, the same basis at any point P as at any other point. Hence the nine derivatives
∂~i
, etc., all vanish. This is what makes Cartesian coordinates the simplest coordinate system.
∂x
Now, suppose that we have a smooth scalar or vector or tensor valued function defined through-
out a region of space and that we wish to introduce a notation that allows us to differentiate this
function. To do this let C denote a curve lying in this region and let s denote arc length along this
curve. The positions of points on the curve are denoted ~r (s) or x(s), y(s), z(s), and the tangent to
d~r
the curve at a point P on the curve is denoted by ~t where ~t = and ~t · ~t = 1 due to ds2 = d~r · d~r.
ds
We introduce the differential operator ∇ so that at a point P of C the derivative of f with
respect to arc length along C is given by

df ~
= t · ∇f
ds

where ∇f depends on the point P and ~t, at the point P , depends on the curve C.

We now select three curves passing through P having unit tangents at P denoted ~t1 , ~t2 and ~t3
 
and we introduce the set of vectors ~a1 , ~a2 , ~a3 ⊥ ~t1 , ~t2 , ~t3 . Then we have

~t1 · ∇f = df , ~t2 · ∇f = df , ~t3 · ∇f = df


ds1 ds2 ds3

whereupon

 df df df
∇f = ~a1~t1 + ~a2~t2 + ~a3~t3 · ∇f = ~a1 + ~a2 + ~a3
ds1 ds2 ds3

df df df
and therefore we have ∇f in terms of three derivatives of f , viz., , and , along three
ds1 ds2 ds3
curves C1 , C2 and C3 passing through P .

Thus on any curve C the derivative of f with respect to arc length along the curve is given by

df ~ df df df
= t · ~a1 + ~t · ~a2 + ~t · ~a3
ds ds1 ds2 ds3

Now if we have a coordinate system, we will choose the three curves through P to be the three
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 247

coordinate curves passing through P . Hence in Cartesian coordinates we choose ~t1 = ~i, ~t2 = ~j,
and ~t3 = ~k whereupon we have

df ∂f df ∂f df ∂f
= , = , and =
ds1 ∂x ds2 ∂y ds3 ∂z

and, therefore,

∂f ~ ∂f ~ ∂f
∇f = ~i +j +k
∂x ∂y ∂z

Hence, in Cartesian coordinates, we denote by ∇ the differential operator

~i ∂ + ~j ∂ + ~k ∂
∂x ∂y ∂z

whereupon we have

2 ∂2 ∂2 ∂2
∇ =∇·∇= 2 + 2 + 2
∂x ∂y ∂z

making use of the fact that ~i, ~j and ~k are independent of x, y and z.

11.2 New Coordinate Systems

Now, we may introduce a new coordinate system where the new coordinates, (u, v, w), like (x, y, z),
are coordinates of a point P . We do this by writing

x = f (u, v, w)

y = g (u, v, w)

z = h (u, v, w)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 248

where the vectors

∂~r ∂~r ∂~r


~ru = , ~rv = and ~rw =
∂u ∂v ∂w

are tangent to the coordinate curves, viz., the curves u increasing at constant v and w, v increasing
at constant w and u, etc.

Then our new coordinate system is an orthogonal coordinate system iff

~ru · ~rv = 0, ~rv · ~rw = 0 and ~rw · ~ru = 0

This being the case, and it is the only case of interest to us, we introduce unit length vectors along
the three coordinate curves by scaling ~ru, ~rv and ~rw, viz.,

~i = ~ru , ~i = ~rv and ~i = ~rw


u | ~ru| v | ~rv| w | ~rw|

And now at each point of space we have an orthogonal basis of unit vectors that, algebraically, acts

just like ~i, ~j, ~k , viz.,

~i · ~i = 0, ~i · ~i = 0 and ~i · ~i = 0
u v v w w u

and

~i × ~i = ~i , ~i × ~i = ~i and ~i × ~i = ~i
u v w v w u w u v

However, the vectors ~ru, ~rv and ~rw ordinarily do not remain fixed in direction as we move from a
point P to a nearby point.

To write a formula for ∇ in our new coordinate system, we introduce a curve C defined by

u = u (s) , v = v (s) and w = w (s)


LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 249

and then we differentiate f with respect to arc length along this curve, viz.,

df ∂f du ∂f dv ∂f dw
= + +
ds ∂u ds ∂v ds ∂w ds

Observing that the tangent to the curve is given by

~t = d~r = ~ru du + ~rv dv + ~rw dw


ds ds ds ds

we have

du ~r · ~t ~i
= u 2 = u · ~t
ds | ~ru| | ~ru|

dv ~r · ~t ~i
= v 2 = v · ~t
ds | ~rv| | ~rv|
and

dw ~r · ~t ~i
= w 2 = w · ~t
ds | ~rw| | ~rw|

whence we obtain
( )
df ~ ~i ~ ~
=t· u ∂ + iv ∂ + iw ∂ f = ~t · ∇f
ds | ~ru| ∂u | ~rv | ∂v | ~rw| ∂w

where, in our new coordinate system,

~i ∂ ~i ∂ ~i ∂
∇= u + v + w
hu ∂u hv ∂v hw ∂w

and where

hu = | ~ru| , hv = | ~rv | and hw = | ~rw|


LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 250

Now a displacement d~r is given in terms of du, dv and dw by

d~r = ~rudu + ~rvdv + ~rwdw

whereupon we see that

ds2 = d~r · d~r

= | ~ru|2 du2 + | ~rv |2 dv 2 + | ~rw|2 dw 2

= hu2 du2 + hv2 dv 2 + hw2 dw 2

Hence if we can write a formula for ds2 in our coordinate system, we can read off the formulas for
hu, hv and hw. For example, in cylindrical coordinates we have

ds2 = dr 2 + r 2 dθ2 + dz 2

and therefore hr = 1, hθ = r, hz = 1. And in spherical coordinates we have

ds2 = dr 2 + r 2 dθ2 + r 2 sin2 θ dφ2

and therefore hr = 1, hθ = r, hφ = r sin θ.

We now have a formula for ∇ in any orthogonal coordinate system and we can proceed to
derive a formula for ∇2 = ∇ · ∇.

11.3 The Surface Gradient

Before we do this, we introduce the surface gradient, denoted ∇S in order that we may differentiate
functions defined on a surface.

Points (x, y, z), such that

x = f (α, β)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 251

y = g (α, β)

and

z = h (α, β)

lie on a surface, denoted S, on which the curves α constant, β increasing, and α increasing, β
constant, are coordinate curves. The vectors ~rα and ~rβ , tangent to these curves at a point P of S,
ordinarily are not perpendicular. They can be used to determine the unit normal to S at P via

~rα × ~rβ
~n =
~rα × ~rβ

where

2 2 2
 2
~rα × ~rβ = ~rα ~rβ − ~rα · ~rβ

 
The two sets of vectors ~a, ~b, ~n and ~rα, ~rβ , ~n are biorthogonal if ~a and ~b are given by

~rβ × ~n
~a =
~rα · ~rβ × ~n

and

~b = ~n × ~rα
~rα · ~rβ × ~n

Then if f is defined on S as a function of α and β and ~t is tangent to a curve C on S passing


through P , we have

df df dα df dβ
= +
ds dα ds dβ ds

where α = α (s) and β = β (s) define the curve and s denotes arc length along the curve.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 252

Due to

~t = d~r = ~rα dα + ~rβ dβ


ds ds ds

we have

dα dβ ~ ~
= ~a · ~t, =b·t
ds ds

and hence
 
df ~ ∂ ∂
= t · ~a + ~b f
ds ∂α ∂β

which we write as

df ~
= t · ∇S f
ds

where

∂ ∂
∇S = ~a + ~b
∂α ∂β

and at points along the surface we have

d
∇ = ∇S + ~n
ds

The mean curvature of a surface, denoted by H, is important in some of our later examples and
we record the fact that it can be obtained via

2H = −∇S · ~n

More simply, defining

gαα = ~rα · ~rα , gαβ = ~rα · ~rβ , and gββ = ~rβ · ~rβ ,
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 253

bαα = ~n · ~rαα , bαβ = ~n · ~rαβ , and bββ = ~n · ~rββ

we have
 −1  
 g 
αα gαβ   bαα bαβ 
2H = tr 
 g bαβ bββ 
αβ gββ

For example if the surface is defined by

z = Z (x, y)

where x and y are surface coordinates, we then have

~rx = ~i + Z x ~k, ~rxx = Z xx ~k

~ry = ~j + Z y ~k, ~ryy = Z yy ~k

~k − Z ~i − Z ~j
x y
~n = q
1 + Z x2 + Z y 2

and

~rxy = Z xy ~k

whereupon we find
 
1 + Z y2 Z xx − 2 Z x Z y Z xy + 1 + Z x2 Z yy
2H =  3/2
2
1 + Zx + Zy 2
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 254

11.4 A Formula for ∇2

To derive a formula for ∇2 in the u, v, w coordinate system we first notice that if f is a vector
valued function, say ~v , where

~v = vu~iu + vv~iv + vw~iw

then to calculate the tensor ∇~v where ∇~v can be used to find the derivative of ~v with respect to arc
length along a curve, say a particle path, we write
 n o
1~ ∂
∇~v = iu +··· vu~iu + · · ·
hu ∂u

The first term is

1 ~ ∂ n ~ o 1 ∂vu ~ ~ vu ~ ∂~iu
iu vu iu = i i + i
hu ∂u hu ∂u u u hu u ∂u

and we see that to get ∇~v we are going to need three derivatives of each of the three base vectors,
twenty seven components in all. These components are of the form

∂~iη
~i ·
ξ ∂ζ

where ξ, η and ζ are any of u, v and w.

First we have

∂~iξ
~i · =0
ξ ∂η

due to ~iξ · ~iξ = 1 and then because

∂ 2~r ∂ 2~r
=
∂ξ ∂η ∂η ∂ξ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 255

we have

∂  ~  ∂  ~
hη iη = h i
∂ξ ∂η ξ ξ

hence

∂hη ∂~iη ∂hξ ∂~iξ


~i + h = ~i + h
∂ξ η η ∂ξ ∂η ξ ξ ∂η

whereupon we obtain

∂~iξ 1 ∂hη
~i · =
η ∂η hξ ∂ξ

and this is all that we need to derive a formula for ∇ · ~v or for ∇2 = ∇ · ∇, i. e., many of the terms
that appear in ∇v or in ∇∇ are eliminated by the dot product.

To obtain a formula for ∇2 we write


!  
1 ~ ∂ 1 ∂ 1 ~ ∂ 1 ~ ∂ 1 ~ ∂ 1 ~ ∂
∇2 = iu + ~iv + i · i + i + i
hu ∂u hv ∂v hw w ∂w hu u ∂u hv v ∂v hw w ∂w

where the underlined term leads to


(  
1 ~ ∂ 1 ~ 2
i · ~i ∂ + 1 ∂ iu ∂ + 1 ~i ∂
hu u ∂u hu u ∂u hu ∂u ∂u hu u ∂u2
 
∂ 1 ~ 2
+ ~i ∂ + 1 ∂ iv ∂ + 1 ~i ∂
∂u hv v ∂v hv ∂u ∂v hv v ∂u ∂v
  )
∂ 1 ~ ∂2
+ ~i ∂ + 1 ∂ iw ∂ + 1 ~i
∂u hw w ∂w hw ∂u ∂w hw w ∂u ∂w
 
1 ∂ 1 ∂ 1 ∂2 1 ~ ∂~iv ∂ 1 ~ ∂~iw ∂
= + 2 2+ iu · + i ·
hu ∂u hu ∂u hu ∂u hu hv ∂u ∂v hu hw u ∂u ∂w
 
1 ∂ 1 ∂ 1 ∂2 1 ∂hu ∂ 1 ∂hu ∂
= + 2 2
+ 2
+ 2
hu ∂u hu ∂u hu ∂u hu hv ∂v ∂v hu hw ∂w ∂w
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 256

1 ~ ∂ 1 ~ ∂
The terms coming from iv · and iw · can be written by replacing u, v and w by v, w
hv ∂v hw ∂w
and u, etc., in this formula.

More simplifications are possible. In fact the reader can derive the formula:
      
2 1 ∂ hv hw ∂ ∂ hw hu ∂ ∂ hu hv ∂
∇ = + +
hu hv hw ∂u hu ∂u ∂v hv ∂v ∂w hw ∂w

But we turn our attention to some examples which indicate a direct way to write ∇2 .

First we introduce cylindrical coordinates defined by

x = r cos θ

y = r sin θ

z=z

Z
iz

ir
z

r Y
θ

X
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 257

whereupon we have

~i = cos θ ~i + sin θ ~j, hr = 1


r

~i = − sin θ ~i + cos θ ~j, hθ = r


θ

~i = ~k, hz = 1
z

and hence we learn that

∂~ir ~ ∂~iθ
= iθ, = −~ir
∂θ ∂θ

the other seven derivatives being zero. Therefore we have

∂ 1 ∂ ~ ∂
∇ = ~ir + ~iθ + iz
∂r r ∂θ ∂z

and

∂2 1 ∂ 1 ∂2 ∂2
∇2 = + + +
∂r 2 r ∂r r 2 ∂θ2 ∂z 2
 
1 ∂ 1~ ∂ ∂
where the term appears due to iθ · ~ir and where
r ∂r r ∂θ ∂r
 
∂2 1 ∂ 1 ∂ ∂
+ = r
∂r 2 r ∂r r ∂r ∂r

Second, spherical coordinates are defined by

x = r sin θ cos φ, 0 ≤ φ ≤ 2π, 0≤θ≤π

y = r sin θ sin φ

z = r cos θ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 258

ir

θ
r

Y
φ

and we have

~i = sin θ cos φ~i + sin θ sin φ ~j + cos θ ~k, hr = 1


r

~i = cos θ cos φ~i + cos θ sin φ~j − sin θ ~k, hθ = r


θ

~i = − sin φ~i + cos φ ~j, hφ = r sin θ


φ

and we see that

∂~ir ~ ∂~ir
= iθ, = sin θ ~iφ,
∂θ ∂φ

∂~iθ ∂~iθ
= −~ir, = cos θ ~iφ,
∂θ ∂φ

∂~iφ
= − sin θ ~ir − cos θ ~iθ
∂φ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 259

the remaining four derivatives being zero.

Hence we have

∂ 1 ∂ 1 ~ ∂
∇ = ~ir + ~iθ + i
∂r r ∂θ r sin θ φ ∂φ

whereupon

2
~i ∂ · ∇ = ∂
r ∂r ∂r 2

and

1~ ∂ 1 ∂ 1 ∂2
iθ ·∇= + 2 2
r ∂θ r ∂r r ∂θ

All of the interesting terms come out of

1 ~ ∂ 1 ∂ 1 1 ∂ 1 ∂2
iφ ·∇= sin θ + cos θ + 2 2
r sin θ ∂φ r sin θ ∂r r sin θ r ∂θ r sin θ ∂φ2

and we find
   
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2 r + 2 sin θ + 2 2
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2

Now we have formulas for ∇2 in three coordinate systems and enough information to write ∇2
in any other orthogonal coordinate system. Given a coordinate system it is often easier to proceed
to ∇2 directly via the base vectors and their derivatives than it is to use general formulas.

For axisymmetric flow of an incompressible fluid, i.e., a flow where the velocity components
are the same in every plane through the z-axis we can write

1~
~v = i × ∇ψ (r, z)
r θ
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 260

in cylindrical coordinates and

1 ~
~v = i × ∇ψ (r, θ)
r sin θ φ

in spherical coordinates.

Hence in cylindrical coordinates we have


   
1 ∂ 1 ∂ψ ∂2ψ ~ 1
∇ × ~v = r + 2 iθ = E 2 ψ ~iθ
r ∂r r ∂r ∂z r

and the reader ought to work out the corresponding formula in spherical coordinates, viz.,

1
∇ × ~v = E 2 ψ ~iφ
r sin θ

learning, in the course of the calculation, a formula for E 2 in spherical coordinates.

11.5 Domain Pertubations

Our aim is to solve problems on a domain D and to do this in terms of the eigenfunctions of ∇2 on
the domain we must solve the eigenvalue problem

∇2 ψ + λ 2 ψ = 0 on D

where, say, ψ = 0 on the boundary of D. We now know how to write ∇2 in a variety of orthogonal
coordinate systems and if one of these coordinate systems fits our needs we will try to solve our
eigenvalue problem in this coordinate system.

But there are many domains where we can not do this and for some of these, those close to a
domain where our methods will work, we have an option.

Suppose our domain D lies close to a domain D0 on which we are able to solve our problem.
We call D0 the reference domain and we wish to know what problems to solve on D0 in order to
estimate the solution to our problem on D.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 261

To sketch the main idea, we assume D and D0 are two dimensional and in a Cartesian coordi-
nate system we denote the points of D by (x, y) and those of D0 by (x0 , y0).

We imagine a family of domains, Dε, growing out of D0 , one of these being D. Then points
(x, y, ε) ∈ Dε grow out of points (x0 , y0) ∈ D0 via

x = f (x0 , y0, ε) , y = g (x0 , y0 , ε)

where the boundary points of Dε, viz., y = Y (x, ε) = g (x0 , Y0 (x0 ) , ε) grow out of the corre-
sponding boundary points of D0 , viz., y0 = Y0 (x0 )

We simplify our mapping of D0 into Dε to

x = x0 , y = g (x0 , y0 , ε)

and we first expand y = g (x0 , y0 , ε), viz.,

dg 1 d2 g
g (x0 , y0, ε) = g (x0 , y0 , ε = 0) + ε (x0 , y0 , ε = 0) + ε2 2 (x0 , y0, ε = 0) + · · ·
dε 2 dε

where

g (x0 , y0, ε = 0) = y0

and we define

dg
y1 = (x0 , y0 , ε = 0)

d2 g
y2 = (x0 , y0 , ε = 0)
dε2
etc.

Thus our mapping of D0 into Dε can be written

x = x0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 262

1 2
y = y0 + ε y1 (x0 , y0 ) + ε y2 (x0 , y0 ) + · · ·
2
on the domain and

x = x0

1 2
y = Y (x, ε) = g (x0 , Y0 (x) , ε) = Y0 (x0 ) + ε Y1 (x0 ) + ε Y2 (x0 ) + · · ·
2
on its boundary where Y0 (x0 ) defines the boundary of D0 , where

 
Y1 (x0 ) = y1 x0 , Y (x0 ) , Y2 (x0 ) = y2 x0 , Y (x0 ) , etc.

and where Y0 , Y1 , Y2 , · · · define the boundary of Dε.

Now our problem is to find a function, denoted u, defined on a domain Dε. Assuming the
equations satisfied by u on Dε have the same form for all values of ε, we begin by expanding
u (x, y, ε) along the mapping where all derivatives along the mapping hold x0 and y0 fixed.

Thus expanding u in powers of ε along the mapping of D0 into Dε we have:

 du 
u (x, y, ε) = u x = x0 , y = y0 , ε = 0 + ε x = x0 , y = y0 , ε = 0 +

1 2 d2 u 
ε x = x0 , y = y 0 , ε = 0 +···
2 dε2

d
where u depends on x, y, ε and holds x0 and y0 fixed.

Hence we have, using the chain rule,

du ∂u ∂u dy
(x, y, ε) = (x, y, ε) + (x, y, ε) (x0 , y0 , ε)
dε ∂ε ∂y dε

and therefore

du ∂u0
(x = x0 , y = y0 , ε = 0) = u1 (x0 , y0 ) + y1 (x0 , y0 ) (x0 , y0)
dε ∂y0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 263

where

∂u
u1 (x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂ε

dg
y1 (x0 , y0 ) = (x0 , y0 , ε = 0)

and

∂u0 ∂u
(x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂y0 ∂y

Likewise we have

d2 u ∂u1 (x0 , y0 )
2
(x = x0 , y = y0 , ε = 0) = u2 (x0 , y0 ) + 2y1 (x0 , y0 ) +
dε ∂y0
∂ 2 u0 (x0 , y0 ) ∂u0 (x0 , y0 )
y12 (x0 , y0 ) 2
+ y2 (x0 , y0 )
∂y0 ∂y0

where

∂2u
u2 (x0 , y0 ) = (x = x0 , y = y0 , ε = 0)
∂ε2

Thus we write our expansion of u as


   2

∂u0 1 2 ∂u1 2 ∂ u0 ∂u0
u (x, y, ε) = u0 + ε u1 + y1 + ε u2 + 2y1 + y1 + y2 +···
∂y0 2 ∂y0 ∂y02 ∂y0

where u0 , u1 , y1 , etc. are all evaluated at (x0 , y0) in D0 .


∂u
To obtain the expansion of we differentiate the RHS of the expansion of u with respect to
∂y
y holding x and ε fixed, where x0 and y0 are obtained in terms of x, y and ε via

1 2
x = x0 , y = y0 + ε y1 (x0 , y0) + ε y2 (x0 , y0) + · · ·
2
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 264

and we find
 
∂u (x, y, ε) ∂u0 ∂u1 ∂ 2 u0
= +ε + y1 2 +
∂y ∂y0 ∂y0 ∂y0
 
1 2 ∂u2 ∂ 2 u1 3
2 ∂ u0 ∂ 2 u0
ε + 2y1 2 + y1 + y2 +···
2 ∂y0 ∂y0 ∂y03 ∂y02

Likewise we have
 
∂u (x, y, ε) ∂u0 ∂u1 ∂ 2 u0
= +ε + y1 +
∂x ∂x0 ∂x0 ∂y0 ∂x0
 
1 2 ∂u2 ∂ 2 u1 3
2 ∂ u0 ∂ 2 u0
ε + 2y1 + y1 2 + y2 +···
2 ∂x0 ∂y0 ∂x0 ∂y0 ∂x0 ∂y0 ∂x0

The reader may notice that only y1 , y2 , . . . appear in these two formulas, not their derivatives.
To see how the derivatives are lost, the algebra must be worked out. The main idea is to replace
∂y0 ∂y1 1 2 ∂y2 ∂u
with 1 − ε − ε − · · · in the derivation of the formula for .
∂y ∂y 2 ∂y ∂y
Now to derive the equations for u0 , u1 , u2, . . . on the reference domain, we substitute our ex-
pansions for u and its derivatives into the equation satisfied by u. Doing this we discover that the
mappings do not survive. For example, if our problem is

∇2 u = f on Dε

we substitute
 
∂2u ∂ 2 u0 ∂ 2 u1 ∂ ∂ 2 u1
= +ε + y 1 +···
∂x2 ∂x20 ∂x20 ∂y0 ∂x20
 
∂2u ∂ 2 u0 ∂ 2 u1 ∂ ∂ 2 u1
= +ε + y 1 +···
∂y 2 ∂y02 ∂y02 ∂y0 ∂y02
and
 

f = f0 + ε f1 + y1 f0 + · · ·
∂y0
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 265

to obtain
   
∂ ∂
∇20 u0 2
+ ε ∇0 u 1 + y 1 2
∇ u0 + · · · = f0 + ε f1 + y1 f0 + · · ·
∂y0 0 ∂y0

whereupon we have, on D0,

∇20 u0 = f0

∇20 u1 = f1

etc.

and the conclusion is this: the equations for u0 , u1, u2 on D0 , i.e., on the reference domain, can be
derived by ordinary methods from the equation for u on Dε. The mapping of D0 into Dε does not
appear, and we are grateful, because we can never know what it is.
1
Thus we can substitute u = u0 + εu1 + ε2 u2 + · · · into the equation for u and set to zero
2
terms of order zero, one, two, etc. while paying no attention to the fact that Dε is not D0 . By doing
this we obtain the equations for u0 , u1 , u2 , . . . on D0 .

The boundary is different. For example suppose u = 0 must be satisfied at y = Y (x) in the
forgoing problem. Then using the expansion
 
∂u0
u(x, Y (x) ) = u0 + ε u1 + Y1 +··· +···
∂y0

we obtain at y0 = Y0 (x0 )

u0 = 0

∂u0
u1 + Y1 =0
∂y0

etc.

And we see that in the boundary conditions the displacement of D0 into Dε appears.

To obtain u (x, y, ε) from u0 (x0 , y0 ), u1 (x0 , y0 ), etc., not knowing the mapping, we rearrange
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 266

the expansion
   2

∂u0 1 2 ∂u1 2 ∂ u0 ∂u0
u (x, y, ε) = u0 (x0 , y0 ) + ε u1 + y1 + ε u2 + 2y1 + y1 + y2 +···
∂y0 2 ∂y0 ∂y02 ∂y0

using

1
y − y0 = εy1 + ε2 y2 + · · ·
2

and

∂u0 1 ∂ 2 u0
u0 (x, y) = u0 (x0 , y0 ) + (x0 , y0 ) (y − y0 ) + 2
(x0 , y0) (y − y0 )2 + · · ·
∂y0 2 ∂y0

etc.

and conclude

1 2
u (x, y, ε) = u0 (x, y) + εu1 (x, y) + ε u2 (x, y) + · · ·
2

and by this we obtain u at all points (x, y) of Dε which are also points of D0 .

Now to solve our eigenvalue problem

∇2 ψ + λ 2 ψ = 0 on D

ψ=0 on y = Y (x)

we solve

∇2 ψ0 + λ20 ψ0 = 0 on D0

ψ0 = 0 on y0 = Y0 (x0 )
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 267

∇2 ψ1 + λ20 ψ1 = −λ21 ψ0 on D0

∂ψ0
ψ1 + Y1 =0 on y0 = Y0 (x0 )
∂y0

∇2 ψ2 + λ20 ψ2 = −λ22 ψ0 − 2λ21 ψ1 on D0

∂ψ1 ∂ 2 ψ0 ∂ψ0
ψ2 + 2Y1 + Y12 2
+ Y2 =0 on y0 = Y0 (x0 )
∂y0 ∂y0 ∂y0

etc.

And we notice that the homogeneous part of every problem has a solution, not zero. Hence a
solvability condition must be satisfied at every order and it will determine λ21 , λ22 , . . . the corrections
to λ20 . The displacement of the boundary , given by Y1 , Y2 , . . ., will appear in the solvability
conditions.

Domain perturbations would be useful if we have a heavy fluid lying above a light fluid in
a container of arbitrary cross section and we wish to learn if the interface is stable to a small
displacement. Hopefuly, the arbitrary cross section is close to, say, a circle, and the displaced
interface is close to, say, a plane.

More the details can be found in the book ”Interfacial Instability” by L.E. Johns and R.
Narayanan.

11.6 Home Problems

1. Suppose ~v satisfies ∇ · ~v = 0 and ~v

(a) is tangent to the family of planes z = const,


or

(b) is tangent to the family of planes passing through the z–axis


LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 268

In (a) we can write

~v = ~k × ∇ψ (x, y)

whereas in (b) we can write

1~ 
~v = i × ∇ψ (r, z) cylindrical, r 2 = x2 + y 2
r θ

or

1 ~ 
~v = i × ∇ψ (r, θ) spherical, r 2 = x2 + y 2 + z 2
r sin θ φ

Introducing orthogonal coordinates ξ, η, derive formulas for vξ and vη in terms of ψ (ξ, η).

In (a) you have

x = f (ξ, η) , y = g (ξ, η)

whereas in (b) you have either

r = f (ξ, η) , z = g (ξ, η)

or

r = f (ξ, η) , θ = g (ξ, η)

Derive ~v · ∇ψ = 0 and conclude that ~v is tangent to surfaces ψ = const.

2. You are to write out the terms appearing in the Navier-Stokes equation. Do this in cylindrical
coordinates where

~v = vr~ir + vθ~iθ + vz~iz


LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 269

and where

∂~ir ∂~iθ
= ~iθ and = −~ir
∂θ ∂θ

the other seven derivatives being zero. Write out

∇~v

∇ · ~v

∇ × ~v

~v · ∇~v

∇ · ∇~v = ∇2~v

∇ · (∇~v )T = ∇ (∇ · ~v )

3. Denote by r, θ and z cylindrical coordinates, viz.,

x = r cos θ

y = r sin θ

where r 2 = x2 + y 2 and define coordinate systems of revolution by

r = r (ξ, η) , z = z (ξ, η)
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 270

so that

x = r (ξ, η) cos θ

y = r (ξ, η) sin θ

z = z (ξ, η)

For example, spherical coordinates obtain if θ is replaced by φ and

r (ξ, η) = ξ sin η

z (ξ, η) = ξ cos η

where ξ 2 = x2 + y 2 + z 2 , η = θ and cos θ = ~k · ~r

Prolate and oblate spheroidal coordinates are defined by

r = c sinh ξ sin η

z = c cosh ξ cos η

and

r = c cosh ξ sin η

z = c sinh ξ cos η

where 0 ≤ ξ ≤ ∞, 0 ≤ η ≤ π.

Show that prolate and oblate spheroidal coordinates are orthogonal, write ∇2 and reduce
(∇2 + λ2 ) ψ = 0 to ordinary differential equations by separation of variables.
LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 271

1~ ~
 1 x2 + y 2
4. Show that ~v = xi − y j and p = − ρ satisfy
T 2 T2

ρ~v · ∇~v = −∇p + µ∇2~v

and

∇ · ~v = 0

Notice that:

p decreasing

~v = | ~v | ~t

where,denoting by κ the curvature of a particle path, we have

 d  ~ d | ~v | ~ d~t
~a = ~v · ∇~v = | ~v | ~t · ∇ | ~v | ~t = | ~v | | ~v | t = | ~v | t + | ~v |
ds ds ds

d~t
and where = κ~p and p~ · ~t = 0.
ds

5. Denote by m
~ a magnetic dipole. The vector potential and the magnetic induction due to m
~
are:

~ = µ0 m
A ~ × ~r

LECTURE 11. THE DIFFERENTIAL OPERATOR ∇2 272

and

~ =∇×A
B ~

~ = m~k and find B.


Assume m ~
Lecture 12

Diffusion in Unbounded Domains

To learn something about diffusion without doing very much, we present some diffusion problems
in unbounded domains.

12.1 Power Moments of an Evolving Solute Concentration Field:


1 dσ 2
The Formula D =
2 dt
An easy way to get information about the solution to the diffusion equation without actually solving
the equation is to determine the spatial power moments of the concentration field. As spatial
derivatives in the diffusion equation lower the order of the power moments, a sequence of ordinary
differential equations results that can be solved recursively and the first few moments tell us some
interesting things about diffusion. To see how this works we begin with a one dimensional problem.
Suppose that a solute is distributed at t = 0 over −∞ < x < ∞ according to the smooth function
c(t = 0) where c(t = 0) = 0 for | x| > a, a < ∞. Then, for t > 0, the solute distribution satisfies

∂c ∂2c
= D 2, −∞ < x < ∞
∂t ∂x

and we only need c to vanish strongly enough as | x| → ∞ so that all of its moments are finite.

273
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 274

Defining the power moments of c via


Z +∞
cm (t) = xm c(x, t)dx, m = 0, 1, 2, . . .
−∞

we can derive the equation satisfied by cm (t) by multiplying the diffusion equation by xm and
integrating the result over all x. Simplifying the right hand side by integration by parts and setting
all the terms evaluated at ±∞ to zero we get

dcm
= Dm(m − 1)cm−2 , m = 0, 1, 2, . . .
dt

Indeed, for m = 0, 1 and 2, we have

dc0
=0
dt
dc1
=0
dt

and

dc2
= 2Dc0
dt

whence, for t > 0, we obtain

c0 = c0 (t = 0)

c1 = c1 (t = 0)

and

c2 = c2 (t = 0) + 2Dtc0 (t = 0)

The first result expresses the fact that solute is neither gained nor lost in diffusion, it is just
redistributed. The second and third results tell us something about this redistribution. If we think
of c at any time t as the result of an experiment designed to determine the spatial positions of the
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 275

c(x, t)
solute molecules at that time then dx is the probability that the measurement of a molecule’s
c0
position falls between x and x + dx at time t and hence we have
Z +∞
c(x, t) c1
x = x dx =
−∞ c0 c0

and
Z +∞
2 c(x, t) c2
x = x2 dx =
−∞ c0 c0

where h i denotes the average or expected value of a function of x, weighted by the solute density.
So c1 (t) and c2 (t) determine the expected values of x and x2 and in a simple diffusion experiment
and we see hxi is fixed while hx2 i increases linearly in time.

The variance of the solute distribution, i.e., the average of (x − hxi)2 , tells us about the spread-
ing of the solute, and as this is

D 2 E
2
σ = x− x

= hx2 i − hxi2
 2
c2 c1
= −
c0 c0

we find that

σ 2 = σ 2 (t = 0) + 2Dt

This is a formula for D in terms of the variance of the solute concentration as independent
solute molecules spread out in a solvent, viz.,

1 dσ 2 1 dc2
D= =
2 dt 2c0 dt

It can be used to determine the value of D if measurements of c2 (t) can be made.

All this carries over to three dimensional problems where a solute distribution is assigned at
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 276

t = 0 according to c(t = 0) and then spreads out for t > 0 according to


 
∂c ∂2c ∂2c ∂2c
=D + + = D∇2 c
∂t ∂x2 ∂y 2 ∂z 2

The power moments of c defined by


Z +∞ Z +∞ Z +∞
cℓmn = xℓ y m z n c(x, y, z, t) dx dy dz
−∞ −∞ −∞

then satisfy

dcℓmn
= Dℓ(ℓ − 1)c(ℓ−2)mn + Dm(m − 1)cℓ(m−2)n + Dn(n − 1)cℓm(n−2)
dt

and again this set of equations can be solved recursively. And a formula for D in three dimensional
diffusion in terms of

  
σ 2 = h x2 i − h x i2 + h y 2 i − h y i2 + h z 2 i − h z i2

can be obtained.

Suppose now that an initial solute distribution not only spreads out by diffusion but is also
displaced by a flow field. Here too, we can get useful information about what is going on by
looking at the moments of the solute distribution. To do the simplest example we suppose that an
initial solute distribution in two dimensions is being displaced by a stagnation flow such as

x~ y ~
~v = i− j
T T

viz.,
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 277

whose tendency is to stretch the solute out in the x direction while compressing it in the y direction
so that as t increases the solute might pass through the following stages:

Then, assuming c(t = 0) is assigned, we have for c(t > 0)

∂c
= D∇2 c − ~v · ∇c
∂t
x~ y ~
and using ~v = i − j this is
T T

∂c ∂2c ∂2c x ∂c y ∂c
=D 2 +D 2 − +
∂t ∂x ∂y T ∂x T ∂y
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 278

where T is a time constant of the flow.

Defining the power moments of c via


Z +∞ Z +∞
cmn = xm y n c(x, y, t) dxdy
−∞ −∞

we can determine the equations satisfied by the moments just as we did before and here we find

dcmn m−n
= Dm(m − 1)c(m−2)n + Dn(n − 1)Cm(n−2) + cmn
dt T

Again the equations can be solved recursively and doing this we find that the moment equations

dc00
=0
dt

dc10 1
= c10
dt T
dc01 1
= − c01
dt T
dc11
=0
dt
dc20 2
= 2Dc00 + c20
dt T
and

dc02 2
= 2Dc00 − c02
dt T

lead to

c00 = c00 (t = 0)

c10 = c10 (t = 0)et/T

c01 = c01 (t = 0)e−t/T

c11 = c11 (t = 0)
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 279

c20 = e2t/T c20 (t = 0) + T Dc00 (t = 0) e2t/T − 1

and


c02 = e−2t/T c02 (t = 0) − T Dc00 (t = 0) e−2t/T − 1

Using these formulas we can determine averages and variances, viz.,

 2
c10 2 c20 c20 c10
hxi = , hx i = , σx2 2 2
= hx i − hxi = − , etc.
c00 c00 c00 c00

and the results are

x = x (t = 0)et/T

y = y (t = 0)e−t/T

σx2 = σx2 (t = 0)e2t/T + T D e2t/T − 1

and


σy2 = σy2 (t = 0)e−2t/T − T D e−2t/T − 1

This tells us that as time grows large σy2 achieves a constant value T D. This value expresses the
balance between the tendency of diffusion to increase σy2 and the tendency of the flow to decrease
it. The tendency of the flow is to carry the solute toward the line y = 0 building concentration
gradients there until ultimately diffusion in the y direction can just offset this.

The reader can use this method to investigate the displacement of a solute concentration field
1
by a simple shearing flow: vx = y, vy = 0.
T
The reader can also do two simple calculations in the spherically symmetric case where
 
∂c 1 ∂ 2 ∂c
=D 2 r
∂t r ∂r ∂r
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 280

Define c0 and c2 via


Z ∞
c0 = c 4πr 2 dr
0

and
Z ∞
c2 = r 2 c 4πr 2 dr
0

and derive

dc0
=0
dt

and

dc2
= 6Dc0
dt

Now suppose we have one dimensional diffusion in an inhomogeneous region so that


D = D(x), then c, the solute concentration, satisfies
 
∂c ∂ ∂c
= D(x)
∂t ∂x ∂x
R +∞
whereupon c0 (t) = −∞
c dx satisfies

dc0
=0
dt
R +∞
and c1 (t) = −∞
xc dx satisfies

Z +∞
dc1
= D ′ (x)c dx
dt −∞

whence we have

d
hxi = hD ′ (x)i
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 281

c1
where hxi =
c0
Thus we see that solute no longer redistributes about a fixed point. And moving on to solute
spreading we observe that
Z +∞
c2 = x2 c dx
−∞

satisfies
Z +∞
dc2
=2 (xD(x))′ c dx
dt −∞

whereupon

dσ 2 1 dc2 c1 dc1
= −2
dt c0 dt c0 dt
dhxi
= 2h(xD(x))′ i − 2hxi
dt

= 2hD(x)i + 2hxD ′ (x) − hxihD ′(x)ii

Now these formulas certainly have the expected form and we could use them if we could write
c(x, t) in terms of c0 (t), c1 (t), etc. but that is not so easy.

12.2 Chromatographic Separations

Carrying on with our plan to try to learn something about diffusion without doing much, we turn
to a problem in which we discover diffusion where at first there would appear to be none.

A solute is injected into a carrier gas flowing through a packed column where the solute is
adsorbed by the packing. As the solute moves through our column it is adsorbed at its leading
edge, desorbed at its trailing edge. The solute concentration in the carrier phase and in the solid
phase are denoted by c and by a and the dilute solute equilibrium isotherm is assumed to be

c = ma where m is constant
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 282

The smaller the value of m, the more strongly the solute is bound to the solid.

We denote the volumetric flow rate of the carrier by G, the cross sectional area of the empty
column by A and the porosity of the bed by ε.
G v0
Then v0 = denotes the superficial velocity of the carrier and v = denotes its interstitial
A ε
velocity.

We can write a simple model, assuming no variation of c and a on the cross section. It is

∂c ∂c
ε + v0 = −K (c − ma)
∂t ∂z

and

∂a
(1 − ε) = K (c − ma)
∂t

1
where the volumetric mass transfer coefficient is denoted by K and where [K] = .
time
The bed is assumed to be infinitely long and at time zero a finite amount of solute is injected
into the carrier near z = 0.

Our model equations can be solved, but we do not do so. Instead, we will see what we can
learn by solving the moment equations.

The zeroth moments of c and a, viz.,


Z +∞ Z +∞
c0 = c dz, a0 = a dz
−∞ −∞

satisfy

dc0
ε = −K (c0 − ma0 )
dt

and

da0
(1 − ε) = K (c0 − ma0 )
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 283

By adding these equations, differentiating the first and eliminating a0 we obtain


 
d 2 c0 1 m dc0
+K + =0
dt2 ε 1−ε dt

whence we have
 
1 m
−tK +
c0 = A0 + B0 e ε 1−ε

and, as t → ∞,

c0 → A0

dc0
→0
dt
and

c0
a0 →
m

The first order moments of c and a, viz.,


Z +∞ Z +∞
c1 = zc dz, a1 = za dz
−∞ −∞

satisfy

dc1
ε − v0 c0 = −K (c1 − ma1 )
dt

and

da1
(1 − ε) = K (c1 − ma1 )
dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 284

Again, adding these equations, differentiating the first and eliminating a1 , we have
 
d 2 c1 1 m dc1 v0 m v0 dc0
+K + =K c0 +
dt2 ε 1−ε dt ε 1−ε ε dt

Hence, at times long after solute injection, we must solve


 
d 2 c1 1 m dc1 v0 m
+K + =K A0
dt2 ε 1−ε dt ε 1−ε

whereupon we obtain
zero
 
1 m m
−tK + v0
c1 = A1 + B1 e ε 1−ε + 1−ε A t
0
ε 1 m
+
ε 1−ε

The average distance transversed by the solute in the carrier phase, measured from the injection
point, z = 0, and denoted hzi, viz.,
R +∞
zc dz c1
hzi = R−∞
+∞ =
c dz c0
−∞

advances, at times long after solute injection, according to

m
dhzi 1 dc1 v0 1−ε
= =
dt c0 dt ε 1 m
+
ε 1−ε

dhzi
and we call the speed at which the solute in the carrier phase proceeds through the bed.
dt
For large m, weakly bound solute, we have

dhzi v0
= ,
dt ε
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 285

v0
where is the carrier speed, whereas for small m, strongly bound solute, we have
ε

dhzi v0 ε
= m
dt ε 1−ε

Thus the ratio of the speeds of two strongly bound solutes is the ratio of their m’s. The mass
transfer coefficient does not appear in these formulas.

The reader may wish to find the speed of the solute in the solid phase.

What is interesting about this problem is that we have a transverse distribution of longitudinal
v0
speeds, simple though it may be, viz., in the carrier, zero in the solid, and we have solute that
ε
can move between the regions of different speed. This is a recipe for solute spreading in the flow
direction and, therefore, a hint that a diffusion model may be appropriate.

To see this we look at second moments of c and a defined by


Z +∞ Z +∞
2
c2 = z c dz, a2 = z 2 a dz
−∞ −∞

These moments satisfy

dc2
ε − 2v0 c1 = −K (c2 − ma2 )
dt

and

da2
(1 − ε) = K (c2 − ma2 )
dt

By eliminating a2 we obtain
 
d 2 c2 1 m dc2 v0 dc1 v0 m
+K + =2 +2 K c1
dt2 ε 1−ε dt ε dt ǫ 1−ε

whence, upon using

m
v0 1 − ε A t = A + const A t
c1 = A1 + 0 1 0
ε 1 m
+
ε 1−ε
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 286

we have, for times long after solute injection,


 
d 2 c2 1 m dc2 v0 v0 m
+K + = 2 const A0 + 2 K {A1 + const A0 t}
dt2 ε 1−ε dt ε ε 1−ε

Now we introduce σ 2 , the longitudinal variance of the solute distribution in the carrier, where
R +∞  2
2 −∞
( z − hzi )2 c dz c2 c1
σ = R +∞ = −
c dz c0 c0
−∞

and, therefore, we have, at long time,

dσ 2 1 dc2 1 dc1
= − 2 2c1
dt c0 dt c0 dt

dσ 2 dσ 2
where tells us the rate of spreading of the solute. If σ 2 is a multiple of t or is a constant,
dt dt
this constant defines the longitudinal diffusion coefficient.
dσ 2
The long time value of can be worked out and after some algebra we obtain
dt

m
2 2
dσ (v0 /ε) 1−ε
=2  2
dt Kε 1 m
+
ε 1−ε

which, for strongly bound solute, is

dσ 2 (v0 /ε)2 2 m
=2 ε
dt K 1−ε

Hence, the larger the value of K and the smaller the value of m, the more the solute hangs
together. Indeed, the larger the value of K, the less the effect of the velocity gradient, and the
smaller the value of m, the stronger the solute is bound to the solid.

The reader ought to derive these formulas. Only a particular solution of the c2 equation is
needed.
1 dc2 1 dc1
In order for the spreading process to be diffusive, the terms in and in 2 2c1 that
c0 dt c0 dt
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 287

dσ 2
are multiples of t must cancel, otherwise will be a multiple of t and in that case the spreading
dt
is called ballistic, not diffusive.

This problem serves as an introduction to the problem called Taylor dispersion presented in
Lecture 20.

12.3 Random Walk Model

The diffusion equation tells us that

hxi = 0

and

hx2 i = 2Dt

assuming all the solute particles start at x = 0 and diffuse in the ±x directions.

The probability in a simple random walk on a one-dimensional lattice satisfies the difference
equation

P (N1 , N2 ; N) = pP (N1 − 1, N2 ; N − 1) + qP (N1 , N2 − 1 : N − 1)

where P (N1 , N2 ; N) is the probability that in N steps, N1 steps are to the right, N2 steps are to
the left and where, at each step, p is the probability that it is to the right, q the probability that it is
to the left. We must have N1 + N2 = N and p + q = 1. After N steps the possible values of N1
are 0, 1, 2, . . . , N.

The solution to this difference equation is

N!
P (N1 , N2 ; N) = p N1 q N2
N1 !N2 !

where N1 + N2 = N and p + q = 1.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 288

This formula is called the binormal distribution. It tells us the probability of N1 successes in N
trials where the trials are independent and where p denotes the probability of a success in any trial.

Now, due to

XX
P (N1 , N2 : N) = (p + q)N , N1 + N2 = N
N1 N2

we can determine the average values of N1 and N12 to be

hN1 i = Np

and

hN12 i = Np (1 − p) + N 2 p2

Hence if we denote N1 − N2 by ∆, i.e., the net number of steps to the right, then the average values
of ∆ and ∆2 are

h∆i = N (2p − 1)

and

h∆2 i = 4Np (1 − p) + N 2 (2p − 1)2

If the lattice spacing is ℓ and a particle is released at x = 0 then its position on the lattice after
1
N steps is x = ℓ∆. So if all particles are released at x = 0 and if p = q = then
2

hxi = 0

and

hx2 i = ℓ2 N
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 289

Assuming the time required to take a step is τ , we see that N steps corresponds to time t = Nτ
and

hx2 i = ℓvt

where v = ℓ/τ is the speed of a particle. Our two formulas for hx2 i tell us that random walk acts
statistically like a diffusion process where

1
D= ℓv
2

We can repeat this calculation assuming that diffusion takes place, not in one, but in three
dimensions. Then if all the particles start at the origin the diffusion equation tells us directly that

r2 = x2 + y2 + z2 = 6Dt

To introduce the corresponding simple random walk on a three dimensional cubic lattice we let
P (N1 , N2 , N3 , N4 , N5 , N6 ; N) be the probabiity that in N steps, N1 steps are in the +x direction,
N2 steps are in the −x direction, N3 steps are in the +y direction, etc. And at each step we let p
be the probability that it is in the +x direction, q the probability that it is in the −x direction, r the
probability that it is in the +y direction, etc. Then we find

N!
P (N1 , N2 , . . . , N6 ; N) = p N1 q N2 r N3 s N4 u N5 v N6
N1 !N2 !N3 !N4 !N5 !N6 !

where N1 + N2 + · · · + N6 = N and p + q + · · · + v = 1 and where

XX X
··· P (N1 , N2 , . . . , N6 ; N) = (p + q + r + s + u + v)N = 1, N1 +N2 +· · ·+N6 = N
N1 N2 N6
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 290

We can determine the averages we need as follows

XX X 
N1 = ··· N1 P N1 , N2 , . . . , N6 ; N
N1 N2 N6

∂ XX X 
=p ··· P N1 , N2 , . . . , N6 ; N
∂p N N N
1 2 6

= pN

and

XX X 
N12 = ··· N12 P N1 , N2 , . . . , N6 ; N
N1 N2 N6

∂ ∂ XX X 
=p p ··· P N1 , N2 , . . . , N6 ; N
∂p ∂p N N N 1 2 6


= pN + p2 N N − 1

Likewise we find

N2 = qN

N1 N2 = pqN N − 1

and


N22 = qN + q 2 N N − 1

So, if we let ∆x = N1 − N2 , ∆y = N3 − N4 and ∆z = N5 − N6 then ∆x is the net number of


steps in the positive x direction. Its average is


∆x = N1 − N2 = p−q N
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 291

while the average of its square is

∆2x = N12 − 2N1 N2 + N22


  2 
= p+q N + p−q N N −1

If the lattice spacing is ℓ and a particle is released at x = y = z = 0 then its position on


the lattice after N steps is x = ℓ∆x , y = ℓ∆y and z = ℓ∆z . So if all particles are released at
1
x = y = z = 0 and if p = q = r = s = u = v = then
6

x =0

and

1
x2 = ℓ2 N
3

1
Likewise we have hyi = 0 = hzi and hy 2i = ℓ2 N = hz 2 i.
3
Again if the time required to take a step is τ then t = Nτ and

r2 = x2 + y2 + z2 = ℓvt

whence

1
D= ℓv
6

So when we hold the length of a free path fixed, require the free path to lie on a lattice and hold
the speed of a particle fixed we get

1
D= ℓv
2
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 292

and

1
D= ℓv
6

depending on whether the random walk is one or in three dimensions.

This may be a surprising result. Even more surprising is the way the lattice calculations turn
out. After N steps the average of the square of the distance from the origin is the same whether the
random walk is on a one dimensional lattice where hx2 i = ℓ2 N or on a three dimensional lattice
where hr 2 i = ℓ2 N.

It is worth recording the elementary mean free path result that

1
D= ℓv
3

when a dilute gas is diffusing in a gas of fixed atoms.

12.4 The Point Source Solution to the Diffusion Equation

A solute is distributed throughout all space at time t = 0. The amount of solute is finite. There are
no natural length and time scales. Our job is to predict the solute concentration, c, at t > 0. To do
this we must solve

∂c
= D∇2 c, ∀~r
∂t

where we must have c → 0 as | ~r| → ∞ and where c(t = 0) is assigned. Our first step is to find
out how a point source of solute spreads out in time in an unbounded domain. The result will be
the Green’s function for our problem and we obtain it by assuming we know the solution to a step
function initial condition. Then we come back to Green’s functions in Lecture 19.

To begin we first observe that

x
A + B erf √
4t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 293

satisfies

∂c ∂2c
= 2
∂t ∂x

where erf z, called the error function, is defined by


Z z
2 2
erf z = e−ζ dζ
π 0

and where for real values of z it looks like:

erf(z)

1.00
0.75
0.50
0.25 1.00 2.00 3.00
z
-3.00 -2.00 -1.00 − 0.25
− 0.50
− 0.75
− 1.00

While this would not seem to be a very useful solution to the diffusion equation due to the
limited class of boundary conditions that can be satisfied by setting two constants, nonetheless it
plays an important role in modeling diffusion processes as can be seen by looking in Bird, Stewart
and Lightfoot’s book “Transport Phenomena.” To get the solution when the diffusivity is other
than D = 1, write Dt in place of t.

If c(t = 0) is zero for x < ξ and x > ξ + ∆ξ and uniform but not zero for ξ < x < ξ + ∆ξ
then, for t > 0 we have
    
1 x−ξ 1 x − (ξ + ∆ξ)
c = c0 erf √ − erf √
2 4t 2 4t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 294

and to the first order in ∆ξ this is

(x − ξ)2
1 −
c0 √ e 4π ∆ξ
4πt

To go to three dimensions observe that if f (x, t) satisfies

∂c ∂2c
= 2
∂t ∂x

then f (x, t)f (y, t)f (z, t) satisfies

∂c
= ∇2 c
∂t

and hence to first order in ∆ξ, ∆η and ∆ζ

 3 (x − ξ)2 + (x − η)2 + (x − ζ)2


1 −
c = c0 ∆ξ∆η∆ζ √ e 4t
4πt

satisfies

∂c
= ∇2 c
∂t

where


 ξ < x < ξ + ∆ξ


c(t = 0) = c0 η < y < η + ∆η



 ζ < z < ζ + ∆ζ

=0 otherwise

The product c0 ∆ξ∆η∆ζ is the total amount of solute initially present. If we hold this fixed,
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 295

equal to one unit of mass, and let ∆ξ, ∆η and ∆ζ go to zero, we find

(x − ξ)2 + (x − η)2 + (x − ζ)2


1 −
c = √ 3 e 4t
4πt

and c satisfies

∂c
= ∇2 c, t>0
∂t

c(t = 0) = 0, ∀ (x, y, z) 6= (ξ, η, ζ)

and
Z +∞ Z +∞ Z +∞
c dV = 1, ∀t ≥ 0
−∞ −∞ −∞

This is called the point source solution to the diffusion equation as it tells us the solute density
at the point (x, y, z) and at the time t if at t = 0 a unit mass of solute is injected into a region
of vanishingly small extent at the point (ξ, η, ζ). It is called the Green’s function for the diffusion
equation in an unbounded region.

If the diffusivities are other that D = 1 it is a simple matter to decide that the point source
solution to

∂c ∂2c ∂2c ∂2c


= Dxx 2 + Dyy 2 + Dzz 2
∂t ∂x ∂y ∂z

is

−1
Dxx (x − ξ)2 + Dyy
−1
(x − η)2 + Dzz
−1
(x − ζ)2
1 1 −
c = √ 3 √ p √ e 4t
4πt Dxx Dyy Dzz

and this is the Green’s function for diffusion in an anisotropic solvent.

Thus suppose c satisfies

∂c ~~ ~~
=∇·D · ∇c = D : ∇∇c
∂t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 296

~~
where we have assumed D to be symmetric. Thus being so there is an orthogonal basis in which
~~ ~~
D is diagonal. Hence requiring ~i, ~j and ~k to lie along the eigenvectors of D, we can write

~~ ∂2c ∂2c ∂2c


D : ∇∇c = Dxx 2 + Dyy 2 + Dzz 2 ,
∂x ∂y ∂z

~~ ~~
where Dxx , Dyy and Dzz are the eigenvalues of D. And in the eigenbasis of D we can write the
point source solution to the anisotropic diffusion equation as above. In coordinate free form it is

~~ −1
D : (~r − r~0 )(~r − r~0 )
1 −
c (~r, t) = n o3 n o e 4t
√ ~~ 1/2
4πt det D

where r~0 = ξ~i + η~j + ζ ~k.

We can use the Green’s function to write the solution to the diffusion equation in a way that
does not require us to decompose the sources but accepts them as they stand. To do this observe
that the point source solution tells us the solute concentration at the point (x, y, z) at a time t
following the introduction of a unit mass of solute at the point (ξ, η, ζ). Knowing this we can write
the solution to

∂c ~~
=D : ∇∇c + Q (~r, t)
∂t

where

c(t = 0) = c0 (~r)
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 297

in terms of a superposition of point source solutions. The result is

~~ −1
Z +∞ Z +∞ Z +∞ D : (~r − r~0 )(~r − r~0 )
1 −
c (~r, t) = ( )3 (q )e 4t c0 (r~0 ) dV0
−∞ −∞ −∞ √ ~~
4πt det D

~~ −1
Z Z Z Z D : (~r − r~0 )(~r − r~0 )
+∞ +∞ +∞ t
1 −
+ ( )3 (q ) e 4 (t − t0 ) Q (r~0 , t0 ) dV0 dt0
−∞ −∞ −∞ 0 p ~~
4π (t − t0 ) det D

inasmuch as c0 (r~0 )dV0 is the amount of solute introduced into dV0 at r~0 at t = 0 and Q (~
r0 , t0 ) dV0 dt0
is the amount of solute introduced into dV0 at r~0 during dt0 at t0 . Again the solution is a sum of
terms, each term itself being the solution corresponding to one of the sources when the others
~~ ~~
vanish. If this result is used to determine D, only symmetric D’s can be found.

The point source solution for diffusion in three dimensions leads easily to the point source
solution for diffusion in two dimensions and then in one dimension.

If we introduce M ′′′ mass units of solute into an isotropic solvent at the point ~r = r~0 and at
time t = 0 then the resulting solute concentration is

 3 (x − x0 )2 + (y − y0 )2 + (z − z0 )2
1 −
c (~r, t) = M ′′′ √ e 4Dt
4πDt

and this is the point source solution in three dimensions. The dimensions of c are M/L3 and the

dimension of Dt is L.

If instead we introduce M ′′ mass units of solute per unit of length uniformly on the line
x = x0 , y = y0 , −∞ < z < ∞, at time t = 0, then the resulting solute concentration is

Z +∞  3 (x − x0 )2 + (y − y0 )2 + (z − z0 )2
1 −
c (~r, t) = M ′′ dz0 √ e 4Dt
−∞ 4πDt

 2 (x − x0 )2 + (y − y0 )2
1 −
= M ′′ √ e 4Dt
4πDt

and this is called the line source solution in three dimensions. It is uniform in z and does not vanish
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 298

as | z| → ∞. It can be called the point source solution in two dimensions, taking the dimensions
of M ′′ to be M instead of M/L and then the dimensions of c to be M/L2. Likewise if we introduce
M ′ mass units of solute per unit of area uniformly over the plane
x = x0 , −∞ < y < ∞, −∞ < z < ∞, at time t = 0, then the resulting solute concentration is

(x − x0 )2
1 −
c (~r, t) = M ′ √ e 4Dt
4πDt

and this is called the plane source solution in three dimensions. It is uniform in y and z and does
not vanish as | y| → ∞ or as | z| → ∞. It can be called the point source solution in one dimension,
taking the dimensions of M ′ to be M instead of M/L2 and then the dimensions of c to be M/L.

12.5 Home Problems

1. Suppose c satisfies

∂c ~~
=D : ∇∇c
∂t

where c (t = 0) is assigned and c vanishes strongly as | ~r| → ∞. Derive equations for the
power moments of c, viz.,
ZZZ
c0 = c dV

ZZZ
~c1 = ~r c dV

ZZZ
~~c2 = ~r ~r c dV

etc.

~~
Derive a formula for D in terms of the power moments of c.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 299

You ought to begin by deriving:

~r ∇2 c = ∇2 (~rc) − 2∇c
 ~~
~r ~r∇2 c = ∇2 ~r ~r∇2 c − 2Ic − 2∇c ~r − 2~r∇c

and

~~
(∇c) ~r = ∇ (~rc) − Ic

It helps to know:
ZZZ ZZ
∇f dV = dA ~nf
V S

where f can be scalar, vector, etc. valued.

2. A thermometer, your finger, senses the temperature of its tip shortly after it touches another
body. Let αB , kB and TB denote the thermal diffusivity, conductivity and temperature of a
body while αF , kF and TF denote the thermal diffusivity, conductivity and temperature of
the thermometer.

Use the special solution of the diffusion equation, viz.,


r 
x
A + B erf
4αt

to decide that two bodies at the same temperature ordinarily do not feel like they are at the
same temperature. What determines which feels cooler?

3. Denote by σ a surface heat source and assume σ is constant over a sphere of radius R.
Derive a formula for T inside and for T outside, assuming T → 0 as r → ∞. The thermal
conductivities inside and outside may differ.
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 300

Here, ∇2 T is simply
 
1 d 2 dT
r
r 2 dr 2 dr

Write your result in terms of Σ = 4πR2 σ and then let R → 0 holding Σ constant.

4. Leveling a storm surge.

Water lies in a porous rock above an impermeable plane, viz.,

g p0
water z = h ( x, t )
z

Writing Darcy’s law for vx , viz.,

µ ∂p
vx = −
K ∂x

assuming the pressure is hydrostatic, viz.,

p = p0 + ρg (h − z)

and using

∂vx ∂vz
+ = 0, vz = 0 at z=0
∂x ∂z

derive
   
ρ ∂ ∂h ∂h
K g h =
µ ∂x ∂x ∂t
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 301

starting from the balance satisfied by the water at the surface z = h (x, t), viz.,

∂h ∂h
vz − vx =
∂x ∂t

Notice that
 
Kρg L2
h =
µ T

and that what you have is a nonlinear diffusion equation where the diffusivity at (x, t) de-
pends on how much h is there.

Define
Z +∞
hm = xm h (x, t) dx
−∞

and show that

dh0 dh1
=0=
dt dt

and

dh2
>0
dt

Thus the amount of water does not change nor does its mean position as it spreads out.

5. Chromatography

Solve for
   
c0 c1
 ,  , ···
a0 a1
LECTURE 12. DIFFUSION IN UNBOUNDED DOMAINS 302

in terms of the eigenvalues and eigenvectors of


 
K Km
 −ε ε 
 
 K Km 

1−ε 1−ε

Define c, Veff and Deff via

c = ε c + (1 − ε) a

1 dc1
Veff =
c0 dt
and

1 dc2 c1 dc1
Deff = − 2
2c0 dt c0 dt

and derive formulas for Veff and Deff as t → ∞.


Lecture 13

Multipole Expansions

13.1 Steady Diffusion in an Unbounded Domain

In this lecture we deal with the problem of steady solute diffusion or heat conduction from a source
of solute or heat near the origin to a sink at zero concentration or temperature, infinitely far away.
Again we derive a point source solution.

Suppose heat is generated uniformly inside a sphere of radius R centered on the origin O.
Then, assuming the heat is conducted steadily to T = 0 at r = ∞, we have

Q heat
∇2 T + = 0 , [Q] =
k volume time

inside the sphere and

∇2 T = 0

outside the sphere where

1 d 2d
∇2 = r
r 2 dr dr

303
LECTURE 13. MULTIPOLE EXPANSIONS 304

and therefore ∇2 T = 0 implies

B
T =A+
r

Before going on we ought to notice that there would be no steady solution if the sphere were
replaced by an infinitely long cylinder.

We find

1 Q
T = A − r2
6 k

inside,

B
T =
r

outside and, denoting by k the outside thermal conductivity, we have

1 Q
B = R3
3 k

and therefore, outside,

1 Q1 VQ1
T = R3 = , r>R
3 kr 4πk r

where V Q is the heat supplied per unit time. Denoting V Q by Q, we have, for a point source of
heat at the origin:

Q 1
T = , r > 0.
4πk r

Now that we have the temperature at a point −



r due to a point source of heat at a point −

r0, viz.,

Q 1 heat
T (→

r)= , [Q] =
4πk r − →

→ −
r0 time

we can find the temperature at the point −



r due to a continuously distributed source of heat specified
LECTURE 13. MULTIPOLE EXPANSIONS 305

by the density ρ, vanishing outside a region of space denoted V0 , viz.,


Z Z Z 
1 ρ −
→r0 dV0 heat
T (−

r)= →
− , [ρ] =
4πk r −→ −
r0 time volume
V0

The readers ought to convince themselves that the integral makes sense, whether −

r lies inside
or outside the sphere, by looking at the special case where heat is generated uniformly in a finite
sphere centered on O

Thus, we have the solution to

ρ (−
→r)
∇2 T + = 0, T → 0 as | →

r | →∞
k

in an infinite region. This equation is called Poisson’s equation.

Now suppose we have a system of point sources in the neighbourhood of a point O and we
wish to know T at a point P some distance away, viz.,
LECTURE 13. MULTIPOLE EXPANSIONS 306

At P we have

X Qi 1
T (P ) =
4πk | r − →

− −
ri |

where
 

→ →
− 2 →
− →
− →
− →
− 2 ri ri2
| r − ri | = ( r − ri ) · ( r − ri ) = r 1 − 2 cos θi + 2
r r

and therefore
    2  
1 1 ri r2 3 ri
|→

r −−
→ −1
ri | = 1− −2 cos θi + i2 + 2
4 2 cos θi + · · · + · · ·
r 2 r r 8 r

and we find that T (P ) is given by


( )
1 1X ri 3 (ri cos θi )2 1 ri2
T (P ) = Qi 1 + cos θi + − 2 +···
4πk r r 2 r2 2r

Then using



r ·−→
ri
ri cos θi = ,
r

we have
 X X X 3  
1 1 1→
− →
− 1→
− →
− →
− →
− 1→
− →
− ⇒
T (P ) = Qi + 3 r · Qi ri + 5 r r : Qi r r − r · ri I + · · ·
4πk r r r 2 i i 2 i

 
P P P 3− 1− ⇒
The factors Qi , Qi →

ri, Qi → →
− → →

r r − r · ri I , · · · depend only on the distribution
2 i i 2 i
of the point heat sources near O and are called the monopole, dipole, quadrupole, · · · moments of
→ ⇒

the point source distribution. Denoting these M, D , Q,· · · our formula can be written

1 1→ − → 1 −− ⇒
4πkT (P ) = M + 3 −
r · D + 5→
r→r : Q+···
r r r

→ ⇒

where −

r denotes the position of P with respect to O and M, D , Q, · · · do not depend on the field
LECTURE 13. MULTIPOLE EXPANSIONS 307

point −

r.

If, instead of a heat conduction problem, we have an electrostatic problem, where the electrical
potential through out space is created by a system of point charges near the origin, we find for the
electrostatic potential at P , via Coulomb’s law or, as above, via ∇2 φ = 0:

1 1− − → 1−− ⇒
4πǫ0 φ (P ) = MO + 3 →
r · DO + 5 →
r→r : QO + · · ·
r r r

→ ⇒

where MO , D O , QO , · · · denote the monopole, dipole, quadrupole, · · · moments of the charge
distribution about O.

We might then have a second charge distribution in the neighbourhood of P and ask for the
potential energy of this second set of charges due to the electrical potential created by the first
set of charges, i.e., the potential energy of one molecule due to the electrical potential created by
another molecule. The result is

X X
PE = φ (Pi ) Qi = φ (→

r +−

ri ) Qi

corresponding to the sketch


LECTURE 13. MULTIPOLE EXPANSIONS 308

where −

ri denotes the position with respect to P of the ith charge in the second charge distribution.

The potential energy can be written in terms of the moments of the two charge distributions by
expanding φ (Pi ) via

1 →−
φ (−

r +−

ri ) = φ (−

r )+−

ri · ∇φ (−

r )+ −ri →
ri : ∇∇φ (−

r ) +···
2

Hence we have

X  
1− −
PE = Qi φ (→

r )+−
→ r )+ →
ri · ∇φ (−
→ r→ r : ∇∇φ (−

r )+···
2 i i

and using
  →

1 r
∇ =− 3
r r
  →
− 
1 r 3→

r−→r 1⇒
∇∇ = −∇ = − I
r r3 r5 r3
and
−
→ 
r 3−

r−→r 1⇒
∇ = − + I
r3 r5 r3

1
we find, to terms of order ,
r3
 
1 1→− →
− 1− → →
− ⇒
4πǫ0 P E = MP MO + 3 r · D O + 5 r r : QO
r r r
 − →  −  

− r 3→ r−→r 1⇒ − →
+ D P · − 3 MO + − 5 + 3 I · D O
r r r
 − 
X 1

− →
− 3→
r−→r 1⇒
+ Qi r i r i : − 3 I MO
2 r5 r

Then, using
 −  ⇒  − 
3→
r−→r 1⇒ 3→
r→−r 1⇒
tr − 3I =I : − 3I =0
r5 r r5 r
LECTURE 13. MULTIPOLE EXPANSIONS 309

and


tr QP = 0

where

X  
⇒ 3−→ 1− → ⇒
QP = Qi ri →

ri − →ri · −
ri I
2 2

we have

MO MP →

r n→− −
→ o 1⇒ 1→

→ −
− →
4πǫ0 P E = + 3 · D O MP − D P MO + − →

I − 5 r r : DO DP
r r r 3 r
 
1− → −
→ ⇒ ⇒
+ 5 r r : QO MP + QP MO
r

−→
where −

r = OP .

We see that if the two charge distributions are neutral; i.e., MO = 0 = MP , the leading term is
1
the dipole-dipole term, falling off as 3 .
r
If the point charges were replaced by point masses and ε0 by G, and if O and P were taken

− → −
− →
to lie at the centers of mass so that D O = 0 = D P , then the first correction to the leading term
MO MP
would be the quadrupole term, where the quadrupole moment of a mass distribution can
r
be expressed in terms of its inertia tensor.

13.2 Home Problems

1. A source of heat is distributed over the surface of a sphere of radius R according to

σ = σ0 + σ1 cos θ

Your job is to solve ∇2 T = 0 for r > R and r < R where T → 0 as r → ∞. The thermal
conductivities inside and outside may differ.
LECTURE 13. MULTIPOLE EXPANSIONS 310

You have
   
2 1 ∂ 2 ∂ 1 ∂ ∂
∇ = 2 r + 2 sin θ
r ∂r ∂r r sin θ ∂θ ∂θ

and to solve your problem write, inside and outside,

T = T0 (r) + T1 (r) cos θ

and derive
 
1 d 2 dT0
r =0
r 2 dr dr

and
 
1 d 2 dT1 2
r − T1 = 0
r 2 dr dr r2

whereupon

B0
T0 = A0 +
r

and

B1
T1 = A1 r +
r2

What if the inside of the sphere is non conducting?

2. A heat source, denoted Q, is distributed over a region V0 . The temperature at a point ~r is


given by
ZZZ
1 Q (~r0 )
T (~r) = dV0
4πk | ~r − ~r0 |
V0
LECTURE 13. MULTIPOLE EXPANSIONS 311

~n
T (~r)
~r
V0

S0

show that if | ~r| >> | ~r0 | we have


 
1 1 ZZZ ~r
ZZZ 
T (~r) = Q (~r0 ) dV0 + · ~r0 Q (~r0 ) dV0 + · · ·
4πk  r r3 
V0 V0

Now if ~q denotes −k∇T we have

∇ · ~q = Q

and hence we see that


ZZ ZZZ
~n · ~q dA0 = Q (~r0 ) dV0
S0 V0

This suggests that perhaps we can replace all the moments of Q over V0 by moments of
~n · ~q over S0 in our equation for T (~r).

Show that

~r ∇ · ~q = ∇ · (~q ~r) − ~q
LECTURE 13. MULTIPOLE EXPANSIONS 312

and hence derive


ZZZ ZZ ZZZ
~r0 Q dV0 = ~r0 ~n · ~q dA0 − ~q dV0
V0 S0 V0

and conclude that we cannot replace first moments of Q over V0 by first moments of ~n · ~q
over S0 .

However, we have
ZZZ ZZZ ZZ
~q dV0 = −k ∇T dV0 = −k ~n T dA0
V0 V0 S0

whence if T is constant on S0 we have


ZZZ
~q dV0 = ~0
V0

Turn to second moments and evaluate


ZZZ ZZ
~r0 ~r0 Q dV0 − ~n · ~q ~r0 ~r0 dA0
V0 S0

3. Derive a formula for the potential energy due to systems of charges near O, P and Q.
LECTURE 13. MULTIPOLE EXPANSIONS 313

O
~rP
P
~rQ

where | ~rP |, | ~rQ | and | ~rP − ~rQ | are all much greater than the distances of the charges from
O, P and Q.

Assume the monopole moments are all zero, and account only for the dipole moments.

4. The temperature at ~r due to a point source at ~r0 is

Q
T (~r) = 4πk
| ~r − ~r0 |

Assume ~r0 = z0~k, z0 > 0 and show that the temperature at the points (R, θ, φ) is

Q
T (R, θ, φ) = p 2 4πk
z0 − 2z0 R cos θ + R2

Assume z0 >> R and write out the first three terms.

Setting this aside, find the temperature at ~r if the temperature, T (R, θ, φ), is specified
on the surface of a sphere of radius R, r > R and T → 0 as r → ∞.

Now we have a point source of heat at z = z0 , x = 0 = y and we wish to derive a


formula for T (~r) due to this point source, given that a sphere of radius R is centered at the
origin and given that its surface temperature is held constant at a temperature TS > 0.
LECTURE 13. MULTIPOLE EXPANSIONS 314

The result is the sum of two contributions, that of the point source in the absence of the
sphere and that of a sphere whose surface temperature is

Q
TS − p 2 4πk
z0 − 2z0 R cos θ + R2

in the absence of the point source.

How should we think about finding the temperature at ~r due to two spheres of radius
R, one centered at ~r1 having surface temperature T1 , the other centered at ~r2 having surface
temperature T2 , T1 and T2 constants?

5. Suppose our charge distribution is composed of two charges, q1 at ~r1 and q2 at ~r2 , where
q1 = q = −q2 and ~r1 − ~r2 = d~u and where ~u is a vector of unit length in the direction
~r1 − ~r2 .

Derive the monopole, dipole and quadrupole moments of this charge distribution. What is
the limiting form of each of these moments as d → 0 holding qd fixed.

6. The solute concentration at a point ~r due to point sources at ~r1 , ~r2 , . . ., near the origin is

X mi
4πDc (~r) =
| ~r − ~ri |
LECTURE 13. MULTIPOLE EXPANSIONS 315

~r2
V0
~r1
O

~r

which we can write


ZZZ
ρ (~r0 )
4πD c (~r) = dV0
| ~r − ~r0 |
V0

if the discrete sources are replaced by a continuous source density ρ.

If ~r lies inside V0 we might wonder about the integral on the right hand side as ~r0 passes
through ~r. Suppose ~r = 0 and write
ZZZ ZZZ ZZZ
= +
V0 V0 −Vε Vε

where V ε is the volume of a sphere of radius ε centered on O.

Show that
ZZZ
ρ (r0 )
dV0
| ~r0 |

is not singular if ρ is bounded.


LECTURE 13. MULTIPOLE EXPANSIONS 316

Now for ~r outside V0 derive


ZZZ ZZZ
1 1 ~r
4πD c (~r) = ρ (~r0 ) dV0 + 2 · ~r0 ρ (~r0 ) dV0
r r r
V0 V0
  ZZZ
1 3 ~r ~r 1 ~~
+ 3 − I : ~r0 ~r0 ρ (~r0 ) dV0 + etc.
r 2rr 2
V0

Suppose the solute source is distributed over a surface S0 , its density being denoted by
σ. Then decide
ZZ
σ (~r0 )
4πD c (~r) = dA0
| ~r − ~r0 |
S0

If ~r is a point of S0 does the right hand side present a technical difficulty?

Suppose S0 is the surface of a sphere of radius R0 centered at the point ~r ∗ and define

1
G (~r, ~r0 ) = , ~r0 ∈ S0
| ~r − ~r0 |

Then expand G about ~r ∗ via

G (~r, ~r0 ) = G (~r, ~r ∗) + (~r0 − ~r ∗) · ∇0 G (~r, ~r ∗)

1
+ (~r0 − ~r ∗) (~r0 − ~r ∗) : ∇0 ∇0 G (~r, ~r ∗) + · · ·
2

where ~r0 − ~r ∗ = R0 ~n (~r0 ) and ~n is the unit normal to the sphere at ~r0 .
LECTURE 13. MULTIPOLE EXPANSIONS 317

Then we can write


ZZ
4πD c (~r) = σ (~r0 ) dA0 G (~r, ~r ∗)
S0
ZZ
+ σ (~r0 ) (~r0 − ~r ∗) dA0 · ∇0 G (~r, ~r ∗)
S0
ZZ  
+ σ (~r0 ) (~r0 − ~r ∗) ~r0 − ~r ∗ dA0 : ∇0 ∇0 G (~r, ~r ∗)
S0

+···

Suppose the interior of the sphere is not permeable to solute. How must the foregoing be
corrected?

7. The temperature T at a point P due to a heat source distributed over a volume V0 is


ZZ
1 ρ (~r0 )
T (P ) = dV0
4πk | ~r − ~r0 |
V0

Assume ρ is constant inside a sphere of radius R0 centered on the origin and zero outside.
Derive

ρ1 2
T (origin) = R
k 2 0

Check this by solving



   0,
1 d dT r > R0
r2 =
r 2 dr dr  −ρ, r < R0
k
Lecture 14

One Dimensional Diffusion in Bounded


Domains

14.1 Introduction

We begin with a simple problem, solute diffusion in one dimension. The diffusion takes place in a
solvent layer separating two solute reservoirs where we control what is going on in the reservoirs.
∂2
The differential operator ∇2 is then 2 and our problem is to solve
∂x

∂c ∂2c
=D 2 , 0<x<L
∂t ∂x

to determine c (t > 0) whenever c (t = 0) is assigned.

In our first problem the solvent layer is in perfect contact with large reservoirs maintained
x Dt
solute free. Hence, in scaled variables ⇒ x, 2 ⇒ t we have
L L

∂c ∂2c
= 2 , 0<x<1
∂t ∂x

and

c (x = 0) = 0 = c (x = 1)

319
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 320

where c (t = 0) is assigned and it is the only source of solute in the problem.

We are going to make several assumptions as we go along, one being that our functions are
smooth enough that integration by parts formulas can be used.

Earlier we solved

dx
= Ax , x (t = 0) specified
dt

by expanding x (t) in the eigenvectors of A and solving for the time dependent coefficients in the
∂2c
expansion. In fact, in one of our problems A was derived from difference approximations to 2 .
∂x
To maintain continuity with our earlier work, we are going to introduce the eigenvalue problem
d2
for the differential operator 2 , viz.
dx

d2
ψ = −λ2 ψ , 0 < x < 1
dx2

Solutions, ψ, to this problem are called eigenfunctions, and we will call the corresponding values
of λ2 the eigenvalues.

This problem has been stated too generally and we need to introduce restrictions on its solu-
tions. At this point we can only say that

ψ = A cos λx + B sin λx

where A, B and λ are unknown. The problem is homogeneous and will remain homogeneous as
we introduce additional conditions, hence if ψ is a solution, so too any multiple of ψ.
d2
What we need to do is to define the domain of the differential operator 2 in a way that is
dx
specific to the problem at hand. Here, as our solutions satisfy c (x = 0) = 0 = c (x = 1) it is
natural to require ψ (x = 0) = 0 = ψ (x = 1). This is the best choice and we will see why this is
so as we go along. We might simply argue that if we make each term in a sum vanish at x = 0,
then the sum vanishes at x = 0.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 321

So, the eigenvalue problem is

d2 ψ
= −λ2 ψ , 0 < x < 1
dx2

and

ψ (x = 0) = 0 = ψ (x = 1)

and its solutions are



ψ = sin nπx 
n = 1, 2, . . .
λ2 = n2 π 2 

where A = 0 and we set B = 1.

The eigenvalues are determined by the condition at x = 1, viz.,

sin λ = 0

Thus, λ = nπ, n = 0, ±1, ±2, . . .. The value n = 0 leads to ψ = 0 whereas the values n =
−1, −2, . . . lead to eigenfunctions that are multiples of those corresponding to n = 1, 2, . . .
Thus, we find an infinite set of eigenfunctions, ψ = sin nπx, corresponding to an infinite set of
eigenvalues.

λ2 = n2 π 2 , n = 1, 2, . . .

The main question that we would like to answer about an infinite set of functions, such as these
eigenfunctions, is whether or not it can be used as a basis for the expansion of a fairly arbitrary
function, viz., the function c (t = 0). This is what the theory of Fourier series is about and a
good elementary account of this theory can be found in Weinberger’s book (A First Course in
Partial Differential Equations with Complex Variables and Transform Methods). The expansion
we require is an infinite series, not a finite sum, and the question is not easy to answer. Still
there are conditions on the arbitrary function and on the basis functions sufficient for the series
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 322

representation of the function to mean something. On their part, the set of basis functions sin nπx,
n = 1, 2, . . . satisfy the conditions required, viz., first, they are a set of orthogonal functions. We
can establish this directly by using the functions themselves or indirectly by using the eigenvalue
problem defining the functions.

14.2 Orthogonality of the Functions ψn


d2
We show that orthogonality is implied by the differential operatorand the boundary conditions
dx2
and we go on and draw some conclusions about the eigenvalues. To do this we use two integration
formulas. If the functions φ and ψ are smooth enough so that we can integrate by parts, we get two
formulas

Z1  1 Z1
d2 ψ dψ dφ dψ
φ 2 dx = φ − dx
dx dx 0 dx dx
0 0

and

Z1  1 Z1 2
d2 ψ dψ dφ d φ
φ 2 dx = φ − ψ + ψdx
dx dx dx 0 dx2
0 0

where

[φ]10 = φ (x = 1) − φ (x = 0)

and where these formulas hold for complex valued functions as well as for real valued functions.
d2
Then as ψ and λ2 satisfy the eigenvalue problem for 2 whenever it is satisfied by ψ and λ2 ,
dx
we let ψ be an eigenfunction and φ be its complex conjugate in the second formula and determine
that λ2 = λ2 , and hence that λ2 is real. It follows that if ψ is an eigenfunction corresponding to the
eigenvalue λ2 so too is its complex conjugate and its real and imaginary parts.

Again if ψ and φ denote an eigenfunction and its complex conjugate, the first formula tells us
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 323

that:

Z1  1 Z1 2 Z1 2
dψ dψ dψ
−λ2 |ψ|2 dx = ψ − dx = − dx
dx 0 dx dx
0 0 0

and hence that the corresponding eigenvalue, λ2 , is strictly positive. If λ2 were zero, this formula
would require that ψ be a constant and as ψ (x = 0) = 0 = ψ (x = 1) this constant would then be
zero.

Returning to the second formula and taking φ and ψ to be eigenfunctions corresponding to


different eigenvalues, we discover that

Z1
φψdx = 0
0

and we call φ and ψ orthogonal functions. We can introduce the inner product

Z1
hφ, ψi = φψdx
0

and restate this as

hφ, ψi = 0.

In terms of this inner product the two integration by parts formulas can be rewritten as

   1 Z1
d2 dψ dφ dψ
φ, 2 ψ = φ − dx
dx dx 0 dx dx
0

and
   1  
d2 dψ dφ d2
φ, 2 ψ = φ − ψ + φ, ψ .
dx dx dx 0 dx2

What we have established then is just what we can show to be true by direct calculation:
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 324

the eigenvalues n2 π 2 , n = 1, 2, . . ., are real and positive and the eigenfunctions, sin nπx, n =
1, 2, . . ., are orthogonal, viz.,

Z1
sin mπx sin nπx dx = 0 , m 6= n
0

What is important is not the confirmation of results we already have, but the possibility of
obtaining new results. To see this we observe the important role played by the boundary con-
 1  1
dψ dψ dφ
ditions in eliminating the term φ in the first formula and the term φ − ψ in the
dx 0 dx dx 0
second. Indeed if φ and ψ, and therefore their complex conjugates, satisfy the boundary conditions
dψ dψ
ψ (x = 0) = 0, (x = 1) = 0 or the boundary conditions (x = 0) = 0, ψ (x = 1) = 0
dx dx

then the conclusions are again as above. So too if the boundary conditions are (x = 0) =
dx

0, (x = 1) = 0, except that now we can conclude only that λ2 ≥ 0 as ψ = constant 6= 0
dx
satisfies the boundary conditions. This also obtains for periodic boundary conditions, where
 1  1
dψ dψ dψ dψ dφ
ψ (x = 0) = ψ (x = 1) and (x = 0) = (x = 1), as again the terms φ and φ − ψ
dx dx dx 0 dx dx 0
vanish and ψ = constant 6= 0 satisfies the boundary conditions.

Because the specification of a linear combination of ψ and at a boundary is of physical
dx
dψ dψ
interest, we look also at the boundary conditions (x = 0)+ β0 ψ (x = 0) = 0 and (x = 1) +
dx  dx 1
dψ dφ
β1 ψ (x = 1) = 0 where the constants β0 and β1 take real values. Then because φ − ψ =
dx dx 0
0 all conclusions drawn from the second integration by parts formula are as above whereas the first
now tells us that

Z1 Z1 2
2 2 2 2 dψ
−λ |ψ| dx = −β1 |ψ (x = 1)| + β0 |ψ (x = 0)| − dx.
dx
0 0

If β0 and β1 are not both zero then β1 ≥ 0, β0 ≤ 0 are sufficient, but, as we shall see, not necessary,
that λ2 > 0.
d2
What we conclude then is that the operator , restricted to a variety of domains by a variety
dx2
of boundary conditions of differing physical interpretation, leads via the solution of its eigenvalue
problem to a variety of sets of orthogonal functions. Denote one such set ψ1 , ψ2 , . . . and require
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 325

R1 2
that ψ i ψi dx = 1 so that hψi , ψj i = δij . Then sums of terms like e−λi t ψi (x) satisfy the diffusion
0
equation and the corresponding homogeneous boundary conditions. It remains only to determine
the weight of each such term in a series solution to a diffusion problem. And this must be decided
by the assigned solute distribution at the initial instant, t = 0. Thus we need to learn how to
determine the coefficients in an expansion such as

c (t = 0) = c1 ψ1 (x) + c2 ψ2 (x) + · · ·

where c (t = 0) is a fairly arbitrary function of x, 0 ≤ x ≤ 1.

14.3 Least Mean Square Error

Writing c (t = 0) as f (x) we find the error in an n term approximation to f (x) to be f (x)−Sn (x)
Pn
where Sn (x) = ci ψi (x). The mean square error is then
i=1

Z1 n o
f (x) − Sn (x) {f (x) − Sn (x)} dx
0

R1
which, on using ψ i ψj dx = δij , can be rewritten
0

2 2
Z1 n
X Z1 n
X Z1
|f |2 dx + ci − ψ i f dx − ψ i f dx
0 i=1 0 i=1 0

2
P
n R1
where only the second term, ci − ψ i f dx , depends on the values of the coefficients c1 , c2 ,
i=1 0
. . ., cn . Because the second term is not negative we can make the mean square error least by
making this term zero. To do this we assign the coefficients in the expansion the values

Z1
ci = ψ i f dx = hψi , f i
0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 326

whence our approximation to f (x) is

n
X
hψi , f i ψi (x)
i=1

and its mean square error is

Z1 n
X
2
|f | dx − |ci |2 .
0 i=1

What is going on is indicated in the picture below where the best approximation of f using
only ψ1 is shown


→ →
− →
− →
− →
− D−
→ − →E
and where we see that the length of f −c1 ψ 1 is least if f −c1 ψ 1 is ⊥ to ψ 1 or if c1 = ψ1, f .
Pn
Indeed on requiring f − ci ψi to be ⊥ to ψ1 , ψ2 , · · · , ψn we get immediately ci = hψi , f i. And
i=1
we see that the values of the coefficients c1 , c2 . · · · , cn do not depend on n, remaining fixed as n
increases once determined for some value of n.

As the mean square error is non-negative we have

n
X Z1
2
|ci | ≤ |f |2 dx
i=1 0

and for n −→ ∞


X Z1
|ci |2 ≤ |f |2 dx.
i=1 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 327

P

This tells us that the series |ci |2 converges and therefore that |ci |2 −→ 0 as i −→ ∞. The
i=1
R1
coefficients ci = ψ i f dx = hψi , f i are called the Fourier coefficients of f and the set of orthogonal
0
functions ψ1 , ψ2 , · · · is said to be complete if for every function f in a class of functions of interest
we have


X Z1
2
|ci | = |f |2 dx.
i=1 0

P
n
Then f (x) is approximated by Sn (x) = hψi , f i ψi to a mean square error that vanishes as
i=1
n −→ ∞ and the sequence Sn (x) is said to converge to f (x) in the mean or in the norm. This
does not imply that the sequence Sn (x) converges to f (x) for an arbitrary value of x on the
interval [0, 1]. But it does imply that the sequence Sn (x) converges to f (x) pointwise almost
everywhere. A discussion of pointwise and norm convergence is given in Weinberger’s book in
terms of conditions on the functions being approximated.

We will assume that the sets of orthogonal functions that we generate by solving the eigenvalue
d2
problem for 2 , and indeed for ∇2 itself, are complete vis-a-vis functions of interest in physical
dx
problems. But whereas completeness implies only convergence in norm, we will go on and write


X
f (x) = hψi , f i ψi (x)
i=1

mindful of the warning that this might not be true for all values of x. Indeed the series obtained
P

by differentiating hψi , f i ψi termwise might not converge for any value of x and therefore not
i=1
have a meaning in any ordinary sense.

An infinite series is nearly as useful as a finite sum if the function f (x) is smooth enough and
d2 f
satisfies the same boundary conditions as do the functions ψ1 (x), ψ2 (x), · · · . For suppose 2
  dx
P∞ d2 f
has the Fourier series di ψi (x) where di = ψi , 2 and di −→ 0 as i −→ ∞. Then we have
i=1 dx

Z1  1 Z1 2
d2 f df dψ i d ψi
di = ψ i 2 dx = ψ i − f + f dx
dx dx dx 0 dx2
0 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 328

or

Z1
di = −λ2i ψ i f dx = −λ2i hψi , f i
0

and hence

di
ci = hψi , f i = −
λ2i

P

where ci ψi (x) is the Fourier series corresponding to f . Assuming λ2i ∝ i2 , we see that ci −→ 0
i=1
faster than i12 as i −→ ∞ and hence that the Fourier series for f may then be a useful representation.

14.4 Example Problems

With the foregoing as background we investigate a sequence of problems where diffusion takes
place in a solvent layer separating two reservoirs. If the reservoirs are very large, well mixed
and solute free we put c (x = 0) = 0 = c (x = 1). If the reservoirs are impermeable to solute,
∂c ∂c
we put (x = 0) = 0 = (x = 1). If the solute diffusing to the right hand boundary is
∂x ∂x
D ∂c
dissipated by a first order reaction taking place there we put − (x = 1) = kc (x = 1) or
l ∂x
∂c kl
(x = 1) + βc (x = 1) = 0 where β = . As we have written it, k is positive for an ordinary
∂x D
decomposition of solute, but negative for an autocatalytic reaction wherein, at the wall, solute
catalyzes the production of more solute.

We can also assume that the solute diffusing to the right hand boundary accumulates in a finite,
well mixed reservoir whose composition is in equilibrium with the composition at the right-hand
edge of the film. Then we put

D ∂c l1 m ∂c
− (x = 1) = 2 (x = 1)
l ∂x l /D ∂t
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 329

or

∂c ∂c
(x = 1) + α (x = 1) = 0
∂t ∂t

l1
where α = m, l1 denoting the volume of the reservoir divided by the cross sectional area of the
l
diffusion layer, m denoting the equilibrium distribution ratio. This problem comes up in separa-
tions by chromatography and its solutions can be used to explain why retention times depend on
initial solute distributions even if there is no competition for the adsorbent.

In each of these problems, differing only in the conditions specified at the boundary of the
diffusion layer, we have

∂c ∂2c
= 2 , 0<x<1
∂t ∂x

where c (t = 0) is an asssigned function of x, 0 ≤ x ≤ 1. To solve our problem, we expand the


d2
solution we seek in the eigenfunctions of 2 and so we write
dx

X
c (x, t) = ci (t) ψi (x)
i=1

where the coefficients ci (t) = hψi , ci remain to be determined. To find the equations satisfied by
the ci we multiply the diffusion equation by ψ i and integrate over the domain, viz.,

Z1 Z1
∂c ∂2c
ψi dx = ψi dx
∂t ∂x2
0 0

and observe, by our second integration by parts formula, that this is

 1
d ∂c dψ i
hψi , ci = ψ i − c − λ2i hψi , ci .
dt ∂x dx 0

∂c
Now we have what appears to be a technical difficulty: we do not know both c and at both
∂x
x = 0 and x = 1.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 330

If the first term on the right hand side vanishes, and we set the boundary conditions in the
eigenvalue problem to make this happen, if we can, this is simply

d
hψi , ci = −λ2i hψi , ci
dt

whence

2
hψi , ci = e−λi t hψi , c (t = 0)i

and our solution is


X 2
c (x, t) = hψi , c (t = 0)i e−λi t ψi (x) .
i=1

We order the eigenvalues so that 0 ≤ λ21 ≤ λ22 ≤ · · · .

This series is an increasingly useful representation of c (x, t) as t increases. How useful it is for
small values of t depends on what c (t = 0) is. For very large values of t, c (x, t) is approximately

2
hψ1 , c (t = 0)i e−λ1 t ψ1 (x)

if λ21 > 0 and λ22 > λ21 or

2
hψ1 , c (t = 0)i ψ1 (x) + hψ2 , c (t = 0)i e−λ2 t ψ2 (x)

if λ21 = 0, λ22 > 0 and λ23 > λ22 .

It is important to notice that we have the solution to our problem, even though it is an infinite
series, and that we have not differentiated the series to obtain it.

We now look at the solutions to the eigenvalue problem corresponding to a variety of boundary
conditions where in every case ψ = A cos λx + B sin λx satisfies

∂2ψ
+ λ2 ψ = 0.
∂x2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 331

Example (1): c (x = 0) = 0 = c (x = 1)

In the first problem a solute initially distributed over a solvent layer according to c (t = 0), is lost to
the adjacent reservoirs which maintain the edges of the diffusion layer solute free, i.e., c (x = 0) =
0 = c (x = 1). Hence we choose the boundary conditions satisfied by ψ to be ψ (x = 0) = 0 =
 1
∂c ∂ψ i
ψ (x = 1) whereupon the term ψ i − c on the right hand side of the equation for hψi , ci
∂x ∂x 0
vanishes.

Then ψ (x = 0) = 0 implies A = 0 and ψ (x = 1) = 0 implies sin λ = 0 whence the eigen-


functions and the corresponding eigenvalues are


ψi = 2 sin iπx , i = 1, 2, · · ·

and

λ2i = i2 π 2 , i = 1, 2, · · ·

∂c
Example (2): (x = 0) = 0, c (x = 1) = 0
∂x
The solute sink at x = 0 in Example (1) is replaced by a barrier impermeable to solute, all else
remaining the same.

To solve this problem we need a new set of eigenfunctions which we obtain by introducing new
boundary conditions, viz.,


(x = 0) = 0, ψ (x = 1) = 0
dx
 1
∂c dψi
for this is what is required to make the term ψ i −c on the right hand side of the equation
∂x dx 0
for hψi , ci vanish.

Then (x = 0) = 0 implies B = 0 and ψ (x = 0) = 0 implies cos λ = 0, A 6= 0, whence we
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 332

have
 
√ 1
ψi = 2 cos i − πx, i = 1, 2, · · ·
2

and
 2
1
λ2i = i− π 2 , i = 1, 2, · · ·
2

The loss of solute is slowed by imposing the barrier on the left hand side of the layer. For long
times, after the details of the initial solute distribution are forgotten, the second layer is effectively
twice as thick as the first and to see this we need only observe that λ21 in the second problem is one
fourth λ21 in the first.

∂c ∂c
Example (3): (x = 0) = 0, (x = 1) = 0
∂x ∂x
Here both edges of the solute layer are isolated from the bounding reservoirs and the initial solute
distribution in the film is simply rearranged by diffusion, no solute being lost. The boundary
conditions on ψ are now

∂ψ ∂ψ
(x = 0) = 0, (x = 1) = 0
∂x ∂x

as this leads to
 1
∂c dψi
ψi − c =0
∂x dx 0

Because


= −Aλ sin λ + Bλ cos λx
dx


we see that (x = 0) = 0 implies Bλ = 0. Thus we have either
dx

λ = 0, ψ = A
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 333

or

B = 0, ψ = A cos λx, A 6= 0


where (x = 1) = 0 implies
dx

Aλ sin λ = 0

and our eigenfunctions and eigenvalues are



 1 , i=1
ψi = √
 2 cos (i − 1) πx , i = 2, · · ·

and

λ2i = (i − 1)2 π 2 , i = 1, 2, · · · .

The average solute concentration in the solvent layer is

Z1
cavg (t) = c (x, t) dx = hψ1 , ci
0

and in terms of this the solution can be written


X 2
c (x, t) = cavg (t = 0) + hψi , c (t = 0)i e−λi t ψi (x)
i=2

Due to hψ1 , ψi i = 0, i = 2, · · · , we have

cavg (t) = cavg (t = 0) .


LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 334

∂c ∂c
Example (4): (x = 0) = 0, (x = 1) + βc (x = 1) = 0, β ≥ 0
∂x ∂x
Again the left hand edge of the solute layer is impermeable to solute but at the right hand edge
solute is lost by a first order process. If β = 0 we get Example (3), if β −→ ∞ we get Example
(2).

The boundary conditions are now

dψ ∂ψ
(x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx ∂x

because this choice makes the term


 1
∂c dψi
ψi − c
∂x dx 0

vanish.

First we notice that if λ = 0, then ψ must be constant. But that constant must be zero if β 6= 0.

The boundary condition (x = 0) = 0 implies B = 0 hence we have
dx

ψ = A cos λx , A 6= 0

whereupon the λ′ s are the solutions of

λ sin λ − β cos λ = 0

where if λ is a solution so too −λ, both implying the same λ2 and ψ. Hence we write our equation

λ
= cot λ
β

and look for solutions λ > 0. These can be obtained graphically. The figure illustrates their
dependence on β.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 335


We see that 0 ≤ λ1 ≤ 12 π, π ≤ λ2 ≤ 23 π and indeed that (i − 1) π ≤ λi ≤ i − 1 + 12 π =

i − 21 π, i = 1, 2, · · · . We also see that λi −→ (i − 1) π as β −→ 0, the result for Example (3),

and λi −→ i − 12 π as β −→ ∞, the result for Example (2). As β increases from 0 to ∞, λi

increases monotonically from (i − 1) π to i − 12 π and this corresponds to an increasing rate of
loss of solute. Some information on how the eigenvalues depend on β is in Appendix 1.

While each eigenvalue is a smooth monotonic function of β, moving from its β −→ 0 limit
(chemical reaction control) to its β −→ ∞ limit (diffusion control) as β increases, we observe
that if we hold β fixed, at any value other than ∞, then as i −→ ∞, λi −→ (i − 1) π, and this
is its β = 0 value. The larger the value of β the larger the value of i before λi can be closely
approximated by its β = 0 value, viz., lim lim λi (β) 6= lim lim λi (β). Ordinarily it is only
i−→∞β−→∞ β−→∞i−→∞
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 336

the first few λi ’s that depart greatly from their β = 0 values but these are the most important
eigenvalues for all but the shortest times.

This is the first problem where we might not have been able to use trigonometric identities to
discover that

Z1
ψi ψj dx = 0 , i 6= j
0

and yet this condition holds here just as it does in the earlier problems.

The eigenvalues then are the squares of the values λ1 , λ2 , · · · determined as above and the
q R1
1 1
corresponding eigenfunctions are cos λi x divided by 2λi
sin λ i cos λ i + 2
as cos2 λxdx =
0
1

sin λ cos λ + 21 .

It may be worthwhile to observe that for each value of β we generate an infinite set of orthog-
onal eigenfunctions:

cos λ1 x, cos λ2 x, cos λ3 x, · · ·

where λ1 , λ2 , λ3 , · · · depend on β. The eigenfunctions for one value of β are not particularly
useful in writing the solution for another value of β. The readers may wish to satisfy themselves
that this is so. An example is presented in Appendix 3.

∂c
Example (5): c (x = 0) = 0, (x = 1) + βc (x = 1) = 0, β < 0
∂x
Here we have a source of solute at x = 1, not a sink as in the earlier case where β > 0, and
for β near zero we might imagine that all λ2 ’s are positive. Solute is produced at the right hand
boundary, lost at the left hand boundary and we wish to know: at what value of β, as the source
becomes increasingly stronger, can diffusion across the layer no longer control the source.

Our eigenvalue problem is

d2 ψ
+ λ2 ψ = 0 , 0 < x < 1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 337

and


ψ = 0 at x = 0; + βψ = 0 at x = 1
dx

and we observe, assuming λ2 and ψ to be real, that

Z1  1 Z1
2 2 dψ dψ dψ
−λ ψ dx = ψ − dx
dx 0 dx dx
0 0

where the first term on the right is

−βψ 2 (x = 1) .

Hence if β > 0 we have λ2 > 0, but if β < 0 we cannot tell the sign of λ2 without a calculation.

The first term depends explicitly on β, the second implicitly. The signs of both are known
and opposite if β < 0. We therefore anticipate that stability will be lost if β becomes sufficiently
negative (i.e., at least one value of λ2 will become negative). Indeed our formula for λ2 continues

to hold if ψ (x = 0) = 0, a sink, is replaced by (x = 0) = 0, a barrier. In that case the critical
dx
value of β is certainly zero; i.e., every negative value of β leads to growth.

By choosing ψ to satisfy


ψ (x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx

we have
 1
∂c dψi
ψi − c =0
∂x dx 0

and the equation for hψi , ci is the same as in the earlier examples.

The boundary condition ψ (x = 0) = 0 implies A = 0 and λ 6= 0, whence B 6= 0. Then the


∂ψ
condition (x = 1) + βψ (x = 1) = 0 tells us that λ must satisfy
∂x

λ cos λ + β sin λ = 0.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 338

This is an equation for λ2 . If λ is a solution so also is −λ and both ±λ lead to the same
eigenvalue and eigenfunction. However, if we anticipate a solution λ = 0, we ought to write

ψ = A + Bx

and then we see that λ = 0 is indeed a solution corresponding to ψ = βx, iff β = −1.

First we look for positive real values of λ leading to positive real values of λ2 . To do this we
write our equation

λ
− = tan λ
β

and indicate its solutions graphically as:


LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 339

For β > 0 and indeed for β > −1 all is well. The value of λ1 , the smallest positive root,
π π
decreases from π to as β decreases from ∞ to 0 and then decreases from to 0 as β decreases
2 2
from 0 to -1. This makes physical sense as it tells us that an initial solute distribution dies out
more and more slowly as a solute sink loses strength and turns into a weak source. But as β passes
through -1 a root is lost and something new seems to happen. Indeed if we were to inquire as to
whether a steady concentration field were possible, wherein diffusion to the left hand reservoir just
balances production at the right hand boundary, we would find that such a condition obtains only
for β = −1.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 340

Now λ2 must be real and, when β ≥ 0, it must also be positive. This is what directed our
earlier attention to real and positive values of λ; but if we admit purely imaginary values of λ, i.e.,
λ = iω where ω is real, then λ2 = −ω 2 and now λ2 is real, as it must be, but it is negative and this
is new. To see if this might be what is happening we put λ = iω, ω > 0, into λ cos λ + β sin λ = 0
and then use cos iω = cosh ω and sin iω = i sinh ω to get

ω cosh ω + β sinh ω = 0.

If β is not negative, this equation is not satisfied by any real values of ω. If β is negative we write
β = − |β| whence ω satisfies

ω
= tanh ω
|β|

But tanh ω increases monotonically from 0 to 1 while its derivative decreases monotonically from
tanh ω
1 to 0 as ω increases from 0 to ∞ and lim = 1. Hence there are no solutions to this
ω−→0 ω
equation for 0 < |β| < 1; but for |β| > 1 there is a solution as we can see on the graph:

So the eigenvalue λ21 decreases from π 2 to 0 as β decreases from ∞ to −1 and then decreases
from 0 to −∞ as β decreases from −1 to −∞. Hence for β < −1 an initial solute distribution, no
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 341

matter what its shape, runs away, while for β > −1 an initial solute distribution, again no matter
what its shape, dies out.

What we have found then is this. The diffusion equation, under the conditions c (x = 0) = 0,
∂c
(x = 1) + βc (x = 1) = 0, acts to dissipate imposed solute fields (of any size) so long as
∂x
kl
−1 < β < ∞. The parameter β is , where k depends on temperature and l is determined by the
D
proximity of the source and the sink, i.e., the diffusion path length. Calculations of this sort are of
interest in the design of cylinders for the storage of acetylene, but there the autocatalytic reaction
is homogeneous and is controlled by diffusion to the wall where deactivation takes place. Again
the diameter of the tank is important as is the temperature.

We might have expected to see first one negative value of λ2 , then two, then three, etc., as β
decreases below −1. But we do not.

The Steady Solution When β = −1

The problem

∂c ∂2c
= 2, 0 < x < 1
∂t ∂x

and

∂c
c (x = 0) = 0, (x = 1) + βc (x = 1) = 0
∂x

where c (t = 0) is assigned has the solution c (x, t) = 0 for all values of β if c (t = 0) = 0; likewise
the corresponding steady problem has the solution c (x) = 0 for all values of β. If β > −1 this is
the long time limit of all unsteady solutions.

The steady solution is

c = A + Bx

and as c (x = 0) = 0 we have A = 0. Then B must satisfy B + βB = 0 and hence B = 0 for


LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 342

all values of β except β = −1 where B is indeterminate. If β = −1 we must solve the dynamic


problem to discover the value of B, for the steady solution depends on how much solute is in the
film at the outset.

The eigenvalue problem is

d2 ψ
+ λ2 ψ = 0, 0 < x < 1
dx2

and


ψ (x = 0) = 0, (x = 1) + βψ (x = 1) = 0
dx

and its solutions are

B
ψ = A cos λx + sin λx
λ

B
where A = 0 as ψ (x = 0) = 0. The eigenvalues corresponding to ψ = sin λx are determined
λ
by the boundary condition at x = 1 and hence by the solutions to

β
cos λ + sin λ = 0.
λ

This has the solution λ = 0 iff β = −1. For small λ we can write this equation
   
1 2 β 1 3
1− λ ··· + λ− λ ··· = 0
2 λ 6

or
 
2 1 1
(1 + β) − λ + β +··· = 0
2 6

whence λ2 = 0 is a root and a simple root, iff β = −1. When λ2 is zero it corresponds to the
eigenfunction ψ = Bx.

So if β = −1, the eigenvalues are the squares of the roots of λ cos λ − sin λ = 0 and the
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 343

corresponding eigenfunctions are ψ1 = 3x, ψ2 = B2 sin λ2 x, · · · . And we can demonstrate by
R1
direct calculation that x sin λi xdx = 0, i = 2, · · · .
0
The solution to our problem when β = −1 is then
1 
Z √ √ ∞
X 2
c= 3xc (t = 0) dx 3x + hψi , c (t = 0)i e−λi t ψi (x)
 
0 i=2

the first term being the steady solution.

∂c ∂c ∂c
Example (6): (x = 0) = 0, (x = 1) + α (x = 1) = 0, α > 0
∂x ∂x ∂t
Now the diffusion layer is isolated from the left hand reservoir but exchanges solute with the
right hand reservoir, assumed to be of finite extent so that its concentration responds to this solute
exchange. To simplify the problem we assume that the right hand edge of the diffusion layer and
the reservoir remain in phase equilibrium for all time.

Here we see something new: time derivatives appear in our problem in two places and hence
we anticipate that our eigenvalue problem will not be the same as it was in the earlier examples.
But imagining that our time dependence will remain exponential we propose that ψ and λ2 must
satisfy

d2 ψ
+ λ2 ψ = 0
dx2

dψ dψ
(x = 0) = 0 and (x = 1) − αλ2 ψ (x = 1) = 0
dx dx
where the eigenvalue λ2 now also appears in the boundary condition.

Before we solve this problem we ought to use our two integration formulas to learn something
about it.

Thus, as λ2 and ψ is a solution to our problem whenever λ2 and ψ is a solution, we substitute


φ = ψ in the second formula to see that λ2 = λ2 , hence the eigenvalues must be real. Then
substituting φ = ψ in the first formula, we see that λ2 must be non negative.

Something new appears on substituting φ = ψ 1 and ψ = ψ2 , eigenfunctions corresponding to


LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 344

distinct eigenvalues, in the second formula. We see that φ and ψ are not orthogonal in the plane
vanilla inner product. Instead they are orthogonal in the inner product

Z1
ha, bi = ab dx + αa (x = 1) b (x = 1)
0

Hence, to solve our diffusion problem, we write

X
c= ci (t) ψi

and find

d
hψi , ci = −λ2i hψi , ci
dt

just as before, but now in a new inner product, and our solution is as before, but now in a new inner
product, viz.,

X 2
c= hψi , c (t = 0)i e−λi t ψi

where

Z1
hψi , c (t = 0)i = ψi c (t = 0) dx + αψi (x = 1) c (t = 0) |
x=1
0

and we can see how the initial states of the diffusion layer and the reservoir come into the solution.

It remains only to solve the eigenvalue problem. First we observe that zero is an eigenvalue
corresponding to ψ = 1. This is not surprising because we expect the system to come to rest as
t −→ ∞ with a uniform solute concentration in the diffusion layer, in equilibrium with whatever
solute concentration winds up in the reservoir. So writing

ψ = A cos λx + B sin λx


we see that (x = 0) = 0 implies that B = 0, hence A must not be zero. The values of λ then
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 345

satisfy


(x = 1) − λ2 αψ (x = 1) = 0
dx

and this tells us that

λ sin λ + λ2 α cos λ = 0.

This is an equation for λ2 and we see that λ2 = 0 is a simple root. We can find the remaining
values of λ2 by graphical means, by solving

αλ = − tan λ

for λ > 0. Because α is positive there is no solution λ = iω and hence λ2 is not negative.

Example (7): Periodic Conditions

Suppose we have

c (x = 0) = c (x = 1)

and

∂c ∂c
(x = 0) = (x = 1)
∂x ∂x

then we find that


 1
∂c dψi
ψi − c |
∂x dx 0

vanishes if we choose

ψ (x = 0) = ψ (x = 1)
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 346

and

dψ dψ
(x = 0) = (x = 1) .
dx dx

Hence writing the solution to

d2 ψ
2
+ λ2 ψ = 0
dx

as

ψ = A cos λx + B sin λx

we see that A, B and λ must satisfy

A = A cos λ + B sin λ

and

Bλ = −Aλ sin λ + Bλ cos λ

and hence
    
cos λ − 1 sin λ A 0
  = 
−λ sin λ λ (cos λ − 1) B 0

To have a solution to these homogeneous equations such A and B are not both zero, we must have
 
cos λ − 1 sin λ
det  =0
−λ sin λ λ (cos λ − 1)

First, λ = 0 is a solution (and at λ = 0 we have ψ = A + Bx) and it implies ψ = A 6= 0, i.e., there


LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 347

is only one periodic solution at λ = 0. Then, for λ 6= 0, we have

cos λ = 1

or

λ = 2π, 4π, . . .

Hence our eigenvalues are

0, (2π)2 , (4π)2 , . . .

and the corresponding eigenfunctions (not normalized) are


 
 cos 2πx  cos 2πnx
1, , ··· , ,···
 sin 2πx  sin 2πnx

n = 1, 2, · · · Thus to every eigenvalue not zero, we have two periodic eigenfunctions, viz., A = 1,
B = 0 and A = 0, B = 1. But corresponding to λ = 0, we have only one periodic solution.

The reader may observe that the expansion of a periodic function in these eigenfunctions is
what is ordinarily called a Fourier series expansion, where the coefficient of the first eigenfunction
is the average value of the function being expanded.

14.5 An Eigenvalue Problem Arising in Frictional Heating

We present here an eigenvalue problem that is a little out of the ordinary. First, suppose φ and µ2
satisfy

d2 φ
+ µ2 φ = 0, −1 < x < 1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 348

where

φ (x = −1) = 0 = φ (x = 1)

The solutions are

 2
1 2 1
cos πx µ = π
2 2
sin πx µ2 = π 2
 2
3 2 3
cos πx µ = π
2 2
etc.

Now the eigenvalue problem of interest in this example is


 
2 Z1

2
+ ν 2 ψ − ψdx = 0
dx
−1

where

ψ (x = −1) = 0 = ψ (x = 1) .

This is a model for an eigenvalue problem arising in frictional heating. It appears if a small pertur-
bation is imposed on the solution to a problem in plane Couette flow.

The solutions odd about x = 0 are as above, viz.,

sin πx ν 2 = π2

sin 2πx ν 2 = (2π)2

etc.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 349

Then setting

Z1
C= ψdx
−1

we have

d2 ψ
+ ν2ψ = ν2C
dx2

whereupon

ψ = C + A cos νx + B sin νx

and we see that A, B, C and ν satisfy


 
0 = C + A cos ν − B sin ν   0 = C + A cos ν

0 = C + A cos ν + B sin ν   0 = B sin ν

and

2A
C = 2C + sin ν
ν

The case B 6= 0, A = 0, ν = π, 2π, · · · corresponds to the odd solutions about x = 0 and hence
to C = 0. In the remaining case, C 6= 0, we have A 6= 0, B = 0 and

1 sin ν
ν= = tan ν
2 cos ν

This equation has many positive solutions, which can be found graphically, as well as one negative
solution ν = iω, whence ν 2 = −ω 2 < 0.

There are similarities here to two earlier problems in this lecture, one of which also has a
negative eigenvalue.

The reader can use our two integration by parts formulas, now on the interval −1 ≤ x ≤ 1, to
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 350

derive general conclusions about the solutions to this new eigenvalue problem. For example setting
φ = ψ in our second integration by parts formula and observing that ψ, λ2 is a solution if ψ, λ2 is
a solution we have
1  1 
Z Z1 Z1  Z Z1 Z1 
−ν 2 ψψdx − ψdx ψdx = −ν 2 ψψdx − ψdx ψdx
   
−1 −1 −1 −1 −1 −1

whereupon we conclude ν 2 must be real.

14.6 More on Examples (5) and (6)

Our problem in Example (5) is to solve the eigenvalue problem

d2 ψ
+ λ2 ψ = 0, ψ = 0 at x=0
dx2

and


= −βψ, at x=1
dx

whereas in Example (6) the boundary condition at x = 1 is


= αλ2 ψ
dx

All β’s make physical sense but β ≥ 0 is the ordinary case. Only α ≥ 0 makes sense but here
we suppose α < 0 is possible, i.e., we have an antireservoir.

In both problems

ψ = A sin λx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 351

where in the first we have

λ
= tan λ
−β

whereas in the second

1
= tan λ
αλ

The readers can satisfy themselves that all λ2 ’s are real.

If we ask: is there a value of α or β where λ2 = 0, in the first problem we find one and only
one value of β : β = −1. In the second problem there are no values of α such that λ2 = 0. For
β > −1 all λ2 ’s are positive, at β = −1, one becomes zero and at β < −1, one is negative, the
others remaining positive because there is no value of β other than −1 where λ2 = 0. All λ2 ’s are
smooth functions of β.

This is not the way the second problem works. For α ≥ 0 all λ2 ’s are positive, α = 0 cor-
responding to an impermeable wall at x = 1. However, if we admit the possibility α < 0 we
have
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 352

tan λ

π⁄2 3π ⁄ 2
λ

1
αλ

and the root π/2 at α = 0 appears to have been lost. However, at α < 0 by setting λ = iω we have

1
= tan ω
(−α) ω

and we recover our lost root. Now we have: α → 0− =⇒ ω → ∞ and α → − ∞ =⇒ ω → 0


and we see
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 353

λ1
2

( π ⁄ 2 )2

Our pictures show that in the second problem all λ2 ’s are positive for α ≥ 0, one λ2 is negative,
all others remaining positive, for α < 0 and now no λ2 is ever zero.

Our frictional heating problem is like the second problem if we write it


 Z +1 
d2 ψ 2
+ν ψ−γ ψ dx = 0
dx2 −1

ψ (x = ±1) = 0
1
The crisis here occurs at γ = . The two earlier cases correspond to γ = 0 and γ = 1.
2
If we now put our autocatalytic reaction on the domain, our eigenvalue problem is

d2 ψ
2
+ λ2 ψ − βψ = 0, β<0
dx

ψ = 0 at x = 0, 1
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 354

And we have

p
ψ = A sin λ2 − β x

where

λ2 = β + n2 π 2

And if we ask: at what values of β can λ2 be zero we find

β = −n2 π 2 , n = 1, 2, . . .

Every λ2 can turn negative and we see that the larger −β the more spatial variation is needed to
control growth. But once −β exceeds π 2 we have lost stability.

14.7 Differentiating the Eigenvalue Problem

In example (4) we have


 
d2 2
+λ ψ =0
dx2

and
 
d d
ψ (x = 0) = 0, + β ψ (x = 1) = 0
dx dx

and our aim here is to determine the dependence of λ21 , λ22 , · · · on β. To do this we find, by

differentiating the problem, that satisfies

 
d2 dψ dλ2
+ λ2 =− ψ
dx2 dβ dβ
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 355

and
 
d dψ d dψ
(x = 0) = 0, +β (x = 1) = −ψ (x = 1)
dx dβ dx dβ

The corresponding homogeneous problem has a nonzero solution, hence a solvability condition

must be satisfied. It is satisfied because can be found by differentiating ψ, and solvability

dλ2
determines . Thus we use our second integration by parts formula to write

Z1    1 Z1  2 
d2 dψ d dψ dψ dψ d dψ
ψ + λ2 dx = ψ − + 2
+ λ ψ dx
dx2 dβ dx dβ dx dβ 0 dx 2 dβ
0 0

   
d2 dψ dλ2 d2
and then substitute + λ2 =− ψ and 2
+ λ ψ = 0 into this to get
dx2 dβ dβ dx2

Z1
dλ2
ψψdx = −ψψ (x = 1)

0

Using the fact that ψ is a multiple of cos λx we can write this

dλ2 cos2 λ
= 1 1
dβ 2λ
sin λ cos λ + 2

and hence, using λ sin λ − β cos λ = 0 and sin2 λ + cos2 λ = 1, we get

dλ2 2λ2
= 2
dβ λ + β2 + β

This differential equation determines λ2i vs β given that λ2i (β = 0) = (i − 1)2 π 2 , i = 1, 2, · · · .


Indeed we see that

dλ2i
(β = 0) = 2, i = 2, · · ·

dλ21
but (β = 0) is indeterminate because λ21 (β = 0) = 0.

LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 356

To see what λ21 is doing when β is small we first observe that λ21 as a function of β satisfies

λ1 sin λ1 − β cos λ1 = 0.

Then when λ21 is near zero, as it is when β is small, we can approximate λ1 sin λ1 and cos λ1 by

1 1 6
λ1 sin λ1 = λ21 − λ41 + λ −···
6 120 1

and

1 1
cos λ1 = 1 − λ21 + λ41 − · · ·
2 24

and write

λ21 = c1 β + c2 β 2 + · · ·

Substituting this in
 
1 4 1 6 1 2 1 4
λ21 − λ1 + λ − · · · − β 1 − λ1 + λ1 − · · · = 0
6 120 1 2 24

we find that

1
c1 = 1, c2 = − , · · ·
3

and hence as β −→ 0 that

1
λ21 = β − β 2
3

or

β
λ21 = .
1
1+ β
3
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 357

This approximation is useful in many ways; indeed it shows that

dλ21
(β = 0) = 1.

14.8 The Use of a Nondiagonalizing Basis.

To get a clear idea how much help a diagonalizing basis is in writing the solution to the diffusion
equation we carry out a calculation in a nondiagonalizing basis.

Let c satisfy

∂c ∂2c
= 2, 0 < x < 1
∂t ∂x

∂c
(x = 0) = 0
∂x
and

∂c
(x = 1) + βc (x = 1) = 0
∂x

where c (t = 0) is assigned. Then the eigenvalue problem is

d2 ψ
+ λ2 ψ = 0, 0 < x < 1
dx2


(x = 0) = 0
dx
and


(x = 1) + βψ (x = 1) = 0
dx

and it is satisfied by

B
ψ = A cos λx + sin λx
λ
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 358

where A, B and λ remain to be determined.



As (x = 0) = 0 we find B = 0, and the eigenfunctions are then
dx

ψi = A cos λi x

where the λi are the non-negative solutions to

λ sin λ − β cos λ = 0

and the A′ s are normalization constants. For each value of β we get a set of eigenfunctions and
these eigenfunctions depend on the value of β.

Earlier, Example (4) page 334, we solved this problem. Whatever the value of β we expanded
the solution in the corresponding eigenfunctions. Here we try something else. Let β > 0 be fixed.
Then to determine the solution for this value of β we expand it in the eigenfunctions corresponding
to β = 0 as they are easy to determine.

When β = 0 the eigenvalues satisfy

λ(0) sin λ(0) = 0

2
which is an equation in λ(0) having simple roots, viz.,

2
λ(0) = 0, π 2 , 22 π 2 , · · ·

We normalize the corresponding eigenfunctions and denote them ψi0 , i = 0, 1, 2, · · · , where

ψ00 = 1


ψ10 = 2 cos πx

ψ20 = 2 cos 2πx

etc
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 359

To solve the problem corresponding to a fixed value of β > 0 in terms of the eigenfunctions
corresponding to β = 0 we write


X
c= ci ψi0
i=0

and our job is to find the coefficients ci in this series, where

Z1
ci = ψi0 , c = ψi0 cdx
0

To do this we multiply the equation for c by ψi0 , integrate the result from 0 to 1 and use the
integration by parts formula

Z1 Z1  1
d2 v d2 u dv du
u 2 dx = vdx + u − v
dx dx2 dx dx 0
0 0

to get

       1
∂c ∂2c d2 ψi0 0 ∂c dψi0
ψi0 , = ψi0 , = , c + ψi − c
∂t ∂x2 dx2 ∂x dx 0

and this is

d 2 ∂c
ci = − λ0i ci + ψi0 (x = 1) (x = 1) .
dt ∂x

∂c ∂c
The technical difficulty here is that (x = 1) is not zero. In fact (x = 1) = −βc (x = 1) =
P ∂x ∂x
−β cj ψj0 (x = 1) and using this we get
j=0

d 2 X
ci = − λ0i ci − βψi0 (x = 1) cj ψj0 (x = 1)
dt j=0

This illustrates what happens when we do not use a diagonalizing basis; the equations satisfied by
the ci are not uncoupled.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 360
√ 2 2
Using ψ00 (x = 1) = 1, ψj0 (x = 1) = 2 (−1)j , j = 1, 2, · · · , (λ00 ) = 0 and λ0j = j 2π2,
j = 1, 2, · · · we must solve

dci 2 √ X
= − λ0i ci − βψi0 (x = 0) c0 − 2βψi0 (x = 0) (−1)j cj
dt j=1

or

dc0 √ X
= 0 − βc0 − 2β (−1)j cj
dt j=1

and

dci √ X
= −i2 π 2 ci − 2β (−1)i c0 − 2β (−1)i (−1)j cj , i = 1, 2, · · · .
dt j=1

To try to learn something about our solution, we truncate the first two equations to

dc0 √
= −βc0 + 2βc1
dt

and

dc1 √
= −π 2 c1 + 2βc0 − 2βc1
dt

and hence to determine c0 and c1 in this approximation we need the eigenvalues of the matrix
 √ 
−β 2β
 √ 
2β −π 2 − 2β

For small β these are −β and −π 2 and so for long time and small β the solute is dissipated as e−βt
and this is correct. But more information than this is difficult to obtain in this basis.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 361

14.9 A Warning About Series Solutions

Before we go on, we can get an idea of what is to come and at the same time make an observation
about Fourier series. The eigenvalue problem

d2 ψ
2
+ λ2 ψ = 0 , 0 < x < 1
dx

and

ψ (x = 0) = 0, ψ (x = 1) = 0


has solutons λ2i = i2 π 2 and ψi = 2 sin iπx, i = 1, 2, · · · . This orthogonal set of eigenfunctions
can be used to solve problems such as

d2 c
0= + Q (x)
dx2

where

c (x = 0) = c0 , c (x = 1) = c1 .

Indeed, writing


X
c (x) = hψi , ci ψi
i=1

d2 c
we can find hψi , ci by multiplying 0 = 2 + Q by ψi and integrating over 0 ≤ x ≤ 1. Doing this
dx
we get
 
d2 c
0= ψi , 2 + hψi , Qi
dx
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 362

and using our second integration by parts formula, we find

 1  2 
dc dψi d ψi
0 = ψi − c + , c + hψi , Qi
dx dx 0 dx2

 1
dc dψi
Something new happens here. The term ψi − c does not vanish, but it can be evaluated
dx dx 0
 1
dc
because ψi vanishes on the boundary, eliminating the piece ψi , while c is assigned there,
dx 0
 1
dψi
establishing the value of the piece − c . Using this we get
dx 0

hψi , Qi 1 dψi 1 dψi


hψi , ci = 2
− 2 (x = 1) c1 + 2 (x = 0) c0
λi λi dx λi dx

and hence

X hψi , Qi X 1 dψi X 1 dψi


c (x) = ψi − c1 (x = 1) ψi + c 0 (x = 0) ψi
λ2i λ2i dx λ2i dx

We see, then, that c (x) is the sum of three terms each accounting for the contribution of one of
the three sources. The boundary sources introduce a special problem. To see this let Q = 0, c0 = 0
and c1 = 1, then

X 1 dψ X 1 √ X∞
√ 2
c (x) = − 2
i
(x = 1) ψi = − 2 2
2iπ cos iπ 2 sin iπx = − (−1)i sin iπx
λi dx iπ i=1

Because c (x) = x satisfies this problem, this expansion must be the Fourier series for x and indeed
it is, viz.,


X 2
x= − (−1)i sin iπx
i=1

1
The terms in this series fall off as and so convergence depends on the alternating sign, (−1)i ,
i
1 1 1
i.e., − ∼ 2 , and gets a little help from the sign pattern of sin iπx.
i i+1 i
What we seem to have here is the function f (x) = x on 0 ≤ x ≤ 1 expanded in a series of
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 363

odd functions of period 2. But in fact what we really have is the function f (x) = x on 0 ≤ x ≤ 1,
extended first to −1 ≤ x ≤ 1 as an odd function and then extended to all x as a function of period
2, expanded in a series of functions of period 2. That is, what we have expanded is the function
shown here:

+1

-3 -1 +1 +3

-1

The convergence is slow because this function is not smooth, having a jump at x = 1. But the
series converges to x for all x : 0 ≤ x < 1. It converges to 0 at x = 1, where 0 is the average of 1
and -1, the limits of the value of the function as x goes to 1 from the left and the right.

The series that results on termwise differentiation,

X
−2 (−1)i cos iπx,

does not converge anywhere and has no ordinary meaning.

The solution to this boundary source problem is not a superposition of terms that satisfy the
original differential equation as we found earlier for initial value problems nor is it a superposition
of terms that satisfy special problems of the same kind as we find for interior sources where

hψi , Qi ψi
c=
λ2i
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 364

satisfies

d2 c
0= + hψi , Qi ψi
dx2

and

c (x = 0) = 0, c (x = 1) = 0.

What we have then is a series expansion of the solution to a problem driven by a boundary source
which is correct but which cannot be verified by direct substitution into the problem. Indeed what
can be learned about the solution to the problem using its series expansion is limited to operations
on the series that exclude differentiation.

14.10 Home Problems

1. Let c1 and c2 denote the concentrations of two solute species dissolved in a solvent which
is confined to a thin layer. The two solute species are distributed across the solvent layer at
t = 0 in some assigned way. The edges of the layer are impermeable to both species and an
estimate of the time for the layer to equilibrate is required. In terms of
 
c1
c= 
c2

we have for the diffusive homogenization of the solvent layer

∂c ∂2c
= D 2, 0<x<1
∂t ∂x

and

∂c ∂c
(x = 0) = 0, (x = 1) = 0,
∂x ∂x
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 365

where, in units of the layer thickness,


   
D11 D12 3/2 1/2
D= = 
D21 D22 1/2 3/2

and where at t = 0:


 1, 1
0≤x<
c1 = 2
 1
 0, <x≤1
2


 0, 1
0≤x<
c2 = 2
 1
 1, <x≤1
2
How long before the maximum deviation from uniformity is less than 0.000001?

2. Let c1 , c2 , . . . , cn denote the concentrations of n solute species dissolved in a solvent and


suppose that the species participate in a set of reversible first order chemical reactions, viz.,

kji
i j
kij

Then the rate of production of solute is Kc where K is introduced in Lecture 8.

Let the solute be in chemical equilibrium and be distributed uniformly in the solvent
which is confined to a thin layer, 0 < x < 1. At t = 0 the edges of the solvent layer contact
large solute free reservoirs that hold the solute concentration there at zero for all t > 0. Then
for the loss of solute to the reservoir we have

∂c ∂2c
= D 2 + Kc, 0<x<1
∂t ∂x

c (x = 0) = 0, c (x = 1) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 366

and

c (t = 0) = c eq

where
 
c1
 
 
 c2 
c=
 ..


 . 
 
cn

Kc eq = 0

and
 
D D12 · · ·
 11 
 
D =  D21 D22 · · · 
 
.. .. ..
. . .

Determine c (x, t) and hence the time required for the solvent to be cleared of solute.
In problem 1 expanding c in the eigenvectors of D leads to two familiar diffusion equations.
That idea will not work here unless D and K have a complete set of eigenvectors in common.
But this is not ordinarily so, even if D is diagonal.

Yet the problem is special in another way: the boundary conditions are Dirichlet con-
ditions for all solute species. And so its solution can be obtained by expanding c in the
eigenfunctions determined by the ordinary eigenvalue problem

d2 Ψ
+ λ2 Ψ = 0
dx2

and

Ψ (x = 0) = 0, Ψ (x = 1) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 367

Denote the solutions to this Ψi , i = 1, 2, . . ., and write

X
c= ci (t) Ψi (x)

Z 1
where ci (t) = h Ψi , c i and h , i = dx. Then as
0

Z 1    1
d 2 c d 2 Ψi dc dΨi
Ψi 2 − c dx = Ψi − c =0
0 dx dx2 dx dx 0

our expansion works out here just as it does in more familiar problems.

How much time must elapse before only 1% of the initial equilibrium solute remains in
the solvent, if n = 2, if, in units of film thickness,

1 0
D= 1

0
2

and if
 
−1 1
K= ?
1 −1

3. Let c, the concentration of a solute in a solvent, satisfy

∂c ∂2c
= 2, 0<x<1
∂t ∂x

where c (t = 0) is assigned. Study the homogenization of the solute in three experiments:

(i) c (x = 0) = 0 = c (x = 1)
∂c
(ii) c (x = 0) = 0 = (x = 1)
∂x
∂c ∂c
(iii) (x = 0) = 0 = (x = 1)
∂x ∂x
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 368

Order the rates at which the solute goes to its long time uniform state in the three experi-
ments. Observe that the uniform state in (iii) cannot be determined by solving the steady
equation but depends on c (t = 0). This is not true in (i) and (ii).

4. Free radicals are created in acetylene whereupon they catalyse the production of more free
radicals.

The growth of free radicals is controlled by diffusion to the wall whereupon the free
radicals are destroyed upon collision with the solid surface. A tank of acetylene must not be
too large in diameter if a runaway is to be averted.

Suppose acetylene is stored between two plane walls a distance L apart. In terms of k
and D, how large can L be before a runaway occurs?

The model is

1 ∂c ∂2c k
= 2 + c, k>0
D ∂t ∂x D

L2
c = 0 at x = 0 and x = L where c denotes the free radical concentration, [D] = and
T
1
[k] = .
T
Does the value of L depend on c (t = 0) ?

The steady problem

∂2c k
0= 2
+ c, 0<x<L
∂x D

and

c (x = 0) = 0 = c (x = L)

kL2 kL2
has the solution c = 0 for all values of > 0. But for special values of it has
D D
solutions other than c = 0. Find these values.
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 369

The steady problem does not have only non-negative solutions. But you can show that
if c (t = 0) > 0, the unsteady problem must have non-negative solutions. Do this.
kL2
Write the solution to the unsteady problem if c (t = 0) > 0 and is less than, equal
D
to or greater than π 2 .
kL2
If = π 2 , the solution to the steady problem can only be obtained by solving the
D
unsteady problem and then letting t grow large. This steady solution depends on c (t = 0).

5. Two species having concentrations a and b are distributed over a one dimensional domain,
∂b
0 < x < 1. At the ends we have a = 0 and = 0 and on the domain, where the reaction
∂x

a b

takes place, we have

∂a ∂2a
= 2 +b−a
∂t ∂x

and

∂b ∂2b
= 2 −b+a
∂t ∂x

Estimate the rate at which a and b go to zero.

This simple looking problem is not so simple.

6. Write the solution to the problem

∂c ∂2c
= 2, 0<x<1
∂t ∂x

c (x = 0) = 0, c (x = 1) = 1
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 370

and

c (t = 0) = 0

Do this by expanding c (x, t) in the solutions to the eigenvalue problem

d2 Ψ
+ λ2 Ψ = 0
dx2

and

Ψ (x = 0) = 0, Ψ (x = 1) = 0

Take the limit of the solution as t → ∞ and verify that this is the Fourier series for f (x) = x
on the interval −1 ≤ x ≤ 1. This series converges for all values of x and defines a periodic
function of period 2. Its value when x = 1 is zero, its limit as x → 1− is one. It is the solution
to the problem

d2 c
0= , 0<x<1
dx2

and

c (x = 0) = 0, c (x = 1) = 1

This can be verified by construction but not by direct substitution, as the series derived by
termwise differentiation does not converge.

7. Let D be the linear differential operator

d2
D = 2, 0<x<ε
dx
d2
D=β , ε<x<1
dx2
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 371

Then using the integration by parts formulas

Z 1  ε  1 Z ε Z 1
dv dv du dv du dv
uDv dx = u + uβ − dx − β dx
0 dx 0 dx ε 0 dx dx ε dx dx
 ε  1 Z 1
dv du dv du
= u − v + uβ −β v + Duv dx
dx dx 0 dx dx ε 0

show that the solutions to the eigenvalue problem

DΨ + λ2 Ψ = 0, 0<x<1

Ψ (x = 0) = 0, Ψ (x = 1) = 0
 
Ψ x = ε− = Ψ x = ε+

and

dΨ  dΨ 
x = ε− = β x = ε+
dx dx

satisfy the usual orthogonality, etc., conditions

Use the solutions to this eigenvalue problem to obtain a formula for the solution to
a diffusion problem where a solute, initially distributed over a layer composed of two im-
miscible solvents, diffuses out of the layer when it is placed in contact at t = 0 with two
solute free reservoirs that maintain the solute concentration at its edges at c = 0. The solute
concentration then satisfies


 ∂2c

 D1 2 , 0 < x < x12
∂c  ∂x
=
∂t  
 2
 D2 ∂ c ,

x12 < x < ℓ
∂x2

and

c (x = 0) = 0, c (x = ℓ) = 0
LECTURE 14. ONE DIMENSIONAL DIFFUSION IN BOUNDED DOMAINS 372

where c (t = 0) is assigned. At x = x12 , the solvent-solvent interface, phase equilibrium


obtains and is given by the equilibrium distribution ratio α. The rate of solute diffusion is
continuous there.

8. Solve the eigenvalue problem


Z +1
d2 ψ
+ λ2 ψ = c, ψ = 0 at x = ±1, ψ dx = 0
dx2 −1

9. Solve the eigenvalue problem


Z +1
d2 ψ
+ λ2 ψ = ψ dx, ψ=0 at x = ±1
dx2 −1

In this and the preceding problem, by using our integration by parts formula, you can
prove that λ2 must be real, etc.
Lecture 15

Two Examples of Diffusion in One


Dimension

15.1 Instability due to Diffusion

Diffusion is always smoothing and ordinarily it is stabilizing; nonetheless there is a paper by Segel
and Jackson (L. A. Segel, J. L. Jackson, J. Theoretical Biology, (1972), 37, 545) in which it is
proposed that diffusion is destabilizing, causing non uniformities to appear in an otherwise stable,
spatially uniform system.

We present the model. Two chemical species occupy the real line, −∞ < x < ∞. We denote
their concentrations c1 and c2 and refer to the first as the activator, the second as the inhibitor.

The model is

∂c1 ∂ 2 c1
= D1 2 + R1 (c1 , c2 )
∂t ∂x
∂c2 ∂ 2 c2
= D2 2 + R2 (c1 , c2 )
∂t ∂x

where the equations

R1 (c1 , c2 ) = 0 = R2 (c1 , c2 )

373
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 374

(0) (0) (0) (0)


have a solution c1 = c1 , c2 = c2 and hence c1 = c1 , c2 = c2 is a spatially uniform, time
independent solution of our model problem.

We would like to know if we can see this solution in an experiment.

Interest in this stems from the fact that an activator and an inhibitor might be in balance in a
cell wall where there also reside receptors picking up signals that the cell must respond to. Our
uniform activator-inhibitor state may not persist in the face of perturbations due to such signals
and our aim might be to find conditions that cause the cell to spring into action.
(0) (0)
To do this we introduce small perturbations to c1 and c2 , denoted ξ1 and ξ2 , and derive

       
∂2
D
∂  ξ1   1 ∂x2 0   ξ1   a11 a12   ξ1 
= 2  +
∂t ξ2 ∂ ξ a21 a12 ξ2
0 D2 2 2
∂x

where the algebraic signs of the aij have a physical meaning.

Because species 1 is an activator, it causes both species to grow, hence we have

a11 > 0, a21 > 0

Likewise, because species 2 is an inhibitor, we have

a12 < 0, a22 < 0

In the absence of diffusion, the uniform state is assumed to be stable. Thus we have

a11 + a22 < 0 (trace condition)

and

a11 a22 − a21 a12 > 0 (det condition)


LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 375

and this implies

(−a22 ) > a11

and

a21 (−a12 ) > a11 (−a22 )

Then, for a perturbation of wave number k we write

ξ1 = a1 cos kxe σt

ξ2 = a2 cos kxe σt

whereupon we have
        
a1 D1 0 a1 a11 a12
σ  = −k 2   + A  , A= 
a2 0 D2 a2 a21 a22

and we see that the σ’s, the growth constants, are eigenvalues of the matrix
 
a11 − k 2 D1 a12
 
2
a21 a22 − k D2

where, because x runs from −∞ to +∞ all values of k are admissible. Had x a finite range
the admissible k’s would be limited by the end conditions, eg., Neumann conditions, periodic
conditions, etc.

Our system is stable to long wave length perturbations (k 2 = 0), by construction, and to small
wave length perturbations (k 2 → ∞) due to diffusive smoothing.

If there is an intermediate range of wave numbers where


stability is lost,
 we say that the insta-
D1 0
bility is brought about by diffusion, though both A and −k 2   are stable matrices.
0 D2
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 376

Upon setting
   
D1 0
det −k 2   + A − σI  = 0
0 D2

we obtain Re σ < 0 iff

−k 2 D1 + a11 − k 2 D2 + a22 < 0 (trace condition)

and

 
−k 2 D1 + a11 −k 2 D2 + a22 − a21 a12 > 0 (det condition)

Because we have stability at k 2 = 0, viz., a11 + a22 < 0, we see that

−k 2 D1 + a11 − k 2 D2 + a22 < 0

and the trace condition is satisfied for all values of k 2 .

Turning to the det condition then, we have


det k 2 = D1 D2 k 4 − (D1 a22 + D2 a11 ) k 2 + a11 a22 − a21 a12
| {z } | {z }
(+) (+)

and we observe that if det (k 2 ) is increasing at k 2 = 0, it will continue to increase as k 2 increases


and it will always be positive. Thus to have a chance of finding an instability caused by the presence
d 
of diffusion we must have 2 det k 2 negative at k 2 = 0, i.e., we must have
dk

D1 a22 + D2 a21 > 0

whence we need

D2 > D1
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 377

due to (−a22 ) > a11 .

A graph of det (k 2 ) vs k 2 appears as shown in the sketch:

det (k2)

k2

and we see that to have an instability det (k 2 ) must be negative at its least value. The least value
occurs at

2D1 D2 k 2 − (D1 a22 + D2 a11 ) = 0

where the value of det (k 2 ) is

1 (D1 a22 + D2 a11 )2


− + a11 a22 − a21 a12
4 D1 D2

and hence we must have

2 
D1 a22 + D2 a11 > 4D1 D2 a11 a22 − a21 a12
| {z } | {z } | {z }
(−) (+) (+)

and this can be satisfied if D1 is small enough or D2 is large enough.

Thus we can arrange an instability and we can understand it: At a site where a perturbation due
to an outside signal increases the activator concentration, and hence the rates of production of the
activator and the inhibitor are both increased, instability obtains if the activator remains in place
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 378

(low D1 ) and the inhibitor diffuses away (high D2 ), leaving the activator to reinforce itself. This is
referred to as diffusion induced symmetry breaking.

Given an unstable range of k’s, we could go on and get an estimate of the non uniform state
that appears. This is done in Grindrod’s book “The Theory and Applications of Reaction-Diffusion
Equations: Patterns and Waves.” We turn instead to the Petri Dish problem where this is easier to
do.

15.2 Petri Dish

A steady solution to our Petri Dish problem (Lecture 1) satisfies, in scaled variables,

d2 c
0= + λF (c)
dx2

where F (0) = 0, F ′ (0) > 0 and where c = 0 at x = 0, 1.

We already know that c = c0 = 0 is a solution for all values of λ, and we know that this solution
is stable to small perturbations for all values of λ < λcrit (see also §15.3); beyond λ = λcrit it is
unstable and we wish to find out what the new solution looks like for λ just beyond λcrit.

To find the non zero steady solution branch emerging from λcrit as λ increases we write

1 2
c = c0 + ε c 1 + ε c2 + · · ·
2

where c0 = 0, and we try to find how λ depends on ε or vice versa.

There is a method, called dominant balance, for doing this and it is explained in the books by
Bender and Orzag and by Grindrod. (“Advanced Mathematical Methods for Scientists and Engi-
neers” and “The Theory and Application of Reaction-Diffusion Equations.”) But we can illustrate
the main ideas by trying two possibilities:

(1) λ = λcrit + ε

and

(2) λ = λcrit + 21 ε2
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 379

First we expand the nonlinear part of the problem in powers of ε, viz.,

1 2 ′ 
F (c) = F (c0 ) + εF ′ (c0 ) c1 + ε F (c0 ) c2 + F ′′ (c0 ) c21
2
1 3 
+ ε F ′ (c0 ) c3 + 3F ′′ (c0 ) c1 c2 + F ′′′ (c0 ) c31 + · · ·
6

and then, assuming expansion (1), we have, using F (c0 ) = 0,

d 2 c1
2
+ λcritF ′ (c0 ) c1 = 0, c1 = 0 at x = 0, 1
dx

and

d 2 c2
+ λcritF ′ (c0 ) c2 = −2F ′ (c0 ) c1 − λcritF ′′ (c0 ) c21 , c2 = 0 at x = 0, 1
dx2

1 2
at order ε and at order ε
2
The first problem is the eigenvalue problem that we solved earlier, Lecture 1, to obtain λcrit,
and we found λcritF ′ (c0 ) = π 2 . Whence we have

c1 = A sin πx

and our job is to find the factor A.

For our expansion, here (1), to be correct we need to be able to find c1 , c2 , . . . and hence we
move on to the second order problem and we notice that the homogeneous part of this problem
is the eigenvalue problem and we already know that it has a non zero solution, viz., c1 . Thus a
solvability condition must be satisfied in order for the calculation to continue. and we find this
condition by multiplying the second problem by c1 the first by c2 , subtracting and integrating over
0 < x < 1.

The result is
Z 1  
′ ′′
c1 2F (c0 ) c1 + λcritF (c0 ) c21 dx = 0
0
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 380

and we look at two functions F (c):

First we suppose


 F ′ (0) = 1


F (c) = c − c2 F ′′ (0) = −2



 F ′′′ (0) = 0

whereupon solvability at second order tells us that


Z 1 Z 1
′ 2 2 ′′ 3
2F (0) A sin πx dx + λcrit F (0) A sin3 πx dx = 0
0 0

This determines A, hence our solution just beyond λcrit is

c = (λ − λcrit) A sin πx

and expansion (1) appears to be correct.

Second we suppose


 F ′ (0) = 1


F (c) = c − c3 F ′′ (0) = 0



 F ′′′ (0) = −6

and now we find, at second order, that A = 0. Hence we conclude that expansion (1) fails in this
case.

Turning to expansion (2), and continuing with our second case, we have, at first and second
orders,

d 2 c1
+ λcritF ′ (c0 ) c1 = 0, c1 = 0 at x = 0, 1
dx2

and

d 2 c2
2
+ λcritF ′ (c0 ) c2 = −λcritF ′′ (c0 ) c21 , c2 = 0 at x = 0, 1
dx
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 381

whence

c1 = A sin πx

as before, but now solvability at second order, viz.,


Z 1  
c1 − λcrit F ′′ (c0 ) c21 dx = 0
0

is satisfied for all values of A due to F ′′ (c0 ) = 0. Hence we must go to third order where we have

d 2 c3
2
+ λcritF ′ (c0 ) c3 = −3λcritF ′′ (c0 ) c1 c2 − λcritF ′′′ (c0 ) c31 − 3F ′ (c0 ) c1
dx

c3 = 0 at x = 0, 1

and solvability must be satisfied if we are to be able to find c3 and continue our calculations. Ordi-
narily c2 would be needed, and it can be found, but it is not needed here because
F ′′ (c0 ) = 0. (It is not usually true that the condition needed to satisfy solvability at second
order eliminates the need for c2 at third order.) The solvability condition at third order is
Z 1 Z 1
′′′
−λcritF (c0 ) c41 ′
dx − 3F (c0 ) c21 dx = 0
0 0

and this determines A2 as


Z 1
sin2 πx dx
1
A2 = 2 Z0 1

sin4 πx dx
0

where we have used λcritF ′ (c0 ) = π 2 , F ′ (c0 ) = 1 and F ′′′ (c0 ) = −6.

Our solution branch as λ passes through λcrit is then

p
c= 2 (λ − λcrit) A sin πx

and we see that how our solution depends on λ, for λ just beyond λcrit, differs as the nonlinearity
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 382

differs.

15.3 A Lecture 14 Problem

We wish to find the critical value of λ at which the solution c = 0, which holds for all λ, becomes
unstable. The model is

∂c ∂2c
= 2 + λF (c)
∂t ∂x

where c = 0 at x = 0 and at x = 1, where F (0) = 0 and where F ′ (0) > 0.

The perturbation equation is

∂c1 ∂ 2 c1
= + λF ′ (0) c1
∂t ∂x2

where c1 = 0 at x = 0, 1.

The eigenvalue problem is

∂2ψ
+ µ2 ψ = 0
∂x2

where ψ = 0 at x = 0, 1.

Its solutions are

ψ = A sin µx µ2 = π 2 , (2π)2 , . . .

d2
Expanding c1 in the eigenfunctions of we have
dx2
X
c1 = h ψ, c1 i ψ

where

d 
h ψ, c1 i = − µ2 + λF ′ (0) h ψ, c1 i
dt
LECTURE 15. TWO EXAMPLES OF DIFFUSION IN ONE DIMENSION 383

As λ increases from zero, −µ2 + λF ′ (0) starts out negative. It first becomes zero at λ = λcrit,
corresponding to µ2 = µ21 , and thereafter remains positive, viz.,

µ2 = µ21
µ2 = µ22

λ
λcrit

15.4 Home Problems

1. Solve the Segel and Jackson problem on a bounded, one dimensional domain, assuming
homogeneous Neumann conditions at the ends.
Lecture 16

Diffusion in Bounded, Three Dimensional


Domains

16.1 The Use of the Eigenfunctions of ∇2 to Solve Inhomoge-


neous Problems

In this lecture we do not assume the boundary conditions are homogeneous and we do not assume
d2
the domain is one dimensional. Thus, we replace the differential operator 2 by ∇2 corresponding
dx
to diffusion in more than one dimension. Our emphasis will be on ∇2 and its eigenvalue problem.

We suppose that at time t = 0 a solute is distributed in some specified way throughout a solvent
occupying a bounded region of three dimensional space. We let V denote the region as well as its
volume and we suppose that V is separated from the remainder of physical space, over which we
have control, by a piecewise smooth surface S.

Our interest is in determining how our initial distribution of solute changes as time goes on and
we assume that its concentration satisfies

∂c
= ∇2 c + Q (−

r , t) , t > 0, →

r ∈V
∂t

385
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 386

where c (t = 0) is assigned ∀ −→
r ∈ V and where we measure lengths in units of a length L, say
1
L = V 3 , and measure time in units of L2 /D. This sets the value of the diffusion coefficient to one
in the scaled units. The assigned functions Q (−

r , t) and c (t = 0) specify the source of solute in
the region and the initial distribution of solute there, but the problem is indeterminate until we go
on and specify the conditions on S, the boundary of V , which define the effect of the surroundings
on what is going on inside V . To do this we divide S into three parts S1 , S2 and S3 and on each of
these we specify a definite condition:

c (−

r , t) = g1 (−

r , t) , →

r ∈ S1 , t > 0



n · ∇c (−

r , t) = g2 (−

r , t) , →

r ∈ S2 , t > 0

and



n · ∇c (−

r , t) + βc (−

r , t) = g3 (−

r , t) , →

r ∈ S3 , t > 0

where −

n is the outward unit normal vector to S and where β is assigned on S3 . It is assumed to
be real and it may not be constant. Ordinarily it is a positive constant. Thus we specify the solute
concentration itself on S1 , the rate of solute diffusion across S2 and a linear combination of these
on S3 by specifying the functions g1 , g2 , and g3 defined on S1 , S2 and S3 ∀ t > 0. The conditions
on S1 , S2 and S3 are called Dirichlet, Neuman and Robin conditions and if S1 = S the problem is
called a Dirichlet problem, etc.

Our goal here is to learn how to write the solution to this problem. To do this we introduce the
eigenvalue problem

∇2 ψ = −λ2 ψ, →

r ∈V

and denote its solutions, the eigenfunctions and the eigenvalues, ψ1 , ψ2 , . . . corresponding to λ21 , λ22 , . . .

We face two problems. The first is to specify the boundary conditions on S1 , S2 and S3 that the
eigenfunctions must satisfy in order that they can be used in solving for c. The second is to prove
that eigenfunctions are an orthogonal set of functions so that the coefficients in the expansion of c
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 387

in the eigenfunctions are the Fourier coefficients of c.

Thus we complete the statement of the eigenvalue problem and then prove orthogonality of the
eigenfunctions in the inner product
ZZZ
φ, ψ = φ ψ dV
V

We assume our set of eigenfunctions is complete and we introduce the normalization

ψi , ψi =1

Integration by Parts Formulas

By asking what we must do to determine the coefficients in the expansion of the solution to our
problem in the eigenfunctions of ∇2 , we will discover the conditions that the eigenfunctions must
satisfy on S1 , S2 and S3 and to do this we need two integration by parts formulas for functions
defined on V , then the argument is much as it was in Lecture 14.

In Brand’s book, “Vector and Tensor Analysis,” there are many very general integration theo-
rems. But all we need are the three dimensional forms of our earlier integration by parts formulas.
If φ and ψ are sufficiently smooth these are:
ZZZ ZZ ZZZ
2
φ∇ ψ dV = φ→

n · ∇ψ dA − ∇φ · ∇ψ dV
V S V

and
ZZZ ZZ ZZZ
2
φ∇ ψ dV = {φ→

n · ∇ψ − ψ −

n · ∇φ} dA + ψ∇2 φ dV
V S V

where φ and ψ are real or complex valued functions of −



r defined throughout V and on its bound-
ary. These formulas are called Green’s first and second theorems and we can write the solution to
our diffusion problem using them.

To begin we assume that the solution to the diffusion equation can be expanded in the eigen-
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 388

functions of ∇2 and that the coefficients in the expansion are the Fourier coefficients of c (−

r , t).
Thus we write


X
c (−

r , t) = ci (t) ψi (−

r)
i=1

and our job is to find the coefficients ci (t), viz., to find


ZZZ
ci (t) = hψi , ci = ψ i c dV.
V

To derive the equation satisfied by ci (t) we multiply the diffusion equation by ψ i and integrate
over V obtaining
ZZZ ZZZ ZZZ
∂c 2
ψ i dV = ψ i ∇ c dV + ψ i QdV.
∂t
V V V

Then, using Green’s second theorem to turn the first term on the right hand side into terms we can
evaluate, we have
ZZ
d  →
hψi , ci = dA ψ i −
n · ∇c − c−

n · ∇ψ i + ∇2 ψi , c + hψi , Qi
dt
S

and we write this

d 2
hψi , ci = −λi hψi , ci + hψi , Qi
dt
ZZ
 →
+ dA ψ i − n · ∇c − c−→
n · ∇ψ i
S1
ZZ
 →
+ dA ψ i −
n · ∇c − c−

n · ∇ψ i
S2
ZZ
 →
+ dA ψ i −
n · ∇c − c−

n · ∇ψ i
S3
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 389

The Boundary Conditions Satisfied by the Eigenfunctions

Assuming that we have the eigenfunctions and the eigenvalues, this is an equation by which we can
determine ci (t) = hψi , ci, i.e., the coefficient of ψi in the solution to our problem. The first two
terms on the right hand side present no problem. But in the third term − →n ·∇c is not specified on S ,
1

in the fourth term c is not specified on S2 while in the fifth term neither →

n · ∇c nor c are specified
on S3 . The equation then is indeterminate and it is our choice of the boundary conditions satisfied
by ψi , which completes the definition of the eigenvalue problem, that removes this indeterminancy.
So to make this a determinate equation for hψi , ci, we put ψi = 0 on the part of the boundary where
c is specified but −

n · ∇c is not; while on the part of the boundary where − →
n · ∇c is specified but c
is not we put −

n · ∇ψi = 0; on the remaining part of the boundary where −

n · ∇c + βc is specified
we put −→
n · ∇ψ + βψ = 0. Then the differential equation
i i

d
hψi , ci = −λ2i hψi , ci + hψi , Qi
dt
ZZ
+ − g1 −→n · ∇ψ i dA
S1
ZZ
+ ψ i g2 dA
S2
ZZ
+ ψ i g3 dA
S3

and the initial condition

hψi , ci (t = 0) = hψi , c (t = 0)i

determine the coefficient hψi , ci in the expansion

X
c (−

r , t) = hψi , ci ψi (−

r ).

Each coefficient, hψi , ci, can be written as the sum of five terms, each one corresponding to one
of the sources: c (t = 0), Q, g1 , g2 and g3 . If , as an example, S1 is all of S then the sources are
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 390

c (t = 0), Q and g1 and hψi , ci is

2
hψi , ci = hψi , c (t = 0)i e−λi t
Zt
2
+ e−λi (t−τ ) hψi , Q (t = τ )i dτ
0

Zt ZZ
− e−λ2i (t−τ ) →
g1 (t = τ ) −
n · ∇ψ i dAdτ
0 S

This, when multiplied by ψi and summed over i, is the solution to our problem as it depends on the
three sources of solute: c (t = 0) ,Q and g1 . Each term, in fact, produces a solution to the diffusion
equation corresponding to one of the sources when the other two vanish.

This method of solving for c requires that an eigenvalue problem be solved. Doing this deter-
mines a set of eigenfunctions and a way of doing this will be presented in Lecture 17. Then to pro-
duce a solution to the diffusion equation acting under a specified set of sources, each eigenfunction
must be multiplied by a coefficient hψi , ci and the product summed over the set of eigenfunctions.
Each coefficient is the solution of a linear, constant coefficient, first order differential equation,
independent of every other coefficient. Each coefficient depends in its own way on the sources of
the solute , i.e., on c (t = 0), Q, g1 , g2 and g3 , and its dependence on the sources is additive. The
coefficient hψi , ci depends on t and is a sum of terms each depending on t and each corresponding
to one of the sources of the field. This separation of the contributions of the sources carries over to
the solution itself and is one form of the principle of superposition satisfied by the solutions to the
diffusion equation. It is also satisfied by the solutions to all linear equations.

We now know what our method is and how the sources of the field make their contribution to
the solution. We also know what the eigenvalue problem is; it is the homogeneous problem

∇2 ψ = −λ2 ψ, −

r ∈V

ψ = 0, −

r ∈ S1



n · ∇ψ = 0, −

r ∈ S2
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 391



n · ∇ψ + βψ = 0, −

r ∈ S1

The nonzero solutions ψ are called eigenfunctions while the corresponding values of λ2 are called
eigenvalues.

Expanding a Function in a Series of Eigenfunctions

Suppose we wish to approximate a function defined in a region V by a sum

n
X
f= ci ψi
i=1

The question is: how should we assign values to the coefficients ci ?

Ordinarily we expect to have an orthogonal set of functions ψ1 , ψ2 , . . ., viz.,


ZZZ
ψi , ψj = ψi ψj dV = δij
V

and we do not expect this set of functions to be finite in number.

We denote by Sn the n term approximation to f ,

n
X
Sn = ci ψi
i=1

hence the error is f − Sn and the mean square error, MSE, viz.,
ZZZ
(f − Sn ) (f − Sn ) dV
V

is positive.

Now we have

(f − Sn ) (f − Sn ) = f f − S n f − f Sn + S n Sn
X X X X
=ff− ci ψ i f − ci ψi f + ci ψ i cj ψj
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 392

and therefore
ZZZ X ZZZ X ZZZ X
2
MSE = | f | dV − ci ψ i f dV − ci ψi f dV + ci ci
V V V

2 2
ZZZ n
X ZZZ n
X ZZZ
= | f |2 dV + ci − ψ i f dV − ψ i f dV
V i=1 V i=1 V

due to

2
ZZZ ZZZ ZZZ
ci − ψ i f dV = ci ci − ci ψ i f dV − ci ψi f dV
V V V
ZZZ ZZZ
+ ψ i f dV ψi f dV
V V

Hence, we see that only the second term depends on the ci ’s and to make MSE least we should
set the ci ’s to
ZZZ
ci = ψ i f dV = ψi , f
V

Then our best n term approximation is

n
X
ψi , f ψi
i=1

and its mean square error is

ZZZ n
X
| f | dV − | ci |2 > 0
V i=1

Upon increasing n, no ci ’s already known need to be reevaluated, we have

ZZZ ∞
X
2
| f | dV − | ci | 2 ≥ 0
V i=1
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 393

and the infinite series converges, whereupon ci → 0 as i → ∞.

The coefficients ci are called the Fourier coefficients of f and we assume


X ZZZ
2
| ci | = | f |2 dV
i=1 V

for all functions f of interest. Hence our series for f converges to f in the mean, i.e., the MSE
vanishes as n → ∞, and we ordinarily expect to have pointwise convergence almost everywhere
in V .

The set of functions ψi is then said to be complete.

16.2 The Facts about the Solutions to the Eigenvalue Problem

We can go on and learn something about the eigenfunctions and the eigenvalues of ∇2 by using
Green’s two theorems. If ψ and λ2 satisfy the eigenvalue problem then so also do ψ and λ2 and
hence on putting φ = ψ in Green’s second theorem we conclude that λ2 must be real. Then on
putting φ = ψ in Green’s first theorem we get

ZZZ ZZ ZZZ
−λ 2 2
|ψ| dV = ψ→

n · ∇ψdA − |∇ψ|2 dV
V S V
ZZ ZZZ
= − β |ψ|2 dA − |∇ψ|2 dV
S3 V

This is an equation telling us the sign of λ2 . First, if β > 0, then λ2 = 0 cannot be a solution
and we have λ2 > 0. But, if β = 0 and S2 includes S3 , then λ = 0 and ψ = constant might be
a solution. Indeed if S2 = S, and we have a Neumann problem, λ2 = 0, ψ = constant 6= 0 is
one solution to our eigenvalue problem. Otherwise, β > 0 or S1 = S, we must have λ2 > 0. We
go on and put φ = ψ i , ψ = ψj in Green’s second theorem, where ψi and ψj are solutions to the
eigenvalue problem corresponding to distinct eigenvalues, and learn that

hψi , ψj i = 0.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 394

This, and the observation that any independent set of eigenfunctions corresponding to the same
eigenvalue can be replaced by an orthogonal set of eigenfunctions, shows that the eigenvalue prob-
RRR
lem determines an orthogonal set of eigenfunctions in the inner product hφ, ψi = φψdV . In
V
fact restricting ∇2 to the class of functions on V satisfying homogeneous boundary conditions on
S = S1 + S2 + S3 we have hφ, ∇2 ψi = h∇2 φ, ψi and we say that ∇2 is self-adjoint on that class
of functions. In a way we have been very lucky. The boundary conditions of physical interest,
the plain vanilla inner product and the differential operator ∇2 have gotten together and given us
simple answers to the important questions. In another inner product or for other boundary con-
ditions or for another differential operator we would be required to determine an adjoint operator
and adjoint boundary conditions to work out our theory. This comes up again in Lecture 19.

In the next lecture we turn to the question: how do we solve the eigenvalue problem? And we
explain the method of separation of variables for doing this. The readers may satisfy themselves
that there are places in the foregoing where it is important that the coefficient β in the Robin
boundary condition be real and places where it is important that β be positive but nowhere is it
required that β be constant. While this is so, in solving the eigenvalue problem by the method
of separation of variables it will also be important that β be a constant, or at least be piecewise
constant and constant on each coordinate surface.

16.3 The Critical Size of a Region Confining an Autothermal


Heat Source

But before we go on to separation of variables we introduce the use of the eigenvalues of ∇2 to


estimate critical conditions in nonlinear problems.

To be definite, let heat be generated in a bounded region V of three dimensional space. To


reach a steady condition where the heat lost to the surroundings balances the heat generated in V ,
the heat must be carried to the boundary of the region by conduction. The boundary is assumed to
be in good contact with a heat bath held at a fixed temperature T0 .

The more distant the heat source is from the boundary, the higher the temperature must rise
there to conduct it away. If the source is assigned in advance the heat generation can always
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 395

be balanced by heat conduction. But if the source depends on the temperature, and increases in
strength as the temperature increases, then there is a positive feed back and this may create a critical
condition beyond which a steady solution cannot be found.

To see why this might be so we can study the problem as the region grows larger in size. Then
heat is generated at greater and greater distances from the boundary and the temperature required
to dissipate it must increase. The greatest temperature in the region then increases as the size of
the region increases and, as this goes on, the source grows stronger. Depending on how fast the
strength of the source increases as the temperature increases, there may be a critical size of the
region beyond which the heat generated therein cannot be conducted steadily to the surroundings.
This critical condition is called a runaway condition and it leads to a thermal explosion.

We suppose that the temperature is the only variable of interest and that the heat source is an
exponential function of the temperature. Then in scaled variables we have

∇2 u + µ2 eu = 0 , →

r ∈V

and

u=0 , →

r ∈S

where the size of the region appears in the constant µ2 which indicates the strength of the source.
This model is introduced by D.A. Frank-Kamenetskii in his book “Diffusion and Heat Conduction
in Chemical Kinetics.”

In certain simple geometries this equation can be solved and the critical value of µ2 can be
determined. But that is not our aim here. What we do instead is use the eigenvalue problem for ∇2
in V to estimate the critical value of µ2 . To do this we write our problem:

∇2 u + µ2 f (u) = 0, →

r ∈V

and

u=0 , →

r ∈S
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 396

where f (u) denotes the nonlinear source of heat and where u and f (u) have been scaled so that
df
f (u = 0) = 1 and (u = 0) = 1. Our interest is in how large µ2 can be, consistent with a
du
bounded solution u.

We assume that f (u) can be written

f (u) = u + g (u)

dg
where g (u = 0) = 1 and (u = 0) = 0 and where we must have g (u) ≥ 0 ∀ u ≥ 0. Indeed if
du
d2 f
≥ 0 ∀ u ≥ 0 then g (u) ≥ 1 ∀ u ≥ 0.
du2
The solutions to our problem must be non negative and our job is to estimate the range of values
of the control variable µ2 where this obtains. To do this we assume that we have a non negative
solution to our problem

∇2 u + µ2 u + µ2 g (u) = 0, →

r ∈V

and

u=0 , →

r ∈S

for some value of µ2 and then introduce for comparison the eigenvalue problem for ∇2 in V , viz.,

∇2 ψ + λ 2 ψ = 0 , →

r ∈V

and

ψ=0 , →

r ∈S

This problem determines a set of eigenfunctions ψ1 , ψ2 , · · · and a corresponding set of eigenvalues


λ21 , λ22 , · · · where the eigenvalues are greater than zero. If we have λ21 < λ22 then ψ1 must be singly
signed and we take it to be non-negative. Weinberger has a simple argument for this but it can be
seen to be true on physical grounds as the long time solution to an initial value problem in diffusion
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 397

is

2
hψ1 , c (t = 0)i e−λ1 t ψ1

To determine a bound on µ2 we observe that our second integration by parts formula tells us
ZZZ ZZ
 −

ψ1 ∇2 u − u∇2 ψ1 dV = n · {ψ1 ∇u − u∇ψ1 } dA
V S

where the right hand side is zero by the boundary conditions on u and ψ1 . The left hand side is
ZZZ
 2
−µ ψ1 u − µ2 ψ1 g (u) + λ21 uψ1 dV
V

and as this must be zero the value of µ2 corresponding to u must satisfy


RRR
uψ1 dV
µ2 V
= RRR RRR
λ21 uψ1 dV + g (u) ψ1 dV
V V

and hence we have

µ2 < λ21 .

This tells us that the critical value of µ2 lies to the left of λ21 .

This is interesting. It tells us that the critical value of µ2 , a control variable in a nonlinear prob-
lem, cannot exceed λ21 , the slowest diffusion eigenvalue, where λ21 can be determined by solving a
linear eigenvalue problem. If we replace eu by 1 + u, viz., linear heating, we expect to find

µ2crit = λ21 .

The heating problem now looks a lot like the eigenvalue problem.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 398

16.4 Solvability

We may ask whether or not a problem presented to us is solvable. The question ordinarily comes
up in a steady state problem such as

∇2 φ + µ2 φ = Q in V

where, for example,

φ = 0 on S

and Q is assigned throughout V .

We are not asking whether or not the expansion of Q in the eigenfunctions of ∇2 makes sense.
In fact Q ordinarily does not satisfy the same conditions on S as do the eigenfunctions and hence
its series expansion most likely converges in norm, not pointwise.

Our problem is to find the coefficients in an expansion of φ in the eigenfunctions of ∇2 , viz.,

X
φ= ci ψi

where ci = hψi , φi. To do this we use our second integration by parts formula to obtain
ZZ

2
∇ ψi , φ + ψi−

n · ∇φ − φ−

n · ∇ψ i dA + µ2 hψi , φi = hψi , Qi
S

Thus we have


−λ2i + µ2 hψi , φi = hψi , Qi

and we conclude that so long as µ2 is not one of the eigenvalues of ∇2 we can find a solution to our
problem. If, however, µ2 = λ2i then Q must be perpendicular to every independent eigenfunction
corresponding to λ2i . This is the solvability condition.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 399

16.5 ∇4

The linear differential operator ∇4 = ∇2 ∇2 appears in problems in the slow flow of viscous fluids
and in the deformation of elastic solids. Its eigenvalue problem is

∇4 ψ = λ4 ψ, →

r ∈V

and any two of the three homogeneous conditions

ψ=0



n · ∇ψ = 0

∇2 ψ = 0

on each part of S. (Here, as earlier, the boundary conditions on the problem to be solved will
determine the boundary conditions on ψ. The conditions listed are not all that are physically
interesting and to these can be added their linear combinations.)

The integral formulas we need here can be obtained from Green’s second theorem. First we
put ∇2 ψ in place of ψ to get
ZZZ ZZ ZZZ
 −
4
φ∇ ψdV = φ→
n · ∇∇2 ψ − ∇2 ψ −

n · ∇φ dA + ∇2 ψ∇2 φdV.
V S V

And then we put ∇2 φ in place of φ, and use the result to rewrite the second term on the right hand
side, to get

ZZZ ZZ
 −
4
φ∇ ψdV = φ→
n · ∇∇2 ψ − ∇2 ψ −

n · ∇φ dA
V S
ZZ ZZZ
 2 −
+ ∇ φ→
n · ∇ψ − ψ −

n · ∇∇2 φ dA + ψ∇4 φdV
S V

These two formulas can be used in solving equations in ∇4 in just the same way that Green’s
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 400

first and second theorems can be used in solving the diffusion equation or any other equation in
∇2 . They are especially useful in exhibiting the way in which the sources of the field make their
contribution to the field itself.

To get information about the eigenvalues and eigenfunctions of ∇4 we first observe that if λ4
4
and ψ satisfy the eigenvalue problem then so also do λ and ψ. Hence putting ψ in place of φ in
the second formula we discover that λ4 must be real. Likewise putting ψ in place of φ in the first
formula we get
ZZZ ZZZ
4 2 2
λ |ψ| dV = ∇2 ψ dV
V V

and conclude that λ4 is not negative. In both calculations the integrals over S vanish due to the
conditions that we assume the eigenfunctions satisfy on S. To establish orthogonality we can go
on and put φ = ψ i , ψ = ψj in the second formula, where ψi and ψj are solutions to the eigenvalue
problem corresponding to different eigenvalues, and obtain
ZZZ
hψi , ψj i = ψ i ψj dV = 0.
V

As real eigenvalues and orthogonal eigenfunctions correspond to self-adjoint operators we observe


that ∇4 , restricted to the class of functions satisfying the homogeneous boundary conditions of the
eigenvalue problem, is self-adjoint, viz.,

φ, ∇4 ψ = ∇4 φ, ψ .

The set of eigenfunctions determined by the eigenvalue problem for ∇4 can be used in writing
the solution to equations such as (This equation is not entirely made up, at least not by me. It is in
the book “Fractal Concepts in Surface Growth” by Barabasi and Stanley.)

∂c
= −∇4 c + Q , →

r ∈V
∂t

where the values of c and −



n · ∇c are specified on S ∀ t > 0 and c is specified at t = 0 ∀−

r ∈ V.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 401

Indeed hψi , ci is determined by solving

ZZ n o
d
hψi , ci = − ψi→

n · ∇∇2 c − ∇2 c−

n · ∇ψ i dA
dt
S
ZZ n o
− ∇2 ψ i −

n · ∇c − c−

n · ∇∇2 ψ i dA
S

−λ4i hψi , ci + hψi , Qi

where the singly underlined terms vanish as we require ψi = 0 and →−n · ∇ψi = 0 on S and the
doubly underlined terms can be determined from the assigned values of c and −

n · ∇c on S.

A warning is appropriate: the eigenvalue problem

∇4 ψ = λ 4 ψ

is not easy to solve. There are home problems in Lecture 17 which illustrate the difficulty.

16.6 Vector Eigenvalue Problems

To determine a vector field −



v defined in a region of space V and satisfying there an equation in
∇2 , we can expand our solution in the eigenfunctions of ∇2 .

The corresponding eigenvalue problem is then


− → −

∇2 ψ = −λ2 ψ , →
r ∈V



where ψ satisfies homogeneous conditions on the boundary of V .

To derive some facts about the solutions to this problem, we first use
   
⇒ ⇒ ⇒
∇· T · v = ∇·T ·−

− →
v + T : (∇−
→ T
v)
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 402

to get

 −
→ − → → −
− → →  −
− →T
∇ · ∇ ψ · φ = ∇2 ψ · φ + ∇ ψ : ∇ φ

and

 −
→ − → → −
− → →  −
− →T
∇ · ∇ φ · ψ = ∇2 φ · ψ + ∇ φ : ∇ ψ


where the underlined terms are equal as tr (AB) = tr B T AT ; we then use
ZZZ ZZ
∇·→

v dV = dA →

n ·→

v
V S

to get our two integration by parts formulas


ZZZ ZZ  − ZZZ
2−
→ −
→ −
→ → − → →  −
− →T
∇ ψ · φ dV = dA n · ∇ ψ · φ − ∇ψ : ∇ φ dV
V S V

and
ZZZ ZZ n − ZZZ
2−
→ −
→ −
→ → − → → −
− →o → −
− →
∇ ψ · φ dV = dA n · ∇ ψ · φ − ∇ φ · ψ + ∇2 φ · ψ dV
V S V

To go on we require that
ZZ n −

− → − → → →
− −o
dA n · ∇ ψ · φ − ∇ φ · ψ = 0
S


→ →

whenever ψ and φ satisfy the homogeneous form of the boundary conditions assigned to −

v , then
the second formula reduces to
ZZZ ZZZ
2−
→ −
→ → −
− →
∇ ψ · φ dV = ∇2 φ · ψ dV
V V


− →

If ψ is a solution of the eigenvalue problem corresponding to λ2 so also is ψ corresponding to
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 403

→ −
− →
λ2 . Then putting φ = ψ in the second formula shows that λ2 must be real and hence that the real

− → −
− →
and imaginary parts of ψ must also be eigenfunctions corresponding to λ2 . And putting φ = ψ
in the first formula shows that
ZZZ ZZ   ZZZ  
2 − −
→ → −
→ → −
− → →
− → T

−λ ψ · ψ dV = dA n · ∇ ψ · ψ − ∇ψ : ∇ψ dV
V S V

where the second integral on the right hand side is not negative and both integrals are zero if there

− ⇒
is an eigenfunction such that ∇ ψ = 0. If the boundary conditions are such that the first term
vanishes, this formula shows that

λ2 ≥ 0;

otherwise the boundary conditions must be such that the sign of the right hand side is the sign of
the second term if we must have λ2 ≥ 0.

The second formula establishes orthogonality. It shows that

D−
→ → − E
ψ i, ψ j = 0


→ →

whenever ψ i and ψ j are eigenfunctions corresponding to different eigenvalues where

D− ZZZ
→ → − E − →
→ −
ψ i, ψ j = ψ i · ψ j dV.
V

16.7 Home Problems

In each of these three problems there is an eigenvalue, denoted σ. You are to see if you can prove
it is real.

1. You have an incompressible fluid at rest whose density varies upward: ρ = ρ0 (z). The fluid
is inviscid and you are to find out if the rest state is stable to small perturbations.
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 404

Upon perturbation, the base density is carried by the perturbation flow and you have

∂−
→v1 →

ρ0 = −∇p1 − ρ1 g k
∂t

∇·−

v1 = 0

and

∂ρ1 −
+→
v1 · ∇ρ0 = 0
∂t

Assuming solutions of the form

vx1 = vbx1 (z) eσt eikx x eiky y

etc.

eliminate vbx1 and vby1 in favor of vbz1 and pb1 . Then eliminate pb1 obtaining
 
d db
v z1 k 2 dρ0
ρ0 − k 2 ρ0 vbz1 = − 2 g vbz1
dz dz σ dz

where k 2 = kx2 + ky2 , where ρ0 (z) > 0 and where vbz1 = 0 at z = 0, H.


dρ0 dρ0
Your job is to prove that σ 2 (k 2 ) is real no matter and that < 0 implies
dz dz
σ 2 (k 2 ) < 0.

2. To take viscosity into account, you again have


→ →

ρ = ρ0 (z) , v0 = 0

and

dp0
= −ρ0 g
dz
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 405

but now your perturbation problem is

∂vx1 ∂p1
ρ0 =− + µ∇2 vx1
∂t ∂x

∂vy1 ∂p1
ρ0 =− + µ∇2 vy1
∂t ∂y
∂vz1 ∂p1
ρ0 =− + µ∇2 vz1 − ρ1 g
∂t ∂z
∂vx1 ∂vy1 ∂vz1
+ + =0
∂x ∂y ∂z
and

∂ρ1 dρ0
+ vz1 =0
∂t dz

Assume a solution

vx1 = vbx1 (z) eikx x eiky y eσt

etc.

vx1 and vby1 , obtain


and eliminating b
   2  
d vbz1 2 d 2 d vbz1
ρ0 σ − =k bp1 + µ −k −
dz dz 2 dz

which must be solved together with


 
db
p d2
ρ0 σ vbz1 = − 1 + µ 2
− k2 vbz1 − b
ρ1 g
dz dz

and

dρ0
σb
ρ1 + vbz1 =0
dz

d vbz1
ρ1 and b
Eliminate b p1 to derive an equation for vbz1 where vbz1 = 0 = at z = 0, H.
dz
LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 406

You now have an eigenvalue problem, where σ is the eigenvalue. Can you say anything
about σ without a calculation?

3. You can account for viscosity more easily by assuming your fluid saturates a porous solid.
Then you can use Darcy’s law and you have

 →



µ v = K −∇p − ρ g k

∇·−

v =0

and

∂ρ −
+→
v · ∇ρ = 0
∂t

Assume you have a two dimensional problem whereupon your perturbation equations
are

∂p1
µ vx1 = −K
∂x

∂p1
µ vz1 = −K − ρ1 g
∂z
∂vx1 ∂ vz1
+ =0
∂x ∂z
and

∂ρ1 dρ0
+ vz1 =0
∂t dz

Writing

vx1 = vbx1 (z) sin kx eσt

vz1 = vbz1 (z) cos kx eσt


LECTURE 16. DIFFUSION IN BOUNDED, THREE DIMENSIONAL DOMAINS 407

p1 = pb1 (z) cos kx eσt

and

ρ1 = ρb1 (z) cos kx eσt

you have

µ vbx1 = Kk pb1

db
p1
µ vbz1 = −K − ρ1 g
dz
d vbz1
k vbx1 + =0
dz
dρ0
σ ρb1 + vbz1 =0
dz

Hence you obtain

d2vbz1 2 k 2 g dρ0
− k v
b z1 = − vbz1
dz 2 σ µ dz

where vbz1 = 0 at z = 0, H. This is an eigenvalue problem where σ is the eigenvalue.


dρ0
Assuming is a constant derive a formula for σ (k 2 ).
dz
Lecture 17

Separation of Variables

17.1 Separating Variables in Cartesian, Cylindrical and Spher-


ical Coordinate Systems

In Lecture 16 we found that the eigenvalue problem that must be solved in order to solve the
diffusion equation is:

∇2 ψ + λ2 ψ = 0, ~r ∈ V

ψ = 0, ~r ∈ S1

~n · ∇ψ = 0, ~r ∈ S2

~n · ∇ψ + βψ = 0, ~r ∈ S3

and we learned that the eigenvalues are real and that the eigenfunctions corresponding to different
eigenvalues are orthogonal. We can add the term V (~r) ψ to the left hand side of ∇2 ψ + λ2 ψ = 0,
where V (~r) is real valued, and not change the conclusion that the eigenvalues are real and that
the eigenfunctions are orthogonal in the plain vanilla inner product. We now turn to the method
of solving this eigenvalue problem and present the details in three coordinates systems. The job
begins here and is finished in Lecture 20.

409
LECTURE 17. SEPARATION OF VARIABLES 410

The method we use to do this is called separation of variables. To see how it goes we suppose
that we have an orthogonal coordinate system which is such that the bounding surface of the region
V coincides piecewise with a finite number of coordinate surfaces. Then the first question to ask
is this: in what form can we express the solutions to our eigenvalue problem?

If it works out, the method of separation of variables answers this question. The idea is to
reduce a three dimensional problem to three one dimensional problems. In certain orthogonal
coordinate systems this can be done. It is done by assuming that ψ can be written as the product
of three functions, each depending on only one of the three coordinates, substituting this into
∇2 ψ + λ2 ψ = 0, dividing by ψ and then determining which parts of the result must be constants.
We begin by showing how this works out in the Cartesian, cylindrical and spherical coordinate
systems.

Cartesian coordinate systems

To separate variables in Cartesian coordinates we substitute

ψ (x, y, z) = X (x) Y (y) Z (z)

into
 
2 2
 ∂ ∂ ∂ 2
∇ +λ ψ= + + +λ ψ =0
∂x2 ∂y 2 ∂z 2

and then divide by XY Z to get

1 d2 X d2 Y d2 Z
+ + + λ2 = 0
X dx2 dy 2 dz 2

The first term depends only on x, the second only on y and the third only on z. Because these
terms are independent of one another we conclude that each term must be equal to a constant.
Denoting these undetermined constants by −α2 , −β 2 and −γ 2 , we have replaced (∇2 + λ2 ) ψ = 0
LECTURE 17. SEPARATION OF VARIABLES 411

in Cartesian coordinates by the three equations

d2 X
+ α2 X = 0 (1)
dx2
d2 Y
+ β2Y = 0 (2)
dy 2

and

d2 Z
+ γ2Z = 0 (3)
dz 2

Cylindrical coordinate systems

To separate variables in cylindrical coordinates we substitute

ψ (r, θ, z) = R (r) Θ (θ) Z (z)

into
   
2 2
 1 ∂ ∂ 1 ∂2 ∂2 2
∇ +λ ψ= r + 2 2 + 2 +λ ψ =0
r ∂r ∂r r ∂θ ∂z

and then divide by RΘZ to get


 
1 1 d dR 1 d2 Θ 1 d2 Z
r + 2 + + λ2 = 0
R r dr dr r Θ dθ2 Z dz 2

1 d2 Z 1 d2 Θ
We conclude first that must be a constant and then that must be a constant. Denoting
Z dz 2 Θ dθ2
these constants by −γ 2 and −m2 we have replaced (∇2 + λ2 ) ψ = 0 in cylindrical coordinates by
the three equations

d2 Z
+ γ2Z = 0 (4)
dz 2
d2 Θ
+ m2 Θ = 0 (5)
dθ2
LECTURE 17. SEPARATION OF VARIABLES 412

and
   
1 d dR 2 m2 2
r + λ −γ − 2 R=0 (6)
r dr dr r

Spherical coordinate systems

To separate variables in spherical coordinates we substitute

ψ (r, θ, φ) = R (r) Θ (θ) Φ (φ)

into
     
2 2
 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2 2
∇ +λ ψ= r + 2 sin θ + 2 2 +λ ψ
r 2 ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2

and divide by RΘΦ to get


   
1 1 d 2 dR 1 1 d ∂Θ 1 1 d2 Φ
r + 2 sin θ + 2 2 + λ2 = 0
R r 2 dr dr r Θ sin θ dθ ∂θ r sin θ Φ dφ 2

1 d2 Φ
Now we see that must be a constant. Calling this −m2 , we then see that
Φ dφ2
 
1 1 d dΘ m2
sin θ − must be a constant. Calling this constant −ℓ (ℓ + 1) we have
Θ sin θ dθ dθ sin2 θ
replaced (∇2 + λ2 ) ψ = 0 in spherical coordinates by

d2 Φ
+ m2 Φ = 0 (7)
dφ2
   
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0 (8)
sin θ dθ dθ sin2 θ

and
   
1 d 2 dR 2 ℓ (ℓ + 1)
r + λ − R=0 (9)
r 2 dr dr r2
LECTURE 17. SEPARATION OF VARIABLES 413

For Cartesian, cylindrical and spherical coordinate systems we have now reduced the problem of
solving ∇2 ψ +λ2 ψ = 0 to the problem of solving nine second order, linear, homogeneous ordinary
differential equations, three for each coordinate system.

The homogeneous boundary conditions satisfied by ψ lead to homogeneous boundary condi-


tions that must be satisfied by each of the factors making up ψ. And each of these factors satisfies a
homogeneous ordinary differential equation depending on an undetermined constant. For arbitrary
values of these constants the solutions must be zero. Our job will be to determine the special values
of these constants for which the solutions are other than zero.

Indeed each of these problems is a one dimensional eigenvalue problem in its own right and
some of them are one dimensional forms of eigenvalue problem for ∇2 , while all of them are one
dimensional forms of the eigenvalue problem for ∇2 + V (~r).

In Lecture 19 we will describe in a little more detail the elementary facts about linear ordinary
differential equations but for now we assume only that each of these equations has two independent
solutions, observing that these two solutions can be written in many ways. Taking equation (1) as
an example we can write its general solution as

A cos αx + B sin αx

or

Ae iαx + Be −iαx

or

A cosh iαx + B sinh iαx

These familiar functions satisfy our needs in Eqs. (1), (2), (3), (4), (5) and (7). Equations
(6), (8) and (9) have solutions that are less familiar. For instance we denote by Jm and Ym two
independent solutions of Eq. (6) which is called Bessel’s equation. And while these functions may
not be as familiar as cosine and sine, Watson’s book “Theory of Bessel Functions” has nearly 1,000
LECTURE 17. SEPARATION OF VARIABLES 414

pages of information on these and related functions and so Jm and Ym are very familiar to some
people. The same is true of the solutions of Eqs. (8) and (9).

While the solutions of each of the nine equations are denoted by special symbols, in every
instance the symbols stand for power series, either infinite or finite, or power series multiplied
by familiar functions. The power series solutions are determined by what is called the method of
Frobenius and we will show how this works by using Eq. (8) as an example in Lecture 20.
 2m
m 1
X∞ (−1) z
2
For now we observe only that J0 (z) is the name assigned to the series
m=0
(m!)2
which satisfies
 
1 d d
z +1 ψ =0
z dz dz

ψ (z = 0) = 1

and

ψ ′ (z = 0) = 0

Likewise cos z is the name assigned to the series


X z 2m 1 2 1 4
(−1)m 2 = 1− z + z −···
m=0
(m!) 2! 4!

which satisfies
 
d2
+1 ψ =0
dz 2

ψ (z = 0) = 1

and

ψ ′ (z = 0) = 0
LECTURE 17. SEPARATION OF VARIABLES 415

It is worth observing that when a function is defined by a power series it is, in fact, defined by the
sequence of coefficients in the series which then must encode all of the properties of the function.
To illustrate this idea we show in §17.4 how the zeros of cos z and J0 (z) can be determined using
the coefficients in their power series.

It is also worth observing that new technical difficulties come up as we move away from Carte-
sian coordinates. Eqs. (1), (2) and (3) are independent of one another. But in Eqs. (4), (5) and (6)
m2 must be determined in Eq. (5) before Eq. (6) can be solved while in Eqs. (7), (8) and (9) m2
must be determined in Eq. (7) before Eq. (8) can be solved and ℓ (ℓ + 1) must be determined in
Eq. (8) before Eq. (9) can be solved. Equations (5) and (6) correspond to spherical coordinates
in two dimensions, Eqs. (7), (8) and (9) correspond to spherical coordinates in three dimensions
and in §17.5 we observe that the pattern we see here obtains in spherical coordinates in four di-
mensions. Also in §17.5 we carry out separation of variables in elliptic cylinder coordinates where
again something new happens that we have not seen heretofore. There are a dozen or so orthogo-
nal coordinate systems where we can separate (∇2 + λ2 ) ψ = 0 and information on these can be
found in Moon and Spencer’s book “Field Theory Handbook,” and in Morse and Feshbach’s book
“Methods of Theoretical Physics,” as well as in many other books. Indeed a lot of information on
orthogonal coordinate systems and on ∇2 can be found in Pauling and Wilson’s book “Quantum
Mechanics” and Happel and Brenner’s book “Low Reynolds Number Hydrodynamics.” The titles
of these books suggest the wide range of application of the method of separation of variables.

17.2 How the Boundary Conditions Fix the Eigenvalues

We turn now to an explanation of how we use Eqs. (1), ..., (9). The boundary conditions in a
specific problem may be any of a large number of possibilities, yet we can illustrate the essential
ideas by taking up a small number of concrete examples. We begin by observing that Eqs. (1), (2),
(3), (4), (5) and (7) are identical and that we can write the general solution to each in terms of a
linear combination of the functions cosine and sine. The boundary conditions for Eqs. (1), (2), (3)
and (4) are ordinarily imposed on two coordinate surfaces and this may also be true for Eqs. (5)
and (7), though often periodic conditions are imposed. To see how the boundary conditions select
the solutions we use we take Eq. (1) as an example and work out the Dirichlet problem where c
LECTURE 17. SEPARATION OF VARIABLES 416

is specified on the boundary of the region. Many more examples appear in Lecture 14. Then the
solutions to Eq. (1) must satisfy

X (x = a) = 0

and

X (x = b) = 0

where the values of a and b are determined by the problem at hand.

Writing the general solution to Eq. (1) as

X = A cos αx + B sin αx

we must determine the constants A, B and α so that the conditions

0 = A cos αa + B sin αa

and

0 = A cos αb + B sin αb

or
    
cos αa sin αa A 0
  = 
cos αb sin αb B 0

are satisfied. Now for arbitrary values of α the only solution to this homogeneous equation is
A = 0 = B. To get solutions other than A = 0 = B we must find the special values of α that
make the determinant of the matrix on the left hand side vanish. To each such value of α there are
solutions such that A 6= 0 or B 6= 0 or A 6= 0 and B 6= 0 but there is only one independent solution
LECTURE 17. SEPARATION OF VARIABLES 417
 
cos αa sin αa
as the rank of   is one.
cos αb sin αb

To each value of α that satisfies

cos αa sin αb − cos αb sin αa = 0

there corresponds one independent solution of Eq. (1) and X (x = a) = 0 = X (x = b). If


cos αa 6= 0 it can be written
 
sin αa
X=B − cos αx + sin αx
cos αa

But not all values of α that make the determinant vanish produce new solutions. If α makes the
determinant vanish so also does −α but α and −α determine the same eigenvalue α2 and dependent
eigenfunctions.

The simplest result obtains if a = 0 and b = 1 then



α2 = n2 π 2 
n = 1, 2, . . .
X = B sin nπx 

Assuming we solve Eqs. (2) and (3) in a similar way we can determine the eigenvalues and
eigenfunctions of a problem in Cartesian coordinates as

λ2 = α 2 + β 2 + γ 2

and

ψ = XY Z

where the arbitrary multiples in X, Y and Z will be determined so that h ψ, ψ i = 1. And it is worth
observing that only three sets of orthogonal functions are needed to solve problems in Cartesian
coordinates. In this way Cartesian coordinates are special.
LECTURE 17. SEPARATION OF VARIABLES 418

While the foregoing shows how we deal with Eqs. (5) and (7) when surfaces θ = constant or
φ = constant separate the system from its surroundings, it often happens that the boundary of a
region can be completely specified in terms that are independent of θ in Eq. (5) or φ in Eq. (7).
Then we do not have surfaces on which we can specify physical boundary conditions and in place
of this we must require the solution to our problem to be periodic in θ or φ of period 2π. This
requirement is passed on to the eigenfunctions and then to their θ or φ dependent parts and so the
boundary conditions for Eqs. (5) and (7) can be taken to be

Θ (θ + 2π) = Θ (θ)

or

Φ (φ + 2π) = Φ (φ)

17.3 Solving a Two Dimensional Diffusion Problem in Plane


Polar Coordinates (Spherical Coordinates in Two Dimen-
sions)

These ideas carry over to Eq. (6), (8) and (9), but we have not yet explained how to write the
solutions to these equations. Before we do this we take up a simple concrete example which shows
how the various parts of the solution fit together .

Suppose we wish to solve a 2-dimensional diffusion problem in a region bounded by concen-


tric circles, i.e., the concentration is uniform in the z-direction in cylindrical coordinates and the
only source is a specified nonzero initial solute distribution, denoted c (t = 0). The inner circle is
impermeable to solute, the outer circle is in perfect contact with a large solute free reservoir.

Thus we have to solve


 
∂c 1 ∂ ∂c 1 ∂c
= ∇ 2c = r + 2 2 a < r < b, 0 ≤ θ < 2π
∂t r ∂r ∂r r ∂θ
LECTURE 17. SEPARATION OF VARIABLES 419

and

∂c
(r = a) = 0 = c (r = b)
∂r

where c (t = 0) is specified. The solution is

X 2
c (r, θ, t) = ψi , c (t = 0) e−λ i t ψi (r, θ)

where the eigenvalue problem is

∇2 ψ + λ 2 ψ = 0 a < r < b, 0 ≤ θ < 2π

and

∂ψ
(r = b) = 0 = ψ (r = a) , 0 ≤ θ < 2π
∂r

The expectation might be that we will have to piece together two sets of orthogonal functions
in order build up the eigenfunctions we use to solve our problem. That would be true in Cartesian
coordinates where the separated eigenvalue problems are not coupled, but it is not ordinarily true,
and it is not true here.

To solve the eigenvalue problem we put ψ = R (r) Θ (θ) and conclude that R and Θ satisfy

d2 Θ
+ m2 Θ = 0
dθ2

and

d2 R 1 dR m2
+ − 2 R + λ2 R = 0
dr 2 r dr r

where

dR
(r = a) = 0 = R (r = b)
dr
LECTURE 17. SEPARATION OF VARIABLES 420

We assume c, and therefore ψ, and therefore Θ to be periodic in θ and we assign to ψ and


therefore to R homogeneous conditions corresponding to the boundary conditions assigned to c by
the physics of the problem.

First we look at the θ part of the problem and we replace

Θ (θ) = Θ (θ + 2π)

by

Θ (0) = Θ (2π)

and

Θ′ (0) = Θ′ (2π)

as then we see that Θ′′ (0) = Θ′′ (2π), etc.

Now writing Θ as

Θ = A cos mθ + B sin mθ

we find A, B and m must satisfy

A = A cos 2πm + B sin 2πm

and

mB = −mA sin 2πm + mB cos 2πm

whereupon we have
    
cos 2πm − 1 sin 2πm A 0
  = 
−m sin 2πm m (cos 2πm − 1) B 0
LECTURE 17. SEPARATION OF VARIABLES 421

And we see that only for special values of m does this system of homogeneous, linear, algebraic
equations have solutions other that A = 0 = B. These special values of m are those that make the
determinant of the matrix on the left had side vanish and as this determinant is

m (2 − 2 cos 2πm)

the solutions of interest correspond to

m = 0, ±1, ±2, · · ·

When m is ±1, ±2, · · · the rank of this matrix


 is zero
 and the system of equations has two
1 0
independent solutions, the simplest choice being   and  . So, to each integer value
0 1
of m other than zero there corresponds the eigenvalue m2 and the two eigenfunctions cos mθ and
sin mθ. As this obtains both for m and its negative we need admit only the values m = 1, 2, . . ..
When m = 0 we should write Θ = A+Bx whereupon we find only one periodic solution which we
take to be A = 1, B = 0. So, to m = 0 there corresponds the eigenvalue 0 and the eigenfunction
1.

Now the simplest way to denote this set of eigenfunctions is to write the independent solutions
corresponding to m2 = 0, 1, 4, . . . as e imθ and e −imθ, m = 0, 1, 2, . . . Doing this we exhibit the
eigenfunctions satisfying periodic conditions as

1
Θm = √ e imθ m = . . . , −2, −1, 0, 1, 2, . . .

and we observe that



Z 2π  0 m 6= n
Θm (θ) Θn (θ) dθ =
0  1 m=n

This can be seen by carrying out the 


integration but it
 can also be seen by observing that periodic
dψ dφ
boundary conditions make the term φ −ψ in our second integration by parts formula
dx dx
vanish.
LECTURE 17. SEPARATION OF VARIABLES 422

In this way we deal with Eq. (5) in cylindrical coordinates and Eq. (7) in spherical coordinates.
Now having established the values of m2 , viz., 0, 1, 4, . . . we turn to the R equation and notice that
we get a different R equation for each value of m2 and that λ2 appears only in the R equations.
For each value of m2 we denote the two independent solutions of the R equation by Jm (λr) and
Ym (λr) , m = 0, 1, 2, . . . where Jm and Ym denote independent solutions of Bessel’s equation for
nonnegative integer values of m.

Now there are many R equations, one corresponding to each fixed value of m2 , hence we write

R = AJm (λr) + BYm (λr)

and seek to determine A, B and λ via the conditions at r = a and r = b, viz.,


λA Jm (λa) + λB Ym′ (λa) = 0

and

A Jm (λb) + B Ym (λb) = 0

′ dJm (x)
where Jm denotes . Thus we have
dx
    

λ Jm (λa) λ Ym′ (λa) A 0
  = 
Jm (λb) Ym (λb) B 0

and to each fixed value of m2 the values of λ2 can be determined by finding the values of λ that
make the determinant of the matrix on the left hand side vanish. Indeed only for values of λ such
that


λ Jm (λa) Ym (λb) − λ Ym′ (λa) Jm (λb) = 0

can constants other than A = 0 = B, and, therefore, solutions other than R =0, be determined.
And we see that this equation must be solved for m = 0, 1, 2, . . ..
LECTURE 17. SEPARATION OF VARIABLES 423

To go on, we put a =0. By doing this we turn up a technical difficulty. The boundary of the
region is now the circle r = b. And as this is the only surface on which we can assign a physical
boundary condition we no longer have the two boundary conditions required to evaluate the two
constants in the solution of the R equation. What gets us out of this is the discovery that Ym (r) is
not bounded as r → 0 and hence, upon requiring c to be bounded, and therefore ψ to be bounded,
we must require R to be bounded and to achieve this we put B = 0. So if a = 0 and b = 1 we
write

R = A Jm (λr)

and determine λ via

Jm (λ) = 0

If λ is a root of this equation then so also is −λ but λ and −λ lead to the same eigenvalue and to de-
pendent eigenfunctions as Jm (−z) = ±Jm (z). If zero is a root of this equation the corresponding
eigenfunction is zero everywhere.

We let λ| m|1, λ| m| 2, . . . denote the positive roots of J | m| (λ) = 0 and then organize the
solutions to the eigenvalue problem in terms of m by assigning to each value of m, i.e., to
. . . , −2, −1, 0, 1, 2, . . . the eigenvalues

λ2 | m|1, λ2 | m| 2, · · ·

and the corresponding unnormalized eigenfunctions,

  e imθ   e imθ
J | m| λ| m|1 r √ , J | m| λ| m| 2 r √ , ···
2π 2π

In this way, of the two eigenfunctions corresponding to the eigenvalue λ2 | m|i, one is assigned to
| m| the other to − | m|. In problem 1 you will derive the factor normalizing the Bessel’s functions.

Letting ψ mi (r, θ) denote the normalized eigenfunction R| m|i (r) Θm (θ), where
LECTURE 17. SEPARATION OF VARIABLES 424
 
R| m|i (r) ∝ J | m| λ| m|i r we can write the solution to our problem as

+∞ X
X ∞
2
c (r, θ, t) = ψ mi, c (t = 0) e −λ | m|it ψ mi (r, θ)
m=−∞ i=1

where
Z 1 Z 2π
ψ mi, c (t = 0) = R| m| i (r) Θm (θ) c (t = 0) dr r dθ
0 0

and where this is


Z 1 Z 2π
ψ mi, c (t = 0) = R| m|i (r) r dr Θm (θ) c (t = 0) dθ
0 0

= R| m|i , h Θm, c (t = 0) iθ r

The orthogonality works out as follows

D E D E D E
R| m|i Θm , R| n|j Θn = R| m|i , R| n|j Θm , Θn
r θ
D E
and this is zero if m 6= n whereas it is R| m|i , R| m|j if m = n where
r

D E  0, i 6= j
R| m|i , R| m|j =
r  1, i=j

So, corresponding to each value of m2 we have a complete orthogonal set of functions of r and
indeed a different set for each value of m2 . The completeness we assume; the orthogonality we
infer from Lecture 15 or establish directly using the integration by parts formulas:

Z 1    1 Z 1
1 d dψ dψ dφ dψ
φ r r dr = φr − r dr
0 r dr dr dr 0 0 dr dr

and
Z 1    1 Z 1  
1 d dψ dψ dφ 1 d dφ
φ r r dr = φr −r ψ + r ψr dr
0 r dr dr dr dr 0 0 r dr dr
LECTURE 17. SEPARATION OF VARIABLES 425

Thus if R and λ2 satisfy


   
1 d dR 2 m2
r + λ − 2 R=0
r dr dr r

and

R (r = 1) = 0

2
where R is required to be bounded as r → 0 and where m2 is fixed, then so also do R and λ . On
setting φ = R and ψ = R in the first and second formulas we find that λ2 is real and positive. On
setting φ = R| m|i and ψ = R| m|j in the second formula where R| m|i and R| m|j are two solutions
corresponding to different values of λ2 we find that
Z 1
R| m|i R| m|j r dr = R| m|i , R| m|j =0
0 r

and this is the orthogonality condition that we require.

What is going on here is this: the index m sorts out the θ variation of c (t = 0) and then the
index i sorts out the corresponding r variation. Indeed we first expand an assigned initial solute
distribution c (t = 0) in the set of functions

n o+∞
Θm (θ)
m=−∞

as

+∞ D
X E
c (t = 0) = Θm, c (t = 0) Θm (θ)
θ
m=−∞


The complex and real Fourier series are two forms of the same expansion. If
P+∞ imφ P∞
m=−∞ cm e = m=0 am cos mφ + bm sin mφ then am = cm + c−m and

bm = icm − ic−m.

This resolves c (t = 0) into its various angular pieces where each of the resulting coefficients
D E
Θm, c (t = 0) is a function of r. This function, defining the part of the r dependence of
θ
LECTURE 17. SEPARATION OF VARIABLES 426
n o∞
c (t = 0) that corresponds to Θm (θ), is then expanded in the set of functions R| m|i (r) and
i=1
this set is special to each value of | m|.

The orthogonality of two functions R| m|i and R| n|j , m 6= n is never an issue.


D E
As c (t = 0) is real we observe that Θm, c (t = 0) Θm (θ) and
D E θ
Θ−m, c (t = 0) Θ−m (θ) are complex conjugates and hence that the terms corresponding
θ
to +m and −m in our solution are complex conjugates. Therefore we can write c (r, θ, t) more
simply as

∞ X
X ∞ nD E 2
o
c (r, θ, t) = 2 Re Rmi , Θm , c (t = 0) Rmi (r) Θm (θ) e −λ | m|it
θ r
m=0 i=1

and we see that if Θm , c (t = 0) = 0 then Θm , c = 0 ∀ t > 0. Indeed if


Θm , c (t = 0) = 0, ∀ m other than m = 0 then we have


X 2
c (r, t) = R0i , c (t = 0) R0i (r) e −λ 0 it
r
i=1

and so if c (t = 0) is uniform in θ then c itself is uniform in θ ∀ t > 0.

Sketches of J0 (z) , J1 (z) , J2 (z) , . . .

1 J0 (z)

J1 (z)
J2 (z) λ21 λ22 λ23

λ01 λ11 λ02 λ12 λ03 λ13


LECTURE 17. SEPARATION OF VARIABLES 427

indicate that the positive zeros of J0 , J1 , J2 , . . . are ordered in the following way: The lowest is
the first zero of J0 , the next lowest is the first positive zero of J1 , then the first positive zero of J2 ,
before the second zero of J0 .

Hence as t grows large the last remaining term in our solution corresponds to m = 0, i = 1
and this term is uniform in θ. The next to the last term corresponds to m = 1, i = 1, not to
m = 0, i = 2. Indeed estimates of how large t must be before the series can be replaced by its first
term are too short if we look at λ01 and λ02 . We need to look at λ01 and λ11 .

The terms corresponding to larger values of | m| and i die out faster than the terms corre-
sponding to smaller values of | m| and i. This is the smoothing we associate with diffusion as
the eigenfunctions corresponding to larger values of | m| and i exhibit more oscillations and their
contribution to the solution dies out faster.

The sketch below of J0 (λ01 r), J0 (λ02 r), J0 (λ03 r), . . . shows that this set of orthogonal func-
tions is constructed from J0 (z) by scaling, in turn, its positive zeros, z1 , z2 , z3 , . . . to 1. Likewise
J1 (λ11 r), J1 (λ12 r), J1 (λ13 r), . . . is constructed from J1 (z) in just this same way. Etc. This illus-
trates the rule that in a set of orthogonal functions each function can be identified by the number
of its interior zeros. It also shows that the zeros of any two functions in such a set are nested.

1
J0 ( λ01r)

J0 ( λ02r)

r
J0 ( λ03r)
LECTURE 17. SEPARATION OF VARIABLES 428

As the last remaining term in our solution as t grows large is a multiple of J0 (λ01 r), it is im-
portant that J0 (λ01 r) be singly signed. It is also important that J1 (z = 0) = 0 otherwise the eigen-
functions J1 (λ1i r) cos θ would be poorly behaved as r → 0. Likewise
J2 (z = 0) = 0 = J2′ (z = 0) and this is important as the eigenfunctions exhibiting J2 as a fac-
tor are multiplied by cos 2θ. Etc.

We have been a little lucky. Ordinarily, in orthogonal coordinate systems we have

dV = hξ hη hζ dξ dη dζ

hence the inner product over V factors into one dimensional inner products only under special
conditions. And these conditions are not satisfied in elliptic cylinder coordinates introduced in
§17.5

Further, we have
 
2 1 ∂ hη hζ ∂
∇ = +···
hξ hη hζ ∂ξ hξ ∂ξ

and for separation of variables to work out as simply as above places strong requirements on hξ , hη
and hζ , conditions not satisfied in all orthogonal coordinate systems.

In the next lecture we solve some problems where what we have done in this lecture is suffi-
cient.

17.4 The Zeros of cosz and J0(z)

To determine the zeros of cos z or J0 (z) in terms of the coefficients in their power series expansions
we observe that if q (z) has a simple zero at z0 then q ′ (z0 )/q(z0 ) has a simple pole there and its
residue is 1. Then because the contour integral
I
1 cos′ w
dw
2πi (w − z) cos w
C
LECTURE 17. SEPARATION OF VARIABLES 429

is equal to the sum of the residues of its integrand at its poles inside C and because the integral
vanishes as the diameter of C grows large, we get

cos′ z 1 1
0= + + +···
cos z z1 − z −z1 − z

where the zeros of cos z are real and where 0 < z1 < z2 < · · · denote the positive zeros. Writing
this as

cos′ z 2z 2z
= 2 + 2 +···
cos z z − z1 z − z22
2

or

1 cos′ z X 1 1
− =
2z cos z zi2 z2
1− 2
zi

 j
1 P z2
and expanding as when | z 2 | < | zi2 | we get
z2 zi2
1− 2
zi

∞ ∞  j
1 cos′ z X 1 X z 2
− =
2z cos z z2
i=1 i j=0
zi2

X X∞  j+1
2j 1
= z
j=0 i=1
zi2

d 1
where cos z is a function of z 2 and 2
cos z = cos′ z . Using this to write
dz 2z

X∞ X∞  j+1
d 2j 1
− 2 cos z = cos z z
dz j=0 i=1
zi2
LECTURE 17. SEPARATION OF VARIABLES 430
 j+1
2
P∞ 1
and expanding both sides in powers of z , we can evaluate the sums i=1 . Thus, using
zi2
the series defining cos z, viz.,

1 1 1 6
cos z = 1 − z 2 + z 4 − z +···
2 24 720

and

d 1 1 1 4
2
cos z = − + z 2 − z +···
dz 2 12 240

we get

X 1 1
2
=
zi 2
X 1 1
4
=
zi 6

etc.

√ π2
Indeed 6 = 2.449 is already a fair approximation to z12 = = 2.467.
4
The corresponding equations for J0 are

X 1 1
2
=
zi 4
X 1 1
4
=
zi 32
X 1 1
6
=
zi 192

etc.


3
where 192 = 5.769 is a very good approximation to the square of the smallest positive zero of
J0 .

This method was used by Rayleigh and goes back to Euler.


LECTURE 17. SEPARATION OF VARIABLES 431

17.5 Separation of Variables in Two More Coordinate Systems

(I) Spherical coordinates in four dimensions

We let w, x, y and z denote rectangular Cartesian coordinates in four dimensions and define spher-
ical coordinates via

w = r cos ω

x = r sin ω cos θ

y = r sin ω sin θ cos φ

z = r sin ω sin θ sin φ

Then we have hr = 1, hω = r, hθ = r sin ω and hφ = r sin ω r sin θ and hence


   
2 1 ∂ 3 ∂ 1 ∂ 2 ∂
∇ = 3 r + 2 2 sin ω
r ∂r ∂r r sin ω ∂ω ∂ω
 
1 ∂ ∂
+ 2 2 sin θ
r sin ω sin θ ∂θ ∂θ
1 ∂2
+
r 2 sin2 ω sin2 θ ∂φ2

The result of substituting ψ = R (r) Ω (ω) Θ (θ) Φ (φ) into (∇2 + λ2 ) ψ, dividing by RΩΘΦ and
identifying terms which must be constant is

d2 Φ
+ m2 Φ = 0
dφ2
   
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0
sin θ dθ dθ sin2 θ
   
1 d 2 dΩ ℓ (ℓ + 1)
sin ω + k (k + 2) − Ω=0
sin2 ω dω dω sin2 ω
LECTURE 17. SEPARATION OF VARIABLES 432

and
   
1 d 3 dR 2 k (k + 2)
r + λ − R=0
r 3 dr dr r2

(II) Confocal elliptical cylinder coordinates

We define elliptical cylinder coordinates via

x = c cosh ξ cos η

y = c sinh ξ sin η

z=z

p
Then we find hξ = hη = c sinh2 ξ + sin2 η and hz = 1 and hence

 
2 1 ∂2 ∂2 ∂2
∇ = 2
 + +
c2 sinh ξ + sin2 η ∂ξ 2 ∂η 2 ∂z 2

Substituting ψ = X (ξ) Y (η) Z (z) into (∇2 + λ2 ) ψ = 0 we get

d2 Z
+ γ 2Z = 0
dz 2

d2 X n 2 2 2 2
 2
o
+ c sin ξ λ − γ − m X=0
dξ 2
d2 Y n  o
2 2 2 2 2
+ c sin η λ − γ + m Y =0
dη 2

We notice that λ2 appears in two equations. This is new. And we see that the orthogonality
does not factor, viz.,
Z Z

ψ ij , ψ kℓ = Xi (ξ) Xk (ξ) Y j (η) Y ℓ (η) c2 sinh2 ξ + c2 sin2 η dξdη

This is new.
LECTURE 17. SEPARATION OF VARIABLES 433

17.6 Home Problems

1. Normalization of Bessel Functions

To normalize Bessel’s functions you need to evaluate integrals such as


Z 1
2
Jm (λr) r dr
0

To do this put ψ = Jm (λr) and multiply


   
1 d dψ 2 m2
r = −λ + 2 ψ
r dr dr r


by 2r 2 to obtain
dr
   
dψ d dψ  dψ
2 r r = 2 − λ2 r + m2 ψ
dr dr dr dr

Now this is
 2
d dψ  d 
r = − λ2 r 2 + m2 ψ2
dr dr dr

hence you have


" 2 #1 Z 1
dψ d  h i1
2
r = −λ r2 ψ 2 dr + m2 ψ 2
dr 0 dr 0
0

and integrating by parts gets you a formula for


Z 1
ψ 2 r dr
0

2. Heat is generated in a circle of radius R. The temperature at the edge is held fixed at T = 0.
The rate of heat generation is a linear function of temperature, increasing as temperature
LECTURE 17. SEPARATION OF VARIABLES 434

increases. Hence we have

∇2 T + λ2 (1 + T ) = 0

T = 0 at r=R

What is the greatest value of R at which there is a solution to our problem, at a fixed
value of λ2 ? { We could ask: what is the greatest value of λ2 at which there is a solution to
our problem at a fixed value of R ? }.

A cooling pipe of radius R0 is introduced at the center of the circle. Its temperature is
T = 0. By how much can R be increased?

It is not possible to center the cooing pipe precisely. What is its effect if it is off center
by a small amount ε?

In the expansion

1 2
R = R0 + εR1 (θ) + ε R2 (θ) + · · ·
2

we have

sin2 θ
R1 = cos θ, R2 = −
R0

3. Assume the temperature, T > 0, is specified at the cross section z = 0 of an infinite circular
cylinder of radius R. The walls are held at T = 0. Find the temperature in the cylinder,
z > 0, by solving

∂2T 1 ∂ ∂T
− 2
= r
∂z r ∂r ∂r

Do this by using the solutions to the eigenvalue problem

1 d dψ
r + λ2 ψ = 0, 0 < r < R, and ψ = 0 at r = R
r dr dr
LECTURE 17. SEPARATION OF VARIABLES 435

This should bring to mind the problem

∂T 1 ∂ ∂T
= r
∂t r ∂r ∂r

where T is specified at t = 0.

4. The eigenvalue problem

 2
d2 1 d
2
+ ψ = λ4 ψ
dr r dr

where ψ is bounded at r = 0 and

ψ (r = 1) = 0

ψ ′ (r = 1) = 0

has the solution

ψ = AJ0 (λr) + BI0 (λr)

due to
 
d2 1 d
+ J0 (λr) = −λ2 J0 (λr)
dr 2 r dr

and
 
d2 1 d
2
+ I0 (λr) = λ2 I0 (λr)
dr r dr

Hence A, B and λ satisfy

AJ0 (λ) + BI0 (λ) = 0


LECTURE 17. SEPARATION OF VARIABLES 436

and

AλJ0′ (λ) + BλI0′ (λ) = 0

whereupon, to have a solution such that A and B are not both zero, the λ’s must satisfy


λ J0 (λ) I0′ (λ) − J0′ (λ) I0 (λ) = 0

and then we have

AJ0 (λ)
B=−
Io (λ)

or


ψ = A J0 (λr) I0 (λ) − I0 (λr) J0 (λ)

Your job is to prove


Z 1
r ψλ ψµ dr = 0, λ 6= µ
0

and estimate the first few values of λ.

Denote by W (λ) the Wronskian of J0 (λ) and I0 (λ) and show that

dW 1
= − W + 2J0 (λ) I0 (λ)
dλ λ

where

dI0 (λ) dJ0 (λ)


W (λ) = J0 (λ) − I0 (λ)
dλ dλ

The λ’s satisfy W (λ) = 0 and you will find that W (λ) is not a very nice function.
LECTURE 17. SEPARATION OF VARIABLES 437

5. Show that
 2
d2 1 d
2
+ ψ = λ4 ψ
dr r dr

has solutions

AJ0 (λ) + BY0 (λ) + CI0 (λ) + DK0 (λ)

and that

d4 ψ
4
= λ4 ψ
dz

has solutions

A sin λz + B cos λz + C sinh λz + D cosh λz

Each of these solutions has enough flexibility to satisfy four boundary conditions.

To solve
 2
∂2 1 ∂ ∂
2
+ + 2 ψ = 4λ4 ψ
∂r r ∂r ∂z

observe that
 
∂2 1 ∂ ∂2
+ + I0 (λr) sin λz = 0
∂r 2 r ∂r ∂z 2
 
∂2 1 ∂ ∂2
+ + I0 (λr) sinh λz = 2λ2 I0 (λr) sinh λz
∂r 2 r ∂r ∂z 2
etc.

Hence you have solutions

 
AJ0 (λr) + BY0 (λr) C sin λz + D cos λz
LECTURE 17. SEPARATION OF VARIABLES 438

and

 
AI0 (λr) + BK0 (λr) C sinh λz + D cosh λz

and also

√  √  
AJ0 3 λr + BY0 3 λr C sinh λz + D cosh λz

etc.

But it is not easy to find ψ’s with enough flexibility to satisfy the boundary conditions
you are likely to meet. Show that the same problem arises if you are trying to solve

 2
4 ∂2 ∂
∇ψ= 2
+ 2 =0
∂x ∂y

viz., show that about the best you can do is


 
b sinh λx + B
b cosh λx + C
b 1 b 1
ψ = {A sin λy + B cos λy} A x sinh λx + D x cosh λx
2λ 2λ

6. You are to solve

∇4 φ = f (x, y)

where φ = 0 = φx at x = −1 and x = 1 and where φ and φy are specified as functions of x


at y = 0 and y = 1.

You begin by solving the eigenvalue problem

d4 ψ
= λ4 ψ
dx4

where ψ = 0 = ψ ′ at x = −1 and x = 1 and find even solutions,viz.,

ψλ = A cos λx + B cosh λx
LECTURE 17. SEPARATION OF VARIABLES 439

where tan λ = − tanh λ and odd solutions,viz.,

ψλ = C sin λx + D sinh λx

where tan λ = tanh λ. You then prove


Z +1
ψλ′ ψλ dx = 0 λ 6= λ′
−1

Thus you write

X
φ (x, y) = cλ (y) ψλ (x)

where
Z +1
cλ (y) = ψλ (x) φ (x, y) dx
−1

and then to find the equation satisfied by the cλ’s you multiply ∇4 φ = f by ψλ and integrate
over −1 ≤ x ≤ 1 obtaining
Z +1 Z +1 Z +1 Z +1
∂4φ d2 ∂2φ d4
ψλ 4 dx + 2 2 ψλ 2 dx + 4 ψλ φ dx = ψλ f dx
−1 ∂x dy −1 ∂x dy −1 −1

On carrying out integration by parts as many times as you need to, you discover one,
and only one, technical difficulty: the term
Z +1
d2 d2 ψ λ
2 2 φ dx
dy −1 dx2

d2 ψ λ
appears and it is not easy to write in terms of ψλ,
dx2
It turns out, in two dimensional problems, the theory of complex variables comes to
your rescue: if you write z = x + ıy then

∇2 (yf (z)) = 2 ıf ′ (z)


LECTURE 17. SEPARATION OF VARIABLES 440

in any region where f (z) has a Taylor series.

7. You are solving

∂c
= ∇2 c
∂t

on a bounded plane domain, c (t = 0) assigned and c = 0 at the edge of the domain.

You introduce the eigenvalue problem

∇2 ψ + λ 2 ψ = 0

on the domain and ψ = 0 at the edge and write your solution

X 2
c= h ψ, c (t = 0) i e−λ t ψ

Hence all you need are the solutions to the eigenvalue problem in order to estimate c.

First, suppose the domain is a circle of radius R0 . Find the two eigenvalues in control
of the final stages of solute loss to the surroundings.

Then suppose the domain is a small displacement of the circle defined by

1 2
R (θ) = R0 + ε R1 + ε R2 + · · ·
2

sin2 θ
where R1 = cos θ, R2 = − and find the corrections to the above two eigenvalues in
R0
order to learn whether the diffusion of solute is faster or slower, i.e., find λ21 and λ22 in the
series

1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2

The hope is you find λ21 = 0 = λ22 for both eigenvalues and you are curious to know
why this is so.
LECTURE 17. SEPARATION OF VARIABLES 441

To see, take a circle of radius R0 , displace its center to x = ε, y = 0 and determine R1


and R2 in the expansion

1 2
R (θ) = R0 + ε R1 + ε R2 + · · ·
2

8. Elliptical cylinder coordinates are defined by

x = c cosh ξ cos η

y = c sinh ξ sin η

z=z

In the x, y plane the family of curves defined by holding ξ fixed, 0 ≤ ξ < ∞, is a


family of confocal ellipses. The family defined by holding η fixed, 0 ≤ η < 2π, is a family
of confocal hyperbolas. The two branches of each hyperbola correspond to four values of η.
As ξ → ∞, we have

y
= tan η
x

and hence the four values of η are the angles that the branches make with the positive x axis.
The ellipses and hyperbolas are centered at x = 0 = y and their foci lie at x = ±c, y = 0.
1 2 4 5
The curves η = π, π, π and π corresponding to two branchs of one hyperbola are
3 3 3 3
shown in the sketch.
LECTURE 17. SEPARATION OF VARIABLES 442

2 1
η= π η= π
3 3

η=π η=0
(−c, 0) (c, 0) η = 2π

4 5
η= π η= π
3 3

Write ∇2 in this coordinate system and reduce the eigenvalue problem (∇2 + λ2 ) ψ = 0
to three one dimensional eigenvalue problems by separation of variables.

Suppose a solute distribution is assigned at t = 0 in the domain bounded by

x2 y 2
+ 2 = 1, a2 − b2 = c2
a2 b

z = 0, z=d

The planes z = 0 and z = d are not permeable to solute.

Assuming that the solute concentration on the lateral surface of the elliptical cylinder
does not depend on z, show that c (t > 0) is independent of z iff c (t = 0) is independent
of z.

Suppose ξ = ξ1 defines the elliptical cylinder, viz.,

a2 = c2 cosh2 ξ1 , b2 = c2 sinh2 ξ1

and suppose that our domain is in contact with a solute free reservoir so that c (ξ = ξ1 ) = 0
for all t > 0. Then determine whether or not the solute concentration is always independent
of η if it is initially independent of η. Moon and Spencer’s book “Field Theory Handbook:
LECTURE 17. SEPARATION OF VARIABLES 443

Including Coordinate Systems, Differential Equations, and their Solutions” is a useful refer-
ence.

9. A solute is initially distributed throughout an infinitely long pipe of radius R according to


c (t = 0) = c0 (r, z). Determine the solute concentration for all t > 0 by expanding c in
the radial eigenfunctions of ∇2 and then using the point source solution to the longitudinal
diffusion equation to determine the coefficients of the eigenfunctions in the expansion. The
wall of the pipe is impermeable to the solute.

10. Occasionally it is possible to use eigenfunctions in a region of simple shape to derive eigen-
functions in a not so simple region of interest. To do this we make linear combinations of the
eigenfunctions we have in a way that satisfies the conditions on the boundary of the region
of interest.

For example sin mπx sin nπy and sin nπx sin mπy are eigenfunctions of ∇2 in the
square, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, satisfying ψ = 0 along x = 0, x = 1, y = 0 and
y = 1. The eigenvalue is


λ2 = π 2 m2 + n2

You are to derive eigenfunctions and eigenvalues of ∇2 on the region 0 ≤ x ≤ 1,


0 ≤ y ≤ 1−x satisfying ψ = 0 on x = 0, y = 0 and x+y = 1. Do this by making x+y = 1
a line where a linear combination of the eigenfunctions on the square, corresponding to a
fixed eigenvalue, vanishes.

11. You are to solve the eigenvalue problem

∇2 ψ + λ2 ψ = 0, r ≤ R (θ) , 0 ≤ θ < 2π
LECTURE 17. SEPARATION OF VARIABLES 444

where

ψ = 0 at r = R (θ)

The region is an ellipse obtained by slightly displacing a circle of radius R0 , holding


the area fixed.

Thus, you write

a = R0 + ε

ε2
b = R0 − ε + −···
R0

and

1 2
R (θ) = R0 + ε R1 (θ) + ε R2 (θ) + · · ·
2

and substitute these expansions into

b2 x2 + a2 y 2 = a2 b2

where x = R cos θ, y = R sin θ, to find R1 (θ) , R2 (θ) , . . . By doing this you should have
R1 (θ) = cos 2θ and R2 (θ) = your job.

Then any eigenfunction and eigenvalue on the circle, viz., ψ0 , λ20 , can be corrected for
a slight displacement by writing

1 2
ψ = ψ0 + ε ψ1 + ε ψ2 + · · ·
2
1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2

deriving the equations for ψ1 , λ21 and ψ2 , λ22 on the circle in the usual way and then deriving
LECTURE 17. SEPARATION OF VARIABLES 445

the conditions satisfied by ψ1 and ψ2 at r = R0 by writing


 
 ∂ψ0
ψ r = R (θ) = ψ0 + ε ψ1 + R1
∂r
 2

1 2 ∂ψ1 2 ∂ ψ0 ∂ψ0
+ ε ψ2 + 2R1 + R1 + R2 +···
2 ∂r ∂r 2 ∂r

where the RHS is evaluated at r = R0 . To see how this goes first take ψ0 = J0 (λ0 r) where
J0 (λ0 R0 ) = 0 and then ψ0 = J1 (λ0 r) cos θ where J1 (λ0 R0 ) = 0.

You should notice that at each order 1st , 2nd , etc., the homogeneous problem is the ze-
roth order problem and it has a solution, viz., ψ0 , not zero. Hence a solvability condition must
be satisfied. This determines λ21 at first order, before ψ1 , λ22 at second order, before ψ2 , etc. In
solving for ψ1 , ψ2 , etc. you ought to expand the inhomogeneous terms in 1, cos θ, cos 2θ, . . .,
1 1
eg., cos2 θ = + cos 2θ.
2 2

12. The setting for the free radical problem, see Lecture 14, is now a very long circular cylinder
of radius R. What is the critical value of R in terms of k and D?

By how much can the critical value of R be increased if the cylinder is of length L and
c = 0 at z = 0 and L.

13. Normalization of Bessel Functions

Define ψ by

ψ = AJm (λr) + BYm (λr)

and notice that ψ satisfies


 
1 d 2 m2
(rψ) + λ − 2 ψ=0
r dr r


Multiply this equation by r 2 and integrate the product over R1 < r < R2 . Derive a
dr
LECTURE 17. SEPARATION OF VARIABLES 446

formula for
Z R2
r ψ 2 dr
R1

in terms of ψ and ψ ′ at R1 and R2 .

14. The frequencies of oscillation of a spinning column of inviscid fluid

An inviscid fluid lies in a cylinder of radius R, spinning at angular velocity Ω.

The equations are

∂~v
+ ~v · ∇~v = −∇p
∂t

∇ · ~v = 0
p
where p denotes .
ρ
Show that

vr = 0, vθ = rΩ, vz = 0

satisfies these equations.

Introduce a small perturbation, write the perturbation equations and assume



vr1 = vbr (r) 





v = vb (r) 
θ1 θ
eı σt eı m θ eı k z
vz1 = vbz (r) 





p1 = pb (r) 
LECTURE 17. SEPARATION OF VARIABLES 447

to obtain

db
p
ı (σ + m Ω) vbr − 2Ω vbθ = −
dr
ım
ı (σ + m Ω) vbθ + 2Ω vbr = − p
b
r

ı (σ + m Ω) vbz = −ı k b
p

and

vr vbr ı mb
db vθ
+ + + ı k vbz = 0
dr r r

This is an eigenvalue problem where σ is the eigenvalue, m and k are inputs.


ı mbvθ
Eliminate + ı k vbz from the last equation and vbθ from the first two thereby
r
obtaining two equations in vbr and b p. Then eliminate vbr, whereupon you have
   
d2 b
p 1 db p 2 4Ω2 m2
+ + k −1 − 2 b p=0
dr 2 r dr (σ + mΩ)2 r

and at r = R, vr = 0 implies

db
p 2Ωm b p
+ =0
dr σ + mΩ r

Find the frequencies of oscillation, given k 2 , in the simple case m = 0. At m = 0 we


have a problem in σ 2 but at m 6= 0 it is a problem in (σ + m Ω)2 .

15. Solve the Petri dish problem in a circular domain , assuming homogeneous Dirichlet condi-
tions along the circumference.

16. A cold rod of radius κR0 lies inside a hot pipe of radius R0 . The temperatures T cold and
LECTURE 17. SEPARATION OF VARIABLES 448

T hot are held fixed and the temperature of the fluid in the annular region is

r
A + B ln
R0

where

T cold = A + B ln κ

and

T hot = A

The fluid and the cylinders are spinning at constant angular velocity, Ω, such that the fluid
~ = Ω~k and ~k lies along the axis of the rod.
~ × ~r where Ω
velocity is Ω

The density of the fluid depends on temperature via

 
ρ = ρref 1 − α (T − T ref)

and our base state exhibits an unstable density stratification.

Accounting for the variation of ρ only in the ~v · ∇~v terms derive the equations satisfied
by a small perturbation of the base state, assume a solution

vr1 = vbr (r) 





vθ1 = vbθ (r)  σt ı m θ
e e
p1 = pb (r)  




T = T (r) 
1
b

and derive the equations for vbr, vbθ, pb, and Tb. There is no gravity, no z variation, no vz and
for κ near 1 the base temperature is more or less linear.

Making this approximation find the critical value of T hot −T cold, i.e., the smallest value
of T hot − T cold such that σ = 0.
LECTURE 17. SEPARATION OF VARIABLES 449

17. A hot rod of radius R0 loses heat by conduction to a cold pipe of radius κR0 , κ > 1. Their
temperatures are Th > Tc . Derive a formula for the rate of heat loss. Move the rod off center
by a small amount ε so that its surface is now

1 2
R (θ) = R0 + ε R1 (θ) + ε R2 (θ) + · · ·
2

sin2 θ
R1 (θ) = cos θ, R2 (θ) = −
R0
and find out by how much the heat loss is changed.

18. The solutions to

∇2 ψ + λ 2 ψ = 0 0 ≤ r ≤ R0 , 0 ≤ θ < 2π

where

ψ = 0 at r = R0

are

m=0 J0 (λr) , J0 (λR0 ) = 0

m=1 J1 (λr) cos θ, J1 (λR0 ) = 0

etc.

If the circle is displaced into an ellipse of the same area, viz.,

r = R (θ) = R0 + εR1 + · · ·
LECTURE 17. SEPARATION OF VARIABLES 450

where

ab = R02

a = R0 + ε
 
R02 ε
b= = R0 1 − +···
a R0

we have
 
1 1 3
R1 = cos 2θ, R2 = − − cos 2θ + cos 4θ , ···
R0 2 2

and we wish to solve

x2 y 2
∇2 ψ + λ2 ψ = 0, + 2 ≤1
a2 b

where

x2 y 2
ψ = 0, at + 2 =1
a2 b

Writing the eigenvalues and eigenvectors on the ellipse

λ2 = λ20 + ελ21 + · · ·

ψ = ψ0 + εψ1 + · · ·

where λ20 and ψ0 are the corresponding eigenvalues and eigenvectors on the circle, derive the
result

∀ λ20 : λ21 = 0 at m=0

∀ λ20 : λ21 6= 0 at m=1


LECTURE 17. SEPARATION OF VARIABLES 451

This second result ought to surprise you.

19. Your job is to estimate the solution to

∇2 ψ + λ2 ψ = 0, ψ=0 on all sides of the domain (see below)

The curved side is specified by

1 2
y = Y (x) = Y0 + ε Y1 (x) + ε Y2 (x) + · · ·
2

where Y1 (x) , Y2 (x) , · · · are inputs. You can make your job easy by assuming
Y1 (x) = sin 2πx, Y2 (x) = 0, . . .

Y0

0
0 1 x

The reference domain is


LECTURE 17. SEPARATION OF VARIABLES 452

y0

Y0

0
0 1 x0

The solutions to

∇02 ψ0 + λ20 ψ0 = 0, ψ0 = 0 on all sides

are

y0
ψ0 = sin mπx0 sin nπ m, n = 1, 2, . . .
Y0

and

n2 π 2
λ20 2 2
=m π + 2
Y0

First select an eigenvalue to be corrected and write

1 2 2
λ2 = λ20 + ε λ21 + ε λ2 + · · ·
2

where λ20 , ψ0 correspond to definite values of m and n, say m0 , n0 , held fixed henceforth.
LECTURE 17. SEPARATION OF VARIABLES 453

Then the λ21 , λ22 , . . . problems are, first,

∇02 ψ1 + λ20 ψ1 = −λ21 ψ0 , ψ1 = 0 on three sides

∂ψ0
ψ1 = −Y1 (x0 ) (x0 , y0 = Y0 ) on y0 = Y0
∂y0
and, second,

∇02 ψ2 + λ20 ψ2 = −2λ21 ψ1 − λ22 ψ0 , ψ2 = 0 on three sides

∂ψ1 ∂ 2 ψ0
ψ2 = −2Y1 (x0 ) (x0 , y0 = Y0 ) − Y12 (x0 ) (x0 , y0 = Y0 )
∂y0 ∂y02
∂ψ0
− Y2 (x0 ) (x0 , y0 = Y0 ) on y0 = Y0
∂y0

etc.

At each order the homogeneous problem has a solution, not zero. Hence a solvability condi-
tion must be satisfied and these conditions lead you to λ21 , λ22 , etc.

First, derive a formula for λ21 , viz.,


Z 1   Z Y0 Z 1
∂ψ0 2
dx0 −Y1 (x0 ) (x0 , y0 = Y0 ) = λ1 ψ02 dx0 dy0
0 ∂y0 0 0

Second, solve for ψ1 by deriving formulas for the coefficients Amn , where

XX
ψ1 = Amn ψ0mn

and where
Z Z
Amn = ψ0mn ψ1 dx0 dy0

Z Z
2
assuming ψ0mn dx0 dy0 = 1.
LECTURE 17. SEPARATION OF VARIABLES 454

20. You wish to find the eigenvalues, λ2 , where

∇2 ψ + λ 2 ψ = C

ψ = 0, at the sides

and
ZZ
ψ dxdy = 0
A

The domain is the square:

L L

−L L
−L

and ψ, C and λ2 are the outputs. Your interest is the solutions where C 6= 0.

The problem

∇2 φ + µ 2 φ = 0

and

φ = 0, at the sides
LECTURE 17. SEPARATION OF VARIABLES 455

has solutions

x y
sin mπ sin nπ , m, n = 1, 2, . . .
L L
 
x 1 y
sin mπ cos n + π , m = 1, 2, . . . n = 0, 1, . . .
L 2 L
 
1 x y
cos m + π sin nπ , m = 0, 1, . . . n = 1, 2, . . .
2 L L

all of which integrate to zero hence all of which are ψ’s and λ2 ’s corresponding to C = 0.

It also has solutions


   
1 x 1 y
cos m + π cos n + π , m, n = 0, 1, . . .
2 L 2 L

none of which integrates to zero.

Expanding ψ in these solutions, at C 6= 0, derive

Z Z 2
X φmn dxdy
=0
m,n
λ2 − µ2mn

and estimate the smallest λ2 .


Lecture 18

Two Stability Problems

Using what we did in Lecture 17 we can work out two stability problems: the Saffman-Taylor
problem (P. G. Saffman, G. I. Taylor, Proc. Roy. Soc., Vol. 245, 312, 1958) and the Rayleigh-
Taylor problem. (S. Chandrasekhar, “Hydrodynamic and Hydromagnetic Stability.”)

The setting for each is a cylindrical column of circular cross section bounding a porous solid.
Fluid fills the pores and its velocity is given by Darcy’s law.

We will need the balances, expressing the conservation laws, across a surface separating two
phases, denoted (1) and (2):

(2) → →
u = u n ( normal velocity )


n

(1)

z = Z( x, y, t )

where

~k − Z ~i − Z ~j
x y
~n = q
1 + Zx2 + Zy2

457
LECTURE 18. TWO STABILITY PROBLEMS 458

Zt
u= q
1 + Zx2 + Zy2

and
  
1 + Zy Zxx − 2ZxZy Zxy + 1 + Zx2 Zyy
2

2H =  3/2
1 + Zx2 + Zy2

and where H denotes the mean curvature of the surface.

Assuming the phases are immiscible, neither crossing the surface, we have at z = Z (x, y, t):

~n · ~v (1) = u = ~n · ~v (2)

which can be written as

n o n o
vz − Z xvx − Z y vy (1) = Z t = vz − Z xvx − Z yvy (2)

and

~ ~
−~n~n : T~ (1) + γ2H = −~n~n : T~ (2)

which can be written, in the case of a Darcy fluid, as

p (1) + γ2H = p (2)

where γ denotes the surface tension.

We introduce the notation


∇ = ~k + ∇H
∂z

and

~v = vz~k + ~vH
LECTURE 18. TWO STABILITY PROBLEMS 459

and we plan to work in terms of p (1), p (2) and Z.

The present domain, bounded by the surface z = Z (x, y, t) is obtained by a displacement of


the reference domain, bounded by the surface z = Z0 .

Hence for a domain variable, say p, at the surface z = Z we will write


 
  dp0 
p(Z) = p0 Z 0 + ε p1 Z 0 + Z 1 Z0 + · · ·
dz

assuming

Z = Z 0 + εZ 1 + · · ·

where p0 , p1 , . . . are defined on the reference domain.

18.1 The Saffman-Taylor Problem

In this problem the stability of the surface separating two immiscible fluids is of interest, one fluid
displacing the other in a porous rock. The flow is in the z direction at a speed U and gravity is not
important. What is important is that the viscosity of the two fluids differs.

R

n
µ
z
z = Z0 = 0

z = Z( r, θ, t )
µ∗

U
LECTURE 18. TWO STABILITY PROBLEMS 460

We introduce an observer moving at the velocity U ~k. Then, in the moving frame, we write
the nonlinear equations making up our model for the dynamics of the surface, z = Z (r, θ, t),
separating the two phases. First, we have Darcy’s law, which is not Galilean invariant, above and
below the surface, viz.,

µ  
~ = −∇p,
~v + U z>Z
K

and

µ∗  
~ = −∇p∗,
~v ∗ + U z<Z
K

where ∇ · ~v = 0 = ∇ · ~v ∗, where U
~ = U ~k and where K denotes the permeability of the porous

solid, whereupon

∇2 p = 0 = ∇2 p ∗

At the side walls, r = R, no flow implies

~n · ~v = 0 = ~n · ∇H p, z>Z

and

~n · ~v ∗ = 0 = ~n · ∇H p∗, z<Z

and contact at right angles to the wall implies ~n · ∇H Z = 0. Far from the surface z = Z the
pressures p and p∗ must be bounded.

At the surface, z = Z, we have

vz − ~v · ∇H Z = Z t = vz∗ − ~v ∗ · ∇H Z
H H
LECTURE 18. TWO STABILITY PROBLEMS 461

and

p − p∗ = γ2H

All of the nonlinearities in our model appear at the surface.

Our base solution, denoted by the subscript zero, is

~v0 = ~0 = ~v0∗

dp0 µ
=− U
dz K

and

dp∗
0 µ∗
=− U
dz K

where the surface separating the two fluids lies at z = Z0 = 0, defining the base domain.

Imposing a small displacement on our base solution and denoting the perturbation variables by
the subscript 1, viz., Z = Z0 + εZ1 , we obtain the perturbation problem. It is defined on the base
domain and we have

∇2 p1 = 0. z > 0

and

∇2 p1 ∗ = 0. z < 0

At the side walls we have

∂p1
vr1 = 0 ∴ = 0 at r=R
∂r
∂p1∗
v∗
r1= 0 ∴ =0 at r = R
∂r
LECTURE 18. TWO STABILITY PROBLEMS 462

and

∂Z1
=0 at r = R
∂r

And at z = 0 we have
 
dp 0 dp 0∗
p1 + Z 1 ∗
− p 1 + Z1 2
= γ∇H Z1
dz dz

K ∂p1 ∂Z1 K ∂p 1∗
− = v z1 = = v∗
z1= − ∗
µ ∂z ∂t µ ∂z
and
Z 2π Z R
Z 1r drdθ = 0
0 0

where we assume no volume change on perturbation of the base surface.

This is a linear problem in p 1, p 1∗ and Z1, where each of these variables satisfies homogeneous
Neumann conditions at r = R.

Hence, to solve it, we introduce the eigenvalue problem

2
∇H ψ + λ2 ψ = 0

∂ψ
= 0 at r=R
∂r

and ψ bounded at r = 0 and we write its solution

ψ = Jm (λr) cos mθ

where λ is a root of


Jm (λR) = 0

′ d
and Jm (x) denotes Jm (x).
dx
LECTURE 18. TWO STABILITY PROBLEMS 463

Our plan is to determine the growth rate of surface displacements in the shape of any of the
allowable eigenfunctions.

To do this we separate variables and write

p1 = pb1 (z) ψ (r, θ) eσt

p1∗ = pb1∗(z) ψ (r, θ) eσt

and

Z1 = Zb1ψ (r, θ) eσt

whereupon we find

db
p1
− λ2 pb1 = 0, z>0
dz
p1∗
db
− λ2 pb1∗ = 0, z<0
dz

and at z = 0 we have

K db
p1 p1∗
K db
− = σ Zb1 = −
µ dz µ∗ dz

   
µ  µ ∗
pb1 + Zb1 − U − pb1∗ + Zb1 − U = −γλ2 Zb1
K K
and
Z 2π Z R
Zb1ψeσt r drdθ
0 0

Assuming pb1 is bounded as z → ∞ and pb1∗ is bounded as z → − ∞ we have

pb1 = Ae−λz
LECTURE 18. TWO STABILITY PROBLEMS 464

and

pb1∗ = A∗eλz

hence at z = 0 we find

K K
λA = σ Zb1 = − λA∗
µ µ∗

and

 µ   
µ ∗


A + Zb1 − U − A + Zb1 − U = −γ λ2 Zb1
K K

The inputs to our problem are U and R, the output is σ and we have three linear homogeneous
equations in A, A∗ and Zb1 which have a non vanishing solution iff the determinant of the matrix
of coefficients vanishes. This determines σ, the growth rate of a surface displacement in the shape
ψ. The readers can work this out.

To determine the critical value of U, we set σ = 0 whereupon A = 0 = A∗ and we find

U
(µ − µ∗) = γλ2
K

which tells us this: if µ∗ > µ there is no critical condition, i.e., the surface separating a more
viscous fluid displacing a less viscous is stable to any small displacement. A critical value of U is
possible iff µ∗ < µ, i.e., a less viscous fluid displacing a more viscous fluid. A plot of the critical
value of U vs λ2 , then, looks as follows
LECTURE 18. TWO STABILITY PROBLEMS 465

Ucrit
unstable

stable

2
λ

and if we mark the allowable values of λ2 on the abscissa we see that the lowest allowable λ2 sets
the pattern of the instability.

The allowable λ2 ’s come from the roots of

J0′ (λR) = 0, J1′ (λR) = 0, J2′ (λR) = 0 etc.

where we have

Jm (x)
J0 (x)

J′1 = 0 J2 (x)
J′2 = 0
J′0 = 0 J′1 = 0 J′2 = 0

X
J1 (x)
J′ = 0
J′0 = 0 J′1 = 0 2

We look first at the eigenfunctions ψ = J 0 (λr) , where J ′0 (λR) = 0, and observe that
J ′0 (x) = 0 has a root at x = 0, hence we have a solution λ = 0 and ψ = 1. This solution
LECTURE 18. TWO STABILITY PROBLEMS 466

does not satisfy


Z 2π Z R
ψ r drdθ = 0
0 0

and hence it is not marked on the diagram. Every other root of J ′0 (x) = 0 is allowable because

Z R R
1
J 0 (λr) r dr = rJ1 (λr)
0 λ 0

and J1 (x) = −J ′0 (x).

All the eigenfunctions J m (λr) cos mθ, where J ′m (λR) = 0, are allowable due to

Z 2π
cos mθ dθ = 0
0

and we observe that J ′2 (x) , J ′3 (x), etc., all vanish at x = 0 but in each case the corresponding
eigenfunction is zero.

Hence the lowest allowable value of λ2 corresponds to J ′1 (λR) = 0 whence


Ucrit = λ2
(µ − µ∗)

where λR is the first root of J1′ (x) = 0.

Thus the pattern we should expect to see as we increase U in an experiment to just beyond Ucrit
should have a cos θ dependence, viz.,
LECTURE 18. TWO STABILITY PROBLEMS 467

− +

− +

along with the radial dependence

J1 ( λ r )

r
R

Now we can ask another question: at what value of R does the surface become unstable, given
that U is fixed at a positive value?

From the above we have

U x2
(µ − µ∗) = λ2 , λ2 =
Kγ R2

where the x’s are roots of J ′1 (x) = 0.


LECTURE 18. TWO STABILITY PROBLEMS 468

If R is very small the right hand side is very large even for the smallest root of J ′1 (x) = 0.
Hence small diameter columns are stable unless U is very large. Upon increasing R we arrive at
its critical value where

U x2
(µ − µ∗) = 21
Kγ R crit

and where x1 is the smallest positive root of J ′1 (x) = 0.

18.2 The Rayleigh-Taylor Problem

This problem does not differ from the Saffman-Taylor problem by much. Here the instability is
~ Again we set the problem in a porous rock and use
caused by gravity and ~g takes the place of U.
Darcy’s law.

We have two fluids of different density lying in a gravitational field, the heavy fluid above the
light fluid.

→ →
z g = −gk

z = Z( r, θ, t )
ρ∗

We write our equations in the laboratory frame as

µ
~v = −∇p + ρ~g , ∇ · ~v = 0
K

for both fluids and hence we have

∇2 p = 0, z>Z
LECTURE 18. TWO STABILITY PROBLEMS 469

and

∇2 p∗ = 0, z<Z

The boundary conditions are as before:

∂p
= 0, at r = R, z>Z
∂r

and

∂p∗
= 0, at r = R, z<Z
∂r

due to no flow across the side walls, p must be bounded as z → ∞ and p∗ must be bounded as
z → − ∞ and, at the surface z = Z,

Zt
~n · ~v = r = ~n · ~v ∗
1
1 + Z 2r + 2 Z 2θ
r

due to no flow across the surface separating the fluids,

p − p∗ = γ2H

and
Z 2π Z R
Z (r, θ, t) r dr dθ
0 0

At this point we do something a little different than before. We are going to change the bound-
ary conditions satisfied by Z and require pinned edges in place of free edges, i.e., Z = 0 at r = R.
Hence the boundary conditions satisfied by p1, p1∗ and Z 1 at the wall in the perturbed problem
differ and will not allow us to separate variables as we did above, viz., p1 and p1∗ will be ask-
ing for one set of ψ’s, corresponding to Neumann conditions, Z 1 will be asking for another set
corresponding to Dirichlet conditions. Therefore we are limited in what we can do easily.
LECTURE 18. TWO STABILITY PROBLEMS 470

We do not ask for σ, instead we set σ to zero and look for the neutral condition. By setting σ
to zero in the perturbation problem we have

∇2 p1 = 0 = ∇2 p1∗

∂p1 ∂p ∗
=0= 1 at r=R
∂r ∂r

and

∂p1 ∂p ∗
=0= 1 at z=0
∂z ∂z

where p1 and p1∗ must be bounded. Hence, at the critical value of R, p1 and p1∗ must be constants,
but not necessarily zero as would be the case if the edges of the surface were free instead of pinned.

Then the equation for Z1 is

 
dp0 dp0∗
p1 + Z − p1∗ + Z = γ2H1
dz 1 dz 1

where, setting γC = p1 − p1∗ and using

dp0 dp0∗
= −ρg, = −ρ∗g and 2H 1 = ∇H2 Z1
dz dz

we have

γC − g (ρ − ρ∗) Z1 = γ∇H2 Z1

where

Z1 = 0 at r = R, Z1 bounded at r = 0

and
Z 2π Z R
Z1r drdθ = 0
0 0
LECTURE 18. TWO STABILITY PROBLEMS 471

g (ρ − ρ∗)
This is a homogeneous problem in Z1 and C and we are looking for the value of
γ
such that Z1 is not zero.

Thus we write

∇H2 Z 1 + λ2 Z 1 = C, Z 1 = 0 at r=R

and
Z 2π Z R
Z 1r drdθ = 0
0 0

and we have an eigenvalue problem, the eigenvalue λ2 corresponding to the eigenfunction Z 1, C.


g
The question then is: for what value of R can (ρ − ρ∗) be one of the eigenvalues of this problem?
γ
Writing ψ in place of Z1 we scale the problem and obtain

2
∇H ψ + λ2 ψ = C

ψ = 0 at r=1

and
Z 2π Z 1
ψr drdθ = 0
0 0

where C denotes CR2 and where λ2 is now independent of R. Then

g
R2 (ρ − ρ∗)
γ

g
must be one of the λ2 ’s and we can look for critical values of R given (ρ − ρ∗).
γ
First we see that if ρ∗ > ρ the surface is stable to small perturbations for all values of R.
This is the case of a light fluid lying above a heavy fluid. Then for ρ > ρ∗ and R very small,
g
R2 (ρ − ρ∗) will be less than all λ2 ’s. And a heavy fluid lying above a light fluid will be stable
γ
to small perturbations.
LECTURE 18. TWO STABILITY PROBLEMS 472

g
As we increase R, the critical value of R will be reached when R2 (ρ − ρ∗) becomes equal
γ
to λ21 the smallest eigenvalue among the set of λ2 ’s satisfying our eigenvalue problem.

For m = 1, 2, . . . we have solutions

ψ = Jm (λr) cos mθ, C=0

where the λ’s are the positive roots of Jm (x) = 0, viz.,

J1 (x)
J2 (x)

x1 = 3.83171

The least of these is the first positive root of J1 (x) = 0.

This leaves only the case of axisymmetric disturbances, viz., m = 0, where we can not use
R 2π
0
cos mθ dθ = 0, m = 1, 2, . . . to easily conclude that C = 0. In fact at m = 0, C is not zero.
Indeed at m = 0 we have

C
ψ = AJ0 (λr) +
λ2

which is the general solution to

d2 ψ 1 dψ
+ + λ2 ψ = C
dr 2 r dr
LECTURE 18. TWO STABILITY PROBLEMS 473

assuming ψ is bounded.

Now λ cannot be zero for then C must be zero and we have ψ = A, whereupon A must be
zero. Hence to find the positive λ’s we observe that

ψ (r = 1) = 0

and
Z 1
ψr dr = 0
0

imply that

C
AJ0 (λ) + =0
λ2

and
Z 1
1C
A J0 (λr) r dr + =0
0 2 λ2

and we have two homogeneous equations in A and C.

Then using

d 
rJ1 (λr) = λrJ0 (λr)
dr

and requiring a solution other than A = 0 = C we obtain

λJ0 (λ) = 2J1 (λ)

whose solutions are the values of λ at m = 0. The lowest solution lies to the right of x1 , the
smallest positive root of J1 (x) = 0. Hence we conclude that the critical value of R is given by

g
R2 (ρ − ρ∗) = x21
γ
LECTURE 18. TWO STABILITY PROBLEMS 474

whence the surface will break in a non axisymmetric, viz., m = 1, mode.

We present a graph of x J0 (x) − 2J1 (x) vs x and indicate the first few positive roots.
LECTURE 18. TWO STABILITY PROBLEMS 475

Roots of xJ 0 (x) − 2J 1 (x) = 0


0
5.1355
8.4172
11.6195
14.7960
17.9598
21.117
24.2701
27.4206
30.5692
33.7165
36.8628
40.0084
We also record the first positive zeros of J1 (x) and J ′1 (x). They are

J1 (x) = 0 : x1 = 3.8317

J ′1 (x) = 0 : x1 = 1.8412

Hence the case of pinned edges is most unstable to a cos θ perturbation and, although both free
and pinned edges are most unstable to a cos θ perturbation, it takes a much larger value of R to
destabilize the case where the edges are pinned.

18.3 Home Problems

1. Convection caused by an adverse temperature gradient.

You have two isothermal horizontal planes bounding a fluid. The lower one, at z = 0,
is hot, the upper one, at z = H, is cold. The density of the fluid depends on its temperature
LECTURE 18. TWO STABILITY PROBLEMS 476

via


ρ = ρref 1 − α (T − Tref)

Hence the fluid layer is unstably stratified, heavy over light.

The base solution is

dT0 TH − TC
~v0 = ~0, =−
dz H

Assume the problem is two dimensional, i.e., one horizontal dimension.

Your model is

∂~v
ρ + ρ~v · ∇~v = −∇p + µ∇2~v + ρ~g , ∇ · ~v = 0
∂t

and

∂T
+ ~v · ∇T = κ∇2 T
∂t

where we have no side walls and at z = 0, H we have

∂vz ∂vx
vz = 0, + =0
∂x ∂z

corresponding to no flow and no shear.

Introduce a small perturbation of the base solution, eliminate p1 by differentiation, elim-


inate vx1 by ∇ · ~v1 = 0 and obtain

∂ 2 ∂ 2 T1
ρ ∇ vz1 = µ∇2 ∇2 vz1 + ρref αg
∂t ∂x2

and

∂T1 dT0
+ vz1 = κ∇2 T1
∂t dz
LECTURE 18. TWO STABILITY PROBLEMS 477

dT0
Your job is to find at neutral conditions where steady values of vz1 and T1 , not both
dz

zero, prevail. Dropping and scaling, you can obtain
∂t

∂ 2 T1
∇2 ∇2 vz1 − ∆T =0
∂x2

and

∇2 T1 + vz1 = 0

where T1 = 0 = vz1 at z = 0, 1

∂ 2 vz1 ∂ 2 vz1
− =0 at z = 0, 1
∂x2 ∂z 2

and where ∆T is a scaled temperature difference.

Eliminating T1 you get

∂ 2 vz1
∇2 ∇2 ∇2 vz1 + ∆T =0
∂x2

and you can solve this by separation of variables, viz.,

vz1 = A sin nπz cos kx

to find the critical value of ∆T as a function of n and k.

Your result should look like this, where σ = 0 curves are plotted as ∆T vs. k 2 .
LECTURE 18. TWO STABILITY PROBLEMS 478

n=2
∆T
unstable to n = 1, 2
etc. n=1

unstable to n = 1

∆Tcrit stable

k2
k2crit


where k 2 is an input, telling you the horizontal wave length of the perturbation, . You will
k
notice that very long and very short wave length disturbances are very stable.

Observe that

T1 = B sin nπz cos kx

vx1 = C cos nπz sin kx

and

p1 = D cos nπz cos kx

and find B, C and D in terms of A, n2 and k 2 .

Using integration by parts, workout

Z (  3  2 3 )
1
d2 d
a − k2 b − b − k 2 a dz
0 dz 2 dz 2

What is going on in this problem is this: An element of fluid in equilibrium, buoy-


ancy balancing gravity, upon being given an upward displacement, finds itself surrounded
by colder fluid of a higher density. Its increased buoyancy reinforces the displacement and
LECTURE 18. TWO STABILITY PROBLEMS 479

the density stratification is unstable. This is offset by the fact that our element of fluid is now
hotter than its new surroundings and it cools, its density increasing.

2. You have a porous rock bounded by a cylinder of constant cross section. At the top and
bottom are planes held at constant temperature, viz., Thot at z = 0, Tcold at z = H.

The side wall is an insulated, no-flow surface. The top and bottom planes are isothermal
no-flow surfaces.

Your model is

µ
~v = −∇p − ρ g ~k, Darcy’s law
K

∇ · ~v = 0

and

∂T
+ ~v · ∇T = κ ∇2 T
∂t

where

~n · ~v = 0 at all surfaces

~n · ∇T = 0 at the side walls

T = Thot at z = 0

and

T = Tcold at z=H

and where ∇ρ = −ρref α ∇T .


LECTURE 18. TWO STABILITY PROBLEMS 480

The base solution is

T0 = T0 (z) ,

~v0 = ~0,
dp0
= −ρ (T0 ) g,
dz
dT0 T − Tcold
= − hot <0
dz H

You may proceed without declaring the shape of the cross section by writing

~v = vz ~k + ~vH

and


∇ = ~k + ∇H
∂z

whereupon

∂2
∇2 = + ∇2H
∂z 2
∂vz
∇ · ~v = + ∇H · ~vH
∂z
µ ∂p
vz = − − ρ g
K ∂z

and

µ
~v = −∇H p
K H

Then you can derive

K
∇2 vz = ρ α g∇H2 T
µ ref
LECTURE 18. TWO STABILITY PROBLEMS 481

and

∂T ∂T
+ vz + ~vH · ∇H T = κ∇2 T
∂t ∂z

And using ~n · ~vH = 0 = ~n · ∇H T at the walls you have

~n · ∇H vz = 0 = ~n · ∇H T

You introduce a small perturbation of the base solution and derive the perturbation
dT0
problem for vz1 and T1 . And, assuming a steady solution at the critical value of , you
dz
solve your problem by separation of variables, viz.,

vz1 = vbz1 (z) ψ (x, y)

b1 (z) ψ (x, y)
T1 = T

where ψ is any solution to the eigenvalue problem on the cross section:

∇H2 ψ + λ2 ψ = 0 on the domain

~n · ∇H ψ = 0 at the edge

Derive the result:


 2
n2 π 2
  + λ2
K 1 dT0 H2
ρref α g − = n = 1, 2, . . .
µ κ dz λ2

and draw a sketch, LHS vs. λ2 .

So far the cross section has not come into the problem. But at this point it determines
the allowable values of λ2 and these depend on the shape and diameter of the cross section.

Assume the cross section is a circle of radius R0 and deduce the convection pattern seen
at critical as R0 increases from a small value to its critical value, at fixed T hot − T cold.
LECTURE 18. TWO STABILITY PROBLEMS 482

The dip in the plot of LHS vs. λ2 allows you to see many patterns at the critical value
of T hot − T cold. Set n = 1 and assume the cross section to be one dimensional having side
walls at x = 0 and x = L. Then ψ = cos kx (k = λ) where the allowable values of k are

, m = 0, 1, 2, . . . .
L
For small values of L the most dangerous value of k corresponds to m = 1. As L
increases show that the most dangerous value of k corresponds to increasing values of m
and, therefore, that many patterns can be seen at the critical value of T hot − T cold depending
on the width of the cell.

3. Assume the cross section in Problem 2 to be a thin rectangle of length a and width b, a >> b.
Is it the value of a or b that controls the critical temperature difference?

4. Your job is to look again at the Rayleigh-Taylor problem, assuming Darcy’s law tells you the
velocity. Do this on an arbitrary cross section, writing


∇ = ~k + ∇H
∂z

and

~v = vz ~k + ~vH

and suppose that the surface is not pinned at the edge but contacts the side wall at right
angles, viz.,

~n · ∇H Z = 0

The question is: is there an effect of the fluid depths on the critical diameter of the cross
section?
LECTURE 18. TWO STABILITY PROBLEMS 483

(1) (2)
At infinite depths you have p1 = 0 = p1 whereas at finite depths you have, instead,
(1)
p1 = c1 , p(2) = c2 . And your equation for Z1 is then


c2 − c1 + g −ρ(2) + ρ(1) Z1 = γ∇H2 Z1

on the cross section.

Show that assuming


ZZ
Z1 dxdy = 0
cross section

implies c2 − c1 = 0 and therefore the depths are immaterial.

Do you think this would also be true if the edges were pinned, viz., Z1 = 0 at the edges?

5. You are going to try to predict what you might see in a Rayleigh-Taylor experiment, assum-
ing you see the pattern having the greatest growth rate.

You have a cylinder of circular cross section. The radius is denoted R. A heavy fluid,
density ρ, lies above a light fluid, density ρ⋆. The surface separating the fluids is denoted
z = Z (r, θ, t) and at first the two fluids are at rest, being separated by the horizontal surface
dp0 dp0⋆
z = Z0 = 0. You have − →v0 = 0 = −→v0⋆, = −ρg and = −ρ⋆g.
dz dz
The domain equations are

 →

µ−

v = K −∇p − ρg k

and

∇·−

v =0

and therefore

∇2 p = 0
LECTURE 18. TWO STABILITY PROBLEMS 484

for both fluids and at the side walls you have −



n ·−

v = 0 and therefore −

n · ∇p = 0.

At the surface z = Z (r, θ, t) you have

Zθ ⋆ Zθ vθ⋆
vz − Zr vr − vθ = Zt = v⋆
z − Zr vr −
r r

and

p − p⋆ = γ2H

The surface is given a small perturbation, viz., Z = Z0 +ε Z1 and your perturbation equations
are then

∇2 p1 = 0 = ∇2 p1⋆

∂p1 ∂p1⋆
=0= at r=R
∂r ∂r
and

∂Z1
= 0 at r=R
∂r

assuming the surface contacts the side walls at right angles.

At z = 0 you have

K ∂p1 K ∂p⋆
= vz1 = Z1t = v⋆
1
− z1 = −
µ ∂z µ⋆ ∂z

and
 
dp0 ⋆ dp⋆
0
p1 + Z 1 − p1 + Z 1 = γ2H1 = γ∇H2Z1
dz dz

Writing

p1 = pb1 (z) Jm (λr) eimθ eσt


LECTURE 18. TWO STABILITY PROBLEMS 485

p1⋆ = pb1⋆ (z) Jm (λr) eimθ eσt

and

b1 Jm (λr) eimθ eσt


Z1 = Z

where


Jm (λR) = 0

you have

pb1 = Ae−λz

and

pb1⋆ = A⋆eλz

where pb1 is assumed to be bounded as z → ∞, likewise pb1⋆ is assumed to be bounded as


z → − ∞, leaving A, A⋆ and Zb1 to satisfy the conditions at z = 0, viz.

K K
Aλ = σ Zb1 = − ⋆ A⋆λ
µ µ

and


A − A⋆ = −γλ2 + (ρ − ρ⋆) g Zb1

whereupon to have a solution A, A⋆ and Zb1 not all zero you find

σ (µ + µ⋆) 
= λ −γλ2 + (ρ − ρ⋆) g
K

Your first job is to make certain all of the above is correct.


LECTURE 18. TWO STABILITY PROBLEMS 486

Your second job is to notice that σ vs λ is one curve, you can sketch it and you can observe
that there is a greatest value of σ. The curve rises due to the kinematic condition then falls
(ρ − ρ⋆) g
due to surface tension crossing zero at λ2 =
γ
Now the allowable λ’s depend on R via


Jm (λR) = 0

′ xm
Denoting by xm the solution to Jm (x) = 0 your λ’s are: λ = and you have
R

Jm (x)
J0 (x)

J′1 = 0
J′0 = 0 J′1 = 0

X
J1 (x)
J′0 = 0 J′1 = 0

where x = 0 is ruled out by holding the volume constant on perturbation, viz.,


Z R
Z1 r dr = 0
0

∆ρg
For a small value of R all the λ2 ’s lie to the right ofand all perturbations are stable. As
γ
R increases they all move leftward but maintain their order. Soon the lowest moves to the
∆ρg
left of and the problem is unstable to the corresponding perturbation. Upon increasing
γ
R we can make any of the λ2 ’s the fastest growing and that λ2 determines the pattern you
see.
LECTURE 18. TWO STABILITY PROBLEMS 487

Your third job is to satisfy yourself that all of the above is true and that first you will see an
m = 1 pattern followed by an m = 0 pattern.

6. A heavy fluid lies above a light fluid. The two fluids are in hydrostatic equilibrium. The
problem is two dimensional. The interface is horizontal, its ends are pinned and the width of
the cell is such that the equilibrium is stable. The volume of the heavy fluid is 2LH.

z=H

g
z

x= −L x=L
x

z = − H∗

Now you add heavy fluid and remove light fluid resulting in

x
LECTURE 18. TWO STABILITY PROBLEMS 488

where the surface is denoted z = Z0 (x) and where at first Z0 (x) = 0.

The eigenvalue problem, viz.,


!
d 1 dψ
P1 − λ2 ψ = 3/2
dx (1 + Z02 (x)) dx

ψ = 0 at x = ±L

and
Z L
ψ dx = 0
−L

appears upon investigating the stability of these solutions.

Multiplying by ψ and integrating over −L < x < L you obtain

Z !
L
1
3/2
ψx2 dx
−L (1 + Z02 (x))
λ2 = Z L
ψ 2 dx
−L

This is called a Rayleigh quotient and the facts about Rayleigh quotients are explained in
Weinberger’s book.
Z L
Every trial function you put into the RHS, satisfying ψ = 0 at x = ±L, ψ dx = 0 gives
−L
an estimate of λ21 lying above the true value of λ21 .

Your job is to satisfy yourself that λ21 in the first picture lies above λ21 in the second.

7. You have a cylinder of arbitrary cross section bounded above and below by parallel horizon-
tal planes, one at z = 0, the other at z = H. The cylinder is filled with a porous solid whose
free space is filled with a liquid.

The density of the liquid depends on its temperature, viz.,


ρ = ρref 1 − α (T − T ref)
LECTURE 18. TWO STABILITY PROBLEMS 489

The lower plane is at temperature T hot, the upper plane is at temperature T cold. And you
need to take into account only the temperature dependence of the density.

All walls are no flow, the vertical side walls are adiabatic and the upper and lower walls are
isothermal.

The density is unstably stratified, heavy over light; and you are to, first, derive the critical
value of T hot at which flow sets in and then you are to increase T hot slightly beyond its
critical value and begin the process of estimating the steady flow above T hot critical.

The model is


→ K →

v = − ∇p − ρ g k , ∇·−

v =0
µ

and

∂T −
+→
v · ∇T = κ ∇2 T
∂t

where

T = T cold at z=H

T = T hot at z=0

vz = 0 at z = 0, H

and



n ·−

v =0=−

n · ∇T at the side walls.

Denoting the base solution by the subscript zero you have


→ →

v0 = 0
LECTURE 18. TWO STABILITY PROBLEMS 490

and

dT0 T − T cold 0
− = hot 0
dz H

Now set

→ ∂

∇= k + ∇H
∂z

and


→ → →

v = vz k + −
vH

assume steady conditions and write the model

K ∂p
vz = − − ρg
µ ∂z


→ K
vH = − ∇H p
µ
∂vz
+ ∇H · →

vH = 0
∂z
and
 
∂T − ∂2
vz +→
vH · ∇H T = κ 2
+ ∇H T
∂z ∂z 2

Then eliminating p derive


 
∂2
+ ∇H vz = −g∇H2 ρ = ρref α g∇H2 T
2
∂z 2

At the side walls −



n ·−

v = 0 implies −

n · ∇p = →

n · ∇H p = 0 and hence −

n · ∇H vz = 0.
And our boundary conditions are

T = T cold 0, vz = 0 at z=H
LECTURE 18. TWO STABILITY PROBLEMS 491

T = T hot 0, vz = 0



n · ∇H vz = 0 = −

n · ∇H T at the side walls

To find the critical value of T hot 0, impose a small perturbation on the base solution, denote
the perturbation variables by the subscript one and derive the perturbation problem at zero
growth rate, viz.,
 
∂2
+ ∇H vz1 − ρref gα∇H2 T1 = 0
2
∂z 2
   
∂2 2 1 dT0
+ ∇H T1 + vz1
∂z 2 κ − dz = 0
where

T1 = 0 = vz1 at z=0=0

and



n · ∇H vz = 0 = −

n · ∇H T at the side walls.

dT0
Your job is to find the values of − at which this problem has solutions other than
dz
vz1 = 0 = T1

To do this introduce the eigenvalue problem on the cross section

∇H2 ψ + λ2 ψ = 0

and



n · ∇H ψ = 0 at the edge
LECTURE 18. TWO STABILITY PROBLEMS 492

and denote its solutions

λ21 , λ22 , . . .

ψ1 , ψ2 , . . .

These solutions depend on what the cross section is and if we denote its diameter by d then
the λ2 ’s are multiples of d−2 . Henceforth by not specifying d you can view λ2 as a continuous
variable.

Assuming we have a perturbation in the shape of the eigenfunction ψ we can separate vari-
ables and write

vz1 = vbz1 (z) ψ

and

T1 = b
T1 (z) ψ

whereupon vbz1 and b


T1 satisfy
 
d2
− λ vbz1 + ρref g α λ2 b
2
T1 = 0
dz 2
   
d2 b 1 dT0
− λ2 T1 + vbz1
dz 2 κ − dz = 0
and

vbz1 = 0 = b
T1 at z = 0, H

Thus you have


vbz1 = β sin z
H
LECTURE 18. TWO STABILITY PROBLEMS 493

and

b nπ
T1 = sin z
H

n = 1, 2, . . .
dT0
Hence for each value of n you have the critical value of − as a function of λ2 , viz.,
dz
 2
n2 π 2
  + λ2
ρref α g dT0 H2
− =
κ dz λ2
 
dT0 π2
The least critical value of − occurs at n = 1, λ2 = 2 where
dz H
 
ρref α g dT0 π2
− =4 2
κ dz crit H

and
 
π2 2

H2
β=  
1 dT0
κ − dz

Now you may advance T hot from T hot 0 at critical by writing

1 2
T hot = T hot 0 + ε T hot 2
2

1
vz = vz0 + ε vz1 + ε2 vz2
2
and

1 2
T = T0 + εT1 + ε T2
2

In the Petrie dish problem in Lecture 15 you learned that the successful expansion of the
control variable depends on the nonlinearity at hand.
LECTURE 18. TWO STABILITY PROBLEMS 494

Your job is to find out that the expansion above is the correct expansion by proving that
solvability is satisfied at second order. Thus you have

vz0 = 0

  z 
T0 = T hot 0 + T cold 0 − T hot 0
H
vz1 = Ab
vz1 ψ

and

T1 = A b
T1 ψ

where T hot 0 is the critical value of T hot, ψ corresponds to the critical value of λ2 and vbz1
and b
T are known from above.
1

What you are looking for is the value of A as a function of T hot 2

This can be found at third order if you can get through the second order problem without
finding that A = 0 which would tell you that your expansion of T hot is not correct.

At second order you should have


 
∂2
2
2
+ ∇H vz2 − ρref α g ∇H2 T2 = 0
∂z
   
∂2 2 1 dT0 2→

v1 · ∇ T1
+ ∇H T2 + vz2 − =
∂z 2 κ dz κ
vz2 = 0 at z = 0, H

T2 = 0 at z=H

and

T2 = T hot 2 at z=0
LECTURE 18. TWO STABILITY PROBLEMS 495

And

 z
vz2 = 0, T2 = T hot 2 1 −
H

2−

v1 · ∇ T1
is a solution if is zero, so henceforth we set
κ
T2 = 0 at z=0

Your problem then is this

  
vz2 0
L = −
2→
v1 · ∇ T1

T2
κ
and

vz2 = 0 = T2 at z = 0, H

The corresponding homogeneous problem has a nonzero solution, hence a solvability condi-
tion must be satisfied, viz.,
 T 

Z Z
H v⋆0
z1
dz dx dy    −
2→
v1 · ∇ T1
=0
0 A T1⋆
κ
where
   
v⋆ 0
L⋆ 
z1
= 
T1⋆ 0

and

v⋆ ⋆
z1 = 0 = T1 at z = 0, H
LECTURE 18. TWO STABILITY PROBLEMS 496

whereupon you have

πz
v⋆ ⋆
z1 = β sin ψ
H

and

πz
T1⋆ = sin ψ
H

Now you need to work out −



v1 · ∇ T1 to obtain
 

→ π πz πz ψ∇H ψ
v1 · ∇ T1 = A2 β sin cos ∇H ·
H H H λ2

And then you can conclude that solvability is satisfied at second order due entirely to the z
integration, viz.,
Z H
πz πz πz
sin sin cos dz = 0
0 H H H

Thus, if we had to, we could go on to third order in the expectation of finding A as a function
of T hot 2. This is the way the Petrie dish problem worked out for the cubic nonlinearity
Lecture 19

Ordinary Differential Equations

19.1 Boundary Value Problems in Ordinary Differential Equa-


tions

As we now know, using the method of separation of variables to solve the eigenvalue problem
(∇2 + λ2 ) ψ = 0 reduces this problem to the solution of ordinary differential equations. In this
lecture we present the elementary facts about second order, linear, ordinary differential equations.

We denote by L a second order, linear, differential operator acting on a class of functions


defined on a finite interval and smooth enough so that we can use integration by parts in its ordinary
form. We let x denote the independent variable, scale the interval of interest so that 0 ≤ x ≤ 1 and
write

d2 u du
Lu = a (x) 2
+ b (x) + c (x)
dx dx

We introduce the plain vanilla inner product


Z 1
u, v = uv dx
0

and assume for now that u and v are real valued.

497
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 498

d2 v dv
Integrating ua 2 twice by parts and ub once, we can write u, Lv as
dx dx
h i1
u, Lv = a {uv ′ − u′ v} − {a′ − b} uv + L∗u, v
0

da
where a′ denotes , etc. By doing this we introduce the operator L∗, associated to L and called
dx
its adjoint, where

d2 d
L∗u = (au) − (bu) + cu
dx2 dx

The adjoint, L∗, depends on the inner product we use.

A problem in second order, linear differential equations is specified in part by

Lu = f, 0<x<1

where f (x) is an assigned function on the interval (0, 1). To complete the specification of the
problem two boundary conditions must be assigned. Most of the boundary conditions of physical
interest can be taken into account by assigning values to two linear combinations of u and u′ at the
boundary, i. e., to two linear combinations of u (x = 0), u′ (x = 0), u (x = 1), u′ (x = 1). This
includes both initial value and boundary value problems. We limit ourselves to unmixed boundary
value problems and write the boundary conditions

at x = 0 : B0 u = a0 u + b0 u′ = g0

and

at x = 1 : B1 u = a1 u + b1 u′ = g1

where g0 and g1 are assigned real numbers.

Occasionally periodic conditions are imposed and these are of mixed type, viz.,

u (x = 0) − u (x = 1) = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 499

and

u′ (x = 0) − u′ (x = 1) = 0

We will deal with these as exceptional cases.

A problem then is defined by the operators L, B0 and B1 and assigned sources f (x), g0 and
g1 . Associated with this is the adjoint problem and to formulate the adjoint problem we need to
identify the adjoint boundary operators B0∗ and B1∗ that go with the adjoint differential operator
L∗. To do this write

h   i1
u, Lv − L∗u, v = a uv ′ − u′ v − a′ − b uv
0

and define B0∗ and B1∗ such that B0∗u = 0 = B1∗u and B0 v = 0 = B1 v imply

h   i1
a uv ′ − u′ v − a′ − b uv = 0
0

To illustrate this: if B0 v = v and B1 v = v then B0∗u = u and B1∗u = u. If B0 v = v ′ and


 
B1 v = v ′ then B0∗u = − (au)′ + bu and B1∗u = − (au)′ + bu .

Our job is to decide whether or not the problem

Lu = f, 0<x<1

and

B0 u = g0 , B1 u = g1

has a solution and, if it does, to decide what it is. Naimark’s book “Linear Differential Operators”
deals with this, and more, but we do not need very many of Naimark’s results as we deal only with
second order differential operators and in this case a simplification obtains by which we need deal
only with self adjoint differential operators.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 500

To see why this is so, observe that Lv, L∗u and uLv − L∗uv are

Lv = av ′′ + bv ′ + cv

L∗u = (au)′′ − (bu)′ + cu = au′′ + (2a′ − b) u′ + (a′′ − b′ + c) u

and


uLv − L∗uv = a (uv ′ − u′ v) − (a′ − b) uv

Then if b = a′ we get

L∗ = L

and

 ′
uLv − Luv = a (uv ′ − u′ v)

whereupon we have B0∗ = B0 and B1∗ = B1 . Thus, if B0 u = 0 = B0 v we have


      

0 B0 u u (x = 0) u (x = 0) a0
 = =  
0 B0 v v (x = 0) v ′ (x = 0) b0

and because a0 and b0 are not both zero, we conclude that

u (x = 0) v ′ (x = 0) − u′ (x = 0) v (x = 0) = 0

Likewise if B1 u = 0 = B1 v, then

u (x = 1) v ′ (x = 1) − u′ (x = 1) v (x = 1) = 0

 1
This tells us that if B0∗ = B0 and B1∗ = B1 then a (uv ′ − u′ v) 0 = 0.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 501

Hence if b = a′ , we get all of the following:


 
d du
Lu = a + cu
dx dx

and

L∗ = L

B0∗ = B0

and

B1∗ = B1

When this is so, a problem defined by L, B0 and B1 is called self adjoint in the plain vanilla inner
product (L is called self adjoint if L∗ = L). We get all this by requiring only b = a′ , but it must
be observed that we have assumed special forms for B0 and B1 , yet none of this depends on the
values assigned to a0 , b0 , a1 and b1 .

The condition b = a′ is important for two reasons. The first is that all of the ordinary differential
1
operators coming from ∇2 on separation of variables can be written as times a self adjoint
w (x)
operator and hence are themselves self adjoint in the inner product
Z 1
u, v = uvw dx
0

The second is that any second order linear differential operator, viz.,

Lu = au′′ + bu′ + cu

can be written
   
a d du c
Lu = d + du
d dx dx a
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 502

where
Z x
b

d (x) = e x0 a

d
and hence is self adjoint in a weighted inner product with w = . Henceforth then we will assume
a
that L is self adjoint in the plain vanilla inner product and write
 
d du
Lu = p − qu
dx dx

and

d
uLv − Luv = p (uv ′ − u′ v)
dx

The results we get will serve all of our purposes.

19.2 The Wronskian of Two Solutions to Lu = 0

The Wronskian of two functions u and v is denoted by W and is defined by


 
u v
W = uv ′ − u′ v = det  
′ ′
u v

Two functions u and v are linearly dependent if and only if their Wronskian vanishes.

Now we can write the problem Lu = 0 as

    
d  u 0 1 u
=
 p p

′  
dx u ′ − u ′
q p

and then observe, as we discovered in Lecture 2, that if u and v are any two solutions of this
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 503

equation their Wronskian satisfies


 
dW  0 1  p′
= tr  p p W = − p W

dx −
q p

This tells us that if u and v satisfy Lu = 0 then their Wronskian, multiplied by p, remains constant,
i.e., pW = const. This is also a simple consequence of the formula
 ′
uLv − Luv = p (uv ′ − u′ v) = (pW )′ . So if p is not zero, then W is either always zero or
never zero. And if pW is not zero and p → 0 as x → x0 then W → ∞ as x → x0 and at least one of
u and v does not remain bounded as x → x0 .

19.3 The General Solution to Lu = f

We can now write the general solution, i.e., no end conditions, to Lu = f in terms of the solutions
to Lu = 0. As Coddington and Levinson explain, there are always two independent solutions to
Lu = 0 and every other solution can be expresses as a linear combination of any two independent
solutions. We let u1 and u2 denote two independent solutions of Lu = 0, then Lu1 = 0 = Lu2 ,
W = u1 u′2 − u′1u2 does not vanish, pW is a nonzero constant and
 Z x Z x 
1
u0 = −u1 (x) u2 (y) f (y) dy + u2 (x) u1 (y) f (y) dy
pW 0 0

satisfies Lu = f . This can be verified by direct calculation. The general solution of Lu = f is


then

u = u 0 + c1 u 1 + c2 u 2

where c1 and c2 are two constants to be determined. We observe that u0 (x = 0) = 0 = u′0 (x = 0)


and therefore B0 u0 = 0.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 504

19.4 Solving the Homogeneous Problem f = 0, g0 = 0, g1 = 0

To solve the homogeneous problem

Lu = 0

and

B0 u = 0, B1 u = 0

we must determine c1 and c2 so that

u = c1 u 1 + c2 u 2

satisfies the boundary conditions. This requires


    
B0 u1 B0 u2 c1 0
  = 
B1 u1 B1 u2 c2 0

and we denote by D the determinant of the matrix of coefficients, viz.,


 
B0 u1 B0 u2
D = det  
B1 u1 B1 u2

If D 6= 0 then c1 = 0 = c2 is the unique solution to this system of equations and u = 0 is the


unique solution to the homogeneous problem. If D = 0 then this system of equations has exactly
one independent solution because not all B0 u1 , B0 u2 , B1 u1 and B1 u2 can vanish. So too then the
homogeneous problem. The result is this: if D 6= 0, the homogeneous problem Lu = 0, B0 u = 0,
B1 u = 0 has only the solution u = 0; if D = 0 the homogeneous problem has one independent
solution.

Because u1 and u2 are independent solutions of Lu = 0, we can see that B0 u1 , B0 u2 , B1 u1


and B1 u2 cannot all be zero. Observe first that u1 and u2 do not depend on B0 and B1 , being
determined solely by L and the condition W = u1 u′2 − u′1 u2 6= 0. Then to see that B0 u1 and B0 u2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 505

cannot both be zero, write


    
B0 u1 u1 (x = 0) u′1 (x = 0) a0
 =  
B0 u2 u2 (x = 0) u′2 (x = 0) b0

and observe that W (x = 0) is not zero and that a0 and b0 cannot be both zero.

Now there is always a non zero solution to Lu = 0, B0 u = 0 for, if neither B0 u1 nor B0 u2


is zero, a linear combination of u1 and u2 can be found which satisfies B0 u = 0. And there is
not another solution independent of this because, if there were, the Wronskian of the two solutions
would vanish at x = 0,

Hence we always have one non zero solution, and it is the only independent solution, to Lu = 0,
B0 u0 = 0. Likewise, there is one independent solution to Lu = 0, B1 u = 0. And all this is true no
matter the value of D. Then if D is zero, we have one independent solution of Lu = 0, B0 u = 0 ,
B1 u = 0. If u1 is this solution then neither B0 u2 nor B1 u2 can be zero.

These results depend on the boundary conditions being unmixed as both cos 2πx and sin 2πx
satisfy
 
d2 2
+ 4π u = 0
dx2

u (0) = u (1)

and

u′ (0) = u′ (1)

19.5 Solving the Inhomogeneous Problem

To solve the problem Lu = f where B0 u = g0 , B1 u = g1 we need to determine the values of the


constants c1 and c2 in the general solution to Lu = f so that the boundary conditions are satisfied.
Substituting u = u0 + c1 u1 + c2 u2 into the boundary conditions results in two equations in the two
unknowns c1 and c2 . Each solution to these equations produces a solution to our problem and by
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 506

solving these equations we get every solution to our problem.

Thus we have

B0 u0 + c1 B0 u1 + c2 B0 u2 = g0

and

B1 u0 + c1 B1 u1 + c2 B1 u2 = g1

and hence
    
B0 u1 B0 u2 c1 g0 − B0 u0
  = 
B1 u1 B1 u2 c2 g1 − B1 u0

where if Bu is any linear combination of u and u′ , then

B (u0 + c1 u1 + c2 u2 ) = Bu0 + c1 Bu1 + c2 Bu2

and, as
 Z x Z x 
1
u0 = −u1 u2 f dy + u2 u1 f dy
pW 0 0

and
 Z x Z x 
1
u′0 = −u′1 u2 f dy + u′2 u1 f dy
pW 0 0

we have
 Z x Z x 
1
Bu0 = −Bu1 u2 f dy + Bu2 u1 f dy
pW 0 0

Now, if D is not zero, the constants c1 and c2 can be determined uniquely and our problem has
a unique solution; otherwise, to have a solution, a solvability condition must be satisfied and if the
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 507

solvability condition is satisfied, the solution is not unique. The solvability condition for u is the
solvability condition for c1 and c2 .

Whether D is zero or not depends only on L, B0 and B1 . It does not depend on how we select
the two independent solutions of Lu = 0 denoted u1 and u2 .

19.6 The Case D 6= 0

Our first result is this. The problem

Lu = f

and

B0 u = g0 , B1 u = g1

has a solution and it is unique iff the problem

Lu = 0

and

B0 u = 0, B1 u = 0

has only the solution u = 0. This is the case D 6= 0,

To determine this unique solution we first must find two independent solutions of Lu = 0.
Denoting these u1 and u2 where W = u1 u′2 − u′1 u2 , W (u1 , u2 ) 6= 0 and pW = constant 6= 0, we
evaluate D. Then we write the solution

u = u 0 + c1 u 1 + c2 u 2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 508

where
 Z x Z x 
1
u0 = −u1 u2 f dy + u2 u1 f dy
pW 0 0

and where
    
c1 B1 u2 −B0 u2 g − B0 u0
 = 1   0 
c2 D −B1 u1 B0 u1 g1 − B1 u0

and we observe that

B0 u0 = 0

and

B1 u2 B1 u1
B1 u0 = u1 , f − u2 , f
pW pW

The simplest result obtains if u1 and u2 are chosen so that

B0 u1 = 0 = B1 u2

for then

g1 1
c1 = + u2 , f
B1 u1 pW

and

g0
c2 =
B0 u2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 509

19.7 The Case D = 0

Our second result corresponds to the case D = 0 whereupon the problem Lu = f , B0 u = g0 ,


B1 u = g1 , may or may not have a solution. To decide whether or not it does we must find out if a
solvability condition is satisfied. It is to this that we now turn.

The solvability condition for u is the solvability condition for c1 and c2 . Now when D = 0, the
rank of
 
B0 u1 B0 u2
 
B1 u1 B1 u2

is one and the solvability condition for c1 and c2 is simply the requirement that the rank of
 
B0 u1 B0 u2 g0 − B0 u0
 
B1 u1 B1 u2 g1 − B1 u1

be one also. This then is the requirement that


   
B0 u1 g0 − B0 u0 B0 u2 g0 − B0 u0
det  =0= 
B1 u1 g1 − B1 u0 B1 u2 g1 − B1 u0

If one of these determinants is zero then so is the other as their first columns are dependent due to
D = 0.

Now when D = 0 the homogeneous problem Lu = 0, B0 u = 0, B1 u = 0 has one independent


solution and the solvability condition takes its simplest form if we take u1 to be a nonvanishing
solution to this problem. Then B0 u1 = 0 = B1 u1 , B0 u2 6= 0, B1 u2 6= 0 and the solvability
condition is
 
B0 u2 g0 − B0 u0
det  =0
B1 u2 g1 − B1 u0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 510

B1 u2
where B0 u0 = 0 and B1 u0 = u1 , f . Hence we have
pW

B0 u2 B1 u2
B0 u2 g1 − B1 u2 g0 = u1 , f
pW

or

(pW )1 (pW )0
g1 − g0 = u1 , f
B1 u2 B0 u2

This, the simplest expression of the solvability condition, does not depend on how u2 is selected
once u1 is set, so long as u1 and u2 are independent. Indeed, because W is the Wronskian of u1
W1
and u2 and B1 u1 = 0, is unchanged if u2 is replaced by a linear combination of u1 and u2 .
B1 u2
W0
The same is true of .
B0 u2

For homogeneous boundary conditions the solvability condition is

u1 , f =0

We can state this as: The problem

Lu = f

and

B0 u = 0, B1 u = 0

is solvable if and only if f is orthogonal to all solutions to

Lu = 0

and

B0 u = 0, B1 u = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 511

If the solvability condition is satisfied, we have a solution, otherwise we do not. Assuming


solvability is satisfied and u1 is a solution of the homogeneous problem we have
    
0 B0 u2 c1 g0
  = 
0 B1 u2 c2 g1 − B1 u0

whereupon c1 is arbitrary and

g0 g1 − B1 u0
c2 = =
B0 u2 B1 u2

and our solution is

u0 + c2 u2 + cu1

where c is arbitrary. Likewise the solution to the homogeneous problem is

cu1

As an example we may wish to determine whether or not the problem


 
d2 2
+ a u = f, 0<x<1
dx2

and

u (x = 0) = 0, u (x = 1) = 0

d2
has a solution. The differential operator L = + a2 is self adjoint in the plain vanilla inner
dx2
1
product and the functions u1 = sin ax and u2 = cos ax are two independent solutions of Lu = 0.
a
As B0 u = u and B1 u = u we find that D is

−1
D= sin a
a
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 512

and hence that D is not zero unless a2 = π 2 , 22 π 2 , . . .. So for each value of a2 other than
π 2 , 22 π 2 , . . . this problem has a solution for all functions f (x). But if a2 = n2 π 2 , where n is
a positive integer, the problem has a solution if and only if the function f (x) satisfies
Z 1
sin nπxf (x) dx = 0
0

19.8 The Green’s Function

We suppose D is not zero so that our problem has a unique solution. We denote the two inde-
pendent solutions of Lu = 0 by u1 and u2 where W (u1 , u2 ) 6= 0. The simplest result obtains
if

Lu1 = 0, B0 u1 = 0

and

Lu2 = 0, B1 u2 = 0

for then
    
c1 B1 u2 −B0 u2 g − B0 u0
 = 1   0 
c2 D −B1 u1 B0 u1 g1 − B1 u1

where

B0 u0 = 0

and
 Z 1 Z 1 
1
B1 u0 = −B1 u1 u2 f dy + B1 u2 u1 f dy
pW 0 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 513
 
c1
and   is simply
c2

    
c1 0 g
−B0 u2  0
 = 1   Z 1 

D 1
c2 −B1 u1 0 g1 + B1 u1 u2 f dy
pW 0

where D = −B1 u1 B0 u2 .

In the case g0 = 0 = g1 we have


Z 1
1
c1 = u2 f dy
pW 0

and

c2 = 0

and our solution is then


Z 1 Z x
1 1
u = u 0 + c1 u 1 = u1 u2 f dy + u2 u1 f dy
pW x pW 0

which we can write


Z 1
u= g (x, y) f (y) dy
0

where g is called the Green’s function for our problem and we have

 1

 pW u1 (y) u2 (x) , y<x
g (x, y) =

 1
 u1 (x) u2 (y) , x<y
pW

Now at any y we see that

Lg = 0, B0 g = 0, x<y
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 514

and

Lg = 0, B1 g = 0, y<x

where

g (x, y) − g (x, y) = 0
x → y+ x → y−

and

∂g ∂g 1
(x, y) − (x, y) =
∂x ∂x p (y)
x → y+ x → y−

and where we notice that

u1 (y) u′2 (y) − u′1 (y) u2 (y) = W (y)

Hence we can find g by solving the foregoing equations but we see that g is not an ordinary
∂2g
solution to Lg = 0 ( as are u1 and u2 ) because 2 does not exist at x = y.
∂x
As an example, to solve

d2 u
= f, 0<x<1
dx2

and

u (x = 0) = 0 = u (x = 1)

we can write
Z 1
u (x) = g (x, y) f (y) dy
0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 515

where g (x, y) satisfies

d2 g
= 0, 0<x<y
dx2

g (x = 0, y) = 0

d2 g
= 0, y<x<1
dx2

g (x = 1, y) = 0

 
g y+, y − g y−, y = 0

and

∂g +  ∂g − 
y ,y − y ,y = 1
∂x ∂x

whence

g (x, y) = −x (1 − y) , 0<x<y

= −y (1 − x) , y<x<1

For fixed y the function g (x, y) is not an ordinary (smooth) solution of Lu = 0, B0 u = 0,


B1 u = 0. The first derivative of g has a jump discontinuity at x = y and so Lg is not defined
there in an ordinary sense. The function g is called a generalized solution of Lu = 0, B0 u = 0,
B1 u = 0. The only ordinary solution to this homogeneous equation when D 6= 0 is u = 0. There
is more on this in §19.9

The reader can continue and determine how to use g (x, y) to write the solution if g0 and g1 are
not zero. This will complete the introduction to the Green’s function when D is not zero. When
D is zero it is possible to introduce a generalized Green’s function and it is possible to do this in a
way that produces a best approximation when a solution cannot be determined. We do not go into
this but the main ideas can be found back in Lecture 4.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 516

As a second example suppose we must find a solution to


 
1 d du
x = f, 0<x<1
x dx dx
 
d d
where B0 and B1 are not as yet specified. Then we denote x by L and work in the inner
dx dx
Z 1
product: u, v = uv dx.
0
Doing this we find L∗ = L and uLv − Luv = {xW }′ . As long as the boundary conditions
remain unspecified we cannot determine g (x, y) but we can write a general solution to our problem.
1
Indeed u1 = 1 and u2 = ln x satisfy Lu = 0. And, as W (u1, u2 ) = , u1 and u2 are independent
x
1
and pW = 1. A particular solution to Lu = f is then
x
Z x Z x
u0 = − (ln y) yf (y) dy + ln x yf (y) dy
0 0

and, so long as f is a bounded function, u0 and u′0 remain finite as x → 0.

The general solution is

u = u0 + c1 + c2 (ln x)

and the requirement that u remain bounded as x → 0 replaces the boundary condition B0 u = g0 . It
is satisfied if and only if c2 = 0.

Using this the reader can determine the Green’s function for this problem and show that it
satisfies

Lg = 0, 0<x<y

g (x = 0, y) finite

Lg = 0, y<x<1

B1 g = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 517
 
g y+, y − g y−, y = 0

and

∂  ∂  1
g y +, y − g y −, y =
∂x ∂x y

The solution to

Lu = xf

u finite at x = 0

and

B1 u = 0

is then
Z 1
u (y) = xf (x) g (x, y) dx
0

Note:

In §19.9 we introduce the delta function. We can use it to define the Green’s function via

Lg = δ (x − y)

g (x, y) finite at x = 0

and

B1 g = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 518

Then we can use


Z 1
{uLg − Lug} dx = [xW ]10
0

to write the solution.

19.9 What is δ (x) Doing?

The short answer for us is nothing. This is, more or less, the first place where the symbol δ (x) has
been used. But we have made great use of the integration by parts formula
Z 1 h i1 Z 1
dg df
f dx = f g − g dx
0 dx 0 0 dx

and we have not inquired whether this use is justified. To see why there might be a question let f
be smooth and suppose first that g is continuous but that its derivative is not. For instance let g and
dg
be
dx

dg
g dx 1
1 1−ξ
1−ξ
1 x−ξ
1−ξ

0 0
0 0
X X
0 ξ 1 0 ξ 1

Then the left hand side of our integration by parts formula is


Z ξ Z 1 Z 1
dg dg 1
LHS = f dx + f dx = f dx
0 dx ξ dx 1−ξ ξ
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 519

whereas the right hand side is


Z 1 Z 1
df x − ξ 1
RHS = f (1) − dx = f dx
ξ dx 1 − ξ 1−ξ ξ

dg
As LHS = RHS we see that the discontinuity in does not require us to give up our integration
dx
by parts formula.
dg
But suppose that g and are now
dx

dg
g dx

1
1

0 0 0
0 0
X X
0 ξ 1 0 ξ 1

Then the right hand side is


Z 1
df
RHS = f (1) − dx = f (ξ)
ξ dx

whereas the left hand side is ambiguous. If we suppose it to be zero then our integration by parts
formula is incorrect. To account for the jump in g and to enlarge the class of functions for which
our formula is correct we simply let the right hand side define the left hand side whenever the left
hand side is ambiguous.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 520

The easiest way to do this is to introduce the symbol δ (x) where


Z b
δ (x) dx = 1, a<0<b
a
Z b
δ (x) dx = 0, 0 < a < b or a<b<0
a

Then
Z b
f (x) δ (x − x0 ) dx = f (x0 ) , a < x0 < b
a

= 0, x0 < a < b or a < b < x0

dg
and so if we write = δ (x − ξ) in the second example we have
dx

LHS = f (ξ)

and our integration by parts formula is restored.

What is really useful about this notation is that in terms of it the introduction of Green’s func-
tions can be formalized and simplified because we can now use our integration by parts formula to
evaluate terms such as
Z
uLg dx

where g is a Green’s function. Its derivative takes a jump at a point where Lg is not defined. This
integral then is like that in the second example just above.

To see how this goes let u satisfy

Lu = f (x) , 0<x<1

and

u (x = 0) = 0, u (x = 1) = 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 521

and suppose that g (x, ξ) can be determined so that

Lg = δ (x − ξ)

and

g (x = 0, ξ) = 0, g (x = 1, ξ) = 0

Then as
Z 1 n o h i1
′ ′
gLu − uLg dx = p {gu − ug } =0
0 0

we have
Z 1
u (ξ) = g (x, ξ) f (x) dx
0

19.10 Turning a Differential Equation into an Integral Equa-


tion: Autothermal Heat Generation

To take a more interesting example, suppose that heat is released in a thin layer of fluid of width
L bounded on either side by large reservoirs to which heat can be rejected. Where the fluid is
in contact with the reservoirs, the fluid temperature is held at the reservoir temperature, T0 . Our
problem is to determine whether or not the heat released can be balanced by heat conduction to
the reservoirs. In the simplest case we get an interesting problem by assuming that the local heat
source is autothermal with the temperature dependence of its rate given by the Arrhenius formula.
Then scaling distance by the width of the fluid layer our problem is to determine u so that

u
d2 u γ
1+u = 0
+ λ e
dx2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 522

and

u (x = 0) = 0, u (x = 1) = 0

T − T0
are satisfied where u = , γ is a scaled activation energy and λ is a multiple of L2 .
T0
This is not the kind of problem we have been talking about. The source is not specified in
advance but instead is a function of the solution and a non-linear function at that. Nonetheless we
can determine the Green’s function to be

g (x, y) = −x (1 − y) , 0<x<y

= −y (1 − x) , y<x<1

as above and write


 
Z 1  γ
u (y) 
u (x) = g (x, y) −λe 1 + u (y) dy
0  

Now we have not solved our problem, we have simply put it in another form. But this form is
useful as −g (x, y) is a non-negative function and using this integral equation it is easy to construct
both bounds on and approximations to u (x).

19.11 The Eigenvalue Problem

Writing L, B0 and B1 as
 
d du
Lu = p − qu
dx dx

B0 u = a0 u + b0 u′ at x = 0

and

B1 u = a1 u + b1 u′ at x = 1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 523

we have
 
d dv du dv
uLv = up −p − quv
dx dx dx dx

and

d n o d
uLv − Luv = p (uv ′ − u′v) = {pW }
dx dx

Now, assuming a0 and b0 are not both zero and a1 and b1 are not both zero, B0 u = 0 = B0 v and
B1 u = 0 = B1 v imply

W0 = {uv ′ − u′ v} (x = 0) = 0

and

W1 = {uv ′ − u′ v} (x = 1) = 0

Hence if u and v satisfy B0 u, B1 u, B0 v, B1 v all zero we have

Z 1 1
{uLv − Luv} dx = pW =0
0 0

If, say, p (x = 0) = 0 then instead of B0 u = 0 = B0 v, we only need u and v to remain bounded


as x → 0 to obtain the above formula.

Our integration by parts formulas are now

Z 1 1 Z 1

uLv dx = puv − {pu′v ′ + quv} dx
0 0 0

and
Z 1 1
′ ′
{uLv − Luv} dx = {p (uv − u v)}
0 0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 524

Our eigenvalue problem is

Lψ + λ2 ψ = 0, 0<x<1

and

B0 ψ = 0 = B1 ψ

where the eigenfunctions are the non zero solutions, ψ, and the corresponding values of λ2 are the
eigenvalues.

For any fixed value of λ2 , the equation

Lu + λ2 u = 0


has two independent solutions denoted u1 (λ2 ) and u2 (λ2 ), where W u1 (λ2 ) , u2 (λ2 ) , does not
vanish. The general solution of (L + λ2 ) ψ = 0 is then

 
ψ = c1 u 1 λ 2 + c2 u 2 λ 2

Each value of λ2 such that D (λ2 ) = 0 where

    
D λ2 = B0 u1 λ2 B1 u2 λ2 − B1 u1 λ2 B0 u2 λ2

is an eigenvalue for then c1 and c2 , not both zero, can be determined so that
    
2 2
B0 u1 (λ ) B0 u2 (λ ) c1 0
  = 
B1 u1 (λ2 ) B1 u2 (λ2 ) c2 0

and hence a solution to the eigenvalue problem other than ψ = 0can be obtained. To each eigen-
2 2
B0 u1 (λ ) B0 u2 (λ )
value there will be one independent eigenfunction as the rank of   can-
2 2
B1 u1 (λ ) B1 u2 (λ )
not be zero. The eigenvalues will be isolated as D will ordinarily be an analytic function of λ2 and
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 525

hence its zeros will be isolated. And D will ordinarily have infinitely many zeros.

Assuming p and q to be real valued functions and a0 , b0 , a1 and b1 to be real constants we


see that if ψ is an eigenfunction corresponding to the eigenvalue λ2 then so also ψ corresponding
to λ2 . Putting u = ψ, v = ψ in our second integration by parts formula and observing that
 
W0 ψ, ψ = 0 = W1 ψ, ψ , we see that λ2 = λ2 or that λ2 is real. Then putting u = ψ, v = ψ in
our first formula we get

Z Z ( )
1 h i1 1

2
−λ2 | ψ|2 dx = p ψ ψ′ − p + q | ψ|2 dx
0 0 0 dx

Now as B0 ψ = 0 = B0 ψ we find that



 0, b0 = 0
p ψ ψ ′ (x = 0) = a
 − 0 p | ψ|2 (x = 0) , otherwise
b0

and likewise

 0, b1 = 0
p ψ ψ ′ (x = 1) = a
 − 1 p | ψ|2 (x = 1) , otherwise
b1

and hence if p ≥ 0, q ≥ 0, a1 /b1 ≥ 0 and a0 /b0 ≤ 0 it is certain that λ2 ≥ 0.

The second formula then shows that if the eigenfunctions ψ1 and ψ2 correspond to distinct
eigenvalues, λ21 and λ22 , we have
Z 1
ψ1 , ψ2 = ψ1 ψ2 dx = 0
0

This is the orthogonality condition.

Note: The eigenvalues are real and to each there is one independent eigenfunction. So we take
the eigenfunction to be a real valued function. Under other boundary conditions it may be of
some advantage to indroduce complex valued eigenfunctions. Then if ψ is an eigenfunction
so also Re ψ, Im ψ and ψ.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 526

The solution of ∇2 ψ + λ2 ψ = 0 by separation of variables leads to eigenvalue problems of the


form

1
Lψ + λ2 ψ = 0
w

where L is self-adjoint in the inner product


Z 1
u, v = uv dx
0

In using the integration by parts formulas to again determine that the eigenvalues are real and not
negative, we now put Lψ = −λ2 wψ instead of Lψ = −λ2 ψ as above. Then in determining the
sign of λ2 we use

Z Z ( )
1 h i1 1

2
−λ2 w | ψ|2 dx = p ψ ψ′ − p + q | ψ|2 dx
0 0 0 dx

and λ2 remains non-negative if w > 0. The orthogonality condition is


Z 1
ψ1 ψ2 w dx = 0
0

1
which is not unexpected as L is self-adjoint in the inner product
w
Z 1
u, v = uvw dx
0

19.12 Solvability Conditions

The simplest way to decide whether or not a problem has a solution is to determine if all of the
steps required to write the solution can be carried out.

To see how this goes suppose the eigenvalue problem

Lψ + λ2 ψ = 0, 0<x<1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 527

and

B0 ψ = 0, B1 ψ = 0

determines a complete orthogonal set of eigenfunctions and denote them ψ1 , ψ2 , . . . corresponding


to the eigenvalues 0 ≤ λ21 < λ22 < · · · . Then to determine a function u satisfying

Lu = f, 0<x<1

and

B0 u = g0 , B1 u = g1

we write

X
u= ci ψi

and try to determine the coefficients c1 , c2 , · · · in this expansion. To find the equation satisfied by
the coefficient ci = h ψi , u i, we multiply Lu = f by ψi and integrate over 0 ≤ x ≤ 1, getting
Z 1 Z 1
ψi Lu dx = ψi f dx
0 0

Then as
Z 1 h i1 Z 1
ψi Lu dx = pW + Lψi u dx
0 0 0

where W is the Wronskian of ψi and u, we have

h i1

ψi , f = p {ψi u − ψi′ u} − λ2i ci
0

This is the result we need. It tells us this: in the expansion of u the coefficient of an eigenfunc-
tion corresponding to a non vanishing eigenvalue has one, and only one, value. But the coefficient
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 528

of an eigenfunction corresponding to the eigenvalue zero either cannot be determined, whence a


solution cannot be found, or is not unique, whence a solution can be found but it is not unique. The
question of solvability then comes up only if zero is an eigenvalue. In the first instance

h i1
ψ1 , f 6= p {ψ1 u′ − ψ1′ u}
0

whereas in the second

h i1

ψ1 , f = p {ψ1 u − ψ1′ u}
0

This is the solvability condition,, but it is not in terms of B0 , B1 , g0 and g1 .

To write it in terms of B0 and B1 we first use


      
0 B0 ψ1 ψ1 (x = 0) ψ1′ (x = 0) a0
 = =  

g0 B0 u u (x = 0) u (x = 0) b0

to write a0 and b0 in terms of g0 and W0 = W (x = 0). Doing this we get

−g0 ψ1′ (x = 0)
a0 =
W0

and

g0 ψ1 (x = 0)
b0 =
W0

Likewise we get

−g1 ψ1′ (x = 1)
a1 =
W1

and

g1 ψ1 (x = 1)
b1 =
W1
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 529

Then if a0 and a1 are not zero we can write the solvability condition as

g1 g0
ψ1 , f = −p (x = 1) ψ1′ (x = 1) + p (x = 0) ψ1′ (x = 0)
a1 a0

The reader may wish to write our problem


    
L 0 0 u (x) f
    
    
 0 B0 0   u (0)  =  g0 
    
0 0 B1 u (1) g1

and look for an inner product in which


 
L 0 0
 
 
 0 B0 0 
 
0 0 B1

is self adjoint. The solvability condition is then the requirement that


 
f
 
 
 g0 
 
g1

be perpendicular to every vector in


 
L 0 0
 
 
Ker  0 B0 0 
 
0 0 B1

19.13 Home Problems

1. A reaction A + B → AB takes place in a layer of solvent separating reservoirs of A and B


one on the left and the other on the right. Neither reservoir is permeable to the other reactant.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 530

Our model is

d2 cA
− k cA cB = 0
dx2
d2 cB
− k cA cB = 0
dx2

where

dcA dcB
cA (x = 0) = cA⋆, (x = 1) = 0, (x = 0) = 0, cB (x = 1) = c⋆
B
dx dx

The constant k is assumed to be small.

To estimate the rate of production of AB write

cA = cA0 + k cA1 + k 2 cA2 + · · ·

and

cB = cB0 + k cB1 + k 2 cB2 + · · ·

where cA0 = cA⋆, cB0 = cB⋆ and cA1, cA2, . . . , cB1, cB2, . . . satisfy

d2 cA1
− cA0 cB0 = 0
dx2
d2 cB1
− cA0 cB0 = 0
dx2

where

dcA1 dcB1
cA1 (x = 0) = 0, (x = 1) = 0, (x = 0) = 0, cB1 (x = 1) = 0
dx dx

and

d2 cA2
− cA0 cB1 − cA1 cB0 = 0
dx2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 531

where

dcA2
cA2 (x = 0) = 0, (x = 1) = 0
dx

etc.

Derive the Green’s function for the cA1, cA2, etc. problems and for the cB1, cB2, etc.
problems. Use these Green’s functions to find the first few terms in the two expansions.

2. Write the solution to the problem

d2 u
0= + λ2 (1 + u)
dx2

where

u (x = −1) = 0 = u (x = +1)

by adding to the particular solution u0 = −1 the general solution to the homogeneous equa-
tion

A cos λx + B sin λx

and then using the boundary conditions to find A and B.


π π
Sketch the solution for a value of λ lying between 0 and , then for λ lying between
2 2
3π 3π 5π π 3π
and , then and , etc. The problem does not have a solution when λ = , , . . ..
2 2 2 2 2
To see why this is so look at the solvability condition for this problem. The correspond-
ing homogeneous problem

d2 u
0= 2
+ λ2 u
dx
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 532

where

u (x = −1) = 0 = u (x = +1)

has solutions that can be written

A cos λx + B sin λx

where A and B satisfy


    
cos λ − sin λ A 0
  = 
cos λ sin λ B 0

The determinant of the matrix on the LHS is

cos λ sin λ

To all values of λ such that (cos λ sin λ) 6= 0 the constants A and B are both zero and
u = 0 is the only solution to the homogeneous problem. Our problem then has a unique
solution.
1 3
To all values of λ such that cos λ = 0, i.e., to
π, π, . . ., the constant B must be zero
2 2
but A is indeterminate and the homogeneous problem has solutions A cos λx. Our problem
then requires that a solvability condition be satisfied. Show that it is not satisfied.

To all values of λ such that sin λ = 0, i.e., to 0, π, . . ., the constant A must be zero but
B is indeterminate and the homogeneous problem has solutions B sin λx. When λ = 0 this
is 0 but otherwise our problem again requires that a solvability condition be satisfied. Show
that it is satisfied.
π 3π
So when λ = , , . . . our problem has no solution; when λ = π, . . . it has a solution
2 2
but it is not unique.
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 533

To see what is going on solve the problem

∂u ∂2u
= 2 + λ2 (1 + u)
∂t ∂x

where

u (x = −1) = 0 = u (x = +1)

1
and where u (t = 0) > 0 is assigned. Do this when λ = π and π.
2

3. You have seen how the Green’s function can be used to solve

Lu = f, B0 u = 0, B1 u = 0

use it to solve

Lu = 0, B0 u = g0 , B1 u = 0

4. We have


Lu = (pu′ ) − qu

and therefore

1 p′ q
Lu = u′′ + u′ − u
p p p

1
Show that L is self adjoint in the inner product
p
Z 1
h u, v i = uvp dx
0
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 534

5. We present three frictional heating problems for your enjoyment, all in cylindrical coordi-
nates. In each case the temperature depends only on r as does the only non zero velocity
component. The viscosity depends on temperature via

µ (Twall)
= 1 + β (T − Twall)
µ (T )

And in each case a stress component is specified. The problems are written in terms of a
scaled temperature

β (T − Twall)

First: Sliding rod, vr = 0 = vθ , vz = vz (r)

R1
z
R0

T (r = R1 ) = Twall

dT
(r = R0 ) = 0
dr
Trz (r = R0 ) input

In terms of the scaled temperature our problem is

d2 T 1 dT 2 1
+ + λ (1 + T ) = 0
dr 2 r dr r2
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 535

β
λ2 = T 2 (r = R0 ) R02
k µwall rz

Second: Poiseuille flow, vr = 0 = vθ , vz = vz (r)

R0 z

T (r = R0 ) = Twall

T (r = 0) bounded

dp
input
dz

Our problem is

d2 T 1 dT
2
+ + λ2 r 2 (1 + T ) = 0
dr r dr
 2
2 β 1 dp
λ =
k Twall 4 dz

Third: Spinning rod, vr = 0 = vz , vθ = vθ (r)


LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 536

R0

R1

T (r = R1 ) = Twall

dT
(r = R0 ) = 0
dr
Tr θ (r = R0 ) input

Our problem is

d2 T 1 dT 1
2
+ + λ2 4 (1 + T ) = 0
dr r dr r

β
λ2 = Tr2 θ (r = R0 ) R04
k µwall

Your job is to solve these problems for increasing values of λ2 and to find the value of
λ2 at which T becomes unbounded.

In the second and third problems, ψ, where

d2 ψ 1 dψ
+ + λ2 r 2 ψ = 0
dr 2 r dr

or

d2 ψ 1 dψ 1
2
+ + λ2 4 ψ = 0
dr r dr r
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 537

can be found by assuming

ψ = J0 (µr p )

where in the second problem p = 2 and in the third p = −1 and where µ is to be found.

This is not true of the first problem which, however, is a Bernoulli equation.

6. This is a linear heating problem in Cartesian coordinates. Again we have

µwall
= 1 + β (T − Twall)
µ (T )

The fluid lying between two fixed plane walls at x = L and x = −L⋆ is sheared by a plane
wall at x = 0 moving to the right.
⋆ are constants. But they are not independent. Our problem, in
The stresses Txz and Txz
scaled temperature, is

d2 T
2
+ λ2 (1 + T ) = 0, T (x = L) = 0
dx

d2 T⋆ 2  
2
+ λ⋆ 1 + T⋆ = 0, T x = −L⋆ = 0
dx
where

β 2 β 2
λ2 = T2 , λ⋆ = T⋆xz
k µwall xz k µwall

and where

T (x = 0) = T⋆ (x = 0)

and

dT dT⋆
(x = 0) = (x = 0)
dx dx
LECTURE 19. ORDINARY DIFFERENTIAL EQUATIONS 538

Because the speed of the moving wall is common, Txz and T⋆xz are not independent.
Show that Txz and T⋆xz have opposite signs and

Z L Z 0 
−Txz (1 + T ) dx = T⋆xz 1 + T⋆ dx
0 −L ⋆

Assume Txz to be the input variable, determine the value of λ2 at which the temperatures
become infinite.
dT
First do the T problem assuming T (x = 0) = 0 and then (x = 0) = 0, before
dx
solving the T, T⋆ problem.
Lecture 20

Eigenvalues and Eigenfunctions of ∇2 in


Cartesian, Cylindrical and Spherical
Coordinate Systems

20.1 Cartesian Coordinates: Sines, Cosines and Airy Func-


tions

We have seen in some of our earlier examples that the information we can get out of the eigenvalues
and the eigenfunctions is interesting in itself whether or not we plan to use them in an infinite series.

Of the nine one-dimensional problems arising upon separating variables in our three simple
coordinate systems only problems (6), (8) and (9) might be unfamiliar, all the others are of the
form

d2 X
2
+ α2 X = 0
dx

and have a solution

X = A sin αx + B cos αx

539
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 540

where what we do next depends on the boundary conditions that must be satisfied.

The solutions of problems (6), (8) and (9) can be obtained by a method due to Fuchs and
Frobenius which we will explain when we come to problem (8).

But first we can present another example of the importance of the eigenvalues themselves by
looking at the simplest quantum mechanical problem.

The rules for writing Schrödinger’s equation and the rules for interpreting its solutions are the
postulates of quantum mechanics. They cannot be proved; they can only be shown to lead to
conclusions that either agree or do not agree with experimental results. We obtain Schrödinger’s
equation in Cartesian coordinates, for a system of particles, if in the classical formula

H =T +V =E

where

1 n 2 o 1 n 2 o
T = p x + p2y + p2z + p x + p2y + p2z + · · ·
2m1 1 1 1 2m2 2 2 2

h ∂ h ∂ h ∂
we replace px , py , . . . by , , · · · and E by − , and then introduce a
1 1 2πi ∂x1 2πi ∂y1 2πi ∂t
function Ψ for these differential operators to act on. It is easy to see how ∇2 turns up. Indeed
∇2 turns up for each particle and if we have N particles of the same mass we turn up ∇2 in a 3N
dimensional space.

We can restrict a particle having only kinetic energy to a box 0 < x < a, 0 < y < a,
0 < z < a by setting V = 0 inside the box and V = ∞ outside the box. Then Schrödinger’s
equation for this particle is
 2
1 h h ∂Ψ
∇2 Ψ = −
2m 2πi 2πi ∂t

where Ψ = 0 on the boundary of the box. The corresponding eigenvalue problem is

 2
1 h
∇2 ψ = Eψ
2m 2πi
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 541

where ψ = 0 on the boundary of the box. It determines the allowable values of the energy of the
particle in its stationary states. These values are important for they are the possible outcomes of a
measurement of the energy of the particle. We introduce

8π 2 m
λ2 = E
h2

and we write our problem

∇2 ψ + λ 2 ψ = 0

where ψ vanishes at x = 0, x = a, y = 0, y = a, z = 0 and z = a and we solve it by separation of


variables. Writing ψ = X (x) Y (y) Z (z) we find

d2 X
+ kx2 X = 0
dx2

X (x = 0) = 0 = X (x = a)
d2 Y
2
+ k 2y Y = 0
dy
Y (y = 0) = 0 = Y (y = a)

and

d2 Z
+ k 2z Z = 0
dz 2

Z (z = 0) = 0 = Z (z = a)

where λ2 = k 2x + k 2y + k 2z , where

π π π
kx = n , ky = n , kz = n
a x a y a z

and where nx, ny and nz = 1, 2, . . .. Hence we have

π2 n 2 o
λ2 = n x + n2
y + n2
z
a2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 542

and

h2 2 h2 n 2 2 2
o
E= λ = n x + n y + n z
8π 2 m 8a2 m


The points nx, ny, nz , where nx, ny and nz = 1, 2, . . .. lie at the nodes of a cubic lat-
tice in the positive octant of quantum number space. To each point of the lattice there corre-
p2
sponds an eigenfunction, i.e., a quantum mechanical state. The energy of the state is , where
n o 2m
2
h h
p~ = nx~ix + ny ~jy + nz ~kz , and this is times the distance of the lattice point from the
2a 8ma2
origin.

We can make this example a little more interesting by assuming that the particle inside the box
is in a uniform gravitational field, ~g = −g~k. Then we have V = mgz and our problem is

 2
1 h
∇2 ψ + mgzψ = Eψ
2m 2πi

or


2 2m2 2
∇ ψ+ λ − 2 z ψ=0
h
 2 
22mE h 2m g 1
where λ = , h = and = . Separating variables leads to the same X and
h2 2π h2 L3
Y problems as before but the Z problem is now
 
d2 Z 2 2m2 g
+ kz − z Z=0
dz 2 h2

where Z (z = 0) = 0 = Z (z = a). Again λ2 = kx2 + ky2 + kz2 but now the values of kz2 are new.

We can write the solution to our new Z problem in terms of Airy functions, cf., Bender and
Orzag, “Advanced Mathematical Methods for Engineers and Scientists.”

In terms of the Fuchs, Frobenius classification, the point x = 0 is an ordinary point of the
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 543

homogeneous linear equations

d2 y
+y =0
dx2

and

d2 y
− xy = 0
dx2

The second equation is called Airy’s equation and two of its independent solutions are denoted
Ai (x) and Bi (x) where these are the names of Taylor series about x = 0 having infinite radii of
convergence.

1.00
Bi( x )
0.75

0.50

0.25
Ai( x )
0.00 x

− 0.25

− 0.50
− 10 −5 0 5
− 15

A sketch of Ai (x) and Bi (x) is presented above and we see that the Airy functions look like
trigonometric functions to the left, exponential functions to the right.
2m2 g
Introducing a characteristic length, denoted β, where β 3 = 1, and replacing z by βz we
h2
have

d2 Z 
2
+ β 2 kz2 − z Z = 0
dz
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 544

whence

 
Z = c1 Ai z − β 2 kz2 + c2 Bi z − β 2 kz2

Thus, in order that the values of c1 and c2 are not both zero, kz2 must satisfy
   
 a a 
Ai −β 2 kz2 Bi − β 2 kz2 − Ai − β kz Bi −β 2 kz2 = 0
2 2
β β

20.2 Cylindrical Coordinates: Bessel Functions

Solute Dispersion

Solute dispersion refers to the longitudinal spreading of a solute introduced into a flowing solvent
stream. We have seen a simple example of this in our model of a chromatographic separation
where we observed that the longitudinal dispersion of solute is due to a transverse variation of the
solvent velocity.

For a solvent in straight line flow in a pipe our problem is at least two dimensional. One
dimensional models are called dispersion equations and the coefficients appearing in these models
are called dispersion coefficients.

Assuming the process takes place in a pipe of circular cross section, we use cylindrical coordi-
nates and ∇2 in cylindrical coordinates is what this lecture is about.

Suppose a solvent is in straight line flow in a long straight pipe of circular cross section. At
time t = 0 a solute is injected into the solvent, its initial concentration being denoted c (t = 0).
Our job is to estimate its concentration some time later.

We align the z-axis with the axis of the pipe and denote the diameter of the pipe by 2a. Then
the solute concentration satisfies

∂c ∂c
+ vz = D∇2 c, θ ≤ r ≤ a, 0 ≤ θ < 2π, −∞ < z < ∞
∂t ∂z
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 545

where
 
r2
vz = 2 vz 1 − 2
a

where vz denotes the average of v z over the pipe cross section and where
 
2 1 ∂ ∂ 1 ∂2 ∂2
∇ = r + +
r ∂r ∂r r 2 ∂θ2 ∂z 2

Henceforth we write v in place of vz .

Assuming that the wall of the pipe is impermeable to solute and that only a finite amount of
solute is put into the solvent at t = 0, we require c to satisfy

∂c
(r = a) = 0
∂r

and we assume c vanishes strongly as z → ± ∞.

Now solvent near the axis of the pipe is moving faster than solvent near the wall; the result is
that solute is carried down stream by the solvent at different rates depending on where it initially
resides on the cross section of the pipe. This convective distortion of the initial solute distribution
creates transverse concentration gradients which tend to move solute from the axis to the wall at
the leading edge of the distribution and in the opposite direction at the trailing edge.

The problem of determining c, or at least something about c, is called the problem of solute
dispersion or Taylor dispersion. The first work leading to an understanding of this problem was
reported by G.I. Taylor in his 1953 paper “Dispersion of Solute Matter in Solvent Flowing Slowly
Through a Tube.” We deal with this problem because it requires us to deal with the eigenfunctions
of ∇2 in cylindrical coordinates and because it requires us to solve the diffusion equation taking
into account homogeneous sources. The latter two reasons are artifacts of our way of doing the
problem and fit well into our sequence of lectures, Taylor did not require any of this.

The convective diffusion equation is a linear equation in c and as such its solution can be
written in terms of the transverse eigenfunctions of ∇2 and a special set of orthogonal functions in
z. The result is not instructive. What invites the construction of models is that v is not constant but
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 546

depends on r. As Taylor was able to measure the transverse average solute concentration, i.e., c,
where
Z a Z 2π
1
c= 2 cr drdθ
πa 0 0

determining something about c as it depends on z and t became the goal of his and much subsequent
work on this problem

The Action of the Flow by Itself on the Solute

To begin we inquire as to what the velocity field by itself is doing to the solute distribution in the
pipe. If diffusion is set aside then c satisfies the purely convective equation

∂c ∂c
= −v (r)
∂t ∂z

and this is easy to solve for any assigned c (t = 0) as it is simply the statement that the value
of c at the point r0 , θ0 , z0 at the time t = 0 will be found also at the point r1 = r0 , θ1 = θ0 ,
z1 = z0 + v (r0 ) t1 at the time t1 and hence c (t > 0) can be calculated in terms of c (t = 0).

In the end the information we seek is independent of c (t = 0) and we can get it most easily by
introducing the longitudinal power moments of c. Denoting these by cm , where
Z +∞
cm = z mc dz, m = 0, 1, 2, . . .
−∞

we can derive the equation for cm by multiplying the purely convective equation by z m, integrating
over z from −∞ to +∞, using integration by parts and discarding the terms evaluated at z = ±∞.
Doing this for m = 0, 1, 2, . . . we get

∂c0
=0
∂t

∂c1
= vc0
∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 547

∂c2
= 2vc1
∂t

etc.

The moments of c can be determined recursively as

c0 = c0 (t = 0)

c1 = c1 (t = 0) + vc0 (t = 0) t

c2 = c2 (t = 0) + 2vc1 (t = 0) t + v 2 c0 (t = 0) t2

etc.

and their transverse averages calculated as

c0 = c0 (t = 0)

 
c1 = c1 (t = 0) + vc0 (t = 0) t

   
c2 = c2 (t = 0) + 2 vc1 (t = 0) t + v 2 c0 (t = 0) t2

etc.

To see how fast the solute is spreading in the axial direction due to nonuniform flow on the
cross section we calculate the longitudinal variance of the transverse average solute concentration
and determine how fast this is growing in time. Denoting the expected value of z m by
R +∞ m
z c dz c
zm = −∞
R +∞ = m
c dz c0
−∞
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 548

we have

D E  2
2 2 c2
2 c1
(z − h z i) = z i− z i = −
c0 c0

which turns out to be


(  2 ) ( )
c2 c1 vc1 (t = 0) vc0 (t = 0) c1 (t = 0)
− (t = 0) + 2 − t+
c0 c0 c0 (t = 0) c0 (t = 0) c0 (t = 0)
 !2 
 v 2 c (t = 0) vc0 (t = 0) 
0
− t2
 c0 (t = 0) c0 (t = 0) 

If v were uniform on the cross section the variance would remain at its initial value. But a
non-uniform v causes the variance to grow as t2 . To see how much of this might be due to the
average motion we repeat the calculation in a fame moving at the average speed. To do this let

z′ = z − v t

and

t′ = t

Then c satisfies

∂c ∂c

= − (v − v) ′
∂t ∂z

and the new formula for the variance requires only that we put v − v in place of v in the formula we
already have. As this leads to no change, we see that the variance is the same whether we examine
solute spreading in a frame at rest or in a frame moving at the average speed of the solvent.

Now we know that diffusion and a linear growth of the variance go hand in hand, hence, if
accounting for transverse diffusion can eliminate the quadratic term in the above formula, we can
think about solute dispersion as a longitudinal diffusion process.

Assuming transverse diffusion can cut the growth of the longitudinal variance of the solute
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 549

distribution from quadratic in time to linear in time, solute dispersion can be thought to be, on the
average, a longitudinal diffusion process. But as the balance between longitudinal convection and
transverse diffusion, on which elimination of the quadratic growth term depends, may take some
time to be established and as this time may depend on how the solute is initially distributed, we
need to view the representation of solute dispersion in terms of longitudinal diffusion as a long
time representation.

We can arrive at Taylor’s result viz.,

1 v2 a2
Deff =
48 D

where the reader can observe that Deff is smaller as D is larger, by a route that takes us through
familiar territory. To begin observe that if the model for c is

∂c ∂2c ∂c
= Deff 2 − V eff
∂t ∂z ∂z

then the moments of c, viz.,


Z +∞
cm = z mc dz
−∞

satisfy

dc0
=0
dt

dc1
= V eff c0
dt
dc2
= 2Deff c0 + 2V eff c1
dt

dc3
= 6Deff c1 + 3V eff c2
dt

etc.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 550

and these equations tell us that we can determine V eff and Deff in terms of c0 , c1 and c2 via
 
1 dc1 d c1
V eff = =
c0 dt dt c0

and
  (  2 )
1 dc2 c1 dc1 1d c2 c1
Deff = −2 = −
2c0 dt c0 dt 2 dt c0 c0

 2
c1 c2 c1 c (z, t)
where and − are the average and the variance of z distributed according to
c0 c0 c0 c0
at time t.

Now assuming axial symmetry, the solute distribution is in fact determined by


 
∂c 1 ∂ ∂c ∂c ∂2c
=D r −v +D 2
∂t r ∂r ∂r ∂z ∂z

and

∂c
(r = a) = 0
∂r

where c (t = 0) is assigned. So to determine V eff and Deff we need only determine c0 , c1 and c2
and use their averages in the above formulas. This does not mean that the model is then correct,
only that its first three power moments match those of the true solute distribution.
a2 D
Before we go on we scale length, time and velocity by a, and then Deff is scaled by
D a
D and we do not introduce new symbols. In terms of scaled variables the solute concentration
satisfies
 
∂c 1 ∂ ∂c ∂c ∂ 2 c
= r −v +
∂t r ∂r ∂r ∂z ∂z 2

and

∂c
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 551

where c (t = 0) is assigned and where v = 2v (1 − r 2 ). The transverse average is now


Z 1
u=2 ur dr
0

and if we introduce an inner product via


Z 1
h u, v i = 2 uvr dr
0

then

u = h 1, u i

The longitudinal moments of c, denoted by cm and defined by


Z +∞
cm = z mc dz
−∞

satisfy
 
∂c0 1 ∂ ∂c0
= r
∂t r ∂r ∂r
 
∂c1 1 ∂ ∂c1
= r + vc0
∂t r ∂r ∂r

 
∂c2 1 ∂ ∂c2
= r + 2vc1 + 2c0
∂t r ∂r ∂r

etc.

which can be obtained by multiplying the equation satisfied by c by z m, integrating this over z
from −∞ to +∞, using integration by parts and then discarding terms evaluated at z = ±∞. The
boundary conditions to be satisfied are

∂c0
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 552

∂c1
(r = 1) = 0
∂r

∂c2
(r = 1) = 0
∂r

etc.

and the initial conditions are determined by the moments of c (t = 0).

The moment equations can be solved recursively. The equation for c0 is simply a transverse
diffusion equation in a region where the boundary is impermeable. So also the equation for c1
but now there is a source depending on c0 ; likewise for c2 , the source now depending on c0 and
c1 . Indeed the equation for cm is a transverse diffusion equation having a source depending on
cm − 1 and cm − 2. The moment equations all take the same form. It is that of an unsteady, radial
diffusion equation driven by an assigned source. The equations must be solved in sequence so that
the sources can be determined before they are required. The power moments lead to this useful
∂c
structure as the mth moment of is a multiple of the m − 1st moment of c, etc. This lowering of
∂z
the order of the moments moves the variable coefficient v (r) into the source term in each equation.

Our eigenvalue problem is


 
1 ∂ ∂ψ
r + λ2 ψ = 0, 0≤r≤1
r ∂r ∂r

and

∂ψ
(r = 1) = 0
∂r

The eigenvalues, λ2 , are real and not negative and the eigenfunctions corresponding to different
eigenvalues are orthogonal in the inner product
Z 1
u, v =2 uvr dr
0

The eigenfunctions are thus solutions of Bessel’s equation and because they must be bounded
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 553

at r = 0, we have

ψ = AJ0 (λr)

where the λ’s must then be the roots of

J0′ (λ) = 0

Again, maybe for the third time, we say: J0 (z) is a power series in z 2 , determined by Frobenius’
method. The coefficients in the series tell us everything about J0 .

To every positive root, λ, there corresponds a negative root, −λ, but as J0 (z) = J0 (−z) this
root adds neither a new eigenvalue nor a new eigenfunction. Denoting the non-negative roots then
as λ0 , λ1 , λ2 , . . . we have the corresponding normalized eigenfunctions ψ0 , ψ1 , ψ2 , . . ., where

J0 (λi r)
ψi =  Z 1/2
1
2 J02 (λi r) r dr
0

We observe that λ0 = 0 and ψ0 = 1 and hence transverse averages can be written

u= ψ0 , u

The first few eigenfunctions are sketched below


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 554

1 J0(λ0 r)

J0(λ2 r)

0 r
1
J0(λ1 r)

where each eigenfunction is a rescaling of J0 (z), viz.,

J0 (z)

0 z
λ1 λ2
λ0

To find c0 is straightforward. We expand c0 as


X
c0 = ψj , c0 ψj
j=0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 555

and obtain the Fourier coefficients of c0 by multiplying the c0 equation by rψj and integrating over
0 ≤ r ≤ 1, viz.,
*  +
D ∂c0 E 1 ∂ ∂c0
ψj , = ψj , r
∂t r ∂r ∂r

∂ψi ∂c0
Then, because (r = 1) = 0 = (r = 1), we have
∂r ∂r

d
h ψj , c0 i = −λ2j h ψj , c0 i
dt

and hence


X 2
c0 = ψj , c0 (t = 0) e−λj t ψj
j=0

whereupon

c0 = ψ0 , c0 = ψ0 , c0 (t = 0) = c0 (t = 0)

due to ψ0 , ψj = 0, j = 1, 2, . . . and λ0 = 0. Thus c0 remains constant in time.

Turning to c1 , we must solve


 
∂c1 1 ∂ ∂c1
= r + vc0
∂t r ∂r ∂r

and

∂c1
(r = 1) = 0
∂r

where c1 (t = 0) is assigned and where c0 is now known. This is a diffusion equation driven by
a homogeneous source. Again we expand the solution in the eigenfunctions ψ0 , ψ1 , ψ2 , . . . and
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 556

determine the coefficients in the expansion. Here we write


X
c1 = ψj , c1 ψj
j=0

and determine ψj , c1 by multiplying the c1 equation by rψj and integrating over 0 ≤ r ≤ 1,


viz.,
*  +
D ∂c1 E 1 ∂ ∂c1
ψj , = ψj , r + ψj , vc0
∂t r ∂r ∂r

∂ψj ∂c1
Then because (r = 1) = 0 = (r = 1) we have
∂r ∂r

d
ψj , c1 = −λ2j ψj , c1 + ψj , vc0
dt

and for each j, j = 0, 1, 2, . . ., this is an equation determining the coefficients ψj , c1 . This


equation contains a source on the right hand side but the source is completely known. Hence we
find
Z t
2 2
ψj , c1 = ψj , c1 (t = 0) e−λj t + e−λj (t − τ ) ψj , vc0 (t = τ ) dτ
0

and substituting for c0 we get

2 2
e−λk t − e−λj t
X ∞
2
ψj , c1 = ψj , c1 (t = 0) e−λj t + ψj , vψk ψk , c0 (t = 0)
k=0
−λ2k + λ2j

2 e−λ2k t − e−λ2j t
−λ t
where when k = j we write te j in place of . Thus we have determined c1 as
−λ2k + λ2j

2 2
e−λk t − e−λj t

X XX ∞ ∞
2
c1 = ψj , c1 (t = 0) e−λj t ψj + ψj , vψk ψk , c0 (t = 0) ψj
j=0 j=0 k=0
−λ2k + λ2j
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 557

and we can go on and determine c1 as

c1 = ψ0 , c1
2
e−λk t − 1

X
= ψ0 , c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
k=0
−λ2k
2
e−λk t − 1

X
= c1 (t = 0) + vψk ψk , c0 (t = 0)
k=0
−λ2k

where when k = 0 the third factor in the summand reduces to t.

We now have c0 , c0 , c1 and c1 and the reader can go on and obtain c2 , c2 , . . . in a similar way;
to do this simply requires a notational scheme to keep track of what is going on. Everything falls
into place once such a scheme is invented.

It turns out that c2 is not needed in order to derive a formula for c2 and hence for Deff . To see
this we average the equations satisfied by c0 , c1 and c2 obtaining

 
dc0 1 ∂ ∂c0
= r
dt r ∂r ∂r

 
dc1 1 ∂ ∂c1
= r + vc0
dt r ∂r ∂r
and
 
dc2 1 ∂ ∂c2
= r + 2vc1 + 2c0
dt r ∂r ∂r

Now ψ0 and ci satisfy homogeneous Neumann conditions at r = 1 and λ0 is zero. This tells us that

  *  + *   +
1 ∂ ∂c i 1 ∂ ∂c i 1 d dψ0
r = ψ0 , r = r , ci = 0
r ∂r ∂r r ∂r ∂r r dr dr

and hence that

dc0
=0
dt
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 558

dc1
= vc0
dt
and

dc2
= 2vc1 + 2c0
dt

These formulas simplify the determination of V eff and Deff. Putting them into

1 dc1
V eff =
c0 dt

and
 
1 dc2 c1 dc1
Deff = −2
2c0 dt c0 dt

we get

vc0
V eff =
c0

and

vc1 vc0 c1
Deff = − +1
c0 c0 c0

and this tells us that we need only c0 , c0 , c1 and c1 to determine V eff and Deff.

Now because


X 2
c0 = ψj , c0 (t = 0) e−λj t ψj
j=0


X 2
= c0 (t = 0) + ψj , c0 (t = 0) e−λj t ψj
j=1

and

c0 = c0 (t = 0)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 559

we find


X 2
vc0 = ψj , c0 (t = 0) e−λj t vψ j
j=0
 2 
= c0 (t = 0) v + O e 1 t
−λ

and we see that V eff turns out to be time dependent. This results whenever c0 (t = 0) differs from
its equilibrium value c0 (t = 0). This difference weights the streamlines more or less heavily than
their equilibrium weighting and as the streamlines differ in speed this leads to V eff being other than
v. But for large enough values of t the equilibrium distribution of c0 is attained and V eff reaches a
constant value. It is this that we now denote by V eff and its value is

V eff = v

To go on and determine Deff we need vc1 and c1 . By using

2
e−λk t − e−λj t
∞ ∞ ∞ 2
X 2 XX
c1 = ψj , c1 (t = 0) e−λj t ψj + ψj , vψk ψk , c0 (t = 0) ψj
j=0 j=0 k=0
−λ2k + λ2j

and

2
e−λk t − 1

X
c1 = c1 (t = 0) + vψk ψk , c0 (t = 0)
k=0
−λ2k

X 1  2 
= c1 (t = 0) + vc0 (t = 0) t + vψk ψk , c0 (t = 0) 2
+ O e 1t
−λ
k=1
λk
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 560

we find


X 2
vc1 = ψj , c1 (t = 0) e−λj t vψj
k=0

2
e−λk t − e−λj t
∞ X
∞ 2
X
+ ψj , vψk ψk , c0 (t = 0) vψj
j=0 k=0
−λ2k + λ2j
 2 
= c1 (t = 0) v + O e−λ1 t + vc0 (t = 0) v t
2
e−λk t − 1

X
+ vψk ψk , c0 (t = 0) v
k=1
−λ2k
2
1 − e−λj t  2 

X
+ ψj , v c0 (t = 0) 2
vψj + O te 1 t
−λ
j=1
λj

where the first of the last four terms corresponds to j = 0, k = 0, the second to j = 0, k > 0,
the third to j > 0, k = 0 and the last to j > 0, k > 0. Using these formulas and writing
ψj , v = 1, ψj v = ψj v we find

X∞
vψj vψj  
vc1 c0 − vc0 c1 = c c + O te−λ21 t
0 0
j=1
λ2j

and hence, for large enough values of t, we we see that Deff reaches a constant value. It is

X∞
vψj vψj
Deff = 1 +
j=1
λ2j

This formula is correct for any v and for v = 2v (1 − r 2 ) it reduces to

1 2
1+ v
48

which is Taylor’s formula if 1 is added to take longitudinal diffusion into account.

To see this, we need to evaluated vψj = ψ0 , vψj = ψj , v . Now because v = 2v (1 − r 2 )


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 561

we find that
 
1 d dv
r = −8v
r dr dr

dv
where v (r = 1) = 0 and (r = 1) = −4v. Then, integrating by parts, we have
dr
*  +  1 *   +
1 d dv dv dψj 1 d dψj
ψj , r = 2 ψj r − r v + r ,v
r dr dr dr dr 0 r dr dr

and hence, for j = 1, 2, . . .

8vψj (r = 1)
ψj , v =−
λ2j

Now, using

J0 (λj )
ψj (r = 1) =  Z 1/2
1
2 J02 (λj r) r dr
0

Z 1
1 n ′2 o
J02 (λj r) r dr = J0 (λ) + J02 (λ)
0 2
and

J0′ (λj ) = 0

we obtain

ψj (r = 1) = 1

and our result is

8v
ψj , v =−
λ2j
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 562

whereupon we have

X∞ X∞
vψj vψj 2 1
2
= 64v
j=1
λj λ6
j=1 j

X∞
1
The sum 6
is found, cf., A. E. DeGance and L. E. Johns, “Infinite Sums in the Theory of
j=1
λ j
1
Dispersion of Chemically Reactive Solute” SIAM J. Math. Anal. 18, 473 (1987), to be ,
3072
using the fact that J0′ (z) = −J1 (z), and the formula for Deff, when v = 2v (1 − r 2 ), is then

1 2
Deff = 1 + v
48

Having now determined V eff and Deff , we must be careful not to claim too much for these
results. Indeed we have not really matched the moments of the solution to the model equation to
the moments of the true solute distribution, at least not yet. Of course we have not even looked at
the moments for i > 2. But more than this, if we look at c1 where

2
e−λk t − 1

X
c1 = c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
k=0
−λ2k

we see that it can be written

2
e−λk t − 1

X
c1 = c1 (t = 0) + vc0 (t = 0) t + ψ0 , vψk ψk , c0 (t = 0)
k=1
−λ2k

whereas from the model we get, assuming V eff to be constant at its long time value,

c1 (t = 0) + V eff c0 (t = 0) t

This cannot match the true c1 for all time but we can determine V eff, c0 (t = 0) and c1 (t = 0) to
match it to the true c1 as t grows large. To do this we let Veff = v and retain c0 (t = 0) at its true
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 563

value. But we cannot take c1 (t = 0) to be its true value; we must instead take it to be


X 1
c1 (t = 0) + ψ0 , vψk ψk , c0 (t = 0)
λ2k
k=1

What we see then is that in using the model we must determine ceff (t = 0) as well as V eff and
Deff. If we do this we make the difference between the model prediction of c1 and the true c1 , i.e.,
2  
e−λk t
X∞
vψk ψk , c0 (t = 0) , to be O e−λ21 t ; otherwise the difference will be O (1). A
−λ2k
k=1
similar observation pertains to the moment c2 . To have the model get c2 right as t grows large,
additional conditions are required to be satisfied by ceff (t = 0).

The Zeros of J1(z)

The Bessel function J1 (z) is defined to be the sum of the infinite series

 2k
  k z
1 z 2
z 4
1 X
∞ (−1)
z 1− 2 + 2 −··· = z 2
2 2 (1! 2!) 2 (2! 3!) 2 k=0 k! (k + 1)!

This series converges for all finite values of z as do the series obtained from it by term-by-term
differentiation.
J1′ (z)
The zeros J1 are real and simple and we denote them 0, ±z1 , ±z2 , . . .. The function has
J1 (z)
simple poles at z = 0, ±z1 , ±z2 , . . . and its residue at each pole is +1. Thus on any closed contour
C the value of the integral
Z
1 J1′ (z)
dz
C z − ζ J1 (z)

is the product of 2πi and the sum of the residues of the integrand at its poles inside C. In the limit
as C grows large and encloses the entire complex plane, the integral vanishes and we get
 
J1′ (ζ) 1 1 1 1 1
0 = 2πi + + + + + +···
J1 (ζ) 0 − ζ z1 − ζ −z1 − ζ z2 − ζ −z2 − ζ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 564

and hence writing z in place of ζ, we have

1 J1′ (z) −2z −2z


− = 2
+ +···
z J1 (z) −z1 + z 2 −z22 + z 2

whereupon
 
1 1 J1′ (z) 1 1
− =  2
+   +···
2z z J1 (z) z z2
z12 1− 2 z22 1− 2
z1 z2

For any z such that | z| < | z1 | where | z1 | < | z2 | < · · · we can expand the factors 1 on
z2
1− 2
zi
the right hand side to get

∞ ∞ ∞
J1 (z) − zJ1′ (z) X 1 X 1
2
X 1
2
= 2
+ 4
z + 6
z4 + · · ·
2z J1 (z) z
i=1 i
z
i=1 i
z
i=1 i

and then, using the series for J1 (z) to expand the left hand side in powers of z 2 , we can match the
coefficients of 1, z 2 , z 4 , . . . on the two sides to get

X∞
1 1
2
=
z
i=1 i
8

X∞
1 1
4
=
z
i=1 i
192

X∞
1 1
6
=
z
i=1 i
3072

etc.

These formulas can be used to estimate the zeros of J1 (z); the third formula is of interest as it
stands in evaluating Deff .

Functions such as cos z, sin z, J0 (z), J1 (z), etc., defined by infinite sums converging for all
finite values of z are generalizations of polynomials. As such they also have infinite product
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 565

expansions which simplify this kind of work. The infinite product expansion of J1 leads directly
to the formula

J1′ (z) 1 2z 2z
= + 2 2
+ 2 +···
J1 (z) z z − z1 z − z22

which we got by the residue theorem.

20.3 Spherical Coordinates: Spherical Harmonics, the Method


of Frobenius

The eigenvalue problem for ∇2 , viz., (∇2 + λ2 ) ψ = 0, when written out in spherical coordinates
leads on separation of variables to three problems. Each of these determines one of the factors
making up ψ (r, θ, φ) as the product R (r) Θ (θ) Φ (φ). Solving for Φ (φ) is easy and will be re-
viewed here. But our main job in this lecture is the determination of Θ (θ). This requires us to
explain Frobenius’ method and we use Θ (θ) as an example of this. The determination of R (r) is
also easy and is not worked out in detail.

To get going we recall the multipole moment expansion of the electrical potential due to a set
of point changes. This is of interest because the potential, denoted φ, satisfies ∇2 φ = 0.

To write φ at a point P , whose position is denoted by ~r, we use Coulomb’s law (in rationalized
MKS units), viz.,

n
X qi
φ (~r) =
i=1
4πε0 | ~r − ~ri|

Then, in terms of r = | ~r|, ri = | ~ri| and θi, we have

| ~r − ~ri|2 = (~r − ~ri) · (~r − ~ri)

= r 2 − 2r ri cos θi + ri2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 566

and hence we write

1 1 1
= r  r 2
| ~r − ~ri| r r
1 − 2 i cos θi + i
r r

so that if the field point is further from the origin then any source point we can use the expansion
        
1 1 3 1 3 5
− − − − − −
1 2 2 2 2 2 2
√ =1+ z+ z2 + z3 + · · ·
1+z 1! 2! 3!

which holds whenever | z| < 1, and write


 
1
−   r 2 
1 2 ri
r   =1+ − 2 cos θi + i
r r 2 1! r r
1 − 2 i cos θi + i
r r
  
1 3
− −   r 2 2
2 2 ri
+ − 2 cos θi + i +···
2! r r

ri  r 2
This requires −1 < −2 cos θi + i < 1.
r r
r
On rewriting this in ascending powers of i we get
r

ri  r 2 1  
1 + cos θi + i 2
3 cos θi − 1 + · · ·
r r 2

1 
Our interest in this lies in the functions 1, cos θi, 3 cos2 θi −1 , . . . obtained on expanding
2
1 2 ri
using the binomial theorem, writing z = −2xy + y where x = cos θi and y = ,
{1 + z}1/2 r
and then arranging the result in a power series in y. We can write the expansion directly in powers

of y as (see Linus Pauling and E. Bright Wilson, jr. “Introduction to quantum mechanics, with
applications to chemistry”)


X
1
T (x, y) = = Pℓ (x) y ℓ
{1 − 2xy + y 2 }1/2 ℓ=0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 567

and then determine the sequence of functions Pℓ , ℓ = 0, 1, 2, . . .. To do this we must have | x| ≤ 1


and | y| small enough that the series converges uniformly in x. The function
1
T (x, y) = is called the generating function for the functions
{1 − 2xy + y 2 }1/2
P0 (x) , P1 (x) , P2 (x) , . . .. It turns up naturally in electrostatics and not only generates P0 (x),
P1 (x), P2 (x) , . . . but also provides an easy way to uncover the important facts about these func-
tions.

To begin we obtain P0 (x) and P1 (x) via

T (x, y = 0) = 1 = P0 (x)

and


T (x, y = 0) = x = P1 (x)
∂y

Then we can derive a recursion formula expressing any one function in terms of the previous two.
Indeed using

1 X
T (x, y) = = Pℓ (x) y ℓ
{1 − 2xy + y 2 }1/2

and

∂ x−y X
T (x, y) = 3/2
= ℓPℓ (x) y ℓ−1
∂y 2
{1 − 2xy + y }

we get

X  X
(x − y) Pℓ (x) y ℓ = 1 − 2xy + y 2 ℓPℓ (x) y ℓ−1

and requiring the coefficients of y ℓ to agree on the two sides we get

(ℓ + 1) Pℓ+1 (x) − (2ℓ + 1) xPℓ (x) + ℓPℓ−1 (x) = 0

As P0 (x) = 1 and P1 (x) = x, this recursion formula can be used to obtain


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 568

3 2 1
P2 (x) = x − , . . .. And by doing this we can see that Pℓ (x) is a polynomial of degree ℓ
2 2
and that it is odd or even as ℓ is odd or even. The polynomials so defined are called Legendre
polynomials and they are fundamental to work in spherical coordinate systems.

We can use the generating function to obtain the differential equation satisfied by Pℓ (x). To do
this we observe that

∂T y X
(x, y) = 3/2
= Pℓ′ (x) y ℓ
∂x 2
{1 − 2xy + y }

and hence that

X  X
y Pℓ (x) y ℓ = 1 − 2xy + y 2 Pℓ′ (x) y ℓ

and again requiring the coefficients of y ℓ to agree on the two sides we get


Pℓ+1 (x) − 2xPℓ′ (x) + Pℓ−1

(x) − Pℓ (x) = 0

Then multiplying this by ℓ+1 and subtracting the result from the derivative of the recursion formula
leads to

xPℓ′ (x) − Pℓ−1



(x) − ℓPℓ (x) = 0

Likewise we find


Pℓ+1 (x) − xPℓ′ (x) − (ℓ + 1) Pℓ (x) = 0

and subtracting the first, multiplied by x, from the second, written with ℓ−1 in place of ℓ we obtain


1 − x2 Pℓ′ (x) + ℓxPℓ (x) − ℓPℓ−1 (x) = 0

which on differentiation results in


 
d  d
2
1−x Pℓ (x) + ℓ (ℓ + 1) Pℓ (x) = 0
dx dx
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 569

upon replacing ℓxPℓ′ (x)−ℓPℓ−1



(x) by ℓ2 Pℓ (x). This is called Legendre’s differential equation and
it is satisfied by the Legendre polynomials. These polynomials satisfy the orthogonality condition
Z +1
Pℓ Pℓ ′ dz = 0, ℓ 6= ℓ′ , ℓ, ℓ ′ = 0, 1, 2, . . .
−1

To see this we use


Z +1   Z 
+1 
d  dQ
2 d 
2 dP
P 1−z dz − Q 1−z dz =
−1 dz dz −1 dz dz
  +1
2
 dQ dP
1−z P −Q =0
dz dz −1

which holds for any smooth and bounded functions P and Q. Then setting P = Pℓ and Q = Pℓ ′
produces the required result.

To make use of the orthogonality condition:


Z +1 Z +1 
Pℓ (z) Pℓ ′ (z) dz = Pℓ2 (z) dz δℓ ℓ ′
−1 −1

we need to evaluate the integrals


Z +1
Pℓ2 (z) dz
−1

To get these we write

1n o
Pℓ (z) = (2ℓ − 1) zPℓ−1 (z) − (ℓ − 1) Pℓ−2 (z)

multiply by Pℓ (z) and integrate from −1 to +1. Due to the orthogonality of Pℓ and Pℓ−2 we get
Z +1 Z +1
2ℓ − 1
Pℓ2 (z) dz = Pℓ (z) zPℓ−1 (z) dz
−1 ℓ −1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 570

1 n o
Then as zPℓ (z) = (ℓ + 1) Pℓ+1 (z) + ℓPℓ−1 (z) we see that
2ℓ + 1
Z +1 Z +1
2ℓ − 1
Pℓ2 (z) dz = 2
Pℓ−1 (z) dz
−1 2ℓ + 1 −1

and using this recursively we get


Z +1
2
Pℓ2 (z) dz =
−1 2ℓ + 1

The Angular Eigenfunctions of ∇2 in Spherical Coordinates: The Use of


Frobenius’ Method

We will rediscover the functions Pℓ (z) in the course of solving for the eigenfunctions of ∇2 in
spherical coordinates; it is to this that we now turn. Writing ∇2 in spherical coordinates and
substituting ψ (r, θ, φ) = R (r) Θ (θ) Φ (φ) into the eigenvalue problem (∇2 + λ2 ) ψ = 0 we find
that R (r), Θ (θ) and Φ (φ) satisfy

d2 Φ
2
+ m2 Φ = 0

   
1 d dΘ m2
2
sin θ + β − Θ=0
sin θ dθ dθ sin2 θ
and
   
1 d 2 dR β2 2
r + λ − 2 R=0
r 2 dr dr r

These three ordinary differential equations turn up in the order listed and −m2 and −β 2 are
separation constants introduced in the course of carrying out the separation of variables algorithm.
The values of m2 , β 2 and λ2 must be determined as well as expressions for the functions Φ, Θ and
R. The problems must be solved in order, because m2 in the first carries over to the second and
β 2 in the second carries over to the third. Each of these equations is a one-dimensional eigenvalue
problem in its own right but is not a definite problem until boundary conditions are assigned.
The boundary conditions derive from those satisfied by ψ which in turn come from the physical
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 571

problem of interest.

The φ and θ equations are common to many problems and we work out their solutions in some
detail.

We suppose that physical boundary conditions are assigned only on surfaces where r is con-
stant. Hence we use symmetry and boundedness conditions to derive the dependence of ψ on θ
and φ and therefore Θ on θ and Φ on φ.

Thus Φ must be periodic in φ of period 2π and we write the solution to

d2 Φ
+ m2 Φ = 0
dφ2

as

1
Φ = A cos mφ + B sin mφ
m

whereupon using Φ (0) = Φ (2π) and Φ′ (0) = Φ′ (2π) we have, for A, B and m,
 1    
cos 2πm − 1 sin 2πm A 0
 m  = 
−m sin 2πm cos 2πm − 1 B 0

Only values of m such that

2 − 2 cos 2πm = 0

lead to A and B other than A = 0 = B.

Hence we must have m = 0, ±1, ±2, . .


.. If m
 = 0,the 
rank of the matrix on the LHS is
A 1
1, otherwise it is 0. At m = 0, we write   =  ; at m = +1, +2, . . . we write
B 0
     
A 1 0
  =   and  
B 0 1

Corresponding to m = 0, we have m2 = 0, Φ = 1; corresponding to m = +1, +2, . . ., we have


m2 = 1, 4, . . . and Φ = cos mφ and sin mφ. Corresponding to m = −1, −2, . . . there is no new
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 572

information.

However what we do is this: we replace cos mφ and sin mφ with two linear combinations,
eimφ and e−imφ . Then we assign eimφ to m and let m run through · · · , −2, −1, 0, 1, 2, · · · . So
for m = 0, ±1, ±2, . . . we have

1
Φm (φ) = √ eimφ

where

Z 2π  1, m=n
ΦmΦn dφ =
0  0, m 6= n

and where the coefficients cm in an expansion

+∞
X
f (φ) = cmΦm (φ)
m=−∞

are given by
Z 2π Z 2π
cm = φm (φ) f (φ) dφ = φ−m (φ) f (φ) dφ
0 0

Now having obtained the values of m2 , we turn to the θ equation:


   
1 d dΘ 2 m2
sin θ + β − Θ=0
sin θ dθ dθ sin2 θ

where 0 ≤ θ ≤ π. To solve this equation we introduce z = cos θ so that z runs from +1 to −1 as


θ runs from 0 to π. Then writing P (z) in place of Θ (θ) = P (cos θ) we find that P (z) satisfies
   
d  dP 2 2 m2
1−z + β − P = 0, −1 ≤ z ≤ 1
dz dz 1 − z2

or
 
 d2 P2 dP 2 m2
1−z − 2z + β − P = 0, −1 ≤ z ≤ 1
dz 2 dz 1 − z2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 573

The Method of Frobenius

Our aim is to derive the solution to this equation by obtaining the coefficients in a series expansion
of P (z) in powers of z.

But first we outline the method we use, due to Fuchs and Frobenius, for obtaining solutions to
equations of the form

y(m) + pm−1 (x) y(m − 1) + · · · + p1 (x) y(1) + p0 (x) y = 0

A point x0 is called an ordinary point if all pi (x) have Taylor series expansions about x0 . For
example x = 0 is an ordinary point of

d2 y
+y =0
dx2

and the reader may seek a solution to this equation in the form


X
y= an xn
n=0

producing the series for cos x if a0 = 1, a1 = 0 and for sin x if a0 = 0, a1 = 1. The coefficients in
these series tell us everything we might want to know about the functions denoted sin and cos.

A point x0 is called a regular singular point if

(x − x0 ) np0 (x) , (x − x0 ) n − 1p1 (x) , · · · , (x − x0 ) pn−1 (x)

all possess Taylor series expansions about x0 . If x0 is a regular singular point, we can find a
solution

X
y = (x − x0 ) α an (x − x0 ) n, a0 6= 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 574

In the case of second order equations, where x0 is a regular singular point, we write

p (x) ′ q (x)
y ′′ + y + y=0
x − x0 (x − x0 )2

where p (x) and q (x) have Taylor series expansions about x = x0 , viz.,

X
p= pn (x − x0 ) n
n=0

and

X
q= qn (x − x0 ) n
n=0

Substituting

X X
y = (x − x0 ) α an (x − x0 ) n = an (x − x0 ) n + α

X X
y′ = α an (x − x0 ) n + α − 1 + nan (x − x0 ) n + α − 1

X X
y ′′ = α (α − 1) an (x − x0 ) n + α − 2 + 2α nan (x − x0 ) n + α − 2
X
+ n (n − 1) an (x − x0 ) n + α − 2

into our differential equation we obtain

X X
α (α − 1) an (x − x0 ) n + α − 2 + 2α nan (x − x0 )n+α−2
X
+ n (n − 1) an (x − x0 ) n + α − 2
X X
+ pn (x − x0 ) n (α + n) an (x − x0 ) n + α − 2
X X
+ qn (x − x0 ) n an (x − x0 ) n + α − 2 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 575

and equating the coefficients of (x − x0 ) n + α − 2, n = 0, 1, 2, . . . to zero we have


n=0: α2 + (p0 − 1) α + q0 a0 = 0


n = 1, 2, . . . : (α + n)2 + (p0 − 1) (α + n) + q0 an =
n−1
X 
− (α + k) pn−k + qn−k ak
k=0

Hence, at n = 0, a0 6= 0 implies P (α) = α2 + (p0 − 1) α + q0 = 0, (called the indicial


equation) whereas for n = 1, 2, . . . we have

n−1
X 
P (α + n) an = − (α + k) pn−k + qn−k ak
k=0

which leads to a1 , a2 , . . . , an terms of a0 so long as P (α + n) is not zero.

We denote by α1 and α2 the two roots of P (α) = 0 and assume Re α1 ≥ Re α2 . Then for
α = α1 we have P (α + n) 6= 0 for n = 1, 2, . . . because α2 is the only root, other than α1 , of
P (α) = 0 and α2 6= α1 + n. Hence we always have one solution to our equation.

As an example, we solve
 
1 ′
′′ ν2
y + y − 1+ 2 y =0
x x
X
which has a regular singular point at x = 0. We substitute y = x α an x n obtaining

X X
(α + n) (α + n − 1) an x n + α − 2 + (α + n) an x n + α − 2
n=0 n=0
X X n+α−2
− an−2 x n + α − 2 − ν 2 x =0
n=2 n=0

hence we have


α2 − ν 2 a0 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 576

(α + 1)2 − ν 2 a1 = 0

(α + n)2 − ν 2 an = an−2 , n = 2, . . .

where a0 6= 0 implies P (α) = (α2 − ν 2 ) = 0, therefore, α = ±ν and α1 = ν, α2 = −ν assuming


Re ν > 0. Thus P (α1 + n) 6= 0, n = 1, 2, . . . and we have a1 , a3 , a5 , . . . all zero and

a2n−2 a2n−4
a2n = 2
=
2 n (ν + n) 24 n (n − 1) (ν + n) (ν + n − 1)

etc.

2 −ν
where, chosing a0 = , we have the series
Γ (ν + 1)

 2n + ν
1
X 2x
, Γ (z) = zΓ (z − 1)
n=0
n!Γ (ν + n + 1)

which has an infinite radius of convergence and defines the functions Iν (x).

If α1 − α2 is not an integer, we can find a second solution by using α2 in place of α1 . But often
α1 − α2 is an integer and this presents a technical difficulty.

If α1 = α2 (case ν = 0 above) we can solve the recursion formula for an as a function of a0


and α. Then

X
y (x, α) = (x − x0 ) α an (α) (x − x0 ) n

is not a solution unless α = α1 and we have


 
d2 p (x) d q (x)
+ + y (x, α) = q0 (x − x0 ) α − 2P (α)
dx 2 (x − x0 ) dx (x − x0 )2

where at α = α1 , P (α) = 0, the RHS = 0 and y (x, α1 ) is a solution. Now differentiating


with respect to α the RHS is a linear combination of P (α) and P ′ (α) and if α1 = α2 , P (α) =
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 577

(α − α1 )2 and the new RHS is zero at α = α1 . Thus our second solution is

∂ X d
y (x, α) = y (x, α1 ) ln (x − x0 ) + an (α) (x − x0 ) n + α1
∂α α=α1 dα α = α1

Much more than this sketch can be found in “Advanced Mathematical Methods for Scientists
and Engineers” by Carl M. Bender and Steven A. Orszag and in “An Introduction to the Theory of
Functions of a Complex Variable” by E. T. Copson.

The reader can work out the closely related problem, again Bessel’s equation,
 
d2 y 1 dy m2
+ + 1− 2 y =0
dx2 x dx x

where x = 0 is a regular singular point. Assuming m = . . . , −2, −1, 0, 1, 2, . . . and writing

X
y = xα an x n

you should have

X X
y ′ = αx α − 1 an x n + x α nan xn−1

and

X X X
y ′′ = α (α − 1) x α − 2 an x n + 2αx α − 1 nan x n − 1 + x α n (n − 1) an x n − 2

and upon substituting and multiplying by x2 you should obtain

n  o αX X
α (α − 1) + α + x − m 2
x an x n + {2α + 1} x α + 1
2
nan x n − 1
X
+x α + 2 n (n − 1) an x n − 2 = 0

The coefficient of xα on the LHS is (α2 − m2 ) a0 and assuming a0 6= 0, you have α = ±m. The
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 578

coefficient of x α + n is


an n2 + 2αn − an−2

and hence you have the recursion formula

−1
an = an−2
n2 + 2αn

and this determines the even numbered coefficients in the series in terms of a0 .

The odd numbered coefficients, multiples of a1 , are zero. For example, setting m = 0, a0 = 1
you have a power series solution for all finite x. The series is called J0 (x) where

1 2 1 1
J0 (x) = 1 − 2
x + 2 2 x4 − 2 2 2 x6 + · · ·
2 2 4 2 4 6

1
A second independent solution can be obtained. Because the Wronskian of two solutions is , this
x
second solution will diverge logarithmically as x → 0.

Back to the Solution of the θ-Equation

Now we return to our problem


   
d  dP
2 2 m2
1−z + β − P = 0, −1 ≤ z ≤ 1
dz dz 1 − z2

where m = . . . , −2, −1, 0, 1, 2, . . . , and observe that it has regular singular points at z = ±1.
Because our aim is to expand P in a power series in z about z = 0 and to make use of the condition
that P must be bounded, we need to determine what is going on near z = ±1 by investigating the
indicial equation at each of these points. To see what happens at z = +1 we introduce x = 1 − z
and write R (x) = P (z), translating the singular point to x = 0. Then our equation is
   
d dR 2 m2
x (2 − x) + β − R=0
dx dx x (2 − x)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 579

and substituting

X
R = xα an xn , a0 6= 0

into
  n o
d2 R dR
x (2 − x) x (2 − x) 2 + (2 − 2x) + x (2 − x) β 2 − m2 R=0
dx dx

we find the coefficient of xα on the left hand side to be


4α2 − m2 a0

and for this to be zero when a0 is not zero, we have

1 2
α2 = m
4

or

| m|
α=±
2

Likewise at z = −1 we find

| m|
α=±
2

| m|
Because we are looking for bounded solutions on −1 ≤ z ≤ 1 we discard the factors (1 − z)− 2
| m|
and (1 + z)− 2 and assume that P (z) can be written

| m| | m|
P (z) = (1 − z) 2 (1 + z) 2 G (z)

| m|
= 1 − z2 2 G (z)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 580

where G (z) is a power series in z about z = 0. The differential equation satisfied by G is then

  
1 − z 2 G′′ − 2 | m| + 1 zG′ + β 2 − | m| {| m| + 1} G = 0

The points z = ±1 remain regular singular points but the corresponding indicial equations require
α = 0. And so we look for a solution in the form of an ordinary power series about z = 0 where
the singular points z = ±1 bound the interval of interest. Writing

X
G= an z n

X
G′ = nan z n−1

and

X
G′′ = n (n − 1) an z n−2

and substituting into the equation satisfied by G we get

X X  X
n (n − 1) an z n−2 − n (n − 1) an z n − 2 | m| + 1 nan z n
X
+ β 2 − | m| {| m| + 1} an z n = 0

Observing that


X ∞
X
n (n − 1) an z n−2 = (n + 2) (n + 1) an+2 z n
n=0 n=0

and setting the coefficient of z n on the left hand side to zero we discover the two term recursion
formula

β 2 − | m| {| m| + 1} − 2n | m| + 1 − n (n − 1)
an+2 =− an
(n + 2) (n + 1)
(n + | m|) (n + | m| + 1) − β 2
= an
(n + 2) (n + 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 581

in which a2 , a4 , . . . can be determined sequentially once a0 is assigned and a3 , a5 , . . . can be deter-


mined sequentially once a1 is assigned. The simplest way to obtain two independent solutions is to
take them to be the even series corresponding to a0 = 1, a1 = 0 and the odd series corresponding
to a0 = 0, a1 = 1. So for any fixed values of | m| and β 2 we have two series, one an even function
of z, the other an odd function of z. Now as Copson explains in his book “An Introduction to the
P
Theory of Functions of a Complex Variable” the radius of convergence of a power series an z n
1
is . For both our series this is 1 and so for any fixed values of | m| and β 2 both our even
lim | an |1/n
and our odd series converge for all z such that | z| < 1. But both series diverge when z = ±1.

So, for fixed values of | m| and β 2 we ordinarily do not get a bounded solution on −1 ≤ z ≤ 1.
But for fixed values of | m| there is the possibility that special values of β 2 lead to a bounded
solution. These values of β 2 are those that make one series or the other terminate in a finite number
of terms. For each fixed value of | m| we can select a sequence of values of β 2 that terminate the
even series in 1, 2, . . . terms and a sequence of values of β 2 that terminate the odd series in 1, 2, . . .
terms. To each such value of β 2 one series terminates in a polynomial, while the other series does
not terminate and is discarded. (The boundedness condition works here just like it did in cylindrical
coordinates where we discarded a solution to Bessel’s equation on the same grounds.)

So for each fixed value of | m| we get an infinite sequence of polynomial solutions. The poly-
nomial whose highest power is z ν , ν = 0, 1, 2, . . ., corresponds to

β 2 = (ν + | m|) (ν + | m| + 1)

For each even value of ν the even series terminates in a polynomial and we discard the divergent
odd series; for each odd value of ν the odd series terminates in a polynomial and we discard the
divergent even series.

Again, to each fixed value of | m| there corresponds to each value of β 2 such that

β 2 = (ν + | m|) (ν + | m| + 1) , ν = 0, 1, 2, . . .

a polynomial solution to the equation for G whose highest power is z ν , and it is odd or even as ν is
odd or even. The polynomial is completely determined by the recursion formula up to a constant
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 582

factor. If we write

ℓ = ν + | m|

then

β 2 = ℓ (ℓ + 1)

and, due to ν = 0, 1, . . ., we have

ℓ = | m| , (| m| + 1) , (| m| + 2) , . . .

The polynomial is of degree ℓ − | m|.

Going back to the equation for P (z), and hence to the equation for Θ (θ), we see that to each
fixed value of | m|, | m| = 0, 1, 2, . . ., we have determined a sequence of eigenvalues β 2 , where

β 2 = ℓ (ℓ + 1)

and where

ℓ = | m| + ν, ν = 0, 1, 2, . . .

To each such eigenvalue we have one independent eigenfunction and it is of the form
 
| m|  an even or odd polynomial in z 
P (z) = 1 − z2 2 ×
 of degree ν = ℓ − | m| 

To each value of | m| and to each ℓ the polynomial is defined by the recursion formula

(ν + | m|) (ν + | m| + 1) − ℓ (ℓ + 1)
aν+2 = aν
(ν + 1) (ν + 2)

ℓ = | m| , (| m| + 1) , . . ., whence
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 583

ℓ = | m| leads to a0 ,

ℓ = | m| + 1 leads to a1 z,

ℓ = | m| + 2 leads to a0 + a2 z 2 , etc.

We can write out these eigenfunctions using the recursion formula for the coefficients in the
polynomials, but there is a better way of presenting our results. Before we do this we establish the
orthogonality of two eigenfunctions corresponding to different eigenvalues. Let | m| be fixed and
let P (z; ℓ, m) denote the bounded solution of
   
d  d
2 m2
1−z P + ℓ (ℓ + 1) − P =0
dz dz 1 − z2

corresponding to ℓ = | m| , (| m| + 1) , . . .. Then as
Z +1   Z +1  
d  dQ
2 d 2
 dP
P 1−z dz − Q 1−z dz =
−1 dz dz −1 dz dz
 +1
 dQ 2
 dP
P 1−z − Q 1 − z2 =0
dz dz −1

for any smooth functions P and Q bounded on −1 ≤ z ≤ 1, we see that


Z +1
P (z; ℓ, m) P (z; ℓ′ , m) dz = 0
−1

where ℓ and ℓ′ are distinct values taken from the sequence | m|, (| m| + 1), (| m| + 2) , . . .

We emphasize the fact that to each value of | m| there corresponds an infinite set of polynomi-
als. And the sets of polynomials differ as | m| differs. The orthogonality is a condition satisfied by
the polynomials in each one of these sets.

We return to our problem as originally written and put | m| = 0. Then it is


 
d  dP
2
1−z + β 2P = 0
dx dz

and its bounded solutions on −1 ≤ z ≤ +1 are even or odd polynomials of degree ℓ, ℓ = 0, 1, 2, . . .


corresponding to β 2 = ℓ (ℓ + 1). Now the Legendre polynomials P0 , P1 , P2 , . . ., introduced earlier,
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 584

are even or odd polynomials of degree ℓ, ℓ = 0, 1, 2, . . ., satisfying


 
d  dP 2
1−z + ℓ (ℓ + 1) P = 0
dx dz

and as polynomial solutions of a fixed degree of this equation are unique up to a constant factor we
can take the solutions to our problem for | m| = 0 to be

Pℓ (z) , ℓ = 0, 1, 2, . . .

This we do henceforth

We can use the Legendre polynomials to define the associated Legendre functions of degree ℓ
and order | m| via

| m| | m| d| m|
Pℓ = 1 − z2 2 Pℓ (z)
dz | m|

where | m| = 0, 1, 2, . . ., where, to each value of | m|, ℓ = | m|, (| m| + 1), (| m| + 2) , . . . and


d| m|
where, as Pℓ (z) is an even or odd polynomial of degree ℓ, Pℓ (z) is an even or odd polyno-
dz | m|
mial of degree ℓ − | m|. Then by differentiating
 
d  d 2
1−z Pℓ (z) + ℓ (ℓ + 1) Pℓ (z) = 0
dx dz

| m| times we get

 d| m| + 2 d| m| + 1
1 − z2 Pℓ (z) − 2 {| m| + 1} z Pℓ (z)
dz | m| + 2 dz | m| + 1
n o
+ ℓ (ℓ + 1) − | m| {| m| + 1} Pℓ (z) = 0

| m| | m| d| m|
and using Pℓ = 1 − z2 2 Pℓ (z) we find
dz | m|

 d2 Pℓ| m| | m|  
2 dPℓ m2 | m|
1−z − 2z + ℓ (ℓ + 1) − Pℓ =0
dz 2 dz 1 − z2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 585

and we see that the associated Legendre functions are bounded solutions of our differential equa-
tion, viz.,
   
1 d dΘ m2
sin θ + ℓ (ℓ + 1) − Θ=0
sin θ dθ dθ sin2 θ

| m|
| m|
where z = cos θ and P (z) = Θ (θ). And as Pℓ (z) is (1 − z 2 ) 2 times an even or odd poly-
nomial of degree ℓ − | m| the associated Legendre functions must be constant multiples of the
solutions we determined by Froebenius’ method.

What we find then is this: bounded solutions of


   
d  dP
2 m2
2
1−z + β − P =0
dz dz 1 − z2

on the interval −1 ≤ z ≤ +1 corresponding to | m| = 0, 1, 2, . . . can be found for β 2 = ℓ (ℓ + 1),


ℓ = | m| , (| m| + 1) , . . ., and can be written

| m| | m| d| m|
Pℓ = 1 − z2 2 Pℓ (z)
dz | m|

where Pℓ (z), ℓ = 0, 1, 2, . . . are the Legendre polynomials. Indeed all that we may wish to know
about the associated Legendre functions can be determined from the Legendre polynomials. For
instance the integrals of their squares,

Z +1  2
| m|
Pℓ (z) dz
−1

required for normalization, can be found, after some work, to be

2 {ℓ + | m|}!
2ℓ + 1 {ℓ − | m|}!

This establishes the angular eigenfunctions of ∇2 . So, while the radial parts of the eigenfunc-
tions of ∇2 remain to be determined, we now have a complete picture of the angular part. Defining
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 586

Yℓ m (θ, φ) by
s
2ℓ + 1 {ℓ − | m|}! | m|
Yℓ m (θ, φ) = P (cos θ) Φm (φ)
2 {ℓ + | m|}! ℓ

m = . . . , −2, −1, 0, 1, 2, . . .

ℓ = | m| , (| m| + 1) , . . .

we have a complete set of orthogonal functions defined on the surface of a sphere. The orthogo-
nality works like this. Because
Z 2π Z π
Y ℓ m Yℓ ′ m′ sin θ dθdφ
0 0

p p Z π Z 2π
′ ′ | m| | m′ |
= ℓ, m ℓ , m Pℓ (cos θ) Pℓ ′ (cos θ) sin θ dθ Φm (φ) Φm′ (φ) dφ
0 0

p p Z +1 Z 2π
′ ′ | m| | m′ |
= ℓ, m ℓ , m Pℓ (z) Pℓ ′ (z) dz Φm (φ) Φm′ (φ) dφ
−1 0

where
s
p 2ℓ + 1 {ℓ − | m|}!
ℓ, m =
2 {ℓ + | m|}!

and
s
p 2ℓ′ + 1 {ℓ′ − | m|}!
ℓ ′, m′ =
2 {ℓ′ + | m|}!

we look at the second integral first. It is either 1 or 0 as m is or is not equal to m′ . Only if m is


equal to m′ need we to look at the first integral. It is either 1 or 0 as ℓ is or is not equal to ℓ ′ .

The functions Yℓm are called spherical harmonics and as they are eigenfunctions of ∇2 on the
surface of a sphere, viz.,

ℓ (ℓ + 1)
∇2 Yℓm = − Yℓm
r2
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 587

we can use them to solve problems there.

For instance if a solute is distributed over the surface of a sphere at t = 0 according to c (t = 0)


and then comes to equilibrium by diffusion, to determine c (t > 0) we must solve

∂c
= ∇2 c, 0 ≤ φ ≤ 2π, 0≤θ≤π
∂t

where c (t = 0) is assigned and c is required to be periodic in φ and bounded in θ. Then c (t > 0)


is given by

+∞
X ∞
X
c (θ, φ, t) = cℓm Yℓm (θ, φ) e−ℓ (ℓ + 1) t
m=−∞ ℓ=| m|

where
Z 2π Z π
cℓm = Yℓm (θ, φ) c (t = 0) sin θ dθdφ
0 0

and where distance and time are scaled so that the radius of the sphere is one unit of length and
the diffusivity is one unit of length2 /time. The readers can determine that c (t > 0) is independent
of φ for all t > 0 if c (t = 0) is independent of φ and then go on and investigate the long time
dependence of c on θ for arbitrary c (t = 0).
P P∞ P∞ Pℓ
The double sum +∞ m=−∞ ℓ=| m| is often written ℓ=0 m=−ℓ where a given value of ℓ

corresponds to 2ℓ + 1 values of m, viz., the values m = −ℓ, −ℓ + 1, . . ., ℓ − 1, ℓ. This is


sometimes important. Indeed Schrödinger’s wave equation for the stationary states of an electron
moving relative to a nucleus of charge Ze to which it is bound is

8π 2 µ
∇2 ψ + {E − V } ψ = 0
h2
m1 m2
where E is the energy of the electron, assuming the nucleus is fixed, µ = is the reduced
m1 + m2
mass, m1 and m2 being the masses of the particles, and r, θ and φ are the spherical coordinates
of the electron taking the nucleus to be at the origin of our coordinate system. Both particles are
assumed to be point particles and the potential energy is that due to the Coulombic attraction of
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 588

their electric charges, i.e.,

Ze2
V =−
r

This the Coulomb’s law in unrationalized electrostatic units.

The solutions to this eigenvalue problem are then of the form

ψ (r, θ, φ) = R (r) Yℓ m (θ, φ)

where R (r) satisfies


   2 
1 d 2 dR 8π µ  ℓ (ℓ + 1)
r + E − V (r) − R=0
r 2 dr dr h2 r2

where m = 0, ±1, ±2, . . . and ℓ = | m|, (| m| + 1), (| m| + 2) , . . .. Thus a fundamental problem in


atomic and molecular physics is reduced to the solution of a one dimensional eigenvalue problem.

The eigenvalues are independent of m. So, corresponding to ℓ = 0, we have m = 0 and one


eigenfunction, corresponding to ℓ = 1, we have m = −1, 0, 1 and three eigenfunctions, etc.

The Radial Part of the Eigenfunctions

The eigenfunctions of ∇2 in spherical coordinates, periodic in φ and bounded in θ, take the form

Ψ = R (r) Yℓ m (θ, φ)

where R (r) satisfies


   
1 d 2 dR 2 ℓ (ℓ + 1)
r + λ − R=0
r 2 dr dr r2

and where m = 0, ±1, ±2, . . . and ℓ = | m|, (| m| + 1), (| m| + 2) , . . .. This equation can be
reduced to Bessel’s equation and solved in terms of Bessel functions. We observe that if ℓ = 0 two
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 589

independent solutions are

sin λr cos λr
and
λr λr

where the first, but not the second, is bounded as r → 0. Likewise for each value of ℓ, ℓ =
0, 1, 2, . . ., two independent solutions can be written in terms of sin λr, cos λr and powers of λr.
These can be used to solve diffusion problems in a sphere when c (t = 0) and all other sources of
solute are assigned. The reader can determine that

sin λr cos λr

(λr)2 λr

is the bounded solution when ℓ = 1.

Solutions to ∇2 c = 0, Laplace’s Equation, and ∇2c + Q = 0, Poisson’s


Equation, in Spherical Coordinates

It may be worth a sentence or two to explain that as

1
∇2 r ℓ = ℓ (ℓ + 1) r ℓ
r2

and

1 1 1
∇2 = ℓ (ℓ + 1) ℓ+1
r ℓ+1 r 2 r

the function
 
ℓ 1
Aℓ m r + Bℓ m Yℓ m (θ, φ)
r ℓ+1

satisfies Laplace’s equation.

To solve Laplace’s equation in spherical coordinates we introduce the inner product:


Z 2π Z π
h f, g i = f g sin θ dθdφ
0 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 590

Then for all f and g periodic in φ and bounded in θ we can derive

h f, ∇2g i = h ∇2f, g i

Hence to solve

∇2 c = 0, R1 ≤ r ≤ R2 , 0 ≤ θ ≤ π, 0 ≤ φ ≤ 2π

where physical conditions are assigned on the spheres r = R1 and r = R2 we write

XX
c= cℓ m (r) Yℓ m (θ, φ)

where the radial part of the solution is denoted cℓ m (r) and where

cℓ m (r) = h Yℓ m , c i

The equation satisfied by cℓ m (r) is then obtained by multiplying Laplace’s equation by Y ℓ m and
integrating over θ and φ, viz.,

0 = h Y ℓ m , ∇2 c i

1 d 2d
= r h Yℓ m , c i + h Yℓ m , ∇2θ,φ c i
r 2 dr dr
1 d 2d
= r h Y ℓ m , c i + h ∇2 Y ℓ m , c i
r 2 dr dr

This leads to the differential equation

1 d 2d ℓ (ℓ + 1)
2
r h Yℓ m , c i − h Yℓ m , c i = 0
r dr dr r2

and, using

Bℓ m
h Yℓ m , c i = Aℓ m r ℓ +
r ℓ+1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 591

we get

XX Bℓ m


c= Aℓ m r + ℓ+1 Yℓ m (θ, φ)
r

where Aℓ m and Bℓ m can be determined using the conditions assigned on r = R1 and r = R2 . This
will work in solving Poisson’s equation as well, viz.,

0 = ∇2 c + Q (r, θ, φ)

The differential equation determining the radial part of the solution is then

1 d 2d ℓ (ℓ + 1)
2
r h Yℓ m , c i − h Yℓ m , c i + h Yℓ m , Q i = 0
r dr dr r2

To do any other problem, where, for instance, c is driven by an initial condition or by an initial
condition and a volume source, brings us back to

∇2 ψ + λ 2 ψ = 0

where
 
 2 1 d 2d ℓ (ℓ + 1)
∇ + λ2 R (r) Yℓ m = 2
r + λ2 − R Yℓ m
r dr dr r2

and where a homogeneous condition must be satisfied by ψ and hence by R on the surface of a
sphere (or on two spheres) corresponding to a physical condition imposed on c there. The angular
parts of this, the Yℓ m ’s, are now known, only the radial part part remains to be determined, viz.,
the solutions to
 
1 d 2d ℓ (ℓ + 1)
2
r + λ2 − R=0
r dr dr r2

and this can be done by extending the observations on page 589 to ℓ = 2, 3, . . ..

We notice that λ2 depends on ℓ but not on m.


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 592

20.4 Small Amplitude Oscillations of an Inviscid Sphere

We wish to find the frequencies of small amplitude oscillations of a sphere of inviscid fluid.

The radius of the sphere is denoted R0 and we introduce a displacement so that its surface is
r = R (θ, φ, t) .

We have

∂−
→v
ρ = −∇p, ∇·−

v =0
∂t

and hence

∇2 p = 0

Ignoring the effect of the outside fluid we have at r = R (θ, φ, t)

p = −γ2H

and

Rθ Rφ ∂R
vr − vθ − vφ =
R R sin θ ∂t

2 → →

Now introducing a small displacement of the rest state, viz., the state p0 = γ , −
v0 = 0 , we
R0
write

R (θ, φ, t) = R0 + ε R1 (θ, φ, t)

Then to order ε we have

vr = εv r1

p = p0 + ε p 1
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 593

and
 
2 2 1 2
2H = − +ε + ∇ R1
R0 R02 R02 θφ

where
 
1 ∂ ∂ 1 ∂2
∇2θφ = sin θ +
sin θ ∂θ ∂θ sin2 θ ∂φ2

Thus our small amplitude equations are

∂vr1 ∂p1
ρ =−
∂t ∂r

∇2 p 1 = 0
∂R1
vr1 = at r = R0
∂t
and

γ 
p1 = − 2 + ∇2θφ R1 at r = R0
R02

and we have solutions

vr1 = vbr1 (r) Y ℓm (θ, φ) eσt

p1 = bp1 (r) Y ℓm (θ, φ) eσt

and

b Y ℓm (θ, φ) eσt
R1 = R1

b satisfy
where vbr1, bp1, and R1

dbp1
ρ σ vbr1 = −
dr
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 594
 
1 d 2dbp1 ℓ (ℓ + 1)
r − bp1 = 0
r 2 dr dr r2
b
vbr1 = σ R at r = R0
1

and

γ b
bp1 = − {2 − ℓ (ℓ + 1)} R at r = R0
R02 1

and we see that

bp1 = A r ℓ

where

ℓ = | m| , | m| + 1, . . .

Thus we obtain

γ
σ2 = − ℓ (ℓ − 1) (ℓ + 2)
ρR03

Now to order ε the volume of the sphere is


Z 2π Z π
V0 + b eσt
ε3R02 R Y ℓm (θ, φ) sin θ dθ dφ
1
0 0

and for | m| = 1, 2, . . . , ℓ ≥ | m|, the integral is zero. But m = 0, ℓ = 0 must be ruled out assuming
the volume of the sphere remains fixed on perturbation.

But what about ℓ = 1 which is possible at m = 0 and | m| = 1? Suppose we have a sphere of


radius R0 centered at {ε, 0, 0} and we wish to write this

r = R (θ, φ) = R0 + εR1 (θ, φ)


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 595

Then, substituting

x = R sin θ cos φ

y = R sin θ sin φ

z = R cos θ

R = R0 + ε R1

into

(x − ε)2 + y 2 + z 2 = R02

we find

R1 = sin θ cos φ

Likewise if a sphere of radius R0 is centered at (0, ε, 0) we have

R1 = sin θ sin φ

and if our sphere is centered at (0, 0, ε) we find R1 = cos θ. Now sin θ cos φ, sin θ sin φ and cos θ
are eigenfunctions of ∇2θ,φ corresponding to | m| = 1, ℓ = 1 and m = 0, ℓ = 1. Hence all of the
small displacements at ℓ = 1 correspond simply to moving the sphere off center and hence lead to
σ = 0.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 596

20.5 The Solution to Poisson’s Equation

Coulomb’s law tells us that the electrostatic potential at the field point ~r due to electric charges qi
concentrated at the source points ~ri is

1 X qi
φ (~r) =
4πε0 | ~r − ~ri |

If the charge is distributed continuously, instead of discretely, due to an assigned charge density ρ,
then the electrostatic potential is
ZZZ
1 ρ (~r ′ )
φ (~r) = dV ′
4πε0 | ~r − ~ri |

where the source at ~r ′ is ρ (~r ′ ) dV ′ and the sum over point charges is replaced by an integral over
the charge density.

This formula is the solution to Poisson’s equation

∇2 φ = −ρ/ε0

To see this we write our second integration by parts formula in the form
ZZZ ZZ ZZ ZZZ
2
φ∇ ψ dV = dA~n · {φ∇ψ − ψ∇φ} + dA~n · {φ∇ψ − ψ∇φ} + ψ∇2 φ dV
V S S V
2 1

where S1 and S2 bound a region V . To use this we assume the origin lies inside S1 which in turn
1 1
lies inside S2 . Then setting ψ = so that ∇ψ = − 2 ~ir and ∇2 ψ = 0, we get
r r
ZZ   ZZ   ZZZ
φ~ 1 φ~ 1 1 2
0=− dA~n · 2
ir + ∇φ − dA~n · 2
ir + ∇φ + ∇ φ dV
r r r r r
S S V
2 1

We now let S2 be a sphere of very large radius and we let S1 be a sphere of radius ε where
1
ε → 0. Then by requiring that φ → 0 at least as fast as as r → ∞, the first term on the right hand
r
side vanishes and by requiring that φ and ∇φ remain bounded as r → 0, the second reduces to
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 597
 
4πφ ~0 . Using this our integration by parts formula simplifies to

  ZZZ
~ 1 −∇2 φ
φ 0 = dV
4π r
V

ρ
and so, if φ satisfies Poisson’s equation, ∇2 φ = − , we get
ε0

  ZZZ
1 ρ (~r)
φ ~0 = dV
4πε0 r
V

which establishes the formula written earlier.

By doing this we have discovered the Green’s function for the operator ∇2 when its domain is
1
the set of functions defined throughout all space, required to vanish as r → ∞ at least as fast as .
r
Indeed we observe that
ZZ   ZZZ  
1 2 1
dA~n · ∇ = ∇ dV = 0
r r
S V

whenever S is the complete surface bounding any region V that does not include the origin. Then
when S is any surface enclosing the origin we conclude that
ZZ   ZZ  
1 1
dA~n · ∇ =− dA~n · ∇ = −4π
r r
S Sε

where S ε is a sphere of radius ε and ~n = −~ir thereon.

By using this we see that the function g, where

1 1 1
g= ·1
D 4π r

M
and where the factor 1 is written because it has physical dimensions , satisfies
T

∇2 g = 0, ∀~r 6= ~0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 598

and
ZZ
dA~n · {−D∇g} = 1
S

where S is any surface enclosing the origin. It is therefore the concentration field resulting when a
M
steady point source of unit strength, i.e., of strength 1 in units , is established at ~r = ~0 ∀t.
T
2
This is the Green’s function for ∇ . If a mass source is distributed via a continuous source
density so that Q (~r) dV units of mass per unit of time is introduced at ~r then by superposition
ZZZ
1
c (~r) = Q (~r ′ ) dV ′
D4π | ~r − ~r ′ |
V

satisfies

D∇2 c + Q = 0

20.6 Home Problems

1. A quantum ball of mass m in a uniform gravitational field is illustrated below


V
m
g

z=0 z

where V = gz, z > 0 and where ψ = 0 at z = 0 and ψ → 0 as z → ∞.


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 599

You are to find the energies of the ball by solving

d2 ψ 2m
2
= 2 (gz − E) ψ
dz ~

α3 2mg α2 2mg
Replacing z by αz where = 1 and setting E = ε you have
~2 ~2

d2 ψ
= (z − ε) ψ
dz 2

The solution is

ψ = CAi (z − ε) + DBi (z − ε)

where Ai (x) and Bi (x) are sketched early in Lecture 20.

2. An incompressible fluid is in steady straight line flow in a long straight pipe of rectangular
cross section. Determine vz (x, y) where

1 dp
∇2 vz = <0
µ dz

dp
where is an input and where vz is zero at x = ± a and y = ± b.
dz
Determine Q, the volumetric flow rate, and then determine the values of a and b so that the
volumetric flow rate is greatest at a fixed cross sectional area.

3. Instead of expanding vz in the eigenfunctions of ∇2 as you might have done in Problem 1


solve

∂ 2 vz ∂ 2 vz 2 1 ∂p
∇2 vz = + = a
∂x2 ∂y 2 µ ∂z

where

vz (x = 0, 1) = 0 = vz (y = 0, 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 600

by writing


X
vz = cn (y) sin nπx
n=1

and determining the function cn (y) , n = 1, 2, . . . where


Z 1
cn (y) = 2 (sin nπx) vz (x, y) dx
0

This series is the sum of products of functions of x times functions of y, unlike the
solution in Problem 1.

4. An incompressible fluid is in steady straight line flow in a long straight pipe. The axial
velocity depends on x and y and satisfies

1 ∂p
∇2 vz = , (x, y) ∈ A
µ ∂z

and

vz = 0, (x, y) ∈ P

dp
where is a negative constant, A denotes the fixed cross section of the pipe and P denotes
dz
its perimeter.

Polynomial solutions to this equation can be obtained for a variety of cross sections using

∇2 {1, x, y} = {0, 0, 0)

∇2 x2 , xy, y 2 = {2, 0, 2)

∇2 x3 , x2 y, xy 2, y 3 = {6x, 2y, 2x, 6y)

etc.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 601

Determine the solution to the problem when the cross section of the pipe is bounded
by:

(i) the circle: x2 + y 2 = a2


x2 y 2
(ii) the ellipse: + 2 =1
a2 b
(iii) the equilateral triangle:

1
y=− √ a
2 3
√ 1
y = 3x + √ a
3
√ 1
y = − 3x+ √ a
3

Show that for the same pressure gradient and the same area, the circular cross section
carries the greatest volumetric flow.

Show that polynomial solutions cannot be found when the cross section is a square.

5. This has to do with the nonlinear heat generation problem presented at the end of Lecture 16.

Let the region V in which a heat generating material resides a be a rectangular paral-
lelepipde of side lengths a, b and c. Its volume is then abc. Determine the eigenvalues of ∇2
in this region when the eigenfunctions are required to vanish on its boundary. Observe that
 
1 1 1
λ21 =π 2
2
+ 2+ 2
a b c

Holding the volume of the region fixed show that a cube might be thought to be the most
dangerous shape. Is this result expected on physical grounds?

6. A solvent is in straight line flow in a long straight pipe of rectangular cross section. The pipe
is aligned with the axis 0z of a Cartesian coordinate system and −a < x < a, −b < y < b,
−∞ < z < ∞.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 602

The longitudinal velocity of the solvent, denoted vz (x, y), is known from Problem 1.

A solute is introduced at t = 0. Denote its initial distribution by c (t = 0) and determine


Veff and D eff as t grows large in terms of vz (x, y). Assume c (t = 0) is independent of x and
y.

Define the inner product and the transverse average by


ZZ
1
h u, v i = uv dA
A A

and
ZZ
1
u= u dA
A A

then

u = h 1, u i

7. Suppose c satisfies ~n · ∇c = 0 on the boundary of a region V . Show that


ZZZ
1
∇2 c = ∇2 c dV = 0
V
V

8. Two planes, one at x = 0 and the other at x = L, are held at temperature T0 . A plane at
y = 0 is held at temperature T1 > T0 .

Derive a formula for the steady temperature field in the region 0 < x < L, 0 < y < ∞.
∂T
To estimate the heat that must be supplied to establish this temperature field find
∂y
along the hot plane y = 0.

First differentiate the formula for T (x, y) term by term, observe that the result is a
series that diverges at y = 0 and conclude that, while the series produces T , it does not
produce everything that we might want to know about the problem.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 603

∂T
Devise a way of determining (y = 0).
∂y

9. Solve the differential equation


 
d  dP
2
1−z + n (n + 1) P = 0, −1 ≤ z ≤ 1
dz dz

where n is a non-negative integer, by expanding P in powers of z:

P = a0 + a1 z + a2 z 2 + · · ·

10. Define spherical coordinates in four dimensions by

z = r cos ω

w = r sin ω cos θ

x = r sin ω sin θ cos φ

y = r sin ω sin θ sin φ

and derive
   
2 1 ∂ 3 ∂ 1 ∂ 2 ∂
∇ = 3 r+ 2 sin ω
r ∂r r sin2 ω ∂ω
∂r ∂ω
 
1 ∂ ∂ 1 ∂2
+ 2 sin θ + 2
r sin2 ω sin θ ∂θ ∂θ r sin2 ω sin2 θ ∂φ2

Separate variables in the four dimensional eigenvalue problem


∇2 + λ 2 ψ = 0

and obtain four one dimensional eigenvalue problems. The θ and φ equations are just the θ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 604

and φ equations that come up in three dimensions. The ω equation is new, being
   
1 d 2 dΩ 2 ℓ (ℓ + 1)
2 sin ω + β − Ω=0
sin ω dω dω sin2 ω

where ℓ (ℓ + 1) comes from the θ equation.

Let z = cos ω, P (z) = Ω (ω) and reduce this equation to


 
 d2 P
2 dP 2 ℓ (ℓ + 1)
1−z − 3z + β − P =0
dz 2 dz 1 − z2

Let x = 1 − z (and x = 1 + z) and solve the indicial equation to find

1 1
s= ℓ, − (ℓ + 1)
2 2

ℓ/2
Write P (z) = (1 − z 2 ) G (z) and show that G satisfies

 d2 G dG  2
1 − z2 2
− (2ℓ + 3) z + β − ℓ (ℓ + 2) G = 0
dz dz

Then let G = a0 + a1 z + a2 z 2 + · · · , derive the recursion formula

(v + ℓ) (v + ℓ + 2) − β 2
av + 2 = av
(v + 1) (v + 2)

and conclude that

β 2 = k (k + 2) , k = ℓ, ℓ + 1

There is a pattern in the separation constants in spherical coordinates:

in two dimensions: m (m + 0)

in three dimensions: m (m + 0)
ℓ (ℓ + 1)
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 605

in four dimensions: m (m + 0)
ℓ (ℓ + 1)
k (k + 2)

Denote the spherical harmonics in four dimensions by Ykℓm. They satisfy

2 k (k + 2)
∇ωθφ Ykℓm = − Ykℓm
r2

Show that

1
r k Ykℓm and Ykℓm
rk + 2

satisfy Laplace’s equation in four dimensions.

Almost everything that needs to be known, in any number of dimensions, can be in-
ferred by pursuing the pattern that is emerging.

11. A dye drop in the shape of North and South America is absorbed on the surface of a sphere of
water in which it is insoluble. The dye cannot escape into the surrounding air but being sub-
jected to collisions by the water molecules it can diffuse over the surface of the water. Find
a formula for the diffusive homogenization of the dye. Scale length and time so that D = 1
and R = 1 where R is the radius of the sphere of water. Then the surface concentration of
dye satisfies

∂c
= ∇2 c, 0 ≤ θ ≤ π, 0 ≤ φ ≤ 2π
∂t

where c (t = 0) is assigned.

Show that as t grows large the dye becomes uniformly distributed over the surface of the
sphere. Find the long time limiting concentration of the dye. What is the θ and φ dependence
of the non-uniform terms that die out most slowly? If c (t = 0) depends only on θ, is this
symmetry maintained for all t > 0?
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 606

12. In the Debye model for the reorientation of rigid rods by Brownian motion, the direction
of a rod can be specified by a unit vector lying along its axis and hence by a point (θ, φ)
lying on the surface of the unit sphere. If the initial orientation of the rods is specified by the
probability density p (t = 0), then the probability density p (t > 0) satisfies

∂p 2
= D ∇θφ p
∂t

1
where [D] = . Write a formula for p (t > 0).
T
If in the initial orientation, the rods are concentrated at (θ0 , φ0) then

δ (θ − θ0 ) δ (φ − φ0 )
p (t = 0) =
sin θ0

where the delta function is introduced in Lecture 19, Appendix 3. Show that

+∞
X ∞
X
p (t > 0) = Yℓm (θ0 , φ0 ) Yℓm (θ, φ) e−ℓ (ℓ + 1) Dt
m=−∞ ℓ=| m|

This is a transition probability. It is the probability that a rod acquires the orientation (θ, φ)
at time t, given its initial orientation is (θ0 , φ0 ).

13. The energy of a state, denoted E, is a solution to

8π 2 µ
∇2 ψ = − Eψ
h2

where
ZZZ
ψψ dV < ∞
V

and ψ vanishes strongly as | ~r| → ∞. Prove that E > 0.

14. This is a 2-dimensional heat conduction problem, based on the 1-dimensional problem pre-
sented in Lecture 14
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 607

We have a region 0 < z < H, 0 < x < L in which the temperature is specified at
t = 0. The wall at z = H is at a fixed temperature, the walls at x = 0 and x = L are
insulated. The wall at z = 0 is in perfect contact with a well stirred reservoir. The uniform
temperature of the reservoir is denoted T0 . Hence we have T (z = 0) = T0 for all x ∈ (0, L)
∂T
whereupon T is uniform in x at z = 0 but need not be uniform.
∂z
At z = 0 we also have
Z L
∂T dT0
dx = constant ×
0 ∂z dt

Our problem is to find T (x, z, t) where

∂T
= κ∇2 T
∂t

on the domain.

First we scale our problem and then introduce an eigenvalue problem to aid us in solving
the scaled problem.

Thus we have

∂T
= ∇2 T 0 < x < 1, 0<z<H
∂z

T = 0 at z=H

∂T
= 0 at x = 0, 1
∂x

and
Z 1
∂T ∂T
dx = C at z = 0
0 ∂z ∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 608

Our eigenvalue problem is then

∇2 ψ + λ 2 ψ = 0

ψ = 0 at z=H

∂ψ
= 0 at x = 0, 1
∂x
Z 1
∂ψ
dx + Cλ2 ψ = 0 at z=0
0 ∂z

First,the eigenvalue problem must be solved, and we do this by separation of variables,


viz., we write

ψ (x, z) = Z (z) cos kx

k = 0, π, 2π, . . .

and obtain the ψ’s and λ’s. Then the ψ’s and λ’s must be used to obtain T (x, z, t) where
T (x, z, t = 0) is specified.

Our integration by parts formulas can be used to light our path, i.e., to tell us the inner
product we ought to be using.

15. The Stokes’ equation for the slow flow of a constant density fluid is

1
∇2~v = ∇p, ∇ · ~v = 0
µ

Derive the equations

∇2 p = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 609

and
 
2 1 1
∇ ~r p = ∇p
2µ µ

Denote by p n, χn and φn solid spherical harmonics, viz.,

 
A nr n + B nr − (n + 1) Ynm

Then we have

X
p= pn

and you are to derive the result that

X n+3
~v = ∇ × (~rχ n) + ∇φn + r 2 ∇p n
2µ (n + 1) (2n + 3)

n
− ~r p n
µ (n + 1) (2n + 3)

satisfies

1
∇2~v = ∇p and ∇ · ~v = 0
µ

This result appears in Lamb’s book “Hydrodynamics.”

16. Stokes’ equation for slow flow past a sphere has no solution in two dimensions. This is
Stokes’ paradox. It has a solution in three dimensions, but no first order corrections to
account for non zero Reynolds number. This is Whitehead’s paradox.

Your job is to see what you can find out about slow flow past a four dimensional sphere.

First you ought to derive the symmetry conditions and then ask if a streamfunction can be
found.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 610

17. We have a sphere of radius R centered at the origin lying in an unbounded region. We
impose a constant temperature gradient at a great distance from the sphere. The thermal
conductivities of the sphere and its surroundings are kS and k.

Far away from the sphere we have

∂T dT0 ∂T ∂T
= , =0=
∂z dz ∂x ∂y

No temperature is specified at any point in the problem and we have ∇2 T = 0 inside and
outside the sphere. The axisymmetric solutions to this equation are

B0  B1  
2 B2   3 2 1
A0 + + A1 r + 2 cos θ + A2 r + 3 cos θ − + etc.
r r r 2 2

Dropping the constant A0 , derive the formulas

3k dT0
T = r cos θ
kS + 2k dz

and
 
dT0 dT0 k − kS R3
T = r cos θ + r cos θ
dz dz kS + 2k r3

for the temperatures inside and outside the sphere.

18. D. J. Jeffrey, “Conduction through a random suspension of spheres,” Proc. Roy. Soc. Lon-
don, A335 (1973) 355, proposes to determine the effective conductivity of a dilute suspen-
sion of spheres in the following way:

Write

− ~q = k eff ∇T
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 611

where

D E ZZZ
1
() = lim ( ) dV
V →∞ V
V

and then write


 

 ZZZ ZZZ 

1  
− ~q = kS ∇T dV + k∇T dV
V 
 

 
V spheres V − V spheres
 

Z Z Z ZZZ 

1  
= k∇T dV + (kS − k) ∇T dV
V 
 

 V 
V spheres
ZZZ
1
= k ∇T + (kS − k) ∇T dV
V
V spheres

Set
ZZZ
~i =
∆ (kS − k) ∇T dV
V sphere i

whereupon

1 n~ ~ 2 +···
o
h − ~q i = k h ∇T i + ∆1 + ∆
V

~ to be the same, we have for n spheres in a volume V


and then assuming all ∆’s

n ~
h − ~q i = k h ∇T i + ∆1
V

Assuming the spheres are dilute and do not interact with one another, we can use

3k dT0 ~
∇TS = k
kS + 2k dz
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 612

where z = r cos θ, which we found in an earlier problem.


dT0 ~
Setting h ∇T i = k derive:
dz

k eff kS − k
= 1 + 3φ
k kS + 2k

4
n πR3
where φ = 3 is the volume fraction spheres.
V

19. In cylindrical coordinates the Bessel’s functions Im (λr) and Km (λr) satisfy
 
d2 1 d m2 2
+ − 2 − λ {Im (λr)} = 0
dr 2 r dr r

and
 
d2 1 d m2 2
+ − 2 − λ {Km (λr)} = 0
dr 2 r dr r

where Km (λr) is not bounded at λr = 0.

Assume you need to solve

∇2~v = ∇p, ∇ · ~v = 0

Show that

∇2 p = 0 and ∇4~v = ~0

However we would be wise to stay away from ∇4 .

Assume p is bounded at r = 0 and is periodic in θ and show that

p = Im (λr) eı mθ eı λz

satisfies ∇2 p = 0 for any λ and for m = 0, ±1, ±2, . . ..


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 613

Now in cylindrical coordinates your problem is

∂p
∇2 v z =
∂z
v 2 ∂v ∂p
∇2 vr − r2 − 2 θ =
r r ∂θ ∂r
v 2 ∂v 1 ∂p
∇2 vθ − θ2 + 2 r =
r r ∂θ r ∂θ

and

∂vr vr 1 ∂vθ ∂vz


+ + + =0
∂r r r ∂θ ∂z

where ~v = vr ~ır + vθ ~ıθ + vz ~ız

Observe that

ψ = I√ (λr) eı mθ eı λz
m2 + 1

satisfies
 
2 1
∇ − 2 ψ=0
r

and, if there is no θ variation, observe that

vz = I0 (λr) eı λz

vr = I1 (λr) eı λz

and

vθ = I1 (λr) eı λz

satisfy the homogeneous problem ∇2~v = ~0.


LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 614

Assuming

p = Im (λr) cos mθ sin λz

you have

∇2 vz = λ Im (λr) cos mθ cos λz

Then write

vz = f (r, m, λ) cos mθ cos λz

and show that


 
d2 1 d m2 2
+ − 2 − λ f = λ Im (λr)
dr 2 r dr r

and that a particular solution is

1 d
f= r Im (λr)
2λ dr
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 615

20. A cylindrical column of inviscid fluid is rigid body rotation about its axis of symmetry at
angular velocity Ω. Thus you have vr = 0, vθ = r Ω and vz = 0. A small axisymmetric
perturbation is introduced and your job is to find the frequencies of small amplitude oscilla-
tions.

Your equations are

∂−
→v
v · ∇−
+−
→ →
v = −∇p, ∇·−

v =0
∂t
p
where has been replaced by p.
ρ
And you need to derive the equations corresponding to m = 0, viz.,

∂vr1 ∂p1
− 2Ω vθ1 = −
∂t ∂r

∂vθ1
+ 2Ω vr1 = 0
∂t
∂vz1 ∂p1
=−
∂t ∂z
∂vr1 vr1 ∂vz1
+ + =0
∂r r ∂z

Then seeking solutions vr1 = vbr1 (r) eiωt eikz , etc. you can eliminate vbθ1 in favor of vbr1
and vbz1 in favor of pb1 and arrive at an equation for vbr1:
 
d2 vbr1 1 db
vr1 1 4Ω2
+ − vb + k 2 − 1 vbr1 = 0
dr 2 r dr r 2 r1 ω2

where vbr1 = 0 at r = R and vbr1 is bounded at r = 0.

This eigenvalue problem tells you the values of ω 2 as they depend on k 2 . Photographs
illustrating what you have found are presented by D. Fultz, J. Meteorology, 16 199 (1959).

21. Your problem is to solve

∂c
= ∇2 c, 0 < x < 1, 0<y<a
∂t
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 616

where c (t = 0) is assigned.

Two cases are of interest

a) c = 0 at the edge of the rectangle


and

b) ~n · ∇c = 0 at the edge of the rectangle.

In case a) you are to deduce the fact that if a >> 1, the x variation dies out quickly
leaving the y variation in control of the loss of solute to the surroundings. If a << 1, it is
the reverse, i.e., the long direction is slow.

In case b) there is no solute loss, the initial solute distribution is simply working its way
to uniformity. You have eigenfunctions

• independent of x and y,

• independent of x and dependent on y

• independent of y and dependent on x


and

• dependent on both x and y

For a >> 1 and for a << 1 does the x or y variation control the final approach to
uniformity.

22. Suppose we inject a decomposing solute into a solvent in straight line flow in a long pipe of
circular cross section.

The solute concentration decreases due to a first order reaction and we have, in scaled vari-
ables,
 
∂c 1 ∂ ∂c ∂c ∂ 2 c
= r −v + − kc
∂t r ∂r ∂r ∂z ∂z 2

and

∂c
(r = 1) = 0
∂r
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 617

where v = 2v (1 − r 2 ).

Our model is

∂c ∂2c ∂c
= D eff 2 − V eff − K eff c
∂t ∂z ∂z

First derive

1 dc0
K eff = −
c0 dt

 
d c1
V eff =
dt c0

etc.

Then show that

c0 (k > 0) = c0 (k = 0) e−kt

c1 (k > 0) = c1 (k = 0) e−kt

etc.

and conclude that K eff = k and that V eff and D eff are independent of k.

Estimate the fraction of solute remaining at the time V eff and D eff become nearly con-
stant.

23. Your tennis ball, having a diameter 2R and a wall thickness L is filled with air at pressure
P0 > P atm. The air diffuses across the wall and your tennis ball goes flat. Assume all the
pressure drop is across the wall and diffusion through the wall is steady. Then write


∇· c~
v =0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 618

P
c=
RT
and

K
~v = ∇P (Darcy’s law)
µ

where K denotes the permeability of the wall, assumed to be a porous solid.

At constant temperature and assuming one dimensional diffusion derive the equation
 
d 2 dP
Pr =0
dr dr

solve it and derive a formula for the time at which the pressure in your tennis ball falls to
P0 + P atm
.
2

24. A solid sphere of radius R0 and density cS dissolves sparingly in a solvent. It is in equilib-
rium with the solvent at solids concentration c∗, where

c∗ = c∗∞− Aγ2H

and where c∗∞is the concentration in equilibrium with cS at a plane surface.

Writing

R = R0 + ε R1

and assuming no φ variation we have

2
2H0 = −
R0

and
 
1 ∂ 2 R1 cos θ ∂R1
2H1 = 2 2R1 + +
R0 ∂θ2 sin θ ∂θ
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 619

Introduce a small perturbation to the system at rest at concentration c0∗ corresponding


to R = R0 , denote the perturbation variables by the subscript 1 and write

∂c1
= D∇2 c1
∂t

c1 = c1∗ = Aγ2H1 at r = R0

and

∂R1 ∂c1
(cS − c0∗) =D at r = R0
∂t ∂r

∂R0
where c1∗ does not appear in the third equation because =0
∂t
Your aim is to find out how fast the perturbation dies out.

Assume a solution

c1 = bc1(r) Pℓ (cos θ) eσt

and

b Pℓ (cos θ) eσt
R1 = R1

and derive the domain equation for bc1 (r). Its solutions are denoted

r ! r !
−σ −σ
j r and y r
ℓ D ℓ D

and are called spherical Bessel functions.

The case ℓ = 0 does not maintain the volume of the sphere fixed and the case ℓ = 1 is
neutral, so set ℓ = 2. Then a technical difficulty arises, viz., j2 (r) and y2 (r) do not vanish
as fast as you would like as r → ∞.
∂R
To get some idea of what is going on drop σ on the domain on the grounds that is
∂t
controlling the equilibration.
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 620

Then

B
bc1 = Ar 2 +
r3

and you can derive a formula for σ.


− →

25. Your job is to find the shape of a sphere spinning at angular velocity Ω = Ω k . To do this
you need the velocity of the fluid:


→ →

v = r sin θΩ iφ

and its pressure:


 
2 1 2 2
p=γ + ρr sin θ + C Ω2
R0 2

Now writing

1 4
R = R0 + Ω2 R1 + Ω R2 + · · ·
2

and observing that the volume of the sphere is


Z 2π Z π
1
V = R3 sin θ dθdφ
3 0 0

and must remain constant, independent of Ω2 , you conclude, to first order in Ω2 :


Z 2π Z π
R1 sin θ dθdφ = 0
0 0

At r = R (θ, φ) you have p + γ2H = 0 and hence to first order in Ω2 you have, at r = R0 ,

p1 + γ2H1 = 0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 621

where
 
1 1 d d
2H1 = 2 2+ sin θ R1
R0 sin θ dθ dθ

and where you must have m = 0.

Now you can find R1 and hence you will have R to first order in Ω2 .

Answer:
 
2ρ 2
R = R0 + Ω R04 − 2 cos2 θ
γ 3

3 1
At m = 0, the Yℓm ’s are the Pℓ ’s: P0 = 1, P1 = cos θ, P2 = cos2 θ −
2 2
To go to second order in Ω2 , you will need 2H2 . At r = R0 you will find

∂p1
2R1 + γ2H2 = 0
∂r

d 2 p0 dp0
because p2 , 2
and are all zero.
dr dr

26. An inviscid fluid confined to a circle of radius R0 by the surface tension acting at its edge is
spinning at a constant angular velocity, Ω.

The velocity of the fluid and its pressure are given by


→ →

v0 = rΩ iθ

and

dp0
= ρΩ2 r
dr
γ
where p0 = at r = R0
R0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 622

Your job is to find the frequency of oscillation, σ, assuming the surface is given a small
displacement, viz., R = R0 + εR1 and assuming there is no z variation and vz = 0.

You have

∂−
→v p
v · ∇−
+−
→ →
v = −∇ , →
∇·−
v =0
∂t ρ

and, at r = R,


vr − vθ = Rt
R

and

p γ
+ 2H = 0
ρ ρ

where
 
1 Rθθ 1 2R2θ
2H =  3/2 − − 3
R2 R2 R R
1 + 2θ
R

Write the perturbation problem and assume a solution



vr1 = vr1 (r) 
b 




vθ1 = vbθ1 (r)  imθ iσt
e e
p1 = pb1 (r) 




b 

R1 = R
1

pb1 d 
Eliminate by differentiation and use im vbθ1 = − rb
vr1 to eliminate vbθ1 arriving at
ρ dr
 
d2 3 d 1 − m2
+ + vr1 = 0
b
dr 2 r dr r2

Retain the bounded solution, observing that m = 0 is ruled out if the area of the circle is
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 623

held fixed on perturbation.

Obtain a formula for σ 2 and notice that σ 2 = 0 at m = 1. At Ω = 0 your formula is

γ
σ2 = m (m − 1) (m + 1)
ρR03

Answer:

γ
(σ + mΩ)2 − (σ + mΩ) 2Ω + m2 Ω2 + 3
m (m − 1)2 = 0
ρR0

27. The mechanical energy equation may tell you something important about a problem before
you try to solve it, eg., the oscillating drop problem. As a simple example, we may have a
fluid occupying a volume V bounded by a surface S where we omit the fluid outside S.

Then in V we have

∂−
→v →



ρ + ρ−
→ −
v · ∇→
v =∇· T, ∇·−

v =0
∂t

and on S we have



n ·−

v =u





−−

n−→
n : T + γ 2H = 0

→→ −
− →


−t −
n : T =0
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 624

S
V

The plan is: dot the equation with −



v , use

 →
−  →
−  −
→ T

− →
− →
− − → →

∇· T · v =∇· T · v − T : ∇−

v,

integrate over V , use





I : ∇−

v =∇·−

v = 0,

use Leibnitz rule and write




− →


− →



T = −p I + 2µ D ,

where



→ −
− →

→ −

→ −
− →


∇−

v = D + W, D :W =0

to derive
Z Z Z −

d 1 − → −
− →


ρ→
v 2 dV = dA γ 2Hu − 2µ dV D : D
dt 2
V S V
LECTURE 20. EIGENVALUES AND EIGENFUNCTIONS OF ∇2 625

Now, because
Z Z
d
dA = − dA 2Hu
dt
S S

you have
Z Z Z →

d 1 − d → −
− →


ρ→
v 2 dV + γ dA = −2µ D : D dV
dt 2 dt
V S V

And you have a constraint:


Z Z
d
dV = dA u = 0
dt
V S

Now if you wish, you can easily include gravity in your formula by adding +ρ−

g, →

g =
−∇φ, to the RHS at the beginning.

Suppose you have a spherical drop at rest in free space, no gravity. Denote its radius by R0
its volume by V0 . You give it a small displacement from rest. Prove that the drop returns to
rest
Index

A block diagonal, 115


activator, 373 Boiling an Azeotrope, 23
adjoint, 68, 89, 498 boiling curve, 156
adjugate, 34 boiling point rise, 14, 25
Airy’s equation, 543 Boiling Point Rise Model, 25
algebraic multiplicity, 87, 114 boundary value problems, 498
algebraically simple eigenvalues, 87 boundedness condition, 581
analytic function, 524
C
Arrhenius formula, 521
Cartesian coordinates, 245, 410, 540
Arrhenius temperature dependence, 183
cell wall, 374
associated Legendre functions, 584
chain of generalized eigenvectors, 136
asymptotically stable, 129
characteristic equation, 88
autocatalytic reaction, 195
characteristic polynomial, 88
autothermal, 182, 521
chemical rate constants, 209
axisymmetric flow, 259
Chemostat, 179
azeotropes, 4
chemostat, 179
B chromatographic separation, 544
backward difference, 234 Chromatographic Separations, 281
balanced reactions, 31 Cocurrent honeycomb heat exchanger, 172
basis columns, 38 cofactor, 34
basis for the space, 47 column formulas, 36
basis minor, 38 column vector, 6
basis minor theorem, 38, 49 complete set, 97
Bessel function, 563 complete set of eigenvectors, 97
Bessel’s equation, 413, 577 complex conjugate, 67, 322
best approximation problem, 77 complex conjugates, 90
binomial theorem, 566 conservation conditions, 31
binormal distribution, 288 conservation of energy equation, 161
biorthogonal sets, 35, 89 contour integral, 428
biorthogonal sets of vectors, 68, 183 convective distortion, 545

627
INDEX 628

convective equation, 546 eigenvalue, 85


correlation coefficient, 17 eigenvalue problem, 320, 524
Coulomb’s law, 307, 596 eigenvalues, 320
counter current separating cascade, 101 eigenvectors, 70, 85
counterflow configuration, 165 electrical potential, 307, 565
Cramer’s rule, 36 electrostatic potential, 307, 596
critical conditions in nonlinear problems, 394 elliptical cylinder coordinates, 432
critical points, 4 equilibrium points, 4
cubic lattice, 289 equilibrium solutions, 128
cylindrical coordinates, 256, 411, 544 equilibrium vapor, 3
equimole counter diffusion, 156
D
error function, 293
Darcy’s law, 300, 457
Euler approximation , 131
delta function, 517
exchange of stability, 184
Derivatives of Determinants, 41
extinction point, 188
determinant of a square matrix, 32
diagonal matrix, 41, 114 F
diagonalizing basis, 114 Flow Reactor Data, 214
differential operator ∇, 246 formula for ∇2 , 254
diffusion equation, 231 formulas for ∇2 in three coordinate systems, 259
dimension of the space, 48 forward difference, 234
dipole, 306 Fourier coefficients, 327, 555
dipole-dipole term, 309 Fourier series, 321
Dirichlet conditions, 469 Fourier series expansion, 347
Dirichlet problem, 386, 415 Frank-Kamenetski approximation, 185
Dirichlet, Neuman and Robin conditions, 386 frictional heating, 348, 534
dispersion coefficients, 544 Frobenius’ method, 565
dispersion equations, 544 Froebenius’ method, 585
Domain Pertubations, 260 Fuchs, Frobenius classification, 542
Domain perturbations, 267 fundamental solution matrix, 146
dominant balance, 378 fundamental solutions, 50
Drude model, 117
G
dynamics of a simple evaporator, 11
generalized coordinates, 159, 169
E generalized eigenvector, 91
eigenfunctions, 320 generalized eigenvectors, 135
eigenspace, 86 generalized Green’s function, 515
INDEX 629

generalized inverse, 81, 82 indicial equation, 575


generalized inverses, 71 infinite series, 321
generalized solution, 515 inhibitor, 373
generating function, 567 inner product, 67, 323
geometric interpretation of the rank, 49 integration by parts, 320, 497, 518
geometric multiplicity, 86, 87, 114 integration by parts formula, 520
Gerschgorin column circles, 209 integration by parts formulas, 387
Gerschgorin’s circle theorem, 129 integration formulas, 322
Gerschgorin’s row circles, 218 invariant subspaces, 85
Getting Going, 3 isotherms, 9
gravitational field, 542
J
greatest number of independent columns, 30
Jacobian matrix, 182
greatest number of independent rows, 30
joint probability density, 16
greatest number of independent vectors, 32
Jordan forms, 115
Green’s first and second theorems, 387
Green’s first theorem, 393 K
Green’s function, 292, 513, 597, 598 kernel, 50
Green’s function for the diffusion equation, 295 Kremser equation, 104
Green’s second theorem, 393
L
H Lagrange’s equations of motion, 158
Hamiltonian, 169 Lagrangian, 158
heat balance, 12 lambda matrices, 236
heat bath, 394 lambda matrix, 140, 141
Hermitian, 67 Laplace transformation, 139
Hermitian positive definite matrix, 68 Laplace’s equation, 589
Hilbert matrix, 15 latent vectors, 159
homogeneous azeotropes, 4 Least Squares Approximations, 76
homogeneous equations, 346 least squares problem, 77
homogeneous problem, 26 Legendre polynomials, 568, 569
honeycomb heat exchanger, 166 Legendre’s differential equation, 569
Hopf bifurcation, 184, 193 limiting nutrient, 179
line source solution in three dimensions, 297
I linear combinations, 29, 47
ignition point, 188
linear dependence, 29
image, 49
linear independence, 29
Independent Chemical Reactions, 30
linear operator, 112
INDEX 630

long wave length perturbations, 375 orthogonality, 322


longitudinal spreading, 544 orthogonality condition, 526
lower triangular matrix, 41 oscillations of a sphere, 592

M P
magnetic dipole, 271 periodic conditions, 375
manifold, 47 periodic eigenfunctions, 347
Mathieu equation, 148 perturbation calculations, 107
matrix multiplication, 6 Petri Dish Problem, 25
matrix of the cofactors, 34 plain vanilla inner product, 68, 497
maximum boiling azeotrope, 9, 24 plane source solution in three dimensions, 298
mean curvature, 252 point source, 292
mean square error, 325 point source solution, 295, 303
merry-go-round, 168 point source solution in one dimension, 298
method of Frobenius, 414, 573 point source solution in three dimensions, 297
minimum boiling azeotrope, 24 point source solution in two dimensions, 298
minor, 34 Poisson’s equation, 305, 591, 596
minors, 37 polynomial differential operator, 53
monopole, 306 population of cells, 179
Multiplication of a Vector by a Matrix, 6 positive cone, 210
Multipole Expansions, 303 positive definite, 67
multipole moment expansion, 565 positive feed back, 395
potential energy, 307
N
power moments, 273, 546
Navier-Stokes equation, 268
power series, 414
nearly dependent columns, 20
principal minors, 86
Neumann conditions, 375, 462, 469, 557
principle of detailed balance, 210
norm convergence, 327
principle of superposition, 10, 390
nutrient concentration, 180
probability density, 16
O projection, 73
one-dimensional lattice, 287 projection theorem, 73, 74
ordinary differential equations, 497 prolate and oblate spheroidal coordinates, 270
ordinary point, 542, 573 Pythagorean Theorem, 76
orthogonal basis, 246
orthogonal complement, 69
Q
quadrupole, 306
orthogonal coordinate system, 248, 410
quantum mechanics, 540
orthogonal functions, 323
INDEX 631

R spherical coordinates, 257, 412, 565


radius of convergence, 581 spherical harmonics, 586
random variables, 16 square submatrices, 37
random walk, 287 stagnation flow, 276
rank, 37 startup period, 137
rank test for solvability, 49 state space, 4
rate of cell multiplication, 179 steady solute diffusion, 303
Rayleigh-Taylor problem, 457 steady solutions, 26
reaction-molecule matrix, 30 steady state points, 4
receptors, 374 Stefan-Maxwell equations, 152
recursion formula, 567 stiffness problem, 133
regular singular point, 573 stirred tank reactor, 183
residue, 428 stoichiometric coefficient, 30, 180
row formulas, 36 Straight Line Paths, 228
runaway condition, 395 stripping cascade, 135
sum of the squares of the errors, 19
S
superdiagonal, 115
Saffman-Taylor problem, 457
surface gradient, 250
Schrödinger’s equation, 540
symmetrizable, 71
second central difference, 232
symmetry breaking, 378
second order linear differential operator, 501
system of particles, 169
self adjoint differential operators, 499
system of point sources, 305
self-adjoint, 70, 207
Self-adjoint matrices, 70 T
self-adjoint operators, 70 Taylor dispersion, 287
separation of variables, 410 Taylor’s formula, 560
set of eigenvectors, 124 The Trace Formula, 41
set of measure zero, 97 thermal explosion, 395
simple evaporator, 11 trace of A, 40
simple pole, 428
U
simple roots, 97
uniform activator-inhibitor state, 374
solute diffusion, 319
unmixed boundary value problems, 498
Solute dispersion, 544
upper triangular matrix, 41
solution to the best approximation problem, 79
Use of Flow Reactor Data, 214
solvability condition, 27, 49, 108, 398, 506
solvability conditions, 7 V
spectral representation, 100, 104 Vandermonde matrix, 18
INDEX 632

variance, 17, 275, 547


vector space, 47

W
washout branch, 182
washout solution, 181
Wronskian, 502
Wronskian of the solutions, 40
20 Lectures on

and Their Applications


20 Lectures on Eigenvectors, Eigenvalues,
Eigenvectors,
Eigenvalues, and
Orange Grove Texts Plus seeks to redefine
Their Applications
publishing in an electronic world. This imprint
of the University Press of Florida provides
faculty, students, and researchers worldwide
Problems in Chemical Engineering
with the latest scholarship and course materials
in a twenty-first-century format that is
readily discoverable, easily customizable, and
consistently affordable.

www.orangegrovetexts.com

j o hns
L. E. Johns

ISBN 978-1-61610-165-7 $65.00

56500

9 781616 101657

You might also like