0% found this document useful (0 votes)
980 views204 pages

Theoretical Physics For Students by Alexander Fufaev

The document is an introduction to a book by Alexander Fufaev, a freelance theoretical physicist, aimed at providing an intuitive understanding of theoretical physics topics at the undergraduate level. It addresses common struggles faced by students in understanding complex concepts and completing assignments, offering a structured approach to mastering the material. The book covers various mathematical tools and principles essential for theoretical physics, including differential equations, tensors, and Maxwell's equations.

Uploaded by

Shritan Aravalli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
980 views204 pages

Theoretical Physics For Students by Alexander Fufaev

The document is an introduction to a book by Alexander Fufaev, a freelance theoretical physicist, aimed at providing an intuitive understanding of theoretical physics topics at the undergraduate level. It addresses common struggles faced by students in understanding complex concepts and completing assignments, offering a structured approach to mastering the material. The book covers various mathematical tools and principles essential for theoretical physics, including differential equations, tensors, and Maxwell's equations.

Uploaded by

Shritan Aravalli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 204

Hello there!

My name is Alexander Fufaev. You may


know me from YouTube or from my website
"fufaev.org". I'm a freelance theoretical
physicist, and in this book I explain theoretical
physics - the stu you're going to study,
are studying, or have studied in your
physics degree, but haven't quite grasped
yet. This book aims to provide an intuitive
understanding of the key topics in
theoretical physics at undergraduate
level and serves as a perfect entry point for deepening your understanding.
You should be familiar with the basics of vector, dierential and integral
calculus. If these concepts are new to you, it's a good idea to understand them
rst before delving into this book.

Weekly assignments at university are not easy for beginners and require a lot
of time. The content is rushed through, so it's easy to nd yourself not only
failing to understand many topics, but also struggling to keep up with the weekly
assignments. This often leads to not completing the required credits or failing
the exams at the end of the semester.
Having completed my M.Sc. in Physics, I know exactly in retrospect what I
would have needed in the theoretical physics modules to meet the academic
requirements and pass the exams. A fundamental, intuitive understanding of
the topics would have been essential because often I didn't know what the
assignments are talking about. In some cases, I understood the assignment but
didn't know how to approach it. There were also assignments I could solve,
but I wasn't sure what or why I was calculating something.
If you work through this book from the rst to the last chapter, you'll nd it
much easier to master the assignments in theoretical physics and pass
the exams more easily.
May physics be with you!
What you’ll learn

I Mathematical Tools

1 Differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1 What is a differential equation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Different notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 What should I do with a differential equation? . . . . . . . . . . . . . . 15
1.4 Recognize differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Classify a differential equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.1 Ordinary or partial? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.2 Of what order? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.3 Linear or non-linear? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.4 Homogeneous or inhomogeneous? . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.2 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 Tensors of Zeroth and First Order . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Second-order tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Tensors of higher orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Symmetric and antisymmetric tensors . . . . . . . . . . . . . . . . . . . . . 30
2.5 Combine tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Addition of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Subtraction of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.3 The outer product of tensors (tensor product) . . . . . . . . . . . . . . . . . . . 32
2.5.4 Contraction of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.6 Kronecker Delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35


2.6.1 Kronecker delta is symmetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.2 Contracting with Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6.3 Kronecker delta sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6.4 Scalar product in index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.7 Levi Civita symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40


2.7.1 Cross product in index notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Dirac delta in the coordinate origin . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Shifted Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Properties of the Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Analogy to the Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Three-dimensional Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Nabla operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Positive divergence = source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.2 Negative divergence = sink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.3 Divergence-free vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 Gauss Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 65


6.1 Surface Integral in the Divergence Theorem . . . . . . . . . . . . . . . . 65
6.2 Volume integral in the Divergence Theorem . . . . . . . . . . . . . . . . 68

7 Stokes’ Curl Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73


7.1 Line integral in the Curl Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.2 Surface integral in the Curl Theorem . . . . . . . . . . . . . . . . . . . . . . . 76

8 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.1 The concept of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.2 Fourier coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.3 Fourier basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.4 Example: Fourier series for the sawtooth function . . . . . . . . . . . 87

II Nature is extreme

9 Action Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

10 Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.1 Lagrange function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10.2 How To: Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.2.1 First step: Set generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.2.2 Second step: Set up the Lagrange function . . . . . . . . . . . . . . . . . . . . 99
10.2.3 Third step: Calculate derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.2.4 Fourth step: Solve the differential equations . . . . . . . . . . . . . . . . . . . . 101
10.2.5 Fifth step: Set boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

III Electromagnetism

11 The Electric Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

12 The Magnetic Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . 109

13 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

13.1 Integral and Differential Representation . . . . . . . . . . . . . . . . . . . 114


13.2 Electric and Magnetic Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.3 Electric and Magnetic Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
13.4 First Maxwell Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
13.4.1 Macroscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
13.4.2 Microscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
13.5 Second Maxwell Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.5.1 Macroscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.5.2 Microscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
13.6 Third Maxwell Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
13.6.1 Macroscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
13.6.2 Microscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
13.7 Fourth Maxwell Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
13.7.1 Macroscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
13.7.2 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
13.7.3 Microscopic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

14 Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

14.1 Wave equation for the E-field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134


14.2 Wave equation for the B field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.3 A few hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

IV The Quantum World

15 Schrödinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141


15.1 Time-Independent Schrödinger Equation . . . . . . . . . . . . . . . . . 144
15.1.1 Energy conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
15.1.2 Wave-Particle Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
15.1.3 Plane wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
15.1.4 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
15.2 Interpretation of the wave function . . . . . . . . . . . . . . . . . . . . . . . 151
15.2.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
15.2.2 |Ψ |2 graphically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
15.3 Normalization of the wave function . . . . . . . . . . . . . . . . . . . . . . . 153
15.3.1 Example: Normalizing a wave function . . . . . . . . . . . . . . . . . . . . . . . . 155
15.4 Three-dimensional Schrödinger equation . . . . . . . . . . . . . . . . . 156
15.5 Time-dependent Schrödinger equation . . . . . . . . . . . . . . . . . . . 158
15.6 Stationary Wave Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
15.7 Hamilton operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
15.8 What you’ve learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

16 Bra-Ket Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165


16.1 Bra- and Ket-State Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.1.1 Ket vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.1.2 Bra Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.2 Scalar and Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.3 Continuous Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.3.1 Overlap of Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
16.4 Orthonormal Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
16.5 Tensorproduct in Bra-Ket Notation . . . . . . . . . . . . . . . . . . . . . . . . 172
10

16.6 Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173


16.6.1 Basis Transformation with Projection Matrices . . . . . . . . . . . . . . . . . . . 174
16.7 Schrödinger Equation in Bra-Ket Notation . . . . . . . . . . . . . . . . . 176
16.8 Mean Values in Bra-Ket Notation . . . . . . . . . . . . . . . . . . . . . . . . . 176

17 Represent Operators as Matrices . . . . . . . . . . . . . . . . . . . 179

18 Hermitian Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181


18.1 Useful properties of Hermitian operators . . . . . . . . . . . . . . . . . . 183
18.2 Examples of Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 184

19 Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185


19.1 Can We Measure Angular Momentum? . . . . . . . . . . . . . . . . . . . 188
19.2 Can We Determine ALL Angular Momentum Components? 189
19.3 Quantum Numbers l and m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
I
Mathematical Tools

1 Differential equations . . . . . . . . . . . . 11

2 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Dirac delta . . . . . . . . . . . . . . . . . . . . . . . 45

4 Vector fields . . . . . . . . . . . . . . . . . . . . . . 53
1. Differential equations

More: en.fufaev.org/dierential-equations
If you plan to deal with...
ˆ the atomic world,

ˆ the movement of the planets,

ˆ chemical processes,

ˆ electrical circuits,

ˆ weather forecasts

ˆ or with the spread of a virus

then you will eventually encounter dierential equations. You will encounter
dierential equations in every part of theoretical physics, so it is important to
understand how to work with dierential equations.

1.1 What is a differential equation?


Let's take a look at Hooke's law as a simple example:

F = −D y (1.1)
14 Chapter 1. Differential equations

This law describes the restoring force F on a mass attached to a spring. The
mass experiences this force when you deect it from its rest position by the
distance y. D is a constant coecient that describes how hard it is to stretch
or compress the spring.

The mass m is hidden in the force. We can write the force using Newton's
second law as m a:

m a = −D y (1.2)

Here a is the acceleration that the mass experiences when it is deected by y


from its rest position. As soon as you pull on the mass and release it, the spring
will start to oscillate back and forth. Without friction, as in this case, it will
never come to a standstill.
While the mass oscillates, the displacement y naturally changes. The
displacement is therefore dependent on the time t. This means that the
acceleration a also depends on the time. The mass naturally remains the same
at all times, regardless of how much the spring is deected. This also applies
to a good approximation for the spring constant D:

m a(t) = −D y(t) (1.3)


1.2 Different notations 15

If we only bring m to the other side, we can use this equation to calculate the
acceleration that the mass experiences for each displacement y :

D
a(t) = − y(t) (1.4)
m

But what if we are interested in the question: At what displacement y will


the spring be after 24 seconds?

In order to answer such a future question, we need to know how exactly y


depends on the time t. We only know THAT y depends on time, but not HOW.

Dierential equations come into play for such future questions. We can easily
show that the acceleration a is the second time derivative of the distance covered,
so in our case it is the second derivative of y with respect to time t:

d2 y(t) D
= − y(t) (1.5)
dt 2 m

We have now set up a dierential equation for the displacement y(t)! You
can recognize a dierential equation by the fact that it in addition to the
unknown function y(t), there are also derivatives of this function. As
in this case, the second derivative of y with respect to time t.

A dierential equation is an equation that


We can therefore conclude:
contains an unknown function and derivatives of this function.

1.2 Different notations


You will certainly encounter many ways of writing a dierential equation. We
have written down our equation 1.5 for the spring oscillation of a mass in a
Leibniz notation. Here it is again:
d2 y(t) D
= − y(t) (1.6)
dt2 m
16 Chapter 1. Differential equations

You will often come across this in physics. We can also write it down a little
more compactly without mentioning the time dependency:

d2 y D
= − y (1.7)
dt 2 m

If the function y(t) only depends on the time t, then we can write down the time
derivative even more compactly using Newton notation. A time derivative of
y corresponds to a point above the ẏ . With two derivatives, as in our case, there
would therefore be two points above the function:

D
ÿ = − y (1.8)
m

Obviously, this notation is rather unsuitable for considering the tenth derivative.

Another notation that you are more familiar with from mathematics is
Lagrange notation. Here, dashes are used for the derivatives. So two dashes
for the second derivative:
D
y ′′ = − y (1.9)
m

With Lagrange notation, it should be clear from the context with respect to
which variable is being dierentiated. If this is not clear, the variable on which
y depends must be mentioned explicitly:

D
y ′′ (t) = − y(t) (1.10)
m

Each notation has its advantages and disadvantages. However, it should be


noted that these are only dierent notations that represent the same physics.
Rearranging and renaming does not change the physics under the hood of a
dierential equation. We could also call the displacement x:

d2 x(t) D
= − x(t) (1.11)
dt2 m
1.3 What should I do with a differential equation? 17

1.3 What should I do with a differential


equation?
In order to answer our original question: "At what displacement will the spring
be after 24 seconds?", we need to solve the dierential equation. Solving a
dierential equation means that you have to nd out how the unknown function
y(t) depends exactly on the variable t: y(t) = ...

For simple dierential equations, such as that of the oscillating mass, there are
methods that you can use to get the unknown function y(t). Keep in mind,
however, that there is no general recipe for solving any dierential equation.
For some dierential equations there is not even an analytical solution!
Non-analytical means that you cannot write down a concrete equation for the
function y(t) = ....

Dierential equations that cannot be solved analytically can only be solved on


the computer numerically. Then the computer does not spit out a concrete
formula y(t) = ..., but data points y(t1 ), y(t2 ), y(t3 ), ..., which you can plot in a
y -t diagram and use to analyze the numerically solved function y(t).

1.4 Recognize differential equations


As soon as you come across a dierential equation, the rst thing to do is to
nd out,

ˆ what the unknown function is


ˆ and on which variables it depends.
In the dierential equation 1.5 of the oscillating mass, the function we are looking
for is called y(t) and it depends on the variable t:

d2 y(t) D
= − y(t) (1.12)
dt 2 m

As an example, take a look at the three-dimensional wave equation that describes


the electric eld E of an electromagnetic wave propagating at the speed of light
18 Chapter 1. Differential equations

c:

∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
+ + = (1.13)
∂x2 ∂y 2 ∂z 2 c2 ∂t2

What is the unknown function in this equation? It is the function E . Why is


that? Because the dierential equation contains the derivatives of E . On which
variables does the function dependent? The dependency is not explicitly stated
here, but you can immediately see from the curved del character ∂ that it must
depend on several variables. You can immediately see from the derivatives in
the dierential equation that the unknown electric eld depends on x, y , z and
t. So on a total of four variables: E(x, y, z, t).

Let's look at a slightly more complex example. This system of dierential


equations describes how a mass moves in a gravitational eld:
d2 x m
= G 2 (1.14)
dt 2 x + y2 + z2
d2 y m
= G 2 (1.15)
dt 2 x + y2 + z2
d2 z m
= G (1.16)
dt2 x2 + y 2 + z 2

You have a coupled system of dierential equations in front of you. In


this case, a single dierential equation is not sucient to describe the motion
of a mass in the gravitational eld. In fact, three functions are required here,
namely the trajectories x(t), y(t) and z(t), which determine a position r(t) =
(x(t), y(t), z(t)) of the mass in three-dimensional space. Each function describes
the movement in one of the three spatial directions.

By coupled we mean that, for example, the rst dierential equation for the
function x(t), also contains the function y(t). This means that we cannot
simply solve the rst dierential equation independently of the second
dierential equation, because the second dierential equation tells us how it
behaves in the rst dierential equation. The functions we are looking for
occur in all three dierential equations, which means we have to solve all three
1.5 Classify a differential equation 19

equations simultaneously. You may learn exactly how to do this in your


math lectures.

1.5 Classify a differential equation


There are many dierent dierential equations. But if you look closely, you will
notice that some dierential equations are similar to each other.

After you have found out what the unknown function is and on which variables
it depends, you should answer a few more basic questions to choose the
appropriate solution method for the dierential equation. We need to
classify the dierential equation:
ˆ Is the dierential equation ordinary or partial? Partial dierential
equations describe multidimensional problems and are signicantly more
complex.

ˆ Of which order is the dierential equation? First-order dierential


equations are usually easy to solve and describe, for example, exponential
behavior such as radioactive decay or the cooling of a liquid. Second-order
dierential equations, on the other hand, are somewhat more complex and
also occur frequently in nature. Maxwell's equations in electrodynamics,
the Schrödinger equation in quantum mechanics - these are all second-
order dierential equations.

ˆ Is the dierential equation linear or non-linear? The


superposition principle applies to linear dierential equations, which is
very useful for describing electromagnetic phenomena, for example.
Non-linear dierential equations are much more complex and are used,
for example, in non-linear electronics to describe superconducting
currents. In addition, chaos can only occur with non-linear dierential
equations of third or higher order. As soon as you come across a
non-linear dierential equation, you can actually throw away pen and
paper and treat the equation numerically directly on the computer. Most
non-linear dierential equations cannot even be solved analytically!
20 Chapter 1. Differential equations

ˆ Is the linear dierential equation homogeneous or


inhomogeneous? Homogeneous linear partial dierential equations
(PDEs) are simpler than inhomogeneous ones and describe, for example,
an undisturbed oscillation, while inhomogeneous PDEs are also able to
describe externally disturbed oscillations.

Once you have classied a dierential equation, you can use a suitable solution
method to solve the equation. Even if there is no specic solution method, the
classication tells you how complex a dierential equation is.
Let's classify the DGL for the oscillating spring, wave equation, mass in the
gravitational eld and for the decay law.

1.5.1 Ordinary or partial?


Let's take a look at the dierential equation for the oscillating mass:

d2 y(t) D
+ y(t) = 0 (1.17)
dt 2 m

This is an ordinary dierential equation. Ordinary means that the function


y(t) depends on only one variable. In this case on the time t.

What about the wave equation for the electric eld?

∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
+ + = (1.18)
∂x2 ∂y 2 ∂z 2 c2 ∂t2

This is a partial dierential equation. "Partial" means that the function you
1.5 Classify a differential equation 21

at least two variables and there are derivatives


are looking for depends on
with respect to these variables. In this case, the function E depends on
four variables: x, y , z and t. And derivatives with respect to these variables also
occur in the dierential equation.

1.5.2 Of what order?


The dierential equation for the oscillating mass is a dierential equation of
2nd order. The order of a dierential equation is the highest occurring
derivative of the unknown function:
d2 y D
+ y = 0 (1.19)
dt2 m

Since the second derivative is the highest and even the only derivative of in
the dierential equation, the oscillating mass at the spring is the 2nd order
dierential equation.
1.5.2.1 How to reduce the order
It is always possible to convert a higher order dierential equation into a
system of 1st order dierential equations. Sometimes this procedure is
helpful when solving a dierential equation.

For example, we can convert the dierential equation 1.19 for the oscillating
mass on the spring (second-order dierential equation) into two coupled rst-
order dierential equations. All we have to do is introduce a new function,
let's call it v and dene it as the rst time derivative of the unknown function
y:

dy
v = (1.20)
dt
This is already one of the two coupled rst-order dierential equations. Now
we just have to express the second derivative in the original dierential equation
with the derivative of v . Then we get the second 1st order dierential equation.
22 Chapter 1. Differential equations

The two coupled dierential equations are as follows:

dy
v − = 0 (1.21)
dt
dv D
+ y = 0 (1.22)
dt m

The two dierential equations are coupled. We must therefore solve them
simultaneously. They are coupled because y occurs in the rst equation and v
in the second equation.

You can always proceed in this way if you want to reduce the order of a
dierential equation. The price you have to pay for this is additional coupled
dierential equations.

Let's continue. Of what order is the dierential equation for the decay
law?
dN
−λ N = (1.23)
dt
This is a dierential equation 1st order, because the highest occurring
derivative of the function N (t) is the rst derivative.

1.5.3 Linear or non-linear?


The dierential equation for the oscillating mass is linear:
1
d2 y

D 1
+ y = 0 (1.24)
dt2 m

Linear means that the unknown function and its derivatives only
contain powers of 1 and there are no products of derivatives with the
function, such as y 2 or y ÿ . There are also no composition of functions,
such as sin(y(t)) or y(t). For products, compositions and higher powers, we
p

speak of nonlinear dierential equations.

Note! The to the power of two in the second derivative in the Leibniz
d2 y
notation dt2 is not a power of the derivative, but merely a notation for the
1.5 Classify a differential equation 23

second derivative.
The decay law is also linear:

dN
 1
−λ N 1
= (1.25)
dt

What about the wave equation? It is also linear:


1 1 1 1
∂ 2E ∂ 2E ∂ 2E ∂ 2E
   
1
+ + = 2 (1.26)
∂x2 ∂y 2 ∂z 2 c ∂t2

The coupled dierential equation system for the motion of a mass in the
gravitational eld, on the other hand, is non-linear because higher powers of
the functions x(t), y(t) and z(t) occur there, namely x2 , y 2 and z 2 .

1.5.4 Homogeneous or inhomogeneous?


For homogeneous and inhomogeneous types of dierential equations, the
coecients multiplied by the unknown function and its derivatives are
important. For some solution methods, it is important to distinguish
between...
ˆ constant coecients - these do not depend on the variables on which the
unknown function also depends.
ˆ non-constant coecients - these depend on the variables on which the
unknown function depends.
A coecient does not necessarily have to be multiplied by the unknown function
or its derivative. The coecient can also stand alone! In this case, the coecient
is referred to as perturbation function.
Let's take another look at the oscillating mass:

d2 y D
1 + y = 0 (1.27)
dt 2 m

In this dierential equation, there is an interesting coecient that is multiplied


24 Chapter 1. Differential equations

by the unknown function y , namely m D


. Strictly speaking, the second derivative
is also preceded by a coecient, namely 1 and the single coecient, i.e. the
perturbation function, is 0. If the perturbation function is zero, then we
call the linear dierential equation homogeneous.
The wave equation also has no perturbation function (no stand-alone coecient).
The dierential equation is therefore homogeneous:

∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
1 + 1 + 1 = (1.28)
∂x2 ∂y 2 ∂z 2 c2 ∂t2

The dierential equation for a forced oscillation, on the other hand, is


inhomogeneous:
d2 y dy D
1 + µ + y = F (t) (1.29)
dt 2 dt m

Here, the external force F (t) corresponds to the perturbation function. As you
can see, it stands alone without being multiplied by the function y(t) or its
derivatives. In addition, the perturbation function is time-dependent, so it is a
non-constant coecient.

1.6 Constraints
A dierential equation alone is not sucient to describe a physical system
uniquely. The solution of a dierential equation describes many possible
systems that exhibit a certain behavior.
For example, the solution N (t) of the decay law describes an exponential
behavior. However, the knowledge of an exponential behavior is not
sucient to be able to say specically how many nuclei N (t = 10) have
decayed after 10 seconds.
For this very reason, every dierential equation usually has constraints. These
are additional information that must be given for a dierential equation in order
to determine the unique solution of the dierential equation. The number
of necessary constraints depends on the order of the equation.
1.6 Constraints 25

ˆ Only one constraint is required for a 1st order dierential equation:


One function value of the unknown function: y(t). For the decay law, for
example, it should be known how many undecayed nuclei N (t = 0) there
were at the time t = 0. For example, 1000 nuclei. Then the constraint is:
N (0) = 1000.

ˆ For a 2nd order dierential equation, two constraints are necessary: A


function value of the unknown function y(t) and y ′ (t). For the oscillating
mass, for example, it should be known what displacement y(t = 0) the
spring had at the time t = 0, e.g. y(0) = 1 and what the velocity y ′ (t = 0)
of the mass was at this time, e.g. y ′ (0) = 0.

ˆ For a 3rd-order dierential equation, three constraints would then be


necessary so that the solution of the dierential equation uniquely
describes the system under consideration: A function value y(t) = A of
the unknown function, a function value y ′ (t) = B e.g. its rst derivative
and a function value y ′′ (t) = C e.g. its second derivative.

ˆ For a 4th order, four constraints would then be necessary and so on...
In order to uniquely determine the solution of a
We can therefore state:
nth order dierential equation, n constraints are necessary.

Most of the time you will come across initial conditions and boundary
conditions. These are also just names for constraints that tell you what kind
of information you have about the system.

1.6.1 Initial conditions


Sometimes, for example, you know what the system was at a certain point in
time t = t0 . This can be the initial time at which you deected and released an
oscillating mass. In this case, we speak of initial conditions. You determine
what the displacement y(t0 ) was at a specic point in time. And since we
need two constraints for the 2nd order equation, you also specify what the
displacement (i.e. the velocity) y ′ (t0 ) was at time t0 .

We call a dierential equation together with its initial conditions as


26 Chapter 1. Differential equations

initial value problem. If we solve the initial value problem, we can use the
solution to predict the future behavior of a system.

1.6.2 Boundary conditions


Let's take another look at the second-order dierential equation for the
oscillating mass. And let's assume that we know the displacement y(t0 ) = y0
at time t0 . Sometimes we are unlucky and do not know what velocity the
oscillating mass had at the initial time t0 . We therefore do not know the
derivative y ′ (t0 ) at the time t0 at which we know the displacement. However,
we need two constraints, otherwise the solution is not unique and we cannot
use the function y(t) to calculate specic numbers for displacement.
However, we may know that after t = 6 seconds, for example, the oscillating
mass was in the maximum deected state. We therefore know the displacement
y(6) = y6 .

If we know the constraints, such as y(t1 ) = y1 and y(t2 ) = y2 , which at two


dierent times t1 and t2 describe the system, then we speak of boundary
conditions: y(t1 ) = y1 and y(t2 ) = y2 .
We call a dierential equation together with two boundary conditions
as boundary value problem. If we solve the boundary value problem, we can
use the solution to predict how the system will behave within these boundary
values.
The 'function value at two dierent times' was of course just an example.
Instead of time, it could be any variable that denes the system, usually at the
boundaries: at dierent times, at dierent locations, at dierent angles and
1.6 Constraints 27

so on.
You have now learned the most important basics of dierential equations. This
knowledge will help you as undergrad.
2. Tensors

More: en.fufaev.org/tensors

Before we fully understand tensors in their general denition, let's rst get to
know them from an engineering perspective. As long as you understand
scalars, vectors, and matrices, you'll nd it easy to understand tensors from
this perspective because tensors are nothing but a gegeneralization of
scalars, vectors, and matrices. Just as we use scalars, vectors, and
matrices to describe physical laws, we can use tensors to describe physics.
Tensors are an even more powerful tool with which we can describe physics
that cannot be described solely with scalars, vectors, and matrices. In order to
develop the modern theory of gravitation, Albert Einstein had to rst
understand the concept of tensors. Only then could he mathematically
formulate the general theory of relativity.

2.1 Tensors of Zeroth and First Order


Let's start with the simplest tensor: The zeroth order tensor. This is a
scalar σ , that is an ordinary number. This tensor has a single component and
represents, for example, the electrical conductivity of a wire. This zero-order
tensor σ indicates how well a wire conducts electricity in this case.
30 Chapter 2. Tensors

A slightly more complex tensor, let's call it j , is a rst-order tensor. This is a


vector with three components j1 , j2 and j3 in three-dimensional space:
 
j1
j = j2  (2.1)
 

j3

In Eq. 2.1 we have represented the rst order tensor as a column vector. Of
course, we can also represent it as a row vector:

j = [j1 , j2 , j3 ] (2.2)

At this stage, it doesn't matter how we write down the components. But
remember that it will play a role later!

The notation of rst-order tensors as row or column vectors only makes sense if
we are working with concrete numbers, such as in computer physics, where we
use tensors to obtain concrete numbers. In order to calculate theoretically, for
example to derive equations or simply to formulate a physical theory, the tensors
are formulated compactly in index notation. You are probably already familiar
with this from vector calculus. Instead of writing out all three components of
the rst-order tensor, we write them with an index k , that is, jk . What we
call the index does not matter. jk stands for the rst component j1 , second
j2 or third j3 component, depending on what we actually use for index k . In
theoretical physics, we usually do not use anything specic because we want to
write the physics as generally and compactly as possible.

From this index notation jk it is not clear whether it represents a column or


row vector. This is not good, because later it will be important to distinguish
between column and row vectors. But we can easily introduce this distinction
into our index notation by noting the index below jk if we mean a row vector.
And we note the index top j k if we mean a column vector. The notation of
indices above and below has a deeper meaning, which we will get to know later.
At this stage, we only distinguish the representation of the rst-order tensor.
2.2 Second-order tensors 31

2.2 Second-order tensors


The next more complex tensor is second-order tensor. Let us also denote
this tensor by σ , because a second-order tensor can describe electrical
conductivity. Electrical conductivity as a zero-level tensor describes isotropic
materials. The conductivity as a second-order tensor, on the other hand,
describes a non-isotropic material in which the conductivity varies depending
on the direction in which the current ows.

You have certainly already become familiar with this tensor in mathematics in
the matrix representation. In a three-dimensional space, the second-order
tensor is a 3x3 matrix:

 
σ11 σ12 σ13
σ = σ21 σ22 σ23  (2.3)
 

σ31 σ32 σ33

We also use index notation for the second-order tensor and note the components
of the matrix with σmk , for example. The indices m and k can have the values
1, 2 or 3. The index m species the row and the index k species the column.

2.3 Tensors of higher orders


We can continue the game and consider a third-order tensor. This then has
three indices σmkn . The fourth-order tensor has four indices: σmkni . And so
on. The indices of a tensor of any level can also be superscripted. For example,
the indices mk of the fourth-order tensor can be at the top and the indices ni
at the bottom: σ mk ni . You will nd out exactly what this means in the next
chapters.

The number of components dr of a tensor depends on the space dimension


d and on the level (rank) r of the tensor. In a three-dimensional space (d = 3),
a second-order tensor (r = 2) therefore has 32 = 9 components.
32 Chapter 2. Tensors

2.4 Symmetric and antisymmetric tensors


In theoretical physics, especially in the theory of relativity and quantum
mechanics, we will regularly encounter symmetric and antisymmetric tensors.
A symmetric tensor tij remains the same if we swap its indices: tij = tji .
Specically, swapping the indices of the second-order tensor as a matrix means
that the matrix remains the same if we transpose it, that is, mirror the rows
and columns on the diagonal:
   
∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ = ∗ ∗ ∗ (2.4)
   

∗ ∗ ∗ ∗ ∗ ∗

This symmetry property of tensors is very useful and simplies calculations in


computer physics enormously. Moreover, this property is crucial in quantum
mechanics, because symmetric matrices have real eigenvalues. They therefore
represent physical quantities (we call them observables) that we can measure in
an experiment. So if you have a symmetric tensor in front of you, as a theoretical
physicist you should immediately get a dopamine kick. The Kronecker delta
δmk , for example, is a concrete example of a simple symmetric tensor.

We have considered a second-order tensor. What if the tensor is of a higher


order? What about its symmetry property then? For example, if the tensor
has four indices: tmkni , and it remains the same if we swap the rst two indices:
tmkni = tkmni , then we are talking about a symmetric tensor in the rst two
indices or more precisely: symmetric in the mk indices.
However, we will also encounter tensors that are antisymmetric. A
antisymmetric tensor tij changes sign when we swap its indices: tij = −tji .
If the antisymmetric tensor is represented as a matrix, then it is equal to its
negative transpose:
   
∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ = − ∗ ∗ ∗ (2.5)
   

∗ ∗ ∗ ∗ ∗ ∗
2.5 Combine tensors 33

Unfortunately, most tensors are neither symmetric nor antisymmetric. But the
great thing is: Mathematically, we can split every tensor t into a symmetric
s and an antisymmetric a part: t = s + a.

Let's take a look at how we can practically decompose a general tensor tij of the
second-order into the two parts.
1. The symmetric part sij of the tensor tij is sij = 21 (tij + tji ). Here we
have swapped the two indices and added the two tensors together. The
factor 21 is important because we have counted the tensor twice here.
2. The antisymmetric part aij of the tensor tij ist aij = 12 (tij − tji ). Here
we have swapped the two indices, so the swapped tensor gets a minus sign.
Then we add the two tensors together. The factor 21 is also important here.
3. We then add the symmetric and antisymmetric components together to
obtain the total tensor:
1 1
tij = (tij + tji ) + (tij − tji ) (2.6)
2 2

The rst term in Eq. 2.6 is the symmetric part of the tensor tij and the second
term is the antisymmetric part.

2.5 Combine tensors


We can do little with tensors alone. We need to be able to perform calculations
with them. There are several arithmetic operations that we can use to combine
two tensors a and b into a new tensor c.

2.5.1 Addition of tensors


We can add two tensors aij and bij of the same order:

cij = aij + bij (2.7)

The result is a new tensor cij of the same order. If we represent the tensors a
and b as matrices, then adding tensors is nothing other than adding matrices
34 Chapter 2. Tensors

component by component. The component a11 of the matrix a in the rst row
and rst column is added with the component b11 of the matrix b, which is also
in the same column and the same row. This is how matrix addition works. We
proceed in the same way with all other components. The result is the matrix c:

     
a11 a12 a13 b11 b12 b13 a11 + b11 a12 + b12 a13 + b13
a21 a22 a23  + b21 b22 b23  = a21 + b21 a22 + b22 a23 + b23  (2.8)
     

a31 a32 a33 b31 b32 b33 a31 + b31 a32 + b32 a33 + b33
| {z } | {z } | {z }
a b c

2.5.2 Subtraction of tensors


We can subtract two tensors aij and bij of the same order:

cij = aij − bij (2.9)

The result is a new tensor cij of the same order. Subtraction works in the same
way as addition. Simply replace the plus sign in Eq. 2.9 with a minus sign.

2.5.3 The outer product of tensors (tensor product)


The next operation is probably new to you, namely the outer product ⊗.
Sometimes it is also called tensor product. Here, the same components are
not oset against each other as with the addition and subtraction of tensors. For
this operation, the indices of the tensor aij and bkm must therefore be designated
dierently. The tensor bkm has therefore been given the indices k and m.

cijkm = aij ⊗ bkm (2.10)

If we form the tensor product 2.10 of second-order tensors, then the result cijkm
is a fourth-order tensor. If, on the other hand, we form the tensor product
of tensors ai and bk of the rst order, then the result is a tensor of the second
2.5 Combine tensors 35

order:

cik = ai ⊗ bk = ai bk (2.11)

This is how the tensor product works with two tensors of any order. The only
exceptions are zero-order tensors. In this case, the result remains a zero-order
tensor. The tensor sign ⊗ is usually omitted in 2.10 and 2.11.

Let's make a concrete example of the tensor product that we can illustrate
well, namely the tensor product of rst-order tensors as in Eq. 2.11. They are
represented by the vectors: a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ]. The result is a
second-order tensor represented by a matrix:
     
a1 b1 a1 b 1 a1 b 2 a1 b 3
a2  ⊗ b2  = a2 b1 a2 b2 a2 b3  (2.12)
     

a3 b3 a3 b 1 a3 b 2 a3 b 3
| {z }
c

The rst index, the index i, numbers the rows of the matrix by denition and
the second index, the index k , numbers the columns.

The tensor product does not necessarily have to be between two tensors of the
same order. For example, we can also form the tensor product with the third-
order tensor Aijm and the second-order tensor Bkn . The result is a fth-order
tensor Cijmkn :

Cijmkn = Aijm Bkn (2.13)

As you have probably already noticed, for example, Bkn specically represents
the kn-th component of the tensor B . And Aijm is the ijm-th component of
the tensor A. If we form the tensor product as in 2.13, then this is the tensor
product of the components. The result is the ijmkn-th component of the tensor
C . When we write a tensor with indices, we always mean its components.
Nevertheless, we casually say tensor for its component notation.
36 Chapter 2. Tensors

The tensor product naturally works in the same way with superscript indices,
which we will get to know in the next chapter. If the indices ij are at the top
of the tensor A, then they must also be at the top of the resulting tensor C :

C ij mkn = Aij m Bkn (2.14)

2.5.4 Contraction of tensors


The next arithmetic operation we can perform is the contraction of a tensor.
Let's take the fourth order tensor as an example: tijmk . The contraction of this
tensor means the following:

1. We select two of its indices. For example, the index i and m: tijmk .

2. Then we set the two indices equal: i = m. For example, we can call them
both i: tijik .

3. We then sum over the index i:

+ tijik
i = 1

In three-dimensional space, the index i ranges from 1 to 3, so the


contraction of the tensor tijmk results in the following sum:

+ tijik = t1j1k + t2j2k + t3j3k


i = 1

If we want to communicate these three steps to another physicist, then we say:


Contraction of the indices i and m of the tensor tijmk . Or: contraction of the
rst and third index of the tensor tijmk .
The contraction is very useful because it reduces the order of a tensor. For
example, the contraction of the fourth-order tensor tijmk has reduced its order
by two. The result of the contraction is a second-order tensor: tijik = cjk .

In physics, we use the Einstein sum convention, which states that we can
omit the sum sign in + tijik to simplify the notation if two identical indices
i = 1
2.6 Kronecker Delta 37

appear in a tensor. With the tensor tijik in combination with the Einstein
summation convention, summation is therefore performed using the index i:

tijik = t1j1k + t2j2k + t3j3k (2.15)

If we contract a second-order tensor tii then the contraction is also called trace
of the tensor:

tii = t11 + t22 + t33 = Tr (t) (2.16)

The result is a zero-order tensor, that is, a scalar.

Of course, we can also contract the indices of two dierent tensors. For example,
let's take a tensor Mij and a tensor vk . The tensor product Mij vk without
contraction results in a third-order tensor. Now we contract the indices j and k .
Then, in the matrix and vector representation, this corresponds exactly to the
multiplication of a matrix M with a vector v. The result ui is a rst-order
tensor, that is, a vector:

Mij vj = ui
|{z}
(2.17)
|{z} |{z}
M v u

2.6 Kronecker Delta


More: en.fufaev.org/kronecker-delta
The Kronecker delta δij has become indispensable in theoretical physics. You
will encounter this relatively simple, yet powerful tensor in practically all areas
of theoretical physics. It is used, for example, to make long expressions more
compact and to simplify complicated expressions. In combination with the Levi-
Civita symbol, which you will learn in the next chapter, the two tensors are very
useful!

Kronecker delta δij is a small Greek delta that is either 1 or 0, depending on


the value of the two indices i and j . Kronecker delta is equal to 1 if i and j are
38 Chapter 2. Tensors

equal. And the Kronecker delta is equal to 0 if i and j are not equal:
(
1 i=j
δij = (2.18)
0 i ̸= j

Let's take a few examples:

ˆ δ11 = 1, as both indices are equal.

ˆ δ23 = 0, as both indices are dierent.

ˆ aδ33 = a · 1 = a

ˆ δ23 δ22 = 0

Also note that, unless otherwise stated, we use the Einstein summation
convention we learned earlier. The same index is used for summation:

δij δjk = δi1 δ1k + δi2 δ2k + δi3 δ3k (2.19)

However, there are exceptions to the Einstein summation convention. For


example, with the dierential operator ∂i . You are not allowed to move ∂i in
front of fi if ∂i acts as a derivative of fi :

∂i fi ̸= fi ∂i

So be careful with operators in index notation!

Let's look at four useful rules with Kronecker delta that you can always use
when summing over double indices.

2.6.1 Kronecker delta is symmetric


Indices, here i and j , may be swapped:

δij = δji (2.20)


2.6 Kronecker Delta 39

2.6.2 Contracting with Kronecker delta


If the product of two or more Kronecker deltas contains a summation index,
here j , then the product can be combined, whereby the summation index j
disappears:

δij δjk = δik (2.21)

An example with two summation indices:

δij δkj δin = δkn (2.22)

This should make it clear that the order of contraction of the Kronecker delta
is irrelevant.

We can also apply this rule to the contraction of the Kronecker delta with
another tensor, here ai :

ai δij = aj (2.23)

Other example: Γjmk δnk = Γjmn .

2.6.3 Kronecker delta sum


If i takes the values from 1 to n, then the following rule applies:

δii = δ11 + δ22 + ... = n


| {z }
(2.24)
n

In the four-dimensional spacetime: δii = 4.


40 Chapter 2. Tensors

2.6.4 Scalar product in index notation


We can easily illustrate how useful the Kronecker delta is in theoretical physics
using the scalar product. Let's look at any three-dimensional vector:

a = [a1 , a2 , a3 ] (2.25)
= a1 ê1 + a2 ê2 + a3 ê3 (2.26)
= ai êi (2.27)

Here, ê1 , ê2 and ê3 are three basis vectors that are normalized and
orthogonal to each other. In this case, they span an orthogonal
three-dimensional coordinate system. For the third equal sign, we have used
the Einstein summation convention and represented the vector a in index
notation: ai êi .
Let's now take another vector b and also represent it in index notation: bj êj .
Note that we have to name the indices of the two vectors dierently.
Now we form the scalar product of the two vectors:

a · b = (ai êi ) · (bj êj ) (2.28)

We can sort the objects in index notation in Eq. 2.28 as we like. Let's take
advantage of this and put brackets around the basis vectors to emphasize their
importance when introducing the Kronecker delta:

a · b = ai bj (êi · êj ) (2.29)

Thus we have converted the scalar product a · b of the two vectors to the scalar
product of the basis vectors êi · êj . The basis vectors are orthonormal (i.e.
pairwise orthogonal and normalized). Let's remember what the property of
being orthonormal means for two vectors:
ˆ Scalar product êi · êj equals 1 if i and j equal. In this case, it is the same
vector.
ˆ Scalar product êi · êj results in 0 if i and j are not equal. In this case,
2.6 Kronecker Delta 41

there are two dierent basis vectors and they are orthogonal to each other.

The two behaviors can be written compactly in mathematical terms as follows:


(
1 i=j
êi · êj = (2.30)
0 i ̸= j

Doesn't this property sound familiar to you? The scalar product 2.30 of two
orthonormalized vectors behaves exactly like the denition 2.18 of Kronecker
delta! Therefore, you may replace the scalar product between two basis vectors
with the Kronecker delta:

êi · êj = δij (2.31)

This allows us to calculate the scalar-product 2.29 by using the Kronecker


delta:

a · b = ai bj δij (2.32)

If you remember the rules for the contraction, we can contract one of the
summation indices i or j in 2.32. For example, let's contract (eliminate) the j .
We get the scalar product in index notation:

a · b = ai b i (2.33)

And eq. 2.33 is exactly the denition of the scalar product, where the vector
components are summed component by component.

Now you know how the scalar product is written in index notation and what role
the Kronecker delta plays. It represents the scalar product of the basis vectors:
êi · êj = δij .

■ Example 2.1 — Kronecker delta in quantum mechanics. The spin-up state


|1⟩ and the spin-down state |2⟩ are orthonormal to each other. The word
"orthonormal" should trigger a thought in your mind: Kronecker delta can be
42 Chapter 2. Tensors

used here! Why again? Orthonormal vectors result in either 1 or 0, just like
the Kronecker delta.

The scalar product |i⟩ · |j⟩ in quantum mechanics is represented in Bra-Ket


notation ⟨i |j⟩ (we will learn this notation in the chapter 16):

|i⟩ · |j⟩ = ⟨i |j⟩ = δij (2.34)

Here, i and j take the values 1 (spin up) or 2 (spin down). ■

You can keep the following in mind: As soon as you discover an expression
in the index notation of an equation that results in either 0 or 1 depending on
the indices, replace it with Kronecker-Delta and use the Kronecker-Delta rules
above to simplify the equations further or to represent them in index notation.

2.7 Levi Civita symbol


More: en.fufaev.org/levi-civita-symbol
In addition to the Kronecker delta δij , the Levi-Civita symbol εijk is a very
common symbol in theoretical physics that is used in all areas of physics, from
classical mechanics to quantum eld theory.

With the Levi-Civita symbol, which is sometimes also called the epsilon tensor,
you can easily transform and simplify complicated vector equations, such as
multiple cross products, and represent equations more compactly.

Levi-Civita symbol εijk is notated with a small Greek epsilon that has three
indices i, j and k . The Levi-Civita symbol can take on three dierent values:
+1, 0 or -1. When does it take on which value? That depends on how the indices
ijk are arranged in relation to the original order. What do I mean exactly? Let's
take a closer look. You can permute (swap) the indices ijk . We can permute
the indices in two ways.

In the straight (cyclic) permutation, all indices ijk are rotated clockwise or
anti-clockwise. With this permutation, all indices change their position.
2.7 Levi Civita symbol 43

For example:

ˆ An even permutation of ijk in a clockwise direction results in kij . Can


you see how the indices have been rotated here?

ˆ An even permutation of kij in a clockwise direction results in jki.

ˆ An even permutation of jki would again result in ijk . Remember that an


anticlockwise rotation of the indices is also an even permutation.

In an odd permutation, two indices are swapped with each other. In this
permutation, only two of the three indices ijk change position. For example:

ˆ An odd permutation of ijk is jik . Here, i and j have been swapped, while
k has remained in the same place.

ˆ Another odd permutation of ijk is kji. Here, i and k have been swapped,
while j has remained in the same place.

ˆ And the last possible odd permutation of ijk is ikj . Here, i has been left
as it is, while j and k have been swapped.

With this knowledge, you will be able to understand the denition of the Levi-
Civita symbol. The permutations refer to a starting position of the indices.
Here we assume (ijk) = (123) as the starting position. Then the Levi-Civita
44 Chapter 2. Tensors

symbol behaves as follows:


(ijk) is even permutation of (123)


 1

εijk = − 1 (ijk) is odd permutation of (123) (2.35)

at least two indices are equal

0

Here are a few examples:

ˆ ε112 = 0, since the rst two indices are equal.

ˆ ε313 = 0, since the rst and third indices are the same.

ˆ ε222 = 0, since all three indices are equal.

ˆ ε123 + ε213 = 1 + (−1) = 0, since the indices of the rst epsilon are in the
start position and the indices of the second epsilon are an odd permutation
of this.

ˆ ε123 ε231 = 1 · 1 = 1, since the indices of the rst epsilon are in the starting
position and the indices of the second epsilon have just been permuted
counterclockwise.

2.7.1 Cross product in index notation


The enormous benet of the Levi-Civita symbol can be seen by looking at the
double cross product a × b × c or the parallelepipedial product (a × b) · c.
But even for the simple cross product a × b of two vectors a and b, we need
the Levi-Civita symbol to be able to represent the cross product compactly in
index notation.

The cross product, written using the orthonormal basis vectors ê1 , ê2 and ê3
2.7 Levi Civita symbol 45

looks like this:


 
a2 b 3 − a3 b 2
a × b = a3 b1 − a1 b3  (2.36)
 

a1 b 2 − a2 b 1
= (a2 b3 − a3 b2 ) ê1
+ (a3 b1 − a1 b3 ) ê2
+ (a1 b2 − a2 b1 ) ê3

The i-th component (a × b)i of the cross product a × b, represented in


the orthonormal basis, we can write compactly in index notation as follows:

(a × b)i = εijk êi aj bk (2.37)

Take a look at the indices in eq. 2.37. All three indices i, j and k occur double.
Here we have used the Einstein summation convention, therefore we sum over
duplicate indices. If we insert concrete values for the indices in 2.37, we get
exactly the rst (i = 1), second (i = 2) or third (i = 3) component of the cross
product. But eq. 2.37 is not only a compact notation of the cross product, it
is also a clever notation for the cross product, with which we can easily derive
relations for the parallelepipedial product and the double cross product.
For fun, write out the double cross product (a × b) × c with vector notation
and then write it out in index notation. And prove the following BAC-CAB
rule with one and the other method:

(a × b) × c = b (a · c) + c (a · b) (2.38)

You will be grateful to have learned about the Levi Civita symbol, as you will
encounter it regularly during your undergraduate and master's studies.
3. Dirac delta

More: en.fufaev.org/dirac-delta
The Dirac delta δ(x) (sometimes also called the Dirac delta function, although
it is not a function) is a useful mathematical object that is used in many areas of
theoretical physics. Starting in electrodynamics in the description of electric
point charges as a charge density concentrated in a single point, up to quantum
eld theory in the description of quantum elds as operators.

Let us consider a one-dimensional electric charge density ρ(x) that depends on


the position x. The charge density is therefore smeared on a line. ρ(x) can also
represent a mass density or any other density function. Here we look at the
charge density as an example.

To calculate how large the total charge Q is on this line, we must integrate
(sum up) the charge density ρ(x) on this line. Let's assume that the charge
density is smeared on the line from x = a to x = b. These are our integration
limits. The total charge is therefore calculated as follows:
Z b
ρ(x) dx = Q (3.1)
a

But what if Q is not a smeared charge, but a charge localized at a single point?
48 Chapter 3. Dirac delta

What if Q is a singularity? The entire charge density ρ(x) is then concentrated


in a single point and zero everywhere else. And this is where the problem arises:
We cannot mathematically use the integral 3.1 for singularities. But we must
somehow be able to mathematically describe a point charge.

The charge density ρ(x) must fulll two properties if it is to describe a single
point charge:

ˆ Charge density ρ(x) must disappear at every location x, except at the


location where the point charge is located. Let us assume that the charge
is located at the coordinate origin x = 0, that is: ρ(x) = 0 for x ̸= 0.

ˆ The integral 3.1 over the charge density must give us the value Q if the
point charge lies within the integration limits x = a and x = b.

If we normalize the charge to the value Q = 1 and observe the both


properties of the charge density of a point charge, then we note the
density with a Greek delta δ(x) and call it the Dirac delta. The Dirac delta
therefore describes a density and has the following properties, which we have
chosen so that we can use it to describe a point charge (Q = 1):

ˆ The Dirac delta is zero everywhere except at the origin:

δ(x) = 0, x ̸= 0 (3.2)

ˆ If the integration of the Dirac delta includes the coordinate origin x = 0,


then the integral has the value 1:
Z b
δ(x) dx = 1, a < 0 < b (3.3)
a

Even though the name delta function may suggest that it is a function, δ(x)
is mathematically not a function, but another mathematical object that can
be understood as a Dirac distribution or as a Dirac measure. Let us therefore
continue to call δ(x) a Dirac delta so as not to upset the mathematicians.

The Dirac delta is graphically illustrated with an arrow that is located at the
3.1 Dirac delta in the coordinate origin 49

position x = 0 of the unit point charge Q = 1. The height of the arrow is usually
chosen so that it represents the value of the integral, in this case Q = 1.

3.1 Dirac delta in the coordinate origin


Let us now consider an integral of the delta function together with any function
f (x):
Z b
f (x) δ(x) dx = ? (3.4)
a

Such an integral is very easy to calculate, because due to the property 3.3 the
Dirac delta is zero everywhere except at the point x = 0. This means that the
product f (x)δ(x) is also zero everywhere except at the point x = 0. In the
integral 3.4, only the function value f (0) remains. Since f (0) no longer depends
on x, we can move this constant in front of the integral:
Z b Z b
f (x)δ(x) dx = f (0) δ(x) dx (3.5)
a a
Z b
= f (0) δ(x) dx
a

=?

The integral over the Dirac delta results in 1 if x = 0 lies between a and b
(otherwise the integral is zero). This is exactly the property of the Dirac delta.
So we know what the Dirac delta does in the integral 3.4 when multiplied by
50 Chapter 3. Dirac delta

a function f (x). The Dirac delta picks the value of the function at the
origin x = 0:
Z b
f (x) δ(x) dx = f (0) (3.6)
a

3.2 Shifted Dirac delta


Of course, we can also move the charge Q = 1 to another position on the x-axis,
for example to the position x = x0 . To indicate the charge shifted outside the
coordinate origin, we write the argument of the Dirac delta as δ(x − x0 ). Why
not δ(x + x0 )? Because we have shifted the Dirac delta in the positive direction.
Then the delta function must be zero everywhere except at the point x0 .

Even with a shifted charge, the integral over the delta function is equal to 1 if
the charge at x = x0 lies between the integration limits. We have only shifted
the Dirac delta to x0 , so the value of the integral with δ(x − x0 ) is the same as
in the case of δ(x):
Z b
δ(x − x0 ) dx = 1, a < x0 < b (3.7)
a
3.3 Properties of the Dirac delta 51

What happens if the shifted Dirac delta is multiplied by another function f (x)
in the integral? δ(x − x0 ) is zero everywhere except at the point x0 . This means:
Shifted Dirac delta δ(x − x0 ) in the integral picks the function value
f (x0 ) at the point where the Dirac delta is located:
Z b
f (x) δ(x − x0 ) dx = f (x0 ), a < x0 < b (3.8)
a

3.3 Properties of the Dirac delta


The Dirac delta has two important properties that we will need in theoretical
physics when dealing with equations:

ˆ The Dirac delta is symmetric:

δ(−x) = δ(x) (3.9)


52 Chapter 3. Dirac delta

ˆ The factor k in the argument of Dirac delta can be pulled out:


1
δ(k x) = δ(x) (3.10)
k

3.4 Analogy to the Kronecker delta


The dening properties 3.2 and 3.3 of the Dirac delta δ(x − x0 ) are somewhat
reminiscent of the denition of Kronecker delta δij , if we call the letters the
same: x := i and x0 := j . The Dirac delta then looks like this: δ(i − j).
Recall what the Kronecker delta (with Einstein summation convention) does in
a sum fi δij with a vector component fi . It selects the j -th vector component of
the vector f (we have named the vector f to make the analogy clearer):

fi δij = fj (3.11)

And now compare that with what the Dirac delta does in the integral:
Z b
f (i) δ(i − j) = f (j) (3.12)
a

While we can use the Kronecker delta δij to pick a vector component fj from a
nite number of vector components fi , we can use the Dirac delta δ(i − j) to
pick a function value f (j) from an innite number of function values f (i).
ˆ The Kronecker delta δij is used when we are dealing with vectors f and
their nite number of vector components fi .
ˆ The delta function δ(i − j) is used when we are dealing with functions f
and their innite number of function values f (i).

3.5 Three-dimensional Dirac delta


So far, we have only considered a one-dimensional Dirac delta δ(x) that can be
moved back and forth along the x-axis. The charges or other density singularities
such as black holes are usually located in a three-dimensional space with three
3.5 Three-dimensional Dirac delta 53

spatial axes: (x, y, z). Fortunately, the generalization of the Dirac delta to
three-dimensional space is quite simple.

δ(x) δ(y) δ(z) = 0, (x, y, z) ̸= (0, 0, 0) (3.13)

If our unit charge Q = 1 is located in the coordinate origin (x, y, z) = (0, 0, 0),
then we can describe the corresponding density, that is, the three-dimensional
Dirac delta, with the product of three one-dimensional Dirac deltas δ(x), δ(y)
and δ(z):
Z
δ(x) δ(y) δ(z) dx dy dz = 1, (0, 0, 0) ∈ V (3.14)
V

To avoid having to write three Dirac deltas, we combine them into one Dirac
delta with a superscript that species the spatial dimension. And in the
argument of the Dirac delta, we write the position vector r = (x, y, z):

δ 3 (r) := δ(x)δ(y)δ(z) (3.15)

The Dirac delta shifted to the location r0 = (x0 , y0 , z0 ) then looks as follows:

δ 3 (r − r0 ) := δ(x − x0 ) δ(y − y0 ) δ(z − z0 ) (3.16)

If the three-dimensional delta function appears in the integral in a product with


a scalar three-dimensional function f (r) = f (x, y, z), then the three-dimensional
Dirac delta δ 3 (r − r0 ) works in the same way as in the one-dimensional case.
The Dirac delta picks the value f (r0 ) = f (x0 , y0 , z0 ) of the function at
the point r0 :
Z
f (r) δ 3 (r − r0 ) dv = f (r0 ) (3.17)
V

With the knowledge of the Dirac delta, we can theoretically describe density
singularities (for example point charges and black holes).
4. Vector fields

A vector function F (or vector-valued function) is a vector that depends on the


(Cartesian) coordinates (x, y, z) and has three components in three-dimensional
space:
 
Fx (x, y, z)
F (x, y, z) = Fy (x, y, z) (4.1)
 

Fz (x, y, z)

Here, Fx (x, y, z), Fy (x, y, z) and Fz (x, y, z) are three scalar functions and they
represent the three components of the vector function F . Sometimes we also
write briey: F (r) = F (x, y, z), where r = (x, y, z) is the position vector.

In theoretical physics, we will mainly work with vector elds. A vector


function can depend on any coordinates (x, y, z), such as angles. And, if we are
talking about a vector eld, then (x, y, z) represents the spatial coordinates.
We represent vector functions and vector elds either with an arrow F⃗ above
the symbol or more compactly, in bold F . The vector eld could, for example,
represent the electric eld F = E or a magnetic eld F = B .
56 Chapter 4. Vector fields

■ Example 4.1 A two-dimensional vector eld could look like this:


" #
2x + 5y
F (x, y, z) = (4.2)
5x

Here, Fx (x, y) = 2x + 5y is the rst component and Fy (x) = 5x is the second


component of the vector eld. The second component depends only on the
position coordinate x. If we represent 4.2 graphically, the vector eld looks like
this:

Each point (x, y) is assigned a vector F (x, y). For example, at the point (x, y) =
(1, 1) the vector looks like this: F (1, 1) = (7, 5). Simply insert x = 1 and for
y = 1 into the vector eld 4.2 to get this example vector. If you insert a large
number of locations in this way, you will get the graphical representation of the
vector eld 4.2. ■
5. Nabla operator

More: en.fufaev.org/nabla-operator
We will encounter the nabla operator ∇ (inverted large delta) in every
branch of theoretical physics when it comes to multidimensional derivatives,
especially in electrodynamics when we get to know Maxwell's equations. The
three-dimensional Nabla operator is notationally similar to a vector and
looks like this in three-dimensional space when we express it with Cartesian
coordinates (x, y, z):
 
∂x
∇ = ∂y  (5.1)
 

∂z

The three components of the Nabla operator are partial derivatives with
respect to x, y and z . We have notated the partial derivatives more compactly
with ∂x instead of ∂x ∂
. This notation is common in theoretical physics. The
single derivatives ∂x , ∂y and ∂z are called dierential operators. You can apply
a dierential operator to a function f . The result is the derivative of the function.
For example: ∂x f = ∂f ∂x
.

We can apply the Nabla operator in three dierent ways to a scalar function f
58 Chapter 5. Nabla operator

or to a vector eld F :

ˆ As multiplication with a scalar function: ∇f . The result ∇f is called


gradient of the scalar function f .
ˆ As scalar product with a vector eld: ∇ · F . The result ∇ · F is called
divergence of the vector eld F .
ˆ As cross product with a vector eld: ∇ × F . The result ∇ × F is called
curl of the vector eld F .

5.1 Gradient
More: en.fufaev.org/gradient
Let's take a look at the rst application of the nabla operator in the form of
the gradient ∇f of a scalar function f . Here we apply the nabla operator
∇ to the function f . We will encounter the gradient in Maxwell's equations, for
example:
 
∂x f (x, y, z)
∇f (x, y, z) = ∂y f (x, y, z) (5.2)
 

∂z f (x, y, z)

The result 5.2 is called gradient and represents a three-dimensional vector


eld ∇f with three components:
ˆ The rst component contains the gradient ∂x f of the function f (x, y, z)
in the x direction.

ˆ The second component contains the gradient ∂y f of the function f (x, y, z)


in the y direction.

ˆ The third component contains the gradient ∂z f of the function f (x, y, z)


in the z direction.

Of course, we can also use a two-dimensional Nabla operator ∇2d , which


only has two components. A two-dimensional gradient of a function f (x, y) then
5.1 Gradient 59

looks like this:


" #
∂x f (x, y, z)
∇2d f (x, y, z) = (5.3)
∂y f (x, y, z)

And the one-dimensional Nabla operator ∇1d has only one component.
Applied to a one-dimensional function f (x), the gradient is simply the partial
derivative of the function:

∇1d f (x, y, z) = ∂x f (x, y, z) (5.4)

■ Example 5.1 — Calculating the gradient of a function. Given is a scalar


function f (x, y, z) = x2 + 5xy + z . The gradient of this scalar function is:
 
2x + 5y
∇f (x, y, z) =  5x  (5.5)
 

The resulting vector ∇f (x, y, z) points at each location (x, y, z) to the steepest
slope of the function f (x, y, z). For example, look at the following plot of a
two-dimensional scalar function f (x, y):
60 Chapter 5. Nabla operator

At each location on the green function, imagine a vector that shows you the
direction of the steepest ascent or descent at that location.

5.2 Divergence
More: en.fufaev.org/divergence

Let's look at the second application of the Nabla operator, namely the
divergence ∇ · F of a vector eld F . Here we apply the nabla operator ∇
to the vector-valued function F (x, y, z). Just like the gradient ∇f , we will also
encounter the divergence in Maxwell's equations, for example.

For the divergence, we form the scalar product ∇ · F between the nabla
operator and the vector eld F :
   
∂x Fx (x, y, z)
∇ · F (x, y, z) = ∂y  · Fy (x, y, z) (5.6)
   

∂z Fz (x, y, z)
= ∂x Fx + ∂y Fy + ∂z Fz

In the last step, we omitted the arguments for more compactness. The result
∇ · F of the scalar product is a three-dimensional scalar function. By forming
the gradient, a vector eld was generated from a scalar function. And by formig
the divergence, we make a scalar function was generated from a vector
eld. So the other way around!

■ Example 5.2 — Calculate the divergence of a vector field. The following


three-dimensional vector eld is given:
 
2x3
F (x, y, z) =  zy  (5.7)
 

5xy
5.2 Divergence 61

The divergence of this vector eld is the following scalar function:

∇ · F (x, y, z) = ∂x (2x3 ) + ∂y (zy) + ∂z (5xy) (5.8)


| {z } | {z } | {z }
6x2 z 0

So if a specic location (x, y, z) is inserted into the scalar function


∇ · F (x, y, z), then this function spits out a number. This number is a measure
of the divergence of the vector eld at the considered location (x, y, z). The
result can be a positive or negative number or even zero. Depending on
whether the number is positive, negative or zero, it has a dierent physical
meaning.

5.2.1 Positive divergence = source


We assume that we have inserted a specic location, for example something like
(x, y, z) = (1, 0, 3), into the result ∇ · F (x, y, z) and obtained a positive number:
∇ · F (x, y, z) > 0. Then the location (x, y, z) is a source of the vector eld
F.

Why do we call the location a source? If we were to enclose this location point
in an imaginary cube, then the vector eld would mainly point out of the cube.

You can visualize the source as a hole from which the water comes and leaves
the surface of the cube. Even though we can use the divergence to describe a
water source, in this book we use the divergence to describe electric charges. In
62 Chapter 5. Nabla operator

this case, the vector eld F corresponds to the electric eld F = E . Then the
source at the location (x, y, z) represents a positive electric charge.

5.2.2 Negative divergence = sink


If, on the other hand, we obtain a negative number after inserting the location
(x, y, z) into ∇ · F (x, y, z): ∇ · F (x, y, z) < 0, then we are talking about a sink
of the vector eld F (x, y, z).

If we enclose the location with an imaginary cube, then the vector eld ows
into the surface. We can imagine the sink as a hole into which the water ows.
To do this, the water must ow into the cube. If we assume that the vector eld
is an electric eld: F = E , then the sink at the location (x, y, z) corresponds to
a negative electric charge.

■ Example 5.3 — Sink of a vector field. Let's take a look at the following vector
eld:
 
2x
F (x, y, z) =  y  (5.9)
 

Let's calculate the divergence of this vector eld:

∇ · F (x, y, z) = ∂x (−2x) + ∂y (y) + ∂z (4) = −1 (5.10)


| {z } | {z } | {z }
−2 1 0
5.2 Divergence 63

The vector eld under consideration has a constant, negative divergence at every
location (x, y, z). This means that no matter which location is used for (x, y, z),
each location has a negative divergence with the value -1. Sinks of the vector
eld 5.9 are distributed everywhere. If the vector eld were an electric eld
F = E , then this result would mean that a negative electric charge is smeared
everywhere in space.

5.2.3 Divergence-free vector field


Now assume that we get zero after we have inserted a concrete location (x, y, z)
into the divergence eld: ∇ · F (x, y, z) = 0. Then the location (x, y, z) is
divergence-free.
If we enclose this location with a cube surface, then the vector eld neither
ows out nor in. Or just as much of the vector eld points into the surface as
points out, so that the two opposite contributions cancel each other out and the
divergence is net zero.

We can imagine this as if the cube enclosed a source (e.g. water source) and
a sink (e.g. drain), so that the amount of water owing in and water owing
out cancel each other out. If we interpret the vector eld as the electric eld
F = E , then there could be an electric dipole at the considered divergence-free
location. It consists of a positive (source) and negative (sink) charge.

■ Example 5.4 — Divergence-free vector field. Let's calculate the divergence


64 Chapter 5. Nabla operator

at the location (x, y, z) = (1, 1, 1) of the following vector eld:


 
−2x
F (x, y, z) = 0.5y 2  (5.11)
 

0.5z 2

Let's calculate the divergence of this vector eld:

∇ · F (x, y, z) = ∂x (−2x) + ∂y (0.5y 2 ) + ∂z (0.5z 2 ) (5.12)


| {z } | {z } | {z }
−2 y z

= −2 + y + z

Insert the location (1, 1, 1) into the calculated scalar function:

∇ · F (1, 1, 1) = −2 + 1 + 1 = 0 (5.13)

The divergence of the vector eld at this location is zero. There is therefore
neither a source nor a sink or an ideal electric dipole at the location (1, 1, 1). ■

5.3 Curl
As with divergence (scalar product ∇ · F ), we also apply the nabla operator to
a vector eld F in the case of curl (cross product ∇ × F ):
   
∂x Fx (x, y, z)
∇ × F (x, y, z) = ∂y  × Fy (x, y, z) (5.14)
   

∂z Fz (x, y, z)
 
∂y Fz − ∂z Fy
= ∂z Fx − ∂x Fz 
 

∂x Fy − ∂y Fx

In the last step, we omitted the arguments for more compactness. The result
of the cross product is again a vector eld with three components. The curl
∇ × F (x, y, z) of a vector eld thus gives us another vector eld.
5.3 Curl 65

We can visualize the curl ∇×F (x, y, z) of the vector eld at the location (x, y, z)
actually as the name tells us, as the circulation of the vector eld around
the location (x, y, z).
■ Example 5.5 — Calculate the curl of a vector field. Let us again consider the
following vector eld at the location (1, 1, 1):
 
2x3
F (x, y, z) =  zy  (5.15)
 

5xy

The vector eld F (1, 1, 1) at this location is:


 
2
F (1, 1, 1) = 1 (5.16)
 

The curl of the vector eld is:


   
∂y (5xy) − ∂z (zy) 5x − y
∇ × F = ∂z (2x3 ) − ∂x (5xy) =  −5y  (5.17)
   

∂x (zy) − ∂y (2x3 ) 0

Inserting the location (1, 1, 1) gives the curl vector:



4
∇ × F (1, 1, 1) = −5 (5.18)
 

Hopefully you can now imagine what the cross product ∇ × F with the vector
eld F means. We will encounter curl in the chapter on Maxwell's equations.
6. Gauss Divergence Theorem

We will encounter the Gauss Divergence Theorem in Maxwell's equations. This


theorem states that the sum of the sources and sinks in a volume is equal to the
ow through the volume surface. Mathematically speaking:
Z I
(∇ · F ) dv = F · da (6.1)
V A

Please what? You're probably asking yourself. Don't worry. We break down the
Divergence Theorem into its components so that you understand it one hundred
percent.

6.1 Surface Integral in the Divergence Theorem


Let's rst look at the right-hand side of the Divergence Theorem 6.1, namely
the surface integral:
I
F · da
A

The A stands for a surface that encloses any volume, for example the surface
of a cube, a sphere or the surface of any three-dimensional shape you can think
of. The small circle around the integral is intended to indicate that this surface
68 Chapter 6. Gauss Divergence Theorem

must fulll a condition: It must be closed, that is, it must not contain any
holes, so that the equality in 6.1 is mathematically fullled. The surface A is
therefore a closed surface.

The F is any vector eld: F = F (x, y, z), that is, a vector with three
components Fx (x, y, z), Fy (x, y, z) and Fz (x, y, z), as shown in Eq. 4.1. For
example, the vector eld could represent an elecetric eld F = E or a
magnetic eld F = B .
The da is an innitesimal surface element, that is, an innitely small area
of the surface A under consideration. As you may have noticed, the da element
is shown in bold, so it is a vector with three components dax , day and daz . The
vector naturally also has a magnitude and a direction. The magnitude |da| = da
indicates the area of this small piece of surface. The da vector is orthogonal to
the surface area and points out of the surface.

The point · in F · da is the scalar product (also called dot product). You
should be familiar with this vector operation from basic mathematics. The
scalar product is a way of multiplying two vectors together. In the Divergence
Theorem, the scalar product is therefore formed between the vector eld F and
the da area. Written out, the scalar product looks like this:

F · da = Fx dax + Fy day + Fz daz (6.2)


6.1 Surface Integral in the Divergence Theorem 69

The task of this scalar product is to pick out the part of the vector eld F at
point (x, y, z) that is perpendicular to the surface, i.e. that points parallel to
the da surface element. How can we understand this? Mathematically, we can
split the vector eld F = F|| + F⊥ into two parts:

ˆ In the component F|| , which points parallel to the da surface element.


ˆ In the component F⊥ that points perpendicular to the da surface
element.

The scalar product F · da at the location (x, y, z) on the surface eliminates


the perpendicular part of the vector eld and leaves only the component of the
vector eld parallel to the element:

F · da = F|| + F⊥ · da (6.3)


= F|| · da + F⊥ · da = F|| · da
| {z }
0

Why again is the perpendicular component zero? Because the scalar product of
two perpendicular vectors F and da is mathematically zero.

The scalar product F · da thus ensures that in the Divergence Theorem we


only take the component F|| of the vector eld that leaves or enters the surface
70 Chapter 6. Gauss Divergence Theorem

perpendicularly. Everything that passes by the surface (by this I mean the
component F⊥ parallel to the surface) is omitted in the Divergence Theorem.

Then, on the right-hand side of the Divergence Theorem 6.1, the scalar products
F (x, y, z) · da(x, y, z) are summed up for each point (x, y, z) on the surface A
using the integral in 6.1.

Let us briey denote the right-hand side of the Divergence Theorem by Φ :


I
Φ = F · da (6.4)
A

The surface integral therefore results in a number Φ , which is a measure of how


much of the vector eld F ows in or out of the surface A. The surface integral
is the ux Φ of the vector eld F from the surface A. In the chapter on
Maxwell's equations, we will get to know the electric and magnetic ux.

6.2 Volume integral in the Divergence Theorem


Let us now look at the left-hand side of the Divergence Theorem 6.1, namely at
the volume integral:
Z
(∇ · F ) dv (6.5)
V
6.2 Volume integral in the Divergence Theorem 71

The V stands for a volume, but not just any volume, it is the volume that is
enclosed by the surface A. For example, if A is the surface of a cube, then
V is the volume of this cube. The dv is an innitesimal volume element,
that is, an innitely small volume piece of the volume V .

In the integrand ∇·F of the volume integral, ∇ stands for the nabla operator,
which we got to know in the chapter 5. Although this operator is not a vector
from a mathematical point of view, it looks like a vector. An operator such as
the nabla operator is only useful if it is applied to a eld. And this also happens
in the integrand ∇ · F . The nabla operator ∇ is applied to the vector eld
F by forming the scalar product between the nabla operator and the vector
eld. Written out, this scalar product corresponds to the sum of the derivatives
of the vector eld with respect to the coordinates x, y and z :

∇ · F = ∂x Fx + ∂y Fy + ∂z Fz (6.6)

The integrand ∇ · F is therefore the divergence of the vector eld F . We


learned what divergence is in the chapter 5.2. The result ∇ · F is no longer a
vector, but a scalar that can be either positive, negative or zero:

ˆ If the divergence atlocation (x, y, z) is positive: ∇·F (x, y, z) > 0, then


there is a source of the vector eld at the location. In electrodynamics,
the source corresponds to a positive charge.
72 Chapter 6. Gauss Divergence Theorem

ˆ If the divergence at location (x, y, z) is negative: ∇·F (x, y, z) < 0, then


there is a sink of the vector eld at the location. In electrodynamics,
the source corresponds to a negative charge.

ˆ If the divergence at location (x, y, z) is zero: ∇ · F (x, y, z) = 0, then


this location is neither a sink nor a source of the vector eld. The
vector eld does not enter or leave the surface or the vector eld enters as
much as it leaves, so that the two parts cancel each other out.

Then, in the volume integral 6.5, the divergences ∇ × F (all sources and sinks)
at each location (x, y, z) within the volume V are summed up with an integral.

The volume integral 6.5 in the Divergence Theorem is a number that measures
how many sinks and sources can be found within the volume V .
Let us summarize the statement of the Divergence Theorem 6.1:

ˆ The volume integral on the left-hand side describes the sum of sources
and sinks of the vector eld within a volume V :
Z
(∇ · F ) dv
V

ˆ The area integral on the right-hand side describes the ux Φ of the
6.2 Volume integral in the Divergence Theorem 73

vector eld through the surface A of this volume V :


I
F · da
A

According to the Divergence Theorem, both integrals are equal.


The sum of the sources
The Gauss Divergence Theorem therefore states:
and sinks of a vector eld F within a volume V corresponds to the
ux Φ through the surface A of this volume.
7. Stokes’ Curl Theorem

More: en.fufaev.org/stokes-curl-theorem

Besides the Divergence Theorem, we will also need the Stokes' Curl Theorem
(or shorter: Curl Theorem) in order to understand Maxwell's equations in depth.
The Curl Theorem states that the curl of a vector eld within a surface
is equal to the curl of the vector eld along the edge of this surface.
Expressed mathematically, this theorem looks like this:
Z I
(∇ × F ) · da = F · dl (7.1)
A L

If you have understood the Divergence Theorem, the Curl Theorem should no
longer seem totally cryptic to you. You already know the vector eld
F (x, y, z). It depends on three spatial coordinates and has three components
as a vector. The scalar product F · dl, but also the nabla operator ∇ and
the innitesimal surface da should be familiar to you if you have read the
chapter 6 on the Divergence Theorem.
76 Chapter 7. Stokes’ Curl Theorem

7.1 Line integral in the Curl Theorem


Let us rst consider the line integral on the right-hand side of the Curl Theorem
7.1, namely:
I
F · dl
L

The symbol L on the integral represents a line in three-dimensional space.


The circle on the integral symbol indicates that this line must be closed, which
means that its beginning and end are connected. We refer to such a closed line
as a loop for short.

The dl is an innitesimal line element of the line, in other words an innitely


small piece of the line. You should also notice here that the dl line element is
shown in bold, which means it is a vector with three components: dlx , dly and
dlz . The magnitude dl of the line element indicates the length of dl, while its
direction points along the line.

Then, on the right-hand side, the scalar product F · dl is formed between a vector
eld and the line element. The scalar product looks like this when written out:

F · dl = Fx dlx + Fy dly + Fz dlz (7.2)

You have already learned what the purpose of this scalar product is in the
chapter 6 on the Divergence Theorem. Here is a quick recap: First, divide the
7.1 Line integral in the Curl Theorem 77

vector eld F into two parts:

ˆ Into the component F|| , which points parallel to the dl line element.

ˆ Into the component F⊥ that points perpendicular to the dl line


element.

The scalar product F · dl eliminates the perpendicular part of the vector


eld F and leaves only the part F|| of the vector eld that is parallel to the dl
element. The scalar product of two perpendicular vectors F⊥ and dl is
mathematically zero:

F · dl = F|| + F⊥ · dl (7.3)


= F|| · dl + F⊥ · dl = F|| · dl
| {z }
0

Since the line element dl is parallel to the line at every point of the line, in the
scalar product F · dl only the parallel component F|| of the vector eld remains,
which of course also runs along the line L. All other components of the vector
eld are absent.

The scalar products for each point (x, y, z) on the line L are then added up on
the right-hand side of the Curl Theorem using the line integral.
78 Chapter 7. Stokes’ Curl Theorem

Let us briey denote the right-hand side of the Curl Theorem by U :


I
U = F · dl (7.4)
L

The line integral therefore results in a number U , which is a measure of how


much of the vector eld runs along the line. Because the line L is closed, the
summation returns to the same point (x, y, z) where the summation began. The
closed line integral U therefore indicates how much of the vector eld F
circulates along the closed line L.

7.2 Surface integral in the Curl Theorem


Let us now consider the surface integral on the left-hand side of the Curl
Theorem 7.1, namely:
Z
(∇ × F ) · da
A

The surface A appears in the area integral. In contrast to the surface integral
with a circle around the integral sign, as in the Divergence Theorem, here we
consider an open surface. It therefore does not include a volume. This is
merely a surface that is enclosed by the loop L.

The vector da = (dax , day , daz ) represents an innitely small element of the
surface A and is perpendicular to every location point (x, y, z) on this surface.
7.2 Surface integral in the Curl Theorem 79

The cross product ∇ × F between the nabla operator and the vector eld also
appears in the surface integral. You should already know what the cross product
means from the basics of mathematics. In addition to the scalar product, the
cross product is the second way to multiply vectors with each other. The cross
product ∇ × F is the curl of the vector eld F .

In contrast to the scalar product, the result of the cross product ∇ · F is again
a vector eld that is perpendicular to F . Why perpendicular? Because
that is the property of the cross product! If we write out the cross product in
concrete terms, the result vector ∇ × F looks like this:
 
∂y Fz − ∂z Fy
∇ × F = ∂z Fx − ∂x Fz  (7.5)
 

∂x Fy − ∂y Fx

What does curl mean?

The vector ∇ × F (x, y, z) indicates how strongly the vector eld F


circulates at the point (x, y, z), which is located within the area A.
Then the scalar product (∇ × F ) · da between the curl vector eld (∇ × F ) and
the innitesimal surface element da is formed inside the surface integral of the
Curl Theorem. As we already know, the scalar product is only used to pick out
the component (∇ × F )|| of the curl vector eld that runs parallel to the surface
element:
80 Chapter 7. Stokes’ Curl Theorem

(∇ × F ) · da = (∇ × F )|| + (∇ × F )⊥ · da (7.6)


= (∇ × F )|| · da + (∇ × F )⊥ · da
| {z }
0

= (∇ × F )|| · da

Since the surface element da(x, y, z) at a point (x, y, z) is perpendicular to the


respective piece of surface, the scalar product ?? only picks out the component of
the vector eld dF that is also perpendicular to the surface element or, in other
words, the eld component that is parallel to the da(x, y, z) vector. Therefore,
only the component (∇ × F )|| · da remains in the surface integral.

The scalar products (∇ × F )|| (x, y, z) · da(x, y, z) are then summed up on the
left-hand side of the Curl Theorem using the surface integral at each point
(x, y, z).
7.2 Surface integral in the Curl Theorem 81

Let us now summarize the statements of the surface integral (right-hand side)
and line integral (left-hand side) of the Stokes' Curl Teorem:
ˆ On the left-hand side, the curl (∇ × F ) of the vector eld F is summed
up at each individual location within the area A:
Z
(∇ × F ) · da
A

ˆ On the right-hand side, the vector eld F is summed up along the


boundary L of the surface A. The right-hand side therefore corresponds
to a number that measures the curl of the vector eld on the boundary:
I
F · dl
L

Both integrals should be equal according to the Curl Theorem.


The total curl of a vector
The Stokes' Curl Theorem thus clearly states:
eld F within the surface A corresponds to the curl of the vector eld
along the edge L of this area.
8. Fourier Series

More: en.fufaev.org/fourier-series
You are certainly familiar with the Taylor expansion, with which we can
approximate a function f (x) at a point x = x0 using a simpler Taylor series.
Let us denote the approximation of the exact function as f . The more terms
we take in the Taylor series, the better the approximation f in the vicinity of
the selected point x0 .
As you can see in the image below, the Taylor series, represented by ftaylor , is a
good approximation of the function f in the immediate vicinity of x0 . However,
if we move further away from the point, we see that the Taylor series is not a
good approximation there. The Taylor expansion is therefore a method with
which we can approximate a function only locally.

However, if it is important for us to approximate a function f on a whole


interval, then we need a Fourier series of the function. As we will see,
84 Chapter 8. Fourier Series

the Fourier series is a linear combination of simple periodic basis functionsn


(e1 , e2 , e3 , e4 , ...) such as cosine and sine or complex exponential functions, which
in sum (series) can approximate the function f in a chosen interval. In the
following, we assume a interval of length L.

8.1 The concept of Fourier series


vector v that lives in an n-dimensional vector space as a
We can represent a
linear combination of basis vectors {ek } that span the vector space:

k
v = v1 e1 + v2 e2 + v3 e3 + ... + vn en = + vk ek (8.1)
n

You should be familiar with the representation of the vector as a linear


combination from linear algebra! Using a basis {ek } we can represent every
possible vector v in this vector space. Here vk are the components of the
vector in the chosen basis. The choice of basis is not unique, therefore
the components vk can be dierent. By choosing a dierent basis, the vector
has dierent components! You should already know this.
We can also apply this concept of linear combination to innite-dimensional
vectors. A function f , for example from the illustration above, can be
interpreted as an innite-dimensional vector f , which we can represent as a
linear combination. The components vk of a nite vector become Fourier
8.1 The concept of Fourier series 85

coecients fˆk if we represent a function and not a nite vector as a linear


combination:

f = fˆ1 e1 + fˆ2 e2 + fˆ3 e3 + ... + vn en (8.2)


k
= + fˆk ek
n

If we represent a function f as a linear combination 8.4 of basis functions ek ,


then we denote the sum 8.4 as Fourier series of the function f . For a linear
combination for a function, the basis vectors ek are more appropriately called
basis functions. In optics, the basis functions are also called Fourier modes.

When considering the function f as a vector, the function values f (x0 ), f (x1 ),
f (x2 ) and so on until f (xn ) = f (x0 + L), represent the components of f . We
can imagine the function f as a column vector:

 
f (x0 )
 f (x1 ) 
 
 
 f (x2 ) 
 
f =  .  (8.3)
 
 
 . 
 
 . 
 

f (xn )

Of course, the representation is not exact. The argument x of the function f (x)
is real and there are therefore theoretically innitely many values, even
between x0 and x1 .

We have omitted all these values in the representation of the function as a


column vector 8.3. The column vector is therefore only an approximation vector
for the function f . By the way: In this way, as in 8.3, we represent a quantum
mechanical wave function as a state vector in computer physics.
86 Chapter 8. Fourier Series

8.2 Fourier coefficients


We can determine the Fourier coecients in the same way as in linear algebra.
How do we do this again in linear algebra? To get the k -th component of a
nite-dimensional vector v , we have to form the scalar product between the
k -th basis vector and the vector v :

vk = ek · v (8.4)
= ek0 v0 + ek1 v1 + ... + ekn vn
n
= + ekj vj
j

In the last step, we have written out the scalar product a little more compactly
with a summation sign and selected the summation index as j . Here, ek0 to ekn
are the components of the basis vector ek = [ek0 , ek1 , ..., ekn ].

If we are not working with nite-dimensional vectors but with functions, then
we have to form the scalar product between the k -th basis function and
the function f to obtain the k-th Fourier coecient of f :

   
e(x0 ) f (x0 )
 e(x1 )   f (x1 ) 
   
   
 e(x2 )   f (x2 ) 
   
fˆk = ek · f = ⟨ek |f ⟩ =  .  ·  .  (8.5)
   
   
 .   . 
   
 .   . 
   

e(xn ) f (xn )

To indicate that we may be working with an innite-dimensional vector space


here, we can call the operation 8.5 not scalar product but inner product and use
it as physicist in the Bra-Ket Notation ⟨ek f ⟩. We learn Bra-Ket Notation in
the chapter 16.

Note that we represent the vectors in 8.5 up to the n component because, as


8.3 Fourier basis 87

already mentioned, with a Fourier series we can only work with functions in a
certain interval. Our chosen interval (x0 , xn ) = (x0 , x0 + L) has the length L.

Let's write out the inner product 8.5 as a sum:


xn
fˆk = ⟨ek |f ⟩ ≈ + ek (x)f (x) (8.6)
x=x0

You have certainly seen the approximation sign in Eq. 8.6. The sum is
therefore only an approximation of the Fourier coecient fˆk . Can we represent
the Fourier coecients fˆk exact? It's quite simple! Since we are dealing with
a continuous summation in the case of exact Fourier coecients, we must
replace the summation sign with an integral. So instead of summing
discretely over x as in 8.6, we integrate over x:
Z xn
fˆk = ⟨ek |f ⟩ = e∗k (x)f (x)dx (8.7)
x0

We have only made a small mathematical upgrade in the integral. The basis
function e∗k (x) has been complex conjugate. We can also omit the asterisk if
we are working with real basis functions, as e∗ = e applies to real basis functions.
However, to allow complex basis functions, we must append an asterisk to the
basis function. The asterisk in the case of complex-valued functions
is important so that the integral 8.7 fullls the properties of an inner
product.

So, now we know how we can calculate the Fourier coecients with the integral
8.7 and how the integral formula 8.7 comes about in the rst place.

8.3 Fourier basis


Now let's get to know the basis functions. Which basis functions ek can we
use in the Fourier series of f ?
n
f = + fˆk ek (8.8)
k
88 Chapter 8. Fourier Series

All functions that fulll the properties of a basis! In order for a set of vectors
or, as in our case, a set of functions {ek } to be called a basis, these functions
must fulll two conditions:

ˆ If we take two basis functions ek and em from the set {ek }, then they
must beorthonormal to each other, in other words orthogonal and
normalized. This property can be expressed with the Kronecker delta:
⟨ek |em ⟩ = δkm .

ˆ The set {ek } of basis functions must be complete. In other words, they
must span the space in which the functions f live. We must be able to
represent each function f exactly with the set {ek }.
Only when these two properties are fullled by the functions {ek } can we take
these functions as basis functions and thus represent a function f as a Fourier
series 8.8.

A typical basis {ek } used in physics are the complex exponential functions:

1
ek = √ eikx (8.9)
L

The factor √1L ensures that the basis functions are normalized, that is, they
exactly fulll the necessary 1. property. In the context of physics, especially
in optics, we refer to k as wavenumber. And remember that e in eikx is the
Euler number and not the label of the basis function ek ! I'm just saying...
Depending on what we use for the wavenumber k , we get a dierent basis
function in 8.9. Of course, we can also choose a dierent basis for the Fourier
series, such as cosine and sine functions. We are free to choose a basis.
Here we have chosen complex exponential functions as a basis because they
can be written in a nice compact way, especially for the explanation of the
Fourier series.

The Fourier series 8.8 of the function f would look like this in the exponential
8.4 Example: Fourier series for the sawtooth function 89

basis 8.9:
n
1 n
f = + fˆk ek = √ + fˆk eikx (8.10)
k L k

What can we do with the Fourier series 8.10 in the exponential basis? As I said,
we can use it to approximate any function f in an interval. Let's take a look at
a concrete example, then you'll know what I mean.

8.4 Example: Fourier series for the sawtooth


function
As an example, let us consider the sawtooth function in the interval (0, 1):

(
−x (0, 0.5)
f = (8.11)
1 − x (0.5, 1)

This saw function looks like this:

Let's choose the exponential basis functions as our basis for the Fourier series
of the sawtooth function:
1 n
f = √ + fˆk eikx
L k

The total interval length is L = 1. This means that the normalization factor for
90 Chapter 8. Fourier Series

exponential basis functions is also 1:


n
f = + fˆk eikx (8.12)
k

When determining Fourier series, we always have to do two things:

ˆ Choose a basis and insert it into the Fourier series. We have already
done this in eq. 8.12.

ˆ Calculate the Fourier coecients fˆk with the integral 8.7 and insert it
into the Fourier series 8.12. We determine the k -th Fourier coecient 8.7
with the inner product between the k -th basis function and the sawtooth
function f :
Z xn
fˆk = e∗k (x) f (x) dx (8.13)
x
Z 01
= e−ikx f (x) dx (8.14)
0

Note that the exponential basis function must be complex conjugate in the
integral. This is where the minus sign in the exponent of the exponential function
comes from. And the integration limits x0 = 0 and xn = 1 are our free decision.
We want to approximate the sawtooth function in this region.

Now it's up to you to solve the integral 8.14 to determine the Fourier coecients
concretely. I can't do it, so I'll leave it to you as an exercise.

In any case, here is the solution we need to illustrate it right away:


1
fˆk = e−ik/2 (8.15)
ik

Since we have not entered a specic value for the wavenumber k in 8.15, we have
determined all Fourier coecients fˆk . For a dierent k value, we get a dierent
Fourier coecient in Eq. 8.15.

Let's just insert the Fourier coecients 8.14 into the Fourier series 8.12 and
8.4 Example: Fourier series for the sawtooth function 91

combine the two exponential functions:


n
f = + fˆk eikx (8.16)
k
n
1 −ik/2 ikx
= + e e (8.17)
k ik
n
1 ik(x−0.5)
= + e (8.18)
k ik

1 i2πm(x−0.5)
= + e (8.19)
m=−∞ i2πm

In the last step, we selected periodic boundary conditions for k = 2πm/L, where
m = ... − 2, −1, 0, 1, 2, ... takes whole numbers. We therefore sum over both
positive and negative m.
We can decide up to which mmax we want to sum in the Fourier series 8.19 of the
sawtooth function. The higher we choose mmax , the better our approximation
of the function f will be.
Look at the plots for the approximation mmax = 1 and for a better approximation
mmax = 20:

With this Fourier series for the sawtooth function, we have basically gained two
things:
ˆ We can now sum the series up to a certain maximum value: m = mmax
and thus obtain a continuously dierentiable good approximation for
the sawtooth function.
ˆ Since we have determined the Fourier coecients, we know which m
92 Chapter 8. Fourier Series

values are contained in the sawtooth function (m = 0, for example, is not


included). We therefore know which building blocks (basis functions) the
sawtooth function is composed of. This breaking down of the function
into individual components is known as Fourier analysis.
II
Nature is extreme
9. Action Functional

More: en.fufaev.org/euler-lagrange-equation

The Euler-Lagrange equation is a powerful tool with which we can set up


dierential equations (which you should be familiar with) for a specic
problem. We will encounter the Euler-Lagrange equation not only in
mechanics, but in all areas of theoretical physics. Let's rst look at the
motivation for this equation.

Let us consider a particle in the gravitational eld that is thrown vertically


upwards from the height h(t1 ) = 0 at the time t1 = 0. Marked as point A in the
following image. It moves straight along a spatial direction and arrives on the
ground at the same location h(t2 ) = h(t1 ) = 0 at time t2 (marked as point B in
the image):
96 Chapter 9. Action Functional

The connection between A and B, that is, the trajectory h(t) must be a
parabola in this problem. But why is this trajectory a parabola and not some
other trajectory? Why does nature or the particle between points A and B
choose this path of all paths? And not for any other path?
In order to answer this question, we need a physical quantity called action,
which is abbreviated with an S and has the unit Js (Joule second).
We can assign an action S[h] to each of the conceivable trajectories h. The
action takes an entire function h in the argument and outputs a number S[h],
namely the value of the action for the corresponding function. For example,
some trajectory h1 could have the value S[h1 ] = 3.5 Js, another trajectory h2
could have the value S[h2 ] = 5.6 Js and the parabolic trajectory h could have
the value S[h] = 2 Js:

So now back to the question: Why a parabola? Experience shows that nature
97

follows the principle of extreme action. This means that if we calculate


the corresponding action S for all possible trajectories h(t), h1 (t), h2 (t) and so
on between points A and B, then nature takes the value of the action that is
maximum, minimum or a saddle point.

All other actions are out of the question for nature. Nature chooses one of these
extremal actions. This is exactly what "extreme" means. Which of the extreme
paths (minimum, maximum, saddle point) nature actually takes depends on the
problem under consideration.
So we can answer the question: Why does the particle thrown upwards in the
gravitational eld take the path of the parabola in the space-time diagram?
Because the parabolic trajectory h has the smallest action S[h]!
But how do we actually calculate the value of the action? For this we
need the lagrange function L(t, h, ḣ). It depends on the time t, on the function
value (position) h(t) and on the time derivative (velocity) ḣ(t) at time t. The
Lagrange function has the unit of energy, that is, Joule (J).
If we integrate the Lagrange function L over the time t between t1 and t2 , we
get a quantity that has the unit Joule second. We interpret this as the action
S:

Z t2
S[h] = dt L(t, h, ḣ) (9.1)
t1

The letter q is usually used instead of h and q̇ instead of ḣ and is called q


generalized coordinate and the derivative q̇ generalized velocity. The
98 Chapter 9. Action Functional

generalized coordinate (the trajectory you are looking for) does not necessarily
have to be the height h above the ground. For example, it can represent an
angle q = φ or any other quantity that may depend on the time t.

Z t2
S[q] = dt L(t, q, q̇) (9.2)
t1

With this formula for the action functional we can calculate the value S[q]
of the action for every possible trajectory q that the particle can take. We only
need to determine the Lagrange function L.
There are an innite number of possible trajectories that a particle can take from
A to B. Do I really have to calculate an innite number of action functionals
9.2? No, there is a faster way to nd the trajectory with the most extreme
action and for this we need the Euler-Lagrange equation.
10. Euler-Lagrange Equation

Of course, it is totally cumbersome to calculate the action functional 9.2 for all
possible trajectories and to take the trajectory that yields the smallest value of
the integral. To save us this huge task, the Euler-Lagrange equation comes
into play:

∂L d ∂L
− = 0 (10.1)
∂q dt ∂ q̇

This is one of the most important equations in physics. It is best to


memorize it immediately. The derivation of the Euler-Lagrange equation is
based on the denition of the action functional 9.2 and the principle of extremal
action. In this chapter, we do not want to know how to derive the Euler-Lagrange
equation, but rather how to use it to determine the desired extreme trajectory
h.

The Euler-Lagrange equation 10.1 contains the partial derivative ∂L ∂ q̇


of the
Lagrange function with respect to the generalized velocity q̇ . This derivative is
also referred to as generalized momentum and is abbreviated as p. You
100 Chapter 10. Euler-Lagrange Equation

may also encounter the Euler-Lagrange equation in the following form:

∂L dp
− = 0 (10.2)
∂q dt

We call p generalized momentum because this must not necessarily be


mechanical momentum. p can also represent a torque, for example. The
generalized momentum p is then dierentiated with respect to time t in the
Euler-Lagrange equation.

If we rearrange the Euler-Lagrange equation 10.2 with respect to the time


derivative of the momentum, we can read from it whether the momentum is
conserved:

dp ∂L
= (10.3)
dt ∂q

For the momentum to be conserved, the time derivative of the momentum must
disappear. We therefore only have to calculate whether ∂L ∂q
is zero, then the
generalized momentum is obtained.

Using the form 10.3, we can also read o a possible interpretation of the
Euler-Lagrange equation. It is a condition for the conservation of
generalized momentum.
To be able to use the Euler-Lagrange equation at all, we need to know the
Lagrange function L for a chosen system.

10.1 Lagrange function


The Lagrange function L is a scalar function that cannot be derived for any
problem, but can only be guessed or motivated. If you think you have
discovered a suitable Lagrange function for a problem, be it from quantum
mechanics, classical mechanics or relativity, then you can easily use the
Euler-Lagrange equation to check whether the Lagrange function you have
10.2 How To: Euler-Lagrange equation 101

found correctly describes your problem or not. If you want to nd the Theore
of Everything formula that unites quantum mechanics with the general
theory of relativity, then you should derive or dream up the corresponding
Lagrange function.
In classical mechanics, the Lagrange function is the dierence between the
kinetic energy Wkin and the potential energy Wpot of a particle:

L(t, q, q̇) = Wkin (t, q, q̇) − Wpot (t, q, q̇) (10.4)

So if we know the kinetic and potential energy of a particle, we can determine


the Lagrange function 10.4 of mechanics and then use it in the Euler-Lagrange
equation 10.1.

10.2 How To: Euler-Lagrange equation


Let's take a look at our example in the introduction, namely how we can calculate
the parabola using the Lagrange function 10.4 and the Euler-Lagrange equation
10.1. To do this, we must always carry out the following ve steps:

10.2.1 First step: Set generalized coordinates


First of all, we need to know what q and q̇ actually represent. In our example,
q = h and q̇ = v , where v is the velocity of the thrown particle. Velocity is
nothing other than the time derivative of the trajectory function, in other words
v = ḣ.

10.2.2 Second step: Set up the Lagrange function


Next, we need to specify the Lagrange function 10.4 by giving the kinetic energy
Wkin and the potential energy Wpot of the particle in the gravitational eld as a
function of q and ḣ. The kinetic energy Wkin of the thrown particle is given by
1
Wkin = m ḣ2 (10.5)
2

Here m is the mass of the particle. The potential energy Wpot of the particle in
102 Chapter 10. Euler-Lagrange Equation

the gravitational eld is given by

Wpot = m g h (10.6)

The Lagrange function L for our problem is thus:


1
L = m v2 − m g h (10.7)
2

10.2.3 Third step: Calculate derivatives


Now we can use the Lagrange function 10.7 to calculate the derivatives of the
Lagrange function occurring in the Euler-Lagrange equation 10.1:

∂L d ∂L
− = 0 (10.8)
∂h dt ∂v

Dierentiate the Lagrange function 10.7 with respect to h:


 
∂L ∂L 1
= 2
mv − mgh (10.9)
∂h ∂h 2
 
∂L 1 ∂L
= mv 2
− (m g h) (10.10)
∂h 2 ∂h
= −mg (10.11)

Dierentiate the Lagrange function 10.7 with respect to v :


 
∂L ∂L 1
p = = 2
mv − mgh (10.12)
∂v ∂v 2
 
∂L 1 ∂L
= mv 2
− (m g h) (10.13)
∂v 2 ∂v
= mv (10.14)
10.2 How To: Euler-Lagrange equation 103

Dierentiate the calculated momentum p = ∂L


∂v
with respect to time:

dp d
= (m v) (10.15)
dt dt
= m v̇ (10.16)
= m ḧ (10.17)

Let's insert the calculated derivatives 10.11 and 10.14 into the Euler-Lagrange
equation:

−m g − m ḧ = 0 (10.18)

Let's cancel the mass and bring ḧ to the right-hand side of the equation:

−g = ḧ (10.19)

What we have obtained in 10.19 is a dierential equation for the desired


trajectory h(t). Hopefully you can see the usefulness of the Euler-Lagrange
equation here: It is there to set up dierential equations for the
extremal trajectory h(t).
Note that our example is a one-dimensional problem and therefore we only
got one dierential equation. For more complex multidimensional problems,
we get several dierential equations.

10.2.4 Fourth step: Solve the differential equations


Now we have to solve the dierential equation 10.19 set up using the Euler-
Lagrange equation. We can do this by integrating both sides twice. The solution
is the unknown extremal trajectory:
1
h(t) = − g t2 + C1 t + C2 (10.20)
2

Here, C1 and C2 are the integration constants.


104 Chapter 10. Euler-Lagrange Equation

10.2.5 Fifth step: Set boundary conditions


The last step is to insert the constraints of the problem under consideration into
the solution of the dierential equation and determine the unknown constants
C1 and C2 .

In our problem, we have thrown the particle from the height h1 = 0 at the time
t1 = 0. The rst boundary condition is therefore: h(0) = 0. If we insert it
into the solution 10.20, we get the second constant of integration: C2 = 0. This
simplies the solution:
1
h(t) = − g t2 + C1 t (10.21)
2

We know that the trajectory h(t) ends at point B. Point B corresponds to


the time t2 at which the particle landed on the ground at h(t2 ) = 0. This is
the second constraint. If we insert this boundary condition into 10.21, we can
determine the rst constant of integration: C1 = 12 g t2 . We are done!
The required extremal trajectory is therefore given by :
1 1
h(t) = − g t2 + C1 t + g t2 t (10.22)
2 2

This trajectory has the smallest value S[h] of the action. If we plot the result
10.22 in the space-time diagram, we get a parabola.
Let us summarize: The Euler-Lagrange equation helps us to set up
dierential equations for a desired trajectory between two xed
points. The solution of these dierential equations yields the exact shape of
the trajectory that is allowed by nature.
III
Electromagnetism
11. The Electric Vector Field

Consider an electrically charged sphere with a large source charge Q and a


sphere with a small test charge q . The test charge is at a certain point in time
at a distance r from the source charge. The source charge exerts an electric
force Fe on the test charge, which is given by the Coulomb's law:

1 Qq
Fe = (11.1)
4πε0 r2

Here, 4πε0 is a constant pre-factor with the vacuum permittivity ε0 , which


ensures the correct unit of force on the right-hand side of Coulomb's law, namely
the unit Newton (N).

What if we know the value of the big charge Q and want to know what force this
108 Chapter 11. The Electric Vector Field

big charge exerts on another small charge q ? But we don't know the exact value
of this small charge. Or we deliberately leave this value open and only want to
consider the electric force that would be exerted by the big charge if we place
the test charge q near it. To do this, q must somehow be eliminated from the
Coulomb's law. We achieve this by dividing the Coulomb's law on both sides
by q so that the test charge on the right-hand side disappears:

Fe 1 Q
= (11.2)
q 4πε0 r2

The quotient between force and charge on the left-hand side is dened as electric
eld E of the source charge Q:
1 Q
E = (11.3)
4πε0 r2

The electric eld E indicates the electric force


So what is the electric eld?
that WOULD act on a test charge q if it were placed at a distance r
from the source charge Q.

We have called Q the source charge to indicate that it is the source of the
electric eld. And so that Q is really the only source, we have chosen the test
charge q to be very small.
The electric eld in Eq. 11.3 is only the magnitude, that is, the value of the
109

electric eld. For the cherry on the cake of electrodynamics, Maxwell's equations,
we need the electric eld as a vector quantity in order to also take into account
the direction of the electric eld at all locations in space. Therefore, the
electric eld E must be transformed into a vector E . Vectors are shown in this
book in bold.
The electric eld E as a vector in three-dimensional space has three components
E1 ,E2 and E3 .

 
E1
E = E2  (11.4)
 

E3

The rst component E1 (x, y, z) depends on the spatial coordinates (x, y, z) and
it indicates the magnitude of the electric force that would act on a test charge
along the x-axis if the test charge were placed at the location (x, y, z). The same
applies to the other two eld components E2 (x, y, z) and E3 (x, y, z), which each
determine the electric force on a test charge along the y and z spatial directions.
We can summarize: The electric vector eld E assigns a vector E(x, y, z)
to each point in space (x, y, z), which represents the electric eld at
that location. If a test charge is placed there, it is accelerated in the
direction of this vector.
12. The Magnetic Vector Field

Another important fundamental physical quantity that appears in the second


and fourth Maxwell equation is the magnetic eld. Experiments show that a
particle with electric charge q moving in a straight line with velocity v in an
external magnetic eld, experiences a magnetic force Fm , which deects the
particle from its straight-line path.

The magnetic force on the particle increases proportionally to its charge Fm ∼ q


and proportional to its speed Fm ∼ v . This means that if the charge or speed is
doubled, the magnetic force on the particle also doubles.

The force also increases in proportion to the applied magnetic eld. To describe
this proportionality of the force and the magnetic eld, we introduce the quantity
B . Overall, the magnetic force (also called Lorentz force) is given by:
112 Chapter 12. The Magnetic Vector Field

Fm = q v B (12.1)

The unit of the quantity B must be such that the right-hand side of the equation
results in the unit of the force, that is N = kg · m/s2. A simple transformation
results in the unit of B : kg/As2 . We refer to this unit as Tesla for short:

kg
T = (12.2)
As2

We refer to the proportionality constant B as magnetic ux density or short:


magnetic eld or even shorter B -eld.
The equation 12.1 only represents the magnitude of the magnetic force Fm .
To formulate the magnetic force vectorially, the force, the velocity and the
magnetic eld are written as vectors:

     
Fm1 v1 B1
Fm = Fm2  , v = v2  , B = B2  (12.3)
     

Fm3 v3 B3

Now the three variables are not scalars, but three-dimensional vectors with the
components in x−, y− and z spatial direction. The question now is: How must
the velocity vector v be vectorially multiplied by the magnetic vector
eld B ?
If the deection of the charge in the magnetic eld is investigated more closely
in an experiment, it can be determined that the magnetic force deects it always
orthogonally, in other words perpendicularly to the direction of velocity and
to the magnetic eld lines. This orthogonality can be easily established with
the cross product v × B .

The cross product between the velocity vector and the magnetic eld vector is
113

dened in such a way that the result of the cross product, which is a vector, is
always orthogonal on the two vectors v and B :

     
v1 B1 v2 B3 − v3 B2
v × B = v2  × B2  = v3 B1 − v1 B3  (12.4)
     

v3 B3 v1 B2 − v2 B1

So that the magnetic force Fm is always orthogonal to v and B , their cross


product must be formed. The magnetic force is therefore given as a vector eld
by the following equation:

Fm = q v × B (12.5)

So what is the magnetic vector eld B ?The magnetic vector eld assigns
a vector B(x, y, z) to each point in space (x, y, z), which determines the
magnitude and direction of the magnetic force F (x, y, z) on a moving
m

charge q.
13. Maxwell’s Equations

More: en.fufaev.org/maxwell-equations
The four Maxwell equations together with the Lorentz force contain the
entire knowledge of electrodynamics. There are so many applications of
this that I can't list them all, but some of them are, for example

ˆ Electronic devices such as computers and mobile phones. They contain


electrical capacitors, coils and entire circuits that make use of Maxwell's
equations.

ˆ Power generation - whether from nuclear, wind or hydroelectric power


plants, the energy released must rst be converted into electrical energy
so that people can use it. This happens with electric generators. These in
turn are based on Maxwell's equations.

ˆ Power supply. AC voltages and transformers are needed to transport


electricity to households with as little energy loss as possible.

ˆ And much more! Electric welding for assembling car bodies, motors
for electric cars, magnetic resonance imaging in medicine, kettles in the
kitchen, the charger for your smartphone, radio, Wi-Fi and so on.
116 Chapter 13. Maxwell’s Equations

Isn't that incredible? Every device that utilizes electricity or magnetism is


fundamentally based on Maxwell's equations. Here, take a look:
ρ
∇·E =
ε0
∇·B = 0
∂B
∇×E = −
∂t
∂E
∇ × B = µ 0 j + µ 0 ε0
∂t

They may still seem a little cryptic to you, but after this lesson you will be able
to translate each of these four equations into a picture, which will be easier to
internalize.

As you can see from Maxwell's equations, the electric eld E and the magnetic
eld B appear there. You will hopefully have become familiar with these two
quantities in the chapters 11 and 12. Of course, I also assume that you have
read the chapter 5 on the nabla operator, the Stokes' Curl Theorem 7 and the
Gauss Divergence Theorem 6. If this is the case, then you will have no problem
understanding the following chapters.

13.1 Integral and Differential Representation


The four Maxwell equations can be represented in two dierent ways:

ˆ We can represent Maxwell's equations in integral form. Here we express


them with integrals. This allows us to understand Maxwell's equations
macroscopically. This is what the integral form looks like. Let it aect
13.1 Integral and Differential Representation 117

you briey:
I
Q
E · da =
ε0
IA
B · da = 0
A
I Z
∂B
E · dl = − · da
A ∂t
IL Z
∂E
B · dl = µ0 I + µ0 ε0 · da
L A ∂t

What feelings do these equations generate in you? Maybe fear? Trust me


- soon no more!

ˆ We can represent Maxwell's equations in dierential form. Here we


express them with derivatives. This allows us to understand Maxwell's
equations microscopically. This is what the dierential form looks like:
ρ
∇·E =
ε0
∇·B = 0
∂B
∇×E = −
∂t
∂E
∇ × B = µ 0 j + µ 0 ε0
∂t

What is the exact dierence between these two representations? Both


representation are not physically dierent, but mathematically they are.
And they are useful for dierent problems in dierent ways:

ˆ While the dierential form of a Maxwell equation applies to a single


point in space, the integral form applies to a entire spatial area.
ˆ The integral form is well suited for calculating symmetric problems,
such as calculating the electric eld of a charged sphere, a charged cylinder
or a charged plane. The dierential form is more suitable for calculating
complicated numerical problems or for various derivations, such
as the derivation of electromagnetic waves.
118 Chapter 13. Maxwell’s Equations

ˆ In addition, the dierential representation is much more compact than


the integral form.

converted into each other with


Both representations are useful and can be
the help of two mathematical theorems (Gauss and Stokes), which you
learned about in the chapters 6 and 7. Once you have understood the two
theorems, it will be easy for you to convert the integral form into the dierential
form and vice versa.

To understand Maxwell's equations, it is helpful to understand the electric and


magnetic ux and the voltage.

13.2 Electric and Magnetic Flux


In the chapter 6 on the Gauss Divergence Theorem, you learned about the ux
Φ of a vector eld F through the surface A (we do not need the circle at
the integral sign to dene the ux):
Z
Φ = F · da (13.1)
A

The surface integral over the vector eld F results in a number Φ , which
indicates how much of the vector eld F passes through the surface A.

If the vector eld F in the surface integral is an electric eld F = E , then


this surface integral is called electric ux Φe through the surface A:
Z
Φe = E · da (13.2)
A
13.3 Electric and Magnetic Voltage 119

If, on the other hand, the vector eld F in the surface integral is a magnetic
eld F = B , then this surface integral is referred to as magnetic ux Φm
through the surface A:
Z
Φm = B · da (13.3)
A

13.3 Electric and Magnetic Voltage


In the chapter 7 on the Stokes' Curl Theorem, we denoted the following line
integral over the vector eld F by U :
Z
U = F · dl (13.4)
L

The number U indicates how much of the vector eld circulates along the line
L. It was no coincidence that we gave it the same letter as the voltage.

If the vector eld F in the line integral is an electric eld F = E , then this
line integral is referred to as electric voltage Ue along the line L:
Z
Ue = E · dl (13.5)
L
120 Chapter 13. Maxwell’s Equations

The voltage 13.5 in the case of an electric eld is proportional to the kinetic
energy:
ˆ A positively charged particle gains energy when it passes through
the line L.

ˆ A negatively charged particle loses energy when it passes through


the line L.

The line integral 13.5 of the electric eld, that is the voltage Ue , measures
the kinetic energy gain or energy loss of a charged particle when it passes
through the considered line L in the electric eld. Note, however, that this
kinetic energy does not come from nothing, but is withdrawn from or added
to the electric eld.
If the vector eld F in the line integral is a magnetic eld F = B , then this
line integral is referred to as magnetic voltage Um along the line L:
Z
Um = B · dl (13.6)
L

In contrast to electric voltage, magnetic voltage has no interpretation as


energy, because here the particle does not change its energy when it passes
through the line L. Magnetic elds do not perform any work on moving
charges. Nevertheless, the analogous denition to electric voltage makes
mathematical sense.

We will need this knowledge of electric and magnetic ux and voltage in a
moment if we want to understand Maxwell's equations in integral form.
13.4 First Maxwell Equation 121

13.4 First Maxwell Equation


13.4.1 Macroscopic form
Let's take a look at the rst Maxwell equation in integral form:
I
Q
E · da = (13.7)
ε0

You should be familiar with the left-hand side of Maxwell's equation 13.7. It is
the electric ux Φe through an imaginary surface A that encloses something.
The left-hand side of Maxwell's equation therefore tells you how much net of
the electric eld E exits and enters the surface A:
Q
Φe = (13.8)
ε0

On the right-hand side of the rst Maxwell equation is the total electric charge
Q, which is enclosed by the surface A. The vacuum permitivity ε0 is
only there to have the correct unit "voltmeter" on both sides of the Maxwell
equation. The interesting thing is: It doesn't matter how this enclosed charge
is distributed.

So what does Maxwell's rst equation mean in integral form? The


electric ux through a surface is determined by the electric charge enclosed by
122 Chapter 13. Maxwell’s Equations

that surface.

13.4.2 Microscopic form


This is themacroscopic interpretation of Maxwell's rst equation. For a
microscopic interpretation, we need to convert the intergral form into a
dierential form. How do we do that? We must rst convert both sides of
Maxwell's equation 13.7 into a volume integral.

With the Gauss Divergence theorem 6.1, which links a volume integral with a
surface integral, the surface integral on the left-hand side of the rst Maxwell
equation can be rewritten as a volume integral:
Z
Q
(∇ · E) dv = (13.9)
V ε0

The enclosed charge Q can also be expressed with a volume integral. The charge
corresponds to the charge density ρ over the considered volume V , because
charge density is by denition charge per volume. This means that the volume
integral of the charge density ρ over a volume V corresponds to the charge
enclosed in this volume. This transforms the right-hand side of the rst Maxwell
equation 13.9 into a volume integral:
Z Z
1
(∇ · E) dv = ρ dv (13.10)
V ε0 V

On both sides in Eq. 13.10 we integrate over the same volume V . To ensure
that this equation is always fullled for any chosen volume, the integrands on
both sides must be the same (whereby the right integrand is multiplied by the
constant ε10 ). This results in the dierential form of the rst Maxwell
equation:
ρ
∇·E = (13.11)
ε0

On the left-hand side of the dierential form is the divergence ∇ · E of the


electric eld. The divergence at the location (x, y, z) can be positive, negative
13.5 Second Maxwell Equation 123

or zero. We learned what this means in the 5.2 chapter:


ˆ If the divergence ∇ · E(x, y, z) > 0 at the location (x, y, z) is positive,
then the charge density ρ(x, y, z) at the location (x, y, z) is also positive.
There is therefore a positive electric charge at the location (x, y, z),
which is the source of the electric eld.
ˆ If the divergence ∇ · E(x, y, z) < 0 at the location (x, y, z) is negative,
then the charge density ρ(x, y, z) at the location (x, y, z) is also
negative. There is therefore a negative electric charge at the
location (x, y, z), which is the sink of the electric eld.
ˆ If the divergence ∇ · E(x, y, z) = 0 at the location (x, y, z) is zero, then
the charge density ρ(x, y, z) at the location (x, y, z) is also zero. There
is therefore neither a negative nor a positive electric charge at the
location (x, y, z) or there is just as much positive as negative charge there,
so that the total charge at this point (x, y, z) cancels out. In this case,
there is a ideal electric dipole at this point.
The electric
So what does Maxwell's rst equation mean in dierential form?
charges are the sources and sinks of the electric eld. Charges
generate an electric eld.

13.5 Second Maxwell Equation


13.5.1 Macroscopic form
The second Maxwell equation in integral form looks like this:
I
B · da = 0 (13.12)
A

Nothing here should be unfamiliar to you. On the left-hand side is a surface


integral. However, it is not integrated over an electric vector eld, as in the rst
Maxwell equation, but over a magnetic vector eld B . This surface integral
corresponds to the magnetic ux Φm through the closed surface A.
So what does Maxwell's second equation mean in integral form from a
124 Chapter 13. Maxwell’s Equations

The magnetic ux through a closed surface


macroscopic point of view?
is always zero. There are as many magnetic eld vectors pointing
into the surface as out.

13.5.2 Microscopic form


To obtain the dierential form of the second Maxwell equation, we must
convert the surface integral in the second Maxwell equation 13.12 into a volume
integral. To do this, we simply replace the surface integral with the volume
integral using the Gauss Divergence Theorem 6.1. The second Maxwell equation
then looks like this:
Z
(∇ · B) dv = 0 (13.13)
V

The integral 13.13 for any volume V is always zero only if the integrand ∇ · B
is zero. This is how the second Maxwell equation in its dierential form
emerges:

∇·B = 0 (13.14)

The magnetic counterpart to the electric charge is the magnetic charge: a


magnetic south pole and north pole. We call them magnetic monopoles. They
have never been observed experimentally, which is why it is initially assumed
13.6 Third Maxwell Equation 125

that no magnetic monopoles exist. Their non-existence is captured in the second


Maxwell equation (the right-hand side is zero).

The dierential representation of Maxwell's second equation allows us a


microscopic interpretation of the non-existence of magnetic monopoles.
The vanishing divergence ∇ · B(x, y, z) of the magnetic eld means: No matter
which point in space (x, y, z) we look at - there is no magnetic monopole
in any point in space. Or, when the divergence also disappears, is the case
when there is an ideal magnetic dipole at the point (x, y, z). A magnetic
dipole is a combination of the south and north poles. These are inseparably
connected to each other.

Since there are no magnetic monopoles, there are no sources and sinks of
the magnetic eld. Consequently, there are no points in space where magnetic
eld vectors originate or diverge to a point. The magnetic eld lines
must therefore always be closed.
The second Maxwell equation is just like the other Maxwell equations an
experimental observation. This means that if at some point a magnetic
monopole is found, for example a single north pole without an associated south
pole, then Maxwell's second equation must be modied. That would be nice
for us, because then Maxwell's equations would be even more symmetrical!

13.6 Third Maxwell Equation


13.6.1 Macroscopic form
The third Maxwell equation in integral form looks like this:
I Z
∂B
E · dl = − · da (13.15)
L A ∂t

On the left-hand side is a line integral of the electric eld E over a closed
line L that borders the surface A. This line integral sums up all components
E|| of the electric eld that run along the line L. From the chapter 13.3 we know
that this line integral corresponds to the electric voltage Ue along the loop L.
126 Chapter 13. Maxwell’s Equations

We can also write the second Maxwell equation as follows:


Z
∂B
Ue = − · da (13.16)
A ∂t

This form of Maxwell's third equation is referred to as law of induction. On


the right-hand side of the law of induction is a surface integral of the time
derivative of the magnetic eld. If the surface A through which the magnetic
eld penetrates does not change over time, we can put the time derivative ∂t∂
in front of the integral:
Z

Ue = − B · da (13.17)
∂t A

Now we can interpret the surface integral on the right-hand side as magnetic
ux Φm through the surface A:
∂Φm
Ue = − (13.18)
∂t

The time derivative in front of the magnetic ux is also still there. The magnetic
ux is therefore dierentiated with respect to time in the third Maxwell equation.
The time derivative of the magnetic ux indicates how quickly the magnetic
ux changes when time passes. The third Maxwell equation therefore tells us
two equivalent things:

ˆ The faster the magnetic ux Φm changes through the enclosed surface A,
the greater the voltage Ue generated along the edge of the surface L.

ˆ The faster the magnetic ux through the enclosed surface A changes, the
stronger the parallel eld component E|| , which runs along the edge
of the surface L. This electric eld along the edge is also referred to as
electric vortex eld, because this eld component swirls around the
edge of the surface:
I
∂Φm
E · dl = − (13.19)
L ∂t
13.6 Third Maxwell Equation 127

Of course, we can also interpret Maxwell's third equation 13.19, expressed with
the vortex eld, the other way round: The stronger the electric vortex
eld E (or more precisely E|| ) around the surface boundary L the faster the
magnetic ux Φm (or equivalently B|| ) changes through the surface A.

You are probably also wondering what the minus sign before the time
derivative means? The minus sign takes into account the circulation
direction of the vortex eld:
ˆ If the change in magnetic ux is positive, that is ∂Φm
∂t
> 0, then the
voltage is negative due to the minus sign, that is Ue < 0.
ˆ If the change in magnetic ux is negative, that is ∂Φm
∂t
< 0, then the
voltage is positive due to the minus sign, that is Ue > 0.
The vortex component E|| of the electric eld E thus swirls around in such a
way that the change in magnetic ux is impeded. Nature tries to prevent
the change in ux with a vortex eld. You probably remember this minus sign
in the third Maxwell equation from school as the Lenz rule. The minus sign
takes into account the law of conservation of energy.
What would happen if we omitted the minus sign? The electric vortex eld
(with the eld energy We ) would generate a magnetic ux change ∂Φ∂tm . This
in turn would amplify the electric vortex eld. This would increase the eld
energy We . The increased vortex eld leads to an increased change in ux. This
in turn leads to an even larger vortex eld and thus to greater eld energy.
This mutual amplication does not stop and the eld energy We becomes
innitely large. We could tap into this with a capacitor, for example, and have
128 Chapter 13. Maxwell’s Equations

an inexhaustible source of energy. This not only violates the law of


conservation of energy, but also the third law of thermodynamics, which states
that it is impossible to build such a perpetual motion machine.

If the magnetic ux does not change over time ∂Φ∂tm = 0, then of course there is
also no electric vortex eld and no electric voltage. The right-hand side of
the third Maxwell equation 13.19 is therefore zero:
I
E · dl = 0 (13.20)
L

Now it is stated in 13.20 that the line integral over the electric eld, that is the
electric voltage Ue , is always zero along a closed line L. There is therefore
no electric vortex eld as long as there is no time-varying magnetic
eld! This means: If an electron were to pass through the closed line L in the
electric eld E , the electron would not change its energy.

13.6.2 Microscopic form


Let us now convert the third Maxwell equation 13.15 in integral representation
into a dierential representation in order to be able to interpret the Maxwell
equation microscopically. To do this, we use the Stokes' Curl Theorem from
the chapter 7, which links a line integral with a surface integral. So let's
replace the line integral with a surface integral in 13.15:
Z Z
∂B
(∇ × E) · da = − · da (13.21)
A A ∂t

This brings the curl ∇ × E into play. Since the equation 13.21 applies to any
surfaces A, the integrands on both sides must be equal. This yields the third
Maxwell equation in dierential representation:
∂B
∇×E = − (13.22)
∂t

magnetic eld B(t, x, y, z)


The dierential representation 13.22 states: If the
changes in time at the point in space (x, y, z), then an electric vortex
13.7 Fourth Maxwell Equation 129

eld E(t, x, y, z) is generated around this point in space, which attempts to


suppress this change in the magnetic eld. Of course, the interpretation also
works the other way round: an electric vortex eld around a point in space
generates a magnetic eld that changes over time.

If the magnetic eld does not change, that is, if it is static, the right-hand side
in 13.22 is zero and the third Maxwell equation is simplied to an electrostatic
Maxwell equation. Electrostatic here means that the electric eld E is
time-independent:

∇×E = 0 (13.23)

As long as there is no changing magnetic eld, the electric eld is always


E vortex-free. We know this from mathematics: If the rotation ∇ · E of
a vector eld F vanishes, then the vector eld is conservative, that is, it
conserves energy. The electrostatic electric eld E in Eq. 13.23 is therefore
conservative. Electric charges are neither accelerated nor decelerated in this
electric eld - they do not change their energy.

13.7 Fourth Maxwell Equation


13.7.1 Macroscopic form
The fourth Maxwell equation in integral form looks like this:
I Z
∂E
B · dl = µ0 I + µ0 ε0 · da (13.24)
L A ∂t

On the left-hand side of Maxwell's fourth equation is a line integral of the


magnetic eld B along the closed line (loop) L. We already know what this
line integral means from the line integral over the electric eld in Eq. 13.19.
It indicates the vortex component B|| of the magnetic eld B , which swirls
around the loop L.
The right-hand side of Maxwell's fourth equation 13.24 tells us how we can
generate this magnetic vortex eld:
130 Chapter 13. Maxwell’s Equations

ˆ We can generate it with a electric


current I through the surface A. This
current does not have to change in time to generate a magnetic vortex
eld.

ˆ We can generate it with a time-changing electric eld E(t) through


the surface A.
ˆ We can generate it with both contributions, I and E(t).

The physical constants ε0 and µ0 in the fourth Maxwell equation are irrelevant
for understanding the fourth Maxwell equation. These constants merely ensure
that the right-hand side also has the unit "Tesla times meter", like the left-hand
side of the equation.
On the right-hand side of 13.24 we can pull the time derivative in front of the
integral if the surface A does not change:
I Z

B · dl = µ0 I + µ0 ε0 E · da (13.25)
L ∂t A
13.7 Fourth Maxwell Equation 131

Then the surface integral, integrated over the electric eld, corresponds exactly
to the electric ux Φe through the surface A:
I
∂Φe
B · dl = µ0 I + µ0 ε0 (13.26)
L ∂t

13.7.2 Ampere’s Law


An important special case arises if the electric ux does not change over time,
that is, ∂Φ
∂t
e
= 0, then Maxwell's fourth equation is simplied to Ampere's
Law:
I
B · dl = µ0 I (13.27)
L

According to the Ampere's Law, a current-carrying wire generates a magnetic


vortex eld around itself.

13.7.3 Microscopic form


Let us now derive the dierential form of the fourth Maxwell equation. Using
the Stokes' Curl Theorem 7.1, we convert the line integral inside the integral
form 13.24 into a surface integral. In this way, the curlof the magnetic eld
∇ × B comes into play:
Z Z
∂E
(∇ × B) · da = µ0 I + µ0 ε0 · da (13.28)
A A ∂t

Now the term with the electric current I must be converted into a surface
integral. To do this, we simply have to express the current with the electric
132 Chapter 13. Maxwell’s Equations

current density j . The current density is dened as the current per


cross-sectional area. Consequently, the surface integral over the current density
corresponds to the current through the cross-sectional area A. The fourth
Maxwell equation thus becomes:
Z Z Z
∂E
(∇ × B) · da = µ0 j · da + µ0 ε0 · da (13.29)
A A A ∂t

Note that the scalar product of the current density with the surface element da
is taken in the integral. The scalar product therefore only picks the component
j|| of the current density vector that runs parallel to the surface element
da. Only this current density component contributes to the current through the
cross-sectional area A.
Now we have a surface integral in each term of the fourth Maxwell equation.
We can combine the two surface integrals on the right-hand side into one surface
integral because both summands are integrated over the same area:
Z Z  
∂E
(∇ × B) · da = µ 0 j + µ 0 ε0 · da (13.30)
A A ∂t

For the Maxwell equation 13.30 to be fullled for any-surface A, the integrands
on both sides must be equal. This results in the fourth Maxwell equation in
dierential representation:
∂E
∇ × B = µ 0 j + µ 0 ε0 (13.31)
∂t

What does Maxwell's fourth equation microscopically mean? If the electric


eld E(t, x, y, z) changes over time at the point in space (x, y, z) or if
the current density j(t, x, y, z) is not zero, then a magnetic vortex eld
B(t, x, y, z) is generated around this point in space.

You should now have an intuitive understanding of Maxwell's four equations.


Next, let's take a look at where exactly the electromagnetic waves are hidden in
Maxwell's equations.
14. Electromagnetic Waves

More: en.fufaev.org/electromagnetic-waves
An electromagnetic wave (short: EM wave) consists of a electric eld
component E(t, x, y, z) and a magnetic eld component B(t, x, y, z). The
two eld components assign an electric and magnetic eld strength and its
direction to each point (x, y, z) in three-dimensional space at each time t. The
two eld components are therefore three-dimensional vector elds:
   
E1 (t, x, y, z) B1 (t, x, y, z)
E(t, x, y, z) = E2 (t, x, y, z) B(t, x, y, z) = B2 (t, x, y, z)
   

E3 (t, x, y, z) B3 (t, x, y, z)

ˆ The magnitude (eld strength) E(t, x, y, z) (not shown in bold) indicates


the electric amplitude of an electromagnetic wave.
ˆ The magnitude (eld strength) B(t, x, y, z) indicates the magnetic
amplitude of an electromagnetic wave.
The amplitudes are generally not only dependent on location, but they also
change with time t. This is the only way to obtain an electromagnetic
oscillation in space and time. The following image shows the oscillation of
134 Chapter 14. Electromagnetic Waves

the E and B vectors in space. wave vector


The k indicates the
propagation direction of the electromagnetic wave:

How an electromagnetic wave changes exactly in space and time, that is, how
it moves and propagates in space, is described by the wave equations for E
and B vectors. Let's take a look at how we can extract these wave equations
from Maxwell's equations:
ρ
∇·E =
ε0
∇·B = 0
∂B
∇×E = −
∂t
∂E
∇ × B = µ 0 j + µ 0 ε0
∂t

We assume that the electromagnetic wave propagates in an empty space,


without charges (ρ = 0) and currents (j = 0). We therefore set both the
charge density ρ and the current density j in Maxwell's equations to zero.
This simplies them to charge- and current-free Maxwell's equations:

∇·E = 0 (14.1)
∇·B = 0 (14.2)
∂B
∇×E = − (14.3)
∂t
∂E
∇ × B = µ 0 ε0 (14.4)
∂t
135

The general form of a wave equation for a vector eld F looks like this:
1 ∂ 2F
∇2 F = (14.5)
vp ∂t2

Here F is an arbitrary vector eld that satises the wave equation and vp is
the phase velocity of the wave. It indicates how fast a point of the wave moves
in space. Since we are not considering dispersion (meaning that the wave moves
apart), the phase velocity describes the propagation speed of the wave.

A relation that is necessary for the derivation of the wave equation is the
following relationship for the curl of the curl of the vector eld F (double
cross product):

∇ × ∇ × F = ∇(∇ · F ) − ∇2 F (14.6)

The four Maxwell equations are coupled dierential equations. "Coupled"


here means that the third and fourth Maxwell equations contain both the E
eld and the B eld. To obtain the wave equations for the E and B eld
component of an electromagnetic wave, we need to decouple the two coupled
Maxwell equations. Let's do that. It's quite simple.
136 Chapter 14. Electromagnetic Waves

14.1 Wave equation for the E-field


To arrive at the wave equation for the electric eld E , we have to decouple the
third Maxwell equation:
∂B
∇×E = − (14.7)
∂t

Let's apply the curl operator with cross product ∇× to both sides of the
third Maxwell equation:
 
∂B
∇×∇×E = ∇× − (14.8)
∂t

The time derivative together with the minus sign may be placed in front of the
Nabla operator, since the Nabla operator only contains spatial derivatives
and thus does not depend on time:


∇×∇×E = − (∇ × B) (14.9)
∂t

Now we can replace the curl ∇ × B of the magnetic eld using the fourth
current-free Maxwell equation 14.4:
 
∂ ∂E
∇×∇×E = − µ 0 ε0 (14.10)
∂t ∂t
 
∂ ∂E
= −µ0 ε0 (14.11)
∂t ∂t
2
∂E
= −µ0 ε0 2 (14.12)
∂t

We are nished with the right-hand side. It has the same form as the general
wave equation 14.5. Now we have to replace the double cross product on the
left-hand side with the relation 14.6:
∂E 2
∇ (∇ · E) − ∇2 E = −µ0 ε0 (14.13)
∂t2

On the left-hand side is the divergence ∇ · E of the electric eld. According to


14.2 Wave equation for the B field 137

Maxwell's rst equation 14.2, the divergence of the electric eld in charge-free
space is always zero. This simplies 14.13 to wave equation for the electric
eld component of an electromagnetic wave:
∂E 2
∇ 2 E = µ 0 ε0 (14.14)
∂t2

The wave equation thus links spatial derivatives ∇2 E of the electric eld
with the time derivatives ∂E and thus represents a system of three partial
2
∂t2
dierential equations.
If we compare the wave equation 14.14 for the electric eld with the general
form 14.5 of a wave equation, we nd out how the propagation velocity vp is
related to the two eld constants µ0 and ε0 :
1 1
2
= µ0 ε0 ↔ vp = √ (14.15)
vp µ 0 ε0

If we specically calculate the propagation velocity of an electromagnetic wave,


we get the velocity of light c:
1 m
vp = √ = 3 × 108 = c (14.16)
µ 0 ε0 s

From Maxwell's equations and the derived wave equation for the E eld, we can
conclude that the electric eld component of the electromagnetic wave
propagates at the speed of light. We will see that it also applies to the B
eld component. We can therefore express the E wave equation with the
speed of light:
1 ∂E 2
∇2 E = (14.17)
c2 ∂t2

14.2 Wave equation for the B field


To derive the wave equation for the magnetic eld B , we have to decouple
the fourth Maxwell equation 14.4. Decoupling is done in the same way as we
138 Chapter 14. Electromagnetic Waves

did with the E eld.

Apply the curl operator with cross product ∇× on both sides of the fourth
current-free Maxwell equation:
 
∂E
∇ × ∇ × B = ∇ × µ 0 ε0 (14.18)
∂t

Now let's move the time derivative and the two constants on the right-hand side
in front of the Nabla operator:


∇ × ∇ × B = µ 0 ε0 (∇ × E) (14.19)
∂t

Now we can replace the curl ∇ × E of the electric eld with the third Maxwell
equation:
 
∂ ∂B
∇ × ∇ × B = µ 0 ε0 − (14.20)
∂t ∂t

We have now decoupled the third Maxwell equation. The time derivative on the
right-hand side is combined and the double cross product on the left-hand side
is replaced using the calculation rule 14.6:

∂B 2
∇ (∇ · B) − ∇2 B = −µ0 ε0 (14.21)
∂t2

The divergence ∇ · B = 0 disappears according to the second Maxwell


equation (there are no magnetic monopoles). The term ∇ (∇ · B) therefore
disappears and what remains is the wave equation for the magnetic eld
of an electromagnetic wave:
∂B 2
∇ 2 B = µ 0 ε0 (14.22)
∂t2

In order to nd concrete electromagnetic waves ( spherical waves emanating


from a radio tower, for example), we have to solve these wave equations for
14.3 A few hints 139

certain initial or boundary conditions. Mathematicians or Python can do this


for us. It is important that you now know how to get wave equations from
Maxwell's equations and that their solution describes electromagnetic
waves that propagate at the speed of light.

14.3 A few hints


The derived wave equations 14.14 and 14.22 are partial dierential equations
of the second order. For example, if we look at the wave equation 14.14 for the
E -eld, then, strictly speaking, these are three partial dierential equations:

2 1 ∂E 2
∇E = 2
c ∂t2

Why? Because the E eld is a vector with three components E1 , E2 and


E3 . Let's write out this wave equation to understand what I mean:
 ∂2E ∂ 2 E1 ∂ 2 E1
  ∂2E 
∂x2
1
+ ∂y 2
+ ∂z 2 2
1

1  ∂t
(14.23)
 ∂ 2 E2 ∂ 2 E2 ∂ 2 E2  2
+ + = 2  ∂∂tE22 

 ∂x2 ∂y 2 ∂z 2 
∂ 2 E3 ∂ 2 E3 ∂ 2 E3
c ∂2E
∂x2
+ ∂y 2
+ ∂z 2 ∂t2
3

There are three dierential equations for the E eld that you have to solve.
Fortunately, they are not coupled and can therefore be solved independently
of each other. Physically, non-coupled dierential equations mean:: The three
eld components E1 , E2 and E3 oscillate independently of each other.
They do not interfere with each other!

The solution E(t, x, y, z) of the wave equation 14.17 is an electric wave, but it
does not necessarily represent the E -eld of an electromagnetic wave,
only because E(t, x, y, z) solves the wave equation. The solution E(t, x, y, z)
only describes the E eld of an electromagnetic wave in a vacuum if the solution
also satises all four Maxwell equations.
From the fourth, current-free Maxwell equation 14.4, for example, we can
directly read o the orientation of the E and B eld components. Here is the
140 Chapter 14. Electromagnetic Waves

Maxwell equation again:

∂E
∇ × B = µ 0 ε0
∂t

We know from mathematics that the result vector ∇ × B of the cross product
is always orthogonal to the vectors between which the cross product is formed.
In this case, the B eld vector is therefore orthogonal to the derivative of the
E eld vector. However, the time derivative does not change the direction
of a vector. The E eld vector and its derivative therefore point in the same
direction. Thus the solutions E(t, x, y, z) and B(t, x, y, z) of the wave equation
are perpendicular to each other at any time and at any place.

Now you should have an intuitive understanding of electromagnetic waves and


the associated wave equations!
IV
The Quantum World

15 Schrödinger equation . . . . . . . . . . 141

16 Bra-Ket Notation . . . . . . . . . . . . . . . . 165


15. Schrödinger equation

More: en.fufaev.org/schrodinger-equation
Most phenomena in our everyday world can be described using classical
mechanics. The goal of classical mechanics is to nd out how a body moves
over time. Classical mechanics therefore determines the trajectory r(t),
that is, the path of this body. With the trajectory, we can predict where this
body was, is and will be at any time t. We thus describe the movement of the
body.

Here are some examples of the motion of bodies whose trajectory r(t) can be
predicted using classical mechanics:

ˆ Movement of our earth around the sun

ˆ Movement of a satellite around the Earth

ˆ Motion of a rocket

ˆ Motion of a swinging pendulum

ˆ Motion of a thrown stone

These are all classical problems that can be solved with the help of
144 Chapter 15. Schrödinger equation

Newton's second law of motion, that is, with the following dierential
equation:

d2 r
ma = F ↔ m = −∇Wpot (15.1)
dt2

Here, Wpot is the potential energy of a body of mass m. For example, this could
be the potential energy in the Earth's gravitational eld.
By solving the Newton dierential equation 15.1 we can nd the
unknown trajectory r(t) of a body. The solution is a position vector
r(t) = [x(t), y(t), z(t)], which species the three-dimensional position of the
body at any time t.

Once we have determined the trajectory r(t) by solving the dierential equation,
we can extract all other physical quantities. Here are a few examples of these
quantities:
ˆ Velocity of the body: v(t) = dr
dt

ˆ Momentum of the body: p(t) = m v(t)

ˆ Kinetic energy of the body: Wkin (t) = 12 m v 2

In order to be able to specify the solution r(t), the initial conditions that
characterize the problem to be solved must also be known. In classical physics,
these are the initial position r(t0 ) and the initial velocity v(t0 ) of the body.
In quantum mechanics, on the other hand, it would not even be possible to
145

specify ainitial position and initial velocity due to the Heisenberg


uncertainty principle. However, the procedure with Newton's second law and
the determination of the trajectory r(t) is not possible for a quantum mechanical
particle. This is because a quantum particle (such as an electron) behaves
like a wave under many conditions. The position r(t) of an electron cannot
be determined precisely due to this wave character, because a wave is not
concentrated at a single point. And, if we try to squeeze the wave to a xed
point, then, according to the Heisenberg uncertainty principle, the momentum
p(t) of an electron can no longer be determined precisely.

So we cannot determine the trajectory of a quantum particle as in classical


mechanics and then deduce all other physical quantities from this, but must
nd another way to describe a quantum particle. And this other way is the
development of quantum mechanics and the Schrödinger equation.

It was only through the novel approach to nature with the help of the
Schrödinger equation that humans succeeded in making part of the microcosm
controllable. This has enabled humans to build lasers, which are now
indispensable in medicine and research. Or scanning tunneling
microscopes, which signicantly exceed the resolution of conventional light
microscopes. It was only through the Schrödinger equation that the periodic
table of elements and nuclear fusion in our sun were precisely
understood. But this is only a fraction of the applications that the Schrödinger
equation and quantum mechanics have brought us. So let's get to know this
powerful equation a little better.

Take a look at the following time-dependent Schrödinger equation in one


spatial dimension and let it sink in:
∂Ψ ℏ2 ∂ 2 Ψ
iℏ = − + Wpot Ψ (15.2)
∂t 2m ∂x2

We can already state mathematically that the Schrödinger equation is a partial


dierential equation of second order:
ˆ The Schördinger equation is a dierential equation. The unknown
146 Chapter 15. Schrödinger equation

quantity is a function and derivatives of this function occur in the


equation. The unknown function in the Schrödinger equation is the
wave function Ψ (x, t). It depends on the spatial coordinate x and
the time t and describes a quantum mechanical particle with mass m
and potential energy Wpot . Note that in the Schrödinger equation x
species a spatial coordinate and not, as in classical mechanics the
(one-dimensional) trajectory x(t) of the position vector
r(t) = [x(t), y(t), z(t)].

ˆ The Schördinger equation is a partial dierential equation. It therefore


contains derivatives of Ψ with respect to dierent variables, namely the
derivative with respect to the spatial variable x and the derivative
with respect to the time t.
ˆ The Schördinger equation is a partial dierential equation of second
order. By "second order" it is meant that the maximum derivative
that occurs in the dierential equation is the second order derivative.
The wave function is dierentiated twice with respect to the spatial
coordinate in the Schrödinger equation.

As is the case with any dierential equation, our goal is to solve the Schrödinger
equation to nd the desired wave function Ψ and then apply the initial conditions
for a specic quantum mechanical problem (e.g. an electron in a potential well).

However, there is no general recipe for how to solve the Schrödinger


dierential equation for a given problem. Most quantum problems cannot even
be solved analytically (exactly), but require approximate methods or
numerical solutions using a computer.

15.1 Time-Independent Schrödinger Equation


Unfortunately, it is not possible to derive the Schrödinger equation from
classical mechanics alone. We still need the wave-particle duality, which
does not occur within classical mechanics. In the following, let us motivate the
Schrödinger equation and thus understand the fundamental principles behind
it.
15.1 Time-Independent Schrödinger Equation 147

We make our lives easier by looking at a one-dimensional movement. In one


dimension, a quantum particle can only move along a straight line, namely along
the local axis x.

15.1.1 Energy conservation


Let us now take a particle of mass m, which ies with a velocity v in x-space
direction. The particle therefore has a kinetic energy Wkin . It can also be in
a conservative (i.e. energy-conserving) eld, for example in a gravitational eld
or in the electric eld of a plate capacitor. The particle can therefore also have
a potential energy Wpot . The total energy W of the particle is made up of
the kinetic and potential energy and is constant in time ( meaning that the total
energy remains constant):

W = Wkin + Wpot (15.3)

You should be familiar with the total energy and its conservation over time.
You already know this from the basics of classical mechanics. The law of
conservation of energy is a fundamental principle of physics, which is also
fullled in quantum mechanics in a modied form in conservative elds.

15.1.2 Wave-Particle Duality


The peculiarity of quantum mechanics is added by the wave-particle duality.
This allows us to view the particle as a matter wave.

The wave-particle duality links the "particle-like" momentum p with the


148 Chapter 15. Schrödinger equation

"wave-like" quantity, namely with the de-Broglie wavelength λ:


h h
λ = ↔ p = (15.4)
p λ

The two quantities are linked by the Planck's constant h = 6.6 · 10−34 Js with
each other. Because of the tiny value of h, it is understandable why we do
not observe wave-particle duality in our macroscopic everyday life.
In theoretical physics, it is common to express the momentum 15.4 not with the
de-Broglie wavelength λ, but with the wavenumber k . The momentum looks
like this:
hk
p = = ℏk (15.5)

Here ℏ = 2πh
is dened as reduced Planck's constant and is only used for
shorter notation. Whether we dene the particle momentum as in Eq. 15.4 or
15.5 is purely a matter of taste. We simply stick to the usual representation
15.5 in theoretical physics.

The momentum 15.5 is also a measure of whether the particle behaves more
particle-like or wave-like:

ˆ The smaller the wavenumber k (that is, the greater the de Broglie
wavelength λ), the more likely the particle behaves quantum mechanically -
more like a matter wave. In this case, we speak of a quantum mechanical
particle.
ˆ The larger the wavenumber k (that is, the smaller the de Broglie
wavelength λ), the more likely the particle behaves classically - like a real
particle. In this case, we speak of a classical particle.

The particle has a small wave number ( in other words a large de Broglie
wavelength) if it has a very small momentum p. So a small mass m and small
velocity v . A perfect candidate for such a quantum mechanical particle is a
free electron. By "free" we mean that it is not in an external eld. The
15.1 Time-Independent Schrödinger Equation 149

electron behaves like an extended matter wave, which we can describe


mathematically with a plane wave. We denote the plane matter wave here
with the capital Greek letter Ψ (x, t). A plane matter wave generally depends
on the spatial coordinate x and the time t.

15.1.3 Plane wave


We can describe a plane wave that has the wavenumber k , (angular)
frequency ω and amplitude A by a cosine function (or sine function):

Ψ (x, t) = A cos (k x − ω t) (15.6)

As time t progresses, the matter wave moves in the positive x direction, just like
the electron we are looking at.

In order to perform calculations with such plane waves without any addition
theorems, we convert the plane wave into a complex exponential function.
This is an equivalent but extremely eective representation of the plane wave.
Als erstes: Addiere zur Cosinusfunktion die komplexe Sinusfunktion
i A sin (k x − ω t):

Ψ (x, t) = A cos (k x − ω t) + i A sin (k x − ω t) (15.7)


= A [cos (k x − ω t) + i sin (k x − ω t)] (15.8)

We have thus converted a real function 15.6 into a complex function 15.8. Here,
the imaginary unit i ensures that the plane matter wave becomes complex-
valued and we can immediately represent it as a compact exponential function.
The cosine term is the real part Re(Ψ ) and the sine term is the imaginary
part Im(Ψ ) of the complex-valued function Ψ . The good thing is that we can
150 Chapter 15. Schrödinger equation

exploit the enormous advantages of the complex notation 15.8 and then agree
that we are only interested in the real part (the cosine term) in the experiment.
We can then simply ignore the imaginary part.

However, remember that a complex plane wave 15.8 is also a possible solution
of the Schrödinger equation. Most solutions Ψ (x, t) of the Schrödinger equation
are complex-valued wave functions. Real-valued wave functions, as in Eq.
15.6, are then only a special case.

Next, we use the Euler relation eiφ = cos(φ) + i sin(φ) from mathematics,
which links the complex exponential function with cosine and sine. In our
case, φ = k x + ω t. Let's use this to rewrite our complex plane wave:

Ψ (x, t) = A ei(k x−ω t) (15.9)

Whenever you encounter such a complex exponential function 15.9, you know
immediately that it always describes a plane wave - in this case a matter wave.
Our original, real-valued plane wave 15.6 as a cosine function is contained in
the complex exponential function 15.9, namely as the real part Re(Ψ ) of
the wave function.
15.1.3.1 Plane wave in a complex plane
Such a complex-valued wave function 15.9, at a xed coordinate x, can be
represented in the complex number plane as an arrow Ψ (a complex
vector).
15.1 Time-Independent Schrödinger Equation 151

ˆ The amplitude A corresponds to the length of the arrow.


ˆ The argument k x + ω t corresponds to the phase angle φ, which is
enclosed between the real axis and the Ψ pointer. As the angle changes
with time t, the arrow rotates clockwise. This rotation represents the
temporal propagation of the plane wave along the x-local axis.
The complex exponential function 15.9 is a function that describes a plane
wave. This is why it is also called a wave function Ψ (x, t), especially in
the context of quantum mechanics. Sometimes we also say: The particle is
in the state Ψ . By this we mean its equivalent representation as an innite-
dimensional vector (see chapter 16 on the Bra-Ket Notation).
There are, of course, a wide variety of wave functions that describe a wide variety
of particles under a wide variety of conditions. The plane wave is only a simple
example of a possible wave function.
Next, we multiply the total energy 15.3 by the wave function 15.9. In this way,
we combine the law of conservation of energy and the wave-particle
duality in one equation:

W Ψ = Wkin Ψ + Wpot Ψ (15.10)

15.1.4 Wave equation


But this equation doesn't help us much yet. We still have to convert it into
a dierential equation. We regularly encounter a plane wave 15.9 in optics
and electrodynamics when describing electromagnetic waves (see chapter 14).
And from there we know that it is a possible solution to the following (one-
dimensional) wave equation:

∂ 2Ψ 1 ∂ 2Ψ
= (15.11)
∂x2 c2 ∂t2

In our case, c = ω/k is the phase velocity of the matter wave. On the
left-hand side of the wave equation is the second derivative of the wave function
with respect to the space x. It is therefore reasonable to dierentiate the plane
152 Chapter 15. Schrödinger equation

wave function 15.9 twice with respect to x:

∂ 2Ψ ∂2 i(k x−ω t)
e = −k 2 A ei(k x−ω t) (15.12)

= A
∂x2 ∂x2

By taking the derivative twice, we get i2 k 2 as a factor and, because of i2 = −1,


a minus sign. The wave function as an exponential function remains unchanged
when dierentiated - as you hopefully know. The second derivative therefore
results in:
∂ 2Ψ
2
= −k 2 Ψ (15.13)
∂x

Next, we carry out four seemingly arbitrary steps that will ultimately lead us to
the Schrödinger equation. In these steps, we want to link the second derivative
15.13 of the wave function with the constant total energy W of the quantum
particle:
ˆ Let's use the de Broglie relation p = ℏ k and replace k 2 in the second
derivative 15.13:
∂ 2Ψ p2
= − Ψ (15.14)
∂x2 ℏ2

ˆ Next, we bring the kinetic energy expressed by the momentum Wkin = p2


2m
into play by substituting p2 into Eq. 15.14 with p2 = 2mWkin :

∂ 2Ψ 2m
2
= − 2 Wkin Ψ (15.15)
∂x ℏ

ˆ If we now look at the total energy 15.10 multiplied by the wave function,
we see that Wkin Ψ occurs there. We therefore rearrange 15.15 for Wkin Ψ :

ℏ2 ∂ 2 Ψ
Wkin Ψ = − (15.16)
2m ∂x2

ˆ If we only insert Eq. 15.16 into the total energy 15.10 multiplied by the
15.2 Interpretation of the wave function 153

wave function, we get the time-independent Schrödinger equation


in one space dimension:
ℏ2 ∂ 2 Ψ
WΨ = − + Wpot Ψ (15.17)
2m ∂x2

We recognize the one-dimensionality of the Schrödinger equation 15.17 by


the fact that only the dierentiation with respect to a single spatial
coordinate x occurs here. And we can recognize the time-independence of
the Schrödinger equation by the fact that it contains a constant total energy
W . However, the wave function Ψ (x, t) in the time-independent Schrödinger
equation may of course be time-dependent!

Let's summarize: To derive the time-independent Schrödinger equation


refeq:timeindependent-schrodinger-equation, we needed two fundamental
principles, the law of conservation of energy 15.3 and the
wave-particle-dualism 15.4, which we introduced with the help of a plane
matter wave 15.9. And since we started from the law of conservation of energy,
the time-independent Schrödinger equation is also referred to as law of
conservation of energy in quantum mechanics.

15.2 Interpretation of the wave function


Let's assume that we have solved the Schrödinger equation and thus found a
concrete wave function. How exactly we did this doesn't matter at rst. The
wave function we have found can also be complex-valued. We should not
neglect the imaginary part, as we agreed at the beginning with our plane wave.
By omitting the imaginary part, the result of the Schrödinger equation would
no longer agree with the results of experiments. For an experimenter, however,
such complex functions are unfavorable because they are not directly
measurable. In addition, there is no direct interpretation of the wave
function yet. But how can we still use the calculated wave function in the
experiment, even though the complex wave function cannot be measured
directly?

This is where the statistical interpretation of the wave function, the so-
154 Chapter 15. Schrödinger equation

called Copenhagen interpretation, comes into play. Although it does not say
what the wave function Ψ (x, t) means, it interprets its magnitude squared
|Ψ (x, t)|2 . By forming the absolute value square, we obtain a real-valued
(measurable for the experimenter) function |Ψ |2 .

The statistical interpretation makes use of the mathematical fact that the
square of magnitude is always positive |Ψ |2 > 0 and interprets it as
probability density. Because as you know: probabilities are always positive.

ˆ In the one-dimensional case, the magnitude squared |Ψ |2 is a probability


per length.

ˆ In the three-dimensional case, the magnitude squared |Ψ |2 is a


probability per volume.

15.2.1 Probability
Let's stick to the simple one-dimensional case. If we integrate the probability
density |Ψ (x, t)|2 over the spatial coordinate x within the distance between the
points x = a and x = b, then we get a probability P (t):
Z b
P (t) = |Ψ (x, t)|2 dx (15.18)
a

The integral of the probability density |Ψ (x, t)|2 indicates with which
probability P (t) the particle is located in the region between a and b
at time t. The probability can generally change over time.
15.3 Normalization of the wave function 155

15.2.2 |Ψ |2 graphically

If we plot the magnitude square |Ψ (x, t)|2 as a function of the location x in a


diagram, then we can extract the following information from it:
ˆ The probability P (t) at time t is the area under the |Ψ |2 -curve.
ˆ It is most likely to nd the particle at the maxima of the |Ψ |2 curve.
ˆ The most unlikely is to nd the particle at the minima of the |Ψ |2 curve.

Note, however, that it is not possible to specify the probability P (t) of the
particle at a certain location (for example at x = a), but only for a spatial
region (here between x = a and x = b). In the case of a single point in space,
the integral 15.18 would be zero. That is the mathematical reason. The physical
reason why we cannot specify a probability for a single point is that there are
innitely many points in space in the region between a and b. If each of these
points in space were assigned a nite probability, then the sum (i.e. the integral
15.18) of all probabilities would be innite, which would make no sense at all.
Therefore, we always calculate the probability of being in a spatial region.

15.3 Normalization of the wave function


In order for the statistical interpretation to be compatible with the Schrödinger
equation, the solution of the Schrödinger equation, that is, the wave function
Ψ , must satisfy the normalization condition. This states that the particle
must exist somewhere in space. In the one-dimensional case, it must therefore
156 Chapter 15. Schrödinger equation

be found one hundred percent somewhere on the line between x = −∞ and


x = ∞.

In other words: The normalization condition states that the integral 15.18 for
the probability, integrated over the entire space, must always result in
1:
Z ∞
P = |Ψ (x, t)|2 dx = 1 (15.19)
−∞

The normalization condition 15.19 is a necessary condition that every


physically possible wave function must full. After solving the
Schrödinger equation, the wave function Ψ (x, t) must be normalized with
the help of the normalization condition. "Normalization" means that we
have to calculate the integral 15.19 and then choose the amplitude of the wave
function so that the normalization condition is fullled.
It can be proven that the normalized wave function remains normalized for
all times t. If this were not the case, then the Schrödinger equation and the
statistical interpretation would be incompatible. There are of course solutions
to the Schrödinger equation, such as Ψ (x, t) = 0, which are not
normalizable. Such solutions are unphysical and we ignore them in
quantum mechanics. Wave functions that can be normalized with Eq. 15.19
are called square-integrable functions in mathematics. You will denitely
come across this term in your studies.
If you know with one hundred percent probability that the particle is located
15.3 Normalization of the wave function 157

between x = a and x = b, then you may reduce the integration limits in the
normalization condition 15.19 to this spatial region (this can sometimes
be useful to solve the integral):
Z b
|Ψ (x, t)|2 dx = 1 (15.20)
a

15.3.1 Example: Normalizing a wave function


An electron moves from the negative electrode to the positive electrode of a plate
capacitor. The two electrodes are at a distance of d from each other. You have
determined the following wave function by solving the Schrödinger equation:

Ψ (x, t) = A ei (kx−ωt) (15.21)

Our goal is to determine the factor A so that the integral over the magnitude
squared of this wave function is one.

You know with one hundred percent probability that the electron must be
between the two electrodes. If we place the negative electrode at x = 0 and the
positive electrode at x = d, then the electron is somewhere between these two
points. The normalization condition becomes:
Z d
|Ψ (x, t)|2 dx = 1 (15.22)
0

Next, we need to determine the magnitude squared |Ψ (x, t)|2 . The magnitude
of the wave function is calculated in the same way as the magnitude of a vector.
This is where the power of the complex exponential function becomes apparent
for the rst time. The following always applies: |eiφ | = 1. The magnitude
squared is therefore given by :

|Ψ (x, t)|2 = |A ei (kx−ωt) |2 (15.23)


= A2 |ei (kx−ωt) |2
= A2
158 Chapter 15. Schrödinger equation

Let's insert the calculated magnitude squared into the normalization condition:

Z d
A2 dx = 1 (15.24)
0

The amplitude A is independent of x, so it is a constant and we can place it in


front of the integral. And the integral simply results in d:
Z d
A 2
1 dx = 1 (15.25)
0

A2 d = 1
1
A = √
d

The normalized wave function for the electron is therefore:


1
Ψ (x, t) = √ ei (kx−ωt) (15.26)
d

Once we have normalized the wave function of a quantum particle, we can


not only nd out the probability P (t) of a particle, but also the mean value
⟨x⟩ of the position and the mean value of many other observables (physical
quantities). For example, the mean value of the momentum ⟨p⟩, the velocity ⟨v⟩
or the kinetic energy ⟨Wkin ⟩ of a quantum particle.

15.4 Three-dimensional Schrödinger equation


In your physics course, you will not only encounter a one-dimensional
Schrödinger equation, but also a two- or three-dimensional version. The
three-dimensional wave function can depend not only on one spatial
coordinate x, but on three spatial coordinates: Ψ (x, y, z, t). We can
combine the three spatial coordinates into one vector r: Ψ (r, t).

We can generalize the one-dimensional Schrödinger equation 15.17 to a three-


dimensional Schrödinger equation. This is not dicult if you have read the
chapter 5 about the Nabla operator. Here is the one-dimensional Schrödinger
15.4 Three-dimensional Schrödinger equation 159

equation again:

ℏ2 ∂ 2 Ψ
WΨ = − + Wpot Ψ (15.27)
2m ∂x2

In it, we have to extend the second spatial derivative with respect to x so that
the second spatial derivative with respect to y and z also appear in the three-
dimensional Schrödinger equation. To do this, we simply add these derivatives
to the spatial derivative with respect to x. Then we get the three-dimensional,
time-independent Schrödinger equation:
ℏ2 ∂ 2Ψ ∂ 2Ψ ∂ 2Ψ
 
WΨ = − + + + Wpot Ψ (15.28)
2m ∂x2 ∂y 2 ∂z 2

We can write Eq. 15.28 a little more compact with the Nabla operator.
To do this, factor out the wave function from the spatial derivatives:

ℏ2 ∂2 ∂2 ∂2
 
WΨ = − + + Ψ + Wpot Ψ (15.29)
2m ∂x2 ∂y 2 ∂z 2

The sum of the spatial derivatives in the brackets form a Laplace operator ∇ ·
∇ = ∇2 (sometimes also noted as ∆). This operator is the scalar product of two
nabla operators. This results in three-dimensional Schrödinger equation
expressed with Nabla operator:
ℏ2 2
WΨ = − ∇ Ψ + Wpot Ψ (15.30)
2m

So far, we have only learned about the time-independent Schrödinger


equation. You will have to use this regularly in the quantum mechanics
lecture. For example, in problems such as particle in a potential well, quantum
mechanical harmonic oscillator, tunnel eect and helium atom.
160 Chapter 15. Schrödinger equation

15.5 Time-dependent Schrödinger equation


A quantum particle described by the time-independent Schrödinger equation
has a constant total energy W . We can therefore only use the time-independent
Schrödinger equation to describe quantum particles that do not change
their total energy.

But what if the total energy W of a quantum particle is not constant in time?
This can happen, for example, if the particle interacts with its environment
and its total energy increases or decreases as a result. For such a quantum
system, we need the time-dependent Schrödinger equation. This is what
it looks like in one spatial dimension:

∂Ψ ℏ2 ∂ 2 Ψ
iℏ = − + Wpot Ψ (15.31)
∂t 2m ∂x2

The only dierence to the time-independent Schrödinger equation is that the


total energy W becomes an operator W = iℏ ∂t∂ . This operator is also
called time evolution operator.

15.6 Stationary Wave Function


Solving the time-dependent Schrödinger equation 15.31 is not that easy.
However, you can simplify the solution of this partial dierential equation
considerably if you convert it into two ordinary dierential equations.
One dierential equation then only depends on the time t and the other only
on the spatial coordiante x. We do this separation into two ordinary
dierential equations with the method Separation of Variables. This is a
very important method in physics to simplify partial dierential equations and
make them easier to solve.

The only requirement for the Separation of Variables to work is that the
potential energy Wpot (x) does not depend on the time t (but it may
very well depend on the position x). The wave function itself can, of course,
still depend on both position and time.
15.6 Stationary Wave Function 161

First, divide the time-dependent wave function Ψ (x, t) (that is, the total
solution) into two parts:
ˆ Into a partial solution ψ(x), which only depends on the location x.
ˆ Into a partial solution ϕ(t), which only depends on the time t.
This separation ansatz (ansatz is a German word for approach) turns the
total wave function into a product of the two partial solutions:

Ψ (x, t) = ψ(x) ϕ(t) (15.32)

The time-dependent Schrödinger equation thus becomes:

∂ ℏ2 ∂ 2
iℏ (ψ(x) ϕ(t)) = − (ψ(x) ϕ(t)) + Wpot ψ(x) ϕ(t) (15.33)
∂t 2m ∂x2

Not all wave functions can be separated into two partial solutions as in Eq.
15.32. However, since the Schrödinger equation is linear, we can form a linear
combination of such solutions and thus obtain all wave functions (including
those that cannot be separated). This is what makes variable separation so
powerful.

As you can see from the time-dependent Schrödinger equation 15.31, the time
derivative and the second spatial derivative occur there. Calculate the two
derivatives of the separation ansatz 15.32:

ˆ Dierentiate the separated wave function 15.32 with respect to time t:

∂Ψ ∂ϕ(t)
= ψ(x) (15.34)
∂t ∂t

ˆ Dierentiate the separated wave function 15.32 once according to the


position x:

∂ 2Ψ ∂ 2 ψ(x)
= ϕ(t) (15.35)
∂x2 ∂x2
162 Chapter 15. Schrödinger equation

We can insert the time derivative 15.34 and the spatial derivative 15.35 into the
time-dependent Schrödinger equation 15.33:

∂ϕ(t) ℏ2 ∂ 2 ψ(x)
iℏ ψ(x) = − ϕ(t) + Wpot ψ(x) ϕ(t) (15.36)
∂t 2m ∂x2

In the following, we omit the position and time dependence in order to be able
to write the Schrödinger equation in a more compact form. Now we have to
reformulate the separated Schrödinger dierential equation 15.36 so that its
left-hand side depends only on the time t and its right-hand side only on the
position x. We achieve this by dividing Eq. 15.36 by the product ψ ϕ:

1 ∂ϕ ℏ2 1 ∂ 2 ψ
iℏ = − + Wpot (15.37)
ϕ ∂t 2m ψ ∂x2

What do we get out of it? Quite a lot! If we change the time t (which only
occurs on the left side), only the left side of the equation will change, while the
right side remains unchanged. However, if the right-hand side does not change
over time, it is constant. This constant is real, as a complex-valued constant
would violate the normalization condition. The right-hand side corresponds to
the time-constant total energy W :

1 ∂ϕ
iℏ = W (15.38)
ϕ ∂t

This is an ordinary dierential equation for the partial solution ψ(x). We can
even write down the solution for this dierential equation. It is easy to solve
with pencil and paper. The time-dependent partial solution is a plane
wave:

ψ(x) = ei ℏ t (15.39)
W

Now let's look at the right-hand side of Eq. 15.37. If you change the variable
x on the right-hand side, the left-hand side of the equation remains constant
because it is independent of x. Because of the equality, the left-hand side must
15.6 Stationary Wave Function 163

correspond to the same constant W :

ℏ2 1 ∂ 2 ψ
W = − + Wpot (15.40)
2m ψ ∂x2

If we multiply the dierential equation 15.40 by ψ , we get the stationary,


time-independent Schrödinger equation for ψ:
ℏ2 ∂ 2 ψ
Wψ = − + Wpot ψ (15.41)
2m ∂x2

By stationary we mean that the solution ψ(x) does not depend on time.
Therefore, we refer to the solution ψ(x) of a stationary Schrödinger equation as
stationary wave function ψ(x) or as stationary state.

What have we achieved overall with the separation approach? Instead of having
to solve a more complicated time-dependent Schrödinger equation for Ψ (x, t) =
ψ(x) ϕ(t),

∂Ψ ℏ2 ∂ 2 Ψ
iℏ = − + Wpot Ψ (15.42)
∂t 2m ∂x2

solve the stationary Schrödinger equation 15.41 for ψ(x) instead and
multiply this position-dependent partial solution with the
time-dependent partial solution 15.39. As a result, we obtain the total
solution of the time-dependent Schrödinger equation:

Ψ (x, t) = ψ(x) ei ℏ t (15.43)


W

The solution 15.43 is very special, because its magnitude squared |Ψ (x, t)|2 is
time-independent! All other observables that describe the particle are also time-
independent. For example, a quantum particle described by the wave function
15.43 has a constant mean value of the energy ⟨W ⟩, constant mean value of the
momentum ⟨p⟩ and constant mean value of all other observables.
164 Chapter 15. Schrödinger equation

15.7 Hamilton operator


You will not only encounter the time-dependent and independent Schrödinger
equation in this form:

ℏ2 2
WΨ = − ∇ Ψ + Wpot Ψ (15.44)
2m
∂Ψ ℏ2 2
iℏ = − ∇ Ψ + Wpot Ψ (15.45)
∂t 2m

If you factor out the wave function, you get:

ℏ2 2
 
WΨ = − ∇ + Wpot Ψ (15.46)
2m
ℏ2 2
 
∂Ψ
iℏ = − ∇ + Wpot Ψ (15.47)
∂t 2m
| {z }

The operator in the brackets is called Hamilton operator Ĥ (sometimes also


called Hamiltonian):

ℏ2 2
Ĥ = − ∇ + Wpot (15.48)
2m
| {z }
Wkin

The Hamilton operator describes the total energy of a quantum particle.


You will also regularly encounter the representation of the Schrödinger equation
with the Hamilton operator:

Ĥ Ψ = W Ψ (15.49)
∂Ψ
Ĥ Ψ = iℏ (15.50)
∂t

With the Hamilton operator, we can interpret the time-independent Schrödinger


equation as a eigenvalue equation. You should know what an eigenvalue
equation is from linear algebra. So you apply the Hamilton operator (think of
15.8 What you’ve learned 165

it as matrix) to the eigenfunction Ψ (think of it as eigenvector). Then you


get the eigenvector Ψ on the right-hand side of the Schrödinger equation, which
is scaled with the corresponding energy eigenvalue W . The energy eigenvalues
W depend on the Hamilton operator used Ĥ and are discrete for most of the
Hamilton operators you will encounter in your studies. We say: The energy of
the quantum particle is quantized.

Thus we have transferred the problem of solving the Schrödinger


dierential equation 15.47 to the problem of solving the eigenvalue
equation 15.50.

15.8 What you’ve learned


Let's summarize what you should have learned from the chapter 15:

ˆ You now know how to motivate the time-independent Schrödinger


equation.
ˆ You know what the wave function is and have become familiar with
the plane wave as a simple example of a wave function.

ˆ You know the statistical interpretation of the wave function.


ˆ You can normalize a wave function.
ˆ You know the dierence between the time-dependent and time-
independent Schrödinger equation.
ˆ You know the dierence between the one-dimensional and three-
dimensional Schrödinger equation.
ˆ You know what the Hamilton operator is.
ˆ You know what a stationary wave function is.
Remember that the Schrödinger equation is a non-relativistic equation. It
fails for quantum particles moving at almost the speed of light. Furthermore, it
does not naturally take into account the spin of a particle. These two problems
are only solved by its relativistic version, namely the Dirac equation. You will
166 Chapter 15. Schrödinger equation

only learn about this in your Master's degree when you take course on quantum
eld theory.
In the following Chapter 16 you will learn the representation of the wave
function as a state vector ("quantum mechanical state"). Advantage: You
can work with the state vector in (almost) the same way as with the usual
vectors that you know from linear algebra.
16. Bra-Ket Notation

More: en.fufaev.org/bra-ket-notation
Consider any one-dimensional wave function Ψ (x) describing a quantum
mechanical particle. We have omitted its time dependence Ψ (t, x) because it
is not relevant in this chapter. The value of the wave function, for instance, at
the location x1 is Ψ (x1 ), at the location x2 the function value is Ψ (x2 ), at the
location x3 the function value is Ψ (x3 ), and so forth. In this manner you can
assign to each point in space x the function value Ψ (x) of the wave function.
The sum of all these function values yields the shape of the wave function.

We can represent all these function values as a list of values. We can interpret
168 Chapter 16. Bra-Ket Notation

this list of values as a column vector Ψ . The column vector then has the
following components:
   
Ψ (x1 ) Ψ1
   
Ψ (x2 ) Ψ2 
Ψ =
Ψ (x ) = Ψ 
   (16.1)
 3   3
... ...

At the second equality sign, we have represented the function values more
compactly. Instead of writing the rst component as Ψ (x1 ), we compactly
write it as Ψ1 .
We can illustrate the column vector 16.1 as follows:
ˆ The rst component Ψ (x1 ) forms the rst coordinate axis.

ˆ The second component Ψ (x2 ) forms the second coordinate axis.

ˆ The third component Ψ (x3 ) forms the third coordinate axis.

ˆ and so forth.

We'll stick to only three components because I can't draw a four-dimensional


coordinate system. So, each component is assigned a coordinate axis. In this
way, the three components span a three-dimensional space. Once we add
an additional function value Ψ (x4 ), the space becomes four-dimensional, and
so on. We denote the vector Ψ representing a wave function Ψ (x) as a state
vector.
16.1 Bra- and Ket-State Vectors 169

In theory, there are of course innitely many x-values. Therefore, there are
also innitely many associated function values Ψ (x) as components of the
column vector. If there are innitely many function values, then the space in
which the state vector Ψ lives is innite-dimensional. Remember that this
space is not an innitely-dimensional position space but an abstract space.

This abstract space, where various quantum mechanical state vectors Ψ live,
is called a Hilbert space. In general, this is an innite-dimensional vector
space. However, it can also be nite-dimensional. For example, spin states
Ψ↑ and Ψ↓ , which describe the spin of a particle, reside in a two-dimensional
Hilbert space. That is, state vectors like the spin-up state Ψ↑ have only two
components:
" #
Ψ↑1
Ψ↑ = (16.2)
Ψ↑2

However, even approximating an innite-dimensional state with a column


vector 16.1 is incredibly useful. In numerical computations, we have no other
choice but to approximate the innite-dimensional state with a nite number
of function values. There's simply no other way since your computer would
need innite memory for that. The more components we take in numerical
computation, the more accurate the state vector becomes, but the
computations become slower and more memory-intensive.

So, we can represent a quantum mechanical particle in two ways:

ˆ as a wave function Ψ (x)

ˆ as a state vector Ψ

16.1 Bra- and Ket-State Vectors


16.1.1 Ket vector
To better distinguish the particle's description as a state vector from its
description as a wave function, we enclose the state vector Ψ in arrow-like
170 Chapter 16. Bra-Ket Notation

brackets:
 
Ψ1
 
Ψ2 
|Ψ ⟩ = 
Ψ 
 (16.3)
 3
...

The wave function Ψ , represented as a column vector 16.3, is called a ket


vector |Ψ ⟩, and the arrow-like bracket points to the right. It doesn't matter
what you write inside the bracket. For example, you could have also noted
the ket vector as |Ψ (x)⟩. The only thing to consider is that the notation inside
the bracket claries to other readers which quantum mechanical system this ket
vector represents.

ˆ So, when you see a ket |Ψ ⟩, you know that it refers to the representation
of the quantum particle as a state vector.

ˆ On the other hand, if you see Ψ (x) without ket notation, you know that it
refers to the representation of the quantum particle as a wave function.

16.1.2 Bra Vector


The vector |Ψ ⟩† , which is the adjoint of the ket vector, is called a bra vector.
The symbol † is pronounced as Dagger. For a clever, compact notation, we
write the bra vector with a reversed arrow ⟨Ψ | instead of using the dagger
|Ψ ⟩† .

To obtain the bra vector ⟨Ψ | adjoint to the ket vector |Ψ ⟩, we need to perform
two operations:

ˆ Transpose the ket vector |Ψ ⟩. This turns it into a row vector:

|Ψ ⟩T = [Ψ1 , Ψ2 , Ψ3 , ...] (16.4)

ˆ Complex-conjugate the transposed ket vector |Ψ ⟩T . This operation


16.2 Scalar and Inner Product 171

adds asterisks to the components to obtain the bra vector:

⟨Ψ | = [Ψ1∗ , Ψ2∗ , Ψ3∗ , ...] (16.5)

In summary: The wave function Ψ (x) corresponds in vector representation to


the ket vector |Ψ ⟩, and the row vector adjoint to the ket vector, denoted as ⟨Ψ |,
is the bra vector.

Since we've interpreted the wave function Ψ (x) as a ket vector |Ψ ⟩, we can
practically work with the ket vector in much the same way as with ordinary
vectors you're familiar with from linear algebra. For example, we can form
a scalar product or tensor product between the bra or ket vectors.
What may be new to you, however, is that unlike vectors from linear algebra,
the components of the ket vector can be complex, and the number of
components can be innite.

16.2 Scalar and Inner Product


We can form the scalar product ⟨Φ | · |Ψ ⟩ between a bra vector ⟨Φ | and a ket
vector |Ψ ⟩. Here, we don't need to write the scalar product dot and can omit a
vertical line. We write ⟨Φ |Ψ ⟩ instead of ⟨Φ | · |Ψ ⟩ for brevity.

When the state vectors between which you form the scalar product live in an
innite-dimensional Hilbert space, we call this operation not a scalar
product but an inner product. However, the notation ⟨Φ |Ψ ⟩ for the inner
product remains the same as in the case of the scalar product.

In a nite n-dimensional Hilbert space, the written out scalar product ⟨Φ |Ψ ⟩


between any bra vector ⟨Φ | and ket vector |Ψ ⟩ looks like this:
 
Ψ1
 Ψ2 
 
∗ ∗ ∗ ∗ 
(16.6)
 
⟨Φ |Ψ ⟩ = [Φ1 , Φ2 , Φ3 , ..., Φn ]  Ψ3 
 ... 
 

Ψn
172 Chapter 16. Bra-Ket Notation

We can multiply the row and column vectors in 16.6 just as we do with the usual
matrix multiplication.:

⟨Φ |Ψ ⟩ = Φ1 ∗ Ψ1 + Φ2 ∗ Ψ2 + Φ3 ∗ Ψ3 + ... + Φn ∗ Ψn (16.7)
n
= + Φi ∗ Ψi (16.8)
i=1

In the last step, we abbreviated the scalar product using a sum sign. Here,
n represents the dimension of the Hilbert space, that is, the number of
components of a state vector living in this Hilbert space. The dimension
n = ∞ of the Hilbert space can also be innite.

16.3 Continuous Quantum States


So far, we have discretized a quantum state |Ψ ⟩ by omitting many function
values of the wave function Ψ (x). Just between the positions x1 and x2 alone,
there are innitely many more values. Why? Because the position
coordinate is a real number. This means there are innitely many components
between Ψ1 = Ψ (x1 ) and Ψ2 = Ψ (x2 ) that we have omitted in the column
vector representation:
 
Ψ1
 
 ... 
 
Ψ 
(16.9)
 2
|Ψ ⟩ =  
 ... 
 
Ψ 
 3
...

This means that representing a wave function Ψ (x) with a real-valued


argument x as a column vector is only an approximation and serves merely
for illustration purposes.

Similarly, the inner product 16.8 with the sum sign is not exact for states with
real-valued arguments. How can we make the inner product exact for these
16.3 Continuous Quantum States 173

states? We need to switch to an integral. Therefore, we replace the sum sign


with an integral sign. We now consider the function values Φi and Ψi not at
discrete points xi but at all points x:
Z
⟨Φ |Ψ ⟩ = Φ(x)∗ Ψ (x) dx (16.10)

So, to calculate the exact inner product of two wave functions Φ((x) and
Ψ (x), we need to evaluate the integral 16.10.

16.3.1 Overlap of Quantum States


What does this inner product of two quantum states actually tell us intuitively?
Similar to the scalar product, the inner product is a number that measures how
much two quantum states overlap. Let's consider two normalized quantum
states Φ and Ψ for simplicity:

ˆ If the inner product is ⟨Φ |Ψ ⟩ = 1, then the corresponding wave functions


Φ(x) and Ψ (x) completely overlap.

ˆ If the inner product is ⟨Φ |Ψ ⟩ = 0, then the wave functions Φ(x) and Ψ (x)
do not overlap at all.

ˆ All values of the inner product ⟨Φ |Ψ ⟩ between 1 and 0 indicate partial


overlap of the two wave functions.
174 Chapter 16. Bra-Ket Notation

16.4 Orthonormal Quantum States


Let's consider two normalized and orthogonal (orthonormal) states |Ψi ⟩ and
|Ψj ⟩, denoted by variable indices i and j instead of xed values. Then, their
scalar product ⟨Ψi |Ψj ⟩ yields either 0 or 1. Therefore, they are suitable as basis
states. You should know this property from linear algebra when calculating the
scalar product of two basis vectors:
ˆ The scalar product of two dierent orthonormal states, i ̸= j , yields:
⟨Ψi |Ψj ⟩ = 0.

ˆ The scalar product of two identical orthonormal states, i = j , yields:


⟨Ψi |Ψj ⟩ = 1.

These two cases can be combined in a single equation using the Kronecker
delta δij :

⟨Ψi |Ψj ⟩ = δij (16.11)

16.5 Tensorproduct in Bra-Ket Notation


Another important operation between a bra and ket vector is the tensor
product: |Φ⟩ ⊗ ⟨Ψ |. We can omit the tensor symbol ⊗, because it is
immediately clear from the bra-ket notation that it is not a scalar or inner
product: |Φ⟩⟨Ψ |, since bra and ket vectors are interchanged here.
The result of the tensor product is a matrix:
ˆ If the states |Φ⟩ and |Ψ ⟩ each have two components, then |Φ⟩⟨Ψ | is a
2x2 matrix.
ˆ If the states |Φ⟩ and |Ψ ⟩ each have three components, then |Φ⟩⟨Ψ | is
a 3x3 matrix.
16.6 Projection Matrices 175

ˆ If the states |Φ⟩ and |Ψ ⟩ each have n components, then |Φ⟩⟨Ψ | is an


n×n matrix.
As you know from matrix multiplication, in the tensor product, we multiply a
ket vector |Φ⟩, which is a column vector, with a bra vector ⟨Ψ |, which is a row
vector. If the states have three components, then we get a 3x3 matrix:
   
Φ1 Φ1 Ψ1 ∗ Φ1 Ψ2 ∗ Φ1 Ψ3 ∗
|Φ⟩⟨Ψ | = Φ2  [Ψ1 ∗ , Ψ2 ∗ , Ψ3 ∗ ] = Φ2 Ψ1 ∗ Φ2 Ψ2 ∗ Φ2 Ψ3 ∗  (16.12)
   

Φ3 Φ3 Ψ1 ∗ Φ3 Ψ2 ∗ Φ3 Ψ3 ∗

You will encounter such matrices in form of density matrices very often in
quantum mechanics, for example, when learning about quantum
entanglement.

16.6 Projection Matrices


Let's take a normalized state |Ψ ⟩, meaning the magnitude of this vector
is 1, and form the tensor product of this state with itself, then we obtain a
projection matrix |Ψ ⟩⟨Ψ | (or projection operator, if specic components
are not considered):
   
Ψ1 Ψ1 Ψ1 ∗ Ψ1 Ψ2 ∗ Ψ1 Ψ3 ∗
|Ψ ⟩⟨Ψ | = Ψ2  [Ψ1 ∗ , Ψ2 ∗ , Ψ3 ∗ ] = Ψ2 Ψ1 ∗ Ψ2 Ψ2 ∗ Ψ2 Ψ3 ∗  (16.13)
   

Ψ3 Ψ3 Ψ1 ∗ Ψ3 Ψ2 ∗ Ψ3 Ψ3 ∗

If we apply a projection matrix to any ket vector |Φ⟩ (which may not be
normalized), then we multiply a matrix |Ψ ⟩⟨Ψ | by a column vector |Φ⟩:
  
Ψ1 Ψ1 ∗ Ψ1 Ψ2 ∗ Ψ1 Ψ3 ∗ Φ1
∗ ∗ ∗
|Ψ ⟩⟨Ψ | |Φ⟩ = |Ψ ⟩⟨Ψ |Φ⟩ = Ψ2 Ψ1 Ψ2 Ψ2 Ψ2 Ψ3  Φ2  (16.14)
  

Ψ3 Ψ1 ∗ Ψ3 Ψ2 ∗ Ψ3 Ψ3 ∗ Φ3

The special feature of a projection matrix is that it projects the state |Φ⟩
onto the state |Ψ ⟩. In simple terms, it yields the part of the quantum state
176 Chapter 16. Bra-Ket Notation

|Φ⟩ that overlaps with the quantum state |Ψ ⟩. The result of the projection is
thus a ket vector |Ψ ⟩⟨Ψ |Φ⟩, which describes the overlap of the quantum
states |Φ⟩ and |Ψ ⟩.

16.6.1 Basis Transformation with Projection Matrices


Projection matrices are an important tool in theoretical physics for investigating
the overlap of quantum states. However, perhaps the most important use of
projection matrices is for eortless basis transformation. If we have any
quantum state |Φ⟩ and want to view it from a dierent perspective, or
mathematically speaking, represent it in a dierent basis, we rst choose the
desired new basis: {|Ψi ⟩}. As you hopefully know from linear algebra, this
is a set of orthonormal vectors |Ψ1 ⟩, |Ψ2 ⟩, |Ψ3 ⟩, and so on. Their number
equals the dimension of the Hilbert space in which these quantum states
live. In quantum mechanics, we refer to the basis vectors as basis states. For
describing particle spin, for example, we only need two basis states.

For the sake of illustration, let's assume that our desired basis consists of only
three basis states: {|Ψ1 ⟩, |Ψ2 ⟩, |Ψ3 ⟩}. With each of these basis states, we can
construct projection matrices: |Ψ1 ⟩⟨Ψ1 |, |Ψ2 ⟩⟨Ψ2 |, and |Ψ3 ⟩⟨Ψ3 |.

To represent a quantum state |Φ⟩ in this basis, we rst form the sum of the
projection matrices:
3
+ |Ψi ⟩⟨Ψi | = |Ψ1 ⟩⟨Ψ1 | + |Ψ2 ⟩⟨Ψ2 | + |Ψ3 ⟩⟨Ψ3 | = I (16.15)
i = 1

As we know from mathematics, the sum of the projection matrices forming


a basis is an identity matrix I. The fact that the sum yields an identity
matrix is crucial during basis transformation because we do not want to alter
the quantum state |Φ⟩. Multiplying an identity matrix by a column vector |Φ⟩
does not change this vector:

|Φ⟩ = I |Φ⟩ (16.16)


16.6 Projection Matrices 177

Now, let's substitute the sum of the basis projection matrices 16.15 for the
identity matrix:
3
|Φ⟩ = + |Ψi ⟩⟨Ψi |Φ⟩ (16.17)
i = 1

= (|Ψ1 ⟩⟨Ψ1 | + |Ψ2 ⟩⟨Ψ2 | + |Ψ3 ⟩⟨Ψ3 |) |Φ⟩ (16.18)


= |Ψ1 ⟩⟨Ψ1 |Φ⟩ + |Ψ2 ⟩⟨Ψ2 |Φ⟩ + |Ψ3 ⟩⟨Ψ3 |Φ⟩ (16.19)
= |Φ⟩ (16.20)

The resulting state |Φ⟩, although denoted the same as the original state |Φ⟩,
is now represented in the new basis {|Ψi ⟩}. If we want to emphasize the new
basis, we can also assign it an index: |Φ⟩Ψ . I hope you now understand the
usefulness of the concept of projection matrices!

In general, we can represent a quantum state |Φ⟩ with n components using a


basis {|Ψi ⟩} consisting of n basis states as follows:
n
|Φ⟩Ψ = + |Ψi ⟩⟨Ψi |Φ⟩ (16.21)
i = 1

The basis change with a nite number of basis states is, of course, only exact
when the states |Φ⟩ live in a nite-dimensional Hilbert space. For states
with innitely many components, Eq. 16.21 is only an approximation of the
old state in the new basis. The approximation becomes more accurate as we
choose n larger. Thus, in computational physics, we can save memory by not
choosing n too large, but large enough to approximate the quantum state good
enough in the new basis.

Guess how the basis change for states with innitely many components can be
made exact? With an integral! To do this, replace the discrete summation with
a sum sign with a continuous summation with an integral:
Z
|Φ⟩Ψ = dx |Ψ ⟩⟨Ψ |Φ⟩ (16.22)
178 Chapter 16. Bra-Ket Notation

16.7 Schrödinger Equation in Bra-Ket Notation


In Chapter 15.7, you learned about the Schrödinger equation as an eigenvalue
equation. Here it is again:

Ĥ Ψ = W Ψ
∂Ψ
Ĥ Ψ = iℏ
∂t

You will encounter this eigenvalue equation regularly in Bra-Ket notation. To


do so, replace the wave function Ψ with the ket vector:

Ĥ |Ψ ⟩ = W |Ψ ⟩ (16.23)

Ĥ |Ψ ⟩ = iℏ |Ψ ⟩ (16.24)
∂t

16.8 Mean Values in Bra-Ket Notation


We can utilize the familiar Bra-Ket notation to represent the mean value ⟨Ĥ⟩
of an operator Ĥ in the quantum state |Ψ ⟩. Often physicists call the
mean value as expectation value - however, this terminology is misleading
and should not be used for the mean value ⟨Ĥ⟩. The notation ⟨Ĥ⟩ for the mean
value is a short representation of ⟨Ψ | Ĥ |Ψ ⟩.

So, we obtain the mean value of an observable by sandwiching the operator Ĥ


between a Bra vector ⟨Ψ | and a Ket vector:

⟨Ĥ⟩ = ⟨Ψ | Ĥ |Ψ ⟩ = ⟨Ψ | Ĥ Ψ ⟩ (16.25)

For the last equal sign, we have taken advantage of the fact that Ĥ applied
to the ket vector |Ψ ⟩ results in a new ket vector |Ĥ Ψ ⟩. As you know, the
notation in | ⟩ is irrelevant as long as it is clear what this ket vector represents.

Now you should have a solid basic knowledge of bra-ket notation. You should
have learned the following from this chapter:
16.8 Mean Values in Bra-Ket Notation 179

ˆ You know what bra and ket vectors are.

ˆ You know how to form the scalar product and inner product.

ˆ You know how to construct projection matrices in Bra-Ket notation.

ˆ You know how to carry out a change of basis with projection matrices.
17. Represent Operators as
Matrices

If we choose a basis, we can represent an operator Ĥ as a matrix H . For


example, we can take the eigenbasis {|φi ⟩} of Ĥ . Eigenbasis is the set of
eigenstates (eigenvectors) |φ1 ⟩, |φ2 ⟩, |φ3 ⟩, ... of Ĥ .
ˆ If the operator Ĥ : H → H is a mapping between the two-dimensional
Hilbert spaces H, then the eigenbasis of Ĥ has two eigenstates {|φ1 ⟩, |φ2 ⟩}
and can be represented by a 2x2-matrix:
" #
⟨φ1 | Ĥ |φ1 ⟩ ⟨φ1 | Ĥ |φ2 ⟩
H=
⟨φ2 | Ĥ |φ1 ⟩ ⟨φ2 | Ĥ |φ2 ⟩

ˆ If the operator Ĥ : H → H is a mapping between the


three-dimensional Hilbert spaces H, then the eigenbasis of Ĥ has three
eigenstates {|φ1 ⟩, |φ2 ⟩, |φ3 ⟩} and can be represented by a 3x3-matrix:
 
⟨φ1 | Ĥ |φ1 ⟩ ⟨φ1 | Ĥ |φ2 ⟩ ⟨φ1 | Ĥ |φ3 ⟩
H = ⟨φ2 | Ĥ |φ1 ⟩ ⟨φ2 | Ĥ |φ2 ⟩ ⟨φ2 | Ĥ |φ3 ⟩
 

⟨φ3 | Ĥ |φ1 ⟩ ⟨φ3 | Ĥ |φ2 ⟩ ⟨φ3 | Ĥ |φ3 ⟩


18. Hermitian Operators

More: en.fufaev.org/hermitian-operators
Let's take a look at the mean value ⟨Ĥ⟩ of the operator Ĥ in the state |Ψ ⟩. If
Ĥ is the Hamilton operator, then ⟨Ĥ⟩ describes the mean value of the total
energy of a quantum particle:

⟨Ĥ⟩ = ⟨Ψ | Ĥ Ψ ⟩ (18.1)
Z  
= Ψ (x, t)∗ Ĥ Ψ (x, t) dx (18.2)

The mean value of a physical quantity, as we know it from everyday life, is a


real number. In quantum mechanics, however, complex mean values can also
occur for some operators. Imagine that we want to calculate the local mean
value ⟨x̂⟩ or the energy mean value ⟨Ĥ⟩ and obtain a complex value as a result.
This is problematic, because what does a complex position or a complex energy
mean?

In quantum mechanics, we are therefore only interested in mean values ⟨Ĥ⟩


that are real. How can we demand this mathematically? Quite simple! A real
number, for example the number 5, remains unchanged if we complex conjugate
184 Chapter 18. Hermitian Operators

it: 5 = 5∗ . A complex number, for example 4 + 2i, does not remain the same if
we complex conjugate it: (4 + 2i)∗ = 4 − 2i.

So if the mean value is real, then it is equal to its complex conjugate


value:

⟨Ψ | Ĥ Ψ ⟩ = ⟨Ψ | Ĥ Ψ ⟩∗ (18.3)

We call the operator Ĥ whose mean value ⟨Ĥ⟩ is real, as a Hermitian


operator. A Hermitian operator therefore represents a measurable
quantity, such as momentum, position and energy. A quantity that we can
measure in an experiment is called an observable in quantum mechanics.

What does this mean for the mean value integral 18.2 if Ĥ is a Hermitian
operator? Let's take a look at this by calculating the complex conjugate mean
value ⟨Ψ | Ĥ Ψ ⟩∗ by using the mean value integral 18.2:
Z ∗

⟨Ψ | Ĥ Ψ ⟩ = Ψ (Ĥ Ψ ) dx

(18.4)
Z ∗
= (Ĥ Ψ ) Ψ dx

Z
= (Ĥ Ψ )∗ (Ψ ∗ )∗ dx
Z
= (Ĥ Ψ )∗ Ψ dx

= ⟨Ĥ Ψ | Ψ ⟩

We have discovered the following important property of Hermitian operators:

⟨Ψ | Ĥ Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.5)

In order for our mean-has-to-be-real requirement 18.2 to be fullled, the operator


Ĥ must be interchangeable in the scalar product. It must therefore
not matter whether we rst apply Ĥ to the Ket or Bra vector in the mean
18.1 Useful properties of Hermitian operators 185

value calculation. So if you know that an operator is Hermitian, then move the
operator wherever you want in the Bra-Ket notation.

18.1 Useful properties of Hermitian operators


In addition to the useful property 18.5, a Hermitian operator has a bunch of
other useful properties.

Usually, if we want to move an operator Ĥ that acts on a Ket vector to the Bra
vector, we have take the adjoint Ĥ † of the operator:

⟨Ψ | Ĥ † Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.6)

As you have learned, in the case of a Hermitian operator we do not have to do


this. A Hermitian operator is a self-adjoint operator:

Ĥ = Ĥ † (18.7)

To make it clear that Ĥ is a Hermitian operator, the mean value is also noted
as follows:

⟨Ψ | Ĥ Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.8)
= ⟨Ψ | Ĥ |Ψ ⟩
= ⟨Ψ | Ĥ † |Ψ ⟩

A Hermitian operator has another important property that you must


remember for your quantum mechanics courses: The set of eigenstates
{|φi ⟩} (eigenvectors) of a Hermitian operator can be used as a basis. This
property is so important that it has a name, namely the spectral theorem.

Take this to heart: You have a Hermitian operator in front of you. You can take
its eigenstates {|φi ⟩} as a basis and thus represent any other state in this basis.
That's incredible and super useful!
186 Chapter 18. Hermitian Operators

18.2 Examples of Hermitian Matrices


Most of the operators that you will encounter in your undergrad courses are
Hermitian operators. These include, for example, the momentum operator p̂,
the position operator x̂, the Hamiltonian operator Ĥ , the kinetic energy operator
Wˆkin and so on.

As you learned in Chapter 17, we can represent an operator as a matrix if we


choose a concrete basis in which to represent the operator. Let's look at a few
concrete examples of Hermitian matrices.
The σy spin matrix is Hermitian. If you transpose and complex-conjugate it,
you get the same matrix:
" # " #∗
0 −i 0 −i
σy = = = (σy )∗ (18.9)
i 0 i 0

The σx spin matrix is also a Hermitian matrix:


" # " #∗
0 1 0 1
σx = = = (σx )∗ (18.10)
1 0 1 0

And here is an example of a non-Hermitian matrix. If you transpose and complex


conjugate it, you get a completely dierent matrix that is not equal to the
original matrix:
" # " #∗ " #
1 2 1 2 1 3i
̸ = = (18.11)
−3i 0 −3i 0 2 0
19. Angular Momentum

More: en.fufaev.org/quantum-angular-momentum

The angular momentum (more precisely: orbital angular momentum) L


of a classical particle is given by the cross product between the distance r of
the particle from the axis of rotation and the linear momentum p = mv of the
particle:
 
y pz − z py
L = r × p =  z px − x pz  (19.1)
 

x py − y px
188 Chapter 19. Angular Momentum

The angular momentum L is therefore perpendicular to the vectors r and p due


to the cross product.

If we write out the cross product, we get the individual components L1 , L2


and L3 of the angular momentum vector, which each indicate the magnitude of
angular momentum in the x, y and z directions:

ˆ Angular momentum in the x direction is:

Lx = y pz − z py (19.2)

ˆ Angular momentum in the y direction is:

Ly = z px − x pz (19.3)

ˆ Angular momentum in the z direction is:

Lz = x py − y px (19.4)
189

How do we turn these classical angular momentum components into quantum


mechanical angular momentum components? By putting little hats on them:
L̂x , L̂y and L̂z , in other words turning the angular momentum components
into operators. The spatial components x = x̂, y = ŷ and z = ẑ remain the
same. And the momentum components are replaced by the following axiomatic
mappings:
ˆ Momentum component px becomes the operator: p̂x = −iℏ ∂x

ˆ Momentum component py becomes the operator: p̂y = −iℏ ∂y

ˆ Momentum component pz becomes the operator: p̂z = −iℏ ∂z

Here, ∂x ,∂y and ∂z are derivative operators. When applied to a function, they
result in the derivative of this function with respect to x, y or z . Of course, a
single derivative makes no sense. For this reason, a momentum operator only
becomes useful when it is applied to a wave function. The result is a new
wave function modied by the operator. With the axiomatic mappings, we have
quantized the classical orbital angular momentum:
ˆ L̂x = −iℏ y ∂z + iℏ z ∂y = iℏ (z ∂y − y ∂z )

ˆ L̂y = −iℏ z ∂x + iℏ x ∂z = iℏ (x ∂z − z ∂x )

ˆ L̂z = −iℏ x ∂y + iℏ y ∂x = iℏ (y ∂x − x ∂y )
190 Chapter 19. Angular Momentum

19.1 Can We Measure Angular Momentum?


We have constructed the angular momentum operators L̂x , L̂y and L̂z . The
question now is: Do they represent observables? Or to put it another way:
Are they Hermitian operators? This is important because the angular
momentum components can only be measured in the experiment if they are
Hermitian operators.
A Hermitian operator L̂x is equal to its complex conjugate L̂x † . Let's check that:

L̂x † = (ŷ p̂z − ẑ p̂y )† (19.5)

First, we apply the property of anti-linearity:

L̂x † = (ŷ p̂z )† − (ẑ p̂y )† (19.6)

In the next step we will use the anti-distributivity. This swaps the two
operators in the parenthesis and the parenthesis disappears:

L̂x † = p̂z † ŷ † − p̂y † ẑ † (19.7)

We know that the position operators ŷ , ẑ and momentum operators p̂z , p̂y are
Hermitian. Hermitian operators are equal to their adjoint. We can therefore
omit †:

L̂x † = p̂z ŷ − p̂y ẑ (19.8)

The momentum and position operators may be interchanged here, because


operator p̂z = −iℏ ∂z dierentiates with respect to the z -coordinate and ŷ = y
does not depend on z and therefore acts like a constant that can be moved
forward. The argumentation for the term p̂y ẑ is the same:

L̂x † = ŷ p̂z − ẑ p̂y = L̂x (19.9)


19.2 Can We Determine ALL Angular Momentum Components? 191

The expression obtained corresponds exactly to the L̂x operator. Similarly, we


can show that the L̂y and the L̂z operator are also Hermitian. Conclusion: We
can measure the angular momentum components of quantum particles
in an experiment. Perfect!

19.2 Can We Determine ALL Angular


Momentum Components?
In classical physics, in our macroscopic world, the values of all three angular
momentum components Lx , Ly and Lz exist - for example of a spinning
particle. All three components can therefore be determined exactly and
simultaneously.

In the quantum world, on the other hand, we have the Heisenberg


uncertainty principle, which makes it impossible to determine certain
observables exactly at the same time because one of the observables does not
inherently have an exact value if the other observable is measured exactly.
Momentum p̂y and position x̂ are an example of two observables that cannot
be determined exactly at the same time.

Formulated mathematically, the Heisenberg uncertainty principle states: If we


rst apply the position operator x̂ to the wave function Ψ (x, y, z) and then the
momentum operator: p̂x x̂ Ψ (x, y, z), then we get something dierent than if
we rst apply the momentum operator and then the position operator:
p̂x x̂ Ψ (x, y, z). It matters whether we rst measure the position or the
momentum of a quantum particle. As soon as we reverse the order of the
measurement, we get something completely dierent for the momentum and
position. We say that the momentum and position are subject to the
Heisenberg uncertainty principle. The dierence between the two
measurements is provided by the commutator [x̂, p̂x ]. To do this, we
calculate the dierence between the two measurements and factor out the wave
192 Chapter 19. Angular Momentum

function. The dierence of the operators is the commutator of x̂ and p̂y :

p̂x x̂ Ψ − p̂x x̂ Ψ = (p̂x x̂ − p̂x x̂) Ψ (19.10)


= [x̂, p̂x ] Ψ
= iℏΨ

In the last step, we used the commutator of the position and momentum operator
[x̂, p̂x ] = i ℏ.

ˆ If the commutator is zero, then in principle it is possible to determine both


observables exactly at the same time.
ˆ If the commutator is not zero, as in the case of [x̂, p̂x ] = i ℏ, then it is
impossible to determine both observables exactly at the same time. Only
one of the observables, either p̂x or x̂, can be determined exactly.
Can we know all the angular
With this knowledge, we can now ask:
momentum components of a quantum particle exactly at the same
time?
Short answer: No! To do this, we must use the commutators [L̂x , L̂z ], [L̂y , L̂z ]
and [L̂x , L̂y ] of the angular momentum components. We will nd that none of the
commutators is zero. It is therefore impossible to know two angular momentum
components at the same time.
To demonstrate this, let's look at the commutator [L̂x , L̂z ] and show that L̂x
and L̂z are subject to the uncertainty principle. First, we use the denition of
the commutator:

[L̂x , L̂z ] = L̂x L̂z − L̂z L̂x (19.11)


= (ŷ p̂z − ẑ p̂y ) (x̂ p̂y − ŷ p̂x )
− (x̂ p̂y − ŷ p̂x ) (ŷ p̂z − ẑ p̂y )
= ŷ p̂z x̂ p̂y − ŷ p̂z ŷ p̂x − ẑ p̂y x̂ p̂y
+ ẑ p̂y ŷ p̂x − x̂ p̂y ŷ p̂z + x̂ p̂y ẑ p̂y
+ ŷ p̂x ŷ p̂z − ŷ p̂x ẑ p̂y
19.2 Can We Determine ALL Angular Momentum Components? 193

For the second equal sign, we have expressed the angular momentum operators
with position and momentum operators. For the third equal sign, we have
multiplied out the brackets.

Then we swap the operators so that some terms are canceled out. In the rst
term, we can place x̂ at the beginning without problems: x̂ ŷ p̂z p̂y , because x̂
commutates with both ŷ and p̂z (their commutator is zero, so we can move
them back and forth). We can place the operator p̂y in front of p̂z without
problems: x̂ ŷ p̂y p̂z , but not in front of ŷ , because the commutator of [ŷ, p̂y ] = iℏ
is not zero. Therefore, we must replace ŷ, p̂y with iℏ + p̂y ŷ : x̂ (iℏ + p̂y ŷ) p̂z =
iℏ x̂ p̂z + x̂ p̂y ŷ p̂z . This cancels out the term x̂ p̂y ŷ p̂z :

[L̂x , L̂z ] = iℏ x̂ p̂z − ŷ p̂z ŷ p̂x − ẑ p̂y x̂ p̂y (19.12)


+ ẑ p̂y ŷ p̂x − x̂ p̂y ŷ p̂z + x̂ p̂y ẑ p̂y
+ ŷ p̂x ŷ p̂z − ŷ p̂x ẑ p̂y

In the expression ŷ p̂z ŷ p̂x we can swap all operators without any problems and
cancel it out with the other term ŷ p̂x ŷ p̂z :

[L̂x , L̂z ] = iℏ x̂ p̂z − ẑ p̂y x̂ p̂y + ẑ p̂y ŷ p̂x (19.13)


+ x̂ p̂y ẑ p̂y − ŷ p̂x ẑ p̂y

And also in the expression ẑ p̂y x̂ p̂y operators can be swapped so that the
operator that is related to the expression x̂ p̂y ẑ p̂y cancels out:

[L̂x , L̂z ] = iℏ x̂ p̂z − ẑ p̂y ŷ p̂x − ŷ p̂x ẑ p̂y (19.14)

Now we come back to a term ŷ p̂x ẑ p̂y , where the swapping is not simply possible.
First, we can swap ŷ and p̂x : p̂x ŷ ẑ p̂y and then ẑ with p̂y , so that we have the
following expression: p̂x ŷ p̂y ẑ . To now swap ŷ with p̂y , we have to use their
product because of the non-vanishing commutator [ŷ, p̂y ] = iℏ with iℏ + p̂y ŷ :
194 Chapter 19. Angular Momentum

p̂x (iℏ + p̂y ŷ) ẑ = iℏ p̂x ẑ − p̂x p̂y ŷ ẑ . This turns the commutator into:

[L̂x , L̂z ] = iℏ x̂ p̂z − ẑ p̂y ŷ p̂x − iℏ p̂x ẑ − p̂x p̂y ŷ ẑ (19.15)

In this expression ẑ p̂y ŷ p̂x we swap ẑ with p̂x : p̂x p̂y ŷ ẑ and can thus cancel it
out:

[L̂x , L̂z ] = iℏ x̂ p̂z − iℏ p̂x ẑ (19.16)


= iℏ (x̂ p̂z − p̂x ẑ)
= iℏ L̂y

As you can see, the commutator [L̂x , L̂z ] is not zero, so it is impossible to know
L̂x and L̂z simultaneously with arbitrary precision. The other two commutators
can be derived in the same way:

[L̂y , L̂z ] = iℏ L̂x (19.17)


[L̂x , L̂y ] = iℏ L̂z

We can easily illustrate this fundamental uncertainty of the angular


momentum components. Let's consider a classical particle that moves on a
circular path. It therefore has an angular momentum L. All three angular
momentum components are exactly xed, so L has a xed direction.
19.2 Can We Determine ALL Angular Momentum Components? 195

What if it were a quantum particle? Let's assume that we have measured the
angular momentum component L̂z of the quantum particle. We have thus
exactly determined its angular momentum component Lz . Due to the
non-vanishing commutators, the other two angular momentum components Lx
and Ly have no concrete value. The direction of the total angular momentum
vector L is no longer clearly given, but lies anywhere on a cone mantle.

Based on the cone in the illustration, we can already guess that although the
direction of L is not unique, the length of the L vector is unique. We
196 Chapter 19. Angular Momentum

can determine the length of the L vector using the sum of the squares of the
angular momentum operators:

L̂2 = L̂x 2 + L̂y 2 + L̂z 2 (19.18)

This sum is briey noted as the L̂2 operator. This operator is Hermitian, so
it represents an observable, namely the length of the angular momentum
vector squared. And the great thing is: This operator commutes with each
angular momentum component L̂x , L̂y and L̂z . This is very good, because
it allows us to determine not only one of the angular momentum components
of a quantum particle exactly, but also the magnitude of the total angular
momentum:
q
L̂ = L̂x 2 + L̂y 2 + L̂z 2 (19.19)

This would be very bad for physics if the magnitude of the total angular
momentum did not exist exactly at all times. Without a xed, exact total
angular momentum, the law of conservation of angular momentum in quantum
mechanics would not work at all.

19.3 Quantum Numbers l and m


Let's use the Bra-Ket notation and treat operators as matrices and wave
functions as Ket vectors (states). A commutator not only tells us whether two
observables can be measured exactly at the same time, but also whether the
associated operators share eigenstates {|Ψi ⟩}.

If the commutator [L̂2 , L̂z ] vanishes (and it does), then we know that there is a
state |Ψ ⟩, which is simultaneously both an eigenstate of L̂2 and an eigenstate
of L̂z .

ˆ If L̂2 operator is applied to the state |Ψ ⟩, which is an eigenstate of this


operator, then the result is an eigenstate scaled by the eigenvalue: In
the case of L̂2 , the eigenvalue returns the magnitude of the total angular
19.3 Quantum Numbers l and m 197

momentum squared:

L̂2 |Ψ ⟩ = L2 |Ψ ⟩ (19.20)

ˆ And if L̂z is applied to the state |Ψ ⟩, which is also an eigenstate of L̂2 , then
we again get the scaled eigenstate with a dierent eigenvalue. In the case
of L̂z , this eigenvalue represents the magnitude of the angular momentum
component in the z -direction:

L̂z |Ψ ⟩ = Lz |Ψ ⟩ (19.21)

Using the ladder operators, we can derive the eigenvalues L2 and Lz a little
more precisely. Here I give the famous result that probably every chemist knows.
The eigenvalues L2 are a multiple of ℏ2 :

L̂2 |Ψ ⟩ = L2 |Ψ ⟩ (19.22)
= l (l + 1) ℏ2 |Ψ ⟩

The eigenvalue l (l + 1) ℏ2 is determined by a integer or half-integer number


l, which we call angular momentum quantum number. This quantum
number can have the following values:
1 3
l = 0, , 1, , 2, ... (19.23)
2 2

Consequently, the total angular momentum squared L2 cannot take on any


continuous value, but only the following discrete values, which are determined
by the angular momentum quantum number:

L2 = 0, 0.75ℏ2 , 2ℏ2 , 3.75ℏ2 , 6ℏ2 , ... (19.24)

Let's take the square root of the magnitude square. The magnitude L of the
198 Chapter 19. Angular Momentum

total angular momentum is quantized:


√ √ √ √
L = 0, 0.75ℏ, 2ℏ, 3.75ℏ, 6ℏ2 , ... (19.25)

The eigenvalues Lz of the L̂z angular momentum operator are a


multiple of ℏ:

L̂z |Ψ ⟩ = Lz |Ψ ⟩ (19.26)
= m ℏ |Ψ ⟩

The quantum number m is called the magnetic quantum number and it can
only take on values between m = −l and m = l in +1 steps. The L̂z angular
momentum component can therefore not take on continuous values as in classical
physics - the L̂z angular momentum component is quantized.

Example: If a quantum particle has a total angular momentum L = 2(2 +


1)ℏ2 = 6ℏ2 , represented by the angular momentum quantum number l = 2,
then its magnetic quantum number can take on m = −2, −1, 0, 1, 2 values and
no others. The Lz angular momentum component of this quantum particle can
only have 5 possible values: Lz = −2ℏ, −1ℏ, 0, 1ℏ, 2ℏ.

Let's summarize what you should take away from the 19 chapter:

ˆ You now know how to quantize classical angular momentum.


ˆ You know how to show that the angular momentum components are
Hermitian.
all angular momentum components are
ˆ You have learned that
subject to the uncertainty principle and why only one of the
components can be determined exactly.
ˆ You have learned how to work with angular momentum
commutators.
ˆ You know what the L̂2 operator is good for.

ˆ You know the possible eigenvalues of L̂z and L̂2 .


The End

If you enjoyed the book, it would immensely help me if you could leave a brief
review on Amazon with a rating. Even more important is that you send me
any mistakes, suggestions for improvement, or any unclear sections as
soon as possible to the email [email protected] so that I can address them
immediately.
May physics be with you!
Equations of Physics: Solve Every Physics Problem

ˆ Contains over 500 illustrated formulas on only 140 pages


ˆ Includes reference tables with examples and constants
ˆ Easy to use, because without vectors and integrals
ˆ Perfect for courses with basic physics
Get this book: en.fufaev.org/physics-equations
Independently Published

ISBN for paperback: 9798320488516

ISBN for premium hardcover: 9798320491899

Imprint

Alexander Fufaev
Peiner Straÿe 86
31137 Hildesheim, Germany
[email protected]
fufaev.org

Copyright © 2024 Alexander Fufaev


Published, March 2024

You might also like