Theoretical Physics For Students by Alexander Fufaev
Theoretical Physics For Students by Alexander Fufaev
Weekly assignments at university are not easy for beginners and require a lot
of time. The content is rushed through, so it's easy to nd yourself not only
failing to understand many topics, but also struggling to keep up with the weekly
assignments. This often leads to not completing the required credits or failing
the exams at the end of the semester.
Having completed my M.Sc. in Physics, I know exactly in retrospect what I
would have needed in the theoretical physics modules to meet the academic
requirements and pass the exams. A fundamental, intuitive understanding of
the topics would have been essential because often I didn't know what the
assignments are talking about. In some cases, I understood the assignment but
didn't know how to approach it. There were also assignments I could solve,
but I wasn't sure what or why I was calculating something.
If you work through this book from the rst to the last chapter, you'll nd it
much easier to master the assignments in theoretical physics and pass
the exams more easily.
May physics be with you!
What you’ll learn
I Mathematical Tools
1 Differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1 What is a differential equation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Different notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 What should I do with a differential equation? . . . . . . . . . . . . . . 15
1.4 Recognize differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Classify a differential equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.1 Ordinary or partial? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.2 Of what order? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.3 Linear or non-linear? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.4 Homogeneous or inhomogeneous? . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.2 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 Tensors of Zeroth and First Order . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Second-order tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Tensors of higher orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Symmetric and antisymmetric tensors . . . . . . . . . . . . . . . . . . . . . 30
2.5 Combine tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.1 Addition of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Subtraction of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.3 The outer product of tensors (tensor product) . . . . . . . . . . . . . . . . . . . 32
2.5.4 Contraction of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Dirac delta in the coordinate origin . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Shifted Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Properties of the Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Analogy to the Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Three-dimensional Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5 Nabla operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Positive divergence = source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.2 Negative divergence = sink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.3 Divergence-free vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.1 The concept of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.2 Fourier coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.3 Fourier basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.4 Example: Fourier series for the sawtooth function . . . . . . . . . . . 87
II Nature is extreme
9 Action Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
10 Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
10.1 Lagrange function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10.2 How To: Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.2.1 First step: Set generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.2.2 Second step: Set up the Lagrange function . . . . . . . . . . . . . . . . . . . . 99
10.2.3 Third step: Calculate derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.2.4 Fourth step: Solve the differential equations . . . . . . . . . . . . . . . . . . . . 101
10.2.5 Fifth step: Set boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
III Electromagnetism
1 Differential equations . . . . . . . . . . . . 11
2 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Dirac delta . . . . . . . . . . . . . . . . . . . . . . . 45
4 Vector fields . . . . . . . . . . . . . . . . . . . . . . 53
1. Differential equations
More: en.fufaev.org/dierential-equations
If you plan to deal with...
the atomic world,
chemical processes,
electrical circuits,
weather forecasts
then you will eventually encounter dierential equations. You will encounter
dierential equations in every part of theoretical physics, so it is important to
understand how to work with dierential equations.
F = −D y (1.1)
14 Chapter 1. Differential equations
This law describes the restoring force F on a mass attached to a spring. The
mass experiences this force when you deect it from its rest position by the
distance y. D is a constant coecient that describes how hard it is to stretch
or compress the spring.
The mass m is hidden in the force. We can write the force using Newton's
second law as m a:
m a = −D y (1.2)
If we only bring m to the other side, we can use this equation to calculate the
acceleration that the mass experiences for each displacement y :
D
a(t) = − y(t) (1.4)
m
Dierential equations come into play for such future questions. We can easily
show that the acceleration a is the second time derivative of the distance covered,
so in our case it is the second derivative of y with respect to time t:
d2 y(t) D
= − y(t) (1.5)
dt 2 m
We have now set up a dierential equation for the displacement y(t)! You
can recognize a dierential equation by the fact that it in addition to the
unknown function y(t), there are also derivatives of this function. As
in this case, the second derivative of y with respect to time t.
You will often come across this in physics. We can also write it down a little
more compactly without mentioning the time dependency:
d2 y D
= − y (1.7)
dt 2 m
If the function y(t) only depends on the time t, then we can write down the time
derivative even more compactly using Newton notation. A time derivative of
y corresponds to a point above the ẏ . With two derivatives, as in our case, there
would therefore be two points above the function:
D
ÿ = − y (1.8)
m
Obviously, this notation is rather unsuitable for considering the tenth derivative.
Another notation that you are more familiar with from mathematics is
Lagrange notation. Here, dashes are used for the derivatives. So two dashes
for the second derivative:
D
y ′′ = − y (1.9)
m
With Lagrange notation, it should be clear from the context with respect to
which variable is being dierentiated. If this is not clear, the variable on which
y depends must be mentioned explicitly:
D
y ′′ (t) = − y(t) (1.10)
m
d2 x(t) D
= − x(t) (1.11)
dt2 m
1.3 What should I do with a differential equation? 17
For simple dierential equations, such as that of the oscillating mass, there are
methods that you can use to get the unknown function y(t). Keep in mind,
however, that there is no general recipe for solving any dierential equation.
For some dierential equations there is not even an analytical solution!
Non-analytical means that you cannot write down a concrete equation for the
function y(t) = ....
d2 y(t) D
= − y(t) (1.12)
dt 2 m
c:
∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
+ + = (1.13)
∂x2 ∂y 2 ∂z 2 c2 ∂t2
By coupled we mean that, for example, the rst dierential equation for the
function x(t), also contains the function y(t). This means that we cannot
simply solve the rst dierential equation independently of the second
dierential equation, because the second dierential equation tells us how it
behaves in the rst dierential equation. The functions we are looking for
occur in all three dierential equations, which means we have to solve all three
1.5 Classify a differential equation 19
After you have found out what the unknown function is and on which variables
it depends, you should answer a few more basic questions to choose the
appropriate solution method for the dierential equation. We need to
classify the dierential equation:
Is the dierential equation ordinary or partial? Partial dierential
equations describe multidimensional problems and are signicantly more
complex.
Once you have classied a dierential equation, you can use a suitable solution
method to solve the equation. Even if there is no specic solution method, the
classication tells you how complex a dierential equation is.
Let's classify the DGL for the oscillating spring, wave equation, mass in the
gravitational eld and for the decay law.
d2 y(t) D
+ y(t) = 0 (1.17)
dt 2 m
∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
+ + = (1.18)
∂x2 ∂y 2 ∂z 2 c2 ∂t2
This is a partial dierential equation. "Partial" means that the function you
1.5 Classify a differential equation 21
Since the second derivative is the highest and even the only derivative of in
the dierential equation, the oscillating mass at the spring is the 2nd order
dierential equation.
1.5.2.1 How to reduce the order
It is always possible to convert a higher order dierential equation into a
system of 1st order dierential equations. Sometimes this procedure is
helpful when solving a dierential equation.
For example, we can convert the dierential equation 1.19 for the oscillating
mass on the spring (second-order dierential equation) into two coupled rst-
order dierential equations. All we have to do is introduce a new function,
let's call it v and dene it as the rst time derivative of the unknown function
y:
dy
v = (1.20)
dt
This is already one of the two coupled rst-order dierential equations. Now
we just have to express the second derivative in the original dierential equation
with the derivative of v . Then we get the second 1st order dierential equation.
22 Chapter 1. Differential equations
dy
v − = 0 (1.21)
dt
dv D
+ y = 0 (1.22)
dt m
The two dierential equations are coupled. We must therefore solve them
simultaneously. They are coupled because y occurs in the rst equation and v
in the second equation.
You can always proceed in this way if you want to reduce the order of a
dierential equation. The price you have to pay for this is additional coupled
dierential equations.
Let's continue. Of what order is the dierential equation for the decay
law?
dN
−λ N = (1.23)
dt
This is a dierential equation 1st order, because the highest occurring
derivative of the function N (t) is the rst derivative.
Linear means that the unknown function and its derivatives only
contain powers of 1 and there are no products of derivatives with the
function, such as y 2 or y ÿ . There are also no composition of functions,
such as sin(y(t)) or y(t). For products, compositions and higher powers, we
p
Note! The to the power of two in the second derivative in the Leibniz
d2 y
notation dt2 is not a power of the derivative, but merely a notation for the
1.5 Classify a differential equation 23
second derivative.
The decay law is also linear:
dN
1
−λ N 1
= (1.25)
dt
The coupled dierential equation system for the motion of a mass in the
gravitational eld, on the other hand, is non-linear because higher powers of
the functions x(t), y(t) and z(t) occur there, namely x2 , y 2 and z 2 .
d2 y D
1 + y = 0 (1.27)
dt 2 m
∂ 2E ∂ 2E ∂ 2E 1 ∂ 2E
1 + 1 + 1 = (1.28)
∂x2 ∂y 2 ∂z 2 c2 ∂t2
Here, the external force F (t) corresponds to the perturbation function. As you
can see, it stands alone without being multiplied by the function y(t) or its
derivatives. In addition, the perturbation function is time-dependent, so it is a
non-constant coecient.
1.6 Constraints
A dierential equation alone is not sucient to describe a physical system
uniquely. The solution of a dierential equation describes many possible
systems that exhibit a certain behavior.
For example, the solution N (t) of the decay law describes an exponential
behavior. However, the knowledge of an exponential behavior is not
sucient to be able to say specically how many nuclei N (t = 10) have
decayed after 10 seconds.
For this very reason, every dierential equation usually has constraints. These
are additional information that must be given for a dierential equation in order
to determine the unique solution of the dierential equation. The number
of necessary constraints depends on the order of the equation.
1.6 Constraints 25
For a 4th order, four constraints would then be necessary and so on...
In order to uniquely determine the solution of a
We can therefore state:
nth order dierential equation, n constraints are necessary.
Most of the time you will come across initial conditions and boundary
conditions. These are also just names for constraints that tell you what kind
of information you have about the system.
initial value problem. If we solve the initial value problem, we can use the
solution to predict the future behavior of a system.
so on.
You have now learned the most important basics of dierential equations. This
knowledge will help you as undergrad.
2. Tensors
More: en.fufaev.org/tensors
Before we fully understand tensors in their general denition, let's rst get to
know them from an engineering perspective. As long as you understand
scalars, vectors, and matrices, you'll nd it easy to understand tensors from
this perspective because tensors are nothing but a gegeneralization of
scalars, vectors, and matrices. Just as we use scalars, vectors, and
matrices to describe physical laws, we can use tensors to describe physics.
Tensors are an even more powerful tool with which we can describe physics
that cannot be described solely with scalars, vectors, and matrices. In order to
develop the modern theory of gravitation, Albert Einstein had to rst
understand the concept of tensors. Only then could he mathematically
formulate the general theory of relativity.
j3
In Eq. 2.1 we have represented the rst order tensor as a column vector. Of
course, we can also represent it as a row vector:
j = [j1 , j2 , j3 ] (2.2)
At this stage, it doesn't matter how we write down the components. But
remember that it will play a role later!
The notation of rst-order tensors as row or column vectors only makes sense if
we are working with concrete numbers, such as in computer physics, where we
use tensors to obtain concrete numbers. In order to calculate theoretically, for
example to derive equations or simply to formulate a physical theory, the tensors
are formulated compactly in index notation. You are probably already familiar
with this from vector calculus. Instead of writing out all three components of
the rst-order tensor, we write them with an index k , that is, jk . What we
call the index does not matter. jk stands for the rst component j1 , second
j2 or third j3 component, depending on what we actually use for index k . In
theoretical physics, we usually do not use anything specic because we want to
write the physics as generally and compactly as possible.
You have certainly already become familiar with this tensor in mathematics in
the matrix representation. In a three-dimensional space, the second-order
tensor is a 3x3 matrix:
σ11 σ12 σ13
σ = σ21 σ22 σ23 (2.3)
We also use index notation for the second-order tensor and note the components
of the matrix with σmk , for example. The indices m and k can have the values
1, 2 or 3. The index m species the row and the index k species the column.
∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗
2.5 Combine tensors 33
Unfortunately, most tensors are neither symmetric nor antisymmetric. But the
great thing is: Mathematically, we can split every tensor t into a symmetric
s and an antisymmetric a part: t = s + a.
Let's take a look at how we can practically decompose a general tensor tij of the
second-order into the two parts.
1. The symmetric part sij of the tensor tij is sij = 21 (tij + tji ). Here we
have swapped the two indices and added the two tensors together. The
factor 21 is important because we have counted the tensor twice here.
2. The antisymmetric part aij of the tensor tij ist aij = 12 (tij − tji ). Here
we have swapped the two indices, so the swapped tensor gets a minus sign.
Then we add the two tensors together. The factor 21 is also important here.
3. We then add the symmetric and antisymmetric components together to
obtain the total tensor:
1 1
tij = (tij + tji ) + (tij − tji ) (2.6)
2 2
The rst term in Eq. 2.6 is the symmetric part of the tensor tij and the second
term is the antisymmetric part.
The result is a new tensor cij of the same order. If we represent the tensors a
and b as matrices, then adding tensors is nothing other than adding matrices
34 Chapter 2. Tensors
component by component. The component a11 of the matrix a in the rst row
and rst column is added with the component b11 of the matrix b, which is also
in the same column and the same row. This is how matrix addition works. We
proceed in the same way with all other components. The result is the matrix c:
a11 a12 a13 b11 b12 b13 a11 + b11 a12 + b12 a13 + b13
a21 a22 a23 + b21 b22 b23 = a21 + b21 a22 + b22 a23 + b23 (2.8)
a31 a32 a33 b31 b32 b33 a31 + b31 a32 + b32 a33 + b33
| {z } | {z } | {z }
a b c
The result is a new tensor cij of the same order. Subtraction works in the same
way as addition. Simply replace the plus sign in Eq. 2.9 with a minus sign.
If we form the tensor product 2.10 of second-order tensors, then the result cijkm
is a fourth-order tensor. If, on the other hand, we form the tensor product
of tensors ai and bk of the rst order, then the result is a tensor of the second
2.5 Combine tensors 35
order:
cik = ai ⊗ bk = ai bk (2.11)
This is how the tensor product works with two tensors of any order. The only
exceptions are zero-order tensors. In this case, the result remains a zero-order
tensor. The tensor sign ⊗ is usually omitted in 2.10 and 2.11.
Let's make a concrete example of the tensor product that we can illustrate
well, namely the tensor product of rst-order tensors as in Eq. 2.11. They are
represented by the vectors: a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ]. The result is a
second-order tensor represented by a matrix:
a1 b1 a1 b 1 a1 b 2 a1 b 3
a2 ⊗ b2 = a2 b1 a2 b2 a2 b3 (2.12)
a3 b3 a3 b 1 a3 b 2 a3 b 3
| {z }
c
The rst index, the index i, numbers the rows of the matrix by denition and
the second index, the index k , numbers the columns.
The tensor product does not necessarily have to be between two tensors of the
same order. For example, we can also form the tensor product with the third-
order tensor Aijm and the second-order tensor Bkn . The result is a fth-order
tensor Cijmkn :
As you have probably already noticed, for example, Bkn specically represents
the kn-th component of the tensor B . And Aijm is the ijm-th component of
the tensor A. If we form the tensor product as in 2.13, then this is the tensor
product of the components. The result is the ijmkn-th component of the tensor
C . When we write a tensor with indices, we always mean its components.
Nevertheless, we casually say tensor for its component notation.
36 Chapter 2. Tensors
The tensor product naturally works in the same way with superscript indices,
which we will get to know in the next chapter. If the indices ij are at the top
of the tensor A, then they must also be at the top of the resulting tensor C :
1. We select two of its indices. For example, the index i and m: tijmk .
2. Then we set the two indices equal: i = m. For example, we can call them
both i: tijik .
+ tijik
i = 1
In physics, we use the Einstein sum convention, which states that we can
omit the sum sign in + tijik to simplify the notation if two identical indices
i = 1
2.6 Kronecker Delta 37
appear in a tensor. With the tensor tijik in combination with the Einstein
summation convention, summation is therefore performed using the index i:
If we contract a second-order tensor tii then the contraction is also called trace
of the tensor:
Of course, we can also contract the indices of two dierent tensors. For example,
let's take a tensor Mij and a tensor vk . The tensor product Mij vk without
contraction results in a third-order tensor. Now we contract the indices j and k .
Then, in the matrix and vector representation, this corresponds exactly to the
multiplication of a matrix M with a vector v. The result ui is a rst-order
tensor, that is, a vector:
Mij vj = ui
|{z}
(2.17)
|{z} |{z}
M v u
equal. And the Kronecker delta is equal to 0 if i and j are not equal:
(
1 i=j
δij = (2.18)
0 i ̸= j
aδ33 = a · 1 = a
δ23 δ22 = 0
Also note that, unless otherwise stated, we use the Einstein summation
convention we learned earlier. The same index is used for summation:
∂i fi ̸= fi ∂i
Let's look at four useful rules with Kronecker delta that you can always use
when summing over double indices.
This should make it clear that the order of contraction of the Kronecker delta
is irrelevant.
We can also apply this rule to the contraction of the Kronecker delta with
another tensor, here ai :
ai δij = aj (2.23)
a = [a1 , a2 , a3 ] (2.25)
= a1 ê1 + a2 ê2 + a3 ê3 (2.26)
= ai êi (2.27)
Here, ê1 , ê2 and ê3 are three basis vectors that are normalized and
orthogonal to each other. In this case, they span an orthogonal
three-dimensional coordinate system. For the third equal sign, we have used
the Einstein summation convention and represented the vector a in index
notation: ai êi .
Let's now take another vector b and also represent it in index notation: bj êj .
Note that we have to name the indices of the two vectors dierently.
Now we form the scalar product of the two vectors:
We can sort the objects in index notation in Eq. 2.28 as we like. Let's take
advantage of this and put brackets around the basis vectors to emphasize their
importance when introducing the Kronecker delta:
Thus we have converted the scalar product a · b of the two vectors to the scalar
product of the basis vectors êi · êj . The basis vectors are orthonormal (i.e.
pairwise orthogonal and normalized). Let's remember what the property of
being orthonormal means for two vectors:
Scalar product êi · êj equals 1 if i and j equal. In this case, it is the same
vector.
Scalar product êi · êj results in 0 if i and j are not equal. In this case,
2.6 Kronecker Delta 41
there are two dierent basis vectors and they are orthogonal to each other.
Doesn't this property sound familiar to you? The scalar product 2.30 of two
orthonormalized vectors behaves exactly like the denition 2.18 of Kronecker
delta! Therefore, you may replace the scalar product between two basis vectors
with the Kronecker delta:
a · b = ai bj δij (2.32)
If you remember the rules for the contraction, we can contract one of the
summation indices i or j in 2.32. For example, let's contract (eliminate) the j .
We get the scalar product in index notation:
a · b = ai b i (2.33)
And eq. 2.33 is exactly the denition of the scalar product, where the vector
components are summed component by component.
Now you know how the scalar product is written in index notation and what role
the Kronecker delta plays. It represents the scalar product of the basis vectors:
êi · êj = δij .
used here! Why again? Orthonormal vectors result in either 1 or 0, just like
the Kronecker delta.
You can keep the following in mind: As soon as you discover an expression
in the index notation of an equation that results in either 0 or 1 depending on
the indices, replace it with Kronecker-Delta and use the Kronecker-Delta rules
above to simplify the equations further or to represent them in index notation.
With the Levi-Civita symbol, which is sometimes also called the epsilon tensor,
you can easily transform and simplify complicated vector equations, such as
multiple cross products, and represent equations more compactly.
Levi-Civita symbol εijk is notated with a small Greek epsilon that has three
indices i, j and k . The Levi-Civita symbol can take on three dierent values:
+1, 0 or -1. When does it take on which value? That depends on how the indices
ijk are arranged in relation to the original order. What do I mean exactly? Let's
take a closer look. You can permute (swap) the indices ijk . We can permute
the indices in two ways.
In the straight (cyclic) permutation, all indices ijk are rotated clockwise or
anti-clockwise. With this permutation, all indices change their position.
2.7 Levi Civita symbol 43
For example:
In an odd permutation, two indices are swapped with each other. In this
permutation, only two of the three indices ijk change position. For example:
An odd permutation of ijk is jik . Here, i and j have been swapped, while
k has remained in the same place.
Another odd permutation of ijk is kji. Here, i and k have been swapped,
while j has remained in the same place.
And the last possible odd permutation of ijk is ikj . Here, i has been left
as it is, while j and k have been swapped.
With this knowledge, you will be able to understand the denition of the Levi-
Civita symbol. The permutations refer to a starting position of the indices.
Here we assume (ijk) = (123) as the starting position. Then the Levi-Civita
44 Chapter 2. Tensors
ε313 = 0, since the rst and third indices are the same.
ε123 + ε213 = 1 + (−1) = 0, since the indices of the rst epsilon are in the
start position and the indices of the second epsilon are an odd permutation
of this.
ε123 ε231 = 1 · 1 = 1, since the indices of the rst epsilon are in the starting
position and the indices of the second epsilon have just been permuted
counterclockwise.
The cross product, written using the orthonormal basis vectors ê1 , ê2 and ê3
2.7 Levi Civita symbol 45
a1 b 2 − a2 b 1
= (a2 b3 − a3 b2 ) ê1
+ (a3 b1 − a1 b3 ) ê2
+ (a1 b2 − a2 b1 ) ê3
Take a look at the indices in eq. 2.37. All three indices i, j and k occur double.
Here we have used the Einstein summation convention, therefore we sum over
duplicate indices. If we insert concrete values for the indices in 2.37, we get
exactly the rst (i = 1), second (i = 2) or third (i = 3) component of the cross
product. But eq. 2.37 is not only a compact notation of the cross product, it
is also a clever notation for the cross product, with which we can easily derive
relations for the parallelepipedial product and the double cross product.
For fun, write out the double cross product (a × b) × c with vector notation
and then write it out in index notation. And prove the following BAC-CAB
rule with one and the other method:
(a × b) × c = b (a · c) + c (a · b) (2.38)
You will be grateful to have learned about the Levi Civita symbol, as you will
encounter it regularly during your undergraduate and master's studies.
3. Dirac delta
More: en.fufaev.org/dirac-delta
The Dirac delta δ(x) (sometimes also called the Dirac delta function, although
it is not a function) is a useful mathematical object that is used in many areas of
theoretical physics. Starting in electrodynamics in the description of electric
point charges as a charge density concentrated in a single point, up to quantum
eld theory in the description of quantum elds as operators.
To calculate how large the total charge Q is on this line, we must integrate
(sum up) the charge density ρ(x) on this line. Let's assume that the charge
density is smeared on the line from x = a to x = b. These are our integration
limits. The total charge is therefore calculated as follows:
Z b
ρ(x) dx = Q (3.1)
a
But what if Q is not a smeared charge, but a charge localized at a single point?
48 Chapter 3. Dirac delta
The charge density ρ(x) must fulll two properties if it is to describe a single
point charge:
The integral 3.1 over the charge density must give us the value Q if the
point charge lies within the integration limits x = a and x = b.
δ(x) = 0, x ̸= 0 (3.2)
Even though the name delta function may suggest that it is a function, δ(x)
is mathematically not a function, but another mathematical object that can
be understood as a Dirac distribution or as a Dirac measure. Let us therefore
continue to call δ(x) a Dirac delta so as not to upset the mathematicians.
The Dirac delta is graphically illustrated with an arrow that is located at the
3.1 Dirac delta in the coordinate origin 49
position x = 0 of the unit point charge Q = 1. The height of the arrow is usually
chosen so that it represents the value of the integral, in this case Q = 1.
Such an integral is very easy to calculate, because due to the property 3.3 the
Dirac delta is zero everywhere except at the point x = 0. This means that the
product f (x)δ(x) is also zero everywhere except at the point x = 0. In the
integral 3.4, only the function value f (0) remains. Since f (0) no longer depends
on x, we can move this constant in front of the integral:
Z b Z b
f (x)δ(x) dx = f (0) δ(x) dx (3.5)
a a
Z b
= f (0) δ(x) dx
a
=?
The integral over the Dirac delta results in 1 if x = 0 lies between a and b
(otherwise the integral is zero). This is exactly the property of the Dirac delta.
So we know what the Dirac delta does in the integral 3.4 when multiplied by
50 Chapter 3. Dirac delta
a function f (x). The Dirac delta picks the value of the function at the
origin x = 0:
Z b
f (x) δ(x) dx = f (0) (3.6)
a
Even with a shifted charge, the integral over the delta function is equal to 1 if
the charge at x = x0 lies between the integration limits. We have only shifted
the Dirac delta to x0 , so the value of the integral with δ(x − x0 ) is the same as
in the case of δ(x):
Z b
δ(x − x0 ) dx = 1, a < x0 < b (3.7)
a
3.3 Properties of the Dirac delta 51
What happens if the shifted Dirac delta is multiplied by another function f (x)
in the integral? δ(x − x0 ) is zero everywhere except at the point x0 . This means:
Shifted Dirac delta δ(x − x0 ) in the integral picks the function value
f (x0 ) at the point where the Dirac delta is located:
Z b
f (x) δ(x − x0 ) dx = f (x0 ), a < x0 < b (3.8)
a
fi δij = fj (3.11)
And now compare that with what the Dirac delta does in the integral:
Z b
f (i) δ(i − j) = f (j) (3.12)
a
While we can use the Kronecker delta δij to pick a vector component fj from a
nite number of vector components fi , we can use the Dirac delta δ(i − j) to
pick a function value f (j) from an innite number of function values f (i).
The Kronecker delta δij is used when we are dealing with vectors f and
their nite number of vector components fi .
The delta function δ(i − j) is used when we are dealing with functions f
and their innite number of function values f (i).
spatial axes: (x, y, z). Fortunately, the generalization of the Dirac delta to
three-dimensional space is quite simple.
If our unit charge Q = 1 is located in the coordinate origin (x, y, z) = (0, 0, 0),
then we can describe the corresponding density, that is, the three-dimensional
Dirac delta, with the product of three one-dimensional Dirac deltas δ(x), δ(y)
and δ(z):
Z
δ(x) δ(y) δ(z) dx dy dz = 1, (0, 0, 0) ∈ V (3.14)
V
To avoid having to write three Dirac deltas, we combine them into one Dirac
delta with a superscript that species the spatial dimension. And in the
argument of the Dirac delta, we write the position vector r = (x, y, z):
The Dirac delta shifted to the location r0 = (x0 , y0 , z0 ) then looks as follows:
With the knowledge of the Dirac delta, we can theoretically describe density
singularities (for example point charges and black holes).
4. Vector fields
Fz (x, y, z)
Here, Fx (x, y, z), Fy (x, y, z) and Fz (x, y, z) are three scalar functions and they
represent the three components of the vector function F . Sometimes we also
write briey: F (r) = F (x, y, z), where r = (x, y, z) is the position vector.
Each point (x, y) is assigned a vector F (x, y). For example, at the point (x, y) =
(1, 1) the vector looks like this: F (1, 1) = (7, 5). Simply insert x = 1 and for
y = 1 into the vector eld 4.2 to get this example vector. If you insert a large
number of locations in this way, you will get the graphical representation of the
vector eld 4.2. ■
5. Nabla operator
More: en.fufaev.org/nabla-operator
We will encounter the nabla operator ∇ (inverted large delta) in every
branch of theoretical physics when it comes to multidimensional derivatives,
especially in electrodynamics when we get to know Maxwell's equations. The
three-dimensional Nabla operator is notationally similar to a vector and
looks like this in three-dimensional space when we express it with Cartesian
coordinates (x, y, z):
∂x
∇ = ∂y (5.1)
∂z
The three components of the Nabla operator are partial derivatives with
respect to x, y and z . We have notated the partial derivatives more compactly
with ∂x instead of ∂x ∂
. This notation is common in theoretical physics. The
single derivatives ∂x , ∂y and ∂z are called dierential operators. You can apply
a dierential operator to a function f . The result is the derivative of the function.
For example: ∂x f = ∂f ∂x
.
We can apply the Nabla operator in three dierent ways to a scalar function f
58 Chapter 5. Nabla operator
or to a vector eld F :
5.1 Gradient
More: en.fufaev.org/gradient
Let's take a look at the rst application of the nabla operator in the form of
the gradient ∇f of a scalar function f . Here we apply the nabla operator
∇ to the function f . We will encounter the gradient in Maxwell's equations, for
example:
∂x f (x, y, z)
∇f (x, y, z) = ∂y f (x, y, z) (5.2)
∂z f (x, y, z)
And the one-dimensional Nabla operator ∇1d has only one component.
Applied to a one-dimensional function f (x), the gradient is simply the partial
derivative of the function:
The resulting vector ∇f (x, y, z) points at each location (x, y, z) to the steepest
slope of the function f (x, y, z). For example, look at the following plot of a
two-dimensional scalar function f (x, y):
60 Chapter 5. Nabla operator
At each location on the green function, imagine a vector that shows you the
direction of the steepest ascent or descent at that location.
5.2 Divergence
More: en.fufaev.org/divergence
Let's look at the second application of the Nabla operator, namely the
divergence ∇ · F of a vector eld F . Here we apply the nabla operator ∇
to the vector-valued function F (x, y, z). Just like the gradient ∇f , we will also
encounter the divergence in Maxwell's equations, for example.
For the divergence, we form the scalar product ∇ · F between the nabla
operator and the vector eld F :
∂x Fx (x, y, z)
∇ · F (x, y, z) = ∂y · Fy (x, y, z) (5.6)
∂z Fz (x, y, z)
= ∂x Fx + ∂y Fy + ∂z Fz
In the last step, we omitted the arguments for more compactness. The result
∇ · F of the scalar product is a three-dimensional scalar function. By forming
the gradient, a vector eld was generated from a scalar function. And by formig
the divergence, we make a scalar function was generated from a vector
eld. So the other way around!
5xy
5.2 Divergence 61
Why do we call the location a source? If we were to enclose this location point
in an imaginary cube, then the vector eld would mainly point out of the cube.
You can visualize the source as a hole from which the water comes and leaves
the surface of the cube. Even though we can use the divergence to describe a
water source, in this book we use the divergence to describe electric charges. In
62 Chapter 5. Nabla operator
this case, the vector eld F corresponds to the electric eld F = E . Then the
source at the location (x, y, z) represents a positive electric charge.
If we enclose the location with an imaginary cube, then the vector eld ows
into the surface. We can imagine the sink as a hole into which the water ows.
To do this, the water must ow into the cube. If we assume that the vector eld
is an electric eld: F = E , then the sink at the location (x, y, z) corresponds to
a negative electric charge.
■ Example 5.3 — Sink of a vector field. Let's take a look at the following vector
eld:
2x
F (x, y, z) = y (5.9)
The vector eld under consideration has a constant, negative divergence at every
location (x, y, z). This means that no matter which location is used for (x, y, z),
each location has a negative divergence with the value -1. Sinks of the vector
eld 5.9 are distributed everywhere. If the vector eld were an electric eld
F = E , then this result would mean that a negative electric charge is smeared
everywhere in space.
■
We can imagine this as if the cube enclosed a source (e.g. water source) and
a sink (e.g. drain), so that the amount of water owing in and water owing
out cancel each other out. If we interpret the vector eld as the electric eld
F = E , then there could be an electric dipole at the considered divergence-free
location. It consists of a positive (source) and negative (sink) charge.
0.5z 2
= −2 + y + z
∇ · F (1, 1, 1) = −2 + 1 + 1 = 0 (5.13)
The divergence of the vector eld at this location is zero. There is therefore
neither a source nor a sink or an ideal electric dipole at the location (1, 1, 1). ■
5.3 Curl
As with divergence (scalar product ∇ · F ), we also apply the nabla operator to
a vector eld F in the case of curl (cross product ∇ × F ):
∂x Fx (x, y, z)
∇ × F (x, y, z) = ∂y × Fy (x, y, z) (5.14)
∂z Fz (x, y, z)
∂y Fz − ∂z Fy
= ∂z Fx − ∂x Fz
∂x Fy − ∂y Fx
In the last step, we omitted the arguments for more compactness. The result
of the cross product is again a vector eld with three components. The curl
∇ × F (x, y, z) of a vector eld thus gives us another vector eld.
5.3 Curl 65
We can visualize the curl ∇×F (x, y, z) of the vector eld at the location (x, y, z)
actually as the name tells us, as the circulation of the vector eld around
the location (x, y, z).
■ Example 5.5 — Calculate the curl of a vector field. Let us again consider the
following vector eld at the location (1, 1, 1):
2x3
F (x, y, z) = zy (5.15)
5xy
∂x (zy) − ∂y (2x3 ) 0
Hopefully you can now imagine what the cross product ∇ × F with the vector
eld F means. We will encounter curl in the chapter on Maxwell's equations.
6. Gauss Divergence Theorem
Please what? You're probably asking yourself. Don't worry. We break down the
Divergence Theorem into its components so that you understand it one hundred
percent.
The A stands for a surface that encloses any volume, for example the surface
of a cube, a sphere or the surface of any three-dimensional shape you can think
of. The small circle around the integral is intended to indicate that this surface
68 Chapter 6. Gauss Divergence Theorem
must fulll a condition: It must be closed, that is, it must not contain any
holes, so that the equality in 6.1 is mathematically fullled. The surface A is
therefore a closed surface.
The F is any vector eld: F = F (x, y, z), that is, a vector with three
components Fx (x, y, z), Fy (x, y, z) and Fz (x, y, z), as shown in Eq. 4.1. For
example, the vector eld could represent an elecetric eld F = E or a
magnetic eld F = B .
The da is an innitesimal surface element, that is, an innitely small area
of the surface A under consideration. As you may have noticed, the da element
is shown in bold, so it is a vector with three components dax , day and daz . The
vector naturally also has a magnitude and a direction. The magnitude |da| = da
indicates the area of this small piece of surface. The da vector is orthogonal to
the surface area and points out of the surface.
The point · in F · da is the scalar product (also called dot product). You
should be familiar with this vector operation from basic mathematics. The
scalar product is a way of multiplying two vectors together. In the Divergence
Theorem, the scalar product is therefore formed between the vector eld F and
the da area. Written out, the scalar product looks like this:
The task of this scalar product is to pick out the part of the vector eld F at
point (x, y, z) that is perpendicular to the surface, i.e. that points parallel to
the da surface element. How can we understand this? Mathematically, we can
split the vector eld F = F|| + F⊥ into two parts:
F · da = F|| + F⊥ · da (6.3)
= F|| · da + F⊥ · da = F|| · da
| {z }
0
Why again is the perpendicular component zero? Because the scalar product of
two perpendicular vectors F and da is mathematically zero.
perpendicularly. Everything that passes by the surface (by this I mean the
component F⊥ parallel to the surface) is omitted in the Divergence Theorem.
Then, on the right-hand side of the Divergence Theorem 6.1, the scalar products
F (x, y, z) · da(x, y, z) are summed up for each point (x, y, z) on the surface A
using the integral in 6.1.
The V stands for a volume, but not just any volume, it is the volume that is
enclosed by the surface A. For example, if A is the surface of a cube, then
V is the volume of this cube. The dv is an innitesimal volume element,
that is, an innitely small volume piece of the volume V .
In the integrand ∇·F of the volume integral, ∇ stands for the nabla operator,
which we got to know in the chapter 5. Although this operator is not a vector
from a mathematical point of view, it looks like a vector. An operator such as
the nabla operator is only useful if it is applied to a eld. And this also happens
in the integrand ∇ · F . The nabla operator ∇ is applied to the vector eld
F by forming the scalar product between the nabla operator and the vector
eld. Written out, this scalar product corresponds to the sum of the derivatives
of the vector eld with respect to the coordinates x, y and z :
∇ · F = ∂x Fx + ∂y Fy + ∂z Fz (6.6)
Then, in the volume integral 6.5, the divergences ∇ × F (all sources and sinks)
at each location (x, y, z) within the volume V are summed up with an integral.
The volume integral 6.5 in the Divergence Theorem is a number that measures
how many sinks and sources can be found within the volume V .
Let us summarize the statement of the Divergence Theorem 6.1:
The volume integral on the left-hand side describes the sum of sources
and sinks of the vector eld within a volume V :
Z
(∇ · F ) dv
V
The area integral on the right-hand side describes the ux Φ of the
6.2 Volume integral in the Divergence Theorem 73
More: en.fufaev.org/stokes-curl-theorem
Besides the Divergence Theorem, we will also need the Stokes' Curl Theorem
(or shorter: Curl Theorem) in order to understand Maxwell's equations in depth.
The Curl Theorem states that the curl of a vector eld within a surface
is equal to the curl of the vector eld along the edge of this surface.
Expressed mathematically, this theorem looks like this:
Z I
(∇ × F ) · da = F · dl (7.1)
A L
If you have understood the Divergence Theorem, the Curl Theorem should no
longer seem totally cryptic to you. You already know the vector eld
F (x, y, z). It depends on three spatial coordinates and has three components
as a vector. The scalar product F · dl, but also the nabla operator ∇ and
the innitesimal surface da should be familiar to you if you have read the
chapter 6 on the Divergence Theorem.
76 Chapter 7. Stokes’ Curl Theorem
Then, on the right-hand side, the scalar product F · dl is formed between a vector
eld and the line element. The scalar product looks like this when written out:
You have already learned what the purpose of this scalar product is in the
chapter 6 on the Divergence Theorem. Here is a quick recap: First, divide the
7.1 Line integral in the Curl Theorem 77
Into the component F|| , which points parallel to the dl line element.
F · dl = F|| + F⊥ · dl (7.3)
= F|| · dl + F⊥ · dl = F|| · dl
| {z }
0
Since the line element dl is parallel to the line at every point of the line, in the
scalar product F · dl only the parallel component F|| of the vector eld remains,
which of course also runs along the line L. All other components of the vector
eld are absent.
The scalar products for each point (x, y, z) on the line L are then added up on
the right-hand side of the Curl Theorem using the line integral.
78 Chapter 7. Stokes’ Curl Theorem
The surface A appears in the area integral. In contrast to the surface integral
with a circle around the integral sign, as in the Divergence Theorem, here we
consider an open surface. It therefore does not include a volume. This is
merely a surface that is enclosed by the loop L.
The vector da = (dax , day , daz ) represents an innitely small element of the
surface A and is perpendicular to every location point (x, y, z) on this surface.
7.2 Surface integral in the Curl Theorem 79
The cross product ∇ × F between the nabla operator and the vector eld also
appears in the surface integral. You should already know what the cross product
means from the basics of mathematics. In addition to the scalar product, the
cross product is the second way to multiply vectors with each other. The cross
product ∇ × F is the curl of the vector eld F .
In contrast to the scalar product, the result of the cross product ∇ · F is again
a vector eld that is perpendicular to F . Why perpendicular? Because
that is the property of the cross product! If we write out the cross product in
concrete terms, the result vector ∇ × F looks like this:
∂y Fz − ∂z Fy
∇ × F = ∂z Fx − ∂x Fz (7.5)
∂x Fy − ∂y Fx
(∇ × F ) · da = (∇ × F )|| + (∇ × F )⊥ · da (7.6)
= (∇ × F )|| · da + (∇ × F )⊥ · da
| {z }
0
= (∇ × F )|| · da
The scalar products (∇ × F )|| (x, y, z) · da(x, y, z) are then summed up on the
left-hand side of the Curl Theorem using the surface integral at each point
(x, y, z).
7.2 Surface integral in the Curl Theorem 81
Let us now summarize the statements of the surface integral (right-hand side)
and line integral (left-hand side) of the Stokes' Curl Teorem:
On the left-hand side, the curl (∇ × F ) of the vector eld F is summed
up at each individual location within the area A:
Z
(∇ × F ) · da
A
More: en.fufaev.org/fourier-series
You are certainly familiar with the Taylor expansion, with which we can
approximate a function f (x) at a point x = x0 using a simpler Taylor series.
Let us denote the approximation of the exact function as f . The more terms
we take in the Taylor series, the better the approximation f in the vicinity of
the selected point x0 .
As you can see in the image below, the Taylor series, represented by ftaylor , is a
good approximation of the function f in the immediate vicinity of x0 . However,
if we move further away from the point, we see that the Taylor series is not a
good approximation there. The Taylor expansion is therefore a method with
which we can approximate a function only locally.
k
v = v1 e1 + v2 e2 + v3 e3 + ... + vn en = + vk ek (8.1)
n
When considering the function f as a vector, the function values f (x0 ), f (x1 ),
f (x2 ) and so on until f (xn ) = f (x0 + L), represent the components of f . We
can imagine the function f as a column vector:
f (x0 )
f (x1 )
f (x2 )
f = . (8.3)
.
.
f (xn )
Of course, the representation is not exact. The argument x of the function f (x)
is real and there are therefore theoretically innitely many values, even
between x0 and x1 .
vk = ek · v (8.4)
= ek0 v0 + ek1 v1 + ... + ekn vn
n
= + ekj vj
j
In the last step, we have written out the scalar product a little more compactly
with a summation sign and selected the summation index as j . Here, ek0 to ekn
are the components of the basis vector ek = [ek0 , ek1 , ..., ekn ].
If we are not working with nite-dimensional vectors but with functions, then
we have to form the scalar product between the k -th basis function and
the function f to obtain the k-th Fourier coecient of f :
e(x0 ) f (x0 )
e(x1 ) f (x1 )
e(x2 ) f (x2 )
fˆk = ek · f = ⟨ek |f ⟩ = . · . (8.5)
. .
. .
e(xn ) f (xn )
already mentioned, with a Fourier series we can only work with functions in a
certain interval. Our chosen interval (x0 , xn ) = (x0 , x0 + L) has the length L.
You have certainly seen the approximation sign in Eq. 8.6. The sum is
therefore only an approximation of the Fourier coecient fˆk . Can we represent
the Fourier coecients fˆk exact? It's quite simple! Since we are dealing with
a continuous summation in the case of exact Fourier coecients, we must
replace the summation sign with an integral. So instead of summing
discretely over x as in 8.6, we integrate over x:
Z xn
fˆk = ⟨ek |f ⟩ = e∗k (x)f (x)dx (8.7)
x0
We have only made a small mathematical upgrade in the integral. The basis
function e∗k (x) has been complex conjugate. We can also omit the asterisk if
we are working with real basis functions, as e∗ = e applies to real basis functions.
However, to allow complex basis functions, we must append an asterisk to the
basis function. The asterisk in the case of complex-valued functions
is important so that the integral 8.7 fullls the properties of an inner
product.
So, now we know how we can calculate the Fourier coecients with the integral
8.7 and how the integral formula 8.7 comes about in the rst place.
All functions that fulll the properties of a basis! In order for a set of vectors
or, as in our case, a set of functions {ek } to be called a basis, these functions
must fulll two conditions:
If we take two basis functions ek and em from the set {ek }, then they
must beorthonormal to each other, in other words orthogonal and
normalized. This property can be expressed with the Kronecker delta:
⟨ek |em ⟩ = δkm .
The set {ek } of basis functions must be complete. In other words, they
must span the space in which the functions f live. We must be able to
represent each function f exactly with the set {ek }.
Only when these two properties are fullled by the functions {ek } can we take
these functions as basis functions and thus represent a function f as a Fourier
series 8.8.
A typical basis {ek } used in physics are the complex exponential functions:
1
ek = √ eikx (8.9)
L
The factor √1L ensures that the basis functions are normalized, that is, they
exactly fulll the necessary 1. property. In the context of physics, especially
in optics, we refer to k as wavenumber. And remember that e in eikx is the
Euler number and not the label of the basis function ek ! I'm just saying...
Depending on what we use for the wavenumber k , we get a dierent basis
function in 8.9. Of course, we can also choose a dierent basis for the Fourier
series, such as cosine and sine functions. We are free to choose a basis.
Here we have chosen complex exponential functions as a basis because they
can be written in a nice compact way, especially for the explanation of the
Fourier series.
The Fourier series 8.8 of the function f would look like this in the exponential
8.4 Example: Fourier series for the sawtooth function 89
basis 8.9:
n
1 n
f = + fˆk ek = √ + fˆk eikx (8.10)
k L k
What can we do with the Fourier series 8.10 in the exponential basis? As I said,
we can use it to approximate any function f in an interval. Let's take a look at
a concrete example, then you'll know what I mean.
(
−x (0, 0.5)
f = (8.11)
1 − x (0.5, 1)
Let's choose the exponential basis functions as our basis for the Fourier series
of the sawtooth function:
1 n
f = √ + fˆk eikx
L k
The total interval length is L = 1. This means that the normalization factor for
90 Chapter 8. Fourier Series
Choose a basis and insert it into the Fourier series. We have already
done this in eq. 8.12.
Calculate the Fourier coecients fˆk with the integral 8.7 and insert it
into the Fourier series 8.12. We determine the k -th Fourier coecient 8.7
with the inner product between the k -th basis function and the sawtooth
function f :
Z xn
fˆk = e∗k (x) f (x) dx (8.13)
x
Z 01
= e−ikx f (x) dx (8.14)
0
Note that the exponential basis function must be complex conjugate in the
integral. This is where the minus sign in the exponent of the exponential function
comes from. And the integration limits x0 = 0 and xn = 1 are our free decision.
We want to approximate the sawtooth function in this region.
Now it's up to you to solve the integral 8.14 to determine the Fourier coecients
concretely. I can't do it, so I'll leave it to you as an exercise.
Since we have not entered a specic value for the wavenumber k in 8.15, we have
determined all Fourier coecients fˆk . For a dierent k value, we get a dierent
Fourier coecient in Eq. 8.15.
Let's just insert the Fourier coecients 8.14 into the Fourier series 8.12 and
8.4 Example: Fourier series for the sawtooth function 91
In the last step, we selected periodic boundary conditions for k = 2πm/L, where
m = ... − 2, −1, 0, 1, 2, ... takes whole numbers. We therefore sum over both
positive and negative m.
We can decide up to which mmax we want to sum in the Fourier series 8.19 of the
sawtooth function. The higher we choose mmax , the better our approximation
of the function f will be.
Look at the plots for the approximation mmax = 1 and for a better approximation
mmax = 20:
With this Fourier series for the sawtooth function, we have basically gained two
things:
We can now sum the series up to a certain maximum value: m = mmax
and thus obtain a continuously dierentiable good approximation for
the sawtooth function.
Since we have determined the Fourier coecients, we know which m
92 Chapter 8. Fourier Series
More: en.fufaev.org/euler-lagrange-equation
The connection between A and B, that is, the trajectory h(t) must be a
parabola in this problem. But why is this trajectory a parabola and not some
other trajectory? Why does nature or the particle between points A and B
choose this path of all paths? And not for any other path?
In order to answer this question, we need a physical quantity called action,
which is abbreviated with an S and has the unit Js (Joule second).
We can assign an action S[h] to each of the conceivable trajectories h. The
action takes an entire function h in the argument and outputs a number S[h],
namely the value of the action for the corresponding function. For example,
some trajectory h1 could have the value S[h1 ] = 3.5 Js, another trajectory h2
could have the value S[h2 ] = 5.6 Js and the parabolic trajectory h could have
the value S[h] = 2 Js:
So now back to the question: Why a parabola? Experience shows that nature
97
All other actions are out of the question for nature. Nature chooses one of these
extremal actions. This is exactly what "extreme" means. Which of the extreme
paths (minimum, maximum, saddle point) nature actually takes depends on the
problem under consideration.
So we can answer the question: Why does the particle thrown upwards in the
gravitational eld take the path of the parabola in the space-time diagram?
Because the parabolic trajectory h has the smallest action S[h]!
But how do we actually calculate the value of the action? For this we
need the lagrange function L(t, h, ḣ). It depends on the time t, on the function
value (position) h(t) and on the time derivative (velocity) ḣ(t) at time t. The
Lagrange function has the unit of energy, that is, Joule (J).
If we integrate the Lagrange function L over the time t between t1 and t2 , we
get a quantity that has the unit Joule second. We interpret this as the action
S:
Z t2
S[h] = dt L(t, h, ḣ) (9.1)
t1
generalized coordinate (the trajectory you are looking for) does not necessarily
have to be the height h above the ground. For example, it can represent an
angle q = φ or any other quantity that may depend on the time t.
Z t2
S[q] = dt L(t, q, q̇) (9.2)
t1
With this formula for the action functional we can calculate the value S[q]
of the action for every possible trajectory q that the particle can take. We only
need to determine the Lagrange function L.
There are an innite number of possible trajectories that a particle can take from
A to B. Do I really have to calculate an innite number of action functionals
9.2? No, there is a faster way to nd the trajectory with the most extreme
action and for this we need the Euler-Lagrange equation.
10. Euler-Lagrange Equation
Of course, it is totally cumbersome to calculate the action functional 9.2 for all
possible trajectories and to take the trajectory that yields the smallest value of
the integral. To save us this huge task, the Euler-Lagrange equation comes
into play:
∂L d ∂L
− = 0 (10.1)
∂q dt ∂ q̇
∂L dp
− = 0 (10.2)
∂q dt
dp ∂L
= (10.3)
dt ∂q
For the momentum to be conserved, the time derivative of the momentum must
disappear. We therefore only have to calculate whether ∂L ∂q
is zero, then the
generalized momentum is obtained.
Using the form 10.3, we can also read o a possible interpretation of the
Euler-Lagrange equation. It is a condition for the conservation of
generalized momentum.
To be able to use the Euler-Lagrange equation at all, we need to know the
Lagrange function L for a chosen system.
found correctly describes your problem or not. If you want to nd the Theore
of Everything formula that unites quantum mechanics with the general
theory of relativity, then you should derive or dream up the corresponding
Lagrange function.
In classical mechanics, the Lagrange function is the dierence between the
kinetic energy Wkin and the potential energy Wpot of a particle:
Here m is the mass of the particle. The potential energy Wpot of the particle in
102 Chapter 10. Euler-Lagrange Equation
Wpot = m g h (10.6)
∂L d ∂L
− = 0 (10.8)
∂h dt ∂v
dp d
= (m v) (10.15)
dt dt
= m v̇ (10.16)
= m ḧ (10.17)
Let's insert the calculated derivatives 10.11 and 10.14 into the Euler-Lagrange
equation:
−m g − m ḧ = 0 (10.18)
Let's cancel the mass and bring ḧ to the right-hand side of the equation:
−g = ḧ (10.19)
In our problem, we have thrown the particle from the height h1 = 0 at the time
t1 = 0. The rst boundary condition is therefore: h(0) = 0. If we insert it
into the solution 10.20, we get the second constant of integration: C2 = 0. This
simplies the solution:
1
h(t) = − g t2 + C1 t (10.21)
2
This trajectory has the smallest value S[h] of the action. If we plot the result
10.22 in the space-time diagram, we get a parabola.
Let us summarize: The Euler-Lagrange equation helps us to set up
dierential equations for a desired trajectory between two xed
points. The solution of these dierential equations yields the exact shape of
the trajectory that is allowed by nature.
III
Electromagnetism
11. The Electric Vector Field
1 Qq
Fe = (11.1)
4πε0 r2
What if we know the value of the big charge Q and want to know what force this
108 Chapter 11. The Electric Vector Field
big charge exerts on another small charge q ? But we don't know the exact value
of this small charge. Or we deliberately leave this value open and only want to
consider the electric force that would be exerted by the big charge if we place
the test charge q near it. To do this, q must somehow be eliminated from the
Coulomb's law. We achieve this by dividing the Coulomb's law on both sides
by q so that the test charge on the right-hand side disappears:
Fe 1 Q
= (11.2)
q 4πε0 r2
The quotient between force and charge on the left-hand side is dened as electric
eld E of the source charge Q:
1 Q
E = (11.3)
4πε0 r2
We have called Q the source charge to indicate that it is the source of the
electric eld. And so that Q is really the only source, we have chosen the test
charge q to be very small.
The electric eld in Eq. 11.3 is only the magnitude, that is, the value of the
109
electric eld. For the cherry on the cake of electrodynamics, Maxwell's equations,
we need the electric eld as a vector quantity in order to also take into account
the direction of the electric eld at all locations in space. Therefore, the
electric eld E must be transformed into a vector E . Vectors are shown in this
book in bold.
The electric eld E as a vector in three-dimensional space has three components
E1 ,E2 and E3 .
E1
E = E2 (11.4)
E3
The rst component E1 (x, y, z) depends on the spatial coordinates (x, y, z) and
it indicates the magnitude of the electric force that would act on a test charge
along the x-axis if the test charge were placed at the location (x, y, z). The same
applies to the other two eld components E2 (x, y, z) and E3 (x, y, z), which each
determine the electric force on a test charge along the y and z spatial directions.
We can summarize: The electric vector eld E assigns a vector E(x, y, z)
to each point in space (x, y, z), which represents the electric eld at
that location. If a test charge is placed there, it is accelerated in the
direction of this vector.
12. The Magnetic Vector Field
The force also increases in proportion to the applied magnetic eld. To describe
this proportionality of the force and the magnetic eld, we introduce the quantity
B . Overall, the magnetic force (also called Lorentz force) is given by:
112 Chapter 12. The Magnetic Vector Field
Fm = q v B (12.1)
The unit of the quantity B must be such that the right-hand side of the equation
results in the unit of the force, that is N = kg · m/s2. A simple transformation
results in the unit of B : kg/As2 . We refer to this unit as Tesla for short:
kg
T = (12.2)
As2
Fm1 v1 B1
Fm = Fm2 , v = v2 , B = B2 (12.3)
Fm3 v3 B3
Now the three variables are not scalars, but three-dimensional vectors with the
components in x−, y− and z spatial direction. The question now is: How must
the velocity vector v be vectorially multiplied by the magnetic vector
eld B ?
If the deection of the charge in the magnetic eld is investigated more closely
in an experiment, it can be determined that the magnetic force deects it always
orthogonally, in other words perpendicularly to the direction of velocity and
to the magnetic eld lines. This orthogonality can be easily established with
the cross product v × B .
The cross product between the velocity vector and the magnetic eld vector is
113
dened in such a way that the result of the cross product, which is a vector, is
always orthogonal on the two vectors v and B :
v1 B1 v2 B3 − v3 B2
v × B = v2 × B2 = v3 B1 − v1 B3 (12.4)
v3 B3 v1 B2 − v2 B1
Fm = q v × B (12.5)
So what is the magnetic vector eld B ?The magnetic vector eld assigns
a vector B(x, y, z) to each point in space (x, y, z), which determines the
magnitude and direction of the magnetic force F (x, y, z) on a moving
m
charge q.
13. Maxwell’s Equations
More: en.fufaev.org/maxwell-equations
The four Maxwell equations together with the Lorentz force contain the
entire knowledge of electrodynamics. There are so many applications of
this that I can't list them all, but some of them are, for example
And much more! Electric welding for assembling car bodies, motors
for electric cars, magnetic resonance imaging in medicine, kettles in the
kitchen, the charger for your smartphone, radio, Wi-Fi and so on.
116 Chapter 13. Maxwell’s Equations
They may still seem a little cryptic to you, but after this lesson you will be able
to translate each of these four equations into a picture, which will be easier to
internalize.
As you can see from Maxwell's equations, the electric eld E and the magnetic
eld B appear there. You will hopefully have become familiar with these two
quantities in the chapters 11 and 12. Of course, I also assume that you have
read the chapter 5 on the nabla operator, the Stokes' Curl Theorem 7 and the
Gauss Divergence Theorem 6. If this is the case, then you will have no problem
understanding the following chapters.
you briey:
I
Q
E · da =
ε0
IA
B · da = 0
A
I Z
∂B
E · dl = − · da
A ∂t
IL Z
∂E
B · dl = µ0 I + µ0 ε0 · da
L A ∂t
The surface integral over the vector eld F results in a number Φ , which
indicates how much of the vector eld F passes through the surface A.
If, on the other hand, the vector eld F in the surface integral is a magnetic
eld F = B , then this surface integral is referred to as magnetic ux Φm
through the surface A:
Z
Φm = B · da (13.3)
A
The number U indicates how much of the vector eld circulates along the line
L. It was no coincidence that we gave it the same letter as the voltage.
If the vector eld F in the line integral is an electric eld F = E , then this
line integral is referred to as electric voltage Ue along the line L:
Z
Ue = E · dl (13.5)
L
120 Chapter 13. Maxwell’s Equations
The voltage 13.5 in the case of an electric eld is proportional to the kinetic
energy:
A positively charged particle gains energy when it passes through
the line L.
The line integral 13.5 of the electric eld, that is the voltage Ue , measures
the kinetic energy gain or energy loss of a charged particle when it passes
through the considered line L in the electric eld. Note, however, that this
kinetic energy does not come from nothing, but is withdrawn from or added
to the electric eld.
If the vector eld F in the line integral is a magnetic eld F = B , then this
line integral is referred to as magnetic voltage Um along the line L:
Z
Um = B · dl (13.6)
L
We will need this knowledge of electric and magnetic ux and voltage in a
moment if we want to understand Maxwell's equations in integral form.
13.4 First Maxwell Equation 121
You should be familiar with the left-hand side of Maxwell's equation 13.7. It is
the electric ux Φe through an imaginary surface A that encloses something.
The left-hand side of Maxwell's equation therefore tells you how much net of
the electric eld E exits and enters the surface A:
Q
Φe = (13.8)
ε0
On the right-hand side of the rst Maxwell equation is the total electric charge
Q, which is enclosed by the surface A. The vacuum permitivity ε0 is
only there to have the correct unit "voltmeter" on both sides of the Maxwell
equation. The interesting thing is: It doesn't matter how this enclosed charge
is distributed.
that surface.
With the Gauss Divergence theorem 6.1, which links a volume integral with a
surface integral, the surface integral on the left-hand side of the rst Maxwell
equation can be rewritten as a volume integral:
Z
Q
(∇ · E) dv = (13.9)
V ε0
The enclosed charge Q can also be expressed with a volume integral. The charge
corresponds to the charge density ρ over the considered volume V , because
charge density is by denition charge per volume. This means that the volume
integral of the charge density ρ over a volume V corresponds to the charge
enclosed in this volume. This transforms the right-hand side of the rst Maxwell
equation 13.9 into a volume integral:
Z Z
1
(∇ · E) dv = ρ dv (13.10)
V ε0 V
On both sides in Eq. 13.10 we integrate over the same volume V . To ensure
that this equation is always fullled for any chosen volume, the integrands on
both sides must be the same (whereby the right integrand is multiplied by the
constant ε10 ). This results in the dierential form of the rst Maxwell
equation:
ρ
∇·E = (13.11)
ε0
The integral 13.13 for any volume V is always zero only if the integrand ∇ · B
is zero. This is how the second Maxwell equation in its dierential form
emerges:
∇·B = 0 (13.14)
Since there are no magnetic monopoles, there are no sources and sinks of
the magnetic eld. Consequently, there are no points in space where magnetic
eld vectors originate or diverge to a point. The magnetic eld lines
must therefore always be closed.
The second Maxwell equation is just like the other Maxwell equations an
experimental observation. This means that if at some point a magnetic
monopole is found, for example a single north pole without an associated south
pole, then Maxwell's second equation must be modied. That would be nice
for us, because then Maxwell's equations would be even more symmetrical!
On the left-hand side is a line integral of the electric eld E over a closed
line L that borders the surface A. This line integral sums up all components
E|| of the electric eld that run along the line L. From the chapter 13.3 we know
that this line integral corresponds to the electric voltage Ue along the loop L.
126 Chapter 13. Maxwell’s Equations
Now we can interpret the surface integral on the right-hand side as magnetic
ux Φm through the surface A:
∂Φm
Ue = − (13.18)
∂t
The time derivative in front of the magnetic ux is also still there. The magnetic
ux is therefore dierentiated with respect to time in the third Maxwell equation.
The time derivative of the magnetic ux indicates how quickly the magnetic
ux changes when time passes. The third Maxwell equation therefore tells us
two equivalent things:
The faster the magnetic ux Φm changes through the enclosed surface A,
the greater the voltage Ue generated along the edge of the surface L.
The faster the magnetic ux through the enclosed surface A changes, the
stronger the parallel eld component E|| , which runs along the edge
of the surface L. This electric eld along the edge is also referred to as
electric vortex eld, because this eld component swirls around the
edge of the surface:
I
∂Φm
E · dl = − (13.19)
L ∂t
13.6 Third Maxwell Equation 127
Of course, we can also interpret Maxwell's third equation 13.19, expressed with
the vortex eld, the other way round: The stronger the electric vortex
eld E (or more precisely E|| ) around the surface boundary L the faster the
magnetic ux Φm (or equivalently B|| ) changes through the surface A.
You are probably also wondering what the minus sign before the time
derivative means? The minus sign takes into account the circulation
direction of the vortex eld:
If the change in magnetic ux is positive, that is ∂Φm
∂t
> 0, then the
voltage is negative due to the minus sign, that is Ue < 0.
If the change in magnetic ux is negative, that is ∂Φm
∂t
< 0, then the
voltage is positive due to the minus sign, that is Ue > 0.
The vortex component E|| of the electric eld E thus swirls around in such a
way that the change in magnetic ux is impeded. Nature tries to prevent
the change in ux with a vortex eld. You probably remember this minus sign
in the third Maxwell equation from school as the Lenz rule. The minus sign
takes into account the law of conservation of energy.
What would happen if we omitted the minus sign? The electric vortex eld
(with the eld energy We ) would generate a magnetic ux change ∂Φ∂tm . This
in turn would amplify the electric vortex eld. This would increase the eld
energy We . The increased vortex eld leads to an increased change in ux. This
in turn leads to an even larger vortex eld and thus to greater eld energy.
This mutual amplication does not stop and the eld energy We becomes
innitely large. We could tap into this with a capacitor, for example, and have
128 Chapter 13. Maxwell’s Equations
If the magnetic ux does not change over time ∂Φ∂tm = 0, then of course there is
also no electric vortex eld and no electric voltage. The right-hand side of
the third Maxwell equation 13.19 is therefore zero:
I
E · dl = 0 (13.20)
L
Now it is stated in 13.20 that the line integral over the electric eld, that is the
electric voltage Ue , is always zero along a closed line L. There is therefore
no electric vortex eld as long as there is no time-varying magnetic
eld! This means: If an electron were to pass through the closed line L in the
electric eld E , the electron would not change its energy.
This brings the curl ∇ × E into play. Since the equation 13.21 applies to any
surfaces A, the integrands on both sides must be equal. This yields the third
Maxwell equation in dierential representation:
∂B
∇×E = − (13.22)
∂t
If the magnetic eld does not change, that is, if it is static, the right-hand side
in 13.22 is zero and the third Maxwell equation is simplied to an electrostatic
Maxwell equation. Electrostatic here means that the electric eld E is
time-independent:
∇×E = 0 (13.23)
The physical constants ε0 and µ0 in the fourth Maxwell equation are irrelevant
for understanding the fourth Maxwell equation. These constants merely ensure
that the right-hand side also has the unit "Tesla times meter", like the left-hand
side of the equation.
On the right-hand side of 13.24 we can pull the time derivative in front of the
integral if the surface A does not change:
I Z
∂
B · dl = µ0 I + µ0 ε0 E · da (13.25)
L ∂t A
13.7 Fourth Maxwell Equation 131
Then the surface integral, integrated over the electric eld, corresponds exactly
to the electric ux Φe through the surface A:
I
∂Φe
B · dl = µ0 I + µ0 ε0 (13.26)
L ∂t
Now the term with the electric current I must be converted into a surface
integral. To do this, we simply have to express the current with the electric
132 Chapter 13. Maxwell’s Equations
Note that the scalar product of the current density with the surface element da
is taken in the integral. The scalar product therefore only picks the component
j|| of the current density vector that runs parallel to the surface element
da. Only this current density component contributes to the current through the
cross-sectional area A.
Now we have a surface integral in each term of the fourth Maxwell equation.
We can combine the two surface integrals on the right-hand side into one surface
integral because both summands are integrated over the same area:
Z Z
∂E
(∇ × B) · da = µ 0 j + µ 0 ε0 · da (13.30)
A A ∂t
For the Maxwell equation 13.30 to be fullled for any-surface A, the integrands
on both sides must be equal. This results in the fourth Maxwell equation in
dierential representation:
∂E
∇ × B = µ 0 j + µ 0 ε0 (13.31)
∂t
More: en.fufaev.org/electromagnetic-waves
An electromagnetic wave (short: EM wave) consists of a electric eld
component E(t, x, y, z) and a magnetic eld component B(t, x, y, z). The
two eld components assign an electric and magnetic eld strength and its
direction to each point (x, y, z) in three-dimensional space at each time t. The
two eld components are therefore three-dimensional vector elds:
E1 (t, x, y, z) B1 (t, x, y, z)
E(t, x, y, z) = E2 (t, x, y, z) B(t, x, y, z) = B2 (t, x, y, z)
E3 (t, x, y, z) B3 (t, x, y, z)
How an electromagnetic wave changes exactly in space and time, that is, how
it moves and propagates in space, is described by the wave equations for E
and B vectors. Let's take a look at how we can extract these wave equations
from Maxwell's equations:
ρ
∇·E =
ε0
∇·B = 0
∂B
∇×E = −
∂t
∂E
∇ × B = µ 0 j + µ 0 ε0
∂t
∇·E = 0 (14.1)
∇·B = 0 (14.2)
∂B
∇×E = − (14.3)
∂t
∂E
∇ × B = µ 0 ε0 (14.4)
∂t
135
The general form of a wave equation for a vector eld F looks like this:
1 ∂ 2F
∇2 F = (14.5)
vp ∂t2
Here F is an arbitrary vector eld that satises the wave equation and vp is
the phase velocity of the wave. It indicates how fast a point of the wave moves
in space. Since we are not considering dispersion (meaning that the wave moves
apart), the phase velocity describes the propagation speed of the wave.
A relation that is necessary for the derivation of the wave equation is the
following relationship for the curl of the curl of the vector eld F (double
cross product):
∇ × ∇ × F = ∇(∇ · F ) − ∇2 F (14.6)
Let's apply the curl operator with cross product ∇× to both sides of the
third Maxwell equation:
∂B
∇×∇×E = ∇× − (14.8)
∂t
The time derivative together with the minus sign may be placed in front of the
Nabla operator, since the Nabla operator only contains spatial derivatives
and thus does not depend on time:
∂
∇×∇×E = − (∇ × B) (14.9)
∂t
Now we can replace the curl ∇ × B of the magnetic eld using the fourth
current-free Maxwell equation 14.4:
∂ ∂E
∇×∇×E = − µ 0 ε0 (14.10)
∂t ∂t
∂ ∂E
= −µ0 ε0 (14.11)
∂t ∂t
2
∂E
= −µ0 ε0 2 (14.12)
∂t
We are nished with the right-hand side. It has the same form as the general
wave equation 14.5. Now we have to replace the double cross product on the
left-hand side with the relation 14.6:
∂E 2
∇ (∇ · E) − ∇2 E = −µ0 ε0 (14.13)
∂t2
Maxwell's rst equation 14.2, the divergence of the electric eld in charge-free
space is always zero. This simplies 14.13 to wave equation for the electric
eld component of an electromagnetic wave:
∂E 2
∇ 2 E = µ 0 ε0 (14.14)
∂t2
The wave equation thus links spatial derivatives ∇2 E of the electric eld
with the time derivatives ∂E and thus represents a system of three partial
2
∂t2
dierential equations.
If we compare the wave equation 14.14 for the electric eld with the general
form 14.5 of a wave equation, we nd out how the propagation velocity vp is
related to the two eld constants µ0 and ε0 :
1 1
2
= µ0 ε0 ↔ vp = √ (14.15)
vp µ 0 ε0
From Maxwell's equations and the derived wave equation for the E eld, we can
conclude that the electric eld component of the electromagnetic wave
propagates at the speed of light. We will see that it also applies to the B
eld component. We can therefore express the E wave equation with the
speed of light:
1 ∂E 2
∇2 E = (14.17)
c2 ∂t2
Apply the curl operator with cross product ∇× on both sides of the fourth
current-free Maxwell equation:
∂E
∇ × ∇ × B = ∇ × µ 0 ε0 (14.18)
∂t
Now let's move the time derivative and the two constants on the right-hand side
in front of the Nabla operator:
∂
∇ × ∇ × B = µ 0 ε0 (∇ × E) (14.19)
∂t
Now we can replace the curl ∇ × E of the electric eld with the third Maxwell
equation:
∂ ∂B
∇ × ∇ × B = µ 0 ε0 − (14.20)
∂t ∂t
We have now decoupled the third Maxwell equation. The time derivative on the
right-hand side is combined and the double cross product on the left-hand side
is replaced using the calculation rule 14.6:
∂B 2
∇ (∇ · B) − ∇2 B = −µ0 ε0 (14.21)
∂t2
2 1 ∂E 2
∇E = 2
c ∂t2
1 ∂t
(14.23)
∂ 2 E2 ∂ 2 E2 ∂ 2 E2 2
+ + = 2 ∂∂tE22
∂x2 ∂y 2 ∂z 2
∂ 2 E3 ∂ 2 E3 ∂ 2 E3
c ∂2E
∂x2
+ ∂y 2
+ ∂z 2 ∂t2
3
There are three dierential equations for the E eld that you have to solve.
Fortunately, they are not coupled and can therefore be solved independently
of each other. Physically, non-coupled dierential equations mean:: The three
eld components E1 , E2 and E3 oscillate independently of each other.
They do not interfere with each other!
The solution E(t, x, y, z) of the wave equation 14.17 is an electric wave, but it
does not necessarily represent the E -eld of an electromagnetic wave,
only because E(t, x, y, z) solves the wave equation. The solution E(t, x, y, z)
only describes the E eld of an electromagnetic wave in a vacuum if the solution
also satises all four Maxwell equations.
From the fourth, current-free Maxwell equation 14.4, for example, we can
directly read o the orientation of the E and B eld components. Here is the
140 Chapter 14. Electromagnetic Waves
∂E
∇ × B = µ 0 ε0
∂t
We know from mathematics that the result vector ∇ × B of the cross product
is always orthogonal to the vectors between which the cross product is formed.
In this case, the B eld vector is therefore orthogonal to the derivative of the
E eld vector. However, the time derivative does not change the direction
of a vector. The E eld vector and its derivative therefore point in the same
direction. Thus the solutions E(t, x, y, z) and B(t, x, y, z) of the wave equation
are perpendicular to each other at any time and at any place.
More: en.fufaev.org/schrodinger-equation
Most phenomena in our everyday world can be described using classical
mechanics. The goal of classical mechanics is to nd out how a body moves
over time. Classical mechanics therefore determines the trajectory r(t),
that is, the path of this body. With the trajectory, we can predict where this
body was, is and will be at any time t. We thus describe the movement of the
body.
Here are some examples of the motion of bodies whose trajectory r(t) can be
predicted using classical mechanics:
Motion of a rocket
These are all classical problems that can be solved with the help of
144 Chapter 15. Schrödinger equation
Newton's second law of motion, that is, with the following dierential
equation:
d2 r
ma = F ↔ m = −∇Wpot (15.1)
dt2
Here, Wpot is the potential energy of a body of mass m. For example, this could
be the potential energy in the Earth's gravitational eld.
By solving the Newton dierential equation 15.1 we can nd the
unknown trajectory r(t) of a body. The solution is a position vector
r(t) = [x(t), y(t), z(t)], which species the three-dimensional position of the
body at any time t.
Once we have determined the trajectory r(t) by solving the dierential equation,
we can extract all other physical quantities. Here are a few examples of these
quantities:
Velocity of the body: v(t) = dr
dt
In order to be able to specify the solution r(t), the initial conditions that
characterize the problem to be solved must also be known. In classical physics,
these are the initial position r(t0 ) and the initial velocity v(t0 ) of the body.
In quantum mechanics, on the other hand, it would not even be possible to
145
It was only through the novel approach to nature with the help of the
Schrödinger equation that humans succeeded in making part of the microcosm
controllable. This has enabled humans to build lasers, which are now
indispensable in medicine and research. Or scanning tunneling
microscopes, which signicantly exceed the resolution of conventional light
microscopes. It was only through the Schrödinger equation that the periodic
table of elements and nuclear fusion in our sun were precisely
understood. But this is only a fraction of the applications that the Schrödinger
equation and quantum mechanics have brought us. So let's get to know this
powerful equation a little better.
As is the case with any dierential equation, our goal is to solve the Schrödinger
equation to nd the desired wave function Ψ and then apply the initial conditions
for a specic quantum mechanical problem (e.g. an electron in a potential well).
You should be familiar with the total energy and its conservation over time.
You already know this from the basics of classical mechanics. The law of
conservation of energy is a fundamental principle of physics, which is also
fullled in quantum mechanics in a modied form in conservative elds.
The two quantities are linked by the Planck's constant h = 6.6 · 10−34 Js with
each other. Because of the tiny value of h, it is understandable why we do
not observe wave-particle duality in our macroscopic everyday life.
In theoretical physics, it is common to express the momentum 15.4 not with the
de-Broglie wavelength λ, but with the wavenumber k . The momentum looks
like this:
hk
p = = ℏk (15.5)
2π
Here ℏ = 2πh
is dened as reduced Planck's constant and is only used for
shorter notation. Whether we dene the particle momentum as in Eq. 15.4 or
15.5 is purely a matter of taste. We simply stick to the usual representation
15.5 in theoretical physics.
The momentum 15.5 is also a measure of whether the particle behaves more
particle-like or wave-like:
The smaller the wavenumber k (that is, the greater the de Broglie
wavelength λ), the more likely the particle behaves quantum mechanically -
more like a matter wave. In this case, we speak of a quantum mechanical
particle.
The larger the wavenumber k (that is, the smaller the de Broglie
wavelength λ), the more likely the particle behaves classically - like a real
particle. In this case, we speak of a classical particle.
The particle has a small wave number ( in other words a large de Broglie
wavelength) if it has a very small momentum p. So a small mass m and small
velocity v . A perfect candidate for such a quantum mechanical particle is a
free electron. By "free" we mean that it is not in an external eld. The
15.1 Time-Independent Schrödinger Equation 149
As time t progresses, the matter wave moves in the positive x direction, just like
the electron we are looking at.
In order to perform calculations with such plane waves without any addition
theorems, we convert the plane wave into a complex exponential function.
This is an equivalent but extremely eective representation of the plane wave.
Als erstes: Addiere zur Cosinusfunktion die komplexe Sinusfunktion
i A sin (k x − ω t):
We have thus converted a real function 15.6 into a complex function 15.8. Here,
the imaginary unit i ensures that the plane matter wave becomes complex-
valued and we can immediately represent it as a compact exponential function.
The cosine term is the real part Re(Ψ ) and the sine term is the imaginary
part Im(Ψ ) of the complex-valued function Ψ . The good thing is that we can
150 Chapter 15. Schrödinger equation
exploit the enormous advantages of the complex notation 15.8 and then agree
that we are only interested in the real part (the cosine term) in the experiment.
We can then simply ignore the imaginary part.
However, remember that a complex plane wave 15.8 is also a possible solution
of the Schrödinger equation. Most solutions Ψ (x, t) of the Schrödinger equation
are complex-valued wave functions. Real-valued wave functions, as in Eq.
15.6, are then only a special case.
Next, we use the Euler relation eiφ = cos(φ) + i sin(φ) from mathematics,
which links the complex exponential function with cosine and sine. In our
case, φ = k x + ω t. Let's use this to rewrite our complex plane wave:
Whenever you encounter such a complex exponential function 15.9, you know
immediately that it always describes a plane wave - in this case a matter wave.
Our original, real-valued plane wave 15.6 as a cosine function is contained in
the complex exponential function 15.9, namely as the real part Re(Ψ ) of
the wave function.
15.1.3.1 Plane wave in a complex plane
Such a complex-valued wave function 15.9, at a xed coordinate x, can be
represented in the complex number plane as an arrow Ψ (a complex
vector).
15.1 Time-Independent Schrödinger Equation 151
∂ 2Ψ 1 ∂ 2Ψ
= (15.11)
∂x2 c2 ∂t2
In our case, c = ω/k is the phase velocity of the matter wave. On the
left-hand side of the wave equation is the second derivative of the wave function
with respect to the space x. It is therefore reasonable to dierentiate the plane
152 Chapter 15. Schrödinger equation
∂ 2Ψ ∂2 i(k x−ω t)
e = −k 2 A ei(k x−ω t) (15.12)
= A
∂x2 ∂x2
Next, we carry out four seemingly arbitrary steps that will ultimately lead us to
the Schrödinger equation. In these steps, we want to link the second derivative
15.13 of the wave function with the constant total energy W of the quantum
particle:
Let's use the de Broglie relation p = ℏ k and replace k 2 in the second
derivative 15.13:
∂ 2Ψ p2
= − Ψ (15.14)
∂x2 ℏ2
∂ 2Ψ 2m
2
= − 2 Wkin Ψ (15.15)
∂x ℏ
If we now look at the total energy 15.10 multiplied by the wave function,
we see that Wkin Ψ occurs there. We therefore rearrange 15.15 for Wkin Ψ :
ℏ2 ∂ 2 Ψ
Wkin Ψ = − (15.16)
2m ∂x2
If we only insert Eq. 15.16 into the total energy 15.10 multiplied by the
15.2 Interpretation of the wave function 153
This is where the statistical interpretation of the wave function, the so-
154 Chapter 15. Schrödinger equation
called Copenhagen interpretation, comes into play. Although it does not say
what the wave function Ψ (x, t) means, it interprets its magnitude squared
|Ψ (x, t)|2 . By forming the absolute value square, we obtain a real-valued
(measurable for the experimenter) function |Ψ |2 .
The statistical interpretation makes use of the mathematical fact that the
square of magnitude is always positive |Ψ |2 > 0 and interprets it as
probability density. Because as you know: probabilities are always positive.
15.2.1 Probability
Let's stick to the simple one-dimensional case. If we integrate the probability
density |Ψ (x, t)|2 over the spatial coordinate x within the distance between the
points x = a and x = b, then we get a probability P (t):
Z b
P (t) = |Ψ (x, t)|2 dx (15.18)
a
The integral of the probability density |Ψ (x, t)|2 indicates with which
probability P (t) the particle is located in the region between a and b
at time t. The probability can generally change over time.
15.3 Normalization of the wave function 155
15.2.2 |Ψ |2 graphically
Note, however, that it is not possible to specify the probability P (t) of the
particle at a certain location (for example at x = a), but only for a spatial
region (here between x = a and x = b). In the case of a single point in space,
the integral 15.18 would be zero. That is the mathematical reason. The physical
reason why we cannot specify a probability for a single point is that there are
innitely many points in space in the region between a and b. If each of these
points in space were assigned a nite probability, then the sum (i.e. the integral
15.18) of all probabilities would be innite, which would make no sense at all.
Therefore, we always calculate the probability of being in a spatial region.
In other words: The normalization condition states that the integral 15.18 for
the probability, integrated over the entire space, must always result in
1:
Z ∞
P = |Ψ (x, t)|2 dx = 1 (15.19)
−∞
between x = a and x = b, then you may reduce the integration limits in the
normalization condition 15.19 to this spatial region (this can sometimes
be useful to solve the integral):
Z b
|Ψ (x, t)|2 dx = 1 (15.20)
a
Our goal is to determine the factor A so that the integral over the magnitude
squared of this wave function is one.
You know with one hundred percent probability that the electron must be
between the two electrodes. If we place the negative electrode at x = 0 and the
positive electrode at x = d, then the electron is somewhere between these two
points. The normalization condition becomes:
Z d
|Ψ (x, t)|2 dx = 1 (15.22)
0
Next, we need to determine the magnitude squared |Ψ (x, t)|2 . The magnitude
of the wave function is calculated in the same way as the magnitude of a vector.
This is where the power of the complex exponential function becomes apparent
for the rst time. The following always applies: |eiφ | = 1. The magnitude
squared is therefore given by :
Let's insert the calculated magnitude squared into the normalization condition:
Z d
A2 dx = 1 (15.24)
0
A2 d = 1
1
A = √
d
equation again:
ℏ2 ∂ 2 Ψ
WΨ = − + Wpot Ψ (15.27)
2m ∂x2
In it, we have to extend the second spatial derivative with respect to x so that
the second spatial derivative with respect to y and z also appear in the three-
dimensional Schrödinger equation. To do this, we simply add these derivatives
to the spatial derivative with respect to x. Then we get the three-dimensional,
time-independent Schrödinger equation:
ℏ2 ∂ 2Ψ ∂ 2Ψ ∂ 2Ψ
WΨ = − + + + Wpot Ψ (15.28)
2m ∂x2 ∂y 2 ∂z 2
We can write Eq. 15.28 a little more compact with the Nabla operator.
To do this, factor out the wave function from the spatial derivatives:
ℏ2 ∂2 ∂2 ∂2
WΨ = − + + Ψ + Wpot Ψ (15.29)
2m ∂x2 ∂y 2 ∂z 2
The sum of the spatial derivatives in the brackets form a Laplace operator ∇ ·
∇ = ∇2 (sometimes also noted as ∆). This operator is the scalar product of two
nabla operators. This results in three-dimensional Schrödinger equation
expressed with Nabla operator:
ℏ2 2
WΨ = − ∇ Ψ + Wpot Ψ (15.30)
2m
But what if the total energy W of a quantum particle is not constant in time?
This can happen, for example, if the particle interacts with its environment
and its total energy increases or decreases as a result. For such a quantum
system, we need the time-dependent Schrödinger equation. This is what
it looks like in one spatial dimension:
∂Ψ ℏ2 ∂ 2 Ψ
iℏ = − + Wpot Ψ (15.31)
∂t 2m ∂x2
The only requirement for the Separation of Variables to work is that the
potential energy Wpot (x) does not depend on the time t (but it may
very well depend on the position x). The wave function itself can, of course,
still depend on both position and time.
15.6 Stationary Wave Function 161
First, divide the time-dependent wave function Ψ (x, t) (that is, the total
solution) into two parts:
Into a partial solution ψ(x), which only depends on the location x.
Into a partial solution ϕ(t), which only depends on the time t.
This separation ansatz (ansatz is a German word for approach) turns the
total wave function into a product of the two partial solutions:
∂ ℏ2 ∂ 2
iℏ (ψ(x) ϕ(t)) = − (ψ(x) ϕ(t)) + Wpot ψ(x) ϕ(t) (15.33)
∂t 2m ∂x2
Not all wave functions can be separated into two partial solutions as in Eq.
15.32. However, since the Schrödinger equation is linear, we can form a linear
combination of such solutions and thus obtain all wave functions (including
those that cannot be separated). This is what makes variable separation so
powerful.
As you can see from the time-dependent Schrödinger equation 15.31, the time
derivative and the second spatial derivative occur there. Calculate the two
derivatives of the separation ansatz 15.32:
∂Ψ ∂ϕ(t)
= ψ(x) (15.34)
∂t ∂t
∂ 2Ψ ∂ 2 ψ(x)
= ϕ(t) (15.35)
∂x2 ∂x2
162 Chapter 15. Schrödinger equation
We can insert the time derivative 15.34 and the spatial derivative 15.35 into the
time-dependent Schrödinger equation 15.33:
∂ϕ(t) ℏ2 ∂ 2 ψ(x)
iℏ ψ(x) = − ϕ(t) + Wpot ψ(x) ϕ(t) (15.36)
∂t 2m ∂x2
In the following, we omit the position and time dependence in order to be able
to write the Schrödinger equation in a more compact form. Now we have to
reformulate the separated Schrödinger dierential equation 15.36 so that its
left-hand side depends only on the time t and its right-hand side only on the
position x. We achieve this by dividing Eq. 15.36 by the product ψ ϕ:
1 ∂ϕ ℏ2 1 ∂ 2 ψ
iℏ = − + Wpot (15.37)
ϕ ∂t 2m ψ ∂x2
What do we get out of it? Quite a lot! If we change the time t (which only
occurs on the left side), only the left side of the equation will change, while the
right side remains unchanged. However, if the right-hand side does not change
over time, it is constant. This constant is real, as a complex-valued constant
would violate the normalization condition. The right-hand side corresponds to
the time-constant total energy W :
1 ∂ϕ
iℏ = W (15.38)
ϕ ∂t
This is an ordinary dierential equation for the partial solution ψ(x). We can
even write down the solution for this dierential equation. It is easy to solve
with pencil and paper. The time-dependent partial solution is a plane
wave:
ψ(x) = ei ℏ t (15.39)
W
Now let's look at the right-hand side of Eq. 15.37. If you change the variable
x on the right-hand side, the left-hand side of the equation remains constant
because it is independent of x. Because of the equality, the left-hand side must
15.6 Stationary Wave Function 163
ℏ2 1 ∂ 2 ψ
W = − + Wpot (15.40)
2m ψ ∂x2
By stationary we mean that the solution ψ(x) does not depend on time.
Therefore, we refer to the solution ψ(x) of a stationary Schrödinger equation as
stationary wave function ψ(x) or as stationary state.
What have we achieved overall with the separation approach? Instead of having
to solve a more complicated time-dependent Schrödinger equation for Ψ (x, t) =
ψ(x) ϕ(t),
∂Ψ ℏ2 ∂ 2 Ψ
iℏ = − + Wpot Ψ (15.42)
∂t 2m ∂x2
solve the stationary Schrödinger equation 15.41 for ψ(x) instead and
multiply this position-dependent partial solution with the
time-dependent partial solution 15.39. As a result, we obtain the total
solution of the time-dependent Schrödinger equation:
The solution 15.43 is very special, because its magnitude squared |Ψ (x, t)|2 is
time-independent! All other observables that describe the particle are also time-
independent. For example, a quantum particle described by the wave function
15.43 has a constant mean value of the energy ⟨W ⟩, constant mean value of the
momentum ⟨p⟩ and constant mean value of all other observables.
164 Chapter 15. Schrödinger equation
ℏ2 2
WΨ = − ∇ Ψ + Wpot Ψ (15.44)
2m
∂Ψ ℏ2 2
iℏ = − ∇ Ψ + Wpot Ψ (15.45)
∂t 2m
ℏ2 2
WΨ = − ∇ + Wpot Ψ (15.46)
2m
ℏ2 2
∂Ψ
iℏ = − ∇ + Wpot Ψ (15.47)
∂t 2m
| {z }
Ĥ
ℏ2 2
Ĥ = − ∇ + Wpot (15.48)
2m
| {z }
Wkin
Ĥ Ψ = W Ψ (15.49)
∂Ψ
Ĥ Ψ = iℏ (15.50)
∂t
only learn about this in your Master's degree when you take course on quantum
eld theory.
In the following Chapter 16 you will learn the representation of the wave
function as a state vector ("quantum mechanical state"). Advantage: You
can work with the state vector in (almost) the same way as with the usual
vectors that you know from linear algebra.
16. Bra-Ket Notation
More: en.fufaev.org/bra-ket-notation
Consider any one-dimensional wave function Ψ (x) describing a quantum
mechanical particle. We have omitted its time dependence Ψ (t, x) because it
is not relevant in this chapter. The value of the wave function, for instance, at
the location x1 is Ψ (x1 ), at the location x2 the function value is Ψ (x2 ), at the
location x3 the function value is Ψ (x3 ), and so forth. In this manner you can
assign to each point in space x the function value Ψ (x) of the wave function.
The sum of all these function values yields the shape of the wave function.
We can represent all these function values as a list of values. We can interpret
168 Chapter 16. Bra-Ket Notation
this list of values as a column vector Ψ . The column vector then has the
following components:
Ψ (x1 ) Ψ1
Ψ (x2 ) Ψ2
Ψ =
Ψ (x ) = Ψ
(16.1)
3 3
... ...
At the second equality sign, we have represented the function values more
compactly. Instead of writing the rst component as Ψ (x1 ), we compactly
write it as Ψ1 .
We can illustrate the column vector 16.1 as follows:
The rst component Ψ (x1 ) forms the rst coordinate axis.
and so forth.
In theory, there are of course innitely many x-values. Therefore, there are
also innitely many associated function values Ψ (x) as components of the
column vector. If there are innitely many function values, then the space in
which the state vector Ψ lives is innite-dimensional. Remember that this
space is not an innitely-dimensional position space but an abstract space.
This abstract space, where various quantum mechanical state vectors Ψ live,
is called a Hilbert space. In general, this is an innite-dimensional vector
space. However, it can also be nite-dimensional. For example, spin states
Ψ↑ and Ψ↓ , which describe the spin of a particle, reside in a two-dimensional
Hilbert space. That is, state vectors like the spin-up state Ψ↑ have only two
components:
" #
Ψ↑1
Ψ↑ = (16.2)
Ψ↑2
as a state vector Ψ
brackets:
Ψ1
Ψ2
|Ψ ⟩ =
Ψ
(16.3)
3
...
So, when you see a ket |Ψ ⟩, you know that it refers to the representation
of the quantum particle as a state vector.
On the other hand, if you see Ψ (x) without ket notation, you know that it
refers to the representation of the quantum particle as a wave function.
To obtain the bra vector ⟨Ψ | adjoint to the ket vector |Ψ ⟩, we need to perform
two operations:
Since we've interpreted the wave function Ψ (x) as a ket vector |Ψ ⟩, we can
practically work with the ket vector in much the same way as with ordinary
vectors you're familiar with from linear algebra. For example, we can form
a scalar product or tensor product between the bra or ket vectors.
What may be new to you, however, is that unlike vectors from linear algebra,
the components of the ket vector can be complex, and the number of
components can be innite.
When the state vectors between which you form the scalar product live in an
innite-dimensional Hilbert space, we call this operation not a scalar
product but an inner product. However, the notation ⟨Φ |Ψ ⟩ for the inner
product remains the same as in the case of the scalar product.
Ψn
172 Chapter 16. Bra-Ket Notation
We can multiply the row and column vectors in 16.6 just as we do with the usual
matrix multiplication.:
⟨Φ |Ψ ⟩ = Φ1 ∗ Ψ1 + Φ2 ∗ Ψ2 + Φ3 ∗ Ψ3 + ... + Φn ∗ Ψn (16.7)
n
= + Φi ∗ Ψi (16.8)
i=1
In the last step, we abbreviated the scalar product using a sum sign. Here,
n represents the dimension of the Hilbert space, that is, the number of
components of a state vector living in this Hilbert space. The dimension
n = ∞ of the Hilbert space can also be innite.
Similarly, the inner product 16.8 with the sum sign is not exact for states with
real-valued arguments. How can we make the inner product exact for these
16.3 Continuous Quantum States 173
So, to calculate the exact inner product of two wave functions Φ((x) and
Ψ (x), we need to evaluate the integral 16.10.
If the inner product is ⟨Φ |Ψ ⟩ = 0, then the wave functions Φ(x) and Ψ (x)
do not overlap at all.
These two cases can be combined in a single equation using the Kronecker
delta δij :
Φ3 Φ3 Ψ1 ∗ Φ3 Ψ2 ∗ Φ3 Ψ3 ∗
You will encounter such matrices in form of density matrices very often in
quantum mechanics, for example, when learning about quantum
entanglement.
Ψ3 Ψ3 Ψ1 ∗ Ψ3 Ψ2 ∗ Ψ3 Ψ3 ∗
If we apply a projection matrix to any ket vector |Φ⟩ (which may not be
normalized), then we multiply a matrix |Ψ ⟩⟨Ψ | by a column vector |Φ⟩:
Ψ1 Ψ1 ∗ Ψ1 Ψ2 ∗ Ψ1 Ψ3 ∗ Φ1
∗ ∗ ∗
|Ψ ⟩⟨Ψ | |Φ⟩ = |Ψ ⟩⟨Ψ |Φ⟩ = Ψ2 Ψ1 Ψ2 Ψ2 Ψ2 Ψ3 Φ2 (16.14)
Ψ3 Ψ1 ∗ Ψ3 Ψ2 ∗ Ψ3 Ψ3 ∗ Φ3
The special feature of a projection matrix is that it projects the state |Φ⟩
onto the state |Ψ ⟩. In simple terms, it yields the part of the quantum state
176 Chapter 16. Bra-Ket Notation
|Φ⟩ that overlaps with the quantum state |Ψ ⟩. The result of the projection is
thus a ket vector |Ψ ⟩⟨Ψ |Φ⟩, which describes the overlap of the quantum
states |Φ⟩ and |Ψ ⟩.
For the sake of illustration, let's assume that our desired basis consists of only
three basis states: {|Ψ1 ⟩, |Ψ2 ⟩, |Ψ3 ⟩}. With each of these basis states, we can
construct projection matrices: |Ψ1 ⟩⟨Ψ1 |, |Ψ2 ⟩⟨Ψ2 |, and |Ψ3 ⟩⟨Ψ3 |.
To represent a quantum state |Φ⟩ in this basis, we rst form the sum of the
projection matrices:
3
+ |Ψi ⟩⟨Ψi | = |Ψ1 ⟩⟨Ψ1 | + |Ψ2 ⟩⟨Ψ2 | + |Ψ3 ⟩⟨Ψ3 | = I (16.15)
i = 1
Now, let's substitute the sum of the basis projection matrices 16.15 for the
identity matrix:
3
|Φ⟩ = + |Ψi ⟩⟨Ψi |Φ⟩ (16.17)
i = 1
The resulting state |Φ⟩, although denoted the same as the original state |Φ⟩,
is now represented in the new basis {|Ψi ⟩}. If we want to emphasize the new
basis, we can also assign it an index: |Φ⟩Ψ . I hope you now understand the
usefulness of the concept of projection matrices!
The basis change with a nite number of basis states is, of course, only exact
when the states |Φ⟩ live in a nite-dimensional Hilbert space. For states
with innitely many components, Eq. 16.21 is only an approximation of the
old state in the new basis. The approximation becomes more accurate as we
choose n larger. Thus, in computational physics, we can save memory by not
choosing n too large, but large enough to approximate the quantum state good
enough in the new basis.
Guess how the basis change for states with innitely many components can be
made exact? With an integral! To do this, replace the discrete summation with
a sum sign with a continuous summation with an integral:
Z
|Φ⟩Ψ = dx |Ψ ⟩⟨Ψ |Φ⟩ (16.22)
178 Chapter 16. Bra-Ket Notation
Ĥ Ψ = W Ψ
∂Ψ
Ĥ Ψ = iℏ
∂t
Ĥ |Ψ ⟩ = W |Ψ ⟩ (16.23)
∂
Ĥ |Ψ ⟩ = iℏ |Ψ ⟩ (16.24)
∂t
⟨Ĥ⟩ = ⟨Ψ | Ĥ |Ψ ⟩ = ⟨Ψ | Ĥ Ψ ⟩ (16.25)
For the last equal sign, we have taken advantage of the fact that Ĥ applied
to the ket vector |Ψ ⟩ results in a new ket vector |Ĥ Ψ ⟩. As you know, the
notation in | ⟩ is irrelevant as long as it is clear what this ket vector represents.
Now you should have a solid basic knowledge of bra-ket notation. You should
have learned the following from this chapter:
16.8 Mean Values in Bra-Ket Notation 179
You know how to form the scalar product and inner product.
You know how to carry out a change of basis with projection matrices.
17. Represent Operators as
Matrices
More: en.fufaev.org/hermitian-operators
Let's take a look at the mean value ⟨Ĥ⟩ of the operator Ĥ in the state |Ψ ⟩. If
Ĥ is the Hamilton operator, then ⟨Ĥ⟩ describes the mean value of the total
energy of a quantum particle:
⟨Ĥ⟩ = ⟨Ψ | Ĥ Ψ ⟩ (18.1)
Z
= Ψ (x, t)∗ Ĥ Ψ (x, t) dx (18.2)
it: 5 = 5∗ . A complex number, for example 4 + 2i, does not remain the same if
we complex conjugate it: (4 + 2i)∗ = 4 − 2i.
⟨Ψ | Ĥ Ψ ⟩ = ⟨Ψ | Ĥ Ψ ⟩∗ (18.3)
What does this mean for the mean value integral 18.2 if Ĥ is a Hermitian
operator? Let's take a look at this by calculating the complex conjugate mean
value ⟨Ψ | Ĥ Ψ ⟩∗ by using the mean value integral 18.2:
Z ∗
∗
⟨Ψ | Ĥ Ψ ⟩ = Ψ (Ĥ Ψ ) dx
∗
(18.4)
Z ∗
= (Ĥ Ψ ) Ψ dx
∗
Z
= (Ĥ Ψ )∗ (Ψ ∗ )∗ dx
Z
= (Ĥ Ψ )∗ Ψ dx
= ⟨Ĥ Ψ | Ψ ⟩
⟨Ψ | Ĥ Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.5)
value calculation. So if you know that an operator is Hermitian, then move the
operator wherever you want in the Bra-Ket notation.
Usually, if we want to move an operator Ĥ that acts on a Ket vector to the Bra
vector, we have take the adjoint Ĥ † of the operator:
⟨Ψ | Ĥ † Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.6)
Ĥ = Ĥ † (18.7)
To make it clear that Ĥ is a Hermitian operator, the mean value is also noted
as follows:
⟨Ψ | Ĥ Ψ ⟩ = ⟨Ĥ Ψ | Ψ ⟩ (18.8)
= ⟨Ψ | Ĥ |Ψ ⟩
= ⟨Ψ | Ĥ † |Ψ ⟩
Take this to heart: You have a Hermitian operator in front of you. You can take
its eigenstates {|φi ⟩} as a basis and thus represent any other state in this basis.
That's incredible and super useful!
186 Chapter 18. Hermitian Operators
More: en.fufaev.org/quantum-angular-momentum
x py − y px
188 Chapter 19. Angular Momentum
Lx = y pz − z py (19.2)
Ly = z px − x pz (19.3)
Lz = x py − y px (19.4)
189
Here, ∂x ,∂y and ∂z are derivative operators. When applied to a function, they
result in the derivative of this function with respect to x, y or z . Of course, a
single derivative makes no sense. For this reason, a momentum operator only
becomes useful when it is applied to a wave function. The result is a new
wave function modied by the operator. With the axiomatic mappings, we have
quantized the classical orbital angular momentum:
L̂x = −iℏ y ∂z + iℏ z ∂y = iℏ (z ∂y − y ∂z )
L̂y = −iℏ z ∂x + iℏ x ∂z = iℏ (x ∂z − z ∂x )
L̂z = −iℏ x ∂y + iℏ y ∂x = iℏ (y ∂x − x ∂y )
190 Chapter 19. Angular Momentum
In the next step we will use the anti-distributivity. This swaps the two
operators in the parenthesis and the parenthesis disappears:
We know that the position operators ŷ , ẑ and momentum operators p̂z , p̂y are
Hermitian. Hermitian operators are equal to their adjoint. We can therefore
omit †:
In the last step, we used the commutator of the position and momentum operator
[x̂, p̂x ] = i ℏ.
For the second equal sign, we have expressed the angular momentum operators
with position and momentum operators. For the third equal sign, we have
multiplied out the brackets.
Then we swap the operators so that some terms are canceled out. In the rst
term, we can place x̂ at the beginning without problems: x̂ ŷ p̂z p̂y , because x̂
commutates with both ŷ and p̂z (their commutator is zero, so we can move
them back and forth). We can place the operator p̂y in front of p̂z without
problems: x̂ ŷ p̂y p̂z , but not in front of ŷ , because the commutator of [ŷ, p̂y ] = iℏ
is not zero. Therefore, we must replace ŷ, p̂y with iℏ + p̂y ŷ : x̂ (iℏ + p̂y ŷ) p̂z =
iℏ x̂ p̂z + x̂ p̂y ŷ p̂z . This cancels out the term x̂ p̂y ŷ p̂z :
In the expression ŷ p̂z ŷ p̂x we can swap all operators without any problems and
cancel it out with the other term ŷ p̂x ŷ p̂z :
And also in the expression ẑ p̂y x̂ p̂y operators can be swapped so that the
operator that is related to the expression x̂ p̂y ẑ p̂y cancels out:
Now we come back to a term ŷ p̂x ẑ p̂y , where the swapping is not simply possible.
First, we can swap ŷ and p̂x : p̂x ŷ ẑ p̂y and then ẑ with p̂y , so that we have the
following expression: p̂x ŷ p̂y ẑ . To now swap ŷ with p̂y , we have to use their
product because of the non-vanishing commutator [ŷ, p̂y ] = iℏ with iℏ + p̂y ŷ :
194 Chapter 19. Angular Momentum
p̂x (iℏ + p̂y ŷ) ẑ = iℏ p̂x ẑ − p̂x p̂y ŷ ẑ . This turns the commutator into:
In this expression ẑ p̂y ŷ p̂x we swap ẑ with p̂x : p̂x p̂y ŷ ẑ and can thus cancel it
out:
As you can see, the commutator [L̂x , L̂z ] is not zero, so it is impossible to know
L̂x and L̂z simultaneously with arbitrary precision. The other two commutators
can be derived in the same way:
What if it were a quantum particle? Let's assume that we have measured the
angular momentum component L̂z of the quantum particle. We have thus
exactly determined its angular momentum component Lz . Due to the
non-vanishing commutators, the other two angular momentum components Lx
and Ly have no concrete value. The direction of the total angular momentum
vector L is no longer clearly given, but lies anywhere on a cone mantle.
Based on the cone in the illustration, we can already guess that although the
direction of L is not unique, the length of the L vector is unique. We
196 Chapter 19. Angular Momentum
can determine the length of the L vector using the sum of the squares of the
angular momentum operators:
This sum is briey noted as the L̂2 operator. This operator is Hermitian, so
it represents an observable, namely the length of the angular momentum
vector squared. And the great thing is: This operator commutes with each
angular momentum component L̂x , L̂y and L̂z . This is very good, because
it allows us to determine not only one of the angular momentum components
of a quantum particle exactly, but also the magnitude of the total angular
momentum:
q
L̂ = L̂x 2 + L̂y 2 + L̂z 2 (19.19)
This would be very bad for physics if the magnitude of the total angular
momentum did not exist exactly at all times. Without a xed, exact total
angular momentum, the law of conservation of angular momentum in quantum
mechanics would not work at all.
If the commutator [L̂2 , L̂z ] vanishes (and it does), then we know that there is a
state |Ψ ⟩, which is simultaneously both an eigenstate of L̂2 and an eigenstate
of L̂z .
momentum squared:
L̂2 |Ψ ⟩ = L2 |Ψ ⟩ (19.20)
And if L̂z is applied to the state |Ψ ⟩, which is also an eigenstate of L̂2 , then
we again get the scaled eigenstate with a dierent eigenvalue. In the case
of L̂z , this eigenvalue represents the magnitude of the angular momentum
component in the z -direction:
L̂z |Ψ ⟩ = Lz |Ψ ⟩ (19.21)
Using the ladder operators, we can derive the eigenvalues L2 and Lz a little
more precisely. Here I give the famous result that probably every chemist knows.
The eigenvalues L2 are a multiple of ℏ2 :
L̂2 |Ψ ⟩ = L2 |Ψ ⟩ (19.22)
= l (l + 1) ℏ2 |Ψ ⟩
Let's take the square root of the magnitude square. The magnitude L of the
198 Chapter 19. Angular Momentum
L̂z |Ψ ⟩ = Lz |Ψ ⟩ (19.26)
= m ℏ |Ψ ⟩
The quantum number m is called the magnetic quantum number and it can
only take on values between m = −l and m = l in +1 steps. The L̂z angular
momentum component can therefore not take on continuous values as in classical
physics - the L̂z angular momentum component is quantized.
Let's summarize what you should take away from the 19 chapter:
If you enjoyed the book, it would immensely help me if you could leave a brief
review on Amazon with a rating. Even more important is that you send me
any mistakes, suggestions for improvement, or any unclear sections as
soon as possible to the email [email protected] so that I can address them
immediately.
May physics be with you!
Equations of Physics: Solve Every Physics Problem
Imprint
Alexander Fufaev
Peiner Straÿe 86
31137 Hildesheim, Germany
[email protected]
fufaev.org