0% found this document useful (0 votes)
51 views476 pages

Complex Numbers in Engineering Math

Uploaded by

sahildarji806
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views476 pages

Complex Numbers in Engineering Math

Uploaded by

sahildarji806
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Advanced Engineering Mathematics I

Part I
Linear Algebra and Differential Equations

Authors: Søren Enemark, Steen Markvorsen, and Karsten Schmidt

Translated by: Jesper Kampmann Larsen

Technical University of Denmark

Version: January 11, 2022


eNote 1 1

eNote 1

Complex Numbers

In this eNote we introduce and investigate the set of numbers C , the complex numbers. Since
C is considered to be an extension of R , the eNote presumes general knowledge of the real
numbers, including the elementary real functions such as the trigonometric functions and the
natural exponential function. Finally elementary knowledge of vectors in the plane is taken for
granted.

Updated 22.09.21. by David Brander. Version 29.05.16. by Karsten Schmidt.

1.1 Introduction

A simple quadratic equation such as x2 = 25 has two real solutions, viz.


x = 5 and x = −5 ,
since 52 = 25 and (−5)2 = 25 . Likewise the equation x2 = 2 has two solutions, viz.
√ √
x = 2 and x = − 2 ,
√ √
since 2 2 = 2 and (− 2) 2 = 2 .

In the two examples above the right-hand sides were positive. When considering the
equation
x2 = k , k ∈ R
we must be more careful; here everything depends on the sign of k . If k ≥ 0, the
equation has the solutions √ √
x = k and x = − k ,
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 2

√ √
since k 2 = k and (− k ) 2 = k . But if k < 0 the equation has no solutions, since real
numbers with negative squares do not exist.

But now we ask ourselves the question, is it possible to imagine a set of numbers larger
than the set of real numbers; a set that includes all the real numbers but in addition also
includes solutions to an equation like
x2 = −1?
The equation should then in analogy to the equations above have two solutions
√ √
x = −1 and x = − −1 .
Let us
√ be bold and assume2that this is in fact possible. We then choose to call this number
i = −1. The equation x = −1 then has two solutions, viz.
x = i and x = −i
since, if we assume that the usual rules of algebra hold,
√ 2 √ √
i2 = −1 = −1 and (−i)2 = (−1 · −1 ) 2 = (−1)2 (− −1)2 = −1 .
As we just mentioned, we make the further demand on the hypothetical number i , that
one must be able to use the same algebraic rules that apply to the real numbers. We
must e.g. be able to multiply i by a real number b and add this to another real number
a. In this way a new kind of number z of the type
z = a + ib , ( a, b) ∈ R2
emerges.

Below we describe how these ambitions about a larger set of numbers can be fulfilled.
We look at how the structure of the set of numbers should be and at which rules apply.
We call this set of numbers the complex numbers and use the symbol C . R must be a
proper subset of C — that is, C contains all of R together with the new numbers which
fulfill the above ambitions that are impossible in R. As we have already hinted C must
be two-dimensional in the sense that a complex number contains two real numbers, a and
b.

1.2 Complex Numbers Introduced as Pairs of Real


Numbers

The common way of writing a complex number z is


z = a + ib , (1-1)
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 3

where a and b are real numbers and i is the new imaginary number that satisfies i2 = −1 .
This form is very practical in computation with complex numbers. But we have not re-
ally clarified the meaning of the expression (1-1). For what is the meaning of a product
like ib , and what does the addition a + ib mean?

A satisfactory way of introducing the complex numbers is as the set of pairs of real num-
bers ( a, b). In this section we will show how in this set we can define arithmetic opera-
tions (addition, subtraction, multiplication and division) that fulfill the ordinary arith-
metic rules for real numbers. This will turn out to give full credit to the form (1-1).

Definition 1.1 The Complex Numbers


The complex numbers C are defined as the set of ordered pairs of real numbers:

C = {( a, b) | a, b ∈ R} (1-2)

equipt with the arithmetic rules described below.

As the symbol for an arbitrary complex number we will use the letter z .

Example 1.2

Here we show five different complex numbers:

z1 = (2, 7) , z2 = (7, 2) , z3 = (0, 1) , z4 = (−5, 0) , z5 = (0, 0) .

First we introduce the arithmetic rule for the addition of complex numbers. Then sub-
traction as a special form of addition.
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 4

Definition 1.3 Addition of Complex Numbers


Let z1 = ( a, b) and z2 = (c, d) be two complex numbers.

The sum z1 + z2 is defined as

z1 + z2 = ( a, b) + (c, d) = ( a + c, b + d) . (1-3)

Example 1.4 Addition

For the two complex numbers z1 = (2, 7) and z2 = (4, −3) we have:

z1 + z2 = (2, 7) + (4, −3) = (2 + 4, 7 + (−3)) = (6, 4) .

The complex number (0, 0) is neutral with respect to addition, since for every complex
number z = ( a, b) we have:

z + (0, 0) = ( a, b) + (0, 0) = ( a + 0, b + 0) = ( a, b) = z .

It is evident that (0, 0) is the only complex number that is neutral with respect to addi-
tion.

For every complex number z there exists an additive inverse (also calld opposite number)
denoted −z, which, when added to z, gives (0, 0). The complex number z = ( a, b) has
the additive inverse −z = (− a, −b), since

( a, b) + (− a, −b) = ( a + (− a), b + (−b)) = ( a − a, b − b) = (0, 0) .

It is clear that (− a, −b) is the only additive inverse for z = ( a, b), so the notation −z
is well-defined. By use of this, subtraction of complex numbers can be introduced as a
special form of addition.
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 5

Definition 1.5 Subtraction of Complex Numbers


For the two complex numbers z1 and z2 the difference z1 − z2 is defined as the sum
of z1 and the additive inverse for z2 :

z1 − z2 = z1 + (−z2 ) . (1-4)

Let us for two arbitrary complex numbers z1 = ( a, b) and z2 = (c, d) calculate the
difference z1 − z2 using definition 1.5:
z1 − z2 = ( a, b) + (−c, −d) = ( a + (−c), b + (−d)) = ( a − c, b − d) .
This gives the simple formula
z1 − z2 = ( a − c, b − d) . (1-5)

Example 1.6 Subtraction of Complex Numbers

For the two complex numbers z1 = (5, 2) and z2 = (4, −3) we have:

z1 − z2 = (5 − 4, 2 − (−3)) = (1, 5) .

While addition and subtraction appear to be simple and natural, multiplication and
division of complex numbers appear to be more odd. Later we shall see that all the
four arithmetic rules have geometrical equivalents in the so-called complex plane that
constitutes the graphical representation of the complex numbers. But first we must
accept the definitions at their face value. First we give the definition of multiplication.
Then follows the definition of division as a special form of multiplication.

Definition 1.7 Multiplication of Complex Numbers


Let z1 = ( a, b) and z2 = (c, d) be two complex numbers.

The product z1 z2 is defined as

z1 z2 = z1 · z2 = ( ac − bd, ad + bc) . (1-6)


eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 6

Example 1.8 Multiplication of Complex Numbers

For the two complex numbers z1 = (2, 3) and z2 = (1, −4) we have:

z1 z2 = (2, 3) · (1, −4) = (2 · 1 − (3 · (−4)), 2 · (−4) + 3 · 1) = (14, −5) .

The complex number (1, 0) is neutral with respect to multiplication, since for every com-
plex number z = ( a, b) we have that:

z · (1, 0) = ( a, b) · (1, 0) = ( a · 1 − b · 0 , a · 0 + b · 1) = ( a, b) = z .

It is clear that (1, 0) is the only complex number that is neutral with respect to multipli-
cation.

For every complex number z apart from (0, 0) there exists a unique reciprocal number
that when multiplied by the given number gives (1, 0). It is denoted 1z . The complex
number ( a, b) has the reciprocal number
 
1 a b
= ,− 2 , (1-7)
z a2 + b2 a + b2

since
a2 b2
   
a b ab ba
( a, b) · 2 2
,− 2 = + ,− 2 + 2 = (1, 0) .
a +b a + b2 2
a +b 2 2
a +b 2 a +b 2 a + b2

Exercise 1.9

Show that every complex number z 6= (0, 0) has exactly one reciprocal number.

By the use of reciprocal numbers we can now introduce division as a special form of
multiplication.
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 7

Definition 1.10 Division of Complex Numbers


Let z1 and z2 be arbitrary complex numbers, where z2 6= (0, 0).

z1 1
The quotient is defined as the product of z1 and the reciprocal number for z2 :
z2 z2
z1 1
= z1 · . (1-8)
z2 z2

Let us for two arbitrary complex numbers z1 = ( a, b) and z2 = (c, d) 6= (0, 0) compute
z
the quotient 1 from the Definition 1.10:
z2

ac + bd bc − ad
   
1 c d
z1 · = ( a, b) 2 ,− 2 = , .
z2 c + d2 c + d2 c2 + d2 c2 + d2

From this we get the following formula for division:

ac + bd bc − ad
 
z1
= , . (1-9)
z2 c2 + d2 c2 + d2

Example 1.11 Division of Complex Numbers

Consider two complex numbers z1 = (1, 2) and z2 = (3, 4).

1·3+2·4 2·3−1·4
   
z1 11 2
= , = , .
z2 32 + 42 32 + 42 25 25

We end this section by showing that the complex numbers, with the above arithmetic
operations, fulfill the computational rules known from the real numbers.
eNote 1 1.2 COMPLEX NUMBERS INTRODUCED AS PAIRS OF REAL NUMBERS 8

Theorem 1.12 Properties of Complex Numbers


The complex numbers fulfill the following computational rules:

1. Commutative rule for addition: z1 + z2 = z2 + z1

2. Associative rule for addition: (z1 + z2 ) + z3 = z1 + (z2 + z3 )

3. The number (0, 0) is neutral with respect to addition

4. Every z has an opposite number −z where z + (−z) = (0, 0)

5. Commutative rule for multiplication: z1 z2 = z2 z1

6. Associative rule for multiplication: (z1 z2 ) z3 = z1 (z2 z3 )

7. The number (1, 0) is neutral with respect to multiplication


1 1
8. Every z 6= (0, 0) has a reciprocal number where z · = (1, 0)
z z
9. Distributive rule: z1 (z2 + z3 ) = z1 z2 + z1 z3

Proof

Let us look at property 1, the commutative rule. Given two complex numbers z1 = ( a, b) and
z2 = (c, d). We see that

z1 + z2 = ( a + c, b + d) = (c + a, d + b) = z2 + z1 .

To establish the second equality sign we have used that for both the first and the second
coordinates the commutative rule for addition of real numbers applies. By this it is seen that
the commutative rule also applies to complex numbers.

In the proof of the properties 2, 5, 6 and 9 we similarly use the fact that the corresponding
rules apply to the real numbers. The details are left to the reader. For the properties 3, 4, 7
and 8 we refer to treatment above in this section.


eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 9

(0,b) (a,b)

(0,1)

X
(0,0) (1,0) (a,0)

Figure 1.1: Six complex numbers in the ( x, y)-plane

1.3 Complex Numbers in Rectangular Form

Since to every ordered pair of real numbers corresponds a unique point in the ( x, y)-
plane and vice versa, C can be considered to be the set of points in the ( x, y)-plane.
Figure 1.1 shows six points in the ( x, y)-plane, i.e. six complex numbers.

In the following we will change our manner of writing complex numbers.

First we identify all complex numbers of the type ( a, 0), i.e. the numbers that lie on the
x-axis, with the corresponding real number a . In particular the number (0, 0) is written
as 0 and the number (1, 0) as 1 . Note that this will not be in conflict with the arithmetic
rules for complex numbers and the ordinary rules for real numbers, since

( a, 0) + (b, 0) = ( a + b, 0 + 0) = ( a + b, 0)

and
( a, 0) · (b, 0) = ( a · b − 0 · 0 , a · 0 + 0 · b) = ( ab, 0) .

In this way the x-axis can be seen as an ordinary real number axis and is called the real
axis. In this way the real numbers can be seen as a subset of the complex numbers. That
the y-axis is called the imaginary axis is connected to the extraordinary properties of the
complex number i which we now introduce and investigate.
eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 10

Definition 1.13 The Number i


By the complex number i we understand the number (0, 1) .

A decisive motivation for the introduction of complex numbers was the wish
for a set of numbers that contained the solution to the equation

x 2 = −1 .

With the number i we have got such a solution because:

i2 = i · i = (0, 1) · (0, 1) = (0 · 0 − 1 · 1 , 0 · 1 + 1 · 0) = (−1, 0) = −1 .

Theorem 1.14 Complex Numbers in Rectangular Form


Every complex number z = ( a, b) can be written in the form

z = a + i · b = a + ib . (1-10)

This way of writing the complex number is called the rectangular form of z .

Proof

The proof consists of simple manipulations in which we use the new way of writing numbers
of this type.
( a, b) = ( a, 0) + (0, b) = ( a, 0) + (0, 1) · (b, 0) = a + i b .


eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 11

Imaginary Axis

Real Axis

Figure 1.2: Six complex numbers in rectangular form in the complex number plane

Since 0 = (0, 0) is neutral with respect to addition, and 1 = (1, 0) is neutral


with respect to multiplication, the following identities apply:

0 + z = z and 1z = z .

Furthermore it is easily seen that

0z = 0 .
1
Let us now consider all complex numbers of the type (0, b) . Since

(0, b) = 0 + ib = ib ,

i can be understood as the unit of the y-axis, and therefore we refer to i as the imaginary
unit. From this comes the name the imaginary axis for the y-axis.

In Figure 1.2 we see an update of the situation from Figure 1.1, where numbers are given
in their rectangular form.

All real numbers are complex but not all complex numbers are real!
eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 12

Method 1.15 Computation Using the Rectangular form


A decisive advantage arising from the rectangular form of complex numbers is that
one does not have to remember the formulas for the arithmetic rules for addition,
subtraction, multiplication and division given in the definitions 1.3, 1.5, 1.7 and 1.10.
All computations can be carried out by following the usual arithmetic rules for real
numbers and treating the number i as one would treat a real variable — with the
difference, though, that we replace i2 by −1 .

In the following example it is shown how multiplication can be carried out through
ordinary computation with the rectangular form of the factors.

Example 1.16 Multiplication Using the Rectangular Form

We compute the product of two complex numbers given in rectangular form z1 = a + ib and
z2 = c + id :

z1 z2 = ( a + ib)(c + id) = ac + iad + ibc + i2 bd = ac + iad + ibc − bd


= ( ac − bd) + i( ad + bc) .

The result corresponds to the definition, see Definition 1.7!

Exercise 1.17

Prove that the following rule for real numbers — the so-called zero rule — also applies to
complex numbers: "‘A product is 0 if and only if at least one of factors is 0 ."’
eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 13

Remark 1.18 Powers of Complex Numbers


The property 6 in Theorem 1.12 gives us the possibility to introduce integer powers
of complex numbers, corresponding to integer powers of real numbers. In the fol-
lowing let n be a natural number.

1. z1 = z , z2 = z · z , z3 = z · z · z etc.

2. By definition z0 = 1 .

1
3. Finally we put z−n = .
zn
It is easily shown that the usual rules for computations with integer powers of real
numbers also apply for integer powers of complex numbers:

zn zm = zn+m and (zn )m = zn m .

We end this section by introducing the concepts real part and imaginary part of complex
numbers.

Definition 1.19 Real Part and Imaginary Part


Given a complex number z in rectangular form z = a + ib . By the real part of z we
understand the real number

Re(z) = Re( a + ib) = a , (1-11)

and by the imaginary part of z we understand the real number

Im(z) = Im( a + ib) = b . (1-12)


eNote 1 1.3 COMPLEX NUMBERS IN RECTANGULAR FORM 14

The expression rectangular form refers to the position of the number in the com-
plex number plane, where Re(z) is the number’s perpendicular drop point on
the real axis, and Im(z) its perpendicular drop point on the imaginary axis. In
short the real part is the first coordinate of the number while the imaginary
part is the second coordinate of the number.

Note that every complex number z can be written in rectangular form like this:

z = Re(z) + i Im(z) .

Example 1.20 Real Part and Imaginary Part

Three complex numbers are given by

z1 = 3 − 2i , z2 = i5 , z3 = 25 + i .

Find the real part and the imaginary part of each number.

Re(z1 ) = 3 , Im(z1 ) = −2
Re(z2 ) = 0 , Im(z2 ) = 5
Re(z3 ) = 25 , Im(z3 ) = 1

Two complex numbers in rectangular form are equal if and only if both their
real parts and imaginary parts are equal.
eNote 1 1.4 CONJUGATION OF COMPLEX NUMBERS 15

1.4 Conjugation of Complex Numbers

Definition 1.21 Conjugation


Let z be a complex number with the rectangular form z = a + ib . By the conjugated
number corresponding to z we understand the complex number z given by

z = a − ib . (1-13)

Conjugating a complex number corresponds to reflecting the number in the real axis as
shown in Figure 1.3.

Figure 1.3: Reflection in the real axis

It is obvious that the conjugate number of a conjugate number is the original number:

z = z. (1-14)

Furthermore the following useful formula for the product of complex number and its
conjugate applies:

z · z = | z |2 (1-15)

which is shown by simple calculation.


eNote 1 1.4 CONJUGATION OF COMPLEX NUMBERS 16

In the following method we show a smart way of finding the rectangular form of a
fraction when the denominator is not real: we use the fact that the product of a number
z = a + ib and its conjugate z = a − ib is always a real number, cf. (1-15).

Method 1.22 Finding the rectangular form of a complex fraction


The way to remember: Multiply the numerator and the denominator by the conjugate of
the denominator. Here the denominator is written in its rectangular form:

z z( a − ib) z( a − ib)
= = 2 .
a + ib ( a + ib)( a − ib) a + b2

An example:

2−i (2 − i)(1 − i) 1 − 3i 1 − 3i 1 3
= = 2 2
= = − i.
1+i (1 + i)(1 − i) 1 +1 2 2 2

In conjugation in connection with the four ordinary arithmetic operations the following
rules apply.

Theorem 1.23 Arithmetic Rules for Conjugation

1. z1 + z2 = z1 + z2

2. z1 − z2 = z1 − z2

3. z1 · z2 = z1 · z2

4. (z1 /z2 ) = z1 /z2 , z2 6= 0 .


eNote 1 1.4 CONJUGATION OF COMPLEX NUMBERS 17

Proof

The proof is carried out by simple transformation using the rectangular form of the numbers.
As an example we show the first formula. Suppose that z1 = a1 + ib1 and z2 = a2 + ib2 . Then:

z1 + z2 = ( a1 + ib1 ) + ( a2 + ib2 ) = ( a1 + a2 ) + i(b1 + b2 )


= ( a1 + a2 ) − i(b1 + b2 ) = ( a1 − ib1 ) + ( a2 − ib2 )
= z1 + z2 .

Finally we note that all complex numbers on the real axis are identical with their con-
jugate number and that they are the only complex numbers that fulfill this condition.
Therefore we can state a criterion for whether a given number in a set of complex num-
bers is real:

Theorem 1.24 The Real Criterion


Let A be a subset of C , and let AR denote the subset of A that consists of real num-
bers. Then:
AR = { z ∈ A | z = z } .

Proof

Let z be an arbitrary number in A ⊆ C with rectangular form z = a + ib . Then:

z = z ⇔ a − ib = a + ib ⇔ 2ib = 0 ⇔ b = 0 ⇔ z ∈ AR .


eNote 1 1.5 POLAR COORDINATES 18

1.5 Polar Coordinates

The obvious way of stating a point (or a position vector) in an ordinary ( x, y)-coordinate
system is by the point’s rectangular, i.e. orthogonal, coordinates ( a, b). In many situations
it is, however, useful to be able to determine a point by its polar coordinates, consisting of
its distance to (0, 0) together with its direction angle from the x-axis to its position vector.
The direction angle is then positive if it is measured counter-clockwise and negative if
measured clockwise.

Analogously, we now introduce polar coordinates for complex numbers. Let us first be
absolutely clear about the orientation of the complex number plane.

Definition 1.25 Orientation of the Complex Number Plane


The orientation of the complex number plane is determined by a circle with its centre
at the origen being traversed counter-clockwise.
Im

Re
0

The ingredients in the polar coordinates of complex numbers are (as mentioned above)
its distance to (0, 0) called the absolute value, and its direction angle called the argument.
We now introduce these two quantities.
eNote 1 1.5 POLAR COORDINATES 19

Definition 1.26 Absolute Value and Argument


Given a complex number z .

By the absolute value of z we understand the length of the corresponding position


vector. The absolute value is written | z | and is also called the modulus or numerical
value.

Suppose z 6= 0 . Every angle from the positive part of the real axis to the position
vector for z is called an argument for z and is denoted arg(z) . The angle is positive
or negative relative to the orientation of the complex number plane.
Im
z

|z|

arg(z)
Re
0

The pair 
| z | , arg(z)
of the absolute value of z and an argument for z will collectively be called the polar
coordinates of the number.

Note that the argument for a number z is not unique. If you add 2π to an
arbitrary argument for z, you get a new valid direction angle for z and there-
fore a valid argument. Therefore a complex number has infinitely many argu-
ments corresponding to turning an integer number of times extra clockwise or
counter-clockwise in order to reach the same point again.

You can always choose an argument for z that lies in the interval from −π to π. Tradi-
tionally this argument is given a preferential position. It is called the principal value of
the argument.
eNote 1 1.5 POLAR COORDINATES 20

Definition 1.27 Principal Value


Given a complex number z that is not 0 . By the principal argument Arg(z) for z we
understand the uniquely determined argument for z that satisfies:

arg(z) ∈ ] − π, π ] .

We denoted the principal value with a capital initial Arg(z) as compared to


arg(z) that denotes an arbitrary argument. All arguments for a complex num-
ber z are then given by

arg(z) = Arg(z) + p · 2π , p ∈ Z . (1-16)

Two complex numbers are equal if and only if both their absolute values and
the principal arguments are equal.

Example 1.28 Principal Arguments

Im
‐2+2i 2+2i

4
π
π 4

‐2 Re


4
_ 3π
4

‐2‐2i 2‐2i

The figure shows five complex numbers, four of which lie on the lines through (0, 0) bisecting
the four quadrants. We read:
eNote 1 1.5 POLAR COORDINATES 21

π
• 2 + 2i has the principal argument 4 ,

• 2 − 2i has the principal argument − π4 ,



• −2 + 2i has the principal argument 4 ,

• −2 − 2i has the principal argument − 3π


4 , and

• −2 has the principal argument π .

Whether it is advantageous to use the rectangular format of the complex numbers or


their polar form depends on the situation at hand. In Method (1.29) it is demonstrated
how one can shift between the two forms.
eNote 1 1.5 POLAR COORDINATES 22

Method 1.29 Rectangular and Polar Coordinates


We consider a complex number z 6= 0 that has the rectangular form z = a + ib and
an argument v :

i|z|sin(v) z=a+ib

|z| b
i
v a
0 1 |z|cos(v)

1. The rectangular form is computed from the polar coordinates like this:

a = |z| cos(v) and b = |z| sin(v) . (1-17)

2. The absolute value is computed from the rectangular form like this :
p
| z | = a2 + b2 . (1-18)

3. An argument is computed from the rectangular form by finding an angle v


that satisfies both of the following equations:

a b
cos(v) = and sin(v) = . (1-19)
|z| |z|

When z is drawn in the first quadrant it is evident that the computational rules
(1-17) and (1-19) are derived from well-known formulas for cosine and sine to
acute angles in right-angled triangles and (1-18) from the theorem of Pythago-
ras. By using the same formulas it can be shown that the introduced methods
are valid regardless of the quadrant in which z lies.
eNote 1 1.5 POLAR COORDINATES 23

Example 1.30 From Rectangular to Polar Form


Find the polar coordinates for the number z = − 3 + i .

Im
z = ‐ 3 +i

|z|
v
Re
0

We use the rules in Method 1.29. Initially we identify the real and the imaginary parts of z
as √
a = − 3 and b = 1.
First we determine the absolute value:
p q √ √
| z | = a + b = (− 3) 2 + 12 = 3 + 1 = 2 .
2 2

Then the argument is determined. From the equation



a 3
cos(v) = =−
|z| 2
we get two possible principal arguments for z , viz.
5π 5π
v= and v = − .
6 6
From the figure it is seen that z lies in the second quadrant, and the correct principal argument
must therefore be the first of these possibilities. But this can also be determined without
inspection of the figure, since also the equation
1
sin(v) =
2
must be fulfilled. From this we also get two possible principal arguments for z , viz.
π 5π
v= and v = .
6 6

5π 5π
Since only v = satisfies both equations, we see that Arg(z) = .
6 6
eNote 1 1.5 POLAR COORDINATES 24

Thus we have found the set of polar coordinates for z :


 
 5π
| z |, Arg(z) = 2 , .
6

We end this section with the important product rules for absolute values and arguments.

Theorem 1.31 The Product Rule for Absolute Values


The absolute value of the product of two complex numbers z1 and z2 is found by

| z1 · z2 | = | z1 | · | z2 | . (1-20)

From Theorem 1.31 we get the corollary

Corollary 1.32
The absolute value for the quotient of two complex numbers z1 and z2 where z2 6= 0
is found by
z1 |z |
= 1 . (1-21)
z2 | z2 |
The absolute value of the nth power of a complex number z is for every n ∈ Z given
by
| z1 n | = | z1 | n . (1-22)

Exercise 1.33

Write down in words what the formulas (1-20), (1-21) and (1-22) say and prove them.
eNote 1 1.6 GEOMETRIC UNDERSTANDING OF THE FOUR COMPUTATIONAL
OPERATIONS 25

Theorem 1.34 The Product Rule for Arguments


Given two complex numbers z1 6= 0 and z2 6= 0 (which also means z1 z2 6= 0) .
Then if v1 is an argument for z1 and v2 is an argument for z2 , then v1 + v2 is an
argument for the product z1 z2 .

Corollary 1.35
Given two complex numbers z1 6= 0 and z2 6= 0 .Then:

1. If v1 is an argument for z1 and v2 is an argument for z2 , then v1 − v2 is an


z
argument for the fraction 1 .
z2
2. If v is an argument for z , then n · v is an argument for the power zn .

Exercise 1.36

Prove Theorem 1.34 and Corollary 1.35.

1.6 Geometric Understanding of the Four Computational


Operations

We started by introducing addition, subtraction, multiplication and division of complex


numbers as algebraic operations carried out on pairs of real numbers ( a, b), see defini-
tions 1.3, 1.5, 1.7 and 1.10. Then we showed that the rectangular form of the complex
numbers a + ib leads to a more practical way of computation: One can compute with
complex numbers just as with real numbers, as long as the number i is treated as a real
parameter and it is understood that i2 = −1 . In this section we shall see that the com-
putational operations can also be viewed as geometrical constructs.

The first exact description of the complex numbers was given by the Norwegian sur-
veyor Caspar Wessel in 1796 . Wessel introduced complex numbers as line segments
with given lengths and directions, that is what we now call vectors in the plane. There-
fore computations with complex numbers were geometric operations carried out on
eNote 1 1.6 GEOMETRIC UNDERSTANDING OF THE FOUR COMPUTATIONAL
OPERATIONS 26

z2=c+id

z1+z2=(a+c)+i( b+d)
i

0 1

z1=a+ib

Figure 1.4: Addition by the method of parallelograms

vectors. In the following we recollect the ideas in the definition of Wessel. It is easy
to see the equivalence between the algebraic and geometric representations of addition
and subtraction — it is more demanding to understand the equivalence when it comes
to multiplication and division.

Theorem 1.37 Geometric Addition


Addition of two complex numbers z1 and z2 can be obtained geometrically in the
following way:

The position vector for z1 + z2 is the sum of the position vectors for z1 and
z2 . (See Figure 1.4).

Proof

Suppose that z1 and z2 are given in rectangular form as z1 = a + ib and z2 = c + id . Then


the position vector for z1 has the coordinates ( a, b) and the position vector for z2 has the
coordinates (c, d) . The sum of the two position vectors is then ( a + c, b + d) , being the coor-
dinates of the position vector for the complex number ( a + c) + i(b + d) . Since we have that
z1 + z2 = ( a + ib) + (c + id) = ( a + c) + i(b + d) , we have proven the theorem.


eNote 1 1.6 GEOMETRIC UNDERSTANDING OF THE FOUR COMPUTATIONAL
OPERATIONS 27

z1=a+ib

z1-z2=(a-c)+i(b-d)
z2=c+id
i

0 1

-z2=-c-id

Figure 1.5: Subtraction by the method of parallelograms

Geometric subtraction is given as a special form of geometric addition: The position


vector for z1 − z2 is the sum of the position vectors for z1 and the opposite vector to the
position vector for z2 . This is illustrated in Figure 1.5

While in the investigation of geometrical addition (and subtraction) we have used the
rectangular form of complex numbers, in the treatment of geometric multiplication (and
division) we shall need their polar coordinates.

Theorem 1.38 Geometrical Multiplication


Given two complex numbers z1 and z2 that are both different from 0 (which also
means that z1 z2 6= 0) . Multiplication of z1 and z2 can be obtained geometrically
in the following way:

1. The absolute value of the product z1 z2 is found by multiplication of


the absolute value of z1 by the absolute value of z2 .

2. An argument for the product z1 z2 is found by adding an argument


for z1 and an argument for z2 .
eNote 1 1.6 GEOMETRIC UNDERSTANDING OF THE FOUR COMPUTATIONAL
OPERATIONS 28

Proof

First part of the theorem appears from Theorem 1.31 while the second part is evident from
Theorem 1.34.

Im


z2 3

11π
12

z1 π
z1 z2
4
1/2 1 2 Re

Figure 1.6: Multiplication

Example 1.39 Multiplication by Use of Polar Coordinates

1 π
and 2, 2π
 
Two complex numbers z1 and z2 are given by the polar coordinates 2, 4 3 , re-
spectively. (Figure 1.6,)

We compute the product of z1 and z2 by the use of their absolute values and arguments:

1
| z1 z2 | = | z1 | | z2 | = ·2 = 1
2
π 2π 11π
arg(z1 z2 ) = arg(z1 ) + arg(z2 ) = + = .
4 3 12
Thus the product z1 z2 is the complex number that has the absolute value 1 and the argument
11π
.
12

Note that it is important to observe whether a set of coordinates is given in


rectangular or in polar form.
eNote 1 1.7 THE COMPLEX EXPONENTIAL FUNCTION 29
Im

z1
u
z1
z2
z2 u-v
v
2 3 6 Re

Figure 1.7: Division

Example 1.40 Division by Use of Polar Coordinates

The numbers z1 and z2 are given by |z1 | = 6 with arg(z1 ) = u and |z2 | = 2 with arg(z2 ) = v
z1
respectively. Then can be determined as
z2
 
z1 6 z
= = 3 and arg 1 = u − v .
z2 2 z2

1.7 The Complex Exponential Function

The ordinary exponential function x 7→ ex , x ∈ R has, as is well known, the character-


istic properties,

1. e0 = 1 ,

2. ex1 + x2 = ex1 · ex2 for all x1 , x2 ∈ R , and

3. (ex )n = enx for all n ∈ Z and x ∈ R .

In this section we will introduce a particularly useful extension of the real exponential
function to a complex exponential function, that turns out to follow the same rules of
computation as its real counterpart.
eNote 1 1.7 THE COMPLEX EXPONENTIAL FUNCTION 30

Definition 1.41 Complex Exponential Function


By the complex exponential function expC we understand a function that to each
number z ∈ C with the rectangular form z = x + iy attaches the number

expC (z) = expC ( x + iy) = ex · (cos(y) + i sin(y)) , (1-23)

where e (about 2.7182818 . . . ) is base for the real natural exponential function.

Since we for every real number x get

expC ( x ) = expC ( x + i · 0) = ex (cos(0) + i sin(0)) = ex ,

we see that the complex exponential function is everywhere on the real axis identical to
the real exponential function. Therefore we do not risk a contradiction when we in the
following allow (and often use) the way of writing

expC (z) = ez for z ∈ C . (1-24)

Im

ez

ex
y

0
Re

Figure 1.8: Geometric Interpretation of ez

We now consider the complex number ez where z is an arbitrary complex number with
the rectangular form z = x + iy . Then (by use of Theorem 1.31) we see that

|ez | = |ex (cos(y) + i sin(y))| = |ex | |(cos(y) + i sin(y))| = |ex | = ex . (1-25)

Furthermore (by use of Theorem 1.34) we see that

arg (ez ) = arg (ex ) + arg (cos(y) + i sin(y)) = 0 + y = y . (1-26)


eNote 1 1.7 THE COMPLEX EXPONENTIAL FUNCTION 31

The polar coordinates for z = x + iy are then (ex , y) , which is illustrated in Figure 1.8.

For the trigonometric functions cos( x ) and sin( x ) we know that for every integer p
cos( x + p2π ) = cos( x ) and sin( x + p2π ) = sin( x ) . If the graph for cos( x ) or sin( x ) is
displaced by an arbitrary multiple of 2π , it will be mapped onto itself. Therefore the
functions are called periodic having a period of 2π .

A similar phenomenon is seen for the complex exponential function. It has the imaginary
period i 2π . This is closely connected to the periodicity of the trigonometric functions
as can be seen in the proof of the following theorem.

Theorem 1.42 Periodicity of ez


For every complex number z and every integer p:

ez+ip2π = ez . (1-27)

Proof

Suppose that z has the rectangular form z = x + iy and p ∈ Z .

Then:

ez+ip2π = ex+i(y+ p2π )


= ex cos(y + p2π ) + i sin(y + p2π ) = ex cos(y) + i sin(y)
 

= ez .

By this the theorem is proved.

In the following example the periodicity of the complex exponential function is illus-
trated.
eNote 1 1.7 THE COMPLEX EXPONENTIAL FUNCTION 32

Example 1.43 Exponential Equation

Determine all solutions to the equation



ez = − 3 + i . (1-28)

First we write z in rectangular form: z = x + iy . In Example 1.30 we found that the right-hand
side in (1-28) has the absolute value |z| = 2 and the principal argument v = 5π
6 . Since the left-
hand and the right-hand sides must have the same absolute value and the same argument,
apart from an arbitrary multiple of 2π , we get

| ez | = | − 3 + i | ⇔ ex = 2 ⇔ x = ln(2)
√ 5π
arg(ez ) = arg(− 3 + i) ⇔ y = v + p2π = + p2π , p ∈ Z .
6
All solutions for (1-28) are then
 

z = x + iy = ln(2) + i + p2π , p ∈ Z .
6

We end this section by stating and proving the rule of computations mentioned in the
introduction and known from the real exponential function.

Theorem 1.44 Complex Exponential Function Computation Rules

1. e0 = 1

2. ez1 +z2 = ez1 · ez2 for all z1 , z2 ∈ C

3. (ez )n = enz for all n ∈ Z og z ∈ C

Proof

Point 1 in the theorem that e0 = 1, follows from the fact that the complex exponential function
is identical with the real exponential function on the real axis, cf. (1-24).
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 33

In point 2 we set z1 = x1 + iy1 and z2 = x2 + iy2 . From the set of polar coordinates and 1.38
we get:

ez1 · ez2 = (e x1 , y 1 ) · (e x2 , y 2 ) = (e x1 · e x1 , y 1 + y 2 ) = (e x1 + x2 , y 1 + y 2 )
= e(x1 +x2 )+i(y1 +y2 ) = e(x1 +iy1 )+(x2 +iy2 )
= ez1 + z2 .

In point 3 we set z = x + iy and with the use of sets of polar coordinates and the repeated
use of Theorem 1.38 we get:

(ez )n = ((ex )n , n · y) = (en·x , n · y) = en·x+i·n·y = en(x+i·y)


= en · z .

By this the Theorem is proved.

Exercise 1.45

Show that for every z ∈ C ez 6= 0 .

1.8 The Exponential Form of Complex Numbers

Let v be an arbitrary real number. If we substitute the pure imaginary number iv into
the complex exponential function we get from the Definition 1.41:

eiv = e0+iv = e0 (cos(v) + i sin(v)) ,

which yields the famous Euler’s formula.

Theorem 1.46 Euler’s Formula


For every v ∈ R :
eiv = cos(v) + i sin(v) . (1-29)
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 34

Im

eiv isin(v)
v
Re
cos(v) 0 1

Figure 1.9: The number eiv in the complex number plane

By use of the definition of the complex exponential function, see Definition


1.41, we derived Euler’s formula. In return we can now use Euler’s formula to
write the complex exponential function in the convenient form

ez = ex cos(y) + i sin(y) = ex eiy .



(1-30)

The two most-used ways of writing complex numbers both in pure and applied math-
ematics are the rectangular form (as is frequently used above) and the exponential form.
In the exponential form the polar coordinates of the number (absolute value and argu-
ment), in connection with the complex exponential function. Since the polar coordinates
appear explicitly in this form, it is also called the polar form.

Theorem 1.47 The Exponential Form of Complex Numbers


Every complex number z 6= 0 can be written in the form

z = |z| eiv , (1-31)

where v is an argument for z . This way of writing is called the exponential form (or
the polar form) of the number.
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 35

Proof

Let v be an argument for the complex number z 6= 0 , and put r = |z| . We show that reiv has
the same absolute value and argument as z , and thus the two numbers are identical:

1.
reiv = |r | eiv = r .

2. Since 0 is an argument for r, and v is an argument for eiv , we have that 0 + v = v is an


argument for the product reiv .

Method 1.48 Computations Using the Exponential Form


A decisive advantage of the exponential form of complex numbers is that one does
not have to think about the rule of computations for multiplication, division and
powers when the polar coordinates are used, see Theorem 1.31, Corollary 1.32 and
Theorem 1.34. All computations can be carried out using the ordinary rules of com-
putation on the exponential form of the numbers.

We now give an example of multiplication following Method 1.48; cf. Example 1.39.

Example 1.49 Multiplication in Exponential Form

Two complex numbers are given in exponential form,

1 πi 3π
z1 = e 4 and z2 = 2 e 2 i .
2
The product of the numbers is found in exponential form as

1 π i 3π  1 π 3π π 3π 7π
z1 z2 = e 4 2 e 2 i = ( · 2)e 4 i+ 2 i = 1ei( 4 + 2 ) = e 4 i .
2 2
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 36

Exercise 1.50

Show that Method 1.48 is correct.

In the following we will show how so-called binomial equations can be solved by the use
of the exponential form. A binomial equation is an equation with two terms in the form
zn = w , (1-32)
where w ∈ C and n ∈ N . Binomial equations are described in more detail in eNote 2
about polynomials.

First we show an example of the solution of a binomial equation by use of the exponen-
tial form and then we formulate the general method.

Example 1.51 Binomial Equation in Exponential Form

Find all solutions to the binomial equation



z4 = −8 + 8 3 i . (1-33)

The idea is that we write both z and the right-hand side in exponential form.

If z has the exponential form z = seiu , then the equation’s left-hand side can be computed as

z4 = (seiu )4 = s4 (eiu )4 = s4 ei4u . (1-34)

The right-hand side is also written in exponential form. The absolute value r of the right-hand
side is found by
√ q √
r = | − 8 + 8 3 i | = (−8)2 + (8 3)2 = 16 .
The argument v of the right-hand side satisfies
√ √
−8 1 8 3 3
cos(v) = = − and sin(v) = = .
16 2 16 2
By use of the two equations the principal argument of the right-hand side can be determined
to be
√ 2π
v = arg(−8 + 8 3i) = ,
3
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 37

and so the exponential form of the right-hand side is



reiv = 16e 3 i . (1-35)

We now substitute (1-34) and (1-35) into (1-33) in order to replace the right- and left-hand
side with the exponential counterparts

s4 ei4u = 16e 3 i .

Since the absolute value of the left-hand side must be equal to absolute value of the right-
hand side we get √
4
s4 = 16 ⇔ s = 16 = 2 .


The argument of the left-hand side 4u and the argument of the right-hand side 3 must be
equal apart from a multiple of 2π . Thus

2π π π
4u = + p2π ⇔ u = + p , p ∈ Z .
3 6 2
These infinitely many arguments correspond, as we have seen earlier, to only four half-lines
from (0, 0) determined by the arguments obtained by putting p = 0, p = 1, p = 2 and p =
3 . For any other value of p the corresponding half-line will be identical to one of the four
mentioned above. E.g. the half-line corresponding to p = 4 has the argument
π π π
u= + 4 = + 2π ,
6 2 6
i.e. the same half-line that corresponds to p = 0 , since the difference in argument is a whole
revolution, that is 2π .

Therefore the given equation (1-33) has exactly four solutions that lie on the four mentioned
half-lines and that are separated the distance s = 2 from 0 . Stated in exponential form:
π π
z = 2 ei( 6 + p 2 ) , p = 0 , 1 , 2 , 3 .

Or each recomputed to rectangular form by means of Euler’s formula (1-29):


√ √ √ √
z0 = 3 + i , z1 = −1 + i 3 , z2 = − 3 − i , z3 = 1 − i 3 .

All solutions to a binomial equation lie on a circle with the centre at 0 and radius equal
to the absolute value of the right-hand side. The connecting lines between 0 and the
solutions divide the circle into equal angles. This is illustrated in Figure 1.10 which
shows the solutions to the equation of the fourth degree from Example 1.51.
eNote 1 1.8 THE EXPONENTIAL FORM OF COMPLEX NUMBERS 38

Im
z1

z0

‐2
Re
0 2

z2

z3


Figure 1.10: The four solutions for z4 = −8 + 8 3i

The method in Example 1.51 we now generalize in the following theorem. The theorem
is proved in eNote 2 about polynomials.

Theorem 1.52 Binomial Equation Solved Using the Exponential Form


Given a complex number w that is different from 0 and that has the exponential
form
w = |w| eiv .
The binomial equation
zn = w , n ∈ N (1-36)
has n solutions that can be found with the formula
q v 2π
z = n |w| ei( n + p n ) , where p = 0 , 1 , . . . , n − 1 . (1-37)

Exercise 1.53 Binomial Equation with a Negative Right-Hand Side

Let r be an arbitrary positive real number. Show by use of Theorem 1.52 that the binomial
quadratic equation
z2 = −r
has the two solutions √ √
z0 = i r and z1 = −i r.
eNote 1 1.9 LINEAR AND QUADRATIC EQUATIONS 39

1.9 Linear and Quadratic Equations

Let a and b be complex numbers with a 6= 0 . A complex linear equation of the form

az = b

in analogy with the corresponding real linear equation has exactly one solution

b
z= .
a
With a and b in rectangular form, the solution is easily found in rectangular form, as
shown in the following example.

Example 1.54 Solution of a Linear Equation

The equation
(1 − i) z = (5 + 2i)
has the solution
5 + 2i (5 + 2i)(1 + i) 3 + 7i 3 7
z= = = = + i.
1−i (1 − i)(1 + i) 2 2 2

Also in the solution of complex quadratic equations we use a formula that corresponds
to the well-known solution formula for real quadratic equations. This is given in the
following theorem that is proved in eNote 2 about polynomials.
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 40

Theorem 1.55 Solution Formula for Complex Quadratic Equations


Let a, b and c be arbitrary complex numbers with a 6= 0 . We define the discriminant
by D = b2 − 4ac . The quadratic equation

az2 + bz + c = 0 (1-38)

has two solutions


− b − w0 − b + w0
z0 = and z1 = , (1-39)
2a 2a
where w0 is a solution to the binomial quadratic equation w2 = D .

−b
If in particular D = 0 , we find z0 = z1 = .
2a

In this eNote we do not introduce square roots of complex numbers. Therefore


the complex solution formula above differs in one detail from the ordinary real
solution formula.

Concrete examples of the application of the theorem can be found in Section 30.5.2 in
eNote 2 about polynomials.

1.10 Complex Functions of a Real Variable

In this section we use the theory of the so-called epsilon functions for the introduction of dif-
ferentiability. The material is a bit more advanced than previously and knowledge about epsilon
functions from eNote 3 (see Section 3.4) may prove advantageous. Furthermore the reader should
be familiar with the rules of differentiation of ordinary real functions.

We will make a special note of functions of the type

f : t 7→ ect , t ∈ R , (1-40)

where c is a given complex number. This type of function has many uses in pure and
applied mathematics. A main purpose of this section is to give a closer description
of these. They are examples of the so-called complex functions of a real variable. Our
investigation starts off in a wider sense with this broader class of functions. I.a. we
show how concepts such as differentiability and derivatives can be introduced. Then
we give a fuller treatment of functions of the type in (1-40).
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 41

Definition 1.56 Complex Functions of a Real Variable


By a complex function of a real variable we understand a function f that for every t ∈ R
attaches exactly one complex number that is denoted f (t) . A short way of writing a
function f of this type is
f : R 7→ C .

The notation f : R 7→ C tells us the function f uses a variable in the real num-
ber space, but ends up with a result in the complex number space. Consider
e.g. the function f (t) = eit . At the real number t = π4 we get the complex
function value
√ √
π π
i
π π 2 2
f = e 4 = cos + i sin = + i.
4 4 4 2 2

Let us consider a function f : R 7→ C . We introduce two real functions g and h by


 
g(t) = Re f (t) and h(t) = Im f (t)
for all t ∈ R . By this f can be stated in rectangular form:
f (t) = g(t) + i · h(t) , t ∈ R . (1-41)

When in the following we introduce differentiability of complex functions of one real


variable, we shall need a special kind of complex function, viz. the so-called epsilon
functions. Similar to real epsilon functions they are auxiliary functions, whose functional
expression is of no interest. The two decisive properties for a real epsilon function e :
R 7→ R are that it satisfies e(0) = 0 , and that e(t) → 0 when t → 0 . The complex
epsilon function is introduced in a similar way.

Definition 1.57 Epsilon Function


By a complex epsilon function of a real variable we understand a function e : R 7→ C,
that satisfies:

1. e(0) = 0, and

2. |e(t)| → 0 for t → 0 .
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 42

Note that if e is an epsilon function, then it follows directly from the Definition
1.57 that for every t0 ∈ R:

| e(t − t0 ) | → 0 for t → t0 .

In the following example a pair of complex epsilon functions of a real variable are
shown.

Example 1.58 Epsilon Functions

The function
t 7→ i sin(t) , t ∈ R
is an epsilon function. This is true because requirement 1 in definition 1.57 is fulfilled by

i sin(0) = i · 0 = 0

and requirement 2 by

|i sin(t)| = |i| |sin(t)| = |sin(t)| → 0 for t → 0 .

Also the function


t 7→ t + it , t ∈ R
is an epsilon function, since
0+i·0 = 0
and p √
|t + it| = t2 + t2 = 2 |t| → 0 for t → 0 .

We are now ready to introduce the concept of differentiability for complex functions of a
real variable.
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 43

Definition 1.59 Derivative of a C-valued Function of a Real Variable


A function f : R 7→ C is called differentiable at t0 ∈ R, if a constant c ∈ C and an
epsilon function e : R 7→ C exist such that

f (t) = f (t0 ) + c(t − t0 ) + e(t − t0 )(t − t0 ) , t ∈ R . (1-42)

If f is differentiable at t0 then c is called the derivative for f at t0 .

If f is differentiable at every t0 in an open interval I then f is said to be differentiable


on I .

Differentiability for a complex function of a real variable is tightly connected to the


differentiability of the two real parts of the rectangular form. We now show this.

Theorem 1.60
For a function f : R 7→ C with the rectangular form f (t) = g(t) + ih(t) and a
complex number c with the rectangular form c = a + ib :

f is differentiable at t0 ∈ R with

f 0 ( t0 ) = c ,

if and only if g and h are differentiable at t0 with

g0 (t0 ) = a and h0 (t0 ) = b .


eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 44

Proof

First suppose that f is differentiable at t0 and f 0 (t0 ) = a + ib, where a, b ∈ R . Then there
exists an epsilon function e such that f for every t can be written in the form

f (t) = f (t0 ) + ( a + ib)(t − t0 ) + e(t − t0 )(t − t0 ) .

We rewrite both the left- and the right-hand side into their rectangular form:

g(t) + ih(t) =
g(t0 ) + ih(t0 ) + a(t − t0 ) + ib(t − t0 ) + Re(e(t − t0 )(t − t0 )) + iIm(e(t − t0 )(t − t0 )) =
( g(t0 ) + a(t − t0 ) + Re(e(t − t0 ))(t − t0 )) + i(h(t0 ) + b(t − t0 ) + Im(e(t − t0 ))(t − t0 )) .

From this we get

g(t) = g(t0 ) + a(t − t0 ) + Re(e(t − t0 ))(t − t0 ) and h(t) = h(t0 ) + b(t − t0 ) + Im(e(t − t0 ))(t − t0 ) .

In order to conclude that g and h are differentiable at t0 with g0 (t0 ) = a and h0 (t0 ) = b , it
only remains for us to show that Re(e) and Im(e) are real epsilon functions. This follows
from

1. e(0) = Re(e(0)) + iIm(e(0)) = 0 yields Re(e(0)) = 0 and Im(e(0)) = 0 , and


q
2. |e(t)| = |Re(e(t))|2 + |Im(e(t))|2 → 0 for t → 0 yields that Re(e(t)) → 0 for t →
0 and Im(e(t)) → 0 for t → 0 .

The converse statement in the theorem is similarly proved.

Example 1.61 Derivative of a Complex Function

By the expression
f (t) = t + it2
a function f : R 7→ C is defined. Since the real part of f has the derivative 1 and the
imaginary part of f the derivative 2t we obtain from Theorem 1.60:

f 0 (t) = 1 + i2t , t ∈ R .
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 45

Example 1.62 Derivative of a Complex-valued Function

Consider the function f : R 7→ C given by

f (t) = eit = cos(t) + i sin(t) , t ∈ R .

Since cos0 (t) = − sin(t) and sin0 (t) = cos(t) , it is seen from Theorem 1.60 that

f 0 (t) = − sin(t) + i cos(t) , t ∈ R .

In the following theorem we consider the so-called linear properties of differentiation.


These are well known from real functions.

Theorem 1.63 Computational Rules for Derivatives


Let f 1 and f 2 be differentiable complex functions of a real variable, and let c be an
arbitrary complex number. Then:

1. The function f 1 + f 2 is differentiable with the derivative

( f 1 + f 2 )0 (t) = f 1 0 (t) + f 2 0 (t) . (1-43)

2. The function c · f 1 is differentiable with the derivative

(c · f 1 )0 (t) = c · f 1 0 (t) . (1-44)

Proof

Let f 1 (t) = g1 (t) + i h1 (t) and f 2 (t) = g2 (t) + i h2 (t), where g1 , h1 , g2 and h2 are differentiable
real functions. Furthermore let c = a + ib be an arbitrary complex number in rectangular
form.

First part of the theorem:

( f 1 + f 2 )(t) = f 1 (t) + f 2 (t) = g1 (t) + i h1 (t) + g2 (t) + i h2 (t)


= ( g1 (t) + g2 (t)) + i (h1 (t) + h2 (t)) .
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 46

We then get from Theorem 1.60 and by the use of computational rules for derivatives for real
functions:

( f 1 + f 2 ) 0 ( t ) = ( g1 + g2 ) 0 ( t ) + i ( h 1 + h 2 ) 0 ( t )
= g10 (t) + g20 (t) + i h10 (t) + h20 (t)
 

= g10 (t) + i h10 (t) + g20 (t) + i h20 (t)




= f 10 (t) + f 20 (t) .

By this the first part of the theorem is proved.

Second part of the theorem:

c · f 1 (t) = ( a + ib) · ( g1 (t) + i h1 (t))


= ( a g1 (t) − b h1 (t)) + i ( a h1 (t) + b g1 (t)) .

We get from Theorem 1.60 and by the use of computational rules for derivatives for real
functions:

( c · f 1 ) 0 ( t ) = ( a g1 − b h 1 ) 0 ( t ) + i ( a h 1 + b g1 ) 0 ( t )
= a g10 (t) − b h10 (t) + i a h10 (t) + b g10 (t)
 

= ( a + ib) g10 (t) + i h10 (t)




= c · f 10 (t) .

By this the second part of the theorem is proved.

Exercise 1.64

Show that if f 1 and f 2 are differentiable complex functions of a real variable, then the function
f 1 − f 2 is differentiable with the derivative

( f 1 − f 2 )0 (t) = f 10 (t) − f 20 (t) . (1-45)

We now return to functions of the type (1-40). First we give a useful theorem about their
conjugation.
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 47

Theorem 1.65
For an arbitrary complex number c and every real number t:

ect = ec t . (1-46)

Proof

Let c = a + ib be the rectangular form of c . We then get by the use of Definition 1.41 and the
rules of computation for conjugation in Theorem 1.23:

ect = eat+ibt
= e at (cos(bt) + i sin(bt))
= e at (cos(bt) + i sin(bt))
= e at (cos(bt) − i sin(bt))
= e at (cos(−bt) + i sin(−bt))
= e at−ibt
= ect .

Thus the theorem is proved.

For ordinary real exponential functions of the type

f : x 7→ ekx , x ∈ R ,

where k is a real constant we have the well-known derivative

f 0 ( x ) = k f ( x ) = kekx . (1-47)

We end this eNote by showing that the complex exponential function of a real variable
satisfies a quite similar rule of differentiation.
eNote 1 1.10 COMPLEX FUNCTIONS OF A REAL VARIABLE 48

Theorem 1.66 Differentiation of ect


Consider an arbitrary number c ∈ C . The function f : R 7→ C given by

f (t) = ect , t ∈ R (1-48)

is differentiable and its derivative is determined by

f 0 (t) = c f (t) = cect . (1-49)

Proof

Let the rectangular form of c be c = a + ib . We then get

ect = eat+ibt
= e at (cos(bt) + i sin(bt))
= e at cos(bt) + i e at sin(bt) .


Thus we have

f (t) = g(t) + ih(t) , where g(t) = e at cos(bt) and h(t) = e at sin(bt) .

Since g and h are differentiable, f is also differentiable . Furthermore since

g0 (t) = aeat cos(bt) − eat b sin(bt) and h0 (t) = aeat sin(bt) + eat b cos(bt) ,

we now get

f 0 (t) = aeat cos(bt) − eat b sin(bt) + i aeat sin(bt) + eat b cos(bt)




= ( a + ib)e at (cos(bt) + i sin(bt))


= ( a + ib)e at+ibt
= c ect .

Thus the theorem is proved.

If c in Theorem 1.66 is real, (1-49) naturally only expresses the ordinary differentiation
of the real exponential function as in (1-47), as expected.
eNote 2 49

eNote 2

Polynomials of One Variable

In this eNote complex polynomials of one variable are introduced. An elementary knowledge of
complex numbers is a prerequisite, and knowlege of real polynomials of one real variable is
recommended.

Updated: 29.05.16. Karsten Schmidt. 11.9.2021. David Brander.

2.1 Introduction

Polynomials are omnipresent in the technical literature about mathematical models of


physical problems. A great advantage of polynomials is the simplicity of computation
since only addition, multiplication and powers are needed. Because of this polynomials
are especially applicable as approximations to more complicated types of functions.

Knowledge about the roots of polynomials is the main road to understanding their prop-
erties and efficient usage, and is therefore a major subject in the following. But first we
introduce some general properties.
eNote 2 2.1 INTRODUCTION 50

Definition 2.1
By a polynomial of degree n we understand a function that can be written in the form

P ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0 (2-1)
where a0 , a1 , . . . , an are complex constants with an 6= 0 , and z is a complex vari-
able.

ak is called the coefficient of zk , k = 0, 1, . . . , n , and an is the leading coefficient.


A real polynomial is a polynomial in which all the coefficients are real.
A real polynomial of a real variable is a real polynomial in which we assume z ∈ R .

Polynomials are often denoted by a capital P or similar letter Q, R, S . . . If the


situation requires you to include the variable name, the polynomial is written
as P(z) where it is understood that z is an independent complex variable.

Example 2.2 Examples of Polynomials

P(z) = 2 z3 + (1 + i) z + 5 is a polynomial of the third degree.


Q(z) = z2 + 1 is a real quadratic polynomial.
R(z) = 17 is a polynomial of the 00 th degree.
S(z) = 0 is called the 0-polynomial and is not assigned any degree.

T (z) = 2 z3 + 5 z − 4 is not a polynomial.

If you multiply a polynomial by a constant, or when you add, subtract, multiply and
compose polynomials with each other, you get a new polynomial. This polynomial can
be simplified by gathering terms of the same degree and written in the form (2.1).

Example 2.3 Addition and Multiplication of Polynomials

Two polynomials P and Q are given by P(z) = z2 − 1 and Q(z) = 2 z2 − z + 2 . The


polynomials R = P + Q and S = P · Q are determined like this:

R(z) = (z2 − 1) + (2 z2 − z + 2) = (z2 + 2 z2 ) + (−z) + (−1 + 2) = 3 z2 − z + 1 .


S(z) = (z2 − 1) · (2 z2 − z + 2) = (2 z4 − z3 + 2 z2 ) + (−2 z2 + z − 2)
= 2 z4 − z3 + (2 z2 − 2 z2 ) + z − 2 = 2 z4 − z3 + z − 2 .
eNote 2 2.2 THE ROOTS OF POLYNOMIALS 51

2.2 The Roots of Polynomials

Definition 2.4 Root of a Polynomial


By a root of a polynomial P(z) we understand a number z0 such that P(z0 ) = 0 .

Example 2.5 Whether a Given Number Is a Root of a Polynomial

Show that 3 is a root of P(z) = z3 − 5 z − 12 , and that 1 is not a root.

Since P(3) = 33 − 5 · 3 − 12 = 0 , 3 is a root of P .


Since P(1) = 13 − 5 · 1 − 12 = −16 6= 0 , 1 is not a root of P .

To develop the theory we shall need the following Lemma.


eNote 2 2.2 THE ROOTS OF POLYNOMIALS 52

Lemma 2.6 The Theorem of Descent


A polynomial P of the degree n is given by

P ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0 . (2-2)

If z0 is an arbitrary number, and Q is the polynomial of the n-10 degree given by


the coefficients

bn − 1 = a n (2-3)
bk = ak+1 + z0 · bk+1 for k = n−2, . . . , 0 , (2-4)

then P can be written in the factorized form

P ( z ) = ( z − z0 ) Q ( z ) (2-5)

if and only if z0 is a root of P .

Proof

Let the polynomial P be given as in the theorem, and let α be an arbitrary number. Consider
an arbitrary (n − 1)-degree polynomial

Q(z) = bn−1 zn−1 + bn−2 zn−2 + · · · + b1 z + b0 .

By simple calculation we get

(z − α) Q(z) = bn−1 zn + (bn−2 − αbn−1 ) zn−1 + · · · + (b0 − αb1 )z − αb0 .

It is seen that the polynomials (z − α) Q(z) and P(z) have the same representation if we in
succession write the bk -coefficients for Q as given in (2-3) and (2-4), and if at the same time
the following is valid:
−αb0 = a0 ⇔ b0 α = − a0 .
We investigate whether this condition is satisfied by using (2-3) and (2-4) in the opposite
eNote 2 2.2 THE ROOTS OF POLYNOMIALS 53

order:

b0 α = ( a1 + αb1 )α = b1 α2 + a1 α
= ( a2 + αb2 )α2 + a1 α = b2 α3 + a2 α2 + a1 α
..
.
= bn − 1 α n + a n − 1 α n − 1 + · · · + a 2 α 2 + a 1 α
= a n α n + a n −1 α n −1 + · · · + a 2 α 2 + a 1 α = − a 0
⇔ P ( α ) = a n α n + a n −1 α n −1 + · · · + a 2 α 2 + a 1 α + a 0 = 0 .

It is seen that the condition is only satisfied if and only if α is a root of P . By this the proof is
complete.

Example 2.7 Descent

Given the polynomial P(z) = 2z4 − 12z3 + 19z2 − 6z + 9 . It is seen that 3 is a root
since P(3) = 0 . Determine a third-degree polynomial Q such that

P ( z ) = ( z − 3) Q ( z ) .

We set a4 = 2, a3 = −12, a2 = 19, a1 = −6 og a0 = 9 and find the coefficients for Q by the


use of (2-3) and (2-4):

b3 = a4 = 2
b2 = a3 + 3b3 = −12 + 3 · 2 = −6
b1 = a2 + 3b2 = 19 + 3 · (−6) = 1
b0 = a1 + 3b1 = −6 + 3 · 1 = −3 .

We conclude that
Q(z) = 2z3 − 6z2 + z − 3
so
P(z) = (z − 3) (2z3 − 6z2 + z − 3) .

When a polynomial P with the root z0 is written in the form P(z) = (z − z0 ) Q1 (z) ,
where Q1 is a polynomial, it is possible that z0 is also a root of Q1 . Then Q1 can
eNote 2 2.2 THE ROOTS OF POLYNOMIALS 54

similarly be written as Q1 (z) = (z − z0 ) Q2 (z) where Q2 is a polynomial. And in this


way the descent can successively be carried out as P(z) = (z − z0 )m R(z) where R is
a polynomial in which z0 is not a root. We will now show that this factorization is
unique.

Theorem 2.8 The Multiplicity of a Root


If z0 is a root of the polynomial P , it can in exactly one way be written in factorised
form as:
P ( z ) = ( z − z0 ) m R ( z ) (2-6)
where R(z) is a polynomial for which z0 is not a root.

The exponent m is called the algebraic multiplicity of the root z0 .

Proof

Assume that α is a root of P , and that (contrary to the statement in the theorem) there exist
two different factorisations

P ( z ) = ( z − α )r R ( z ) = ( z − α ) s S ( z )

where r > s , and R(z) and S(z) are polynomials of which α is not a root. We then get

(z − α)r R(z) − (z − α)s S(z) = (z − α)s (z − α)k R(z) − S(z) = 0 , for all z ∈ C


where k = r − s. This equation is only satisfied if

(z − α)k R(z) = S(z) for all z 6= α .

Since both the left-hand and the right-hand sides are continuous functions, they must have
the same value at z = α . From this we get that

S(α) = (z − α)k R(α) = 0

which is contradictory to the assumption that α is not a root of S .


eNote 2 2.2 THE ROOTS OF POLYNOMIALS 55

Example 2.9

In Example 2.7 we found that

P(z) = (z − 3) (2z3 − 6z2 + z − 3)

where 3 is a root. But 3 is also a root of the factor 2z3 − 6z2 + z − 3 . By using the theorem
of descent, Theorem 2.6, on this polynomial we get

P(z) = (z − 3) (z − 3)(2z2 + 1) = (z − 3)2 (2z2 + 1) .

Since 3 is not a root of 2z2 + 1 , the root 3 in P has the multiplicity 2 .

Now we have started a process of descent! How far can we get along this way? To con-
tinue this investigation we will need a fundamental result, viz the Fundamental Theorem.

2.2.1 The Fundamental Theorem of Algebra

A decisive reason for the introduction of complex numbers is that every (complex) poly-
nomial has a root in the set of complex numbers. This result was proven by the math-
ematician Gauss in his ph.d.-dissertation from 1799 . The proof of the theorem is de-
manding, and Gauss strove all his life to refine his proof more. Four versions of the
proof by Gauss exist, so there is no doubt that he put a lot of emphasis on this theorem.
Here we take the liberty to state Gauss’ result without proof:

Theorem 2.10 The Fundamental Theorem of Algebra


Every polynomial of degree n ≥ 1 has at least one root within the set of complex
numbers.

The polynomial P(z) = z2 + 1 has no roots within the set of real numbers. But
within the set of complex numbers it has two roots i and −i because

P(i) = i2 + 1 = −1 + 1 = 0 and P(−i) = (−i)2 + 1 = i2 + 1 = 0 .


eNote 2 2.2 THE ROOTS OF POLYNOMIALS 56

The road from the fundamental theorem of algebra until full knowledge of the number
of roots is not long. We only have to develop the ideas put forward in the theorem of
descent futher.

We consider a polynomial P of degree n with leading coefficient an . If n ≥ 1 , P has


according to the fundamental theorem of algebra a root α1 and therefore by the use of
method of coefficients, cf. Theorem 2.6 it can be written as

P ( z ) = ( z − α1 ) Q1 ( z ) (2-7)

where Q1 is a polynomial of degree n-1 with leading coefficient an . If n ≥ 2 , then Q1


has a root α2 and can be written as

Q1 ( z ) = ( z − α2 ) Q2 ( z )

where Q2 is a polynomial of degree n-2 also with a leading coefficient an . By substi-


tution we now get
P(z) = (z − α1 )(z − α2 ) Q2 (z) .
In this way the construction of polynomials of descent Qk of degree n − k for k =
n − 1, . . . , 0 continues until we reach the polynomial Qn of degree n-n = 0 , which in
accordance with Example 2.2, is equal to its leading coefficient an . Hereafter P can be
written in its completely factorized form:

P(z) = an (z − α1 )(z − α2 ) · · · (z − αn ) . (2-8)

In this expression we should note three things:

• First all the n numbers α1 , . . . , αn that are listed in (2-8), are roots of P since
substitution into the formula gives the value 0 .

• The second thing we notice is that P cannot have other roots than the n given
ones. That there cannot be more roots is easily seen as follows: If an arbitrary
number α 6= αk , k = 1, . . . , n , is inserted in place of z in (2-8), all factors on the
right-hand side of (2-8) will be different from zero. Hence their product will also
be different form zero. Therefore P(α) 6= 0 , and α is not a root of P .

• The last thing we notice in (2-8), is that the roots are not necessarily different. If
z1 , z2 , . . . , z p are the p different roots of P , and mk is the multiplicity of zk , k =
1, . . . , p , then the completely factorized form (2-8) can be simplified as follows

P ( z ) = a n ( z − z 1 ) m1 ( z − z 2 ) m2 · · · ( z − z p ) m p (2-9)

where the following applies:

m1 + m2 · · · + m p = n .
eNote 2 2.2 THE ROOTS OF POLYNOMIALS 57

According to the preceding arguments we can now present the fundamental theorem of
algebra in the extended form.

Theorem 2.11 The Fundamental Theorem of Algebra — Version 2


Every polynomial of degree n ≥ 1 has within the set of complex numbers exactly
n roots, when the roots are counted with multiplicity.

Example 2.12 Quadratic Polynomial in Completely Factorized Form

An arbitrary quadratic polynomial P(z) = az2 + bz + c can be written in the form

P(z) = a(z − α)(z − β)

where α and β are roots of P . If α 6= β , P has two different roots, both with algebraic
multiplicity 1 . If α = β , P has only one root with the algebraic multiplicity 2 . The roots are
then denoted a double root.

Example 2.13 Algebraic Multiplicity

A polynomial P is given in complete factorized form as:

P ( z ) = 7( z − 1)2 ( z + 4)3 ( z − 5) .

We see that P has three different roots: 1, −4 and 5 with the algebraic multiplicities 2 , 3
and 1 , respectively.

We notice that the sum of the algebraic multiplicities is 6 which equals the degree of P in
concordance with the Fundamental Theorem of Algebra — Version 2.

Example 2.14 Algebraic Multiplicity

State the number of roots of P(z) = z3 .

P has only one root z = 0 . The algebraic multiplicity of this root is 3. One says that 0 is a
eNote 2 2.3 IDENTICAL POLYNOMIALS 58

triple root in the polynomial.

2.3 Identical Polynomials

Two polynomials P and Q are equal (as functions of z) if P(z) = Q(z) for all z . But
what does it take for two polynomials to be equal? Is it possible that a fourth-degree and
a fifth-degree polynomial take on the same value for all variables as long as you choose
the right coefficients? This is not the case as is seen from the following theorem.

Theorem 2.15 The Identity Theorem for Polynomials


Two polynomials are identical if and only if they are of the same degree, and all
coefficients for corresponding terms of the same degree from the two polynomials
are equal.

Proof

We consider two arbitrary polynomials P og Q . If they are of the same degree, and all
the coefficients for terms of the same degree are equal, they must have the same value for
all variables and hence they are identical. This proves the first direction of the theorem of
identity.

Assume hereafter that P og Q are identical as functions of z, but that not all coefficients for
terms of the same degree from the two polynomials are equal. We assume further that P has
the degree n and Q the degree m where n ≥ m . Let ak be the coefficients for P and let bk
be the coefficients for Q , and consider the difference polynomial

R(z) = P(z) − Q(z) (2-10)


=( an − bn )zn + ( an−1 − bn−1 )zn−1 + · · · + ( a1 − b1 )z + ( a0 − b0 )

where we for the case n > m put bk = 0 for m < k ≤ n . We note that the 0-degree
coefficient ( a0 − b0 ) cannot be the only coefficient of R(z) that is different from 0, since this
would make P(0) − Q(0) = ( a0 − b0 ) 6= 0 which contradicts that P and Q are identical as
functions. Therefore the degree of R is greater than or equal to 1. On the other hand (2-10)
shows that the degree of R at the most is n . Now let zk , k = 1, . . . , n + 1 , be n + 1 different
eNote 2 2.4 POLYNOMIAL EQUATIONS 59

numbers. They are all roots of R since

R(zk ) = P(zk ) − Q(zk ) = 0 , k = 1 . . . n + 1 .

This contradicts the fundamental theorem of algebra – version 2, Theorem 2.11: R cannot
have a number of roots that is higher than its degree. The assumption, that not all coefficients
of terms of the same degree from P and Q are equal, must therefore be wrong. From this
it also follows that P and Q have the same degree. By this the second part of the identity
theorem is proven.

Example 2.16 Two Identical Polynomials

The equation
3 z2 − z + 4 = a z2 + b z + c
is satisfied for all z exactly when a = 3, b = −1 og c = 4 .

Exercise 2.17 To Identical Polynomials

Determine the numbers a, b and c such that

(z − 2)( a z2 + b z + c) = z3 − 5 z + 2 for all z .

In the following section we treat methods of finding roots of certain types of polynomi-
als.

2.4 Polynomial Equations

From the fundamental theorem of algebra, Theorem 2.10, we know that every polyno-
mial of degree greater than or equal to 1 has roots. Moreover, in the extended version,
Theorem 2.11, it is maintained that for every polynomial the degree is equal to the num-
ber of roots if the roots are counted with multiplicity. But the theorem is a theoretical
theorem of existence that does not help in finding the roots.

In the following methods for finding the roots of simple polynomials are introduced.
eNote 2 2.4 POLYNOMIAL EQUATIONS 60

But let us keep the level of ambition (safely) low, because in the beginning of the 170 th
century the Norwegian algebraicist Abel showed that one cannot establish general meth-
ods for finding the roots of arbitrary polynomials of degree larger than four!

For polynomials of higher degree than four a number of smart tricks exist by which one
can successfully find a single root. Hereafter one descends to a polynomial of lower de-
gree — and successively decends to a polynomial of fourth degree or lower for which
one can find the remaining roots.

Let us at the outset maintain that when you want to find the roots of a polynomial
P(z) , you should solve the corresponding polynomial equation P(z) = 0 . As a simple
illustration we can look at the root of an arbitrary first-degree polynomial:

P(z) = az + b .

To find this we shall solve the equation

az + b = 0 .
b
this is not difficult. It has the solution z0 = − which therefore is a root of P(z) .
a

Finding the roots of a polynomial P , is tantamount to finding the solutions to


the polynomial equation P(z) = 0 .

Example 2.18 The Root of a Linear Polynomial

Find the root of a linear polynomial P given by

P(z) = (1 − i) z − (5 + 2i) .

We shall solve the following equation

(1 − i) z − (5 + 2i) = 0 ⇔ (1 − i) z = (5 + 2i) .

We isolate z on the left-hand side:


5 + 2i (5 + 2i)(1 + i) 3 + 7i 3 7
z= = = = + i.
1−i (1 − i)(1 + i) 2 2 2

3 7
Hence the equation has the solution z0 = + i that also is the root of P.
2 2
eNote 2 2.4 POLYNOMIAL EQUATIONS 61

2.4.1 Binomial Equations

A binomial equation is an equation of the degree n in which only the coefficients an


(the term of highest degree) and a0 (the constant term) are different from 0 . A given
binomial equation can only be simplified to the following form:

Definition 2.19 Binomial Equation


A binomial equation has the form zn = w where w ∈ C and n ∈ N .

For binomial equations an explicit solution formula exists, which we present in the fol-
lowing theorem.

Theorem 2.20 Binomial Equations Solved by Use of the Exponential


Form
Let w 6= 0 be a complex number with the exponential form

w = |w| eiv .

The binomial equation


zn = w (2-11)
has n different solutions given by the formula
q v 2π
z p = n |w| ei( n + p n ) where p = 0 , 1 , . . . , n − 1 . (2-12)

Proof

p v 2π
For every p ∈ {0, 1, . . . , n − 1} z p = n
|w| ei( n + p n ) is a solution to (2-11), since
q n
n i( nv + p 2π )
(z p ) = n
| w |e n = |w| ei(v+ p 2π ) = |w| eiv = w.

It is seen that the n solutions viewed as points in the complex plane all lie on a circle with
centre at z = 0, radius n |w| and a consecutive angular distance of 2π
p
n . In other words the
eNote 2 2.4 POLYNOMIAL EQUATIONS 62

connecting lines between z = 0 and the solutions divide the circle in n angles of the same
size.

From this it follows that all n solutions are mutually different. That there are no more solu-
tions is a consequence of the fundamental theorem of algebra – version 2, Theorem 2.11. By
this the theorem is proven.

In the next examples we will consider some important special cases of binomial equa-
tions.

Example 2.21 Binomial Equation of the Second Degree

We consider a complex number in the exponential form w = |w| eiv . It follows from (2-12)
that the quadratic equation
z2 = w
has two solutions q q
v v
z0 = |w| ei 2 and z1 = − | w | ei 2 .

Example 2.22 Binomial Equation of the Second Degree with a Negative


Right-Hand Side

Let r be an arbitrary positive real number. By putting v = Arg(−r ) = π in Example 2.21 it


is seen that the binomial of the second degree

z2 = −r

has two solutions √ √


z0 = i r og z1 = −i r.
As a concrete example the equation z2 = −16 has the solutions z = ±i 4 .

Sometimes the method used in Example 2.21 can be hard to carry out. In the following
example we show an alternative method.
eNote 2 2.4 POLYNOMIAL EQUATIONS 63

Example 2.23 Binomial Equation of the Second Degree, Method 2

Solve the equation


z2 = 8 − 6i . (2-13)

Since we expect the solution to be complex we put z = x + iy where x and y are real numbers.
If we can find x and y, then we have found the solutions for z. Therefore we have z2 =
( x + iy)2 = x2 − y2 + 2xyi and we see that (2-13) is equivalent to

x2 − y2 + 2xyi = 8 − 6i .

Since a complex equation is true exactly when both the real parts and the imaginary parts of
the right-hand and the left-hand sides of the equation are identical, (2-13) is equivalent to

x2 − y2 = 8 and 2xy = −6 . (2-14)

−6 3
If we put y = = − in x2 − y2 = 8 , and put x2 = u , we get a quadratic equation that
2x x
can be solved:

 2
2 3 9
x − − = 8 ⇔ x2 − =8⇔
x x2
 
2 9
x − 2 x2 = 8x2 ⇔ x4 − 9 = 8x2 ⇔ x4 − 8x2 − 9 = 0 ⇔
x
u2 − 8u − 9 = 0 ⇔ u = 9 or u = −1 .

The equation x2 = u = 9 has the solutions x1 = 3 and x2 = −3, while the equation
x2 = u = −1 has no solution, since x and y are real numbers. If we put x1 = 3 respective
x2 = −3 in (2-14), we get the corresponding y-values y1 = −1 and y2 = 1 .

From this we conclude that the given equation (2-13) has the roots

z1 = x1 + iy1 = 3 − i and z2 = x2 + iy2 = −3 + i .


eNote 2 2.4 POLYNOMIAL EQUATIONS 64

2.4.2 Quadratic Equations

For the solution of quadratic equations we state below the formula that corresponds
to the well-known solution formula for real quadratic equations. There is a single de-
viation, viz. we do not compute the square-root of the discriminant since we in this
theorem do not presuppose knowledge of square-roots of complex numbers.

Theorem 2.24 Solution Formula for Quadratic Equation


For the quadratic equation

az2 + bz + c = 0 , a 6= 0 (2-15)

we introduce the discriminant D by D = b2 − 4ac . The equations has two solutions

− b − w0 − b + w0
z1 = og z2 = (2-16)
2a 2a
where w0 is a solution to the binomial equation of the second degree w2 = D .

−b
If in particular D = 0 , we have that z1 = z2 = .
2a
eNote 2 2.4 POLYNOMIAL EQUATIONS 65

Proof

Let w0 be an arbitrary solution to the binomial equation w2 = D . We then have:


 
2 2 b c
az + bz + c = a z + z +
a a
2 !
b2

b c
=a z+ − 2+
2a 4a a
!
b 2 b2 − 4ac
 
=a z+ −
2a 4a2
!
b 2
 
D
=a z+ − 2
2a 4a
2 !
w02

b
=a z+ − 2
2a 4a
     
b w0 b w0
=a z+ + z+ −
2a 2a 2a 2a
b − w0
  
b + w0
= a z+ z+ =0
2a 2a
− b − w0 − b + w0
⇔z= or z = .
2a 2a

By this the solution formula (2-16) is derived.

Example 2.25 Real Quadratic Equation with a Positive Value of the Dis-
criminant

Solve the following quadratic equation with real coefficients:

2z2 + 5z − 3 = 0 .

We identify the coefficients: a = 2, b = 5, c = −3 , and find the discriminant as:

D = 52 − 4 · 2 · (−3) = 49 .

It is seen that w0 = 7 is a solution to the binomial equation of the second degree w2 = D =


eNote 2 2.4 POLYNOMIAL EQUATIONS 66

49 . Now the solutions can be computed as:

−5 + 7 1 −5 − 7
z1 = = and z2 = = −3 . (2-17)
2·2 2 2·2

Example 2.26 Real Quadratic Equation with a Negative Discriminant

Solve the following quadratic equation with real coefficients:

z2 − 2z + 5 = 0 .

We identify the coefficients: a = 1, b = −2, c = 5 , and find the discriminant as:

D = (−2)2 − 4 · 1 · 5 = −16 .

According to Example 2.22 the solution to the binomial equation of the second degree w2 =
D = −16 is given by w0 = 4i . Now the solutions can be computed as:

−(−2) + 4i −(−2) − 4i
z1 = = 1 + 2i and z2 = = 1 − 2i . (2-18)
2·1 2·1

Example 2.27 A Quadratic Equation with Complex Coefficients

Solve the quadratic equation

z2 − (1 + i)z − 2 + 2i = 0 . (2-19)

First we identify the coefficients: a = 1, b = −(1 + i), c = −2 + 2i , and we find the discrimi-
nant:
D = (−(1 + i))2 − 4 · 1 · (−2 + 2i) = 8 − 6i .
From Example 2.23 we know that the solution to the binomial equation w2 = D = 8 − 6i is
w0 = 3 − i . From this we find the solution to (2-19) as

−(−(1 + i)) + (3 − i) −(−(1 + i)) − (3 − i)


z1 = = 2 and z2 = = −1 + i . (2-20)
2·1 2·1
eNote 2 2.4 POLYNOMIAL EQUATIONS 67

2.4.3 Equations of the Third and Fourth Degree

From antiquity geometrical methods for the solution of (real) quadratic equations are
known. But not until A.D. 800 did algebraic solution formulae became known, through
the work (in Arabic) of the Persian mathematician Muhammad ibn Musa al-Khwarismes
famous book al-Jabr. In the West the name al-Khwarisme became the well-known word
algorithm, while the book title became algebra.

Three centuries later history repeated itself. Around A.D. 1100 another Persian math-
ematician (and poet) Omar Khayyám gave exact methods on how to find solutions to
real equations of the third and fourth degree by use of advanced geometrical methods.
As an example he solved the equation x3 + 200x = 20x2 + 2000 by intersecting a circle
with a hyperbola the equations of which he could derive from the equation of third de-
gree.

Omar Khayyám did not think it possible to draw up algebraic formulae for solutions to
equations of degree greater than two. He was proven wrong by the Italian Gerolamo
Cardano who in the 16th century published formulae for the solution of Equations of
the third and fourth degree.

Khayyáms methods and Cardanos formulae are beyond the scope of this eNote. Here
we only give — see the previous Example 2.9 and the following Example 2.28 — a few
examples by use of the “method of descent”, Theorem 2.6, on how one can find all so-
lutions to equations of degree greater that two if one in advance knows or can guess a
sufficient number of the solutions.

Example 2.28 An Equation of the Third Degree with an Initial Guess

Solve the equation of third degree

z3 − 3z2 + 7z − 5 = 0 .

It is easily guessed that 1 is a solution. By use of the algorithm of descent one easily gets the
factorization:
z3 − 3z2 + 7z − 5 = (z − 1)(z2 − 2z + 5) = 0 .
We know that 1 is a solution, the remaining solutions are found by solving the quadratic
equation
z2 − 2z + 5 = 0 ,
which, according to Example 2.26, has the solutions 1 + 2i and 1 − 2i .
eNote 2 2.5 REAL POLYNOMIALS 68

Collectively the equation of the third degree has the solutions 1, 1 + 2i og 1 − 2i .

2.5 Real Polynomials

The theory that has been unfolded in the previous section applies to all polynomials
with complex coefficients. In this section we present two theorems that only apply to
polynomials with real coefficients — that is the subset called the real polynomials. The
first theorem shows that non-real roots always appear in pairs.

Theorem 2.29 Roots in Real Polynomials


If the number a + ib is a root of the polynomial that only has real coefficients, then
also the conjugate number a − ib is a root of the polynomial.

Proof

Let
P ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0
be a real polynomial. By use of the arithmetic rules for conjugation of the sum and prod-
uct of complex numbers (see eNote 1 about complex numbers) with the condition that all
coefficients are real, we get

P ( z ) = a n z n + a n −1 z n −1 + · · · + a 1 z + a 0
= a n z n + a n −1 z n −1 + · · · + a 1 z + a 0
= P(z) .

If z0 is a root of P , we get
P ( z0 ) = 0 = 0 = P ( z0 )
from which it is seen that z0 is also a root. Thus the theorem is proven.


eNote 2 2.5 REAL POLYNOMIALS 69

Example 2.30 Conjugated Roots

Given that the polynomial

P(z) = 3z2 − 12z + 39 (2-21)

has the root 2 − 3i . Determine all roots of P , and write P in a complete factorized
form.

We see that all the three coefficients in P are real. Therefore the conjugate of the given root
2 + 3i is also a root of P . Since P is a quadratic polynomial, there are no more roots.

According to Example 2.12 the complete factorized form for P : is

P(z) = 3 (z − (2 − 3i))(z − (2 + 3i)) .

In the complete factorized form of a polynomial it is always possible to multiply the


two factors that correspond to a pair of conjugated roots such that the product forms a
real quadratic polynomial in this way:

(z − ( a + ib))(z − ( a − ib)) = ((z − a) + ib))((z − a) − ib)


= (z − a)2 − (ib)2
= z2 − 2az + ( a2 + b2 ) .

From Theorem 2.29 we know that complex roots always are present in conjugated pairs.
This leads to the following theorem:

Theorem 2.31 Real Factorization


A real polynomial can be written as a product of real polynomials of the first degree
and real quadratic polynomials without any real roots.
eNote 2 2.5 REAL POLYNOMIALS 70

Example 2.32 Real Factorization

Given that a real polynomial of seventh degree P has the roots 1, i, 1 + 2i as well
as the double root −2 , and that the coefficient to its term of the highest degree is
a7 = 5 . Write P as a product of real linear and real quadratic polynomials without
real roots.

We use the fact that the conjugates of the complex roots are also roots and write P in its
complete factorized form:

P(z) = 5 (z − 1)(z − i)(z + i)(z − (1 + 2i))((z − (1 − 2i))(z − 2)2 .

Two pairs of factors correspond to conjugated roots. When we multiply these we obtain the
form we wanted:
P(z) = 5 (z − 1)(z2 + 1)(z2 − 2z + 5)(z − 2)2 .

By this we end the treatment of polynomials in one variable.


eNote 3 71

eNote 3

Elementary Functions

In this eNote we will both repeat some of the basic properties for a selection of the (from high
school) well-known functions f ( x ) of one real variable x, and introduce some new functions,
which typically occur in a variety of applications. The basic questions concerning any function
are usually the following: How, and for which values of x, is the function defined? Which
values for f ( x ) do we get when we apply the functions to the x-elements in the domain? Is the
function continuous? What is the derivative f 0 ( x ) of the function – if it exists? As a new
concept, we will introduce a vast class of functions, the epsilon functions, which are denoted
by the common symbol ε( x ) and which we will use generally in order to describe continuity and
differentiability – also of functions of more variables, which we introduce in the following
eNotes.

(Updated: 22.9.2021 David Brander)

3.1 Domain and Range

In the description of a real function f ( x ) both the real numbers x where the function is
defined and the values that are obtained by applying the function on the domain are
stated. The Domain we denote D ( f ) and the range, or image, we denote R( f ).

Note: in higher mathematics, it is usual to define a function by specifying the domain


and codomain, (the set where the function in principle takes values) rather than the im-
age. For example: f : R → R given by f ( x ) = x2 . The codomain is R, but the range is
the set of non-negative numbers [0, ∞[⊂ R.
eNote 3 3.1 DOMAIN AND RANGE 72

Example 3.1 Some Domains and Ranges

Here are domains and the corresponding ranges for some well-known functions.

f1 (x) = exp( x ) , D( f1 ) = R = ] − ∞, ∞[ , R( f 1 ) = ]0, ∞[


f2 (x) = ln( x ) , D( f2 ) =]0, ∞[ , R( f 2 ) = R = ] − ∞, ∞[

f3 (x) = x , D( f3 ) = [0, ∞[ , R( f 3 ) = [0, ∞[
f4 (x) = x2 , D( f4 ) = R = ] − ∞, ∞[ , R( f 4 ) = [0, ∞[
(3-1)
f5 (x) = x7 + 8x3 + x − 1 , D( f5 ) = R = ] − ∞, ∞[ , R( f 5 ) = R = ] − ∞, ∞[
f6 (x) = exp(ln( x )) , D( f6 ) = ]0, ∞[ , R( f 6 ) = ]0, ∞[
f7 (x) = sin(1/x ) , D( f7 ) = ] − ∞, 0[∪]0, ∞[ , R( f 7 ) = [−1, 1]
f8 (x) = | x |/x , D( f8 ) = ] − ∞, 0[∪]0, ∞[ , R( f 8 ) = {−1} ∪ {1}

Figure 3.1: The well-known exponential function e x = exp( x ) and the natural loga-
rithmic function ln( x ). The red circles on the negative x-axis and at 0 indicate that the
logarithmic function is not defined on ] − ∞, 0].
eNote 3 3.1 DOMAIN AND RANGE 73

The function f 8 ( x ) in Example 3.1 is defined using | x |, which denotes the ab-
solute value of x, i.e.

 x>0, for x > 0
|x| = 0, for x = 0 (3-2)
− x > 0 , for x < 0 .

From this the domain and range for f 8 ( x ) follow directly.

Example 3.2 Tangent

The function
sin( x )
f ( x ) = tan( x ) = (3-3)
cos( x )
has the domain D ( f ) = R \ A, A denoting those real numbers x for which cos( x ) = 0, cos( x )
being the denominator, i.e.

D ( f ) = R \ { x | cos( x ) = 0} = R \ {(π/2) + p · π , p being an integer} . (3-4)

The range R( f ) is all real numbers, see Figure 3.2.

Figure 3.2: The graphs for the functions tan( x ) and cot( x ).
eNote 3 3.1 DOMAIN AND RANGE 74

Exercise 3.3

Let g( x ) denote the reciprocal function to the function tan( x ):

cos( x )
g( x ) = cot( x ) = (3-5)
sin( x )

Determine the domain for g( x ) and state it in the same way as above for tan( x ), see Figure
3.2.

3.1.1 Extension of the Domain to All of R

A function f ( x ) that is not defined for all real numbers can easily be extended to a func-
tion fb( x ), which has D ( fb) = R. One way of doing this is by the use of a curly bracket
in the following way:

Definition 3.4
Given a function f ( x ) with D ( f ) 6= R. We then define the 0-extension of f ( x ) by:

f ( x ) , for x ∈ D ( f )
f (x) = (3-6)
for x ∈ R \ D ( f ) .
b
0,

It is evident that depending on the application one can seal and extend the do-
main for f ( x ) in many other ways than choosing the constant 0 as the value for
the extended function at the points where the original function is not defined.

Naturally, the Range R( fb) for the 0-extended function is the original range for
f ( x ) united with 0, i.e. R( fb) = R( f ) ∪ {0} .

Hereafter we will assume – unless otherwise stated – that the functions we consider are
defined for all R possibly by extension as above.
eNote 3 3.2 EPSILON FUNCTIONS 75

3.2 Epsilon Functions

We introduce a special class of functions, which we will use in order to define the im-
portant concept of continuity.

Definition 3.5 Epsilon Functions


Every ε( x ) that is defined on an open interval that contains 0 and that assumes the
value ε(0) = 0 at x = 0 and moreover tends towards 0 when x tends towards 0
is called an epsilon function of x. Thus epsilon functions are characterized by the
properties:
ε(0) = 0 and ε( x ) → 0 for x → 0 . (3-7)
The last condition is equivalent to the fact that the absolute value of ε( x ) can be
made as small as possible by choosing the numerical value of x sufficiently small.
To be precise the condition means: For every number a > 0 there exists a number
b > 0 such that |ε( x )| < a for all x satisfying | x | < b.

The set of epsilon functions is very large:

Example 3.6 Epsilon Functions

Here are some simple examples of epsilon functions:

ε 1 (x) = x
ε 2 (x) = |x|
(3-8)
ε 3 ( x ) = ln(1 + x )
ε 4 ( x ) = sin( x ) .

The quality ’to be an epsilon function’ is rather stable: The product of an ep-
silon function and an arbitrary other function that only has to be bounded is
also an epsilon function. The sum and the product of two epsilon functions are
again epsilon functions. The absolute value of an epsilon function is an epsilon
function.
eNote 3 3.3 CONTINUOUS FUNCTIONS 76

Functions that are 0 in other places than x = 0 can also be epsilon functions:

If a function g( x ) has the properties g( x0 ) = 0 and g( x ) → 0 for x → x0 then


g( x ) is an epsilon function of x − x0 i.e. we can write g( x ) = ε g ( x − x0 ).

Exercise 3.7

Show that the 0-extension fb8 ( x ) of the function f 8 ( x ) = | x |/x is not an epsilon function. Hint:
If we choose k = 10 then clearly there does not exist a value of K such that

1 1
| f 8 ( x )| = | | x |/x | = 1 < , for all x with |x| < . (3-9)
10 K

Draw the graph for fb8 ( x ). This cannot be drawn without ’lifting the pencil from the paper’!

Exercise 3.8

Show that the 0-extension of the function f ( x ) = sin(1/x ) is not an epsilon function.

3.3 Continuous Functions

We can now formulate the concept of continuity by use of epsilon functions:

Definition 3.9 Continuity


A function f ( x ) is continuous at x0 if there exists an epsilon function ε f ( x − x0 ) such
that the following is valid on an open interval that contains x0 :

f ( x ) = f ( x0 ) + ε f ( x − x0 ) . (3-10)

If f ( x ) is continuous at every x0 on a given open interval in D ( f ) we say that f ( x )


is continuous on the interval.
eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 77

Note that even though it is clear what the epsilon function precisely is in the
definition 3.9, viz. f ( x ) − f ( x0 ), then the only property in which we are in-
terested is the following: ε f ( x − x0 ) → 0 for x → x0 such that f ( x ) → f ( x0 )
for x → x0 , that is precisely as we know the concept of continuity from high
school!

Exercise 3.10

According to the above, all epsilon functions are continuous at x0 = 0 (with the value 0 at
x0 = 0). Construct an epsilon function that is not continuous at any of the points x0 = 1/n
where n = 1, 2, 3, 4, · · · .

Even though the concept of epsilon functions is central to the definition of


continuity (and as we shall see below, to the definition of differentiability),
epsilon functions need not be continuous for any other values than x0 = 0.

Exercise 3.11

Show that the 0-extension fb( x ) of the function f ( x ) = | x − 7|/( x − 7) is not continuous on
R.

3.4 Differentiable Functions


eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 78

Definition 3.12 Differentiability


A function f ( x ) is differentiable at x0 ∈ D ( f ) if both a constant a and an epsilon
function ε f ( x − x0 ) exist such that

f ( x ) = f ( x0 ) + a · ( x − x0 ) + ( x − x0 ) · ε f ( x − x0 ) . (3-11)

It is the number a that we call f 0 ( x0 ) and it is well-defined in the sense that if f ( x )


can be stated at all in the form above (i.e. if f ( x ) is differentiable at x0 ) then there is
one and only one value for a that makes this formula true. With this definition of
the derivative f 0 ( x0 ) of f ( x ) at x0 we then have:

f ( x ) = f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) · ε f ( x − x0 ) . (3-12)

If f ( x ) is differentiable for all x0 in a given open interval in D ( f ), we then naturally


say that f ( x ) is differentiable on the interval. We often write the derivative of f ( x )
at x in the following alternative way:

d
f 0 (x) = f (x) . (3-13)
dx

Explanation 3.13 The Derivative is Unique

We will show that there is only one value of a that fulfills Equation (3-11). Assume
that two different values, a1 and a2 both fulfill (3-11) possibly with two different
epsilon functions:

f ( x ) = f ( x0 ) + a1 · ( x − x0 ) + ( x − x0 ) · ε 1 ( x − x0 )
(3-14)
f ( x ) = f ( x0 ) + a2 · ( x − x0 ) + ( x − x0 ) · ε 2 ( x − x0 ) .
By subtracting (3-14) from the uppermost equation we get:

0 = 0 + ( a1 − a2 ) · ( x − x0 ) + ( x − x0 ) · (ε 1 ( x − x0 ) − ε 2 ( x − x0 )) , (3-15)

such that
a2 − a1 = ε 1 ( x − x0 ) − ε 2 ( x − x0 ) (3-16)
for all x 6= x0 – and clearly this cannot be true; the right hand side tends towards
0 when x tends towards x0 ! Therefore the above assumption, i.e. that a1 6= a2 , is
eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 79

wrong. The two constants a1 and a2 must be equal, and this is what we should
realize.

The definition above is quite equivalent to the one we know from high school.
If we first subtract f ( x0 ) from both sides of the equality sign in Equation (3-12)
and then divide by ( x − x0 ) we get

f ( x ) − f ( x0 )
= f 0 ( x0 ) + ε f ( x − x0 ) → f 0 ( x0 ) for x → x0 , (3-17)
x − x0
i.e. the well-known limit value for the quotient between the increment in the
function f ( x ) − f ( x0 ) and the x-increment x − x0 . The reason why we do not
apply this known definition of f 0 ( x0 ) is simply that for functions of more vari-
ables the quotient does not make sense – but more about this in a later eNote.

Theorem 3.14 Differentiable Implies Continuous


If a function f ( x ) is differentiable at x0 , then f ( x ) is also continuous at x0 .

Proof

We have that
f ( x ) = f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) ε f ( x − x0 )
(3-18)
= f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) ε f ( x − x0 ) ,
 

and since the function in the square brackets on the right hand side is an epsilon function of
( x − x0 ) then f ( x ) is continuous at x0 .

But the opposite is not valid – here is an example that shows this:
eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 80

Example 3.15 Continuous But Not Differentiable

The function f ( x ) = | x | is continuous but not differentiable at x0 = 0. The function is in itself


an epsilon function and therefore f ( x ) is continuous in 0. But now assume that there exist a
constant a and an epsilon function ε f ( x − x0 ) such that

f ( x ) = f ( x0 ) + a · ( x − x0 ) + ( x − x0 ) ε f ( x − x0 ). (3-19)

The following will then apply:

|x| = 0 + a · x + x · ε f (x) (3-20)

and hence for all x 6= 0:


|x|
= a + ε f (x) . (3-21)
x
If so a should both be equal to −1 and to 1 and this is impossible! Therefore the assumption
above that there exists a constant a is accordingly wrong; therefore f ( x ) is not differentiable.

Definition 3.16
The first degree approximating polynomial for f ( x ) expanded about the point x0 is
defined by:
P1,x0 ( x ) = f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) . (3-22)

Note that P1,x0 ( x ) really is a first degree polynomial in x. The graph for the
function P1,x0 ( x ) is the tangent to the graph for f ( x ) at the point ( x0 , f ( x0 )),
see Figure 3.3. The equation for the tangent is y = P1,x0 ( x ), thus y =
f ( x0 ) + f 0 ( x0 ) · ( x − x0 ). The slope of the tangent is clearly α = f 0 ( x0 ) and
the tangent intersects the y-axis at the point (0, f ( x0 ) − x0 · f 0 ( x0 )). Later we
will find out how we can approximate with polynomials of higher degree n,
i.e. polynomials that are then denoted Pn,x0 ( x ).

3.4.1 Differentiation of a Product


eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 81

Figure 3.3: Construction of the tangent y = P1,x0 ( x ) = f ( x0 ) + α · ( x − x0 ) with the


slope α = f 0 ( x0 ) for the function f ( x ). To the right the difference between f ( x ) and the
’tangent value’ P1,x0 ( x ).

Theorem 3.17 Differentiation of f ( x ) · g( x )


A product h( x ) = f ( x ) · g( x ) of two differentiable functions f ( x ) and g( x ) is differ-
entiable and its derivative is as follows:
d
( f ( x ) · g( x )) = f 0 ( x ) · g( x ) + f ( x ) · g0 ( x ) . (3-23)
dx

Even though this formula is rather well known from high school we shall give a short
sketch of a proof – to illustrate the use of epsilon functions.
eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 82

Proof

Since f ( x ) and g( x ) are differentiable in x0 , we have:

f ( x ) = f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) ε f ( x − x0 )
(3-24)
g ( x ) = g ( x0 ) + g 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) ε g ( x − x0 ) ,

resulting in the product of the two right hand sides:

h( x ) = f ( x ) · g( x )
(3-25)
= f ( x0 ) · g( x0 ) + ( f 0 ( x0 ) · g( x0 ) + f ( x0 ) · g0 ( x0 )) · ( x − x0 ) + ( x − x0 )ε h ( x − x0 ) ,

where we have used ( x − x0 )ε h ( x − x0 ) as short for the remaining part of the product sum.
Furthermore any of the addends in the remaining part contains the factor ( x − x0 )2 or the
product of ( x − x0 ) with an epsilon function and therefore can be written in the stated form.
But then the product formula follows directly from the factor in front of ( x − x0 ) in Equation
(3-25):
h 0 ( x0 ) = f 0 ( x0 ) · g ( x0 ) + f ( x0 ) · g 0 ( x0 ) . (3-26)

3.4.2 Differentiation of a Quotient

The following differentiation rule is also well known from high school:

Theorem 3.18 Differentiation of f ( x )/g( x )


A quotient h( x ) = f ( x )/g( x ) involving two differentiable functions f ( x ) and g( x ),
is differentiable everywhere that g( x ) 6= 0, and the derivative is given in this well-
known fashion:
f 0 (x) f ( x ) · g0 ( x ) f 0 ( x ) · g( x ) − f ( x ) · g0 ( x )
 
d f (x)
= − = . (3-27)
dx g( x ) g( x ) g2 ( x ) g2 ( x )
eNote 3 3.4 DIFFERENTIABLE FUNCTIONS 83

Exercise 3.19

Use the epsilon function argument in the same way as in the differentiation rule for a product
to show Equation 3.18.

3.4.3 Differentiation of Composite Functions

Theorem 3.20 The Chain Rule for Composite Functions


A function h( x ) = f ( g( x )) that is composed of two differentiable functions f ( x ) and
g( x ) is in itself differentiable at every x0 with the derivative

h0 ( x0 ) = f 0 ( g( x0 )) · g0 ( x0 ) (3-28)

Proof

We exploit that the two functions f ( x ) and g( x ) are differentiable. In particular g( x ) is dif-
ferentiable at x0 :
g( x ) = g( x0 ) + g0 ( x0 )( x − x0 ) + ( x − x0 ) · ε g ( x − x0 ) , (3-29)
and the function f (u) is differentiable at u0 = g( x0 ):
f (u) = f (u0 ) + f 0 (u0 )(u − u0 ) + (u − u0 ) · ε f (u − u0 ) . (3-30)
From this we get, setting u = g( x ) and u0 = g( x0 ):
h( x ) = f ( g( x ))
= f ( g( x0 )) + f 0 ( g( x0 ))( g( x ) − g( x0 ) + ( g( x ) − g( x0 ) · ε f ( g( x ) − g( x0 )
= h( x0 ) + f 0 ( g( x0 ))( g0 ( x0 )( x − x0 ) + ( x − x0 ) · ε g ( x − x0 )) (3-31)
0
+ ( g ( x0 )( x − x0 ) + ( x − x0 ) · ε g ( x − x0 )) · ε f ( g( x ) − g( x0 )
= h( x0 ) + f 0 ( g( x0 )) g0 ( x0 ) · ( x − x0 ) + ( x − x0 ) · ε h ( x − x0 ) ,
from which we directly read that h0 ( x0 ) = f 0 ( g( x0 )) g0 ( x0 ) – because this is exactly the unique
coefficient of ( x − x0 ) in the above expression.


eNote 3 3.5 INVERSE FUNCTIONS 84

Exercise 3.21

Above we have used – at the end of Equation (3-31) – that

f 0 ( g( x0 )) · ε g ( x − x0 ) + ( g0 ( x0 ) + ·ε g ( x − x0 )) · ε f ( g( x ) − g( x0 )) (3-32)

is an epsilon function, which we accordingly can call (and have called) ε h ( x − x0 ). Consider
why this is entirely OK.

Exercise 3.22

Find the derivatives of the following functions for every x-value in their respective domains:

f 1 ( x ) = ( x2 + 1) · sin( x )
f 2 ( x ) = sin( x )/( x2 + 1) (3-33)
2
f 3 ( x ) = sin( x + 1) .

3.5 Inverse Functions

The exponential function exp( x ) and the logarithmic function ln( x ) are inverse func-
tions to each other – as is well known the following is valid:

exp(ln( x )) = x for x ∈ D (ln) = ]0, ∞[ = R(exp)


(3-34)
ln(exp( x )) = x for x ∈ D (exp) = ] − ∞, ∞[ = R(ln) .

Note that even though exp( x ) is defined for all x, the inverse function ln( x ) is
only defined for x > 0 – and vice versa (!).

The function f ( x ) = x2 has an inverse function in its respective intervals of monotony,


i.e. where f ( x ) is either increasing or decreasing: The inverse function on the interval

[0, ∞[ where f ( x ) is increasing is the well-known function g( x ) = x. Thus the function
f ( x ) maps the interval A = [0, ∞[ one-to-one onto the interval B = [0, ∞[, and the
eNote 3 3.5 INVERSE FUNCTIONS 85

inverse function g( x ) maps the interval B one-to-one onto the interval A such that:

f ( g( x )) = ( x )2 = x for x ∈ B = [0, ∞[
√ (3-35)
g( f ( x )) = x2 = x for x ∈ A = [0, ∞[ .

The inverse function to f ( x ) on the interval ] − ∞, 0] where f ( x ) is decreasing is the



function h( x ) = − x, which is not defined on the same interval as f ( x ). The function
f ( x ) maps the interval C =] − ∞, 0] one-to-one onto the interval D = [0, ∞[, and the
inverse function h( x ) maps the interval D one-to-one onto the interval C such that:

f (h( x )) = (− x )2 = x for x ∈ D = [0, ∞[
√ (3-36)
h( f ( x )) = − x2 = x for x ∈ C = ] − ∞, 0] .

If f ( x ) is not monotonic on an interval, it means that we can obtain the same


function-value f ( x ) for more x-values – in the same way as x2 = 1 both for x = 1
and for x = −1, and then the function is not one-to-one on the interval. The
functions cos( x ) and sin( x ) are only monotonic on certain subintervals of the
x-axis, see Figure 3.7. If we wish to define inverse functions to the functions
we must choose the interval with care, see Section 3.8 and Figure 3.8.

Definition 3.23 Notation for Inverse Functions


We denote the inverse function for a given function f ( x ) by f ◦−1 ( x ). The inverse
function is generally defined by the following properties on suitably chosen inter-
vals A and B that are part of D ( f ) and D ( f ◦−1 ), respectively

f ◦−1 ( f ( x )) = x for x ∈ A ⊂ D( f )
◦−1
(3-37)
f(f ( x )) = x for x ∈ B ⊂ D ( f ◦−1 ) .

We use here the symbol f ◦−1 ( x ) in order to avoid confusion with ( f ( x ))−1 =
1/ f ( x ). However the reader should note that the standard notation is simply
f −1 for the inverse function. The graph for the inverse function g( x ) = f ◦−1 ( x )
to a function f ( x ) can be obtained by mirroring the graph for f ( x ) in the diag-
onal in the first quadrant in the ( x, y)-coordinate system – i.e. the line with the
equation y = x – see Figure 3.4.
eNote 3 3.5 INVERSE FUNCTIONS 86

Figure 3.4: The graph for a function f ( x ) and the graph for the inverse function g( x ).
It is valid that g( x ) = f ◦−1 ( x ) and f ( x ) = g◦−1 ( x ), but they each have their own
definition intervals.

3.5.1 Differentiation of Inverse Functions

Theorem 3.24 Differentiation of Inverse Functions


If a differentiable function f ( x ) has the inverse function f ◦−1 ( x ) and if
f 0 ( f ◦−1 ( x0 )) 6= 0, then the inverse function f ◦−1 ( x ) is itself differentiable at x0 :

1
( f ◦−1 )0 ( x0 ) = (3-38)
f 0 ( f ◦−1 ( x 0 ))
eNote 3 3.6 HYPERBOLIC FUNCTIONS 87

Proof

From the definition of inverse functions we have

h( x ) = f ( f ◦−1 ( x )) = x , (3-39)

so h0 ( x0 ) = 1, but we also have from the chain rule in (3-28):

h0 ( x0 ) = f 0 ( f ◦−1 ( x0 )) · ( f ◦−1 )0 ( x0 ) = 1 , (3-40)

from which we get the result by dividing by f 0 ( f ◦−1 ( x0 )).

3.6 Hyperbolic Functions

Definition 3.25 Hyperbolic Cosine and Hyperbolic Sine


We will define two new functions cosh( x ) and sinh( x ) as the unique solution to the
following system of differential equations with initial conditions. The two solutions
are denoted hyperbolic cosine and hyperbolic sine, respectively:

cosh0 ( x ) = sinh( x ) , cosh(0) = 1


(3-41)
sinh0 ( x ) = cosh( x ) , sinh(0) = 0 .

The names cosh( x ) and sinh( x ) (often spoken as “cosh” and “sinsh”) look like cos( x )
and sin( x ), but the functions are very different, as we shall demonstrate below.

Yet there are also fundamental structural similarities between the two pairs of functions
and this is what motivates the names. In the system of differential equations for cos( x )
eNote 3 3.6 HYPERBOLIC FUNCTIONS 88

and sin( x ) only a single minus sign separates this from (3-41):

cos0 ( x ) = − sin( x ) , cos(0) = 1


0 (3-42)
sin ( x ) = cos( x ) , sin(0) = 0 .

In addition (again with the decisive minus sign as the only difference) the following
simple analogy to the well-known and often used relation cos2 ( x ) + sin2 ( x ) = 1 applies:

Theorem 3.26 Fundamental Relation of cosh( x ) and sinh( x )

cosh2 ( x ) − sinh2 ( x ) = 1 . (3-43)

Proof

Make the derivative with respect to x on both sides of the equation (3-43) and conclude that
cosh2 ( x ) − sinh2 ( x ) is a constant. Finally use the initial conditions.

Exercise 3.27

Show directly from the system of differential equations (3-41) that the two ”new” functions
are in fact not so new:
e x + e− x
cosh( x ) = , D (cosh) = R , R(cosh) = [1, ∞[
2 (3-44)
e x − e− x
sinh( x ) = , D (sinh) = R , R(sinh) = ] − ∞, ∞[
2
eNote 3 3.6 HYPERBOLIC FUNCTIONS 89

Figure 3.5: Hyperbolic cosine, cosh( x ), and hyperbolic sine, sinh( x ).

Exercise 3.28

Show directly from the expressions found in Exercise 3.27, that

cosh2 ( x ) − sinh2 ( x ) = 1 . (3-45)

Exercise 3.29

The graph for the function f ( x ) = cosh( x ) looks a lot like a parabola, viz. the graph for the
function g( x ) = 1 + ( x2 /2) when we plot both functions on a suitably small interval around
x0 = 0. Try this! If we instead plot the two graphs in very large x-interval, we learn that
the two functions have very different graphical behaviours. Try this, i.e. try to plot both
functions on the interval [−50, 50]. Comment upon and explain the qualitative differences.
Similarly compare the two functions sinh( x ) and x + ( x3 /6) in the same way.

It is natural and useful to define hyperbolic analogies to tan( x ) and cot( x ). This is done
as follows:
eNote 3 3.6 HYPERBOLIC FUNCTIONS 90

Definition 3.30 Hyperbolic Tangent and Hyperbolic Cotangent

sinh( x ) e2x − 1
tanh( x ) = = 2x , D (tanh) = R , R(tanh) = ] − 1, 1[
cosh( x ) e +1

(3-46)
cosh( x ) e2x + 1
coth( x ) = = 2x , D (coth) = R − {0} ,
sinh( x ) e −1
R(coth) = ] − ∞, −1[ ∪ ]1, ∞[ .

Figure 3.6: Hyperbolic tangent, tanh( x ), and hyperbolic cotangent, coth( x ).

The derivatives of cosh( x ) and of sinh( x ) are already given by the defining system in
(3-41).

d
cosh( x ) = sinh( x )
dx
d
sinh( x ) = cosh( x )
dx
d 1 (3-47)
tanh( x ) = = 1 − tanh2 ( x )
dx cosh2 ( x )
d −1
coth( x ) = 2
= 1 − coth2 ( x ) .
dx sinh ( x )
eNote 3 3.7 THE AREA FUNCTIONS 91

Exercise 3.31

Show the last two expressions for the derivatives for tanh( x ) and coth( x ) in (3-47) by the use
of the differentiation rule in Theorem 3.18.

3.7 The Area Functions

The inverse functions to the hyperbolic functions are called area functions and are
named cosh◦−1 ( x ) = arcosh( x ), sinh◦−1 ( x ) = arsinh( x ), tanh◦−1 ( x ) = artanh( x ), and
coth◦−1 ( x ) = arcoth( x ), respectively.

Since the functions cosh( x ), sinh( x ), tanh( x ), and coth( x ) all can be expressed in terms
of exponential functions it is no surprise that the inverse functions and their derivatives
can be expressed by logarithmic functions. We gather the information here:

p
arcosh( x ) = ln( x + x2 − 1) for x ∈ [1, ∞[
p
arsinh( x ) = ln( x + x2 + 1) for x ∈ R
 
1 1+x (3-48)
artanh( x ) = ln for x ∈ ] − 1, 1[
2 1−x
x−1
 
1
arcoth( x ) = ln for x ∈ ] − ∞, 1[ ∪ ]1, ∞[ .
2 x+1
d 1
arcosh( x ) =√ for x ∈]1, ∞[
dx x2 − 1
d 1
arsinh( x ) =√ for x ∈ R
dx x2 + 1 (3-49)
d 1
artanh( x ) = for x ∈ ] − 1, 1[
dx 1 − x2
d 1
arcoth( x ) = for x ∈ ] − ∞ 1[ ∪ ]1, ∞[ .
dx 1 − x2

3.8 The Arc Functions

The inverse functions to the trigonometric functions are a bit more complicated. As
mentioned earlier here we must choose for each trigonometric function an interval
eNote 3 3.8 THE ARC FUNCTIONS 92

Figure 3.7: Cosine and Sine Functions.

where the function in question is monotonic. In return, once we have chosen such an
interval, it is clear how the inverse function should be defined and how it should then
be differentiated. The inverse functions to cos( x ), sin( x ), tan( x ), and cot( x ) are usu-
ally written arccos( x ), arcsin( x ), arctan( x ), and arccot( x ), respectively; their names are
arccosine, arcsine, arctangent, and arccotangent. As above we gather the results here:

cos◦−1 ( x ) = arccos( x ) ∈ [0, π ] for x ∈ [−1, 1]


◦−1
sin ( x ) = arcsin( x ) ∈ [−π/2, π/2] for x ∈ [−1, 1]
(3-50)
tan◦−1 ( x ) = arctan( x ) ∈ [−π/2, π/2] for x∈R
cot◦−1 ( x ) = arccot( x ) ∈]0, π [ for x∈R .
d −1
arccos( x ) =√ for x ∈] − 1, 1[
dx 1 − x2
d 1
arcsin( x ) =√ for x ∈] − 1, 1[
dx 1 − x2 (3-51)
d 1
arctan( x ) = for x ∈R
dx 1 + x2
d −1
arccot( x ) = for x ∈R .
dx 1 + x2

Note that the derivatives for arccos( x ) and arcsin( x ) are not defined at x0 = 1
or at x0 = −1. This is partly because, if the function we consider is only defined
on a bounded interval then we cannot say that the function is differentiable
at the end-points of the interval. Moreover the formulas for arccos0 ( x ) and
arcsin0 ( x ) show that they are not defined at x0 = 1 or x0 = −1; these values
give 0 in the denominators.
eNote 3 3.8 THE ARC FUNCTIONS 93

Figure 3.8: The arccosine function is defined here.

Exercise 3.32

Use a suitable modification of arctan( x ) in order to determine a new differentiable (and hence
continuous) function f ( x ) that looks like the 0-extension of | x |/x (which is neither continuous
nor differentiable), i.e. we want a function f ( x ) with the following properties: 1 > f ( x ) >
0.999 for x > 0.001 while −0.999 > f ( x ) > −1 for x < −0.001. See Figure 3.10. Hint: Try to
plot arctan(1000x ).
eNote 3 3.8 THE ARC FUNCTIONS 94

Figure 3.9: Arccosine and arcsine. Again the red circles indicate that the arc-functions
are not defined outside the interval [−1, 1]. Similarly the green circular disks indicate
that the arc-functions are defined at the end-points x = 1 and x = −1.

Figure 3.10: The arctangent function.


eNote 3 3.9 SUMMARY 95

3.9 Summary

We have treated some of the fundamental properties of some well-known and some
not so well-known functions. How are they defined, what are their domains, are they
continuous, are they differentiable, and if so what are their derivatives?

• A function f ( x ) is continuous at x0 if f ( x ) − f ( x0 ) is an epsilon function of ( x −


x0 ), i.e.
f ( x ) = f ( x0 ) + ε f ( x − x0 ) . (3-52)

• A function f ( x ) is differentiable at x0 with the derivative f 0 ( x0 ) if

f ( x ) = f ( x0 ) + f 0 ( x0 )( x − x0 ) + ( x − x0 )ε f ( x − x0 ) .

• If a function is differentiable at x0 , then it is also continuous at x0 . The converse


does not apply.

• The derivative of a product of two functions is

d
( f ( x ) · g( x )) = f 0 ( x ) · g( x ) + f ( x ) · g0 ( x ) . (3-53)
dx

• The derivative of a quotient of two functions is

f 0 (x) f ( x ) · g0 ( x ) f 0 ( x ) · g( x ) − f ( x ) · g0 ( x )
 
d f (x)
= − = . (3-54)
dx g( x ) g( x ) g2 ( x ) g2 ( x )

• The derivative of a composite function is

d
f ( g( x )) = f 0 ( g( x )) · g0 ( x ) . (3-55)
dx

• The derivative of the inverse function f ◦−1 ( x ) is


 0 1
f ◦−1 (x) = . (3-56)
f 0 ( f ◦−1 ( x ))
eNote 4 96

eNote 4

Taylor’s Approximation Formulas for


Functions of One Variable

In eNotes ?? and ?? it is shown how functions of one and two variables can be approximated by
first-degree polynomials at every (development) point and that the graphs for the approximating
first-degree polynomial are exactly the tangents and the tangent planes, respectively, for the
corresponding graphs of the functions. In this eNote we will show how the functions can be
approximated even better by polynomials of higher degree, so if the approximation to a function
is sufficiently good then one can use and continue the computations with the approximation
polynomial in place of the function itself and hope for a sufficiently small error. But what does it
mean that the approximation and the error are sufficiently good and sufficiently small? And
how does this depend on the degree of the approximating polynomial? You will find the answers
to these questions in this eNote.
(Updated: 22.09.2021 David Brander).

4.1 Higher Order Derivatives

First we consider functions f ( x ) of one variable x on an open interval of the real num-
bers. We will also assume that the functions can be differentiated an arbitrary number of
times, that is, all the derivatives exist for every x in the interval: f 0 ( x0 ), f 00 ( x0 ), f 000 ( x0 ),
f (4) ( x0 ), f (5) ( x0 ), etc. where f (4) ( x0 ) means the 4th derivative of f ( x ) in x0 . These
higher order derivatives we will use in the construction of (the coefficients to) the ap-
proximating polynomials.
eNote 4 4.1 HIGHER ORDER DERIVATIVES 97

Definition 4.1
If a function f ( x ) can be differentiated an abitrary number of times at every point x
in a given open interval I we say that the function is smooth on the interval I.

Example 4.2 Higher-Order Derivatives of Some Elementary Functions

Here are some higher-order derivatives of some well-known smooth functions:

f (x) f 0 (x) f 00 ( x ) f 000 ( x ) f (4) ( x ) f (5) ( x )


ex ex ex ex ex ex
x2 2x 2 0 0 0
x3 3x2 6x 6 0 0
x 4 4x 3 12x 2 24x 24 0
x 5 5x 4 20x 3 60x 2 120x 120 (4-1)
( x − x0 )5 5 · ( x − x0 )4 20 · ( x − x0 )3 60 · ( x − x0 )2 120 · ( x − x0 ) 120
cos( x ) − sin( x ) − cos( x ) sin( x ) cos( x ) − sin( x )
sin( x ) cos( x ) − sin( x ) − cos( x ) sin( x ) cos( x )
cosh( x ) sinh( x ) cosh( x ) sinh( x ) cosh( x ) sinh( x )
sinh( x ) cosh( x ) sinh( x ) cosh( x ) sinh( x ) cosh( x )
eNote 4 4.1 HIGHER ORDER DERIVATIVES 98

Note that

1. The nth derivative f (n) ( x ) of the function f ( x ) = ( x − x0 )n is

f (n) ( x ) = n · (n − 1) · (n − 2) · · · 2 · 1 = n! , (4-2)

Where n! (n factorial) is the short way of writing the product of the natu-
ral numbers from and including 1 to and including n, cf. Table 4.2 where
n! appears as 2! = 2, 3! = 6, 4! = 24, 5! = 120. Note: by definition 0! = 1,
so n! is well-defined for non-negative integers.

2. By repeated differentiation of cos( x ) we get the same set of functions


periodically with the period 4: If f ( x ) = cos( x ) then

f ( p ) ( x ) = f ( p +4) ( x ) for all p ≥ 1 . (4-3)

The same applies for f ( x ) = sin( x ).

3. By repeated differentiation of the hyperbolic cosine function cosh( x ) we


again get the same ”set” of functions periodically with the period 2: If
f ( x ) = cosh( x ) we get

f ( p ) ( x ) = f ( p +2) ( x ) for all p ≥ 1 . (4-4)

This applies to the hyperbolic sine function f ( x ) = sinh( x ), too.

Example 4.3 The Derivatives of a Somewhat Less Elementary Function

A function f ( x ) can e.g. be given as an integral (that in this case can be expressed by the
ordinary elementary functions):
Z x
2
f (x) = e−t dt . (4-5)
0

But we can easily find the higher order derivatives of the function for every x:
2 2 2 2
f 0 ( x ) = e− x , f 00 ( x ) = −2 · x · e− x , f 000 ( x ) = −2 · e− x + 4 · x2 · e− x etc. (4-6)
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 99

Example 4.4 The Derivatives of an Unknown Function

We assume that a function f ( x ) is given as a solution to a differential equation with the initial
conditions at x0 :

f 00 ( x ) + 3 f 0 ( x ) + 7 f ( x ) = q( x ) , where f ( x0 ) = 1 , and f 0 ( x0 ) = −3 (4-7)

where q( x ) is a given smooth function of x. Again we can fairly easily find the higher order
derivatives of the function at x0 by using the initial conditions directly and by differentiating
the differential equation. We get the following from the initial conditions and from the differ-
ential equation itself:

f 0 ( x0 ) = −3 , f 00 ( x0 ) = q( x0 ) − 3 f 0 ( x0 ) − 7 f ( x0 ) = q( x0 ) + 2 . (4-8)

The third (and the higher-order) derivatives of f ( x ) we then obtain by differentiating both
sides of the differential equation. E.g. by differentiating once we get:

f 000 ( x ) + 3 f 00 ( x ) + 7 f 0 ( x ) = q0 ( x ) , (4-9)

from which we get:

f 000 ( x0 ) = q0 ( x0 ) − 3 f 00 ( x0 ) − 7 f 0 ( x0 )
= q0 ( x0 ) − 3 · (q( x0 ) + 2) − 7 · (−3) (4-10)
= q0 ( x0 ) − 3q( x0 ) + 15 .

4.2 Approximations by Polynomials

The point of the following to find the polynomial of degree n (e.g. the second-degree
polynomial) that best approximates a given smooth function f ( x ) at and around a given
x0 in the domain of the function D ( f ).

For the case n = 2, we try to write f ( x ) in the following way:

f ( x ) = a0 + a1 · ( x − x0 ) + a2 · ( x − x0 )2 + R2,x0 ( x ) , (4-11)

where a0 , a1 , and a2 are suitable constants that are to be chosen so that the remainder
function also known as the Lagrange remainder term R2,x0 ( x ) is as small as possible at
and around x0 . The remainder function we can express by f ( x ) and the polynomial we
are testing:
R2,x0 ( x ) = f ( x ) − a0 − a1 · ( x − x0 ) − a2 · ( x − x0 )2 , (4-12)
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 100

and it is this function that should be as close as possible to 0 when x is close to x0 such
that the difference between the function f ( x ) and the second-degree polynomial be-
comes as small as possible – at least in the vicinity of x0 .

The first natural requirement is therefore that:

R2,x0 ( x0 ) = 0 corresponding to f ( x0 ) = a0 , (4-13)

by which a0 is now determined.

The next natural requirement is that the graph of the remainder function has horizontal
gradient at x0 such that the tangent to the remainder function then is identical to the x
axis:
0
R2,x 0
( x0 ) = 0 such that f 0 ( x0 ) = a1 , (4-14)
by which a1 is determined.

The next requirement on the remainder function is then:


00
R2,x 0
( x0 ) = 0 corresponding to f 00 ( x0 ) = 2 · a2 , (4-15)
1 00
by which also a2 = 2 f ( x0 ) then is determined and fixed.

Thus we have found


f 0 ( x0 ) f 00 ( x0 )
f ( x ) = f ( x0 ) + · ( x − x0 ) + · ( x − x0 )2 + R2,x0 ( x ) (4-16)
1! 2!
Where the remainder function R2,x0 ( x ) satisfies the following requirement that makes it
very small in the neighborhood of x0 :
0 00
R2,x0 ( x0 ) = R2,x 0
( x0 ) = R2,x 0
( x0 ) = 0 . (4-17)

If similarly we had wished to find an approximating n’th degree polynomial for the
same function f ( x ) we would have found:

f 0 ( x0 ) f ( n ) ( x0 )
f ( x ) = f ( x0 ) + · ( x − x0 ) + · · · + · ( x − x0 )n + Rn,x0 ( x ) , (4-18)
1! n!
where the remainder function Rn,x0 ( x ) is a smooth function that satisfies all the require-
ments:
(n)
Rn,x0 ( x0 ) = R0n,x0 ( x0 ) = · · · = Rn,x0 ( x0 ) = 0 . (4-19)
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 101

At this point it is reasonable to expect, on one hand, that these requirements on the re-
mainder functions can be satisfied; on the other, that the remainder function itself must
’appear like’ and be as small as a power of ( x − x0 ) close to x0 .

This is precisely the content of the following Lemma:

Lemma 4.5 Remainder Functions


The remainder function Rn,x0 ( x ) can be expressed from f ( x ) in two different ways,
and we will use both in what follows:

f (n+1) (ξ ( x ))
Rn,x0 ( x ) = · ( x − x 0 ) n +1 , (4-20)
( n + 1) !

where ξ ( x ) lies between x and x0 in the interval I.

The other way is the following one, that contains an epsilon function:

Rn,x0 ( x ) = ( x − x0 )n · ε f ( x − x0 ) , (4-21)

where ε f ( x − x0 ) is an epsilon function of ( x − x0 ).

Proof

We will content ourselves by proving the first statement (4-20) in the simplest case, viz. for
n = 0, i.e. the following : On the interval between (a fixed) x and x0 we can always find a
value ξ such that the following applies:
f 0 (ξ )
R0,x0 ( x ) = f ( x ) − f ( x0 ) = · ( x − x0 ) . (4-22)
(1) !
But this is only a form of the mean value theorem: If a smooth function has values f ( a) and
f (b), respectively, at the end points of an interval [ a, b], then the graph for f ( x ) has at some
position a tangent that is parallel to the line segment connecting the two points ( a, f ( a)) and
(b, f (b)), see Figure 4.1.

f ( n +1) ( ξ )
The other statement (4-21) follows from the first (4-20) by observing that ( n +1) !
· ( x − x 0 ) n +1
f ( n +1) ( ξ )
is an epsilon function of ( x − x0 ), since ( n +1) !
is bounded and since ( x − x0 )n+1 is itself an
epsilon function.
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 102

Figure 4.1: Two points on the blue graph curve for a function are connected with a line
segment (red). The mean value theorem then says that at least one position exists (in
the case shown, exactly two positions, marked in green) on the curve between the two
given points where the slope f 0 ( x ) for the tangent (black) to the curve is exactly the
same as the slope of the straight line segment.

Definition 4.6 Approximating Polynomials


Let f ( x ) denote a smooth function on an interval I. The polynomial

f 0 ( x0 ) f ( n ) ( x0 )
Pn,x0 ( x ) = f ( x0 ) + · ( x − x0 ) + · · · + · ( x − x0 ) n (4-23)
1! n!
is called the approximating polynomial of nth degree for the function f ( x ) with
development point x0 .

To sum up:
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 103

Theorem 4.7 Taylor’s Formulas


Every smooth function f ( x ) can for every non-negative integer n be divided into an
approximating polynomial of degree n and a remainder function like this:

f ( x ) = Pn,x0 ( x ) + Rn,x0 ( x ) , (4-24)

where the remainder function can be expressed in the following ways:

f (n+1) (ξ ( x ))
Rn,x0 ( x ) = · ( x − x 0 ) n +1 for the ξ ( x ) between x and x0
( n + 1) !
(4-25)
and
Rn,x0 ( x ) = ( x − x0 )n · ε f ( x − x0 ) .

In particular it is Taylor’s Limit Formula (where the remainder function is expressed by


an epsilon function) that we will make use of in what follows. We mention this version
explicitly:

Theorem 4.8 Taylor’s Limit Formula


Let f ( x ) denote a smooth function on an open interval I that contains a given x0 .
Then for all x in the interval and for every integer n ≥ 0 the following applies

f 0 ( x0 ) f ( n ) ( x0 )
f ( x ) = f ( x0 ) + · ( x − x0 ) + · · · + · ( x − x0 ) n + ( x − x0 ) n · ε f ( x − x0 ) ,
1! n!
where ε f ( x − x0 ) denotes an epsilon function of ( x − x0 ), i.e. ε f ( x − x0 ) → 0 for
x → x0 .

Example 4.9 The Approximating Polynomials of a Polynomial

One might be led to believe that every polynomial is its own approximating polynomial
because every polynomial must be the best approximation to itself. Here is an example that
shows that this is not that simple. We look at the third-degree polynomial

f ( x ) = 1 + x + x2 + x3 . (4-26)
eNote 4 4.2 APPROXIMATIONS BY POLYNOMIALS 104

The polynomial f ( x ) has the following quite different approximating polynomials - depen-
dent on the choice of development point x0 and degree of development n:

P7,x0 =0 ( x ) = 1 + x + x2 + x3
P3,x0 =0 ( x ) = 1 + x + x2 + x3
P2,x0 =0 ( x ) = 1 + x + x2
P1,x0 =0 ( x ) = 1 + x
P0,x0 =0 ( x ) = 1
P7,x0 =1 ( x ) = 1 + x + x2 + x3
P3,x0 =1 ( x ) = 1 + x + x2 + x3
P2,x0 =1 ( x ) = 2 − 2 · x + 4 · x2 (4-27)
P1,x0 =1 ( x ) = −2 + 6 · x
P0,x0 =1 ( x ) = 4
P7,x0 =7 ( x ) = 1 + x + x2 + x3
P3,x0 =7 ( x ) = 1 + x + x2 + x3
P2,x0 =7 ( x ) = 344 − 146 · x + 22 · x2
P1,x0 =7 ( x ) = −734 + 162 · x
P0,x0 =7 ( x ) = 400 .

Exercise 4.10 Remainder Functions for Polynomials

For the function f ( x ) = 1 + x + x2 + x3 we consider the following two splittings into approx-
imating polynomials and corresponding remainder functions:

f ( x ) = P2,x0 =1 ( x ) + R2,x0 =1 ( x ) and


(4-28)
f ( x ) = P1,x0 =7 ( x ) + R1,x0 =7 ( x ) ,

where the two approximating polynomials P2,x0 =1 ( x ) and P1,x0 =7 ( x ) already are stated in
example 4.9. Determine the two remainder functions R2,x0 =1 ( x ) and R1,x0 =7 ( x ) expressed in
both of the two ways shown in 4-25: For each of the two remainder functions the respective
expressions for ξ ( x ) and for ε( x − x0 ) are stated.
eNote 4 4.3 CONTINUOUS EXTENSIONS 105

Example 4.11 Taylor’s Limit Formula with the Development Point x0 = 0

Here are some often-used functions with their respective approximating polynomials (and
corresponding remainder functions expressed by epsilon functions) with the common devel-
opment point x0 = 0 and arbitrarily high degree:

x2 xn
ex = 1 + x + +···+ + xn · ε( x )
2! n!
2 x4 x2n
ex = 1 + x2 + +···+ + x2n · ε( x )
2! n!
x2 x4 x2n
cos( x ) = 1− + + · · · + (−1)n · + x2n · ε( x )
2! 4! (2n)!
x3 x5 x2n+1
sin( x ) = x− + + · · · + (−1)n · + x2n+1 · ε( x )
3! 5! (2n + 1)! (4-29)
x2 xn
ln(1 + x ) = x− + · · · + (−1)n−1 · + xn · ε( x )
2 n!
x2 xn
ln(1 − x ) = −x − −···− − xn · ε( x )
2 n!
1
= 1 − x + x2 − x3 + · · · + (−1)n−1 · x n−1 + x n · ε( x )
1+x
1
= 1 + x + x 2 + x 3 + · · · + x n −1 + x n · ε ( x )
1−x

Note that in Taylor’s Limit formula we always end with an epsilon function
and with the power of x that is precisely the same as the last power used in the
preceding approximating polynomial.

4.3 Continuous Extensions

The function f ( x ) = sin( x )/x is not defined at x = 0. We will investigate whether


we can extend the function to having a value at 0, such that the extended function is
continuous at 0. I.e. we will find a value a such that the a-extension
(
sin( x )
x for x 6= 0
fe = (4-30)
a for x = 0
is continuous at x = 0, i.e. such that
sin( x )
→a for x→0 . (4-31)
x
eNote 4 4.3 CONTINUOUS EXTENSIONS 106

A direct application of Taylor’s limit formula appears in the determination of limit val-
ues for those quotients f ( x )/g( x ) where both the functions, i.e. the numerator f ( x ) and
the denominator g( x ), tend towards 0 for x tending towards 0. What happens to the
quotient as x tends towards 0? We illustrate with a number of examples. Note that even
though the numerator function and the denominator function both are continuous at 0,
the quotient needs not be continuous.

Example 4.12 Limit Values for Function Fractions

sin( x ) x + x1 · ε( x )
= = 1 + ε( x ) → 1 for x→0 . (4-32)
x x

sin( x ) x − 3!1 x3 + x3 · ε( x ) 1 x
= = − + x · ε( x ) , (4-33)
x2 x2 x 3!
that has no limit value for x → 0. Therefore a continuous extension does not exist in this case.

sin( x2 ) sin(u)
→1 for x→0 because →1 for u→0 . (4-34)
x2 u

1 3
sin( x ) − x x− 3! x + x3 · ε( x ) − x x
2
= 2
= − + x · ε( x ) → 0 for x → 0 . (4-35)
x x 3!

1 3
sin( x ) − x x− 3! x + x3 · ε( x ) − x 1 1
3
= 3
= − + ·ε( x ) → − for x→0 . (4-36)
x x 3! 6

By determination of such limit values the approximating polynomials in the


numerator and the denominator are developed to such a high degree that limit
value ”appears” by dividing both the numerator and the denominator by a
power of x.

Here is a somewhat more complicated example:


eNote 4 4.3 CONTINUOUS EXTENSIONS 107

Figure 4.2: The function f ( x ) = sin( x )/x (blue) together with the numerator function
sin( x ) (red) and the denominator function x (also red). The function f ( x ) is continuous
at x = 0 exactly when we use the value f (1) = 1.

Example 4.13 The Limit Value for a Fraction Between Functions

2 cos( x ) − 2 + x2 2 · (1 − 2!1 · x2 + 4!1 · x4 + x4 · ε 1 ( x )) − 2 + x2


=
x · sin( x ) − x2 x · ( x − 3!1 · x3 + 5!1 · x5 + x5 · ε 2 ( x )) − x2
1
2 − x2 + 12 · x4 + 2 · x4 · ε 1 ( x ) − 2 + x2
= 1
x2 − 3! · x4 + 5!1 · x6 + x6 · ε 2 ( x ) − x2
1
12· x4 + 2 · x4 · ε 1 ( x ) (4-37)
=
− 3!1 · x4 + 5!1 · x6 + x6 · ε 2 ( x )
1
12 + 2 · ε 1 ( x )
=
− 16 + 5!1 · x2 + x2 · ε 2 ( x )
1
→− for x→0 ,
2
1
since the numerator tends towards 12 for x → 0 and the denominator tends towards − 61 for
x → 0.
eNote 4 4.4 ESTIMATION OF THE REMAINDER FUNCTIONS 108

4.4 Estimation of the Remainder Functions

How large is the error committed by using the approximating polynomial (which it is
easy to compute) instead of the function itself (that can be difficult to compute) on a
given (typically small) interval around the development point? The remainder function
can of course give the answer to this question. We give here a couple of examples
that show how the remainder function can be used for such error estimations for given
functions.

Figure 4.3: The function f ( x ) = ln( x ) from Example 4.14 (blue), the approximating
first-degree polynomial (black) with development point x0 = 1 and the corresponding
 3 5 as the difference between f ( x ) and the approximat-
remainder function (red) illustrated
ing polynomial on the interval 4 , 4 . To the right is shown the figure around the point
(1, 0) close-up.

Example 4.14 Approximation of an Elementary Function

The logarithmic function ln( x ) is defined for positive values of x. We approximate with
the approximating first-degree polynomial with the development point at x0 = 1 and will
estimate the remainder term on a suitably small interval around x0 = 1, i.e. the starting point
is the following:
 
3 5
f ( x ) = ln( x ) , x0 = 1 , n = 1 , x ∈ , . (4-38)
4 4

According to Taylor’s formula with the remainder function we have - using the development
point x0 = 1 where f (1) = 0 and f 0 (1) = 1 and using f 00 ( x ) = −1/x2 for all x in the domain:
eNote 4 4.4 ESTIMATION OF THE REMAINDER FUNCTIONS 109

f 0 (1) f 00 (ξ ) 1
f ( x ) = ln( x ) = ln(1) + ( x − 1) + · ( x − 1)2 = x − 1 − · ( x − 1)2 (4-39)
1! 2! 2 · ξ2

for a value of ξ between x and 1. Thus we have found:

1
P1,x0 =1 ( x ) = x − 1 , and R1,x0 =1 ( x ) = − · ( x − 1)2 . (4-40)
2 · ξ2

The absolute value of the remainder function on the given interval can now be evaluated for all
x in the given interval - even if we do not know very much about the position of ξ in the
interval apart from the fact that ξ lies between x and 1:

We have  2
1 1 1
| R1,x0 =1 ( x )| = | − 2
· ( x − 1)2 | ≤ | · | . (4-41)
2·ξ 2 · ξ2 4
Here the minus sign has been removed because we only look at the absolute value and we
have also used that ( x − 1)2 clearly is largest (with the value (1/4)2 ) for x = 3/4 and for
x = 5/4 in the interval. In addition ξ is smallest and thus (1/ξ )2 largest on the interval for
ξ = 3/4. (Note that here we do not use the fact of ξ lying between x and 1 - we simply use
the fact of ξ lying in the interval!) I.e.

1 1 1
| R1,x0 =1 ( x )| ≤ | |≤|  | = 18 , (4-42)
32 · ξ 2 3 2
32 · 4

thus we have proved that


 
1 3 5
| ln( x ) − ( x − 1)| ≤ for all x∈ , . (4-43)
18 4 4
eNote 4 4.4 ESTIMATION OF THE REMAINDER FUNCTIONS 110

One may well wonder why the remainder function estimation of such a simple
function as f ( x ) = ln( x ) in Example 4.14 should be so complicated, when it is
evident to everybody (!) that the red remainder function in that case assumes
its largest numerical (absolute) value at one of the end points of the actual
interval, see Figure 4.3 – a statement, moreover, which we can prove by a quite
ordinary function investigation.

By differentiation of the remainder function we get:

0 d 1
R1,x 0 =1
(x) = (ln( x ) − ( x − 1)) = − 1 , (4-44)
dx x
that is less than 0 precisely for x > 1 (such that R1,x0 =1 ( x ) to the right of x = 1
is negative and decreasing from the value 0 at x = 1) and greater than 0 for
x < 1 (such that R1,x0 =1 ( x ) to the left of x = 1 is negative and increasing
towards the value 0 at x = 1). But the problem is that we in principle do not
know what the value of ln( x ) in fact is – neither at x = 3/4 nor at x = 5/4
unless we use Maple or some other tool as help. The remainder function
estimate uses only the defined properties of f ( x ) = ln( x ), i.e. f 0 ( x ) = 1/x and
f (1) = 0 and the estimation gives the values (also at the end points of the
interval) with a (numerical) error of at most 1/18 in this case.

If we actually get the information that ln(3/4) = −0.2877 and ln(5/4) =


0.2223 we thenof course get a direct estimate of the largest value of | R1,x0 =1 ( x )|
3 5

in the interval 4 , 4 :

| R1,x0 =1 ( x )| ≤ max{| − 0.2877 + 0.25| , |0.2223 − 0.25|}


(4-45)
= 0.0377 < 1/18 = 0.0556 .

With the ordinary function analysis we get a somewhat better estimate of the
remainder function – but only because we beforehand can estimate the func-
tion value at the end points.

Exercise 4.15 Approximation of a Non-Elementary Function

Given the function from Example 4.3:


Z x
2
f (x) = e−t dt (4-46)
0
An estimate of the magnitude of the difference between f ( x ) and the approximating first-
degree polynomial P1,x0 =0 ( x ) with the development point at x0 = 0 is wished for. The exer-
cise is about determining the largest absolute value that the remainder function | R1,x0 =0 ( x )|
eNote 4 4.4 ESTIMATION OF THE REMAINDER FUNCTIONS 111

Figure 4.4: The function f ( x ) from Example 4.15 (blue), the approximating first- and
third-degree polynomials (black) with development points x0 = 0 and the correspond-
ing remainder functions (red) on the interval [−1, 1].

assumes on the interval [−1, 1].


Hint: Use higher order derivatives of f ( x ) evaluated at x0 = 0 found earlier, cf. Example 4.3:
2
f (0) = 0, f 0 (0) = 1, f 00 ( x ) = −2 · x · e− x . See Figure 4.4.

Example 4.16 Approximation of an Unknown (But Elementary) Function

Given the function from example 4.4, i.e. the function satisfies the following differential
equation with initial conditions:

f 00 ( x ) + 3 f 0 ( x ) + 7 f ( x ) = x2 , where f (0) = 1 , and f 0 (0) = −3 , (4-47)

where we have assumed that the right-hand side of the equation is q( x ) = x2 and that the
development point is x0 = 0. By this we now get:

f 0 (0) = −3 , f 00 (0) = 2 , f 000 (0) = 15 . (4-48)

We have
f 00 (0) 2 f 000 (0) 3
f ( x ) = f (0) + f 0 (0) · x + ·x + · x + x3 · ε( x )
2 6 (4-49)
5 3 2 3
=1 − 3 · x + x + · x + x · ε ( x ) ,
2
such that the approximating third-degree polynomial for f ( x ) with development point x0 =
0 is
5
P3,x0 =0 ( x ) = 1 − 3 · x + x2 + · x3 . (4-50)
2
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 112

Figure 4.5: The function f ( x ) from Example 4.16 (blue), the approximating first-,
second-, and third-degree polynomials (black) with the development point x0 = 0. The
corresponding respective remainder functions (red) are illustrated as the differences be-
tween f ( x ) and the approximating polynomials.

Note that P3,x0 =0 ( x ) satisfies the initial conditions in (4-47) but the polynomial P3,x0 =0 ( x ) is
not a solution to the differential equation itself!

4.5 Functional Investigations

A very important property of continuous functions is the following, which means one
can control how large and how small values a continuous function can assume on an
interval, as long as the interval is sufficiently nice:
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 113

Theorem 4.17 Main Theorem for Continuous Functions of One Variable


Let f ( x ) denote a function that is continuous on all of its domain D ( f ) ⊂ R. Let
I = [ a , b ] be a bounded, closed, and connected interval in D ( f ).

Then the range for the function f ( x ) on the interval I is also a bounded, closed and
connected interval [ A , B ] ⊂ R, thus denoted:

R( f | I ) = f ( I ) = { f ( x ) | x ∈ I } = [ A , B ] , (4-51)

where the possibility that A = B is allowed and this happens precisely when f ( x ) is
constant on the whole interval I.

Definition 4.18 Global Minimum and Global Maximum


When a function f ( x ) has the range R( f | I ) = f ( I ) = [ A , B ] on an interval I = [ a, b]
we say that

1. A is the global minimum value for f ( x ) on I, and if f ( x0 ) = A for x0 ∈ I then


x0 is a global minimum point for f ( x ) on I.

2. B is the global maximum value for f ( x ) on I, and if f ( x0 ) = B for x0 ∈ I then


x0 is a global maximum point for f ( x ) on I.

A well-known and important task is to find the global maximum and minimum values
for given functions f ( x ) on given intervals and to determine the x-values for which
these maximum and minimum values are assumed, that is, the minimum and maximum
points. To solve this task the following is an invaluable help – see Figure 4.6:

Lemma 4.19 Maxima and Minima at Stationary Points


Let x0 be a global maximum or minimum value for f ( x ) on I. Assume that x0 is not
an end point for the interval I and that f ( x ) is differentiable at x0 .
Then x0 is a stationary point for f ( x ),i.e. f 0 ( x0 ) = 0.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 114

Proof

We outline the argument. Since f ( x ) is assumed differentiable, we have:

f ( x ) = f ( x0 ) + f 0 ( x0 ) · ( x − x0 ) + ( x − x0 ) · ε f ( x − x0 )
(4-52)
= f ( x0 ) + ( x − x0 ) · ( f 0 ( x0 ) + ε f ( x − x0 )) .

Now if we assume that f 0 ( x0 ) is positive then the parenthesis ( f 0 ( x0 ) + ε f ( x − x0 )) is also


positive for x sufficiently close to x0 (since ε f ( x − x0 ) → 0 for x → x0 ), but then ( x − x0 ) ·
( f 0 ( x0 ) + ε f ( x − x0 )) is also positive for x sufficiently close to x0 and then f ( x ) > f ( x0 ) for
x > x0 , and f ( x ) < f ( x0 ) for x < x0 . Therefore f ( x0 ) can not be neither a maximum value nor
a minimum value for f ( x ). A similar conclusion appears when the assumption is f 0 ( x0 ) < 0.
If x0 is a global maximum or minimum value for f ( x ) on I this assumption must imply that
f 0 ( x0 ) = 0.

Hereby we have the following investigation method at our disposal:

Method 4.20 Method of Investigation


Let f ( x ) be a continuous function and I = [ a, b] an interval in the domain D ( f ).

Maximum and minimum values for the function f ( x ), x ∈ I, i.e. A and B in the
range [ A, B] for f ( x ) restricted to I, are found by finding and comparing the function
values at the following points:

1. Interval end points (the boundary points a and b for the interval I).

2. Exception points, i.e. the points in the open interval ] a, b[ where the function is
not differentiable.

3. The stationary points, i.e. all the points x0 in the open interval ] a, b[ where
f 0 ( x0 ) = 0.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 115

With this method of investigation we not only find the global maximum and
minimum values but also the x-values in I for which the global maximum and
the global minimum are assumed i.e. maximum and minimum points in the
actual interval.

Example 4.21 A Continuous Function Is Investigated

A Continuous function f ( x ) is defined for all x in the following way:




 0.75 for x ≤ −1.5
2



 0.5 + ( x + 1 ) for − 1.5 ≤ x ≤ 0
f (x) = 3
1.5 · (1 − x ) for 0 ≤ x ≤ 1 (4-53)
x−1 for 1 ≤ x ≤ 2





 1 for x > 2

See Figure 4.6, where we only consider the function on the interval I = [−1.5, 2.0]. There are
two exception points where the function is not differentiable: x0 = 0 and x0 = 1. There is
one stationary point in ] − 1.5, 2.0[ where f 0 ( x0 ) = 0 viz. x0 = −1. And finally there are two
boundary points (the interval end points x0 = −1.5 and x0 = 2) that need to be investigated.

Therefore we have the following candidates for global maximum and minimum values for f
on I:
x0 = −1.5 −1 0 1 2
(4-54)
f ( x0 ) = 0.75 0.5 1.5 0 1
In conclusion we read from this that the maximum value for f ( x ) is B = 1.5 which is assumed
at the maximum point x0 = 0. The minimum value is A = 0, assumed at the minimum point
x0 = 1. There are no other maximum or minimum points for f on I.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 116

Figure 4.6: The continuous function f ( x ) from example 4.21 (blue). On the graph we
have marked (in red) the 5 points that need to be investigated particularly in order to
determine the range for f in the interval [−1.5, 2], cf. Method 4.20.

Definition 4.22 Local Minima and Local Maxima


Let f ( x ) denote a function on an interval I = [ a, b] containing a given x0 ∈] a, b[.

1. If f ( x ) ≥ f ( x0 ) for all x in a (as small as you like) neighborhood of x0 then


f ( x0 ) is called a local minimum value for f ( x ) in I and x0 is a local minimum
point for f ( x ) in I. If actually f ( x ) > f ( x0 ) for all x in the neighborhood apart
from the point x0 itself then f ( x0 ) is called a proper local minimum value.

2. If f ( x ) ≤ f ( x0 ) for all x in a (as small as you like) neighborhood of x0 then


f ( x0 ) is called a local maximum value for f ( x ) in I and x0 is a local maximum
point for f on I. If actually f ( x ) < f ( x0 ) for all x in the neighborhood apart
from the point x0 itself then f ( x0 ) is called a proper local maximum value.

If the function we want to investigate is smooth at its stationary points then we can
qualify the Method 4.20 even better, since the approximating polynomial of degree 2
with development point at the stationary point can help in the decision whether the
value of f ( x ) at the stationary point is a candidate to be a maximum value or a minimum
value.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 117

Lemma 4.23 Local Analysis at a Stationary Point


Let f ( x ) be a smooth function and assume that x0 is a stationary point for f ( x ) on
an interval I =] a, b[. Then the following applies:

1. If f 00 ( x0 ) > 0 then f ( x0 ) is a proper local minimum value for f ( x ).

2. If f 00 ( x0 ) < 0 then f ( x0 ) is a proper local maximum value for f ( x ).

3. If f 00 ( x0 ) = 0 then this is not sufficient information to decide whether f ( x0 ) is


a local minimum value or a local maximum value or neither.

Exercise 4.24

Prove Lemma 4.23 by using Taylor’s limit formula with the approximating second-degree
polynomial for f ( x ) and with the development point x0 . Remember that x0 is a stationary
point, such that f 0 ( x0 ) = 0.

Example 4.25 Local Maxima and Minima

The continuous function f ( x )




 0.75 for x ≤ −1.5
2

 0.5 + ( x + 1)

 for − 1.5 ≤ x ≤ 0
f (x) = 1.5 · (1 − x3 ) for 0 ≤ x ≤ 1 (4-55)
x−1 for 1 ≤ x ≤ 2





 1 for x ≥ 2

is shown in Figure 4.6. On the interval I = [−1.5, 2.0] the function has the proper local
minimum values 0.5 and 0 in the respective proper local minimum points x0 = −1 and x0 = 1
and the function has a proper local maximum value 1.5 at the proper local maximum point
x0 = 0. If we extend the interval to J = [−7, 7] and note that the function values by definition
are constant outside the interval I we get the new local maximum values 0.75 and 1 for f on
J – not one of them is a proper local maximum value. All x0 ∈] − 7, −1.5] and all x0 ∈ [2, 7[
are local maximum points for f on J but not one of them is a proper local maximum point. All
x0 in the open interval x0 ∈] − 7, −1.5[ and all x0 in the open interval x0 ∈]2, 7[ in addition also
local minimum points for f ( x ) i J but not one of them is a proper local minimum point.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 118

Figure 4.7: Proper local maxima and proper local minima for the function from Example
4.26 are here indicated on the graph for the function. We note: The local maximum and
minimum points for the function are the x-coordinates of the graph points shown in red,
and the local maximum and minimum values for the function are the y-coordinates of
the graph-points shown in red.

Example 4.26 A Non-Elementary Function

The function f ( x ) Z x
f (x) = cos(t2 ) dt (4-56)
0
has stationary points at those values of x0 satisfying:
π
f 0 ( x0 ) = cos( x02 ) = 0 , dvs. x02 = +p·π where p is an integer . (4-57)
2
Since we also have that
f 00 ( x ) = −2 · x · sin( x2 ) , (4-58)
such that at the stated stationary points it applies

f 00 ( x0 ) = −2 · x0 · (−1) p . (4-59)

From this it follows – via Lemma 4.23 – that every other stationary point x0 along the x-axis
is a proper local maximum point for f ( x ) and the other points proper local minimum points.
See Figure 4.7. In Figure 4.8 are shown graphs (parabolas) for a pair of the approximat-
ing second-degree polynomials for f ( x ) with the development points at chosen stationary
points.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 119

Figure 4.8: The graph for the function in Example 4.26 and two approximating parabolas
with development points in two stationary points, which are a proper local minimum
point and a proper local maximum point for f ( x ).

Example 4.27 When the Approximation to Degree 2 is Not Good Enough

As stated in Lemma 4.23 one cannot from f 0 ( x0 ) = f 00 ( x0 ) = 0 decide whether the function
has a local maximum or minimum at x0 . This is shown in the three simple functions in Figure
4.9 with all the clarity one could wish for: f 1 ( x ) = x4 , f 2 ( x ) = − x4 and f 3 ( x ) = x3 . All three
functions have a stationary point at x0 = 0 and all have f 00 ( x0 ) = 0, but f 1 ( x ) has a proper
local minimum point at 0, f 2 ( x ) has a proper local maximum point at 0, and f 3 ( x ) has neither
a local minimum point nor a local maximum point at 0.
eNote 4 4.5 FUNCTIONAL INVESTIGATIONS 120

Figure 4.9: Three elementary functions with approximating second-degree polynomials


P2,x0 =0 ( x ) = 0 for all x. The functions are: f 1 ( x ) = x4 (red), f 2 ( x ) = − x4 (black) and
f 3 ( x ) = x3 (blue).
eNote 4 4.6 SUMMARY 121

4.6 Summary

In this eNote we have studied how one can approximate smooth functions using poly-
nomials.

• Every smooth function f ( x ) on an interval I can be split into an approximating


n’th-degree polynomial Pn,x0 ( x ) with the development point x0 and a correspond-
ing remainder function Rn,x0 ( x ) like this:

f ( x ) = Pn,x0 ( x ) + Rn,x0 ( x ) , (4-60)

where the polynomial and the remainder function in Taylor’s limit formula are
written like this:

f 0 ( x0 ) f ( n ) ( x0 )
f ( x ) = f ( x0 ) + · ( x − x0 ) + · · · + · ( x − x0 ) n + ( x − x0 ) n · ε f ( x − x0 ) ,
1! n!
with ε f ( x − x0 ) denoting an epsilon function of ( x − x0 ), i.e. ε f ( x − x0 ) → 0 for
x → x0 .

• Taylor’s limit formula can be used to find the continuous extension of quotients
of functions by finding (if possible) their limit values for x → x0 where x0 are
the values where the numerator function is 0 such that the quotient at the starting
point is not defined at x0 :

sin( x ) x + x1 · ε( x )
= = 1 + ε( x ) → 1 for x→0 . (4-61)
x x

• Estimation of the remainder function gives an upper bound for the largest numer-
ical difference between a given function and the approximating polynomial of a
suitable degree and with a suitable development point on a given interval of in-
vestigation. Such an estimation can also be made for functions that are possibly
only ”known” via a differential equation or as a non-elementary integral:
 
1 3 5
| ln( x ) − ( x − 1)| ≤ for all x ∈ , . (4-62)
18 4 4

• Taylor’s limit formula with approximating second-degree polynomials is used for


efficient functional investigation, including determination of range, global and lo-
cal maxima and minima for given functions.
eNote 5 122

eNote 5

The Number Spaces Rn and Cn

This eNote is about the real number space Rn and the complex number space Cn , which are
essential building blocks in Linear Algebra.

Update: 23.09.21 David Brander

5.1 Number Spaces

Remark 5.1 The Common Notion L


Defnitions and rules in this eNote are valid both for the real numbers R and the
complex numbers C . The set of real numbers and the set of complex numbers are
examples of fields. Fields have common calculation rules concerning elementary
arithmetic rules (the same rules as those for C described in Theorem 1.12 in eNote
1). In the following when we use the symbol L it means that the notion is valid
both for the set of real numbers and for the set of complex numbers.

Rn is the symbol for the set of all n-tuples that contain n real elements. For example,
(1, 4, 5) and (1, 5, 4)
are two different 3-tuples that belong to R3 . Similarly Cn is the symbol for the set of all
n-tuples which contains n complex elements, e.g.
(1 + 2i, 0, 3i, 1, 1) and (1, 2, 3, 4, 5)
eNote 5 5.1 NUMBER SPACES 123

are two different 5-tuples that belong to C5 . Formally we write Ln in set notation as:
Ln = {( a1 , a2 , ..., an ) | ai ∈ L} . (5-1)

We introduce addition of elements in Ln and multiplication of elements in Ln by an


element of L (a scalar) by the following definition:

Definition 5.2
Let ( a1 , a2 , ..., an ) and (b1 , b2 , ..., bn ) be two elements of Ln and let k be a number in L
(a scalar). The sum of the two n-tuples is defined by

( a1 , a2 , ..., an ) + (b1 , b2 , ..., bn ) = ( a1 + b1 , a2 + b2 , ..., an + bn ), (5-2)

and the product of ( a1 , a2 , ..., an ) by k by

k · ( a1 , a2 , ..., an ) = ( a1 , a2 , ..., an ) · k = (k · a1 , k · a2 , ..., k · an ). (5-3)

Rn with the operations (5-2) and (5-3) is called the n-dimensional real number space.
Similarly, Cn with the operations (5-2) and (5-3), the n-dimensional complex number space.

Example 5.3 Addition

An example of the addition of two 4-tuples in R4 is

(1, 2, 3, 4) + (2, 1, −2, −5) = (3, 3, 1, −1)

Example 5.4 Multiplication

Denitio An example of multiplication of a 3-tuple in R3 by a scalar is

5 · (2, 4, 5) = (10, 20, 25) .

An example of multiplication of a 2-tuple in C2 by a scalar is

i · (2 + i, 4) = (−1 + 2i, 4 i ) .
eNote 5 5.1 NUMBER SPACES 124

As a short notation for n-tuples we often use small bold letters, we write e.g.
a = (3, 2, 1) or b = (b1 , b2 , ..., bn ) .
For the n-tuple (0, 0, ..., 0), which is called the zero element of Ln , we use the notion
0 = (0, 0, ..., 0) .

When more complicated computational exercises in the number spaces are called for,
there is a need for the following arithmetic rules.

Theorem 5.5 Arithmetic Rules in Ln


For all values of n, in the number space Ln the operations introduced in definition
5.2 obey the following eight rules:

1. a + b = b + a (addition is commutative)

2. (a + b) + c = a + (b + c) (addition is associative)

3. For all a: a + 0 = a (i.e. 0 is neutral with respect to addition)

4. For all a there exists an opposite element −a such that a + (−a) = 0

5. k1 (k2 a) = (k1 k2 )a (multiplication by scalars is associative)

6. (k1 + k2 )a = k1 a + k2 a (distributive rule)

7. k1 (a + b) = k1 a + k1 b (distributive rule)

8. 1a = a (the number 1 is neutral in a product with a scalar)

Proof

Concerning rule 4: Given two vectors a = ( a1 , ... an ) and b = (b1 , ..., bn ) . Then

a + b = ( a1 + b1 , ..., an + bn ) = 0 ⇔ b1 = − a1 , ..., bn = − an .

From this we deduce that a has an opposite vector −a given by −a = (− a1 , ..., − an ) . More-
over, this vector is unique.

The other rules are proved by calculating the left and right hand side of the equations and
then comparing the two results.
eNote 5 5.1 NUMBER SPACES 125

From the proof of rule 4 in theorem 5.5 it is evident that for an arbitrary n-tuple
a: −a = (−1)a .

Exercise 5.6

Give a formal proof of rule 2 and rule 5 in Theorem 5.5.

Definition 5.7 Subtraction


Given a ∈ Ln and b ∈ Ln . The difference a − b is defined as:

a − b = a + (−b) . (5-4)

Example 5.8 Subtraction

(1 + 2i, 1) − (i, 2) = (1 + 2i, 1) + (−(i, 2)) = (1 + 2i, 1) + (−i, −2) = (1 + i, −1) .

Exercise 5.9 The Zero Rule

Show that the following variant of the zero rule is valid:

ka = 0 ⇔ k = 0 or a = 0 . (5-5)
eNote 5 5.1 NUMBER SPACES 126

Remark 5.10 n-Tuples as Vectors


Often an n-tuple is written as a column vector. We have two equivalent ways of
writing, here with an example from R4 :
 
1
2
v = (1, 2, 3, 4) and v = 
3.

If in a given context the n-tuple is regarded as a row vector then a transposition is


performed. The transpose of a column vector is a row vector (and vice versa), it has
the symbol T:
v> = 1 2 3 4 .
 
eNote 6 127

eNote 6

Systems of Linear Equations

(Updated 24.9.2021 David Brander)

6.1 Linear Equations

Remark 6.1 The Common Notion L


Defnitions and rules in this eNote are valid both for the real numbers R and the
complex numbers C . The set of real numbers and the set of complex numbers are
examples of fields. Fields have common calculation rules concerning elementary
arithmetic rules (the same rules as those for C described in Theorem 1.12 in eNote
1). In the following when we use the symbol L it means that the notion is valid
both for the set of real numbers and for the set of complex numbers.

A linear equation with n unknowns x1 , x2 , . . . xn is an equation of the form

a1 · x1 + a2 · x2 + . . . + a n · x n = b . (6-1)

The numbers a1 , a2 , . . . , an are called the coefficients and the number b is, in this con-
text, called the right hand side. The coefficients and the right hand side are considered
known in contrast to the unknowns. The equation is called homogeneous if b = 0, else
inhomogeneous.
eNote 6 6.1 LINEAR EQUATIONS 128

Definition 6.2 Solution to a Linear Equation


By a solution to the equation

a1 · x1 + a2 · x2 + . . . + a n · x n = b . (6-2)

we shall understand an n-tuple x = ( x1 , x2 , . . . , xn ) ∈ Ln that by substitution into


the equation makes the left hand side of the equation equal to the right hand side.

By the general solution or just the solution set we understand the set of all solutions
to the equation.

Example 6.3 The Equation for a Straight Line in the Plane

An example of a linear equation is the equation for a straight line in the ( x, y)-plane:

y = 2x +5. (6-3)

Here y is isolated on the left hand side and the coefficients 2 and 5 have well known geomet-
rical interpretations. But the equation could also be written

−2 x1 + 1 x2 = 5 (6-4)

where x and y are substituted by the more general names for unknowns, x1 and x2 , and the
equation is of the form (6-1).

The solution set for the equation (6-3) is of course the coordinate set for all points on the line
- by substitution they will satisfy the equation in contrast to all other points!

Example 6.4 Trivial and Inconsistent Equations

The linear equation


0x1 + 0x2 + 0x3 + 0x4 = 0 ⇔ 0 = 0 (6-5)
where all coefficients and the right hand side are 0, is an example of a trivial equation. The
solution set of the equation consists of all x = ( x1 , x2 , x3 , x4 ) ∈ L4 .

If all the coefficients of the equation are 0 but the right hand side is non-zero, the equation is
an inconsistent equation, that is, an equation without a solution. An example is the equation

0x1 + 0x2 + 0x3 + 0x4 = 1 ⇔ 0 = 1 . (6-6)


eNote 6 6.1 LINEAR EQUATIONS 129

When you investigate linear equations, you can use the usual rule of conversion for
equations: The set of solutions for the equation is not changed if you add the same
number to both sides of the equality sign, and you do not change the solution set if you
multiply both sides of the equality sign by a non-zero constant.

All linear equations that are not inconsistent and which contain more than one solution,
have infinitely many solutions. The following example shows how the solution set in
this case can be written.

Example 6.5 Infinitely Many Solutions in Standard Parameter Form

We consider an inhomogeneous equation with three unknowns:

2 x1 − x2 + 4 x3 = 5 . (6-7)

By substitution of x1 = 1, x2 = 1 and x3 = 1 into the equation (6-7) we see that x = (1, 1, 1) is


a solution. But by this we have not found the general solution, because x = ( 12 , 0, 1) is also a
solution. How can we describe the complete set of solutions?

First we isolate x1 :
5
x1 = 2 + 21 x2 − 2 x3 . (6-8)
To every choice of x2 and x3 corresponds exactly one x1 . For example, if we set x2 = 1 and
x3 = 4, then x1 = −5. This means that the 3-tuple (−5, 1, 4) is a solution. Therefore we
can consider x2 and x3 free parameters that together determine the value of x1 . Therefore we
rename x2 and x3 to the parameter names s and t, respectively: s = x2 and t = x3 . Then x1
can be expressed as:
x1 = 25 + 12 x2 − 2 x3 = 25 + 12 s − 2 t . (6-9)
Now we can write the general solution to (6-7) in the following standard parameter form:
 5 1
−2
  
x1 2 2
x =  x2  =  0 + s · 1 + t · 0  with s, t ∈ L . (6-10)
x3 0 0 1

Note that the parameter form of the middle equation x2 = 0 + s · 1 + t · 0 only expresses the
renaming x2 → s. Similarly, the last equation only expresses the renaming x3 → t.
eNote 6 6.2 A SYSTEM OF LINEAR EQUATIONS 130

If we consider the equation (6-7) to be an equation for a plane in space, then


the equation (6-10) is a parametric representation for the same plane. The first
column on the right hand side is the initial point in the plane, and the two last
columns are directional vectors for the plane. This is elaborated in the eNote 10
Geometric Vectors.

6.2 A System of Linear Equations

A system of linear equations consisting of m linear equations with n unknowns is writ-


ten in the form
a11 · x1 + a12 · x2 + . . . + a1n · xn = b1
a21 · x1 + a22 · x2 + . . . + a2n · xn = b2
.. (6-11)
.
am1 · x1 + am2 · x2 + . . . + amn · xn = bm

The system has m rows, each of which contains an equation. The n unknowns, denoted
x1 , x2 , . . . xn , are present in each of the m equations (unless some of the coefficients
are zero, and we choose not to write down the zero terms). The coefficient of x j in the
equation in row number i is denoted aij . The system is termed homogeneous if all the m
right hand sides bi are equal to 0, otherwise inhomogeneous.

Definition 6.6 Solution of System of Linear Equations


By a solution to the the system of linear equations

a11 · x1 + a12 · x2 + . . . + a1n · xn = b1


a21 · x1 + a22 · x2 + . . . + a2n · xn = b2
.. (6-12)
.
am1 · x1 + am2 · x2 + . . . + amn · xn = bm

we understand an n-tuple x = ( x1 , x2 , . . . xn ) ∈ Ln which by substitution into all of


the m linear equations satisfies the equations, i.e. makes the left hand side of each
equal to the right hand side.

By the general solution or just the solution set we understand the set of all solutions
to the system. A single solution is often termed a particular solution.
eNote 6 6.3 THE COEFFICIENT MATRIX AND THE AUGMENTED MATRIX 131

Example 6.7 A Homogeneous System of Linear Equations

A homogeneous system of linear equations consisting of two equations with four unknowns
is given by:
x1 + x2 + 2 x3 + x4 = 0
(6-13)
2 x1 − x2 − x3 + x4 = 0

We investigate whether the two 4-tuples x = (1, 1, 2, −6) and y = (3, 0, 1, −5) are particular
solutions to the equations (6-13). Substituting x into the left hand side of the system we get

1+1+2·2−6 = 0
(6-14)
2 · 1 − 1 − 2 − 6 = −7
Because the left hand side is equal to the given right hand side 0 in the first of these equations,
x is only a solution to the first of the two equations. Therefore x is not a solution to the system.

Substituiting y we get
3+0+2·1−5 = 0
(6-15)
2·3−0−1−5 = 0
Since in both equations the left hand side is equal to the right hand side 0 , y is a solution to
both of the equations. Therefore y is a particular solution to the system.

The solution set to a system of linear equations is the intersection of the solu-
tion sets for all the equations comprising the system.

6.3 The Coefficient Matrix and the Augmented Matrix

When we investigate a system of linear equations it is often convenient to use matrices.


A matrix is a rectangular array consisting of a number of rows and columns. As an
example the matrix M given by
 
1 0 5
M= , (6-16)
8 3 2

has two rows and three columns. The six elements are termed the elements of the ma-
trix. The diagonal of the matrix consists of the elements with equal row and column
numbers. In M the diagonal consists of the elements 1 and 3.
eNote 6 6.3 THE COEFFICIENT MATRIX AND THE AUGMENTED MATRIX 132

By the coefficient matrix A to the system of linear equations (6-11) we understand the ma-
trix whose first row consists of the coefficients in the first equation, whose second row
consists of the coefficients in the second equation, etc. In short, the following matrix
with m rows and n columns:
 
a11 a12 · · · a1n
 21 a22 · · · a2n 
a 
A = . .. ..  (6-17)
 .. . . 
am1 am2 · · · amn
The augmented matrix T of the system is constructed by adding a new column to the
coefficient matrix consisting of the right hand sides bi of the system. Thus T consists of
m rows and n + 1 columns. If we collect the right hand sides bi into a column vector b,
which we denote the right hand side of the system, T is composed as follows, where the
vertical line symbolizes the equality sign of the system:
 
a11 a12 · · · a1n b1
   a21 a22 · · · a2n b2 
 
T = A b = . .. .. ..  (6-18)
 .. . . . 
am1 am2 · · · amn bm

The vertical line in front of the last column in (6-18) has only the didactical
function to create a clear representation of the augmented matrix. One can
chose to leave out the line if in a given context this does not lead to misunder-
standings.

Example 6.8 Coefficient Matrix, Right Hand Side and Augmented Matrix

In the following system of linear equations with 3 equations and 3 unknowns

− x2 + x3 = 2
2x1 + 4x2 − 2x3 = 2 (6-19)
3x1 + 4x2 + x3 = 9

we have
0 −1 0 −1
     
1 2 1 2
A = 2 4 −2  , b =  2  and T =  2 4 −2 2  (6-20)
3 4 1 9 3 4 1 9
Notice that the 0 that is placed in the top left position in A and T, denotes that the coefficient
of x1 in the uppermost row of the system is 0.
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 133

The clever thing about a coefficient matrix (and an augmented matrix) is that
we do not need to write down the unknowns. The unique position of the
coefficients in the matrix means that we are sure of which of the unknowns
any single particular coefficient belongs to. Thus we have removed redundant
symbols!

6.4 Row Reduction of Systems of Linear Equations

Systems or linear equations can be reduced, that is, made simpler using a method called
Gaussian elimination. The method has several versions, and the special variant used in
these eNotes goes by the name Gauss-Jordan elimination . The algebraic basis for all
variants is that you can reshape a system of linear equations by so-called row operations
without thereby changing the solution set for the system. When a system of equations
is reduced as much as possible it is usually easy to read it and to evaluate the solution
set.

Theorem 6.9 Row Operations


The solution set of a system of linear equations is not altered if the system is trans-
formed by any of the following three row operations:

ro1 : Let two of the equations swap rows.

ro2 : Multiply one of the equations by a non-zero constant.

ro3 : To a given equation add one of the other equations multiplied by a constant.

Here we introduce a short notation for each of the three row operations:
ro1 : Ri ↔ R j : The equation in row i is swapped with the equation in row j.
ro2 : k · Ri : The equation in row i is multiplied by k.
ro3 : R j + k · Ri : Add the equation in row i, multiplied by k, to the equation in row j.

In the following example we test the three row operations.


eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 134

Example 6.10 Row Operations

An example of ro1 : Consider the system of equations below to the left. We swap two equa-
tions in the two rows thus performing R1 ↔ R2 .

x1 + 2x2 = −3 x1 + x2 = 0
→ (6-21)
x1 + x2 = 0 x1 + 2x2 = −3

The system to the right has the same solution set as the system on the left.

An example of ro2 : Consider the system of equations below to the left. We multiply the
equation in the second row by 5, thus performing 5 · R2 :

x1 + 2x2 = −3 x1 + 2x2 = −3
→ (6-22)
x1 + x2 = 0 5 x1 + 5 x2 = 0

The system to the right has the same solution set as the system on the left.

An example of ro3 : Consider the system of equations below to the left. To the equation in the
second row we add the equation in the first row multiplied by 2, thus performing R2 + 2 · R1 :

x1 + 2x2 = −3 x1 + 2x2 = −3
→ (6-23)
x1 + x2 = 0 3x1 + 5x2 = −6
The system to the right has the same solution set as the system on the left.

The arrow, →, which is used in the three examples indicates that one or more row
operations have taken place.

Proof

The first part of the proof of 6.9 is simple: Since the solution set of a system of equations is
equal to the intersection F of the solution sets for the various equations comprising the system,
F is not altered by the order of the equations being changed. Therefore ro1 is allowed.

Since the solution set of a given equation is not altered when the equation is multiplied by
a constant k 6= 0, F will not be altered if one of the equations is replaced by the equation
multiplied by a constant different from 0. Therefore ro2 is allowed.

Finally consider a system of linear equations A with n unknowns x = ( x1 , x2 , . . . xn ). We


write the left hand side of an equation in A as L(x) and the right hand side as b . Now
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 135

we perform an arbitrary row operation of the type ro3 in the following way: An arbitrary
equation L1 (x) = b1 is multiplied by an arbitrary number k and is then added to an arbitrary
different equation L2 (x) = b2 . This produces a new equation L3 (x) = b3 where

L3 (x) = L2 (x) + k L1 (x) and b3 = b2 + k b1 .

We now show that the system of equations B that emerges as a result of replacing L2 (x) = b2
in A by L3 (x) = b3 has the same solution set as A, and that ro3 thus is allowed. First, assume
that x0 is an arbitrary solution to A . Then it follows from the transformation rules for a linear
equation that
k L1 (x0 ) = k b1
and further that
L2 (x0 ) + k L1 (x0 ) = b2 + k b1 .
From this it follows that L3 (x0 ) = b3 , and that x0 is a solution to B. Assume vice versa that
x1 is an arbitrary solution to B . Then it follows that

−k L1 (x1 ) = −k b1

and further that


L3 (x1 ) − k L1 (x1 ) = b3 − k b1 .
This means that L2 (x1 ) = b2 , and that x1 also is a solution to A. In sum we have shown that
ro3 is allowed.

From 6.9 follows directly:

Corollary 6.11
The solution set of a system of linear equations is not altered if the system is trans-
formed an arbitrary number of times, in any order, by the three row operations.

We are now ready to use the three row operations for the row reduction of systems
of linear equations. In the following example we follow the principles of Gauss-Jordan
elimination, and a complete description of the method follows in subsection 6.5.
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 136

Example 6.12 Gauss-Jordan Elimination

Consider below, to the left, a system of linear equations, consisting of three equations with
the three unknowns x1 , x2 and x3 . On the right the augmented matrix for the system is written:

− x2 + x3 = 2 0 −1

1 2

2x1 + 4x2 − 2x3 = 2 T = 2 4 −2 2  (6-24)


3x1 + 4x2 + x3 = 9 3 4 1 9

The purpose of reduction is to achieve, by means of row operations, the following situation:
x1 is the only remaining part on left hand side of the upper equation , x2 is the only one on the
left hand side of the middle equation and x3 is the only one on the left hand side of the lower
equation. If this is possible then the system of equations is not only reduced but also solved!
This is achieved in a series of steps taken in accordance with the Gauss-Jordan algorithm.
Simultaneously we look at the effect the row operations have on the augmented matrix.

First we aim to have the topmost equation comprise x1 , and to have the coefficient of this x1
be 1. This can be achieved in two steps. We swap the two top equations and multiply the
equation now in the top row by 21 . That is,

1
R1 ↔ R2 and · R1 :
2
x1 + 2x2 − x3 = 1 
1 2 −1 1

(6-25)
− x2 + x3 = 2  0 −1 1 2
3x1 + 4x2 + x3 = 9 3 4 1 9

Now we remove all other occurrences of x1 . In this example it is only one occurrence, i.e. in
row 3. This is achieved as follows: we multiply the equation in row 1 by the number −3 and
add the product to the equation in row 3, in short

R3 − 3 · R1 :
x1 + 2x2 − x3 = 1 
1 2 −1 1

(6-26)
− x2 + x3 = 2  0 −1 1 2
−2x2 + 4x3 = 6 0 −2 4 6

We have now achieved that x1 only appears in row 1 . There it must stay! The work on x1
is finished. This corresponds to the fact that at the top of the first column of the augmented
matrix there is 1 and directly below it only 0’s. This means that work on the first column is
finished !

The next transformations aim at ensuring that the unknown x2 will be represented only in
row 2 and nowhere else. First we make sure that the coefficient of x2 in row 2 switches
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 137

coefficient from −1 to 1 by use of the operation

(−1) · R2 :
x1 + 2x2 − x3 = 1 
1 2 −1 1

(6-27)
x2 − x3 = −2 0 1 −1 −2 
−2x2 + 4x3 = 6 0 −2 4 6

We now remove the occurrences of x2 from row 1 and row 3 with the operations

R1 − 2 · R2 and R3 + 2 · R2 :
x1 + x3 = 5 
1 0 1 5

(6-28)
x2 − x3 = −2  0 1 −1 −2 
2x3 = 2 0 0 2 2

Now the work with x2 is finished, which corresponds to the fact that in row 2 in the aug-
mented matrix the number in the second column is 1, all the other numbers in the second
column being 0. This column must not be altered by subsequent operations.

Finally we wish that the unknown x3 is represented in row 3 by the coefficient 1 and that x3
is removed from row 1 and row 2. This can be accomplished in two steps. First

1
· R3 :
2
x1 + x3 = 5 
1 0 1 5

(6-29)
x2 − x3 = −2  0 1 −1 −2 
x3 = 1 0 0 1 1

Then
R1 − R3 and R2 + R3 :
x1 = 4 
1 0 0 4

(6-30)
x2 = −1 0 1 0 −1 
x3 = 1 0 0 1 1

Now x3 only appears in row 3. This corresponds to the fact that in column 3 in the third row
of the augmented matrix we have 1, each of the other elements in the column being 0. We
have now completed a total reduction of the system, and from this we can conclude that there
exists exactly one solution to the system viz :

x = ( x1 , x2 , x3 ) = (4, −1, 1). (6-31)


eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 138

Let us remember what a solution is: an n-tuple that satisfies all the equations in
the system! Let us prove that formula (6-31) actually is a solution to equation
(6-24):

−(−1) + 1 = 2
2 · 4 + 4 · (−1) − 2 · 1 = 2
3 · 4 + 4 · (−1) + 1 = 9
As expected all three equations are satisfied!

In (6-30) after the row operations the augmented matrix of the system of linear equations
has achieved a form of special beauty with three so-called leading 1’s in the diagonal
and zeros everywhere else. We say that the transformed matrix is in reduced row echelon
form. It is not always possible to get the simple representation shown in (6-30). Some-
times the leading 1 in the next row is found more than one column to the right, as one
moves down. The somewhat complex definition follows below.

Definition 6.13 Reduced Row Echelon Form


A system of linear equations is denoted to be in reduced row echelon form, if the
corresponding augmented matrix fulfills the following four conditions:

1. The first number in a row that is not 0, is a 1. This is called the leading 1 or the
pivot of the row.

2. In two consecutive rows which both contain a pivot, the upper row’s leading
1 is further to the left than the leading 1 in the following row.

3. In a column with a leading 1, all other elements are 0.

4. Any rows with only 0’s are placed at the bottom of the matrix.
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 139

Example 6.14 Reduced Row Echelon Form

Consider the three matrices


     
1 0 0 1 2 0 1 3 1
A = 0 1 0  , B =  0 0 1  og C = 0 0 0  . (6-32)
0 0 1 0 0 0 0 0 0

The three matrices shown are all in row reduced echelon form. In A all the leading 1’s are
nicely placed in the diagonal. B has only leading two leading 1’s and you have to go two steps
to the right to go from the first to the second step. In C there is only one leading 1.

Example 6.15

None of the following four matrices is in reduced row echelon form because each violates
exactly one of the rules in the definition 6.13 – which, is left to reader to figure out!
       
1 1 0 0 0 0 1 0 0 1 0 0
A = 0 1 0  , B = 1 2 0  , C =  0 2 1  and D =  0 0 1  . (6-33)
0 0 1 0 0 1 0 0 0 0 1 0

Note the following important theorem about the relationship between a matrix on the
one hand, and the reduced row echelon form of the same matrix produced through the
use of row operations, on the other.

Theorem 6.16 Reduced Row Echelon Form


If a given matrix M is transformed by two different sequences of row operations into
a reduced row echelon form, then the two resulting reduced row echelon forms are
identical.

The unique reduced row echelon form a given matrix M can be transformed into
this way is termed the reduced row echelon form, and given the symbol rref(M).
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 140

Proof

We use the following model for the six matrices that are introduced in the course of the proof:

f1 f2
A ←− M −→ B
↓ (6-34)
f1 f2
A1 ←− M1 −→ B1
Suppose a matrix M has been transformed, by two different series of row operations f 1 and
f 2 , into two different reduced row echelon forms A and B . Let column number k be the first
column of A and B where the two matrices differ from one another. We form a new matrix M1
from M in the following way. First we remove all the columns in M whose column numbers
are larger than k. Then we remove just the columns in M whose column numbers are less
than k , and have the same column numbers as a column in A (and thus B ) which does not
contain a leading 1.

Now we transform M1 by the series of row operations f 1 and f 2 , and the resulting matrices
formed hereby are called A1 and B1 , respectively. Then A1 necessarily will be the same matrix
that would result if we remove all the columns from A, similar to those we took away from M
to produce M1 . And the same relationship exists between B1 and B. A1 and B1 will therefore
have a leading 1 in the diagonal of all columns apart from the last, which is the first column
where the two matrices are different from one another. In this last column there are two
possibilities: Either one of the matrices has a leading 1 in this column or neither of them has.
An example of how the situation in the first case could be is:

   
1 0 0 1 0 0
A1 =  0 1 0  B1 =  0 1 2  (6-35)
0 0 1 0 0 0
We now interpret M1 as the augmented matrix for a system of linear equations L . Both A1
and B1 will then represent a totally reduced system of equations with the same solution set
as L . However, this leads to a contradiction since one of the totally reduced systems is seen
to be inconsistent due to one of the equations now being invalid and the other will have just
one solution. We can therefore rule out that one of A1 and B1 contains a leading 1 in the last
column.

We now investigate the other possibility, that neither of A1 and B1 contains a leading 1 in the
last column. The situation could then be like this:
   
1 0 1 1 0 1
A1 =  0 1 3  B1 =  0 1 2  (6-36)
0 0 0 0 0 0

Both the totally reduced system of equations as represented by A1 , and that which is
represented by B1 , will in this case have exactly one solution. But when the last column
eNote 6 6.4 ROW REDUCTION OF SYSTEMS OF LINEAR EQUATIONS 141

is different in the two matrices the solution for A1 ’s system of equations will be different
from the solution for B1 ’s system of equations, whereby we again have ended up in a
contradiction.

We conclude that the assumption that M might be transformed into two different reduced
row echelon forms cannot be true. Hence, to M corresponds a unique reduced row echelon
form: rref(M).

From Theorem 6.16 it is relatively easy to obtain the next result about matrices that can
transformed into each other through row operations:

Corollary 6.17
If a matrix M has been transformed by an arbitrary sequence of row operations into
the matrix N, then

rref(N) = rref(M). (6-37)

Proof

Let s be a sequence of row operations that transforms the matrix M to the matrix N, and
let t be a sequence of row operations that transforms the the matrix N to rref(N). Then the
sequence of row operations consisting of s followed by t , transform M to rref(N). But since
M in accordance with 6.16 has a unique reduced row echelon form, rref(M) must be equal to
rref(N).

If, in the preceding corollary, we interpret M and N as the augmented matrices for two
systems of linear equations, then it follows directly from definition (6.13) that:
eNote 6 6.5 GAUSS-JORDAN ELIMINATION 142

Corollary 6.18
If two systems of linear equations can be transformed into one another by the use of
row operations, then they are identical in the reduced row echelon form (apart from
possible trivial equations).

6.5 Gauss-Jordan Elimination

We are now able to precisely introduce the method of elimination that is applied in these
eNotes.

Definition 6.19 Gauss-Jordan Elimination


A system of linear equations is totally reduced by Gauss-Jordan elimination when
the corresponding augmented matrix after the use of the three row operations (see
theorem 6.9) is brought into the reduced row echelon form by the following proce-
dure:
We proceed from left to right : First we treat the first column of the aug-
mented matrix so that it does not conflict with the reduced row echelon
form, then the second column is treated so as not to conflict with the re-
duced row echelon form and so on, as far as and including the last column
in the augmented matrix .

This is always possible!

When you are in the process of reducing systems of linear equations, you are
free to deviate from the Gauss-Jordan method if it is convenient in the situation
at hand. If you have achieved a reduced row echelon form by using other
sequences of row operations, it is the same form that would have been obtained
by using the Gauss-Jordan method strictly. This follows from corollary 6.18.

In Example 6.12 it was possible to read the solution from the totally reduced system of
linear equations. In the following main example the situation is a bit more complicated
owing to the fact that the system has infinitely many solutions.
eNote 6 6.5 GAUSS-JORDAN ELIMINATION 143

Example 6.20 Gauss-Jordan Elimination

We want to reduce the following system of four linear equations in five unknowns:

x1 + 3x2 + 2x3 + 4x4 + 5x5 = 9


2x1 + 6x2 + 4x3 + 3x4 + 5x5 = 3
(6-38)
3x1 + 8x2 + 6x3 + 7x4 + 6x5 = 5
4x1 + 14x2 + 8x3 + 10x4 + 22x5 = 32

We write the augmented matrix for the system:

1 3 2 4 5 9
 
2 6 4 3 5 3
T =
3 8
 (6-39)
6 7 6 5
4 14 8 10 22 32

Below we reduce the system using three row operations. This we will do by only looking at
the transformations of the augmented matrix!

R2 − 2 · R1 , R3 − 3 · R1 and R4 − 4 · R1 :
1 3 2 4 5 9
 
0 0 0 −5 −5 −15  (6-40)
 
 0 −1 0 −5 −9 −22 
0 2 0 −6 2 −4

Now we have completed the treatment of the first column, because we have a leading 1 in
the first row and only 0’s on the other entries in the column.

R2 ↔ R3 and (−1) · R2 :
1 3 2 4 5 9
 
0 1 0 5 9 22  (6-41)
 
 0 0 0 −5 −5 −15 
0 2 0 −6 2 −4

R1 − 3 · R2 and R4 − 2 · R2 :
1 0 2 −11 −22 −57
 
0 1 0 5 9 22  (6-42)
 
 0 0 0 −5 −5 −15 
0 0 0 −16 −16 −48
The work on the second column is now completed. Now a deviation from the standard
situation follows, where leading 1’s are established in the diagonal, because it is not possible
to produce a leading 1 as the third element in the third row. We are not allowed to swap
eNote 6 6.5 GAUSS-JORDAN ELIMINATION 144

row 1 and row 3, because by doing so the first column would be changed in conflict with
the principle that the treatment of the first column is complete. This means that we have
also completed the treatment of the third column (the number 2 in the top row cannot be
removed). To continue the reduction we move on to the fourth element in row three, where
it is possible to provide a leading 1.

− 15 · R3 :
1 0 2 −11 −22 −57
 
0 1 0 5 9 22  (6-43)
 
0 0 0 1 1 3
0 0 0 −16 −16 −48

R1 + 11 · R3 , R2 − 5 · R3 and R4 + 16 · R3 :
1 0 2 0 −11 −24
 
0 1 0 0 4 7 (6-44)
 
0 0 0 1 1 3
0 0 0 0 0 0
Now the Gauss-Jordan elimination has ended and we can write the totally reduced system
of equations:
1x1 + 0x2 + 2x3 + 0x4 − 11x5 = −24
0x1 + 1x2 + 0x3 + 0x4 + 4x5 = 7
(6-45)
0x1 + 0x2 + 0x3 + 1x4 + 1x5 = 3
0x1 + 0x2 + 0x3 + 0x4 + 0x5 = 0

First, we note that the original system of equations has actually been reduced (made easier)
by the fact that many of the coefficients of the equation system are replaced by 0’s. But
moreover the system with four equations can now be replaced by a system consisting of
only three equations. The last row is indeed a trivial equation that has the whole L5 as
its solution set. Therefore, the solution set of the system system will not change if the last
equation is omitted in the reduced system (since the intersection of the solutions sets of all
four equations equals that of the solution sets from the first three equations alone). Quite
simply , we can therefore write the totally reduced system of equations as:

x1 + 2x3 − 11x5 = −24


x2 + 4x5 = 7 (6-46)
x4 + x5 = 3
But how do we proceed from the reduced system of equations to writing down the solution
set in a comprehensible form? We shall return to this example later, see Example 6.30. Before
that we need to introduce the concept of rank.
eNote 6 6.6 THE CONCEPT OF RANK 145

6.6 The Concept of Rank

In the example 6.20 a system of linear equations consisting of 4 equations with 5 un-
knowns has been totally reduced, see equation (6-46). Only three equations are left,
because the trivial equation 0x1 + 0x2 + 0x3 + 0x4 + 0x5 = 0 has been left out since it
only expresses the fact that 0 = 0. That the reduced system of equations contains a
trivial equation means that the reduced row echelon form of the the augmented matrix
contains a 0-row, as in equation (6-44). This leads to the following definition.

Definition 6.21 Rank


By the rank ρ of a matrix we understand the number of rows that are not 0-rows, in
the reduced row echelon form of the matrix. The rank thereby corresponds to the
number of leading 1’s in the reduced row echelon form of the matrix.

From the definition 6.21 and corollary 6.18 together with corollary 6.17 we obtain:

Theorem 6.22 Rank and Row Operations


Two matrices that can be transformed into each other by row operations have the
same rank.

The rank gives the least possible number of non-trivial equations that a system
of equations can be transformed into using row operations. You can never
transform a system of linear equations through row operations in such a way
that it will contain fewer non-trivial equations than it does when it is totally
reduced. This is a consequence of theorem 6.22.

Example 6.23 The Rank of Matrices

A matrix M with 3 rows and 4 columns is brought into the reduced row echelon form as
follows:
7 −2
   
3 1 1 0 3 0
M =  −1 −3 3 1 → rref(M) =  0 1 −2 0  (6-47)
2 3 0 −3 0 0 0 1
eNote 6 6.6 THE CONCEPT OF RANK 146

Since rref(M) does not contain 0-rows, ρ(M) = 3.

A matrix N with 5 rows and 3 columns is brought into reduced row echelon form like this:
   
2 2 1 1 0 0
 −2 −5 −4  0 1 0
   
N = 3 1 −7  → rref(N) =  0 0 1  (6-48)
   
 2 −1 −8  0 0 0
   
3 1 −7 0 0 0

Since rref(N) contains three rows that are not 0-rows, ρ(N) = 3.

If we interpret M and N as augmented matrices for linear systems of equations we see that for
both coefficient matrices the rank is 2, this is less than the ranks of the augmented matrices.

We now investigate the relationship between rank and the number of rows and columns.
First we notice that from the definition of 6.21 it follows that the rank of a matrix can
never be larger than the number of matrix rows.

In Example 6.23 the rank of M equals the number of rows in M, while the rank of N is
less than the number of rows in N.

Analogously the rank of a matrix cannot be larger than the number of columns. The
rank is in fact equal to the number of leading 1’s in the reduced row echelon form . And
if the echelon form of the matrix contains more leading 1’s than there are columns, then
there must be at least one column containing more than one leading 1. But this contra-
dicts condition number 3 in the definition 6.13.

In the example 6.23 the rank of M is less than the number of columns in M, while the
rank of N equals the number of columns in N.

We summarize the above observations in the following theorem:

Theorem 6.24 Rank, Rows and Columns


For a matrix M with m rows and n columns we have that

ρ(M) ≤ min {m, n} . (6-49)


eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 147

6.7 From Reduced Row Echelon Form to the Solution Set

Sometimes it is possible to write down the solution set for a system of linear equations
immediately when the corresponding augmented matrix is brought into its reduced
echelon form. This applies when the system has no solution or when the system has
exactly one solution. If the system has infinitely many solutions, work is needed in
order to be able to characterize the solution set. This can be achieved by writing the
solution in a standard parametric form. The concept of rank proves well suited to give
an instructive overview of the classes of solution sets.

6.7.1 When ρ(A) < ρ(T)

The augmented matrix T for a system of linear equations has the same number of rows
as the coefficient matrix A but one column more, which contains the right hand sides
of the equations. There are two possibilities. Either ρ(T) = ρ(A), or ρ(T) = ρ(A) +
1, corresponding to the fact that the last column in rref(T) contains a leading 1. The
consequence of the last possibility is investigated in Example 6.25.

Example 6.25 Inconsistent Equation (No Solution)

The augmented matrix for a system of linear equations consisting of three equations in two
unknowns is brought into reduced row echelon form

1 −2 0
 

rref(T) =  0 0 1 (6-50)
0 0 0
The system is thereby reduced to
x1 − 2x2 = 0
0x1 + 0x2 = 1 (6-51)
0x1 + 0x2 = 0
Notice that the equation in the second row is inconsistent and thus has no solutions. Because
the solution set for the system is the intersection of the solution sets for all the equations, the
system has no solutions at all.

Let us look at the reduced row echelon form of the coefficient matrix
1 −2
 

rref(A) =  0 0 (6-52)
0 0
eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 148

We note that ρ(A) = 1. This is less than ρ(T) = 2, and this is exactly due to the inconsistency
of the equation in the reduced system of equations.

The considerations in example 6.25 can be generalized to the following theorem.

Theorem 6.26 When ρ(A) < ρ(T)


If a system of linear equations with coefficient matrix A and augmented matrix T
has
ρ (A) < ρ (T) , (6-53)
then the totally reduced system has an inconsistent equation. Therefore the system
has no solutions.

 
If rref(T) has a row of the form 0 0 · · · 0 1 , then the system has no solu-
tions.

Exercise 6.27

Determine the reduced rwo echelon form of the augmented matrix for the following system
of linear equations, and determine the solution set for the system.

x1 + x2 + 2x3 + x4 = 1
(6-54)
−2x1 − 2x2 − 4x3 − 2x4 = 3

6.7.2 When ρ(A) = ρ(T) = Number of Unknowns

Let n denote the number of unknowns in a given system of linear equations. Then by
the way the coefficient matrices are formed there must be n columns in A.

Further we assume that ρ(A) = n. Then rref(A) contains exactly n leading 1’s. There-
fore the leading 1’s must be placed in the diagonal in rref(A), and all other elements of
rref(A) are zero.
eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 149

Finally we assume that in the given example ρ(A) = ρ(T). Then the solution set can
be read directly from rref(T). The first row in rref(T) will correspond to an equation
where the first unknown has the coefficient 1 while all the other unknowns have the co-
efficient 0. Therefore the value of the first unknown is equal to to the last element in the
first row (the right hand side). Similarly with the other rows, row number i corresponds
to an equation where unknown number i is the only unknown, and therefore its value
is equal to the last element in row number i. Since each unknown there corresponds
to exactly one value, and since ρ(A) = ρ(T) we are certain that there is no inconsis-
tent equation in the given system of equations. Thus the given system of equations has
exactly one solution.

Example 6.28 Exactly One Solution

The augmented matrix for a system of linear equations consisting of three equations in two
unknowns has been brought onto the reduced row echelon form

1 0 −3
 

rref(T) =  0 1 5 (6-55)
0 0 0

Consider the reduced row echelon form of the coefficient matrix for the system
 
1 0
rref(A) =  0 1  (6-56)
0 0

This has a leading 1 in each column and 0 in all other entries. We further note that ρ(A) =
ρ(T) = 2.

From rref(T) we can write the totally reduced system of equations as

1x1 + 0x2 = −3
0x1 + 1x2 = 5 (6-57)
0x1 + 0x2 = 0

which shows that this system of equations has exactly one solution x = ( x1 , x2 ) = (−3, 5).

The argument given just before the example proves the following theorem:
eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 150

Theorem 6.29 When ρ(A) = ρ(T) = Number of Unknowns


If a linear system with coefficient matrix A and augmented matrix T has:

ρ(A) = ρ(T) = number of unknowns, (6-58)

then the system has exactly one solution, and this can be immediately read from
rref(T).

6.7.3 When ρ(A) = ρ(T) < the Number of Unknowns

We are now ready to resume the discussion of our main example 6.20, a system of linear
equations with 5 unknowns, for which we found the totally reduced system of equations
consisting of 3 non-trivial equations. Let us now find the solution set and investigate its
properties!

Example 6.30 Infinitely Many Solutions

In the example 6.20 the augmented matrix T for a system of linear equations with 4 equations
in 5 unknowns was reduced to
1 0 2 0 −11 −24
 
0 1 0 0 4 7
rref(T) = 
0 0 0 1
 (6-59)
1 3
0 0 0 0 0 0
We see that ρ(A) = ρ(T) = 3, i.e. less than 5, the number of unknowns.

From rref(T) we can write the totally reduced system of equations


x1 + 2x3 − 11x5 = −24
x2 + 4x5 = 7 (6-60)
x4 + x5 = 3
The system has infinitely many solutions. For every choice of values for x3 and x5 we can
find exactly one new value for the other unknowns x1 , x2 and x4 . This can be made more
clear by isolating x1 , x2 and x4 in the following way
x1 = −24 − 2x3 + 11x5
x2 = 7 − 4x5 (6-61)
x4 = 3 − x5
eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 151

If we, for example, choose x3 = 1 and x5 = 2, we find the solution x = ( x1 , x2 , x3 , x4 , x5 ) =


(−4, −1, 1, 1, 2). More generally, any choice of values for x3 and x5 will, in the same way,
produce a solution, whilst the other three variables are uniquely determined by the cohice.
Therefore we can consider x3 and x5 as free parameters that determine the value of the three
other unknowns, and therefore on the right hand side we rename x3 and x5 the parameter
names t1 and t2 , respectively. Then we can write the solution set as:

x1 = −24 − 2t1 + 11t2


x2 = 7 − 4t2
x3 = t1 (6-62)
x4 = 3 − t2
x5 = t2

or more clearly in the standard parameter form:


       
x1 −24 −2 11
x   7  0  −4 
 2      
x =  x3  =  0 + t1 1 + t2 0  where t1 , t2 ∈ L. (6-63)
       
 x4   3  0  −1 
       
x5 0 0 1

With geometry-inspired wording we term the vector (−24, 7, 0, 3, 0) the initial point of the
solution set and the two vectors (−2, 0, 1, 0, 0) and (11, −4, 0, −1, 1) its directional vectors.
Letting x0 , v1 and v2 denote the initial point, and the directional vectors, respectively, we can
write the parametric representation in this way:

x = x0 + t1 v1 + t2 v2 hvor t1 , t2 ∈ L. (6-64)

Since the solution set has two free parameters corresponding to two directional vectors, we
say that it has a double -infinity of solutions.

Line 3 and 5 in (6-63) only express that x3 = t1 and x5 = t2 .

Let us, inspired by example 6.30, formulate a general method for changing the solution
set to standard parametic form from the totally reduced system of equations:
eNote 6 6.7 FROM REDUCED ROW ECHELON FORM TO THE SOLUTION SET 152

Method 6.31 From the Augmented Matrix to the Solution in Standard


Parameter Form
We consider a system of linear equations with n unknowns with the coefficient ma-
trix A and the augmented matrix T. In addition we assume

ρ(A) = ρ(T) = k < n. (6-65)

The solution set of the system is brought into standard parametric form in this way:

1. We find rref(T) and from this we write the totally reduced system of equations
(as is done in (6-60)).

2. In each of the k non-trivial equations in the totally reduced system of equations


we isolate the first unknowns on the left hand side (as is done in (6-61)).

3. In this way we have isolated k different unknowns on the left hand side of the
total system. The other (n − k ) unknowns, that are placed on the right hand
side are renamed the parameter names t1 , t2 , . . . , tn−k .

4. We can now write the solution set in standard parametic form:

x = ( x1 , x2 , . . . , xn ) = x0 + t1 v1 + t2 v2 + · · · + tn−k vn−k , (6-66)

where the vector x0 denotes the initial point of the parameter representation,
while v1 , v2 , . . . , vn−k are its directional vectors (as is done in (6-63)).

Notice that the numbers t1 , t2 , . . . , tn−k can be chosen freely. Regardless of the choice
equation (6-66) will be a valid solution. Therefore they are called free parameters.

If the algorithm of the Gauss-Jordan elimination has been followed perfectly,


one arrives at a certain initial point and a certain set of directional vectors for
the solution set, see equation (6-66). But the solution set can be written with
another choice for the initial point (if the system is inhomogeneous), and with
a different choice of directional vectors. However, the number of directional
vectors will always be (n − k ).

Solution sets in which some of the unknowns have definite values are possible. In the
following example the free parameter only influences one of the unknowns. The other
two are locked:
eNote 6 6.8 ON THE NUMBER OF SOLUTIONS 153

Example 6.32 Infinitely Many Solutions with a Free Parameter

For a given system of linear equations it is found that


 
1 0 0 −2
rref(T) = (6-67)
0 0 1 5

We see that ρ(A) = ρ(T) = 2 < n = 3. Accordingly we have one free parameter. We write
the solution set as:
−2
     
x1 0
x = x2 =
   0 +t 1
  (6-68)
x3 5 0
where t is a scalar that can be chosen freely.

In general you can prove the following theorem:

Theorem 6.33 When ρ(A) = ρ(T) < Number of Unknowns


If a system of linear equations with n unknowns and with the coefficient matrix A
and augment matrix T has

ρ (A) = ρ (T) = k < n (6-69)

Then the system has infinitely many solutions that can be written in standard pa-
rameter form with an initial point and (n − k ) directional vectors.

6.8 On the Number of Solutions

Let us consider a system of three linear equations in two unknowns:

a1 · x + b1 · y = c1
a2 · x + b2 · y = c2 (6-70)
a3 · x + b3 · y = c3

We have previously emphasized that the solution set for a system of equations is the
intersection of the solution sets for each of the equations in the system. Let us now in-
terpret the given system of equations as equations for three straight lines in a coordinate
eNote 6 6.8 ON THE NUMBER OF SOLUTIONS 154

system in the plane. Then the solution set corresponds to a set of points that are common
to all the three lines. In order to answer the question about “number” of solutions we
draw the different situations in Figure 6.1. In situation 2 two of the lines are parallel,

Figure 6.1: The six possible structures of the solutions for three linear equations in two
unknowns.
and in situation 3 all three lines are parallel. Therefore there are no points that are part
of all three lines in the situations 1, 2 and 3. In situation 5 two of the lines are identical
(the blue and the red line coincide in the purple line). Hence there is exactly one com-
mon point in the situations 4 and 5. In the situation 6 all the three lines coincide (giving
the black line). Therefore in this situation there are infinitely many common points.

The example with three equations in two unknowns illustrates the following theorem
which follows from our study of the solution sets in the previous section, see the theo-
rems 6.26, 6.29 and 6.33:

Theorem 6.34 Remark about the Number of Solutions


A system of linear equations either has no, exactly one, or infinitely many solutions.
There are no other possibilities.
eNote 6 6.9 THE LINEAR STRUCTURE OF THE SOLUTION SET 155

6.9 The Linear Structure of the Solution set

In this section we dig a little deeper into the question about the structure of the solution
set for systems of linear equations. It is particularly important to observe the corre-
spondence between the solution set for an inhomogeneous system of equations and the
corresponding homogeneous system of equations. We start by investigating the homogenous
system.

6.9.1 The Properties of Homogeneous Systems of Equations

A homogenous system of linear equations of m linear equations in n unknowns is writ-


ten in the form
a11 · x1 + a12 · x2 + . . . + a1n · xn = 0
a21 · x1 + a22 · x2 + . . . + a2n · xn = 0
.. (6-71)
.
am1 · x1 + am2 · x2 + . . . + amn · xn = 0

In the following theorem we describe an important property of the structure of the so-
lution set for homogeneous systems of linear equations.

Theorem 6.35 Solutions to a Homogeneous System of Linear Equations

Let Lhom denote the solution set of a homogeneous system of linear equations. Then
there exists at least one solution to the system, namely the zero or trivial solution. If

x = ( x1 , x2 , . . . x n ) and y = ( y1 , y2 , . . . y n ) (6-72)

are two arbitrary solutions, and k is an arbitrary scalar then both the sum

x + y = ( x1 + y1 , x2 + y2 , . . . x n + y n ) (6-73)

and the product


k · x = ( k · x1 , k · x2 , . . . k · x n ) (6-74)
belong to Lhom .
eNote 6 6.9 THE LINEAR STRUCTURE OF THE SOLUTION SET 156

Proof

An obvious property of the system (6-71) is that ρ(A) = ρ(T) (because the right hand side
consists of only zeros). Therefor the system has at least one solution - it follows from theorem
6.29. We can also immediately find a solution, viz. the zero vector, 0 ∈ Ln . That this is a
solution is evident when we replace all the unknowns in the system with the number 0, then
the system consists of m equations of the form 0 = 0.

Apart from this the theorem comprises two parts that are proved separately:

1. If
ai1 x1 + ai2 x2 + · · · + ain xn = 0 for every i = 1, 2, . . . , m (6-75)
and
ai1 y1 + ai2 y2 + · · · + ain yn = 0 for every i = 1, 2, . . . , m (6-76)
then by addition of the two equations and a following factorization with respect to the
coeficients we get

ai1 ( x1 + y1 ) + ai2 ( x2 + y2 ) + · · · + ain ( xn + yn ) = 0 for every i = 1, 2, . . . , m (6-77)

which shows that x + y is a solution.

2. If
ai1 x1 + ai2 x2 + · · · + ain xn = 0 for every i = 1, 2, . . . , m (6-78)
and k is an arbitrary scalar, then by multiplying both sides of the equation by k and a
following factorization with respect to the coefficients we get

ai1 (k · x1 ) + ai2 (k · x2 ) + · · · + ain (k · xn ) = 0 for every i = 1, 2, . . . , m (6-79)

which shows that k · x is a solution.

Remark 6.36
If you take an arbitrary number of solutions from Lhom , multiply these by arbitrary
constants and add the products then the so-called linear combination of solutions
also is a solution. This is a consequence of theorem 6.35.
eNote 6 6.9 THE LINEAR STRUCTURE OF THE SOLUTION SET 157

6.9.2 Structural Theorem

We will now consider a decisive relation between an inhomogeneous system of linear


equations of the form

a11 · x1 + a12 · x2 + . . . + a1n · xn = b1


a21 · x1 + a22 · x2 + . . . + a2n · xn = b2
.. (6-80)
.
am1 · x1 + am2 · x2 + . . . + amn · xn = bm

and the corresponding homogeneous system of linear equations, by which we mean the equa-
tions (6-80) after all the right hand sides bi have been replaced by 0. The solution set
for the inhomogeneous system of equations is called Linhom and the solution set for the
corresponding homogeneous system of equations is called Lhom .

Theorem 6.37 Structural Theorem


If you have found just one solution (a so-called particular solution) x0 to an inho-
mogeneous sytem of linear equations, then Linhom can be found as the sum of x0 and
Lhom .

In other words 
Linhom = x = x0 + y y ∈ Lhom . (6-81)
or in short
Linhom = x0 + Lhom . (6-82)

Proof

Note that the theorem contains two propositions. One is that the sum of x0 and an arbitrary
vector from Lhom belongs to Linhom . The other is that an arbitrary vector from Linhom can be
written as the sum of x0 and a vector from Lhom . We prove the two propositions separately:

1. Assume y ∈ Lhom . We want to show that

x = x0 + y = ( x01 + y1 , x02 + y2 , . . . , x0n + yn ) ∈ Linhom . (6-83)


eNote 6 6.9 THE LINEAR STRUCTURE OF THE SOLUTION SET 158

Since
ai1 x01 + ai2 x02 + · · · + ain x0n = bi for any i = 1, 2, . . . , m (6-84)
and
ai1 y1 + ai2 y2 + · · · + ain yn = 0 for any i = 1, 2, . . . , m (6-85)
then by addition of the two equations and a following factorization with respect to the
coeficients we get

ai1 ( x01 + y1 ) + · · · + ain ( x0n + yn ) = bi for any i = 1, 2, . . . , m (6-86)

which proves the proposition.

2. Assume x ∈ Linhom . We want to show that a vector y ∈ Lhom exists that fulfills

x = x0 + y. (6-87)

Since both x and x0 belong to Linhom we have that

ai1 x1 + ai2 x2 + · · · + ain xn = bi for any i = 1, 2, . . . , m (6-88)

and
ai1 x01 + ai2 x02 + · · · + ain x0n = bi for any i = 1, 2, . . . , m (6-89)
When we subtract the lower equation from the upper, we get after factorization

ai1 ( x1 − x01 ) + · · · + ain ( xn − x0n ) = 0 for any i = 1, 2, . . . , m (6-90)

which shows that the vector y defined by y = x − x0 , belongs to Lhom and satisfies:
x = x0 + y.


eNote 7 159

eNote 7

Matrices and Matrix Algebra

This eNote introduces matrices and arithmetic operations for matrices and deduces the relevant
arithmetic rules. Math knowledge comparable to that of a Danish gymnasium (high school)
graduate is the only requirement for benefitting from this note, but it is a good idea to acquaint
oneself with the number space Rn that is described in eNote 5 The Number Spaces.

(Updated: 24.09.2021 David Brander)

A matrix is an array of numbers. Here is an example of a matrix called M:


 
1 4 3
M= (7-1)
−1 2 7

A matrix is characterized by the number of rows and columns, and the matrix M above
is therefore called a 2 × 3 matrix. The matrix M is said to contain 2 · 3 = 6 elements. In
addition to rows and columns a number of further concepts are connected. In order to
describe these we write a general matrix, here called A, as:
 
a11 a12 . . . a1n
a
 21 a22 . . . a2n 

A = . .. ..  (7-2)
 .. . . 
am1 am2 . . . amn

The matrix A has m rows and n columns, and this can indicated by writing Am×n or the
m × n matrix A. The matrix A is also said to be of the type m × n.

Two m × n-matrices A and B are called equal if the elements in each matrix are equal,
eNote 7 7.1 MATRIX SUM AND THE PRODUCT OF A MATRIX BY A SCALAR 160

and we then write A = B .

A matrix with a single column (n = 1), is called a column matrix. Similarly a matrix with
only one row (m = 1), a row matrix.

A matrix with the same number of row and columns (m = n), is called a square matrix.
Square matrices are investigated in depth in eNote 8 Square Matrices.

If all the elements in an m × n-matrix are real numbers, the matrix is called a real matrix.
The set of these matrices is denoted Rm×n .

7.1 Matrix Sum and the Product of a Matrix by a Scalar

It is possible to add two matrices if they are of the same type. You then add the elements
with the same row and column numbers and in this way form a new matrix of the same
type. Similarly you can multiply any matrix by a scalar (a number), this is done by
multiplying all the elements by the scalar. The matrix in which all elements are equal to
0 is called the zero matrix regardless of the type, and is denoted 0 or possibly 0m×n . In
these notes, all other matrices are called proper matrices.
eNote 7 7.1 MATRIX SUM AND THE PRODUCT OF A MATRIX BY A SCALAR 161

Definition 7.1 Matrix Sum and Multiplication by a Scalar


Given a scalar k ∈ R and two real matrices Am×n and Bm×n :
   
a11 a12 . . . a1n b11 b12 . . . b1n
a
 21 a22 . . . a2n   b21 b22 . . . b2n
  
A = . og B =  . (7-3)

. . .. ..
 .. .. ..   ..
 
. . 
am1 am2 . . . amn bm1 bm2 . . . bmn

The sum of the matrices is defined as:


 
a11 + b11 a12 + b12 ... a1n + b1n
 a +b a22 + b22 ... a2n + b2n 
 21 21
A + B = (7-4)

.. .. .. 
 . . . 
am1 + bm1 am2 + bm2 . . . amn + bmn

The sum is only defined when the matrices are of the same type.

The product of the matrix A by the scalar k is written kA or Ak and is defined as:
 
k · a11 k · a12 . . . k · a1n
 k·a
21 k · a22 . . . k · a2n 

kA = Ak =  . (7-5)

. .
 .. .. .. 
k · am1 k · am2 . . . k · amn

The opposite matrix −A (additive inverse) to a matrix A is defined by the matrix that
results when all the elements in A are multiplied by −1 . It is seen that −A = (−1)A .

Example 7.2 Simple Matrix Operations

Define two matrices A andB by:


   
4 −1 −4 3
A= and B = (7-6)
8 0 9 12

The matrices are both of the type 2 × 2. We wish to determine a third and fourth matrix
eNote 7 7.1 MATRIX SUM AND THE PRODUCT OF A MATRIX BY A SCALAR 162

C = 4A and D = 2A + B. This can be done through the use of the definition 7.1.
     
4 −1 4 · 4 4 · (−1) 16 −4
C = 4A = 4 · = =
8 0 4·8 4·0 32 0
      (7-7)
8 −2 −4 3 4 1
D = 2A + B = + 1 =
16 0 9 2 25 21

In the following theorem we summarize the arithmetic rules that are valid for sums of
matrices and multiplication by a scalar.

Theorem 7.3 Arithmetic Rules for the Matrix Sum and the Product by a
Scalar
For arbitrary matrices A, B and C in Rm×n and likewise arbitrary real numbers k1
and k2 the following arithmetic rules are valid:
1. A+B = B+A Addition is commutative
2. (A + B) + C = A + (B + C) Addition is associative
3. A+0 = A 0 is a neutral matrix for addition in Rm×n
4. A + (−A) = 0 Every matrices in Rm×n has an opposite matrix
5. k 1 ( k 2 A) = ( k 1 k 2 )A Product of a matrix by scalars is associative

6. ( k 1 + k 2 )A = k 1 A + k 2 A
The distributive rules are valid
7. k 1 (A + B) = k 1 A + k 1 B
8. 1A = A The scalar 1 is neutral in the product by a matrix

The arithmetic rules in Theorem 7.3 can be proved by applying the ordinary arithmetic
rules for real numbers. The method is demonstrated for two of the rules in the following
example.

Example 7.4 Demonstration of Arithmetic Rule

Given the two matrices


   
a a b b
A = 11 12 and B = 11 12 (7-8)
a21 a22 b21 b22

plus the constants k1 and k2 . We now try by way of example to show the distributive rules in
eNote 7 7.1 MATRIX SUM AND THE PRODUCT OF A MATRIX BY A SCALAR 163

Theorem 7.3. First we have:


   
a a (k1 + k2 ) a11 (k1 + k2 ) a12
(k1 + k2 )A = (k1 + k2 ) 11 12 =
a21 a22 (k1 + k2 ) a21 (k1 + k2 ) a22
      (7-9)
k a k a k a k a k a + k2 a11 k1 a12 + k2 a12
k1 A + k2 A = 1 11 1 12 + 2 11 2 12 = 1 11
k1 a21 k1 a22 k2 a21 k2 a22 k1 a21 + k2 a21 k1 a22 + k2 a22

If you take a11 , a12 , a21 anda22 outside the parentheses in each of the elements in the last ex-
pression, it is seen that (k1 + k2 )A = k1 A + k2 A in this case. The operation of taking the
a-elements outside the parentheses is exactly equivalent to be using the distributive rule for
the real numbers.

The second distributive rule is demonstrated for given matrices and constants:
   
a11 + b11 a12 + b12 k1 ( a11 + b11 ) k1 ( a12 + b12 )
k 1 (A + B) = k 1 =
a21 + b21 a22 + b22 k1 ( a21 + b21 ) k1 ( a22 + b22 )
      (7-10)
k1 a11 k1 a12 k1 b11 k1 b12 k1 a11 + k1 b11 k1 a12 + k1 b12
k1 A + k1 B = + =
k1 a21 k1 a22 k1 b21 k1 b22 k1 a21 + k1 b21 k1 a22 + k1 b22

If k1 is taken outside of the parenthesis in each of the elements in the matrix in the last ex-
pression it is seen that the second distributive rule also is valid in this case: k1 (A + B) =
k1 A + k1 B. The distributive rule for real numbers is again used for each element.

Note that the zero matrix in Rm×n is the only matrix Rm×n that is neutral with
respect to addition, and that −A is the only solution to the equation A + X = 0.

Definition 7.5 Difference Between Matrices


The difference A − B between two matrices A and B of the same type is introduced
by:
A − B = A + (−B). (7-11)
In other words B is subtracted from A by subtracting each element in B from the
corresponding element in A.
eNote 7 7.2 MATRIX-VECTOR PRODUCTS AND MATRIX-MATRIX PRODUCTS 164

Example 7.6 Simple Matrix Operation with Difference

With the matrices given in Example 7.2 we get


     
8 −2 4 −3 12 −5
D = 2A − B = 2A + (−1)B = + = (7-12)
16 0 −9 − 12 7 − 12

7.2 Matrix-Vector Products and Matrix-Matrix Products

In this subsection we describe the multiplication of a matrix with a vector and then the
multiplication of matrix by another matrix.

A vector v = (v1 , v2 , . . . , vn ) can be written as a column matrix and is then called a


column vector:  
v1
 v2 
v = ( v1 , v2 , . . . , v n ) =  .  (7-13)
 
 .. 
vn
Using this concept you can divide a matrix Am×n into its column vectors. This is written
like this:
 
A = a1 a2 . . . a n
       
a11 a12 a1n a11 a12 . . . a1n
 a   a22   a2n   a
  21 a22 . . . a2n  (7-14)

 21  
=  ..   ..  · · ·  ..  =  ..
 
.. .. 
 .   .   .   . . . 
am1 am2 amn am1 am2 . . . amn
Accordingly there are n column vectors with m elements each.

Notice that the square brackets around the column vectors can be removed just
like that! This can be done in all dealings with matrices, where double square
brackets occur. It is always the innermost brackets that are removed. In this
way there is no difference between the two expressions. The last expression is
always preferred, because it is the easier to read.

We now define the product of a matrix and a vector, in which the matrix has as many
columns as the vector has elements:
eNote 7 7.2 MATRIX-VECTOR PRODUCTS AND MATRIX-MATRIX PRODUCTS 165

Definition 7.7 Matrix-Vector Product


Let A be an arbitrary matrix in Rm×n , and let v be an arbitrary vector in Rn .

The matrix-vector product of A with v is defined as:


 
v1
  v2 

  
Av = a1 a2 . . . an  .  = v1 a1 + v2 a2 + . . . + vn an (7-15)
 .. 
vn

The result is a column vector with m elements. It is the sum of the products of
the k’th column in the matrix and the k’th element in the column vector for all k =
1, 2, . . . , n.

It is necessary that there are as many columns in the matrix as there are rows in the
column vector, here n.

Notice the order in the matrix-vector product: first matrix, then vector! It is not
a vector-matrix product so to speak. The number of rows and columns will not
match in the other configuration unless the matrix is of the type 1 × 1.

Example 7.8 Matrix-Vector Product

The following matrix and vector (a column vector) are given:


 
  3
a b c
A = A2×3 = and v =  4 . (7-16)
d e f
−1
We now form the matrix-vector product of A with v by use of definition 7.7:
 
  3         
a b c  a b c 3a + 4b − c
Av = 4 = 3
 +4 + (−1) = (7-17)
d e f d e f 3d + 4e − f
−1
If A is given like this
 
−1 2 6
A= , (7-18)
2 1 4
you get the product
   
3 · (−1) + 4 · 2 − 6 −1
Av = = (7-19)
3·2+4·1−4 6
eNote 7 7.2 MATRIX-VECTOR PRODUCTS AND MATRIX-MATRIX PRODUCTS 166

It is seen that the result (in both cases) is a column vector with as many rows as there are
rows in A.

Exercise 7.9 Matrix-Vector Product

Form the matrix-vector product A with x in the equation Ax = b, when it is given that
     
a11 a12 a13 x1 b1
A =  a21 a22 a23  , x =  x2  and b =  b2  (7-20)
a31 a32 a33 x3 b3

Is this something you have seen before? From where does it come?

As we have remarked above a matrix can be viewed as a number of column vectors


aligned after one another. This is used in the following definition of a matrix-matrix
product as a series of matrix-vector products.

Definition 7.10 Matrix-Matrix Product


Let A be an arbitrary matrix in Rm×n , and let B be an arbitrary matrix in Rn× p .

The matrix-matrix product or just the matrix product of A and B is defined like this:
   
AB = A b1 b2 . . . b p = Ab1 Ab2 . . . Ab p (7-21)

The result is matrix of type m × p. The k’th column in the resulting matrix is a
matrix-vector product of the first matrix (here A) and the k’th column vector in the
last matrix (here B), cf. definition 7.7.

There must be as many columns in the first matrix as there are rows in the last
matrix.

Example 7.11 Matrix-Matrix Product

Given two matrices A2×2 and B2×3 :


   
4 5 −8 3 3
A= og B= (7-22)
1 2 2 9 −9
eNote 7 7.2 MATRIX-VECTOR PRODUCTS AND MATRIX-MATRIX PRODUCTS 167

We wish to form the matrix-matrix product of A and B. This is done by use of definition 7.10.
        
4 5 −8 4 5 3 4 5 3
AB =
1 2 2 1 2 9 1 2 −9
    (7-23)
4 · (−8) + 5 · 2 4 · 3 + 5 · 9 4 · 3 + 5 · (−9) −22 57 −33
= =
−8 + 2 · 2 3+2·9 3 + 2 · (−9) −4 21 −15
NB: It is not possible to form the matrix-matrix product BA, because there are not as many
columns in B as there are rows in A (3 6= 2).

Example 7.12 Matrix-Matrix Product two Ways

Given the two matrices A2×2 and B2×2 :


   
3 2 4 4
A= and B = (7-24)
−5 1 −1 0

Because the two matrices are square matrices of the same type both matrix-matrix products
AB and BA can be calculated. We use the definition 7.10.
     
3 2 4 3 2 4
AB =
−5 1 −1 −5 1 0
   
3 · 4 + 2 · (−1) 3·4+2·0 10 12
= =
−5 · 4 + 1 · (−1) −5 · 4 + 1 · 0 −21 −20
      (7-25)
4 4 3 4 4 2
BA =
−1 0 −5 −1 0 1
   
4 · 3 + 4 · (−5) 4 · 2 + 4 −8 12
= =
−1 · 3 −1 · 2 −3 −2

We see that AB 6= BA. The factors are not interchangeable!

Here we summarize the arithmetic rules that apply to matrix-matrix products and ma-
trix sums. Because the matrix-vector product is a special case of the matrix-matrix prod-
uct, the rules also apply for matrix-vector products.
eNote 7 7.2 MATRIX-VECTOR PRODUCTS AND MATRIX-MATRIX PRODUCTS 168

Theorem 7.13 Arithmetic Rules for Matrix Sum and Product


For arbitrary matrices A, B and C and likewise an arbitrary real number k the follow-
ing arithmetic rules are valid, in so far as the matrix-matrix products can be formed:
(kA)B = A(kB) = k(AB) Product with a scalar is associative

A(B + C) = AB + AC
the distributive rules apply
(A + B)C = AC + BC
A(BC) = (AB)C Matrix-matrix products are associative

Analogous to the demonstration of the arithmetic rules in Theorem 7.3 we demonstrate


the last arithmetic rule in Theorem 7.13:

Example 7.14 Are Matrix Products Associative?

The last arithmetic rule in 7.13 is tested on the three matrices:

4 −5
 
   
1 2 −3 −2 −1
A= , B= and C =  2 1 (7-26)
3 4 0 0 7
1 −3

First we calculate AB and BC:


          
1 2 −3 1 2 −2 1 2 −1 −3 −2 13
AB = =
3 4 0 3 4 0 3 4 7 −9 −6 25
 −5 (7-27)
    
  4   
−3 −2 −1   −3 −2 −1  −17 16
BC =  2 1   =
0 0 7 0 0 7 7 −21
1 −3

Then we calculate A(BC) and (AB)C:


    
  
1 2 1 2−17 16 −3 −26
A(BC) = =
3 4 3 4 −21
7 −23 −36
 −5 (7-28)
    
  4   
−3 −2 13   −3 −2 13  −3 −26
(AB)C =  2 1  =
−9 −6 25 −9 −6 25 −23 −36
1 −3

We see that A(BC) = (AB)C, and therefore it doesn’t matter which of the matrix products
AB and BC we calculate first. This is valid for all matrices (although not proved here).

As is done in example 7.14 we can demonstrate the other arithmetic rules. By writing
eNote 7 7.3 TRANSPOSE OF A MATRIX 169

down carefully the formula for each element of a matrix in the final product, in terms of
the elements of the other matrices, one can prove the rules properly.

Exercise 7.15 Demonstration of Arithmetic Rule

Demonstrate the first arithmetic rule in Theorem 7.13 with two real matrices A2×2 and B2×2
and the constant k.

7.3 Transpose of a Matrix

By interchanging rows and columns in a matrix the transpose matrix is formed as in


this example:
 
  a d
a b c
A= has the transpose A> =  b e  (7-29)
d e f
c f

A> is ’A transpose’. In addition you have that (A> )> = A. Here is a useful arithmetic
rule for the transpose of a matrix-matrix product.

Theorem 7.16 Transpose of a Matrix


Let there be given two arbitrary matrices Am×n and Bn× p . You form the transposed
matrices , A> and B> respectively, by interchanging the columns and rows of each
matrix.

The transpose of a matrix-matrix product AB is equal to the matrix-matrix product


of B> with A> (that is, in reverse order):

(AB)> = B> A> (7-30)

In the following example Theorem 7.16 is tested.


eNote 7 7.3 TRANSPOSE OF A MATRIX 170

Example 7.17 Demonstration of Theorem 7.16

 
  9 1
0 1 6
Given the two matrices A = and B =  1 0 . Then
7 −3 2
−6 3
    
  9   1
0 1 6  0 1 6  
AB =  1 0
7 −3 2 7 −3 2
−6 3 (7-31)
   
1·1−6·6 6·3 −35 18
= =
7·9−3·1−2·6 7·1+2·3 48 13

We now try to form the matrix-matrix product B> A> and we find

0 7  
9 1 −6
A> =  1 −3  and B> = (7-32)
1 0 3
6 2

and then     
  0   7
9 1 −6   9 1 −6 
B> A> =  1 −3 
1 0 3 1 0 3
6 2 (7-33)
   
1·1−6·6 9·7−1·3−6·2 −35 48
= =
3·6 1·7+3·2 18 13
The two results look identical:
 >  
−35 18 −35 48
= ⇔ (AB)> = B> A> , (7-34)
48 13 18 13

in agreement with Theorem 7.16

Exercise 7.18 Matrix Product and the Transpose

Given the matrices    


1 1 2 0 −1 −1
A= and B = (7-35)
1 2 −1 1 2 1
Calculate if possible the following:

a) 2A − 3B, b) 2A> − 3B> , c) 2A − 3B> , d) AB,


e) AB> , f) BA> , g) B> A, h) A> B.
eNote 7 7.4 SUMMARY 171

7.4 Summary

• Matrices are arrays characterized by the number of columns and rows, determining
the type of the matrix. An entry in the matrix is called an element.

• The type of a matrix is denoted as: Am×n . The matrix A has m rows and n columns.

• Matrices can be multiplied by a scalar by multiplying each element in the matrix


by the scalar.

• Matrices can be added if they are of the same type. This is done by adding corre-
sponding elements.

• The matrix-vector product, of Am×n with the vector v with n elements, is defined
as:
 
v1
  v2 

  
Am×n v = a1 a2 . . . an  .  = a1 v1 + a2 v2 + . . . + an vn , (7-36)
 .. 
vn

where a1 , a2 , . . . , an are the column vectors in A.

• The matrix-matrix product (or just the matrix product) is defined as a series of
matrix-vector products:
   
Am×n Bn× p = A b1 b2 . . . b p = Ab1 Ab2 . . . Ab p (7-37)

• More arithmetic rules for matrix sums, matrix products and matrix-scalar prod-
ucts are found in Theorem 7.3 and Theorem 7.13.

• The transpose A> of a matrix A is determined by interchanging rows and columns


in the matrix.
eNote 8 172

eNote 8

Square Matrices

In this eNote we explore the basic characteristics of the set of square matrices and introduce the
notion of the inverse of certain square matrices. We presume that the reader has a knowledge of
basic matrix operations, see e.g. eNote 7, Matrices and Matrix Algebra.

(Updated 24.9.2021 David Brander).

Square matrices are simply matrices with equal number of rows and columns, that is they
are of the type n × n. This note will introduce some of the basic operations that apply
to square matrices.

A square n × n matrix A looks like this:


 
a11 a12 . . . a1n
a
 21 a22 . . . a2n 

A = . .. ..  (8-1)
 .. . . 
an1 an2 . . . ann

The elements a11 , a22 , . . . , ann are said to be placed on the main diagonal or just the
diagonal of A.

A square matrix D, the non-zero elements of which lie exclusively on the main diagonal,
is termed a diagonal matrix, and one can write D = diag( a11 , a22 , . . . , ann ).

A symmetric matrix A is a square matrix that is equal to its own transpose, thus A = A> .

The square matrix with 1’s in the main diagonal and zeroes elsewhere, is called the
eNote 8 173

identity matrix regardless of the number of rows and columns. The identity matrix is
here denoted E, (more commonly in the literature as I). Accordingly
 
1 0 ··· 0
0 1 ··· 0
E = En × n =  . . . (8-2)
 
.
 .. .. . . .. 
0 0 ... 1

Internationally accepted usage is to denote the identity matrix by I.

Theorem 8.1 Identity Matrix


The identity matrix E in Rn×n is the only matrix in Rn×n that satisfies the following
relations:
AE = EA = A (8-3)
for an arbitrary matrix A ∈ Rn×n .

Proof

Suppose another matrix D, satisfies the same relations as E, that is AD = DA = A for all
n × n matrices A. This arbitrary matrix A could be the identity matrix, combining the two
equations we get : D = ED = DE = E.

Since E = D there is no other matrix than the identity matrix E that can be a neutral element
for the matrix product.

The identity matrix can be regarded as the ”1 for matrices": A scalar is not
altered by multiplication by 1, likewise a matrix is not altered by the matrix
product of the matrix with the identity matrix of the same type.

As is evident from the following it is often crucial for the manipulation of square matri-
ces whether they have full rank or not. Therefore we now introduce a special concept
eNote 8 8.1 INVERSE MATRIX 174

to express this.

Definition 8.2 Invertible and Singular Matrix


A square matrix is called regular or non-singular (or invertible) if it is of full rank, that
is ρ(An×n ) = n.

A square matrix is called singular if it not of full rank, that is ρ(An×n ) < n.

8.1 Inverse Matrix

The reciprocal of a scalar a 6= 0 satisfies the following equation: a · x = 1, where x is


the reciprocal. This can be rewritten as x = a−1 . This idea will now be generalized to
square matrices. Notice that you cannot determine the reciprocal of a scalar a if a = 0.
A similar exception emerges when we generalize to square matrices.

In order to determine the “reciprocal matrix” to a matrix A, termed the inverse matrix,
a matrix equation similar to a · x = 1 for a scalar:

AX = XA = E (8-4)

The unknown X is a matrix. If there is a solution X , it is denoted A−1 and is called the
inverse matrix to A. Hence we wish to find a certain matrix called A−1 , for which the
matrix product of A with this matrix yields the identity matrix.

It is not all square matrices that possess an inverse. This is postulated in the following
theorem.

Theorem 8.3 Inverse Matrix


A square matrix An×n has an inverse matrix A−1 , that satisfies AA−1 = A−1 A = E,
if and only if A is non-singular.

The inverse matrix is uniquely determined by the solution of the matrix equation
AX = E, where X is the unknown.

Note: this is why a non-singular square matrix is also called invertible.


eNote 8 8.1 INVERSE MATRIX 175

In the following method it is explained, how the matrix equation described above (8-4)
is solved, and thus how the inverse of an invertible matrix is found.

Method 8.4 Determining the Inverse Matrix


You determine the inverse matrix denoted A−1 , for the invertible square matrix A
by use of the matrix equation
AX = E. (8-5)
The equation is solved with respect to the unknown X in the following way:
 
1. The augmented matrix T = A E is formed.

2. By ordinary Gauss-Jordan elimination the reduced row echelon form rref(T)


of T is determined.

3. In the elimination process the identity matrix is finally formed on the left hand
side of the vertical line, while
 the solution (theinverse of A) can be read on the
right hand side: rref(T) = E X = E A−1 .


Example 8.5 Inverse Matrix

We wish to find the inverse matrix A−1 to the matrix A, given in this way:

−16 9 −10
 

A = 9 −5 6 (8-6)
2 −1 1

This can be done using method 8.4. First the augmented matrix is formed:

−16 9 −10 1 0 0
 

T = 9 −5 6 0 1 0 (8-7)
2 −1 1 0 0 1

Now we form the leading 1 in the first row: First the row operation R1 + R2 and then R1 +
4 · R3 . This yields

−7 4 −4 1 1 0
   
1 0 0 1 1 4
 9 −5 6 0 1 0  →  9 −5 6 0 1 0  (8-8)
2 −1 1 0 0 1 2 −1 1 0 0 1

Then the numbers in the 1st column of the 2nd and 3rd row are eliminated: R2 − 9 · R1 and
eNote 8 8.1 INVERSE MATRIX 176

R3 − 2 · R1 . Furthermore the 2nd and 3rd rows are swapped: R2 ↔ R3 . We then get
   
1 0 0 1 1 4 1 0 0 1 1 4
 0 −5 6 −9 −8 −36  →  0 −1 1 −2 −2 −7  (8-9)
0 −1 1 −2 −2 −7 0 −5 6 −9 −8 −36

Now we change the sign in row 2: (−1) · R2 and then we eliminate the number in the 2nd
column of the 3rd row: R3 + 5 · R2 .
   
1 0 0 1 1 4 1 0 0 1 1 4
0 1 −1 2 2 7  →  0 1 −1 2 2 7 (8-10)
0 −5 6 −9 −8 −36 0 0 1 1 2 −1

The last step is then evident: R2 + R3 :


 
1 0 0 1 1 4
rref(T) =  0 1 0 3 4 6 (8-11)
0 0 1 1 2 −1

We see that ρ(A) = ρ(T) = 3, thus A is of full rank, and therefore one can read the inverse to
A on the right hand side of the vertical line:
 
1 1 4
A−1 = 3 4 6 (8-12)
1 2 −1

Notice that the left hand side of the augmented matrix is the identity matrix. It is
so to speak ”moved” from the right to the left hand side of the equality signs (the
vertical line).

Finally we check whether A−1 , as is expected, satisfies AA−1 = E andA−1 A = E:

−16 9 −10
  
1 1 4
AA−1 =  9 −5 6  3 4 6
2 −1 1 1 2 −1
−16 9 −10 −16 9 −10 −16 9 −10
        
1 1 4
=   9 −5 6  3   9 −5 6  4   9 −5 6  6  (8-13)
2 −1 1 1 2 −1 1 2 2 −1 1 −1
−16 + 27 − 10 −16 + 36 − 20 −64 + 54 + 10
   
1 0 0
=  9 − 15 + 6 9 − 20 + 12 36 − 30 − 6  = 0 1
 0 = E
2−3+1 2−4+2 8−6−1 0 0 1

It is true! By use of the same procedure it is seen that A−1 A = E is also true.
eNote 8 8.1 INVERSE MATRIX 177

As can be seen in the next example, the inverse can be used in the solution of matrix
equations with square matrices. In matrix equations one can interchange terms and mul-
tiply by scalars in order to isolate the unknown just as one would do in ordinary scalar
equations. Moreover one can multiply all terms by matrices – this can be done either
from right or the left on all terms in the equation, yielding different results.

Example 8.6 Matrix Equation

We wish to solve the matrix equation

AX = B − CX (8-14)

where

−4 2 −1 −12 7 −9
     
0 1 0
A =  9 5 −5  , B =  8 −12 5  and C =  0 −10 11  (8-15)
2 0 7 5 0 0 0 −1 −6

First the equation is reduced as far as possible, see e.g. Theorem 7.13, without using the
values:
AX = B − CX ⇔ AX + CX = B − CX + CX ⇔ (A + C)X = B (8-16)
Since X is the unknown we try to isolate this matrix totally. If (A + C) is an invertible matrix,
one can multiply by the inverse to (A + C) from the left on both sides of the equality sign.
Thus:
(A + C)−1 (A + C)X = (A + C)−1 B ⇔ EX = X = (A + C)−1 B , (8-17)
because (A + C)−1 (A + C) = E according to the definition of inverse matrices. We now form
A + C and determine whether the matrix is invertible:

−4 2 −1 −12 7 −9 −16 9 −10


     

A + C =  9 5 −5 + 0 −10 11  =  9 −5 6 (8-18)


2 0 7 0 −1 −6 2 −1 1

The inverse of this matrix is already determined in Example 8.5, and this part of the procedure
is therefor skipped. X is determined as:
−1
−16 9 −10
 
0 1 0
X = ( A + C ) −1 B =  9 −5 6   8 −12 5 
2 −1 1 5 0 0
(8-19)
28 −11 5
    
1 1 4 0 1 0
= 3 4
 6  8 −12 5  =  62 −45 20 
1 2 −1 5 0 0 11 −23 10
eNote 8 8.1 INVERSE MATRIX 178

In the further investigation of the invertibility of the transpose or the inverse of an in-
vertible matrix plus the invertibility of the product of two or more invertible matrices
we will need the following corollary, which is stated without proof (see eNote 9, in
particular, Theorem 9.20 for one way to prove it).

Lemma 8.7 Inherited Invertibility


1. If A is an invertible square matrix, both A> and A−1 are invertible.

2. The product A B of two square matrices is invertible if and only if both A and B
are invertible.

We can now give arithmetic rules for inverse matrices.

Theorem 8.8 Arithmetic Rules for Inverse Matrices


For the invertible square matrices A, B and C the following arithmetic rules apply:

1. The inverse of the inverse of a matrix is equal to the matrix itself:

(A−1 ) −1 = A (8-20)

2. The transpose of an inverse matrix is equal to the inverse of the transpose of


the matrix:
(A> )−1 = (A−1 )> (8-21)
A> is invertible if and only if A is invertible.

3. In matrix equations we can multiply all terms by the inverse of a matrix. This
can be done either from the right or the left hand side on both sides of the
equality sign:

AX = B ⇔ X = A−1 B and XC = D ⇔ X = DC−1 (8-22)

4. The inverse of a matrix product of two matrices is equal to the product of the
corresponding inverse matrices in reverse order:

(AB)−1 = B−1 A−1 (8-23)


eNote 8 8.1 INVERSE MATRIX 179

All the arithmetic rules in theorem 8.8 are easily proven by checking.

Below one of the rules is tested in an example. The arithmetic rule in equation (8-22)
has already been used in example 8.6.

Example 8.9 Checking of Arithmetic Rule for an Inverse Matrix

Two square matrices are given


   
2 4 1 1
A= and B = (8-24)
6 10 2 3

We wish to test the last arithmetic rule in theorem 8.8, viz. that (AB)−1 = B−1 A−1 . First A−1
and B−1 are determined by use of method 8.4.
1
1 0 − 52
     
  2 4 1 0 1 2 2 0 1
A E = → → 3 1 (8-25)
6 10 0 1 0 −2 −3 1 0 1 2 −2

Similarly with B:

    
 1 1 1 0
 1 1 1 0 1 0 3 −1
B E = → → (8-26)
2 3 0 1 0 1 −2 1 0 1 −2 1

Since we have obtained the identity matrix on the left hand side of the vertical line in both
cases , we get
 5   
−1 −2 1 −1 3 −1
A = 3 1 and B = (8-27)
2 −2 −2 1
B−1 A−1 is determined:
− 25 7
       
−1 −1 3 −1 3 −1 1 −9 2
B A = 3 = 13 (8-28)
−2 1 2 −2 1 − 21 2 − 52

On the other side of the equality sign in the arithmetic rule we first calculate AB:
       
2 4 1 2 4 1 10 14
AB = = (8-29)
6 10 2 6 10 3 26 36

Now the inverse of AB is determined:


7 1 7
     
  10 14 1 0 1 5 10 0 1 0 −9 2
AB E = → → (8-30)
26 36 0 1 0 − 52 13
−5 1 0 1 13
2 − 52

Finally we arrive at
7
 
−9
(AB)−1 = 13
2 , (8-31)
2 − 52
Comparison of equations (8-28) and (8-31) yields the identity: (AB)−1 = B−1 A−1 .
eNote 8 8.2 POWERS OF MATRICES 180

Exercise 8.10 Inverse Matrix

Given the (not square!) matrices



0 1  
3 2 1
A= 1 0
  and B = (8-32)
0 1 2
2 3

a) Determine (BA)−1 .

b) Show that AB is not invertible and therefore one cannot determine (AB)−1 .

Exercise 8.11 Inverse Matrix

Given the matrices


       
2 3 1 0 −1 3 1 0
A= , B= , C= and D = (8-33)
1 1 4 1 1 −2 −4 1

a) Calculate AC, BD and DC.

b) Determine if possible, A−1 , B−1 and (AB)−1 .

c) Is it possible to decide whether (AB)−1 exists after you have tried to determine A−1
and B−1 ? If yes, how?

8.2 Powers of Matrices

We have now seen how the inverse of an invertible matrix is determined and we say that
it has the power −1. Similarly we define arbitrary integer powers of square matrices.
eNote 8 8.2 POWERS OF MATRICES 181

Definition 8.12 Powers of a Matrix


For an arbitrary square matrix A the following natural powers are defined:
n times
z }| {
A0 = E and An = AA · · · A , for n ∈ N (8-34)

Furthermore for an arbitrary invertible square matrix B the negative powers are
defined:
n times
z }| {
−n −1 n −1 −1 −1
B = (B ) = B B · · · B , for n ∈ N (8-35)

As a consequence of the definition of powers, some arithmetic rules can be given.

Theorem 8.13 Arithmetic Rules for Powers of Matrices


For an arbitrary square matrix A and two arbitrary non-negative integers a and b
the following arithmetic rules for powers are valid

A a Ab = A a + b and (Aa )b = Aab (8-36)

If A is invertible, these arithmetic rules are also valid for negative integers a and b.

Below is an example of two (simple) matrices that possess some funny characteristics.
The characteristics are not typical for matrices!

Example 8.14 Two Funny Matrices with Respect to Powers

Given the matrices    


1 0 2 1
A= and B = (8-37)
2 −1 −4 −2
By use of both the Definition 8.12 and Theorem 8.13 the following calculations are performed.
A2 is determined:     
2 1 0 1 0 1 0
A = AA = = =E (8-38)
2 −1 2 −1 0 1
Following the addendum to the fourth arithmetic rule in Theorem 8.8, A is invertible, and
eNote 8 8.2 POWERS OF MATRICES 182

moreover A = A−1 . This gives

.. ..
. .
A−3 = (AA2 )−1 = A−1 = A A−2 = (A2 )−1 = E
A−1 = A A0 = E (8-39)
A1 = A A2 = E
.. ..
. .

Thus all odd powers of A give A, while even powers give the identity matrix:

A2n = E and A2n+1 = A for n ∈ Z (8-40)

B2 is determined:     
2 2 1 2 1 0 0
B = = =0 (8-41)
−4 −2 −4 −2 0 0
According to the same arithmetic rule B is singular. Then it follows

B0 = E , B1 = B , B2 = 0 , Bn = 0 for n ≥ 2 (8-42)
eNote 8 8.3 SUMMARY 183

8.3 Summary

• Square matrices are matrices where the number of rows equals the number of
columns.

• The unit matrix E is a square matrix with the number one in the diagonal and
zeros elsewhere:  
1 0 ··· 0
0 1 ··· 0
E = En × n =  . (8-43)
 
.. . . .. 
 .. . . .
0 0 ... 1

• If a square matrix has full rank, it is called regular, otherwise it is called singular.

• A square matrix, the entries of which are all zero except for those on the diagonal,
is called a diagonal matrix.

• A square matrix, that is equal to the transpose of itself, is called a symmetric ma-
trix.

• For a regular matrix A there exists a unique inverse, denoted A−1 , satisfying:

AA−1 = A−1 A = E (8-44)

The inverse can be determined by Method 8.4.

• Rules of computation with square and inverse matrices exist, see Theorem 8.8.

• Powers of suare matrices are defined, see Definition 8.12. In addition some arith-
metic rules exist.

• Inverse matrices are e.g. used in connection with change of basis and the eigen-
value problem. Moreover the determinant of a square matrix is defined in eNote 9,
Determinants.
eNote 9 184

eNote 9

Determinants

In this eNote we look at square matrices; that is they are of type n × n for n ≥ 2, see eNote 8.
It is an advantage but not indispensable to have knowledge about the concept of a determinant
for (2 × 2)-matrices in advance. The matrix algebra from eNote 7 is assumed known (sum,
product, transpose and inverse of matrices, plus the general solution method for systems of
linear equations from eNote 6.

Updated: 24.9.21 David Brander.

9.1 Intro to Determinants

The determinant of a real square (n × n)-matrix A is a real number which we denote by


det(A) or sometimes for short by |A|. The determinant of a matrix can, in a way, be
considered as a measure of how much the matrix ’weighs’ - with sign; we will illustrate
this visually and geometrically for (2 × 2)-matrices and for (3 × 3)-matrices in eNote 10.

The determinant is a well defined function of the total of n2 numbers, that constitute the
elements of an (n × n)-matrix.

In order to define – and then calculate – the value of the determinant of an (n × n)-
matrix directly from the n2 elements in each of the matrices we need two things: First the
well-known formula for the determinant of (2 × 2)-matrices (see the definition 9.1 be-
low) and secondly a method to cut up an arbitrary (n × n)-matrix into (2 × 2)-matrices
eNote 9 9.2 DETERMINANTS OF (2 × 2)−MATRICES 185

and thereby define and calculate arbitrary determinants from the determinants of these
(2 × 2)-matrices.

9.2 Determinants of (2 × 2)−Matrices

Definition 9.1 Determinants of (2 × 2)−Matrices


Let A be the arbitrary (2 × 2)−matrix
 
a11 a12
A= . (9-1)
a21 a22

Then the determinant of A is defined by:

det(A) = a11 · a22 − a21 · a12 . (9-2)

Exercise 9.2 Inverse (2 × 2)−Matrix

Remember that the inverse matrix A−1 of a invertible matrix A has the characteristic property
that A−1 · A = A · A−1 = E. Show directly from (9-1) and (9-2), that the inverse matrix A−1
to a (2 × 2)−matrix A can be expressed in the following way (when det(A) 6= 0) :
 
−1 1 a22 − a12
A = . (9-3)
det(A) − a21 a11

Exercise 9.3 Arithmetic Rules for (2 × 2)−Matrices

For allsquare matrices a number of basic arithmetic rules are valid; they are presented in
theorem 9.20 below. Check the three first equations in theorem 9.20 for (2 × 2)-matrices A
and B. Use direct calculation of both sides of the equations using (9-2).

9.3 Submatrices
eNote 9 9.3 SUBMATRICES 186

Definition 9.4 Submatrices


For an (n × n)-matrix A we define the (i, j) submatrix, A
b ij , as the ((n − 1) × (n − 1))-
submatrix of A that emerges by deleting the entire row i and the entire column j from
the matrix A.

All the total of n2 submatrices A


b ij (where 1 ≤ i ≤ n and 1 ≤ j ≤ n) are less
than A and are of the type (n − 1) × (n − 1) and therefore have only (n − 1)2
elements.

Example 9.5 Submatrices for a (3 × 3)−Matrix

A (3 × 3)-matrix A has total of 9 (2 × 2)-submatricer A


b ij . For example, if
 
0 2 1
A = 1 3 2  , (9-4)
0 5 1

then the 9 submatrices belonging to A are given by:


     
3 2 1 2 1 3
A11 =
b , A12 =
b , A13 =
b ,
5 1 0 1 0 5
     
b 21 = 2
A
1
, b 22 = 0
A
1
, b 23 = 0
A
2
, (9-5)
5 1 0 1 0 5
     
b 31 = 2
A
1
, b 32 = 0
A
1
, b 33 = 0
A
2
.
3 2 1 2 1 3

The corresponding determinants are determinants of (2 × 2)−matrices and each of these can
be calculated directly from the definition 9.1 above:

b 11 ) = −7 , det(A
det(A b 12 ) = 1 , det(Ab 13 ) = 5 ,
b 21 ) = −3 , det(A
det(A b 22 ) = 0 , det(Ab 23 ) = 0 , (9-6)
det(A
b 31 ) = 1 , det(Ab 32 ) = −1 , det(A
b 33 ) = −2 .
eNote 9 9.4 INDUCTIVE DEFINITION OF DETERMINANTS 187

9.4 Inductive Definition of Determinants

The determinant of a 3 × 3 matrix can now be defined from the determinants of 3 of the
9 submatrices, and generally: The determinant of an n × n matrix is defined by the use
of the determinants of the n submatrices that belong to a (freely chosen) row r in the
following way, which naturally is called expansion along the r-th row:

Definition 9.6 Determinants are Defined by Expansion


For an arbitrary value of the row index r the determinant of a given (n × n)-matrix
A is defined inductively in the following way:
n
det(A) = ∑ (−1)r+ j arj det(Ab rj ) . (9-7)
j =1

We here and subsequently use the following short notation for the sum and
products of many terms, e.g. n given real numbers c1 , c2 , . . . , cn−1 , cn :
n
c 1 + c 2 + · · · + c n −1 + c n = ∑ ci , and
i =1
n
(9-8)
c 1 · c 2 · · · · · c n −1 · c n = ∏ ci .
i =1

Example 9.7 Expansion of a Determinant along the 1. Row

We will use Definition 9.6 directly in order to calculate the determinant of the matrix A that
is given in example 9.5. We choose r = 1 and we thus need three determinants of the sub-
matrices, det(Ab 11 ) = −7, det(A
b 12 ) = 1, and det(A
b 13 ) = 5, which we calculated already in
example 9.5 above:
n
det(A) = ∑ (−1)1+ j a1j det(Ab 1j )
j =1
(9-9)
= (−1)1+1 · 0 · det(A
b 11 ) + (−1)1+2 · 2 · det(A
b 12 ) + (−1)1+3 · 1 · det(A
b 13 )
= 0−2+5 = 3 .
eNote 9 9.4 INDUCTIVE DEFINITION OF DETERMINANTS 188

Notice that the determinants of the submatrices must be multiplied by the


element in A that is in entry (r, j) and with the sign-factor (−1)r+ j before
they are added. And notice that the determinants of the submatrices them-
selves can be expanded by the use of determinants of even smaller matrices,
such that finally we only need to determine weighted sums of determinants of
(2 × 2)−matrices!

Exercise 9.8 Choice of ’Expansion Row’ Arbitrary

Show by direct calculation that we obtain the same value for the determinant by use of one
of the other two rows for the expansion of the determinant in example 9.5.

Definition 9.9 Alternative: Expansion along a Column


The determinant of a given (n × n)−matrix A can alternatively be defined induc-
tively by expansion along an arbitrary chosen column :
n
det(A) = ∑ (−1)i+s ai s det(Ab i s ) . (9-10)
i =1

Here the expansion is expressed as the expansion along column s.

As is already hinted with the definitions and as shown in the concrete case of the matrix
in example 9.5, it doesn’t matter which row (or column) defines the expansion:

Theorem 9.10 Choice of Row or Column for the Expansions Immaterial


The two definitions, 9.6 and 9.9, of the determinant of a square matrix give the same
value and this they do without regard to the choice of row or column in the corre-
sponding expansions.
eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 189

Exercise 9.11 Choice of Column for the Expansion is Immaterial

Show by direct calculation that we get the same value for the determinant in 9.5 by using
expansion along any of the three columns in A.

It is of course wisest to expand along a row (or a column) that contains many
0’s.

Exercise 9.12 Determinants of Some Larger Matrices

Use the above instructions and results to find the determinants of each of the following ma-
trices:  
  0 0 2 1 5 3
0 2 1 0 5
0 2 7 1 1 3 2 0 2 0 1 3 2 2 1
   
1 3 0 2   0 0 5 1 1 4
  
 , 0 5 1 0 1 , . (9-11)
0 0 1 0  
 1 0 0 0 0 0
 
0 0 1 0 0  
0 5 8 1 0 0 1 0 0 0
5 2 7 1 9
0 5 2 7 1 9

If there are many 0’s in a matrix then it is much easier to calculate its deter-
minant! Especially if all the elements in a row (or a column) are 0 except one
element then it is clearly wisest to expand along that row (or column). And we
are allowed to ’obtain’ a lot of 0’s by application of the well-known row opera-
tions, if you keep record of the constants used for divisions and how often you
swap rows. See theorem 9.16 and example 9.17 below.

9.5 Computational Properties of Determinants

We collect some of the most important tools that are often used for the calculation and
inspection of the matrix determinants.

It is not difficult to prove the following theorem, e.g. by expansion first along the first
column or the first row, after which the pattern shows:
eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 190

Theorem 9.13 Matrices with 0 above or below the Diagonal


If an (n × n)−matrix has only 0’s above or below the diagonal, then the determinant
is given by the products of the elements on the diagonal.

As a special case of this theorem we have:

Theorem 9.14 The Determinant of a Diagonal Matrix


Let Λ denote an (n × n)-diagonal matrix with the elements in the diagonal
λ1 , λ2 , ..., λn and 0’s outside the diagonal:
 
λ1 0 · · · 0
 0 λ2 · · · 0 
Λ = diag(λ1 , λ2 , · · · , λn ) =  . ..  . (9-12)
 
. .. . .
 . . . . 
0 0 . . . λn

Then the determinant is


n
det(Λ) = λ1 λ2 · · · λn = ∏ λi (9-13)
i =1

Exercise 9.15 Determinant of a Bi-diagonal Matrix

Determine the determinant of the (n × n)− bi-diagonal matrix with arbitrarily given values
µ1 , . . . , µn in the bi-diagonal and 0’s elsewhere:

· · · 0 µ1
 
0
 0 · · · µ2 0 
M = bidiag(µ1 , µ2 , · · · , µn ) =  ..  . (9-14)
 
.. . ..
 . .. . . 
µn ... 0 0

General matrices (including square matrices), as known from eNote 6, can be reduced
to reduced row echelon form by the use of row operations. If you keep an eye on what
happens in every step in this reduction then the determinant of the matrix can be read
eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 191

directly from the process. The determinant of a matrix behaves ’nicely’ even if you
perform row operations on the matrix:

Theorem 9.16 The Influence of Row Operations on the Determinant


The determinant has the following properties:

1. If all the elements in a row in A are 0 then the determinant is 0, det(A) = 0.

2. If two rows are swapped in A, Ri ↔ R j , then the sign of the determinant is


shifted.

3. If all the elements in a row in A are multiplied by a constant k, k · Ri , then the


determinant is multiplied by k.

4. If two rows in a matrix A are equal then det(A) = 0.

5. A row operation of the type R j + k · Ri , i 6= j does not change the determinant.

As indicated above it follows from these properties of the determinant function that the
well-known reduction of a given matrix A to the reduced row echelon form, rref(A),
through row operations as described in eNote 6, in fact comprises a totally explicit calcu-
lation of the determinant of A. We illustrate with a simple example:

Example 9.17 Inspection of Determinant by Reduction to the Reduced


Row Echelon Form

We consider the (3 × 3)−matrix A1 = A from example 9.5:


 
0 2 1
A1 =  1 3 2  (9-15)
0 5 1

We reduce A1 to the reduced row echelon form in the usual way by Gauss–Jordan row op-
erations and all the time we keep an eye on what happens to the determinant by using the
rules in 9.16 (and possibly by checking the results by direct calculations):

Operation: Swap row 1 and row 2, R1 ↔ R2 : The determinant changes sign :

det(A2 ) = − det(A1 ) : (9-16)


eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 192

 
1 3 2
A2 =  0 2 1  (9-17)
0 5 1
Operation: 12 R2 , row 2 is multiplied by 1
2 : The determinant is multiplied by 1
2 :

1 1
det(A3 ) = det(A2 ) = − det(A1 ) : (9-18)
2 2
 
1 3 2
A3 =  0 1 1/2  (9-19)
0 5 1
Operation: R1 − 3R2 : The determinant is unchanged:

1 1
det(A4 ) = det(A3 ) = det(A2 ) = − det(A1 ) : (9-20)
2 2
 
1 0 1/2
A4 =  0 1 1/2  (9-21)
0 5 1
Operation: R3 − 5R2 : The determinant is unchanged:

1 1
det(A5 ) = det(A4 ) = det(A3 ) = det(A2 ) = − det(A1 ) : (9-22)
2 2
 
1 0 1/2
A5 =  0 1 1/2  (9-23)
0 0 −3/2
Now the determinant is the product of the elements in the diagonal because all the elements
below the diagonal are 0, see theorem 9.13. All in all we therefore have:

3 1 1
− = det(A5 ) = det(A4 ) = det(A3 ) = det(A2 ) = − det(A1 ) : (9-24)
2 2 2
From this we obtain directly – by reading ’backwards’:

1 3
− det(A1 ) = − , (9-25)
2 2
such that
det(A1 ) = 3 . (9-26)

In addition we have the following relation between the rank and the determinant of a
eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 193

matrix; the determinant reveals whether the matrix is singular or invertible:

Theorem 9.18 Rank versus Determinant


The rank of a square (n × n)-matrix A is less than n if and only if the determinant of
A is 0. In other words, A is singular if and only if det(A) = 0.

If a matrix contains a variable, a parameter, then the determinant of the matrix is a func-
tion of this parameter; in the applications of matrix-algebra it is often crucial to be able
to find the zeroes of this function – exactly because the corresponding matrix is singular
for those values of the parameter, and hence there might not be a (unique) solution to
the corresponding system of linear equations with the matrix as the coefficient matrix.

Exercise 9.19 Determinant of a Matrix with a Variable

Given the matrix


1 a a2 a3
 
1 0 a2 a3 
 , where a ∈ R.
A = (9-27)
1 a a a3 
1 a a2 a

1. Determine the determinant of A as a polynomium in a.

2. Determine the roots of this polynomium.

3. Find the rank of A for a ∈ {−4, −3, −2, −1, 0, 1, 2, 3, 4} . What does the rank have to
do with the roots of the determinant?

4. Find the rank of A for all a.


eNote 9 9.5 COMPUTATIONAL PROPERTIES OF DETERMINANTS 194

Theorem 9.20 Arithmetic Rules for Determinants


Let A and B denote two (n × n)−matrices. Then:

1. det(A) = det(A> )

2. det(AB) = det(A) det(B)

3. det(A−1 ) = (det(A))−1 , when A is invertible, that is det(A) 6= 0

4. det(Ak ) = (det(A))k , for all k ≥ 1.

5. det(B−1 AB) = det(A), when B is invertible, that is det(B) 6= 0.

Exercise 9.21

Prove the last 3 equations in theorem 9.20 by the use of det(AB) = det(A) det(B).

Exercise 9.22 The Determinant of a Sum is not the Sum of the Determi-
nants

Show by the most simple example, that the determinant-function det() is not additive. That
is, find two (n × n)−matrices A and B such that

det(A + B) 6= det(A) + det(B) . (9-28)

Exercise 9.23 Use of Arithmetic Rules for Determinants

Let a denote a real number. The following matrices are given:

2 −1
   
3 4 4 0
A =  1 a 2  and B =  −5 3 −1  . (9-29)
2 3 3 0 1 a

1. Find det(A) and det(B).


 4 
>
2. Find det(A B) and det A B .
eNote 9 9.6 ADVANCED: CRAMER’S SOLUTION METHOD 195

3. Determine those values of a for which A is invertible and find for these values of a the
expression for det(A−1 ).

9.6 Advanced: Cramer’s Solution Method

If A is a invertible n × n matrix and b = (b1 , ..., bn ) is an arbitrary vector in Rn , then there


exists (as is known from eNote 6 (invertible coefficient matrix)) exactly one solution
x = ( x1 , ..., xn ) to the system of linear equations Ax = b and we found in that eNote
method for finding the solution.

Cramer’s method for solving such systems of equations is a direct method. Essentially it
consists of calculating suitable determinants of matrices constructed from A and b and
then writing down the solution directly from the calculated determinants.

Theorem 9.24 Cramer’s Solution Formula


Let A be a invertible n × n matrix and let b = (b1 , ..., bn ) denote an arbitrary vector
in Rn . Then there exists (as is known from eNote 6 (invertible coefficient matrix))
exactly one solution x = ( x1 , ..., xn ) to the system of linear equations

Ax = b , (9-30)

and the elements in the solution are given by:

1
xj = det(A†bj ) , (9-31)
det(A)

where A†bj denotes the (n × n)−matrix that emerges by replacing column j in A


with b.
eNote 9 9.6 ADVANCED: CRAMER’S SOLUTION METHOD 196

Explanation 9.25 What † Means

If A is the following matrix (from example 9.5)


 
0 2 1
A = 1 3 2  , (9-32)
0 5 1

and if we let b = (b1 , b2 , b3 ), then


     
b1 2 1 0 b1 1 0 2 b1
A†b1 =  b2 3 2  , A†b2 =  1 b2 2  , A†b3 =  1 3 b2  . (9-33)
b3 5 1 0 b3 1 0 5 b3

Exercise 9.26 Use Cramer’s Solution Formula

If in particular we let A be the same matrix as above and now let b = (1, 3, 2), then we get by
substitution of b in (9-33) and then computing the relevant determinants:
 
1 2 1
det(A†b1 ) = det  3 3 2  = 4
2 5 1
 
0 1 1
det(A†b2 ) = det  1 3 2  = 1 (9-34)
0 2 1
 
0 2 1
b
det(A†3 ) = det  1 3 3  = 1 .
0 5 2

Since we also know det(A) = 3 we have now constructed the solution to the system of
equations Ax = b, through (9-31):
   
1 1 1 4 1 1
x = ( x1 , x2 , x3 ) = · 4, · 1, · 1 = , , . (9-35)
3 3 3 3 3 3

1. Check by direct isubstitution, that x is a solution to Ax = b.

2. Determine A−1 and use it directly for the solution of the system of equations.
eNote 9 9.6 ADVANCED: CRAMER’S SOLUTION METHOD 197

3. Solve the system of equations by reduction of the augmented matrix to the reduced
row echelon form as in eNote 2 followed by a reading of the solution.

In order to show what is actually going on in Cramer’s solution formula we first define
the adjoint matrix for a matrix A:

Definition 9.27 The Adjoint Matrix


The (classical) adjoint matrix adj(A) (also called the adjugate matrix) is defined by the
elements that are used in the definition 9.6 of the determinant of A :

(−1)1+1 det(A
 b 11 ) · b 1n ) >
· (−1)1+n det(A
 · · · · 
(9-36)
adj(A) =  
 · · · · 
(−1) n + 1 det(A
b n 1) · · (−1)n+n det(A
b n n)

In other words: The element in entry ( j, i ) in the adjoint matrix adj(A) is the sign-
modified determinant of the (i, j) submatrix, that is: (−1)i+ j det(A
b ij ). Notice the use
of the transpose in (9-36).

Example 9.28 An Adjoint Matrix

In example 9.5 we looked at the following matrix


 
0 2 1
A = 1 3 2  . (9-37)
0 5 1
The matrix A has the following adjoint matrix:
−7 3
 
1
adj(A) =  −1 0 1, (9-38)
5 0 −2
that is obtained directly from earlier computations of the determinants of the submatrices,
remembering that each element is given a sign that depends on the ’entry’, and that the
expression (9-36) is to be transposed.
b 11 ) = −7 , det(A
det(A b 12 ) = 1 , det(A
b 13 ) = 5 ,
b 21 ) = −3 , det(A
det(A b 22 ) = 0 , det(A
b 23 ) = 0 , (9-39)
det(A31 ) = 1 , det(A32 ) = −1 , det(A
b b b 33 ) = −2 .
eNote 9 9.6 ADVANCED: CRAMER’S SOLUTION METHOD 198

Exercise 9.29 Adjoint Versus Inverse Matrix

Show that all square matrices A fulfil the following

A adj(A) = det(A)E (9-40)

such that the inverse matrix to A (which exists precisely if det(A) 6= 0) can be found in the
following way:
1
A−1 = adj(A) . (9-41)
det(A)
Hint: The exercise it not easy. It is recommended to practice on a (2 × 2)-matrix. The zeroes of
the identity matrix in equation (9-40) are obtained by using the property that the determinant
of a matrix with two identical columns is 0.

The proof of theorem 9.24 is now rather short:

Proof

By multiplying both sides of equation (9-30) with A−1 we get:

1
x = A−1 b = adj(A)b , (9-42)
det(A)

and thus – if we denote the elements in adj(A), αij :

α11 · α1n
    
x1 b1
 · = 1 
· · ·  ·  . (9-43)
det(A)
xn αn1 · αnn bn

From this we read directly


n
1
det(A) i∑
xj = α ji bi
=1
n
1
= ∑
det(A) i=1
(−1)i+ j bi det(A
b ij ) (9-44)

1
= det(A†bj ) ,
det(A)

where we in the establishment of the last equality sign have used that
n
∑ (−1)i+ j bi det(Ab ij ) (9-45)
i =1
eNote 9 9.6 ADVANCED: CRAMER’S SOLUTION METHOD 199

is exactly the expansion of det(A†bj ) along column number j, that is the expansion along the
b-column in det(A†bj ), see the definition in equation (9-10).


eNote 9 9.7 SUMMARY 200

9.7 Summary

• The determinant of a square matrix with real elements is a real number that is cal-
culated from the n2 elements in the matrix, either by expansion along a row or
a column or through inspection of the Gauss-Jordan reduction process to the re-
duced row echelon form. When expanding along a matrix row or column the in-
telligent choice for it is a row or column in the matrix with many 0-elements. The
expansion along row number r takes place inductively after the following formula
that expresses the determinant as a sum of ’smaller’ determinants (with suitable
signs), see definition 9.6:
n
det(A) = ∑ (−1)r+ j arj det(Ab rj ) , (9-46)
j =1

where Ab rj is the submatrix that emerges by deleting row r and column j from the
matrix A, see definition 9.4.

• There exist convenient arithmetic rules for the calculation of determinants of prod-
ucts of matrices, determinants of the inverse matrix, and determinants of the trans-
pose of a matrix. See Theorem 9.20. The most important arithmetic rules are the
product-formula
det(A · B) = det(A) · det(B)
and the transpose-formula

det(A) = det(A> ) .

• The determinant of a square matrix that contains a variable, is a function of this


variable. The characteristic polynomial is such a very important function: KA (λ) =
det(A − λE) is n’th degree polynomial in the variable λ, see Definition 9.32.

• Cramer’s solution formula gives the direct way (through computations of deter-
minants) to the solution of a inhomogeneous system of linear equations with a
invertible coefficient matrix, see Theorem 9.24. If the system of equations is

Ax = b , (9-47)

the the solutions is given by:


1
xj = det(A†bj ) , (9-48)
det(A)

where A†bj denotes the matrix that emerges by replacing column j in the matrix A
with b.
eNote 9 9.8 ADVANCED: CHARACTERISTIC POLYNOMIAL 201

9.8 Advanced: Characteristic Polynomial

The material in this subsection naturally belongs to this eNote about determinants due
to the involved calculations, but it is only later, when solving the so-called eigenvalue
problem, that we will find the characteristic polynomialsreally useful.

Fo a given square matrix A we define the corresponding characteristic matrix and the
corresponding characteristic polynomial in the following way:

Definition 9.30 The Characteristic Matrix


Let A be an (n × n)−matrix. The corresponding characteristic matrix is the following
real matrix-function of the real variable λ:

KA (λ) = A − λ E , in which λ ∈ R , (9-49)

where E = En×n = diag(1, 1, ..., 1) is the (n × n)−identity matrix.

Example 9.31 A Characteristic Matrix

Given a (3 × 3)−matrix A by
3 −2 0
 

A = 0 1 0. (9-50)
1 −1 2
Then
3 − λ −2
 
0
KA (λ) = A − λ E =  0 1−λ 0 . (9-51)
1 −1 2 − λ

The corresponding characteristic polynomial for the matrix A is then the following real
polynomial to the variable λ:
eNote 9 9.8 ADVANCED: CHARACTERISTIC POLYNOMIAL 202

Definition 9.32 The Characteristic Polynomial


Given the square matrix A then the characteristic polynomial for A is defined like
this:
KA (λ) = det(KA (λ)) = det(A − λ E) , where λ ∈ R . (9-52)

Example 9.33 A Characteristic Polynomial

With A as in example 9.31 we get the following characteristic polynomial for A by expansion
of the characteristic matrix along the last column:

3 − λ −2
 
0
KA (λ) = det  0 1−λ 0 
1 −1 2 − λ
  (9-53)
3+3 3 − λ −2
= (−1) (2 − λ) det
0 1−λ
= (2 − λ)(3 − λ)(1 − λ) .

The characteristic polynomial for A thus has the roots 1, 2, and 3.

The characteristic polynomial KA (λ) – and in particular λ-values for which


KA (λ) = 0, i.e. the roots of the polynomial – will play a decisive role in the un-
derstanding and application of the operative properties of the A-matrix. This
will be described in the eNote about eigenvalues and eigenvectors. The roots
of the characteristic polynomial of a matrix are termed the eigenvalues of the
matrix.

Exercise 9.34 The Degree of the Characteristic Polynomial

Give the reasons, why the characteristic polynomial KA (λ) for an (n × n)−matrix A is a
polynomial in λ of the degree n.
eNote 9 9.8 ADVANCED: CHARACTERISTIC POLYNOMIAL 203

Exercise 9.35 Some Characteristic Polynomials and their Roots

Determine the characteristic polynomials for the following matrices and find all real roots in
each of the polynomials:
 
1 2
A1 = , A2 = diag( a1 , a2 , a3 ) , A3 = bidiag(b1 , b2 , b3 ) . (9-54)
3 4

Exercise 9.36 Find Matrices with Given Properties

Construct two (4 × 4)-matrices A and B such that one has only real roots in the corresponding
characteristic polynomial, and such that the other has no real roots in the corresponding
characteristic polynomial.
eNote 10 204

eNote 10

Geometric Vectors

The purpose of this note is to give an introduction to geometric vectors in the plane and
3-dimensional space, aiming at the introduction of a series of methods that manifest themselves
in the general theory of vector spaces. The key concepts are linear independence and linear
dependence, plus basis and coordinates. The note assumes knowledge of elementary geometry in
the plane and 3-space, of systems of linear equations as described in eNote 6 and of matrix
algebra as described in eNote 7.

Updated 25.09.21 David Brander

By a geometric vector in the plane R2 or Euclidean 3-space, R3 , we understand a con-


nected pair consisting of a length and a direction. Euclidean vectors are written as small
bold letters, e.g. v. A vector can be represented by an arrow with a given initial point
and a terminal point. If the vector v is represented by an arrow with the initial point A

and the terminal point B, we use the representation v = AB. All arrows with the same
length and direction as the arrow from A to B, also represent v.

Example 10.1 Parallel Displacement Using Vectors

Geometric vectors can be applied in parallel displacement in the plane and 3-space. In Figure
10.1 the line segment CD is constructed from the line segment AB as follows: all points of
AB are displaced by the vector u. In the same way the line segment EF emerges from AB by
→ → → → →
parallel displacement by the vector v. AB = CD = EF but notice that e.g. AB 6= FE.

In what follows we assume that a unit line segment has been chosen, that is a line segment
eNote 10 205

u
F

v
A

Figure 10.1: Parallel displacement by a vector

that has the length 1. By |v| we understand the length of the vector v as the proportion-
ality factor with respect to the unit line segment, that is, a real number. All vectors of
the same length as the unit line segment are called unit vectors.

For practical reasons a particular vector that has length 0 and which has no direction
is introduced. It is called the zero vector and is written 0. For every point A we put

AA = 0. Any vector that is not the zero vector is called a proper vector.

For every proper vector v we define the opposite vector −v as the vector that has the same
→ →
length as v , but the opposite direction. If v = AB, then BA = −v . For the zero vector
we put −0 = 0 .

It is often practical to use a common initial point when different vectors are to be rep-
resented by arrows. We choose a fixed point O which we term the origin, and consider
those representations of the vectors that have O as the initial point. Vectors represented
in this way are called position vectors, because every given vector v has a unique point

(position) P that satisifies v =OP. Conversely, every point Q corresponds to a unique

vector u such that OQ = u.
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 206

By the angle between two proper vectors in the plane we understand the unique angle be-
tween their representations radiating from O , in the interval [ 0; π ] . If a vector v in the
plane is turned the angle π/2 counter-clockwise, a new vector emerges that is called
v’s hat vector, it is denoted v
b.

By the the angle between two proper vectors in 3-space we understand the angle between
their representations radiating from O in the plane that contains their representations.

It makes good and useful sense “to add vectors”, taking account of the vectors’ lengths
and directions. Therefore in the following we can introduce some arithmetic operations
for geometric vectors. First it concerns two linear operations, addition of vectors and
multiplication of a vector by a scalar (a real number). Later we will consider three ways
of multiplying vectors, viz. the dot product, and for vectors in 3-space the cross product
and the scalar triple product.

10.1 Addition and Multiplication by a Scalar

Definition 10.2 Addition


Given two vectors in the plane or 3-space, u and v. The sum u + v is determined in
the following way:
→ →
• We choose the origin O and mark the position vectors u =OQ and v =OR.

• By parallel displacement of the line segments OR by u the line segment QP is


constructed.
→ →
• OP is then the position vector for the sum of u and v, in short u + v = OP.

R P

v u+v

u
O Q
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 207

In physics you talk about the ”parallelogram of forces": If the object O is influ-
enced by the forces u and v, the resulting force can be determined as the vector
sum u + v, the direction of which gives the direction of the resulting force, and
the length of which gives the magnitude of the resulting force. If in particular
u and v are of the same length, but have opposite directions, the resulting force
is equal to the 0-vector.

We then introduce multiplication of a vector by a scalar:

Definition 10.3 Multiplication by a Scalar


Given a vector v in the plane or 3-space and a scalar k. If v = 0, we have kv = vk = 0.
Otherwise by the product kv the following is understood:

• If k > 0, then kv = vk is the vector that has the same direction as v and which
is k times as long as v.

• If k = 0, then kv = 0.

• If k < 0, then kv = vk is the vector that has the opposite direction of v and which
is −k = | k | as long as v.

Example 10.4 Multiplication by a Scalar

A given vector v is multiplied by −1 and 2, respectively:

v
(-1)v
2v

Figure: Multiplication of vector by -1 and 2


eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 208

It follows immediately from the defintion 10.3 that multiplication of a vector


by −1 gives the vector’s opposite vector, in short

(−1)u = −u .

Thus we use the following way of writing

(−5)v = −(5v) = −5v .

From the definition 10.3 the zero rule follows immediately for geometric vec-
tors:
kv = 0 ⇔ k = 0 or v = 0 .

In the following example it is shown that multiplication of an arbitrary vector by an


arbitrary scalar can be performed by a genuine compasses and ruler construction.

Example 10.5 Geometrical Multiplication

Given a vector a and a line segment of length k, we wish to construct the vector ka.

Q
ka

O 1 k
Figure: Multiplication of a vector by an arbitrary scalar


First the position vector OQ= a is marked. Then with O as the initial point we draw a line
which is used as a ruler and which is not parallel to a, and where the numbers 1 and k are
marked. The triangle OkP is drawn so it is congruent with the triangle O1Q. Since the two

triangles are similar it must be true that ka =OP.
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 209

Exercise 10.6

Given two parallel vectors a and b and a ruler line. How can you using a pair of compasses
and the ruler line construct a line segment of the length k given that b = ka.

Exercise 10.7

1
Given the proper vector v and a ruler line. Draw the vector |v|
v.

Parametric representations for straight lines in the plane or 3-space are written using proper
vectors. Below we first give an example of a line through the origin and then an example
of a line not passing through the origin.

Example 10.8 Parametric Representation of a Straight Line

Given a straight line l through the origin, we wish to write the points on the line using a
parametric representation:

tr P
R
r
O

Figure: Parametric representation for a line through the origin


A point R on l different from the origin is chosen. The vector r =OR is called a direction
vector for l . For every point P on l corresponds exactly one real number t that satisfies

OP= tr. Conversely, to every real number t corresponds exactly one point P on l so that

OP= tr . As t traverses the real numbers from -∞ to +∞, P will traverse all of l in the
direction determined by r. Then

{ P | OP= tr where t ∈ R }

is a parametric representation of l .
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 210

Example 10.9 Parametric Representation of a Straight Line

The line m does not go through the origin. We wish to describe the points on m by use of a
parametric representation:

tr
r
B
R
m P
b

Figure: Parametric representation of a line


First an initial point B on m is chosen, and we put b =OB. A point R ∈ m different from

B is chosen. The vector r = BR is then a directional vector for m . To every point P on m

corresponds exactly one real number t that fulfils OP= b + tr. Conversely, to every number t

exactly one point P on m corresponds so that OP= b + tr. When t traverses the real numbers
from -∞ to +∞, P will traverse all of m in the direction determined by r. Then

{ P | OP= b + tr where t ∈ R }

is a parametric representation for m .

Parametric representations can also be used for the description of line segments. This is
the subject of the following exercise.

Exercise 10.10

Consider the situation in example 10.9. Draw the oriented line segment with the parametric
representation

{ P | OP= b + tr, where t ∈ [ −1; 2 ] } .
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 211

Exercise 10.11

Given two (different) points A and B . Describe with a parametric representation the oriented
line segment from A to B .

We will need more advanced arithmetic rules for addition of geometric vectors and
multiplication of geometric vectors by scalars than the ones we have given in the exam-
ples above. These are sketched in the following theorem and afterwards we will discuss
examples of how they can be justified on the basis of already defined arithmetic opera-
tions and theorems known from elementary geometry.

Theorem 10.12 Arithmetic Rules


For arbitrary geometric vectors u, v and w and for arbitrary real numbers k1 and k2
the following arithmetic rules are valid:
1. u+v = v+u Addition is commutative
2. (u + v) + w = u + (v + w) Addition is associative
3. u+0 = u The zero vector is neutral for addition
4. u + (−u) = 0 The sum of a vector and its opposite is 0
5. k 1 ( k 2 u) = ( k 1 k 2 )u Scalar multiplication is associative

6. ( k 1 + k 2 )u = k 1 u + k 2 u
The distributive rules apply
7. k 1 (u + v) = k 1 u + k 1 v
8. 1u = u The scalar 1 is neutral in the product with vectors

The arithmetic rules in Theorem 10.12 can be illustrated and proven using geometric
constructions. Let us as an example take the first rule, the commutative rule. Here we
just have to look at the figure in the definition 10.2 , where u + v is constructed. If we
construct v + u, we will displace the line segment OQ with v and consider the emerg-
ing line segment RP2 . It must be true that the parallelogram OQPR is identical to the
parallelogram OQP2 R and hence P2 = P and u + v = v + u.

In the following two exercises the reader is asked to explain two of the other arithmetic
rules.
eNote 10 10.1 ADDITION AND MULTIPLICATION BY A SCALAR 212

Exercise 10.13

Explain using the diagram the arithmetic rule k (u + v) = ku + kv.

ka
kb

a b

a+b
O
k(a+b)

Exercise 10.14

Draw a figure that illustrates the rule (u + v) + w = u + (v + w).

For a given vector u it is obvious that the opposite vector −u is the only vector that
satisfies the equation u + x = 0 . For two arbitrary vectors u and v it is also obvious
that exactly one vector exists that satisfies the equation u + x = v , viz. the vector
x = v + (−u) which is illustrated in Figure 10.2.

x v

-u u

Figure 10.2: Opposite of a vector

Therefore we can introduce subtraction of vectors as a variation of addition like this:


eNote 10 10.2 LINEAR COMBINATIONS 213

Definition 10.15 Subtraction


By the difference of two vectors v and u we understand the vector

v − u = v + (−u) . (10-1)

It is not necessary to introduce a formal definition of division of a vector by a


scalar, we consider this as a rewriting of multiplication by a scalar:

v 1
Division by a scalar : = · v ; k 6= 0
k k

10.2 Linear Combinations

A point about the arithmetic rule (u + v) + w = u + (v + w) from the theorem 10.12


is that parentheses can be left out in the process of adding a series of vectors, since it
has no consequences for the resulting vector in what order the vectors are added. This
is the background for linear combinations where sets of vectors are multiplied by scalars
and thereafter written as a sum.

Definition 10.16 Linear Combination


When the real numbers k1 , k2 , . . . , k n are given and in the plane or 3-space the vectors
v1 , v2 , . . . , vn then the sum

k1 v1 + k2 v2 + . . . + k n vn

is called a linear combination of the n given vectors.

If all the coefficients k1 , · · · , k n are equal to 0, the linear combination is called im-
proper, or trivial, but if at the least one of the coefficients is different from 0, it is
proper, or non-trivial.
eNote 10 10.2 LINEAR COMBINATIONS 214

Example 10.17 Construction of a Linear Combination

d
3c
c b

O a O 2a
-b

Figure: Construction of a linear combination

In the diagram, to the left the vectors a, b and c are drawn. On the figure to the right we
have constructed the linear combination d = 2a − b + 3c.

Exercise 10.18

There are given in the plane the vectors u, v, s and t, plus the parallelogram A, see diagram.

A
s t

O u

Figure: Linear combinations

1. Write s as a linear combination of u og v.


1
2. Show that v can be expressed by the linear combination v = 3 s + 61 t.

3. Draw the linear combination s + 3u − v.

4. Determine real numbers a, b, c and d such that A can be described by the parametric
eNote 10 10.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 215

representation

A = { P OP= xu + yv with x ∈ [ a; b ] and y ∈ [ c; d ]} .

10.3 Linear Dependence and Linear Independence

If two vectors have representations on the same straight line, one says that they are
linearly dependent. It is evident that two proper vectors are linearly dependent if they
are parallel; otherwise they are linearly independent. We can formulate it as follows: Two
vectors u and v are linearly dependent if the one can be obtained from the other by
multiplication by a scalar different from 0, if e.g. there exists a number k 6= 0 such that

v = ku .

We wish to generalize this original meaning of the concepts of linear dependence and
independence such that the concepts can be used for an arbitrary set of vectors.

Definition 10.19 Linear Dependence and Independence


A set of vectors (v1 , v2 , . . . , vn ) where n ≥ 2 ,is called linearly dependent if at least
one of the vectors can be written as a linear combination of the others.

If none of the vectors can be written as a linear combination of the others, the set is
called linearly independent.

NB: A set that only consists of one vector is called linearly dependent if the vector is
the 0-vector, otherwise linearly independent.
eNote 10 10.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 216

Example 10.20 Linearly Dependent and Linearly Independent Sets of Vec-


tors

In the plane are given three sets of vectors (u, v), (r, s) and (a, b, c) , as shown.

c
u
v

b
a

r
s

The set (u, v) is linearly dependent since for this example we have

u = −2v .

Also the set (a, b, c) is linearly dependent, since e.g.

b = a−c.

Only the set (r, s) is linearly independent.

Exercise 10.21

Explain that three vectors in 3-space are linearly dependent if and only if they have represen-
tations lying in the same plane. What are the conditions three vectors must fulfill in order to
be linearly independent?

Exercise 10.22

Consider (intuitively) what is the maximum number of vectors a set of vectors in the plane
can comprise, if the set is to be linearly independent. The same question in 3-space.

When investigate whether or not a given set of vectors is linearly independent or lin-
early dependent, the definition 10.19 does not give a practical procedure. It might be
eNote 10 10.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 217

easier to use the theorem that follows below. This theorem is based on the fact that a
set of vectors is linearly dependent if and only if the 0-vector can be written as a proper
linear combination of the vectors. Assume – as a prerequisite to the theorem – that the
set (a, b, c) is linearly dependent because

c = 2a − 3b.

Then the 0-vector can be written as the proper linear combination

2a − 3b − c = 0 .

Conversely assume that the 0-vector is a proper linear combination of the vectors u, v
og w like this:
2u − 2v + 3w = 0 .
Then we have (e.g.) that
2 2
w = − u+ v
3 3
and hence the vectors are linearly dependent.

Theorem 10.23 Linear Independence


Let k1 , k2 , . . . , k n be real numbers. That the set of vectors (v1 , v2 , . . . , vn ) is linearly
independent implies that the equation

k1 v1 + k2 v2 + · · · + k n vn = 0 (10-2)

is only satisfied when all the coefficients k1 , k2 , . . . , k n are equal to 0 .

Proof

Assume that the set (v1 , v2 , . . . , vn ) is linearly dependent, and let vi be a vector that can be
written as a linear combination of the other vectors. We reorder (if necessary) the set, such
that i = 1, following which v1 can be written in the form

v1 = k 2 v2 + · · · + k n v n ⇔ v1 − k 2 v2 − · · · − k n v n = 0 . (10-3)

The 0-vector is hereby written in the form (10-2), in which not all the coefficients are 0 , be-
cause the coefficient to v1 is 1 .
eNote 10 10.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 218

Conversely, assume that the set is written in the form (10-2), and let k i 6= 0 . We reorder (if
necessary) the set such that i = 1 following which we have

k2 kn
k1 v1 = −k2 v2 − · · · − k n vn ⇔ v1 = − v2 − · · · − vn . (10-4)
k1 k1
From this we see that the set is linearly independent.

Example 10.24 Linearly Independent Set

Every set of vectors containing the zero vector is linearly dependent. Consider e.g. the set
(u, v, 0, w). It is obvious that the zero-vector can be written as the other three vectors:

0 = 0u + 0v + 0w ,

where the zerovector is written as a linear combination of the other vectors in the set.

Parametric representations for planes in 3-space is written using two linearly independent
vectors. Below we first give an example of a plane through the origin, then an example
of a plane that does not contain the origin.

Example 10.25 Parametic Representation for a Plane

Given a plane in 3-space through the origin as shown. We wish to describe the points in the
plane by a parametric representation.

R
v
O

P
u

Figure: A plane in 3-space through the origin


eNote 10 10.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 219

In the given plane we choose two points Q and R, both not the origin, and that do not lie
→ →
on a common line through the origin. The vectors u =OQ and v =OR will then be linearly
independent, and are called direction vectors of the plane. For every point P in the plane we

have exactly one pair of numbers (s, t) such that OP= su + tv . Conversely, for every pair of

real numbers (s, t) exists exactly one point P in the plane that satisfies OP= su + tv . Then

{ P | OP= su + tv ; (s, t) ∈ R2 }

is a parametric representation of the given plane.

Example 10.26 Parametric Representation for a Plane

A plane in 3-space does not contain the origin. We wish to describe the plane using a para-
metric representation.

R
P
v

B u
Q

Figure: A plane in 3-space


First we choose an initial point B in the plane, and we put b =OB. Then we choose two
→ →
linearly independent direction vectors u = BQ and v = BR where Q and R belong to the
plane. To every point P in the plane corresponds exactly one pair of real numbers (s, t), such
that → → →
OP=OB + BP= b + su + tv .
Conversely, to every pair of real numbers (s, t) corresponds exactly one point P in the plane
as given by this vector equation. Then

{ P | OP= b + su + tv ; (s, t) ∈ R2 }
eNote 10 10.4 THE STANDARD BASES IN THE PLANE AND SPACE 220

is a parametric representation for the given plane.

Exercise 10.27

Give a parametric representation for the parallelogram A lying in the plane shown:

v
A B
u

10.4 The Standard Bases in the Plane and Space

In analytic geometry one shows how numbers and equations can describe geometric ob-
jects and phenomena including vectors. Here the concept of coordinates is decisive. It
is about how we determine the position of the geometric objects in 3-space and relative
to one another using numbers and tuples of numbers. To do so we need to choose a
number of vectors which we appoint as basis vectors. The basis vectors are ordered,
that is they are given a distinct order, and thus they constitute a basis. When a basis is
given all the vectors can be described using coordinates, which we assemble in so called
coordinate vectors. How this whole procedure takes place we first explain for the stan-
dard bases in the plane and 3-space. Later we show that often it is useful to use other
bases than the standard bases and how the coordinates of a vector in different bases are
related.
eNote 10 10.4 THE STANDARD BASES IN THE PLANE AND SPACE 221

Definition 10.28 Standard Basis in the Plane


By a standard basis or an ordinary basis for the geometric vectors in the plane we
understand an ordered set of two vectors (i, j) that satisfies:

• i has the length 1 .

• j = bi (that is j is the hat vector of i ).

By a standard coordinate system in the plane we understand a standard basis (i, j)


together with a chosen the origin O . The coordinate system is written (O, i, j) . By
the x-axis and the y-axis we understand oriented number axes through O that are
parallel to i and j ,respectively.
Y

j
i
X
O 1

Figure: Standard coordinate system in the plane

Theorem 10.29 Coordinates of a Vector


If e = (i, j) is a standard basis, then any vector v in the plane can be written in exactly
one way as a linear combination of i and j:

v = xi + yj.

The coefficients x and y in the linear combination are called v’s coordinates with respect
to the basis e, or for short v’s e-coordinates, and they are assembled in a coordinate
vector as follows:  
x
ev = .
y
eNote 10 10.4 THE STANDARD BASES IN THE PLANE AND SPACE 222

Definition 10.30 The Coordinates of a Point


Let P be any point in the plane, and let (O, i, j) be a standard coordinate system in
the plane. By the coordinates of P with respect to the coordinate system we under-

stand the coordinates of the position vector OP with respect to the standard basis
(i, j) .
Y
P(x,y)
y

a
yj
j
X
x xi O i

The introduction of a standard basis and the coordinates of a vector in 3-space is a


simple extension of the corresponding coordinates in the plane.
eNote 10 10.4 THE STANDARD BASES IN THE PLANE AND SPACE 223

Definition 10.31 Standard Basis in Space


By a standard basis or an ordinary basis for the geometric vectors in 3-space we
understand an ordered set of three vectors (i, j, k) that satisfies:

• i, j and k all have the length 1.

• i, j and k are pairwise orthogonal.

• When i, j and k are drawn from a chosen point, and we view i and j from the
endpoint of k, then i turns into j, when i is turned by the angle π2 counter-
clockwise.

By n standard coordinate system in 3-space we understand a standard basis (i, j, k)


together with a chosen the origin O . The coordinate system is written (O, i, j, k) . By
the x-axis, the y-axis and the z-axis we understand oriented number axes through
the origin that are parallel to i , j and k , respectively.

k
j 1
i O Y
1

Figure: A standard coordinate system in 3-space.


eNote 10 10.4 THE STANDARD BASES IN THE PLANE AND SPACE 224

Theorem 10.32 The Coordinates of a Vector


When (i, j, k) is a basis, every vector v in 3-space can be written in exactly one way
as a linear combination of i, j and k:

v = xi + yj + zk.

The coefficients x, y and z in the linear combination are called v’s coordinates with re-
spect to the basis, or in short v’s e-coordinates, and they are assembled in a coordinate
vectodr as follows:  
x
ev =  y  .
z

Definition 10.33 The Coordinates of a Point


Let P be an arbitrary point in 3-space, and let (O, i, j, k) be a standard coordinate
system in 3-space. By the coordinates of P with respect to the coordinate system we

understand the coordinates of the position vector OP with respect to the standard
basis (i, j, k) .

P
k a
y
i j Y
O zk
xi+yj
x
Q
X
eNote 10 10.5 ARBITRARY BASES FOR THE PLANE AND SPACE 225

10.5 Arbitrary Bases for the Plane and Space

If two linearly independent vectors in the plane are given, it is possible to write every
other vector as a linear combination of the two given vectors. In Figure 10.3 we consider
e.g. the two linearly independent vectors a1 and a2 plus two other vectors u and v: in
the plane

v u

a2

O a1

Figure 10.3: Coordinate system in the plane with basis (a1 , a2 )

We see that u = 1a1 + 2a2 and v = −2a1 + 2a2 . These linear combinations are unique
because u and v cannot be written as a linear combination of a1 and a2 using any other
coefficients than those written. Similarly, any other vector in the plane can be written as
a linear combination of a1 and a2 , and our term for this is that the two vectors span the
whole plane.

This makes it possible to generalise the concept of a basis. Instead of a standard basis
we can choose to use the set of vectors (a1 , a2 ) as a basis for the vectors in the plane.
If we call the basis a , we say that the coefficients in the linear combinations above are
coordinates for u and v, respectively, with respect to a basis a, which is written like this:
   
1 −2
au = and a v = . (10-5)
2 2

For the set of geometric vectors in 3-space we proceed in a similar way. Given three
linearly independent vectors, then every vector in 3-space can be written as a unique-
linear combination of the three given vectors. They span all of 3-space. Therefore we
can choose three vectors as a basis for the vectors in 3-space and express an arbitrary
vector in 3-space by coordinates with respect to this basis. A method for determination
of the coordinates is shown in Figure 10.4, where we are given an a-basis (a1 , a2 , a3 ) plus
eNote 10 10.5 ARBITRARY BASES FOR THE PLANE AND SPACE 226

a3
a2
O
Q
a1

Figure 10.4: Coordinate system with basis (a1 , a2 , a3 )

an arbitrary vector u. Through the endpoint P for u a line parallel to a3 is drawn, and
the point of intersection of this line and the plane that contains a1 and a2 , is denoted Q.

Two numbers k1 and k2 exist such that OQ= k1 a1 + k2 a2 because (a1 , a2 ) constitutes a
basis in the plane that contains a1 and a2 . Furthermore there exists a number k3 such
→ →
that QP= k3 a3 since QP and a3 are parallel. But then we have
→ →
u =OQ + QP= k1 a1 + k2 a2 + k3 a3 .
u thereby has the coordinate set (k1 , k2 , k3 ) with respect to basis a.

Example 10.34 Coordinates with Respect to an Arbitrary Basis

In 3-space three linearly independent vectors a1 , a2 and a3 are given as shown in the Figure.

a1
a3
O
a2

Figure: Coordinate system with basis (a1 , a2 , a3 )


eNote 10 10.5 ARBITRARY BASES FOR THE PLANE AND SPACE 227

Since u can be written as a linear combination of a1 , a2 and a3 in the following way

u = 3a1 + a2 + 2a3 , (10-6)

then u has the coordinates (3, 1, 2) with respect to the basis a given by (a1 , a2 , a3 ) which we
write in short as  
3
au = 1  .
 (10-7)
2

We gather the above considerations about arbitrary bases in the following more formal
definition:

Definition 10.35 The Coordinates of a Vector with Respect to a Basis

• By a basis a for the geometric vectors in the plane we will understand an ar-
bitrary ordered set of two linear independent vectors (a1 , a2 ). Let an arbitrary
vector u be determined by the linear combination u = xa1 + ya2 . The coeffi-
cients x and y are called u’s coordinates with respect to the basis a, or shorter u’s
a-coordinates, and they are assembled in a coordinate vector as follows:
 
x
au = . (10-8)
y

• By a basis b for the geometric vectors in 3-space we understand an arbitrary


ordered set of three linear independent vectors (b1 , b2 , b3 ). Let an arbitrary
vector v be determined by the linearly combinationen v = xb1 + yb2 + zb3 .
The coefficients x, y and z are called v’s coordinates with respect to the basis b,
or shorter v’s b-coordinates, and they are assembled in a coordinate vector as
follows:  
x
b v =  y . (10-9)
z

The coordinate set of a given vector will change when we change the basis. This crucial
point is the subject of the following exercise.
eNote 10 10.5 ARBITRARY BASES FOR THE PLANE AND SPACE 228

Exercise 10.36

j a2

O i
a1

Figure: Change of basis

In the diagram, we are given the standard basis e = (i, j) in the plane plus another basis
a = (a1 , a2 ).

1. A vector u has the coordinates (5, −1) with respect to basis e. Determine u’s
a-coordinates.

2. A vector v has the coordinates (−1, −2) with respect to basis a. Determine v’s
e-coordinates.
eNote 10 10.6 VECTOR CALCULATIONS USING COORDINATES 229

Exercise 10.37

c
O
a d

1. In the diagram, it is evident that a, b and c are linearly independent. A basis m is


therefore given by (a, b, c). Determine the coordinate vector m d.

2. It is also evident from the figure that (a, b, d) is a basis, let us call it n. Determine the
coordinate vector n c.

3. Draw, with the origin as the initial point, the vector u that has the m-coordinates
 
2
mu = 1  .

1

10.6 Vector Calculations Using Coordinates

When you have chosen a basis for geometric vectors in the plane (or in 3-space), then
all vectors can be described and determined using their coordinates with respect to the
chosen basis. For the two arithmetic operations, addition and multiplication by a scalar,
that were introduced previously in this eNote by geometrical construction, we get a
particularly practical alternative. Instead of geometrical constructions we can carry out
calculations with the coordinates that correspond to the chosen basis.

We illustrate this with an example in the plane with a basis a given by (a1 , a2 ) plus two
eNote 10 10.6 VECTOR CALCULATIONS USING COORDINATES 230

vectors u and v drawn from O, see Figure 10.5. The exercise is to determine the vector
b = 2u − v, and we will do this in two different ways.

v
a2 u
b (4,2)

O a1

Figure 10.5: Linear combination determined using coordinates

Method 1 (geometric): First we carry through the arithmetic operations as defined in


10.2 and 10.3, cf. the grey construction vectors in Figure 10.5.

Method 2 (algebraic): We read the coordinates for u and v and carry out the arithmetic
operations directly on the coordinates:
     
1 −2 4
a b = 2 a u −a v = 2 − = . (10-10)
2 2 2

Now b can be drawn directly from its coordinates (4, 2) with respect to basis a.

That it is allowed to use this method is stated in the following theorem.


eNote 10 10.6 VECTOR CALCULATIONS USING COORDINATES 231

Theorem 10.38 Basic Rules for Coordinate Calculations


Two vectors u and v in the plane or in 3-space plus a real number k are given.
Moreover, an arbitrary basis a has been [Link] two arithmetic operations u + v
and k u can then be carried out using coordinates as follows:

1. a (u + v) = a u + a v

2. a (ku) = k a u

In other words: the coordinates for a vector sum are obtained by adding the coordi-
nates for the summands. And the coordinates for a vector multiplied by a number
are the coordinates of the vector multiplied by that number.

Proof

We carry through the proof for the set of geometric vectors in 3-space. Suppose the coordi-
nates for u and v with respect to the chosen basis a are given by
   
u1 v1
a u =  u 2
 and a v =  v 2 .
 (10-11)
u3 v3
We then have
u = u1 a1 + u2 a2 + u3 a3 og v = v1 a1 + v2 a2 + v3 a3 (10-12)
and accordingly, through the application of the commutative, associative and distributive
arithmetic rules, see Theorem10.12,
u + v = (u1 a1 + u2 a2 + u3 a3 ) + (v1 a1 + v2 a2 + v3 a3 )
(10-13)
= (u1 + v1 )a1 + (u2 + v2 )a2 + (u3 + v3 )a3
which yields
     
u1 + v1 u1 v1
a (u + v) = u2 + v2 = u2 + v2  = a u + a v
     (10-14)
u3 + v3 u3 v3
so that now the first part of the proof is complete. In the second part of the proof we again
use a distributive arithmetic rule, see Theorem 10.12:

ku = k (u1 a1 + u2 a2 + u3 a3 ) = (k · u1 )a1 + (k · u2 )a2 + (k · u3 )a3 (10-15)


which yields
k · u1
   
u1
a ( ku) = k · u2 = k u2  = k a u
   (10-16)
k · u3 u3
eNote 10 10.6 VECTOR CALCULATIONS USING COORDINATES 232

so that now the second part of the proof is complete.

Theorem 10.38 makes it possible to perform more complicated arithmetic operations


using coordinates, as shown in the following example.

Example 10.39 Coordinate Vectors for a Linear Combination

The three plane vectors a, b and c have the following coordinate vectors with respect to a
chosen basis v:      
1 0 5
va = , vb = and v c = . (10-17)
2 1 −1
Problem: Determine the coordinate vector d = a − 2b + 3c with respect to basis v.
Solution:

vd = v (a − 2b + 3c)
= v (a + (−2)b + 3c)
= v a + v (−2b) + v (3c)
= va − 2 vb + 3 vc
       
1 0 5 16
= −2 +3 = .
2 1 −1 −3

Here the third equality sign is obtained using the first part of Theorem 10.38 and the fourth
equality sign from the second part of that theorem.

Example 10.40 The Parametric Representation of a Plane in Coordinates

R
v
O

P
u

Figure: A plane in 3-space


eNote 10 10.6 VECTOR CALCULATIONS USING COORDINATES 233

In accordance with Example 10.25, the plane through the origin shown in the diagram has
the parametric representation

{ P | OP= su + tv ; (s, t) ∈ R2 }. (10-18)

Suppose that in 3-space we are given a basis a and that


   
u1 v1
a u = u2
  and a v = v2  .

u3 v3

The parametric representation (10-18) can then be written in coordinate form like this:
     
x u1 v1
 y  = s u2 + t v2  (10-19)
z u3 v3

where a OP= ( x, y, z) and (s, t) ∈ R2 .

Example 10.41 The Parametric Representation of a Plane in Coordinates

R
P
v

B u
Q

In accordance with Example 10.26 the plane through the origin shown in the diagram has the
parametric representation

{ P | OP= b + su + tv ; (s, t) ∈ R2 }. (10-20)
Suppose that in 3-space we are given a basis a and that
     
b1 u1 v1
a b = b2 , a u = u2
    and a v = v2  .

b3 u3 v3
eNote 10 10.7 VECTOR EQUATIONS AND MATRIX ALGEBRA 234

The parametric representation (10-18) can then be written in coordinate form like this:
       
x b1 u1 v1
 y  =  b2 + s u2 + t v2  (10-21)
z b3 u3 v3

where a OP= ( x, y, z) and (s, t) ∈ R2

10.7 Vector Equations and Matrix Algebra

A large number of vector-related problems are best solved by resorting to vector equa-
tions. If we wish to solve these equations using the vector coordinates in a given ba-
sis, we get systems of linear equations. The problems can then be solved using matrix
methods that follow in eNote 6. This subsection gives examples of this and sums up
this approach by introducing the coordinate matrix concept in the final Exercise 10.45.

Example 10.42 Whether a Vector is a Linear Combination of Other Vectors

In 3-space are given a basis a and three vectors u, v and p which have the coordinates with
respect to the basis a given by:
     
2 1 0
au = 1 , av = 4
    and a p = 7  .

5 3 1
Problem: Investigate whether p is a linear combination of u and v.
Solution: We will investigate whether we can find coefficients k1 , k2 , such that
k1 u + k2 v = p .
We arrange the corresponding coordinate vector equation
     
2 1 0
k 1  1 + k 2  4  =  7 
5 3 1
which is equivalent to the following system of equations
2k1 + k2 = 0
k1 + 4k2 = 7 (10-22)
5k1 + 3k2 = 1
eNote 10 10.7 VECTOR EQUATIONS AND MATRIX ALGEBRA 235

We consider the augmented matrix T for the system of equations and give (without details)
the reduced row echelon form of the matrix:

1 0 −1
   
2 1 0
T =  1 4 7  → rref(T) =  0 1 2 (10-23)
5 3 1 0 0 0

We see that the system of equations has exactly one solution, k1 = −1 and k2 = 2, meaning
that
−1u + 2v = p .

Example 10.43 Whether a Set of Vectors is Linearly Dependent

In 3-space are given a basis v and three vectors a, b and c which with respect to this basis
have the coordinates      
5 1 2
v a =  1 , v b =  0  and v c =  3 .
3 4 1
Problem: Investigate whether the set of vectors (a, b, c) is linearly dependent.
Solution: Following theorem 10.23 we can investigate whether there exists a proper linear
combination
k1 a + k2 b + k3 c = 0 .
We look at the corresponding coordinate vector equation
       
5 1 2 0
k1 1 + k2 0 + k3 3 = 0 
      
3 4 1 0

that is equivalent to the following homogeneous system of linear equations

5k1 + k2 + 2k3 = 0
k1 + 3k3 = 0 (10-24)
3k1 + 4k2 + k3 = 0

We arrange the augmented matrix T of the system of equations and give (without details) the
reduced row echelon form of the matrix:
   
5 1 2 0 1 0 0 0
T =  1 0 3 0  → rref(T) =  0 1 0 0  (10-25)
3 4 1 0 0 0 1 0

We see that the system of equations only have the zero solution k1 = 0, k2 = 0 and k3 = 0. The
set of vectors (a, b, c) is therefore linearly independent. Therefore you may choose (a, b, c) as
a new basis for the set of vectors in 3-space.
eNote 10 10.7 VECTOR EQUATIONS AND MATRIX ALGEBRA 236

In the following example we continue the discussion of the relation between coordinates
and change of basis from exercise 10.36

Example 10.44 The New Coordinates after Change of Basis

j a2

O i
a1

Figure: Change of basis

In the diagram we are given a standard basis e= (i, j) and another basis a= (a1 , a2 ). When the
basis is changed, the coordinates of any given vector are changed. Here we give a systematic
method for expressing the change in coordinates using a matrix-vector product. First we read
the e-coordinates of the vectors in basis a:
   
1 1
e a1 = and e a2 = . (10-26)
−2 1

 
v1
1. Problem: Suppose a vector v has the set of coordinates a v = . Determine the
v2
e-coordinates of v.

Solution: We have that v = v1 a1 + v2 a2 and therefore following Theorem 10.38:


      
1 1 1 1 v1
e v = v1 + v2 =
−2 1 −2 1 v2
 
1 1
If we put M = , we express v’s e-coordinates by the matrix-vector product
−2 1

ev = M · av (10-27)
 
v1
2. Problem: Suppose a vector v has the set of coordinates e v = . Determine the
v2
a-coordinates of v.
eNote 10 10.7 VECTOR EQUATIONS AND MATRIX ALGEBRA 237

Solution: We multiply from the left on both sides of 10-27 with the inverse matrix to M
and get a-coordinates of v expressed by the matrix-vector product:

av = M−1 · e v (10-28)

Exercise 10.45

By a coordinate matrix with respect to a given basis a for a set of vectors me mean the matrix
that is formed by combining the vector’s a-coordinate columns to form a matrix.
Describe the matrix T in example 10.42 and 10.43 and the matrix M in 10.44 as coordinate
matrices.
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 238

10.8 Theorems about Vectors in a Standard Basis

In this subsection we work with standard coordinate systems, both in the plane and
in 3-space. We introduce two different multiplications between vectors, the dot product
which is defined both in the plane and in 3-space, and the cross product that is only
defined in 3-space. We look at geometric applications of these types of multiplication
and at geometrical interpretations of determinants.

10.8.1 The Dot Product of two Vectors

Definition 10.46 The Dot Product in the Plane


   
a1 b
In the plane are given two vectors e a = and e b = 1 . By the dot product (or
a2 b2
the scalar product) of a and b we refer to the number

a · b = a1 · b1 + a2 · b2 . (10-29)

Definition 10.47 The Dot Product in Space


   
a1 b1
In 3-space are given two vectors e a = a2 and e b = b2 . By the dot product (or
  
a3 b3
the scalar product) of a and b we understand the number

a · b = a1 · b1 + a2 · b2 + a3 · b3 . (10-30)

For the dot product between two vectors the following rules of calculation apply.
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 239

Theorem 10.48 Arithmetic Rules for the Dot Product


Given three vectors a, b and c in the plane or in 3-space and the number k. Observe:

1. a · b = b · a (commutative rule)

2. a · (b + c) = a · b + a · c (associative rule)

3. (ka) · b = a · (kb) = k (a · b)

4. a · a = |a|2

5. |a + b|2 = |a|2 + |b|2 + 2a · b.

Proof

The Rules 1, 2, 3 follow from a simple coordinate calculation. Rule 4 follows from the
Pythagorean Theorem, and Rule 5 is a direct consequence of Rules 1, 2 and 4.

In the following three theorems we look at geometric applications of the dot product.

Theorem 10.49 The Length of a Vector


Let v be an arbitrary vector in the plane or in 3-space. The length of v satisfies

|v| = v · v . (10-31)

Proof

The theorem follows immediately from the arithmetic Rule 4 in 10.48


eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 240

b v

O a

Figure 10.6: Angle between two vectors

Example 10.50 Length of a Vector

Given the vector v in 3-space and e v = (1, 2, 3). We then have


p √
|v| = 12 + 22 + 32 = 14 .

The following fact concerns the angle between two vectors, see Figure 10.6.

Theorem 10.51 The Angle between Vectors


In the plane or 3-space we are given two proper vectors a and b. The angle v between
a and b satisfies
a·b
cos(v) = (10-32)
|a||b|

Proof

The theorem can be proved using the cosine relation. In carrying out the proof one needs
Rule 5 in theorem 10.48. The details are left for the reader.

From this theorem it follows directly:


eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 241

Corollary 10.52 The Size of Angles


Consider the situation in Figure 10.6. We see

1. a · b = 0 ⇔ angle(a, b) = π
2

2. a · b > 0 ⇔ angle(a, b) < π


2

3. a · b < 0 ⇔ angle(a, b) > π


2

The following theorems are dedicated to orthogonal projections. In Figure 10.7 two
vectors a and b in the plane or 3-space are drawn from the origin.

O v a P
proj(b,a)

Figure 10.7: Orthogonal projection

Consider P, the foot of the perpendicular from b’s endpoint to the line containing a. By

the orthogonal projection of b onto a we mean the vector OP, denoted proj(b, a).

Theorem 10.53 The Length of a Projection


Given two proper vectors a and b in the plane or 3-space. The length of the orthog-
onal projection of b onto a is:

|a · b|
|proj(b, a)| = (10-33)
|a|
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 242

Proof

Using a known theorem about right angled triangles plus Theorem 10.51 we get

|a · b|
|proj(b, a)| = | cos(v)| |b| = .
|a|

Theorem 10.54 Formula for the Projection Vector


Given two proper vectors a and b in the plane or 3-space. The orthogonal projection
of b on a is:
a·b
proj(b, a) = a. (10-34)
|a|2

Proof

If a and b are orthogonal the theorem is true since the projection in that case is the zero
vector. Conversely, let sign(a · b) denote the sign of a · b. We have that sign(a · b) is positive
exactly when a and proj(b, a) have the same direction and negative exactly when they have
the opposite direction. Therefore we get

a a·b
proj(b, a) = sign(a · b) · |proj(b, a)| = a,
| a| |a|2
a
where we have used Theorem 10.53, and the fact that |a|
is a unit vector pointing in the
direction of a.


eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 243

Figure 10.8: A triangle spanned by two vectors in the plane

10.8.2 Geometric Interpretation of the Determinant of a 2 × 2 Matrix

A triangle 4 = 4( p, a, b) is determined by two vectors drawn from a common initial


point, see the triangle 4 = 4( p, a, b) in Figure 10.8.

The area of a triangle is known to be half the base times its height. We can choose the
length |a| of a as the base. And the height in the triangle is

|b · b
a|
|b| sin(θ ) = , (10-35)
|b
a|

where θ is the angle between the two vectors a and b, and where b a denotes the hat
vector in the plane to a, that is in coordinates we have b
a = (− a2 , a1 ). Hence the area is:

1
Area(4( p, a, b)) = |b · ba|
2
1
= | a1 b2 − a2 b1 |
2
1 a1 b1
= | | (10-36)
2 a2 b2
 
1 a1 b1
= | det |
2 a2 b2
1
= | det ( [a b] ) | .
2
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 244

Thus we have proven the theorem:

Theorem 10.55 Area of a Triangle as a Determinant


The area of the triangle 4( p, a, b) is the absolute value of half the determinant of
the 2 × 2 matrix that is obtained by insertion of a and b as columns in the matrix.

10.8.3 The Cross Product and the Scalar Triple Product

The cross product of two vectors and the scalar triple product of three vectors are intro-
duced using determinants:

Definition 10.56 Cross Product


   
a1 b1
In 3-space two vectors are given e a =  a2  and e b =  b2 .
a3 b3
By the cross product (or the vector product) a × b of a and b is understood the vector v
given by  
a2 b2

 det a3 b3 
  
 a3 b3 
e v =  det (10-37)
 
  a 1 b1 


 a1 b1 
det
a2 b2

The cross product has a geometric significance. Consider Figure 10.9 and the following
theorem:
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 245

axb

b
p v
a

Figure 10.9: Geometry of the cross-product

Theorem 10.57 The Area of a Triangle by the Cross Product


For two linearly independent vectors a and b that form the angle v with each other,
the cross product a × b satisfies

1. a × b is orthogonal to both a and b .

2. |a × b| = 2 · Area(4( p, a, b)) .

3. The vector set (a, b, a × b) follows the right hand rule: seen from the tip of a × b
the direction from a to b is counter-clockwise.

Definition 10.58 Scalar Triple Product


     
a1 b1 c1
The scalar triple product [a, b, c] of the vectors e a = a2 , e b = b2 and e c = c2 
    
a3 b3 a3
is defined by:

[a, b, c] = (a × b) · c
= (c1 ( a2 b3 − a3 b2 ) + c2 ( a3 b1 − a1 b3 ) + c3 ( a1 b2 − a2 b1 )
 
a1 b1 c1
(10-38)
= det  a2 b2 c2 
a3 b3 c3
= det ([e a e b e c]) .
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 246

10.8.4 Geometric Interpretation of the Determinant of a 3 × 3 Matrix

From elementary Euclidean space geometry we know that the volume of a tetrahe-
dron is one third of the area of the base times the height. Consider the tetrahedron
 = ( p, a, b, c) spanned by the vectors a, b and c drawn from the point p, in Figure
10.10. The area of the base, 4( p, a, b) has been determined in the second part of Theo-

Figure 10.10: A tetrahedron spanned by three vectors in 3-space

rem 10.57.

The height can then be determined as the scalar product of the third edge vector c with
a unit vector, perpendicular to the base triangle.

But a × b is exactly perpendicular to the base triangle (because the cross product is
perpendicular to the edge vectors of the base triangle, see part 2 of Theorem (10.57), so
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 247

we use this:
1 (a × b) · c
Vol(( p, a, b, c)) = | Area(4( p, a, b)) |
3 |a × b| (10-39)
1
= |(a × b) · c|
6
where we have used part 2 of Theorem 10.57.

By comparing this to the definition of scalar triple product, see 10.58, we now get the
volume of a tetrahedron written in ’determinant-form’:

Theorem 10.59 Volume of a Tetrahedron as a Scalar Triple Product


The volume of the tetrahedron  = ( p, a, b, c) is:

1
Vol(( p, a, b, c)) = | det ([a b c]) | . (10-40)
6

A tetrahedron has the volume 0, is collapsed, exactly when the determinant in (10-40) is
0, and this occurs exactly when one of the vectors can be written as a linear combination
of the two others (why is that?).

Definition 10.60 Regular Tetrahedron


A regular tetrahedron is a tetrahedron that has a proper volume, that is a volume,
that is strictly greater than 0.

Exercise 10.61

Let A denote a (2 × 2)-matrix with the column vectors a and b:

A = [a b] . (10-41)

Show that the determinant of A is 0 if and only if the column vectors a and b are linearly
dependent in R2 .
eNote 10 10.8 THEOREMS ABOUT VECTORS IN A STANDARD BASIS 248

Exercise 10.62

Let A denote a (3 × 3)−matrix with the column vectors a, b, and c:

A = [a b c] . (10-42)

Show, that the determinant of A is 0 if and only if the column vectors a, b and c constitute a
linearly dependent set of vectors in R3 .

Exercise 10.63

Use the geometric interpretations of the determinant above to show the following
Hadamard’s inequality for (2 × 2)−matrices and for (3 × 3)−matrices (in fact the inequality
is true for all square matrices):
!
n n
(det(A))2 ≤ ∏ ∑ a2ij . (10-43)
j i

When is the equality sign valid in (10-43)?


eNote 11 249

eNote 11

General Vector Spaces

In this eNote a general theory is presented for all mathematical sets where addition and
multiplication by a scalar are defined and which satisfy the same arithmetic rules as geometric
vectors in the plane and in 3-space. Using the concepts of a basis and coordinates, it is shown
how one can simplify and standardize the solution of problems that are common to all these sets,
which are called vector spaces. Knowledge of eNote 10 about geometric vectors is an advantage
as is knowledge about the solution sets for systems of linear equations, see eNote 6. Finally,
elementary matrix algebra and a couple of important results about determinants are required
(see eNotes 7 and 8).

Updated: 4.10.21 David Brander

11.1 Generalization of the Concept of a Vector

The vector concept originates in the geometry of the plane and space where it denotes a
pair consisting of a length and a direction. Vectors can be represented by a line segment
with orientation (an arrow) following which it is possible to define two geometric op-
erations: addition of vectors and multiplication of vectors by numbers (scalar). For the use
in more complicated arithmetic operations one proves eight arithmetic rules concerning
the two arithmetic operations.

In many other sets of mathematical objects one has a need for defining addition of the
objects and multiplication of an object by a scalar. The number spaces Rn and Cn and
the set of matrices Rm×n are good examples, see eNote 5 and eNote 6, respectively. The
remarkable thing is, that the arithmetic rules for addition and multiplication by a scalar,
eNote 11 11.1 GENERALIZATION OF THE CONCEPT OF A VECTOR 250

that are possible to prove within each of these sets, are the same as the arithmetic rules
for geometric vectors in the plane and in space! Therefore one says: Let us make a theory
that applies to all the sets where addition and multiplication by a scalar can be defined
and where all the eight arithmetic rules known from geometry are valid. By this one
carries out a generalization of the concept of geometric vectors, and every set that obeys
the conditions of the theory is therefore called a vector space.

In eNote 10 about geometric vectors it is demonstrated how one can introduce a basis
for the vectors following which the vectors are determined by their coordinates with re-
spect to this basis. The advantage of this is that one can replace the geometric vector
calculation by calculation with the coordinates for the vector. It turns out that it is also
possible to transfer the concepts of basis and coordinates to many other sets of mathe-
matical objects that have addition and multiplication by a scalar.

In the following, when we investigate vector spaces in the abstract sense, it means that
we look at which concepts, theorems and methods follow from the common arithmetic
rules, as we ignore the concrete meaning of addition and multiplication by a scalar has
in the sets of concrete objects where they are introduced. By this one obtains general
methods for every set of the kind described above. The application in any particular vec-
tor space then calls for interpretation in the context of the results obtained. The approach
is called the axiomatic method. Concerning all this we now give the abstract definition of
vector spaces.
eNote 11 11.1 GENERALIZATION OF THE CONCEPT OF A VECTOR 251

Definition 11.1 Vector Spaces


Let L denote R or C , and let V be a set of mathematical elements where there is
defined two arithmetic operations:

I. Addition that from two elements a and b in V forms the sum a + b that also
belongs to V.

II. Multiplication by a scalar that from any a ∈ V and any scalar k ∈ L forms a
product ka or ak that also belongs to V.

V is called a vector space and the elements of V vectors if the following eight
arithmetic rules are valid:
1. a+b = b+a Addition is commutative
2. (a + b) + c = a + (b + c) Addition is associative
3. a+0 = a In V there exists 0 that is neutral wrt. addition
4. a + (−a) = 0 For every a ∈ V there is an opposite object −a ∈ V
5. k 1 ( k 2 a) = ( k 1 k 2 )a Product by a scalar is associative

6. ( k 1 + k 2 )a = k 1 a + k 2 a
The distributive rule applies
7. k 1 (a + b) = k 1 a + k 1 b
8. 1a = a The scalar 1 is neutral in products with vectors

If L in the definition 11.1 stands for R then V is a vector space over the real
numbers. This means that the scalar k (only) can be an arbitrary real number.
Similarly one talks about V as a vector space over the complex numbers if L
stands for C , where k can be an arbitrary complex number.

The requirements I and II in the definition 11.1, that the results of addition and
of multiplication by a scalar in itself must be an element in V, are called the
stability requirements. In other words V must be stable with respect to the
two arithmetic operations.

The set of geometric vectors in the plane and the set of geometric vectors in space are
naturally the most obvious examples of vector spaces, since the eight arithmetic rules in
the definition 11.1 are constructed from the corresponding rules for geometric vectors.
But let us check the stability requirements. Is the sum of two vectors in the plane itself
a vector in the plane? And is a vector in the plane multiplied by a number in itself a
eNote 11 11.1 GENERALIZATION OF THE CONCEPT OF A VECTOR 252

vector in the plane? From the definition of the two arithmetic operations (see Definition
10.2 and Definition 10.3), the answer is obviously yes, therefor the set of vectors in the
plane is a vector space. Similarly we see that the set of vectors in 3-space is a vector
space.

Theorem 11.2 Uniqueness of the 0-Vector and the Opposite Vector


For every vector space V:

1. V only contains one neutral element with respect to addition.


2. Every vector a ∈ V has only one opposite element.

Proof

First part:
Let 01 and 02 be two elements in V both neutral with respect to addition. Then:

01 = 01 + 02 = 02 + 01 = 02 ,

where we have used the fact that addition is commutative. There is only one 0-vector: 0 .

Second part:
Let a1 , a2 ∈ V be two opposite elements for a ∈ V . Then:

a1 = a1 + 0 = a1 + ( a + a2 ) = ( a + a1 ) + a2 = 0 + a2 = a2 ,

where we have used the fact that addition is both commutative and associative. Hence there
is for a only one opposite vector −a .

Definition 11.3 Subtraction


Let V be a vector space, and let a, b ∈ V . By the difference a − b we understand the
vector
a − b = a + (−b) . (11-1)
eNote 11 11.1 GENERALIZATION OF THE CONCEPT OF A VECTOR 253

Exercise 11.4

Prove that (−1)a = −a .

Exercise 11.5 Zero-Rule

Prove that the following variant of the zero-rule applies to any vector space:

ka = 0 ⇔ k = 0 or a = 0 . (11-2)

Example 11.6 Matrices as Vectors

For two arbitrary natural numbers m and n, Rm×n (that is, the set of real m × n-matrices) is a
vector space. Similarly Cm×n (that is, the set of complex m × n-matrices) is a vector space

Consider e.g. R2×3 . If we add two matrices of this type we get a new matrix of the same type,
and if we multiply a 2 × 3-matrix by a number, we also get a new 2 × 3-matrix (see Definition
7.1). Thus the stability requirements are satisfied. That R2×3 in addition satisfies the eight
arithmetic rules, is apparent from Theorem 7.3.

Exercise 11.7

Explain that for every natural number n the number space Ln is a vector space. Remember
to think about the case n = 1 !

In the following two examples we shall see that the geometrically inspired vector space
theory, surprisingly, can be applied to well known sets of functions. Mathematic histori-
ans have in this connection talked about the geometrization of mathematical analysis!

Example 11.8 Polynomials as Vectors

The set of polynomials P : R 7→ R of at most n’th degree is denoted Pn (R). An element P in


Pn (R) is given by
P ( x ) = a0 + a1 x + · · · + a n x n (11-3)
eNote 11 11.1 GENERALIZATION OF THE CONCEPT OF A VECTOR 254

where the coefficients a0 , a1 , · · · an are arbitrary real numbers. Addition of two polynomials
in Pn (R) is defined by pairwise addition of coefficients belonging to the same degree of the
variable, and multiplication of a polynomial in Pn (R) by a number k is defined as the multi-
plication of every coefficient with k. As an example of the two arithmetic operations we look
at two polynomials from P3 (R):

P( x ) = 1 − 2x + x3 = 1 − 2x + 0x2 + 1x3

and
Q( x ) = 2 + 2x − 4x2 = 2 + 2x − 4x2 + 0x3 .

By the sum of P and Q we understand the polynomial R = P + Q given by

R( x ) = (1 + 2) + (−2 + 2) x + (0 − 4) x2 + (1 + 0) x3 = 3 − 4x2 + x3

and by the multiplication of P by the scalar k = 3 we understand the polynomial S = 3P


given by
S( x ) = (3 · 1) + (3 · (−2)) x + (3 · 0) x2 + (3 · 1) x3 = 3 − 6x + 3x3 .
We will now justify that Pn (R) with the introduced arithmetic operations is a vector space!
That Pn (R) satisfies the stability requirements follows from the fact that the sum of two poly-
nomials of at most n’th degree in itself is a polynomial of at most n’th degree, and that multi-
plication of a polynomial of at most n’th degree by a real number again gives a polynomial of
at most n’th degree. The conditions 1, 2 and 5 - 8 in the definition 11.1 are satisfied, because
the same rules of operation apply to the calculations with coefficients of the polynomials that
are used in the definition of the operations. Finally the conditions 3 and 4 are satisfied since
the polynomial
P( x ) = 0 + 0x + · · · 0x n = 0
constitutes the zero vector, and the opposite vector to P( x ) ∈ Pn (R) is given by the polynomial

− P ( x ) = − a0 − a1 x − · · · − a n x n .

In the same way we show that polynomial P : C 7→ C of at most n’th degree, which we
denote by Pn (C) , is a vector space.

Exercise 11.9

Explain that P(R), that is the set of real polynomials, is a vector space.
eNote 11 11.2 LINEAR COMBINATIONS AND SPAN 255

Example 11.10 Continuous Functions as Vectors

The set of continuous real functions on a given interval I ⊆ R is denoted C0 ( I ). Addition


m = f + g of two functions f and g in C0 ( I ) is defined by

m( x ) = ( f + g)( x ) = f ( x ) + g( x ) for every x ∈ I

and multiplication n = k · f of the function f by a real number k by

n( x ) = (k · f )( x ) = k · f ( x ) for every x ∈ I .

We will now justify that C0 ( I ), with the introduced operations of calculations, is a vector
space. Since f + g and k · f are continuous functions, we see that C0 ( I ) satisfies the two
stability requirements. Moreover: there exists a function that acts as the zero vector, viz. the
zero function, that is, the function that has the value 0 for all x ∈ I, and the opposite vector
to f ∈ C0 ( I ) is the vector (−1) f that is also written − f , and which for all x ∈ I has the value
− f ( x ). Now it is obvious that C0 ( I ) with the introduced operations of calculation satisfies
all eight rules in definition 11.1, and C0 ( I ) is thus a vector space.

11.2 Linear Combinations and Span

A consequence of arithmetic rules such as u + v = v + u and (u + v) + w = u + (v + w)


from the definition 11.1 is that one can omit parentheses when one adds a series of
vectors: the order of vector addition has no influence on the resulting sum vector. This
is the background for linear combinations where a set of vectors is multiplied by a scalar
and thereafter written as a sum.

Definition 11.11 Linear Combination


When in a vector space V p vectors v1 , v2 , . . . , v p are given, and arbitrary scalars
k1 , k2 , . . . , k p are chosen, then the sum

k1 v1 + k2 v2 + . . . + k p v p

is called a linear combination of the p given vectors.

If all the k1 , · · · , k p are equal to 0, the linear combination is called improper, or trivial,
but if at least one of the scalars is different from 0, it is called proper or non-trivial.
eNote 11 11.2 LINEAR COMBINATIONS AND SPAN 256

In the definition 11.11 only one linear combination is mentioned. In many circumstances
it is of interest to consider the total set of possible linear combinations of given vectors.
The set is called the span of the vectors. Consider e.g. a plane in space, through the
origin and containing the position vectors for two non-parallel vectors u and v. The
plane can be considered the span of the two vectors since the position vectors

OP= k1 u + k2 v

”run through” all points P in the plane when k1 and k2 take on all conceivable real
values, see Figure 11.1.

R
v
O

P
u

Figure 11.1: u and v span a plane in space

Definition 11.12 Span


By the span of a given set of vectors v1 , v2 , . . . , v p in a vector space V we understand
the total set of all possible linear combinations of the vectors. The span of the p
vectors is denoted by
span{v1 , v2 , . . . , v p } .

Example 11.13 Linear Combination and Span

We consider in the vector space R2×3 the three matrices/vectors


     
1 0 3 2 1 0 −1 −2 9
A= , B= and C = . (11-4)
0 2 2 0 3 1 0 0 4
eNote 11 11.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 257

An example of a linear combination of the three vectors is


 
3 2 −3
2A + 0 B + (−1)C = . (11-5)
0 4 0

We can then write  


3 2 −3
∈ span{A, B, C} . (11-6)
0 4 0

11.3 Linear Dependence and Linear Independence

Two geometric vectors u and v are linearly dependent if they are parallel, that is if there
exists a number k, such that v = ku. More generally an arbitrary set of vectors are
linearly dependent if one of the vectors is a linear combination of the others. We wish
to transfer this concept to vector space theory:

Definition 11.14 Linear Dependence and Independence


A set consisting of p vectors {v1 , v2 , . . . , v p } in a vector space V is linearly dependent
if at least one of the vectors can be written as a linear combination of the others: for
example
v1 = k2 v2 + k3 v3 + · · · + k p v p .
If none of the vectors can be written as a linear combination of the others, the set is
called linearly independent.

NB: If the set of vectors only consists of a single vector, the set is called linearly
dependent if it consists of the 0-vector, and otherwise linearly independent.

Example 11.15 Linear Dependence

Any set of vectors containing the zero vector, is linearly dependent! Consider e.g. the set
{u, v, 0, w}), here the zero vector can trivially be written as a linear combination of the three
other vectors in the set:
0 = 0u + 0v + 0w ,
where the zero vector is written as a linear combination of the other vectors in the set.
eNote 11 11.3 LINEAR DEPENDENCE AND LINEAR INDEPENDENCE 258

Example 11.16 Linear Dependence

Consider in the vector space R2×3 the three matrices/vectors


     
1 0 3 2 1 0 −1 −2 9
A= , B= and C = . (11-7)
0 2 2 0 3 1 0 0 4

C can be written as a linear combination of A and B since

C = 3A − 2B .

Therefore A, B and C are linearly dependent.

In contrast the set consisting of A and B is linearly independent, because these two vectors
are not ”parallel’, since a number k obviously does not exist such that B = kA. Similarly with
the sets {A, C} and {B, C} .

When you investigate whether a set of vectors is linearly dependent, use of the defi-
nition 11.14 provokes the question which of the vectors is a linear combination of the
others. Where should we begin the investigation? The dilemma can be avoided if by-
passing the definition we instead use the following theorem:

Theorem 11.17 Linear Dependence and Independence


A set of vectors {v1 , v2 , . . . , v p } in a vector space V is linearly dependent if and only
if the zero vector can be written as proper linear combination of the vectors – that is,
if and only if scalars k1 , k2 , . . . , k p exist that are not all equal to 0, and that satisfy

k1 v1 + k2 v2 + · · · + k p v p = 0 . (11-8)

. Otherwise the vectors are linearly independent.


.

Proof

Assume first that {v1 , v2 , . . . , v p } are linearly dependent, then one can be written as a linear
combination of the others, e.g.

v1 = k 2 v2 + k 3 v3 + · · · + k p v p . (11-9)
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 259

But this is equivalent to


v1 − k 2 v2 − k 3 v3 − · · · − k p v p = 0 , (11-10)
whereby the zero-vector is written as a linear combination of the vector set in which at least
one of the coefficients are not 0, since v1 has the coefficient 1.

Conversely, assume that the zero-vector is written as a proper linear combination of the set
of vectors, where one of the coefficients, for example the v1 coefficient k1 , is different from 0
(the same argument works for any of other coefficient). Then we have

k2 kp
k1 v1 + k2 v2 + · · · + k p v p = 0 ⇔ v1 = (−1) v2 + · · · + (−1) v p . (11-11)
k1 k1
Thus v1 is written as a linear combination of the other vectors and the proof is complete.

Example 11.18 Linear Dependence

In the number space R4 the vectors a = (1, 3, 0, 2), b = (−1, 9, 0, 4) and c = (2, 0, 0, 1) are
given. Since
3a − b − 2c = 0
the zero vector is written as a non-trivial linear combination of the three vectors. Thus they
are linearly dependent.

11.4 Basis and Dimension of a Vector Space

A compelling argument for the introduction of a basis in a vector space is that all vectors
in the vector space then can be written using coordinates. In a later section it is shown
how problems of calculation can be simplified and standardized with vectors when we
use coordinates. But in this section we will discuss the requirements that a basis should
satisfy and investigate the consequences of these requirements.

A basis for a vector space consists of certain number of vectors, usually written in a
definite order. A decisive task for the basis vectors is that they should span the vector
space, but more precisely we want this task to be done with as few vectors as possible.
In this case it turns out that all vectors in the vector space can be written uniquely as a
linear combination of the basis vectors. And it is exactly the coefficients in the unique
linear combination we will use as coordinates.
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 260

Let us start out from some characteristic properties about bases for geometric vectors in
the plane.

v
a2

O
a1
a3

Figure 11.2: Coordinate system in the plane with the basis (a1 , a2 )

Consider the vector set {a1 , a2 , a3 } in Figure 11.2. There is no doubt that any other
vector in the plane can be written as a linear combination of the three vectors. But the
linear combination is not unique, for example the vector v can be written in these two
ways:

v = 2a1 + 3a2 − 1a3


v = 1a1 + 2a2 + 0a3 .

The problem is that the a-vectors are not linearly independent, for example a3 = −a1 −
a2 . But if we remove one of the vectors, e.g. a3 , the set is linearly independent, and
there is only one way of writing the linear combination

v = 1a1 + 2a2 .

We can summarize the characteristic properties of a basis for the geometric vectors in
the plane thus:

1. any basis must consist of linearly independent vectors,

2. any basis must contain exactly two vectors (if there are more than two, they are
linearly dependent, if there are less than two they do not span the plane), and

3. every set consisting of two linear independent vectors is a basis.

These properties can be transferred to other vector spaces. We embark on this now, and
we start by the general definition of a basis.
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 261

Definition 11.19 Basis


By a basis for a vector space V we understand a set {v1 , v2 , . . . , vn } of vectors from
V that satisfy:

1. {v1 , v2 , . . . , vn } spans V .

2. {v1 , v2 , . . . , vn } is linearly independent.

When we discuss coordinates later, it will be necessary to consider the basis elements
to have a define order, and so we will write them as an ordered set, denoted by using
parentheses: (v1 , v2 , . . . , vn ).

Here we should stop and check that the definition 11.19 does in fact satisfy our require-
ments of uniqueness of a basis. This is established in the following theorem.

Theorem 11.20 Uniqueness Theorem


If a basis for a vector space V is given, any vector in V can then be written as a unique
linear combination of the basis vectors.

Proof

We give the idea in the proof by looking at a vector space V that has a basis consisting of
three basis vectors (a, b, c) and assume that v is an arbitrary vector in V that in two ways
can be written as a linear combination of the basis vectors. We can then write two equations

v = k1 a + k2 b + k3 c
(11-12)
v = k4 a + k5 b + k6 c

By subtracting the lower equation from the upper equation in (11-12) we get the equation

0 = ( k 1 − k 4 )a + ( k 2 − k 5 )b + ( k 3 − k 6 )c . (11-13)

Since a, b and c are linearly independent, the zero vector can only be written as an improper
linear combination of these, therefore all coefficients in (11-12) are equal to 0, yielding k1 = k4 ,
k2 = k5 and k3 = k6 . But then the two ways v has been written as linear combinations of the
basis vectors, is in reality the same, there is only one way!

This reasoning is immediately extendable to a basis consisting of an arbitrary number of basis


vectors.
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 262

We now return to the fact that every basis for geometric vectors in the plane contains
two linearly independent basis vectors, and that similarly for geometric vectors in space
the basis must consist of three linearly independent basis vectors. It turns out that the
fixed number of basis vectors is a property of all vector spaces with a basis, and this
makes it possible to talk about the dimension of a vector space that has a basis. To prove
the property we need a lemma.

Lemma 11.21
If a vector space V has a basis consisting of n basis vectors then every set from V
that contains more than n vectors will be linearly dependent.

Proof

To get a grasp of the proof’s underlying idea, consider a vector space V that has a basis
consisting of two vectors (a, b), and investigate three arbitrary vectors c, d and e from V. We
prove that the three vectors necessarily must be linearly independent.

Since (a, b) is a basis for V, we can write three equations

c = c1 a + c2 b
d = d1 a + d2 b (11-14)
e = e1 a + e2 b

Consider further the zero vector written as the following linear combination

x1 c + x2 d + x3 e = 0 , (11-15)

which by substitution of the equations (11-14) into (11-15) is equivalent to

( x 1 c 1 + x 2 d 1 + x 3 e1 ) a + ( x 1 c 2 + x 2 d 2 + x 3 e2 ) b = 0 . (11-16)
Since the zero vector only can be obtained as a linear combination of a and b, if every coeffi-
cient is equal to 0, (11-16) is equivalent to the following system of equations

c 1 x 1 + d 1 x 2 + e1 x 3 = 0
(11-17)
c 2 x 1 + d 2 x 2 + e2 x 3 = 0
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 263

This is a homogeneous system of linear equations in which the number of equations is less
than the number of unknowns. Therefore the system of equations has infinitely many solu-
tions, which means that (11-16) not only is obtainable with x1 = 0, x2 = 0 and x3 = 0. Thus
we have shown that the ordered set (c, d, e) is linearly dependent.

In general: Assume that the basis V consists of n vectors, and that m vectors from V where
m > n are given. By following the same procedure as above a homogeneous system of lin-
ear equations emerges with n equations in m unknowns that, because m > n , similarly has
infinitely many solutions. By this it is shown that the m vectors are linearly dependent.

Then we are ready to give the following important theorem:

Theorem 11.22 The Number of Basis Vectors


If a vector space V has a basis consisting of n basis vectors, then every basis for V
will consist of n basis vectors.

Proof

Assume that V has two bases with different numbers of big(asis vectors. We denote the basis
with the least number of basis vectors a and the one with largest number b. According to
Lemma 11.21 the b-basis vectors must be linearly dependent, and this contradicts that they
form a basis. The assumption that V can have two bases with different numbers of basis
vectors, must therefore be untrue.

That the number of basis vectors according to theorem 11.22 is a property of vector spaces
with a basis, motivates the introduction of the concept of dimension:
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 264

Definition 11.23 Dimension


By the dimension of a vector space V that has a basis b, we understand the number
of vectors in b. If this number is n, one says that V is n-dimesional and write

dim(V ) = n . (11-18)

Remark: There are vector spaces that do not have a finite basis, see Section 11.7.2 below.

Example 11.24 Dimension of Geometric Vector Spaces

Luckily the definition 11.23 confirms an intuitive feeling that the set of geometric vectors
in the plane has the dimension two and that the set of geometric vectors in space has the
dimension three!

Example 11.25 The Standard Basis for Number Spaces

An arbitrary vector v = ( a, b, c, d) in R4 or in C4 (that is in L4 ) can in an obvious way be


written as a linear combination of four particular vectors in L4

v = a (1, 0, 0, 0) + b (0, 1, 0, 0) + c (0, 0, 1, 0) + d (0, 0, 0, 1) . (11-19)

We put e1 = (1, 0, 0, 0), e2 = (0, 1, 0, 0), e3 = (0, 0, 1, 0) and e4 = (0, 0, 0, 1) and conclude
using (11-19) that the ordered set (e1 , e2 , e3 , e4 ) spans L4 .

Since we can see that none of the vectors can be written as a linear combination of the others,
the set is linearly independent, and (e1 , e2 , e3 , e4 ) is thereby a basis for L4 . This particular
basis is called standard basis for L4 . Since the number of basis vectors in the standard e-basis
is four, dim(L4 ) = 4 .

This can immediately be generalized to Ln : For every n the set (e1 , e2 , . . . , en ) where

e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, 0, . . . , 1)

is a basis for Ln . This is called standard basis for Ln . It is noticed that dim(Ln ) = n .
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 265

Example 11.26 Standard Basis for Matrix Spaces

By standard basis for the vector space R2×3 or C2×3 , we understand the matrix set
 1 0 0   0 1 0   
0 0 0 
, ,..., (11-20)
0 0 0 0 0 0 0 0 1

Similarly we define a standard basis for an arbitrary matrix space Rm×n and for an arbitrary
matrix space Cm×n .

Exercise 11.27

Explain that the matrix set, which in Example 11.26 is referred to as the standard basis for
R2×3 , is in fact a basis for this vector space.

Example 11.28 The Monomial Basis for Polynomial Spaces

In the vector space P2 (R) of real polynomials of at most 2nd degree, the ordered set (1, x, x2 )
is a basis. This is demonstrated in the following way.

1. Every polynomial P( x ) ∈ P2 (R) can be written in the form

P ( x ) = a0 · 1 + a1 · x + a2 · x 2 ,

that is as a linear combination of the three vectors in the set.

2. The set {1, x, x2 } is linearly independent, since the equation

a0 · 1 + a1 · x + a2 · x2 = 0 for every x

according to the identity theorem for polynomials is only satisfied if all the coefficients
a0 , a1 and a2 are equal to 0 .

A monomial is a polynomial with only one term. Hence, the ordered set (1, x, x2 ) is called the
monomial basis for P2 (R), and dim( P2 (R)) = 3 .

For every n the ordered set (1, x, x2 , . . . , x n ) is a basis for Pn (R), and is called the monomial
basis for Pn (R). Therefore we have that dim( Pn (R)) = n + 1 .

Similarly the ordered set (1, z, z2 , . . . , zn ) is a basis for Pn (C), it is called monomial basis for
Pn (C). Therefore we have that dim( Pn (C)) = n + 1 .
eNote 11 11.4 BASIS AND DIMENSION OF A VECTOR SPACE 266

In the set of plane geometric vectors one can choose any pair of two linearly independent
vectors as basis. Similarly in 3-space any set of three linear independent vectors is a
basis. We end the section by transferring this to general n-dimensional vector spaces:

Theorem 11.29 Sufficient Conditions for a Basis


In an n-dimensional vector space V, an arbitrary set of n linearly independent vec-
tors from V constitutes a basis for V .

Proof

Since V is assumed to be n-dimensional, it must have a basis b consisting of n basis vectors.


Let the a-set (a1 , a2 , · · · , an ) be an arbitrary linearly independent set of vectors from V. The
set is then a basis for V if it spans V. Suppose this is not the case, and let v be a vector V
that does not belong to span{a1 , a2 , · · · , an }. Then (v, a1 , a2 , · · · , an ) must be linearly inde-
pendent, but this contradicts theorem 11.21 since there are n + 1 vectors in the set. Therefore
the assumption that the a-set does not span V must be untrue, and it must accordingly be a
basis for V.

Exercise 11.30

Two geometric vectors a = (1, −2, 1) and b = (2, −2, 0) in 3-space are given. Determine a
vector c such that the ordered set (a, b, c) is a basis for the set of space vectors.

Exercise 11.31

In the 4-dimensional vector space R2×2 , consider the vectors


     
1 1 1 1 1 0
A= , B= of C = . (11-21)
1 0 0 1 1 1

Explain why the ordered set (A, B, C) is a linearly independent set, and complement the set
with a 2×2 matrix D such that (A, B, C, D) is a basis for R2×2 .
eNote 11 11.5 VECTOR CALCULATIONS USING COORDINATES 267

11.5 Vector Calculations Using Coordinates

Coordinates are closely connected to the concept of a basis. When a basis is chosen
for a vector space, any vector in the vector space can be described with the help of its
co-ordinates with respect to the chosen basis. By this we get a particularly practical al-
ternative to the calculation operations, addition and multiplication by a scalar, which
originally are defined from the ’anatomy’ of the specific vector space. Instead of car-
rying out these particularly defined operations we can implement number calculations
with the coordinates that correspond to the chosen basis. In addition it turns out that
we can simplify and standardize the solution of typical problems that are common to
all vector spaces. But first we give a formal introduction of coordinates with respect to
a chosen basis.

Definition 11.32 Coordinates with Respect to a Given Basis


In an n-dimensional vector space V the basis a = (a1 , a2 , . . . , an ) and a vector x are
given. We consider the unique linear combination of the basis vectors that according
to 11.20 is a way of writing x:

x = x1 a1 + x2 a2 + · · · + xn an . (11-22)

The coefficients x1 , x2 , . . . , xn in (11-22) are denoted x’s coordinates with respect to the
basis a, or x’s a-coordinates, and they are gathered in a coordinate vector as follows:
 
x1
 x2 
a x =  ..  . (11-23)
 
 . 
xn
eNote 11 11.5 VECTOR CALCULATIONS USING COORDINATES 268

Example 11.33 Coordinates with Respect to a New Basis

In the number space R3 a basis a is given by ((0, 0, 1), (1, 2, 0), (1, −1, 1)). Furthermore the
vector v = (7, 2, 6) is given. Since

2 · (0, 0, 1) + 3 · (1, 2, 0) + 4 · (1, −1, 1) = (7, 2, 6)

we see that  
2
av = 3  .

4
The vector (7, 2, 6) therefore has the a-coordinates (2, 3, 4) .

In order to be able to manipulate the coordinates of several vectors in various arithmetic


operations we will need the following important theorem.

Theorem 11.34 The Coordinate Theorem


In a vector space V two vectors u and v plus a real number k are given. In addition
an arbitrary basis a is chosen. The two arithmetic operations u + v of k u can then be
carried out using the a-coordinates like this:

1. a (u + v) = a u + a v

2. a (ku) = k a u

In other words: The coordinates for a vector sum are obtained by adding the coor-
dinates for the vectors, and the coordinates for a vector multiplied by a number are
the coordinates of the vector multiplied by the number.

Proof

See the proof for the corresponding theorem for geometric vectors in 3-space, Theorem 10.38.
The proof for the general case is obtained as a simple extension.


eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 269

Example 11.35 Vector Calculation Using Coordinates

We now carry out a vector calculation using coordinates. The example is not particularly
mathematically interesting, but we carry it out in detail in order to demonstrate the technique
of Theorem 11.34.

There are given three polynomials in the vector space P2 (R):

R( x ) = 2 − 3x − x2 , S( x ) = 1 − x + 3x2 and T ( x ) = x + 2x2 .

The task is now to determine the polynomial P( x ) = 2R( x ) − S( x ) + 3T ( x ) . We choose to


carry this out using coordinates for the polynomials with respect to the monomial basis for
P2 (R).

m P ( x ) = m 2R ( x ) − S ( x ) + 3T ( x )
= m (2R ( x )) + m (− S ( x )) + m (3T ( x ))
= 2 m R( x ) − m S( x ) + 3 m T ( x )
       
2 1 0 3
= 2 −3 − −1 + 3 1  =  −2  .
−1 3 2 1

We translate the resulting coordinate vector to the wanted polynomial:

P( x ) = 3 − 2x + x2 .

11.6 On the Use of Coordinate Matrices

When we embark on problems with vectors and use their coordinates with respect to a
given basis it often leads to a system of linear equations which we then solve by matrix
calculations. One matrix is of particular importance, viz. the matrix that is formed by
gathering the coordinate columns of more vectors in a coordinate matrix:

Explanation 11.36 Coordinate Matrix for a Vector Set

If in a n-dimensional vector space V a basis a exists, and a set of m numbered vec-


eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 270

tors is given, then the a-coordinate matrix is formed by gathering the a-coordinate
columns in the given order to form an m×n matrix.
By way of example consider a set of three vectors in R2 : ((1, 2), (3, 4), (5, 6)) . The
coordinate matrix of the set with respect to the standard e-basis for R2 is the 2×3-
matrix  
1 3 5
.
2 4 6

We will now show how coordinate matrices emerge in series of examples which we,
for the sake of variation, take from different vector spaces. The methods can directly
be used on other types of vector spaces, and after each example the method is demon-
strated in a concentrated and general form.

It is important for your own understanding of the theory of vector spaces that
you practice and realize how coordinate matrices emerge in reality when you
start on typical problems.

11.6.1 Whether a Vector is a Linear Combination of Other Vectors

In R4 we are given four vectors

a1 = (1, 1, 1, 1)
a2 = (1, 0, 0, 1)
a3 = (2, 3, 1, 4)
b = (2, −2, 0, 1)

Problem: Investigate if b is a linear combination of a1 , a2 and a3 .


Solution: We will investigate whether we can find x1 , x2 , x3 ∈ R such that

x1 a1 + x2 a2 + x3 a3 = b . (11-24)
eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 271

By theorem 11.34 we can rewrite (11-24) as the e-coordinate vector equation


       
1 1 2 2
 + x2  0  + x3  3  =  −2 
1      
x1 
1 0 1  0
1 1 4 1

which is equivalent to the system of linear equations

x1 + x2 + 2x3 = 2
x1 + 3x3 = −2
x1 + x3 = 0
x1 + x2 + 4x3 = 1

We form the augmented matrix of the system of equations and give (without further
details) its reduced row echelon form
   
1 1 2 2 1 0 0 0
 1 0 3 −2  0 1 0 0
T =  ⇒ rref(T) =  . (11-25)
1 0 1 0 0 0 1 0
1 1 4 1 0 0 0 1

From (11-25) it is seen that the rank of the coefficient matrix of the system of equations
is 3, while the rank of the augmented matrix is 4. The system of equations has therefore
no solutions. This means that (11-24) cannot be solved. We conclude

b∈
/ span{a1 , a2 , a3 } .

Method 11.37 Linear Combination


You can decide whether a given vector b is a linear combination of other vectors
a1 , a2 , . . . , a p by solving the system of linear equations which has the augmented
matrix that is equal to the coordinate matrix for (a1 , a2 , . . . , a p , b ) with respect to a
given basis.

NB: In general there can be none, one or infinitely many ways a vector can be written
as linear combinations of the others.
eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 272

11.6.2 Whether Vectors are Linearly Dependent

We consider in the vector space R2×3 the three matrices:


     
1 0 3 2 1 0 −1 −2 9
A= , B= and C = . (11-26)
0 2 2 0 3 1 0 0 4

Problem: Investigate whether the three matrices are linearly dependent.

Solution: We use theorem 11.17 and try to find three real numbers x1 , x2 of x3 that are
not all equal to 0, but which satisfy
 
0 0 0
x1 A + x2 B + x3 C = . (11-27)
0 0 0

By theorem 11.34 we can rewrite (11-27) as the e-coordinate vector equation

1 2 −1 0
       
0 1  −2   0 
       
3 0  9 0
x1   + x2   + x3 
     = 
0
  0
   0  0
  
2 3  0 0
2 1 4 0

That is equivalent to the homogeneous system of linear equations with the augmented
matrix that here is written together with reduced row echelon form (details are omitted):

1 2 −1 0 1 0 3 0
   
0
 1 −2 0 

 0 1 −2 0 
 
3 0 9 0 0 0 0 0
T =
0 ⇒ rref(T) = 
 . (11-28)
 0 0 0  0 0 0 0 

2 3 0 0 0 0 0 0
2 1 4 0 0 0 0 0
From (11-28) we see that both the coefficient matrix and the augmented matrix have the
rank 2, and since the number of unknowns is larger, viz. 3, we conclude that Equation
(11-27) has infinitely many solutions , see Theorem 6.33. Hence the three matrices are
linearly dependent. For instance, from rref(T) one can derive that
 
0 0 0
−3A + 2B + C = .
0 0 0
eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 273

Method 11.38 Linear Dependence or Independence


One can decide whether the vectors v1 , v2 , . . . , v p are linearly dependent by solving
the linear homogenous system of linear equations with the augmented matrix that
is equal to the coordinate matrix for (v1 , v2 , . . . , v p , 0) with respect to a given basis.

NB: Since the system of equations is homogeneous, there will be either one solution
or infinitely many solutions. If the rank of the coordinate matrix is equal to p, there
is one solution, and this solution must be the zero solution, and the p vectors are
therefore linearly independent. If the rank of the coordinate matrix is less than p,
there are infinitely many solutions, including non-zero solutions, and the p vectors
are therefore linearly dependent.

11.6.3 Whether a Set of Vectors is a Basis

In an n-dimensional vector space we require n basis vectors, see theorem 11.22. When
one has asked whether a given set of vectors can be a basis, one can immediately con-
clude that this is not the case if the number of vectors in the set is not equal to n. But
if there are n vectors in the set according to theorem 11.29 we need only investigate
whether the set is linear independent, and for this we already have method 11.38. How-
ever we can in an interesting way develop the method further by using the determinant
of the coordinate matrix of the vector set!
Let us e.g. investigate whether the polynomials

P1 ( x ) = 1 + 2x2 , P2 ( x ) = 2 − x + x2 of P3 ( x ) = 2x + x2

form a basis for P2 (R). Since dim( P2 (R)) = 3, the number of polynomials is compatible
with being a basis. In order to investigate whether they also are linearly independent,
we use their coordinate vectors with respect to the monomial basis and consider the
equation:
       
1 2 0 0
x1  0  + x2  −1  + x3  2  =  0  .
2 1 1 0
The vectors are linearly independent if and only if the only solution is the trivial so-
lution x1 = x2 = x3 = 0 . The equation is equivalent to a homogeneous system of
linear equations consisting of 3 equations in 3 unknowns. The coefficient matrix and
eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 274

the augmented matrix of the system are:


   
1 2 0 1 2 0 0
A =  0 −1 2  and T =  0 −1 2 0  .
2 1 1 2 1 1 0

As for every homogeneous system of linear equations the right hand side of the aug-
mented matrix consists of only 0’s, therefore ρ(A) = ρ(T), and thus solutions do exist.
There is one solution exactly when ρ(A) is equal to the number of unknowns, that is
3. And this solution must be the zero solution x1 = x2 = x3 = 0 , since Lhom always
contains the zero solution.

Here we can use that A is a square matrix and thus has a determinant. A has full rank
exactly when it is invertible, that is when det(A) 6= 0.

Since a calculation shows that det(A) = 5 we conclude that P1 ( x ), P2 ( x ), P3 ( x ) con-
stitutes a basis for P2 (R) .

Method 11.39 Proof of a Basis, given n vectors


Given an n-dimensional vector space V. To determine whether a vector set consist-
ing of n vectors (v1 , v2 , . . . , vn ) is a basis for V, we only need to investigate whether
the set is linearly independent. A particular option for this investigation occurs be-
cause the coordinate matrix of the vector set is a square n× n matrix:
The set constitutes a basis for V exactly when the determinant of the coordinate
matrix of the set with respect to a basis a is non-zero, in short
 
(v1 , v2 , . . . , vn ) is a basis ⇔ det a v1 a v2 · · · a vn 6= 0 . (11-29)

11.6.4 To Find New Coordinates when the Basis is Changed

An important technical problem for the advanced use of linear algebra is to be able to
calculate new coordinates for a vector when a new basis is chosen. In this context a
particular change of basis matrix plays an important role. We now demonstrate how basis
matrices emerge.
eNote 11 11.6 ON THE USE OF COORDINATE MATRICES 275

In a 3-dimensional vector space V a basis a is given. We now choose a new basis b that
is determined by the a-coordinates of the basis vectors:
     
1 1 2
a b1 = 1 , a b2 = 0
    and a b3 = 3  .

1 2 0

Problem 1: Determine the a-coordinates for a vector v given by the b-coordinates as:
 
5
b v = −4 . (11-30)
 
−1

Solution: The expression (11-30) corresponds to the vector equation

v = 5b1 − 4b2 − 1b3

which we below first convert to an a-coordinate vector equation, re-writing the right
hand side as a matrix-vector product, before finally computing the result:
     
1 1 2
av = 5 1 − 4 0 − 1 3 
    
1 2 0
    
1 1 2 5 −1
=  1 0 3  −4  =  2  .
1 2 0 −1 −3

Notice that the 3×3-matrix in the last equation is the coordinate matrix for the b-basis
vectors with respect to basis a . It plays an important role, since we apparently can
determine the a-coordinates for v by multiplying b-coordinate vector for v on the left by
this matrix! Therefore the matrix is given the name change of basis matrix. The property of
this matrix is that it translates b-coordinates to a-coordinates, and it is given the symbol
a Mb . The coordinate change relation can then be written in this convenient way

av = a Mb b v . (11-31)

Problem 2: Determine the b-coordinates for a vector u that has a-coordinates:


 
1
au = 2  .
 (11-32)
3
eNote 11 11.7 SUBSPACES 276

Solution: Since a Mb is the coordinate matrix for a basis, it is invertible, and thus has an
inverse matrix. We therefore use the coordinate change relation (11-31) as follows:

au = a Mb b u ⇔
−1
a Mb au = a Mb −1 a Mb b u ⇔
bu = a Mb −1 a u ⇔
 −1    
1 1 2 1 11
bu = 1 0 3
   2  =  −4  .
1 2 0 3 −3

Method 11.40 Coordinate Change when the Basis is Changed


When a basis a is given for a vector space, and when a new basis b is known by the
a-coordinates of its basis vectors, the change of basis matrix a Mb is identical to the
a-coordinate matrix for b-basis vectors.

1. If b-coordinates for a vector v are known, these a-coordinates can be found by


the matrix-vector product:
a v = a Mb b v .

2. Conversely, if the a-coordinates for v are known, the b-coordinates can be


found by the matrix-vector product:

bv = a Mb − 1 a v .

In short the change of basis matrix that translates a-coordinates to b-


coordinates, is the inverse of the change of basis matrix that translate b-
coordinates to a-coordinates:

b Ma = (a Mb )−1 .

11.7 Subspaces

Often you encounter that a subset of a vector space is itself a vector space. In Figure 11.3
→ →
are depicted two position vectors OP and OQ that span the plane F :
eNote 11 11.7 SUBSPACES 277

F Q

Figure 11.3: A plane through the origin interpreted as a subspace in space

→ →
Since span{OP, OQ} can be considered to be a (2-dimensional) vector space in its own
right, it is named a subspace of the (3-dimensional) vector space of position vectors in
space.

Definition 11.41 Subspace


A subset U of a vector space V is called a subspace of V if U is itself a vector space.

In any vector space V one can immediately point to two subspaces:


1) V is in itself a subspace of V.
2) The set { 0 } is a subspace of V.
These subspaces are called the trivial subspaces in V.

When one must check whether a subset is a subspace, one only has to check whether
the stability requirements are satisfied:

Theorem 11.42 Sufficient Conditions for a Subspace


A non-empty subset U of a vector space V is a subspace of V if U is stable with
respect to addition and multiplication by a scalar. This means

1. The sum of two vectors from U belongs to U .

2. The product of a vector in U with a scalar belongs to U .


eNote 11 11.7 SUBSPACES 278

Proof

Since U satisfies the two stability requirements in 11.1, it only remains to show that U also
satisfies the eight arithmetic rules in the definition. But this is evident since all vectors in U
are also vectors in V where the rules apply.

Example 11.43 Basis for a Subspace

We consider a subset M1 of R2×2 , consisting of matrices of the type


 
a b
(11-33)
b a

where a and b are arbitrary real numbers. We try to add two matrices of the type (11-33)
     
1 2 3 4 4 6
+ =
2 1 4 3 6 4

and we multiply one of type (11-33) by a scalar


   
2 −3 −6 9
−3 = .
−3 2 9 −6

in both cases the resulting matrix is of type (11-33) and it is obvious that this would also apply
had we used other examples. Therefore M1 satisfies the stability requirements for a vector
space. Thus it follows from theorem 11.42 that M1 is a subspace of R2×2 .

Further remark that M1 is spanned by two linear independent 2×2 matrices since
     
a b 1 0 0 1
=a +b .
b a 0 1 1 0

Therefore M1 is a 2-dimensional subspace of R2×2 , and a possible basis for M1 is given by


 1 0   0 1 
, .
0 1 1 0
eNote 11 11.7 SUBSPACES 279

Example 11.44 A Subset which is Not a Subspace

The subset M2 of R2×2 consists of all matrices of the type


 
a b
(11-34)
a·b 0

where a and b are arbitrary real numbers. We try to add two matrices of the type (11-34)
     
1 2 2 3 3 5
+ = .
2 0 6 0 8 0

Since 8 6= 3 · 5, this matrix is not of the type (11-34). Therefore M2 is not stable under linear
combinations, and cannot be a subspace.

11.7.1 About Spannings as Subspaces

Theorem 11.45 Spannnings of Subspaces


For arbitrary vectors a1 , a2 , . . . , a p in vector space V, the set span{a1 , a2 , . . . , a p } is a
subspace of V .

Proof

The stability requirements are satisfied because 1) the sum of two linear combinations of the
p vectors in itself is a linear combination of them and 2) a linear combination of the p vectors
multiplied by a scalar in itself is a linear combination of them. The rest follows from Theorem
11.42.

The solution set Lhom for a homogeneous system of linear equations with n unknowns
is always a subspace of the number space Rn and the dimension of the subspace is the
same as the number of free parameters in Lhom . We show an example of this below.
eNote 11 11.7 SUBSPACES 280

Example 11.46 Lhom is a Subspace

The following homogeneous system of linear equations of 3 equations in 5 unknowns

x1 + 2 x3 − 11 x5 = 0
x2 + 4 x5 = 0
x4 + x5 = 0

has the solution set (details are omitted):


     
x1 −2 11
 x   0   −4 
 2     
 x3  = t1  1  + t2  0  where t1 , t2 ∈ R . (11-35)
     
 x4   0   −1 
     
x5 0 1

We see that Lhom is a span of two vectors in R5 . Then it is according to theorem 11.45
a subspace of R5 . Since the two vectors evidently are linearly independent, Lhom is a 2-
dimensional subspace of R5 , with a basis

(−2, 0, 1, 0, 0) , (11, −4, 0, −1, 1) .

In the following example we will establish a method for how one can determine a basis
for a subspace that is spanned by a number of given vectors in a subspace.

Consider in R3 four vectors

v1 = (1, 2, 1), v2 = (3, 0, −1), v3 = (−1, 4, 3) and v4 = (8, −2, −4) .

We wish to find a basis for the subspace

U = span {v1 , v2 , v3 , v4 } .

Let b = (b1 , b2 , b3 ) be an arbitrary vector in U. We thus assume that the following vector
equation has a solution:

x1 v1 + x2 v2 + x3 v3 + x4 v4 = b . (11-36)

By substitution of the five vectors into (11-36), it is seen that (11-36) is equivalent to an
eNote 11 11.7 SUBSPACES 281

inhomogeneous system of linear equations with the augmented matrix:


   
1 3 −1 8 b1 1 0 2 −1 c1
T = 2 0 4 −2 b2  ⇒ rref(T) =  0 1 −1 3 c2  . (11-37)
1 −1 3 −4 b3 0 0 0 0 0

Here c1 is placeholder for the number that b1 has been transformed into following the
row operations leading to the reduced row echelon form rref(T). Similarly for c2 . Re-
mark that b3 after the row operations must be transformed into 0, or else ( x1 , x2 , x3 , x4 )
could not be a solution as we have assumed.

But it is in particular the leading 1’s in rref(T) on which we focus! They show that v1
and v2 span all of U, and that v1 and v2 are linear independent. We can convince our-
selves of both by considering equation (11-36) again.

First: Suppose we had only asked whether v1 and v2 span all of U. Then we should
have omitted the terms with v3 and v4 from (11-36), and then we would have obtained:
 
1 0 c1
rref(T2 ) =  0 1 c2 
0 0 0

that shows that c1 v1 + c2 v2 = b , and that v1 and v2 then span all of U.

Secondly: Suppose we had asked whether v1 and v2 are linearly independent. Then we
should have omitted the terms with v3 and v4 from (11-36), and put b = 0. And then
we would have got:
 
1 0 0
rref(T3 ) =  0 1 0 
0 0 0
That shows that the zero vector can only be written as a linear combination of v1 and v2
if both of the coefficients x1 and x2 are 0. And thus we show that v1 and v2 are linearly
independent. In total we have shown that (v1 , v2 ) is a basis for U.

The conclusion is that a basis for U can be singled out by the leading 1’s in rref(T), see
(11-37). The right hand side in rref(T) was meant to serve our argument but its con-
tribution is now unnecessary. Therefore we can summarize the result as the following
method:
eNote 11 11.7 SUBSPACES 282

Method 11.47 About refining a Spanning Set to a Basis


When, in a vector space V, for which a basis a has been chosen, one wishes to find a
basis for the subspace 
U = span v1 , v2 , . . . , v p
everything can be read from
 
rref a v1 a v2 ... avp . (11-38)

If in the i’th column in (11-38) there are no leading 1’s, then vi is deleted from the
set (v1 , v2 , . . . , v p ) . The set reduced in this way is a basis for U .

Since the number of leading 1’s in (11-38) is equal to the number of basis vectors in
the chosen basis for U , it follows that
 
Dim(U ) = ρ v v
a 1 a 2 . . . v
a p . (11-39)

11.7.2 Infinite-Dimensional Vector Space

Before we end this eNote, that has cultivated the use of bases and coordinates, we must
admit that not all vector spaces have a basis. Viz. there exist infinite-dimensional vector
spaces.
This we can see through the following example:

Example 11.48 Infinite-Dimensional Vector Spaces

All polynomials in the vector space Pn (R) are continuous functions, therefore Pn (R) is an
n+1 dimensional subspace of the vector space C0 (R) of all real continuous functions. Now
consider P(R) , the set of all real polynomials, that for the same reason is also a subspace of
C0 (R). But P(R) must be infinite-dimensional, since it has Pn (R), for every n, as a subspace.
For the same reason C0 (R) must also be infinite-dimensional.
eNote 11 11.7 SUBSPACES 283

Exercise 11.49

By C1 (R) is understood the set of all differentiable functions, with R as their domain, and
with continuous derivatives in R.

Explain why C1 (R) is an infinite-dimensional subspace of C0 (R) .


eNote 12 284

eNote 12

Linear Transformations

This eNote investigates an important type of transformation (or map) between vector spaces,
viz. linear transformations. It is shown that the kernel and the range for linear transformations
are subspaces of the domain and the codomain, respectively. When the domain and the
codomain have finite dimensions and a basis has been chosen for each, questions about linear
maps can be standardized. In that case a linear transformation can be expressed as a product
between a so-called standard matrix for the transformation and the coordinates of the vectors
that we want to map. Since standard matrices depend on the chosen bases, we describe how the
standard matrices are changed when one of the bases or both are replaced. The prerequisite for
the eNote is knowledge about systems of linear equations, see eNote 6, about matrix algebra, see
eNote 7 and about vector spaces, see eNote 10.

Updated: 15.11.21 David Brander

12.1 About Maps

A map (also known as a function) is a rule f that for every element in a set A attaches an
element in a set B, and the rule is written f : A → B . A is called the domain and B the
codomain.

CPR-numbering is a map from the set of citizens in Denmark into R10 . Note that there
is a 10-times infinity of elements in the codomain R10 , so luckily we only need a small
subset, about five million! The elements in R10 that in a given instant are in use are the
range for the CPR-map.

Elementary functions of the type f : R → R . are simple maps. The meaning of the
eNote 12 12.1 ABOUT MAPS 285
15

arrow is that f to every real number x attaches


14 another real number y = f ( x ) . Consider
e.g. the continuous function: 13

1 2
y = f (x) =
12
x −2. (12-1)
2
11
Here the function has the form of a calculation procedure: Square the number, multiply
the result by one half and subtract 2. Elementary
10
functions have a great advantage in
that their graph { ( x, y) | y = f ( x ) } can be9 drawn to give a particular overview of the
map (Figure 12.1). 8

7
Y
6

X
10 8 6 4 2 0 2 4 6

Figure 12.1: Graph of an elementary function


4

Typical questions in connection with elementary


6 functions reappear in connection with
more advanced maps. Therefore let us as 7an introduction consider some of the most
important ones:
8

1. Determine the zeros of f . This means9 we must find all x for which f ( x ) = 0. In
the example above the answer is x = −2 and x = 2.
2. Solve for a given b the equation f ( x ) = b . For b = 6 there are in the example two
solutions: x = −4 and x = 4 .
3. Determine the range for f . We must find all those b for which the equation f ( x ) =
b has a solution. In the example the range is [ −2; ∞ [.

In this eNote we look at domains, codomains and ranges that are vector spaces. A map
f : V → W attaches to every vector x in the domain V a vector y = f (x) in the codomain
W . All the vectors in W that are images of vectors in V together constitute the range.
eNote 12 12.2 EXAMPLES OF LINEAR MAPS IN THE PLANE 286

Example 12.1 Mapping from a Vector Space to a Vector Space

A map g : R2×3 → R2×2 is given by

Y = g(X) = X X> . (12-2)

Then e.g.
 
 1 0 2   1 0 2  1 0 
5 0

g =  0 3 =
 .
0 3 0 0 3 0 0 9
2 0

12.2 Examples of Linear Maps in the Plane

We investigate in the following a map f that has the geometric vectors in the plane as
both the domain and codomain. For a given geometric vector x we will by x̂ understand
its hat vector, i.e. x rotated π/2 counter-clockwise. Consider the map f given by

y = f (x) = 2 x̂ . (12-3)

To every vector in the plane there is attached its hat vector multiplied (extended) by 2.
In Figure 12.2 two vectors u and v and their images f (u) and f (v) are drawn.

f(u)

f(v)
v

u
O

Figure 12.2: Two vectors (blue) and their images (red).

Figure 12.2 gives rise to a couple of interesting questions: How is the vector sum u + v
mapped? More precisely: How is the image vector f (u + v) related to the two image
vectors f (u) and f (v)? And what is the relation between the image vectors f (ku) and
f (u) , when k is a given real number?
eNote 12 12.2 EXAMPLES OF LINEAR MAPS IN THE PLANE 287

f(u+v)
f(ku)

f(u)
f(u)
f(v)
v
u+v
u

u
O ku
O

Figure 12.3: Construction of f (u + v) and f (ku) .

As indicated in Figure 12.3, f satisfies two very simple rules:

f (u + v) = f (u) + f (v) and f (ku) = k f (u) . (12-4)

Using the well known computational rules for hat vectors

+ v = û + v̂.
1. u[
c = kû .
2. ku

we can now confirm the statement (12-4) :

f (u + v) = 2u[
+ v = 2(û + v̂) = 2û + 2v̂
= f (u) + f (v)
f (ku) = 2ku
c = 2kû = k (2û)
= k f (u)
eNote 12 12.2 EXAMPLES OF LINEAR MAPS IN THE PLANE 288

Exercise 12.2

A map f 1 of plane vectors is given by f 1 (v) = 3v:

f(v)=3v

v
O

Scaling of vectors

Draw a figure that demonstrates that f 1 satisfies the rules (12-4) .

Exercise 12.3

In the plane a line l through the origin is given. A map f 2 reflects vectors drawn from the
origin in l :

l
O

f(v)

Reflection of a vector

Draw a figure that demonstrates that f 2 satisfies the rules (12-4) .


eNote 12 12.3 LINEAR MAPS 289

Exercise 12.4

A map f 3 turns vectors drawn from the origin the angle t about the origin counterclockwise:

f(v)

t
v

Rotation of a vector

Draw a figure that demonstrates that f 3 satisfies the rules (12-4) .

All maps mentioned in this section are linear, because they satisfy (12-4) . We now turn
to a general treatment of linear mappings between vector spaces.

12.3 Linear Maps

Definition 12.5 Linear Map


Let V and W be two vector spaces and let L denote either R or C. A map f : V → W
is called linear if for all u, v ∈ V and all scalars k ∈ L it satisfies the following two
linearity requirements:

L1 : f (u + v) = f (u) + f (v) .
L2 : f (ku) = k f (u) .

V is called the domain and W the codomain for f .


eNote 12 12.3 LINEAR MAPS 290

By putting k = 0 in the linearity requirement L2 in the definition 12.5 , we see


that
f (0) = 0 . (12-5)
In other words for every linear map f : V → W the zero vector in V is mapped
to the zero vector in W .

The image of a linear combination becomes in a very simple way a linear com-
bination of the images of the vectors that are part of the given linear combina-
tion:

f (k1 v1 + k2 v2 + . . . + k p v p ) = k1 f (v1 ) + k2 f (v2 ) + . . . + k p f (v p ) . (12-6)

This result is obtained by repeated application of L1 and L2 .

Example 12.6 Linear Map

A map f : R2 → R4 is given by the rule

f ( x1 , x2 ) = (0, x1 , x2 , x1 + x2 ) . (12-7)

R2 and R4 are vector spaces and we investigate whether f is a linear map. First we test the
left hand side and the right hand side of L1 with the vectors (1, 2) and (3, 4):

f ( (1, 2) + (3, 4) ) = f (4, 6) = (0, 4, 6, 10) .


f (1, 2) + f (3, 4) = (0, 1, 2, 3) + (0, 3, 4, 7) = (0, 4, 6, 10) .

Then L2 is tested with the vector (2,3) and the scalar 5:

f ( 5 · (2, 3) ) = f (10, 15) = (0, 10, 15, 25) .


5 · f (2, 3) = 5 · (0, 2, 3, 5) = (0, 10, 15, 25) .

The investigatíon suggests that f is linear. This is now shown generally. First we test L1 :

f ( ( x1 , x2 ) + (y1 , y2 ) ) = f ( x1 + y1 , x2 + y2 ) = (0, x1 + y1 , x2 + y2 , x1 + x2 + y1 + y2 ) .
f ( x1 , x2 ) + f (y1 , y2 ) = (0, x1 , x2 , x1 + x2 ) + (0, y1 , y2 , y1 + y2 )
= (0, x1 + y1 , x2 + y2 , x1 + x2 + y1 + y2 ) .

Then we test L2 :

f ( k · ( x1 , x2 ) ) = f (k · x1 , k · x2 ) = (0, k · x1 , k · x2 , k · x1 + k · x2 ) .
k · f ( x1 , x2 ) = k · (0, x1 , x2 , x1 + x2 ) = (0, k · x1 , k · x2 , k · x1 + k · x2 ) .

It is seen that f satisfies both linearity requirements and therefore is linear.


eNote 12 12.3 LINEAR MAPS 291

Example 12.7 A Map that is Not Linear

In the example 12.1 we considered the map g : R2×3 → R2×2 given by

Y = g(X) = X X> . (12-8)

That this map is not linear, can be shown by finding an example where either L1 or L2 is not
valid. Below we give an example of a matrix X that does not satisfy g(2X) = 2 g(X) :

 
  1 0 0   2 0 0   2 0 0  2 0 
4 0

g 2 =g =  0 0 =
 .
0 0 0 0 0 0 0 0 0 0 0
0 0
But  
 1 0 0    1 0    
1 0 0  1 0 2 0
2g =2 0 0 =2
 = .
0 0 0 0 0 0 0 0 0 0
0 0
Therefore g does not satisfy the linearity requirements L2 , hence g is not linear.

Example 12.8 Linear Map

A map f : P2 (R) → R is given by the rule

f P ( x ) = P 0 (1) .

(12-9)

For every second degree polynomial the slope of the tangent at x = 1 is attached. An arbi-
trary second degree polynomial P can be written as P( x ) = ax2 + bx + c , where a, b and c are
real constants. Since P0 ( x ) = 2ax + b we have:

f P( x ) = 2a + b .

If we put P1 ( x ) = a1 x2 + b1 x + c1 and P2 ( x ) = a2 x2 + b2 x + c2 , we get

f P1 ( x ) + P2 ( x ) = f ( a1 + a2 ) x2 + (b1 + b2 ) x + (c1 + c2 )
 

= 2( a1 + a2 ) + (b1 + b2 )
= (2a1 + b1 ) + (2a2 + b2 )
 
= f P1 ( x ) + f P2 ( x ) .
Furthermore for every real number k and every second degree polynomial P( x ):

f k · P( x ) = f k · ax2 + k · bx + k · c
 

= (2k · a + k · b) = k · (2a + b)

= k · f P( x ) .
eNote 12 12.4 KERNEL AND RANGE 292

It is hereby shown that f satisfies the linearity conditions L1 and L2 , and that f thus is a linear
map.

Exercise 12.9

By C ∞ (R) we understand the vector space consisting of all functions f : R → R that can be
differentiated an arbitrary number of times. One example (among infinitely many) is the sine
function. Consider the map D : C ∞ (R) → C ∞ (R) that to a function f ( x ) ∈ C ∞ (R) assigns its
derivative:
D f (x) = f 0 (x) .


Show that D is a linear map.

12.4 Kernel and Range

The zeros of an elementary function f : R → R are all the real numbers x that satisfy
f ( x ) = 0 . The corresponding concept for linear maps is called the kernel. The range
of an elementary function f : R → R are all the real numbers b for each of which a
real number x exists such that f ( x ) = b . The corresponding concept for linear maps is
also called the range or image. The kernel is a subspace of the domain and the range is a
subspace of the codomain. This is now shown.

Definition 12.10 Kernel and Range


By the kernel of a linear map f : V → W we understand the set:

ker( f ) = { x ∈ V | f (x) = 0 ∈ W } . (12-10)

By the range or image of f we understand the set:

f (V ) = { b ∈ W | At least one x ∈ V exists with f (x) = b } . (12-11)


eNote 12 12.4 KERNEL AND RANGE 293

Theorem 12.11 The Kernel and the Range are Subspaces


Let f : V → W be a linear map. Then:

1. The kernel of f is a subspace of V .

2. The range f (V ) is a subspace of W .

Proof

1) First, the kernel is not empty, as f (0) = 0 by linearity. So we just need to prove that the
kernel of f satisfies the stability requirements, see Theorem 11.42. Assume that x1 ∈ V and
x2 ∈ V, and that k is an arbitrary scalar. Since (using L1 ):

f (x1 + x2 ) = f (x1 ) + f (x2 ) = 0 + 0 = 0 ,

the kernel of f is stable with respect to addition. Moreover (using L2 ):

f (kx1 ) = k f (x1 ) = k 0 = 0 ,

the kernel of f is also stable with respect to multiplication by a scalar. In total we had shown
that the kernel of f is a subspace of V .

2) The range f (V ) is non-empty, as it contains the zero vector. We now show that it satisfies
the stability requirements. Suppose that b1 ∈ f (V ) and b2 ∈ f (V ), and that k is an arbitrary
scalar. There exist, according to the definition, see (12.10), vectors x1 ∈ V and x2 ∈ V that
satisfy f (x1 ) = b1 and f (x2 ) = b2 . We need to show that there exists an x ∈ V such that
f (x) = b1 + b2 . There is, namely x = x1 + x2 , since

f (x1 + x2 ) = f (x1 ) + f (x2 ) = b1 + b2 .

Hereby it is shown that f (V ) is stable with respect to addition. We will, in a similar way,
show that there exists an x ∈ V such that f (x) = kb1 . Here we choose x = kx1 , then

f (x) = f (kx1 ) = k f (x1 ) = kb1 ,

from which it appears that f (V ) is stable with respect to multiplication by a scalar. In total
we have shown that f (V )is a subspace of W.


eNote 12 12.4 KERNEL AND RANGE 294

But why is it so interesting that the kernel and the range of a linear map are subspaces?
The answer is that it becomes simpler to describe them when we know that they possess
vector space properties and we thereby in advance know their structure. It is particu-
larly elegant when we can determine the kernel and the range by giving a basis for
them. This we will try in the next two examples.

Example 12.12 Determination of Kernel and Range

A linear map f : R3 → R2 is given by the rule:


f ( x1 , x2 , x3 ) = ( x1 + 2x2 + x3 , − x1 − 2x2 − x3 ) . (12-12)
We wish to determine the kernel of f and the range f (R3 ) (note that it is given that f is linear.
So we omit the proof of that).

Determination of the kernel:


We shall solve the equation
   
x1 + 2x2 + x3 0
f (x) = 0 ⇔ = . (12-13)
− x1 − 2x2 − x3 0
This is a system of linear equations consisting of two equations in three unknowns. The
corresponding augmented matrix is
   
1 2 1 0 1 2 1 0
T= → rref(T) =
−1 −2 −1 0 0 0 0 0
We see that the system of equations has the solution set
−2 −1
     
x1
 x2  = t1  1 + t2  0  .
x3 0 1
The solution set is spanned by two linearly independent vectors. Therefore we can conclude
that the kernel of f is a 2-dimensional subspace of R3 that is precisely characterized by a
basis: 
Basis for the kernel : (−2, 1, 0), (−1, 0, 1) .

There is an entire plane of vectors in the space, that by insertion into the expression
for f give the image 0. This basis yields all of them.

Determination of the range:


We shall find all those b = (b1 , b2 ) for which the following equation has a solution:
   
x1 + 2x2 + x3 b
f (x) = b ⇔ = 1 . (12-14)
− x1 − 2x2 − x3 b2
eNote 12 12.4 KERNEL AND RANGE 295

Note, that it is not x1 , x2 and x3 , we are looking for, as we usually do in such a system
of equations. Rather it is b1 and b2 of the right hand side, which we will determine
exactly in those cases, when solutions exist! Because when the system has solution of a
particular right hand side, then this right-hand side must be in the image space that
we are looking for.

This is a system of linear equations consisting of two equations in three unknowns. The
corresponding augmented matrix is
   
1 2 1 b1 1 2 1 b1
T= → rref(T) =
−1 −2 −1 b2 0 0 0 b1 + b2

If b1 + b2 = 0 , that is if b1 = −b2 , the system of equations has infinitely many solutions. If


on the contrary b1 + b2 6= 0 there is no solution. All those b = (b1 , b2 ) ∈ R2 that are images
of at least one x ∈ R3 evidently tcan be written as:
   
b1 −1
=t .
b2 1

We conclude that f (V ) is a 1-dimensional subspace of R2 that can be characterized precisely


by a basis: 
Basis for the range : (−1, 1) .

O 1 X

Figure 12.4: Two vectors in the kernel (Exercise 12.13)


eNote 12 12.4 KERNEL AND RANGE 296

Exercise 12.13 Determination of Kernel and Range

In example 12.8 it was shown that the map f : P2 (R) → R given by the rule

f P ( x ) = P 0 (1) .

(12-15)

is linear. The kernel of f consists of all second degree polynomials that satisfy P0 (1) = 0 .
The graphs for a couple of these are shown in Figure 12.4:

Determine the kernel of f .

In eNote 6 the relation bewteen the solution set for an inhomogeneous system of linear
equations and the corresponding homogeneous linear system of equations is presented
in Theorem 6.37 (the structural theorem). We now show that a corresponding relation
exists for all linear equations.

Theorem 12.14 The Structural Theorem for Linear Equations


Let f : V → W be a liner map and y an arbitrary proper vector in W . Furthermore
let x0 be an arbitrary (so-called particular) solution to the inhomogeneous linear
equation
f (x) = y . (12-16)
Then the general solution Linhom to the linear equation is given by

Linhom = x = x0 + x1 x1 ∈ ker( f ) , (12-17)

or in short
Linhom = x0 + ker( f ) . (12-18)

Proof

The theorem contains two assertions. The one is that the sum of x0 and an arbitrary vector
from the ker( f ) belongs to Linhom . The other is that an arbitrary vector from Linhom can be
written as the sum of x0 and a vector from ker( f ) . We prove the two assertions separately:
eNote 12 12.5 MAPPING MATRIX 297

1. Assume x1 ∈ ker( f ) . Then it applies using the linearity condition L1 :

f (x1 + x0 ) = f (x1 ) + f (x0 ) = 0 + y = y (12-19)

by which it is also shown that x1 + x0 is a solution to (12-16).

2. Assume that x2 ∈ Linhom . The it applies using the linearity condition L1 :

f (x2 − x0 ) = f (x2 ) − f (x0 ) = y − y = 0 ⇔ x2 − x0 ∈ ker( f ) . (12-20)

Thus a vector x1 ∈ ker( f ) exists that satisfy

x2 − x0 = x1 ⇔ x2 = x0 + x1 (12-21)

whereby we have stated x2 in the form wanted. The proof is hereby complete.

Exercise 12.15

Consider the map D : C ∞ (R) → C ∞ (R) from Exercise 12.9 that to the function f ∈ C ∞ (R)
relates its derivative:
D f (x) = f 0 (x) .


State the complete solution to inhomogeneous linear eequation

D( f ( x )) = x2

and interprete this in the light of the structural theorem.

12.5 Mapping Matrix

All linear maps from a finite dimensional domain V to a finite dimensional codomain
W can be described by a mapping matrix. This is the subject of this subsection. The
prerequisite is only that a basis for both V and W is chosen, and that we turn from vector
calculation to calculation using the coordinates with respect to the chosen bases. The
great advantage by this setup is that we can construct general methods of calculation
valid for all linear maps between finite dimensional vector spaces. We return to this
subject, see section 12.6. First we turn to mapping matrix construction.
eNote 12 12.5 MAPPING MATRIX 298

Let A be a real or complex m × n−matrix. We consider a map f : Ln → Lm that has the


form of a f matrix-vector product:

y = f (x) = A x . (12-22)

Using the matrix product computation rules from Theorem 7.13, we obtain for every
choice of x1 , x2 ∈ Rn and every scalar k:

f (x1 + x2 ) = A (x1 + x2 ) = Ax1 + Ax2 = f (x1 ) + f (x2 ) ,

f (kx1 ) = A (kx1 ) = k (A x1 ) = k f (x1 ) .


We see that the map satisfies the linearity requirements L1 and L2 . Therefore every map
of the form (12-22) is linear.

Example 12.16 Matrix-Vector Product as a Linear Map

The formula:      
y1 1 2   x1 + 2x2
 y2  =  3 4  x1 =  3x1 + 4x2 
x2
y3 5 6 5x1 + 6x2
defines a particular linear map from the vector space R2 to the vector space R3 .

But also the opposite is true: Every linear map between finite-dimensional vector spaces
can be written as a matrix-vector product in the form (12-22) if we replace x and y with
their coordinates with respect to a chosen basis for the domain and codomain, respec-
tively. This we show in the following.

We consider a linear map f : V → W where V is an n-dimensional and W is an m-


dimensional vector space, see Figure 12.5

For V a basis a is chosen and for W a basis c. This means that a given vector x ∈ V
can be written as a unique linear combination of the a-basis vectors and that the image
y = f (x) can be written as a unique linear combination of the c-basis vectors:

x = x1 a1 + x2 a2 + . . . + xn an and y = y1 c1 + y2 c2 + · · · + ym cm .

This means that ( x1 , x2 , . . . , xn ) is the set of coordinates for x with respect to the a-basis,
and that (y1 , y2 , . . . , ym ) is the set of coordinates for y with respect to the c-basis.
eNote 12 12.5 MAPPING MATRIX 299

V med dim = n W med dim = m

x
y = f(x)

a-basis: (a1, a2, ..., an ) c-basis: (c1, c2, ..., cm )

Figure 12.5: Linear map

We now pose the question: How can we describe the relation between the a-coordinate
vector for the vector x ∈ V and the c-coordinate vector for the image vector y? In other
words we are looking for the relation between:

   
y1 x1
 y2 
 x2 
 
cy =  ..  and a x =  ..  .

 .  .
ym xn
This we develop through the following rewritings where we first, using L1 and L2 , get
y written as a linear combination of the images of the a-vectos.

y = f (x)
= f ( x1 a1 + x2 a2 + · · · + xn an )
= x1 f (a1 ) + x2 f (a2 ) + · · · + xn f (an ) .

Hereafter we can investigate the coordinate vector for y with respect to the c-basis, while
we first use the cooordinate theorem, see Theorem 11.34, and thereafter the definition
on matrix-vector product, see Definition 7.7.

c y = c x1 f (a1 ) + x2 f (a2 ) + · · · + xn f (an )
= x1 c f (a1 ) + x2 c f (a2 ) + · · · + xn c f (an )
 
= c f ( a1 ) c f ( a2 ) · · · c f ( a n ) a x .
eNote 12 12.5 MAPPING MATRIX 300

 
The matrix c f (a1 ) c f (a2 ) · · · c f (an ) in the last equation is called the mapping ma-
trix for f with respect to the a-basis for V and the c-basis for W.

Thus we have achieved this important result: The coordinate vector c y can be found
by multiplying the coordinate vector a x on the left by the mapping matrix. We now
summarize the results in the following.

Definition 12.17 Mapping Matrix


Let f : V → W be a linear map from an n-dimensional vector space V to an m-
dimensional vector space W . By the mapping matrix for f with respect to the basis
a of V and basis c of W we understand the m × n-matrix:
 
c Fa = c f (a1 ) c f (a2 ) · · · c f (an ) . (12-23)

The mapping matrix for f thus consists of the coordinate vectors with respect to the
basis c of the images of the n basis vectors in basis a.

The main task for a mapping matrix is of course to determine the images in W of the
vectors in V, and this is justified in the following theorem which summarizes the inves-
tigations above.
eNote 12 12.5 MAPPING MATRIX 301

Theorem 12.18 Main Theorem of Mapping Matrices


Let V be an n-dimensional vector space with a chosen basis a and W an m-
dimensional vector space with a chosen basis c .

1. For a linear map f : V → W it is valid that if y = f (x) is the image of an


arbitrary vector x ∈ V , then:

cy = c Fa a x (12-24)

where c Fa is the mapping matrix for f with respect to the basis a of V and the
basis c of W .

2. Conversely, assume that the images y = g(x) for a map g : V → W can be


obtained in the coordinate form as

cy = c Ga a x (12-25)

where c Ga ∈ Lm×n . Then g is linear and c Ga is the mapping matrix for g with
respect to the basis a of V and basis c of W .

Below are three examples of the construction and elementary use of mapping matrices.

Example 12.19 Construction and Use of a Mapping Matrix

f(x)

x
j
v
i
O

Figure: Linear rotation about the origin


eNote 12 12.5 MAPPING MATRIX 302

Rotation of plane vectors drawn from the origin is a simple example of a linear map, see
Exercise 12.4. Let v be an arbitrary angle, and let f be the linear mapping that rotates an
arbitrary vector the angle v about the origin counterclockwise, (see the figure above).

We wish to determine the mapping matrix for f with respect to the standard basis for vectors
in the plane. Therefore we need the images of the basis vectors i and j :

j f(i)
v
f(j) v
O i

Figure: Determination of mapping matrix

It is seen that f (i) = (cos(v), sin(v)) and f (j) = (− sin(v), cos(v)). Therefore the mapping
matrix we are looking for is
 
cos(v) − sin(v)
e Fe = .
sin(v) cos(v)
The coordinates for the image y = f (x) of a given vector x are thus given by the formula:
    
y1 cos(v) − sin(v) x1
= .
y2 sin(v) cos(v) x2

Example 12.20 Construction and Use of a Mapping Matrix

In a 3-dimensional vector space V, a basis a = (a1 , a2 , a3 ) is chosen, and in 2-dimensional


vector space W a basis c = (c1 , c2 ) is chosen. A linear map f : V → W satisfies:

f (a1 ) = 3c1 + c2 , f (a2 ) = 6c1 − 2c2 and f (a3 ) = −3c1 + c2 . (12-26)

We wish to find the image under f of the vector v = a1 + 2a2 + a3 ∈ V using the mapping
matrix c Fa . The mapping matrix is easily constructed since we already from (12-26) know
the images of the basis vectors in V :
 
  3 6 −3
F
c a = c f ( a 1 ) c f ( a2 ) c f ( a 3 ) =
1 −2 1
eNote 12 12.5 MAPPING MATRIX 303

Since v has the set of coordinates (1, 2, 1) with respect to basis a, we find the coordinate vector
for f (v) like this:
 
  1  
3 6 −3   12
c f (v) = c Fa a v = 2 = .
1 −2 1 −2
1
Hence we have found f (v) = 12c1 − 2c2 .

Example 12.21 Construction and Use of a Mapping Matrix

A linear map f : R4 → R3 is given by:


 
x1 + 2x2 + x4
f ( x1 , x2 , x3 , x4 ) =  2x1 − x2 + 2x3 − x4  . (12-27)
x1 − 3x2 + 2x3 − 2x4

Let us determine the mapping matrix for f with respect to the standard basis e of R4 and the
standard basis e of R3 . First we find the images of the four basis vectors in R4 using the rule
(12-27):    
1 2
f (1, 0, 0, 0) =  2  , f (0, 1, 0, 0) =  −1  ,
1 −3
   
0 1
f (0, 0, 1, 0) =  2  , f (0, 0, 0, 1) =  −1  .
2 −2
We can now construct the mapping matrix for f :
 
1 2 0 1
e Fe =  2 −1 2 −1  . (12-28)
1 −3 2 −2

We wish to find the image y = f (x) of the vector x = (1, 1, 1, 1). At our disposal we have of
course the rule (12-27), but we choose to find the image using the mapping matrix:

 1
 
  
1 2 0 1   4
1
e y = e Fe e x =  2 −1 2 −1   =  2  .
1
1 −3 2 −2 −2
1
Thus we have found that y = f (1, 1, 1, 1) = (4, 2, −2) .
eNote 12 12.6 ON THE USE OF MAPPING MATRICES 304

Exercise 12.22

In the plane is given a customary (O, i, j)-coordinate system. Reflection of position vectors
about the line y = 12 x is a linear map, let us call it s .

Determine s(i) and s(j), construct the mapping matrix e Se for s and determine an expression
k for the reflection of an arbitrary position vector v with the coordinates (v1 , v2 ) with respect
to the standard basis. The figure below contains some hints for the determination of s(i).
Proceed similarly with s(j) .
Y

y =½ x

s(i) P
j
t
t X
O i E
s(j)

Reflection of the standard basis vectors.

12.6 On the Use of Mapping Matrices

The mapping matrix tool has a wide range of applications. It allows us to translate
questions about linear maps between vector spaces to questions about matrices and
coordinate vectors that allow immediate calculations. The methods only require that
bases in each of the vector spaces be chosen, and that the mapping matrix that belongs
to the two bases has been formed. In this way we can reduce problems as diverse as
that of finding polynomials with certain properties, finding the result of a geometrical
construction and finding the solution of differential equations, to problems that can be
solved through the use of matrix algebra.

As a recurrent example in this section we look at a linear map f : V → W where V


is a 4-dimensional vector space with chosen basis a = (a1 , a2 , a3 , a4 ), and where W is a
eNote 12 12.6 ON THE USE OF MAPPING MATRICES 305

3-dimensional vector space with chosen basis c = (c1 , c2 , c3 ) . The mapping matrix for f
is:  
1 3 −1 8
c Fa =  2 0 4 −2  . (12-29)
1 −1 3 −4

12.6.1 Finding the Kernel of f

To obtain the kernel of f you must find all the x ∈ V that are mapped to 0 ∈ W. That is
you solve the vector equation
f (x) = 0 .
This equation is according to the Theorem 12.18 equivalent to the matrix equation
= c0
c Fa a x
 
  x1  
1 3 −1 8   0
x2   
⇔ 2
 0 4 −2   = 0

x3
1 −1 3 −4 0
x4
that corresponds to the homogeneous system of linear equations with the augmented
matrix:    
1 3 −1 8 0 1 0 2 −1 0
T = 2 0 4 −2 0  → rref(T) =  0 1 −1 3 0.
1 −1 3 −4 0 0 0 0 0 0
It is seen that the solution set is spanned by two linear independent vectors: (−2, 1, 1, 0)
and (1, −3, 0, 1). Let v1 and v2 be the two vectors in V that are determined by the a-
coordinates like this:

a v1 = (−2, 1, 1, 0) and a v2 = (1, −3, 0, 1) .


Since the two coordinate vectors are linearly independent, (v1 , v2 ) is a basis for the
kernel of f , and the kernel of f has the dimension 2.

Point: The number n = 4 of unknowns in the solved system of equations is by definition


equal to the number of columns in c Fa that again is equal to dim(V ), see definition 12.17.
Moreover we notice that the coefficient matrix of the system of equations is equal to c Fa .
If the rank of the coefficient matrix is k , we know that the solution set, and therefore the
kernel, will be spanned by (n − k ) linearly independent directional vectors where k is
the rank of the coefficient matrix. Therefore we have:
dim( ker( f ) ) = n − ρ (c Fa ) = 4 − 2 = 2 .
eNote 12 12.6 ON THE USE OF MAPPING MATRICES 306

Method 12.23 Determination of the Kernel


In a vector space V a basis a is chosen, and in a vector space W a basis c is chosen.
The kernel of a linear map f : V → W, in coordinate form, can be found as the
solution set for the homogeneous system of linear equations with the augmented
matrix  
T = c Fa c 0 .
The kernel is a subspace of V and its dimension is determined by:

dim( ker( f ) ) = dim(V ) − ρ (c Fa ) . (12-30)

12.6.2 Solving the Vector Equation f(x) = b

How can you decide whether a vector b ∈ W belongs to the image for a given linear
map? The question is whether (at least) one x ∈ V exists that is mapped to b . And
the question can be extended to how to determine all x ∈ V with this property that is
mapped in b .

We consider the linear map f : V → W that is represented by the mapping matrix


(12-29) and choose as our example the vector b ∈ W that has c-coordinates (1, 2, 3) . We
will solve the vector equation
f (x) = b .
If we calculate with coordinates the vector equation corresponds to the following matrix
equation
c Fa a x = c b

that is the matrix equation


 
  x1  
1 3 −1 8   1
x 2
2 0 4 −2   =  2 
 
x3
1 −1 3 −4 3
x4
that corresponds to an inhomogeneous system of linear equations with the augmented
matrix:  
1 3 −1 8 1
T = 2 0 4 −2 2 
1 −1 3 −4 3
eNote 12 12.6 ON THE USE OF MAPPING MATRICES 307

that by Gauss-Jordan elimination is reduced to


 
1 0 2 −1 0
rref(T) =  0 1 −1 3 0.
0 0 0 0 1

Since the rank of the augmented matrix is larger than the rank of the coefficient matrix,
the inhomogeneous system of equations has no solutions. We have thus found a vector
in W that has no “original vector” in V.

Method 12.24 Solution of the Vector Equation f(x) = b


In a vector space V a basis a is chosen, and in a vector space W a basis c is chosen.
For a linear map f : V → W, and a proper vector b ∈ W, the equation

f (x) = b

can be solved using the inhomogeneous system of linear equations that has the aug-
mented matrix  
T = c Fa c b
If solutions exist and x0 is one of these solutions the whole solution set can be written
as:
x0 + ker( f ) .

An inhomogeneous system of linear equation consisting of m equations in n


unknowns, with the coefficient matrix A and the right-hand side b can in
matrix form be written as
Ax = b .
The map f : Ln → Lm given by

f (x) = Ax

is linear. The linear equation f (x) = b is thus equivalent to the considered


system of linear equations. Thus we can see that the structural theorem for
systems of linear equations (see eNote 6 Theorem 6.37) is nothing but a partic-
ular case of the general structural theorem for linear maps (Theorem 12.14).
eNote 12 12.6 ON THE USE OF MAPPING MATRICES 308

12.6.3 Determining the Image Space

Above we have found that the image space for a linear map is a subspace of the codomain,
see theorem 12.11. How can this subspace be delimited and characterized?

Again we consider the linear map f : V → W that is represented by the mapping ma-
trix (12-29). Since the basis (a1 , a2 , a3 , a4 ) for V is chosen we can write all the vectors in
V at once:
x = x1 a1 + x2 a2 + x3 a3 + x4 a4 ,
where we imagine that x1 , x2 , x3 og x4 run through all conceivable combinations of real
values. But then all images in W of vectors in V can be written as:
f (x) = f ( x1 a1 + x2 a2 + x3 a3 + x4 a4 )
= x1 f (a1 ) + x2 f (a2 ) + x3 f (a3 ) + x4 f (a4 ) ,
where we have used L1 og L2 , and where we continue to imagine that x1 , x2 , x3 and x4
run through all conceivable combinations of real values. But then:
f (V ) = span { f (a1 ), f (a2 ), f (a3 ), f (a4 ) } .
The image space is thus spanned by the images of the a-basis vectors! But then we can
(according to Method 11.47Method in eNote 11) determine a basis for the image space
by finding the leading 1’s in the reduced row echelon form of
 
c f(a1 ) c f(a2 ) c f(a3 ) c f(a4 ) .

This is the mapping matrix for f with respect to the chosen bases
 
1 3 −1 8
c Fa =  2 0 4 −2 
1 −1 3 −4
that by Gauss-Jordan elimination is reduced to
 
1 0 2 −1
rref(c Fa ) =  0 1 −1 3.
0 0 0 0
To the two leading 1’s in rref(c Fa ) correspond the first two columns in c Fa . We thus
conclude:

Let w1 and w2 be the two vectors in W determined by c-coordinates as:

c w1 = (1, 2, 1) and c w2 = (3, 0, −1) .


Then (w1 , w2 ) is a basis for the image space f (V ) .
eNote 12 12.7 THE DIMENSION THEOREM 309

Method 12.25 Determination of the Image Space


In a vector space V a basis a is chosen, and in a vector space W a basis c is chosen.
The image space f (V ) for a linear mapping f : V → W can be found from

rref(c Fa ) (12-31)

in the following way: If there is no leading 1 in the i’th column in (12-31) then f (a i )
is removed from the vector set f (a1 ), f (a2 ), . . . , f (a n ) . After this thinning the
vector set constitutes a basis for f (V ) .

Since the number of leading 1’s in (12-31) is equal to the number of basis vectors in
the chosen basis for f (V ) it follows that

dim( f (V )) = ρ (c Fa ) . (12-32)

12.7 The Dimension Theorem

In the method of the preceding section 12.23 we found the following expression for the
dimension of the kernel of a linear map f : V → W :

dim( ker( f ) ) = dim(V ) − ρ (c Fa ) . (12-33)

And in method 12.25 a corresponding expression for the image space f (V ) :

dim( f (V )) = ρ (c Fa ) . (12-34)

By combining (12-33) and (12-34) a remarkably simple relationship between the domain,
the kernel and the image space for a linear map is achieved:

Theorem 12.26 The Dimension Theorem (or Rank-Nullity Theorem)


Let V and W be two finite dimensional vector spaces. For a linear map f : V → W
we have: 
dim(V ) = dim( ker( f ) ) + dim f (V ) .
eNote 12 12.7 THE DIMENSION THEOREM 310

Here are some direct consequences of Theorem 12.26:

The image space for a linear map can never have a higher dimension than the
domain.

If the kernel only consists of the 0-vector, the image space keeps the dimension
of the domain.

If the kernel has the dimension p > 0, then p dimensions disappear through the
map.

Exercise 12.27

A linear map f : R3 → R3 has, with respect to the standard basis for R3 , the mapping matrix
 
1 2 1
e Fe =  2 4 0  .
3 6 0

It is stated that the kernel of f has the dimension 1. Find by mental calculation, a basis for
f (V ) .
eNote 12 311
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

Exercise 12.28

In 3-space a standard (O, i, j, k)-coordinate system is given. The map p projects position
vectors down into ( x, y)-plane in space:

v
k
O Y
i j

p(v)

Projection down into the ( X, Y )-plane

Show that p is linear and construct the mapping matrix e Pe for p. Determine a basis for the
kernel and the image space of the projection. Check that the Dimension Theorem is fulfilled.

12.8 Change in the Mapping Matrix when the Basis is


Changed

In eNote 11 it is shown how the coordinates of a vector change when the basis for the
vector space is changed, see method 11.40. We begin this section by repeating the most
important points and showing two examples.

Assume that in V an a-basis (a1 , a2 , . . . , an ) is given, and that a new b-basis (b1 , b2 , . . . , bn )
is chosen in V. If a vector x has the b-coordinate vector b x , then its a-coordinate vector
can be calculated as the matrix vector-product

av = a Mb b v (12-35)

where the change of basis matrix a Mb is given by


 
a Mb = a b1 a b2 . . . a bn . (12-36)
eNote 12 312
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

We now show two examples of the use of (12-36). In the first example the “new” coor-
dinates are given following which the “old” are calculated. In the second example it is
vice versa: the “old” are known, and the “new” are determined.

Example 12.29 From New Coordinates to Old

In a 3-dimensional vector space V a basis a = (a1 , a2 , a3 ) is given, following which a new


basis b is chosen consisting of the vectors

b1 = a1 − a3 , b2 = 2a1 − 2a2 + a3 and b3 = −3a1 + 3a2 − a3 .

Problem: Determine the coordinate vector a x for x = b1 + 2b2 + 3b3 .

Solution: First we see that

2 −3
   
1 1
  and a Mb =  0 −2
bx = 2 3 . (12-37)
3 −1 1 −1

Then we get
2 −3 −4
    
1 1
a x = a Mb b x =  0 −2 3  2  =  5  . (12-38)
−1 1 −1 3 −2

Example 12.30 From Old Coordinates to New

In a 2-dimensional vector space W a basis c = (c1 , c2 ) is given, following which a new basis
d is chosen consisting of the vectors

d1 = 2c1 + c2 and d2 = c1 + c2 .

Problem: Determine the coordinate vector d y for y = 10c1 + 6c2 .

Solution: First we see that


     
10 2 1 −1 1 −1
cy = and c Md = ⇒ d Mc = (c Md ) = . (12-39)
6 1 1 −1 2

Then we get
    
1 −1 10 4
d y = d Mc c y = = . (12-40)
−1 2 6 2
eNote 12 313
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

We now continue to consider how a mapping matrix is changed when the basis for the
domain or the codomain is changed.

For two vector spaces V and W with finite dimension the mapping matrix for a linear
map f : V → W can only be constructed when a basis for V and a basis for W are cho-
sen. By using the mapping matrix symbol c Fa we show the foundation to be the pair of
given bases a of V and c of W.

Often one wishes to change the basis of V or the basis of W. In the first case the co-
ordinates for those vectors x ∈ V will change while the coordinates for their images
y = f (x) are unchanged; in the second case it is the other way round with the x coor-
dinates remaining unchanged while the image coordinates change. If the bases of both
Vand W are changed then the coordinates for both x and y = f (x) change.

In this section we construct methods for finding the new mapping matrix for f , when
we change the basis for either the domain, the codomain or both. First we show how
a vector’s coordinates change when the basis for the domain is changed (as in detail in
Method 11.40 in eNote 11.)

12.8.1 Change of Basis in the Domain

V med dim = n W med dim = m

x
y = f(x)

b-basis: (b1, b2, ..., bn )

a-basis: (a1, a2, ..., an ) c-basis: (c1, c2, ..., cm )

Figure 12.6: Linear map


eNote 12 314
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

In Figure 12.6 a linear map f : V → W is given that, with respect to basis a of V and
basis c of W, has the mapping matrix c Fa . We change the basis for V from basis a to
basis b. The mapping matrix for f now has the symbol c Fb . Let us find it. The equation

y = f (x)

is translated into coordinates and rewritten as:

cy = c Fa a x = c Fa (a Mb b x) = (c Fa a Mb ) b x .

From this we deduce that the mapping matrix for f with respect to the basis b of V and
basis c of W is formed by a matrix product:

c Fb = c Fa a Mb . (12-41)

Example 12.31 Change of a Mapping Matrix

We consider the 3-dimensional vector space V that is treated in Example 12.29 and the 2-
dimensional vector space W that is treated in Example 12.30. A linear map
f : V → W is given by the mapping matrix:
 
9 12 7
c Fa = .
6 8 5

Problem: Determine y = f (x) where x = b1 + 2b2 + 3b3 .

Solution: We try two different ways. 1) We use a-coordinates for x as found in (12-37):

 −4
 
  
9 12 7  10
c y = c Fa a x = 5 = .
6 8 5 6
−2

2) We change the mapping matrix for f :

2 −3
 
  1  
9 12 7  2 1 2
c Fb = c Fa a Mb = 0 −2 3  = .
6 8 5 1 1 1
−1 1 −1

Then we can directly use the given b-coordinates for x :


 
  1  
2 1 2   10
c y = c Fb b x = 2 = .
1 1 1 6
3

In either case we get y = 10c1 + 6c2 .


eNote 12 315
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

12.8.2 Change of Basis in the Codomain

V med dim = n W med dim = m

x
y = f(x)

d-basis: (d1, d2, ..., dm )

a-basis: (a1, a2, ..., an ) c-basis: (c1, c2, ..., cm )

Figure 12.7: Linear map

In Figure 12.7 a linear map f : V → W is given that, with respect to the basis a of V and
basis c of W has a mapping matrix c Fa . We change the basis for W from basis c to basis
d . The mapping matrix for f now has the symbol d Fa . Let us find it. The equation

y = f (x)

is translated into the matrix equation

cy = c Fa a x

that is equivalent to
d Mc c y = d Mc (c Fa a x)
from which we get that
dy = (d Mc c Fa ) a x .

From this we deduce that the mapping matrix for f with respect to the a-basis for V and
the d-basis for W is formed by a matrix product:

d Fa = d Mc c Fa . (12-42)
eNote 12 316
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

Example 12.32 Change of Mapping Matrix

We consider the 3-dimensional vector space V that is treated in Example 12.29 and the 2-
dimensional vector space W that is treated in Example 12.30. A linear map
f : V → W is given by the mapping matrix:
 
9 12 7
c Fa = .
6 8 5

Problem: Given the vector x = −4a1 + 5a2 − 2a3 . Determine the image y = f (x) as a linear
combination of d1 and d2 .

Solution: We try two different ways.


1) We use the given mapping matrix:

 −4
 
  
9 12 7  10
c y = c Fa a x = 5 =
 .
6 8 5 6
−2

And translate the result to d-coordinates using (12-40):


    
1 −1 10 4
d y = d Mc c y = =
−1 2 6 2

2) We change the mapping matrix for f using (12-39):


    
1 −1 9 12 7 3 4 2
d Fa = d Mcc Fa = = .
−1 2 6 8 5 3 4 3

Then we can directly read the d-coordinates:

 −4
 
  
3 4 2  4
d y = d Fa a x = 5 = .

3 4 3 2
−2

In both cases we get y = 4d1 + 2d2 .

12.8.3 Change of Basis in both the Domain and Codomain

In Figure 12.8 a linear map f : V → W is given that, with respect to the basis a for V
and basis c for W, has the mapping matrix c Fa . We change the basis for V from basis
a to basis b, and for W from basis c to basis d . The mapping matrix for f now has the
eNote 12 317
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

V med dim = n W med dim = m

x
y = f(x)

b-basis: (b1, b2, ..., bn ) d-basis: (d1, d2, ..., dm )

a-basis: (a1, a2, ..., an ) c-basis: (c1, c2, ..., cm )

Figure 12.8: Linear map

symbol d Fb . Let us find it. The equation


y = f (x)
corresponds in coordinates to
cy = c Fa a x
that is equivalent to 
d Mc c y = d Mc c Fa (a Mb b x)

from which we obtain


dy = (d Mc c Fa a Mb ) b x .

From here we deduce that the mapping matrix for f with respect to b-basis of V and
d-basis of W is formed by a matrix product:

d Fb = d Mc c Fa a Mb . (12-43)

Example 12.33 Change of Mapping Matrix

We consider the 3-dimensional vector space V that is treated in example 12.29, and the 2-
dimensional vector space W that is treated in example 12.30. A linear map f : V → W is
given by the mapping matrix:
 
9 12 7
F
c a = .
6 8 5
eNote 12 318
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

Problem: Given the vector x = b1 + 2b2 + 3b3 . Determine y = f (x) as a linear combination
of d1 and d2 .

Solution: We change the mapping matrix using (12-39) and (12-37):

2 −3
 
   1  
1 −1 9 12 7  1 0 1
d Fb = d Mc c Fa a Mb = 0 −2 3 = .
−1 2 6 8 5 0 1 0
−1 1 −1

Then we can directly use the given b-coordinates and directly read the d-coordinates:
 
  1  
1 0 1   4
d y = d Fb b x = 2 = .
0 1 0 2
3

Conclusion: y = 4d1 + 2d2 .

The change of basis in this example turns out to be rather practical. With the new mapping
matrix d Fb it is much easier to calculate the image vector: You just add the first and the third
coordinates of the given vector and keep the second coordinate!

12.8.4 Summary Concerning Change of Basis

We gather the results concerning change of basis in the subsections above in the follow-
ing method:
eNote 12 319
12.8 CHANGE IN THE MAPPING MATRIX WHEN THE BASIS IS CHANGED

Method 12.34 Change of Mapping Matrix when Changing the Basis


For the vector space V are given a basis a = (a1 , a2 , . . . , an ) and a new basis b =
(b1 , b2 , . . . , bn ) . For the vector space W are given a basis c = (c1 , c2 , . . . , cm ) and a
new basis d = (d1 , d2 , . . . , dm ) .

If f is a linear map f : V → W that, with respect to basis a of V and basis c of W,


has the mapping matrix c Fa , then:

1. The mapping matrix for f with respect to basis b of Vand basis c of W is

c Fb = c Fa a Mb . (12-44)

2. The mapping matrix for f with respect to basis a of V and basis d of W is

d Fa = (c Md )−1 c Fa = d Mc c Fa . (12-45)

3. The mapping matrix for f with respect to basis b of V and basis d of W is

d Fb = (c Md )−1 c Fa a Mb = d Mc c Fa a Mb . (12-46)

In the three formulas we have used the change of basis matrices:


   
a Mb = a b1 a b2 · · · a bn and c Md = c d1 c d2 · · · c dm .
eNote 13 320

eNote 13

Eigenvalues and Eigenvectors

This note introduces the concepts of eigenvalues and eigenvectors for linear maps in arbitrary
general vector spaces and then delves deeply into eigenvalues and eigenvectors of square
matrices. Therefore the note is based on knowledge about general vector spaces, see eNote 11, on
knowledge about algebra with matrices, see eNote 7 and eNote 8, and on knowledge about linear
maps see eNote 12.

Update: 7.10.21 David Brander.

13.1 The Eigenvalue Problem for Linear Maps

13.1.1 Introduction

In this eNote we consider linear maps of the type

f : V → V, (13-1)

that is, linear maps where the domain and the codomain are the same vector space. This
gives rise to a special phenomenon, that a vector can be equal to its image vector:

f (v) = v . (13-2)

Vectors of this type are called fixed points of the map f . More generally we are looking
for eigenvectors, that is vectors that are proportional to their image vectors. In this
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 321

connection one talks about the eigenvalue problem: to find a scalar λ and a proper (i.e.
non-zero) vector v satisfying the vector equation:

f (v) = λv . (13-3)

If λ is a scalar and v a proper vector satisfying 13-3 the proportionality factor λ is called
an eigenvalue of f and v an eigenvector corresponding to λ. Let us, for example, take
a linear map f : G3 → G3 , that is, a linear map of the set of space vectors into itself,
mapping three given vectors as shown in Figure 13.1.

c
f(c)

a
f(a)
f(b)

Figure 13.1: Three eigenvectors in space and their image vectors.

As hinted in Figure 13.1 f (a) = 2a . Therefore 2 is an eigenvalue of f with correspond-


ing eigenvector a . Furthermore f (b) = −b , so −1 is also an eigenvalue of f with
corresponding eigenvector b . And since finally f (c) = c , 1 is an eigenvalue of f with
corresponding eigenvector c . More specifically c is a fixed point for f .

To solve eigenvalue problems for linear maps is one of the most critical problems in
engineering applications of linear algebra. This is closely connected to the fact that a
linear map whose mapping matrix with respect to a given basis is a diagonal matrix is
particularly simple to comprehend and work with. And here the nice rule, that if one
chooses a basis consisting of eigenvectors for the map, then the mapping matrix auto-
matically becomes a diagonal matrix.

In the following example we illustrate these points using linear maps in the plane.
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 322

Example 13.1 Eigenvalues and Eigenvectors in the Plane

The vector space of vectors in the plane has the symbol G2 (R) . We consider a linear map

f : G2 (R) → G2 (R) (13-4)

of the set of plane vectors into itself, that with respect to a given basis (a1 , a2 ) has the follow-
ing diagonal matrix as its mapping matrix:
 
2 0
a Fa = . (13-5)
0 3

Since       
2 0 1 2 1
a f ( a1 ) = = = 2·
0 3 0 0 0
and       
2 0 0 0 0
a f ( a2 ) = = = 3·
0 3 1 3 1
we have that f (a1 ) = 2a1 and f (a2 ) = 3a2 . Both basis vectors are thus eigenvectors for f ,
because a1 corresponds to the eigenvalue 2 and a2 corresponds to the eigenvalue 3 . The
eigenvalues are the diagonal elements in a Fa .

We now consider an arbitrary vector x = x1 a1 + x2 a2 and find its image vector:


    
2 0 x1 2x1
a f (x) = = .
0 3 x2 3x2

By the map the x1 -coordinate is multiplied by the eigenvalue 2, while the x2 -coordinate is
multiplied by the eigenvalue 3. Geometrically this means that through the map all of the
plane “is stretched” first by the factor 2 in the direction a1 and then by the factor 3 in the
direction a2 , see the effect on an arbitrarily chosen vector x in the figure A:

f(x)

a2 x

O a1

Figure A: The vector x is stretched horizontally by a factor 2 and vertically by a factor 3.


eNote 13 13.1 THE EIGENVALUE 14PROBLEM FOR LINEAR MAPS 323

12
In Figure B we have chosen the standard basis (i, j) and illustrate how the linear map g that
has the mapping matrix
 
2 0
e Ge = ,
10
0 3
maps the “blue house” into the “red house” by stretching all position vectors in the blue
house by the factor 2 in the horizontal8 direction and by the factor 3 in the vertical direction.

j
5 O i 5 10

Figure B: The blue house is stretched 2in the horizontal direction by the factor 2 and vertically
by the factor 3.
14

4
We now investigate another map h , the mapping matrix of which, with respect to the stan-
dard basis, is not a diagonal matrix:
12

 
6
7/3 2/3
H
e e = .
10
1/3 8/3
Here it is not possible to decide directly
8 whether the map is composed of two stretchings in
two given directions. And the mapping of the blue house by h as shown in the figure below
8
does not give a clue directly:
10

j
5 i 5 10

2 Figure C: House

6
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 324

But it is actually also possible in the case of h to choose a basis consisting of two linearly
independent eigenvectors for h . Let b1 be given by the e-coordinates (2, −1) and b2 by the
e-coordinates (1, 1) . Then we find that
      
7/3 2/3 2 4 2
e h ( b1 ) = = = 2·
1/3 8/3 −1 −2 −1

and       
7/3 2/3 1 3 1
e h (b2 ) = = = 3· .
1/3 8/3 1 3 1
In other words, h(b1 ) = 2b1 and h(b2 ) = 3b2 . We see that b1 and b2 are eigenvectors for
h , and when we choose (b1 , b2 ) as basis, the mapping matrix for h with respect to this basis
12
takes the form:  
2 0
b Gb = .
10
0 3
Surprisingly it thus shows that the mapping matrix for h also can be written in the form
(13-5). The map h is also8 composed of two stretchings with the factors 2 and 3. Only the
stretching directions are now determined by the eigenvectors b1 and b2 . This is more evident
if we map a new blue house whose principal lines are parallel to the b-basis vectors:
6

b2

5 O b 5 10
1

Figure D: The blue house is stretched by the factor 2 and the factor 3, respectively, in the
6
directions of the eigenvectors

Thus we have illustrated: 8 If you can find two linearly independent eigenvectors for a linear
map in the plane it is possible:
10

1. to write its mapping matrix in diagonal form by choosing the eigenvectors as basis
12
2. to describe the map as stretchings in the directions of the eigenvectors with the corre-
sponding eigenvectors as stretching factors.
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 325

13.1.2 Eigenvalues and their Corresponding Eigenvectors

The eigenvalue problem for a linear map is briefly about answering the question: do any
proper vectors, each with its image vector proportional to the vector itself, exist. The
short answer to this is that this cannot be answered in general, it depends on the partic-
ular map. In the following we try to pinpoint what can actually be said generally about
the eigenvalue problem.

Definition 13.2 Eigenvalue and Eigenvector


Let f : V → V be a linear map of the vector space V into itself. If a proper vector
v ∈ V and a scalar λ exist such that

f (v) = λv , (13-6)

then the proportionality factor λ is called an eigenvalue of f , while v is called an


eigenvector corresponding to λ.

If, in Definition 13.2, it were not required to find a proper vector that satisfies
f (v) = λv , then every scalar λ would be an eigenvalue, since for any scalar λ
f (0) = λ 0 is valid. On the other hand, for a given eigenvalue, it is a matter
of convention whether or not to say that the zero vector is also a correspond-
ing eigenvector. Most commonly, the zero vector is not considered to be an
eigenvector.

The number 0 can be an eigenvalue. This is so if a proper vector v exists such


that f (v) = 0 , since we then have f (v) = 0v .

If a linear map f has one eigenvector v , then it has infinitely many eigenvectors. This
is a simple consequence of the following theorem.

Theorem 13.3 Eigenspace


If λ is an eigenvalue of a linear map f : V → V , denote by Eλ the set: Eλ := {v ∈
V | f (v) = λv)}. Then Eλ is a vector subspace of V.
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 326

Proof

Let f : V → V be a linear map of the vector space V into itself, and assume that λ is an
eigenvalue of f . Obviously Eλ is not empty, since it contains the zero vector. We shall show
that the it satisfies the two stability requirements for subspaces, see Theorem 11.42. Let k be
an arbitrary scalar, and let u and v be two arbitrary elements of Eλ . Then the following is
valid :
f (u + v) = f (u) + f (v) = λu + λv = λ(u + v) .
Thus the vector sum u + v ∈ Eλ and thus we have shown that Eλ satisfies the stability re-
quirement with respect to addition. Furthermore the following is valid:

f (ku) = k f (u) = k (λu) = λ(ku) .

Thus we have shown stabilit with respect to multiplication by a scalar. Together we have
shown that Eλ is a subspace of the domain.

Theorem 13.3 yields the following definition:

Definition 13.4 Eigenvector Space


Let f : V → V be a linear map of the vector space V to itself, and let λ be an
eigenvalue of f .

By the eigenvector space (or in short the eigenspace) Eλ corresponding to λ we under-


stand the subspace:
Eλ = {v ∈ V | f (v) = λv } .
If Eλ is finite-dimensional, dim( Eλ ) is called the geometric multiplicity of λ , denoted
gm(λ).

In the following example we consider a linear map that has two eigenvalues, both with
the geometric multiplicity 1 .
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 327

Example 13.5 Eigenspace for Reflection

In the plane a straight line through the origin is drawn. By s we denote the linear map that
maps a vector v, drawn from the origin, in its reflection s(v) in m :

v
m
s(b)

O
a
s(v)
s(a)
b

The eigenvalue problem for the reflection in m .

Let a be an arbitrary proper vector that lies on m. Since

s (a) = a = 1 · a

1 is an eigenvalue of s . The eigenspace E1 is the set of vectors that lie on m .

We now draw a straight line n through the origin, perpendicular to m . Let b be an arbitrary
proper vector lying on n. Since

s(b) = −b = (−1) · b ,

−1 is an eigenvalue of s . The eigenspace E−1 is the set of vectors that lie on n .

That not all linear maps have eigenvalues and thus eigenvectors is evident from the
following example.
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 328

Example 13.6

Let us investigate the eigenvalue problem for the linear map f : G2 → G2 that to every proper
vector v in the plane assigns its hat vector:

f (v) = v
b.

Since a proper vector v never can be proportional (parallel) to its hat vector, then for any
scalar λ we have
b 6= λv .
v
Therefore eigenvalues and eigenvectors for f do not exist.

From the following exercise we see that the dimension of an eigenspace can be greater
than 1.

Exercise 13.7

In space an ordinary (O, i, j, k)-coordinate system is given. All vectors are drawn from the
origin. The map p projects vectors down onto the ( X, Y )-plane in space:

v
k
O Y
i j

p(v)

Eigenvalue problem for the projection down onto the ( X, Y )-plane.

It is shown in Exercise 12.28 that p is linear. Determine all eigenvalues and the eigenspaces
that correspond to the eigenvalues, solely by mental calculation (ponder).
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 329

Example 13.8 The Eigenvalue Problem for Differentiation

We consider the linear map f : C ∞ (R) → C ∞ given by

f ( x (t)) = x 0 (t) .

Let λ be an arbitrary scalar. Since


f (eλt ) = λ eλt ,
λ is an eigenvalue of f and eλt is an eigenvector that corresponds to λ .

Since all solutions to the differential equation

x 0 (t) = λx (t)

is given by k · eλt where k is an arbitrary real number, the eigenspace corresponding to λ is


determined by
Eλ = k · eλt k ∈ R .


13.1.3 Theoretical Points

The following corollary gives an important result for linear maps of a vector space into
itself. It is valid even if the vector space considered is of infinite dimension.

Corollary 13.9
Let f : V → V be a linear map of a vector space V into itself, and assume

1. that f has a series of eigenvalues with corresponding eigenspaces,

2. that some of the eigenspaces are chosen, and within each of the chosen
eigenspaces some linearly independent vectors are chosen,

3. and that all the so chosen vectors are consolidated in a single set of vectors v .

Then v is a linearly independent set of vectors.


eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 330

Proof

Let f : V → V be a linear map, and let v be a set of vectors that are put together according to
points 1. to 3. in Corollary 13.9. We shall prove that v is linearly independent. The flow of
the proof is that we assume the opposite, that is, v is linearly dependent, and show that this
leads to a contradiction.

First we delete vectors from v to get a basis for span{v}. There must be at least one vector in
v that does not correspond to the basis. We choose one of these, let us call it x. Now we write
x as a linear combination of the basis vectors, in doing so we leave out the trivial terms, i.e.
those with the coefficient 0:
x = k 1 v1 + · · · + k m v m (13-7)
We term the eigenvalue that corresponds to x λ, and the eigenvalues corresponding to vi λi .
From (13-7) we can obtain an expression for λx in two different ways, partly by multiplying
(13-7) by λ, partly by finding the image by f of the right and left hand side in (13-7):

λx = λk1 v1 + · · · + λk m vm
λx = λ1 k1 v1 + · · · + λm k m vm

Subtracting the lower from the upper equation yields:

0 = k1 (λ − λ1 )v1 + · · · + k m (λ − λm )vm . (13-8)

If all the coefficients to the vectors on the right hand side of (13-8) are equal to zero, then
λ = λi for all i = 1, 2, . . . , m. But then x and all the basis vectors vi that are chosen form the
same eigenspace, and therefore they should collectively be linearly independent, this is how
they are chosen. This contradicts that x is a linear combination of the basis vectors.

Therefore at least one of the coefficients in (13-8) must be different from 0. But then the zero
vector is written as a proper linear combination of the basis vectors. This contradicts the
requirement that a basis is linearly independent.

Conclusion: the assumption that v is a linearly independent set of vectors, necessarily leads
to a contradiction. Therefore v is linearly independent.

Example 13.10 The Linear Independence of Eigenvectors

A linear map f : V → V has three eigenvalues λ1 , λ2 and λ3 that have the geometric multi-
plicities 2, 1 and 3 , respectively. The set of vectors (a1 , a2 ) is a basis for Eλ1 , (b) is a basis for
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 331

Eλ2 , and (c1 , c2 , c3 ) is a basis for Eλ3 . Then it follows from corollary 13.9 that any selection of
the six basis vectors is a linearly independent set of vectors.

Corollary 13.9 is useful because it leads directly to the following important results:

Theorem 13.11 General Properties


Let V be a vector space with dim(V ) = n , and let f : V → V be a linear map of V
into itself. Then:

1. Proper eigenvectors that correspond to different eigenvalues for f , are linearly


independent.

2. f can at the most have n different eigenvalues.

3. If f has n different eigenvalues, then a basis for V exists consisting of eigen-


vectors for f .

4. The sum of the geometric multiplicities of eigenvalues for f can at the most be
n.

5. If and only if the sum of the geometric multiplicities of the eigenvalues for f is
equal to n, a basis for V exists consisting of eigenvectors for f .

Exercise 13.12

The first point in 13.11 is a simple special case of Corollary 13.9 and therefore follows directly
from the corollary. The second point can be proved like this:

Assume that a linear map has k different eigenvalues. We choose a proper vector from each of the k
eigenspaces. The set of the k chosen vectors is then (in accordance with the corollary 13.9) linearly
independent, and k must therefore be less than or equal to the dimension of the vector space (see
Corollary 11.21).

Similarly, show how the last three points in Theorem 13.11 follow from Corollary 13.9.

Motivated by Theorem 13.11 we introduce the concept eigenbasis:


eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 332

Definition 13.13 Eigenvector basis


Let f : V → V be a linear map of a finite-dimensional vector space V into itself.

By an eigenvector basis, or in short eigenbasis, for V with respect to f we understand a


basis consisting of eigenvectors for f .

Now we can present this subsection’s main result:

Theorem 13.14 Main Theorem


Let f : V → V be a linear map of an n-dimensional vector space V into itself, and
let v = (v1 , . . . vn ) be a basis for V . Then:

1. The mapping matrix v Fv for f with respect to v is a diagonal matrix if and only if
v is an eigenbasis for V with respect to f .

2. Assume that v is an eigenbasis for V with respect to f . Let Λ denote the diagonal
matrix that is the mapping matrix for f with respect to v . The order of the diago-
nal elements in Λ is then determined from the basis like this: The basis vector vi
corresponds to the eigenvalue λi that is in the i’th column in Λ .

The proof of this theorem can be found in eNote 14 (see Theorem 14.7).
eNote 13 13.1 THE EIGENVALUE PROBLEM FOR LINEAR MAPS 333

Example 13.15 Diagonal Matrix for Reflection

Let us again consider the situation in example 13.5, where we considered the map s that
reflects vectors drawn from the origin in the line m :

v
m
s(b)

O
a
s(v)
s(a)
b

Reflection about m.

We found that a is an eigenvector that corresponds to the eigenvalue 1 , and that b is an


eigenvector that corresponds to the eigenvalue −1 . Since the plane has the dimension 2
it follows from Theorem 13.14 that if we choose the basis (a, b) , then f has the following
mapping matrix with respect to this basis:
 
1 0
.
0 −1

Example 13.16 Linear Maps without Eigenvalues

In the example 13.6 we found that the map, which maps a vector in the plane onto its hat
vector, has no eigenvalues. Therefore there is no eigenbasis for the map, and therefore it
cannot be described by a diagonal matrix for this map.

Example 13.17 Diagonalisation of a Complex Map

Let f : C2 → C2 be a linear map that satisfies


f (z1 , z2 ) = (−z2 , z1 ) .
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 334

Since:
f (i, 1) = (−1, i ) = i (i, 1) and f (−i, 1) = (−1, −i ) = (−i )(−i, 1) ,
it is seen that i is an eigenvalue of f with a corresponding eigenvector (i, 1), and that −i is an
eigenvalue of f with a corresponding eigenvector (−i, 1) .

Since (i, 1) and (−i, 1) are linearly independent, (i, 1), (−i, 1) is an eigenbasis for C2 with


respect to f . The mapping matrix for f with respect to this basis is in accordance with Theo-
rem 13.14  
i 0
.
0 −i

Exercise 13.18

Consider once more the situation in Example 13.7. Choose two different eigenbases (bases
consisting of eigenvectors for p ), and determine in each of the two cases the diagonal matrix
that will become the mapping matrix for p with respect to the chosen basis.

13.2 The Eigenvalue Problem for Square Matrices

When a linear map f : V → V maps an n-dimensional vector space V into the vec-
tor space itself the mapping matrix for f with respect to the arbitrarily chosen basis a
becomes a square matrix. The eigenvalue problem f (v) = λv is the equivalent of the
matrix equation:
a Fa ·a v = λ · a v . (13-9)
Thus we can formulate an eigenvalue problem for square matrices generally, that is
without necessarily having to think about a square matrix as a mapping matrix. We will
standardize the method, when eigenvalues and eigenvectors for square matrices are to
be determined. At the same time, due to (13-9), we get methods for finding eigenvalues
and eigenvectors for all linear maps of a vector space into itself, that can be described
by mapping matrices.

First we define what is to be understood by the eigenvalue problem for a square matrix.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 335

Definition 13.19 The Eigenvalue Problem for Matrices


Solving the eigenvalue problem for a square real n × n-matrix A means to find a
scalar λ and a proper vector v = (v1 , ... , vn ) satisfying the equation:

A v = λv . (13-10)

If this equation is satisfied for a pair af λ and v 6= 0, λ is termed an eigenvalue of A


and v an eigenvector of A corresponding to λ.

Example 13.20 The Eigenvalue Problem for a Square Matrix

We wish to investigate whether v1 = (2, 3), v2 = (4, 4) and v3 = (2, −1) are eigenvectors for
A given by
 
4 −2
A= (13-11)
3 −1
For this we write the eigenvalue problem, as stated in Definition 13.19.
    
4 −2 2 2
Av1 = = = 1 · v1
3 −1 3 3
    
4 −2 4 8
Av2 = = = 2 · v2 (13-12)
3 −1 4 8
    
4 −2 2 10
Av3 = = 6= λ · v3 .
3 −1 −1 7

From this we see that v1 and v2 are eigenvectors for A. v1 corresponding to the eigenvalue 1,
and v2 corresponding to the eigenvalue 2.

Furthermore we see that v3 is not an eigenvector for A.

Example 13.21 The Eigenvalue Problem for a Square Matrix

Given the matrix  


2 −2
A= .
−2 2
Since       
2 −2 1 0 1
= =0 ,
−2 2 1 0 1
0 is an eigenvalue of A and (1, 1) an eigenvector for A corresponding to the eigenvalue 0.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 336

Example 13.22 The Eigenvalue Problem for a Square Matrix

Given the matrix 


0 1
A= .
−1 0
Since       
0 1 −i 1 −i
= =i ,
−1 0 1 i 1
i is a complex eigenvalue of A and (−i, 1) is a complex eigenvector for A corresponding to
the eigenvalue i.

For the use in the following investigations we make some important comments to Defi-
nition 13.19 .

First we note that even if the square matrix A in Definition 13.19 is real, one is often
interested not only in the real solutions to (13-10), but more generally complex solutions.
In other words we seek a scalar λ ∈ C and a vector v ∈ Cn , satisfying (13-10).

Therefore it can be convenient to regard the left-hand side of (13-10) as a map f : Cn →


Cn given by:
f (v) = A v .
This map is linear, viz. let u ∈ Cn , v ∈ Cn and k ∈ C ., then according to the usual
arithmetic rules for matrices
1. f (u + v) = A (u + v) = A u + A v
2. f ( k u) = A( k u) = k (A u)
By this the linearity is established. Since the eigenvalue problem f (v) = λv in this case
is identical to the eigenvalue problem Av = λv , we can conclude that results obtained
in subsection 9.1 for the eigenvalue problem in general, can be transferred directly to
the eigenvalue problem for matrices. Thus let us immediately characterize the set of
eigenvectors that correspond to a given eigenvalue of a square, real matrix, compare
with Theorem 13.3.

Theorem 13.23 Subspaces of Eigenvectors


Let λ be a real or complex eigenvalue of a real n × n-matrix A. Then the set of
complex eigenvectors for A corresponding to λ, is a subspace in Cn .
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 337

If one is only interested in real solutions to the eigenvalue problem for real square ma-
trices, one can alternatively see the left hand side of (13-10) as a real map f : Rn → Rn
given by:
f (v) = A v .
Of course, this map is linear, too. We get the following version of Theorem 13.23:

Theorem 13.24 Subspaces of Eigenvectors


Let λ be a real eigenvalue of a real n × n-matrix A. Then the set of real eigenvectors
for A corresponding to λ, is a subspace in Rn .

In the light of Theorem 13.23 and Theorem 13.24 we now introduce the concept eigen-
vector space, compare with Definition 13.4.

Definition 13.25 The Eigenvector Space


Let A be a square, real matrix, and let λ be an eigenvalue of A .

The subspace of all the eigenvectors that correspond to λ is termed the eigenvector
space (or in short the eigenspace) corresponding to λ and is termed Eλ .

Now we have sketched the structural framework for the eigenvalue problem for square
matrices, and we continue in the following two subsections by investigating in an ele-
mentary way, how one can begin to find eigenvalues and eigenvectors for square matri-
ces.

13.2.1 To Find the Eigenvalues for a Square Matrix

We wish to determine the eigenvalues that correspond to a real n × n matrix A . The


starting point is the equation
Av = λv , (13-13)
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 338

First we put λv onto the left hand of the equality sign, and then v “is placed outside a
pair of brackets”. This is possible because v = E v where E is the identity matrix:

Av = λv ⇔ Av − λ(Ev) = Av − (λE)v = 0 ⇔ (A − λE)v = 0 . (13-14)

The last equation in (13-14) corresponds to a homogeneous system of linear equations


consisting of n equations in the n unknowns v1 , ..., vn , that are the elements in v =
(v1 , ..., vn ) . However, it is not possible to solve the system of equations directly, pre-
cisely because we do not know λ. We have to continue the work with the coefficient
matrix of the system of equations. We give this matrix a special symbol:

KA (λ) = (A − λE)

and is called the characteristic matrix of A.

Since it is a homogeneous system of linear equations that we have to solve we have two
possibilities for the structure of the solution. Either the characteristic matrix is invertible,
and the the only solution is v = 0 . Or the matrix is singular, and then infinitely many
solutions v exist. But since Definition 13.19 requires that v must be a proper vector, that
is a vector different from the zero vector, the characteristic matrix must be singular. To
investigate whether this is true, we take the determinant of the square matrix. This is
zero exactly when the matrix is singular:

det(A − λE) = 0 . (13-15)

Note that the left hand side in (13-15) is a polynomial in the variable λ. The polynomial
is given a special symbol:

KA (λ) = det(A − λE) = det(KA (λ))

and is termed the characteristic polynomial of A .

The equation that results when the characteristic polynomial is set equal to zero

KA (λ) = det(A − λE) = det(KA (λ)) = 0

is termed the characteristic equation of A.

By the use of the method for calculating the determinant we see that the characteristic
polynomial is always an n’th degree polynomial. See also the following examples. The
main point is that the roots in the characteristic polynomial (solutions to the character
equation) are the eigenvalues of the matrix, because the eigenvalues precisely satisfy
that the characteristic matrix is singular.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 339

It is also common to define the characteristic matrix as λE − A, since the homo-


geneous equation for this matrix has the same solutions, and the zeros of the
corresponding characteristic polynomial det(λE − A) = 0 are also the same.
But note that det(λE − A) = (−1)n det(A − λE).

Example 13.26 Eigenvalues for 2 × 2 Matrices

Given two matrices A and B:


   
4 −2 −1 4
A= and B = . (13-16)
3 −1 −2 3
We wish to determine the eigenvalues for A and B.

First we consider A. Its characteristic matrix reads:


     
4 −2 λ 0 4−λ −2
KA (λ) = A − λE = − = . (13-17)
3 −1 0 λ 3 −1 − λ
Now we determine the characteristic polynomial:
 
4−λ −2
KA (λ) = det(KA (λ)) = det
3 −1 − λ (13-18)
2
= (4 − λ)(−1 − λ) − (−2) · 3 = λ − 3λ + 2 .
The polynomial as expected has the degree 2 . The characteristic equation can be written and
the solutions determined:
KA (λ) = 0 ⇔ λ2 − 3λ + 2 = 0 ⇔ λ = 1 or λ = 2 . (13-19)
Thus A has two eigenvalues: λ1 = 1 and λ2 = 2 .

The same technique is used for the determination of possible eigenvalues of B.


     
−1 4 λ 0 −1 − λ 4
KB (λ) = B − λE = − =
−2 3 0 λ −2 3−λ
 
−1 − λ 4 (13-20)
KB (λ) = det(KB (λ)) = det
−2 3−λ
= (−1 − λ)(3 − λ) − 4 · (−2) = λ2 − 2λ + 5 .
In this case there are no real solutions to KB (λ) = 0, because the discriminant d = (−2)2 − 4 ·
1 · 5 = −16 < 0, and therefore B has no real eigenvalues. But it has two complex eigenvalues.
We use the complex “toolbox”: The discriminant can be rewritten as d = (4i )2 , which gives
the two complex solutions
2 ± 4i
λ= ⇔ λ = 1 + 2i and λ̄ = 1 − 2i (13-21)
2
Thus B has two complex eigenvalues: λ1 = 1 + 2i and λ2 = 1 − 2i .
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 340

In the following theorem the conclusions of this subsection are summarized.

Theorem 13.27 The Characteristic Polynomial


For the square real n × n-matrix A consider

1. The characteristic matrix KA (λ) = A − λE .

2. The characteristic polynomial KA (λ) = det(KA (λ)) = det(A − λE) .

3. The characteristic equation KA (λ) = 0 .

Then:

1. The characteristic polynomial is an n’th degree polynomial with the variable λ ,


and similarly the characteristic equation is an n’th degree equation with the un-
known λ .

2. The roots of the characteristic polynomial (the solutions to the characteristic equa-
tion) are all the eigenvalues of A .

13.2.2 To Find the Eigenvectors of a Square Matrix

After the eigenvalues of a real n × n matrix A are determined, it is possible to determine


the corresponding eigenvectors. The procedure starts with the equation

(A − λE)v = 0 , (13-22)

that was achieved in (13-14). Since the eigenvalues are now known, the homogeneous
system of linear equations corresponding to (13-22) can be solved with respect to the n
unknowns v1 , ..., vn that are the elements in v = (v1 , ..., vn ) . We just have to substitute
the eigenvalues one after one. As mentioned above, the characteristic matrix is singular
when the substituted λ is an eigenvalue. Therefore infinitely many solutions to the
system of equations exist. Finding these corresponds to finding all eigenvectors v that
correspond to λ .

In the following method we summarize the problem of determining eigenvalues and


the corresponding eigenvectors of a square matrix.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 341

Method 13.28 Determination of Eigenvectors


All (real or complex) eigenvalues λ for the square matrix A are found as the solutions
to the characteristic equation of A:

KA (λ) = 0 ⇔ det(A − λE) = 0 . (13-23)

Then the eigenvectors v corresponding to each of the eigenvalues λ can be deter-


mined. They are the solutions to the following system of linear equations

(A − λE)v = 0 , (13-24)

when the eigenvalue λ is substituted. E is the identity matrix.

Method 13.28 is unfolded in the following three examples that also show a way of char-
acterizing the set of eigenvectors corresponding to a given eigenvalue, in the light of
Theorem 13.23 and Theorem 13.24.

Example 13.29 Eigenvectors Belonging to Given Eigenvalues

Given the square matrix


 
2 1
A= . (13-25)
1 2
We wish to determine eigenvalues and eigenvectors to A and use method 13.28. First the
characteristic matrix is found:
     
2 1 λ 0 2−λ 1
KA (λ) = A − λE = − = (13-26)
1 2 0 λ 1 2−λ
Then the characteristic polynomial is formed:
KA (λ) = det(A − λE)
(13-27)
 
2−λ 1
= det = (2 − λ)(2 − λ) − 1 · 1 = λ2 − 4λ + 3 .
1 2−λ

The characteristic equation that is λ2 − 4λ + 3 = 0, has the solutions λ1 = 1 and λ2 = 3,


which are all the real eigenvalues of A.

In order to determine the eigenvectors corresponding to λ1 , it is substituted into (A − λE)v =


0, and then we solve the system of linear equations with the augmented matrix:
 
2−1 1 0
T = [A − λ1 E | 0 ] = . (13-28)
1 2−1 0
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 342

By Gauss-Jordan elimination we get


 
1 1 0
rref(T) = (13-29)
0 0 0
Thus there are infinitely many solutions v = (v1 , v2 ), since there is only one non-trivial equa-
tion: v1 + v2 = 0. If we are only looking for one proper eigenvector corresponding to the
eigenvalue λ1 , we can put v2 equal to 1, and we get the eigenvector v1 = (−1, 1) . All real
eigenvectors corresponding to λ1 can then be written as
 
−1
v = t· , t ∈ R. (13-30)
1
This is a one-dimensional subspace in R2 , viz. the eigenspace that corresponds to 1 that can
also be written like this:
E1 = span{(−1, 1)} . (13-31)

Now λ2 is substituted in (A − λE)v = 0, and we then solve the corresponding system of


linear equations that has the augmented matrix
 
2−3 1 0
T = [A − λ2 E | 0 ] = . (13-32)
1 2−3 0
By Gauss-Jordan elimination we get
 
1 −1 0
rref(T) = . (13-33)
0 0 0

From this we see that v2 = (1, 1) is an eigenvector corresponding to the eigenvalue λ2 . All
real eigenvectors corresponding to λ2 can be written as
 
1
v = t· , t ∈ R. (13-34)
1
This is a one-dimensional subspace in R2 that can also be written as:
E3 = span{(1, 1)} . (13-35)
We will now check our understanding: When v1 = (−1, 1) is mapped by A, will the image
vector only be a scaling (change of length) of v1 ?
    
2 1 −1 −1
Av1 = = = v1 . (13-36)
1 2 1 1
It is true! It is also obvious that the eigenvalue is 1 .

Now we check v2 :     
2 1 1 3
Av2 = = = 3 · v2 . (13-37)
1 2 1 3
v2 is also as expected an eigenvector and the eigenvalue is 3.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 343

Example 13.30 Complex Eigenvalues and Eigenvectors

In Example 13.26 a matrix B is given


 
−1 4
B= (13-38)
−2 3

that has no real eigenvalues. But we found two complex eigenvalues, λ1 = 1 + 2i and λ2 =
1 − 2i .

We substitute λ1 in (B − λE)v = 0 and then we solve the corresponding system of linear


equations that has the augmented matrix
 
−1 − (1 + 2i ) 4 0
T = [B − λ1 E | 0 ] = (13-39)
−2 3 − (1 + 2i ) 0

By Gauss-Jordan elimination we get


 
1 −1 + i 0
rref(T) = (13-40)
0 0 0

This corresponds to one non-trivial equation v1 + (−1 + i )v2 = 0, and if we put v2 = s, we


see that all the complex eigenvectors corresponding to λ1 are given by
 
1−i
v = s· , s ∈ C. (13-41)
1

This is a one-dimensional subspace in C2 , viz. the eigenspace corresponding to the eigen-


value 1 + 2i which we also can state like this:

E1+2i = span{(1 − i, 1)} . (13-42)

Similarly all complex solutions corresponding to λ2 are given by


 
1+i
v = s· , s ∈ C. (13-43)
1

This is a one-dimensional subspace in C2 which we also can state like this:

E1−2i = span{(1 + i, 1)} . (13-44)


eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 344

In the following example we find eigenvalues and corresponding eigenspaces for a 3 ×


3-matrix. It turns out that in this case to one of the eigenvalues corresponds a two-
dimensional eigenspace.

Example 13.31 Eigenvalue with Multiplicity 2

Given the matrix A:  


6 3 12
A =  4 −5 4 (13-45)
−4 −1 −10
First we wish to determine the eigenvalues of A and use Method 13.28.
6−λ
 
3 12
det 4 −5 − λ 4  = −λ3 − 9λ2 + 108 = −(λ − 3)(λ + 6)2 = 0 (13-46)
−4 −1 −10 − λ
From the last factorization it is seen that A has two different eigenvalues. The eigenvalue
λ1 = −6 is a double root in the characteristic equation, while the eigenvalue λ2 = 3 is a
single root.

Now we determine the eigenspace corresponding to λ1 = −6, see Theorem 13.23:


6 − (−6)
 
3 12 0
 4 −5 − (−6) 4 0 →
−4 −1 −10 − (−6) 0
    (13-47)
12 3 12 0 4 1 4 0
 4 1 4 0 → 0 0 0 0 
−4 −1 −4 0 0 0 0 0
Here is only one nontrivial equation: 4x1 + x2 + 4x3 = 0. If we put x1 and x3 equal to the two
free parameters s and t all real eigenvectors corresponding to the eigenvalue −6 are given
by:
     
x1 1 0
x =  x2  = s · −4 + t · −4  , s, t ∈ R . (13-48)
x3 0 1
This is a two-dimensional subspace in R3 which can also be stated like this:
E−6 = span{(1, −4, 0), (0, −4, 1)} . (13-49)

It is thus possible to find two linearly independent eigenvectors corresponding to λ1 . What


about the number of linearly independent eigenvectors for λ2 = 3?
6−3
 
3 12 0
 4 −5 − 3 4 0 →
−4 −1 −10 − 3 0
      (13-50)
3 3 12 0 1 1 4 0 1 1 4 0
 4 −8 4 0  →  0 −3 −3 0  →  0 1 1 0 
−4 −1 −13 0 0 3 3 0 0 0 0 0
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 345

Here are two non-trivial equations: x1 + x2 + 4x3 = 0 and x2 + x3 = 0. If we put x3 = s equal


to the free parameter, then all real eigenvectors corresponding to the eigenvalue 3 are given
by
−3
   
x1
x =  x2  = s · −1  , s ∈ R . (13-51)
x3 1
This is a one-dimensional subspace in R3 that can also be stated like this:

E3 = span{(−3, −1, 1)} . (13-52)

Thus it is only possible to find one linearly independent eigenvector corresponding to λ2 .

Exercise 13.32

Given the square matrix


5 −4 4
 

A =  0 −1 6  . (13-53)
0 1 4

1. Determine all eigenvalues of A.

2. Determine for each of the eigenvalues the corresponding eigenspace.

3. State at least 3 eigenvectors (not necessarily linearly independent) corresponding to


each eigenvalue.

13.2.3 Algebraic and Geometric Multiplicity

As is evident from Example 13.31 it is important to pay attention to whether an eigen-


value is a single root or a multiple root of the characteristic equation of a square real
matrix and to the dimension of the corresponding eigenspace. In this subsection we
investigate the relation between the two phenomena. This gives rise to the following
definitions.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 346

Definition 13.33 Algebraic and Geometric Multiplicity


Let A be a square, real matrix, and let λ be an eigenvalue of A .

1. λ is said to have the algebraic multiplicity n when λ is a n-double root in the


characteristic equation of the square matrix A. This is is termed am(λ) = n.

2. λ is said to have the geometric multiplicity m when the dimension of the eigen-
vector space corresponding to λ is m. This is termed gm(λ) = m. In other words:
dim( Eλ ) = gm(λ) .

We do not always have am(λ) = gm(λ) . This is dealt with in Theorem 13.34.

The following theorem has some important properties concerning algebraic and geo-
metric multiplicity of eigenvalues of square matrices, cf. Theorem 13.11.

Theorem 13.34 Properties of Multiplicities


Given a real n × n-matrix A .

1. A has at the most n different real eigenvalues, and also the sum of algebraic
multiplicities of the real eigenvalues is at the most n .

2. A has at the most n different complex eigenvalues, but the sum of the algebraic
multiplicities of the complex eigenvalues is equal to n .

3. If λ is a real or complex eigenvalue of A, then:

1 ≤ gm(λ) ≤ am(λ) ≤ n (13-54)

That is, the geometric multiplicity of an eigenvalue will at the least be equal to
1, it will be less than or equal to the algebraic multiplicity of the eigenvalue,
which in turn will be less than or equal to the number of rows and columns in
A.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 347

Exercise 13.35

Check that all three points in Theorem 13.34 are valid for the eigenvalues and eigenvectors
in example 13.31.

Let us comment upon 13.34:

Points 1 and 2 follow directly from the theory of polynomials. The characteristic poly-
nomial for a real n × n-matrix A is an n’th degree polynomial, and it has at the most n
different roots, counting both real and complex ones. Furthermore the sum of the mul-
tiplicities of the real roots is at the most n , whereas the sum of the multiplicities to the
complex roots is equal to n .

We have previously shown that for every linear map of an n-dimensional vector space
into itself the sum of the geometric multiplicities of the eigenvalues for f can at the
most be n , see Theorem 13.11. Note that this can be deduced directly from the state-
ments about multiplicities in Theorem 13.34.

As something new and interesting it is postulated in point 3 that the geometric multi-
plicity of a single eigenvalue can be less than the algebraic multiplicity. This is demon-
strated in the following summarizing Example 13.36. Furthermore the geometric mul-
tiplicity of a single eigenvalue cannot be greater than the algebraic one. The proof of
point 3 in Theorem 13.34 is left out.

Example 13.36 Geometric Multiplicity Less than Algebraic Multiplicity

Given the matrix


−9 10 0
 

A =  −3 1 5 (13-55)
1 −4 6
The eigenvalues of A are determined:

−9 − λ
 
10 0
det −3 1−λ 5  = −λ3 − 2λ2 + 7λ − 4 = −(λ + 4)(λ − 1)2 = 0 . (13-56)
1 −4 6 − λ

From the factorization in front of the last equality sign we get that A has two different eigen-
values: λ1 = −4 and λ2 = 1. Moreover am(−4) = 1 and am(1) = 2, as can be seen from the
factorization.
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 348

The eigenspace corresponding to λ1 = −4 is determined by solving (A − λ1 E)v = 0:

−9 − (−4)
 
10 0 0
 −3 1 − (−4) 5 0 →
1 −4 6 − (−4) 0
(13-57)
1 −2 0 1 −2
   
0 0 0
 0 −1 5 0 → 0
  1 −5 0
0 −2 10 0 0 0 0 0

There are two non-trivial equations: v1 − 2v2 = 0 and v2 − 5v3 = 0. If we put v3 equal to the
free parameter we see that all real eigenvectors corresponding to λ1 can be stated as

E−4 = s · (10, 5, 1) s ∈ R = span{(10, 5, 1)} .



(13-58)

We have that gm(−4) = dim( E−4 ) = 1, and that an eigenvector to λ1 is v1 = (10, 5, 1). It is
seen that gm(−4) = am(−4).

Similarly for λ2 = 1:
−9 − 1
 
10 0 0
 −3 1−1 5 0 →
1 −4 6 − 1 0
(13-59)
1 −1 −1
   
0 0 1 0 0
 0 −3 5 0 → 0 3 −5 0 
0 −3 5 0 0 0 0 0
Again we have two non-trivial equations: v1 − v2 = 0 and 3v2 − 5v3 = 0. If we put v3 = 3s
we see that all to λ2 corresponding real eigenvectors can be stated as

E1 = s · (5, 5, 3) s ∈ R = span{(5, 5, 3)} .



(13-60)

This gives the following results: gm(1) = dim( E1 ) = 1 and that an eigenvector to λ2 = λ3 is
v2 = (5, 5, 3). Furthermore it is seen that gm(1) < am(1).

13.2.4 More About the Complex Problem

We will use the matrix

 
−1 4
B= . (13-61)
−2 3
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 349

From Example 13.30 in order to make more precise some special phenomena for square,
real matrices when their eigenvalue problems are studied in a complex framework.

We found that B has the eigenvalues, λ1 = 1 + 2i and λ2 = 1 − 2i . We see that the


eigenvalues are conjugate numbers. Another remarkable thing in Example 13.30 is that
where  
1−i
v=
1
is an eigenvector corresponding to λ1 = 1 + 2i , then the conjugate vector
 
1+i
v=
1
is an eigenvector for λ2 = 1 − 2i . Both are examples of general rules:

Theorem 13.37 Conjugate Eigenvalues and Eigenvectors


For a square, real matrix A we have:

1. If λ is a complex eigenvalue of A in rectangular form λ = a + ib , then λ =


a − ib is also an eigenvalue of A .

2. If v is an eigenvector for A coresponding to the complex eigenvalue λ, then


the conjugate vector v is an eigenvector for A corresponding to the conjugate
eigenvalue λ .

Proof

The first part of Theorem 13.37 follows from the theory of polynomials. The characteristic
polynomial of a square, real matrix is a polynomial with real coefficients. The roots of such a
polynomial come in conjugate pairs.

By the trace of a square matrix we understand the sum of the diagonal elements. The
trace of B is thus −1 + 3 = 2 . Now notice that the sum of the eigenvalues of B is
(1 − i ) + (1 + i ) = 2 , that is equal to the trace of B . This is also a general phenomenon,
which we state without proof:
eNote 13 13.2 THE EIGENVALUE PROBLEM FOR SQUARE MATRICES 350

Theorem 13.38 The Trace


For a square, real matrix A, the trace A, i.e. the sum of the diagonal elements in
A, is equal to the sum of all (real and/or complex) eigenvalues of A , where every
eigenvalue is counted in the sum the number of times corresponding to the algebraic
multiplicity of the eigenvalue.

Exercise 13.39

In Example 13.31 we found that the characteristic polynomial for the matrix
 
6 3 12
A =  4 −5 4
−4 −1 −10

has the double root −6 and the single root 3 . Prove that Theorem 13.38 is valid in this case.
eNote 14 351

eNote 14

Similarity and Diagonalization

In this eNote it is explained how certain square matrices can be diagonalized by the use of
eigenvectors. Therefore it is presumed that you know how to determine eigenvalues and
eigenvectors for a square matrix and furthermore that you know about algebraic and geometric
multiplicity.

Updated: 4.11.21 David Brander

If we consider a linear map f : V → V of an n-dimensional vector space V to itself,


then the mapping matrix for f with respect to an arbitrary basis for f is a square, n × n
matrix. If two bases a and b for V are given, then the relation between the corresponding
mapping matrices a Fa and b Fb are given by

b Fb = (a Mb )−1 · a Fa · a Mb (14-1)
 
where a Mb = a b1 a b2 · · · a bn is the change of basis matrix that shifts from b to a
coordinates.

It is of special interest if a basis v consisting of eigenvectors for f can be found. Viz. let
a be an arbitrary basis for V and a Fa the corresponding mapping matrix for f . Further-
more let v be an eigenvector basis for V with respect to f . From Theorem 13.14 in eNote
13 it appears that the mapping matrix for f with respect to the v-basis is a diagonal ma-
trix Λ in which the diagonal elements are the eigenvalues of f . If V denotes the change
of basis matrix that shifts from v-coordinates to the a-coordinate vectors, according to
(14-1)Λ will appear as
Λ = V−1 · a Fa · V . (14-2)
eNote 14 14.1 SIMILAR MATRICES 352

Naturally formula 14-1 and formula 14-2 inspire questions that take their starting point
in square matrices: Which conditions should be satisfied in order for two given square
matrices to be interpreted as mapping matrices for the same linear map with respect to
two different bases? And which conditions should a square matrix satisfy in order to
be a mapping matrix for a linear map that in another basis has a diagonal matrix as a
mapping matrix? First we study these questions in a pure matrix algebra context and
return in the last subsection to the mapping viewpoint. For this purpose we now intro-
duce the concept similar matrices.

14.1 Similar Matrices

Definition 14.1 Similar Matrices


Given the n × n-matrices A and B . One says that A is similar to B if an invertible
matrix M can be found such that

B = M−1 A M . (14-3)

Example 14.2 Similar Matrices

   
2 3 8 21
Given the matrices A = and B = .
3 −4 −3 −10
   
2 3 − 1 2 −3
The matrix M = is invertible and has the inverse matrix M = .
1 2 −1 2

Consider the following calculation:


     
2 −3 2 3 2 3 8 21
= .
−1 2 3 −4 1 2 −3 −10

This shows that A is similar to B .


eNote 14 14.1 SIMILAR MATRICES 353

If A is similar to B, then B is also similar to A. If we put N = M−1 then N is


invertible and

B = M−1 A M ⇔ M B M−1 = A ⇔ A = N−1 B N ,

Therefore one uses the phrase: A and B are similar matrices .

Theorem 14.3 Similarity Is Transitive


Let A , B and C be n × n-matrices. If A is similar to B and B is similar to C then A is
similar to C .

Exercise 14.4

Prove Theorem 14.3.

Regarding the eigenvalues of similar matrices the following theorem applies.

Theorem 14.5 Similarity and Eigenvalues


If A is similar to B then the two matrices have identical eigenvalues with the same
corresponding algebraic and geometric multiplicities.

Proof

Let M be an invertible matrix that satisfies B = M−1 AM and let, as usual, E denote the
identity matrix of the same size as the three given matrices. Then:

det(B − λE) = det(M−1 AM − λM−1 EM)


= det(M−1 (A − λE)M) (14-4)
= det(A − λE) .

Thus it is shown that the two matrices have the same characteristic polynomial and thus the
same eigenvalues with the same corresponding algebraic multiplicities. Moreover, that they
have the same eigenvalues appears from Theorem 14.13 which is given below: When A and
eNote 14 14.2 MATRIX DIAGONALIZATION 354

B can represent the same linear map f with respect to different bases they have identical
eigenvalues, viz. the eigenvalues of f .

But the eigenvalues also do have the same geometric multiplicities. This follows from the fact
that the eigenspaces for A and B with respect to any of the eigenvalues can be interpreted as
two different coordinate representations of the same eigenspace, viz. the eigenspace for f
with respect to the said eigenvalue.

Note that Theorem 14.5 says that two similar matrices have the same eigenval-
ues, but not vice versa: that two matrices, which have the same eigenvalues,
are similar. There is a difference and only the first statement is true.

Two similar matrices A and B have the same eigenvalues, but an eigenvector
for the one is not generally and eigenvector for the other. But if v is an eigen-
vector for A corresponding to the eigenvalue λ then M−1 v is an eigenvector
for B corresponding to the eigenvalue λ, where M is the invertible matrix that
satisfies B = M−1 AM . Viz.:

Av = λv ⇔ M−1 Av = M−1 λv ⇔ B (M−1 v ) = λ (M−1 v ) . (14-5)

Exercise 14.6

Explain that two square n × n-matrices are similar, if they have identical eigenvalues with the
same corresponding geometric multiplicities and that the sum of the geometric multiplicities
is n .

14.2 Matrix Diagonalization

Consider a matrix A and an invertible matrix V given by


   
1 −2 −1 −2
A= and V = . (14-6)
1 4 1 1
Since      
−1 1 2 1 −2 −1 −2 3 0
V AV = = ,
−1 −1 1 4 1 1 0 2
eNote 14 14.2 MATRIX DIAGONALIZATION 355

A possesses a special property: it is similar to a diagonal matrix, viz. the diagonal


matrix  
3 0
Λ= .
0 2
In this context one says that A has been diagonallzed by similarity transformation.

Now we will ask the question whether or not an arbitrary square matrix A can be
diagonalized by a similarity transformation. Therefore we form the equation
V−1 A V = Λ,
where V is an invertible matrix and Λ is a diagonal matrix. Below we prove that
the equation has exactly one solution if the columns of V are linearly independent
eigenvectors for A , and the diagonal elements in Λ are the eigenvalues of A written
such that the i-th column of V is an eigenvector corresponding to the eigenvalue for
the i-th column in Λ .

We note that this is in agreement with the example-matrices in (14-6) above:


      
1 −2 −1 −3 −1
= =3 (14-7)
1 4 1 3 1
and       
1 −2 −2 −4 −2
= =2 . (14-8)
1 4 1 2 1
We see from (14-7) that the first column of V as expected is an eigenvector for A corre-
sponding to the first diagonal element in Λ, and we see in (14-8) that the second column
of V is an eigenvector corresponding to the second diagonal element in Λ.

Theorem 14.7 Diagonalization by Similarity Transformation


If a square n × n-matrix A has n linearly independent eigenvectors v1 , v2 , . . . , vn
corresponding to the n (not necessarily different) eigenvalues λ1 , λ2 , . . . , λn , respec-
tively, it can be diagonalized by the similarity transformation

V−1 AV = Λ ⇔ A = VΛV−1 , (14-9)

where

Λ = diag(λ1 , λ2 , . . . , λn ) .
 
V = v1 v2 · · · vn and (14-10)

If A does not have n linearly independent eigenvectors, it cannot be diagonalized


by a similarity transformation.
eNote 14 14.2 MATRIX DIAGONALIZATION 356

Proof

Suppose that A has n linearly independent eigenvectors v1 , v2 , . . . , vn and vi corresponds to


the eigenvalue λi , for i = 1 . . . n. Then the following equations are valid:

Av1 = λ1 v1 , Av2 = λ2 v2 , ... , Avn = λn vn (14-11)


The n equations can be gathered in a system of equations:
   
Av1 Av2 · · · Avn = v1 λ1 v2 λ2 · · · vn λn
λ1 0 · · · 0
 

 0 λ2 · · · 0 

   
⇔ A v1 v2 · · · vn = v1 v2 · · · vn  . .. . . .  (14-12)
 .. . . .. 
0 0 · · · λn
⇔ AV = VΛ

Now all the eigenvectors are inserted (vertically one after the other) in the matrix V in the
same order as that of the eigenvalues in the diagonal of the matrix Λ that outside the diag-
onal contains only zeroes. Since the eigenvectors are linearly independent the matrix V is
invertible. Therefore the inverse V−1 exists, and we multiply by this from the left on both
sides of the equality sign:

V−1 AV = V−1 VΛ ⇔ V−1 AV = Λ . (14-13)

Thus the first part of the theorem is proved. Suppose on the contrary that A can be diagonal-

ized by a similarity transformation. Then an invertible matrix V = v1 v2 · · · vn and a
diagonal matrix Λ = diag(λ1 , λ2 , . . . λn ) exist such that

V−1 AV = Λ . (14-14)

If we now repeat the transformations in the first part of the proof only now in the opposite
order, it is seen that (14-14) is the equivalent of the following n equations:

Av1 = λ1 v1 , Av2 = λ2 v2 , ... , Avn = λn vn (14-15)

from which it appears that vi for i = 1 . . . n is an eigenvector of A corresponding to the


eigenvalue λi .

Therefore diagonalization by similarity transformation can only be obtained by the method


described in the first part of the theorem.


eNote 14 14.2 MATRIX DIAGONALIZATION 357

The following theorem can be of great help when one investigates whether matrices can
be diagonalized by similarity in different contexts . The main result is already given in
Theorem 14.7, but here we refine the conditions by drawing upon previously proven
theorems about the eigenvalue problem for linear maps and matrices.

Theorem 14.8 Matrix Diagonalizability


For a given n × n-matrix A we have:

A can be diagonalized by a similarity transformation

1. if n different eigenvalues for A exist.

2. if the sum of the geometric multiplicities of the eigenvalues is n .

A cannot be diagonalized by similarity transformation

3. if the sum of the geometric multiplicities of the eigenvalues is less than n .

4. if an eigenvalue λ with gm(λ) < am(λ) exists.

Proof

Ad. 1. If a proper eigenvector from each of the n eigenspaces is chosen, it follows from Corol-
lary 13.9 that the collected set of n eigenvectors is linearly independent. Therefore, according
to Theorem 14.7, A can be diagonalized by similarity transformation.

Ad. 2: If a basis from each of the eigenspaces is chosen, then the collected set of the chosen
n eigenvectors according to Corollary 13.9 is linearly independent. Therefore, according to
Theorem 14.7 A can be diagonalized by similarity transformation .

Ad. 3: If the sum of the geometric multiplicities is less than n, n linearly independent eigen-
vectors for A do not exist. Therefore, according to Theorem 14.7 A cannot be diagonalized
by similarity transformation.

Ad. 4: Since according to Theorem 13.34 point 1, the sum of the algebraic multiplicity is less
than or equal to n, and since according to the same theorem point 2 for every eigenvalue
λ gm(λ) ≤ am(λ) , the sum of the geometric multiplicities cannot become n, if one of the
geometric multiplicities is less than its algebraic one. Therefore, according to what has just
been proved, A cannot be diagonalized by similarity transformation.


eNote 14 14.3 COMPLEX DIAGONALIZATION 358

A typical special case is that of a square n × n-matrix with n different eigen-


values. Theorem 14.8 point 1 guarantees that all matrices of this type can be
diagonalized by similarity transformation.

In the following examples we will see how to investigate in practice whether diagonal-
ization by similarity transformation is possible and, if so, carry it through.

Example 14.9

The square matrix


 
5 3 2
A =  2 10 4  (14-16)
2 6 8
has the eigenvalues λ1 = 4 and λ2 = 15 . The vectors v1 = (−2, 0, 1) and v2 = (−3, 1, 0)
are linearly independent vectors correponding to λ1 , and the vector v3 = (1, 2, 2) is a proper
eigenvector corresponding to λ2 . The collected set of the three eigenvectors is linearly in-
dependent according to Corollary 13.9. Therefore, according to Theorem 14.7, it is possible
to diagonalize A, because n = 3 linearly independent eigenvectors exist. Therefore we can
write Λ = V−1 AV, where

−2 −3 1
     
λ1 0 0 4 0 0
Λ =  0 λ1 0  =  0 4 0  and V = v1 v2 v3 =  0
 
1 2  . (14-17)
0 0 λ2 0 0 15 1 0 2

14.3 Complex Diagonalization

What we so far have said about similar matrices is generally valid for square, complex
matrices. Therefore the basic equation for diagonalization by similarity transformation:

V−1 A V = Λ,

will be understood in the broadest sense, where the matrices A, V and Λ are complex
n × n-matrices. Until now we have limited ourselves to real examples, that is examples
where it has been possible to satisfy the basic equation (14.3) with real matrices. We will
in the following look upon a special situation that is typical in technical applications of
diagonalization: For a given real n × n matrix A we seek an invertible matrix M and
a diagonal matrix Λ satisfying the basic equation in a broad context where M and Λ
possibly are complex (not real) n × n matrices.
eNote 14 14.3 COMPLEX DIAGONALIZATION 359

The following example shows a real 3 × 3 matrix that cannot be diagonalized (with only
non-complex entries in the diagonal) because its characteristic polynomial only has one
real root. On the other hand it can be diagonalized in a complex sense.

Example 14.10 Complex Diagonalization of a Real Matrix

The square matrix


 
2 0 5
A = 0 0 1 (14-18)
0 −1 0
has the eigenvalues λ1 = 2 , λ2 = −i and λ3 = i . v1 = (1, 0, 0) is a proper eigenvector
corresponding to λ1 , v2 = (−2 + i, i, 1) is a proper eigenvector corresponding to λ2 , and
v3 = (−2 − i, −i, 1) is a proper eigenvector belonging to λ3 . The collected set of the three
said eigenvectors is linearly independent according to Corallary 13.9. Therefore, according
to Theorem 14.7, it is possible to diagonalize A. Therefore we can write Λ = V−1 AV, where

1 −2 + i −2 − i
     
λ1 0 0 2 0 0
Λ =  0 λ2 0  =  0 −i 0  and V = v1 v2 v3 =  0
 
i −i  .
0 0 λ3 0 0 i 0 1 1
(14-19)

The next example shows a real, square matrix that cannot be diagonalized either in a
real or in a complex way.

Example 14.11 Non-Diagonalizable Square Matrix

Given the square matrix


4 1 −2
 

A = 1 4 1 , (14-20)
0 0 3
and A has the eigenvalues λ1 = 3 and λ3 = 5. The eigenvalue 3 has the algebraic multiplicity
2, but only one linearly independent eigenvector can be chosen, e.g. v1 = (1, −1, 0). Thus the
eigenvalue has the geometric multiplicity 1. Therefore, according to Theorem 14.7, it is not
possible to diagonalize A by similarity transformation.
eNote 14 14.4 DIAGONALIZATION OF LINEAR MAPS 360

Exercise 14.12

For the matrix  


2 9
A= (14-21)
1 −6
the following should be determined:

1. All eigenvalues and their algebraic multiplicities.

2. All corresponding linearly independent eigenvectors and thus the geometric multiplic-
ities of the eigenvectors.

3. If possible, A is to be diagonalized: Determine a diagonal matrix Λ and an invertible


matrix V for which V−1 AV = Λ. What are the requirements for the diagonalization to
be carried through? Which numbers and vectors are used in Λ and V ?

14.4 Diagonalization of Linear Maps

In the introduction to this eNote we asked the question: What conditions should be
satisfied so that two given square matrices can be interpreted as mapping matrices for
the same linear map with respect to two different bases? The answer is simple:

Theorem 14.13 Similar Matrices as Mapping Matrices


An n-dimensional vector space V is given. Two n × n matrices A and B are map-
ping matrices for the same linear map f : V → V with respect to two different bases
for V if and only if A and B are similar.

Exercise 14.14

Prove Theorem 14.13

In the introduction we also asked the question: Which conditions should a square ma-
trix satisfy in order to be a mapping matrix for a linear map that in another basis has
eNote 14 14.4 DIAGONALIZATION OF LINEAR MAPS 361

a diagonal matrix as a mapping matrix? The answer appears from Theorem 14.7 com-
bined with Theorem 14.13: the matrix must have n linearly independent eigenvectors.

We end the eNote by an example on diagonalization of a linear map, that is finding a


suitable basis in which the mapping matrix is diagonal.

Example 14.15 Diagonalization of a Linear Map

A linear map f : P1 (R) → P1 (R) is given by the following mapping matrix with respect to
the standard monomial basis m:
 
−17 −21
F
m m = (14-22)
14 18

This means that f (1) = −17 + 14x and f ( x ) = −21 + 18x. We wish to investigate whether a
(real) eigenbasis for f can be found and if so, how the mapping matrix looks with respect to
this basis, and what the basis vectors are.

The eigenvalues of m Fm are determined:


 
−17 − λ −21
det = λ2 − λ − 12 = (λ + 3)(λ − 4) = 0 . (14-23)
14 18 − λ

It is already now possible to confirm that a real eigenbasis for f exists since 2 = dim( P2 (R)),
viz. λ1 = −3 and λ2 = 4 each with the algebraic multiplicity 1. Eigenvectors corresponding
to λ1 are determined:
1 23 0
   
−17 + 3 −21 0
→ . (14-24)
14 18 + 3 0 0 0 0
This yields an eigenvector m v1 = (−3, 2), if the free parameter is put equal to 2. Similarly we
get the other eigenvector:
   
−17 − 4 −21 0 1 1 0
→ . (14-25)
14 18 − 4 0 0 0 0

This yields an eigenvector m v2 = (−1, 1), if the free parameter is put equal to 1.

Thus a real eigenbasis v for f , given by the basis vectors m v1 and m v2 , exists. We then get
   
−3 −1 −3 0
m Mv = and v Fv = (14-26)
2 1 0 4

The basis consists of the vectors v1 = −3 + 2x and v2 = −1 + x and the map is “simple” with
respect to this basis.
eNote 14 14.4 DIAGONALIZATION OF LINEAR MAPS 362

One can check with the map of v1 :

f (v1 ) = f (−3 + 2x ) = −3 · f (1) + 2 · f ( x )


= −3 · (−17 + 14x ) + 2 · (−21 + 18x ) (14-27)
= 9 − 6x = −3(−3 + 2x ) = −3v1

It is true!
eNote 15 363

eNote 15

Symmetric Matrices

In this eNote we will consider one of the most used results from linear algebra – the so-called
spectral theorem for symmetric matrices. In short it says that all symmetric matrices can be
diagonalized by a similarity transformation – that is, by change of basis with a suitable
substitution matrix.
The introduction of these concepts and the corresponding method were given in eNotes 10, 13
and 14, which therefore is a necessary basis for the present eNote.
Precisely in that eNote it became clear that not all matrices can be diagonalized.
Diagonalization requires a sufficiently large number of eigenvalues (the algebraic multiplicities
add up to be as large as possible) and that the corresponding eigenvector spaces actually span all
of the vector space (the geometric multiplicities add up to be as large as possible). It is these
properties we will consider here, but now for symmetric matrices, which turn out to satisfy the
conditions and actually more: the eigenvectors we use in the resulting substitution matrix can
be chosen pairwise orthogonal, such that the new basis is the result of a rotation of the old
standard basis in Rn .
In order to be able to discuss and apply the spectral theorem most effectively we must first
introduce a natural scalar product for vectors in Rn in such a way that we will be able to
measure angles and lengths in all dimensions. We do this by generalizing the well-known
standard scalar product from R2 and R3 . As indicated above we will in particular use bases
consisting of pairwise orthogonal vectors in order to formulate the spectral theorem, understand
it and what use we can make of this important theorem.

Updated: 20.11.21 David Brander


eNote 15 15.1 SCALAR PRODUCT 364

15.1 Scalar Product

In the vector space Rn we introduce an inner product, i.e. a scalar product that is a
natural generalization of the well-known scalar product from plane geometry and space
geometry, see eNote 10.

Definition 15.1 Scalar Product


Let a and b be two given vectors in Rn with the coordinates ( a1 , ..., an ) and (b1 , ..., bn ),
respectively, with respect to the standard basis shalle in Rn :

ea = ( a1 , ..., an ) , and eb = (b1 , ..., bn ) . (15-1)

Then we define the scalar product, the inner product, (also called the dot product) of
the two vectors in the following way:
n
a·b = a1 b1 + a2 b2 + · · · an bn = ∑ a i bi . (15-2)
i =1

When Rn is equipped with this scalar product (Rn , ·) is thereby an example of a


so-called Euclidian vector space , or a vector space with inner product.

The scalar product can be expressed as a matrix product:

b1
 
 · 
>
   
a · b = ea · eb = a1 · · · an · 
 · 
 (15-3)
 · 
bn

For the scalar product introduced above the following arithmetic rules apply:
eNote 15 15.1 SCALAR PRODUCT 365

Theorem 15.2 Arithmetic Rules for the Scalar Product


If a , b and c are vectors in (Rn , ·) and k is an arbitrary real number then:

a·b = b·a (15-4)


a · (b + c) = a · b + a · c (15-5)
a · (kb) = (ka) · b = k(a · b) (15-6)

A main point about the introduction of a scalar product is that we can now talk about
the lengths of the vectors in (Rn , ·):

Definition 15.3 The Length of a Vector


Let a be a vector in (Rn , ·) with the coordinates ( a1 , ..., an ) with respect to the stan-
dard e-basis in Rn . Then the length of a is defined by
s
√ n
|a| = a · a = ∑ a2i . (15-7)
i =1

The length of a is also called the norm of a with respect to the scalar product in
(Rn , ·). A vector a is called a proper vector if |a| > 0 .
eNote 15 15.1 SCALAR PRODUCT 366

It follows from Definition 15.1 that


a·a ≥ 0 for all a ∈ (Rn , ·) and
(15-8)
a·a = 0 ⇔ a = 0 .

From this we immediately see that

|a| ≥ 0, for all a ∈ (Rn , ·) and


(15-9)
|a| = 0 ⇔ a = 0 .

Thus a proper vector is a vector that is not the 0-vector.

Finally it follows from Definition 15.1 and Definition 15.3 that for a ∈ (Rn , ·)
and an arbitrary real number k we have that

|ka| = |k| |a| . (15-10)

We can now prove the following important theorem:

Theorem 15.4 Cauchy-Schwarz Inequality


For arbitrary vectors a and b in (Rn , ·)

|a · b| ≤ | a | | b | . (15-11)

Equality holds if and only if a and b are linearly dependent.

Proof

If b = 0 , both sides of (15-11) are equal to 0 and the inequality is thereby satisfied. We now
assume that b is a proper vector.

1
We put k = b · b and e = √ b . It then follows from (15-6) that
k
1 1 1
e · e = ( √ b) · ( √ b) = (b · b) = 1
k k k

and thereby that |e| = 1 .



By substituting b = k e in the left hand side and the right hand side of (15-11) we get using
eNote 15 15.1 SCALAR PRODUCT 367

(15-6) and (15-10): √ √


|a · b| = | a · ( k e) | = k |a · e|
and √ √
|a| |b| = |a| | ke| = k |a| |e| .
Therefore we only have to show that for arbitrary a and e , where |e| = 1 ,

|a · e| ≤ | a | (15-12)

where equality holds if and only if a and e are linearly dependent.

For an arbitrary t ∈ R it follows from (15-8), (15-5) and (15-6) that:

0 ≤ (a − te) · (a − te) = a · a + t2 (e · e) − 2t(a · e) = a · a + t2 − 2t(a · e) .

If in particular we choose t = a · e , we get



0 ≤ a · a − (a · e)2 ⇔ |a · e| ≤ a · a = |a| .

Since it follows from (15-8) that (a − te) · (a − te) = 0 if and only if (a − te) = 0 , we see that
|a · e| = | a | if and only if a and e are linearly dependent. The proof is hereby complete.

From the Cauchy-Schwarz inequality follows the triangle inequality that is a general-
ization of the well-known theorem from elementary plane geometry, that a side in a
triangle is always less than or equal to the sum of the other sides:

Corollary 15.5 The Triangle Inequality


For arbitrary vectors a and b in (Rn , ·)

|a + b| ≤ |a| + |b| . (15-13)

Exercise 15.6

Prove Corollary 15.5.


eNote 15 15.2 SYMMETRIC MATRICES AND THE SCALAR PRODUCT 368

Note that from the Cauchy-Schwarz inequality it follows that:

a·b
−1 ≤ ≤1 . (15-14)
|a| · |b|

Therefore the angle between two vectors in (Rn , ·) can be introduced as follows:

Definition 15.7 The Angle Between Vectors


Let a and b be two given proper vectors in (Rn , ·) with the coordinates ( a1 , ..., an )
and (b1 , ..., bn ) with respect to the standard basis in (Rn , ·). Then the angle between
a and b is defined as the value θ in interval [0, π ] that satisfies

a·b
cos(θ ) = . (15-15)
|a| · |b|
If a · b = 0 we say that the two proper vectors are orthogonal or perpendicular with
respect to each other. This occurs exactly when cos(θ ) = 0, that is, when θ = π/2.

15.2 Symmetric Matrices and the Scalar Product

We know the symmetry concept from square matrices:

Definition 15.8
A square matrix A is symmetric if it is equal to its own transpose

A = A> , (15-16)

that is if aij = a j i for all elements in the matrix.

What is the relation between symmetric matrices and the scalar product? This we con-
sider here:
eNote 15 15.2 SYMMETRIC MATRICES AND THE SCALAR PRODUCT 369

Theorem 15.9
Let v and w denote two vectors in the vector space (Rn , ·) with scalar product intro-
duced above. If A is an arbitrary (n × n)−matrix then
 
(A v) ·w = v· A> w . (15-17)

Proof

We use the fact that the scalar product can be expressed as a matrix product:

(A v) · w = (A v) > · w
 
= v> A> · w
  (15-18)
= v> · A> w
 
= v · A> w .

This we can now use to characterize symmetric matrices:

Theorem 15.10
A matrix A is a symmetric (n × n)−matrix if and only if

(A v) ·w = v· (A w) (15-19)

for all vectors v and w in (Rn , ·).

Proof

If A is symmetric then we have that A = A> and therefore Equation (15-19) follows directly
from Equation (15-17). Conversely, if we assume that (15-19) applies for all v and w, we
eNote 15 15.2 SYMMETRIC MATRICES AND THE SCALAR PRODUCT 370

will prove that A is symmetric. But this follows easily just by choosing suitable vectors, e.g.
v = e2 = (0, 1, 0, ..., 0) and w = e3 = (0, 0, 1, ..., 0) and substitute these into (15-19) as seen
below. Note that A ei is the ith column vector in A.

(A e2 ) · e3 = a23
= e2 · (A e3 )
(15-20)
= (A e3 ) ·e2
= a32 ,

such that a23 = a32 . Quite similarly for all other choices of indices i and j we get that aij = a j i
– and this is what we had set out to prove.

A basis a in (Rn , ·) consists (as is known from eNote 11) of n linearly independent vec-
tors (a1 , ..., an ). If in addition the vectors are pairwise orthogonal and have length 1 with
respect to the scalar product, then (a1 , ..., an ) is an orthonormal basis for (Rn , ·) :

Definition 15.11
A basis a = (a1 , ..., an ) is an orthonormal basis if

1 for i = j ,
ai · a j = (15-21)
0 for i 6= j .

Exercise 15.12

Show that if n vectors (a1 , ..., an ) in (Rn , ·) satisfy Equation (15-21) then a = (a1 , ..., an ) is
automatically a basis for (Rn , ·), i.e. the vectors are linearly independent and span all of
(Rn , ·).
eNote 15 15.2 SYMMETRIC MATRICES AND THE SCALAR PRODUCT 371

Exercise 15.13

Show that the following 3 vectors (a1 , a2 , a3 ) constitute an orthonormal basis for (R3 , ·) for
any given value of θ ∈ R :
a1 = (cos(θ ), 0, − sin(θ ))
a2 = (0, 1, 0) (15-22)
a3 = (sin(θ ), 0, cos(θ )) .

If we put the vectors from an orthonormal basis into a matrix as columns we get an
orthogonal matrix:

Definition 15.14
An (n × n)−matrix A is said to be orthogonal if the column vectors in A constitute an
orthonormal basis for (Rn , ·), that is if the column vectors are pairwise orthogonal
and all have length 1 – as is also expressed in Equation (15-21).

Note that orthogonal matrices alternatively (and maybe also more descriptively)
could be called orthonormal, since the columns in the matrix are not only pair-
wise orthogonal but also normalized such that they all have length 1. We will
follow international tradition and call the matrices orthogonal.

It is easy to check whether a given matrix is orthogonal:

Theorem 15.15
An (n × n)−matrix Q is orthogonal if and only if

Q> · Q = En×n , (15-23)

which is equivalent to
Q> = Q−1 . (15-24)
eNote 15 15.2 SYMMETRIC MATRICES AND THE SCALAR PRODUCT 372

Proof

See eNote 7 about the computation of the matrix product and then compare with the condi-
tion for orthogonality of the column vectors in Q (Equation (15-21)).

We can now explain the geometric significance of an orthogonal matrix: as a linear


map it preserves lengths of, and angles between, vectors. That is the content of the fol-
lowing theorem, which follows immediately from Theorems 15.9 and Theorem 15.15:

Theorem 15.16
An n × n matrix A is orthogonal if and only if the linear mapping f : Rn → Rn given
by f (x) = Ax preserves the scalar product, i.e.:

(Ax)·(Ay) = x·y, for any x, y ∈ Rn .

Orthogonal matrices are regular and have determinant ±1:

Exercise 15.17

Show that for a matrix A to be orthogonal, it is necessary that

| det(A)| = 1 . (15-25)

Show that this condition is not sufficient, thus matrices exist that satisfy this determinant-
condition but that are not orthogonal.

Definition 15.18
An orthogonal matrix Q is called special orthogonal or positive orthogonal if
det(Q) = 1 and it is called negative orthogonal if det(Q) = −1.

In the literature, orthogonal matrices with determinant 1 are called special orthogonal,
eNote 15 15.3 GRAM–SCHMIDT ORTHONORMALIZATION 373

and those with determinant −1 are usually not given a name.

Exercise 15.19

Given the matrix


0 −a 0 a
 
 a 0 a 0 
A= 
 0 −a 0 −a  , with a ∈ R . (15-26)
−a 0 a 0
Determine the values of a for which A is orthogonal and state in every case whether A is
positive orthogonal or negative orthogonal.

15.3 Gram–Schmidt Orthonormalization

Here we describe a procedure for determining an orthonormal basis for a subspace of


the vector space (Rn , ·). Let U be a p−dimensional subspace of (Rn , ·); we assume
that U is spanned by p given linearly independent vectors (u1 , · · ·, u p ), constituting a
basis u for U. Gram–Schmidt orthonormalization aims at constructing a new basis v
= (v1 , v2 , · · ·, v p ) for the subspace of U from the given basis u such that the new vectors
v1 , v2 , · · ·, v p are pairwise orthogonal and have length 1.
eNote 15 15.3 GRAM–SCHMIDT ORTHONORMALIZATION 374

Method 15.20 Gram–Schmidt Orthonormalization


Orthonormalization of p linearly independent vectors u1 , · · ·, u p in (Rn , ·):

1. Start by normalizing u1 and call the result v1 , i.e.:


u1
v1 = . (15-27)
| u1 |

2. The next vector v2 in the basis v is now chosen in span{u1 , u2 } but such that
at the same time v2 is orthogonal to v1 , i.e. v2 ·v1 = 0; finally this vector is
normalized. First we construct an auxiliary vector w2 .

w2 = u2 − (u2 · v1 ) v1
w2 (15-28)
v2 = .
|w2 |
Note that w2 (and therefore also v2 ) then being orthogonal to v1 :

w2 · v1 = (u2 − (u2 · v1 ) v1 ) · v1
= u2 · v1 − (u2 · v1 ) v1 · v1
= u2 · v1 − (u2 · v1 ) |v1 |2 (15-29)
= u2 · v1 − (u2 · v1 )
=0 .

3. We continue in this way

wi = ui − (ui · v1 ) v1 − (ui · v2 ) v2 − · · · − (ui · vi−1 ) vi−1


wi (15-30)
vi = .
| wi |

4. Until the last vector u p is used:


  
w p = u p − u p · v1 v1 − u p · v2 v2 − · · · − u p · v p−1 v p−1
wp (15-31)
vp = .
|w p |

The constructed v-vectors span the same subspace U as the given linearly indepen-
dent u-vectors, U = span{u1 , · · ·, u p } = span{v1 , · · ·, v p } and v = (v1 , · · ·, v p )
constituting an orthonormal basis for U.
eNote 15 15.3 GRAM–SCHMIDT ORTHONORMALIZATION 375

Example 15.21

In (R4 , ·) we will by the use of the Gram–Schmidt orthonormalization method find an or-
thonormal basis v = (v1 , v2 , v3 ) for the 3−dimensional subspace U that is spanned by the
three given linearly independent (!) vectors having the following coordinates with respect to
the standard e-basis in R4 :

u1 = (2, 2, 4, 1) , u2 = (0, 0, −5, −5) , u3 = (5, 3, 3, −3) .

We construct the new basis vectors with respect to the standard e-basis in R4 by working
through the orthonormalization procedure. There are 3 ’steps’ since there are in this example
3 linearly independent vectors in U :

1.
u1 1
v1 = = (2, 2, 4, 1) . (15-32)
|u1 | 5

2.
w2 = u2 − (u2 · v1 ) v1 = u2 + 5v1 = (2, 2, −1, −4)
w2 1 (15-33)
v2 = = (2, 2, −1, −4) .
| w2 | 5

3.
w3 = u3 − (u3 · v1 ) v1 − (u3 · v2 ) v2 = u3 − 5v1 − 5v2 = (1, −1, 0, 0)
w3 1 (15-34)
v3 = = √ (1, −1, 0, 0) .
| w3 | 2

Thus we have constructed an orthonormal basis for the subspace U consisting of those vec-
tors that with respect to the standard basis have the coordinates:

1 1 1
v1 = · (2, 2, 4, 1) , v2 = · (2, 2, −1, −4) , v3 = √ · (1, −1, 0, 0) .
5 5 2

We can check that this is really an orthonormal basis by posing the vectors as columns in a
matrix, which then is of the type (4 × 3). Like this:
 √ 
2/5 2/5 1/√2
 2/5 2/5 −1/ 2 
 
V= (15-35)
 4/5 −1/5 0 

1/5 −4/5 0

The matrix V cannot be an orthogonal matrix (because of the type), but nevertheless V can
satisfy the following equation, which shows that the three new basis vectors indeed are
eNote 15 15.4 THE ORTHOGONAL COMPLEMENT TO A SUBSPACE 376

pairvise orthogonal and all have length 1 !


 √ 
  2/5 2/5 1/√2
2/5 2/5 4/5 1/5
2/5 2/5 −1/ 2
 
>
V ·V =  2/5 2/5 − 1/5 − 4/5 · 
√ √ 
4/5 −1/5 0

1/ 2 −1/ 2 0 0
 
1/5 −4/5 0 (15-36)
 
1 0 0
= 0 1 0 
 .
0 0 1

Exercise 15.22

In (R4 , ·) the following vectors are given with respect to the standard basis e:

u1 = (1, 1, 1, 1) , u2 = (3, 1, 1, 3) , u3 = (2, 0, −2, 4) , u4 = (1, 1, −1, 3) .

We let U denote the subspace in (R4 , ·) that is spanned by the four given vectors, that is

U = span{u1 , u2 , u3 , u4 } . (15-37)

1. Show that u = (u1 , u2 , u3 ) is a basis for U and find coordinates for u4 with respect to
this basis.

2. State an orthonormal basis for U.

Example 15.23

In (R3 , ·) a given first unit vector v1 is required for the new orthonormal basis v= (v1 , v2 , v3 )
and the task is to find the two other vectors in the basis. Let us assume that the given vector is
v1 = (3, 0, 4)/5. We see immediately that e.g. v2 = (0, 1, 0) is a unit vector that is orthogonal
to v1 . A last vector for the orthonormal basis can then be found directly using the cross
product: v3 = v1 × v2 = 51 · (−4, 0, 3).

15.4 The Orthogonal Complement to a Subspace

Let U be a subspace in (Rn , ·) that is spanned by p given linearly independent vectors,


U = span{u1 , u2 , · · ·, u p }. The set of those vectors in (Rn , ·) that are all orthogonal to
all vectors in U is itself a subspace of (Rn , ·), and it has the dimension n − p:
eNote 15 15.4 THE ORTHOGONAL COMPLEMENT TO A SUBSPACE 377

Definition 15.24
The orthogonal complement to a subspace U of (Rn , ·) is denoted U ⊥ and consists
of all vectors in (Rn , ·) that are orthogonal to all vectors in U:

U ⊥ = {x ∈ Rn | x · u = 0 , for all u ∈ U } . (15-38)

Theorem 15.25
The orthogonal complement U ⊥ to a given p−dimensional subspace U of (Rn , ·) is
itself a subspace in (Rn , ·) and it has dimension dim(U ⊥ ) = n − p .

Proof

It is easy to check all subspace-properties for U ⊥ ; it is clear that if a and b is orthogonal


to all vectors in U and k is a real number, then a + kb are also orthogonal to all vectors in
U. Since the only vector that is orthogonal to itself is 0 this is also the only vector in the
intersection: U ∩ U ⊥ = {0}. If we let v= (v1 , · · ·, v p ) denote an orthonormal basis for U and
w= (w1 , · · ·, wr ) an orthonormal basis for U ⊥ , then (v1 , · · ·, v p , w1 , · · ·, wr ) is an orthonormal
basis for the subspace S = span{v1 , · · ·, v p , w1 , · · ·, wr } in (Rn , ·). If we now assume that S
is not all of (Rn , ·), then the basis for S can be extended with at least one vector such that the
extended system is linearly independent in (Rn , ·); by this we get - through the last step in the
Gram–Schmidt method - a new vector that is orthogonal to all vectors in U but which is not
an element in U ⊥ ; and thus we get a contradiction, since U ⊥ are defined to be all those vectors
in (Rn , ·) that are orthogonal to every vector in U. Therefore the assumption that S is not all
of (Rn , ·) is wrong. I.e. S = Rn and therefore r + p = n, such that dim(U ⊥ ) = r = n − p; and
this is what we had to prove.

Example 15.26

The orthogonal complement to U = span{a, b} in R3 (for linearly independent vectors – and


therefore proper vectors – a and b) is U ⊥ = span{a × b}.
eNote 15 15.5 THE SPECTRAL THEOREM FOR SYMMETRIC MATRICES 378

Exercise 15.27

Determine the orthogonal complement to the subspace U = span{u1 , u2 , u3 } in (R4 , ·), when
the spanning vectors are given by their respective coordinates with respect to the standard
basis e in R4 as:

u1 = (1, 1, 1, 1) , u2 = (3, 1, 1, 3) , u3 = (2, 0, −2, 4) . (15-39)

15.5 The Spectral Theorem for Symmetric Matrices

We will now start to formulate the spectral theorem and start with the following non-
trivial observation about symmetric matrices:

Theorem 15.28
Let A denote a symmetric (n × n)−matrix. Then the characteristic polynomial
KA (λ) for A has exactly n real roots (counted with multiplicity):

λ1 ≥ λ2 ≥ · · · ≥ λ n . (15-40)

I.e. A has n real eigenvalues (counted with multiplicity).

If e.g. {7, 3, 3, 2, 2, 2, 1} are the roots of KA (λ) for a (7 × 7)−matrix A, then these
roots must be represented with their respective multiplicity in the eigenvalue-list:

λ1 = 7 ≥ λ2 = 3 ≥ λ3 = 3 ≥ λ4 = 2 ≥ λ5 = 2 ≥ λ6 = 2 ≥ λ7 = 1 .

Since Theorem 15.28 expresses a decisive property about symmetric matrices, we will
here give a proof of the theorem:
eNote 15 15.5 THE SPECTRAL THEOREM FOR SYMMETRIC MATRICES 379

Proof

From the fundamental theorem of algebra we know that KA (λ) has exactly n complex roots
- but we do not know whether the roots are real; this is what we will prove. So we let α + i β
be a complex root of KA (λ) and we will then show that β = 0. Note that α and β naturally
both are real numbers.

Therefore we have
det (A − (α + i β)E) = 0 , (15-41)
and thus also that
det (A − (α + i β)E) · det (A − (α − i β)E) = 0 (15-42)
such that
det ((A − (α + i β)E) · (A − (α − i β)E)) = 0
  (15-43)
det (A − α E)2 + β2 E = 0 .
 
The last equation yields that the rank of the real matrix (A − α E)2 + β2 E is less than n;
this now means (see eNote 6) that proper real solutions x to the corresponding system of
equations must exist.  
(A − α E)2 + β2 E x = 0 . (15-44)

Let us choose such a proper real solution v to (15-44) with |v| > 0. Using the assumption that
A (and therefore A − αE also) is assumed to be symmetric, we have:
  
2 2
0= ( A − α E ) + β E v ·v
 
= (A − α E)2 v · v + β2 (v · v)
(15-45)
2 2
= ((A − α E) v) · ((A − α E) v) + β |v|
= | (A − α E) v|2 + β2 |v|2 .

Since |v| > 0 we are bound to conclude that β = 0, because all terms in the last expression
are non-negative. And this is what we had to prove.

Exercise 15.29

Where was it exactly that we actually used the symmetry of A in the above proof?
eNote 15 15.5 THE SPECTRAL THEOREM FOR SYMMETRIC MATRICES 380

To every eigenvalue λi for a given matrix A is associated an eigenvector space


Eλi , which is subspace of (Rn , ·). If two or more eigenvalues for a given matrix
are equal, i.e. if we have a multiple root (e.g. k times) λi = λi+1 = · · ·λthati+k−1
of the characteristic polynomial, then the corresponding eigenvector spaces are
of course also equal: Eλi = Eλi+1 = · · · Eλi+k−1 . We will see below in Theorem
15.31 that for symmetric matrices the dimension of the common eigenvector
space Eλi is exactly equal to the algebraic multiplicity k of the eigenvalue λi .

If two eigenvalues λi and λ j for a symmetric matrix are different, then the two corre-
sponding eigenvector spaces are orthogonal, Eλi ⊥ Eλ j in the following sense:

Theorem 15.30
Let A be a symmetric matrix and let λ1 and λ2 be two different eigenvalues for A
and let v1 and v2 denote two corresponding eigenvectors. Then v1 · v2 = 0, i.e. they
are orthogonal.

Proof

Since A is symmetric we have from (15-19):

0 = (Av1 ) ·v2 − v1 · (Av2 )


= λ1 v1 ·v2 − v1 · (λ2 v2 )
(15-46)
= λ1 v1 ·v2 − λ2 v1 ·v2
= ( λ 1 − λ 2 ) v1 · v2 ,

and since λ1 6= λ2 we therefore get the following conclusion: v1 ·v2 = 0, and this is what we
had to prove.

We can now formulate one of the most widely applied results for symmetric matrices,
the spectral theorem for symmetric matrices that, with good reason, is also called the
theorem about diagonalization of symmetric matrices:
eNote 15 15.5 THE SPECTRAL THEOREM FOR SYMMETRIC MATRICES 381

Theorem 15.31
Let A denote a symmetric (n × n)−matrix. Then a special orthogonal matrix Q exists
such that
Λ = Q−1 AQ = Q> AQ is a diagonal matrix . (15-47)
I.e. that a real symmetric matrix can be diagonalized by application of a positive
orthogonal substitution, see eNote 14.
The diagonal matrix can be constructed very simply from the n real eigenvalues
λ1 ≥ λ2 ≥ · · · ≥ λn of A as:

λ1 0 · 0
 
 0 λ2 · 0 
Λ = diag(λ1 , λ2 , ..., λn ) = 
 ·
 , (15-48)
· · · 
0 0 · λn

Remember: A symmetric matrix has exactly n real eigenvalues when we count


these with multiplicity.

The special orthogonal matrix Q is next constructed as columns of the matrix by


using the eigenvectors from the corresponding eigenvector-spaces Eλ1 , Eλ2 , · · ·, Eλn
in the corresponding order:

Q = [v1 v2 · · · vn ] , (15-49)

where v1 ∈ Eλ1 , v2 ∈ Eλ2 , · · ·, vn ∈ Eλn , and the choice of eigenvectors in the


respective eigenvector spaces is made so that

1. Any eigenvectors corresponding to the same eigenvalue are chosen orthogonal


(use Gram–Schmidt orthogonalization in every common eigenvector space)

2. The chosen eigenvectors are normalized to have length 1.

3. The resulting matrix Q has determinant 1 (if not then multiply one of the cho-
sen eigenvectors by −1 to flip the sign of the determinant)

That this is so follows from the results and remarks – we go through a series of
enlightening examples below.
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 382

15.6 Examples of Diagonalization

Here are some typical examples that show how one diagonalizes some small symmetric
matrices, i.e. symmetric matrices of type (2 × 2) or type (3 × 3):

Example 15.32 Diagonalization by Orthogonal Substitution

A symmetric (3 × 3)−matrix A is given as:

2 −2
 
1
A =  −2 5 −2  . (15-50)
1 −2 2

We will determine a special orthogonal matrix Q such that Q−1 AQ is a diagonal matrix:

Q−1 AQ = Q> AQ = Λ . (15-51)

First we determine the eigenvalues for A: The characteristic polynomial for A is

2 − λ −2
 
1
KA (λ) = det  −2 5 − λ −2  = (λ − 1)2 · (7 − λ) , (15-52)
1 −2 2 − λ
so A has the eigenvalues λ1 = 7, λ2 = 1, and λ3 = 1. Because of this we already know
through Theorem 15.31 that it is possible to construct a positive orthogonal matrix Q such
that  
7 0 0
Q−1 AQ = diag(7, 1, 1) =  0 1 0  . (15-53)
0 0 1
The rest of the problem now consists in finding the eigenvectors for A that can be used as
columns in the orthogonal matrix Q.

Eigenvectors for A corresponding to the eigenvalue λ1 = 7 are found by solving the homo-
geneous system of equations that has the coefficient matrix

−5 −2
 
1
KA (7) = A − 7E =  −2 −2 −2  , (15-54)
1 −2 −5
which by suitable row operations is seen to have

1 0 −1
 

rref(KA (7)) =  0 1 2  . (15-55)


0 0 0
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 383

The eigenvector solutions to the corresponding homogeneous system of equations are seen
to be
u = t · (1, −2, 1) , t ∈ R , (15-56)

such that E7 = span{(1, −2, 1)}. The normalized eigenvector v1 = (1/ 6) · (1, −2, 1) is
therefore an orthonormal basis for E7 (and it can also be used as the first column vector in the
wanted Q:  √ 
1/√6 ∗ ∗
Q =  −2/√6 ∗ ∗  . (15-57)
 
1/ 6 ∗ ∗
We know from Theorem 15.31 that the two last columns are found by similarly determining
all eigenvectors E1 belonging to the eigenvalue λ2 = λ3 = 1 and then choosing two
orthonormal eigenvectors from E1 .

The reduction matrix corresponding to the eigenvalue 1 is


1 −2
 
1
KA (1) = A − E =  −2 4 −2  , (15-58)
1 −2 1
which again by suitable row operations is seen to have
1 −2 1
 

rref(KA (1)) =  0 0 0  . (15-59)


0 0 0
The eigenvector solutions to the corresponding homogeneous system of equations are seen
to be
u = t1 · (2, 1, 0) + t2 · (−1, 0, 1) , t1 ∈ R , t2 ∈ R , (15-60)
such that E1 = span{(−1, 0, 1), (2, 1, 0)}.

We find an orthonormal basis for E1 using the Gram–Schmidt orthonormalization of


span{(−1, 0, 1), (2, 1, 0)} like this: Since we have already defined v1 we put v2 to be
(−1, 0, 1) √
v2 = = (1/ 2) · (−1, 0, 1) , (15-61)
|(−1, 0, 1)|
and then as in the Gram–Schmidt process:
w3 = (2, 1, 0) − ((2, 1, 0) · v2 ) · v2 = (1, 1, 1) . (15-62)

By normalization we finally get v3 = (1/ 3) · (1, 1, 1) and then we finally have all the ingre-
dients to the wanted orthogonal matrix Q:
 √ √ √ 
1/√6 −1/ 2 1/√3
Q = [v1 v2 v3 ] =  −2/ 6 √0 1/√3  . (15-63)
 

1/ 6 1/ 2 1/ 3
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 384

Finally we investigate whether the chosen eigenvectors give a positive orthogonal matrix.
Since
1 −1 1
 

det  −2 0 1  = −6 < 0 , (15-64)


1 1 1
Q has negative determinant. A special orthogonal matrix is found by multiplying one of the
columns of Q by −1, e.g. the last one. Note that a vector v is an eigenvector for A if and only
if −v is also an eigenvector for A. Therefore we have that
 √ √ √ 
  1/ √ 6 − 1/ 2 − 1/ √3 
Q = v1 v2 (−v3 ) =  −2/ 6 √0 −1/√3  (15-65)

1/ 6 1/ 2 −1/ 3

is a positive orthogonal matrix that diagonalizes A.

This is checked by a direct computation:

Q−1 AQ = Q> AQ
 √ √ √     √ √ √ 
1/√6 −2/ 6 1/√6 2 −2 1 1/√6 −1/ 2 −1/√3
=  −1/√2 √0 1/√2  ·  −2 5 −2  ·  −2/√6 √0 −1/√3 
   
−1/ 3 −1/ 3 −1/ 3 1 −2 3 1/ 6 1/ 2 −1/ 3
 
7 0 0
= 0 1 0  ,

0 0 1
(15-66)
which we wanted to show.

We should finally remark here that, since we are in three dimensions, instead of using Gram–
Schmidt orthonormalization for the determination of v3 we could have used the cross prod-
uct v1 × v2 (see 15.23):

v3 = v1 × v2 = (1/ 3) · (−1, −1, −1) . (15-67)

Example 15.33 Diagonalization by Orthogonal Substitution

A symmetric (2 × 2)−matrix A is given as:


 
11 −12
A= . (15-68)
−12 4
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 385

We will determine a special orthogonal matrix Q such that Q−1 AQ is a diagonal matrix:

Q−1 AQ = Q> AQ = Λ . (15-69)

First we determine the eigenvalues for A: The characteristic polynomial for A is


 
11 − λ −12
KA (λ) = det = (λ − 20) · (λ + 5) , (15-70)
−12 4 − λ
so A has the eigenvalues λ1 = 20 and λ2 = −5. Therefore we now have:
 
20 0
Λ= . (15-71)
0 −5

The eigenvectors for A corresponding to the eigenvalue λ1 = 20 are found by solving the
homogeneous system of equations having the coefficient matrix
 
−9 −12
KA (20) = A − 20E = , (15-72)
−12 −16
which, through suitable row operations, is shown to have the equivalent reduced matrix:
 
3 4
rref(KA (20)) = . (15-73)
0 0

The eigenvector solutions to the corresponding homogeneous system of equations are found
to be
u = t · (4, −3) , t ∈ R , (15-74)
such that E20 = span{(4, −3)}. The normalized eigenvector v1 = (1/5) · (4, −3) is therefore
an orthonormal basis for E20 (and it can therefore be used as the first column vector in the
wanted Q:  
4/5 ∗
Q= . (15-75)
−3/5 ∗
The last column in Q is an eigenvector corresponding to the second eigenvalue λ2 = −5
and can therefore be found from the general solution E−5 to the homogeneous system of
equations having the coefficient matrix
 
16 −12
KA (−5) = A − 5 · E = , (15-76)
−12 9
but since we know that the wanted eigenvector is orthogonal to the eigenvector v1 we can just
use a vector perpendicular to the first eigenvector, v2 = (1/5) · (3, 4), evidently a unit vector,
that is orthogonal to v1 . It is easy to check that v2 is an eigenvector for A corresponding to
the eigenvalue −5 :
     
16 −12 3 0
KA (−5) · v2 = · = . (15-77)
−12 9 4 0
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 386

Therefore we substitute v2 as the second column in Q and get


 
4/5 3/5
Q= . (15-78)
−3/5 4/5

This matrix has the determinant det (Q) = 1 > 0, so Q is a positive orthogonal substitution
matrix satisfying that Q−1 AQ is a diagonal matrix:

Q−1 AQ = Q> AQ
     
4/5 −3/5 11 −12 4/5 3/5
= · ·
3/5 4/5 −12 4 −3/5 4/5
  (15-79)
20 0
=
0 −5
= diag(20, −5) = Λ .

Example 15.34 Diagonalization by Orthogonal Substitution

A symmetric (3 × 3)−matrix A is given like this:


7 −2
 
0
A =  −2 6 −2  . (15-80)
0 −2 5
We will determine a positive orthogonal matrix Q such that Q−1 AQ is a diagonal matrix:
Q−1 AQ = Q> AQ = Λ . (15-81)
First we determine the eigenvalues for A: The characteristic polynomial for A is
7−λ −2
 
1
KA (λ) = det  −2 6 − λ −2  = −(λ − 3) · (λ − 6) · (λ − 9) , (15-82)
1 −2 5 − λ
from which we read the three different eigenvalues λ1 = 9, λ2 = 6, and λ3 = 3 and then the
diagonal matrix we are on the road to describe as Q−1 AQ :
 
9 0 0
Λ = diag(9, 6, 3) =  0 6 0  (15-83)
0 0 3
The eigenvectors for A corresponding to the eigenvalue λ3 = 3 are found by solving the
homogeneous system of equations having the coefficient matrix
4 −2
 
0
KA (3) = A − 3 · E =  −2 3 −2  , (15-84)
0 −2 2
eNote 15 15.6 EXAMPLES OF DIAGONALIZATION 387

which through suitable row operations is seen to have

2 0 −1
 

rref(KA (3)) =  0 1 −1  . (15-85)


0 0 0

The eigenvector solutions to the corresponding homogeneous system of equations are found
to be
u3 = t · (1, 2, 2) , t ∈ R , (15-86)
such that E3 = span{(1, 2, 2)}. The normalized eigenvector v1 = (1/3) · (1, 2, 2) is therefore
an orthonormal basis for E3 so it can be used as the third column vector in the wanted Q; note
that we have just found the eigenvector space to the third eigenvalue on the list of eigenvalues
for A :
∗ ∗ 1/3
 

Q =  ∗ ∗ 2/3  . (15-87)
∗ ∗ 2/3
We know from Theorem 15.31 that the two last columns are found by similarly determining
the eigenvector space E6 corresponding to eigenvalue λ2 = 6, and the eigenvector space E9
corresponding to the eigenvalue λ1 = 9.

For λ2 = 6 we have:

1 −2
 
0
KA (6) = A − 6 · E =  −2 0 −2  , (15-88)
0 −2 −1

which by suitable row operations is found to have the following equivalent reduced matrix:
 
1 0 1
rref(KA (6)) =  0 2 1  . (15-89)
0 0 0

The eigenvector solutions to the corresponding homogeneous system of equations are found
to be
u2 = t · (−2, −1, 2) , t ∈ R , (15-90)
so that E6 = span{(−2, −1, 2)}. The normalized eigenvector v2 = (1/3) · (−2, −1, 2) is
therefore an orthonormal basis for E6 (and it can therefore be used as the second column vector
in the wanted Q:
∗ −2/3 1/3
 

Q =  ∗ −1/3 2/3  . (15-91)


∗ 2/3 2/3
Instead of determining the eigenvector space E9 for the last eigenvalue λ1 = 9 in the same
way we use the fact that this eigenvector space is spanned by a vector v1 that is orthogonal
eNote 15 15.7 CONTROLLED CONSTRUCTION OF SYMMETRIC MATRICES 388

to both v3 and v2 , so we can use v1 = v2 × v3 = (1/3) · (−2, 2, −1), and then we finally get

−2/3 −2/3 1/3


 

Q =  2/3 −1/3 2/3  . (15-92)


−1/3 2/3 2/3

This matrix is positive orthogonal since det(Q) = 1 > 0, and therefore we have determined
a positive orthogonal matrix Q that diagonalizes A to the diagonal matrix Λ. This is easily
proved by direct computation:

Q−1 AQ = Q> AQ
−2/3 2/3 −1/3 7 −2 −2/3 −2/3 1/3
     
0
=  −2/3 −1/3 2/3  ·  −2 6 −2  ·  2/3 −1/3 2/3 
1/3 2/3 2/3 0 −2 5 −1/3 2/3 2/3
  (15-93)
9 0 0
= 0 6 0 

0 0 3
= diag(9, 6, 3) = Λ .

15.7 Controlled Construction of Symmetric Matrices

In the light of the above examples it is clear that if only we can construct all orthogonal
(2 × 2)- and (3 × 3)-matrices Q (or for that matter (n × n)-matrices), then we can produce
all symmetric (2 × 2)- and (3 × 3)-matrices A as A = Q · Λ · Q> . We only have to choose
the wanted eigenvalues in the diagonal for Λ.

Every special orthogonal 2 × 2-matrix has the following form, which shows that it is a
rotation given by a rotation angle ϕ :
 
cos( ϕ) − sin( ϕ)
Q= , (15-94)
sin( ϕ) cos( ϕ)

where ϕ is an angle in the interval [−π, π ]. Note that the column vectors are orthogonal
and both have length 1. Furthermore the determinant det(Q) = 1, so Q is special
orthogonal.
eNote 15 15.7 CONTROLLED CONSTRUCTION OF SYMMETRIC MATRICES 389

Exercise 15.35

Prove the statement that every special orthogonal matrix can be stated in the form (15-94) for
a suitable choice of the rotation angle ϕ.

If ϕ > 0 then Q rotates vectors in the positive direction, i.e. counter-clockwise; if ϕ < 0
then Q rotates vectors in the negative direction, i.e. clockwise.

Definition 15.36 Rotation Matrices


Every special orthogonal (2 × 2)-matrix is also called a rotation matrix.

Since every positive orthogonal 3 × 3-matrix similarly can be stated as a product of


rotations about the three coordinate axes – see below – we will extend the naming as
follows:

Definition 15.37 Rotation Matrices


Every special orthogonal (3 × 3)-matrix is called a rotation matrix.

A rotation about a coordinate axis, i.e. a rotation by a given angle about one of the
coordinate axes, is produced with one of the following special orthogonal matrices:
 
1 0 0
Rx (u) =  0 cos(u) − sin(u) 
0 sin(u) cos(u)

 
cos(v) 0 sin(v)
Ry ( v ) =  0 1 0  (15-95)
− sin(v) 0 cos(v)

 
cos(w) − sin(w) 0
Rz (w) =  sin(w) cos(w) 0  ,
0 0 1
eNote 15 15.7 CONTROLLED CONSTRUCTION OF SYMMETRIC MATRICES 390

where the rotation angles are u, v, and w, respectively.

Exercise 15.38

Show by direct calculation that the three axis-rotation matrices and every product of axis-
rotation matrices really are special orthogonal matrices, i.e. they satisfy R−1 = R> and
det(R) = 1.

Exercise 15.39

Find the image vectors of every one of the given vectors a, b, and c by use of the given
mapping matrices Qi :

Q1 = Rx (π/4) , a = (1, 0, 0), b = (0, 1, 0), c = (0, 0, 1)


Q2 = Ry (π/4) , a = (1, 1, 1), b = (0, 1, 0), c = (0, 0, 1)
Q3 = Rz (π/4) , a = (1, 1, 0), b = (0, 1, 0), c = (0, 0, 1) (15-96)
Q4 = Ry (π/4) · Rx (π/4) , a = (1, 0, 0), b = (0, 1, 0), c = (0, 0, 1)
Q5 = Rx (π/4) · Ry (π/4) , a = (1, 0, 0), b = (0, 1, 0), c = (0, 0, 1) .

The combination of rotations about the coordinate axes by given rotation angles u, v,
and w about the x −axis, y−axis, and z−axis is found by computing the matrix product
of the three corresponding rotation matrices.

Here is the complete general expression for the matrix product for all values of u, v and
w:

R(u, v, w) = Rz (w) · Ry (v) · Rx (u)

 
cos(w) cos(v) − sin(w) cos(u) − cos(w) sin(v) sin(u) sin(w) sin(u) − cos(w) sin(v) cos(u)
= sin(w) cos(v) cos(w) cos(u) − sin(w) sin(v) sin(u) − cos(w) sin(u) − sin(w) sin(v) cos(u)  .

sin(v) cos(v) sin(u) cos(v) cos(u)

As one might suspect, it is possible to prove the following theorem:


eNote 15 15.7 CONTROLLED CONSTRUCTION OF SYMMETRIC MATRICES 391

Theorem 15.40 Axis Rotation Angles for a Given Rotation Matrix


Every rotation matrix R (i.e. every special orthogonal matrix Q) can be written as
the product of 3 axis-rotation matrices:

R = R(u, v, w) = Rz (w) · Ry (v) · Rx (u) . (15-97)

In other words: the effect of every rotation matrix can be realized by three con-
secutive rotations about the coordinate axes – with the rotation angles u, v, and w,
respectively, as given in the above matrix product.

When a given special orthogonal matrix R is given (with its matrix ele-
ments rij ), it is not difficult to find these axis rotation angles. As is evident
from the above matrix product we have e.g. that sin(v) = r31 such that
v = arcsin(r31 ) or v = π − arcsin(r31 ), and cos(w) cos(v) = r11 such that
w = arccos(r11 / cos(v)) or v = − arccos(r31 / cos(v)), if only cos(v) 6= 0 i.e. if
only v 6= ±π/2.

Exercise 15.41

Show that if v = π/2 or v = −π/2 then there exist many values of u and w giving the same
R(u, v, w). I.e. not all angle values are uniquely determined in the interval ] − π, π ] for
every given rotation matrix R.

Exercise 15.42

Show that if R is a rotation matrix (a positive orthogonal matrix) then R> is also a rotation
matrix, and vice versa: if R> is a rotation matrix then R is also a rotation matrix.

Exercise 15.43

Show that if R1 and R2 are rotation matrices then R1 · R2 and R2 · R1 are also rotation matrices.
Give examples that show that R1 · R2 is not necesarily the same rotation matrix as R2 · R1 .
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 392

15.8 Structure of Rotation Matrices

As mentioned above (Exercise 15.35), every 2 × 2 special orthogonal matrix has the
form:  
cos φ − sin φ
Q= .
sin φ cos φ
This is a rotation of the plane anticlockwise by the angle φ. The angle φ is related to the
eigenvalues of Q:

Exercise 15.44

Show that the eigenvalues of the matrix Q above are:

λ1 = eiφ , λ2 = e−iφ .

How about the 3 × 3 case? We already remarked that any 3 × 3 special orthogonal matrix
can be written as a composition of rotations about the three coordinate matrices: Q =
Rz (w) · Ry (v) · Rx (u). But is Q itself a rotation about some axis (i.e. some line through
the origin)? We can prove this is so, by examining the eigenvalues and eigenvectors of
Q.

Theorem 15.45
The eigenvalues of any orthogonal matrix all have absolute value 1.

Proof. If λ is an eigenvalue of an orthogonal matrix Q, there is, by definition, a non-zero


complex eigenvector v in Cn \ {0}. Writing v as a column matrix, we then have:

λλ̄vT · v̄ = (λv) T · (λv)


= (Q · v) T · (Q · v) (Q · v = λv)
= vT · QT · Q · v̄ (Q̄ = Q)
= vT · E · v̄ (Q T · Q = E)
= vT · v̄.
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 393

Since v 6= 0, it follows that vT · v̄ is a non-zero (real) number:

vT · v̄ = v1 v̄1 + v2 v̄2 . . . vn v̄n


= |v1 |2 + |v2 |2 + . . . |vn |2 > 0.
Dividing λλ̄vT · v̄ = vT · v̄ by this number we get:

|λ|2 = λλ̄ = 1.

We can now apply this to the eigenvalues of a 3 × 3 special orthogonal matrix:

Theorem 15.46
Let Q be a 3 × 3 special orthogonal matrix, i.e. QT Q = E, and det Q = 1. Then the
eigenvalues are:
λ1 = 1, λ2 = eiφ , λ3 = e−iφ ,
for some φ ∈] − π, π ].

Proof. Q is a real matrix, so all eigenvalues are either real or come in complex conjugate
pairs. There are 3 of them, because Q is a 3 × 3 matrix, so the characteristic polynomial
has degree 3. Hence there is at least one real eigenvalue:

λ1 ∈ R.

Now there are two possibilities:


Case 1: All roots are real: then, since all eigenvalues have absolute value 1 (by Theorem
15.45), and
1 = det Q = λ1 λ2 λ3
either one or all three of the eigenvalues are equal to 1.

Case 2: λ1 is real and the other two are complex conjugate, λ3 = λ̄2 , so:

1 = det Q = λ1 λ2 λ̄2 = λ1 |λ2 |2 = λ1 ,

where we used that |λ2 | = 1. Any complex number λ with absolute value 1 is of the
form eiφ , where φ = Arg(λ), so this gives the claimed form of λ1 , λ2 and λ3 .

Note that the case λ2 = λ3 = 1 or −1 (in Case 1) correspond respectively to φ = 0 and


φ = π in the wording of the theorem.
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 394

We can also say something about the eigenvectors corresponding to the eigenvalues.

Theorem 15.47
Let Q be a special orthogonal matrix, and denote the eigenvalues as in Theorem
15.46. If the eigenvalues are not all real, i.e. Im(λ2 ) 6= 0, then the eigenvectors
corresponding to λ2 and λ3 are necessarily of the form:

v2 = x + iy, v3 = v̄2 = x − iy,

where x and y are respectively the real and imaginary parts of v2 , and

x·y = 0 and |x| = |y|.

If v1 is an eigenvector for λ1 = 1, then:

v1 ·x = v1 ·y = 0

Proof. We have Qv2 = λ2 v2 , and Qv̄2 = λ̄2 v̄2 . So clearly a third eigenvector, corre-
sponding to λ̄2 , is v3 = v̄2 . Using QT Q = E, we have

v2T v2 = v2T QT Qv2 = (Qv2 ) T (Qv2 ) = λ22 v2T v2 .

If v2T v2 6= 0, then we can divide by this number to get λ22 = 1. But λ2 = a + bi, with
b 6= 0, so this would mean: 1 = λ22 = a2 − b2 + 2iab. The imaginary part is: ab = 0,
which implies that a = 0 and hence λ22 = −b2 , which cannot be equal to 1. Hence:

v2T v2 = 0.

Writing v2 = x + iy, this is:

0 = (xT + iyT )(x + iy)


= x T x − y T y + i (x T y + y T x).

The real part of this equation is:

xT x − yT y = 0, i.e., x·x = |x|2 = y·y = |y|2 ,

and the imaginary part is:

0 = xT y + yT x = x·y + y·x = 2x·y.


eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 395

Lastly, if v1 is an eigenvector for λ1 = 1, then, by the same argument as above,

v1T v2 = 1 · λ2 · v1T v2 ,

which must be zero, since λ2 6= 1. This is:

0 = v1T (x + iy) = v1 ·x + i v1 ·y.

Since v1 is real, the real and imaginary parts of this give v1 ·x = v1 ·y = 0.

Now we can give a precise description of the geometric effect of a 3 × 3 rotation matrix:

Theorem 15.48
Let Q be a 3 × 3 special orthogonal matrix, and λ1 = 1, λ2 = eiφ , λ3 = e−iφ be its
eigenvalues, with corresponding eigenvectors v1 , v2 and v3 = v̄2 . Then:

1. The map f : R3 → R3 given by f (x) = Qx is a rotation by angle φ around the


line spanned by v1 .

2. If λ2 is not real then an orthonormal basis for R3 is given by:

v1 Im v2 Re v2
u1 = , u2 = , u3 = ,
|v1 | |Im v2 | |Re v2 |

where v2 is an eigenvector for λ2 = eiφ . The mapping matrix for f with respect
to this basis is:  
1 0 0
u f u =  0 cos φ − sin φ  .
0 sin φ cos φ

3. If λ2 is real then Q is either the identity map (λ2 = λ3 = 1) or a rotation by


angle π (λ2 = λ3 = −1).

Proof. Statement 1 follows from statements 2 and 3, since these represent rotations by
angle φ around the v1 axis.

For statement 2, by Theorem 15.47, if λ2 is not real, then u = (u1 , u2 , u3 ) as defined


above are an orthonormal basis for R3 , since they are mutually orthogonal and of length
1.
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 396

To find the mapping matrix, we have f (u1 ) = u1 = 1 · u1 + 0 · u2 + 0 · u3 , which gives


the first column. For u2 and u3 , according to Theorem 15.47, the real and imaginary
parts of v2 have the same length, so we can rescale v2 by dividing by this number to get
Im v2 Re v2
w = u3 + iu2 , u2 = , u3 = ,
|Im v2 | |Re v2 |
where w is an eigenvector for f with eigenvalue eiφ . That is:
eiφ w = (cos φ + i sin φ)(u3 + iu2 ) = f (u3 + iu2 )
(cos φu3 − sin φu2 ) + i (sin φu3 + cos φu2 ) = f (u3 ) + i f (u2 ).
The imaginary and real parts of this equation give:
f (u2 ) = cos φu2 + sin φu3
f (u3 ) = − sin φu2 + cos φu3 ,
and this gives us the second and third columns of the mapping matrix.

This mapping matrix is precisely the matrix of a rotation by angle φ around the v1 axis
(compare u f u with the matrix Rx (u) discussed earlier).

For statement 3, the special case that λ2 is real, if λ2 = λ3 = 1, then φ = 0 and Q is the
identity matrix, which can be regarded as a rotation by angle 0 around any axis.

Finally, for the case λ2 = λ3 = −1, briefly: let E1 = span{v1 }. Choose any orthonormal
basis for the orthogonal complement E1⊥ . Using this, one can show that the restricition of
f to E1⊥ is a 2 × 2 rotation matrix with a repeated eigenvalue −1. This means it is minus
the identity matrix on E1⊥ , i.e. a rotation by angle π, from which the claim follows.

Example 15.49

The axis of rotation for a 3 × 3 rotation matrix is sometimes called the Euler axis. Let’s find
the Euler axis, and the rotation angle for the special orthogonal matrix:

−2 −2 1
 
1
Q= 2 −1 2  ,
3
−1 2 2

which was used for a change of basis in Example 15.34.

The eigenvalues are:


√ √
2 5 2 5
λ1 = 1, λ2 = − + i , λ3 = − − i ,
3 3 3 3
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 397

with corresponding eigenvectors:


√ 
−i 5
  
0
v1 =  1  , v2 =  − 2  , v3 = v2 .
2 1

So the axis of rotation is the line spanned by v1 = (0, 1, 2), and the angle of rotation is:
√ !
5
φ = Arg(λ2 ) = − arctan +π
2

We can set:

−1
   
0 0
1 1
u1 = v1 = √  1  , u2 = Imv2 =  0  , u3 = Rev2 = √  −2  ,
5 2 0 5 1

and, setting U = [u1 , u2 , u3 ], the matrix of f in this basis is:


  
1 0 0√

1 0 0
T
u f u = U QU =  0 −
2
− 35  =  0 cos φ − sin φ  .
 
√3
0 35 − 32 0 sin φ cos φ

Exercise 15.50

 √ √ 
0 − 2 2
 √
Find the axis and angle of rotation for the rotation matrix: Q = 12  2 1 1 .


− 2 1 1

Conversely, we can construct a matrix that rotates by any desired angle around any
desired axis:

Example 15.51

Problem: Construct the matrix for the linear map f : R3 → R3 that rotates 3-space around
the axis spanned by the vector a = (1, 1, 0) anti-clockwise by the angle π/2.

Solution: Choose any orthonormal basis (u1 , u2 , u3 ) where u1 points in the direction of a. For
example:
−1
     
1 0
1   1 
u1 = √ 1 , u2 = √ 1  , u3 =  0  .
2 0 2 0 1
eNote 15 15.8 STRUCTURE OF ROTATION MATRICES 398

We have chosen them such that det([u1 , u2 , u3 ]) = 1. This means that the orientation of
space is preserved by this change of basis, so we know that the rotation from the following
construction will be anti-clockwise around the axis.

The matrix with respect to the u-basis that rotates anti-clockwise around the u1 -axis by the
angle π/2 is:
   
1 0 0 1 0 0
u f u =  0 cos( π/2) − sin( π/2)  =  0 0 −1  .
0 sin(π/2) cos(π/2) 0 1 0
The change of basis matrix from u to the standard e-basis is:

1 −1 0
 
1
e Mu = [u1 , u2 , u3 ] = √  1 1 0 ,
2 0 0 √2

so the matrix of f with respect to the standard basis is:


 √ 
1 1 √2 
T 1
f
e e = M f
e uu ue u M =
2
 √ 1 √1 − 2  .
− 2 2 0

Note: for the vectors u1 and u2 , it would have made no difference what choice we make as
long as they are orthogonal to a, and orthonormal. If we rotate them in the plane orthogonal
to a, this rotation will cancel in the formula e Mu u f u e MuT .

Exercise 15.52

Find an orthogonal matrix Q that, in the standard e-basis for R3 , represents a rotation about
the axis spanned by a = (1, 1, 1) by an angle π/2.
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 399

15.9 Reduction of Quadratic Polynomials

A quadratic form in (Rn , ·) is a quadratic polynomial in n variables – but without linear


and constant terms.

Definition 15.53
Let A be a symmetric (n × n)-matrix and let ( x1 , x2 , · · · , xn ) denote the coordinates
for an arbitrary vector x in (R, ·) with respect to the standard basis e in Rn .

A quadratic form in (R, ·) is a function of the n variables ( x1 , x2 , · · · , xn ) in the fol-


lowing form:

x1
 

 

 x2 

PA (x) = PA ( x1 , x2 , · · · , xn ) = x1 x2 · · x n ·A·
 · 

 · 
(15-98)
xn
n n
= ∑∑ aij · xi · x j ,
i =1 j =1

aij being the individual elements in A.

Example 15.54 Quadratic Form as Part of a Quadratic Polynomial

Let f ( x, y) be the following quadratic polynomial in the two variables x and y.


f ( x, y) = 11 · x2 + 4 · y2 − 24 · x · y − 20 · x + 40 · y − 60 . (15-99)
Then we can separate the polynomial in two parts:
f ( x, y) = PA ( x, y) + (−20 · x + 40 · y − 60) , (15-100)
where PA ( x, y) is the quadratic form
PA ( x, y) = 11 · x2 + 4 · y2 − 24 · x · y (15-101)
that is represented by the matrix
 
11 −12
A= (15-102)
−12 4
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 400

We will now see how the spectral theorem can be used for the description of every
quadratic form by use of the eigenvalues for the matrix that represents the quadratic
form.

Theorem 15.55 Reduction of Quadratic Forms


Let A be a symmetric matrix and let PA ( x1 , · · · , xn ) denote the corresponding
quadratic form in (Rn , ·) with respect to standard coordinates. By a change of ba-
sis to new coordinates xe1 , · · · , xen given by the positive orthogonal change of basis
matrix Q that diagonalizes A we get the reduced expression for the quadratic form:

PA ( x1 , · · · , xn ) = PeΛ ( xe1 , · · · , xen ) = λ1 · xe12 + · · · + λn · xen2 , (15-103)

where λ1 , · · · , λn are the n real eigenvalues for the symmetric matrix A.

The reduction in the theorem means that the new expression does not contain
any product terms of the type xi · x j for i 6= j.

Proof

Since A is symmetric it can according to the spectral theorem be diagonalized by an orthog-


onal substitution matrix Q. The gathering of column vectors (v1 , · · · , vn ) in Q constitutes a
new basis v in (Rn , ·).

Let x be an arbitrary vector in Rn . Then we have the following set of coordinates for x, partly
with respect to the standard e-basis and partly with respect to the new basis v

ex = ( x1 , · · · , x n ) ,
(15-104)
vx = ( xe1 , · · · , xen ) .
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 401

Then
PA (x) = PA ( x1 , · · · , xn )
x1


   · 
= x1 · · · xn · A · 
 · 

xn
x1
 
 · 
= x 1 · · · x n · Q · Λ · Q−1 · 
 

 · 
xn
x1
  
 >  · 
x1 · · · x n · Q · Λ · 
  
= Q ·  ·  
 
(15-105)
xn
xe1
 
 · 
· · · xen · Λ · 
 
= xe1  · 

xen
λ1 0 · · ·
  
0 xe1

   0 λ2 · · ·
 0
  · 
 
= xe1 · · · xen ·  . .. . . ·
.. 
 .. . .  . · 
0 0 . . . λn xen

= PeΛ ( xe1 , · · · , xen ) = λ1 · xe12 + · · · + λn · xen2 .

Note that the matrix that represents the quadratic form in Example 15.54,
Equation (15-102), is not much different from the Hessian Matrix H f ( x, y) for
f ( x, y), which is also a constant matrix, because f ( x, y) is a second degree poly-
nomial. See eNote ??. In fact we observe that:
1
A= · H f ( x, y) , (15-106)
2
and this is no coincidence.
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 402

Lemma 15.56
Let f ( x1 , x2 , · · · , xn ) denote an arbitrary quadratic polynomial without linear and
constant terms. Then f ( x1 , x2 , · · · , xn ) can be expressed as a quadratic form in ex-
actly one way – i.e. there exists exactly one symmetric matrix A such that:

f (x) = f ( x1 , x2 , · · · , xn ) = PA ( x1 , x2 , · · · , xn ) . (15-107)

The sought matrix is:


1
· H f (x) , A= (15-108)
2
where H f (x) is (the constant) Hessian matrix for the function f (x) =
f ( x1 , x2 , · · · , x n ).

Proof

We limit ourselves to the case n = 2 and refer the analysis to functions of two variables in
eNote ??: If f ( x, y) is a polynomial in two variables without linear (and constant) terms, i.e.
a quadratic form in (R2 , ·), then the wanted A-matrix is exactly the (constant) Hesse-matrix
for f ( x, y).

This applies generally, if we extend the definition of Hessian matrices to functions of


more variables as follows: Let f ( x1 , x2 , · · · , xn ) be an arbitrary smooth function of n
variables in the obvious meaning for functions of more variables (than two). Then the
corresponding Hessian matrices are the following symmetric (n × n)-matrices which
contain all the second-order partial derivatives for the function f (x) evaluated at an
arbitrary point x ∈ Rn :

f x001 x1 (x) f x001 x2 (x) · · · f x001 xn (x)


 
 f x002 x1 (x) f x002 x2 (x) · · · f x002 xn (x) 
H f ( x1 , x2 , · · · , x n ) =  . (15-109)
 
.. .. .. .. 
 . . . . 
00 00 00
f x n x1 (x ) f x n x2 (x ) . . . f x n x n (x )
In particular if f ( x, y, z) is a smooth function of three variables (as in Example 15.57
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 403

below) we get at every point ( x, y, z) ∈ R3 :


 
00 ( x, y, z ) f 00 ( x, y, z ) f 00 ( x, y, z )
f xx xy xz
 00 00 ( x, y, z ) f 00 ( x, y, z ) 
H f ( x, y, z) =  f xy ( x, y, z) f yy yz  , (15-110)
00 ( x, y, z ) f 00 ( x, y, z ) f 00 ( x, y, z )
f xz yz zz

00 ( x, y, z ) =
where we explicitly have used the symmetry of the Hessian matrix, e.g. f zx
00 ( x, y, z ).
f xz

Example 15.57 Quadratic Form with a Representing Matrix

Let f ( x, y, z) denote the following function of three variables:

f ( x, y, z) = x2 + 3 · y2 + z2 − 8 · x · y + 4 · y · z . (15-111)

Then f ( x, y, z) is a quadratic form PA ( x, y, z) with


  
f 00 ( x, y, z) f xy
00 ( x, y, z ) 00 ( x, y, z )
f xz 1 −4 0

1 1  xx 00 ( x, y, z ) f 00 ( x, y, z ) 00 ( x, y, z ) 
A = · H f ( x, y, z) = ·  f xy yy f yz  =  −4 3 2  .
2 2 00 ( x, y, z ) f 00 ( x, y, z ) 00
f xz yz f zz ( x, y, z) 0 2 1
(15-112)
We can prove 15-108 by direct computation:

1 −4 0
 
 
  x
PA ( x, y, z) = x y z ·  −4 3 2  ·  y 
0 2 1 z
x−4·y
 
 
= x y z · 3·y−4·x+2·z
  (15-113)
z+2·y
= x · ( x − 4 · y ) + y · (3 · y − 4 · x + 2 · z ) + z · ( z + 2 · y )
= x 2 + 3 · y2 + z2 − 8 · x · y + 4 · y · z
= f ( x, y, z) .

As is shown in Section ?? in eNote ?? the signs of the eigenvalues for the Hessian matrix
play a decisive role when we analyse and inspect a smooth function f ( x, y) at and about
a stationary point. And since it is again the very same Hessian matrix that appears in the
present context we will here tie a pair of definitions to this sign-discussion – now for the
general (n × n) Hessian matrices, and thus also for general quadratic forms represented
by symmetric matrices A :
eNote 15 15.9 REDUCTION OF QUADRATIC POLYNOMIALS 404

Definition 15.58 Definite and Indefinite Symmetric Matrices


We let A denote a symmetric matrix. Let A have the n real eigenvalues
λ1 , λ2 , · · · , λn . the we say that

1. A is positive definite if all eigenvalues λi are positive.

2. A is positive semi-definite if all eigenvalues λi are non-negative (every eigen-


value is greater than or equal to 0).

3. A is negative definite if all eigenvalues λi are negative.

4. A is negative semi-definite if all eigenvalues λi are non-positive (every eigen-


value is less than of equal to 0).

5. A is indefinite if A is neither positive semi-definite nor negative semi-definite.

We now formulate an intuitively reasonable result that relates this ”definiteness” to the
values which the quadratic polynomial PA (x) assumes for different x ∈ Rn .

Theorem 15.59 The Meaning of Positive Definiteness


If A is a symmetric positive definite matrix then the quadratic form PA (x) is positive
for all x ∈ Rn − 0.

Proof

We refer to Theorem 15.55 and from that we can use the reduced expression for the quadratic
form:
PA ( x1 , · · · , xn ) = PeΛ ( xe1 , · · · , xen ) = λ1 · xe12 + · · · + λn · xen2 , (15-114)
from which it is clear to see that since A is positive definite we get λi > 0 for all i = 1, · · · , n
and then PA (x) > 0 for all x 6= 0, which corresponds to the fact that none of the sets of
coordinates for x can be (0, · · · , 0).

Similar theorems can be formulated for negative definite and indefinite matrices, and
eNote 15 15.10 REDUCTION OF QUADRATIC POLYNOMIALS 405

they are obviously useful in investigations of functions, in particular in investigations


of the functional values around stationary points, as shown in eNote ??.

15.10 Reduction of Quadratic Polynomials

By reducing the quadratic form part of a quadratic polynomial we naturally get an


equivalently simpler quadratic polynomial – now without product terms. We give a
couple of examples.

Example 15.60 Reduction of a Quadratic Polynomial, Two Variables

We consider the following quadratic polynomial in two variables:

f ( x, y) = 11 · x2 + 4 · y2 − 24 · x · y − 20 · x + 40 · y − 60 (15-115)

The part of the polynomial that can be described by a quadratic form is now

PA ( x, y) = 11 · x2 + 4 · y2 − 24 · x · y , (15-116)

where  
11 −12
A= . (15-117)
−12 4
Exactly this matrix is diagonalized by a positive orthogonal substitution Q in Example 15.32:
The eigenvalues for A are λ1 = 20 and λ2 = −5 and
   
4/5 3/5 cos( ϕ) − sin( ϕ)
Q= = , where ϕ = − arcsin(3/5) . (15-118)
−3/5 4/5 sin( ϕ) cos( ϕ)

The change of coordinates xe, ye consequently is a rotation of the standard coordinate system
by an angle of − arcsin(3/5).

We use the reduction theorem 15.55 and get that the quadratic form PA ( x, y) in the new
coordinates has the following reduced expression:

PA ( x, y) = PeΛ ( xe, ye) = 20 · xe2 − 5 · ye2 . (15-119)

By introducing the reduced expression for the quadratic form in the polynomial f ( x, y) we
get:
f ( x, y) = 20 · xe2 − 5 · ye2 + (−20 · x + 40 · y − 60) , (15-120)
eNote 15 15.10 REDUCTION OF QUADRATIC POLYNOMIALS 406

where all that remains is to express the last parenthesis by using the new coordinates. This
is done using the substitution matrix Q. We have the linear relation between the coordinates
( x, y) and ( xe, ye):        
x xe 4/5 3/5 xe
= Q· = · (15-121)
y ye −3/5 4/5 ye
so that:
1
· (4 · xe + 3 · ye)
x=
5 (15-122)
1
y = · (−3 · xe + 4 · ye) .
5
We substitute these rewritings of x and y in (15-120) and get:

f ( x, y) = 20 · xe2 − 5 · ye2 + (−4 · (4 · xe + 3 · ye) + 8 · (−3 · xe + 4 · ye) − 60)


(15-123)
= 20 · xe2 − 5 · ye2 − 40 · xe + 20 · ye − 60 .

Thus we have reduced the expression for f ( x, y) to the following expression in new coordi-
nates xe and ye, that appears by a suitable rotation of the standard coordinate system:

f ( x, y) = 11 · x2 + 4 · y2 − 24 · x · y − 20 · x + 40 · y − 60
= 20 · xe2 − 5 · ye2 − 40 · xe + 20 · ye − 60 (15-124)
= fe( xe, ye) .

Note again that the reduction in Example 15.60 results in the reduced quadratic
polynomial fe( xe, ye) not containing any product terms of the form xe · ye. This
reduction technique and the output of the large work becomes somewhat more
clear when we consider quadratic polynomials in three variables.

Example 15.61 Reduction of a Quadratic Polynomial, Three Variables

In Example 15.34 we have diagonalized the matrix A that represents the quadratic form in
the following quadratic polynomial in three variables:

f ( x, y, z) = 7 · x2 + 6 · y2 + 5 · z2 − 4 · x · y − 4 · y · z − 2 · x + 20 · y − 10 · z − 18 . (15-125)

This polynomial is reduced to the following quadratic polynomial in the new variables ob-
tained using the same directives as in Example 15.60:

f ( x, y) = fe( xe, ye, e


z)
(15-126)
= 9 · xe2 + 6 · ye2 + 3 · e
z2 + 18 · xe − 12 · ye + 6 · e
z − 18
eNote 15 15.10 REDUCTION OF QUADRATIC POLYNOMIALS 407

with the positive orthogonal substitution

−2/3 −2/3 1/3


 

Q =  2/3 −1/3 2/3  . (15-127)


−1/3 2/3 2/3

The substitution matrix Q can be factorized to a product of axis-rotation matrices like this:

Q = Rz ( w ) · Ry ( v ) · R x ( u ) , (15-128)

where the rotation angles are respectively:


 
π 1 π
u= , v = − arcsin , and w = 3· , (15-129)
4 3 4

By rotation of the coordinate system and by using the new coordinates xe, ye, and e z we ob-
tain a reduction of the polynomial f ( x, y, z) to the end that the polynomial f ( xe, ye, e
e z) does
not contain product terms while f ( x, y, z) contains two product terms, with x · y and y · z,
respectively.
eNote 15 15.11 SUMMARY 408

15.11 Summary

The main result in this eNote is that symmetric (n × n)-matrices are precisely those
matrices that can be diagonalized by a special orthogonal change of basis matrix Q.
We have used this theorem for the reduction of quadratic polynomials in n variables –
though particularly for n = 2 and n = 3.

• A symmetric (n × n)-matrix A has precisely n real eigenvalues λ1 , · · · , λn .

• In the vector space Rn a scalar product is introduced by extending the standard


scalar product of R2 and R3 , and we refer to this scalar product when we write
(Rn , ·). If a = ( a1 , · · · , an ) and b = (b1 , · · · , bn ) with respect to the standard basis
e in Rn , then
n
a·b = ∑ a i · bi . (15-130)
i

• The length, the norm, of a vector a is given by


s
√ n
|a| = a · a = ∑ a2i . (15-131)
i =1

• The Cauchy-Schwarz inequality is valid for all vectors a and b in (Rn , ·)

|a · b| ≤ | a | | b | , (15-132)

and the equality sign applies if and only if a and b are linearly dependent.

• The angle θ ∈ [0, π ] between two proper vectors a and b in (Rn , ·) is determined
by
a·b
cos(θ ) = . (15-133)
|a| · |b|
• Two proper vectors a and b in (Rn , ·) are orthogonal if a · b = 0.

• A matrix Q is orthogonal if the column vectors are pairwise orthogonal and each
has length 1 with respect to the scalar product introduced. This corresponds ex-
actly to
Q> · Q = E (15-134)
or equivalently:
Q−1 = Q> . (15-135)
eNote 15 15.11 SUMMARY 409

• The spectral theorem: If A is symmetric, then a special orthogonal change of basis


matrix Q exists such that
A = Q · Λ · Q> , (15-136)
where Λ = diag(λ1 , · · · , λn ).

• Every special orthogonal matrix Q is change of basis matrix that rotates the coordinate-
system. It can for n = 3 be factorized in three axis-rotation matrices:

Q = Rz ( w ) · Ry ( v ) · R x ( u ) , (15-137)

for suitable choices of rotation angles u, v, and w.

• For n = 3: By rotation of the coordinate-system, i.e. by use of a special orthogo-


nal change of basis matrix Q, the quadratic form PA ( x, y, z) (which is a quadratic
polynomial without linear terms and without constant terms) can be expressed by
a quadratic form PeΛ ( xe, ye, e
z) in the new coordinates xe, ye, and e
z such that

PA ( x, y, z) = PeΛ ( xe, ye, e


z) for all ( x, y, z), (15-138)

and such that the reduced quadratic form PeΛ ( xe, ye, e
z) does not contain any product
term of the type xe · ye, xe · e
z, or ye · e
z:

z) = λ1 · xe2 + λ2 · ye2 + λ3 · e
PeΛ ( xe, ye, e z2 , (15-139)

where λ1 , λ2 , and λ3 are the three real eigenvalues for A.


eNote 16 410

eNote 16

First Order Linear Differential Equations

In this eNote we first give a short introduction to differential equations in general and then the
main subject is a special type of differential equation the so-called first order differential
equations. The eNote is based on knowledge of special functions, differential and integral
calculus and linear maps.

Version 19.10.18 by Karsten Schmidt/Jesper Kampmann Larsen. Updated 2.11.21 David


Brander.

16.1 What Is a Differential Equation?

A differential equation is an equation, in which one or more unknown functions appear


together with one or more of their derivatives. In this eNote we only consider differ-
ential equations that contain one unknown function. Differential equations naturally
occur in the modelling of physical, mechanical, economic, chemical and manifold other
problems, and this is why it is an important subject.

One says that a differential equation has the order n if it contains the nth derivative of
the unknown function, but no derivatives of order higher than n . The unknown func-
tion is in this eNote denoted by x or x (t) , if the name of the independent variable t is
important in the context.

An example of a differential equation is


x 000 (t) − 2x 0 (t) + x (t) = t , t ∈ R. (16-1)
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 411

The equation has order 3, since the highest number of times the unknown function x
is differentiated in the equation is 3. A solution to the equation is a function x0 which,
inserted into the equation, makes it true. If we, for example, want to investigate whether
the function
x 0 ( t ) = et + t + 2 , t ∈ R

is a solution to (16-1), we test this by insertion of x0 in place of x in the equation. Since

x0000 (t) − 2x00 (t) + x0 (t) = (et + t + 2)000 − 2(et + t + 2)0 + ((et + t + 2))
= et − 2 (et + 1 ) + et + t + 2
= t,

x0 is a solution.

This eNote is about an important type of first order differential equation, the so-called
linear differential equations. In order to be able to investigate these precisely we first
express them in a standard way.

16.2 Introduction to First Order Linear Differential


Equations

Definition 16.1
By a first order linear differential equation we understand a differential equation that
can be brought into the standard form

x 0 (t) + p(t) x (t) = q(t) , t∈I (16-2)

where I is an open interval in R , and p and q are (known) continuous functions


defined on I .

The equation is called homogeneous if q(t) = 0 for all t . Otherwise it is called inho-
mogeneous.
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 412

Example 16.2 Standard Form

The first order differential equation

x 0 (t) + 2x (t) = 30 + 8t , t ∈ R. (16-3)

is immediately seen to be in standard form (16-2) with p(t) = 2 and q(t) = 30 + 8t , t ∈ R.


Therefore it is linear.

Example 16.3 Standard form

Let I be an open interval in R . Consider the first order linear differential equation

t · x 0 (t) + 2x (t) − 8t2 = −10 , t ∈ R. (16-4)

In order to bring this into standard form we first have to add 8t2 to both sides of the equation,
since on the left-hand side only terms containing the unknown function x (t) must appear.
Then we divide both sides by t, since the coefficient of x 0 (t) must be 1 in the standard form.
To avoid division by 0, we must assume that t is either greater or less than zero: let us choose
the first:
2 10
x 0 (t) + x (t) = 8t − , t > 0 . (16-5)
t t
2
Now the differential equation is in the standard form with p(t) = and
t
10
q(t) = 8t − , t > 0 .
t

Exercise 16.4 Standard Form

Explain why the first order differential equation

1 2
x 0 (t) + t = 0, t∈R
2
is not homogeneous.
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 413

Exercise 16.5 More Solutions

Given the differential equation

x 0 (t) + 2x (t) = 30 + 8t , t ∈ R.

Show that, for any of c = 1, c = 2 or c = 3 , the function

x0 (t) = 13 + 4t + ce−2t , t∈R

is a solution.

From Exercise 16.5 it appears that a differential equation can have more than one solu-
tion. We will in what follows investigate in more detail the question about the number
of solutions. In order to understand precisely what is meant by a first order differential
equation being linear and what this means for its solution set, we will need the follow-
ing lemma.

Lemma 16.6
Let p be a continuous function defined on an open interval I in R . Then the map
f : C1 ( I ) → C0 ( I ) given by

f ( x (t)) = x 0 (t) + p(t) x (t) (16-6)

is linear.

Proof

We will show that f satisfies the two linearity requirements L1 and L2 . Let x1 , x2 ∈ C1 ( I )
(i.e. the two functions are arbitrary differentiable functions with continuous derivatives on
I ), and let k ∈ R . That f satisfies L1 appears from

f ( x1 (t) + x2 (t)) = ( x1 (t) + x2 (t))0 + p(t)( x1 (t) + x2 (t))


= x10 (t) + x20 (t) + p(t)( x1 (t) + p(t) x2 (t)
= ( x10 (t) + p(t) x1 (t)) + ( x20 (t) + p(t) x2 (t))
= f ( x1 (t)) + f ( x2 (t)) .
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 414

That f satisfies L2 appears from

f (kx1 (t)) = (kx10 (t)) + p(t)(kx1 (t)) = k ( x10 (t) + p(t) x1 (t))
= k f ( x1 (t)) .

By this the proof is completed.

From Lemma 16.6 we can deduce important properties for the solution set for first order
linear differential equations. First we introduce convenient notations for the solution
sets that we will treat.

Linhom denotes all solutions for a given inhomogeneous differential equation.


Linhom is briefly known as the solution set or the general solution.
Lhom denotes the solution set for a homogeneous differential equation corre-
sponding to an inhomogeneous equation (where the right-hand side q(t) is
replaced by 0 ) .

Theorem 16.7 Three Properties


For a first order linear differential equation x 0 (t) + p(t) x (t) = q(t) , t ∈ I :

1. If the equation is homogeneous (i.e. q(t) is the 0-function), then the solution
set is a vector subspace of C1 ( I ) .

2. Structure Theorem: If the equation is inhomogeneous the general solution


Linhom can be written in the form

Linhom = x0 (t) + Lhom (16-7)

where x0 (t) is a particular solution to the inhomogeneous differential equation,


and Lhom is the solutions set to the corresponding homogeneous differential
equation.

3. Superposition principle: If x1 (t) is a solution when the right-hand side of the


differential equation is replaced by the function q1 (t) , and x2 (t) is a solution
when the right-hand side is replaced by the function q2 (t) , then x1 (t) + x2 (t)
is a solution when the right-hand side is replaced by the function q1 (t) + q2 (t) .
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 415

Proof

We consider the map between vector spaces, f : C1 ( I ) → C0 ( I ) given by

f ( x (t)) = x 0 (t) + p(t) x (t) . (16-8)

This is, according to Lemma 16.6, linear. Therefore we have:

1. Lhom is equal to ker( f ) . Since the kernel for every linear map is a subspace of the
domain, Lhom is a subspace of C1 ( I ) .

2. Since the equation f ( x (t)) = x 0 (t) + p(t) x (t) = q(t) is linear, the structure theorem
follows directly from the general structure theorem for linear equations (see eNote 12,
Theorem 12.14).

3. The superposition principle follows from the fact that f satisfies the linearity require-
ment L1 . Assume that f ( x1 (t)) = q1 (t) and f ( x2 (t)) = q2 (t) . Then

f ( x1 (t) + x2 (t)) = f ( x1 (t)) + f ( x2 (t)) = q1 (t) + q2 (t) .

By this the proof is completed.

When we call a first order differential equation of the form (16-2) linear, it is – as shown
above – closely related to the fact that its left-hand side represents a linear map, and that
its solution set therefore has the unique properties of Theorem 16.7. In the following
example we juggle with the properties in order to decide whether a given differential
equation is not linear.

Example 16.8 First Order Differential Equation That Is Nonlinear

We consider a first order differential equation


x 0 (t) − ( x (t))2 = q(t) , t ∈ R . (16-9)
where we in the usual way have isolated the terms that contain the unknown function on the
left-hand side. The left-hand side represents the map f : C1 (R) → C0 (R) given by
f ( x (t)) = x 0 (t) − ( x (t))2 . (16-10)
eNote 16 16.2 INTRODUCTION TO FIRST ORDER LINEAR DIFFERENTIAL
EQUATIONS 416

Here we will show that one, in different ways, can demonstrate that the differential equation
is not linear.

1. We can show directly that f does not satisfy the linearity conditions. To show this we
can test L2 , e.g. with k = 2 . We compute the two sides in L2 :

f (2x (t)) = (2x (t))0 − (2x (t))2 = 2x 0 (t) − 4( x (t))2


2 f ( x (t)) = 2( x 0 (t) − ( x (t))2 ) = 2x 0 (t) − 2( x (t))2 .

By subtraction of the two equation we get:

f (2x (t)) − 2 f ( x (t)) = −2( x (t))2

where the right-hand side is only the 0-function when x (t) is the 0-function. Since L2
applies for all x (t) ∈ C1 (R) , L2 is not satisfied. Therefore the equation is nonlinear.

2. The solution set to the corresponding homogeneous equation is not a subspace. E.g.
it does not satisfy the stability requirement with repsect to multiplication by a scalar
which we can show as follows:
1
The function x0 (t) = − is a solution to the homogeneous eqution because
t
1 1
x00 (t) − ( x0 (t))2 = 2
− 2 = 0.
t t

2
But 2 · x0 (t) = − is not, because
t
2 4 2
(2 · x0 (t))0 − (2 · x0 (t))2 = 2
− 2 = − 2 6= 0 .
t t t

Therefore the differential equation is not linear.

3. The solution set does not satisfy the superposition principle. E.g. we see that
   
1 1 2
f − = 0 and f = − 2 , while
t t t
 
1 1 2
f − + = 0 6= 0 − 2 .
t t t

Therefore the differential equation is not linear.

It follows from the structure theorem that homogeneous equations play a special role
for linear differential equations. Therefore we treat them separately in the next section.
eNote 16 417
16.3 HOMOGENEOUS FIRST ORDER LINEAR DIFFERENTIAL EQUATIONS

16.3 Homogeneous First Order Linear Differential


Equations

We now establish a solution formula for homogeneous first order linear equations.

Theorem 16.9 Solution of the Homogeneous Equation


Let p(t) be a continuous function defined on an open real interval I , and let P(t) be
an arbitrary antiderivative for p(t) , i.e., a function satisfying P0 (t) = p(t).
The general solution for the homogeneous first order linear differential equation

x 0 (t) + p(t) x (t) = 0 , t ∈ I . (16-11)

is then given by
x ( t ) = c e− P ( t ) , t ∈ I (16-12)
where c is an arbitrary real number.

Proof

The theorem follows from the fact that the derivative of a function g(t) is zero on an interval
if and only if that function is constant. We apply this to the function g(t) = x (t)e P(t) . Using
the chain rule and the product rule for differentiation we have:
 0
x (t)e P(t)) = e P(t) x 0 ( t ) + p ( t ) e P(t) x ( t )
= e P(t) x 0 ( t ) + p ( t ) x ( t ) .


Since e P(t) 6= 0, the above expression is zero if and only if the equation (16-11) holds. That is,
the differential equation (16-11) is equivalent to the equation:
 0
x (t)e P(t)) = 0.
As mentioned, this is equivalent to the statement:
x (t)e P(t) = c, (16-13)
where c is some real constant, i.e. that x (t) = ce− P(t) . This shows that not only is ce− P(t) a
solution, for any constant c, but that any solution to (16-11) must be of this form, since it must
satisfy Equation (16-13) for some c.


eNote 16 418
16.3 HOMOGENEOUS FIRST ORDER LINEAR DIFFERENTIAL EQUATIONS

We already know that the solution set is a subspace of C1 ( I ) . From the formula
(16-12) we now know that the subspace is 1-dimensional, and that the function
e− P(t) is a basis for the solution set.

Remark 16.10
Theorem 16.9 is also valid if p is a continuous complex-valued function, with the
slight modification that the arbitrary constant c is now a complex constant. The
proof is exactly the same, because the product rule is the same for complex-valued
functions of t, and, writing P(t) = u(t) + iv(t), one finds that the derivative of e P(t)
is still P0 (t)e P(t) . Finally, by separating the function into real and imaginary parts,
one again finds that the derivative of a complex-valued function is zero if and only
if the function is equal to a complex constant.

Exercise 16.11

In Theorem 16.9 an arbitrary antiderivative P(t) for p(t) is used. Explain why it is immate-
rial to the solution set which antiderivative you use when you apply the theorem.

Example 16.12 Solution of a Homogeneous Equation

A homogeneous first order linear differential equation is given by

x 0 (t) + cos(t) x (t) = 0, t ∈ R. (16-14)

We see that that the coefficient function p(t) = cos(t) . An antiderivative for p(t) is
P(t) = sin(t) . Then the general solution can be written as

x (t) = ce− P(t) = ce− sin(t) , t∈R (16-15)

where c is an arbitrary real number.


eNote 16 16.4 INHOMOGENEOUS EQUATIONS SOLVED BY THE GUESS METHOD419

16.4 Inhomogeneous Equations Solved by the Guess


Method

Now that we know how to find the general solution for homogeneous first order linear
differential equations, it is about time to look at the inhomogeneous ones. If you already
know or can guess a particular solution to the inhomogeneous equation, it is obvious
to use the structure theorem, see Theorem 16.7. This is demonstrated in the following
examples.

Example 16.13 Solution Using a Guess and the Structure Theorem

An inhomogeneous first order linear differential equation is given by

x 0 (t) + tx (t) = t, t ∈ R. (16-16)

It is easily seen that x0 (t) = 1 is a particular solution. Then we solve the corresponding
homogeneous differential equation

x 0 (t) + tx (t) = 0, t ∈ R. (16-17)

Using symbols from Theorem 16.9 we have p(t) = t that has the antiderivative

1 2
P(t) = t .
2
The general solution therefore consists of the following functions where c is an arbitrary real
number:
1 2
x (t) = ce− 2 t , t ∈ R . (16-18)
In short: n o
1 2
Lhom = ce− 2 t , t∈R c∈R . (16-19)

Now we can establish the general solution to the inhomogeneous differential equation using
the structure theorem as:
n o
1 2
Linhom = x0 (t) + Lhom = 1 + ce− 2 t , t ∈ R c ∈ R .
eNote 16 16.4 INHOMOGENEOUS EQUATIONS SOLVED BY THE GUESS METHOD420

Example 16.14 Solution Using a Guess and the Structure Theorem

An inhomogeneous first order linear differential equation is given by

x 0 (t) + 2x (t) = 30 + 8t, t ∈ R. (16-20)

First let us try to guess a particular solution. Since the right-hand side is first degree poly-
nomial, one can – with the given left-hand side, where you only differentiate and multiply
by 2 – assume that a first degree polynomial could be a solution. Therefore we try to insert
an arbitrary first degree polynomial x0 (t) = b + at in the left-hand side of the differential
equation:
x00 (t) + 2x0 (t) = (b + at)0 + 2(b + at) = a + 2b + 2at .

We compare the resulting expression with the given right-hand side:

a + 2b + 2at = 30 + 8t

that is satisfied for all t ∈ R eactly when

a + 2b = 30 and 2a = 8 ⇔ a = 4 and b = 13 .

Thus we have found a particular solution

x0 (t) = 13 + 4t , t ∈ R .

Then we solve the corresponding homogeneous differential equation

x 0 (t) + 2x (t) = 0, t ∈ R. (16-21)

Using symbols from Theorem 16.9 we have p(t) = 2 that has the antiderivative P(t) = 2t .
Therefore the general solution consists of the following functions where c is an arbitrary real
number:
x (t) = ce−2t , t ∈ R . (16-22)
In short:
ce−2t , t ∈ R c ∈ R .

Lhom = (16-23)

Now its is possible to establish the general solution to the inhomogeneous differential equa-
tion using the structure theorem:

Linhom = x0 (t) + Lhom = 13 + 4t + ce−2t , t ∈ R c ∈ R .



eNote 16 16.4 INHOMOGENEOUS EQUATIONS SOLVED BY THE GUESS METHOD421

Example 16.15 Solution Using a Guess and the Structure Theorem

An inhomogeneous first order linear differential equation is given by

x 0 (t) + x (t) = 1 + sin(2t), t ≥ 0. (16-24)

First let us try to guess a particular solution. Since the right-hand side consists of constant
plus a sine function with the angular frequency 2, it is obvious to guess a solution the type

x (t) = k + a cos(2t) + b sin(2t) .

By insertion of this in the differential equation we get:

−2a sin(2t) + 2b cos(2t) + k + a cos(2t) + b sin(2t) = 1 + sin(2t)


⇔ (2b + a) cos(2t) + (b − 2a − 1) sin(2t) + (k − 1) 1 = 0 .

Since the set (cos(2t), sin(2t), 1) is linearly independent, this equation is satisfied exactly
when
2 1
2b + a = 0 , b − 2a − 1 = 0 and k = 1 ⇔ a = − , b = and k = 1 .
5 5

By this we have found a particular solution

2 1
x0 ( t ) = 1 − cos(2t) + sin(2t) , t ∈ R .
5 5

Since the corresponding homogeneous differential equation

x 0 (t) + x (t) = 0, t≥0 (16-25)

evidently has the general solution

x (t) = ce−t , t ≥ 0, (16-26)

we get the general solution to the given inhomogeneous differential equation by use of the
structure theorem:

Linhom = x0 (t) + Lhom = 1 − 52 cos(2t) + 15 sin(2t) + ce−t , t ∈ R c ∈ R .




As demonstrated in the three previous examples it makes sense to use the guess method
in the inhomogeneous cases, when you already know a particular solution or easily can
eNote 16 16.5 THE GENERAL SOLUTION FORMULA 422

find one. It only requires that you can find an antiderivative P(t) for the coefficient
function p(t) .

Otherwise if you do not have an immediate particular solution, you must use the gen-
eral solution formula (see below) instead. Here you get rid of the guesswork, but you
must find two antiderivatives, one is P(t) as above, while the other often is somewhat
more difficult (if not impossible) to find, since you must integrate a product of func-
tions. In the following section we establish the general solution formula and discuss the
said problems.

16.5 The General Solution Formula

Now we consider the general first order linear differential equation in the standard form

x 0 ( t ) + p ( t ) x ( t ) = q ( t ), t ∈ I, (16-27)

We can determine the general solution using the following general formula.

Theorem 16.16 The General Solution Formula


Let p(t) and q(t) be continuous functions on an open real interval I , and let P(t) be
an arbitrary antiderivative to p(t) . The differential equation

x 0 ( t ) + p ( t ) x ( t ) = q ( t ), t∈I (16-28)

then has the general solution


Z
− P(t)
x (t) = e eP(t) q(t)dt + ce− P(t) , t∈I (16-29)

where c is an arbitrary real number.

Proof

The second term in the solution formula (16-29) we identify as Lhom . If we can show that the
first term is a particular solution to the differential equation, then it follows from the struc-
ture theorem that the solution formula is the general solution to the differential equation.
eNote 16 16.5 THE GENERAL SOLUTION FORMULA 423

First we must of course ask ourselves whether the indefinite integral that is part of the solu-
tion formula even exists. It does! See a detailed reasoning for this in the proof of the existence
and uniqueness Theorem 16.24. That the first term
Z
x 0 ( t ) = e− P ( t ) eP(t) q(t)dt

is a particular solution we show by testing. We insert the term in left-hand side of the differ-
ential equation and see that the result is equal to the right-hand side.
 Z 0 Z
x00 (t) + p ( t ) x0 ( t ) = e − P(t)
e P(t)
q(t) dt + p ( t ) e− P ( t ) eP(t) q(t) dt
Z Z
= − p ( t )e− P ( t ) eP(t) q(t) dt + e− P(t) eP(t) q(t) + p(t)e− P(t) eP(t) q(t)dt

= q(t) .

By this the proof is completed.

Remark 16.17
Using Remark 16.10, it is straightforward to show that Theorem 16.16 is also valid
if p(t) and q(t) are continuous complex-valued functions, with the modification that
the arbitrary constant c is a complex constant.

If one inserts q(t) = 0 in the general solution formula (16-29), the first term
disappears, and what is left is the second term that is the formula (16-12) for
homogeneous equation. Therefore the formula (16-29) is a "‘general formula"’
that covers both the homogeneous and the inhomogeneous case.

Exercise 16.18

Z
The solution formula (16-29) includes the indefinite integral eP(t) q(t)dt , that represents an
arbitrary antiderivative of e P(t) q(t) . Explain why it does not matter to the solution set which
antiderivative you choose to use, when you apply the formula.

Now we give a few examples using the general solution formula. Since it contains
eNote 16 16.5 THE GENERAL SOLUTION FORMULA 424

an indefinite integral of a product of functions you will often need integration by parts,
which the second example demonstrates.

Example 16.19 Solution Using the General Formula

Given the differential equation

2 10
x 0 (t) + x (t) = 8t − , t > 0. (16-30)
t t
2 10
With the symbols in the general solution formula we have p(t) = t and q(t) = 8t − t . An
antiderivative for p(t) is given by:
P(t) = 2 ln t . (16-31)
We then have
−2 ) 1
e− P(t) = e−2 ln t = eln(t = t −2 = . (16-32)
t2
From this it follows that eP(t) = t2 . Now we use the general solution formula:
Z
− P(t)
x (t) = e eP(t) q(t) dt + ce− P(t)
 
1 10 1
Z
2
= t 8t − dt + c 2
t2 t t
1 1
Z
= 2 (8t3 − 10t) dt + c 2 (16-33)
t t
1  
= 2 2t4 − 5t2 + c
t
c
x (t) = 2t2 − 5 + , t > 0.
t2
The general solution consists of these functions where c is an arbitrary real number. In short:
n c o
Linhom = x (t) = 2t2 − 5 + 2 , t > 0 c ∈ R . (16-34)
t

Example 16.20 Solution Using the General Formula

We will solve the differential equation


1
x 0 (t) − x (t) = t2 sin(2t), t > 0. (16-35)
t
With the symbols in the general solution formula we have p(t) = − 1t and q(t) = t2 sin(2t).
An antiderivative for p(t) is given by:
P(t) = − ln t . (16-36)
eNote 16 16.6 INITIAL VALUE PROBLEMS 425

We then have
1
e− P(t) = eln t = t and eP(t) = e− ln t = (eln t )−1 = . (16-37)
t
Now we use the general solution formula::
Z
x ( t ) = e− P ( t ) eP(t) q(t) dt + ce− P(t)
1 2
Z
=t t sin(2t) dt + ct
Z t
= t t sin(2t) dt + ct .

Now we perform an intermediate computation where we use integration by parts to find the
antiderivative.
1 1
Z Z
t sin(2t)dt = − t cos(2t) − − cos(2t) dt
2 2
1 1
Z
= − t cos(2t) + cos(2t) dt
2 2
1 1
= − t cos(2t) + sin(2t) .
2 4
And return to the computation
Z
x (t) = t t sin(2t) dt + ct
 
1 1
= t − t cos(2t) + sin(2t) + ct
2 4
1 1
x (t) = − t2 cos(2t) + t sin(2t) + ct t > 0 .
2 4
The general solution consists of these functions where c is an arbitrary real number. In short:

x (t) = − 21 t2 cos(2t) + 14 t sin(2t) + ct , t > 0 c ∈ R .



Linhom = (16-38)

Until now we have considered the general solution to the differential equation. Often
one is interested in a particular solution that for a given value of t assumes a desired
functional value, a so-called initial value problem. We treat this in the next section.

16.6 Initial Value Problems

We consider a first order linear differential equation in its standard form


x 0 ( t ) + p ( t ) x ( t ) = q ( t ), t ∈ I. (16-39)
eNote 16 16.6 INITIAL VALUE PROBLEMS 426

If we need a solution to the equation that for a given value of t assumes a desired
functional value, the following questions arise: 1) Is there even a solution that satisfies
the desired properties and 2) If yes, how many solutions are there? Before we answer
these question generally, we consider a couple of examples.

Example 16.21 An Initial Value Problem

In the Example 16.13 we found the general solution to the differential equation

x 0 (t) + tx (t) = t, t ∈ R. (16-40)

viz.
1 2
x (t) = 1 + ce− 2 t , t ∈ R
where c is an arbitrary real number.

Now we will find the solution x0 (t) that satisfies the initial value condition x0 (0) = 3 . This
is done by insertion of the initial value in the general solution, whereby we determine c :

1 2
x0 (0) = 1 + ce− 2 ·0 = 1 + c = 3 ⇔ c = 2 . (16-41)
Therefore the conditioned solution function to the differential equation is given by
1 2
x0 (t) = 1 + 2e− 2 t , t ∈ R . (16-42)

The figure below shows the graphs for the seven solutions that correspond to initial value
conditions x0 (0) = b where b ∈ {−3, −2, −1, 0, 1, 2, 3} . The solution we just found is the
uppermost. The others are found in a similar way.
eNote 16 16.6 INITIAL VALUE PROBLEMS 427

Example 16.22 An Initial Value Problem

In Example 16.20 we found the general solution to the differential equation

2 10
x 0 (t) + x (t) = 8t − , t > 0, (16-43)
t t
viz.
c
x (t) = 2t2 − 5 + , t>0
t2
where c is an arbitrary real number.

Now we will find the particular solution x0 (t) that satisfies the initial value condition
x0 (1) = 2 . It is done by insertion of initial value in the general solution, whereby we de-
termine c :
c
x 0 ( 1 ) = 2 · 12 − 5 + 2 2 − 5 + c = 2 ⇔ c = 5 . (16-44)
1
Therefore the conditioned solution function to the differential equation is given by

5
x0 (t) = 2t2 − 5 + , t > 0. (16-45)
t2

The figure below shows the graphs for the seven solutions that correspond to initial value
conditions x0 (0) = b where b ∈ {−4, −3, −2, −1, 0, 1, 2} . The solution we just found is the
uppermost. The others are found in a similar way.
eNote 16 16.6 INITIAL VALUE PROBLEMS 428

Example 16.23 The Stationary Response

In Example 16.15 we found the general solution to the differential equation

x 0 (t) + x (t) = 1 + sin(2t), t≥0 (16-46)

viz.
2 1
x (t) = 1 − cos(2t) + sin(2t) + ce−t , t ≥ 0. (16-47)
5 5

Here we show a series of solutions with the initial values from -1 to 3 for t = 0 :

The figure indicates that all solutions approach a periodic oscillation when t → ∞ . That this
is the case is seen from the general solution of the differential equation where the fourth term
ce−t regardless of the choice for c is negligible due to the negative exponent. The first three
terms constitute the the stationary response.

In the three preceding examples we did not have any difficulties in finding a solution
to the differential equation that satisfied a given initial condition. In fact we saw that,
for each of the initial value conditions considered, exactly one solution that satisfied the
condition exists. That this applies in general we show in the following theorem.
eNote 16 16.6 INITIAL VALUE PROBLEMS 429

Theorem 16.24 Existence and Uniqueness of Solutions


Given the differential equation

x 0 ( t ) + p ( t ) x ( t ) = q ( t ), t∈I (16-48)

where I is an open interval and p(t) and q(t) are continuous functions on I .

Then: for every number pair (t0 , b) exactly one (particular) solution x0 (t) to the
differential equation exists that satisfies the inital value condition

x0 ( t0 ) = b . (16-49)

Proof

From Theorem 16.16 we know that the set of solutions to the differential equation (16-48) is
given by Z
x ( t ) = e− P ( t ) eP(t) q(t)dt + ce− P(t) (16-50)

where c is an arbitrary real number.

Let us first investigate the indefinite integral that is included in the formula. Does it exist?
This is equivalent to asking: does an antiderivative for the function under the integration
sign exist? We must start with p(t) . Since it is continuous, it has an antiderivative which
we call P(t) . Being an antiderivative, P(t) is differentiable and thus continuous. Since the
exponential function is also continuous the composite function eP(t) is continuous. Finally
since q(t) is continuous, the product eP(t) q(t) is continuous.

By this we have shown that the function under the integration sign is continuous. Therefore
it has an antiderivative, in fact infinitely many antiderivatives that only differ from each other
by constants. We choose an arbitrary antiderivative and call it F (t) . Now we can reformulate
the solution formula as
x (t) = e− P(t) F (t) + ceP(t) (16-51)

where c is an arbitrary real number. Then we insert the initial value condition:

x (t0 ) = e− P(t0 ) F (t0 ) + ce− P(t0 ) = b ⇔ c = F (t0 ) + be− P(t0 )


eNote 16 16.6 INITIAL VALUE PROBLEMS 430

where we first multiplied by eP(t0 ) on both sides of the equality sign and then isolated c .
Thus in the general solution set exactly one solution exists that satisfies the initial value
condition, viz. the one that emerge when we in (16-51) insert the found value of c .

By this the proof is completed.

Exercise 16.25

Again let us consider the linear map f : C1 ( I ) → C0 ( I ) that represents the left-hand side of
a first order linear differential equation:

f ( x (t)) = x 0 (t) + p(t) x (t) (16-52)

We know that ker( f ) is one dimensional and has the basis vector e− P(t) . But what is the
image space (the range) for f ?

We end this section by an example that shows how it is possible to “go backwards” from
a given general solution to the differential equation it solves.

Example 16.26 From Solution to the Differential Equation

The general solution to a first order inhomogeneous differential equation is given by

x (t) = te−5t + ct , t > 0 c ∈ R



Linhom = . (16-53)
Determine the corresponding differential equation that has the form

x 0 (t) + p(t) x (t) = q(t) . (16-54)

(That is, determine p(t) and q(t)).

First we consider the corresponding homogeneous differential equation. With the structure
theorem in mind we immediately see that

Lhom = x (t) = ct , t > 0 c ∈ R



eNote 16 16.7 FINITE DIMENSIONAL DOMAIN 431

By insertion of x (t) = ct in the homogeneous equation x 0 (t) + p(t) x (t) = 0 we get

c + p(t)ct = 0 , (16-55)

and since this equation must hold for all c

1
p(t) = − . (16-56)
t

Since we now know p(t) , it only remains to determine the right-hand side q(t) . We find this
by insertion of the particular solution x (t) = te−5t into the left-hand side of the equation.

1
e−5t − 5te−5t − · te−5t = −5te−5t = q(t) . (16-57)
t
Now since both p(t) and q(t) are determined, the whole differential equation is determined
as:
1
x 0 (t) − x (t) = −5te−5t , t > 0 . (16-58)
t

16.7 Finite Dimensional Domain

In some cases we know in advance what type of solutions to the differential equation are
of interest. Therefore one can choose to restrict the domain C1 (R) . We end this eNote
with an example where the domain is a finite dimensional subset of C1 (R) which leads
to the introduction of matrix methods.

Example 16.27 Solution by Matrix Compution

Consider the differential equation

x 0 (t) + (1 − 2t) x (t) = 7t − 4t3 . (16-59)

In this example we are only interested in solutions that belong to the polynomial space
P2 (R) , i.e. the subset of C1 (R) that has the monomial base (1, t, t2 ) .

To find the range f ( P2 (R)) of the linear map f that represents the left-hand side of the
differential equation, we first determine the images of the basis vectors:

f (1) = 1 − 2t , f (t) = 1 + t − 2t2 and f (t2 ) = 2t + t2 − 2t3 .


eNote 16 16.7 FINITE DIMENSIONAL DOMAIN 432

Since P3 (R) has the monomial base (1, t, t2 , t3 ) , and the found images lie in their span, we
see that the range f ( P2 (R)) is a subspace of P3 (R) .

We want to solve the equation


f ( x (t)) = 7t − 4t3 ,
which can be expressed in matrix form as

F x = b,

where F is the mapping matrix for f with respect to the monomial bases in P2 (R) and
P3 (R) , x is the coordinate matrix for the unknown polynomial with respect to the monomial
basis in P2 (R) , and b is the coordinate matrix for the right-hand side of the differential
equation with respect to the monomial basis in P3 (R) .

Thus, when restricted to P2 (R), the differential equation becomes an inhomogeneous system
of linear equations. The first three columns of the augmented matrix T of the system are
given by F , while the fourth column is b :

1 1 0 0 1 0 0 −1
   
 −2 1 2 7  0 1 0 1
T =
 0 −2 → rref(T) =  .
1 0 0 0 1 2
0 0 −2 −4 0 0 0 0

Since the rank of T is seen to be 3, the differential equation has only one solution. Since
the fourth column in rref(T) states the coordinate vector of the solution with respect to the
monomial basis in P2 (R) , the solution can immediately be stated as:

x0 (t) = −1 + t + 2t2 .

Exercise 16.28

1. Solve the differential equation in Example 16.27 by the guess method or the general
solution formula.

2. How does the general solution differ from the one found in the example?
eNote 16 16.7 FINITE DIMENSIONAL DOMAIN 433

Exercise 16.29

Replace the right-hand side in the differential equation in Example 16.27 by the function
q(t) = 1 .

1. Show, using matrix computation, that the differential equation does not have a solution
in the subspace P2 (R) given in the example.

2. Using Maple (or other software), find the solution x0 (t) to the differential equation
that satisfies the initial value condition x0 (t) = 0 and draw its graph.
eNote 17 434

eNote 17

Systems of Linear First-Order Differential


Equations

This eNote describes systems of linear first-order differential equations with constant
coefficients and shows how these can be solved. The eNote is based on eNote 16, which describes
linear differential equations in general. Thus it is a good idea to read that eNote first. Moreover
eigenvalues and eigenvectors are used in the solution procedure, see eNotes 13 and 14.
(Updated: 9.11.21 by David Brander).

Here we consider coupled homogeneous linear first-order differential equations with


constant coefficients (see Explanation 17.1). Such a collection of coupled differential
equations is called a system of differential equations. A system of n first-order differ-
ential equations with constant coefficients looks like this:
x10 (t) = a11 x1 (t) + a12 x2 (t) + . . . + a1n xn (t)
x20 (t) = a21 x1 (t) + a22 x2 (t) + . . . + a2n xn (t)
.. .. .. .. (17-1)
. . . .
xn0 (t) = an1 x1 (t) + an2 x2 (t) + . . . + ann xn (t)
On the left hand side of the system the derivatives of the n unknown functions x1 (t),
x2 (t), . . ., xn (t) are written. Every right hand side is a linear combination of the n un-
known functions. The coefficients ( the a’s) are real constants. In matrix form the system
can be written like this:
 0    
x1 ( t ) a11 a12 . . . a1n x1 ( t )
 x 0 (t)   a
 2   21 a22 . . . a2n  x2 (t) 
 
 ..   .. = .. . . .  .  (17-2)
 .   . . . ..  .. 
xn0 (t) an1 an2 . . . ann xn (t)
eNote 17 435

Even more compactly it can be written like this

x0 (t) = Ax(t) (17-3)

A is called the system matrix. It is now the aim to solve such a system of differential
equations, that is, we wish to determine x(t) = ( x1 (t), x2 (t), . . . , xn (t)).

Explanation 17.1 What Is a System of Differential Equations?

Systems of differential equations are collections of differential equations. The rea-


son we do not consider the differential equations individually, is that they cannot
be solved independently, because the unknown functions are present in more equa-
tions, that is, the equations are coupled. A single differential equation from a system
can e.g. look like this:
x10 (t) = 4x1 (t) − x2 (t) (17-4)
It is not possible to determine neither x1 (t) nor x2 (t), since there are two unknown
functions, but only one differential equation.

In order to be able to find the full solution to such an equation one should have as
many equations as one has unknown equations (with corresponding derivatives).
Thus the second equation in the system might be:

x20 (t) = −6x1 (t) + 2x2 (t) (17-5)

We now have as many equations (two), as we have unknown functions (two), and it
is now possible to determine both x1 (t) and x2 (t).

For greater clarity we write the system of differential equations in matrix form. The
system above looks like this:
 0      
x1 ( t ) 4 −1 x1 ( t ) 0 4 −1
= ⇔ x (t) = x(t) = Ax(t) (17-6)
x20 (t) −6 2 x2 ( t ) −6 2

Disregarding that are operating with vectors and matrices the system of equations
looks like something we have seen before: x 0 (t) = A · x (t), something we were
able to solve in high school. The solution to this differential equation is trivial:
x (t) = ce At , where c is an arbitrary constant. Below we find that the solution to the
corresponding system of differential equations is similar in structure to x (t) = ce At .
eNote 17 436

We now solve the system of differential equations in the following Theorem 17.2. The
theorem contains requirements that are not always satisfied. The special cases where
the theorem is not valid are investigated later. The proof uses a well-known method,
the so-called diagonalization method.

Theorem 17.2
Let A ∈n×n . A system of linear differential equations consisting of n equations with
a total of n unknown functions is given by

x0 (t) = Ax(t), t ∈ R. (17-7)

If A has n linearly independent eigenvectors v1 , v2 , . . . , vn corresponding to (not nec-


essarily different) eigenvalues, λ1 , λ2 , . . . , λn , then the general solution of the system
is determined by

x(t) = c1 eλ1 t v1 + c2 eλ2 t v2 + . . . + cn eλn t vn , t ∈ R, (17-8)

where c1 , c2 , . . . , cn are arbitrary complex constants.

Note that it is not always possible to find n linearly independent eigenvectors.


Therefore Theorem 17.2 cannot always be applied to the solution of systems of
first-order differential equations.

In the theorem we use the general complex solution for the system of differ-
ential equations. Therefore the general real solution can be found as the real
subset of the complex solution.

Proof

We guess that a solution to the system of differential equations x0 (t) = Ax(t) is a vector v
multiplied by eλt , λ being a constant, such that x(t) = eλt v. We then have the derivative
x0 (t) = λeλt v . (17-9)
If this expression for x0 (t) is substituted into (17-7) together with the expression for x(t) we
get:
λeλt v = Aeλt v ⇔ Av − λv = 0 ⇔ (A − λE)v = 0 (17-10)
eλt is non-zero for every t ∈ R, and can thus be eliminated. The resulting equation is an
eigenvalue problem. λ is an eigenvalue of A and v is the corresponding eigenvector. They
eNote 17 437

can both be determined. We have now succeeded in finding that eλt v is one solution to the
system of differential equations, when λ is an eigenvalue and v the corresponding eigenvec-
tor of A.

In order to find the general solution we use the so-called diagonalization method:

We suppose that A = An×n has n linearly independent (real or complex) eigenvectors


v1 , v2 , . . . , vn corresponding to the eigenvalues λ1 , λ2 , . . . , λn . We now introduce the invert-
ible matrix V, that contains all the eigenvectors:
 
V = v1 v2 · · · vn (17-11)

Furthermore we introduce the function y with y(t) = (y1 (t), y2 (t), . . . , yn (t)) such that

x(t) = Vy(t) (17-12)

We then get x0 (t) = Vy0 (t). If these expressions for x(t) og x0 (t) are substituted into Equation
(17-7) we get
Vy0 (t) = AVy(t) ⇔ y0 (t) = V−1 AVy(t) = Λy(t), (17-13)
where Λ = V−1 AV = diag(λ1 , λ2 , . . . λn ) is a diagonal matrix with the eigenvalues of A .

We now get the equation y0 (t) = Λy(t), which can be written in the following way:

y10 (t) = λ1 y1 (t)


y20 (t) = λ2 y2 (t)
.. (17-14)
.
y0n (t) = λn yn (t)

since Λ only has non-zero elements in the diagonal. In this system the single equations are
uncoupled: each of the equations only contains one function and its derivative. Therefore
we can solve them independently and the general solution for every equation is y(t) = ceλt
for all c ∈ C. In total this yields the general solution consisting of the functions below for all
c1 , c2 , . . . , cn ∈ C:
c 1 eλ1 t
   
y1 ( t )
 y 2 ( t )   c 2 eλ2 t 
y( t ) =  .  =  .  (17-15)
   
 ..   .. 
yn (t) c n eλ n t
Since we now have the solution y(t) we can also find the solution x(t) = Vy(t):

c 1 eλ1 t
 
λ2 t 
  c2 e 

x ( t ) = v1 v2 . . . v n  . 
 ..  (17-16)
cn eλ n t
λ1 t λ2 t
= c1 e v1 + c2 e v2 + . . . + c n e λ n t v n .
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 438

Now we have found the general complex solution to the system of equations in Equation
(17-7) consisting of the functions x(t) for all c1 , c2 , . . . , cn ∈ C.

Example 17.3

Given the system of differential equations

x10 (t) = x1 (t) + 2x2 (t)


(17-17)
x20 (t) = 3x1 (t)

Which in matrix form is 


0 1 2
x (t) = x(t) = Ax(t). (17-18)
3 0
It can be shown that A has the eigenvalues λ1 = 3 and λ2 = −2 with the eigenvectors
v1 = (1, 1) and v2 = (2, −3) (try for yourself!). Therefore the general real solution to the
system of differential equations is given by the functions below for all c1 , c2 ∈ R:
     
x1 ( t ) 3t 1 −2t 2
x( t ) = = c1 e + c2 e , t∈R (17-19)
x2 ( t ) 1 −3

The solution is found using Theorem 17.2. Another way of writing the solution is to separate
the system of equations so that

x1 (t) = c1 e3t + 2c2 e−2t


(17-20)
x2 (t) = c1 e3t − 3c2 e−2t

constitutes the general solution, where t ∈ R, for all c1 , c2 ∈ R.

17.1 Two Coupled Differential Equations

Given a linear homogeneous first order system of differential equations with constant
coefficients with n equations and n unknown functions
x0 (t) = Ax(t) . t∈R (17-21)
If the system matrix A has n linearly independent eigenvectors, the real solution can be
found using Theorem 17.2. If the eigenvalues are real then the real solution can be writ-
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 439

ten directly following formula (17-8) in the theorem, where the n corresponding linearly
independent eigenvectors are real and the arbitrary constants are stated as being real. If
the system matrix has eigenvalues that are not real then the real solution can be found
by extracting the real subset of the complex solution. Also in this case the solution can
be written as a linear combination of n linearly independent real solutions to the system
of differential equations.

We are left with the special case in which the system matrix does not have n linearly in-
dependent eigenvectors. Also in this case the real solution will be a linear combination
of n linearly independent real solutions to the system of differential equations. Here
the method of diagonalization obviously cannot be used and one has to resort to other
methods.

In this section we show the three cases above for systems consisting of n = 2 coupled
differential equations with 2 unknown functions.
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 440

Method 17.4
The general real solution to the system of differential equations

x0 (t) = Ax(t), t ∈ R, (17-22)

consisting of n = 2 equations with 2 unknown functions can be written as

x(t) = c1 u1 (t) + c2 u2 (t) , t ∈ R, (17-23)

where u1 and u2 are real linearly independent particular solutions and c1 , c2 ∈ R.

First determine the eigenvalues of A. For the roots of the characteristic polynomial
A there are three possibilities:

• Two real single roots. In this case both of the eigenvalues λ1 and λ2 have the
algebraic multiplicity 1 and geometric multiplicity 1 and we can put

u1 (t) = eλ1 t v1 and u2 (t) = eλ2 t v2 , (17-24)

where v1 and v2 are proper eigenvectors of λ1 and λ2 , respectively.

• Two complex roots. The two eigenvalues λ and λ̄ are then conjugate complex
numbers. We then determine u1 and u2 using Method 17.5.

• One double root. Here the eigenvalue λ has the algebraic multiplicity 2. If the
geometric multiplicity of λ is 1, u1 and u2 are determined using method 17.7.

In the first case in Method 17.4 with two different real eigenvalues, Theorem 17.2 can be
used directly with the arbitrary constants chosen as real, see Example 17.3.

Now follows the method that covers the case with two complex eigenvalues.
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 441

Method 17.5
Two linearly independent real solutions to the system of equations

x0 (t) = Ax(t), t ∈ R, (17-25)

where A has the complex pair of eigenvalues λ = α + βi and λ̄ = α − βi with


corresponding eigenvectors v and v̄, are
 
u1 (t) = Re eλt v = eαt (cos( βt)Re(v) − sin( βt)Im(v))
  (17-26)
u2 (t) = Im e v = eαt (sin( βt)Re(v) + cos( βt)Im(v))
λt

Example 17.6

Given the system of differential equations


 
0 1 1
x (t) = x(t) = Ax(t) (17-27)
−1 1
We wish to determine the general real solution.

The eigenvalues are determined as λ = 1 + i and λ̄ = 1 − i, respectively, with the correspond-


ing eigenvectors v = (−i, 1) and v̄ = (i, 1), respectively. We see that there are two complex
eigenvalues and their corresponding complex eigenvectors. With λ = 1 + i we get
     
−i 0 −1
v= = +i = Re(v) + iIm(v) (17-28)
1 1 0
If we use Method 17.5 we then get the two solutions:
      
t 0 −1 t sin( t )
u1 (t) = e cos(t) − sin(t) =e (17-29)
1 0 cos(t)
      
t 0 −1 t − cos( t )
u2 ( t ) = e sin(t) + cos(t) =e (17-30)
1 0 sin(t)
The general real solution to the system of differential equations (17-27) is then given by the
following functions for all c1 , c2 ∈ R:
    
sin(t) − cos(t)
x ( t ) = c 1 u1 ( t ) + c 2 u2 ( t ) = e t c 1 + c2 , t∈R (17-31)
cos(t) sin(t)
found using Method 17.4.
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 442

Finally we describe the method that can be used if the system matrix has the eigenvalue
λ with am(λ) = 2 and gm(λ) = 1, that is when diagonalization is not possible.

Method 17.7
If the system matrix A to the system of differential equations

x0 (t) = Ax(t), t ∈ R, (17-32)

has one eigenvalue λ with algebraic multiplicity 2, but the corresponding eigen-
vector space only has geometric multiplicity 1, there are two linearly independent
solutions to the system of differential equations of the form:

u1 (t) = eλt v
(17-33)
u2 (t) = teλt v + eλt b,

where v is the eigenvector corresponding to λ and b is a solution to the following


linear system:
(A − λE)b = v (17-34)

Proof

It is evident that one solution to the system of differential equations is u1 (t) = eλt v. The
difficulty is to find another solution.

We guess at a solution in the form

u2 (t) = teλt v + eλt b = eλt (tv + b), (17-35)

where v is an eigenvector corresponding to λ. We then have

u2 0 (t) = (eλt + λteλt )v + λeλt b = eλt ((1 + λt)v + λb) (17-36)

We check whether u2 (t) is a solution by substitution into x0 (t) = Ax(t):

u2 0 (t) = Au2 (t) ⇔


(1 + λt)v + λb = A(tv + b) ⇔
(17-37)
t(λv − Av) + (v + λb − Ab) = 0 ⇔
λv − Av = 0 ∧ v + λb − Ab = 0

The first equation can easily be transformed into Av = λv, which is seen to be true, since v is
eNote 17 17.1 TWO COUPLED DIFFERENTIAL EQUATIONS 443

an eigenvector corresponding to λ. The other equation is transformed into:

v + λb − Ab = 0 ⇔
Ab − λb = v ⇔ (17-38)
(A − λE)b = v

If b satisfies the given system of equations, u2 (t) will also be a solution to the system of
differential equations. We now have found two solutions and we have to find out whether
these are linearly independent. This is done by a normal linearity criterion: If the equation
k1 u1 + k2 u2 = 0 only has the solution k1 = k2 = 0 then u1 and u2 are linearly independent.

k1 u1 + k2 u2 = 0 ⇒
k1 eλt v + k2 (teλt v + beλt ) = 0 ⇔
(17-39)
t ( k 2 v) + ( k 1 v + k 2 b) = 0 ⇔
k2 v = 0 ∧ k1 v + k2 b = 0

Since v is an eigenvector, it is not the zero-vector, and hence k2 = 0 according to the first
equation. Thus the other equation is reduced to k1 v = 0, and with the same argument we get
k1 = 0. Therefore the two solutions are linearly independent, and thus the method has been
proved.

Example 17.8

Given the system of differential equations


 
0 16 −1
x (t) = x(t) = Ax(t). (17-40)
4 12

The eigenvalues for A are determined:

16 − λ −1
det(A − λE) = = (16 − λ)(12 − λ) + 4
4 12 − λ (17-41)
2 2
= λ − 28λ + 196 = (λ − 14) = 0
There is only one eigenvalue, viz. λ = 14, even though it is a 2 × 2-system. The eigenvectors
are determined:
1 − 12
     
16 − 14 −1 2 −1
A − 14E = = → (17-42)
4 12 − 14 4 −2 0 0

We then obtain the eigenvector ( 12 , 1) or v = (1, 2). We can then conclude that the eigen-
value λ has the algebraic multiplicity 2, but that the corresponding eigenvector space has the
eNote 17 17.2 N-DIMENSIONAL SOLUTION SPACE 444

geometric multiplicity 1. In order to determine two independent solutions to the system of


differential equations we can use Method 17.7.

First we solve the following system of equations:


  
2 −1 1
(A − λE)b = v ⇒ b= (17-43)
4 −2 2

1 − 12 12
   
2 −1 1
→ (17-44)
4 −2 2 0 0 0
This yields b = (1, 1), if the free parameter is put at1. The two solutions then are
 
14t 1
u1 (t) = e
2
    (17-45)
14t 1 14t 1
u2 (t) = te +e .
2 1

By use of Method 17.4 the general solution can be determined to the following functions for
all c1 , c2 ∈ R:
      
14t 1 14t 1 1
x(t) = c1 u1 (t) + c2 u2 (t) = c1 e + c2 e t + . (17-46)
2 2 1

17.2 n-Dimensional Solution Space

In the preceding section we have considered coupled systems consisting of two linear
equations with two unknown functions. The solution space is two-dimensional, since
it can be written as a linear combination of two linearly independent solutions. This
can be generalized to arbitrary systems with n ≥ 2 coupled linear differential equations
with n unknown functions: The solution is a linear combination of exactly n linearly
independent solutions. This is formulated in a general form in the following theorem.
eNote 17 17.2 N-DIMENSIONAL SOLUTION SPACE 445

Theorem 17.9
Given the linear homogeneous first order system of differential equations with con-
stant real coefficients
x0 (t) = Ax(t), t ∈ R, (17-47)
consisting of n equations and with n unknown functions. The general real solution
to the system is n-dimensional and can be written as

x(t) = c1 u1 (t) + c2 u2 (t) + · · · + cn un (t), (17-48)

where u1 (t), u2 (t), . . . , un (t) are linearly independent real solutions to the system of
differential equations and c1 , c2 , . . . , cn ∈ R.

Below is an example with a coupled system of three differential equations that exempli-
fies Theorem 17.9.

Example 17.10 Advanced

Given the system of differential equations

−9 10 0
 

x0 ( t ) =  − 3 1 5 x(t) = Ax(t) (17-49)


1 −4 6

We wish to determine the general real solution to the system of differential equations. Eigen-
values and eigenvectors can be determined and are as follows:

λ1 = −4 : v1 = (10, 5, 1)
λ2 = 1 : v2 = (5, 5, 3)

Moreover λ2 has the algebraic multiplicity 2, but the corresponding eigenvector space has
the geometric multiplicity 1. Because n = 3 we need 3 linearly independent solutions to
construct the general solution, as seen in 17.9. The eigenvalues are considered separately:

1) The first eigenvalue, λ1 = −4, has both geometric and algebraic multiplicity equal to 1.
This yields exactly one solution
 
10
u1 (t) = eλ1 t v1 = e−4t 5  (17-50)
1

2) The other eigenvalue, λ2 = 1, has algebraic multiplicity 2, but geometric multiplicity 1.


eNote 17 17.3 EXISTENCE AND UNIQUENESS OF SOLUTIONS 446

Therefore we can use method 17.7 in order to find two solutions. First b is determined:

−10 10 0
   
5
( A − λ 2 E ) b = v2 ⇒  − 3 0 5 b =  5  (17-51)
1 −4 5 3

A particular solution to this system of equations is b = (0, 12 , 1). With this knowledge we
have two additional linearly independent solutions to the system of differential equations:
 
5
λ2 t t 
u2 (t) = e v2 = e 5
3
    (17-52)
5 0
u3 (t) = teλ2 t v2 + eλ2 t b = tet 5 + et 21 
3 1

We leave it to the reader to show that all three solutions are linearly independent.

According to Method 17.9 the general real solution consists of the following linear combina-
tion for all c1 , c2 , c3 ∈ R:
x(t) = c1 u1 (t) + c2 u2 (t) + c3 u3 (t) (17-53)
Thus this yields
        
10 5 5 0
x(t) = c1 e−4t 5 + c2 et 5 + c3 tet 5 + et 12  (17-54)
1 3 3 1

where t ∈ R and all c1 , c2 , c3 ∈ R.

17.3 Existence and Uniqueness of Solutions

According to the Structural Theorem 17.9 the general solution to a system of differential
equations with n equations contains n arbitrary constants. If we have n initial condi-
tions, then the constants can be determined, and we then get a unique solution. This is
formulated in the following existence and uniqueness theorem.
eNote 17 17.3 EXISTENCE AND UNIQUENESS OF SOLUTIONS 447

Theorem 17.11
A first order system of differential equations consisting of n equations in n unknown
functions with constant coefficients is given by

x0 (t) = Ax(t), t ∈ I. (17-55)

For every t0 ∈ I and every number set y0 = (y1 , y2 , . . . , yn ) exactly one solution
exists x(t) = ( x1 (t), x2 (t) . . . , xn (t) ) satisfying the initial conditions

x(t0 ) = y0 , (17-56)

that is
x1 ( t0 ) = y1 , x2 ( t0 ) = y2 , . . . , x n ( t0 ) = y n . (17-57)

Example 17.12

In Example 17.3 we found the general solution to the system of differential equations
 
0 1 2
x (t) = x(t), t ∈ R, (17-58)
3 0
viz.      
x1 ( t ) 3t 1 −2t 2
x( t ) = = c1 e + c2 e , t∈R (17-59)
x2 ( t ) 1 −3
Now we wish to determine the unique solution x(t) = ( x1 (t), x2 (t)) that satisfies the initial
condition x(0) = ( x1 (0), x2 (0)) = (6, 6). This yields the system of equations
        
6 0 1 0 2 1 2 c1
= c1 e + c2 e = (17-60)
6 1 −3 1 −3 c2
By ordinary Gauss-Jordan elimination we get
     
1 2 6 1 2 6 1 0 6
→ → (17-61)
1 −3 6 0 −5 0 0 1 0
Thus we obtain the solution (c1 , c2 ) = (6, 0), and the unique conditional solution is therefore
 
3t 1
x(t) = 6e , t ∈ R, (17-62)
1
which is equivalent to
x1 (t) = 6e3t x2 (t) = 6e3t . (17-63)
In this particular case the two functions are identical.
eNote 17 17.4 TRANSFORMATION OF NTH ORDER DIFFERENTIAL EQUATIONS 448

17.4 Transformation of Linear n’th Order Homogeneous


Differential Equations to a First Order System of
Differential Equations

With a bit of ingenuity it is possible to transform a homogeneous nth order differen-


tial equation with constant coefficients to a system of differential equations that can be
solved using the methods in this eNote.

Method 17.13
An nth order linear differential equation

x ( n ) ( t ) + a n −1 x ( n −1) ( t ) + a n −2 x ( n −2) ( t ) + · · · + a 1 x 0 ( t ) + a 0 x ( t ) = 0 (17-64)

for t ∈ R, can be transformed into a first order system of differential equations and
the system will look like this:
 0    
x1 ( t ) 0 1 0 ··· 0 x1 ( t )
 x 0 (t)   0 1 ···
0 0   x2 (t) 
 
 2  
.. . .. .. .. .. ..
 =  .. . (17-65)
    

 0 .   . . . 
 . 

 x n −1 ( t )   0 0 0 ··· 1  xn−1 (t) 
0
xn (t) − a 0 − a 1 − a 2 · · · − a n −1 xn (t)

and x1 (t) = x (t).

The proof of this rewriting is simple but gives a good understanding of the transforma-
tion.

Proof

Given an nth order differential equation as in Equation (17-64). We introduce n functions in


this way:
x1 ( t ) = x ( t )
x2 (t) = x10 (t) = x 0 (t)
x3 (t) = x20 (t) = x 00 (t)
.. .. (17-66)
. .
xn−1 (t) = xn0 −2 (t) = x (n−2) (t)
xn (t) = xn0 −1 (t) = x (n−1) (t)
eNote 17 17.4 TRANSFORMATION OF NTH ORDER DIFFERENTIAL EQUATIONS 449

These new expressions are substituted into the differential equation (17-64):

xn0 (t) + an−1 xn (t) + an−2 xn−1 (t) + . . . + a1 x2 (t) + a0 x1 (t) = 0 (17-67)

Now this equation can together with equations (17-66) be written in matrix form.

x10 (t) 0 1 0 ··· 0 x1 ( t )


    
 x20 (t)   0 0 1 ···  0 x2 ( t )

.. .. .. .. .. ..
    

=
  ..  
(17-68)

 .   . . . . 
 . 
.
 x0 ( t )   0 0 0 · · · 1  x n − 1 ( t ) 
n −1
0
xn (t) − a 0 − a 1 − a 2 · · · − a n −1 xn (t)

The method is thus proved.

Example 17.14

Given a linear differential equation of third order with constant coefficients:

x 000 (t) − 4x 00 (t) − 7x 0 (t) + 10x (t) = 0, t ∈ R. (17-69)

We wish to determine the general solution. Therefore the following functions are introduced

x1 ( t ) = x ( t )
x2 (t) = x10 (t) = x 0 (t) (17-70)
x3 (t) = x20 (t) = x 00 (t)

In this way we can rewrite the differential equation as

x30 (t) − 4x3 (t) − 7x2 (t) + 10x1 (t) = 0 (17-71)

And we can then gather the last three equations in a system of equations.

x10 (t) = x2 ( t )
x20 (t) = x3 ( t ) (17-72)
0
x3 (t) = −10x1 (t) + 7x2 (t) + 4x3 (t)

This is written in matrix form in this way:


 
0 1 0
x0 ( t ) =  0 0 1 x(t) (17-73)
−10 7 4
eNote 17 17.4 TRANSFORMATION OF NTH ORDER DIFFERENTIAL EQUATIONS 450

The eigenvalues are determined to be λ1 = −2, λ2 = 1 and λ3 = 5. The general solution


to the system of differential equations according to Theorem 17.2 is given by the following
functions for all the arbitrary constants c1 , c2 , c3 ∈ R:

x(t) = c1 e−2t v1 + c2 et v2 + c3 e5t v3 , t ∈ R, (17-74)

where v1 , v2 , v3 are the respective eigenvectors.

But we need only the solution of x1 (t) = x (t), and we isolate this from the general solution
to the system. Furthermore we introduce three new arbitrary constants k1 , k2 , k3 ∈ R, that are
equal to the product of the c’s and the first coordinates of the eigenvectors. The result is

x (t) = x1 (t) = k1 e−2t + k2 et + k3 e5t , t∈R (17-75)

This constitutes the general solution to the differential equation (17-69). If the first coordinate
in v1 is 0 , we put k1 = 0 ; otherwise k1 can be an arbitrary real number. Similarly for k2 and
k3 .
eNote 18 451

eNote 18

Linear Second-Order Differential


Equations with Constant Coefficients

Following eNotes 16 and 17 about differential equations, we now present this eNote about
second-order differential equations. Parts of the proofs closely follow the preceding notes and a
knowledge of these notes is therefore a prerequisite. In addition, complex numbers are used.

Updated: 15.11.21 David Brander

Linear second-order differential equations with constant coefficients look like this:

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, q : I → R (18-1)

a0 , a1 ∈ R are constant coefficients of x (t) and x 0 (t), respectively. The right hand side
q(t) is a continuous real function, with the domain being an interval I (which could be
all of R ). The equation is called homogeneous if q(t) = 0 for all t ∈ I and otherwise
inhomogeneous.

The left hand side is linear in x, i.e., the map f : C ∞ (R) → C ∞ (R) given by

f x (t) ) = x 00 (t) + a1 x 0 (t) + a0 x (t) (18-2)

satisfies the linearity requirements L1 and L2 . The method used in this eNote for solving
the inhomogeneous equation exploits this linearity.
eNote 18 18.1 THE HOMOGENEOUS EQUATION 452

Method 18.1 Solutions and their structure

1. The general solution Lhom for a homogeneous linear second-order differential


equation
x 00 (t) + a1 x 0 (t) + a0 x (t) = 0, t ∈ I (18-3)
where a0 , a1 ∈ R , can be determined using Theorem 18.2.

2. The general solution set Linhom for an inhomogeneous linear second-order dif-
ferential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I , q : I → R, (18-4)

where a0 , a1 ∈ R , can, using Theorem 12.14, be split into two:

(a) First the general solution Lhom to the corresponding homogeneous equation is
determined. This is produced by setting q(t) = 0 in (18-4).
(b) Then a particular solution x0 (t) to (18-4) is determined e.g. by guessing.
Concerning this see section 18.2.

The general solution then has the following structure

Linhom = x0 (t) + Lhom . (18-5)

18.1 The Homogeneous Equation

We now consider the linear homogeneous second-order differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = 0, t ∈ R, (18-6)

where a0 and a1 are real constants. We wish to determine the general solution. This can
be accomplished using exact formulas that depend on the appearance of the equation.
eNote 18 18.1 THE HOMOGENEOUS EQUATION 453

Theorem 18.2 Solution to the Homogeneous Equation


The homogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = 0, t ∈ R, (18-7)

has the so-called characteristic equation

λ2 + a1 λ + a0 = 0. (18-8)

The type of roots to this equation determines how the general solution Lhom to the
homogeneous differential equation will appear.

• Two different real roots λ1 and λ2 yield the solution

x ( t ) = c 1 eλ1 t + c 2 eλ2 t , t ∈ R. (18-9)

• Two complex roots λ = α ± βi, with Im(λ) = ± β 6= 0, yield the real solution

x (t) = c1 eαt cos( βt) + c2 eαt sin( βt), t ∈ R. (18-10)

• The double root λ yields the solution

x (t) = c1 eλt + c2 teλt , t ∈ R. (18-11)

In all three cases the respective functions for all c1 , c2 ∈ R constitute the general
solution Lhom .

In Section 17.4 you find the theory for rewriting this type of differential equa-
tion as a system of first-order differential [Link] method works here.
The system will then look like this:
 0    
x1 ( t ) 0 1 x1 ( t )
= (18-12)
x20 (t) − a0 − a1 x2 ( t )

where x1 (t) = x (t) and x2 (t) = x10 (t) = x 0 (t). The problem can now be solved
using the theory and methods outlined in that section.
eNote 18 18.1 THE HOMOGENEOUS EQUATION 454

Proof

The homogeneous second-order linear differential equation (18-7) is rewritten as a system of


first-order differential equations:
 0      
x1 ( t ) 0 1 x1 ( t ) x1 ( t )
= =A (18-13)
x20 (t) − a0 − a1 x2 ( t ) x2 ( t )

where x1 (t) = x (t) is the wanted solution that constitutes the general solution. The proof
begins with the theorems and methods in Section 17.1. For the proof we need the eigenvalues
of the system matrix A:

−λ 1
det(A − λE) = = λ2 + a1 λ + a0 = 0, (18-14)
− a0 − a1 − λ

which is also the characteristic equation for the differential equation. The type of roots of this
equation determines the solution x (t) = x1 (t), which gives the following three parts of the
proof:

First part
The characteristic equation has two different real roots: λ1 and λ2 . By using Method 17.4 we
obtain two linearly independent solutions u1 (t) = v1 eλ1 t and u2 (t) = v2 eλ2 t , where v1 and v2
are eigenvectors corresponding to the two eigenvalues , respectively. The general solution is
then spanned by:
x(t) = k1 u1 (t) + k2 u2 (t) = k1 eλ1 t v1 + k2 eλ2 t v2 , (18-15)
for all k1 , k2 ∈ R. The first coordinate x1 (t) = x (t) is the solution wanted:

x 1 ( t ) = x ( t ) = c 1 eλ1 t + c 2 eλ2 t , (18-16)

which for all the arbitrary constants c1 , c2 ∈ R constitutes the general solution. c1 and c2
are two new arbitrary constants and they are the products of the k-constants and the first
coordinates of the eigenvectors: c1 = k1 v11 and c2 = k2 v21 .

Second part
The characteristic equation has the complex pair of roots λ = α + βi and λ̄ = α − βi. It is
possible to find the general solution using Method 17.5.

x(t) = k1 u1 (t) + k2 u2 (t)


= k1 eαt (cos( βt)Re(v) − sin( βt)Im(v)) + k2 eαt (sin( βt)Re(v) + cos( βt)Im(v)) (18-17)
= e cos( βt) · (k1 Re(v) + k2 Im(v)) + e sin( βt) · (−k1 Im(v) + k2 Re(v)).
αt αt

v is an eigenvector corresponding to λ and k1 and k2 are arbitrary constants. The first coordi-
nate x1 (t) = x (t) is the wanted solution, and is according to the above given by

x1 (t) = x (t) = c1 eαt cos( βt) + c2 eαt sin( βt). (18-18)


eNote 18 18.1 THE HOMOGENEOUS EQUATION 455

For all c1 , c2 ∈ R, x (t) constitutes the general solution. c1 and c2 are two new arbitrary
constants given by c1 = k1 Re(v1 ) + k2 Im(v1 ) and c2 = −k1 Im(v1 ) + k2 Re(v1 ). v1 is the first
coordinate of v.

Third part
The characteristic equation has the double root λ. Because of the appearance of the system
matrix (the matrix is equivalent to an upper triangular matrix) it is possible to see that the
geometric multiplicity of the corresponding eigenvector space is 1, and it is then possible to
use Method 17.7 to find the general solution.

x(t) = k1 u1 (t) + k2 u2 (t) = k1 eλt v + k2 (teλt v + eλt b) = eλt (k1 v + k2 b) + k2 teλt v, (18-19)

where v is an eigenvector corresponding to λ, b is the solution to the system of equations


(A − λE)b = v, and k1 , k2 are two arbitrary constants. Taking the first coordinate we get

x (t) = c1 eλt + c2 teλt , (18-20)

which for all c1 , c2 ∈ R constitutes the general solution. c1 d and c2 are two new arbitrary
constants, given by c1 = k1 v1 + k2 b1 and c2 = k2 v1 , in which v1 is the first coordinate in v, as
b1 is the first coordinate in b.

All the three different cases of roots of the characteristic equation have now been treated thus
proving the theorem.

Notice that it is also possible to arrive at the characteristic equation by guessing a


solution to the differential equation of the form x (t) = eλt . One then gets:

x 00 (t) + a1 x 0 (t) + a0 x (t) = 0 ⇒ λ2 eλt + a1 λeλt + a0 eλt = 0 (18-21)

Dividing this equation by eλt , which is non-zero for all values of t, yields the charac-
teristic equation.

Example 18.3 Solution to the Homogeneous Equation

Given the homogeneous differential equation


x 00 (t) + x 0 (t) − 20x (t) = 0, t ∈ R, (18-22)
which has the characteristic equation
λ2 + λ − 20 = 0. (18-23)
eNote 18 18.1 THE HOMOGENEOUS EQUATION 456

We wish to determine the general solution Lhom to this homogeneous equation.

The characteristic equation has the roots λ1 = −5 and λ = 4, since −5 · 4 = −20 and −(−5 +
4) = 1 are the coefficients of the characteristic equation. Therefore the general solution to the
homogeneous equation is

Lhom = c1 e−5t + c2 e4t , t ∈ R c1 , c2 ∈ R ,



(18-24)

that has been found using 18.2.

Example 18.4 Solution to the Homogeneous Equation

A homogeneous second-order differential equation with constant coefficients is given by:

x 00 (t) − 8x 0 (t) + 16x (t) = 0, t ∈ R. (18-25)

We wish to determine Lhom , the general solution to the homogeneous equation. The charac-
teristic equation is
λ2 − 8λ + 16 = 0 ⇔ (λ − 4)2 = 0 (18-26)
Thus we have the double root λ = 4, and the general solutions set is composed of the follow-
ing function for all c1 , c2 ∈ R:

x (t) = c1 e4t + c2 te4t , t ∈ R. (18-27)

The result is determined using Theorem 18.2.

As can be seen from the two preceding examples it is relatively simple to determine
the solution to the homogeneous equation. In addition it is possible to determine the
differential equation from the solution, that is "go backwards". This is illustrated in the
following example.

Example 18.5 From Solution to Equation

The solution to a differential equation is known:

x (t) = c1 e2t cos(7t) + c2 e2t sin(7t), t ∈ R, (18-28)

which with the arbitrary constants c1 , c2 constitute the general solution.

Since the solution only includes terms with arbitrary constants, the equation must be homo-
geneous. Furthermore it is seen that the solution structure is similar to the solution structure
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 457

in equation (18-10) in Theorem 18.2. This means that the characteristic equation of the second-
order differential equation has two complex roots: λ = 2 ± 7i. The characteristic equation
given these roots reads:

(λ − 2 + 7i )(λ − 2 − 7i ) = (λ − 2)2 − (7i )2 =


(18-29)
λ2 − 4λ + 4 + 49 = λ2 − 4λ + 53 = 0

Directly from coefficients of the characteristic equation we can write the differential equation
as:
x 00 (t) − 4x 0 (t) + 53x (t) = 0, t ∈ R. (18-30)
This can also be seen from Theorem 18.2.

18.2 The Inhomogeneous Equation

In this section we wish to determine a particular solution x0 (t) to the inhomogeneous


differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, q : I → R. (18-31)

We wish to find a particular solution, because it is part of the general solution Linhom
together with the general solution Lhom to the corresponding homogeneous equation cf.
Method 18.1.

In this eNote we do not use a specific solution formula. Instead we use different meth-
ods depending on the form of q(t). In general one might guess that a particular solu-
tion x0 (t) has a form that somewhat resembles q(t), as will appear from the following
methods. Notice that these methods cover some frequently occurring forms of q(t), but
certainly not all.

Furthermore the concept of superposition will be treated. Superposition is a basic quality


of linear equations and linear differential equations. The point is to split the equation
into more equations in which the left hand sides stay the same while the sum of the
right hand sides is equal to the right hand side of the original equation. If the original
equation has the right hand side q(t) = sin(2t) + 2t2 , it may be a good idea to split the
equation into two, where the right hand sides become q1 (t) = sin(2t) and q2 (t) = 2t2
respectively. It is easier to determine particular solutions to the two equations. A par-
ticular solution to the original equation will then be the sum of the two particular solu-
tions.

Finally we will introduce the complex guess method. The complex guess method can be
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 458

used if the right hand side q(t) of the equation is the real part of a simple complex ex-
pression, e.g. q(t) = et sin(3t) that is the real part of −ie(1+3i)t . Solving an equation with
a simple right hand side is easier, and therefore the corresponding complex equation is
solved instead. The solutions to the real equation and to the corresponding complex
equation are closely related.

18.2.1 General Solution Methods

Method 18.6 Polynomial


Given the inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, (18-32)

where q is an n-th degree polynomial. If a0 6= 0 a polynomial of degree n that is a


particular solution to the equation exists. In general a polynomial of degree n + 2 at
the most, that is a particular solution to the equation, exists. A particular solutions
of the form mentioned is found by insertion of polynomials of a suitable degree
with unknown coefficients in the left-hand side of the equation and tune this to the
right-hand side q , cf the identity theorem for polynomials, eNote 2, Theorem 2.15.

Example 18.7 Polynomial

Given the inhomogeneous second-order differential equation with constant coefficients

x 00 (t) − 3x 0 (t) + x (t) = 2t2 − 16t + 25, t ∈ R. (18-33)

We wish to determine a particular solution x0 (t) to the inhomogeneous equation. Since the
right hand side is a second degree polynomial we insert an unknown polynomial of second
degree in the left-hand side of the equation and equate this with the right-hand side:

x0 (t) = b2 t2 + b1 t + b0 , t ∈ R. (18-34)

The coefficients are determined by substituting the expression into the differential equation
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 459

together with x00 (t) = 2b2 t + b1 og x000 (t) = 2b2 .

2b2 − 3(2b2 t + b1 ) + b2 t2 + b1 t + b0 = 2t2 − 16t + 25 ⇔


(b2 − 2)t2 + (−6b2 + b1 + 16)t + (2b2 − 3b1 + b0 − 25) = 0 ⇔ (18-35)
b2 − 2 = 0 og − 6b2 + b1 + 16 = 0 og 2b2 − 3b1 + b0 − 25 = 0

From the first equation it is evident that b2 = 2, and by substituting this in the second equa-
tion we get b1 = −4. Finally the last equation yields b0 = 9. Therefore a particular solution
to Equation (18-33) is given by

x0 (t) = 2t2 − 4t + 9, t ∈ R. (18-36)

Exercise 18.8 Polynomium

Given the following differential equation where the right-hand side is a first degree polyno-
mial:
x 00 (t) = t + 1 , t ∈ R. (18-37)
Show that you have to go to the third degree in order to find a polynomial that is a particular
solution to the equation.

Method 18.9 Trigonometric


A particular solution x0 (t) to the inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, (18-38)

where q(t) = a cos(ωt) + b sin(ωt), is of the same form:

x0 (t) = A sin(ωt) + B cos(ωt), t ∈ I, (18-39)

where A and B are determined by substitution of the expression for x0 (t) as a solu-
tion into the inhomogeneous equation.

It is also possible to determine a particular solution to a differential equation


like the one in Method 18.9 using the complex guess method, cf. e.g. section
18.2.3.
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 460

Example 18.10 Trigonometric

Given the differential equation

x 00 (t) + x 0 (t) − x (t) = −20 sin(3t) + 6 cos(3t), t ∈ R. (18-40)

We wish to determine a particular solution x0 (t). By the use of Method 18.9 a particular
solution is
x0 (t) = A sin(ωt) + B cos(ωt) = A sin(3t) + B cos(3t). (18-41)
In addition we have
x00 (t) = 3A cos(3t) − 3B sin(3t)
(18-42)
x000 (t) = −9A sin(3t) − 9B cos(3t)
This is substituted into the equation

(−9A sin(3t) − 9B cos(3t)) + (3A cos(3t) − 3B sin(3t)) − ( A sin(3t) + B cos(3t))


= −20 sin(3t) + 6 cos(3t) ⇔
(18-43)
(−9A − 3B − A + 20) sin(3t) + (−9B + 3A − B − 6) cos(3t) = 0 ⇔
−9A − 3B − A + 20 = 0 og − 9B + 3A − B − 6 = 0
3
This is two equations in two unknowns. Substituting A = − 10 B + 2 from the first equation
in the second yields
 
3 9
−9B + 3 − B + 2 − B − 6 = 0 ⇔ −10B − B = 0 ⇔ B = 0 (18-44)
10 10

From this we get that A = 2, and a particular solution to the differential equation is then

x0 (t) = 2 sin(3t), t ∈ R. (18-45)

Note that the number ω = 3 is the same in the arguments of both cosine
and sine in Example 18.10, and this is the only case that Method 18.9 facili-
tates. If two different numbers are present Method 18.9 does not apply, e.g.
q(t) = 3 sin(t) + cos(10t). But either superposition or the complex guess method
can be applied, and they will be described in section 18.2.2 and section 18.2.3,
respectively.
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 461

Method 18.11 Exponential Function


A particular solution x0 (t) to the inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, (18-46)

where q(t) = βeαt og α, β ∈ R, is also an exponential function:

x0 (t) = γeαt , t ∈ I, (18-47)

where γ is determined by substituting the expression for x0 (t) as a solution into the
inhomogeneous equation. We emphasize that α must not be a root of the character-
istic equation for the differential equation.

As commented by the end of Method 18.11 the exponent α must not be a root of
the characteristic equation. If this is the case the guess will be a solution to the
corresponding homogeneous equation c.f. Theorem 18.2. This is a “problem”
for all orders of differential equations.

Example 18.12 Exponential Function

Given the differential equation

x 00 (t) + 11x 0 (t) + 5x (t) = −20e−t , t ∈ R. (18-48)

We wish to determine a particular solution x0 (t). According to Method 18.11 a particular


solution is given by x0 (t) = γeαt = γe−t . We do not yet know whether α = −1 is a root in the
characteristic equation, but if it is possible to find γ, it is not a root. We have x00 (t) = −γe−t
and x000 (t) = γe−t , and this is substituted into the differential equation:

γe−t + 11(−γe−t ) + 5γe−t = −20e−t ⇔ − 5γ = −20 ⇔ γ = 4 (18-49)

Thus we have succeeded in finding γ, and therefore we have a particular solution to the
differential equation:
x0 (t) = 4e−t , t ∈ R. (18-50)
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 462

Method 18.13 Exponential Function Belonging to Lhom


A particular solution x0 (t) to the inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, (18-51)

where q(t) = βeλt , β ∈ R and λ is a root in the characteristic equation of the differ-
ential equation, has the following form:

x0 (t) = γteλt , t ∈ I, (18-52)

where γ is determined by substitution of the expression for x0 (t) as a solution into


the inhomogeneous equation.

Example 18.14 Exponential Function Belonging to Lhom

Given the differential equation


x 00 (t) − 7x 0 (t) + 10x (t) = −3e2t , t ∈ R. (18-53)
We wish to determine a particular solution. First we try to use Method 18.11, and guess a
solution of the form x0 (t) = γeαt = γe2t . One then has x00 (t) = 2γe2t and x000 (t) = 4γe2t ,
which by substitution into the equation gives
4γe2t − 7 · 2γe2t + 10γe2t = −3e2t ⇔ 0 = −3 (18-54)
It is seen that γ does not appear in the last equation, and that the equation otherwise is false.
Therefore α = λ must be a root in the characteristic equation. The characteristic equation
looks like this:
λ2 − 7λ + 10 = 0 (18-55)
This second degree equation has the roots 2 and 5, since 2 · 5 = 10 and −(2 + 5) = −7. It is
true that α = 2 is a root.

Consequently we use Method 18.13, and we guess a solution of the form x0 (t) = γteλt =
γte2t . We then have
x00 (t) = γe2t + 2γte2t
(18-56)
x000 (t) = 2γe2t + 2γe2t + 4γte2t = 4γe2t + 4γte2t
This is substituted into the equation in order to determine γ.
4γe2t + 4γte2t − 7(γe2t + 2γte2t ) + 10γte2t = −3e2t ⇔
(4γ − 14γ + 10γ)t + (4γ − 7γ + 3) = 0 ⇔ (18-57)
γ=1
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 463

We have now succeeded in finding γ, and therefore a particular solution to the equation is

x0 (t) = te2t , t ∈ R. (18-58)

18.2.2 Superposition

Within all types of linear equations the concept of superposition exists. We present the
concept here for second-order linear differential equations with constant coefficients.
Superposition is here used in order to determine a particular solution to the inhomo-
geneous equation, when the right hand side (q(t)) is a combination (addition) of more
types of functions, e.g. a sine function added to a polynomial.

Theorem 18.15 Superposition


Let q1 , q2 , . . . , qn be continuous functions on an interval I. If x0i (t) is a particular
solution to the inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = qi (t) (18-59)

for every i = 1, . . . , n, then

x0 (t) = x01 (t) + x02 (t) + . . . + x0n (t) (18-60)

is a particular solution to

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t) = q1 (t) + q2 (t) + . . . + qn (t), (18-61)

Proof

Superposition is a consequence of the differential equation being linear. We will here give a
general proof for all types of linear differential equations.

The left hand side of a differential equation is called f ( x (t)). We now posit n differential
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 464

equations:

f ( x01 (t)) = q1 (t), f ( x02 (t)) = q2 (t), ..., f ( x0n (t)) = qn (t) (18-62)

where x01 , x02 , . . . , x0n are particular solutions to the respective inhomogeneous differential
equations. Define x0 = x01 + x02 + . . . + x0n and substitute this into the left hand side:

f ( x0 (t)) = f ( x01 (t) + x02 (t) + . . . + x0n (t))


= f ( x01 (t)) + f ( x02 (t)) + . . . + f ( x0n (t)) (18-63)
= q1 ( t ) + q2 ( t ) + . . . + q n ( t )

On the right hand side we get the sum of the functions q1 , q2 , . . . , qn , which sum we call q.
The Theorem is thus proven.

Example 18.16 Superposition

Given the inhomogeneous differential equation


x 00 (t) − x 0 (t) − 3x (t) = 9e4t + 3t − 14, t ∈ R. (18-64)
We wish to determine a particular solution x0 (t). It is seen that the right hand side is a
combination of an exponential function (q1 (t) = 9e4t ) and a polynomial (q2 (t) = 3t − 14).
Therefore we use superposition 18.15 and the equation is split into two parts.
x 00 (t) − x 0 (t) − 3x (t) = 9e4t = q1 (t) (18-65)
x 00 (t) − x 0 (t) − 3x (t) = 3t − 14 = q2 (t) (18-66)
First we treat (18-65), for which we use Method 18.11. A particular solution then has the form
x01 (t) = γeαt = γe4t . We have x00 1 (t) = 4γe4t and x0001 (t) = 16γe4t . This is inserted into the
equation.
16γe4t − 4γe4t − 3γe4t = 9e4t ⇔ γ = 1 (18-67)
Therefore x01 (t) = e4t .

Now we treat Equation (18-66), where a particular solution is a polynomial of at the most
first degree, cf. Method 18.6, thus x02 (t) = b1 t + b0 . Hence x00 2 (t) = b1 and x0002 (t) = 0. This is
substituted into the differential equation.
0 − b1 − 3(b1 t + b0 ) = 3t − 14 ⇔ (−3b1 − 3)t + (−b1 − 3b0 + 14) = 0 (18-68)
Thus we have two equations in two unknowns, and we find that b1 = −1, and therefore that
b0 = 5. Thus a particular solution is x02 (t) = −t + 5. The general solution to (18-64) is then
found as the sum of the already found particular solutions to the two split equations:
x0 (t) = x01 (t) + x02 (t) = e4t − t + 5, t ∈ R. (18-69)
eNote 18 18.2 THE INHOMOGENEOUS EQUATION 465

18.2.3 The Complex Guess Method

The complex guess method is used when it is easy to rewrite the right hand side of the
differential equation as a complex expression, such that the given real right hand side is
the real part of the complex.

If e.g. the original right hand side is 2e2t cos(3t) , adding i (−2e2t sin(3t)) , we get

2e2t (cos(3t) − i sin(3t)) = 2e(2−3i)t . (18-70)

Here it is evident that Re(2e(2−3i)t ) = 2e2t cos(3t) . One then finds a complex particular
solution with complex right hand side. The wanted real particular solution to the origi-
nal equation is then the real part of the found complex solution.

Note that this method can be used because the equation is linear. It is exactly the lin-
earity that secures that the real part of the complex solution found is the wanted real
solution. This is shown by interpreting the left hand side of the equation as linear map
f (z(t)) in the set of complex functions of one real variable and using the following gen-
eral theorem:

Theorem 18.17
Given a linear map f : (C ∞ (R), C) → (C ∞ (R), C) and the equation

f (z(t))) = s(t) . (18-71)

If we state z(t) and s(t) in rectangular form as z(t) = x (t) + i · y(t) and s(t) =
q(t) + i · r (t) , then (18-71) is true and if and only if

f ( x (t)) = q(t) and f (y(t)) = r (t) . (18-72)


eNote 18 18.2 THE INHOMOGENEOUS EQUATION 466

Proof

Given the function z(t) and letting the linear map f and the functions z(t) and s(t) be given
as in Theorem 18.17. As a consequence of the qualities of a linear map, cf. Definition ??, the
following applies:

f (z(t)) = s(t) ⇔
f ( x (t) + i · y(t)) = q(t) + i · r (t) ⇔
(18-73)
f ( x (t)) + i · f (y(t)) = q(t) + i · r (t) ⇔
f ( x (t)) = q(t) and f (y(t)) = r (t) .

Thus the theorem is proven.

Method 18.18 The Complex Guess Method


A particular solution x0 (t) to the real inhomogeneous differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ R, (18-74)

where a0 og a1 are real coefficients and


 
q(t) = Re ( a + bi )e(α+ωi)t = aeαt cos(ωt) − beαt sin(ωt), (18-75)

is initially determined by the corresponding complex particular solution to the fol-


lowing complex equation

z00 (t) + a1 z0 (t) + a0 z(t) = ( a + bi )e(α+ωi)t , t ∈ R, (18-76)

The complex particular solution has the form z0 (t) = (c + di )e(α+ωi)t , where c and d
are determined by substitution of z0 (t) into Equation (18-76).

Then a particular solution to equation (18-74) is given by

x0 (t) = Re(z0 (t)) . (18-77)


eNote 18 18.2 THE INHOMOGENEOUS EQUATION 467

A decisive reason for using the complex guess method is that it is so easy to
determine the derivative of the exponential function, even when it is complex.

Example 18.19 The Complex Guess Method

Given a second-order inhomogeneous differential equation:

x 00 (t) − 2x 0 (t) − 2x (t) = 19e4t cos(t) − 35e4t sin(t), t ∈ R. (18-78)

We wish to determine a particular solution. It is evident that we can use the complex guess
method in Method 18.18. Initially the following is true for the right hand side:
 
q(t) = 19e4t cos(t) − 35e4t sin(t) = Re (19 + 35i )e(4+i)t . (18-79)

We shall now instead of the original problem find a complex particular solution to the differ-
ential equation
z00 (t) − 2z0 (t) − 2z(t) = (19 + 35i )e(4+i)t , t ∈ R. (18-80)
by guessing that z0 (t) = (c + di )e(4+i)t is a solution. We also have

z00 (t) = (c + di )(4 + i )e(4+i)t = (4c − d + (c + 4d)i ) e(4+i)t and


(18-81)
z000 (t) = (4c − d + (c + 4d)i )(4 + i )e (4+ i ) t
= (15c − 8d + (8c + 15d)i )e(4+i)t

These expressions are substituted into the complex equation in order to determine c and d:

(15c − 8d + (8c + 15d)i )e(4+i)t − 2(4c − d + (c + 4d)i )e(4+i)t − 2(c + di )e(4+i)t


= (19 + 35i )e(4+i)t ⇔
15c − 8d + (8c + 15d)i − 2(4c − d + (c + 4d)i ) − 2(c + di ) = 19 + 35i ⇔ (18-82)
5c − 6d + (6c + 5d)i = 19 + 35i ⇔
5c − 6d = 19 og 6c + 5d = 35

These are two equations in two unknowns. The augmented matrix of the system of equations
is written:
1 − 65 19
     
5 −6 19 5 1 0 5
→ 61 → . (18-83)
6 5 35 0 61 5 5 0 1 1

thus we have that c = 5 and d = 1, which yields z0 (t) = (5 + i )e(4+i)t . Therefore a particular
solution to the equation (18-78) is
 
x0 (t) = Re(z0 (t)) = Re (5 + i )e(4+i)t = 5e4t cos(t) − e4t sin(t), t ∈ R. (18-84)
eNote 18 18.3 EXISTENCE AND UNIQUENESS 468

18.3 Existence and Uniqueness

Here we formulate a theorem about existence and uniqueness for differential equations
of the second order with constant coefficients. We need two initial value conditions: The
value of the function and its first derivative at the chosen initial point.

Theorem 18.20 Existence and Uniqueness


For every 3-tuple (t0 , x0 , v0 ) (double initial value condition), there exists exactly one
solution x (t) to the differential equation

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t), t ∈ I, q : I → R, (18-85)

such that
x ( t0 ) = x0 and x 0 ( t0 ) = v0 , (18-86)
where t0 ∈ I, x0 ∈ R and v0 ∈ R.

Example 18.21 Exsistence and Uniqueness

Given the differential equation


x 00 (t) − 5x 0 (t) − 36x (t) = 0, t ∈ R. (18-87)
It is seen that the equation is homogeneous. It has the characteristic equation
λ2 − 5λ − 36 = 0. (18-88)
We wish to determine a function x (t) that is a solution to the differential equation and has
the initial value condition (t0 , x0 , v0 ) = (0, 5, 6). The characteristic equation has the roots
λ1 = −4 and λ2 = 9, since −4 · 9 = −36 and −(9 + (−4)) = 5 are the coefficients of
the equation. Therefore the general solution for the homogeneous equation (using Theorem
18.2) is spanned by the following functions for all c1 , c2 ∈ R:
x (t) = c1 e−4t + c2 e9t , t ∈ R. (18-89)
One then has
x 0 (t) = −4c1 e−4t + 9c2 e9t (18-90)
if the initial value condition (x (0) = 5 and x 0 (0) = 6) is substituted into the two equations
one can solve for (c1 , c2 ).
5= c1 + c2
(18-91)
6 = −4c1 + 9c2
eNote 18 18.3 EXISTENCE AND UNIQUENESS 469

since e0 = 1. If c2 = 5 − c1 is substituted into the second equation one gets

6 − 45
6 = −4c1 + 9(5 − c1 ) = −13c1 + 45 ⇔ c1 = =3 (18-92)
−13
Therefore c2 = 5 − 3 = 2 and the conditional solution is

x (t) = 3e−4t + 2e9t , t∈R (18-93)

Note that one can determine a unique and conditional solution to a homogeneous
differential equation, as in this case. The right hand side needs not be different from
zero. The general solution for the equation is Linhom = Lhom , since x0 (t) = 0.

Below is an example going through the whole solution procedure for an inhomogeneous
equation with a double initial value condition. After that an example is presented where
the purpose is to find the differential equation given the general solution. It is analogous
to example 18.5, but now we have a right hand side different from zero.

Example 18.22 Accumulated Example

Given the differential equation


x 00 (t) + 6x 0 (t) + 5x (t) = 20t2 + 48t + 13, t ∈ R. (18-94)
We determine the general solution Linhom . Then the conditional solution x (t) that satisfies the
initial value condition (t0 , x0 , v0 ) = (0, 5, −8) , will be determined.

First we solve the corresponding homogeneous equation, and the characteristic equation
looks like this:
λ2 + 6λ + 5 = 0 (18-95)
This has the roots λ1 = −5 and λ2 = −1, since (λ + 5)(λ + 1) = λ2 + 6λ + 5. Because these
roots are real and different, cf. Theorem 18.2, the general homogeneous solution set is given
by
Lhom = c1 e−5t + c2 e−t , t ∈ R c1 , c2 ∈ R .

(18-96)
Now we determine a particular solution to the inhomogeneous equation. Since the right hand
side is a second degree polynomial we guess that x0 (t) = b2 t2 + b1 t + b0 , using Method 18.6.
We then have that x00 (t) = 2b2 t + b1 and x000 (t) = 2b2 . This is substituted into the differential
equation.
2b2 + 6(2b2 t + b1 ) + 5(b2 t2 + b1 t + b0 ) = 20t2 + 48t + 13 ⇔
(5b2 − 20)t2 + (12b2 + 5b1 − 48)t + (2b2 + 6b1 + 5b0 − 13) = 0 ⇔ (18-97)
5b2 − 20 = 0 og 12b2 + 5b1 − 48 = 0 og 2b2 + 6b1 + 5b0 − 13 = 0.
eNote 18 18.3 EXISTENCE AND UNIQUENESS 470

The first equation easily yields b2 = 4. If this is substituted into the second equation we get
b1 = 0. Finally in the third equation we get b0 = 1. A particular solution to the inhomoge-
neous equation is therefore
x0 (t) = 4t2 + 1, t ∈ R. (18-98)
Following the structural theorem, e.g. Method 18.1, the general solution to the inhomoge-
neous equation is given by

Linhom = x0 (t) + Lhom = 4t2 + 1 + c1 e−5t + c2 e−t , t ∈ R c1 , c2 ∈ R



(18-99)

We now determine the solution that satisfies the given initial value conditions. An arbitrary
solution has the form

x (t) = 4t2 + 1 + c1 e−5t + c2 e−t , t ∈ R. (18-100)

We now determine the derivative

x 0 (t) = 8t − 5c1 e−5t − c2 e−t , t ∈ R. (18-101)

If x (0) = 5 and x 0 (0) = −8 are substituted we get two equations

5= c1 + c2 + 1
(18-102)
−8 = −5c1 − c2

Substituting c1 = 4 − c2 from the first equation into the second we get

−8 = −5(4 − c2 ) − c2 ⇔ −8 + 20 = 4c2 ⇔ c2 = 3. (18-103)

This yields c1 = 1 and therefore the conditional solution is

x (t) = e−5t + 3e−t + 4t2 + 1, t ∈ R. (18-104)

Example 18.23 From the Solution to the Equation

Given the general solution to a linear second-order differential equation with constant coef-
ficients:
Linhom = c1 e−2t + c2 e2t − 21 sin(2t) , t ∈ R c1 , c2 ∈ R

(18-105)
It is now the aim to find the differential equation, which in general looks like this:

x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t) (18-106)

Thus we have to determine a1 , a0 and q(t).


eNote 18 18.3 EXISTENCE AND UNIQUENESS 471

First we split the solution into a particular solution and the general homogeneous solution
set:

x0 (t) = − 21 sin(2t), t ∈ R and Lhom = c1 e−2t + c2 e2t t ∈ R c1 , c2 ∈ R



(18-107)

Now we consider the general homogeneous solution. The looks of this complies with the first
case of in Theorem 18.2. The characteristic equation has two real roots and they are λ1 = −2
and λ2 = 2. Therefore the characteristic equation is

(λ + 2)(λ − 2) = λ2 − 4 = 0 (18-108)

This determines the coefficients on the left hand side of the differential equation: a1 = 0 and
a0 = −4. The differential equation so far looks like this:

x 00 (t) − 4x (t) = q(t), t ∈ R. (18-109)

Since x0 (t) is a particular solution to the inhomogeneous equation the right hand side q(t)
can be determined by substituting x0 (t). We have that x000 (t) = 2 sin(2t).

x000 (t) − 4x0 (t) = q(t) ⇔


2 sin(2t) − 4(− 12 sin(2t)) = q(t) ⇔ (18-110)
4 sin(2t) = q(t)

Now all unknowns in the differential equation are determined:

x 00 (t) − 4x (t) = 4 sin(2t), t ∈ R. (18-111)

In these eNotes we do not consider systems of second-order homogeneous


linear differential equations with constant coefficients. We should mention,
however, that with the presented theory and a bit of cleverness we can solve
such problems. If we have a system of second-order homogeneous differential
equations then we can consider each equation individually. By use of Section
17.4 such an equation be rewritten as two equations of first order. If this is
done with all the equations in the system, we end up with double the number
of equations, but those now of first-order equations. We can solve this new
system with the theory presented in eNote 16. Systems of second-order ho-
mogeneous linear differential equations are seen in many places in mechanical
physics, chemistry, electro-magnetism etc.
eNote 18 18.4 SUMMARY 472

18.4 Summary

In this note linear second-order differential equations with constant coefficients are writ-
ten as:
x 00 (t) + a1 x 0 (t) + a0 x (t) = q(t) (18-112)

• This equation is solved by first determining the general solution to the correspond-
ing homogeneous equation and then adding this to a particular solution to the
inhomogeneous equation, see Method 18.1.

• The general solution to the corresponding homogeneous differential equation is


determined by finding the roots of the characteristic equation:
λ2 + a1 λ + a0 = 0. (18-113)
There are in principle three cases, see Theorem 18.2.

• A particular solution is determined by “guessing” a solution that has the same


appearance as the right hand side q(t). If e.g. q(t) is a polynomial then x0 (t) is
also a polynomial of at the most same degree. In the note many examples are
given, see Section 18.2.

• In particular we have the complex guess method for the determination of the partic-
ular solution x0 (t). The complex guess method can be used when the right hand
side has this appearance:
 
q(t) = Re ( a + bi )e(α+ωi)t = aeαt cos(ωt) − beαt sin(ωt). (18-114)

The solution is then determined by rewriting the differential equation in the cor-
responding exponential form, see Method 18.18.

• Furthermore superposition is introduced. Superposition is a general principle that


applies to all types of linear equations. The idea is that two particular solutions can
be added. When they are substituted into the differential equation they will not
influence each other, and hence the right hand side can also be split into two terms,
each corresponding to one of the two solutions. This can be used to determine a
particular solution, when the right hand side is the sum of e.g. a sine function and
a polynomial. See e.g. Example 18.16.

• Furthermore an existence and uniqueness theorem is formulated, see Theorem 18.20.


According to this theorem a unique conditional solution that must satisfy two
particular initial value conditions to a second-order differential equation can be de-
termined.
eNote 18 INDEX 473

Index

A−1 , inverse matrix, 174 derivative, 78


E, identity matrix, 173 determinant, 184
det(A), 184 diagonal in a matrix, 131, 148
0-row, 145 diagonal matrix, 172, 190
diagonalization method, 429, 430
diagonal of matrix, 138 diagonalization of symmetric matrices, 380
Gauss-Jordan elimination , 133 diagonalized, 355
row operations , 133 direction vector, 209
A> , 169 direction vectors, 219
adjoint matrix, 197 domain, 71, 284
algebraic multiplicity, 346 eigenspace, 337
angle between two vectors a and b, 368 eigenvalue, 321, 325
approximating polynomial, 102 eigenvector, 320, 325
area funktions, 91 eigenvector space, 337
arrow, 204 element in a matrix, 131
basis, 220, 261 epsilon function, 75
basis vector, 220 epsilon-functions, 71
bi-diagonal-matrix, 190 Euclidian vector space (Rn , ·), 364
expansion along the r-th row, 187
change of basis matrix, 276
characteristic equation, 338, 341, 446 free parametre, 152
characteristic matrix, 201 Gauss-Jordan elimination, 142
characteristic polynomial, 201 general solution, 128, 130
codomain, 284 geometric multiplicity, 346
column vector, 126 geometric vector, 204
concept of continuity, 76 global maximum point, 113
coordinate matrix, 237, 270 global maximum value, 113
Coordinate vector, 224 global minimum point, 113
coordinate vector, 221 global minimum value, 113
coupled differential equations, 427
curly bracket, 74 hat vector, 243
eNote 18 INDEX 474

hyperbolic cosine, 87 ordered set, 220


hyperbolic sine, 87 ordinary basis, 221, 223
orthogonal complement, 377
identity matrix, 173
orthogonal matrix, 371
identity theorem for polynomials, 265
orthogonal projection, 241
image, 292
orthogonal vectors, 368
inconsistent equation, 128
orthonormal basis for (Rn , ·), 370
indefinite, 397
infinite-dimensional vector spaces, 282 parallel displacement, 204
initial conditions, 439 parametic representation, 209
intersection, 131, 153 parametric representation, 218
inverse function, 84 particular solution, 130, 157
inverse matrix, 174 perpendicular, 368
invertible matrix, 174 pivot, 138
position vector, 205
kernel, 292
positive definite, 397
leading 1, 138 positive orthogonal matrix, 372
length of a vector a, 365 positive semi-definite, 397
linear, 289 powers of a square matrix, 180
linear combination, 156, 213, 255 proper local maximum value, 116
linear dependence, 215 proper local minimum value, 116
linear equation, 127 proper vector, 365
linear independence, 215
quadratic form, 392
linearly dependent, 257
linearly independent, 257 range, 71, 284, 292
local maximum point, 116 rank, 145
local maximum value, 116 reduced row echelon form, 138, 139
local minimum point, 116 regular tetrahedron, 247
local minimum value, 116 remainder function, 99
rotation matrix, 389
main diagonal, 172
row operations, 133
mapping matrix, 300
row vector, 126
matrix, 131, 159
rule of conversion for equations, 129
matrix product, 166
mean value theorem, 101 second-order differential equations
mirroring the graph, 85 existence and uniqueness, 461
monomial basis, 265, 273 singular matrix, 174
negative definite, 397 smooth function, 97
negative orthogonal matrix, 372 solution set, 128, 130
negative semi-definite, 397 span, 225, 256
norm of a vector a, 365 special orthorgonal, 372
eNote 18 INDEX 475

spectral theorem for symmetric matrices,


380
squaree matrices, 184
stability, 251
standard basis, 221, 223, 264, 265
standard coordinate system in 3-space, 223
standard coordinate sytem, 221
standard parameter form, 129
stationary point, 113
subspace, 277
superposition, 456
symmetric matrix, 172, 368
system matrix, 428
system of differential equations, 427
system of differential equations
existence and uniqueness, 439
system of linear equations, 130

the characteristic matrix, 338


the characteristic polynomial, 338
the complex guess method, 458
the inner product, 364
the scalar product, 364
the tangent to the graph, 80
transpose matrix, 169
trivial equation, 128, 144

unit vector, 205

vector, 251
vector space, 251
vector space over the complex numbers, 251
vector space over the real numbersl, 251
vector space with inner product, 364

zero vector, 205


zero-extension of a function, 74

You might also like