0% found this document useful (0 votes)
15 views14 pages

Square Matrices

Uploaded by

fedorvonbock2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views14 pages

Square Matrices

Uploaded by

fedorvonbock2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CUST 2023-2024

Maths for Physics 2

Session 7

Square matrices
Summary

Here we study a few important aspects of square matrices. First, we discuss the
notions of eigenvalues, eigenvectors and diagonalization of a square matrix. We also
discuss a useful way of writing a square matrix, namely in the form of a so-called
LU decomposition.

i
Contents

1 Eigenvalues and eigenvectors 1

2 Diagonalization 4

3 The LU decomposition 8

ii
Chapter 1

Eigenvalues and eigenvectors

Let M be a real or complex n × n square matrix, with n an integer, n ⩾ 2. Let λ be


a scalar (i.e. a real or a complex number), and v be a nonzero vector (to be more
precise, a column vector here and in the sequel). Then λ is said to be an eigenvalue
of M , and v is said to be an eigenvector of M corresponding to, or associated with,
the eigenvalue λ, if we have

M v = λv , (1.1)

that is in matrix form


    
M11 · · · M1n v1 λv1
 . .. 
..   ...  =  ...  .
   
 .. . . (1.2)
    
Mn1 · · · Mnn vn λvn

Note that the zero vector 0 is always a trivial solution of the eigenvalue equa-
tion (1.1). For this reason, the zero vector 0 is never considered to be an eigenvector
of M : an eigenvector of M is a nontrivial (i.e. nonzero) solution of (1.1).
Let’s call I the corresponding identity matrix, i.e. the n × n matrix with only 1
on the diagonal and 0 elsewhere, that is
 
10 ··· 0
0 . . · · · ... 
 . 
I = . . . (1.3)
 
 .. . . . . . ... 
 
0 ··· ··· 1

Note that we can thus write (1.1) as

(M − λI) v = 0 . (1.4)

1
CHAPTER 1. EIGENVALUES AND EIGENVECTORS 2

Now, let’s imagine that the matrix M −λI is invertible: we would then get from (1.4),
upon multiplying both sides by (M − λI)−1 on the left, that

v = (M − λI)−1 0 = 0 ,

which contradicts our assumption of v being nonzero. Therefore, this shows that
the matrix M − λI can actually not be invertible, and must thus be singular, so
that its determinant must be zero, that is

det (M − λI) = 0 . (1.5)

This equation is referred to as the characteristic equation of the matrix M . By


definition, det (M − λI) is an n-th order polynomial in λ, whose coefficients are
expressed in terms of the matrix coefficients Mij of M : the polynomial det (M − λI)
is then referred to as the characteristic polynomial of M . Since the latter is an n-th
order polynomial, the fundamental theorem of algebra ensures that det (M − λI)
has n (not necessarily distinct) roots λ1 , . . . , λn , so that1

det (M − λI) = (λ − λ1 ) · · · (λ − λn ) . (1.6)

In view of (1.5), these roots λ1 , . . . , λn of the characteristic polynomial hence pre-


cisely correspond to the eigenvalues of M . The set of numbers

S = {λ1 , . . . , λn } , (1.7)

formed by the eigenvalues of M , is called the spectrum of M .

REMARK: the eigenvalue equation (1.1) actually only defines a family of eigen-
vectors αv, with α any nonzero scalar (real or complex): indeed, if v is solution
of (1.1), so is αv. Therefore, very often, within this whole family of eigenvectors
we’ll pick the one that is normalized, i.e. the one that satisfies
 
  v1 
.. 

2 2
v T v = v1 · · · vn  .  = v1 + . . . + vn = 1 . (1.8)
vn

REMARK: it may occur that some of the roots λj in (1.6) are actually equal.
In this case, we say that this eigenvalue is degenerate, and the number of λj in (1.6)
that are equal to this eigenvalue gives the degeneracy of this eigenvalue. Regarding
1
Since we’ll only be interested in the roots λ1 , . . . , λn , the actual value of the coefficient of the
higher-order term λn is irrelevant, so that we fix it to be 1.
CHAPTER 1. EIGENVALUES AND EIGENVECTORS 3

the corresponding eigenvectors, we then keep the ones that are linearly independent:
actually, it often proves to be convenient to keep the ones that are orthogonal. It
may however be the case that an eigenvalue that has degeneracy g ⩾ 2 has less than
g linearly independent eigenvectors. We recall that two vectors v and w are said to
be linearly independent if we have the equivalence, for two scalars α and β,

αv + βw = 0 ⇐⇒ α = β = 0 . (1.9)

EXAMPLE: let’s consider the matrix


!
2 1
M= . (1.10)
0 2

It admits the single eigenvalue λ = 2 that has degeneracy g = 2. The eigenvalue


equation M v = 2v then yields eigenvectors v that are of the form
!
x
v= , x ∈ R. (1.11)
0

Note that x must be nonzero in (1.11), otherwise v would be the zero vector. But
of course, choosing two different values of x in (1.11), say x and x′ , does not yield
two linearly independent vectors v and v ′ . Therefore, (1.10) is an example of a
2 × 2 matrix that has an eigenvalue of degeneracy 2 but only a single corresponding
eigenvector.
Chapter 2

Diagonalization

Here we discuss an important notion, deeply connected to eigenvalues and eigenvec-


tors: the notion of diagonalization of a square matrix.
To introduce this notion, we consider an n × n matrix M , with eigenvalues
λ1 , . . . , λn (not necessarily distinct) and the corresponding eigenvectors V1 , . . . , Vn ,
that is

M Vj = λj Vj , ∀j = 1, . . . , n . (2.1)

We then construct two other n × n matrices, denote them by D and V . First, D is


the diagonal matrix whose diagonal elements are the n eigenvalues λj (which, again,
are not necessarily distinct) of M , that is
 
λ1 0 · · · 0
 0 . . . · · · ... 
 
D=. . . (2.2)
 
 .. . . . . . ... 
 
0 · · · · · · λn

Then, the matrix V is defined so that its columns are the eigenvectors Vj of M ,
which we can compactly write as
 
V = V1 , · · · , Vn . (2.3)

In other words, if we write the eigenvector Vj as the column vector


 
V1j
 . 
Vj =  . 
 . , (2.4)
Vnj

4
CHAPTER 2. DIAGONALIZATION 5

we have
 
V11 · · · V1n
 . .. .. 
V = .
 . . .
.  (2.5)
Vn1 · · · Vnn

Let’s now compute the product M V : from the definition (2.3) of V we can write,
in a compact form,
 
M V = M V1 , · · · , M Vn , (2.6)

and thus in view of (2.1)


 
M V = λ1 V1 , · · · , λn Vn . (2.7)

Then we compute the product V D. Computing it explicitly from (2.2)-(2.5), we


also get
 
V D = λ1 V1 , · · · , λn Vn . (2.8)

Therefore, we readily see upon comparing (2.7) and (2.8) that we have the matrix
equality

MV = V D . (2.9)

Now, let’s suppose that the matrix V is invertible: we can thus multiply (2.9)
by V −1 on the left to get

V −1 M V = D . (2.10)

Since the matrix D is by definition diagonal, the result (2.10) is at the heart of the
notion of diagonalization, which is defined as follows:

An n × n matrix M is said to be diagonalizable if there exists an


n × n invertible matrix V such that the matrix

V −1 M V = D (2.11)

is diagonal. The matrix V is then said to diagonalize M .

In view of our construction above [see equations (2.1)-(2.10)] we hence see that
the matrix V that diagonalizes M (if it is invertible) is such that its columns are
the eigenvectors of M . The resulting diagonal matrix has thus the eigenvalues of
CHAPTER 2. DIAGONALIZATION 6

M as its diagonal elements. We can actually show (which we won’t do here) that
such a matrix V is invertible if and only if the n eigenvectors V1 , . . . , Vn of M are
linearly independent1 . That is, M is diagonalizable if and only if it has n linearly
independent eigenvectors.
Diagonalization is very useful in practice. For instance, it allows to compute the
power of an arbitrary matrix in a quite simple way. Indeed, let’s rewrite (2.11) so
as to express M : multiplying (2.11) on the left by V and on the right by V −1 yields

M = V DV −1 . (2.12)

Now, let’s first compute M 2 : we have in view of (2.12)

M 2 = V D2 V −1 . (2.13)

But since D is by construction diagonal, it’s immediate to compute its square: D2


is merely the diagonal matrix formed by the squares of the diagonal elements of D,
i.e. in view of (2.2)
 
λ21 0 ··· 0
 .. .. 
0 . ··· .
D2 =  . .
 
 .. .. .. .. 
 . . . 
2
0 ··· · · · λn

This can then be repeated for an arbitrary power k: computing M k , k ∈ N, yields


in view of (2.12)

M k = V Dk V −1 , (2.14)

with in view of (2.2)


 
λk1 0 ··· 0
 .. .
k
0 . · · · .. 
D = . .
 
 .. .. .. .. 
 . . . 
0 · · · · · · λkn

The expression (2.14) has a clear practical advantage regarding for instance the
computational cost: to compute M k from brute force requires k − 1 matrix mul-
tiplications, while to compute V Dk V −1 requires only two matrix multiplications,
1
Which, we emphasize, does not require the n eigenvalues λ1 , . . . , λn to be distinct: it may still
be the case that some eigenvalues of M are degenerate, but the corresponding eigenvectors are
linearly independent.
CHAPTER 2. DIAGONALIZATION 7

independently of the value of k.


This is especially important in view of computing functions of matrices. Indeed,
if f (x) is a function of a real variable x, we can define the matrix f (M ) that results
from replacing x by the matrix M in the Taylor series of f (x) around 0. That is,
writing the latter as

X f (k) (0)
f (x) = xk ,
k=0
k!

and replacing x by the matrix M , we define a matrix that we denote by f (M ) and


that we call the function f of M , namely

X f (k) (0)
f (M ) = Mk . (2.15)
k=0
k!

For instance, we hence have the matrix exponential eM , which is then defined through
the series

M
X 1 k
e = M . (2.16)
k=0
k!
Chapter 3

The LU decomposition

Let’s first discuss the general idea that underlies the notion of LU decomposition.
Let’s assume that we want to solve a system of n ⩾ 2 linear algebraic equations of
the form
A11 x1 + . . . + A1n xn = B1 , (3.1a)
..
.

An1 x1 + . . . + Ann xn = Bn , (3.1b)

where the coefficients Aij are supposed to be known, as well as the numbers Bj ,
and the numbers xj are the unknowns. We can combine these n equations in the
single matrix equation

Ax = B , (3.2)

where A is the n × n matrix


 
A11 · · · A1n
 . ... .. 
A= ..
 ,
.  (3.3)
An1 · · · Ann

while x and B are the column vectors


   
x1 B1
.  . 
x= . B= . 
. and  . . (3.4)
xn Bn

The so-called LU decomposition of the matrix A is then defined as follows:

8
CHAPTER 3. THE LU DECOMPOSITION 9

Suppose that we can find two n × n matrices L and U such that L is lower
triangular (i.e., L only has nonzero elements on its diagonal and below the
diagonal) and U is upper triangular (i.e., U only has nonzero elements on its
diagonal and above the diagonal), namely
   
L11 0 · · · 0 U11 U12 · · · U1n
 .. ..   .. .. 
L
 21 . ··· .   0 . ··· . 
L= . and U = , (3.5)
  
 .. . .. . .. .
.. 
 .. .. .. .. 
 

 . . . . 

Ln1 · · · · · · Lnn 0 ··· ··· Unn

such that A can be written as the product

A = LU . (3.6)

The expression (3.6) of A is then called the LU decomposition of A.

The advantage of the LU decomposition of A arises from the triangular nature


of the two matrices L and U . To see this, let’s substitute (3.6) into (3.2), we have

Ax = LU x = L (U x) = B . (3.7)

This allows to break the equation Ax = B that we want to solve into two equations:

i) first, we can solve the equation

Ly = B , (3.8)

and thus obtain the corresponding vector y;

ii) and then, once we have y, we can solve the equation

Ux = y , (3.9)

which hence yields the solution x to our original equation Ax = B.

The advantage of separately solving (3.8) and then (3.9) instead of straight away
solving (3.7) is that (3.8) and (3.9) are actually very easy to solve because L and U
are triangular: indeed, (3.8) for instance reads
    
L11 0 ··· 0 y1 B1
 ... .. 
y  B 
   
L
 21 ··· . 

 .2  =  . 2  ,
 . ... ... ..   .   . 
 ..

 .  .   . 
Ln1 · · · · · · Lnn yn Bn
CHAPTER 3. THE LU DECOMPOSITION 10

from which we readily see from the first equation that

B1
y1 = .
L11

We can then substitute this expression of y1 into the second equation to immediately
get y2 , etc. . . Similarly, (3.9) reads
    
U11 U12 · · · U1n x y
 . .   1  1
 0 .. · · · ..  x2   y 2 
 =.,
..   ... 
 . . . . 
 .. .. ..   .. 
  
 
0 · · · · · · Unn xn yn

from which we readily see from the last equation that

yn
xn = .
Unn

We can then substitute this expression of xn into the second last equation to get
xn−1 , etc. . . Therefore, the linear equations that must be solved in (3.8) and (3.9)
are somehow already ordered in a suitable way so as to allow solving them without
too much effort.
Another advantage of triangular matrices is that it is very easy to compute their
determinant: indeed, the determinant of a triangular matrix is merely the product
of its diagonal elements. We hence have
n
Y n
Y
det L = Ljj and det U = Ujj . (3.10)
j=1 j=1

Therefore, if we have the LU decomposition (3.6) of the matrix A, then it’s really
easy to compute the determinant of A, and we have

n
! n
!
Y Y
det A = det L det U = Ljj Ujj . (3.11)
j=1 j=1

Even more, since in practice it proves to be always possible to choose1

L11 = L22 = . . . = Lnn = 1 , (3.12)


1
Indeed, note that the equation (3.6) only yields n × n = n2 equations for the n2 + n unknowns
Lij and Uij . In order to get a unique solution, i.e. in order to uniquely determine the matrices L
and U , we must thus impose n additional constraints on the unknowns Lij and Uij : (3.12) is just
a (convenient) example of such additional constraints.
CHAPTER 3. THE LU DECOMPOSITION 11

then we have the even simpler expression of det A


n
Y
det A = det U = Ujj . (3.13)
j=1

While things are nice once we have the LU decomposition (3.6) of the matrix A, the
difficult part is of course to actually obtain this decomposition for a given matrix
A. And of course, the larger is n, the more difficult it gets. There are however some
algorithms that have been designed to accomplish such a task, so in practice we can
of course use them if we need to find an LU decomposition.
As a final remark, note that the matrix A only contains the coefficients of the
linear equations (3.1), and in particular makes no reference to the vector B: this
means that once we have the LU decomposition of A, we can then efficiently solve
the equations (3.1), or equivalently (3.2), for many different vectors B, which is a
distinct advantage of the LU decomposition over some other methods that can be
used to solve the linear equations (3.1).

You might also like