“31093 3002362 3
Jin Ho Kwak
SunGpyYo HonG
Linear Algebra
BIRKHAUSER
Boston * BASEL * BERLINen
164
93
499F
Jin Ho Kwak
Sungpyo Hong
Department of Mathematics
Pohang University of Science and Technology
Pohang, The Republic of Korea
Library of Congress Cataloging-in-Publication Data
Kwak, Jin Ho, 1948-
Linear Algebra /Jin Ho Kwak, Sungpyo Hong.
P. om.
Includes index.
ISBN 0-8176-3999-3 (alk. paper). - ISBN 3-7643-3999.3 (alk.
paper)
1. Algebras, Linear. I. Hong, Sungpyo, 1948- . Il Title,
QA188.K94. 1997
512.5--de2 97.9062
ce
Pinedo ss pape
©1997 Biase Buon pintiuser BB
Copyright i not claimed for works of US. Government employees,
“Allright reserved, Nopartof this publication may be reproduced, stored in aretreval system,
‘ortransmitted, inany formorby any means, electronic, mechanical, photocopying, reeording,
or otherwise, without prior permission of the copyright owner.
emission to photocopy for internal or personal use of specific cients
BBirkhiuser Boston for libraries and other users registered. with the Copyright Clearance
Center (COC), provided thatthe base fee of $6.00 per copy, plus $0.20 pe page is paid directly
to COC, 222 Rosewood Drive, Danvers, MA 01923, U.S.A. Special requests should be
addressed directly to Bikhiuser Boston, 675 Massachusetts Avenue, Cambridge, MA 02139,
USA.
1SBN0-8176-3999.3,
ISBN 3-7643-3999.3,
‘Typesetting by the authors in LAT
Printed and bound by Hamilton Printing, Rensselaer, NY
Printed in the U.S.A.
987654321Preface
Linear algebra is one of the most important subjects in the study of science
and engineering because of its widespread applications in social or natural
science, computer science, physics, or economics. As one of the most useful
courses in undergraduate mathematics, it has provided essential tools for
industrial scientists, The basic concepts of linear algebra are vector spaces,
linear transformations, matrices and determinants, and they serve as an
abstract language for stating ideas and solving problems.
‘This book is based on the lectures delivered several years in a sophomore.
level linear algebra course designed for science and engineering students. The
primary purpose of this book is to give a careful presentation of the basic
concepts of linear algebra as « coherent part of mathematics, and to illustrate
its power and usefulness through applications to other disciplines. We have
tried to emphasize the computational skills along with the mathematical
abstractions, which have also an integrity and beauty of their own. ‘The
book includes a variety of interesting applications with many examples not
only to help students understand new concepts but also to practice wide
applications of the subject to such areas as differential equations, statistics,
geometry, and physics. Some of those applications may not be central to
‘the mathematical development and may be omitted or selected in a syllabus
at the discretion of the instructor. Most basic concepts and introductory
motivations begin with examples in Euclidean space or solving a system of
linear equations, and are gradually examined from different points of views
to derive general principles.
For those students who have completed a year of calculus, linear algebra
may be the first course in which the subject is developed in an abstract way,
and we often find that many students struggle with the abstraction and
miss the applications. Our experience is that, to understand the material,
students should practice with many problems, which are sometimes omitted
because of a lack of time. To encourage the students to do repeated practice,vi Preface
wwe placed in the middle of the text not only many examples but also some
carefully selected problems, with answers or helpful hints. We have tried
to make this book as easily accessible and clear as possible, but certainly
there may be some awkward expressions in several ways. Any criticism or
comment from the readers will be appreciated.
We are very grateful to many colleagues in Korea, especially to the faculty
members in the mathematics department at Pohang University of Science
and Technology (POSTECH), who helped us over the years with various
aspects of this book. For their valuable suggestions and comments, we would
like to thank the students at POSTECH, who have used photocopied versions
of the text over the past several years. We would also like to acknowledge the
invaluable assistance we have received from the teaching assistants who have
checked and added some answers or hints for the problems and exercises in
this book, Our thanks also go to Mrs. Kathleen Roush who made this book
much more legible with her grammatical corrections in the final manuscript.
Our thanks finally go to the editing staff of Birkhiuser for gladly accepting
our book for publication.
Jin Ho Kwek
‘Sungpyo Hong
E-mail:
[email protected]
[email protected]
April 1997, in Pohang, Korea
“Linear algebra is the mathematics of our modern technological world of
complet multivariable systems and computers”
= Alan Tucker ~
“We (Halmos' and Kaplansky) share a love of linear algebra. I think it
4s our conviction that we'll never understand infinite-dimensional operators
properly until we have a decent mastery of finite matrices. And we share a
philosophy about linear algebra: we think basis-free, we write basis-free, but
‘when the chips are down we close the office door and compute with matrices
like fury”
~ Irving Kaplansky -Contents
Preface v
1 Linear Equations and Matrices 1
Ld Introduction. 6.0... ee eee eee eee 1
1.2 Gaussian elimination 4
13 Matrices... . . - 2
14 Products of matrices . . 16
15. Block matrices . 2
1.6 Inverse matrices... . . « 4
17 Elementary matrices eee a
18 LDU factorization ............. see 88
1.9 Application: Linear models 38
1.10 Bxercises .. 2... 45
2 Determinants 49
2.1 Basic properties of determinant . . 49
2.2 Existence and uniqueness... . 54
2.8 Cofactor expansion ......... 60
24 Cramer’srule.........-. 65
2.5 Application: Area and Volume . . 68
2.6 Exercises... . a
3 Vector Spaces 75
3.1 Vector spaces and subspaces 75
3.2 Bases . 8L
3.3 Dimensions ........ 0.005 88
3.4 Row and column spaces . 94
3.5 Rank and nullity - 100
vilvil -
3.6 Bases for subspaces .
3.7 Invertibility .
3.8 Application: Interpolation dospaseus
3.9. Application: The Wronskian...... 2...
3.10 Exercises... ........
4 Linear ‘Transformations
4.1 Introduction. .... 2.0222
4.2. Invertible linear transformations
4.3 Application: Computer graphics
4.4 Matrices of linear transformations
4.5 Vector spaces of linear transformations . . . .
4.6 Change of bases .
4.7 Similarity... .
4.8 Dual spaces
4.9 Exercises... ..
5 Inner Product Spaces
5.1 Inner products oo
5.2 The lengths and angles of vectors .
5.3 Matrix representations of inner products
5.4 Orthogonal projections .
5.5. The Gram-Schmidt orthogonalization :
5.6 Orthogonal matrices and transformations
5.7. Relations of fundamental subspaces .
5.8 Least square solutions . . . .
5.9 Application; Polynomial approximations...
5.10 Orthogonal projection matrices
5.11 Exercises... 00.50.05 eee
6 Eigenvectors and Eigenvalues
6.1 Introduction .
6.2 Diagonelization of matrices. .
6.3. Application: Difference equations . .
6.4 Application: Differential equations I
6.5 Application: Differential equations I
6.6 Exponential matrices...
6.7 Application: Differential equations HI... .
6.8 Diagonalization of linear transformations... . .
CONTENTS
104
. 110
3
115
uz
121
121
127
132
135,
M40
143
146
. 182
156
161
161
164
167
im
7
181
185
187
192
196
204
209
=. 209
. 216
221
226
. 230
235
240
243CONTENTS
69
EMBICHOS oe eee ee eee e eens
7 Complex Vector Spaces
TA
72
13
14
15
76
Introduction . bec eeee
Hermitian and unitary matrices see
Unitarily diagonalizable matrices... . .
Normal matrices ...... 2.
‘The spectral theorem .
Exercises aoa
8 Quadratic Forms
81
82
83
84
85
86
87
88
Introduction. 6.0... peace
Diagonalization of a quadratic form... . .
Congruence relation :
Extrema of quadratic forms... ...
Application: Quadratic optimization
Definite forms . :
Bilinear forms .
Exercises
9 Jordan Canonical Forms
on
9.2
93
04
95
Introduction ee
Generalized eigenvectors... . . eee
Computation ofe* 2... 0... .
Cayley-Hamilton theorem ....... +
Rrerciseel eer errs
Selected Answers and Hints
Index
«245
251
251
+. 259
+. 263
= 268
2.27
- 276
279
+. 279
+. 282
. 288
202
298
300
= 303
313
317
317
- 327
333
. 337
. 340
343
365Linear AlgebraChapter 1
Linear Equations and
Matrices
1.1 Introduction
One of the central motivations for linear algebra is solving systems of linear
equations, We thus begin with the problem of finding the solutions of a
system of m linear equations in n unknowns of the following form:
aur + at, Ho + inte =
ant, + amtz + + + ant = 0 phase
Amity + mata + ot cb Oran
where 21, 2, ..., 2 are the unknowns and aij’s and by's denote constant
(real or complex) numbers.
A sequence of numbers (81, 82, ..., Sn) is called solution of the
system ifr; = 53, 72 = 82, .-., Zn = Sn satisfy each equation in the system
simultaneously. When b, = bz = --+ = bm = 0, we say that the system is
homogeneous.
‘The central topic of this chapter is to examine whether or not a given
system has a solution, and to find a solution if it has one. For instance,
any homogeneous system always has at least one solution zr = zy
2 = 0, called the trivial solution. A natural question is whether such a
homogeneous system has a nontrivial solution. Ifso, we would like to have a
systematic method of finding all the solutions. A system of linear equations
is said to be consistent if it, has at least one solution, and inconsistent if
12 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
it has no solution. The following example gives us an idea how to answer
the above questions.
Example 1.1 When m = n = 2, the system reduces to two equations in
two unknowns 2 and y:
az + by
age + boy
a
ca
Geometrically, each equation in the system represents a straight line
when we interpret x and y as coordinates in the zy-plane. Therefore, a
point P = (2,y) is a solution if and only if the point P lies on both lines.
Hence there are three possible types of solution set:
(1) the empty set if the lines are parallel,
(2) only one point if they intersect,
(3) a straight line: é.e., infinitely many solutions, if they coincide.
‘The following examples and diagrams illustrate the three types:
Case (1) Case (2) Case (3)
ety sl ay
r-y = 0 2a — 2y
uv y
1
z e z
‘To decide whether the given system has a solution and to find a general
method of solving the system when it has a solution, we repeat here a well-
known elementary method of elimination and substitution,
Suppose first that the system consists of only one equation az + by = c.
‘Then the system has either infinitely many solutions (i.e., points on the
straight line 2 = —4y+ £ or y = —Sx+ § depending on whether a # 0 or
£0) or no solutions when a and ¢ #0.1.1. INTRODUCTION 3a
‘We now assume that the system has two equations representing two lines
in the plane. Then clearly the two lines are parallel with the same slopes
if and only if ag = Aa, and by = Ny for some A # 0, oF aiby ~ ab = 0.
Furthermore, the two lines either coincide (infinitely many solutions) or are
distinct and parallel (no solutions) according to whether cp = Ac; holds or
not.
Suppose now that the lines are not parallel, or ayby — ab # 0. In this
case, the two lines cross at a point, and hence there is exactly one solution:
For instance, if the system is homogeneous, then the lines cross at the origin,
so (0,0) is the only solution. For a nonhomogeneous system, we may find
the solution as follows: Express « in terms of y from the first equation, and
then substitute it into the second equation (i.e., eliminate the variable
from the second equation) to get
Since aby — aah # 0, this can be solved as
xen — ane
ayb2 = a2by”
which is in tum substituted into one of the equations to find z and give a
complete solution of the systentIn detail, the process can be summarized
as follows:
(1) Without loss of generality, we may assume ay # 0 since otherwise we can
interchange the two equations. Then the variable « can be eliminated from
the second equation by adding a times the first equation to the second,
to get
(b= aay = a- a.
ac + by = 4
a;
a
(2) Since a1b2—aabi # 0, y can be found by multiplying the second equation
by a nonzero number ——“1. to get
ayby — aaby
ae + by =4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
(3) Now, 2 is solved by substituting the value of y into the first equation,
and we obtain the solution to the problem:
2 = peaches
axba anh
= mee a204
0 aba aah,
‘Note that the condition a:b2 — ab # 0 is necessary for the system to have
only one solution. o
In this example, we have changed the original system of equations into a
simpler one using certain operations, from which we can get the solution of
the given system, That is, if (1,1) satisfies the original system of equations,
‘then z and y must satisfy the above simpler system in (3), and vice versa,
It is suggested that the readers examine a system of three equations in
three unknowns, each equation representing a plane in the 3-dimensional
space RS, and consider the various possible cases in a similar way.
Problem 1.1 For a system of three equations in three unknowns
ant + any + age = by
fanz + amy + ase = be
at + aay + ame = ba,
describe all the possible types of the solution set in R°.
1.2 Gaussian elimination
As we have seen in Example 1.1, a basic idea for solving a system of linear
equations is to change the given system into a simpler system, keeping the
solutions unchanged; the example showed how to change a general system
to a simpler one. In fact, the main operations used in Example 1.1 are the
following three operations, called elementary operations:
(1) multiply a nonzero constant throughout an equation,
(2) interchange two equations,
(8) change an equation by adding a constant multiple of another equation.1.2. GAUSSIAN ELIMINATION 5
After applying a finite sequence of these elementary operations to the
given system, one can obtain a simpler system from which the solution can
be derived directly.
Note also that each of the three elementary operations has its inverse
operation which is also an elementary operation:
(1)! divide the equation with the same nonzero constant,
(2)! interchange two equations again,
(8)! change the equation by subtracting the same constant multiple of the
same equation.
By applying these inverse operations in reverse order to the simpler system,
cone can recover the original system. This means that a solution of the
original system must also be a solution of the simpler one, and vice versa.
‘These arguments can be formalized in mathematical language. Observe
that in performing any of these basic operations, only the coefficients of the
variables are involved in the calculations and the variables 21, ..., a and
the equal sign “=” are simply repeated. Thus, keeping the order of the
variables and “=” in mind, we just extract the coefficients only from the
equations in the given system and make a rectangular array of numbers:
ay 412 Gin By
G2 azn «+ Gan bp
mt Gna °° mm Om
‘This matrix is Walled the augmented matrix for the system. The term
‘matriz means just any rectangular array of numbers, and the numbers in this
array are called the entries of the matrix. To explain the above operations
in terms of matrices, we first introduce some terminology even though in the
following sections we shall study matrices in more detail.
Within a matrix, the horizontal and vertical subarrays,
ay
35
fan aig +++ Gin bi] and
mj
are called the é-th row (matrix) and the j-th column (matrix) of the aug-
mented matrix, respectively. Note that the entries in the j-th column are6 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
just the coefficients of j-th variable 2;, so there is a correspondence between
columns of the matrix and variables of the system.
Since each row of the augmented matrix contains all the information of
the corresponding equation of the system, we may deal with this augmented
matrix instead of handling the whole system of linear equations.
‘The elementary operations to a system of linear equations are rephrased
as the elementary row operations for the augmented matrix, as follows:
(1) multiply a nonzero constant throughout a row,
(2) interchange two rows,
(8) change a row by adding a constant multiple of another row.
‘The inverse operations are
(1)! divide the row by the same constant,
(2)! interchange two rows again,
(3)' change the row by subtracting the same constant multiple of the other
row.
Definition 1:1 Two augmented matrices (or systems of linear equations)
are said to be row-equivalent if one can be transformed to the'other by a
finite sequence of elementary row operations.
If a matrix B can be obtained from a matrix A in this way, then we
can obviously recover A from B by applying the inverse elementary row
operations in reverse order. Note again that an elementary row operation
does not alter the solution of the system, and we can formalize the above
argument in the following theorem:
‘Theorem 1.1 If two systems of linear equations are row-equivalent, then
they have the same set of solutions,
‘The general procedure for finding the solutions will be illustrated in the
following example:
Example 1.2 Solve the system of linear equations:
Qy + dz
c+ dy + dz
30 + 4y + 6 = -1,1.2. GAUSSIAN ELIMINATION 7
Solution: We could work with the augmented matrix alone. However,
to compare the operations on systems of linear equations with those on the
augmented matrix, we work on the system and the augmented matrix in
parallel, Note that the associated augmented matrix of the system is
024 2
en
346-1
(1) Since the coefficient of « in the first equation is zero while that in the
second equation is not zero, we interchange these two equations:
c+ dy + 2 122 3
dy + dz 024 2
Bc + dy + Be 346-1
(2) Add -3 times the first equation to the third equation:
z+ Qy + 2 3 122 3
Qy + 42 2 o 24 2
= dy = -10 0 -2 0 -10
‘The coefficient 1 of the first unknown « in the first equation (row) is called
the pivot in this first elimination step.
Now the second and the third equations involve only the two unknowns
y and z. Leave the first equation (row) alone, and the same elimination
procedure can be applied to the second and the third equations (rows): The
pivot for this step is the coefficient 2 of y in the second equation (row). To
climinate y from the last equation,
(8) Add 1 times the second equation (row) to the third equation (row):
z+ dy + Oe 122 3
dy + az o24 2
4s 004 -8
‘The elimination process done so far to obtain this result is called a for-
ward elimination: é.c., elimination of x from the last two equations (rows)
and then elimination of y from the last equation (row).
Now the pivots of the second and third rows are 2 and 4, respectively.
‘To make these entries 1,8 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
(4) Divide each row by the pivot of the row:
n+ y+ d= 3 122 3
ytd = 1 o12 1
z= 2 oo1-2
‘The resulting matrix on the right side is called a row-echelon form of the
matrix, and the 1’s at the leftmost entries in each row are called the leading
V's, The process s0 far is called a Gaussian elimination,
We now want to eliminate numbers above the leading 1's;
(5) Add —2 times the third row to the second and the first rows,
2 + dy 7 120 7
v 5 Ogio
z= -2 001-2
(6) Add —2 times the second row to the first row:
2 100 -3
y 010 5
z o01 -2
‘This matrix is called the reduced row-echelon form. The procedure
to get this reduced row-echelon form from a row-echelon form is called the
back substitution. The whole process to obtain the reduced row-echelon
form is called a Gauss-Jordan elimination.
Notice that the corresponding system to this reduced row-echelon form
is row-equivalent to the original one and is essentially a solved form: i.e.,
the solution is 7 = -3, y=5, z= —2. o
In general, a matrix of row-echelon form satisfies the following prop-
erties.
(1) The first nonzero entry of each row is 1, called a leading 1.
(2) A row containing only 0°s should come after all rows with some nonzero
entries.
(3) The leading 1’s appear from left to the right in successive rows. That
is, the leading 1 in the lower row occurs farther to the right than the
leading 1 in the higher row.
Moreover, the matrix of the reduced row-echelon form satisfies1.2. GAUSSIAN ELIMINATION 9
(4) Each column that contains a leading 1 has zeros everywhere else, in
addition to the above three properties.
Note that an augmented matrix has only one reduced row-echelon form
while it may have many row-echelon forms. In any case, the number of
nonzero rows containing leading 1’s is equal to the number of columns con-
taining leading 1's. The variables in the system corresponding to columns
with the leading 1’s in a row-echelon form are called the basic variables. In
general, the reduced row-echelon form U may have columns that do not con-
tain leading 1's. ‘The variables in the system corresponding to the columns
without leading 1’s are called free variables. ‘Thus the sum of the number
of basic variables and that of free variables is precisely the total number of
variables.
For example, the first two matrices below are in reduced row-echelon
form, and the last two just in row-echelon form.
100 0} 5 106) 1232 1126
(Oe te | FeO) (Ostet fae OH enayoritea|20) tell 0:
ooo oo0000 oo17 0013
Notice that in an augmented matrix (A b], the last column b does not
correspond to any variable. Hence, if we consider the four matrices above
as augmented matrices for some systems, then the systems corresponding
to the first and the last two augmented matrices have only basic variables
but no free variables. In the system corresponding to the second augmented
matrix, the second and the forth variables, x2 and <, are basic, and the
first ond the third variables, #1 and ag, are free variables. These ideas will
be used in later chapters.
In summary, by applying a finite sequence of elementary row operations,
the augmented matrix for a system of linear equations can be changed to
its reduced row-echelon form which is row-equivalent to the original one.
From the reduced row-echelon form, we can decide whether the system has
‘a solution, and find the solution of the given system if it has one.
Example 1.3 Solve the following system of linear equations by Gauss-
Jordan elimination.
a + 3x2 — 2a 3
2a, + 622 ~ 2g + doy = 18
a) + a3 + 3x = 10.10 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Solution: The augmented matrix for the system is
3-20 8
6 -2 4 18].
1
‘The Gaussian elimination begins with:
(1) Adding —2 times the first row to the second produces
13-20 3
(OE 0) 2412) |e
o1
13 10
(2) Note that the coefficient of 22 in the second equation is zero and that
in the third equation is not. Thus, interchanging the second and the third
rows produces
13-20 3
Op igs) 10)|
oo
d
(8) The pivot in the third row is 2. Thus, dividing the third row by 2
produces a row-echelon form
This is a row-echelon form, and we now continue the back-substitution:
(4) Adding —1 times the third row to the second, and 2 times the third
row to the first produces
1
0
0
(5) Finally, adding —3 times the second row to the first produces the
reduced row-echelon form:
cre
ors1.2, GAUSSIAN ELIMINATION n
‘The corresponding system of equations is
a + me
tm +
ay + ey
Since 21, 72, and 23 correspond to the columns containing leading 1’s,
they are the basic variables, and 2 is the free variable. Thus by solving this
system for the basic variables in terms of the free variable 4, we have the
system of equations in a solved form:
m= 3
m= 4 - my
ty = 6 — 2s.
By assigning an arbitrary value t to the free variable rz, the solutions can
be written as
(ei, #2, 43, m4) = (3-4, 4-4, 6-24, 0),
for any t € R, where R denotes the set of real numbers. o
Remark: Consider a homogeneous system
eum, + at + + ainda
aym, + amz + +) + Ozma
Omity + maa + Gmntn = 0,
with the number of unknowns greater than the number of equations: that
is, m
[4-140-5]7] 4]?
23][ 2] _ [2-2+3-Cn]_[a
4o}[-1] * [4-24+0-(-)|7[8]>
23][0] _ [2-0+3-0]_[o
4o0}|o}] ~ |4-040-0]=|o1.4, PRODUCTS OF MATRICES 19
Fos 20 710
4o||[5 -10 48 o0]"
Since A is a 2x2 matrix and B is a 2x3 matrix, the product AB is a 2x3
matrix. If we concentrate, for example, on the (2,1)-entry of AB, we single
out the second row from A and the first column from B, and then we multiply
corresponding entries together and add them up, ie,4-140-5=4. 0
‘Therefore, AB is
Note that the product AB of A and B is not defined if the number of
columns of A and the number of rows of B are not equal.
Remark: In step (2), we could have defined for a 1 x n row matrix A and
ann xr matrix B using the same rule defined in step (1). And then in step
(8) an appropriate modification produces the same definition of the product,
of matrices. We suggest the readers verify this (see Example 1.6).
‘The identity matrix of order n, denoted by In (or I if the order is clear
from the context ), is a diagonal matrix whose diagonal entries are all 1, i.e.,
1 0-0
ou
qn
O- O1
By a direct computation, one can easily see that Aly =
nxn matrix A.
‘Many, but not all, of the rules of arithmetic for real or complex numbers
also hold for matrices with the operations of scalar multiplication, the sum
and the product of matrices. The matrix Omxn plays the role of the number
0, and [,, that of the number 1 in the set of real numbers.
‘The rule that does not hold for matrices in general is the commutativity
AB = BA of the product, while the commutativity of the matrix sum
A+B=B+A does hold in general. The following example illustrates the
noncommutativity of the product of matrices,
I,A for any
1 0 o1 .
trams 1s taA= [22] at a= [9 2]
[1 3):
ol
ap=[ $3]; BA20 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
‘Thus the matrices A and B in this example satisfy AB # BA. o
‘The following theorem lists some rules of ordinary arithmetic that do
hold for matrix operations.
‘Theorem 1.4 Let A, B, C be arbitrary matrices for which the matriz op-
erations below are defined, and let k be an arbitrary scalar. Then
(1) A(BC) = (AB)C, (written as ABC) (Associativity),
(2) A(B+ C) = AB+ AC, and (A+ B)C = AC + BC, (Distributivity),
(8) IA=A=Al,
(4) k(BO) = (kB)C = BEC),
(5) (AB)? = BT AT,
Proof: Each equality can be shown by direct calculations of each entry of
both sides of the equalities. We illustrate this by proving (1) only, and leave
the others to the readers.
Assume that A = [aij] is an mxn matrix, B= [bj] is an nxp matrix,
and C = [eq is a pxr matrix. We now compute the (i, )-entry of each
side of the equation. Note that BC is an nxr matrix whose (i, j)-entry is
[BO}s = 8. bareag. Thus
ABONy = x euulBChay
ane np.
> ow byrcrg = DDO aipburcry-
i” Sed
ped
Similarly, AB is an mxp matrix with the (i, j)-entry [AB]iy = Dini @inbajs
and
2
(AB)Cly = SLABlaey = 32 S> audaey = 32 Sauber.
bet Saline ike
‘This clearly shows that [A(BC)],; = [(AB)C],, for all i, j, and consequently
A(BC) = (AB)C as desired. a
Problem 1.8 Prove or disprove: If A is not a zero matrix and AB = AC, then
Bac.
Problem 1.9 Show that any triangular matrix A satisfying AAT = ATA is a diag-
onal matrix.1.4. PRODUCTS OF MATRICES 2
Problem 1.10 For a square matrix A, show that
(1) AAT and A+ A? are symmetric,
(2) A-AT is skew-symmetric, and
(8) A can be expressed as the sum of symmetric part B = }(A+A7) and skew-
symmetric part C= }(A— A"), so that A= B+C.
As an application of our results on matrix operations, we shall prove the
following important theorem:
‘Theorem 1.5 Any system of linear equations has either no solution, exactly
one solution, or infinitely many solutions.
Proof: We have already seen that system of linear equations may be
written as Ax=b, which may have no solution or exactly one solution.
Now assume that the system Ax = b of linear equations has more than one
solution and let x1 and xz be two different solutions so that Ax; = b and
Ax = b. Let xo = x1 — x2 # 0. Since Ax is just a particular case of a
matrix product, Theorem 1.4 gives us
AG + kx0) = Ax1 + KAXo = b+ k( Axi — Axa) = by
for any real number k, This says that x; + kxo is also a solution of Ax = b
for any k. Since there are infinitely many choices for k, Ax = b has infinitely
many solutions. a
Problem 1.11 For which values of a does each of the following systems have no
solution, exactly one solution, or infinitely many solutions?
z+ dy
afé - ¥22 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
1.5 Block matrices
In this section we introduce some techniques that will often be very helpful
in manipulating matrices. A submatrix of a matrix A is a matrix obtained
from A by deleting certain rows and/or columns of A. Using a system of
horizontal and vertical lines, we can partition a matrix A into submatrices,
called blocks, of A as follows: Consider a matrix
ay ap aig | arg
A= | ax oa axa | 024 | ,
a3; 32 ag | 234
divided up into four blocks by the dotted lines shown. Now, if we write
am am az am
[ox].
An=[as1 as as2 ], Az
then A can be written as
[Au 4p
An Azo |"
called a block matrix.
‘The product of matrices partitioned into blocks also follows the matrix
product formula, as if the Aij were numbers:
An Arg Bu Ba
Az B i
ies Pal les Bu
AuBu+AwBu AnBu + ABo |
AB =
[ An By +AnBy An By + AnBa
provided that the number of columns in Ajx is equal to the number of rows
in Byj. ‘This will be true only if the columns of A are partitioned in the
same way as the rows of B.
It is not hard to see that the matrix product by blocks is correct. Sup-
pose, for example, that we have a 3x3 matrix A and partition it as,
ay ay2 | a3
aa a2 | ozs
931 932 | a33
An Aw |
* An1.5. BLOCK MATRICES 23
and a 3x2 matrix B which we partition as
by bia
a=] | =[ 3 |.
fst ba *
‘Then the entries of C = [cj] = AB are
ij = (airbrj + asada) + asda; -
‘The quantity o;3b,j + oiabas is simply the (i4)-entry of Ar, Bhy if i < 2, and
the (i,)-entry of An By: if i = 3. Similarly, asgba; is the (i j)-entry of Ai2Ba
if i <2, and of AgpBn if i= 3. Thus AB can be written as
Cu | 7 [ AnBu + ABa
B= .
et [ Cw | = | AB + AnBa
In particular, if an m xn matrix A is partitioned into blocks of column
vectors: ie., A =a! a? --» a], where each block aé is the j-th column,
then the product Ax with x = [1 -+- zn]? is the sum of the block matrices
(or column vectors) with coefficients 24's:
a
a 1 aya? n
mal +ma"t-+-tana’,
Bn
where aja = xy(a1j aaj +++ ang]?
Example 1.6 Let A be an m x n matrix partitioned into the row vectors
‘a, ag, .-., @q as its blocks, and let B be ann x r matrix so that their
product AB is well-defined. By considering the matrix B as a block, the
product AB can be written
a aB aybt ayb? «-- ab”
ap-|" | p= aB | _ abt ab? +++ agb™ .
an amB ab! ah? + aqb?
where b!, b?, --+, b” denote the columns of B. Hence, the row vectors of
AB are the products of the row vectors of A and B.4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Problem 1.12 Compute AB using block multiplication, where
La}1-o vol?
ae[3He f) ae[2 us
ola 3-2/1
1.6 Inverse matrices
‘As we saw in Section 1.4, a system of linear equations can be written as
Ax = b in matrix form. This form resembles one of the simplest linear
equation in one variable ax = 6 whose solution is simply « = a~'b when
a #0. Thus it is tempting to write the solution of the system as x = A~'b,
However, in the case of matrices we first have to have a precise meaning of
71. To discuss this we begin with the following definition.
Definition 1.7 For an m xn matrix A, an n x m matrix B is called a left
inverse of A if BA = In, and ann x m matrix C is called a right inverse
of A if AC = Im.
Example 1.7 From a direct calculation for two matrices
1-3
a=[po apes] 8],
eee
-5 2 -4
we have AB=Ip,andBA=| 9 -2 6/|#4h.
2-4 9
‘Thus, the matrix B is a right inverse but not a left inverse of A, while A is a
left inverse but not a right inverse of B. Since (AB)™ = BT AT and IT =1,
a matrix A has a right inverse if and only if AT has a left inverse. a
‘However, if A is a square matrix and has a left inverse, then we prove
later (Theorem 1.8) that it has also a right inverse, and vice versa. Moreover,
the following lemma shows that the left inverses and the right inverses of
square matrix are all equal. (This is not true for nonsquare matrices, of
course),
Lemma 1.6 If an nx n square matriz A has a left inverse B and a right
inverse C, then B and C are equal, i.e., B=C.1.6. INVERSE MATRICES 25
Proof: A direct calculation shows that
B= BI=B(AC)=(BA)C=IC=C.
Now any two left inverses must be both equal to a right inverse C, and hence
to each other, and any two right inverses must be both equal to a left inverse
B, and hence to each other. So there exist only one left and only one right
inverse for a square matrix A ifit is known that A has both left and right
inverses. Furthermore, the left and right inverses are equal. a
‘This theorem says that if a matrix A has both a right inverse and a left
inverse, then they must be the same. However, we shall see in Chapter 3
that any mxn matrix A with m # n cannot have both a right inverse and
a left inverse: that is, a nonsquare matrix may have only a left inverse or
only a right inverse. In this case, the matrix may have many left inverses or
many right inverses.
10
Example 1.8 A nonsquare matzix A= | 0 1 | can have more than one
00
left inverse. In fact, for any 2, y € R, one can essily check that the matrix
a={5 ° 5 | ma tense of 4 a
oly
Definition 1.8 An n x n square matrix A is said to be invertible (or
nonsingular) if there exists a square matrix B of the same size such that
AB=I=BA.
‘Such a matrix B is called the inverse of A, and is denoted by A“. A matrix
Ais said to be singular if it is not invertible.
Note that Lemma 1.6 shows that if a square matrix A has both left and
right inverses, then it must be unique. That is why we call B “the” inverse
of A. For instance, consider a 2x2 matrix A= [ ° Fl Tf ad —be #0,
then it is easy to verify that .
b
ad — be
a}
A wel
ad26 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
since AA~! = Ip = A“1A. (Check this product of matrices for practice!)
Note that any zero matrix is singular.
Problem 1.19 Let A be an invertible matrix and k any nonzero scalar. Show that
(1) A“? is invertible and (A7?)-! = A;
(2) the matrix kA is invertible and (kA)? = 247;
(3) AP is invertible and (AT)? = (A~¥)?.
‘Theorem 1.7 The product of invertible matrices is also invertible, whose
inverse is the product of the individual inverses in reverse order:
(AB)*! = BOA"),
Proof: Suppose that A and B are invertible matrices of the same size. ‘Then
(AB)(B-1A-!) = A(BB7) A? = ATA“! = AA! = I, and similarly
(B-1A-)(AB) = I. Thus AB has the inverse B-1A~}, a
‘We have written the inverse of A as “A to the power —1”, so we can
give the meaning of A* for any integer k: Let A be a square matrix. Define
‘A® = I. Then, for any positive integer k, we define the power A* of A
inductively as
AP = A(AE
Moreover, if A is invertible, then the negative integer power is defined as
A*= (AF fork >0.
It is easy to check that with these rules we have AM! = A‘ A® whenever
the right hand side is defined. (If A is not invertible, A°()) is defined but
A“ is not.)
Problem 1.14 Prove:
(1) If A has a zero row, s0 does AB.
(2) If B has a zero column, so does AB.
(3) Any matrix with a zero row or a zero column cannot be invertible
Problem 1.15 Let A be an invertible matrix. Is it true that (A*)" = (AT) for any
integer k? Justify your answer.1.7. ELEMENTARY MATRICES 7
1.7 Elementary matrices
‘We now return to the system of linear equations Ax = b. If A has a right
inverse B such that AB = In, then x = Bb is a solution of the system since
Ax = A(Bb) = (AB)b =
In particular, if A is an invertible square matrix, then it has only one inverse
A7! by Lemma 1.6, and x = A~b is the only solution of the system. In
this section, we discuss how to compute A~? when A is invertible.
Recall that Gaussian elimination is a process in which the augmented
matrix is transformed into its row-echelon form by a finite number of ele-
mentary row operations. In the following, we will show that each elementary
row operation can be expressed as a nonsingular matrix, called an elementary
matriz, and hence the process of Gaussian elimination is simply multiplying
a finite sequence of corresponding elementary matrices to the augmented
matrix.
Definition 1.9 A matrix B obtained from the identity matrix In by exe-
cuting only one elementary row operation is called an elementary matrix.
For example, the following matrices are three elementary matrices cor-
responding to each type of the three elementary row operations.
() [ é 3 ] : multiply the second row of Ip by —5;
@)
interchange the second and the fourth rows of Is;
(3) add 3 times the third row to the first row of Js.
How once
It is an interesting fact that, if E is an elementary matrix obtained by
executing a certain elementary row operation on the identity matrix*Im,
then for any m x n matrix A, the product EA is exactly the matrix that is
obtained when the same elementary row operation in E is executed on A.
‘The following example illustrates this argument, (Note that AI is not what
we want. For this, see Problem 1.17).28 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Example 1.9 For simplicity, we work on a 3x1 column matrix b. Suppose
‘that we want to do the operation “adding (—2) x the first row to the second
row” on matrix b. Then, we execute this operation on the identity matrix
J first to got an clementary matrix E:
100
B=|-21 01].
001
Multiplying the elementary matrix E to b on the left: produces the desired
result:
100
Eb=| -2 1 0] | &
oor
Similarly, the operation “interchanging the first and third rows” on the
matrix b can be achieved by multiplying a permutation matric P, which is
an elementary matrix obtained from Js by interchanging two rows, to b on
o01)fh by
Pb=|0 10} | =| bs
10 0][ 8% b
Recall that each elementary row operation has an inverse operation,
which is also an elementary operation, that brings the matrix back to the
criginal one. Thus, suppose that E denotes an elementary matrix corre-
sponding to an elementary row operation, and let E' be the elementary
matrix corresponding to its “inverse” elementary row operation in E. Then,
a
(1) if E multiplies a row by c #0, then E" multiplies the same row by 3;
(2) if B interchanges two rows, then EB” interchanges them again;
(3) if B adds a multiple of one row to another, then EY subtracts it back
from the same row.
‘Thus, for any m xn matrix A, E'EA = A, and EE EET. That is,
every elementary matrix is invertible so that E-? = E’, which is also an
elementary matriz,
For instance, if7. ELEMENTARY MATRICES 29
010
1 0 0], then
001
010
100
oo1
Definition 1.10 A permutation matrix is a square matrix obtained from
the identity matrix by permuting the rows.
Problem 1.16 Prove
(2) A permutation matrix is the product of e finite number of elementary matrices
‘each of which is corresponding to the “row-interchanging” elementary row
operation.
(2) Any permutation matrix P is invertible and P~! = P?,
(3) The product of any two permutation matrices is a permutation matrix.
(4) The transpose of a permutation matrix is also a permutation matrix,
Problem 1.17 Define the elementary column operations for a matrix by just
replacing “row" by “column” in the definition of the elementary row operations.
Show that if A is an m xn matrix and if B is an elementary matrix obtained by
executing an elementary column operation on In, then AE is exactly the matrix
that is obtained from A when the same column operation is executed on A.
‘The next theorem establishes some fundamental relationships between
nxn square matrices and systems of n linear equations inn unknowns.
‘Theorem 1.8 Let A be ann xn matriz. The following are equivalent
(1) A has a left inverse;
(2) Ax=0 has only the trivial solution x =
(8) A és row-equivalent to Inj
(4) A is a product of elementary matrices;
(5) A is invertible;
(6) A has a right inverse.
Proof: (1) + (2): Letx be a solution of the homogeneous system Ax = 0,
and let B be a left inverse of A. Then
x= [nx = (BA)x = BAx = BO =0.30 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
(2) = (8) : Suppose that the homogeneous system Ax = 0 has only the
trivial solution x = 0:
a
‘This means that the augmented matrix [A 0] of the system Ax = 0 is reduced
to the system [Jn 0] by Gauss-Jordan elimination. Hence, A is row-equivalent
to In
(3) = (4) : Assume A is row-equivalent to Ip, s0 that A can be reduced
to In by a finite sequence of elementary row operations. Thus, we can find
elementary matrices Fi, Fa,..., Ey such that
Ey: E2E,A= In.
Since E}, E2,..., By are invertible, by multiplying both sides of this equation
on the left successively by Ez*,..., £37, 2,7, we obtain
A= By1By)..- By My = Ey! By*
Br,
which expresses A as the product of elementary matrices.
(A) (6) is trivial, because any elementary matrix is invertible. In fact,
AT} = Bys-+ EE,
(5) = (1) and (5) = (6) are trivial.
(6) = (5): If B is a right inverse of A, then A is a left inverse of B
and we can apply (1) = (2) = (3) = (4) = (6) to B and conclude that B
is invertible, with A as its unique inverse. That is, B is the inverse of A and
so A is invertible. oO
‘This theorem shows that a square matrix is invertible if it has a one-side
inverse. In particular, if a square matrix A is invertible, then x = A7'b is a
‘unique solution to the system Ax = b.
Problem 1.18 Find the inverse of the product
1 00 100 100
o 10 o10]}-a1 0}.
o-cif{-bo1 oo11.7, ELEMENTARY MATRICES 31
As an application of the preceding theorem, we give a practical method
for finding the inverse A? of an invertible nxn matrix A. If A is invertible,
there are elementary matrices Ey, Ea, ..., Ey such that
i+ BgEA = In.
Hence,
a + BaEy = Bus ExE In.
It follows that the sequence e row operations that reduces an invertible ma-
trix A to Ip will resolve In, to A~}. In other words, let [A | I] be the aug-
mented matrix with the columns of A on the left half, the columns of T
on the right half. A Gaussian elimination, applied to both sides, by some
elementary row operations reduces the augmented matrix [A | I] to [U' | K],
where U is a row-echelon form of A. Next, the back substitution process by
another series of elementary row operations reduces [U | K] to {I | A~*]:
[Al Z] > [Be E,A| Be--- El]
[Feo ALU | Feo FAK]
where Ey--- Ey represents a Gaussian elimination and Fi-+-F, represents
the back substitution. The following example illustrates the computation of
an inverse matrix.
Example 1.10 Find the inverse of
A
We apply Gauss-Jordan elimination to
ee ales) 0 0)
fain = |235]010 pena
102|001 row 1+ row 3
12 3] 10
a fo -1 -1 | -21
0-2 -1 | -10
G
Ca
0
0 | (-1)row 2
zi .
0
0
1 | @)row 2+20w 3
1 0
Sy oan acs
0 032 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
‘This is [U | K] obtained by Gaussian elimination. Now continue the back
substitution to reduce [U | K] to [I | A~*]
123 |1 00 +
vin = | = co
O21 12 10) (ovstoowt
120|-8 6
~fororaa (-2)row 2+ row 1
oo1] 3
1oo{-6 4
+ |o10 | -1 A = [a7]
oo1| 3
‘Thus, we get
6 4-1
ata|-1 1-1].
3-2 1
(The reader should verify that AA“? = I= A-1A.) o
Note that if A is not invertible, then, at some step in Gaussian elimina-
tion, a zero row will show up on the left side in [U | K]. For example, the
Teo 16 4
2 4 -1 | isrow-equivalent to| 0 -8 -9
“12 6 o 0 0
noninvertible matrix.
matrix A which is a
Problem 1.19 Write A~} asa product of elementary matrices for A in Example 1.10.
of A by using Gaussian elimination.
Problem 1.20 Pind the inverse of each of the following matri
ee {1 000) [Eo
i200] ou[itoe
[= i )e Pa ofeefor eo | eo
tacs| [ooreLDU FACTORIZATION 33,
a 0
Problem 1.91 When is a diagonal matrix D = nonsingular, and
0 dy
what is D-*?
From Theorem 1.8, a square matrix A is nonsingular if and only if Ax = 0
has only the trivial solution. That is, a square matrix A is singular if and
only if Ax = 0 has a nontrivial solution, say xp. Now, for any column vector
[by «++ by]?, if x; is a solution of Ax = b for a singular matrix A, then
0 is kx +1 for any
Alkxo +21) = k(Axo) + Axi = K+ b= Db.
‘This argument strengthens Theorem 1.5 as follows when A is a square
matrix:
Theorem 1.9 If A is an invertiblen xn matria, then for any column vector
b=[b ++ by)”, the system Ax = b has exactly one solution x = A~™b.
If A is not invertible, then the system has either no solution or infinitely
many solutions according to whether or not the system is inconsistent. 0
Problem 1.22 Write the system of linear equations
t+ dy + 2% = 10
Qe - ty + Be 1
40 - By + be = 4
in matrix form Ax = b and solve it by finding A~*b,
1.8 LDU factorization
Recall that a basic method of solving a linear system Ax = b is by Gauss-
Jordan elimination, For a fixed matrix A, if we want to solve more than one
system Ax = b for various values of b, then the same Gaussian elimination
on A has to be repeated over and over again. However, this repetition may
be avoided by expressing Gaussian elimination as an invertible matrix which
is a product of elementary matrices.
‘We first: assume that no permutations of rows are necessary throughgut
the whole process of Gaussian elimination on [A b]. Then the forward elim-
ination is just to multiply finitely many elementary matrices Bx, ..., Ey to
the augmented matrix [4 b]: that is,
[Ek---B,A Bes Eb] =(U e},34 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
where each J; is a lower triangular elementary matrix whose diagonal entries
are all 1’s and [U ¢] is the augmented matrix of the system obtained after
forward elimination on Ax = b (Note that U need not be an upper triangular
matrix if A is not a square matrix). Therefore, if we set L = (Ex-+- Ex)! =
Ej}.--Ej!, then A= LU and
Ux = By: ByAx = Ey: Eyb= Lb.
Note that L is a lower triangular matrix whose diagonal entries are all 1’s (see
Problem 1.24). Now, for any column matrix b, the system Ax = LUx = b
‘can be solved in two steps: first compute ¢ = L-'b which is a forward
elimination, and then solve Ux = c by the back substitution.
‘This means that, to solve the Zsystems Ax = b; fori =1, ..., & we
first find the matrices L and U such that A = LU by performing forward
elimination on A, and then compute ¢; = L~*b; for i = 1,...,€ The
solutions of Ax = by are now those of Ux = ¢..
‘Example 1.11 Consider the system of linear equations
2110][a fl
Aéx=| 410 1]|a]=|-2]=b.
-2211)|45 7
‘The elementary matrices for Gaussian elimination of A are easily found
to be
100 100 100
By=|-210],@=|010],ad m=|01 0],
001 101 031
so that 2110
EsERA=|0 -1 -2 1) =
0 0-44
Note that U is the matrix obtained from A after forward elimination, and
A= LU with
1 00
L=E BB =| 2 10],
-1-31
which is a lower triangular matrix with 1’s on the diagonal, Now, the system
o 1
Ie=b: 4 2a + @ -2
=a - 3 71.8. LDU FACTORIZATION 35
resolves to ¢ = (1,—4,—4) and the system
Qe + m2 +
Ux=e: =m — ly +
= 4g + dy
resolves to
-1+ #]) fo 1
24 3 2 3
1-¢t aft} aye
t 0 1
for t € R. It is suggested that the readers find the solutions for various values
of b. a
Problem 1.29 Determine an LU decomposition of the matrix
1-1 0
A=|- 2-1],
o-1 2
and then find solutions of Ax = b for (1) b= [11 1]? and (2) b= (20 — 1]?
Problem 1.24 Let A, B be two lower triangular matrices. Prove that
(1) their product is also a lower triangular matrix;
(2) if A is invertible, then its inverse is also a lower triangular matrix;
(3) ifthe diagonal entries are all 1’s then the same holds for their product and
their inverses.
Note that the same holds for upper triangular matrices, and for the product of more
than two matrices.
Now suppose that A is a nonsingular square matrix with A = LU in
which no row interchanges were necessary. Then the pivots on the diagonal
of U are all nonzero, and the diagonal of L are all 1's. Thus, by dividing
each i-th row of U by the nonzero pivot dj, the matrix U is factorized into a
diagonal matrix D whose diagonals are just the pivots di, dz, ..., dy and
‘anew upper triangular matrix, denoted again by U, whose diagonals are all
V's so that A= LDU. For example,
aris 4 0 1 t/a s/dy
Od t 0d o 1 t/da
ce o. ufdaa
0 dy 0 oO 136 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
‘This decomposition of A is called the LDU factorization of A. Note that,
in this factorization, U is just a row-echelon form of A (with leading 1’s on
the diagonal) after Gaussian elimination and before back substitution.
In Example 1.11, we found a factorization of A as
R OOo 1 A
2 10} {0-1 -2
-1 -3 1] [0 0 -4
‘This can be further factored as A= LDU by taking
211 2 0 ojf1 12 1/2
0-1 -2}=/0 -1 oj]jo 1 2 |=ou.
0 0 -4 0 o-4Jlo o 1
Suppose now that during forward elimination row interchanges are nec-
essary. In this case, we can first do all the row interchanges before doing any
other type of elementary row operations, since the interchange of rows can be
done at any time, before or after the other operations, with the same effect
on the solution. Those “row-interchanging” elementary matrices altogether
form a permutation matrix P so that no more row interchanges are needed
during Gaussian elimination of PA. So PA has an LDU factorization.
o1
Example 1.12 Consider a square matrix A= | 0 1 For Gaussian
10
elimination, it is clearly necessary to interchange the first row with the third
o01
row, that is, we need to multiply the permutation matrix P = | o10
100
to A so that
100 10 O}fi00
PA=|010/=|0 1 0|/}o01 0]=u.
OH o-11j[oo2
o
Of course, if we choose a different permutation P’, then the LDU fac-
torization of P’A may be different from that of PA, even if there is an-
other permutation matrix P" that changes P’A to PA. However, if we fix
a permutation matrix P when it is necessary, the uniqueness of the LDU
factorization of A can be proved.1.8, LDU FACTORIZATION 37
‘Theorem 1.10 For an invertible matriz A, the LDU factorization of A is
unique up to a permutation: that és, for a fized P the expression PA = LDU
is unique
Proof: Suppose that A = LDU; = [2D2U, where the L’s are lower
triangular, the U's are upper triangular, all with 1's on the diagonal, and
the D’s are diagonal matrices with no zeros on the diagonal. We need to
show Ly = Lz, D; = Dz, and U; = Us,
Note that the inverse of a lower (upper) triangular matrix is also a lower
(upper) triangular matrix. And the inverse of a diagonal matrix is also
diagonal. Therefore, by multiplying (L;D,)~ = Dy*L;! on the left and
Uz? on the right, our equation L1D,U; = LyD2U2 becomes
UUs} = Dz! Lz1L2D2
‘The left side is an upper triangular matrix, while the right side is a lower
‘triangular matrix. Hence, both sides must be diagonal. However, since the
diagonal entries of the upper triangular matrix U;Uz" are all 1's, it must be
the identity matrix I (see Problem 1.24). Thus U,Uz? = I, ie, Ui = Us.
Similarly, L7!Zg = D,D3" implies that Ly = Ly and D; = Da, o
In particular, if A is symmetric (i.e., A= AT), and if it can be factored
into A= LDU without row interchanges, then we have
LDU = A= A? = (LDU)? =UTDTLT =UT DL,
and thus, by the uniqueness of factorizations, we have U = LT and A =
LDI7.
2-10
Problem 1.25 Find the factors L,D, and U for A= [ -l 2-1 ]
Ot 2
‘What is the solution to Ax = b for b= (10 —1]"?
Problem 1.26 For all possible permutation matrices P, find the LDU factorizatjon
123
of PAforA=|2 4 2].
ha38 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
1.9 Application: Linear models
(1) In an electrical network, a simple current flow may be illustrated by a
diagram like the one below. Such a network involves only voltage sources,
like batteries, and resistors, like bulbs, motors, or refrigerators. The voltage
is measured in volts, the resistance in ohms, and the current flow in amperes
(amps, in short). For such an electrical network, current flow is governed by
the following three laws:
© Ohm's Law: The voltage drop V across a resistor is the product of
the current I and the resistance R: V = IR.
© Kirchhoff’s Current Law (KCL): The current flow into a node
equals the current flow out of the node.
© Kirchhoff’s Voltage Law (KVL): The algebraic sum of the voltage
drops around a closed loop equals the total voltage sources in the loop.
Example 1.18 Determine the currents in the network given in the above
figure.
2 ohms 2 ohms
P.
qh Ty Ts]
3 ohms 2 ohms 18 volts
5 VW
1 ohms lohms
Solution: By applying KCL to nodes P and Q, we get equations *
h+tb = h at P,
h=h + hsQ.
Observe that both equations are the same, and one of them is redundant.
By applying KVL to each of the loops in the network clockwise direction,1.9, APPLICATION: LINEAR MODELS 39
we get
6h+2h = 0 from the left loop,
2h+3ls = 18 from the right loop.
Collecting all the equations, we get system of linear equations:
h- h+ b= 0
6h + 2b = 0
2h + 8h = 18.
By solving it, the currents are Jy = —1 amp, Jy = 3 amps and [Ig =
4 amps. The negative sign for J; means that the current Jy flows in the
direction opposite to that shown in the figure. a
Problem 1.27 Determine the currents inthe following networks,
@ )
40 ohms 20 volts 4 De ohm
'
h Sram Ie Is
Socks
WW {-—
ohms 5 volts
ah 2 ohms ls
rH '
4D ohms 40 volts volts
(2) Cryptography is the study of sending messages in disguised form
(secret codes) so that only the intended recipients can remove the disguise
and read the message; modern cryptography uses advanced mathematics.
‘As another application of invertible matrices, we introduce a simple coding.
Suppose we associate a prescribed number with every letter in the alphabet;
for example,
xX Y Z Blank 2? !
D
1 Titi ait
3 + 28 24 25 26 27 28.
con
He ty
nea40 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Suppose that we want to send the message “GOOD LUCK”. Replace
this message by
6, 14, 14, 3, 26, 11, 20, 2, 10
according to the preceding substitution scheme. A code of this type could be
cracked without difficulty by a number of techniques of statistical methods,
like the analysis of frequency of letters. To make it difficult to crack the code,
we first break the message into six vectors in R3, each with 3 components
(optional), by adding extra blanks if necessary:
[é]- (2) (3)
Next, choose a nonsingular 3 x 3 matrix A, say
|:
which is supposed to be known to both sender and receiver. Then as a linear
transformation A translates our message into
(dh al-(2} (8)
By putting the components of the resulting vectors consecutively, we trans-
mit
A
6, 26, 34, 3, 32, 40, 20, 42, 32.
To decode this message, the receiver may follow the following process.
Suppose that we received the following reply from our correspondent:
19, 45, 26, 13, 36, 41.
To decode it, first: break the message into two vectors in R® as before:
19 13
45], | 36].
26 411.9. APPLICATION: LINEAR MODELS 41
‘We want to find two vectors x1, x2 such that Ax; is the i-th vector of the
above two vectors: i.e,
19 13
Ax=| 45 |, Axo=| 36
26 41
Since A is invertible, the vectors x1, x2 can be found by multiplying the
inverse of A to the two vectors given in the message. By an easy computation,
one can find
‘Therefore,
Be OO] 19 13
m=]-2 10/]45/=| 7], =| 10
1-1 1] | 26 0 18
‘The numbers one obtains are
19, 7, 0, 13, 10, 18.
Using our correspondence between letters and numbers, the message we have
received is “THANKS”.
Problem 1.28 Encode “TAKE UFO ” using the same matrix A used in the above
example.
(3) Another significant application of linear algebra is to a mathematical
model in economics. In most nations, an economic society may be divided
into many sectors that produce goods or services, such as the automobile
industry, oil industry, steel industry, communication industry, and so on.
‘Then # fundamental problem in economics is to find the equilibrium of the
supply and the demand in the economy.
‘There are two kinds of demands for goods: the intermediate demand
from the industries themselves (or the sectors) that are needed as inputs for
their own production, and the exira demand from the consumer, the gov-
ernmental use, surplus production, or exports. Practically, the interrelation
between the sectors is very complicated, and the connection between the2 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
extra demand and the production is unclear. A natural question is whether
there is a production level such that the total amounts produced (or supply)
will exactly balance the total demand for the production, so that the equality
{Total output} = {Total demand}
= {Intermediate demand} + {Extra demand}
holds. This problem can be described by a system of linear equations, which
is called the Leontief Input-Output Model. To illustrate this, we show a
simple example.
Suppose that a nation’s economy consists of three sectors: / = automo-
bile industry, Jp = steel industry, and Jy = oil industry.
Let x = [z) 2223)" denote the production vector (or production level) in
R®, where each entry 2; denotes the total amount (in a common unit such as
“dollars” rather than quantities such as “tons” or “gallons”) of the output
that the industry J; produces per year.
‘The intermediate demand may be explained as follows. Suppose that,
for the total output 22 units of the steel industry I2, 20% is contributed by
the output of J, 40% by that of Jz and 20% by that of Js. Then we can
write this as a column vector, called a unit consumption vector of Ip:
02
c= | 04
0.2
For example, if I decides to produce 100 units per year, then it will order (or
demand) 20 units from /,, 40 units from Jz, and 20 units from Is: i-e., the
‘consumption vector of Jp for the production r2 = 100 units can be written as
acolumn vector: 100cz = [20 40 20]”. From the concept of the consumption
vector, it is clear that the sum of decimal fractions in the column ey must
be <1.
Tn our example, suppose that the demands (inputs) of the outputs are
given by the following matrix, called an input-output matria,
output
Lh oh hk
[03 02 038
Az input bh] 01 04 01
BL03 02 03
T tT T
Crs1.9. APPLICATION: LINEAR MODELS 43
In this matrix, an industry looks down @ column to see how much it needs
from where to produce its total output, and it looks across a row to see how
much of its output goes to where. For example, the second row says that,
out of the total output 22 units of the steel industry Ja, as the intermediate
demand, the automobile industry J; demands 10% of the output 21, the steel
industry Jp demands 40% of the output 22 and the oil industry Js demands
10% of the output rs. Therefore, it is now'easy to see that the intermediate
demand of the economy can be written as
03 02 03] [x 0.32; + 0.209 + 0.323
Ax=| 0.1 04 0.1] | 22 | =| 012, +0422 40.129
03 02 03] | 2s 0.32; + 0.209 + 0.323
‘Suppose that the extra demand in our example is given by d = [d1, da, ds]? =
[30,20, 10)”. Then the problem for this economy is to find the production
vector x satisfying the following equation:
x= Axtd.
Another form of the equation is (I — A)x = d, where the matrix I-A
is called the Leontief matrix. If J — A is not invertible, then the equation
may have no solution or infinitely many solutions depending on what d is. If
I-A is invertible, then the equation has the unique solution x = (I—A)~1d,
Now, our example can be written as
a 0:3 0.2 03) [x 30
zg |=|01 04 01} | 22 | +] 20
3 03 0.2 03} | 23 10
In this example, it turns out that the matrix J — A is invertible and
20 10 10
(i-ayt=| 05 20 05
1.0 10 20
‘Therefore,
20
=| 60],
70
which gives the total amount of product 2; of the industry J; for one year to
meet the required demand.
x=(I-Ayt44 CHAPTER 1, LINEAR EQUATIONS AND MATRICES
Remark: (1) Under the usual circumstances, the sum of the entries in a
column of the consumption matrix A is less than one because a sector should
require less than one unit's worth of inputs to produce one unit of output.
‘This actually implies that I — A is invertible and the production vector x is
feasible in the sense that the entries in x are all nonnegative as the following
argument shows.
(2) In general, by using induction one can easily verify that for any
k=1,2,
(I-A) + Ate + Am
If the sums of column entries of A are all strictly less than one, then
impo A* = 0 (see Section 6.6 for the limit. of a sequence of matrices).
‘Thus, we get (I- A)(I+ A+++ Ab +--+) =I, that is
(I= Ay aT + Ate AR ee
‘This also shows a practical way of computing (I~ A) since by taking k
sufficiently large the right side may be made very close to (J ~ A)~*. In
Chapter 6, an easier method of computing A* will be shown.
Tn summary, if A and d have nonnegative entries and if the sum of the
entries of each column of A is less than one, then I~ A is invertible and the
inverse is given as the above formula. Moreover, as the formula shows the
entries of the inverse are all nonnegative, and so are those of the production
vector x = (I= A)".
Problem 1.29 Determine the total demand for industries 1, /, and J for the input-
output matrix A and the extra demand vector d given below:
01 0.7 02
=| 05 01 06 | witha=o.
04 0.2 02
Problem 1.90 Suppose that an economy is divided into three sectors: I; = services,
Jy = manufacturing industries, and Jz = agriculture. For each unit of output, 1
demands no services from J,, 0.4 units from Jp, and 0.5 units from Jp. For each unit
of output, Za requires 0.1 units from sector I; of services, 0.7 units from other parts
in sector Ja, and no product from sector Js. For each unit of output, Jy demands
0.8 units of services /;, 0.1 units of manufacturing products from Jz, and 0.1 units
of its own output from Js, Determine the production level to balance the economy
when 90 units of services, 10 units of manufacturing, and 30 units of agriculture are
required as the extra demand.1.10.
EXERCISES 45
1.10 Exercises
a.
1.2,
18.
1.4.
15.
1.6,
Which of the following matrices are in row-echelon form or in reduced row-
echelon form?
Find a row-echelon form of each matrix.
12345
33020] yl23s8
MPs Aol mipou se
nee 45128
51234
Find the reduced row-echelon form of the matrices in Exercise 1.2.
Solve the systems of equations by Gauss-Jordan elimination.
n+ m+ 35 -2
mom + om 0
) ger + 222 — 20 1
moto + 8a 3.
a - e
@ia = y+ =
2 +
‘Whot are the pivots in each elimination step?
‘Which of the following systems has a nontrivial solution?
e+ dy + 8 = 0 wt yn 2
Q dy + 2 = 0 (4 2 - %y — Bz
+ dy + 8 = 0. Set yo =
Determine all values of the b, that: make the following system consistent:
fs
°
0
0.46 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
1.7. Determine the condition on b; so that the following system has no solution:
wt yt = hh
6 ~ 2 + Iz = by
2 - y + & = by
1.8. Let A and B be matrices of the same size.
(1) Show that, if Ax = 0 for all x, then A is the zero matrix.
(2) Show thet, if Ax = Bx for all x, then A= B.
1.8. Compute ABC and CAB, for
[? . Hk -| ‘]. c=[1 1]
+1
1.10. Prove that if A is a 3x 3 matrix such that AB = BA for every 3x 3 matrix
B, then A = kl for some constant k.
120
0 1 3 |. Pind A* for all integers k.
oo
Ll. Let A=
1.12, Compute (2A ~ B)C and CCT for
100 100 211
o1o0}, B=|}210], ¢ 410].
101 oo 221
1.18. Let f(z) = an" -+0n-12""1 +-+---a,2-+4+a9 be a polynomial. For any square
matrix A, a matriz polynomial f(A) is defined as f(A) = a,A" han —1A"- +
ssh ayA+ aol. For f(z) = 32" +27 — 2248, find f(A) for
e200) tea
(QA=]-3 40], @As]o 2-1
005 Oars)
1.14. Find the symmetric part and the skew-symmetric part of each of the following
‘matrices.
133 13 4
wae 2s :] oa-[2 2 3]
“132 oo 3
1-102
1.15. Find AAT and ATA for the matrix A=] 2 1 3 1
2 840
ii
1.16. Let An?=] 0 1
42
2
a
11.10.
war.
1.18,
1.19.
1.20,
1.21,
1.22
1.28,
1.24,
1.25.
EXERCISES ar
12
oi].
cal
A
(2) Find a matrix C such that AG =A? +A,
(1) Find a matrix B such that AB
Find all possible choices of a,b and ¢ so that A [ oa | has an inverse
matric such that A?
Decide whether or not each of the following matrices is invertible. Find the
1 4
0 4
0 a| ?
inverses for invertible ones.
ea a-1
0238 alae)
cnowotd BBL 201
Suppose A is @ 2x 1 matrix and Bis a 1x2 matrix. Prove that the product
AB is not invertible.
2-13 4
o 1 2-4].
2-3 4
Find three matrices which are row equivalent to A =
Write the following systems of equations as matrix equations Ax = b and
solve them by computing A~*b:
ay — ty + Say = 2 a - mt me 5
@) m - 40 = 5 (2) 4m + ty - ty = HI
2, + m - 2 = 7, 4a; — 822 + 2s = -8.
ation for each of the following matrices:
21 10
@a=|3 ale QA lie MLE
Find the LDLP factorization of the following symmetric matrices:
12 3
oa-[} 6 ‘|.
Find the LDU factori
3 8 10
Solve Ax = b with A= LU, where L and U are given as
1 00 1-1 0 2
b=|-1 10], v=/o 1-1], b=|-3
0-11 oo 1 4 ;
Forward elimination is the same as Le = b, aad back-substitution is Ux =e.48, CHAPTER 1. LINEAR EQUATIONS AND MATRICES
(1) Solve Ax = b by Gauss-Jordan elimination,
(2) Find the LDU factorization of A.
(3) Write A as a product of elementary matrices.
(4) Find the inverse of A.
1.26. A square matrix A is said to be nilpotent if A¥ = 0 for a positive integer k.
(1) Show that an invertible matrix is not nilpotent.
(2) Show that any triangular matrix with zero diagonal is nilpotent.
(3) Show that if A is a nilpotent with A¥ = 0, then J ~ A is invertible with
its inverse [+ A+++ AEH
1.27. A square matrix A is said to be idempotent if A? = A.
(2) Find an example of an idempotent matrix other than 0 or Z
(2) Show that, ifa matrix 4 is both idempotent and invertible, then A =.
1.28. Determine whether the following statements are true or false, in general, and
justify your answers.
(1) Let A and B be row-equivalent square matrices. Then A is invertible if
and only if B is invertible.
(2) Let A be a square matrix such that AA = A. Then A is the identity.
(8) If A and B are invertible matrices such that A? = I and B* = I, then
(AB)? = BA.
(4) IA and B are invertible matrices, A + B is also invertible.
(5) If A, B end AB are symmetric, then AB = BA.
(6) If A and B are symmetric and the same size, then AB is also symmetric.
(7) Let ABT = I. Then A is invertible if and only if B is invertible.
(8) If a square matrix A is not invertible, then neither is AB for any B.
(@) 18 E; and Ep are elementary matrices, then E, E> = ExEy
(10) The inverse of an invertible upper triangular matrix is upper triangular.
(11) Any invertible matrix A can be written as A = LU, where L is lower
triangular and U is upper triangular
(12) If Ais invertible and symmetric, then A~? is also symmetric.Chapter 2
Determinants
2.1 Basic properties of determinant
Our primary interest in Chapter 1 was in the solvability or solutions of a
system Ax = b of linear equations. Foran invertible matrix A, Theorem 1.8
shows that the system has-e.unique solution x= A='b for any b.
Now the question is how to decide whether or not a square matrix A
is invertible. In this section, we introduce the notion of determinant as
a real-valued function of square matrices that satisfies certain axiomatic
rules, and then show that a square matrix A is invertible if and only if the
determinant of A is not zero. In fact, we saw in Chapter 1 that a 2x 2
matrix A= | % ? | is invertible if and only if ad— be x 0, This number is
‘|
called the determinant of A, and is defined formally as follows:
ab
Definition 2.1 For a 2x 2 matrix A = [ aes ] € Mzx2(R), the deter-
minant of A is defined as det A = ad ~ be.
In fact, it turns out that geometrically the determinant of a 2 x 2 matrix
A represents, up to sign, the area of a parallelogram in the zy-plane whose
edges are constructed by the row vectors of A (see Theorem 2.9), so it will
be very ice if we can have the same idea of determinant for higher order
matrices. However, the formula itself in Definition 2.1 does not provide any
clue of how to extend this idea of determinant to higher order matrices.
Hence, we first examine some fundamental properties of the determinant
function defined in Definition 2.1.
4950 CHAPTER 2. DETERMINANTS
By a direct computation, one can easily verify that the function det in
Definition 2.1 satisfies the following three fundamental properties:
(1) det | a 1
o1
(2) det | © All = bead = —(ad—be) = weal 8 A
(8) det Kaw tal hea ha + ba!)d — (kb + €B)e
(ad — be) + &(a'd ~ We)
ab av
=ee:| ¢ t J+eae[ ° ‘|
Actually all the important properties of the determinant function can be
derived from these three properties. We will show in Lemma 2.3 that if a
function f : Mzyo(B) — R satisfies the properties (1), (2) and (8) above, then
it must be of the form f(A) = ad — be. An advantage of looking at these
properties of the determinant rather than looking at the explicit formula
given in Definition 2.1 is that these three properties enable us to define the
determinant function for any n x me square matrices.
Definition 2.2 A real-valued function f : Mnxn(R) —* R of all n xn square
matrices is called a determinant if it satisfies the following three rules:
(Ry) the value of f of the identity matrix is 1, te, f(In) =;
(Ra) the value of f changes sign if any two rows are interchanged;
(Rs) f is linear in the first row: that is, by definition,
ey + ert, n r
” rm r
f = flea eeal lea
Tr a tr
where r;’s denote the row vectors of a matrix.
It is already shown that the det on 2x 2 matrices satisfies these rules.
We will show later that for each positive integer n there always exists such
a function f : Mnxn(R) — R satisfying the three rules in the definition, and,
moreover, it is unique. Therefore, we say “the” determinant and designate
it as “det” in any order.2.1. BASIC PROPERTIES OF DETERMINANT 51
Let us first derive some direct consequences of the three rules in the
definition (the readers are suggested to verify that det of 2 x 2 matrices also
satisfies the following properties):
‘Theorem 2.1 The determinant satisfies the following properties.
(2) The determinant is linear in each row, i.e., for each row the rule (Fs)
also holds.
(2) If A has either a zero row or two identical rows, then det A
(3) The elementary row operation that adds a constant multiple of one row
to another row leaves the determinant unchanged.
Proof: (1) Any row can be placed in the first row with a change of sign in
the determinant by rule (Ra), and then use rules (Rg) and (Re).
(2) If A has a zero row, then the row is zero times the zero row. If A
has two identical rows, then interchanging these identical rows changes only
the sign of the determinant, but not A itself. Thus we get det A = —det A.
(8) By a direct computation using (1), we get
net key %
f 7 uj +khf ,
% 9 13
in which the second term on the right side is zero by (2) o
I is now easy to see the effect of elementary row operations on evaluations
of the determinant. The first elementary row operation that “multiplies a
constant k to a row” changes the determinant to k times the determinant
by (1) of Theorem 2.1. The rule (Rg) in the definition explains the effect
of the elementary row operation that “interchanges two rows”. The last
elementary row operation that “adds a constant multiple of a row to another”
is explained in (3) of Theorem 2.1.
Example 2.1 Consider a matrix
.o1rou
A=|a boc
bte cha b+a52 CHAPTER 2. DETERMINANTS
If we add the second row to the third, then the third row becomes
fatb+e atbt+e atb+d,
which is a scalar multiple of the first row. Thus, det A = 0.
Problem 2.1 Show that, for an nxn matrix A and k ER, det(kA) = k* det A.
Recall that any square matrix can be transformed to an upper triangular
matrix by forward eliminations. Further properties of the determinant are
obtained in the following theorem.
Problem 2.2 Explain why det A = 0 for
ati at4 a47
Q)A=|at2 at5 at8], QA
Be
a+3 a+6 a+9
‘Theorem 2.2 The determinant satisfies the following properties.
(A) The determinant of a triangular matrix is the product of the diagonal
entries.
(2) The matriz A is invertible if and only if det.A #0.
(3) For any two nxn matrices A and B, det(AB) = det A det B.
(4) det. A? = det A.
Proof: (1) If Aisa diagonal matrix, then it is clear that det A = a1 -+-@nn
by (1) of Theorem 2.1 and rule (Rx). Suppose that A is a lower triangular
matrix. ‘Then a forward elimination, which does not change the determinant,
produces a zero row if A has a zero diagonal entry, or makes A row equivalent
to the diagonal matrix D whose diagonal entries are exactly those of A if
the diagonal entries are all nonzero. ‘Thus, in the former case, det.A = 0
and the product of the diagonal entries is also zero. In the latter case,
det A = det D = a11---dnn. Similar arguments apply when A is an upper
triangular matrix.
(2) Note again that a forward elimination reduces a square matrix A to
an upper triangular matrix, which has a zero row if A is singular and has no
zero row if A is nonsingular (see Theorem 1.8).
(3) If A is not invertible, then AB is not invertible, and so det(AB) =
0 = det Adet B. By the properties of the elementary matrices, it is clear
that for any elementary matrix B, det(EB) = det F det B. If Ais invertible,2.1, BASIC PROPERTIES OF DETERMINANT. 53
it can be written as a product of elementary matrices, say A = Ey Ep-+- Ey.
‘Then by induction on k, we get
det(AB) = det(E\E2---E,B)
det Ey det By ++ det By det B
= det(EyEy-- Ey) det B
det Adet B.
(4) Clearly, A is not invertible if and only if AT is not. ‘Thus for a
singular matrix A we have det A? = 0 = det A. If A is invertible, then there
is a factorization PA = LDU for a permutation matrix P. By (3), we get
det P detA=detL det D detU.
Note that the transpose of PA = LDU is AT PT = UT DT LT and that for any
triangular matrix B, det B = det BT by (1). In particular, since L, U, LT,
and UT are triangular with 1's on the diagonal, their determinants are all
equal to 1. Therefore, we have
det AT det P? = detUT det D™ det L7
= detL detD detU = det A det P.
By the definition, a permutation matrix P is obtained from the identity
matrix by a sequence of row interchanges: that is, P= Ex---E,Jy for some
k, where each E; is an elementary matrix obtained from the identity matrix
by interchanging two rows. Thus, det E; = —1 for each i = 1,...,k, and
clearly ET = E; = Ey*. Therefore, det P = (—1)* = det PT by (3), so
det A = det AT, o
Remark: From the equality det A = det A, we could define the determi-
nant in terms of columns instead of rows in Definition 2.2, and Theorem 2.1
is also true with “columns” instead of “rows”.
Example 2.2 Evaluate the determinant of the following matrix A:
-4 0 0
3 021
0-1 2
—4 8-154 CHAPTER 2, DETERMINANTS
Solution: By using forward elimination, A can be transformed to an
upper triangular matrix U7. Since the forward elimination does not change
the determinant, the determinant of A is simply the product of the diagonal
entries of U:
2-4
0-1
detA = det = det} 9 “9
o 0
2. (-1)? +13
Problem 2.9 Prove that if A is invertible, then det A~! = 1/ det A.
Problem 2.4 Evaluate the determinant of each of the following matrices:
LAD i 12 13 14 Lon a 28
21 22 23 24 eoL 2 a
o[3ithe ai a2 33 a4]? 9) | a2 8 1g
8 4142 43 44 ae 1
2.2 Existence and uniqueness
Recall that det A = ad — be defined in the previous section satisfies the
three rules of Definition 2.2. Conversely, the following lemma shows that
any function of Mzx2(R) into R satisfying the three rules (Ri) - (Rs) of
Definition 2.2 must be det, which implies the uniqueness of the determinant
function on Mzx2(R).
Lemma 2.3 If f : Moxs(R) + R satisfies the three rules in Definition 2.2,
then f(A) = ad ~ be.
Proof: First, note that ‘| ‘| =1 by the rules (Ry) and (Rp).
fade
[ee]+[¢ ¢]
[5 eles o}s[o e]o-[e ]
ad-+0-+0—be. o
f(A) =
f
i
f
"2.2, EXISTENCE AND UNIQUENESS 55
‘Therefore, when n = 2 there is only one function f on Mx2(R) which
satisfies the three rules: i.e., f = det.
Now for n = 3, the same calculation as in the case of n = 2 can be
applied. That is, by repeated use of the three rules (Ry) - (Rs) as in the
proof of Lemma 2.3, we can obtain the explicit formula for the determinant
function on Mgx3(R) as follows:
ay a2 a3
det | az1 22 423
431 a3 33
a 0 0 0 ap 0 0 0 as
= det} 0 ay 0 |det] 0 0 agg |+det}an 0 0
0 0 agg a 0 0 0 aa 0
am 0 0 0 a2 0 0 0 as
tdet| 0 0 agg |+det}an 0 0 |+det} 0 aye 0
0 ay 0 0 0 ay a 0 0
= 011022033 + 212023031 + 213421432 ~ 411429082 ~ 419021089 ~ a9422091,
‘This expression of det A for a matrix A € Maxs(R) satisfies the three
rules. Therefore, for n = 3, it shows both the uniqueness and the existence
of the determinant function on Mgx3(R).
Problem 2.5 Show that the above formula of the determinant for 3 x 3 matrices
satisfies the three rules in Definition 2.2.
‘To get the formula of the determinant for matrices of order n > 3, the
same computational process can be repeated using the three rules again, but
the computation is going to be more complicated as the order gets higher.
‘To derive the explicit formula for det.A of order n > 3, we examine the above
case in detail. In the process of deriving the explicit formula for det A of a
3.x 3 matrix A, we can observe the-following three steps:
(1st) By using the linearity of the determinant function in each row,
det A of a 3 x 3 matrix A is expanded as the sum of the determinants of
3° = 27 matrices. Except for exactly six matrices, all of them have Zero
columns so that their determinants are zero (see the proof of Lemma 2.3).
(2nd) In each of these remaining six matrices, all entries are zero except
for exactly three entries that came from the given matrix A. Indeed, no two
of the three entries came from the same column or from the same row of A.56 CHAPTER 2, DETERMINANTS
In other words, in each row there is only one entry that came from A and
at the same time in each column there is only one entry that came from A.
‘Actually, in each of the six matrices, the three entries from A, say a,j,
axe, and dpg, are chosen as follows: If the first entry ay is chosen from the
first row and the third column of A, say a3, then the other entries axe and
pq in the product should be chosen from the second or the third row and
the first or the second column. Thus, if the second entry axe is taken from
the second row, the column it belongs to must be either the first or the
second, é.e., either ani or az2. If aa; is taken, then the third entry apy must
be, without option, a32. ‘Thus, the entries from A in the chosen matrix are
13, a2, and aga. Therefore, the three entries in each of the six remaining
matrices are determined as follows: when the row indices (the first indices 4
of aij) are arranged in the order 1, 2, 3, the assignment of the column indices
1, 2, 8 (the second indices j of aij) to each of the row indices is simply a
re-arrangement of 1, 2, 3 without repetitions or omissions. In this way, one
can recognize that the number 6 = 3! is simply the number of ways in which
the three column indices 1, 2, 3 are rearranged.
(8rd) The determinant of each of the six matrices may be computed by
converting it into a diagonal matrix using suitable “column interchanges”
(see Theorem 2.2 (1)), 60 its determinant becomes :taijakedp,, where the
sign depends on the number of column interchanges,
For example, for the matrix having entries a3, az, and ag) from A,
one can convert this matrix into a diagonal matrix in a couple of ways:
for instance, one can take just one interchanging of the first and the third
columns or take three interchanges: the first and the second, and then the
second and the third, and then the first and the second, In any case,
0 0 as a3 0 0
det | 0 az 0 |=—det] 0 aa 0 | =—arsana3.
a 0 0 0 0 ay
Note that an interchange of two columns is the same as an interchange of
two corresponding column indices. As mentioned above, there may be sev-
eral ways of column interchanges to convert the given matrix to a diagonal
matrix. However, it is very interesting that, whatever ways of column inter-
changes we take, the parity of the number of column interchanges remains
the same all the time.
In this example, the given arrangement of the column is expressed in
the arrangement of column indices, which is 3, 2, 1. Thus, to arrive at the2.2, EXISTENCE AND UNIQUENESS 87
order 1, 2, 3, which represents the diagonal matrix, we can take either just
one interchanging of 3 and 1, or three interchanges: 3 and 2, 3 and 1, and
then 2 and 1. In either case, the parity is odd so that the "—" sign in the
computation of determinant came from (—1)' = (—1)°, where the exponents
mean the numbers of interchanges of the column indices.
In summary, in the expansion of det A for A € Maxs(R), the number 6 =
31 of the determinants which contribute to the computation of det A is simply
the number of ways in which the three numbers 1, 2, 3 are rearranged without
repetitions or omissions. Moreover, the sign of each of the six determinants is
determined by the parity (even or odd) of the number of column interchanges
required to arrive at the order of 1, 2, 3 from the given arrangement of the
column indices.
‘These observations can be used to derive the explicit formula of the deter-
minant for matrices of order n > 3. We begin with the following definition.
»n}
Definition 2.3 A permutation of the set of integers Nn = {1, 2,
is a one-to-one function from Np, onto itself.
‘Therefore, a permutation o of N, assigns a number o(i) in Nq to each
number i in Nj, and this permutation a is commonly denoted by
1 2 on
im) = ( ol) o(2) ++ ofn) ).
Here, the first row is the usual lay-out of Nn as the domain set, and the
second row is just an arrangement in a certain order without repetitions or
omissions of the numbers in NV, as the image set. A permutation that inter~
changes only two numbers in NVq, leaving the rest of the numbers fixed, such
asc = (3,2,1,...,n), is called a transposition. Note that the composition
of any two permutations is also a permutation. Moreover, the composition
of a transposition to a permutation ¢ produces an interchanging of two num-
bers in the permutation @. In particular, the composition of a transposition
with itself always produces the identity permutation.
It is not hard to see that if S, denotes the set of all permutations of Ns,
then S, has exactly n! permutations.
Once we have listed all the permutations, the next step is to determine
the sign of each permutation, A permutation o = (j1, j2, ---+ Jn) is said
to have an inversion if j, > je for s < t (i.e., a larger number precedes
a smaller number). For example, the permutation ¢ = (3,1,2) has two
inversions since 3 precedes 1 and 2.
(2), (2),58 CHAPTER 2. DETERMINANTS
Am inversion in a permutation can be eliminated by composing it with
a suitable transposition: for example, if « = (3,2,1) with three inversions,
then by multiplying a transposition (2,1,8) to it, we get (2,3,1) with two
inversions, which is the same as interchanging the first two numbers 3, 2 in
c. Therefore, given a permutation ¢ = {o(1), 0(2), ...,0(n)) in Sq, one can
convert it to the identity permutation (1, 2, ..., n), which is the only one
with no inversions, by composing it with certain number of transpositions.
For example, by composing the three (which is the number of inversions in
g) transpositions (2,1,3), (1,3,2) and (2,1,3) with o = (3,2, 1), we get the
identity permutation. However, the number of necessary transpositions to
convert the given permutation into the identity permutation need not be
unique as we have seen in the third step. Notice that even if the number
of necessary transpositions is not unique the parity (even or odd) is always
consistent with the number of inversions.
Recall that all we need in the computation of the determinant is just
the parity (even or odd) of the number of column interchanges, which is the
same as that of the number of inversions in the permutation of the column
indices.
‘A permutation is said to be even if it has an even number of inversions,
and it is said to be odd if it has an odd number of inversions. For example,
when n = 3, the permutations (1, 2, 3), (2, 3, 1) and (3, 1, 2) are even,
while the permutations (1, 3, 2), (2, 1, 3) and (3, 2, 1) are odd. In general,
for a permutation ¢ in Sa, the sign of o is defined as
cen(o) = { _} if is an even permutation
en) =) 1. if ois an odd permutation.
It is not hard to see that the number of even permutations is equal to that
of odd permutations, so it is 4. In the case n = 3, one can notice that there
are 3 terms with + sign and 3 terms with — sign.
Problem 2.6 Show that the number of even permutations and the number of odd
permutations in S, are equal.
Now, we repeat the three steps to get an explicit formula for det A of
a square matrix A = [aij] of order n. First, the determinant det A can
be expressed as the sum of determinants of n! matrices, each of which has
zero entries except the n entries aio(1)s @20(2)+ **" Gno(n) taken from A,
where o is a permutation of the set {1,2,...,n} of column indices. The n.
entries aig(1), @25(2}s °**+ Gno(n) are chosen from A in such a way that no2.2, EXISTENCE AND UNIQUENESS 59
‘two of them come from the same row or the same column. Such a matrix
can be converted to a diagonal matrix. Hence, its determinant is equal to
£09(1)420(2) “** Anon)» Where the sign + is determined by the parity of the
number of column interchanges to convert the matrix to a diagonal matrix,
which is equal to that of inversions in o: sgn(o). Therefore, the determinant
of the matrix whose entries are all zero except for aje(q’s is equal to
sen(7)A19(1)420(2) "**Gno(n)s
which is called a signed elementary product of A. Now, our discussions
can be summarized as follows:
Theorem 2.4 For ann xn matrix A,
det A= 7 sen!
oS
Jare(1y20(2) ***no(n)-
That is, det A is the sum of all signed elementary products of A.
It is not difficult to see that this explicit formula for det A satisfies the
‘three rules in the definition of the determinant. Therefore, we have both
ezistence and uniqueness for the determinant function of square matrices of
any order n> 1.
Example 2.3 Consider a permutation 7 = (3,4,2,5,1) € Ss: ie., 0(1) = 3,
o(2) = 4,... , o(5) = 1. Then o has total 2+4 = 6 inversions: two
inversions caused by the position of o(1) = 3, which precedes 1 and 2, and
four inversions in the permutation r = (4,2,5,1), which is a permutation of
the set {1,2,4,5}. Thus,
(-1)?+4 = (-1)%sgn(r).
san(o)
Note that the permutation 7 can be considered as a permutation of Ns by
replacing the numbers 4 and 5 by 3 and 4, respectively.
Moreover, « = (3,4,2,5, 1) can be converted to (1,3, 4,2,5) by shifting
the number 1 by four transpositions, and then (1,3, 4, 2,5) can be converted
to the identity permutation (1,2, 3,4,5) by two transpositions. Hence, ¢ can
be converted to the identity permutation by six transpositions a
In general, for a fixed j, 1 1. Show that
(1) det(adja) = (det A)"
(2) adjfadjA) = (det Ay~
‘The next theorem establishes a formula for the solution of a system of n
equations in n unknowns. It is not useful as a practical method but can be
used to study properties of the solution without solving the system.
Theorem 2.8 (Cramer’s rule) Let Ax = b be a system of n linear equa-
tions in n unknowns such that det A #0. Then the system has the unique
solution given by
det C;
a GetA’
where C; is the matriz obtained from A by replacing the j-th column with
the column matrix b = [by bo =» bal.
G=L Am,
Proof: If detA # 0, then A is invertible and x = A~'b is the unique
solution of Ax = b. Since
1
=A b= i
= Ab = a ladiA)b,
it follows that
np x MAI baa +--+ PnAny _ detOs 4
om det A * deta”