0% found this document useful (0 votes)
157 views

Linear Algebra

Linear Algebra for under graduates

Uploaded by

Ujjwal Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
157 views

Linear Algebra

Linear Algebra for under graduates

Uploaded by

Ujjwal Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 380
“31093 3002362 3 Jin Ho Kwak SunGpyYo HonG Linear Algebra BIRKHAUSER Boston * BASEL * BERLIN en 164 93 499F Jin Ho Kwak Sungpyo Hong Department of Mathematics Pohang University of Science and Technology Pohang, The Republic of Korea Library of Congress Cataloging-in-Publication Data Kwak, Jin Ho, 1948- Linear Algebra /Jin Ho Kwak, Sungpyo Hong. P. om. Includes index. ISBN 0-8176-3999-3 (alk. paper). - ISBN 3-7643-3999.3 (alk. paper) 1. Algebras, Linear. I. Hong, Sungpyo, 1948- . Il Title, QA188.K94. 1997 512.5--de2 97.9062 ce Pinedo ss pape ©1997 Biase Buon pintiuser BB Copyright i not claimed for works of US. Government employees, “Allright reserved, Nopartof this publication may be reproduced, stored in aretreval system, ‘ortransmitted, inany formorby any means, electronic, mechanical, photocopying, reeording, or otherwise, without prior permission of the copyright owner. emission to photocopy for internal or personal use of specific cients BBirkhiuser Boston for libraries and other users registered. with the Copyright Clearance Center (COC), provided thatthe base fee of $6.00 per copy, plus $0.20 pe page is paid directly to COC, 222 Rosewood Drive, Danvers, MA 01923, U.S.A. Special requests should be addressed directly to Bikhiuser Boston, 675 Massachusetts Avenue, Cambridge, MA 02139, USA. 1SBN0-8176-3999.3, ISBN 3-7643-3999.3, ‘Typesetting by the authors in LAT Printed and bound by Hamilton Printing, Rensselaer, NY Printed in the U.S.A. 987654321 Preface Linear algebra is one of the most important subjects in the study of science and engineering because of its widespread applications in social or natural science, computer science, physics, or economics. As one of the most useful courses in undergraduate mathematics, it has provided essential tools for industrial scientists, The basic concepts of linear algebra are vector spaces, linear transformations, matrices and determinants, and they serve as an abstract language for stating ideas and solving problems. ‘This book is based on the lectures delivered several years in a sophomore. level linear algebra course designed for science and engineering students. The primary purpose of this book is to give a careful presentation of the basic concepts of linear algebra as « coherent part of mathematics, and to illustrate its power and usefulness through applications to other disciplines. We have tried to emphasize the computational skills along with the mathematical abstractions, which have also an integrity and beauty of their own. ‘The book includes a variety of interesting applications with many examples not only to help students understand new concepts but also to practice wide applications of the subject to such areas as differential equations, statistics, geometry, and physics. Some of those applications may not be central to ‘the mathematical development and may be omitted or selected in a syllabus at the discretion of the instructor. Most basic concepts and introductory motivations begin with examples in Euclidean space or solving a system of linear equations, and are gradually examined from different points of views to derive general principles. For those students who have completed a year of calculus, linear algebra may be the first course in which the subject is developed in an abstract way, and we often find that many students struggle with the abstraction and miss the applications. Our experience is that, to understand the material, students should practice with many problems, which are sometimes omitted because of a lack of time. To encourage the students to do repeated practice, vi Preface wwe placed in the middle of the text not only many examples but also some carefully selected problems, with answers or helpful hints. We have tried to make this book as easily accessible and clear as possible, but certainly there may be some awkward expressions in several ways. Any criticism or comment from the readers will be appreciated. We are very grateful to many colleagues in Korea, especially to the faculty members in the mathematics department at Pohang University of Science and Technology (POSTECH), who helped us over the years with various aspects of this book. For their valuable suggestions and comments, we would like to thank the students at POSTECH, who have used photocopied versions of the text over the past several years. We would also like to acknowledge the invaluable assistance we have received from the teaching assistants who have checked and added some answers or hints for the problems and exercises in this book, Our thanks also go to Mrs. Kathleen Roush who made this book much more legible with her grammatical corrections in the final manuscript. Our thanks finally go to the editing staff of Birkhiuser for gladly accepting our book for publication. Jin Ho Kwek ‘Sungpyo Hong E-mail: [email protected] [email protected] April 1997, in Pohang, Korea “Linear algebra is the mathematics of our modern technological world of complet multivariable systems and computers” = Alan Tucker ~ “We (Halmos' and Kaplansky) share a love of linear algebra. I think it 4s our conviction that we'll never understand infinite-dimensional operators properly until we have a decent mastery of finite matrices. And we share a philosophy about linear algebra: we think basis-free, we write basis-free, but ‘when the chips are down we close the office door and compute with matrices like fury” ~ Irving Kaplansky - Contents Preface v 1 Linear Equations and Matrices 1 Ld Introduction. 6.0... ee eee eee eee 1 1.2 Gaussian elimination 4 13 Matrices... . . - 2 14 Products of matrices . . 16 15. Block matrices . 2 1.6 Inverse matrices... . . « 4 17 Elementary matrices eee a 18 LDU factorization ............. see 88 1.9 Application: Linear models 38 1.10 Bxercises .. 2... 45 2 Determinants 49 2.1 Basic properties of determinant . . 49 2.2 Existence and uniqueness... . 54 2.8 Cofactor expansion ......... 60 24 Cramer’srule.........-. 65 2.5 Application: Area and Volume . . 68 2.6 Exercises... . a 3 Vector Spaces 75 3.1 Vector spaces and subspaces 75 3.2 Bases . 8L 3.3 Dimensions ........ 0.005 88 3.4 Row and column spaces . 94 3.5 Rank and nullity - 100 vil vil - 3.6 Bases for subspaces . 3.7 Invertibility . 3.8 Application: Interpolation dospaseus 3.9. Application: The Wronskian...... 2... 3.10 Exercises... ........ 4 Linear ‘Transformations 4.1 Introduction. .... 2.0222 4.2. Invertible linear transformations 4.3 Application: Computer graphics 4.4 Matrices of linear transformations 4.5 Vector spaces of linear transformations . . . . 4.6 Change of bases . 4.7 Similarity... . 4.8 Dual spaces 4.9 Exercises... .. 5 Inner Product Spaces 5.1 Inner products oo 5.2 The lengths and angles of vectors . 5.3 Matrix representations of inner products 5.4 Orthogonal projections . 5.5. The Gram-Schmidt orthogonalization : 5.6 Orthogonal matrices and transformations 5.7. Relations of fundamental subspaces . 5.8 Least square solutions . . . . 5.9 Application; Polynomial approximations... 5.10 Orthogonal projection matrices 5.11 Exercises... 00.50.05 eee 6 Eigenvectors and Eigenvalues 6.1 Introduction . 6.2 Diagonelization of matrices. . 6.3. Application: Difference equations . . 6.4 Application: Differential equations I 6.5 Application: Differential equations I 6.6 Exponential matrices... 6.7 Application: Differential equations HI... . 6.8 Diagonalization of linear transformations... . . CONTENTS 104 . 110 3 115 uz 121 121 127 132 135, M40 143 146 . 182 156 161 161 164 167 im 7 181 185 187 192 196 204 209 =. 209 . 216 221 226 . 230 235 240 243 CONTENTS 69 EMBICHOS oe eee ee eee e eens 7 Complex Vector Spaces TA 72 13 14 15 76 Introduction . bec eeee Hermitian and unitary matrices see Unitarily diagonalizable matrices... . . Normal matrices ...... 2. ‘The spectral theorem . Exercises aoa 8 Quadratic Forms 81 82 83 84 85 86 87 88 Introduction. 6.0... peace Diagonalization of a quadratic form... . . Congruence relation : Extrema of quadratic forms... ... Application: Quadratic optimization Definite forms . : Bilinear forms . Exercises 9 Jordan Canonical Forms on 9.2 93 04 95 Introduction ee Generalized eigenvectors... . . eee Computation ofe* 2... 0... . Cayley-Hamilton theorem ....... + Rrerciseel eer errs Selected Answers and Hints Index «245 251 251 +. 259 +. 263 = 268 2.27 - 276 279 +. 279 +. 282 . 288 202 298 300 = 303 313 317 317 - 327 333 . 337 . 340 343 365 Linear Algebra Chapter 1 Linear Equations and Matrices 1.1 Introduction One of the central motivations for linear algebra is solving systems of linear equations, We thus begin with the problem of finding the solutions of a system of m linear equations in n unknowns of the following form: aur + at, Ho + inte = ant, + amtz + + + ant = 0 phase Amity + mata + ot cb Oran where 21, 2, ..., 2 are the unknowns and aij’s and by's denote constant (real or complex) numbers. A sequence of numbers (81, 82, ..., Sn) is called solution of the system ifr; = 53, 72 = 82, .-., Zn = Sn satisfy each equation in the system simultaneously. When b, = bz = --+ = bm = 0, we say that the system is homogeneous. ‘The central topic of this chapter is to examine whether or not a given system has a solution, and to find a solution if it has one. For instance, any homogeneous system always has at least one solution zr = zy 2 = 0, called the trivial solution. A natural question is whether such a homogeneous system has a nontrivial solution. Ifso, we would like to have a systematic method of finding all the solutions. A system of linear equations is said to be consistent if it, has at least one solution, and inconsistent if 1 2 CHAPTER 1. LINEAR EQUATIONS AND MATRICES it has no solution. The following example gives us an idea how to answer the above questions. Example 1.1 When m = n = 2, the system reduces to two equations in two unknowns 2 and y: az + by age + boy a ca Geometrically, each equation in the system represents a straight line when we interpret x and y as coordinates in the zy-plane. Therefore, a point P = (2,y) is a solution if and only if the point P lies on both lines. Hence there are three possible types of solution set: (1) the empty set if the lines are parallel, (2) only one point if they intersect, (3) a straight line: é.e., infinitely many solutions, if they coincide. ‘The following examples and diagrams illustrate the three types: Case (1) Case (2) Case (3) ety sl ay r-y = 0 2a — 2y uv y 1 z e z ‘To decide whether the given system has a solution and to find a general method of solving the system when it has a solution, we repeat here a well- known elementary method of elimination and substitution, Suppose first that the system consists of only one equation az + by = c. ‘Then the system has either infinitely many solutions (i.e., points on the straight line 2 = —4y+ £ or y = —Sx+ § depending on whether a # 0 or £0) or no solutions when a and ¢ #0. 1.1. INTRODUCTION 3a ‘We now assume that the system has two equations representing two lines in the plane. Then clearly the two lines are parallel with the same slopes if and only if ag = Aa, and by = Ny for some A # 0, oF aiby ~ ab = 0. Furthermore, the two lines either coincide (infinitely many solutions) or are distinct and parallel (no solutions) according to whether cp = Ac; holds or not. Suppose now that the lines are not parallel, or ayby — ab # 0. In this case, the two lines cross at a point, and hence there is exactly one solution: For instance, if the system is homogeneous, then the lines cross at the origin, so (0,0) is the only solution. For a nonhomogeneous system, we may find the solution as follows: Express « in terms of y from the first equation, and then substitute it into the second equation (i.e., eliminate the variable from the second equation) to get Since aby — aah # 0, this can be solved as xen — ane ayb2 = a2by” which is in tum substituted into one of the equations to find z and give a complete solution of the systentIn detail, the process can be summarized as follows: (1) Without loss of generality, we may assume ay # 0 since otherwise we can interchange the two equations. Then the variable « can be eliminated from the second equation by adding a times the first equation to the second, to get (b= aay = a- a. ac + by = 4 a; a (2) Since a1b2—aabi # 0, y can be found by multiplying the second equation by a nonzero number ——“1. to get ayby — aaby ae + by = 4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES (3) Now, 2 is solved by substituting the value of y into the first equation, and we obtain the solution to the problem: 2 = peaches axba anh = mee a204 0 aba aah, ‘Note that the condition a:b2 — ab # 0 is necessary for the system to have only one solution. o In this example, we have changed the original system of equations into a simpler one using certain operations, from which we can get the solution of the given system, That is, if (1,1) satisfies the original system of equations, ‘then z and y must satisfy the above simpler system in (3), and vice versa, It is suggested that the readers examine a system of three equations in three unknowns, each equation representing a plane in the 3-dimensional space RS, and consider the various possible cases in a similar way. Problem 1.1 For a system of three equations in three unknowns ant + any + age = by fanz + amy + ase = be at + aay + ame = ba, describe all the possible types of the solution set in R°. 1.2 Gaussian elimination As we have seen in Example 1.1, a basic idea for solving a system of linear equations is to change the given system into a simpler system, keeping the solutions unchanged; the example showed how to change a general system to a simpler one. In fact, the main operations used in Example 1.1 are the following three operations, called elementary operations: (1) multiply a nonzero constant throughout an equation, (2) interchange two equations, (8) change an equation by adding a constant multiple of another equation. 1.2. GAUSSIAN ELIMINATION 5 After applying a finite sequence of these elementary operations to the given system, one can obtain a simpler system from which the solution can be derived directly. Note also that each of the three elementary operations has its inverse operation which is also an elementary operation: (1)! divide the equation with the same nonzero constant, (2)! interchange two equations again, (8)! change the equation by subtracting the same constant multiple of the same equation. By applying these inverse operations in reverse order to the simpler system, cone can recover the original system. This means that a solution of the original system must also be a solution of the simpler one, and vice versa. ‘These arguments can be formalized in mathematical language. Observe that in performing any of these basic operations, only the coefficients of the variables are involved in the calculations and the variables 21, ..., a and the equal sign “=” are simply repeated. Thus, keeping the order of the variables and “=” in mind, we just extract the coefficients only from the equations in the given system and make a rectangular array of numbers: ay 412 Gin By G2 azn «+ Gan bp mt Gna °° mm Om ‘This matrix is Walled the augmented matrix for the system. The term ‘matriz means just any rectangular array of numbers, and the numbers in this array are called the entries of the matrix. To explain the above operations in terms of matrices, we first introduce some terminology even though in the following sections we shall study matrices in more detail. Within a matrix, the horizontal and vertical subarrays, ay 35 fan aig +++ Gin bi] and mj are called the é-th row (matrix) and the j-th column (matrix) of the aug- mented matrix, respectively. Note that the entries in the j-th column are 6 CHAPTER 1. LINEAR EQUATIONS AND MATRICES just the coefficients of j-th variable 2;, so there is a correspondence between columns of the matrix and variables of the system. Since each row of the augmented matrix contains all the information of the corresponding equation of the system, we may deal with this augmented matrix instead of handling the whole system of linear equations. ‘The elementary operations to a system of linear equations are rephrased as the elementary row operations for the augmented matrix, as follows: (1) multiply a nonzero constant throughout a row, (2) interchange two rows, (8) change a row by adding a constant multiple of another row. ‘The inverse operations are (1)! divide the row by the same constant, (2)! interchange two rows again, (3)' change the row by subtracting the same constant multiple of the other row. Definition 1:1 Two augmented matrices (or systems of linear equations) are said to be row-equivalent if one can be transformed to the'other by a finite sequence of elementary row operations. If a matrix B can be obtained from a matrix A in this way, then we can obviously recover A from B by applying the inverse elementary row operations in reverse order. Note again that an elementary row operation does not alter the solution of the system, and we can formalize the above argument in the following theorem: ‘Theorem 1.1 If two systems of linear equations are row-equivalent, then they have the same set of solutions, ‘The general procedure for finding the solutions will be illustrated in the following example: Example 1.2 Solve the system of linear equations: Qy + dz c+ dy + dz 30 + 4y + 6 = -1, 1.2. GAUSSIAN ELIMINATION 7 Solution: We could work with the augmented matrix alone. However, to compare the operations on systems of linear equations with those on the augmented matrix, we work on the system and the augmented matrix in parallel, Note that the associated augmented matrix of the system is 024 2 en 346-1 (1) Since the coefficient of « in the first equation is zero while that in the second equation is not zero, we interchange these two equations: c+ dy + 2 122 3 dy + dz 024 2 Bc + dy + Be 346-1 (2) Add -3 times the first equation to the third equation: z+ Qy + 2 3 122 3 Qy + 42 2 o 24 2 = dy = -10 0 -2 0 -10 ‘The coefficient 1 of the first unknown « in the first equation (row) is called the pivot in this first elimination step. Now the second and the third equations involve only the two unknowns y and z. Leave the first equation (row) alone, and the same elimination procedure can be applied to the second and the third equations (rows): The pivot for this step is the coefficient 2 of y in the second equation (row). To climinate y from the last equation, (8) Add 1 times the second equation (row) to the third equation (row): z+ dy + Oe 122 3 dy + az o24 2 4s 004 -8 ‘The elimination process done so far to obtain this result is called a for- ward elimination: é.c., elimination of x from the last two equations (rows) and then elimination of y from the last equation (row). Now the pivots of the second and third rows are 2 and 4, respectively. ‘To make these entries 1, 8 CHAPTER 1. LINEAR EQUATIONS AND MATRICES (4) Divide each row by the pivot of the row: n+ y+ d= 3 122 3 ytd = 1 o12 1 z= 2 oo1-2 ‘The resulting matrix on the right side is called a row-echelon form of the matrix, and the 1’s at the leftmost entries in each row are called the leading V's, The process s0 far is called a Gaussian elimination, We now want to eliminate numbers above the leading 1's; (5) Add —2 times the third row to the second and the first rows, 2 + dy 7 120 7 v 5 Ogio z= -2 001-2 (6) Add —2 times the second row to the first row: 2 100 -3 y 010 5 z o01 -2 ‘This matrix is called the reduced row-echelon form. The procedure to get this reduced row-echelon form from a row-echelon form is called the back substitution. The whole process to obtain the reduced row-echelon form is called a Gauss-Jordan elimination. Notice that the corresponding system to this reduced row-echelon form is row-equivalent to the original one and is essentially a solved form: i.e., the solution is 7 = -3, y=5, z= —2. o In general, a matrix of row-echelon form satisfies the following prop- erties. (1) The first nonzero entry of each row is 1, called a leading 1. (2) A row containing only 0°s should come after all rows with some nonzero entries. (3) The leading 1’s appear from left to the right in successive rows. That is, the leading 1 in the lower row occurs farther to the right than the leading 1 in the higher row. Moreover, the matrix of the reduced row-echelon form satisfies 1.2. GAUSSIAN ELIMINATION 9 (4) Each column that contains a leading 1 has zeros everywhere else, in addition to the above three properties. Note that an augmented matrix has only one reduced row-echelon form while it may have many row-echelon forms. In any case, the number of nonzero rows containing leading 1’s is equal to the number of columns con- taining leading 1's. The variables in the system corresponding to columns with the leading 1’s in a row-echelon form are called the basic variables. In general, the reduced row-echelon form U may have columns that do not con- tain leading 1's. ‘The variables in the system corresponding to the columns without leading 1’s are called free variables. ‘Thus the sum of the number of basic variables and that of free variables is precisely the total number of variables. For example, the first two matrices below are in reduced row-echelon form, and the last two just in row-echelon form. 100 0} 5 106) 1232 1126 (Oe te | FeO) (Ostet fae OH enayoritea|20) tell 0: ooo oo0000 oo17 0013 Notice that in an augmented matrix (A b], the last column b does not correspond to any variable. Hence, if we consider the four matrices above as augmented matrices for some systems, then the systems corresponding to the first and the last two augmented matrices have only basic variables but no free variables. In the system corresponding to the second augmented matrix, the second and the forth variables, x2 and <, are basic, and the first ond the third variables, #1 and ag, are free variables. These ideas will be used in later chapters. In summary, by applying a finite sequence of elementary row operations, the augmented matrix for a system of linear equations can be changed to its reduced row-echelon form which is row-equivalent to the original one. From the reduced row-echelon form, we can decide whether the system has ‘a solution, and find the solution of the given system if it has one. Example 1.3 Solve the following system of linear equations by Gauss- Jordan elimination. a + 3x2 — 2a 3 2a, + 622 ~ 2g + doy = 18 a) + a3 + 3x = 10. 10 CHAPTER 1. LINEAR EQUATIONS AND MATRICES Solution: The augmented matrix for the system is 3-20 8 6 -2 4 18]. 1 ‘The Gaussian elimination begins with: (1) Adding —2 times the first row to the second produces 13-20 3 (OE 0) 2412) |e o1 13 10 (2) Note that the coefficient of 22 in the second equation is zero and that in the third equation is not. Thus, interchanging the second and the third rows produces 13-20 3 Op igs) 10)| oo d (8) The pivot in the third row is 2. Thus, dividing the third row by 2 produces a row-echelon form This is a row-echelon form, and we now continue the back-substitution: (4) Adding —1 times the third row to the second, and 2 times the third row to the first produces 1 0 0 (5) Finally, adding —3 times the second row to the first produces the reduced row-echelon form: cre ors 1.2, GAUSSIAN ELIMINATION n ‘The corresponding system of equations is a + me tm + ay + ey Since 21, 72, and 23 correspond to the columns containing leading 1’s, they are the basic variables, and 2 is the free variable. Thus by solving this system for the basic variables in terms of the free variable 4, we have the system of equations in a solved form: m= 3 m= 4 - my ty = 6 — 2s. By assigning an arbitrary value t to the free variable rz, the solutions can be written as (ei, #2, 43, m4) = (3-4, 4-4, 6-24, 0), for any t € R, where R denotes the set of real numbers. o Remark: Consider a homogeneous system eum, + at + + ainda aym, + amz + +) + Ozma Omity + maa + Gmntn = 0, with the number of unknowns greater than the number of equations: that is, m [4-140-5]7] 4]? 23][ 2] _ [2-2+3-Cn]_[a 4o}[-1] * [4-24+0-(-)|7[8]> 23][0] _ [2-0+3-0]_[o 4o0}|o}] ~ |4-040-0]=|o 1.4, PRODUCTS OF MATRICES 19 Fos 20 710 4o||[5 -10 48 o0]" Since A is a 2x2 matrix and B is a 2x3 matrix, the product AB is a 2x3 matrix. If we concentrate, for example, on the (2,1)-entry of AB, we single out the second row from A and the first column from B, and then we multiply corresponding entries together and add them up, ie,4-140-5=4. 0 ‘Therefore, AB is Note that the product AB of A and B is not defined if the number of columns of A and the number of rows of B are not equal. Remark: In step (2), we could have defined for a 1 x n row matrix A and ann xr matrix B using the same rule defined in step (1). And then in step (8) an appropriate modification produces the same definition of the product, of matrices. We suggest the readers verify this (see Example 1.6). ‘The identity matrix of order n, denoted by In (or I if the order is clear from the context ), is a diagonal matrix whose diagonal entries are all 1, i.e., 1 0-0 ou qn O- O1 By a direct computation, one can easily see that Aly = nxn matrix A. ‘Many, but not all, of the rules of arithmetic for real or complex numbers also hold for matrices with the operations of scalar multiplication, the sum and the product of matrices. The matrix Omxn plays the role of the number 0, and [,, that of the number 1 in the set of real numbers. ‘The rule that does not hold for matrices in general is the commutativity AB = BA of the product, while the commutativity of the matrix sum A+B=B+A does hold in general. The following example illustrates the noncommutativity of the product of matrices, I,A for any 1 0 o1 . trams 1s taA= [22] at a= [9 2] [1 3): ol ap=[ $3]; BA 20 CHAPTER 1. LINEAR EQUATIONS AND MATRICES ‘Thus the matrices A and B in this example satisfy AB # BA. o ‘The following theorem lists some rules of ordinary arithmetic that do hold for matrix operations. ‘Theorem 1.4 Let A, B, C be arbitrary matrices for which the matriz op- erations below are defined, and let k be an arbitrary scalar. Then (1) A(BC) = (AB)C, (written as ABC) (Associativity), (2) A(B+ C) = AB+ AC, and (A+ B)C = AC + BC, (Distributivity), (8) IA=A=Al, (4) k(BO) = (kB)C = BEC), (5) (AB)? = BT AT, Proof: Each equality can be shown by direct calculations of each entry of both sides of the equalities. We illustrate this by proving (1) only, and leave the others to the readers. Assume that A = [aij] is an mxn matrix, B= [bj] is an nxp matrix, and C = [eq is a pxr matrix. We now compute the (i, )-entry of each side of the equation. Note that BC is an nxr matrix whose (i, j)-entry is [BO}s = 8. bareag. Thus ABONy = x euulBChay ane np. > ow byrcrg = DDO aipburcry- i” Sed ped Similarly, AB is an mxp matrix with the (i, j)-entry [AB]iy = Dini @inbajs and 2 (AB)Cly = SLABlaey = 32 S> audaey = 32 Sauber. bet Saline ike ‘This clearly shows that [A(BC)],; = [(AB)C],, for all i, j, and consequently A(BC) = (AB)C as desired. a Problem 1.8 Prove or disprove: If A is not a zero matrix and AB = AC, then Bac. Problem 1.9 Show that any triangular matrix A satisfying AAT = ATA is a diag- onal matrix. 1.4. PRODUCTS OF MATRICES 2 Problem 1.10 For a square matrix A, show that (1) AAT and A+ A? are symmetric, (2) A-AT is skew-symmetric, and (8) A can be expressed as the sum of symmetric part B = }(A+A7) and skew- symmetric part C= }(A— A"), so that A= B+C. As an application of our results on matrix operations, we shall prove the following important theorem: ‘Theorem 1.5 Any system of linear equations has either no solution, exactly one solution, or infinitely many solutions. Proof: We have already seen that system of linear equations may be written as Ax=b, which may have no solution or exactly one solution. Now assume that the system Ax = b of linear equations has more than one solution and let x1 and xz be two different solutions so that Ax; = b and Ax = b. Let xo = x1 — x2 # 0. Since Ax is just a particular case of a matrix product, Theorem 1.4 gives us AG + kx0) = Ax1 + KAXo = b+ k( Axi — Axa) = by for any real number k, This says that x; + kxo is also a solution of Ax = b for any k. Since there are infinitely many choices for k, Ax = b has infinitely many solutions. a Problem 1.11 For which values of a does each of the following systems have no solution, exactly one solution, or infinitely many solutions? z+ dy afé - ¥ 22 CHAPTER 1. LINEAR EQUATIONS AND MATRICES 1.5 Block matrices In this section we introduce some techniques that will often be very helpful in manipulating matrices. A submatrix of a matrix A is a matrix obtained from A by deleting certain rows and/or columns of A. Using a system of horizontal and vertical lines, we can partition a matrix A into submatrices, called blocks, of A as follows: Consider a matrix ay ap aig | arg A= | ax oa axa | 024 | , a3; 32 ag | 234 divided up into four blocks by the dotted lines shown. Now, if we write am am az am [ox]. An=[as1 as as2 ], Az then A can be written as [Au 4p An Azo |" called a block matrix. ‘The product of matrices partitioned into blocks also follows the matrix product formula, as if the Aij were numbers: An Arg Bu Ba Az B i ies Pal les Bu AuBu+AwBu AnBu + ABo | AB = [ An By +AnBy An By + AnBa provided that the number of columns in Ajx is equal to the number of rows in Byj. ‘This will be true only if the columns of A are partitioned in the same way as the rows of B. It is not hard to see that the matrix product by blocks is correct. Sup- pose, for example, that we have a 3x3 matrix A and partition it as, ay ay2 | a3 aa a2 | ozs 931 932 | a33 An Aw | * An 1.5. BLOCK MATRICES 23 and a 3x2 matrix B which we partition as by bia a=] | =[ 3 |. fst ba * ‘Then the entries of C = [cj] = AB are ij = (airbrj + asada) + asda; - ‘The quantity o;3b,j + oiabas is simply the (i4)-entry of Ar, Bhy if i < 2, and the (i,)-entry of An By: if i = 3. Similarly, asgba; is the (i j)-entry of Ai2Ba if i <2, and of AgpBn if i= 3. Thus AB can be written as Cu | 7 [ AnBu + ABa B= . et [ Cw | = | AB + AnBa In particular, if an m xn matrix A is partitioned into blocks of column vectors: ie., A =a! a? --» a], where each block aé is the j-th column, then the product Ax with x = [1 -+- zn]? is the sum of the block matrices (or column vectors) with coefficients 24's: a a 1 aya? n mal +ma"t-+-tana’, Bn where aja = xy(a1j aaj +++ ang]? Example 1.6 Let A be an m x n matrix partitioned into the row vectors ‘a, ag, .-., @q as its blocks, and let B be ann x r matrix so that their product AB is well-defined. By considering the matrix B as a block, the product AB can be written a aB aybt ayb? «-- ab” ap-|" | p= aB | _ abt ab? +++ agb™ . an amB ab! ah? + aqb? where b!, b?, --+, b” denote the columns of B. Hence, the row vectors of AB are the products of the row vectors of A and B. 4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES Problem 1.12 Compute AB using block multiplication, where La}1-o vol? ae[3He f) ae[2 us ola 3-2/1 1.6 Inverse matrices ‘As we saw in Section 1.4, a system of linear equations can be written as Ax = b in matrix form. This form resembles one of the simplest linear equation in one variable ax = 6 whose solution is simply « = a~'b when a #0. Thus it is tempting to write the solution of the system as x = A~'b, However, in the case of matrices we first have to have a precise meaning of 71. To discuss this we begin with the following definition. Definition 1.7 For an m xn matrix A, an n x m matrix B is called a left inverse of A if BA = In, and ann x m matrix C is called a right inverse of A if AC = Im. Example 1.7 From a direct calculation for two matrices 1-3 a=[po apes] 8], eee -5 2 -4 we have AB=Ip,andBA=| 9 -2 6/|#4h. 2-4 9 ‘Thus, the matrix B is a right inverse but not a left inverse of A, while A is a left inverse but not a right inverse of B. Since (AB)™ = BT AT and IT =1, a matrix A has a right inverse if and only if AT has a left inverse. a ‘However, if A is a square matrix and has a left inverse, then we prove later (Theorem 1.8) that it has also a right inverse, and vice versa. Moreover, the following lemma shows that the left inverses and the right inverses of square matrix are all equal. (This is not true for nonsquare matrices, of course), Lemma 1.6 If an nx n square matriz A has a left inverse B and a right inverse C, then B and C are equal, i.e., B=C. 1.6. INVERSE MATRICES 25 Proof: A direct calculation shows that B= BI=B(AC)=(BA)C=IC=C. Now any two left inverses must be both equal to a right inverse C, and hence to each other, and any two right inverses must be both equal to a left inverse B, and hence to each other. So there exist only one left and only one right inverse for a square matrix A ifit is known that A has both left and right inverses. Furthermore, the left and right inverses are equal. a ‘This theorem says that if a matrix A has both a right inverse and a left inverse, then they must be the same. However, we shall see in Chapter 3 that any mxn matrix A with m # n cannot have both a right inverse and a left inverse: that is, a nonsquare matrix may have only a left inverse or only a right inverse. In this case, the matrix may have many left inverses or many right inverses. 10 Example 1.8 A nonsquare matzix A= | 0 1 | can have more than one 00 left inverse. In fact, for any 2, y € R, one can essily check that the matrix a={5 ° 5 | ma tense of 4 a oly Definition 1.8 An n x n square matrix A is said to be invertible (or nonsingular) if there exists a square matrix B of the same size such that AB=I=BA. ‘Such a matrix B is called the inverse of A, and is denoted by A“. A matrix Ais said to be singular if it is not invertible. Note that Lemma 1.6 shows that if a square matrix A has both left and right inverses, then it must be unique. That is why we call B “the” inverse of A. For instance, consider a 2x2 matrix A= [ ° Fl Tf ad —be #0, then it is easy to verify that . b ad — be a} A wel ad 26 CHAPTER 1. LINEAR EQUATIONS AND MATRICES since AA~! = Ip = A“1A. (Check this product of matrices for practice!) Note that any zero matrix is singular. Problem 1.19 Let A be an invertible matrix and k any nonzero scalar. Show that (1) A“? is invertible and (A7?)-! = A; (2) the matrix kA is invertible and (kA)? = 247; (3) AP is invertible and (AT)? = (A~¥)?. ‘Theorem 1.7 The product of invertible matrices is also invertible, whose inverse is the product of the individual inverses in reverse order: (AB)*! = BOA"), Proof: Suppose that A and B are invertible matrices of the same size. ‘Then (AB)(B-1A-!) = A(BB7) A? = ATA“! = AA! = I, and similarly (B-1A-)(AB) = I. Thus AB has the inverse B-1A~}, a ‘We have written the inverse of A as “A to the power —1”, so we can give the meaning of A* for any integer k: Let A be a square matrix. Define ‘A® = I. Then, for any positive integer k, we define the power A* of A inductively as AP = A(AE Moreover, if A is invertible, then the negative integer power is defined as A*= (AF fork >0. It is easy to check that with these rules we have AM! = A‘ A® whenever the right hand side is defined. (If A is not invertible, A°()) is defined but A“ is not.) Problem 1.14 Prove: (1) If A has a zero row, s0 does AB. (2) If B has a zero column, so does AB. (3) Any matrix with a zero row or a zero column cannot be invertible Problem 1.15 Let A be an invertible matrix. Is it true that (A*)" = (AT) for any integer k? Justify your answer. 1.7. ELEMENTARY MATRICES 7 1.7 Elementary matrices ‘We now return to the system of linear equations Ax = b. If A has a right inverse B such that AB = In, then x = Bb is a solution of the system since Ax = A(Bb) = (AB)b = In particular, if A is an invertible square matrix, then it has only one inverse A7! by Lemma 1.6, and x = A~b is the only solution of the system. In this section, we discuss how to compute A~? when A is invertible. Recall that Gaussian elimination is a process in which the augmented matrix is transformed into its row-echelon form by a finite number of ele- mentary row operations. In the following, we will show that each elementary row operation can be expressed as a nonsingular matrix, called an elementary matriz, and hence the process of Gaussian elimination is simply multiplying a finite sequence of corresponding elementary matrices to the augmented matrix. Definition 1.9 A matrix B obtained from the identity matrix In by exe- cuting only one elementary row operation is called an elementary matrix. For example, the following matrices are three elementary matrices cor- responding to each type of the three elementary row operations. () [ é 3 ] : multiply the second row of Ip by —5; @) interchange the second and the fourth rows of Is; (3) add 3 times the third row to the first row of Js. How once It is an interesting fact that, if E is an elementary matrix obtained by executing a certain elementary row operation on the identity matrix*Im, then for any m x n matrix A, the product EA is exactly the matrix that is obtained when the same elementary row operation in E is executed on A. ‘The following example illustrates this argument, (Note that AI is not what we want. For this, see Problem 1.17). 28 CHAPTER 1. LINEAR EQUATIONS AND MATRICES Example 1.9 For simplicity, we work on a 3x1 column matrix b. Suppose ‘that we want to do the operation “adding (—2) x the first row to the second row” on matrix b. Then, we execute this operation on the identity matrix J first to got an clementary matrix E: 100 B=|-21 01]. 001 Multiplying the elementary matrix E to b on the left: produces the desired result: 100 Eb=| -2 1 0] | & oor Similarly, the operation “interchanging the first and third rows” on the matrix b can be achieved by multiplying a permutation matric P, which is an elementary matrix obtained from Js by interchanging two rows, to b on o01)fh by Pb=|0 10} | =| bs 10 0][ 8% b Recall that each elementary row operation has an inverse operation, which is also an elementary operation, that brings the matrix back to the criginal one. Thus, suppose that E denotes an elementary matrix corre- sponding to an elementary row operation, and let E' be the elementary matrix corresponding to its “inverse” elementary row operation in E. Then, a (1) if E multiplies a row by c #0, then E" multiplies the same row by 3; (2) if B interchanges two rows, then EB” interchanges them again; (3) if B adds a multiple of one row to another, then EY subtracts it back from the same row. ‘Thus, for any m xn matrix A, E'EA = A, and EE EET. That is, every elementary matrix is invertible so that E-? = E’, which is also an elementary matriz, For instance, if 7. ELEMENTARY MATRICES 29 010 1 0 0], then 001 010 100 oo1 Definition 1.10 A permutation matrix is a square matrix obtained from the identity matrix by permuting the rows. Problem 1.16 Prove (2) A permutation matrix is the product of e finite number of elementary matrices ‘each of which is corresponding to the “row-interchanging” elementary row operation. (2) Any permutation matrix P is invertible and P~! = P?, (3) The product of any two permutation matrices is a permutation matrix. (4) The transpose of a permutation matrix is also a permutation matrix, Problem 1.17 Define the elementary column operations for a matrix by just replacing “row" by “column” in the definition of the elementary row operations. Show that if A is an m xn matrix and if B is an elementary matrix obtained by executing an elementary column operation on In, then AE is exactly the matrix that is obtained from A when the same column operation is executed on A. ‘The next theorem establishes some fundamental relationships between nxn square matrices and systems of n linear equations inn unknowns. ‘Theorem 1.8 Let A be ann xn matriz. The following are equivalent (1) A has a left inverse; (2) Ax=0 has only the trivial solution x = (8) A és row-equivalent to Inj (4) A is a product of elementary matrices; (5) A is invertible; (6) A has a right inverse. Proof: (1) + (2): Letx be a solution of the homogeneous system Ax = 0, and let B be a left inverse of A. Then x= [nx = (BA)x = BAx = BO =0. 30 CHAPTER 1. LINEAR EQUATIONS AND MATRICES (2) = (8) : Suppose that the homogeneous system Ax = 0 has only the trivial solution x = 0: a ‘This means that the augmented matrix [A 0] of the system Ax = 0 is reduced to the system [Jn 0] by Gauss-Jordan elimination. Hence, A is row-equivalent to In (3) = (4) : Assume A is row-equivalent to Ip, s0 that A can be reduced to In by a finite sequence of elementary row operations. Thus, we can find elementary matrices Fi, Fa,..., Ey such that Ey: E2E,A= In. Since E}, E2,..., By are invertible, by multiplying both sides of this equation on the left successively by Ez*,..., £37, 2,7, we obtain A= By1By)..- By My = Ey! By* Br, which expresses A as the product of elementary matrices. (A) (6) is trivial, because any elementary matrix is invertible. In fact, AT} = Bys-+ EE, (5) = (1) and (5) = (6) are trivial. (6) = (5): If B is a right inverse of A, then A is a left inverse of B and we can apply (1) = (2) = (3) = (4) = (6) to B and conclude that B is invertible, with A as its unique inverse. That is, B is the inverse of A and so A is invertible. oO ‘This theorem shows that a square matrix is invertible if it has a one-side inverse. In particular, if a square matrix A is invertible, then x = A7'b is a ‘unique solution to the system Ax = b. Problem 1.18 Find the inverse of the product 1 00 100 100 o 10 o10]}-a1 0}. o-cif{-bo1 oo1 1.7, ELEMENTARY MATRICES 31 As an application of the preceding theorem, we give a practical method for finding the inverse A? of an invertible nxn matrix A. If A is invertible, there are elementary matrices Ey, Ea, ..., Ey such that i+ BgEA = In. Hence, a + BaEy = Bus ExE In. It follows that the sequence e row operations that reduces an invertible ma- trix A to Ip will resolve In, to A~}. In other words, let [A | I] be the aug- mented matrix with the columns of A on the left half, the columns of T on the right half. A Gaussian elimination, applied to both sides, by some elementary row operations reduces the augmented matrix [A | I] to [U' | K], where U is a row-echelon form of A. Next, the back substitution process by another series of elementary row operations reduces [U | K] to {I | A~*]: [Al Z] > [Be E,A| Be--- El] [Feo ALU | Feo FAK] where Ey--- Ey represents a Gaussian elimination and Fi-+-F, represents the back substitution. The following example illustrates the computation of an inverse matrix. Example 1.10 Find the inverse of A We apply Gauss-Jordan elimination to ee ales) 0 0) fain = |235]010 pena 102|001 row 1+ row 3 12 3] 10 a fo -1 -1 | -21 0-2 -1 | -10 G Ca 0 0 | (-1)row 2 zi . 0 0 1 | @)row 2+20w 3 1 0 Sy oan acs 0 0 32 CHAPTER 1. LINEAR EQUATIONS AND MATRICES ‘This is [U | K] obtained by Gaussian elimination. Now continue the back substitution to reduce [U | K] to [I | A~*] 123 |1 00 + vin = | = co O21 12 10) (ovstoowt 120|-8 6 ~fororaa (-2)row 2+ row 1 oo1] 3 1oo{-6 4 + |o10 | -1 A = [a7] oo1| 3 ‘Thus, we get 6 4-1 ata|-1 1-1]. 3-2 1 (The reader should verify that AA“? = I= A-1A.) o Note that if A is not invertible, then, at some step in Gaussian elimina- tion, a zero row will show up on the left side in [U | K]. For example, the Teo 16 4 2 4 -1 | isrow-equivalent to| 0 -8 -9 “12 6 o 0 0 noninvertible matrix. matrix A which is a Problem 1.19 Write A~} asa product of elementary matrices for A in Example 1.10. of A by using Gaussian elimination. Problem 1.20 Pind the inverse of each of the following matri ee {1 000) [Eo i200] ou[itoe [= i )e Pa ofeefor eo | eo tacs| [oore LDU FACTORIZATION 33, a 0 Problem 1.91 When is a diagonal matrix D = nonsingular, and 0 dy what is D-*? From Theorem 1.8, a square matrix A is nonsingular if and only if Ax = 0 has only the trivial solution. That is, a square matrix A is singular if and only if Ax = 0 has a nontrivial solution, say xp. Now, for any column vector [by «++ by]?, if x; is a solution of Ax = b for a singular matrix A, then 0 is kx +1 for any Alkxo +21) = k(Axo) + Axi = K+ b= Db. ‘This argument strengthens Theorem 1.5 as follows when A is a square matrix: Theorem 1.9 If A is an invertiblen xn matria, then for any column vector b=[b ++ by)”, the system Ax = b has exactly one solution x = A~™b. If A is not invertible, then the system has either no solution or infinitely many solutions according to whether or not the system is inconsistent. 0 Problem 1.22 Write the system of linear equations t+ dy + 2% = 10 Qe - ty + Be 1 40 - By + be = 4 in matrix form Ax = b and solve it by finding A~*b, 1.8 LDU factorization Recall that a basic method of solving a linear system Ax = b is by Gauss- Jordan elimination, For a fixed matrix A, if we want to solve more than one system Ax = b for various values of b, then the same Gaussian elimination on A has to be repeated over and over again. However, this repetition may be avoided by expressing Gaussian elimination as an invertible matrix which is a product of elementary matrices. ‘We first: assume that no permutations of rows are necessary throughgut the whole process of Gaussian elimination on [A b]. Then the forward elim- ination is just to multiply finitely many elementary matrices Bx, ..., Ey to the augmented matrix [4 b]: that is, [Ek---B,A Bes Eb] =(U e}, 34 CHAPTER 1. LINEAR EQUATIONS AND MATRICES where each J; is a lower triangular elementary matrix whose diagonal entries are all 1’s and [U ¢] is the augmented matrix of the system obtained after forward elimination on Ax = b (Note that U need not be an upper triangular matrix if A is not a square matrix). Therefore, if we set L = (Ex-+- Ex)! = Ej}.--Ej!, then A= LU and Ux = By: ByAx = Ey: Eyb= Lb. Note that L is a lower triangular matrix whose diagonal entries are all 1’s (see Problem 1.24). Now, for any column matrix b, the system Ax = LUx = b ‘can be solved in two steps: first compute ¢ = L-'b which is a forward elimination, and then solve Ux = c by the back substitution. ‘This means that, to solve the Zsystems Ax = b; fori =1, ..., & we first find the matrices L and U such that A = LU by performing forward elimination on A, and then compute ¢; = L~*b; for i = 1,...,€ The solutions of Ax = by are now those of Ux = ¢.. ‘Example 1.11 Consider the system of linear equations 2110][a fl Aéx=| 410 1]|a]=|-2]=b. -2211)|45 7 ‘The elementary matrices for Gaussian elimination of A are easily found to be 100 100 100 By=|-210],@=|010],ad m=|01 0], 001 101 031 so that 2110 EsERA=|0 -1 -2 1) = 0 0-44 Note that U is the matrix obtained from A after forward elimination, and A= LU with 1 00 L=E BB =| 2 10], -1-31 which is a lower triangular matrix with 1’s on the diagonal, Now, the system o 1 Ie=b: 4 2a + @ -2 =a - 3 7 1.8. LDU FACTORIZATION 35 resolves to ¢ = (1,—4,—4) and the system Qe + m2 + Ux=e: =m — ly + = 4g + dy resolves to -1+ #]) fo 1 24 3 2 3 1-¢t aft} aye t 0 1 for t € R. It is suggested that the readers find the solutions for various values of b. a Problem 1.29 Determine an LU decomposition of the matrix 1-1 0 A=|- 2-1], o-1 2 and then find solutions of Ax = b for (1) b= [11 1]? and (2) b= (20 — 1]? Problem 1.24 Let A, B be two lower triangular matrices. Prove that (1) their product is also a lower triangular matrix; (2) if A is invertible, then its inverse is also a lower triangular matrix; (3) ifthe diagonal entries are all 1’s then the same holds for their product and their inverses. Note that the same holds for upper triangular matrices, and for the product of more than two matrices. Now suppose that A is a nonsingular square matrix with A = LU in which no row interchanges were necessary. Then the pivots on the diagonal of U are all nonzero, and the diagonal of L are all 1's. Thus, by dividing each i-th row of U by the nonzero pivot dj, the matrix U is factorized into a diagonal matrix D whose diagonals are just the pivots di, dz, ..., dy and ‘anew upper triangular matrix, denoted again by U, whose diagonals are all V's so that A= LDU. For example, aris 4 0 1 t/a s/dy Od t 0d o 1 t/da ce o. ufdaa 0 dy 0 oO 1 36 CHAPTER 1. LINEAR EQUATIONS AND MATRICES ‘This decomposition of A is called the LDU factorization of A. Note that, in this factorization, U is just a row-echelon form of A (with leading 1’s on the diagonal) after Gaussian elimination and before back substitution. In Example 1.11, we found a factorization of A as R OOo 1 A 2 10} {0-1 -2 -1 -3 1] [0 0 -4 ‘This can be further factored as A= LDU by taking 211 2 0 ojf1 12 1/2 0-1 -2}=/0 -1 oj]jo 1 2 |=ou. 0 0 -4 0 o-4Jlo o 1 Suppose now that during forward elimination row interchanges are nec- essary. In this case, we can first do all the row interchanges before doing any other type of elementary row operations, since the interchange of rows can be done at any time, before or after the other operations, with the same effect on the solution. Those “row-interchanging” elementary matrices altogether form a permutation matrix P so that no more row interchanges are needed during Gaussian elimination of PA. So PA has an LDU factorization. o1 Example 1.12 Consider a square matrix A= | 0 1 For Gaussian 10 elimination, it is clearly necessary to interchange the first row with the third o01 row, that is, we need to multiply the permutation matrix P = | o10 100 to A so that 100 10 O}fi00 PA=|010/=|0 1 0|/}o01 0]=u. OH o-11j[oo2 o Of course, if we choose a different permutation P’, then the LDU fac- torization of P’A may be different from that of PA, even if there is an- other permutation matrix P" that changes P’A to PA. However, if we fix a permutation matrix P when it is necessary, the uniqueness of the LDU factorization of A can be proved. 1.8, LDU FACTORIZATION 37 ‘Theorem 1.10 For an invertible matriz A, the LDU factorization of A is unique up to a permutation: that és, for a fized P the expression PA = LDU is unique Proof: Suppose that A = LDU; = [2D2U, where the L’s are lower triangular, the U's are upper triangular, all with 1's on the diagonal, and the D’s are diagonal matrices with no zeros on the diagonal. We need to show Ly = Lz, D; = Dz, and U; = Us, Note that the inverse of a lower (upper) triangular matrix is also a lower (upper) triangular matrix. And the inverse of a diagonal matrix is also diagonal. Therefore, by multiplying (L;D,)~ = Dy*L;! on the left and Uz? on the right, our equation L1D,U; = LyD2U2 becomes UUs} = Dz! Lz1L2D2 ‘The left side is an upper triangular matrix, while the right side is a lower ‘triangular matrix. Hence, both sides must be diagonal. However, since the diagonal entries of the upper triangular matrix U;Uz" are all 1's, it must be the identity matrix I (see Problem 1.24). Thus U,Uz? = I, ie, Ui = Us. Similarly, L7!Zg = D,D3" implies that Ly = Ly and D; = Da, o In particular, if A is symmetric (i.e., A= AT), and if it can be factored into A= LDU without row interchanges, then we have LDU = A= A? = (LDU)? =UTDTLT =UT DL, and thus, by the uniqueness of factorizations, we have U = LT and A = LDI7. 2-10 Problem 1.25 Find the factors L,D, and U for A= [ -l 2-1 ] Ot 2 ‘What is the solution to Ax = b for b= (10 —1]"? Problem 1.26 For all possible permutation matrices P, find the LDU factorizatjon 123 of PAforA=|2 4 2]. ha 38 CHAPTER 1. LINEAR EQUATIONS AND MATRICES 1.9 Application: Linear models (1) In an electrical network, a simple current flow may be illustrated by a diagram like the one below. Such a network involves only voltage sources, like batteries, and resistors, like bulbs, motors, or refrigerators. The voltage is measured in volts, the resistance in ohms, and the current flow in amperes (amps, in short). For such an electrical network, current flow is governed by the following three laws: © Ohm's Law: The voltage drop V across a resistor is the product of the current I and the resistance R: V = IR. © Kirchhoff’s Current Law (KCL): The current flow into a node equals the current flow out of the node. © Kirchhoff’s Voltage Law (KVL): The algebraic sum of the voltage drops around a closed loop equals the total voltage sources in the loop. Example 1.18 Determine the currents in the network given in the above figure. 2 ohms 2 ohms P. qh Ty Ts] 3 ohms 2 ohms 18 volts 5 VW 1 ohms lohms Solution: By applying KCL to nodes P and Q, we get equations * h+tb = h at P, h=h + hsQ. Observe that both equations are the same, and one of them is redundant. By applying KVL to each of the loops in the network clockwise direction, 1.9, APPLICATION: LINEAR MODELS 39 we get 6h+2h = 0 from the left loop, 2h+3ls = 18 from the right loop. Collecting all the equations, we get system of linear equations: h- h+ b= 0 6h + 2b = 0 2h + 8h = 18. By solving it, the currents are Jy = —1 amp, Jy = 3 amps and [Ig = 4 amps. The negative sign for J; means that the current Jy flows in the direction opposite to that shown in the figure. a Problem 1.27 Determine the currents inthe following networks, @ ) 40 ohms 20 volts 4 De ohm ' h Sram Ie Is Socks WW {-— ohms 5 volts ah 2 ohms ls rH ' 4D ohms 40 volts volts (2) Cryptography is the study of sending messages in disguised form (secret codes) so that only the intended recipients can remove the disguise and read the message; modern cryptography uses advanced mathematics. ‘As another application of invertible matrices, we introduce a simple coding. Suppose we associate a prescribed number with every letter in the alphabet; for example, xX Y Z Blank 2? ! D 1 Titi ait 3 + 28 24 25 26 27 28. con He ty nea 40 CHAPTER 1. LINEAR EQUATIONS AND MATRICES Suppose that we want to send the message “GOOD LUCK”. Replace this message by 6, 14, 14, 3, 26, 11, 20, 2, 10 according to the preceding substitution scheme. A code of this type could be cracked without difficulty by a number of techniques of statistical methods, like the analysis of frequency of letters. To make it difficult to crack the code, we first break the message into six vectors in R3, each with 3 components (optional), by adding extra blanks if necessary: [é]- (2) (3) Next, choose a nonsingular 3 x 3 matrix A, say |: which is supposed to be known to both sender and receiver. Then as a linear transformation A translates our message into (dh al-(2} (8) By putting the components of the resulting vectors consecutively, we trans- mit A 6, 26, 34, 3, 32, 40, 20, 42, 32. To decode this message, the receiver may follow the following process. Suppose that we received the following reply from our correspondent: 19, 45, 26, 13, 36, 41. To decode it, first: break the message into two vectors in R® as before: 19 13 45], | 36]. 26 41 1.9. APPLICATION: LINEAR MODELS 41 ‘We want to find two vectors x1, x2 such that Ax; is the i-th vector of the above two vectors: i.e, 19 13 Ax=| 45 |, Axo=| 36 26 41 Since A is invertible, the vectors x1, x2 can be found by multiplying the inverse of A to the two vectors given in the message. By an easy computation, one can find ‘Therefore, Be OO] 19 13 m=]-2 10/]45/=| 7], =| 10 1-1 1] | 26 0 18 ‘The numbers one obtains are 19, 7, 0, 13, 10, 18. Using our correspondence between letters and numbers, the message we have received is “THANKS”. Problem 1.28 Encode “TAKE UFO ” using the same matrix A used in the above example. (3) Another significant application of linear algebra is to a mathematical model in economics. In most nations, an economic society may be divided into many sectors that produce goods or services, such as the automobile industry, oil industry, steel industry, communication industry, and so on. ‘Then # fundamental problem in economics is to find the equilibrium of the supply and the demand in the economy. ‘There are two kinds of demands for goods: the intermediate demand from the industries themselves (or the sectors) that are needed as inputs for their own production, and the exira demand from the consumer, the gov- ernmental use, surplus production, or exports. Practically, the interrelation between the sectors is very complicated, and the connection between the 2 CHAPTER 1. LINEAR EQUATIONS AND MATRICES extra demand and the production is unclear. A natural question is whether there is a production level such that the total amounts produced (or supply) will exactly balance the total demand for the production, so that the equality {Total output} = {Total demand} = {Intermediate demand} + {Extra demand} holds. This problem can be described by a system of linear equations, which is called the Leontief Input-Output Model. To illustrate this, we show a simple example. Suppose that a nation’s economy consists of three sectors: / = automo- bile industry, Jp = steel industry, and Jy = oil industry. Let x = [z) 2223)" denote the production vector (or production level) in R®, where each entry 2; denotes the total amount (in a common unit such as “dollars” rather than quantities such as “tons” or “gallons”) of the output that the industry J; produces per year. ‘The intermediate demand may be explained as follows. Suppose that, for the total output 22 units of the steel industry I2, 20% is contributed by the output of J, 40% by that of Jz and 20% by that of Js. Then we can write this as a column vector, called a unit consumption vector of Ip: 02 c= | 04 0.2 For example, if I decides to produce 100 units per year, then it will order (or demand) 20 units from /,, 40 units from Jz, and 20 units from Is: i-e., the ‘consumption vector of Jp for the production r2 = 100 units can be written as acolumn vector: 100cz = [20 40 20]”. From the concept of the consumption vector, it is clear that the sum of decimal fractions in the column ey must be <1. Tn our example, suppose that the demands (inputs) of the outputs are given by the following matrix, called an input-output matria, output Lh oh hk [03 02 038 Az input bh] 01 04 01 BL03 02 03 T tT T Crs 1.9. APPLICATION: LINEAR MODELS 43 In this matrix, an industry looks down @ column to see how much it needs from where to produce its total output, and it looks across a row to see how much of its output goes to where. For example, the second row says that, out of the total output 22 units of the steel industry Ja, as the intermediate demand, the automobile industry J; demands 10% of the output 21, the steel industry Jp demands 40% of the output 22 and the oil industry Js demands 10% of the output rs. Therefore, it is now'easy to see that the intermediate demand of the economy can be written as 03 02 03] [x 0.32; + 0.209 + 0.323 Ax=| 0.1 04 0.1] | 22 | =| 012, +0422 40.129 03 02 03] | 2s 0.32; + 0.209 + 0.323 ‘Suppose that the extra demand in our example is given by d = [d1, da, ds]? = [30,20, 10)”. Then the problem for this economy is to find the production vector x satisfying the following equation: x= Axtd. Another form of the equation is (I — A)x = d, where the matrix I-A is called the Leontief matrix. If J — A is not invertible, then the equation may have no solution or infinitely many solutions depending on what d is. If I-A is invertible, then the equation has the unique solution x = (I—A)~1d, Now, our example can be written as a 0:3 0.2 03) [x 30 zg |=|01 04 01} | 22 | +] 20 3 03 0.2 03} | 23 10 In this example, it turns out that the matrix J — A is invertible and 20 10 10 (i-ayt=| 05 20 05 1.0 10 20 ‘Therefore, 20 =| 60], 70 which gives the total amount of product 2; of the industry J; for one year to meet the required demand. x=(I-Ayt 44 CHAPTER 1, LINEAR EQUATIONS AND MATRICES Remark: (1) Under the usual circumstances, the sum of the entries in a column of the consumption matrix A is less than one because a sector should require less than one unit's worth of inputs to produce one unit of output. ‘This actually implies that I — A is invertible and the production vector x is feasible in the sense that the entries in x are all nonnegative as the following argument shows. (2) In general, by using induction one can easily verify that for any k=1,2, (I-A) + Ate + Am If the sums of column entries of A are all strictly less than one, then impo A* = 0 (see Section 6.6 for the limit. of a sequence of matrices). ‘Thus, we get (I- A)(I+ A+++ Ab +--+) =I, that is (I= Ay aT + Ate AR ee ‘This also shows a practical way of computing (I~ A) since by taking k sufficiently large the right side may be made very close to (J ~ A)~*. In Chapter 6, an easier method of computing A* will be shown. Tn summary, if A and d have nonnegative entries and if the sum of the entries of each column of A is less than one, then I~ A is invertible and the inverse is given as the above formula. Moreover, as the formula shows the entries of the inverse are all nonnegative, and so are those of the production vector x = (I= A)". Problem 1.29 Determine the total demand for industries 1, /, and J for the input- output matrix A and the extra demand vector d given below: 01 0.7 02 =| 05 01 06 | witha=o. 04 0.2 02 Problem 1.90 Suppose that an economy is divided into three sectors: I; = services, Jy = manufacturing industries, and Jz = agriculture. For each unit of output, 1 demands no services from J,, 0.4 units from Jp, and 0.5 units from Jp. For each unit of output, Za requires 0.1 units from sector I; of services, 0.7 units from other parts in sector Ja, and no product from sector Js. For each unit of output, Jy demands 0.8 units of services /;, 0.1 units of manufacturing products from Jz, and 0.1 units of its own output from Js, Determine the production level to balance the economy when 90 units of services, 10 units of manufacturing, and 30 units of agriculture are required as the extra demand. 1.10. EXERCISES 45 1.10 Exercises a. 1.2, 18. 1.4. 15. 1.6, Which of the following matrices are in row-echelon form or in reduced row- echelon form? Find a row-echelon form of each matrix. 12345 33020] yl23s8 MPs Aol mipou se nee 45128 51234 Find the reduced row-echelon form of the matrices in Exercise 1.2. Solve the systems of equations by Gauss-Jordan elimination. n+ m+ 35 -2 mom + om 0 ) ger + 222 — 20 1 moto + 8a 3. a - e @ia = y+ = 2 + ‘Whot are the pivots in each elimination step? ‘Which of the following systems has a nontrivial solution? e+ dy + 8 = 0 wt yn 2 Q dy + 2 = 0 (4 2 - %y — Bz + dy + 8 = 0. Set yo = Determine all values of the b, that: make the following system consistent: fs ° 0 0. 46 CHAPTER 1. LINEAR EQUATIONS AND MATRICES 1.7. Determine the condition on b; so that the following system has no solution: wt yt = hh 6 ~ 2 + Iz = by 2 - y + & = by 1.8. Let A and B be matrices of the same size. (1) Show that, if Ax = 0 for all x, then A is the zero matrix. (2) Show thet, if Ax = Bx for all x, then A= B. 1.8. Compute ABC and CAB, for [? . Hk -| ‘]. c=[1 1] +1 1.10. Prove that if A is a 3x 3 matrix such that AB = BA for every 3x 3 matrix B, then A = kl for some constant k. 120 0 1 3 |. Pind A* for all integers k. oo Ll. Let A= 1.12, Compute (2A ~ B)C and CCT for 100 100 211 o1o0}, B=|}210], ¢ 410]. 101 oo 221 1.18. Let f(z) = an" -+0n-12""1 +-+---a,2-+4+a9 be a polynomial. For any square matrix A, a matriz polynomial f(A) is defined as f(A) = a,A" han —1A"- + ssh ayA+ aol. For f(z) = 32" +27 — 2248, find f(A) for e200) tea (QA=]-3 40], @As]o 2-1 005 Oars) 1.14. Find the symmetric part and the skew-symmetric part of each of the following ‘matrices. 133 13 4 wae 2s :] oa-[2 2 3] “132 oo 3 1-102 1.15. Find AAT and ATA for the matrix A=] 2 1 3 1 2 840 ii 1.16. Let An?=] 0 1 42 2 a 1 1.10. war. 1.18, 1.19. 1.20, 1.21, 1.22 1.28, 1.24, 1.25. EXERCISES ar 12 oi]. cal A (2) Find a matrix C such that AG =A? +A, (1) Find a matrix B such that AB Find all possible choices of a,b and ¢ so that A [ oa | has an inverse matric such that A? Decide whether or not each of the following matrices is invertible. Find the 1 4 0 4 0 a| ? inverses for invertible ones. ea a-1 0238 alae) cnowotd BBL 201 Suppose A is @ 2x 1 matrix and Bis a 1x2 matrix. Prove that the product AB is not invertible. 2-13 4 o 1 2-4]. 2-3 4 Find three matrices which are row equivalent to A = Write the following systems of equations as matrix equations Ax = b and solve them by computing A~*b: ay — ty + Say = 2 a - mt me 5 @) m - 40 = 5 (2) 4m + ty - ty = HI 2, + m - 2 = 7, 4a; — 822 + 2s = -8. ation for each of the following matrices: 21 10 @a=|3 ale QA lie MLE Find the LDLP factorization of the following symmetric matrices: 12 3 oa-[} 6 ‘|. Find the LDU factori 3 8 10 Solve Ax = b with A= LU, where L and U are given as 1 00 1-1 0 2 b=|-1 10], v=/o 1-1], b=|-3 0-11 oo 1 4 ; Forward elimination is the same as Le = b, aad back-substitution is Ux =e. 48, CHAPTER 1. LINEAR EQUATIONS AND MATRICES (1) Solve Ax = b by Gauss-Jordan elimination, (2) Find the LDU factorization of A. (3) Write A as a product of elementary matrices. (4) Find the inverse of A. 1.26. A square matrix A is said to be nilpotent if A¥ = 0 for a positive integer k. (1) Show that an invertible matrix is not nilpotent. (2) Show that any triangular matrix with zero diagonal is nilpotent. (3) Show that if A is a nilpotent with A¥ = 0, then J ~ A is invertible with its inverse [+ A+++ AEH 1.27. A square matrix A is said to be idempotent if A? = A. (2) Find an example of an idempotent matrix other than 0 or Z (2) Show that, ifa matrix 4 is both idempotent and invertible, then A =. 1.28. Determine whether the following statements are true or false, in general, and justify your answers. (1) Let A and B be row-equivalent square matrices. Then A is invertible if and only if B is invertible. (2) Let A be a square matrix such that AA = A. Then A is the identity. (8) If A and B are invertible matrices such that A? = I and B* = I, then (AB)? = BA. (4) IA and B are invertible matrices, A + B is also invertible. (5) If A, B end AB are symmetric, then AB = BA. (6) If A and B are symmetric and the same size, then AB is also symmetric. (7) Let ABT = I. Then A is invertible if and only if B is invertible. (8) If a square matrix A is not invertible, then neither is AB for any B. (@) 18 E; and Ep are elementary matrices, then E, E> = ExEy (10) The inverse of an invertible upper triangular matrix is upper triangular. (11) Any invertible matrix A can be written as A = LU, where L is lower triangular and U is upper triangular (12) If Ais invertible and symmetric, then A~? is also symmetric. Chapter 2 Determinants 2.1 Basic properties of determinant Our primary interest in Chapter 1 was in the solvability or solutions of a system Ax = b of linear equations. Foran invertible matrix A, Theorem 1.8 shows that the system has-e.unique solution x= A='b for any b. Now the question is how to decide whether or not a square matrix A is invertible. In this section, we introduce the notion of determinant as a real-valued function of square matrices that satisfies certain axiomatic rules, and then show that a square matrix A is invertible if and only if the determinant of A is not zero. In fact, we saw in Chapter 1 that a 2x 2 matrix A= | % ? | is invertible if and only if ad— be x 0, This number is ‘| called the determinant of A, and is defined formally as follows: ab Definition 2.1 For a 2x 2 matrix A = [ aes ] € Mzx2(R), the deter- minant of A is defined as det A = ad ~ be. In fact, it turns out that geometrically the determinant of a 2 x 2 matrix A represents, up to sign, the area of a parallelogram in the zy-plane whose edges are constructed by the row vectors of A (see Theorem 2.9), so it will be very ice if we can have the same idea of determinant for higher order matrices. However, the formula itself in Definition 2.1 does not provide any clue of how to extend this idea of determinant to higher order matrices. Hence, we first examine some fundamental properties of the determinant function defined in Definition 2.1. 49 50 CHAPTER 2. DETERMINANTS By a direct computation, one can easily verify that the function det in Definition 2.1 satisfies the following three fundamental properties: (1) det | a 1 o1 (2) det | © All = bead = —(ad—be) = weal 8 A (8) det Kaw tal hea ha + ba!)d — (kb + €B)e (ad — be) + &(a'd ~ We) ab av =ee:| ¢ t J+eae[ ° ‘| Actually all the important properties of the determinant function can be derived from these three properties. We will show in Lemma 2.3 that if a function f : Mzyo(B) — R satisfies the properties (1), (2) and (8) above, then it must be of the form f(A) = ad — be. An advantage of looking at these properties of the determinant rather than looking at the explicit formula given in Definition 2.1 is that these three properties enable us to define the determinant function for any n x me square matrices. Definition 2.2 A real-valued function f : Mnxn(R) —* R of all n xn square matrices is called a determinant if it satisfies the following three rules: (Ry) the value of f of the identity matrix is 1, te, f(In) =; (Ra) the value of f changes sign if any two rows are interchanged; (Rs) f is linear in the first row: that is, by definition, ey + ert, n r ” rm r f = flea eeal lea Tr a tr where r;’s denote the row vectors of a matrix. It is already shown that the det on 2x 2 matrices satisfies these rules. We will show later that for each positive integer n there always exists such a function f : Mnxn(R) — R satisfying the three rules in the definition, and, moreover, it is unique. Therefore, we say “the” determinant and designate it as “det” in any order. 2.1. BASIC PROPERTIES OF DETERMINANT 51 Let us first derive some direct consequences of the three rules in the definition (the readers are suggested to verify that det of 2 x 2 matrices also satisfies the following properties): ‘Theorem 2.1 The determinant satisfies the following properties. (2) The determinant is linear in each row, i.e., for each row the rule (Fs) also holds. (2) If A has either a zero row or two identical rows, then det A (3) The elementary row operation that adds a constant multiple of one row to another row leaves the determinant unchanged. Proof: (1) Any row can be placed in the first row with a change of sign in the determinant by rule (Ra), and then use rules (Rg) and (Re). (2) If A has a zero row, then the row is zero times the zero row. If A has two identical rows, then interchanging these identical rows changes only the sign of the determinant, but not A itself. Thus we get det A = —det A. (8) By a direct computation using (1), we get net key % f 7 uj +khf , % 9 13 in which the second term on the right side is zero by (2) o I is now easy to see the effect of elementary row operations on evaluations of the determinant. The first elementary row operation that “multiplies a constant k to a row” changes the determinant to k times the determinant by (1) of Theorem 2.1. The rule (Rg) in the definition explains the effect of the elementary row operation that “interchanges two rows”. The last elementary row operation that “adds a constant multiple of a row to another” is explained in (3) of Theorem 2.1. Example 2.1 Consider a matrix .o1rou A=|a boc bte cha b+a 52 CHAPTER 2. DETERMINANTS If we add the second row to the third, then the third row becomes fatb+e atbt+e atb+d, which is a scalar multiple of the first row. Thus, det A = 0. Problem 2.1 Show that, for an nxn matrix A and k ER, det(kA) = k* det A. Recall that any square matrix can be transformed to an upper triangular matrix by forward eliminations. Further properties of the determinant are obtained in the following theorem. Problem 2.2 Explain why det A = 0 for ati at4 a47 Q)A=|at2 at5 at8], QA Be a+3 a+6 a+9 ‘Theorem 2.2 The determinant satisfies the following properties. (A) The determinant of a triangular matrix is the product of the diagonal entries. (2) The matriz A is invertible if and only if det.A #0. (3) For any two nxn matrices A and B, det(AB) = det A det B. (4) det. A? = det A. Proof: (1) If Aisa diagonal matrix, then it is clear that det A = a1 -+-@nn by (1) of Theorem 2.1 and rule (Rx). Suppose that A is a lower triangular matrix. ‘Then a forward elimination, which does not change the determinant, produces a zero row if A has a zero diagonal entry, or makes A row equivalent to the diagonal matrix D whose diagonal entries are exactly those of A if the diagonal entries are all nonzero. ‘Thus, in the former case, det.A = 0 and the product of the diagonal entries is also zero. In the latter case, det A = det D = a11---dnn. Similar arguments apply when A is an upper triangular matrix. (2) Note again that a forward elimination reduces a square matrix A to an upper triangular matrix, which has a zero row if A is singular and has no zero row if A is nonsingular (see Theorem 1.8). (3) If A is not invertible, then AB is not invertible, and so det(AB) = 0 = det Adet B. By the properties of the elementary matrices, it is clear that for any elementary matrix B, det(EB) = det F det B. If Ais invertible, 2.1, BASIC PROPERTIES OF DETERMINANT. 53 it can be written as a product of elementary matrices, say A = Ey Ep-+- Ey. ‘Then by induction on k, we get det(AB) = det(E\E2---E,B) det Ey det By ++ det By det B = det(EyEy-- Ey) det B det Adet B. (4) Clearly, A is not invertible if and only if AT is not. ‘Thus for a singular matrix A we have det A? = 0 = det A. If A is invertible, then there is a factorization PA = LDU for a permutation matrix P. By (3), we get det P detA=detL det D detU. Note that the transpose of PA = LDU is AT PT = UT DT LT and that for any triangular matrix B, det B = det BT by (1). In particular, since L, U, LT, and UT are triangular with 1's on the diagonal, their determinants are all equal to 1. Therefore, we have det AT det P? = detUT det D™ det L7 = detL detD detU = det A det P. By the definition, a permutation matrix P is obtained from the identity matrix by a sequence of row interchanges: that is, P= Ex---E,Jy for some k, where each E; is an elementary matrix obtained from the identity matrix by interchanging two rows. Thus, det E; = —1 for each i = 1,...,k, and clearly ET = E; = Ey*. Therefore, det P = (—1)* = det PT by (3), so det A = det AT, o Remark: From the equality det A = det A, we could define the determi- nant in terms of columns instead of rows in Definition 2.2, and Theorem 2.1 is also true with “columns” instead of “rows”. Example 2.2 Evaluate the determinant of the following matrix A: -4 0 0 3 021 0-1 2 —4 8-1 54 CHAPTER 2, DETERMINANTS Solution: By using forward elimination, A can be transformed to an upper triangular matrix U7. Since the forward elimination does not change the determinant, the determinant of A is simply the product of the diagonal entries of U: 2-4 0-1 detA = det = det} 9 “9 o 0 2. (-1)? +13 Problem 2.9 Prove that if A is invertible, then det A~! = 1/ det A. Problem 2.4 Evaluate the determinant of each of the following matrices: LAD i 12 13 14 Lon a 28 21 22 23 24 eoL 2 a o[3ithe ai a2 33 a4]? 9) | a2 8 1g 8 4142 43 44 ae 1 2.2 Existence and uniqueness Recall that det A = ad — be defined in the previous section satisfies the three rules of Definition 2.2. Conversely, the following lemma shows that any function of Mzx2(R) into R satisfying the three rules (Ri) - (Rs) of Definition 2.2 must be det, which implies the uniqueness of the determinant function on Mzx2(R). Lemma 2.3 If f : Moxs(R) + R satisfies the three rules in Definition 2.2, then f(A) = ad ~ be. Proof: First, note that ‘| ‘| =1 by the rules (Ry) and (Rp). fade [ee]+[¢ ¢] [5 eles o}s[o e]o-[e ] ad-+0-+0—be. o f(A) = f i f " 2.2, EXISTENCE AND UNIQUENESS 55 ‘Therefore, when n = 2 there is only one function f on Mx2(R) which satisfies the three rules: i.e., f = det. Now for n = 3, the same calculation as in the case of n = 2 can be applied. That is, by repeated use of the three rules (Ry) - (Rs) as in the proof of Lemma 2.3, we can obtain the explicit formula for the determinant function on Mgx3(R) as follows: ay a2 a3 det | az1 22 423 431 a3 33 a 0 0 0 ap 0 0 0 as = det} 0 ay 0 |det] 0 0 agg |+det}an 0 0 0 0 agg a 0 0 0 aa 0 am 0 0 0 a2 0 0 0 as tdet| 0 0 agg |+det}an 0 0 |+det} 0 aye 0 0 ay 0 0 0 ay a 0 0 = 011022033 + 212023031 + 213421432 ~ 411429082 ~ 419021089 ~ a9422091, ‘This expression of det A for a matrix A € Maxs(R) satisfies the three rules. Therefore, for n = 3, it shows both the uniqueness and the existence of the determinant function on Mgx3(R). Problem 2.5 Show that the above formula of the determinant for 3 x 3 matrices satisfies the three rules in Definition 2.2. ‘To get the formula of the determinant for matrices of order n > 3, the same computational process can be repeated using the three rules again, but the computation is going to be more complicated as the order gets higher. ‘To derive the explicit formula for det.A of order n > 3, we examine the above case in detail. In the process of deriving the explicit formula for det A of a 3.x 3 matrix A, we can observe the-following three steps: (1st) By using the linearity of the determinant function in each row, det A of a 3 x 3 matrix A is expanded as the sum of the determinants of 3° = 27 matrices. Except for exactly six matrices, all of them have Zero columns so that their determinants are zero (see the proof of Lemma 2.3). (2nd) In each of these remaining six matrices, all entries are zero except for exactly three entries that came from the given matrix A. Indeed, no two of the three entries came from the same column or from the same row of A. 56 CHAPTER 2, DETERMINANTS In other words, in each row there is only one entry that came from A and at the same time in each column there is only one entry that came from A. ‘Actually, in each of the six matrices, the three entries from A, say a,j, axe, and dpg, are chosen as follows: If the first entry ay is chosen from the first row and the third column of A, say a3, then the other entries axe and pq in the product should be chosen from the second or the third row and the first or the second column. Thus, if the second entry axe is taken from the second row, the column it belongs to must be either the first or the second, é.e., either ani or az2. If aa; is taken, then the third entry apy must be, without option, a32. ‘Thus, the entries from A in the chosen matrix are 13, a2, and aga. Therefore, the three entries in each of the six remaining matrices are determined as follows: when the row indices (the first indices 4 of aij) are arranged in the order 1, 2, 3, the assignment of the column indices 1, 2, 8 (the second indices j of aij) to each of the row indices is simply a re-arrangement of 1, 2, 3 without repetitions or omissions. In this way, one can recognize that the number 6 = 3! is simply the number of ways in which the three column indices 1, 2, 3 are rearranged. (8rd) The determinant of each of the six matrices may be computed by converting it into a diagonal matrix using suitable “column interchanges” (see Theorem 2.2 (1)), 60 its determinant becomes :taijakedp,, where the sign depends on the number of column interchanges, For example, for the matrix having entries a3, az, and ag) from A, one can convert this matrix into a diagonal matrix in a couple of ways: for instance, one can take just one interchanging of the first and the third columns or take three interchanges: the first and the second, and then the second and the third, and then the first and the second, In any case, 0 0 as a3 0 0 det | 0 az 0 |=—det] 0 aa 0 | =—arsana3. a 0 0 0 0 ay Note that an interchange of two columns is the same as an interchange of two corresponding column indices. As mentioned above, there may be sev- eral ways of column interchanges to convert the given matrix to a diagonal matrix. However, it is very interesting that, whatever ways of column inter- changes we take, the parity of the number of column interchanges remains the same all the time. In this example, the given arrangement of the column is expressed in the arrangement of column indices, which is 3, 2, 1. Thus, to arrive at the 2.2, EXISTENCE AND UNIQUENESS 87 order 1, 2, 3, which represents the diagonal matrix, we can take either just one interchanging of 3 and 1, or three interchanges: 3 and 2, 3 and 1, and then 2 and 1. In either case, the parity is odd so that the "—" sign in the computation of determinant came from (—1)' = (—1)°, where the exponents mean the numbers of interchanges of the column indices. In summary, in the expansion of det A for A € Maxs(R), the number 6 = 31 of the determinants which contribute to the computation of det A is simply the number of ways in which the three numbers 1, 2, 3 are rearranged without repetitions or omissions. Moreover, the sign of each of the six determinants is determined by the parity (even or odd) of the number of column interchanges required to arrive at the order of 1, 2, 3 from the given arrangement of the column indices. ‘These observations can be used to derive the explicit formula of the deter- minant for matrices of order n > 3. We begin with the following definition. »n} Definition 2.3 A permutation of the set of integers Nn = {1, 2, is a one-to-one function from Np, onto itself. ‘Therefore, a permutation o of N, assigns a number o(i) in Nq to each number i in Nj, and this permutation a is commonly denoted by 1 2 on im) = ( ol) o(2) ++ ofn) ). Here, the first row is the usual lay-out of Nn as the domain set, and the second row is just an arrangement in a certain order without repetitions or omissions of the numbers in NV, as the image set. A permutation that inter~ changes only two numbers in NVq, leaving the rest of the numbers fixed, such asc = (3,2,1,...,n), is called a transposition. Note that the composition of any two permutations is also a permutation. Moreover, the composition of a transposition to a permutation ¢ produces an interchanging of two num- bers in the permutation @. In particular, the composition of a transposition with itself always produces the identity permutation. It is not hard to see that if S, denotes the set of all permutations of Ns, then S, has exactly n! permutations. Once we have listed all the permutations, the next step is to determine the sign of each permutation, A permutation o = (j1, j2, ---+ Jn) is said to have an inversion if j, > je for s < t (i.e., a larger number precedes a smaller number). For example, the permutation ¢ = (3,1,2) has two inversions since 3 precedes 1 and 2. (2), (2), 58 CHAPTER 2. DETERMINANTS Am inversion in a permutation can be eliminated by composing it with a suitable transposition: for example, if « = (3,2,1) with three inversions, then by multiplying a transposition (2,1,8) to it, we get (2,3,1) with two inversions, which is the same as interchanging the first two numbers 3, 2 in c. Therefore, given a permutation ¢ = {o(1), 0(2), ...,0(n)) in Sq, one can convert it to the identity permutation (1, 2, ..., n), which is the only one with no inversions, by composing it with certain number of transpositions. For example, by composing the three (which is the number of inversions in g) transpositions (2,1,3), (1,3,2) and (2,1,3) with o = (3,2, 1), we get the identity permutation. However, the number of necessary transpositions to convert the given permutation into the identity permutation need not be unique as we have seen in the third step. Notice that even if the number of necessary transpositions is not unique the parity (even or odd) is always consistent with the number of inversions. Recall that all we need in the computation of the determinant is just the parity (even or odd) of the number of column interchanges, which is the same as that of the number of inversions in the permutation of the column indices. ‘A permutation is said to be even if it has an even number of inversions, and it is said to be odd if it has an odd number of inversions. For example, when n = 3, the permutations (1, 2, 3), (2, 3, 1) and (3, 1, 2) are even, while the permutations (1, 3, 2), (2, 1, 3) and (3, 2, 1) are odd. In general, for a permutation ¢ in Sa, the sign of o is defined as cen(o) = { _} if is an even permutation en) =) 1. if ois an odd permutation. It is not hard to see that the number of even permutations is equal to that of odd permutations, so it is 4. In the case n = 3, one can notice that there are 3 terms with + sign and 3 terms with — sign. Problem 2.6 Show that the number of even permutations and the number of odd permutations in S, are equal. Now, we repeat the three steps to get an explicit formula for det A of a square matrix A = [aij] of order n. First, the determinant det A can be expressed as the sum of determinants of n! matrices, each of which has zero entries except the n entries aio(1)s @20(2)+ **" Gno(n) taken from A, where o is a permutation of the set {1,2,...,n} of column indices. The n. entries aig(1), @25(2}s °**+ Gno(n) are chosen from A in such a way that no 2.2, EXISTENCE AND UNIQUENESS 59 ‘two of them come from the same row or the same column. Such a matrix can be converted to a diagonal matrix. Hence, its determinant is equal to £09(1)420(2) “** Anon)» Where the sign + is determined by the parity of the number of column interchanges to convert the matrix to a diagonal matrix, which is equal to that of inversions in o: sgn(o). Therefore, the determinant of the matrix whose entries are all zero except for aje(q’s is equal to sen(7)A19(1)420(2) "**Gno(n)s which is called a signed elementary product of A. Now, our discussions can be summarized as follows: Theorem 2.4 For ann xn matrix A, det A= 7 sen! oS Jare(1y20(2) ***no(n)- That is, det A is the sum of all signed elementary products of A. It is not difficult to see that this explicit formula for det A satisfies the ‘three rules in the definition of the determinant. Therefore, we have both ezistence and uniqueness for the determinant function of square matrices of any order n> 1. Example 2.3 Consider a permutation 7 = (3,4,2,5,1) € Ss: ie., 0(1) = 3, o(2) = 4,... , o(5) = 1. Then o has total 2+4 = 6 inversions: two inversions caused by the position of o(1) = 3, which precedes 1 and 2, and four inversions in the permutation r = (4,2,5,1), which is a permutation of the set {1,2,4,5}. Thus, (-1)?+4 = (-1)%sgn(r). san(o) Note that the permutation 7 can be considered as a permutation of Ns by replacing the numbers 4 and 5 by 3 and 4, respectively. Moreover, « = (3,4,2,5, 1) can be converted to (1,3, 4,2,5) by shifting the number 1 by four transpositions, and then (1,3, 4, 2,5) can be converted to the identity permutation (1,2, 3,4,5) by two transpositions. Hence, ¢ can be converted to the identity permutation by six transpositions a In general, for a fixed j, 1 1. Show that (1) det(adja) = (det A)" (2) adjfadjA) = (det Ay~ ‘The next theorem establishes a formula for the solution of a system of n equations in n unknowns. It is not useful as a practical method but can be used to study properties of the solution without solving the system. Theorem 2.8 (Cramer’s rule) Let Ax = b be a system of n linear equa- tions in n unknowns such that det A #0. Then the system has the unique solution given by det C; a GetA’ where C; is the matriz obtained from A by replacing the j-th column with the column matrix b = [by bo =» bal. G=L Am, Proof: If detA # 0, then A is invertible and x = A~'b is the unique solution of Ax = b. Since 1 =A b= i = Ab = a ladiA)b, it follows that np x MAI baa +--+ PnAny _ detOs 4 om det A * deta”

You might also like