Linear Algebra
Linear Algebra
Nathaniel Beck Department of Political Science University of California, San Diego La Jolla, CA 92093 [email protected] http://weber.ucsd.edu/nbeck April, 2001
Vectors
k-vector as point in Euclidean k dimensional space (Rk )
1 is a point in R2 . 2 Note that for reasons to clear later, we always represent vectors as COLUMN vectors
Dene addition of to two vectors as the sum element by element, so
1 2
3 5
4 7
(1)
where addition is only dened for two vectors in same dimensional space (so both must be k-vectors) Dene scalar multiplication as the element by element product of the scalar with each element of the vector
2 3
2a 3a
(2)
Linear combination of vectors x and y is ax + by for any scalars a and b. Consider a set of vectors. The span of this set is all the linear combinations of the vectors. Thus the span of {x, y, z} is the set of all vectors of the from ax + by + cz for all scalars (real numbers) a, b, c.
Typeset by FoilTEX
Basis, dimension
Consider a set of vectors, X . Suppose X is the span of some vectors a1 , . . . an . The latter set is a basis for X . There are lots of bases for X . Often one has really nice features and so we use it. Particularly interested in basis of Rn . Consider R2 . A nice basis is the two vectors and
1 0
0 . 1 a1 a2 = a1 1 0 + a2 0 1
so we have a basis.
1 1
and
a1 a2
= c1
1 1
+ c2
1 1
To see we can do this, just solve the two linear equations. Note that vectors and solving linear equations intimately related. Note that if {x, y} is a basis, so is {x, y, z} for any z since we can just write a = a1 x + a2 y + 0z since {x, y} is a basis. A set of vectors is linearly dependent if there is some non-trivial linear combination of the vectors that is zero. By non-trivial we mean a combination other than with all scalars being zero, since 0x + 0y = 0 of course. A set of vectors is linearly independent if it is not linearly dependent. Note that is a set of vectors is linearly dependent, we have, say, ax + by = 0 for non-zero a or b, so we can write y = a x, that is, one vector is a linear combination of other vectors in b the set. Since we can always rewrite a vector that is a linear combination of a linearly dependent set of vectors as a new linear combination of the linearly independent subset of the vectors. Thus we can say a nice basis consists only of linearly independent vectors. We also particularly like the natural basis of
1 0 and where each vector is of unit 0 1 length and the two are orthogonal (at right angles). But, as noted, tons of other bases, even if restrict to linear dependence. Let us call the vectors in the natural basis i1 , i2 and similarly for higher dimensional spaces. Note that two vectors form a basis for R2 , and 2 is the dimension of R2 . This is not an
accident, the dimension of a vector space is the number of linearly independent vectors needed to span it. Typeset by FoilTEX 1
Note also that in R2 that any three vectors must be linearly dependent, since we can write the third vector as a linear combination of the other two. Again, just solve the equations to make the third vectors sum to zero non-trivially. Note that any set of vectors that include the zero vector 0 = dependent since, say, 0x + a0 = 0 for a = 0.
0 0
must be linearly
Matrices
A matrix is a rectangular array of numbers or a series of vectors placed next to each other. An example is
0 2 2
1 7 1
3 3 2
6 4 0
(3)
Matrices are called m n with m rows and n columns. We can dene scalar multiplication in the same way as for vector, just multiply each element of the matrix by the scalar. We can dene addition of matrices as element by element addition, though we can only add matrices of the same number and rows and column. Can understand matrices by the following circuitous route: A linear transform is a special type of transformation of vectors; these transforms map n dimensional vectors into m dimensional vectors, where the dimensions might or might not be equal. A linear transform, T has the properties:
(4) (5)
Note that if we know what T does to the basis vector, we completely know T since T (x) = x1 T (i1 ) + + xn T (in ). Let us then think of an m n matrix as dening a linear transform from Rn to Rm where the rst column of the matrix tells what happens to the i1 , the second column to i2 and so forth. Typeset by FoilTEX 2
0 2 and so Thus the above example matrix denes a transform that takes i1 to the vector 2 forth.
Thus we can dene M x, the product of a matrix and a vector as the vector that the transform dened by M maps x into. 1 2 1 which is a linear transform from R2 R3 . What happens to, say Say M = 2 0 1 1 ? This vector is 1i1 + 2i2 and so 2
1 2 5 T (x) = M x = 2 + 2 1 = 3 0 1 2
(6)
This is the same thing we do woodenheadly by taking each element by taking the sum of the product of the elements of row i with the vector. We could compute M N by thinking of N as a k column vectors, and the above operation would produce k column vectors with the same number of rows as M . Note that the number of columns in M (the dimension of the space we are mapping) from must equal the number of elements in each of the column vectors of N . But we can also think of the matrix product M N as the composition of the transforms corresponding to M and N . Say M is n m and N is m k. So N takes k vectors into m vectors and M takes m into n vectors so we can see M N takes k vectors into n vectors, by rst taking them into m vectors and then taking those m vectors into n vectors. To understand the resulting product in matrix form, just think of it as telling us what happens to the k basis vectors after the two transformations. It is easy to see that the i, j th element of the product is given to us by taking the sum of the element by element products of row i and column j , and that we can only multiply matrices where the number of columns in the rst one equals the number of rows in the second. Note that matrix multiplication is not commutative, in that N M will usually not even be dened, since M takes m vectors into n vectors but N takes k vectors into m vectors, so N cannot operate on the vectors produced by M . We can also think of M + N as corresponding to the transform which has M + N (x) = M (x) + N (x), always assuming the matrices are of the right dimension (in the jargon, conformable).
Typeset by FoilTEX
Square Matrices
First, dene 0n as the n n matrix containing all zeros and In as the square matrix (n n) with ones on the diagonal and zeros elsewhere. In is the identity matrix, and the subscript for dimension is often suppressed. Note that A + 0 = A (assuming all are n n and that A0 = 0 so 0 behaves like a zero should. Note also that AI = IA = A. Thus, in terms of transforms, I corresponds to the linear transform from n space to itself which just leaves each vector alone, that is, L(x) = x. 0 corresponds to the transform which maps all n vectors into an n vector of all zeros. Finally, if x = 0 (a column vector of all zeros), then we see that M x = 0. Are there other, non-zero, vectors that get mapped into zero? This turns out to be critical. Note that if M x = 0 for non-zero x, then the elements of x give a non-trivial linear combination of the rows of M which add to zero, so the rows of M are linearly dependent. If no non-zero x gets mapped to zero, then the rows of M are linearly independent. We dene the maximal number of linearly independent rows of M as its row rank and similarly for the column rank. It is easy to prove that the row and column ranks of a matrix are the same, so we can just talk of the rank of a matrix. If it is n n its rank can be no bigger than n; if it is n, then we say it has full rank, otherwise it is less than full rank.
take a vector into only one vector. But if L(x) = L(y), what is the inverse of L(x); is it x or y? Onto means that every vector in the range space is mapped by something, so that there is an x to make y = L(x) for every y. If the transform is not onto, then some vectors in the range space have no map from the domain, so how do we then invert? Note that L(0) = 0 since L(0) = L(x x) = L(x) L(x) = 0. If any other points in the domain get mapped into zero, then the transform is not one to one, and so inverse does not exist. If nothing other than 0 gets mapped into 0, then it is easy to show that the transform is one to one. (Suppose not. Then we would have L(x) = L(z) = y , x = z , and so L(x z) = 0 but x z = 0.) Thus to see if inverses exist, we need merely look at what gets mapped into zero (the kernel of the transform) and see if only zero gets mapped into zero. Thus consider the transform T (x1 , x2 ) = (x1 ) which takes R2 R1 . Clearly all elements (0, x2 ) get mapped to 0, so the transform is not one to one and hence inverse does not exist. x1 x1 + x2 Consider mapping R2 R2 by L = . Note that L(0) = 0. Can x2 x1 x2 L(x) = 0, x = 0? If so, both x1 + x2 = 0, x1 x2 = 0. Solve and you will see that this holds only for x1 = x2 = . (Again, note the tie between matrices, linear transforms and solving linear equations.) First, only square matrices can have inverses. If M is not square, it maps Rn Rm . If
m > n, then more elements the transform cannot be onto (only onto transform could be onto an n dimensional subspace. If m < n, this discussion holds for the inverse.
So only square matrices have inverses. All we have to do is to check that only 0 gets mapped to 0. But as we have seen, if something else gets mapped to zero, the columns of the matrix are linearly dependent. Thus a square matrix has an inverse if it is full rank.
Determinant
Determinants are usually treated as annoying things to compute. But the determinant of A, (|A|), is simply the hypervolume of the hyperrectangle dened by either the rows or columns of A. Typeset by FoilTEX 5
Note that is one row or column is a linear combination of the others, then the hyperrectangle is not of full dimension, that is, has hypervolume of zero. Thus a matrix has an inverse if and only if its determinant is non-zero.
Quadratic forms
Letting x be a vector and A a conformable matrix, we often deal with quadratic forms x Ax which is a scalar (x is 1 n, so product is 1 1, that is, a scalar). A matrix is positive denite (PD) if all quadratic forms involving it are positive (other than the trivial x = 0). Similarly for negative denite; positive semi-denite has all quadratic forms involving A being non-negative and similarly for negative semi-denite. Note that quadratic forms are the matrix analogue of scalar quadratic equations. To see this, 1 2 let A = . Then 3 4
x Ax = x
2 2 = x1 + 5x1 x2 + 4x2
(7)
Matrix calculus
We need matrix calculus to do minima and maxima; matrix calculus is taking derivatives with respect to a vector x; this is dened by the taking the derivatives with respect to each element of x and then treating all the derivatives as a vector. Thus this is purely a notation saving device and the proofs are trivial.
dAx =A dx dx Ax = 2Ax dx
(8)
(9)
Typeset by FoilTEX