0% found this document useful (0 votes)
95 views31 pages

MathReview 2

This document provides a summary of an implicit functions lecture. It begins by defining explicit and implicit functions, with implicit functions appearing when it is impossible to isolate the variable of interest. The implicit function theorem is introduced, stating that under certain conditions, an implicit function can be treated as an explicit function locally. Examples of profit maximization and production possibility frontiers are provided. The lecture then discusses analyzing and graphing implicit functions, and how the gradient is perpendicular to the level sets of an implicit function.

Uploaded by

resperado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views31 pages

MathReview 2

This document provides a summary of an implicit functions lecture. It begins by defining explicit and implicit functions, with implicit functions appearing when it is impossible to isolate the variable of interest. The implicit function theorem is introduced, stating that under certain conditions, an implicit function can be treated as an explicit function locally. Examples of profit maximization and production possibility frontiers are provided. The lecture then discusses analyzing and graphing implicit functions, and how the gradient is perpendicular to the level sets of an implicit function.

Uploaded by

resperado
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Math Review: Week 2

1
Yoshifumi Konishi
Ph.D. in Applied Economics
COB 337j
E-mail: [email protected]
6. Implicit Functions
Up to now, we have studied functions of the form:
y = f(x
1
; :::; x
n
)
where y is an unique endogenous variable with all RHS variables being exogenous. In this form, we
say that y is an explicit function of variables x
1
; :::; x
n
. But, we know that we can rewrite this
function as follows:
y f(x
1
; :::; x
n
) = 0
or,
g(x
1
; :::; x
n
; y) = 0
This function is qualitatively very dierent from the original function. The g function is a function
from R
n+1
to R
1
and the f function is a function from R
n
to R
1
. But, the g function is no longer
a variable function, while the original f function was a variable function. In other words, the g
function determines the relationship between y and x
1
; :::; x
n
, by holding the RHS constant. In this
form, we say that y is an implicit function of variables x
1
; :::; x
n
. Implicit functions often appear
in many economic applications. It is often impossible to separate the variable of our interest from
other variables such that we can have a nice explicit function like y = f(x
1
; :::; x
n
). In this lecture,
we will learn how to analyze such functions. Before we proceed, lets look at some examples.
Example 1.
(i) Consider a prot function:
= pF(x) wx
where p and w are prices of output and input, respectively, and F is a production function. To
maximize this, we take the rst-order condition:
pF
0
(x) w = 0
1
This lecture note is adapted from the following sources: Simon & Blume (1994), W. Rudin (1976), A. Takayama
(1985), M. Wadachi (2000), and Toda & Asano (2000).
1
So, now the optimal value x is an implicit function of p and w. Suppose F(x) = x
1=2
: (Question:
What is this type of function called in economics? Answer: Decreasing return to scale). In this
case, we can rewrite it as an explicit function:
p
2
x

1
2
w = 0 = x =
_
2w
p
_
2
(ii) Consider a production possibility frontier described by the following relationships:
y
2
+ 4xy + 4x
2
= 0 (1)
y
3
5xy + 4x
2
= 0 (2)
In the case of (1), we can explicitly solve for y:
(y + 2x)
2
= 0 = y = 2x
But, in the case of (2), we cannot explicitly solve for y. But, clearly, there is some relationship
between x and y. For example, suppose x = 1. Then, the identity becomes: y
3
5y +4 = 0. Thus,
y = 1 is the solution. When x = 0, y
3
= 0, so that y = 0. Question: Suppose that we can dene
an implicit function of the form g(x; y) = c. Does this necessarily mean there is a function from
x to y? We do not need to be able to solve explicitly. Answer: No. Remember the denition of
a function. For each point x X, there must be at most one point y Y such y = f(x). But,
consider:
y
2
3xy 10x
2
= 0
We can factorize this as:
(y 5x)(y + 2x) = 0 = y = 5x; 2x
So, for each x, there are more than one value of y. But, this case is easily dealt with by restricting
the range of y _ 0. On the range y _ 0, there exists a smooth (continuous and dierentiable)
function y = 5x. There is a more problematic case in which we cannot derive a smooth function in
the neighborhood of some points x. This can be seen graphically (See the graphs). More generally,
we have the following well-known theorem:
Theorem 6-1 (Implicit Function Theorem): Let g : X R
2
R be continuously dieren-
tiable in the neighborhood U of (x
0
; y
0
). Suppose g(x
0
; y
0
) = c and g
y
(x
0
; y
0
) ,= 0. Then, there
exists a C
1
function y = f(x) dened on U such that:
(i) g(x; f(x)) = c for all x U;
(ii) y
0
= f(x
0
);
(iii)
dy
dx
(x
0
) =
g
x
(x
0
; y
0
)
g
y
(x
0
; y
0
)
2
Proof: The proof of this theorem is a bit involved. So, we will simply prove the equality in (iii).
From (i), we know that in the neighborhood U, we have:
g(x; f(x)) = 0
Because this is an identity in this neighborhood, dierentiate both sides with respect to x:
g
x
(x; f(x)) +g
y
(x; f(x))f
0
(x) = 0
Solving for f
0
(x), and substituting (ii), we obtain:
f
0
(x
0
) =
dy
dx
(x
0
; y
0
) =
g
x
(x
0
; y
0
)
g
y
(x
0
; y
0
)
QED:
This theorem can be extended to more than two variable cases.
Theorem 6-2 (Implicit Function Theorem): Let g : X R
n+1
R be continuously dieren-
tiable in the neighborhood U of (x

1
; :::; x

n
; y

). Suppose g(x

1
; :::; x

n
; y

) = c and g
y
(x

1
; :::; x

n
; y

) ,=
0. Then, there exists a function y = f(x
1
; :::; x
n
) dened on U such that:
(i) g(x
1
; :::; x
n
; f(x
1
; :::; x
n
)) = c for all (x
1
; :::; x
n
) U;
(ii) y

= f(x

1
; :::; x

n
);
(iii) for each i,
dy
dx
i
(x

1
; :::; x

n
) =
g
x
i
(x

1
; :::; x

n
; y

)
g
y
(x

1
; :::; x

n
; y

)
Example 2. Consider g(x; y) = x
2
3xy +y
3
= 0. Lets nd dy=dx at (x; y) = (3; 3).
Check that g
y
(3; 3) ,= 0: g
y
= 3x+3y
2
: So, g
y
(3; 3) = 3 3+3 3
2
= 9+27 = 18: Now, compute
g
x
(3; 3) :
g
x
(3; 3) = 2x 3y[
(3;3)
= 2 3 3 3 = 3
Use the implicit function theorem to get:
dy
dx
(3; 3) =
g
x
(3; 3)
g
y
(3; 3)
=
(3)
18
=
1
6
Similarly, we can even consider a system of implicit functions:
g
1
(x
1
; :::; x
n
) = 0
g
2
(x
1
; :::; x
n
) = 0
.
.
.
g
m
(x
1
; :::; x
n
) = 0
3
Not surprisingly, there is a corresponding implicit function theorem for such system. But, it is
beyond the scope of this lecture. You wont see its use, unless you attend a real analysis course
or more advanced courses. We conclude this section discussing the graphical properties of implicit
functions.
Geometrically, implicit functions in R
n
dene level sets in R
n
. For example, g(x; y) = ax+by = c
denes a line in R
2
; g(x; y) = x
2
+y
2
= c denes a circle in R
2
; g(x; y; z) = x
2
+y
2
+z
2
= c denes
a sphere in R
3
. More formally, recall the denition of level sets:
Denition 5-1 (Level Sets): Suppose f : X R
1
where X R
n
. Then, the level set for a
point a f(X) is the set L X such that:
L(a) = x X : f(x) = a
We will prove the following result that seems very intuitive. That is, the gradient vector of the
implicit function _g at x is perpendicular to the tangent line (or plane) on the level set at that
point.
Theorem 6-3: Let g : X R
2
R be continuously dierentiable in the neighborhood U of
(x

; y

). Suppose _g(x

; y

) ,= 0. Then, _g(x

; y

) is perpendicular to the level set of g at (x

; y

).
Proof : By denition,
_g(x

; y

) =
_
g
x
(x

; y

)
g
y
(x

; y

)
_
,= 0
If g
y
(x

; y

) = 0, then the derivative on the level set at that point, dy=dx, is + or . This
implies that the tangent line at that point is vertical. On the other hand, the gradient vector is:
_g(x

; y

) =
_
a
0
_
for some a R
This means that the gradient vector is a horizontal vector (see the graph). So, it is perpendicular
to the tangent line. Now, consider the case where g
y
(x

; y

) ,= 0. In this case, from the Implicit


Function Theorem, the slope of the tangent line on the level set is:
dy
dx
(x

; y

) =
g
x
(x

; y

)
g
y
(x

; y

)
Thus, the directional vector for the tangent line can be written (see the graph again):
v =
_
1;
g
x
(x

; y

)
g
y
(x

; y

)
_
We know that two vectors are perpendicular to each other if and only if v _g(x

; y

) = 0. Clearly,
v _g(x

; y

) =
_
1;
g
x
(x

; y

)
g
y
(x

; y

)
__
g
x
(x

; y

)
g
y
(x

; y

)
_
= 0
QED
4
7. Linear Algebra: System of Linear Equations
A system of linear equations arises naturally in economic applications. Even when we do have
a system of nonlinear equations such as:
y = f(x; y)
x = g(x; y)
It is very common for us to linearize the system (by using a rst-order Taylor expansion, for
example). Suppose that we have a system of linear equations of the form:
x
1
2x
2
= 8 (1)
3x
1
+x
2
= 3
We can rewrite this system in a matrix form:
_
1 2
3 1
__
x
1
x
2
_
=
_
8
3
_
In general, we can represent a system of linear equations as a matrix form:
A
(mn)
x
(n1)
= b
(m1)
We know that we could solve the system if n _ m when there are n unknowns and m equations.
But, is that a sucient condition? We will learn in this section:
(i) When do we have solutions?
(ii) When can we say the solution is unique?
(iii) Is there an ecient algorithm that computes the solutions?
There are essentially three ways of solving such a system:
(i) Substitution;
(ii) Elimination of variables;
(iii) Matrix methods.
These are probably what you have learned in your high school algebra. But, there are many fancy
algorithms within each solution category. The eciency of these methods depends on the property
of the system in question.
Example 1. Substitution
Consider the system (1) again. We can solve the rst equation easily for x
1
: x
1
= 8 + 2x
2
.
Substitute this into another equation:
3(8 + 2x
2
) +x
2
= 3
24 + 7x
2
= 3 or 7x
2
= 21
5
) x
2
= 3
Substituting back this to the rst equation, we have:
) x
1
= 8 + 2(3) = 2
Example 2. Elimination of variables
Consider (1) again. Lets rst multiply both sides by 3:
3x
1
6x
2
= 24
Subtract the second equation from both sides of this new equation:
3x
1
6x
2
= 24
) 3x
1
+x
2
= 3
7x
2
= 21
) x
2
= 3
) x
1
= 2
These solution algorithms are easy to implement, but can be very inecient if we have a large
number of unknowns and equations. In economic applications, we may have thousands of equations.
In such a case, it is more convenient to work with a matrix representation.
A
(mn)
x
(n1)
= b
(m1)
The matrix A is called the coecient matrix of the system. We often create the augmented
matrix
^
A by adding the column vector corresponding to the right-hand side.
^
A = (A[b) =
_
1 2
3 1

8
3
_
The following elementary row operations do not change the property of a linear system (i.e. it
would result in the equivalent system):
(i) Interchange two rows of a matrix;
_
1 2
3 1

8
3
_
=
_
3 1
1 2

3
8
_
(ii) Adding one row to another;
_
1 2
3 1

8
3
_
=
_
1 + 3 2 + 1
3 1

8 + 3
3
_
(iii) Multiply through each row by a nonzero number.
6
_
1 2
3 1

8
3
_
=
_
a 2a
3 1

8a
3
_
Notice that the operations we have done in elimination of variables correspond to these elementary
row operations.
The purpose of performing row operations is to create a matrix of the following form:
_
_
a
11
a
12
a
13
0 a
22
a
23
0 0 a
33

b
1
b
2
b
3
_
_
This matrix form is called row echelon form.
Denition 7-1 (Leading Zero & Row Echelon Form): A row of a matrix is said to have k
leading zeros if the rst k elements of the row are all zeros and the (k +1)-th element is nonzero.
A matrix is said to be in row echelon form if each row of the matrix has more leading zeros than
the row preceding it.
Example 3. Back-substitution and Gaussian Elimination
To see why the new echelon form is useful, consider solving the system:
_
_
a
11
a
12
a
13
0 a
22
a
23
0 0 a
33
_
_
_
_
x
1
x
2
x
3
_
_
=
_
_
b
1
b
2
b
3
_
_
First, look at the last row. We can easily solve this as x
3
= b
3
=a
33
. Then, substitute this into the
second equation, which is: a
22
x
2
+ a
23
x
3
= b
2
. This will give us the solution to x
2
. Then, we can
substitute these into the rst equation and get x
1
. So, we can easily solve the system. This method
is called back substitution. We can apply this logic and create an ecient computer algorithm to
solve thousands of equations in a second. A more general method is called Gaussian Elimination,
which is an algorithm transforming any nonsingular matrix into two triangular matrix such that
Ax = LUx = b and applies back substitution twice to Lz = b and Ux = z.
7-1. Rank and Solutions to A Linear System
As we have studied, the linear equation of the form:
a
11
x
1
+a
12
x
2
= b
1
Denes a line in the two-dimensional space. Thus, the solution to the system of two equations:
a
11
x
1
+a
12
x
2
= b
1
a
21
x
1
+a
22
x
2
= b
2
is a point (x

1
; x

2
) in the two-dimensional space at which two lines cross each other. However, if
one equation is simply a linear transformation of another, say,
7
x
1
+ 2x
2
= 8
2x
1
+ 4x
2
= 16
Then, two equations represent the same line in this space. Thus, the system has innitely many
solutions with the form: x
1
= 8 2x
2
. On the contrary, consider the system:
x
1
+ 2x
2
= 3
x
1
+ 2x
2
= 4
Clearly, these two lines represent two parallel lines do not cross each other. Thus, the system has
no solution. In general, whether a system has zero, one, or innitely many solutions depends on
the characteristics of the coecient matrix A. Notably, the rank of A.
Denition 7-2 (Rank): The rank of a matrix A is the number of nonzero rows in its (reduced)
echelon form. We write rank(A).
You probably wonder if the echelon form is unique. In fact, the echelon form is not unique, it
can have many dierent entries in the echelon form. However, the number of nonzero rows in its
echelon form is unique and does not depend on how we compute the echelon form.
Denition 7-3 (Reduced Echelon Form): A matrix is said to be in reduced echelon form if
each row of the matrix in its echelon form has one in its pivot position and each column containing
the pivot has no other nonzero entries.
Example 4. The following matrix is in reduced echelon form:
_
_
1 a 0 0 0
0 0 1 0 a
0 0 0 1 a
_
_
Example 5. Consider creating a reduced echelon form of the following:
_
0 2
5 3
_

add 2nd to 1st
_
5 5
5 3
_

subtract 1st from 2nd
_
5 5
0 2
_

divide 1st by 5
_
1 1
0 2
_

divide 2nd by -2
_
1 1
0 1
_

subtract 2nd from 1sr
_
1 0
0 1
_
So, the rank of the matrix is 2. In general, I would usually compute the reduced echelon form in
order to compute the rank, because we want to be sure that no more row operations can eliminate
nonzero rows.
Fact 1. Let A be a (mn)-matrix. Then,
(i) rank(A) _ m, the number of rows of A;
(ii) rank(A) _ n, the number of columns of A.
8
Proof : The proof of (i) is obvious from the denition. The rank of A is at most as many as
the number of rows of A, because the rank of A is the number of nonzero rows in echelon form.
To see (ii), suppose rst m _ n. Then, from (i), we have rank(A) _ m _ n: Now suppose that
n < m. Then, all (m n) rows in A can be eliminated by elementary row operations. Thus,
rank(A) _ n:QED
The corollary to this fact is the following: Let A be a (n m)-matrix and B be a (ml)-matrix.
Then,
rank(AB) = minrank(A); rank(B)
We will see why this is true in the next section.
Denition 7-4 (Homogenous System): An equation with a constant term = 0 is called a ho-
mogenous equation. A system of linear equations Ax = b with b = 0 is said to be a homogenous
system.
Now, we are ready to state the following theorem. The proof is very time-consuming. So, I will
only discuss intuitions behind it.
Theorem 7-1: Consider a system of linear equations of the form:
A
(mn)
x
(n1)
= b
(m1)
(a) When m < n (i.e. the number of equations is less than the number of variables):
(i) For every b, Ax = b has either 0 or innitely many solutions.
(ii) If rank(A) = m, then Ax = b has innitely many solutions for every b..
(b) When m > n (i.e. the number of equations is more than the number of variables):
(i) Ax = 0 has one or innitely many solutions.
(ii) For every b, Ax = b has either 0,1, or innitely many solutions.
(iii) If rank(A) = n, then Ax = b has 0 or 1 solution for every b..
(c) When m = n (i.e. the number of equations is the same as that of variables):
(i) Ax = 0 has one or innitely many solutions.
(ii) For every b, Ax = b has either 0,1, or innitely many solutions.
(iii) If rank(A) = n = m, then Ax = b has exactly 1 solution for every b.
Example 6. To see why (a)-(i) and (a)-(ii), consider the cases:
_
0 0
_
_
x
1
x
2
_
= b = 0 = b (No solution, rank(A)=0)
_
1 1
_
_
x
1
x
2
_
= b = x
1
+x
2
= b (Innitely many solutions, rank(A)=1)
Example 7. To see why (b)-(i), consider the case:
_
a
1
a
2
_
x =
_
0
0
_
=
If a
1
= a
2
= 0, then x can be anything:
If a
1
,= a
2
or a
1
= a
2
,= 0, then x = 0:
9
To see why (b)-(ii) and (b)-(iii), consider the cases:
_
1
1
_
x =
_
b
1
b
2
_
=
If b
1
= b
1
, then x = b
1
= b
1
:
If b
1
,= b
1
, then there is no solution:
(rank(A)=1)
_
0
0
_
x =
_
b
1
b
2
_
= x can be anything.
Example 8. To see why (c)-(i), consider the case:
_
1 1
a 1
__
x
1
x
2
_
=
_
0
0
_
=
x
1
+x
2
= 0
ax
1
+x
2
= 0
In this case, if a = 1, then there are innitely many solutions. But, if a ,= 1, then x
1
= x
2
and
ax
1
x
1
= 0. So, x
1
= x
2
= 0. Note also that if a = 1; then
_
1 1
1 1
_

subtract 1st from 2nd
_
1 1
0 0
_
Thus, rank(A) = 1 < 2. But, if a ,= 1, then
_
1 1
a 1
_

multiply 1st by a
_
a a
a 1
_

subtract 1st from 2nd
_
a a
0 1 a
_

divide 1st by a
_
1 1
0 1 a
_

divide 2nd by 1-a
_
1 1
0 1
_

subtract 1st by 2nd
_
1 0
0 1
_
Thus, rank(A) = 2. Therefore, we had exactly one solution when a ,= 1.
Note that in economic applications, we would like to have one exact solution. If we have more
than one solutions or no solution, then we are in trouble. Thus, in view of Theorem 7-1, we have
an ideal case when (c)-(iii). So, we frequently make some assumptions to guarantee (c)-(iii).
Suppose for example, we have the system of m linear equations with n unknowns: Ax = b.
Now, suppose further that m < n. As we see, in such a system, there is no guarantee that we have
a unique solution. However, suppose that some of the variables can be determined exogenously,
say, by policy instruments (e.g. interest rates, money supply, etc, in case of IS-LM model). We
may be able to create a reduced system with only m endogenous variables of our interest such that
the reduced system has exactly one solution. This is what the Linear Implicit Function Theorem
states:
Theorem 7-2 (Linear Implicit Function Theorem): Consider a system of linear equations,
Ax = b with m < n. Consider a partition of unknown variables:
x =
_
x
0
x
00
_
=
_

_
x
1
.
.
.
x
k
_

_ (Endogenous variables)
_

_
x
k+1
.
.
.
x
n
_

_ (Exogenous variables)
10
Then, the system has a unique solution for every choice of x
00
R
nk
if and only if (i) k = m and
(ii) the corresponding coecient matrix (for the endogenous variables):
rank
_
_
_
a
11
a
1k
.
.
.
.
.
.
.
.
.
a
k1
a
kk
_
_
_ = k = m
8. Matrix Algebra
In this course, it is assumed that you have sucient knowledge on matrix algebra. However, in
economics and econometrics, you are likely to encounter very cumbersome computation. Therefore,
it is useful to review elementary as well as advanced rules of matrix algebra. Those who are not
familiar with matrix algebra are strongly encouraged to take some time to work on exercises in
Chapter 8, Simon & Blume.
8-1. Basic Rules
When we say a (m n)-matrix, we have an array of data assorted in m rows and n columns.
It is common to index each entry of the data, as follows:
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_
where we say the entry in the i-th row and j-th column is (i; j)-element of this matrix.
We will review the basic matrix algebraic operations with examples.
Addition & Subtraction
Addition and subtraction of matrices is dened only when they are of the same size. We simply
add and subtract element-by-element.
A =
_
_
2 3
1 1
0 2
_
_
; B =
_
_
1 0
2 1
5 2
_
_
A+B =
_
_
2 3
1 1
0 2
_
_
+
_
_
1 0
2 1
5 2
_
_
=
_
_
3 3
3 2
5 4
_
_
AB =
_
_
2 3
1 1
0 2
_
_

_
_
1 0
2 1
5 2
_
_
=
_
_
1 3
1 0
5 0
_
_
Scalar Multiplication
11
For any r R and any A /
mn
(R), we can dene a scalar multiplication:
rA = r
_
_
2 3
1 1
0 2
_
_
=
_
_
2r 3r
r r
0 2r
_
_
Matrix Multiplication
We can dene the matrix product AB if and only if the number of columns of A is equal to the
number of rows of B. That is, the product is dened for any combination of k; m; n such that:
A
(km)
B
(mn)
Vector product is a special case of matrix multiplication. Question: How do we compute the
following?
( 2 1 1 )
_
_
2
3
1
_
_
= 2 2 + 1 3 + 1 1 = 4 + 3 + 1 = 8
Question: How about the following? Is it well-dened? Answer: Yes. We basically repeat the
vector multiplication and arrange the resulting products according to the following rule:
i-th row j-th column = (i; j)-element in the resulting matrix
_
_
a b
c d
e f
_
_
_
A B
C D
_
=
_
_
aA+bC aB +bD
cA+dC cB +dD
eA+fC eB +fD
_
_
Because we have a collection of (i; j)-elements in the resulting matrix, where the index i coming
from the rst matrix and j coming from the second matrix, we have the (k n)-matrix as a result
of matrix multiplication.
A
(km)
B
(mn)
= C
(kn)
In general, the (i; j)-element of the resulting matrix is written as:
c
ij
= ( a
i1
a
im
)
_
_
_
b
1j
.
.
.
b
mj
_
_
_
= a
i1
b
1j
+a
i2
b
2j
+::: +a
im
b
mj
=
m

h=1
a
ih
b
hj
12
Laws of Matrix Algebra
We have the following laws whenever these operations are well-dened:
(i) Associative Laws:
(A+B) +C = A+ (B +C)
(AB)C = A(BC)
(ii) Commutative Laws of Addition:
A+B = B +A
(iii) Distributive Laws:
A(B +C) = AB +AC
(A+B)C = AC +BC
These are exactly the same laws as we had in case of real numbers. But, there is one important
law which real numbers satisfy but matrices dont. It is the commutative law for multiplication.
\a; b R; ab = ba
But, it is not necessarily true that AB = BA, even when both products are well-dened. Consider:
AB =
_
2 1
1 1
__
1 1
0 2
_
=
_
2 0
1 1
_
BA =
_
1 1
0 2
__
2 1
1 1
_
=
_
1 0
2 2
_
But, there can be some matrices that can satisfy this equality. One example is an identify matrix.
Denition 8-1 (Identify Matrix): The identity matrix is an (nn)-matrix with entries satisfying
a
ij
=
_
1 if i = j
0 if i ,= j
That is,
I =
_
_
_
_
_
1 0 0
0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
_
_
_
_
_
The identify matrix has the following property (Check yourself): For any (k n)-matrix A,
13
AI = A
and for any (n k)-matrix B,
IB = B
Therefore, if A is a (n n)-matrix, we have:
AI = IA = A
Denition 8-2 (Transpose): The transpose of a (mn)-matrix A is the (nm)-matrix obtained
by interchanging the rows and columns of A. That is, the transpose, denoted A
T
, is a matrix such
that:
a
0
ji
= a
ij
where a
ij
is the (i; j)-element of A and a
0
ji
the (j; i)-element of A
T
:
We have the following rules of transpose:
Theorem 8-1 (Transpose Rules): Let A, B be arbitrary matrices. We have the following rules
whenever the operations are well-dened:
(i) (A+B)
T
= A
T
+B
T
;
(ii) (AB)
T
= A
T
B
T
;
(iii) (A
T
)
T
= A;
(iv) (rA
T
)
T
= rA
T
(v) (AB)
T
= B
T
A
T
You are asked to verify these in your HW.
8-2. Special Matrices
We will encounter dierent kinds of problems in economic applications. It is useful to learn
terminologies for special matrices:
Square matrix. Any (n n)-matrix is called a square matrix.
Diagonal matrix. Diagonal matrix is a square matrix in which all non-diagonal entries are zero.
D =
_
_
_
_
_
a
11
0 0
0 a
22
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
nn
_
_
_
_
_
Upper-triangular matrix. Upper-triangular matrix is a matrix (usually square) in which all
entries below the diagonal are zero.
14
U =
_
_
_
_
_
a
11
a
12
a
1n
0 a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
nn
_
_
_
_
_
Lower-triangular matrix. Lower-triangular matrix is a matrix (usually square) in which all
entries above the diagonal are zero.
L =
_
_
_
_
_
a
11
0 0
a
21
a
22
0
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nn
_
_
_
_
_
Symmetric matrix. Symmetric matrix is a square matrix A such that A = A
T
i.e. a
ij
= a
ji
for
all i; j.
S =
_
_
a b c
b d e
c e f
_
_
Idempotent matrix. Idempotent matrix is a square matrix A such that AA = A.
MM =
_
5 5
4 4
__
5 5
4 4
_
=
_
5 5
4 4
_
Permutation matrix. A square matrix of zeros and ones in which each row and each column
contains exactly one 1. It is called permutation matrix, because it permutes entries of a matrix.
P =
_
_
0 1 0
1 0 0
0 0 1
_
_
PA =
_
_
0 1 0
1 0 0
0 0 1
_
_
_
_
a b
c d
e f
_
_
=
_
_
c d
a b
e f
_
_
Nonsingular matrix. Nonsingular matrix is a square matrix of full rank (i.e. rank(A) = n when
A is a (n n)-matrix). We will learn more about this matrix later.
8-3. Elementary Matrices
Recall the three elementary row operations:
(i) Interchanging rows;
15
(ii) Adding a multiple of one row to another;
(iii) Multiplying a row by a nonzero scalar.
These row operations can be performed on a matrix A by premultiplying A by special matrices
called elementary matrices.
Theorem 8-2 (Interchanging Rows):
(i) Let a matrix E
ij
denote a permutation matrix obtained by interchanging the i-th and j-th rows
of an (n n) identity matrix. Then, left-multiplication:
E
ij
A
has the eect of interchanging the i-th and j-th rows of any (n m)-matrix A.
(ii) Let a matrix E
i
(r) denote a matrix obtained by multiplying the i-th row of an (n n) identity
matrix by r. Then, left-multiplication:
E
i
(r)A
has the eect of multiplying the i-th row of any (n m)-matrix A by r.
(iii) Let a matrix E
ij
(r) denote a matrix obtained by inserting r in the (j; i)-position of an (n n)
identity matrix. Then, left-multiplication:
E
ij
(r)A
has the eect of adding r time row i to row j of any (n m)-matrix A.
These matrices E
ij
; E
i
(r); E
ij
(r) are called elementary matrices.
Example 1. Consider a (3 3) identity matrix:
E =
_
_
1 0 0
0 1 0
0 0 1
_
_
Then,
E
12
=
_
_
0 1 0
1 0 0
0 0 1
_
_
This is simply a permutation matrix. We have seen that this matrix has the eect of interchanging
the i-th and j-th rows of any (n m)-matrix A. How about E
2
(5)?
E
2
(5) =
_
_
1 0 0
0 5 0
0 0 1
_
_
16
E
2
(5)A =
_
_
1 0 0
0 5 0
0 0 1
_
_
_
_
a b
c d
e f
_
_
=
_
_
a b
5c 5d
e f
_
_
Finally, E
23
(5) is:
E
23
(5) =
_
_
1 0 0
0 1 0
0 5 1
_
_
E
23
(5)A =
_
_
1 0 0
0 1 0
0 5 1
_
_
_
_
a b
c d
e f
_
_
=
_
_
a b
c d
5c +e 5d +f
_
_
8-4. Inverse Matrices
Thus far, we have seen addition, subtraction, and multiplication. How about division? Can we
dene division on matrices just like on numbers such as 1=a? Formally, an inverse element of an
real number a is dened as a number b such that:
ab = ba = 1
How about applying the same denition to matrices? Because an identity matrix has the role of 1
in matrix operations, an appropriate denition would be:
AB = BA = I
Question: Can we dene such matrices for all matrices A /
mn
(R)? Answer: No. If we force
this denition on any matrices A /
mn
(R); then B must be an (n m)-matrix. But then,
A
(mn)
B
(nm)
= C
(mm)
B
(nm)
A
(mn)
= D
(nn)
Then, we can never have C
(mm)
= D
(nn)
. Thus, if this denition ever works, it must be on (n n)-
matrices (i.e. square matrices).
Denition 8-3 (Inverse Matrices): Let A be a (nn) square matrix. The matrix B /
nn
(R)
is said to be an inverse of A if:
AB = BA = I
nn
When the inverse exists, we say that A is invertible. We write B = A
1
:
Theorem 8-3 (Uniqueness of Inverse): Any square matrix A can have at most one inverse.
17
Proof : Suppose that B and C are both inverses of A. Then,
C = CI = C(AB) = (CA)B = IB = B
So, B must be necessarily equal to C. QED
We have the following important theorems:
Theorem 8-4 (Equivalence): For any square matrix A, the following statements are equivalent:
(i) A is invertible;
(ii) Every system of linear equations Ax = b has a unique solution for all b R
n
;
(iii) A is nonsingular;
(iv) A has a full rank.
(iii)=(iv) is just by denition. To see why (i)=(ii), consider:
Ax = b
Since A is invertible, A
1
exists. Premultiply both sides of the equation:
A
1
Ax = Ix = x = A
1
b
A
1
is derived from A and does not depend on x, so that this will give us a unique solution. The
argument can be reversed. We have seen (ii)=(iii) in Theorem 7-1.
We have convenient computational rules for inverse matrices.
Theorem 8-5: Let A; B be invertible square matrices. Then,
(i) (A
1
)
1
= A;
(ii) (A
T
)
1
= (A
1
)
T
;
(iii) AB is invertible and (AB)
1
= B
1
A
1
:
Proof : (i) is obvious. To see (ii), postmultiply both sides by A
T
,
LHS = (A
T
)
1
A
T
= I
RHS = (A
1
)
T
A
T
=
_
AA
1
_
T
= I
T
= I
To see (iii), by denition, if we nd a matxi C such that:
C(AB) = (AB)C = I
then C is an inverse of AB and we can write C = (AB)
1
. LetC = B
1
A
1
. Then, we have:
B
1
A
1
(AB) = (AB)B
1
A
1
= I
QED
18
Theorem 8-6: If a square matrix A is invertible, then
(i) A
m
= A A ::: A
. .
m times
is invertible for any integer m and:
(A
m
)
1
= (A
1
)
m
= A
m
= A
1
A
1
::: A
1
. .
m times
(ii) For any integer r and s, A
r
A
s
= A
r+s
;
(iii) For any scalar r ,= 0; rA is invertible and (rA)
1
= (1=r)A
1
:
8-5. Partitioned Matrices (Optional)
Any matrices A can be partitioned into submatrices. For example, a (4 5)-matrix can be
partitioned into:
A =
_
_
_
_
_
a
11
a
12
a
13
a
21
a
22
a
23

a
14
a
24

a
15
a
16
a
25
a
26
a
31
a
32
a
33
a
41
a
42
a
43

a
34
a
44

a
35
a
36
a
45
a
46
_
_
_
_
_
which can be written as a (2 3)-matrix of submatrices:
A =
_
A
11
A
12
A
13
A
21
A
22
A
23
_
This is called a partitioned matrix of A. Addition and multiplication can be done similarly, as
long as each submatrix of the partitioned matrices are of sizes that allow for these operations:
A+B =
_
A
11
A
12
A
21
A
22
_
+
_
B
11
B
12
B
21
B
22
_
=
_
A
11
+B
11
A
12
+B
12
A
21
+B
21
A
22
+B
22
_
AB =
_
A
11
A
12
A
21
A
22
__
B
11
B
12
B
21
B
22
_
=
_
A
11
B
11
+A
12
B
21
A
11
B
12
+A
12
B
22
A
21
B
11
+A
22
B
21
A
21
B
12
+A
22
B
22
_
Finally, lets discuss how to actually compute an inverse of a matrix A. In the next section,
we will discuss a more ecient way of computing an inverse. However, there is a primitive way of
computing an inverse. Suppose that A is invertible. Create an augmented matrix
^
A such that:
^
A =
_
A
nn
[ I
nn
_
If there exists an inverse, then we can premultiply this by the inverse to get:
A
1
^
A = A
1
_
A
nn
[ I
nn
_
=
_
I
nn
[ A
1
nn
_
This means that, if we can nd a matrix that conduct elementary row operations such that it
convert A to I. We can nd an inverse of A by applying that matrix to an identify matrix I. In
19
actual computation, we apply elementary row operations consecutively. Consider, for example, a
(2 2) matrix and form an augmented matrix:
_
a b
c d

1 0
0 1
_
Apply elementary row operations on this matrix.
_
a b
c d

1 0
0 1
_

divide 1st by a
_
1 b=a
c d

1=a 0
0 1
_

divide 2nd by c
_
1 b=a
1 d=c

1=a 0
0 1=c
_

subtract 1st from 2nd


_
1 b=a
0 d=c b=a

1=a 0
1=a 1=c
_

multiply 2nd by c
_
1 b=a
0 (ad cb) =a

1=a 0
c=a 1
_

multiply 2nd by a/(ad-cb)


_
1 b=a
0 1

1=a 0
c=(ad cb) a=(ad cb)
_

subtract b/a times 2nd from 1st


_
1 0
0 1

d=(ad cb) b=(ad cb)


c=(ad cb) a=(ad cb)
_
Thus, we have:
A
1
=
1
ad cb
_
d b
c a
_
9. Determinants: An Overview
Undoubtedly, square matrices is the class of matrices that one most frequently encounter in
economic applications. As we saw previously, a system of (n n) linear equations has a unique
solution if and only if the (n n) coecient matrix A is invertible (i.e. nonsingular). By the
equivalence theorem, A is invertible if and only if rank(A) = n. But, we are all aware that
computing the rank of A is troublesome (well, at least it is to me). So, we look for more convenient
ways of (i) checking if A is invertible and (ii) computing its inverse. For this purpose, we rst note
another equivalent denition of nonsingular matrices.
Denition 9-1 (Nonsingular matrix): A square matrix is nonsingular if and only if its de-
terminant is nonzero.
We have a problem here, because we havent dened determinants of matrices. (BTW, I think
whoever came up with this denition, which is more like a theorem, is a genius. To prove this
equivalence is easy, but coming up with the concept of determinants for this equivalence is, I think,
not easy). Now, from our equivalence theorem, we know that A is invertible if and only if its
determinant is nonzero. So, lets work backward to nd determinants from this theorem. Suppose
20
A is a scalar, i.e. A = a /
1
(R). We know that a is invertible (i.e. 1=a exists) if and only if
a ,= 0. So, from this, we dene that:
det(a) = a
Now, lets now consider a (2 2)-matrix, A /
22
(R).
A =
_
a b
c d
_
We have seen that the inverse of this matrix is:
A
1
=
1
ad bc
_
d b
c a
_
From this, we can say that A
1
exists if and only if ad bc ,= 0. So, we can dene:
det(A) = ad bc
In fact, determinants of matrices rst appeared in the computation of a solution to a system of
linear equations. So, this way of dening determinants is perfectly natural. For a (3 3)-matrix,
A /
33
(R),
A =
_
_
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
_
_
We can show that A is invertible if and only if:
a
11
det
_
a
22
a
23
a
32
a
33
_
a
12
det
_
a
21
a
23
a
31
a
33
_
+a
13
det
_
a
22
a
23
a
32
a
33
_
So, we dene:
det(A) = a
11
det
_
a
22
a
23
a
32
a
33
_
a
12
det
_
a
21
a
23
a
31
a
33
_
+a
13
det
_
a
22
a
23
a
32
a
33
_
We can generalize the denition of a determinant for any (n n)-matrix.
Denition 9-2 (Minor, Cofactor, and Determinant): Let A be an (n n)-matrix. Let
^
A
ij
be the (n 1) (n 1)-submatrix obtained by deleting row i and column j from A. Then, the
scalar:
M
ij
= det(
^
A
ij
)
is called the (i; j)-th minor of A. The sign-adjusted scalar:
C
ij
= (1)
i+j
M
ij
= (1)
i+j
det(
^
A
ij
)
21
is called the (i; j)-th cofactor of A. Give these denitions, the determinant of a (n n)-matrix is
a scalar given by:
det(A) = a
11
C
11
+a
12
C
12
+::: +a
1n
C
1n
= a
11
M
11
a
12
M
12
+::: + (1)
1+n
a
1n
M
1n
= a
11
det(
^
A
11
) a
12
det(
^
A
12
) +::: + (1)
1+n
a
1n
det(
^
A
1n
)
As you probably know, there are many ways of computing a determinant. This denition uses the
rst row, but we can use any row k = 1; 2:::; n to compute it:
det(A) = a
11
C
11
+a
12
C
12
+::: +a
1n
C
1n
= a
k1
C
k1
+a
k2
C
k2
+::: +a
kn
C
kn
for any k = 1; 2; :::n
In fact, we dont even have to use a row, we could use any arbitrary column.
det(A) = a
11
C
11
+a
12
C
12
+::: +a
1n
C
1n
= a
11
C
11
+a
21
C
21
+::: +a
n1
C
n1
= a
1k
C
1k
+a
2k
C
2k
+::: +a
nk
C
nk
for any k = 1; 2; :::n
Computing a determinant of actual (nn)-matrices seems daunting, if we stick to the denition.
But, in fact, it is not so bad once we know how to apply simple rules. Its just tedious. Lets compute
a determinant of (3 3)-matrix:
A =
_
_
1 1 3
2 1 0
3 0 1
_
_
Step 1. Pick a row (or column) to start with. Lets pick the rst row.
_
_
1 1 3
2 1 0
3 0 1
_
_
Step 2. Put +,- signs on the row (or next to the column). Always start with + if you pick the rst
row or column and interchange sings.
+ +
_
_
1 1 3
2 1 0
3 0 1
_
_
Step 3. Get the minor matrices
^
A
ij
.
22
A
11
=
_
1 0
0 1
_
; A
12
=
_
2 0
3 1
_
; A
13
=
_
2 1
3 0
_
Step 4. Apply the last line of the denition of det(A):
det(A) = 1 det
_
1 0
0 1
_
(1) det
_
2 0
3 1
_
+ 3 det
_
2 1
3 0
_
Step 5. Compute the determinants of (2 2)-minor matrices rst and sum up:
det(A) = 1 det
_
1 0
0 1
_
(1) det
_
2 0
3 1
_
+ 3 det
_
2 1
3 0
_
= 1 (1 0) + 1 (2 0) + 3 (0 3)
= 1 + 2 9 = 8
So, this matrix is nonsingular.
We can apply the same computational rules for matrices of any size. We just need to apply the
same to work our way so that computation reduces to computing the determinants of (22)-minor
matrices.
We have several convenient theorems. We just state them without proofs.
Theorem 9-1: The determinant of a lower-triangular, upper-triangular, or diagonal matrix is
simply the product of its diagonal entries. That is,
det
_
_
_
_
_
a
11
a
12
a
1n
0 a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
nn
_
_
_
_
_
= det
_
_
_
_
_
a
11
0 0
a
21
a
22
0
.
.
.
.
.
.
.
.
.
.
.
.
a
n1
a
n2
a
nn
_
_
_
_
_
= det
_
_
_
_
_
a
11
0 0
0 a
22
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
nn
_
_
_
_
_
= a
11
a
22
:::a
nn
Proof : First, note that it is enough to prove it for upper-triangular and lower-triangular matrices,
because diagonal matrices are a special case of triangular matrices. To see how the proof works, I
will illustrate it for a (3 3)-matrix. Lets compute:
det
_
_
a
11
a
12
a
13
0 a
22
a
23
0 0 a
33
_
_
= a
11
det
_
a
22
a
23
0 a
33
_
0 det
_
a
12
a
13
0 a
33
_
+ 0 det
_
a
12
a
13
a
22
a
23
_
= a
11
(a
22
a
33
0 a
23
) = a
11
a
22
a
33
We can easily see that this logic works for any (n n) upper-triangular matrix. For a lower-
triangular matrix, pick the rst row, instead of the rst column, to compute its determinant. Can
23
you see how you might extend the proof for an (nn)-matrix? You would need to use mathematical
induction. QED
Corollary to this theorem is that det(I) = 1.
Theorem 9-2: For any A; B /
nn
(R),
(i) det(A
T
) = det(A);
(ii) det(AB) = det(A) det(B);
(iii) det(A+B) ,= det(A) + det(B), in general.
Now, we are almost ready to state the main theorem of this section. We need one more denition
to do so.
Denition 9-3 (Adjoint): Recall that, for any (n n)-matrix A, the (i; j)-th cofactor of A is:
C
ij
= (1)
i+j
det(
^
A
ij
)
where
^
A
ij
is obtained by deleting the row i and the row j of A. Then, the adjoint of A is dened
to be:
adj(A) =
_
_
_
_
_
C
11
C
21
C
n1
C
12
C
22
C
n2
.
.
.
.
.
.
.
.
.
.
.
.
C
1n
C
2n
C
nn
_
_
_
_
_
That is, adj(A) is a (n n)-matrix whose (i; j)-element is C
ji
;the (j; i)-th cofactor of A.
Theorem 9-3 (Inverse and Cramers Rule): Let A /
nn
(R) be a nonsingular matrix.
Then,
(i)
A
1
=
1
det(A)
adj(A)
(ii) (Cramers Rule) The unique solution x of the system Ax = b is given by:
x
i
=
det(B
i
)
det(A)
where B
i
is the matrix A with the right-hand side b replacing the i-th column of A, i.e. for a
(3 3)-matrix, B
2
is:
B
2
=
_
_
a
11
b
1
a
13
a
21
b
2
a
23
a
31
b
3
a
33
_
_
Proof : Lets prove the Cramers rule for a (3 3)-matrix. Suppose the system is:
24
_
_
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
_
_
_
_
x
1
x
2
x
3
_
_
=
_
_
b
1
b
2
b
3
_
_
The solution to this system is obviously,
x = A
1
b
so that,
x =
1
det(A)
adj(A)
_
_
b
1
b
2
b
3
_
_
x =
1
det(A)
_
_
C
11
C
21
C
31
C
12
C
22
C
32
C
13
C
23
C
33
_
_
_
_
b
1
b
2
b
3
_
_
=
1
det(A)
_
_
C
11
b
1
+C
21
b
2
+C
31
b
3
C
12
b
1
+C
22
b
2
+C
32
b
3
C
13
b
1
+C
23
b
2
+C
33
b
3
_
_
Now, let i = 1, then
x
1
=
1
det(A)
(C
11
b
1
+C
21
b
2
+C
31
b
3
)
=
1
det(A)

j
b
j
(1)
1+j
det(
^
A
j1
)
=
1
det(A)
det
_
_
b
1
a
12
a
13
b
2
a
22
a
23
b
3
a
32
a
33
_
_
You can do this for all i = 1; 2; 3. QED
Example 1. Suppose that we have the following system:
_
_
2 4 5
0 3 0
1 0 1
_
_
_
_
x
1
x
2
x
3
_
_
=
_
_
1
2
3
_
_
Lets solve for x
1
; x
2
; x
3
: Question: What do we need to do rst? Answer: First, check if A is
invertible. To do so, we need to check the determinant.
det
_
_
2 4 5
0 3 0
1 0 1
_
_
Which one looks easier to compute, row-based or column-based? Lets pick the rst column.
25
det(A) = 2 det
_
3 0
0 1
_
0 det
_
4 5
0 1
_
+ 1 det
_
4 5
3 0
_
= 2 3 + 1 (15)
= 9 ,= 0
So, it is invertible. Now, use the Cramers rule.
x
1
=
det(B
1
)
det(A)
=
1
det(A)
det
_
_
1 4 5
2 3 0
3 0 1
_
_
=
1
det(A)
_
1 det
_
3 0
0 1
_
2 det
_
4 5
0 1
_
+ 3 det
_
4 5
3 0
__
=
1
9
[3 2 4 + 3 (15)] =
50
9
=
50
9
We can do the same for x
2
; x
3
:
10. Euclidean Spaces
This section also reviews the Euclidean spaces, denoted as R
n
. In fact, we have been using this
notation quite often, without any formal denition. In this section, we will learn how to generalize
notions of points, lines, planes, distances, and angles in R
n
. Most of the topics will be left as your
reading assignment.
Denition 10-1 (Cartesian Product): Let A
1
, A
2
be any sets. Then, the Cartesian product
of two sets, A
1
, A
2
, is the set of all pairs (a
1
; a
2
) such that a
1
A
1
; a
2
A
2
. We represent the
product space as A
1
A
2
. When we have more than two sets, we use the notation:
n

i=1
A
i
. When
we use a geometric representation of (x; y)-coordinates to describe these pairs, that representation
is called the Cartesian plane.
Question: What is the dierence between A
1
A
2
.and A
1
A
2
or A
1
'A
2
? Answer: Let a; b
be in A
1
and c in A
2
. A
1
A
2
= (a; c); (b; c); A
1
A
2
= ?; A
1
' A
2
= a; b; c.
Denition 10-2 (Euclidean Space): An n-dimensional Euclidean space is the Cartesian
product of n real spaces, denoted R
n
= R R ::: R
. .
:
n times
R
n
= (x
1
; :::; x
n
) : x
i
R; \i = 1; 2:::n
So, x = (x
1
; :::; x
n
) represents a point in n-dimensional Cartesian plane. This n-tuple may be
more generally interpreted as a displacement in R
n
. For example, the displacement (2,3) means:
26
move 2 in the rst dimension (horizontally or along x-axis) and move 3 in the second (vertically
or along y-axis). In this interpretation, the vector does not necessarily start from the origin (0,0).
But, more frequently, we treat the displacement as representing the move from the origin, so that
the displacement (2,3) and the location (2,3) coincide with each other. We often call (x
1
; :::; x
n
) a
vector in R
n
, which can ambiguously mean either a location or displacement in R
n
. The Euclidean
space is often termed a normed vector space, because it is a vector space endowed with a metric,
norm.
Denition 10-3 (Vector Space): A vector space is any set V such that addition + and scalar
multiplication are well-dened on V and satisfy the following properties:
(i) (Associative law of addition): \x; y; z V; x + (y +z) = (x +y) +z;
(ii) (Neutral element for addition): 0 V; \x V; x +0 = 0 +x = x;
(iii) (Inverse element for addition): \x V; x V; x + (x) = 0;
(iv) (Associative law of scalar multiplication): \; R; \x V; ( x) = ( ) x = ( ) x =
( x);
(v) (Neutral element for scalar multiplication): \x V; 1 R; 1 x = x 1 = x;
(vi) (Distributive law of addition): \ R; \x; y V; (x +y) = x +y;
(vii) (Distributive law of scalar multiplication): \; R; \x V; ( +)x = x +x:
As we know, addition and scalar multiplication are well-dened and satisfy these properties in
R
n
. So, the Euclidean space is a vector space. In addition to addition and scalar multiplication,
the Euclidean space is endowed with another operation, called the (Euclidean) inner product. We
often denote x y or < x; y >.
Denition 10-4 (Inner Product): Let x = (x
1
; :::; x
n
); y = (y
1
; :::; y
n
) be two vectors in R
n
.
The (Euclidean) inner product of x and y is the number such that:
x y = x
1
y
1
+x
2
y
2
+::: +x
n
y
n
which satisfy the following properties:
(i) (Symmetry): x y = y x;
(ii) (Linearity): x (y +z) = x y +x z and x (y) = (x y) = (x) y;
(iii) (Positivity): x x _ 0 with = i x = 0.
Denition 10-5 (Norm): The norm of a vector x in R
n
is dened as:
[[x[[ = (x x)
1
2
=
_
n

i=1
x
2
i
_1
2
Theorem 10-1 (Properties of Norm): Suppose x; y; z R
n
and R. Then,
(i) [[x[[ _ 0 with = i x = 0;
(ii) [[x[[ = [[[[x[[;
(iii) [[x y[[ _ [[x[[ [[y[[;
(iv) [[x +y[[ _ [[x[[ +[[y[[;
(v) [[x z[[ _ [[x y[[ +[[y z[[;
(vi) [[[x[[ [[y[[[ _ [[x y[[.
27
Geometrically, we can represent a line in R
2
and a plane in R
3
(See Figures 10.26 and 10.28):
x(a) = u +av (10-1)
= (u
1
; u
2
) +a(v
1
; v
2
)
= (u
1
+av
1
; u
2
+av
2
)
x(a; b) = u +av +bw
= (u
1
; u
2
) +a(v
1
; v
2
) +b(w
1
; w
2
)
= (u
1
+av
1
+bw
1
; u
2
+av
2
+bw
2
)
For example, in (1), let u =
_
1
1
_
and let v =
_
2
3
_
: Then, the line represented by this equation is an
arbitrary extension (by a number a) of a line that starts at (1; 1) and moves 2 horizontally and 3
vertically from (1; 1).
Hyperplane. A line in R
2
and a plane in R
3
are examples of sets of points described by a
single linear equation. These sets are called hyperplanes. In general, a hyperplane in R
n
is a set of
points that has (n-1) dimensions in R
n
and are described by a linear equation:
a
1
x
1
+a
2
x
2
+::: +a
n
x
n
= c
Thus, hyperplanes in R
1
; R
2
; R
3
are, respectively, a point, a line, and a plane.
a
1
x
1
= c
a
1
x
1
+a
2
x
2
= c
a
1
x
1
+a
2
x
2
+a
3
x
3
= c
11. Linear Independence
Before formally dening linear dependence, lets consider a simple example. Suppose that we
have a relationship between x
1
and x
2
such that:
a
1
x
1
+a
2
x
2
= 0 (11-1)
Now, suppose that a
1
,= 0. Then, we can manipulate this relationship:
x
1
=
a
2
a
1
x
2
So, there is natural dependency of x
1
on x
2
. So, we say that x
1
is linearly dependent on x
2
,
because the relationship is linear. Question: What if a
2
= 0? Answer: We still have a uniquely
determined value of x
1
= 0, for each xed value of x
2
. So, we still say x
1
is linearly dependent on
x
2
. On the other hand, suppose a
1
= 0 and a
2
= 0. Then, equation (11-1) does not give us any
28
information concerning the relationship between x
1
and x
2
. That is, for a given value of x
2
, x
1
can
take any value. In this case, we say that x
1
and x
2
are linearly independent. This is the natural
denition of linear dependence. As we see, we can dene linear dependence for ndimensional
vectors.
As we saw in Section 10, the set of all scalar multiples of a non-zero vector v is a straight line
through the origin.
Denition 11-1 (Span): For a vector v R
n
, the set L(v) is said to be spanned or generated
by v if:
L(v) = rv : r R
Denition 11-2 (Linear Combination): Alinear combination of k non-zero vectors v
1
; v
2
:::; v
k
is:
c
1
v
1
+c
2
v
2
+::: +c
k
v
k
for some scalars c
1
; c
2
:::c
k
R
Denition 11-3 (Span): Let v
1
; v
2
:::; v
k
be non-zero vectors. We say that a set L(v
1
; v
2
:::; v
k
)
is generated or spanned by (v
1
; v
2
:::; v
k
) if:
L(v
1
; v
2
:::; v
k
) = c
1
v
1
+c
2
v
2
+::: +c
k
v
k
: c
1
; c
2
:::c
k
R
That is, L(v
1
; v
2
:::; v
k
) is a set of all possible linear combinations of v
1
; v
2
:::; v
k
. Moreover, if a set
V is a subset of L(v
1
; v
2
:::; v
k
), then we say that v
1
; v
2
:::; v
k
spans V .
Denition 11-4 (Linear Dependence): Vectors v
1
; v
2
:::; v
k
R
n
are linearly dependent if
and only if there exist scalars c
1
; c
2
:::c
k
R, not all equal to zero, such that:
c
1
v
1
+c
2
v
2
+::: +c
k
v
k
= 0
Denition 11-5 (Linear Independence): Vectors v
1
; v
2
:::; v
k
R
n
are linearly independent
if and only if
c
1
v
1
+c
2
v
2
+::: +c
k
v
k
= 0 == c
1
= c
2
= ::: = c
k
= 0
A ip side of this denition is that, as long as we can nd non-zero scalars c
1
; c
2
:::c
k
such that
c
1
v
1
+c
2
v
2
+::: +c
k
v
k
= 0, we can be assured that v
1
; v
2
:::; v
k
are linearly dependent. It can be
stated more generally:
Theorem 11-1: Vectors v
1
; v
2
:::; v
k
R
n
are linearly dependent if and only if the linear system:
A
_
_
_
_
_
c
1
c
2
.
.
.
c
k
_
_
_
_
_
=
_
v
1
v
2
v
k
_
_
_
_
_
_
c
1
c
2
.
.
.
c
k
_
_
_
_
_
= 0
29
has a non-zero solution c = (c
1
; c
2
; :::; c
k
).
When we have v
1
; v
2
:::; v
n
R
n
, A = (v
1
; v
2
:::; v
n
) becomes a (n n)-matrix. We can use the
equivalence theorem: a square matrix is of full rank if and only if its determinant is not zero.
Theorem 11-2: A set of n vectors v
1
; v
2
:::; v
n
R
n
is linearly independent if and only if:
det
_
v
1
v
2
v
n
_
,= 0
Another important fact is that, if we have more vectors than the number of dimensions of each
vector v
i
, then they must be linearly dependent.
Theorem 11-3: Any set of k vectors v
1
; v
2
:::; v
k
R
n
is linearly dependent if k > n.
It can be well understood with an example in R
2
. Suppose there are 3 non-zero vectors in R
2
.
Pick one vector. Then, this vector can be necessarily described as a linear combination of the other
two vectors. (See the graph).
Lets consider the implication of this for a moment. To describe ANY vector in R
2
, we only
need two non-zero 2-dimensional vectors. In the terminology just learned, a set of all vectors in
R
2
is simply a span of two non-identical non-zero vectors. In general, we might ask, "What is the
most ecient spanning to represent the set of all vectors in arbitrary vector space V?". In other
words, if v
1
; v
2
:::; v
k
span V , what is the smallest possible subset of v
1
; v
2
:::; v
k
spans V ? But, this
is precisely the role of the concept of linear independence that we considered. If v
1
; v
2
:::; v
k
are
linearly independent, no one of these vectors is a linear combination of the others and therefore,
no proper subset of v
1
; v
2
:::; v
k
would span V . This is the concept of a basis of a vector space V .
Denition 11-6 (Basis): Let w
1
; w
2
:::; w
m
be a collection of vectors in V . Then, w
1
; w
2
:::; w
m
form a basis of V if and only if:
(i) w
1
; w
2
:::; w
m
span V ;
(ii) w
1
; w
2
:::; w
m
are linearly independent.
Clearly, three non-zero vectors in R
2
cannot form a basis of R
2
, because these vectors must
be linearly dependent. Even if we have two non-zero vectors in R
2
, they cannot be a basis of R
2
if w
1
= aw
2
. Moreover, two vectors with one of them being a zero vector cannot be a basis of
R
2
, because it cannot span R
2
: Moreover, note that a zero vector in R
2
is a linear combination
of another non-zero vector, 0 = 0v. The natural basis of the Euclidean space R
n
is a canonical
basis:
e
1
=
_
_
_
_
_
1
0
.
.
.
0
_
_
_
_
_
; :::; e
n
=
_
_
_
_
_
0
0
.
.
.
1
_
_
_
_
_
Lets check if this spans R
n
and this is linearly independent. To see it spans R
n
, take any arbitrary
vector v R
n
:
30
v =
_
_
_
_
_
v
1
v
2
.
.
.
v
n
_
_
_
_
_
Then,
v = v
1
e
1
+::: +v
n
e
n
So, it spans R
n
. To see linear independence, consider:
c
1
e
1
+::: +c
n
e
n
= c
1
_
_
_
_
_
1
0
.
.
.
0
_
_
_
_
_
+::: +c
n
_
_
_
_
_
0
0
.
.
.
1
_
_
_
_
_
=
_
_
_
_
_
c
1
c
2
.
.
.
c
n
_
_
_
_
_
So, if c
1
e
1
+::: +c
n
e
n
= 0, then it must mean c
1
= ::: = c
n
= 0. It is linearly independent.
The following theorems should come naturally as a summary of our discussion above.
Theorem 11-4: If both v
1
; v
2
:::; v
n
and w
1
; w
2
:::; w
m
are a basis of V , then we must have n = m.
Theorem 11-5: Every basis of R
n
contains n vectors.
Theorem 11-6: Let v
1
; v
2
:::; v
n
be a collection of n vectors in R
n
. Then, the following statements
are equivalent:
(i) v
1
; v
2
:::; v
n
are linearly independent;
(ii) v
1
; v
2
:::; v
n
spans R
n
;
(iii) v
1
; v
2
:::; v
n
form a basis of R
n
;
(iv) det(v
1
v
2
:::v
n
) ,= 0;
(v) A = (v
1
v
2
:::v
n
) has full rank.
In view of Theorem 11-4, we can talk about the dimension of any vector space unambiguously.
Denition 11-7 (Dimension): A dimension of a vector space V is the number of vectors that
can form a basis of V .
Thus, the dimension of R
n
is exactly n.
31

You might also like