0% found this document useful (0 votes)

48 views334 pages

Linear Algebra - LA - Parti

Uploaded by

郭之一

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views334 pages

Linear Algebra - LA - Parti

Uploaded by

郭之一

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 334

Course: Linear Algebra 1

Instructor: Ming-Xian Chang

Office: 92915 (9F)
E-mail: [email protected]
Website: https://moodle.ncku.edu.tw/

Textbook:
“Elementary Linear Algebra,” 12-th ed., H. Anton and C. Rorres.

( The content of this course comes from the above book and other
references.)

Reference:
1. “Linear Algebra,” Friedberg, Insel, Spence.
2

2. “Elementary Linear Algebra, a Matrix Approach,” Spence, Insel,

Friedberg.
3. “Introduction to Linear Algebra,” Gilbert Strang.
4. “Linear Algebra and Its Applications,” David C. Lay.
5. “Linear Algebra with Applications,” Gareth Williams.
6. “Linear Algebra with Applications,” Steven J. Leon.
7. “Linear Algebra,” J. B. Fraleigh and R. A. Beauregard.
(Youtube Contents)

Grading: Midterm exam, final exam, and quizzes (85%, (30+35+20)%),

homework (15%)
3
■ Introduction

Algebra
1. Elementary algebra
2. Linear Algebra
3. Algebra (Modern algebra, Abstract algebra)
- Abstraction and generalization (Ex. Vector Spaces)
v + w, cv (vectors in R2 or R3)
f (t) + g(t), cf (t) (functions)
M + P, cM (matrices)
4
Notation
1. R: the set of real numbers.
2. C: the set of complex numbers.
3. A vector v is denoted by a bold little character.
4. R2 and R3 are the sets of 2-D plane and 3-D space.
 
" # 2
1 2  
v= ∈R, u =  4  ∈ R3
3
1
5. A vector is denoted as a column vector. For row vector, we use the
notation “T ” to denote “transpose”
uT = [2 4 1]
5

6. We use the notation “H ” to denote “complex transpose” or “Hermi-

tian” of a vector or a matrix. If w is a complex vector, we denote
wH = wT
For example,
 
1 + i2
 
w =  4 , wH = [1 − i2 4 3]
3

   
2 + i3 1 1 + i 2 − i3 0 0
   
M = 0 4 0 , MH = 1 4 0
0 0 3 1−i 0 3
6

Preview
1. Vectors in Rn (or Cn):
 
a1
 
 a2 
u= 
 ..  ak ∈ R (or C) (an n-tuple)
 
an

uT = [a1 a2 · · · an]

uH = uT = [a1 a2 · · · an]

2. Vectors and scalars: c 1 v 1 + c2 v 2 + c3 v 3

Vector Space :
A vector space is a set of vectors, which will be defined later.

Inner product :
The inner product of two vectors x and y,
x · y, ⟨x, y⟩, ⟨x | y⟩
When we define an inner product in a vector space, we can use it
to define
1. the length (norm) of a vector ( ∥x∥2 = ⟨x, x⟩ ), and
2. the orthogonality between vectors. ( ⟨x, y⟩ = 0 )

Linear combination c 1 v 1 + c2 v 2 + · · · + cn v n

Linear transformation
v ∈ Rn 7→ w ∈ Rm
8
Contents
1. Systems of Linear Equations and Matrices (Chap. 1)
2. Determinants (Chap. 2)
3. Euclidean Vector Spaces (R2, R3, Rn) (Chap. 3)
4. General Vector Spaces (Chap. 4)
5. Eigenvalues and Eigenvectors (Chap. 5)
6. Inner Product Spaces (Chap. 6)
7. Diagonalization and Quadratic Forms (Chap. 7)
8. Linear Transformations (Chap. 8)
9. Additional Topics
(including Singular Value Decomposition and Jordan Forms.)
9
Homework :
Chap 1: 1.1): 8, 12 1.2): 38 1.3): 36 1.4): 42, 46 1.5): 31 1.6): 18, 24
1.7): 40(a), 47 1.8): 16, 45 1.9): 12

Chap 2: 2.1): 34 2.2): 32 2.3): 34

Chap 4: 4.1): 6,15 4.2): 12,24,28 4.3): 12 4.4): 6 4.5): 20 4.6): 26 4.7): 16
4.8): 34 4.9): 34
Chap 5: 5.1): 28 5.2): 34 5.3): 30
Chap 6: 6.1): 47 6.2): 37 6.3): 26 6.4): 30 6.5): (7)
Chap 7: 7.1): 8 7.2): 26 7.3): 32 7.4): 18 7.5): 34,43
Chap 8: 8.1): 32, 38 8.2): 40, 44 8.3): 4,6,8 8.4): 14 8.5): 6
Chap 9:
10
■ Linear Equations
Consider a system of m linear equations on n unknowns,

a11x1 + a12x2 + · · · + a1k xk + · · · + a1nxn = b1

a21x1 + a22x2 + · · · + a2k xk + · · · + a2nxn = b2
.. .. .. (1)
am1x1 + am2x2 + · · · + amk xk + · · · + amnxn= bm

where a11, a12, a21, . . . , aℓk , . . . are constant coefficients.

       
a11 a12 a1n b1
       
 a21   a22   a2n   b2 
x1 
 .. 
 + x2 
 .. 
 + · · · + xn  = 
 ..   .. 
       
am1 am2 amn bm
11
which corresponds to a linear combination

x1 a 1 + x2 a 2 + · · · + xk a k + · · · + xn a n = b

where the kth vector ak and b are defined by

   
a1k b1
   
 a2k   b2 
ak = 
 .. 
, b= 
 ..  (∈ Rm)
   
amk bm
respectively.
12
Eq. (1) also corresponds to a transformation Ax = b
    
a11 a12 · · · a1n x1 b1
    
 a21 a22 · · · a2n   x2   b2 
   =  
 .. .. .. ..   ..   .. 
    
am1 am2 · · · amn xn bm

 
a11 a12 · · · a1n
 
 a21 a22 · · · a2n 
A=
 .. .. ..

.. 
 
am1 am2 · · · amn

A is an m × n matrix, x ∈ Rn, b ∈ Rm.

After multiplying by A, we transform x to b.
13
A system of linear equations (Ax = b) has either
1. no solutions, or
2. exactly one solution ( a unique solution), or
3. infinitely many solutions.

Suppose that Ax = b has more than one solutions,

Ax1 = b
Ax2 = b
where x1 ̸= x2. Then

A tx1 + (1 − t)x2 = b, ∀t ∈ R
and tx1 + (1 − t)x2 is also a solution.
14

For the system of linear equations in (1), Ax = b,

a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
.. .. ..
am1x1 + am2x2 + · · · + amnxn= bm

the augmented matrix is

 
a11 a12 · · · a1n b1
 
 a21 a22 · · · a2n b2 
[A | b] = 
 .. .. .. ..

.. 
 
am1 am2 · · · amn bm
15
We will take elementary row operations on [A | b] to make it into
a form that is more easy to solve.

Elementary row operations on a matrix:

1. (Replacement) Add to one row a multiple of anther row.
2. (Interchange) Interchange two rows.
3. (Scaling) Multiply all entries in a row by a non-zero constant.

Remarks
1. By taking elementary row operations, we do not affect the solutions
of Ax = b.
2. Each elementary row operation is reversible.
16

×3

×2

×(−3)

×1/2
17
Definition A matrix is in (row) echelon form if it has
the following three properties: (Forward Gaussian elimination)
   
1 ∗ ∗ ∗ ∗ ∗∗∗∗
■
   
0 1 ∗ ∗ ∗ 0 ■ ∗ ∗ ∗
  or   ■ ̸= 0
0 0 0 1 ∗ 0 0 0 ■ ∗
   
0 0 0 0 0 0 0 0 0 0

1. If there are any rows that consist entirely of zeros, then they are
grouped together at the bottom of the matrix.
2. If a row does not consist entirely of zeros, then the first non-zero
number in the row is a 1. We call this a leading 1 (or pivot).
3. The leading 1 in the lower row occurs farther to the right than the
leading 1 in the higher row.
18
Definition A matrix is in reduced (row) echelon form
if it is in (row) echelon form and satisfies the following property.

4. Each column that contains a pivot (= 1) has zeros everywhere else.

Example A row echelon form (R1) and a reduced row echelon form (R2).
   
1∗∗∗∗ 10∗0∗
   
0 1 ∗ ∗ ∗ 0 1 ∗ 0 ∗
R1 = 
0 0 0 1 ∗
 , R2 = 
0 0 0 1 ∗

   
0 0 0 0 0 00 0 0 0
19
After a sequence of row operations, we can make a matrix in a
row echelon form. A matrix may have many row echelon forms.
(For example, adding the 3rd row of R1 to the 2nd row of R1, we obtain
another matrix in row echelon form.)
 
1∗∗∗∗
 
0 1 ∗ ∗ ∗
R1 = 0 0 0 1 ∗

 
0 0 0 0 0
   
12357 1235 7
   
0 1 4 6 8  
  →  0 1 4 7 17 
0 0 0 1 9 0 0 0 1 9
   
00000 0000 0
20

However, the reduced row echelon form of a matrix is unique.

(We will discuss about this issue later.)
 
1 0 ∗ 0 ∗
 
0 1 ∗ 0 ∗
R2 = 
0

 0 0 1 ∗
0 0 0 0 0

• The pivots are always in the same positions in any echelon form
of A.
• We call these positions pivot positions.
• A pivot column is a column of A that contains a pivot position.
21
Example Reduced row-echelon forms:
 
    0 1 −2 0 1
1 0 0 4 1 0 0   " #
     0 0 0 1 3 , 0 0
0 1 0 7, 0 1 0, 
 0 0 0 0 0  0 0
0 0 1 −1 0 0 1
0 0 0 0 0

Example Row-echelon forms:

     
1 4 −3 7 1 1 0 0 1 2 6 0
     
, ,
 0 1 6 2   0 1 0   0 0 1 −1 0 
0 0 1 5 0 0 1 0 0 0 −0 1
22

We show in the following the process of reaching an echelon form

and a reduced echelon form.
 
a11 a12 . . . a1n b1
 
 a21 a22 . . . a2n b2 
[A | b] = 
 .. .. .. ..

.. 
 
am1 an2 . . . amn bm

a21
1. If a11 ̸= 0, perform row operations to create zeros below a11. (− )
  a11
a11 a12 . . . a1n b1
 
 0 a22 . . . a2n b2 
′ ′ ′
 
 .. .. .. .. .. 
 
0 a′n2 . . . a′mn b′m
23

2. If a11 = 0, interchange the first row with another row (say, the kth
row) for which ak1 ̸= 0. Then perform the operations in step1 to this
new matrix.  
0 . . . . . . b1
 . . . . 
 . . . . 
 
a ... ... b 
 k1 2
.. .. .. ..

3. Apply the above steps to the following submatrix.

 
a′22 . . . a′2n b′2
 . . . .. 
 . . . 
a′n2 . . . a′mn b′m
24

4. If all ak1’s are zeros,

 
0 a12 . . . a1n b1
 
0 a22 . . . a2n b2 
[A | b] = 
 .. .. .. ..

.. 
 
0 am2 . . . amn bm
apply the above steps to the following submatrix.
 
a12 . . . a1n b1
 
 a22 . . . a2n b2 
 
 .. .. .. .. 
 
an2 . . . amn bm

Continue these row operations, we can obtain an echelon form of [A | b].

  25

∗∗∗∗
■
 
0 ■ ∗ ∗ ∗
[A | b] → 
0 0 0 ■ ∗

 
0 0 0 0 0
• In this step, we can determine whether the original equation Ax = b
is consistent or not.
• If a non-zero element of the last column (b-column) is a pivot, then
the original equation is inconsistent and no solution exits.
 
′
 ′ , bk ̸= 0
0 . . . 0 bk

Because this corresponds to an equation

0x1 + 0x2 + · · · + 0xn = b′k
26

• Since elementary row operations do not change the solution(s)

of Ax = b, we conclude that it has no solutions.

Example Inconsistent equations.

x1 + 3x2 = 2
x1 + 3x2 = 5
which has the row-echelon form
" # " #
13 2 132
→
13 5 003
corresponding to
0x1 + 0x2 = 3
27
On the other hand, if the equations are consistent, we can further
make it a reduced echelon form. Begin with the rightmost pivot.

 
∗∗∗∗
■
 
0 ■ ∗ ∗ ∗
  1. Make each pivot 1 by a scaling row operation.
0 0 0 ■ ∗
 
2. Create zeros above each pivot.
0 0 0 0 0
Finally we can obtain a reduced echelon form.
       
∗∗∗∗
■ 1∗ ∗ ∗ ∗ 1∗ ∗ 0 ∗ 10 ∗ 0 ∗
       
0 ■ ∗ ∗ ∗  ∗ ∗ ∗  ∗ ∗  ∗ ∗
  → 0 1  → 0 1 0  → 0 1 0 
0 0 0 ■ ∗ 0 0 0 1 ∗ 0 0 0 1 ∗ 0 0 0 1 ∗
       
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0
28
Example Consider a system of linear equations
3x2 − 6x3 + 6x4 + 4x5 = −5
3x1 − 7x2 + 8x3 − 5x4 + 8x5 = 9
3x1 − 9x2 + 12x3 − 9x4 + 6x5 = 15
The augmented matrix
 
0 3 −6 6 4 −5
 
[A | b] =  3 −7 8 −5 8 9 
3 −9 12 −9 6 15
29

 
0 3 −6 6 4 −5
 
[A | b] =  3 −7 8 −5 8 9
3 −9 12 −9 6 15
 
3 −9 12 −9 6 15
 
→  3 −7 8 −5 8 9
0 3 −6 6 4 −5
   
3 −9 12 −9 6 15 3 −9 12 −9 6 15
   
→  0 2 −4 4 2 −6  →  0 2 −4 4 2 −6  (pivots)
0 3 −6 6 4 −5 0 0 0 0 1 4

=⇒ Consistent.
30

 
3 −9 12 −9 0 −9
 
→  0 2 −4 4 0 −14 
0 0 0 0 1 4

 
3 −9 12 −9 0 9
 
→  0 1 −2 2 0 −7 
0 0 0 0 1 4

   
3 0 −6 9 0 −72 1 0 −2 3 0 −24
   
→  0 1 −2 2 0 −7  →  0 1 −2 2 0 −7 
0 0 0 0 1 4 0 0 0 0 1 4

We obatain the reduced row-echelon form.

31
 
1 0 −2 3 0 −24
 
 0 1 −2 2 0 −7 
0 0 0 0 1 4

x1 + (−2)x3 + 3x4 = −24

x2 + (−2)x3 + 2x4 = −7
x5 = 4

Basic (leading) variables: x1, x2, x5. (at pivot positions)

Free variables: x3, x4.

Note in this example, we have 3 equations on 5 unkonwns.

• The number of basic variables = the number of (effective) equations.

• The number of free variables = the total number of variables − the
number of basic variables.

About effective equations,

x1+2x2 = 5 x1+2x2 =5
3x1− x2 = 1 ⇒ 3x1− x2 =1
4x1+ x2 = 6
     
1 2 5 1 2 5 1 2 5
     
 3 −1 1  →  0 −7 − 14  →  0 1 2 
4 1 6 0 −7 − 14 0 0 0
33
x1 = −24 + 2x3 − 3x4
x2 = −7 + 2x3 − 2x4
x5 = 4
x3 = t1 ∈ R (free variable)
x4 = t2 ∈ R (free variable)

x1 = −24 + 2t1 − 3t2

x2 = −7 + 2t1 − 2t2
x3 = t1
x4 = t2
x5 = 4
t 1 , t2 ∈ R
34
The solution set of Ax = b is
  
    
x1 −24 2 −3
       
 x2   −7   2   −2 
       
 
x =  x3  =  0  +  1  t 1 +  0 
    
 t2 , t 1 , t2 ∈ R
       
x
 4  0 0
    1 
x5 4 0 0
Note that
     
−24 2 −3
     
 −7  2  −2 
     
A 
 0  = b, A 
 1  = 0, A 
 0 =0
     
 0  0  1 
4 0 0
35

     
−24 2 −3
     
 −7  2  −2 
     
Ax = A 
 0  + A  1  t1 + A  0  t2
    
     
 0  0  1 
4 0 0

=b+0+0
=b
36
Summary of the process of solving Ax = b
1. Perform elementary row operations on the augmented matrix
[A | b] to make it into an echelon form.
2. Determine if this system is consistent. If it is inconsistent, no solu-
tions exist. Otherwise, further make it into a reduced echelon form.
3. If there are no free variables, we can immediately obtain the unique
solution of this system from the reduced echelon form.
 
1
 
 1 
 ′
b
 ...
 
1
37

4. If there are free variables, solve the reduced system of equations for
the basic variables in terms of free variables.
(In this case, there are infinitely many solutions.)

 
1 0 −2 3 0 −24
 
 0 1 −2 2 0 −7 
0 0 0 0 1 4
38
Example    
x + y + 2z = 9 1 1 2 9 1 1 2 9
   
2x + 4y − 3z = 1 ⇒  2 4 −3 1  ⇒  0 2 − 7 − 17 
3x + 6y − 5z = 0 3 6 −5 0 0 3 −11 −27
   
1 1 2 9 1 1 2 9
   
⇒  0 1 − 7/2 − 17/2  ⇒  0 1 − 7/2 − 17/2 
0 3 −11 −27 0 0 −1/2 −3/2
     
1 1 2 9 1 1 0 3 1 0 0 1
     
⇒  0 1 − 7/2 − 17/2  ⇒  0 1 0 2  ⇒  0 1 0 2 
0 0 1 3 0 0 1 3 0 0 1 3
   
x 1
   
⇒ y = 2
z 3
39

Note for Ax = b, A: m × n
 
1 0 −2 3 0 −24
 
 0 1 −2 2 0 −7 
[A | b] ∼ 
0 0 0
 (A : 4 × 5)
 0 1 4
0 0 0 0 0 0
(#: number)
(# of basic variables) + (# of free variables) = (# of variables) = n
(# of basic variables) = (# of effective equations) ≤ m

If n > m, then Ax = b must have free variables.

(The number of unknowns > the number of equations)
40
• Existence and uniqueness of solutions
Existence of solutions
1. The equation Ax = b has a solution (consistent) if and only if
the row echelon form of [A | b] does not have a row like
[0 . . . 0 b′] b′ ̸= 0

2. Since Ax = b can be written as

 
x1
 
[a1 a2 · · · an]  ..  = b
xn
or
x1 a 1 + x2 a 2 + · · · + xn a n = b
41

we see that Ax = b has a solution if and only if

b is a linear combination of the columns of A.

Uniqueness of the solution

If the equations are consistent, and there are no free variables,
then the solution is unique.
42
• Homogeneous systems: Ax = 0

Ax = 0 always has a trivial solution: x = 0. (consistent)

For homogeneous equtions, we do not have the following case.
[0 . . . 0 b′] b′ ̸= 0

Non-trivial solution: ∃ x ̸= 0 such that Ax = 0.

In the previous example, if we set b = 0, then
 
0 3 −6 6 4 0
 
[A | 0] =  3 −7 8 −5 8 0 
3 −9 12 −9 6 0
has solution as
43

    
x1 2 −3
     
 x2   2   −2 
     
 
x =  x3  =  1  t 1 +  0 
  
 t2 , t 1 , t2 ∈ R
     
x
 4   0  1 
x5 0 0
with    
2 −3
   
2  −2 
   
A 
 1  = 0, A 
 0 =0
   
0  1 
0 0
Therefore, Ax = 0.
44

Therefore, if Ax = 0 has free variables, then it has non-trivial

solutions, since we can set each free variable as any (non-zero) value.
Ax = 0 has free variables ⇐⇒ Ax = 0 has non-trivial solutions

Conclusions:
Ax = 0 has free variables.
⇐⇒ Ax = 0 has non-trivial solutions.
⇐⇒ Ax = b has infinitely many solutions, if it is consistent.
45

Theorem Suppose Ax = b has a solution p. (Ap = b)

Then the solution set of Ax = b can be expressed as

{p + vh | Avh = 0}
(See the previous example.)
46

Proof :
Let S1 = {x | Ax = b} be the solution set of Ax = b, and
S2 = {p + vh | Avh = 0}, Ap = b.
1. Since p + vh ∈ S2, and
A(p + vh) = Ap = b
we have p + vh ∈ S1. Therefore, S2 ⊆ S1.
2. On the other hand, let w ∈ S1. Then Aw = b. Note
A(w − p) = b − b = 0
Since A(w − p) = 0, let vh = w − p. We have Avh = 0. and
w = p + vh ∈ S2. Therefore, S1 ⊆ S2.

Since S2 ⊆ S1 and S1 ⊆ S2, we conclude that S1 = S2.

47
• The equivalence relation
Examples of relations :
1. 5 > 3, −1 < 2
2. {1, 2, 3} ⊂ {1, 2, 3, 4, 5, 6}, {2, 3, 9} ⊈ {1, 2, 3, 4}
3. Two triangles are similar.
4. Relatives; Classmates; Friends.

We say that a relation (∼) is an equivalence relation if

1. A ∼ A. (Reflexivity)
2. If A ∼ B, then B ∼ A. (Symmetry)
3. If A ∼ B, B ∼ C then A ∼ C. (Transitivity)
48
Example (equivalence relations)
1. A triangle A is similar to a triangle B.
2. A person A is a relative of a person B.

Definition For two matrices A, B, we say A is row equivalent to B

if we can transform A into B by a sequence of elementary row opera-
tions.

A ∼ A1 ∼ A2 ∼ · · · ∼ B
49

A ∼ A1 ∼ A2 ∼ · · · ∼ B

“Row equivalent to” is also an equivalence relation.

For matrices A, B, and C, we have the following results.
1. A ∼ A. ( ∼ : “row equivalent to” )
2. If A ∼ B, then B ∼ A.
3. If A ∼ B, B ∼ C then A ∼ C.
50

Note that
1. A matrix A is row equivalent to all its row echelon forms.
2. A matrix A is row equivalent to its reduced row echelon form.
3. All the row echelon forms of A are row equivalent.

A ∼ R 1 , A ∼ R2 , · · ·

R1 ∼ R2, R2 ∼ R3 · · ·
51
Recall that a matrix can have many row echelon forms.
 
1 ∗ ∗ ∗ ∗
 
0 1 ∗ ∗ ∗
R1 = 
0

 0 0 1 ∗
0 0 0 0 0

However, the reduced row echelon form of a matrix is unique.

 
1 0 ∗ 0 ∗
 
0 1 ∗ 0 ∗
R2 = 
0

 0 0 1 ∗
0 0 0 0 0
52

To see that the reduced row-echelon form of a matrix is unique, notice

the following facts.
• Let C1 and C2 are two matrices in reduced row echelon forms. If
C1 ̸= C2, then it is impossible that C1 ∼ C2. (∼: row equivalent to)
• For a matrix C1 that is in reduced row echelon form, you cannot use
elementary row operations to transform C1 into another reduced row
echelon form.
   
1 0 ∗ 0 ∗ 1 0 ∗ ∗ 0
   
0 1 ∗ 0 ∗ 0 1 ∗ ∗ 0
C1 = 
0
, C2 =  
 0 0 1 ∗
0
 0 0 0 1
0 0 0 0 0 0 0 0 0 0
53

If a matrix A has two reduced row echelon forms C1 and C2, then
A ∼ C1 and A ∼ C2

which implies that C1 ∼ C2, and this causes a contradiction if C1 ̸= C2.

Therefore, we conclude that the reduced row echelon form of a matrix

is unique.

(Equivalence relation)
1. A ∼ A. (Reflexivity)
2. If A ∼ B, then B ∼ A. (Symmetry)
3. If A ∼ B, B ∼ C then A ∼ C. (Transitivity)
54

A system of linear equations Ax = b has either

1. no solutions, or
2. exactly one solution ( a unique solution), or
3. infinitely many solutions.

When there is no solution, one may want to find a x̃ such that

Ax̃ is nearest to b.

When there are infinitely many solutions, one may want to find
a solution x̃ that has the minimum “length”.

We will discuss the above issues later.

■ Matrices 55

Compared with a vector v in Rn, an m × n matrix A is a

two-dimensional (2D) array.
 
a11 · · · a1j · · · a1n
 . . . 
 . . . 
 
A= 
 ai1 · · · aij · · · ain 
 . . . 
 . . . 
am1 · · · amj · · · amn

An m × n matrix A is related to a linear transformation from Rn to Rm.

y = Ax, x ∈ Rn, y ∈ Rm
56

For an m × n matrix A, with m rows and n columns,

 
a11 · · · a1j · · · a1n
 . .. .. 
 . 
 
A=  ai1 · · · aij · · · ain  = [a1 a2 · · · an]
 . .. .. 
 . 
am1 · · · amj · · · amn

where aij is the (i, j)-entry of A, [A]ij = aij ∈ R or C

 
a1j
 
 a2j 
and aj =  .. 
 ∈ Rm or Cm is the jth column of A
 
amj
57
Remarks
1. When m = n, A is a square matrix of order m.

2. The diagonal entries of A are a11, a22, · · · .

3. An m × n zero matrix
 
0 ··· 0
 .. 
O =  .. ..
. 
0 ··· 0
is an m × n matrix whose all entries are zeros.
4. A diagonal matrix is a square matrix whose entries are all zero except
 
d11 O
 
 d 22 
the diagonal entries. D =  


 . . 
.
O dnn
58
The identity matrix

 
1 O
 
 1 
In =  ..  = [e1 e2 · · · en]
 . 
O 1

is a diagonal matrix of order n, where

     
1 0 0
     
0 1 0
e1 =   , e2 =   , · · · , en =  
 ..   ..   .. 
     
0 0 1

Sometimes we write only I if no ambiguity exits.

We now consider the definitions of the multiplication of

1. a matrix A and a vector v,
2. two matrices A and B,
with proper dimensions.

They are related to the linear transformation from Rn to Rm.

T
Rn Rm

v T (v)
60
Consider a transformation T from R to R ,
n m

T : Rn 7→ Rm : v 7→ T (v)
which maps v to T (v). The domain of T is Rn and the codomain is
Rm .

A transformation T is called linear if it satisfies

1. T (u + v) = T (u) + T (v)
2. T (cu) = cT (u)
for all u, v in Rn and c in R.

The above is the superposition principle.

61
The standard basis vectors for R are n

     
1 0 0
     
0 1 0
e1 =      
 ..  , e2 =  ..  , · · · , en =  .. 
     
0 0 1

For any vector v ∈ Rn, we can express it as

 
v1
 
 v2 
v=  = v1 e 1 + v 2 e 2 + · · · + v n e n
 .. 
 
vn
62
Then for a linear transformation T R 7→ R , we have
n m

T (v) = T (v1e1 + v2e2 + · · · + vnen)

= v1T (e1) + v2T (e2) + · · · + vnT (en)

In the above T (e1), . . . , T (en) are vectors in Rm.

T (v) = v1T (e1) + v2T (e2) + · · · + vnT (en)

 
v1
 
 v2 
= T (e1) T (e2) · · · T (en)  
 .. 
 
vn

The matrix [T (e1) T (e2) · · · T (en)] is of size m × n.

63
Therefore, we define the multiplication of a matrix A and a vector
v as
 
v1
 
 v2 
Av = [a1 a2 · · · an] 
 .. 

 
vn
= v1a1 + v2a2 + · · · + vnan

Every linear transformation T from Rn to Rm is associated with an

m × n matrix: A = T (e1) T (e2) · · · T (en) , and T (v) = Av.
64
Now we consider the definition of matrix multiplication, which
corresponds to the composition of two linear transformations.

Consider two linear transformations

T : R p → Rn y = Bx, B :n×p
S : R n → Rm z = Ay, A:m×n
where x ∈ Rp, y ∈ Rn, z ∈ Rm.
T S
Rp R n
Rm

x y=Bx z=Ay

S◦T
65

Then the composition of S and T is

S ◦T : R p → R m z= ABx

in which we have AB, the product of two matrices A and B.

T S
Rp R n
Rm

x y=Bx z=Ay

S◦T
66
Now for the m × n matrix A and n × p matrix B,
A = [a1 a2 · · · an] = [aij ] , ak ∈ R m

B = [b1 b2 · · · bp] = [bij ] , bk ∈ Rn

x = (x1, x2, . . . , xp)T

Bx = x1b1 + · · · + xpbp (2)

ABx = A(Bx) = A(x1b1 + · · · + xpbp)

= A(x1b1) + · · · + A(xpbp) (3)
= x1(Ab1) + · · · + xp(Abp)

Compared with (2), the kth column of AB should be defined as Abk .

AB = [Ab1 Ab2 · · · Abp]

and the (i, j)th entry of AB is the ith entry of Abj ,

[AB]ij = [Abj ]i
Since
 
b1j
 .. 
Abj = [a1 · · · an]  
bnj

= b1j a1 + b2j a2 + · · · + bnj an

we have
n
X
[AB]ij = [Abj ]i = ai1b1j + ai2b2j + · · · + ainbnj = aik bkj
k=1
68

n
X
[AB]ij = aik bkj
k=1

  
b1j
  .. 
 ai1 · · · ain   
bnj
69
Remarks
1. A = B if size(A) = size(B) = (m, n), and
[A]ij = [B]ij , 1 ≤ i ≤ m, 1 ≤ j ≤ n

2. When A and B are of the same sizes, we define A + B, the sum of

A and B, by
[A + B]ij = [A]ij + [B]ij

3. If r is a scalar and A is a matrix, the scalar multiple rA is defined

by
[rA]ij = r[A]ij , 1 ≤ i ≤ m, 1 ≤ j ≤ n

4. A − B = A + (−B).
70
The algebraic properties of matrices,
1. A + B = B + A
2. (A + B) + C = A + (B + C)
3. A + O = A, A + (−A) = O
4. r(A + B) = rA + rB
5. (r + s)A = rA + sA
6. r(sA) = (rs)A, 1A = A
where A, B, and C are of the same sizes, and
r, s are real or complex numbers.

In fact, the set of all m × n matrices satisfies the definitions of

a vector space, which we will study later.
71
Theorem Let A be an m × n matrix, and B, C, D have sizes for
which the indicated sums and products are defined.
1. A(BC) = (AB)C Thus we can write ABC without ambiguity.
2. A(B + C) = AB + AC (left distributive law)
3. (B + C)D = BD + CD (right distributive law)
4. r(AB) = (rA)B for any scalar.
Thus we can write rAB without ambiguity.
5. ImA = A = AIn
72
Note for two square matrices A and B, in general, AB ̸= BA.
  
b1j
  .. 
[AB]ij =  ai1 · · · ain   
bnj

  
a1j
  .. 
[BA]ij =  bi1 · · · bin   
anj

Example (Commutative Property)

1. For the above matrices, we have A + B = B + A.
2. For any a, b ∈ R (or C), we have ab = ba, a + b = b + a.
73

In addition,
AB = O ⇏ A = O or B = O
Cf. for any a, b ∈ R (or C), we have ab = 0 ⇒ a = 0 or b = 0.

For example, " # " #

1 −1 1 1
A= , B=
1 −1 1 1
we have " #
0 0
AB = =O
0 0

In this example, A ̸= O and B ̸= O, but AB = O.

For any a, b, c ∈ R or C, we have ab = ac ⇒ b = c if a ̸= 0.

However, if AB = AC, we have A(B −C) = O, but by the above result,
this does not imply that B − C = O or B = C, even if A ̸= O.

AB = AC ⇏ B = C

Cf.
A+B =A+C ⇒ B =C

However, when A is an invertible matrix,

AB = AC ⇒ B = C
75
• Powers of a sqaure matrix

If A is a square matrix, we define

k △
| ·{z
A = A · · A} , k = 1, 2, 3, . . .
k

We define A0 = I, if A ̸= O.
Note that
(A + B)2 = (A + B)(A + B) = A2 + AB + BA + B 2
(A + B)3 = (A + B)(A2 + AB + BA + B 2) = · · ·

How to efficiently calculate Ak , the powers of a square matrix? We

will discuss this issue later.
76
• The transpose of a matrix

Given an m × n matrix A, the transpose of A is the n × m matrix,

denoted by AT .
1. The rows of AT are formed by the corresponding columns of A.
2. [AT ]ij = [A]ji

Rn Rm
u Au

AT v v
77

 
a11 · · · a1j · · · a1n
 . . . 
 . . . 
 

A =  ai1 · · · aij · · · ain 
 = [a1 a2 · · · an]
 . . . 
 . . . 
am1 · · · amj · · · amn
   
a11 · · · ai1 · · · am1 a T
 . . .   1
 . . .   T
   a2 
A =
T
 a 1j · · · · · · a mj
= 
  . 
 . . .   . 
 . . . 
T
a1n · · · ain · · · amn a n
78

Theorem Let A and B denote matrices whose sizes are appropriate

for the following sums and products.
1. (AT )T = A
2. (A + B)T = AT + B T
3. (rA)T = rAT for any scalar r.
4. (AB)T = B T AT
Proof of (4)
79

Consider the (i, j)-entry of (AB)T and B T AT

 A   B 
|
   
[(AB)T ]ij  −− j-th row −−   i-th column 
|
T T
 B   A 
|
T T    
[B A ]ij  −− i-th row −−   j-th column 
|

[(AB)T ]ij = [AB]ji = aj1b1i + aj2b2i + · · · + ajnbni

= b1iaj1 + b2iaj2 + · · · + bniajn = [B T AT ]ij
80
We further have
(ABCD)T = DT C T B T AT

since (ABCD)T = DT (ABC)T = DT C T (AB)T = DT C T B T AT

Recall the definition of the Hermitian (or Hermitian transpose) of a

vector,
uH = uT = [a1 a2 · · · an]

and the Hermitian of a matrix,

 
aT1
 
 aT 
T  2
AH = A ==  
 ... 
 
aTn
81
• The inverse of a square matrix

An n × n square matrix A is said to be invertible if there is an

n × n matrix C such that
CA = In and AC = In (4)
The matrix C in (4) is unique. Since if there is another matrix B such
that
BA = I and AB = I
then
C = CI = C(AB) = (CA)B = IB = B

Therefore, the inverse of a matrix is unique, and we can denote C in (4)

by A−1,
AA−1 = A−1A = I
82

Theorem If A is an invertible n × n matrix, then for each b in Rn,

the equation Ax = b has a unique solution x = A−1b.

We will extend the inverse of square matrices to a general m×n matrix

A. Then for a general linear equation Ax = b, we have a proper solution
x̂ = A+b
where A+ denotes the pseudo-inverse of A.
83

If a square matrix A is not invertible, then A is said to be singular.

( invertible ⇔ non-singular )

Theorem If R is the reduced row-echelon form of a square matrix A of

size n, then either R is the identity matrix In. or R has a row of zeros.
   
1 0 0 0 1 0 ∗ 0
   
0 1 0 0 0 1 ∗ 0
I=
0
,  
 0 1 0
0
 0 0 1
0 0 0 1 0 0 0 0
84
Theorem Assume A and B are invertible matrices.
1. (A−1)−1 = A (So if C = A−1, then C −1 = A)
∵ AC = CA = I

2. (AB)−1 = B −1A−1 (thus AB is invertible)

∵ (AB)(B −1A−1) = (B −1A−1)(AB) = I

3. (AT )−1 = (A−1)T (thus AT is invertible) (can be written as A−T )

Proof :
AA−1 = A−1A = I
(AA−1)T = (A−1A)T = I T = I
(A−1)T AT = AT (A−1)T = I
⇒ (AT )−1 = (A−1)T
85
Furthermore, if A, B, C, D are invertible, then
(ABCD)−1 = D−1C −1B −1A−1
Since
(ABCD)(D−1C −1B −1A−1) = I

(D−1C −1B −1A−1)(ABCD) = I

Cf.
(ABCD)T = DT C T B T AT
86
From the above, since
(ABCD · · · )−1 = · · · D−1C −1B −1A−1
we have (let A = B = C · · · )
(An)−1 = (A−1)n = A−n

We also note
(cA)−1 = c−1A−1

Exercise Expand (ABCD)−T .

Recall that a linear transformation T from Rn to Rm can be represented

by a matrix
A = [T (e1) T (e2) · · · T (en)]
and we can write T (x) = Ax.

On the other hand, give an m × n matrix A, we can use A to define a

linear transformation TA : Rn → Rm as TA(x) = Ax. We also call this
as the matrix transformation.
For a square matrix A of size n, TA is a linear operator on Rn. If A is
invertible, then the inverse of TA is

(TA)−1 = TA−1
88

TA : x 7→ Ax

TA−1 : Ax 7→ x

Let y = Ax. Then x = A−1y.

Therefore, we have
TA−1 : y 7→ A−1y
and we see that
TA−1 = TA−1

T Rn
Rn
x y=Ax

x = A−1y y
T −1
89
Example Consider a linear transformation T on R , define as
3
   
a a+b
   
T ( b  ) =  b + c 
c c+a
then the stardard matrix of T is
 
1 1 0
 
A = [T (e1) T (e2) T (e3)] =  0 1 1 
1 0 1
Note
        
a 1 1 0 a a+b a
        
A  b  =  0 1 1   b  =  b + c  = T ( b )
c 1 0 1 c c+a c
90
• Elementary matrices
Recall the three elementary row operations on a matrix.
1. (Replacement) Add to one row a multiple of anther row.
2. (Interchange) Interchange two rows.
3. (Scaling) Multiply all entries in a row by a non-zero constant.

×3

×2
91

The three elementary row operations are reversible.

×(−3)

×1/2

− By performing an elementary row operation on an identity matrix In,

we obtain an elementary matrix.
− Since there are three types of elementary row operations, we have
three types of elementary matrices.
92
1. (Replacement) (n = 3)
     
100 100 100
    −1  
I =  0 1 0  , E1 =  0 1 0  , E1 =  0 1 0 
001 501 −5 0 1
2. (Interchange)
     
100 100 100
     
I =  0 1 0  , E2 =  0 0 1  , E2−1 =  0 0 1 
001 010 010
3. (Scaling) Assume c ̸= 0,
     
100 100 1 0 0
     
I =  0 1 0  , E3 =  0 c 0  , E3−1 =  0 c−1 0 
001 001 0 0 1
93

Fact If we perform an elementary row operation on an m × n matrix

A, the resulted matrix can be written as EA,
A −→ EA
where the elementary matrix E is created by performing the same row
operation on Im.
Im −→ E
94

       
100 100 100 a11 a12 a13
       
E1 =  0 1 0  , E2 =  0 0 1  , E3 =  0 c 0  , A =  a21 a22 a23 
501 010 001 a31 a32 a33

 
a11 a12 a13
 
E1 A =  a21 a22 a23 
a31 + 5a11 a32 + 5a12 a33 + 5a13
   
a11 a12 a13 a11 a12 a13
   
E2A =  a31 a32 a33  , E3A =  ca21 ca22 ca23 
a21 a22 a23 a31 a32 a33
95

Recall that every elementary row operation is reversible.

⇒ Every elementary matrix is invertible.
(See the examples of E1, E2, E3, and E1−1, E2−1, E3−1.)
96
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, then the following statements are equivalent.
a. A is invertible.
b. Ax = 0 has only the trivial solution. (x = 0)
c. The reduced row-echelon form of A is In. ( A ∼ In)
d. A is expressible as a product of elementary matrices.

Proof : ( (a)⇒(b)⇒(c)⇒(d)⇒(a) )

(a)⇒(b) : A−1Ax = A−10 ⇒ x = 0

 
10 ∗ 0 0
0 1 ∗ 0 0
(b)⇒(c) : [A | 0] ∼ [In | 0]  
0 0 0 1 0
00 0 0 0
97
(c)⇒(d) :
A ∼ In
Eq Eq−1 · · · E1A = In
Eq−1 · · · E1A = Eq−1
−1
Eq−2 · · · E1A = Eq−1 Eq−1
A = E1−1 · · · Eq−1
−1
Eq−1

(d)⇒(a) : Since A = E1−1 · · · Eq−1

−1
Eq−1, we have
A−1 = (E1−1E2−1 · · · Eq−1)−1 = Eq · · · E2E1
98
−1
• An algorithm for finding A

Consider the augmented matrix [A I]. If A is invertible, then A is

row equivalent to I. Perform a sequence of elementary row operations
on [A I] to transform A into an identity matrix I,

Eq · · · E2E1 [A I] = [I (Eq · · · E2E1)] = [I A−1]

(See Example 4 in Section 1.5 of textbook.)

Theorem Let A and B be two n × n matrices.

1. If B satisfies BA = I, then B = A−1.
2. If B satisfies AB = I, then B = A−1.

By this theorem, we have

BA = I ⇔ AB = I ⇔ B = A−1
100
Proof :
1. BA = I
(a) First we prove that A is invertible.
(b) We can show that Ax = 0 has only the trivial solution.
Multiplying both sides by B, we have BAx = B0, or x = 0,
the trivial solution.
By the previous theorem (equivalent statements of matrix inver-
sion), we see that A is invertible and A−1 exists.
(c) Now since A is invertible and
BA = I
Multiply both sides of the above by A−1 from the right, we have
B = A−1. By the previous results, B −1 = (A−1)−1 = A.
101

2. AB = I By 1., B is invertible and A = B −1 and A−1 = B.

Theorem (Equivalent Statements of Matrix Inversion)

If A is an n × n matrix, then the following statements are equivalent.

a. A is invertible.
b. Ax = b has exactly one solution for every b ∈ Rn.
c. Ax = b is consistent for every b ∈ Rn.
Proof :
(a)⇒(b) : Ax = b has a unique solution x = A−1b for every b ∈ Rn.
(b)⇒(c) : Ax = b has a unique solution and hence it is also consistent.
(c)⇒(a) : If Ax = b is consistent for eveary b ∈ Rn, then each of the
following systems of linear equations has a solution,
102
     
1 0 0
     
0 1 0
     
Ax1 =  0  , Ax2 =  0  , · · · , Axn = 
    0
 

. . .
.
  .
  .
0 0 1

Let X = [x1 x2 · · · xn], then we have

AX = A[x1 x2 · · · xn] = [e1 e2 · · · en] = I

and by the previous theorem we have X = A−1, and A is invertible.

103
• Diagonal matrices
A diagonal matrix is a square matrix whose entries are all zero except
the diagonal entries.
 
d11 0
 
 d 22 
D=  ...


 
0 dnn

If all dkk ’s are non-zero, D is invertible,

 
d−1
11 0
 
 −1
d22 
−1
D =  
... 
 
0 d−1
nn
104
The powers of D is
 
dk11 0
 
k △
 dk22 
D = DD · · · D = 
 ...


 
0 dknn

We will discuss later how to efficiently compute Ak for a square matrix

A.
A = EDE −1
A2 = EDE −1EDE −1 = ED2E −1
Ak = (EDE −1)k = EDk E −1
105
• Triangular matices

Upper (Lower) triangular matrices:

A square matrix whose entries below (above) the main diagonal are
zero.
   
∗ ∗ ∗ ∗ ∗ 0 0 0
   
0 ∗ ∗ ∗  0
  (upper triangular) ,  ∗ ∗ 0  (lower triangular)
0 0 ∗ ∗ ∗ ∗ ∗ 0
   
0 0 0 ∗ ∗∗ ∗ ∗

A diagonal matrix D is an upper and lower triangular matrix.

Theorem Let U and L denote upper and lower triangular matrices, 106
respectively.
a. U T is lower triangular, while LT is upper triangular.
b. U1U2 is upper triangular, L1L2 is lower triangular.
c. If all the diagonal entries of U (or L) are non-zero, then U (or L) is
invertible.
d. If U (or L) is invertible, the U −1 (or L−1) is upper (lower) triangular.
Proof : (a) is clear.
(b)     
∗∗ ∗ ∗ ∗ ∗ ∗ ∗ x x x x
    
0 ∗ ∗ ∗ 0 ∗ ∗ ∗ 0 x x x
  = 
0 0 ∗ ∗ 0 0 ∗ ∗ 0 0 x x
    
0 0 0 ∗ 0 0 0 ∗ 0 0 0 x
107
The proofs of (c) and (d) will be given in the next chapter when we
discuss determinants.

• Symmetric matrces
Definition A square matrix A is called symmetric if AT = A.

Example The matrix

 
a e h p
 
e b f k
M =
h
 = MT
 f c g
p k g d
is a symmetric matrix.
108

Theorem If A and B are symmetric matrices with the same

sizes, and if k is a scalar, then
a. AT is symmetric. (AT = A)
b. A + B and A − B are symmetric.
c. kA is symmetric.
In general, AB is not symmetric.

(AB)T = B T AT = BA ̸= AB
109

Theorem If A is an invertible symmetric matrix, then A−1 is symmtric.

Proof :

(A−1)T = (AT )−1 = A−1

Let B be any m × n matrix. Then

• BB T is an m × m symmetric matrix.
• B T B is an n × n symmetric matrix.

(BB T )T = (B T )T B T = BB T
110
• Partitioned matrices

A matrix A can be regarded as a 2-D array of entries, or a list of col-

umn vectors or row vectors. For an m × n matrix A,
 
a11 · · · a1j · · · a1n  
T
 . .. ..  r
 .  1T 
   r2 
A=  a i1 · · · a ij · · · a in
 = [c1 c2 · · · cn] =  
  .. 
 . . .   
 . . . 
rTm
am1 · · · amj · · · amn
111
Other partitions of matrices:
" # " #
A11 A12 B1
A= , B= , C = [C1 C2]
A21 A22 B2

Multiplication rules for partitioned matrices:

" # " #
A11 A12 B1
C = [C1 C2] , A= , B=
A21 A22 B2

" #
A11B1 + A12B2
AB = , CA = [C1A11 + C2A21 C1A12 + C2A22]
A21B1 + A22B2

if each A11B1 etc. and C1A11 etc. are defined.

Note the order in multiplication of two matrices, 112

A11B1 + A12B2 ̸= B1A11 + B2A12

C1A11 + C2A21 ̸= A11C1 + A21C2

Remarks
1. If [A B]C is defined, note that
[A B]C ̸= [AC BC]
and the latter is even undefined. However,
D[A B] = [DA DB]

2. Similarly,
" # " # " # " #
A AC A DA
C= , D ̸=
B BC B DB
113

Let " #
A11 A12
A=
A21 A22

then  
AT11 AT21
AT =  
AT12 AT22
Exercise
Find the transpose of B and C
" #
B1
B= , C = [C1 C2]
B2
114
• Column-Row expansion of AB

Recall that for matrix multiplication,

  
b1j n
X
  .. 
[AB]ij =  ai1 · · · ain   = aik bkj
bnj k=1

which is called the row-column rule.

If A is an m × n matrix and B is an n × p matrix,
 
bT1
 T
 b2 
A = [a1 a2 · · · an], B = 
 .. 
 ( a k ∈ R m
, bk ∈ R p
)
 
bTn
then
AB = a1bT1 + a2bT2 + · · · + anbTn
115
which is called the column-row rule.
 
a1k
 
 a2k 
a k bk = 
T 
 ..  [bk1 bk2 · · · bkp] , 1≤k≤n
 
amk
Since
[ak bTk ]ij = aik bkj
we have
n
X

a1bT1 + a2bT2 T
+ · · · + a n bn ij
= ak bTk ij
k=1
Xn
= aik bkj
k=1
= [AB]ij
116
• Inverses of partitioned matrices
" #
A11 A12
If A = is invertible,
O A22

" #−1 " #

A11 A12 A−1
11 − A −1
11 A 12 A −1
22
=
O A22 O A−1
22

Note that A is invertible implies that both A11 and A22 are invertible,
and vice versa.
Exercise Check
" #" #
A−1
11 − A −1
A A
11 12 22
−1
A11 A12
=I
O A−1
22 O A22
117
Exercise Find the inverse of
" #
A11 O
M=
A21 A22

Hint
" #T  
A11 O AT11 AT21
M =T
= 
A21 A22 OT AT22

(M T )−1 = (M −1)T
118

• LU decomposition
  
(factorization) 1 0 0 0 • ∗ ∗ ∗ ∗
  
∗ 1 0 0  • ∗ ∗ ∗
A=  0 
∗ ∗ 1 0 0 0 0 • ∗
  
∗ ∗ ∗ 1 0 0 0 0 0
L U
A : m × n, L : m × m, U : m × n.

Note L is a lower triangular matrix, while U is in row echelon form.

119

For solving Ax = b, we can use the LU decomposition of A to

decompose the process into two steps. For A = LU , we can solve
LU x = b by
Ly = b
Ux = y
In the situation that we need to solve a sequence of equations,
Ax = b1, Ax = b2, · · · , Ax = bp
It’s more efficient to solving the above equations by the LU decomposi-
tion of A
Ly = bk , k = 1, 2, · · ·
Ux = y
than by performing elementary row operations on [A | bk ].
120

Ax = b
  
1 0 0 0 • ∗ ∗ ∗ ∗
  
∗ 1 0 0 0 • ∗ ∗ ∗
A=
∗
  = LU
 ∗ 1 0 0
 0 0 • ∗
∗ ∗ ∗ 1 0 0 0 0 0

L(U x) = b

Ly = b
Ux = y
121
− Algorithm for an LU factorization
Assume we can reduce a matrix A to an echelon form U by elementary
row operations, without the row interchange,
 
•∗∗∗∗
 
0 • ∗ ∗ ∗
A ∼ ··· ∼ U =  0 0 0 • ∗

 
0 0 0 0 0

Eq · · · E 1 A = U
where each Ek is lower triangular, for example,
     
1000 1000 1 0 0 0
     
 −2 1 0 0   0 1 0 0   0 1 0 0
 ,  ,  
 0 0 1 0 3 0 1 0 0 0 1 0
     
0001 0001 0 0 0 2
122
In
Eq · · · E 1 A = U
the matrix (Eq · · · E1) is lower triangular.
Then
A = (Eq · · · E1)−1U = LU
where

L = (Eq · · · E1)−1
is lower triangular.

A = (E1−1 · · · Eq−1)U
123
Example    
2 4 −1 5 −2 1 0 0 0
 −4 −5 3 −8 1   1 0 0
A=
 2 −5 −4 1 8 
 

=I
1 0
−6 0 7 −3 1 1
   
2 4 −1 5 −2 1 0 0 0
 0 3 1 2 −3   −2 1 0 0 
∼ 
 0 −9 −3 −4 10  → 1

1 0
0 12 4 12 −5 −3 1
   
2 4 −1 5 −2 1 0 0 0
 0 3 1 2 −3   −2 1 0 0 
∼
0 0 0 2 1
 →
 1 −3 1 0 


0 0 0 4 7 −3 4 1
   
2 4 −1 5 −2 1 0 0 0
 0 3 1 2 −3   −2 1 0 0 
∼
0 0 0 2 1
=U →
 1 −3 1 0 
=L

0 0 0 0 5 −3 4 2 1
Example     124

6 −2 0 1 0 0
A = 9 −1 1  1 0
3 7 5 1
   
1 − 13 0 6 0 0
∼ 9 −1 1 →  1 0
3 7 5 1
   
1 − 13 0 6 0 0
∼ 0 2 1 → 9 1 0
0 8 5 3 1
   
1 − 13 0 6 0 0
∼ 0 1 1
2
→ 9 2 0
0 8 5 3 1
   
1 − 13 0 6 0 0
∼ 0 1 1=U
2
→ 9 2 0 = L
0 0 1 3 8 1
125
Note that
1. LU decomposition doesn’t necessarily exist for every m × n
matrix A.
 
124
 
A ∼ ··· ∼ 0 0 1
037
2. In general, if row interchanges are required to reduce A to its row-
echelon form, then there is no LU decomposition of A.

However, in this case, we can extend the LU decomposition.

126

When row interchanges are required to reduced A to its row-echelon

form, we can first interchang the rows of A before the LU decompo-
sition.
P ′A = LU
(b) (b)
where P ′ = · · · E2 E1 denotes a sequence of row interchange oper-
ations.
3. Then
A = P LU

where P = (P ′)−1
127
Definition For a square matrix A of size n, we define
the trace of A as
n
X
tr(A) = akk , A = [aij ]
k=1
the sum of the entries on the main diagonal of A.
 
a11
 
 a22 
 . 
 . . 
ann

It’s clear that

tr(A) = tr(AT )
128
Theorem
Suppose that A and B are two n × n square matrices. Prove that
tr(AB) = tr(BA)
Proof
Both AB and BA are n × n matrices.
n
X n X
X n
tr(AB) = [AB]kk = [A]kℓ[B]ℓk
k=1 k=1 ℓ=1

n
X n X
X n
tr(BA) = [BA]kk = [B]kℓ[A]ℓk
k=1 k=1 ℓ=1
129
Exercise
Suppose that A and B are two m × n matrices. Prove that
tr(B T A) = tr(AB T )

B T A is n × n, while AB T is m × m.

n
X n X
X m n X
X m
tr(B T A) = [B T A]kk = [B T ]kℓ [A]ℓk = [B]ℓk [A]ℓk
k=1 k=1 ℓ=1 k=1 ℓ=1

m
X m X
X n m X
X n
tr(AB T ) = [AB T ]kk = [A]kℓ [B T ]ℓk = [A]kℓ [B]kℓ
k=1 k=1 ℓ=1 k=1 ℓ=1
■ Determinants 130

The determinant of a square matrix A:

A 7→ det(A) ∈ R (or C)

The notation of the determinant of A: det(A).

Geometrical meaning of a determinant:

A = [a1 a2], Area = |det(A)|

a1
131
A = [a1 a2 a3], Volume = |det(A)|

We will discuss the volume of the parallelogram in Rn later.

132
Definition For a square matrix A, define
Aij : the submatrix formed by deleting the ith row and jth
column of A.

Example:
 
1 −2 5 0  
  1 5 0
 2 0 4 −1   
A=
3 1 0 7
, A32 =  2 4 −1 
  0 −2 0
0 4 −2 0

Note that if A is of size n, then each Aij is of size n − 1.

133
Definition of determinants. (Iterative)

• Define the determinant of a 1 × 1 matrix A = [a11].

• Based on the determinants of square matrices of size (n−1), we define
the determinant of a square matrix A of size n. (Cofactor exapnsion)

1. For a 1 × 1 matrix h i
A = a11
we define
△
det(A) = a11
134

2. For n ≥ 2, the determinant of an n × n matrix A = [aij ]

 
a11 · · · a1j · · · a1n
 . .. .. 
 . 
 
A=  ai1 · · · aij · · · ain 
 . .. .. 
 . 
an1 · · · anj · · · ann
is defined as
det(A)
= a11 det(A11) − a12 det(A12) + · · · + (−1)1+na1n det(A1n)
n
X
= (−1)1+j a1j detA1j
j=1
135

We call
(1) det(Aij ) : the minor of aij , or the (i, j)-minor.

(2) (−1)i+j det(Aij ) : the cofactor of aij , or the (i, j)-cofactor.

Therefore, the determinant of a square matrix A is defined by the

cofactor expansion of A.
136
Example:
" #
a11 a12
det = a11 det[a22] − a12 det[a21] = a11a22 − a12a21
a21 a22

 
a11 a12 a13
 
det  a21 a22 a23 
a31 a32 a33
" # " # " #
a22 a23 a21 a23 a21 a22
= a11 det − a12 det + a13 det
a32 a33 a31 a33 a31 a32

= a11 (a22a33 − a23a32) − a12 (a21a33 − a23a31)

+ a13 (a21a32 − a22a31)

137
Let
Cij = (−1)i+j det(Aij )
then
det(A) = a11 C11 + a12 C12 + · · · + a1n C1n

• In fact, it can be proved that we can expand along any row, say
the ith row,

det(A) = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin

• or expand down any column, say the jth column,

det(A) = a1j C1j + a2j C2j + · · · + anj Cnj

138
If A, B are triangular matrices,
   
a11 a12 · · · a1n b11 0 · · · 0
   
 0 a22 · · · a2n   b21 b22 · · · 0 
A=
 ..
,
. . . ..  B=
 ..

. . . .. 
   
0 0 · · · ann bn1 bn2 · · · bnn

then det(A) is the product of the entries on the main diagonal A,

 
a11 a12 · · · a1n  
  a22 · · · a2n
 0 a22 · · · a2n 
det   = a11 · det 
 0 . . . .. 
 = a11a22 · · · ann
 .. . . . .. 
  0 · · · ann
0 0 · · · ann

by expanding along the first column.

139
Similarly, for the lower triangular matrix B
 
b11 0 · · · 0  
  b22 · · · 0
 b21 b22 · · · 0 
det   = b11 · det 
 .
. . . . .
.

 = b11b22 · · · bnn
 .. . . . .. 
  bn2 · · · bnn
bn1 bn2 · · · bnn

by expanding along the first row.

For a diagonal matrix, we have

 
d11 0
 
 d21 
det 
 ...
 = d11d21 · · · dnn

 
0 dnn
140
• Row operations and determinants

Let A be an n × n matrix. We can use the above cofactor expansion

of a matrix to prove the following facts.
1. The replacement operation doesn’t affect det(A).
2. The interchange operation negates det(A).
3. The scaling (r) operation results in r det(A).

Recall that we can write the row operation of a matrix A as

EaA, or EbA, or EcA

where Ea, Eb, and Ec denote three types of elementary matrices,

respectively.
141
Recall the three types of elementary matrices,
 
1 00
 
Ea =  0 1 0  (replacement)
k 01
 
100
 
Eb =  0 0 1  (interchange)
010
 
1 0 0
 
Ec =  0 r 0  , r ̸= 0 (scaling)
0 0 1
It is clear that
det Ea = 1, det Eb = −1, det Ec = r
142
Let A be a square matrix. We will show that
det (EA) = (det E)(det A)
where E is an elementary matrix.
1. A multiple of one row of A is added to another row.

det (EaA) = det A = (det Ea)(det A) ∵ det Ea = 1

2. Two rows of A are interchanged.

det (EbA) = −det A = (det Eb)(det A) ∵ det Eb = −1

3. One row of A is multiplied by r(̸= 0).

det (EcA) = r det A = (det Ec)(det A) ∵ det Ec = r

143
We have
1. det (EaA) = (det Ea) (det A), since det Ea = 1.
2. det (EbA) = (det Eb) (det A), since det Eb = −1.
3. det (EcA) = (det Ec) (det A), since det Ec = r ̸= 0.

We conclude that
det (Ek A) = (det Ek ) (det A), k = a, b, c (5)

In the following, we will further prove that

det (AB) = (det A) (det B)
for any square matrices A and B.
144

We now consider the determinant of a matrix A when

(1) A is invertible. (2) A is not invertible.
• If A is invertible, we can write
A = E1 · · · Ep−1Ep
where each Ek is an elementary matrix.

By (5), we have
det A = det (E1E2 · · · · · Ep)
= (det E1) det (E2 · · · · · Ep)
= (det E1) (det E2) · · · · · (det Ep) ̸= 0
145

• If A is not invertible
Ep′ · · · E1′ A = I˜
A = E1 · · · EpI˜
For example,  
1 2 0 0
 
0 0 1 0
I˜ = 
0
, det I˜ = 0
 0 0 1
0 0 0 0
and
˜ = det (E1E2 · · · Ep) det I˜ = 0
det A = det (E1E2 · · · EpI)
146
Theorem (Equivalent Statements of Matrix Inversion)

If A is an n × n matrix, then the following statements are equivalent.

1. A is invertible.
2. det (A) ̸= 0.

Theorem Let A and B be two square matrices of sizes n. If AB is

invertible, then both A and B are invertible.
Proof
Let D be the inverse of AB, then
ABD = DAB = I
and we have A−1 = BD and B −1 = DA.
147

Recall that if both A and B are invertible, then AB is also invertible,

and (AB)−1 = B −1A−1. Therefore, we have

Both A and B are invertible ⇔ AB is invertible.

Theorem : If A and B are n × n matrices, then

det (AB) = (det A) (det B)
Proof :
If A is not invertible, so is AB, therefore,
det AB = 0 = (det A) (det B)
148

If A is invertible, let
A = E 1 E2 · · · E p
By (5),
det AB = det (E1E2 · · · EpB)
= (det E1) [det (E2 · · · EpB)] = · · ·
= (det E1) (det E2) · · · · · (det Ep)(det B)
= (det E1E2 · · · Ep)(det B)
= (det A) (det B)

Therefore, we have
det AB = det BA = (det A) (det B)
although AB ̸= BA in general,
149

Theorem Let U and L denote upper and lower triangular matrices,

respectively. If all the diagonal entries of U (or L) are non-zero, then
U (or L) is invertible.
Proof
Since det (U ) = u11 · · · unn ̸= 0
 
u11 u12 · · · u1n
 
 0 u22 · · · u2n 
 
U =  0 0 

 . . 
 . . 
0 0 · · · unn
150
T
Theorem : If A is an n × n matrix, then det A = det A.
Proof :
1. Hold for n = 1, since A = AT = [a11].
2. Assume it’s hold for (n − 1) × (n − 1) matrices, n ≥ 2.
3. For an n × n matrix A,
det A = a11 · C11 + a12 · C12 + · · · + a1n · C1n
′ ′ ′
det AT = a11 · C11 + a12 · C21 + · · · + a1n · Cn1

C1j = (−1)1+j det(A1j )

′
Cj1 = (−1)j+1det[(AT )j1)] = (−1)j+1det[(A1j )T ]
′
Since by inductive hypothesis Cj1 = C1j , we have det AT = det A.
151
Summary
1. det AB = (det A)(det B) = det BA
2. det AT = det A
3. det (A−1) = (det A)−1 (Exercise) (Hint : det (AA−1) = det I = 1)

Note that
1. det (A + B) ̸= det A + det B
2. det (cA) = cn(det A), if A is of size n × n
152
• Cramer’s Rule
For an n × n matrix A, and b ∈ Rn,
A = [a1 · · · ai · · · an]
define
Ai(b) = [a1 · · · ai−1 b ai+1 · · · an]
namely, replace ai by b.
Example
Ii(x) = [e1 · · · x · · · en]
 
1 0 · · · x1 0
 
0 1 x 2 0 
 . . . .. .. 
 . . 
=0

 ⇒ det Ii(x) = xi
 x i 0 
 .. .. . . 
 . 
0 0 xn 1
Theorem (Cramer’s Rule) : Let A be an n × n invertible matrix. 153
For any b in Rn, the unique solution x of Ax = b has entries given by

det Ai(b)
xi = , i = 1, 2, · · · , n
det A
Proof :
A Ii(x) = A[e1 · · · x · · · en]
= [Ae1 · · · Ax · · · Aen]
= [a1 · · · b · · · an]
= Ai(b)

⇒ (det A)(det Ii(x)) = det Ai(b)

⇒ (det A) xi = det Ai(b)

det Ai(b)
⇒ xi =
det A
154
−1
• A formula for A :
Assume A is invertible and A−1 = B = [b1 b2 · · · bn].
Since AB = I,

A[b1 b2 · · · bn] = [e1 e2 · · · en]

we have
Abj = ej 1≤j≤n
By Cramer’s rule,
det Ai(ej ) Cji
[bj ]i = = , 1≤i≤n
det A det A

where Cji is the (j, i)-cofactor of A.

155

A = [a1 · · · ai · · · an]

Ai(ej ) = [a1 · · · ej · · · an]

 
0
.
.
 
ej =   1 
.
.
0
and
det Ai(ej ) = Cji = (−1)j+idet(Aji)

Ref.
det(A) = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin
= a1j C1j + a2j C2j + · · · + anj Cnj
156
Since [bj ]i = [B]ij , we have
Cji
[B]ij =
det A
Hence  
C11 C21 · · · Cn1
 
−1 1   C 12 C 22 · · · C n2 
 1
A =B=  . . .  = adj(A) (6)
det A  . . .  det A
C1n C2n · · · Cnn

where we define the adjoint of a matrix A as

 
C11 C21 · · · Cn1
 
 C12 C22 · · · Cn2 
adj(A) =  .. ..

.. 
 
C1n C2n · · · Cnn
157
We now prove the following theorem.
Theoreom Let U and L denote upper and lower triangular matrices,
respectively. If U (or L) is invertible, the U −1 (or L−1) is upper (lower)
triangular.
Proof
Since U is upper triangular, then the cofactor Cij = 0 for i < j.
By (6), we have U −1 also an upper triangular matrix.
 
∗ ∗ ∗ ∗    
  0∗∗ ∗∗∗
0 ∗ ∗ ∗    
U =
0
, U12 =  0 ∗ ∗  , U23 =  0 0 ∗ 
 0 ∗ ∗ 0 0 ∗ 0 0 ∗
0 0 0 ∗
158

Suppose that A and B are two square matrices of sizes m × m

and n × n, respectively. Let
" #
A O
C=
O B

We can show that

det(C) = det(A)det(B)

This result can be proved based on mathematical induction.

Begin with the case that A is a 1 × 1 matrix,
" #
a11 O
O B
159
• Geometric interpretation of determinants
If
 
" # b11 b12 b13
a11 a12  
A = [a1 a2] = , B = [b1 b2 b3] =  b21 b22 b23 
a21 a22
b31 b32 b33
then
(1). the area of the parallelogram determined by a1 and a2 is |det A|,
(2). the volume of the parallelepiped determined by b1, b2 and b3 is
|det B|.

a1
160

1. It holds if  
" # ℓ1 0 0
ℓ1 0  
A= , B =  0 ℓ2 0 
0 ℓ2
0 0 ℓ3

then det A = ℓ1ℓ2 and det B = ℓ1ℓ2ℓ3 are the corresponding area
and volume, respectively.

ℓ2

ℓ1
161

2. Since det [a1 a2] = det [a1 a2 − ka1],

we can select k such that a2 − ka1 is perpendicular to a1.

a2 − ka1 a2

a1
ka1

3. Rotate the rectangle by a roational matrix R,

" # " #
cos(ϕ) − sin(ϕ) s 0
[a1 a2 − ka1] =
sin(ϕ) cos(ϕ) 0 t

where s = ∥a1∥, t = ∥a2 − ka1∥.

162

Since the rotation doesn’t change the area and det R = 1,

we have
det [a1 a2] = det [a1 a2 − ka1] = s t

4. In Rn, the volume spanned by a1, a2 . . . , an, where ak ’s ∈ Rn, is

defined as
det [a1 a2 · · · an]
Ref:
a1
a2 − c21a1 ⊥ a1
a3 − c31a1 − c32a2 ⊥ a1 , a2
..
163
■ Vector Spaces

- We are familiar with vectors in R2 and R3.

v = (x0, y0)T ∈ R2 or w = (x1, y1, z1)T ∈ R3.

(x0, y0)

- We can extend many familiar ideas beyond the 3-D space.

- Ordered n-tuple: v = (v1, v2, . . . , vn)T , vk ∈ R or C.
- The set of all ordered n-tuples is denoted by Rn (Cn).

Rn = {(v1, v2, . . . , vn)T | vk ∈ R}

164
For vectors in R (or C ), we have the following definitions.
n n

1. For two vectors u = (u1, u2, . . . , un)T and v = (v1, v2, . . . , vn)T
in Rn, the sum u + v is defined by

u + v = (u1 + v1, u2 + v2, . . . , un + vn)T

2. If c is any scalar (c ∈ R or C), the scalar multiple cu is defined by

cu = (cu1, cu2, . . . , cun)T

3. The zero vector is denoted by 0 and is defined to be

the vector 0 = (0, 0, . . . , 0)T .
165

u = (u1, u2, . . . , un)T , v = (v1, v2, . . . , vn)T

4. We say u = v if
u1 = v1, u2 = v2, . . . , un = vn

5. We define the negative (or additive inverse) of u as

−u = (−u1, −u2, . . . , −un)T

6. The difference of vectors in Rn is defined by

v − u = v + (−u) = (v1 − u1, v2 − u2, . . . , vn − un)T

Use the above definitions and operations for vectors in Rn (Cn), we

can readily prove the following theorem.
166
T T
Theorem If u = (u1, u2, . . . , un) , v = (v1, v2, . . . , vn) , and
w = (w1, w2, . . . , wn)T are vectors in Rn ( or Cn ) and c and d
are scalars in R ( or C), then:
1. u + v = v + u
2. (u + v) + w = u + (v + w)
3. u + 0 = u (0 = (0, 0, . . . , 0)T )
4. u + (−u) = 0 ( −u = (−1)u )
5. c(u + v) = cu + cv
6. (c + d)u = cu + du
7. c(du) = (cd)u
8. 1u = u
The above properties will be used to define a vector space.
167
• Motivation
Consider some mathematical sets and the related linear operations.
1. Rn (or Cn)
v1 + v2 , cv1, v = (v1, v2, . . . , vn)T , vk ∈ R or C

2. p(t) = a0 + a1t + a2t2 + · · · + antn, ak ∈ R or C
p1(t) + p2(t), cp1(t)

3. Mmn(R), or Mmn(C)
M1 + M2 , cM1

4. { f (t) ∈ R, t ∈ R }
f1(t) + f2(t), cf1(t)
168

5. { {s} = (s1, s2, . . .), sk ∈ R or C }

{r} + {s}, c{s}

6. { X : Ω → R }, where X is a random variable, and Ω denotes

the sample space.
X +Y, cX

Remarks
1. For the above sets, we have similar the definitions of addition “+”
and scalar multiplication.
2. How to efficiently study further issues such as the concept of bases,
linear transformation, eigenvalues/eigenvectors, inner products, etc.?
169

We will define the Vector Space as a unified and generalized

framework, and develop the “linear algebra” of the above sets under the
vector space.

The properties of the above theorem are actually common for the
following sets.
1. Rn (or Cn)

2. p(t) = a0 + a1t + a2t2 + · · · + antn, ak ∈ R or C
3. Mmn(R), or Mmn(C)
4. { f (t) ∈ R, t ∈ R }
5. { {s} = (s1, s2, . . .), sk ∈ R or C }
6. { X : Ω → R }
170
We use the properties of the vectors in R (or C ) to define
n n

a vector space.

Theorem For vectors u, v in Rn, (or Cn) · · ·

············

Axiom (Definition) For vectors u, v in a vector space V , · · ·

············
171
Axiom (or Definition )
A vector space V over a field F (R or C) is a nonempty set V of
vectors on which are defined two operations, called
addition (+) and scalar multiplication,
- addition: u + v
- scalar multiplication: cu
u, v ∈ V and c ∈ F (R or C)

subject to the following ten rules.

172
For all u, v, w ∈ V and scalars c, d ∈ F (R or C),
1. (Closure) The sum of u and v, denoted by u + v, is in V .
2. (Closure) The scalar multiple of u by c, denoted by cu, is in V .

u v
V

u + v cu

For example, the addition of two polynomials, p1(t) + p2(t), is also

a polynomial.
173

3. u + v = v + u
4. (u + v) + w = u + (v + w)
5. There is a zero vector 0 in V such that u + 0 = u for any u.
6. For each u in V , there exists a vector w such that u + w = 0.
We use −u to denote w.
7. c(u + v) = cu + cv
8. (c + d)u = cu + du
9. c(du) = (cd)u
10. 1u = u

More precisely, we denote a vector space by (V, F ).

174
By the above definition of a vector space, it can be verified that
each of the following sets can be regarded as a vector space.

1. Rn (or Cn) v = (v1, v2, . . . , vn)T

2. {p(t) = c0 + c1t + c2t2 + · · · + cntn, ck ∈ R or C}

3. Mmn(R), or Mmn(C)

4. {f (t) ∈ R, t ∈ R}

5. { {s} = (s1, s2, . . .), sk ∈ R or C }

6. {X : Ω → R}, where X is a random variable, and Ω denotes

the sample space.
175
Example The space of polynomials (with real coefficients) of degrees
no more than n.
V = {p(t) = c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}

u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
176

Example The space of m × n matrices with its entries in R or C.

{M | [M ]ij ∈ R, 1 ≤ i ≤ m, 1 ≤ j ≤ n}
{M | [M ]ij ∈ C, 1 ≤ i ≤ m, 1 ≤ j ≤ n}

u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
177
Example A vector space of real-valued functions defined on (a, b),
denoted by F (a, b), is a vector space. The space F (a, b) is not
of finite dimension.
F (a, b) = {f (t) ∈ R | t ∈ (a, b)}

u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
178

(V, F )
Example (Rn, R), (Cn, C), (F n, F ) are vector spaces.

Exercise Is (Rn, C) a vector space?

Exercise Is (Cn, R) a vector space?

Example {0} can be a vector space.

179
Remarks
1. Based on the concept of vector space, we develop the theorems of
linear algebra.
2. The theorems and results developed under the vector space can be
applied in the sets of Rn, Cn, polynomials, matrices, functions, se-
quences, and other sets that satisfy the rules of vector space.
.. .. ..
⇑ ⇑ ⇑
Theorem Theorem Theorem
⇑ ⇑ ⇑ ⇑ ⇑
Theorem Theorem Theorem Theorem
⇑ ⇑ ⇑ ⇑

Axiom of vector spaces Axiom of inner product spaces

180
Remarks
• Any kind of set can be a vector space if the above ten rules are
satisfied.

u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
181
Fact: The zero vector 0 in a vector space V is unique.
Assume we have two zero vectors 01 and 02 in V .
Then
0 1 = 0 1 + 0 2 = 02

Fact: Every vector u has a unique negative element.

Assume u has two negative vectors w1 and w2,
u + w1 = 0, u + w2 = 0
Then

w1 = w1 + 0 = w1 + (u + w2) = (w1 + u) + w2 = 0 + w2 = w2
182
Therefore, in a vector space,
1. the zero vector 0,
2. and the negative vector of a vector u, denoted by −u,
are well-defined.
183
Theorem (Cancellation law for vector addition)
Let V be a vector space and u, v, w are vectors in V . If
u+w =v+w
then u = v.
Proof.
There exists a z ∈ V such that w + z = 0.
u = u + 0 = u + (w + z)
= (u + w) + z
= (v + w) + z
= v + (w + z)
=v+0=v
184
Theorem Let (V, F ) be a vector space, u a vector in V , and
k a scalar in F . Then
1. 0u = 0
2. k0 = 0
3. (−1)u = −u
4. If ku = 0, then k = 0 or u = 0
Proof
1. 0u + 0u = (0 + 0)u = 0u = 0u + 0
By the cancellation law for vector addition, we have 0u = 0.
2. k0 = k(0 + 0) = k0 + k0
k0 = k0 + 0
⇒ k0 + k0 = k0 + 0 ⇒ k0 = 0 (by the cancellation law)
185

3. u + (−1)u = 1u + (−1)u = [1 + (−1)]u = 0u = 0

So (−1)u = −u.
4. Suppose k ̸= 0, then
1 1
(ku) = ( k)u = (1)u = 1u = u (7)
k k

However, since ku = 0, we have

1 1
(ku) = ( )0 = 0 (8)
k k
Comparing (7) and (8), we have u = 0.
186
Remarks
1. In the above, we define the vector space.
2. We use the abstract method to achieve generalization.
3. Abstraction in mathematics is the process of extracting the underly-
ing essence of a mathematical concept.
4. We have an abstract in the front of an article or a book.
5. When referring to an abstract topic, we can use the cases (or exam-
ples) of R2 or R3 to verify it.
187

Example Let V = R2. Define addition and scalar multiplication

operations as follows. For u = (u1, u2), v = (v1, v2), we define

u + v = (u1 + v1, u2 + v2)

ku = (ku1, 0)

The first nine rules of the axiom are satisfied. However, Axiom 10 fails
to hold.
1u = 1(u1, u2) = (u1, 0) ̸= u
188

Exercise Consider the set of all polynomials of degrees less or equal to

two. Define two operations of polynomials as
 
a0 + b0
2 2 △  
(a0 + a1t + a2t ) + (b0 + b1t + b2t ) =  a1 + b1 
a2 + b2
 
a0
2 △  
c(a0 + a1t + a2t ) = c  a1 
a2

Check each of the ten rules of the vector space.

In the following, we will study the issues of subspaces and the dimension
of a vector space.
189
• Subspaces
Definition A subspace of a vector space V is a subset H of V
that is also a vector space.

H
190

Example V=R3, and H = R2.

x
191
To determine if a subset H of a vector space V is a subspace, we
only need to examine the following three rules, for all u, v ∈ H, and
c ∈ F,
1. 0 ∈ H
2. (Closure) u ∈ H, v ∈ H ⇒ u + v ∈ H
3. (Closure) u ∈ H, c ∈ F ⇒ cu ∈ H ⇒ (−1)u = −u ∈ H

Since all vectors of H are in V , they inherit the properties of

a vector space, like commutativity, associativity, etc.
u + v = v + u, u, v ∈ H ⊂ V
(u + v) + w = u + (v + w)
..
192
For a subset H ⊂ V , H is a subspace if for all u, v ∈ V , and c ∈ F ,
1. 0 ∈ H
2. (Closure) u ∈ H, v ∈ H ⇒ u + v ∈ H
3. (Closure) u ∈ H, c ∈ F ⇒ cu ∈ H

u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
193

Example The subset that contains only the zero element 0 ∈ V ,

{0}, is a subspace of V .

Example The subspaces of R2 include

• {0}
• Lines through the origin H = {(t, at) | t ∈ R}, a ∈ R.
• R2 = {(x, y) | x ∈ R, y ∈ R} {(x, y) | y = ax + b}

{(x, y) | y = ax}
194

Example The subspaces of R3 include

• {0}
• Lines through the origin H = {(t, at, bt) | t ∈ R}, a, b ∈ R.
• Planes through the origin.
• R3 z

x
195

We call H is a proper subspace of V if H is a subspace of V

but H ̸= V .

H
196
Example Assume m < n, and
W = {c0 + c1t + · · · + cmtm | c0, . . . , cm ∈ R}
V = {c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}
Then W is a (proper) subspace of V .

In the above example, both V and W are subspaces of the space

of polynomials
P = {c0 + c1t + · · · + cmtm + · · · | c0, . . . , cm, . . . , ∈ R}
197
Example If m < n, we can pad (n − m) zeros to every vector in
Rm to form a subspace W of Rn.

Rm = {(a1, a2, . . . , am)T , a1, . . . , am ∈ R}

Rn = {(a1, a2, . . . , am, am+1, . . . , an)T , a1, . . . , an ∈ R}

W = {(a1, a2, . . . , am, 0, . . . , 0)T , a1, . . . , am ∈ R}

W ⊂ Rn
198
Example
P (a, b) ⊂ C ∞(a, b) ⊂ C m(a, b) ⊂ C 1(a, b) ⊂ C(a, b) ⊂ F (a, b)

In this example, all spaces are of infinitely large dimensions.

1. P (a, b) = {c0+c1t+· · ·+cmtm+· · · | c0, . . . , cm, . . . , ∈ R, t ∈ (a, b)}

2. C m(a, b) = {f (t) | f (m)(t) exits, t ∈ (a, b)}
3. C(a, b) = {f (t) | f (t) continuous, t ∈ (a, b)}
4. F (a, b) = {f (t) ∈ R | t ∈ (a, b)}
199

Example The solution set of Ax = 0 is a subspace of Rn.

(A : m × n)

Let S = {x | Ax = 0, x ∈ Rn} denote the solution set. S ⊂ Rn.

1. 0 ∈ S.
2. x1 ∈ S and x2 ∈ S ⇒ (x1 + x2) ∈ S, cx1 ∈ S.
Ax1 = 0, Ax2 = 0 ⇒ A(x1 + x2) = 0, A(cx1) = cAx1 = 0

Exercise Is the solution set of Ax = b, b ̸= 0, a subspace of Rn ?

200
Example Let Mnn be the space of square matrices of size n.
1. The set of n × n symmetric matrices (A = AT ) is a subspace of Mnn.
2. The set of n × n upper matrices (U ) is a subspace of Mnn.
3. The set of n × n lower matrices (L) is a subspace of Mnn.
4. The set of n × n diagonal matrices (D) is a subspace of Mnn.

Theorem Let H and W be two subspaces of V . The intersection

H ∩ W is also a subspace of V .
201
Proof
1. Since both H and W are subspaces of V , we have 0 ∈ H and 0 ∈ W ,
⇒ 0 ∈ H ∩ W.
2. Suppose that u, v ∈ H ∩ W .
⇒ u + v ∈ H and u + v ∈ W
⇒ u+v ∈H ∩W
3. ⇒ cu ∈ H and cu ∈ W ⇒ cu ∈ H ∩ W

Exercise Is H ∪ W a subspace of V ?
202
The subspaces of R :
3

1. The x-axis, y-axis, and z-axis.

2. The xy-plane, yz-plane, zx-plane.

The intersection of these subspaces is 0 = (0, 0, 0)T .

203
Let V be a vector space, and W and U are two subspaces in V .
Is it possible that W and U are disjoint? (W ∩ U = ϕ)

W
U

In fact, 0 ∈ W ∩ U
204
How to define the dimension of a vector space?

Definition Let V be a vector space and v1, v2, . . . , vp ∈ V .

We define the span of v1, v2, . . . , vp as

Span{v1, v2, . . . , vp}

= {c1v1 + c2v2 + · · · + cpvp | c1, . . . , cp are scalars.}
the set of all linear combination of v1, v2, . . . , vp.

Example Consider the space of polynomials. The span of 1, t, t2 is

Span{1, t, t } = c0 + c1t + c2t2 | ck ∈ R
2
205

Theorem If v1, v2, . . . , .vp ∈ V , then span{v1, v2, . . . , vp} is

a subspace of V .

Proof : Let W = span{v1, v2, . . . , vp}

= {c1v1 + c2v2 + · · · + cpvp | c1, . . . , cp are scalars.}

1. 0 ∈ W (Let c1 = · · · = cp = 0)
2. Assume u ∈ W , s ∈ W , k ∈ F ,
u = c 1 v 1 + c2 v 2 + · · · + cp v p
s = d1 v1 + d2 v2 + · · · + dp vp
then u + s = (c1 + d1)v1 + (c2 + d2)v2 + · · · + (cp + dp)vp ∈ W

ku = (kc1)v1 + (kc2)v2 + · · · + (kcp)vp ∈ W

206
• We call span{v1, v2, . . . , vp} the subspace spanned (or generated)
by {v1, v2, . . . , vp}.

span{v1, v2, . . . , vp}

207
Definition A set of vectors {v1, v2, . . . , vp} is said to be
linearly independent if

c1 v 1 + c2 v 2 + · · · + cp v p = 0

has only the trivial solution, c1 = c2 = · · · = cp = 0.

Otherwise, we say that v1, v2, . . . , vp are linearly dependent.

If v1, v2, . . . , vp are linearly dependent, there are c1, c2, . . . , cp,
not all zero,
|c1|2 + |c2|2 + · · · + |cp|2 ̸= 0
such that
c1 v 1 + c2 v 2 + · · · + cp v p = 0
208
In this case, say, ck ̸= 0, then

vk = (−1/ck )(c1v1 + · · · + ck−1vk−1 + ck+1vk+1 + · · · + cpvp)

and we can represent vk as a linear combination of other vℓ’s, ℓ ̸= k.

v1 v1

v2 v2

c1 v 1 + c2 v 2 = 0 u = d 1 v1 + d2 v2
209

Example The set of vectors {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T } is a linearly
independent set,
       
1 0 0 0
       
c 1  0  + c 2  1  + c 3  0  =  0  ⇒ c1 = c2 = c3 = 0
0 0 1 0

while the set {(1, 0)T , (0, 1)T , (2, 3)T } is not an independent set,
" # " # " #
2 1 0
=2 +3
3 0 1
210
For a vector space V ,
v1 , . . . , v p ∈ V
x ∈ span{v1, . . . , vp} = W

If v1, . . . , vp are not linearly independent (linearly dependent),

it’s possible that
x = c1 v 1 + c2 v 2 + · · · + cp v p
= d1 v 1 + d 2 v 2 + · · · + d p v p
while if v1, . . . , vp are linearly independent, the coefficients ck ’s are
unique,
x = c1 v 1 + c2 v 2 + · · · + cp v p
for any x ∈ W .
211
Proof :
Assume x has two linear combinations,
x = c1 v 1 + c2 v 2 + · · · + cp v p
= d1 v 1 + d 2 v 2 + · · · + d p v p
⇒ (c1 − d1)v1 + (c2 − d2)v2 + · · · + (cp − dp)vp = 0
Since v1, v2, . . . , vp are linearly independent, we have
(c1 − d1) = (c2 − d2) = · · · = (cp − dp) = 0
i.e.,
ck = dk , k = 1, . . . , p
212
Basis : For a vector space V , we want to find a set of vectors
S = {v1, v2, . . . , vn} such that

1. Every vector in V can be represented as a linear combination

of vectors in S, (i.e., V = span{v1, v2, . . . , vn} )
2. and this representation is unique.
x = c 1 v 1 + · · · + cn v n , x∈V
Example
1. In R2, the standarad basis is {(1, 0), (0, 1)}.
2. In R3, the standarad basis is {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
" # " # " #
x 1 0
=x +y
y 0 1
213

(0, 1)

(1, 0)

Definition A set of vectors β = {v1, . . . , vn} is a basis for V if

1. β spans (generates) V , and ( V = span{v1, v2, . . . , vn} )
2. β is a linearly independent set.

Then every vector x in V can be represented uniquely as

x = c 1 v 1 + c2 v 2 + · · · + cn v n
214
Suppose β = {v1, v2, . . . , vn} is a basis of a vector space V . For
each x ∈ V , we can write x = c1v1 + c2v2 + · · · + cnvn.

Then the β-coordinate vector of x is

 
c1
 
 c2 
[x]β = 
 .. 
 ∈ Rn
 
cn

Rn
V
x [x]β
β
215
For example, consider the space of polynomials of degree smaller
than 3,

V = {c0 + c1t + c2t2 | c0, c1, c2 ∈ R}

Let
β = {1, t, t2}
be a basis of V . For a polynomial

p(t) = d0 + d1t + d2t2 ∈ V

we have
 
d0
 
[ p(t) ]β =  d1  ∈ R3
d2
216

(1, 1)

(0, 1)

(1, 0) (1, −1)

A vector space V may have more than one bases. Any set of vectors
{v1, . . . , vn} in V can be a basis of V if it is linearly independent and
it spans V .

For example, in R2, we can choose β = {(1, 0), (0, 1)} as a basis,
or γ = {(1, 1), (1, −1)} as a basis.
217
Theorem A vector space V can have different bases, say,
β = {b1, . . . , bn}, γ = {f1, . . . , fm}.
However, they must have the same number of vectors, i.e., m = n.
Proof:
1. m > n
Consider the coordinates of f1, . . . , fm relative to β,
     
f11 f21 fm1
     
 f12   f22   fm2 
[f1]β = 
 .. 
, 
[f2]β =  .  , . . . , [fm]β = 

 .. 

   .   
f1n f2n fmn

f1 = f11b1 + f12b2 + · · · + f1nbn

218

Assume c1f1 + c2f2 + · · · + cmfm = 0

or c1[f1]β + c2[f2]β + · · · + cm[fm]β = [0]β
  
 
f11 f21 . . . fm1 c1 0
    
 f12 f22 . . . fm2   c2   0 
   =  
 ..   ..   .. 
    
f1n f2n . . . fmn cm 0

Since m > n, there is a non-trivial solution (c1, c2, . . . , cm).

This means there are c1, c2, . . . , cm, not all zero, such that
c1[f1]β + c2[f2]β + · · · + cm[fm]β = [0]β
or
c1 f 1 + c2 f 2 + · · · + cm f m = 0
219

This contradicts to the fact that f1, . . . , fm are linearly independent.

2. m < n
Reverse the roles of β and γ, and by a similar process as above, we
also reach a contradiction.
We conclude that m = n.

Definition The dimension of a vector space V is defined as the number

of vectors in a basis.

This definition is well-defined, since all bases have the same number of
vectors.
220

Fact In a vector space V of dimension n, if S is a linearly independent

set and span(S) = V , then S is a basis and the number of vectors in S
is n.
Example Find the dimensions of the following vector spaces.
1. R2 (β = {(1, 0), (0, 1)} γ = {(1, 1), (1, −1)})

2. R3 (β = {(1, 0, 0), (0, 1, 0), (0, 0, 1)})

Example The space of polynomials (with real coefficients) of degrees

no more than n has dimension n + 1.
V = {c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}
β = {1, t, . . . , tn}
221

Example The dimension of Rn is n. (β = {e1, e2, . . . , en})

Example The standard basis for 2 × 2 matrices M22 is

" # " # " # " #
10 01 00 00
M1 = , M2 = , M3 = , M4 =
00 00 10 01
since
" # " # " # " # " #
a b 10 01 00 00
=a +b +c +d
c d 00 00 10 01

Example The dimension of the space {0} is defined as 0.

222

• In the above, under the framework of vector space, we define the

basis and dimension of a vector space, and the result can be applied
to vector spaces like Rn (Cn), matrices, polynomials, and so on.

Theorem For a vector space V of dimension n, the largest number

of linearly independent vectors is n.
v1, v2, . . . , vr ∈ V, r ≤ n = dim(V )

Theorem Let S = {v1, v2, . . . , vr } be a set of vectors in Rn.

If r > n, then S is linearly dependent.
Proof :

(In the next page.)

223
Proof :
Consider the system of linear equation (homogeneous)

c1 v 1 + c2 v 2 + · · · + cr v r = 0 (vk ∈ Rn) (9)

or  
c1
 
 c2 
[v1 v2 · · · vr ]  
 ..  = 0
 
cr

There are n equations and r unknowns. When r > n, there are

free variables, and there exist non-trivial solutions of (c1, c2, . . . , cr )
in (9).
224
Note that for a basis β of a vector space V , β is a set that
1. contains maximum number of linearly independent vectors,
2. contains minimum number of vectors that span V .

β = {v1, v2, . . . , vn}, n = dim(V )

Thus if we add a vector u ∈ V to β,

β ∪ {u}
it becomes a linearly dependent set, for we have
u = c 1 v 1 + · · · + cn v n
because β is a basis.
225

If we remove a vector from β, then β cannot span V .

For example, suppose we remove v1 from β,
△
β → {v2, . . . , vn} = β ′

Then
/ span(β ′)
v1 ∈
since β is an independent set.
Therefore,
span(β ′) ̸= V
and β ′ cannot span V .
226
Summary:
For a vector space V , a basis β = {v1, v2, . . . , vn}
1. is an linearly independent set,
2. and it spans (generates) V .
3. n = dimV .

All bases of V have the same number of vectors.

β = {v1, v2, . . . , vn}, γ = {w1, w2, . . . , wn} .

Now, we further discuss the relations among the dimension,

independence, and span (or generation).
227
Theorem Let V be an n-dimensional vector space, n ≥ 1.
1. Any linearly independent set S of exactly n elements is
automatically a basis for V .
2. Any set S of exactly n elements that spans V is
automatically a basis for V .
Proof
1. S must span V . Otherwise, span(S) ⊊ V , and there exists
v ∈/ span(S), and S ∪ {v} is a linearly independent set of n + 1
entries. This contradicts to the fact that dim(V ) = n.
2. S must be linearly independent. Otherwise, we can remove a vector
from S and the remaining (n − 1) vectors in S still span V .
This also contradicts to that dim(V ) = n.
228
In summary, for a vector space V of dimension n,
a set S = {v1, v2, . . . , vn} is a basis if
• S spans V , or
• S is linearly independent.

Theorem Let V be a finite-dimensional vector space. Any linearly

independent set S in V can be extended, if necessary, to a basis of V .
Proof:
1. If span{S} = V , then S is a basis of V .
229

2. If span{S} ⊊ V , there is a vector u1 ∈ V , u1 ∈

/ span{S}.
Let S1 = S ∪ {u1}. Note S1 is also an independent set.

Span(S)

3. If span{S1} = V , then S1 is a basis of V .

230
4. If span{S1} ⊊ V , there is a vector u2 ∈ V , u2 ∈
/ span{S1}.
Let S2 = S1 ∪ {u2}.
V

span(S ∪ u1)

5. Continue this process, until span{Sk } = V .

(since V is finite-dimensional)
S ∪ u1 ∪ u2 · · · · · ·
231

The above proof also indicates how to construct a basis of a space V .

Theorem: If W is a subspace of a finite-dimensional vector space

V , then dim(W ) ≤ dim(V ). When dim(W ) = dim(V ), we have W = V .

dim(W ) ≤ dim(V )
232
Proof :
• Let βW be the a basis of W . Since βW ⊂ W ⊆ V and V is
finite-dimensional, we have that βW is a finite set, and W is
finite-dimensional.
• Since βW can be extended, if necessary, to become a basis βV of V ,
we have dim(W ) ≤ dim(V ) for βW ⊆ βV .
• If dim(W ) = dim(V ) = n, then βW is an independent set of n vectors
in V , and βW becomes a basis for V . So V = W = span(βW ).

Note the above result may not be true if the dimensions of V and W
are not finite, dim(V ) = dim(W ) = ∞.
For example, P (a, b) ⊊ C(a, b).
233
• Change of basis in R n

Let      
1 0 0
     
0 1 0
ϵ = {e1, . . . , en} = {  ,   , . . . ,  }
 ..   ..   .. 
     
0 0 1
be the standard basis of Rn, and

β = {u1, . . . , un}, γ = {v1, . . . , vn}

be another two bases. We can express a vector x in Rn as

x = c 1 u1 + · · · + c n un = d 1 v 1 + · · · + d n v n
234
Then    
c1 d1
   
 c2   d2 
[x]β =  
 ..  , [x]γ =  
 .. 
   
cn dn
and
     
x1 c1 d1
     
 x2   c2   d2 
x = [x]ϵ =   = [u1 · · · un]   = [v1 · · · vn] 
 ..   ..  ···

     
xn cn dn

= Pβ [x]β = Pγ [x]γ

where the matrices Pβ = [u1 · · · un], Pγ = [v1 · · · vn].

235
Remark:
1. We have
x = Pβ [x]β = Pγ [x]γ
and
[x]β = Pβ−1x, [x]γ = Pγ−1x

△
2. [x]γ = Pγ−1Pβ [x]β = Pβγ [x]β , where Pβγ = Pγ−1Pβ .

Pβγ is called the change-of-coordinates matrix from β to γ.

Note Pϵ = In.
236
Example Consider a basis β = {u1, u2} of R , where
2

" # " #
1 1
u1 = , u2 =
0 2
If
" #
−2
[x]β =
3
then
" # " # " #
1 1 1
x = −2u1 + 3u2 = (−2) +3 = = [x]ϵ
0 2 6
237
Example Consider a basis β = {u1, u2} of R , where
2

" # " #
1 1
u1 = , u2 =
0 2
If " #
1
x = [x]ϵ =
6
assume " #
c1
[x]β =
c2
then " #" # " #
1 1 c1 1
x = c1u1 + c2u2 = =
0 2 c2 6
" # " #−1 " # " #" # " #
c1 1 1 1 1 −0.5 1 −2
⇒ [x]β = = = =
c2 0 2 6 0 0.5 6 3
" # " #
238
−9 −5
Example Let β = {u1, u2} = { , } and
1 −1
" # " #
1 3
γ = {v1, v2} = { , } be two bases in R2.
−4 −5
" # " #
2 d1
Assume [x]β = , let [x]γ = . Then
3 d2
" # " #
2 d1
x = [u1 u2] = [v1 v2]
3 d2
and
" #" # " #" # " # " #
−9 −5 2 1 3 d1 d1 24
= ⇒ =
1 −1 3 −4 −5 d2 d2 −19
239
• Change of basis in a general vector space V

Example Let V be the space of polynomials of degrees less than or equal

to n.
V = {c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}

We consider two bases in V ,

β = {1, t, t2, . . . , tn} and γ = {1, 1 + t, t + t2, . . . , tn−1 + tn}

For the space of polynomials of degrees no more than 2,

β = {1, t, t2}, γ = {1, 1 + t, t + t2}
240
   
2 d0
   
p(t) = 2 + 3t + 4t2, [p]β =  3  , [p]γ =  d1 
4 d2

p(t) = 2 + 3t + 4t2 = d0 + d1(1 + t) + d2(t + t2)

= (d0 + d1) + (d1 + d2)t + d2t2

⇒ d0 + d1 = 2, d1 + d2 = 3, d2 = 4
 
3
 
⇒ d2 = 4, d1 = −1, d0 = 3, [p]γ =  −1 
4
241
Consider an m × n matrix A,
 
rT1
 
 rT 
 2 
A = [c1 c2 · · · cn] = 
 . 
, ck ∈ R m , rk ∈ Rn
 . 
 
rTm
Definition
• The column space of A: Col(A) = span{c1, c2, . . . , cn} ⊆ Rm
• The row space of A: Row(A) = span{r1, r2, . . . , rm} ⊆ Rn
• The null space of A is
Nul(A) = {x | Ax = 0} ⊂ Rn
242
A
Rn Rm
Col(A)
Nul(A)
0

A : m×n
Col(A) = {Ax | x ∈ Rn} x = (x1, x2, . . . , xn)T
= {x1c1 + x2c2 + · · · + xncn | xk ∈ R}
= span(c1, c2, · · · , cn)

Nul(A) = {x | Ax = 0}
243
Theorem Elementary row operations do not change the null space
of a matrix.
Proof
Nul(A) = {x | Ax = 0}
= {x | EAx = 0}
since the elementary row operation is reversible, and each elementary
matrix is invertible.

In general, we have
{x | Ax = 0} ⊆ {x | BAx = 0}
or
Nul(A) ⊆ Nul(BA)
244
Theorem Elementary row operations do not change the row space
of a matrix.

×3

×2

However, row operations change the column space of a matrix.

 
rT1
 
 rT 
 2 
A = [c1 c2 · · · cn] = 
 . 
 , ck ∈ R m , r k ∈ R n
 . 
 
rTm
245
Example
   
1 −3 4 −2 5 4 1 −3 4 −2 5 4
   
 2 −6 9 −1 8 2   0 0 1 3 −2 −6 
A=  
 2 −6 9 −1 9 7  ∼ R =  0 0 0 0 1 5 

   
−1 3 −4 2 −5 −4 0 00 0 0 0

Row(A) = Row(R)
= span{(1, −3, 4, −2, 5, 4), (0, 0, 1, 3, −2, −6), (0, 0, 0, 0, 1, 5)}

and we can find a basis for Row(A).

In the above, note the row operation does change Col(A).

246

A = [a1 a2 · · · an] ∼ B = [b1 b2 · · · bn]

Ac = 0 ⇔ Bc = 0 c = (c1, c2, . . . , cn)T

c1 a 1 + c2 a 2 + · · · + cn a n = 0
⇔ c1b1 + c2b2 + · · · + cnbn = 0

The columns of A and B have the same dependence.

That is, the same coefficients c1, . . . , cn linearly combines both
a1, . . . , an and b1, . . . , bn into 0.
247
Since A ∼ B, for the submatrices of A and B, we have
A1 = [a1 a3] ∼ B1 = [b1 b3]

A2 = [a2 a4 a5] ∼ B2 = [b2 b4 b5]

..
Since
A = [a1 a2 · · · an] ∼ B = [b1 b2 · · · bn]
Eq · · · E2E1 [a1 a2 · · · an] = [b1 b2 · · · bn]

Eq · · · E2E1 [a1 a3] = [b1 b3]

Eq · · · E2E1 [a2 a4 a5] = [b2 b4 b5]

Therefore,
[a1 a3] ∼ [b1 b3]
[a2 a4 a5] ∼ [b2 b4 b5]
248

A = [a1 a2 · · · an] ∼ B = [b1 b2 · · · bn]

Furthermore, if
{a1, a2, a4}
forms a basis for Col(A), then
{b1, b2, b4}
also forms a basis for Col(B).
For example,

a5 = d2a2 + d4a4 ⇔ b5 = d2b2 + d4b4

(or d 2 a 2 + d 4 a 4 − a 5 = 0 ⇔ d 2 b 2 + d 4 b 4 − b5 = 0 )
249
How to find a basis of Col(A)?
A = [a1 a2 . . . an]
That is, how to find a basis for span{a1 a2 . . . an}?

We illustrate the approach by an example.

   
−3 6 −1 1 −7 1 −2 0 −1 3
   
A =  1 −2 2 3 −1  ∼ R =  0 0 1 2 −2 
2 −4 5 8 −4 0 0 0 0 0

then {a1, a3} forms a basis for Col(A).

(Some examples, derivation, and explanation are given in the Appendix.)

250
Example For
   
1 −3 4 −2 5 4 1 −3 4 −2 5 4
   
 2 −6 9 −1 8 2   0 0 1 3 −2 −6 
A=  2 −6 9 −1 9 7 
∼R=
0 0 0 0 1 5

   
−1 3 −4 2 −5 −4 0 00 0 0 0
we have
Row(A) = span{(1, −3, 4, −2, 5, 4), (0, 0, 1, 3, −2, −6), (0, 0, 0, 0, 1, 5)}
     
1 4 5
     
2  9  8
Col(A) = span{ , ,
2  9  9
}
     
1 −4 −5
251
The above examples illustrate how to find the bases of
Col(A) and Row(A).

Example Find bases for Nul(A), Row(A) and Col(A) of the matrix
 
−3 6 −1 1 −7
 
A =  1 −2 2 3 −1 
2 −4 5 8 −4

Sol : Find the solution of Ax = 0,

 
1 −2 0 −1 3 0 x1 − 2x2 − x4 3x5 = 0
 
A 0 ∼  0 0 1 2 −2 0  x3 + 2x4 − 2x5 = 0
0 0 0 0 0 0 0 =0
         252
x1 2x2 + x4 − 3x5 2 1 −3
         
 x2   x2  1  0  0
         
 x3  =  −2x4 + 2x5  = x2  0  + x4  −2  + x5  2 
         
         
x
 4  x 4  0
   1   0 
x5 x5 0 0 1
Therefore, the solution set of Ax = 0, or Nul(A), is

Nul(A) = {x | Ax = 0}
     
2 1 −3
(  1
 
 0
 
 0
 )
     
= t1  0
 
 + t2  −2  + t3  2  ,
    t 1 , t2 , t3 ∈ R
     
0
   1   0 
0 0 1
253
Note that
     
2 1 −3
     
1  0  0
     
0,  −2  ,  2
     
     
0  1   0 
0 0 1

are linearly independent, and they form a basis of Nul A.

254

By the reduced row echelon form of A,

   
−3 6 −1 1 −7 1 −2 0 −1 3
   
A =  1 −2 2 3 −1  ∼ R =  0 0 1 2 −2 
2 −4 5 8 −4 0 0 0 0 0

we have a basis for Row(A)

{(1, −2, 0, −1, 3), (0, 0, 1, 2, −2)}

and the basis for Col(A)

   
−3 −1
   
{a1, a3} = { 1  ,  2 }
2 5
255

Theorem For any matrix A, the two spaces Row(A) and Col(A)
have the same dimensions.
Proof
A ∼ R
Since Row(A) = Row(R), we have
dim(Row(A)) = dim(Row(R))

On the other hand, since A ∼ R, a given set of columns of A form a basis

for Col(A) iff the corresponding columns R form a basis for Col(R).
dim(Col(A)) = dim(Col(R))
256

dim(Row(A)) = dim(Row(R))
dim(Col(A)) = dim(Col(R))
However,
dim(Col(R)) = dim(Row(R)) = number of pivots

 
1 −3 4 −2 5 4
 
 0 0 1 3 −2 −6 
A∼R=
0 0 0 0 1 5

 
0 0 0 0 0 0
257
Definition For a matrix A, we define its rank as
rank(A) = dim(Col(A)) = dim(Row(A))

Definition For a matrix A, we define its nullity as

nullity(A) = dim(Nul(A))

A
Rn Rm
Col(A)
Nul(A)
0
258
For an m × n matrix A, we have
rank(A) = dim(Col(A)) ≤ min(m, n)

since rank(A) equals to the number of pivots in R, which is no more

than m and n.
 
1 −3 4 −2 5 4
 
 0 0 1 3 −2 −6 
A∼R=
0 0 0 0 1 5
, R : m × n (4 × 6)
 
0 0 0 0 0 0

Theorem For any matrix A, rank(A) = rank(AT ).

Proof
rank(A) = dim(Row(A)) = dim(Col(AT )) = rank(AT )
259
Theorem (Dimension Theorem for Matrices)
For any m × n matrix A, we have
rank(A) + nullity(A) = n

A
Rn Rm
Col(A)
Nul(A)
0

Col(A) = {Ax | x ∈ Rn} ⊂ Rm

Nul(A) = {x | Ax = 0} ⊂ Rn
260
Proof
Let a row echelon form of A be R. (Ax = 0)
rank(A) = dim(Col(A)) = (# of pivots in R)
= (# of basic variables)

nullity(A) = dim(Nul(A)) = (# of free variables)

(see the previous example.)
rank(A) + nullity(A) = (# of total variables) = n

 
1 −2 0 −1 3
 
A ∼ R =  0 0 1 2 −2 
0 0 0 0 0
261
Example As in the previous examples,
   
1 −3 4 −2 5 4 1 −3 4 −2 5 4
   
 2 −6 9 −1 8 2   
A=  ∼ R =  0 0 1 3 −2 −6 
 2 −6 9 −1 9 7  0 0 0 0 1 5
   
−1 3 −4 2 −5 −4 0 0 0 0 0 0

We have n = 6, rank(A) = dim(Col(A)) = 3, nullity(A) = 3.

Example As in the previous examples,

   
−3 6 −1 1 −7 1 −2 0 −1 3
   △
A =  1 −2 2 3 −1  ∼  0 0 1 2 −2  = R
2 −4 5 8 −4 0 0 0 0 0

We have n = 5, rank(A) = 2, nullity(A) = 3.

A 262

Rn Rm
Col(A)
Nul(A)
0

Ax = b, A = [c1 c2 · · · cn], x = (x1, x2, . . . , xn)T

x1c1 + x2c2 + · · · + xncn = b

We have
1. Consistent ⇔ b ∈ Col(A)
263

2. When Nul(A) = {0}, (nullity(A) = 0)

Ax = b has a unique solution if consistent.
3. If Nul(A) ̸= {0},
Ax = b has infinitely many solutions if consistent.

Nul(A) = {0} ⇔ the columns of A are linearly independent.

Nul(A) = {x | Ax = 0} A = [c1 c2 · · · cn]

= {x | x1c1 + · · · + xncn = 0}
264

Ax = b, A : m × n, x ∈ Rn, b ∈ Rm

Overdetermined (m > n) and underdetermined (m < n)

n
n

m A A

More equations ⇒ more “determined”

265
Example (An overdetermined system)
x1 − 2x2 = b1
x1 − x2 = b2
x1 + x2 = b3
x1 + 2x2 = b4
x1 + 3x2 = b5
The corresponding [A | b] is row equivalent to
 
1 0 2b2 − b1
 
0 1 b 2 − b1 
 
0 0 b3 − 3b2 + 2b1 
 
 
0 0 b4 − 4b2 + 3b1 
0 0 b5 − 5b2 + 4b1
266

which is consistent only when

b3 − 3b2 + 2b1 = 0
b4 − 4b2 + 3b1 = 0
b5 − 5b2 + 4b1 = 0

A
Rn Rm
Col(A)
Nul(A)
0

rank(A) + nullity(A) = n (A : m × n)

rank(A) ≤ min(m, n)
267

Overdetermined (m > n) Ax = b

Since rank(A) ≤ min(m, n) < m, we have Col(A) ⊊ Rm.

There exists b ∈ Rm but b ∈
/ Col(A) such that Ax = b is inconsistent.
(See the last Example )

n
A
Rn Rm
Col(A)
m A Nul(A)
0
268

Underdetermined (m < n) Ax = b

Since rank(A) + nullity(A) = n, we have

nullity(A) = n − rank(A) ≥ n − m > 0

⇒ Nul(A) ̸= {0}, and

Ax = b has infinitely many solutions if it is consistent.

Rm
A
n
Rn
Col(A)
Nul(A)
m A 0
269
Let Ax = b be a consistent linear system of m equations
in n unknowns (A : m × n).
△
If A has rank r, then dim( Nul(A) ) = n − r = k.

Let {v1, v2, . . . , vk } be a basis for

Nul(A) = {x | Ax = 0}

Av1 = 0, Av2 = 0, . . . , Avk = 0

The solution set of Ax = b can be expressed as

x 0 + c1 v 1 + c1 v 2 + · · · + ck v k (10)

where x0 is any solution that satisfies Ax0 = b, ck ∈ R.

(See the previous example.)
270
When k = 0, Nul(A) = {0}, and x0 is the unique solution.

For the above results, recall the previous theorem:

Theorem Suppose Ax = b has a solution p. (Ap = b)
Then the solution set of Ax = b can be expressed as
{p + vh | Avh = 0}, vh ∈ Nul(A)
271
Theorem If A is an m × n matrix, then the following are
equivalent.
1. Ax = 0 has only the trivial solution. (Nul(A) = {0})
2. The column vectors of A are linearly independent.
3. Ax = b has at most one solution (none or one) for every
b ∈ Rm .

A = [c1 c2 . . . cn], ck ∈ Rm
Ax = 0, x = (x1, x2, . . . , xn)T
x1c1 + x2c2 + · · · + xncn = 0
x1c1 + x2c2 + · · · + xncn = b
272
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, then the following statements are equivalent.
a. A is invertible.
b. The column vectors of A are linearly independent.
c. The row vectors of A are linearly independent.
d. The column vectors of A span Rn.
e. The row vectors of A span Rn.
f. A has rank n.
g. A has nullity 0.
273
Proof
Recall that for a square matrix A of size n,
A is invertible
⇔ Ax = 0 has only the trivial solution x = 0.
⇔ The column vectors of A are linearly independent. (b)
(the above is by (1) and (2) of the last Theorem)
⇔ Col(A) = Rn. (d)
⇔ Rank(A) = n. (f)

A
Rn Rm
Col(A)
Nul(A)
0
274
Furthermore, by the Dimension Theorem, since A: n × n, we have
rank(A) = n ⇔ nullity(A) = 0 (g)

rank(AT ) = rank(A) = n
⇔ the row vectors of A are linearly independent. (c)
⇔ the row vectors of A span Rn. (e)

(Cf. the column vectors of A are linearly independent. (b))

(Cf. the column vectors of A span Rn. (d))
275
Summary about the subspaces related to an m × n matrix A.
 
1 −5 7 3 2
 
A ∼  0 0 1 −1 4  (row-echelon form)
0 0 0 0 0

rank(A) = dim(Col(A)) = (# of pivots) ≤ min{m, n}

nullity(A) = dim(Nul(A)) = (# of free variables)

rank(A) + nullity(A) = n
Theorem Suppose that A and B are two matrices and 276

we can calculate AB. Then we have

rank(AB) ≤ rank(B)

rank(AB) ≤ rank(A)
Proof
Nul(AB) ⊇ Nul(B) Bx = 0 ⇒ ABx = 0

⇒ nullity(AB) ≥ nullity(B) (A : m × p, B : p × n)

⇒ rank(AB) ≤ rank(B)

since
rank(AB) + nullity(AB) = rank(B) + nullity(B) = n
277

Recall that for a square matrix M ,

rank(M ) = rank(M T )
We have

rank(AB) = rank((AB)T ) = rank(B T AT ) ≤ rank(AT ) = rank(A)

Therefore,
rank(AB) ≤ rank(A)
278
Note in general, rank(AB) ̸= rank(BA).
For example " # " #
1 1 1 0
A= , B=
0 0 −1 0

we have " #" # " #

1 1 1 0 0 0
AB = =
0 0 −1 0 0 0
while " #" # " #
1 0 1 1 1 1
BA = =
−1 0 0 0 −1 −1

rank(AB) = 0, rank(BA) = 1
279
■ Linear Transformation

A transformation from a vector space V into a vector space W ,

T : V 7→ W
is a function (or mapping) that maps a vector v ∈ V to a vector
T (v) ∈ W .
V T
W
T (V )

v T (v)

V : domain, W : codomain
T (V ): range, T (V ) = {T (v) | v ∈ V } ⊆ W
280
Definition
A transformation T : V 7→ W is called a linear transformation if for
all vectors u and v in V and all scalars c, we have
1. T (u + v) = T (u) + T (v)
2. T (cu) = cT (u)
T
V W

u v T (u) T (v)

u+v T (u + v)

In the special case where V = W , the linear transformation

T : V 7→ V is called a linear operator on V .
281

Theorem For a linear transformation T : V 7→ W , we have T (0) = 0.

More precisely, T (0V ) = 0W .
Proof :
T (0 + 0) = T (0)
T (0 + 0) = T (0) + T (0) = 2T (0)

Since T (0) = 2T (0), or T (0) + 0 = 2T (0), we have T (0) = 0.

V W
T
0V 0W
282
For a linear transformation T : V 7→ W , we have
T (c1v1 + c2v2 + · · · + cnvn)
= c1T (v1) + c2T (v2) + · · · + cnT (vn)

Example (Zero Transformation)

The mapping T : V 7→ W such that T (v) = 0 for every v ∈ V is a
linear transformation called the zero transformation.

T (v1 + v2) = 0 = T (v1) + T (v2)

T (cv) = 0 = cT (v)
283
Example (Identity Operator)
The mapping I : V 7→ V defined by I(v) = v is called the identity
operator on V.

Let T = I.
T (v1 + v2) = v1 + v2 = T (v1) + T (v2)
T (cv) = cv = cT (v)

Exercise
Consider the mapping T : V 7→ W such that T (v) = w0 for every
v ∈ V , where w0 is a constant vector in W . Is T a linear transformation?
284
Example
Consider the differential operation on the space of polynomials,

V = c 0 + c 1 t + c 2 t 2 + · · · + c n t n | c0 , . . . , c n ∈ R

Let T denotes the differential operator. Note

′ ′

T p1(t) + p2(t) = p1(t) + p2(t) = T p1(t) + T p2(t)
′

T kp1(t) = kp1(t) = kT p1(t)

Therefore, T is a linear transformation (operator).

285
Example
Define a transformation S from the space of n × n matrices Mn(R)
to R, as
S(A) = tr(A)
where A ∈ Mn(R). Note
S(A + B) = tr(A + B) = tr(A) + tr(B)
= S(A) + S(B)

S(cA) = tr(cA) = c tr(A) = cS(A)
and S is a linear transformation from Mn(R) to R.
286
Exercise
Define a transformation T from the space of n × n matrices Mn(R)
to R, as
T (A) = det(A)
where A ∈ Mn(R). Determeine if T is a linear transformation.
287
For a linear transformation T : R 7→ R ,
n m

we can express T (x) as

T (x) = Ax
where
A = [T (e1) T (e2) · · · T (en)]

is the standard matrix of T . (A has size m × n)

     
1 0 0
     
0 1 0
e1 =      
 ..  , e2 =  ..  , · · · , en =  .. 
     
0 0 1

− Every linear transformation T has a standard matrix.

288
Since for any x ∈ R , we can express it as
n

 
x1
 
 x2 
x= 
 ..  = x1e1 + x2e2 + · · · + xnen
 
xn

T (x) = T (x1e1 + x2e2 + · · · + xnen)

= x1T (e1) + x2T (e2) + · · · + xnT (en)
 
x1
 
 x2 
= [T (e1) T (e2) · · · T (en)]  ,
 ..  T (ek ) ∈ Rm
 
xn
= Ax
289

Remarks
1. Every linear transformatin T from Rn to Rm corresponds to a matrix
A such that T (x) = Ax, with
A = [T (e1) T (e2) · · · T (en)]

2. On the other hand, every m × n matrix A corresponds a linear trans-

formation TA from Rn to Rm.
TA = Ax
290
Examples
1. Reflection operator in R3
z
–Reflection about the xy-plane
(x1, y1, z1)
    
1 0 0 x x
    
0 1 0 y  =  y 
y
0 0 −1 z −z
x

(x1, y1, −z1)

Note
           
1 1 0 0 0 0
           
T ( 0  ) =  0  , T ( 1  ) =  1  , T ( 0  ) =  0 
0 0 0 0 1 −1
291

2. Projection operator on R3
z
– Orthogonal projection
     (x1, y1, z1)
100 x x
    
0 1 0 y  =  y 
000 z 0 y

x (x1, y1, 0)
Note
           
1 1 0 0 0 0
           
T ( 0  ) =  0  , T ( 1  ) =  1  , T ( 0  ) =  0 
0 0 0 0 1 0
292

3. Rotation operator
Define a linear operator on R2 that rotates a vector x counter-clockwise
through an angle θ.
" #" #
cos(θ) − sin(θ) x
T (x) =
sin(θ) cos(θ) y θ

" # " # " # " #

1 cos(θ) 0 − sin(θ)
T( )= T( )=
0 sin(θ) 1 cos(θ)
293
Definition Let T : V 7→ W be a linear transformation.
• The kernel (or null space) of T is defined as
Ker (T ) = N (T ) = {x | T (x) = 0} ⊂ V

• The range of T is defined as

R(T ) = {T (x) | x ∈ V } ⊂ W

• Nullity (T ) = dim (N (T ))
• Rank (T ) = dim (R(T ))

T
V W
R(T )
N (T )
0
294

Exercise Show that N (T ) is a subspace of V , and

R(T ) is a subspace of W .

T
V W
R(T )
N (T )
0
295

Example Let A be an m × n matrix. Define a linear transfromation

TA : Rn 7→ Rm as x →
7 Ax.
Then the kernel of TA is
N (TA) = Nul(A)
and the range of TA
R(TA) = Col (A)

Recall that
Nul(A) = {x | Ax = 0} ⊆ Rn
Col (A) = {Ax | x ∈ Rn} ⊆ Rm
296
Definition A linear transformation T : V → 7 W is said
one-to-one if T maps distinct vectors in V to distinct vectors in W .

That is
x1 ̸= x2 ⇒ T (x1) ̸= T (x2)
or equivalently
T (x1) = T (x2) ⇒ x1 = x2

V W

x
297

Theorem A linear transformation T is one-to-one iff

T (x) = 0 ⇒ x = 0 ( or N (T ) = {0} )
Proof :
(⇒)
Since T (0) = 0 for every linear transformation, if T is one-to-one and
T (x) = 0, we have x = 0.

(⇐)
Suppose that T is not one-to-one, then there exist x1 and x2 such that
x1 ̸= x2 but T (x1) = T (x2). Then T (x1 −x2) = 0, and x1 −x2 ∈ N (T ),
but x1 − x2 ̸= 0, which indicates N (T ) ̸= {0}, a contradiction.
298

Definition A linear transformation T : V 7→ W is said onto

if each vector y in the codomain W is the image
of at least one vector x in the domain V .

That is, R(T ) = W .

V W
299
T
V W
R(T )
N (T )
0

In summary, for a linear transformation T : V 7→ W ,

1. T is one-to-one: N (T ) = {0}
2. T is onto: R(T ) = W
300

Theorem (The Rank Theorem)

Assume a linear transformation T : V → W and dimV = n, then
dim(R(T )) + dim(N(T )) = n

T
V W
R(T )
N (T )
0

N (T ) = {x | T (x) = 0}
R(T ) = {T (x) | x ∈ V }
301

Proof:
Let S = {v1, . . . , vp} is a basis of N(T ), p ≤ n. Then

T (v1) = · · · = T (vp) = 0

Extend S to a basis β = {v1, . . . , vp, vp+1, . . . , vn} of V .

R(T ) = T (V ) = T (span{v1, . . . , vp, vp+1, . . . , vn})

= span{T (v1), . . . , T (vp), T (vp+1), . . . , T (vn)}
= span{T (vp+1), . . . , T (vn)}

dim(R(T )) + dim(N(T )) = (n − p) + p = n
302

In the above, T (vp+1), . . . , T (vn) are linearly independent.

If not, ∃ cp+1, . . . , cn, not all zero, such that
cp+1T (vp+1) + · · · + cnT (vn) = 0
T (cp+1vp+1 + · · · + cnvn) = 0

⇒ cp+1vp+1 + · · · + cnvn ∈ N(T )

⇒ cp+1vp+1 + · · · + cnvn = d1v1 + · · · + dpvp

⇒ d1v1 + · · · + dpvp − cp+1vp+1 − · · · − cnvn = 0

But v1, . . . , vp, vp+1, . . . , vn are linearly independent. So

d1 = · · · = dp = cp+1 = · · · = cn = 0
which leads to a contradiction.
303

Example Let T : R2 7→ R2 be the linear operator that rotates each

vector in the xy-plane through the angle θ.

− Since every vector in the xy-plane can be obtained by rotating some

vector through angle θ, we have R(T ) = R2.
− The only vector that rotates into 0 is 0, so Ker(T ) = N (T ) = {0}.
− In this example, T is both one-to-one and onto.
304

Example Let T : R3 7→ R3 be the orthogonal projection on the

xy-plane.
− The kernel of T is the z-axis, which is one-dimensional.
− The range of T is the xy-plane, which is two-dimensional.
− nullity(T ) = 1, rank(T ) = 2
z

(x1, y1, z1)

x (x1, y1, 0)
305

Definition If a linear transformation T : V 7→ W is

both one-to-one and onto, we call T an isomorphism between V and
W , and V is isomorphic to W .

V T
W
T (V )

v T (v)

When V is isomorphic to W , we have

dim(V ) = dim(W )
306

Since by the rank theorem,

dim(R(T )) + dim(N(T )) = n = dim(V )

T
V W
R(T )
N (T )
0

1. one-to-one ⇒ N (T ) = {0} ⇒ dim(R(T )) = dim(V )

2. onto ⇒ R(T ) = W ⇒ dim(R(T )) = dim(W )

T is both one-to-one and onto ⇒ dim(V ) = dim(W ).

307

Note if there exists a one-to-one and onto linear transformation T from

V to W , then T −1 : W → V exists and T −1 is also one-to-one and onto.

V W

T −1
308

Since T : V → W is both one-to-one and onto, for every y ∈ W ,

there is one and only one x ∈ V such that

T (x) = y
We can define an inverse of T , T −1 : W → V , by

T −1(y) = x

Note T −1 is also one-to-one and onto.

Therefore, if T : V → W is an isomorphism (one-to-one and onto),

then T −1 : W → V exists and T −1 is also an isomorphism.
309

If T is not both one-to-one and onto, its inverse T −1 may not exist.

T T

V W V W

x
310

In the above, we show that if T is an isomorphism from V to W ,

then
1. dim(V ) = dim(W ).
2. There exists T −1, which is also an isomorphism from W to V .
3. V is isomorphic to W .
4. W is isomorphic to V .

V and W are isomorphic.
311

Suppose that T1 : V → W and T2 : W → Z are both isomorphisms

(one-to-one and onto). Then T2 ◦ T1 is also an isomorphism.

T1 T2

V W Z

T1−1 T2−1
312

Therefore, if V is isomorphic to W , and W is isomorphic to Z, then

V is isomorphic to Z.

By the above, we see that the relation “isomorphic to” is also an

equivalence relation.
1. V is isomorphic to V .
2. If V is isomorphic to W , then W is isomorphic to V .
3. If V is isomorphic to W , and W is isomorphic to Z, then V is
isomorphic to Z.

We have shown that

T is both one-to-one and onto ⇒ dim(V ) = dim(W )
313

On the other hand, when dim(V ) = dim(W ) < ∞, and T is a

linear transformation from V to W , we have
T is one-to-one ⇔ T is onto.
Since by the Rank Theorem,
dim(R(T )) + dim(N(T )) = dim(V )
therefore
T is one-to-one
⇔ N(T ) = {0}
⇔ dim(N(T )) = 0
⇔ dim(R(T )) = dim(V ) = dim(W ) < ∞
⇔ R(T ) = W
⇔ T is onto
314
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, and the associated linear transformation
T : Rn 7→ Rn, T (x) = Ax
then the following statements are equivalent.
a. A is invertible.
b. Ax = b is consistent for every n × 1 matrix b.
c. Ax = b has exactly one solution for every n × 1 matrix b.
d. T is onto. (The range of T is Rn.)
e. T is one-to-one.

Proof : We have proved (a) ⇔ (b) ⇔ (c). It is clear that (b) ⇔ (d).
In addition,
(a) ⇔ nullity(A) = 0 ⇔ N (T ) = {0} ⇔ (e)
315

Recall that when a square matrix A is not invertible, its reduced row
echelon form is like    
102 120
   
 0 1 1  or  0 0 1 
000 000

and the mapping x 7→ Ax is not one-to-one.

       
102 2 102 0 2
       
0 1 11 = 0 1 10 = 1
000 0 000 1 0
316

For any vector space V with dim(V ) = n, we can construct a

one-to-one and onto mapping between V and Rn. Assume β is a basis
of V .
V 7→ Rn (isomorphic)
V u 7→ [u]β

Rn
[u]β
317

Therefore, if dim(V ) = dim(W ) = n, then both V and W are isomor-

phic to Rn, and V and W are isomorphic.
So
dim(V ) = dim(W ) ⇔ V and W are isomorphic.

Therefore, every vector space of dimension n is isomorphic to Rn, and

vector spaces of the same dimensions are isomorphic.

n = dim(V ) = dim(W ), V ∼ Rn, W ∼ Rn

⇒ V ∼ Rn ∼ W .
318
Example V = {c0 + c1x + · · · + cnx | ck ∈ R} is isomorphic to
n

Rn+1.
c0 + c1x + · · · + cnxn 7→ (c0, c1, . . . , cn)T

Example The standard basis for 2 × 2 matrices M22 is

" # " # " # " #
10 01 00 00
M1 = , M2 = , M3 = , M4 =
00 00 10 01
since
" # " # " # " # " #
a b 10 01 00 00
=a +b +c +d
c d 00 00 10 01

and M22 is isomorphic to R4.

319
• Affine Transformation
Definition An affine transformation from Rn to Rm is a mapping
of the form
S(u) = T (u) + f0, u ∈ Rn
where T is a linear transformation from Rn to Rm and f0 is a constant
vector in Rm.

Affine transformation = linear transformation + translation

Example
The mapping " # " #
01 1
S(u) = u+
−1 0 1

is an affine transformation on R2.

Appendix
• The development of number systems
1. Natural numbers (1, 2, 3, . . .)
x + 5 = 3, x =?

2. Integers (. . . , −2, −1, 0, 1, 2, . . .)

2x = 3, x =?
p
3. Rational numbers ( ) (Rational numbers)
q
x2 = 2, x =?
√
4. Real numbers ( including 2, π, . . .) R
x2 = −1, x =?
2
√
5. Complex numbers ( a + ib, i = −1 ) C
The equation
xn + an−1xn−1 + · · · + a1x + a0 = 0
has n roots in C. (Fundamental theorem of algebra)

Example
Find the derivatives of order n = 1, 2, 3, . . . of
1
x2 + 1
1 c1 c2
= +
(x + i)(x − i) x + i x − i
• About the field 3

In a vector space (V, F ), the scalars are in the field F (Ex. R, C),
c1 v 1 + c2 v 2 + · · · + cn v n
Both R and C are examples of the field.
The following gives the axiom of a field.
4
Definition (or Axiom) of a field F (Ex. R, C)
1. a + b = b + a and a · b = b · a

2. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c)

3. There exist distinct elements 0 (zero) and 1 such that

0 + a = a and 1 · a = a

4. For each element a and each non-zero element b there exist

elements c and d such that
a + c = 0 and b · d = 1

5. a · (b + c) = a · b + a · c
5
Example R, C, Z2 = {0, 1} are examples of fields.

We may define the operations + and · in Z2 as

0 + 0 = 0, 0 + 1 = 1, 1 + 1 = 0,
0 · 0 = 0, 0 · 1 = 0, 1 · 1 = 1.
6

• The operations of addition and multiplication in a vector space may

not have any relationship or similarity to the standard vector opera-
tions on Rn.
addition: (u, v) 7→ w = u + v
multiplication: (c, u) 7→ z = cu

• The scalars of a vector space, F , may be

– real numbers, R (corresponding to real vector spaces) or
– complex numbers, C (corresponding to complex vector spaces) or
– other fields. (ex. binary numbers)
7
• About finding a basis for Col(A)

How to find a basis of Col(A)?

A = [a1 a2 . . . an]
That is, how to find a basis for span{a1 a2 . . . an}.

We illustrate the approach by an example.

   
−3 6 −1 1 −7 1 −2 0 −1 3
   
A =  1 −2 2 3 −1  ∼ R =  0 0 1 2 −2 
2 −4 5 8 −4 0 0 0 0 0

then {a1, a3} forms a basis for Col(A).

In the following, we show that a1 and a3 are linearly independent,

and
a2 = −2a1
a4 = −a1 + 2a3
a5 = 3a1 − 2a3

Hence we can choose {a1, a3} as a basis for

span{a1, a2, a3, a4, a5} = Col(A).
    9
−3 6 −1 1 −7 1 −2 0 −1 3
   
A =  1 −2 2 3 −1  ∼ R =  0 0 1 2 −2 
2 −4 5 8 −4 0 0 0 0 0

In this example, consider

Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0 (11)
which is equivalent to Rx = 0, and we have the solution set
       
x1 2 1 −3
       
 x2  1  0  0
       
 x3  = x2  0  + x4  −2  + x5  2  , x 2 , x4 , x5 ∈ R (12)
       
       
x
 4 0
   1   0 
x5 0 0 1

where x1 and x3 are basic variables, and x2, x4, x5 are free vairables.
10

1. If we choose x2 = x4 = x5 = 0, then (11)

Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1 a 1 + x3 a 3 = 0
and by (12)
       
x1 2 1 −3
       
 x2  1  0  0
       
 x3  = x2  0  + x4  −2  + x5  2  , x 2 , x4 , x5 ∈ R
       
       
x
 4 0
   1   0 
x5 0 0 1

the only solution is x1 = x3 = 0, this indicates that a1 and a3 are

linearly independent.
11
2. If we choose x2 = 1, x4 = x5 = 0, then (11)
Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1 a 1 + a 2 + x3 a 3 = 0
and by (12)
       
x1 2 1 −3
       
 x2  1  0  0
       
 x3  = x2  0  + x4  −2  + x5  2  , x 2 , x4 , x5 ∈ R
       
       
 x4  0  1  0
x5 0 0 1

the solution is x1 = 2, x3 = 0. Then we have a2 = −2a1.

12
3. If we choose x4 = 1, x2 = x5 = 0, then (11)
Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1 a 1 + x3 a 3 + a 4 = 0
and by (12)
       
x1 2 1 −3
       
 x2  1  0  0
       
 x3  = x2  0  + x4  −2  + x5  2  , x 2 , x4 , x5 ∈ R
       
       
 x4  0  1  0
x5 0 0 1

the solution is x1 = 1, x3 = −2. Then we have

a4 = −a1 + 2a3.
13
4. If we choose x5 = 1, x2 = x4 = 0, then (11)
Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1 a 1 + x3 a 3 + a 5 = 0
and by (12)
       
x1 2 1 −3
       
 x2  1  0  0
       
 x3  = x2  0  + x4  −2  + x5  2  , x 2 , x4 , x5 ∈ R
       
       
 x4  0  1  0
x5 0 0 1

the solution is x1 = −3, x3 = 2. Then we have

a5 = 3a1 − 2a3.
14
Hence, for Col(A) = span{a1, a2, a3, a4, a5}, we see that
{a1, a3} is an independent set, and
a2 = −2a1
a4 = −a1 + 2a3
a5 = 3a1 − 2a3

⇒ {a1, a3} is a basis for Col(A).

If A ∼ R, we can determine the pivot columns of R, then choose

the corresponding columns of A as the bases for Col(A).
   
−3 6 −1 1 −7 1 −2 0 −1 3
   
A =  1 −2 2 3 −1  ∼ R =  0 0 1 2 −2 
2 −4 5 8 −4 0 0 0 0 0

then {a1, a3} forms a basis for Col(A).

Linear Algebra Course Guide
No ratings yet
Linear Algebra Course Guide
137 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
Linear Equations and Transformations
No ratings yet
Linear Equations and Transformations
44 pages
MATH219 Lecture 7
No ratings yet
MATH219 Lecture 7
16 pages
Linear Algebra
No ratings yet
Linear Algebra
163 pages
En 1-2 GaussJordan - Last Updated 20-09-2024
No ratings yet
En 1-2 GaussJordan - Last Updated 20-09-2024
27 pages
Linear Algebra Summary Notes For M.SC Statistics
No ratings yet
Linear Algebra Summary Notes For M.SC Statistics
18 pages
MAN-001 Unit 1
No ratings yet
MAN-001 Unit 1
61 pages
Linear Equations and Matrix Operations
No ratings yet
Linear Equations and Matrix Operations
5 pages
Linear Algebra Lecture Notes
No ratings yet
Linear Algebra Lecture Notes
44 pages
Applied Linear Algebra MTH 3003 29aug
No ratings yet
Applied Linear Algebra MTH 3003 29aug
89 pages
m111 Notes-1
No ratings yet
m111 Notes-1
158 pages
Matrix Equations
No ratings yet
Matrix Equations
40 pages
Lec 2
No ratings yet
Lec 2
28 pages
Lin Alg 1
No ratings yet
Lin Alg 1
11 pages
Systems of Linear Equations and Matrices: Shirley Huang
No ratings yet
Systems of Linear Equations and Matrices: Shirley Huang
51 pages
Valid Row Operations in Linear Algebra
No ratings yet
Valid Row Operations in Linear Algebra
20 pages
Linear Notes
No ratings yet
Linear Notes
152 pages
Introduction To Linear Systems
No ratings yet
Introduction To Linear Systems
40 pages
Linear Algebra Lectures
No ratings yet
Linear Algebra Lectures
185 pages
Linear Algebra for Math Students
No ratings yet
Linear Algebra for Math Students
30 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Chapter1 1
No ratings yet
Chapter1 1
31 pages
Matrices
No ratings yet
Matrices
38 pages
Linear Alg I Yumsuk 2021-2022
No ratings yet
Linear Alg I Yumsuk 2021-2022
46 pages
MATH 257 Lecture Notes PDF
No ratings yet
MATH 257 Lecture Notes PDF
274 pages
Linear Alg
No ratings yet
Linear Alg
30 pages
Leading Entries in Matrix Operations
No ratings yet
Leading Entries in Matrix Operations
10 pages
Chap 1
No ratings yet
Chap 1
38 pages
LAChapter 1
No ratings yet
LAChapter 1
43 pages
Chap1 - Systems of Linear Equations
No ratings yet
Chap1 - Systems of Linear Equations
103 pages
Linear Equations and Matrices
No ratings yet
Linear Equations and Matrices
30 pages
Linear Algebra: Gaussian Elimination
No ratings yet
Linear Algebra: Gaussian Elimination
41 pages
Chapter 2
No ratings yet
Chapter 2
29 pages
Solving Linear Equations: Methods & Examples
No ratings yet
Solving Linear Equations: Methods & Examples
69 pages
Back Substitution in Linear Systems
No ratings yet
Back Substitution in Linear Systems
15 pages
Linear Algebra: Systems of Equations
No ratings yet
Linear Algebra: Systems of Equations
132 pages
Chapter 2 Systems of Linear Equations
No ratings yet
Chapter 2 Systems of Linear Equations
12 pages
Slides 01
No ratings yet
Slides 01
8 pages
Chapter 1. System of Linear Equations and Matrices
No ratings yet
Chapter 1. System of Linear Equations and Matrices
127 pages
Week1 3
No ratings yet
Week1 3
20 pages
Finance Math: Linear Equations
No ratings yet
Finance Math: Linear Equations
32 pages
Linear Algebra Exercises and Solutions
No ratings yet
Linear Algebra Exercises and Solutions
141 pages
Linear Algebra for Math Students
No ratings yet
Linear Algebra for Math Students
78 pages
Linear Equations & Matrix Solutions
No ratings yet
Linear Equations & Matrix Solutions
18 pages
Real-World Applications of Matrices
No ratings yet
Real-World Applications of Matrices
29 pages
Linear Algebra II: Systems & Solutions
No ratings yet
Linear Algebra II: Systems & Solutions
54 pages
Linear Algebra Notes
No ratings yet
Linear Algebra Notes
39 pages
Chapter 1: Systems of Linear Equations and Matrices: Section 1.2
No ratings yet
Chapter 1: Systems of Linear Equations and Matrices: Section 1.2
9 pages
Linear Algebra Essentials
No ratings yet
Linear Algebra Essentials
22 pages
Elementary Linear Algebra: Howard Anton Chris Rorres
No ratings yet
Elementary Linear Algebra: Howard Anton Chris Rorres
78 pages
Matrices Revision Notes: Chapter 3
No ratings yet
Matrices Revision Notes: Chapter 3
4 pages
Linear Equations for Math Students
No ratings yet
Linear Equations for Math Students
30 pages
Math201 - Calculus
No ratings yet
Math201 - Calculus
45 pages
Slide Chapter1 ST
No ratings yet
Slide Chapter1 ST
95 pages
Math Camp 2024: Linear Algebra Guide
No ratings yet
Math Camp 2024: Linear Algebra Guide
36 pages
Row Echelon Form and Reduced Row Echelon Form
No ratings yet
Row Echelon Form and Reduced Row Echelon Form
10 pages
Biomatha
No ratings yet
Biomatha
235 pages
Understanding Partial Order Alignment For Multiple Sequence Alignment
No ratings yet
Understanding Partial Order Alignment For Multiple Sequence Alignment
11 pages
ISCAS 2024 Cover Page
No ratings yet
ISCAS 2024 Cover Page
225 pages
Linear Algebra - Final - 2020
100% (1)
Linear Algebra - Final - 2020
2 pages
Linear Algebra Final Exam - Sample
0% (1)
Linear Algebra Final Exam - Sample
7 pages
Linear Algebra - Final - 2020 - Solution
No ratings yet
Linear Algebra - Final - 2020 - Solution
6 pages
Albertoni - Inferring Interwell Connectivity Only From Well-Rate Fluctuations in Waterfloods - 2003
No ratings yet
Albertoni - Inferring Interwell Connectivity Only From Well-Rate Fluctuations in Waterfloods - 2003
11 pages
Rotor Dynamics
No ratings yet
Rotor Dynamics
43 pages
Effect of Water Injection On The Cooling Performance of Flame Deflector in Rocket Engine Testing Platform
No ratings yet
Effect of Water Injection On The Cooling Performance of Flame Deflector in Rocket Engine Testing Platform
20 pages
3-Characteristics of Polynomial Functions-End Behaviours, Degree, Turning Pts
No ratings yet
3-Characteristics of Polynomial Functions-End Behaviours, Degree, Turning Pts
5 pages
Iso 29767 2019
0% (1)
Iso 29767 2019
9 pages
Contributors: Sity Degree Engineering. Joined Muelheim/Ruhr, Germany, 1958, Having Engineer Design
No ratings yet
Contributors: Sity Degree Engineering. Joined Muelheim/Ruhr, Germany, 1958, Having Engineer Design
12 pages
Twin Rotor MIMO System Modeling
No ratings yet
Twin Rotor MIMO System Modeling
8 pages
My Masters Plan
No ratings yet
My Masters Plan
27 pages
Question Paper Internal FYBCOM
No ratings yet
Question Paper Internal FYBCOM
2 pages
23BS1201
No ratings yet
23BS1201
2 pages
Summative Test No. 1 Grade 8
No ratings yet
Summative Test No. 1 Grade 8
3 pages
21.1-Simple Kinetic Molecular Model of Matter-Cie Igcse Physics Ext-Theory-qp
No ratings yet
21.1-Simple Kinetic Molecular Model of Matter-Cie Igcse Physics Ext-Theory-qp
10 pages
Factors Affecting Coil Self-Inductance
No ratings yet
Factors Affecting Coil Self-Inductance
16 pages
ISO 7919-5:1997 - Hydraulic Machine Vibration
No ratings yet
ISO 7919-5:1997 - Hydraulic Machine Vibration
14 pages
Industrial Electricity Basics
No ratings yet
Industrial Electricity Basics
3 pages
Parabola Concepts and Graphing
No ratings yet
Parabola Concepts and Graphing
94 pages
Construction Materials and Testing
No ratings yet
Construction Materials and Testing
12 pages
Interbehavioral Psychology - J - R - Kantor - 1959 - Anna's Archive
No ratings yet
Interbehavioral Psychology - J - R - Kantor - 1959 - Anna's Archive
300 pages
The Matter-Being Project - Wing Pon MATTER-BEING PARADIGM - A SUMMARY
No ratings yet
The Matter-Being Project - Wing Pon MATTER-BEING PARADIGM - A SUMMARY
1 page
" " Series Variable Displacement Piston Pumps: 21 (3050) A10 A16 A22 A37 A56 A70 A90 A145 A220
No ratings yet
" " Series Variable Displacement Piston Pumps: 21 (3050) A10 A16 A22 A37 A56 A70 A90 A145 A220
90 pages
Shell Structures in Architecture
No ratings yet
Shell Structures in Architecture
23 pages
Schedule Ii Fees For Bachelor Programmes
No ratings yet
Schedule Ii Fees For Bachelor Programmes
3 pages
6th G.science Unit 1
No ratings yet
6th G.science Unit 1
2 pages
Steel Wire, Plain, For Concrete Reinforcement
No ratings yet
Steel Wire, Plain, For Concrete Reinforcement
4 pages
Differentiation (Soalan 3)
No ratings yet
Differentiation (Soalan 3)
21 pages
Civil Engineering Curriculum
No ratings yet
Civil Engineering Curriculum
2 pages
3RW3346 0ec34
No ratings yet
3RW3346 0ec34
2 pages
Heat Transfer and Kinetic Energy Concepts
No ratings yet
Heat Transfer and Kinetic Energy Concepts
73 pages
Flow Meter Calibration: Fluid Mechanics
No ratings yet
Flow Meter Calibration: Fluid Mechanics
2 pages
Certification for NDT Professionals
No ratings yet
Certification for NDT Professionals
1 page