Linear Algebra - LA - Parti
Linear Algebra - LA - Parti
Textbook:
“Elementary Linear Algebra,” 12-th ed., H. Anton and C. Rorres.
( The content of this course comes from the above book and other
references.)
Reference:
1. “Linear Algebra,” Friedberg, Insel, Spence.
2
Algebra
1. Elementary algebra
2. Linear Algebra
3. Algebra (Modern algebra, Abstract algebra)
- Abstraction and generalization (Ex. Vector Spaces)
v + w, cv (vectors in R2 or R3)
f (t) + g(t), cf (t) (functions)
M + P, cM (matrices)
4
Notation
1. R: the set of real numbers.
2. C: the set of complex numbers.
3. A vector v is denoted by a bold little character.
4. R2 and R3 are the sets of 2-D plane and 3-D space.
" # 2
1 2
v= ∈R, u = 4 ∈ R3
3
1
5. A vector is denoted as a column vector. For row vector, we use the
notation “T ” to denote “transpose”
uT = [2 4 1]
5
2 + i3 1 1 + i 2 − i3 0 0
M = 0 4 0 , MH = 1 4 0
0 0 3 1−i 0 3
6
Preview
1. Vectors in Rn (or Cn):
a1
a2
u=
.. ak ∈ R (or C) (an n-tuple)
an
uT = [a1 a2 · · · an]
uH = uT = [a1 a2 · · · an]
Vector Space :
A vector space is a set of vectors, which will be defined later.
Inner product :
The inner product of two vectors x and y,
x · y, ⟨x, y⟩, ⟨x | y⟩
When we define an inner product in a vector space, we can use it
to define
1. the length (norm) of a vector ( ∥x∥2 = ⟨x, x⟩ ), and
2. the orthogonality between vectors. ( ⟨x, y⟩ = 0 )
Linear combination c 1 v 1 + c2 v 2 + · · · + cn v n
Linear transformation
v ∈ Rn 7→ w ∈ Rm
8
Contents
1. Systems of Linear Equations and Matrices (Chap. 1)
2. Determinants (Chap. 2)
3. Euclidean Vector Spaces (R2, R3, Rn) (Chap. 3)
4. General Vector Spaces (Chap. 4)
5. Eigenvalues and Eigenvectors (Chap. 5)
6. Inner Product Spaces (Chap. 6)
7. Diagonalization and Quadratic Forms (Chap. 7)
8. Linear Transformations (Chap. 8)
9. Additional Topics
(including Singular Value Decomposition and Jordan Forms.)
9
Homework :
Chap 1: 1.1): 8, 12 1.2): 38 1.3): 36 1.4): 42, 46 1.5): 31 1.6): 18, 24
1.7): 40(a), 47 1.8): 16, 45 1.9): 12
x1 a 1 + x2 a 2 + · · · + xk a k + · · · + xn a n = b
a11 a12 · · · a1n
a21 a22 · · · a2n
A=
.. .. ..
..
am1 am2 · · · amn
Remarks
1. By taking elementary row operations, we do not affect the solutions
of Ax = b.
2. Each elementary row operation is reversible.
16
×3
×2
×(−3)
×1/2
17
Definition A matrix is in (row) echelon form if it has
the following three properties: (Forward Gaussian elimination)
1 ∗ ∗ ∗ ∗ ∗∗∗∗
■
0 1 ∗ ∗ ∗ 0 ■ ∗ ∗ ∗
or ■ ̸= 0
0 0 0 1 ∗ 0 0 0 ■ ∗
0 0 0 0 0 0 0 0 0 0
1. If there are any rows that consist entirely of zeros, then they are
grouped together at the bottom of the matrix.
2. If a row does not consist entirely of zeros, then the first non-zero
number in the row is a 1. We call this a leading 1 (or pivot).
3. The leading 1 in the lower row occurs farther to the right than the
leading 1 in the higher row.
18
Definition A matrix is in reduced (row) echelon form
if it is in (row) echelon form and satisfies the following property.
Example A row echelon form (R1) and a reduced row echelon form (R2).
1∗∗∗∗ 10∗0∗
0 1 ∗ ∗ ∗ 0 1 ∗ 0 ∗
R1 =
0 0 0 1 ∗
, R2 =
0 0 0 1 ∗
0 0 0 0 0 00 0 0 0
19
After a sequence of row operations, we can make a matrix in a
row echelon form. A matrix may have many row echelon forms.
(For example, adding the 3rd row of R1 to the 2nd row of R1, we obtain
another matrix in row echelon form.)
1∗∗∗∗
0 1 ∗ ∗ ∗
R1 = 0 0 0 1 ∗
0 0 0 0 0
12357 1235 7
0 1 4 6 8
→ 0 1 4 7 17
0 0 0 1 9 0 0 0 1 9
00000 0000 0
20
• The pivots are always in the same positions in any echelon form
of A.
• We call these positions pivot positions.
• A pivot column is a column of A that contains a pivot position.
21
Example Reduced row-echelon forms:
0 1 −2 0 1
1 0 0 4 1 0 0 " #
0 0 0 1 3 , 0 0
0 1 0 7, 0 1 0,
0 0 0 0 0 0 0
0 0 1 −1 0 0 1
0 0 0 0 0
a21
1. If a11 ̸= 0, perform row operations to create zeros below a11. (− )
a11
a11 a12 . . . a1n b1
0 a22 . . . a2n b2
′ ′ ′
.. .. .. .. ..
0 a′n2 . . . a′mn b′m
23
2. If a11 = 0, interchange the first row with another row (say, the kth
row) for which ak1 ̸= 0. Then perform the operations in step1 to this
new matrix.
0 . . . . . . b1
. . . .
. . . .
a ... ... b
k1 2
.. .. .. ..
∗∗∗∗
■
0 ■ ∗ ∗ ∗
[A | b] →
0 0 0 ■ ∗
0 0 0 0 0
• In this step, we can determine whether the original equation Ax = b
is consistent or not.
• If a non-zero element of the last column (b-column) is a pivot, then
the original equation is inconsistent and no solution exits.
′
′ , bk ̸= 0
0 . . . 0 bk
∗∗∗∗
■
0 ■ ∗ ∗ ∗
1. Make each pivot 1 by a scaling row operation.
0 0 0 ■ ∗
2. Create zeros above each pivot.
0 0 0 0 0
Finally we can obtain a reduced echelon form.
∗∗∗∗
■ 1∗ ∗ ∗ ∗ 1∗ ∗ 0 ∗ 10 ∗ 0 ∗
0 ■ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
→ 0 1 → 0 1 0 → 0 1 0
0 0 0 ■ ∗ 0 0 0 1 ∗ 0 0 0 1 ∗ 0 0 0 1 ∗
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0
28
Example Consider a system of linear equations
3x2 − 6x3 + 6x4 + 4x5 = −5
3x1 − 7x2 + 8x3 − 5x4 + 8x5 = 9
3x1 − 9x2 + 12x3 − 9x4 + 6x5 = 15
The augmented matrix
0 3 −6 6 4 −5
[A | b] = 3 −7 8 −5 8 9
3 −9 12 −9 6 15
29
0 3 −6 6 4 −5
[A | b] = 3 −7 8 −5 8 9
3 −9 12 −9 6 15
3 −9 12 −9 6 15
→ 3 −7 8 −5 8 9
0 3 −6 6 4 −5
3 −9 12 −9 6 15 3 −9 12 −9 6 15
→ 0 2 −4 4 2 −6 → 0 2 −4 4 2 −6 (pivots)
0 3 −6 6 4 −5 0 0 0 0 1 4
=⇒ Consistent.
30
3 −9 12 −9 0 −9
→ 0 2 −4 4 0 −14
0 0 0 0 1 4
3 −9 12 −9 0 9
→ 0 1 −2 2 0 −7
0 0 0 0 1 4
3 0 −6 9 0 −72 1 0 −2 3 0 −24
→ 0 1 −2 2 0 −7 → 0 1 −2 2 0 −7
0 0 0 0 1 4 0 0 0 0 1 4
x1+2x2 = 5 x1+2x2 =5
3x1− x2 = 1 ⇒ 3x1− x2 =1
4x1+ x2 = 6
1 2 5 1 2 5 1 2 5
3 −1 1 → 0 −7 − 14 → 0 1 2
4 1 6 0 −7 − 14 0 0 0
33
x1 = −24 + 2x3 − 3x4
x2 = −7 + 2x3 − 2x4
x5 = 4
x3 = t1 ∈ R (free variable)
x4 = t2 ∈ R (free variable)
−24 2 −3
−7 2 −2
Ax = A
0 + A 1 t1 + A 0 t2
0 0 1
4 0 0
=b+0+0
=b
36
Summary of the process of solving Ax = b
1. Perform elementary row operations on the augmented matrix
[A | b] to make it into an echelon form.
2. Determine if this system is consistent. If it is inconsistent, no solu-
tions exist. Otherwise, further make it into a reduced echelon form.
3. If there are no free variables, we can immediately obtain the unique
solution of this system from the reduced echelon form.
1
1
′
b
...
1
37
4. If there are free variables, solve the reduced system of equations for
the basic variables in terms of free variables.
(In this case, there are infinitely many solutions.)
1 0 −2 3 0 −24
0 1 −2 2 0 −7
0 0 0 0 1 4
38
Example
x + y + 2z = 9 1 1 2 9 1 1 2 9
2x + 4y − 3z = 1 ⇒ 2 4 −3 1 ⇒ 0 2 − 7 − 17
3x + 6y − 5z = 0 3 6 −5 0 0 3 −11 −27
1 1 2 9 1 1 2 9
⇒ 0 1 − 7/2 − 17/2 ⇒ 0 1 − 7/2 − 17/2
0 3 −11 −27 0 0 −1/2 −3/2
1 1 2 9 1 1 0 3 1 0 0 1
⇒ 0 1 − 7/2 − 17/2 ⇒ 0 1 0 2 ⇒ 0 1 0 2
0 0 1 3 0 0 1 3 0 0 1 3
x 1
⇒ y = 2
z 3
39
Note for Ax = b, A: m × n
1 0 −2 3 0 −24
0 1 −2 2 0 −7
[A | b] ∼
0 0 0
(A : 4 × 5)
0 1 4
0 0 0 0 0 0
(#: number)
(# of basic variables) + (# of free variables) = (# of variables) = n
(# of basic variables) = (# of effective equations) ≤ m
x1 2 −3
x2 2 −2
x = x3 = 1 t 1 + 0
t2 , t 1 , t2 ∈ R
x
4 0 1
x5 0 0
with
2 −3
2 −2
A
1 = 0, A
0 =0
0 1
0 0
Therefore, Ax = 0.
44
Conclusions:
Ax = 0 has free variables.
⇐⇒ Ax = 0 has non-trivial solutions.
⇐⇒ Ax = b has infinitely many solutions, if it is consistent.
45
{p + vh | Avh = 0}
(See the previous example.)
46
Proof :
Let S1 = {x | Ax = b} be the solution set of Ax = b, and
S2 = {p + vh | Avh = 0}, Ap = b.
1. Since p + vh ∈ S2, and
A(p + vh) = Ap = b
we have p + vh ∈ S1. Therefore, S2 ⊆ S1.
2. On the other hand, let w ∈ S1. Then Aw = b. Note
A(w − p) = b − b = 0
Since A(w − p) = 0, let vh = w − p. We have Avh = 0. and
w = p + vh ∈ S2. Therefore, S1 ⊆ S2.
A ∼ A1 ∼ A2 ∼ · · · ∼ B
49
A ∼ A1 ∼ A2 ∼ · · · ∼ B
Note that
1. A matrix A is row equivalent to all its row echelon forms.
2. A matrix A is row equivalent to its reduced row echelon form.
3. All the row echelon forms of A are row equivalent.
A ∼ R 1 , A ∼ R2 , · · ·
R1 ∼ R2, R2 ∼ R3 · · ·
51
Recall that a matrix can have many row echelon forms.
1 ∗ ∗ ∗ ∗
0 1 ∗ ∗ ∗
R1 =
0
0 0 1 ∗
0 0 0 0 0
If a matrix A has two reduced row echelon forms C1 and C2, then
A ∼ C1 and A ∼ C2
(Equivalence relation)
1. A ∼ A. (Reflexivity)
2. If A ∼ B, then B ∼ A. (Symmetry)
3. If A ∼ B, B ∼ C then A ∼ C. (Transitivity)
54
When there are infinitely many solutions, one may want to find
a solution x̃ that has the minimum “length”.
3. An m × n zero matrix
0 ··· 0
..
O = .. ..
.
0 ··· 0
is an m × n matrix whose all entries are zeros.
4. A diagonal matrix is a square matrix whose entries are all zero except
d11 O
d 22
the diagonal entries. D =
. .
.
O dnn
58
The identity matrix
1 O
1
In = .. = [e1 e2 · · · en]
.
O 1
T
Rn Rm
v T (v)
60
Consider a transformation T from R to R ,
n m
T : Rn 7→ Rm : v 7→ T (v)
which maps v to T (v). The domain of T is Rn and the codomain is
Rm .
1 0 0
0 1 0
e1 =
.. , e2 = .. , · · · , en = ..
0 0 1
x y=Bx z=Ay
S◦T
65
S ◦T : R p → R m z= ABx
T S
Rp R n
Rm
x y=Bx z=Ay
S◦T
66
Now for the m × n matrix A and n × p matrix B,
A = [a1 a2 · · · an] = [aij ] , ak ∈ R m
n
X
[AB]ij = aik bkj
k=1
b1j
..
ai1 · · · ain
bnj
69
Remarks
1. A = B if size(A) = size(B) = (m, n), and
[A]ij = [B]ij , 1 ≤ i ≤ m, 1 ≤ j ≤ n
4. A − B = A + (−B).
70
The algebraic properties of matrices,
1. A + B = B + A
2. (A + B) + C = A + (B + C)
3. A + O = A, A + (−A) = O
4. r(A + B) = rA + rB
5. (r + s)A = rA + sA
6. r(sA) = (rs)A, 1A = A
where A, B, and C are of the same sizes, and
r, s are real or complex numbers.
a1j
..
[BA]ij = bi1 · · · bin
anj
In addition,
AB = O ⇏ A = O or B = O
Cf. for any a, b ∈ R (or C), we have ab = 0 ⇒ a = 0 or b = 0.
AB = AC ⇏ B = C
Cf.
A+B =A+C ⇒ B =C
We define A0 = I, if A ̸= O.
Note that
(A + B)2 = (A + B)(A + B) = A2 + AB + BA + B 2
(A + B)3 = (A + B)(A2 + AB + BA + B 2) = · · ·
Rn Rm
u Au
AT v v
77
a11 · · · a1j · · · a1n
. . .
. . .
A = ai1 · · · aij · · · ain
= [a1 a2 · · · an]
. . .
. . .
am1 · · · amj · · · amn
a11 · · · ai1 · · · am1 a T
. . . 1
. . . T
a2
A =
T
a 1j · · · · · · a mj
=
.
. . . .
. . .
T
a1n · · · ain · · · amn a n
78
A B
|
[(AB)T ]ij −− j-th row −− i-th column
|
T T
B A
|
T T
[B A ]ij −− i-th row −− j-th column
|
Cf.
(ABCD)T = DT C T B T AT
86
From the above, since
(ABCD · · · )−1 = · · · D−1C −1B −1A−1
we have (let A = B = C · · · )
(An)−1 = (A−1)n = A−n
We also note
(cA)−1 = c−1A−1
(TA)−1 = TA−1
88
TA : x 7→ Ax
TA−1 : Ax 7→ x
T Rn
Rn
x y=Ax
x = A−1y y
T −1
89
Example Consider a linear transformation T on R , define as
3
a a+b
T ( b ) = b + c
c c+a
then the stardard matrix of T is
1 1 0
A = [T (e1) T (e2) T (e3)] = 0 1 1
1 0 1
Note
a 1 1 0 a a+b a
A b = 0 1 1 b = b + c = T ( b )
c 1 0 1 c c+a c
90
• Elementary matrices
Recall the three elementary row operations on a matrix.
1. (Replacement) Add to one row a multiple of anther row.
2. (Interchange) Interchange two rows.
3. (Scaling) Multiply all entries in a row by a non-zero constant.
×3
×2
91
×(−3)
×1/2
100 100 100 a11 a12 a13
E1 = 0 1 0 , E2 = 0 0 1 , E3 = 0 c 0 , A = a21 a22 a23
501 010 001 a31 a32 a33
a11 a12 a13
E1 A = a21 a22 a23
a31 + 5a11 a32 + 5a12 a33 + 5a13
a11 a12 a13 a11 a12 a13
E2A = a31 a32 a33 , E3A = ca21 ca22 ca23
a21 a22 a23 a31 a32 a33
95
Proof : ( (a)⇒(b)⇒(c)⇒(d)⇒(a) )
A square matrix whose entries below (above) the main diagonal are
zero.
∗ ∗ ∗ ∗ ∗ 0 0 0
0 ∗ ∗ ∗ 0
(upper triangular) , ∗ ∗ 0 (lower triangular)
0 0 ∗ ∗ ∗ ∗ ∗ 0
0 0 0 ∗ ∗∗ ∗ ∗
• Symmetric matrces
Definition A square matrix A is called symmetric if AT = A.
(AB)T = B T AT = BA ̸= AB
109
(BB T )T = (B T )T B T = BB T
110
• Partitioned matrices
" #
A11B1 + A12B2
AB = , CA = [C1A11 + C2A21 C1A12 + C2A22]
A21B1 + A22B2
2. Similarly,
" # " # " # " #
A AC A DA
C= , D ̸=
B BC B DB
113
Let " #
A11 A12
A=
A21 A22
then
AT11 AT21
AT =
AT12 AT22
Exercise
Find the transpose of B and C
" #
B1
B= , C = [C1 C2]
B2
114
• Column-Row expansion of AB
Note that A is invertible implies that both A11 and A22 are invertible,
and vice versa.
Exercise Check
" #" #
A−1
11 − A −1
A A
11 12 22
−1
A11 A12
=I
O A−1
22 O A22
117
Exercise Find the inverse of
" #
A11 O
M=
A21 A22
Hint
" #T
A11 O AT11 AT21
M =T
=
A21 A22 OT AT22
(M T )−1 = (M −1)T
118
• LU decomposition
(factorization) 1 0 0 0 • ∗ ∗ ∗ ∗
∗ 1 0 0 • ∗ ∗ ∗
A= 0
∗ ∗ 1 0 0 0 0 • ∗
∗ ∗ ∗ 1 0 0 0 0 0
L U
A : m × n, L : m × m, U : m × n.
Ax = b
1 0 0 0 • ∗ ∗ ∗ ∗
∗ 1 0 0 0 • ∗ ∗ ∗
A=
∗
= LU
∗ 1 0 0
0 0 • ∗
∗ ∗ ∗ 1 0 0 0 0 0
L(U x) = b
Ly = b
Ux = y
121
− Algorithm for an LU factorization
Assume we can reduce a matrix A to an echelon form U by elementary
row operations, without the row interchange,
•∗∗∗∗
0 • ∗ ∗ ∗
A ∼ ··· ∼ U = 0 0 0 • ∗
0 0 0 0 0
Eq · · · E 1 A = U
where each Ek is lower triangular, for example,
1000 1000 1 0 0 0
−2 1 0 0 0 1 0 0 0 1 0 0
, ,
0 0 1 0 3 0 1 0 0 0 1 0
0001 0001 0 0 0 2
122
In
Eq · · · E 1 A = U
the matrix (Eq · · · E1) is lower triangular.
Then
A = (Eq · · · E1)−1U = LU
where
L = (Eq · · · E1)−1
is lower triangular.
A = (E1−1 · · · Eq−1)U
123
Example
2 4 −1 5 −2 1 0 0 0
−4 −5 3 −8 1 1 0 0
A=
2 −5 −4 1 8
=I
1 0
−6 0 7 −3 1 1
2 4 −1 5 −2 1 0 0 0
0 3 1 2 −3 −2 1 0 0
∼
0 −9 −3 −4 10 → 1
1 0
0 12 4 12 −5 −3 1
2 4 −1 5 −2 1 0 0 0
0 3 1 2 −3 −2 1 0 0
∼
0 0 0 2 1
→
1 −3 1 0
0 0 0 4 7 −3 4 1
2 4 −1 5 −2 1 0 0 0
0 3 1 2 −3 −2 1 0 0
∼
0 0 0 2 1
=U →
1 −3 1 0
=L
0 0 0 0 5 −3 4 2 1
Example 124
6 −2 0 1 0 0
A = 9 −1 1 1 0
3 7 5 1
1 − 13 0 6 0 0
∼ 9 −1 1 → 1 0
3 7 5 1
1 − 13 0 6 0 0
∼ 0 2 1 → 9 1 0
0 8 5 3 1
1 − 13 0 6 0 0
∼ 0 1 1
2
→ 9 2 0
0 8 5 3 1
1 − 13 0 6 0 0
∼ 0 1 1=U
2
→ 9 2 0 = L
0 0 1 3 8 1
125
Note that
1. LU decomposition doesn’t necessarily exist for every m × n
matrix A.
124
A ∼ ··· ∼ 0 0 1
037
2. In general, if row interchanges are required to reduce A to its row-
echelon form, then there is no LU decomposition of A.
where P = (P ′)−1
127
Definition For a square matrix A of size n, we define
the trace of A as
n
X
tr(A) = akk , A = [aij ]
k=1
the sum of the entries on the main diagonal of A.
a11
a22
.
. .
ann
n
X n X
X n
tr(BA) = [BA]kk = [B]kℓ[A]ℓk
k=1 k=1 ℓ=1
129
Exercise
Suppose that A and B are two m × n matrices. Prove that
tr(B T A) = tr(AB T )
B T A is n × n, while AB T is m × m.
n
X n X
X m n X
X m
tr(B T A) = [B T A]kk = [B T ]kℓ [A]ℓk = [B]ℓk [A]ℓk
k=1 k=1 ℓ=1 k=1 ℓ=1
m
X m X
X n m X
X n
tr(AB T ) = [AB T ]kk = [A]kℓ [B T ]ℓk = [A]kℓ [B]kℓ
k=1 k=1 ℓ=1 k=1 ℓ=1
■ Determinants 130
A 7→ det(A) ∈ R (or C)
a2
a1
131
A = [a1 a2 a3], Volume = |det(A)|
a3
a2
a1
Example:
1 −2 5 0
1 5 0
2 0 4 −1
A=
3 1 0 7
, A32 = 2 4 −1
0 −2 0
0 4 −2 0
1. For a 1 × 1 matrix h i
A = a11
we define
△
det(A) = a11
134
We call
(1) det(Aij ) : the minor of aij , or the (i, j)-minor.
a11 a12 a13
det a21 a22 a23
a31 a32 a33
" # " # " #
a22 a23 a21 a23 a21 a22
= a11 det − a12 det + a13 det
a32 a33 a31 a33 a31 a32
• In fact, it can be proved that we can expand along any row, say
the ith row,
We conclude that
det (Ek A) = (det Ek ) (det A), k = a, b, c (5)
By (5), we have
det A = det (E1E2 · · · · · Ep)
= (det E1) det (E2 · · · · · Ep)
= (det E1) (det E2) · · · · · (det Ep) ̸= 0
145
• If A is not invertible
Ep′ · · · E1′ A = I˜
A = E1 · · · EpI˜
For example,
1 2 0 0
0 0 1 0
I˜ =
0
, det I˜ = 0
0 0 1
0 0 0 0
and
˜ = det (E1E2 · · · Ep) det I˜ = 0
det A = det (E1E2 · · · EpI)
146
Theorem (Equivalent Statements of Matrix Inversion)
If A is invertible, let
A = E 1 E2 · · · E p
By (5),
det AB = det (E1E2 · · · EpB)
= (det E1) [det (E2 · · · EpB)] = · · ·
= (det E1) (det E2) · · · · · (det Ep)(det B)
= (det E1E2 · · · Ep)(det B)
= (det A) (det B)
Therefore, we have
det AB = det BA = (det A) (det B)
although AB ̸= BA in general,
149
Note that
1. det (A + B) ̸= det A + det B
2. det (cA) = cn(det A), if A is of size n × n
152
• Cramer’s Rule
For an n × n matrix A, and b ∈ Rn,
A = [a1 · · · ai · · · an]
define
Ai(b) = [a1 · · · ai−1 b ai+1 · · · an]
namely, replace ai by b.
Example
Ii(x) = [e1 · · · x · · · en]
1 0 · · · x1 0
0 1 x 2 0
. . . .. ..
. .
=0
⇒ det Ii(x) = xi
x i 0
.. .. . .
.
0 0 xn 1
Theorem (Cramer’s Rule) : Let A be an n × n invertible matrix. 153
For any b in Rn, the unique solution x of Ax = b has entries given by
det Ai(b)
xi = , i = 1, 2, · · · , n
det A
Proof :
A Ii(x) = A[e1 · · · x · · · en]
= [Ae1 · · · Ax · · · Aen]
= [a1 · · · b · · · an]
= Ai(b)
A = [a1 · · · ai · · · an]
Ref.
det(A) = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin
= a1j C1j + a2j C2j + · · · + anj Cnj
156
Since [bj ]i = [B]ij , we have
Cji
[B]ij =
det A
Hence
C11 C21 · · · Cn1
−1 1 C 12 C 22 · · · C n2
1
A =B= . . . = adj(A) (6)
det A . . . det A
C1n C2n · · · Cnn
a2
a1
160
1. It holds if
" # ℓ1 0 0
ℓ1 0
A= , B = 0 ℓ2 0
0 ℓ2
0 0 ℓ3
then det A = ℓ1ℓ2 and det B = ℓ1ℓ2ℓ3 are the corresponding area
and volume, respectively.
ℓ2
ℓ1
161
a2 − ka1 a2
a1
ka1
(x0, y0)
1. For two vectors u = (u1, u2, . . . , un)T and v = (v1, v2, . . . , vn)T
in Rn, the sum u + v is defined by
3. Mmn(R), or Mmn(C)
M1 + M2 , cM1
4. { f (t) ∈ R, t ∈ R }
f1(t) + f2(t), cf1(t)
168
Remarks
1. For the above sets, we have similar the definitions of addition “+”
and scalar multiplication.
2. How to efficiently study further issues such as the concept of bases,
linear transformation, eigenvalues/eigenvectors, inner products, etc.?
169
The properties of the above theorem are actually common for the
following sets.
1. Rn (or Cn)
2. p(t) = a0 + a1t + a2t2 + · · · + antn, ak ∈ R or C
3. Mmn(R), or Mmn(C)
4. { f (t) ∈ R, t ∈ R }
5. { {s} = (s1, s2, . . .), sk ∈ R or C }
6. { X : Ω → R }
170
We use the properties of the vectors in R (or C ) to define
n n
a vector space.
············
············
171
Axiom (or Definition )
A vector space V over a field F (R or C) is a nonempty set V of
vectors on which are defined two operations, called
addition (+) and scalar multiplication,
- addition: u + v
- scalar multiplication: cu
u, v ∈ V and c ∈ F (R or C)
u v
V
u + v cu
3. u + v = v + u
4. (u + v) + w = u + (v + w)
5. There is a zero vector 0 in V such that u + 0 = u for any u.
6. For each u in V , there exists a vector w such that u + w = 0.
We use −u to denote w.
7. c(u + v) = cu + cv
8. (c + d)u = cu + du
9. c(du) = (cd)u
10. 1u = u
3. Mmn(R), or Mmn(C)
4. {f (t) ∈ R, t ∈ R}
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
176
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
177
Example A vector space of real-valued functions defined on (a, b),
denoted by F (a, b), is a vector space. The space F (a, b) is not
of finite dimension.
F (a, b) = {f (t) ∈ R | t ∈ (a, b)}
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
178
(V, F )
Example (Rn, R), (Cn, C), (F n, F ) are vector spaces.
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
181
Fact: The zero vector 0 in a vector space V is unique.
Assume we have two zero vectors 01 and 02 in V .
Then
0 1 = 0 1 + 0 2 = 02
w1 = w1 + 0 = w1 + (u + w2) = (w1 + u) + w2 = 0 + w2 = w2
182
Therefore, in a vector space,
1. the zero vector 0,
2. and the negative vector of a vector u, denoted by −u,
are well-defined.
183
Theorem (Cancellation law for vector addition)
Let V be a vector space and u, v, w are vectors in V . If
u+w =v+w
then u = v.
Proof.
There exists a z ∈ V such that w + z = 0.
u = u + 0 = u + (w + z)
= (u + w) + z
= (v + w) + z
= v + (w + z)
=v+0=v
184
Theorem Let (V, F ) be a vector space, u a vector in V , and
k a scalar in F . Then
1. 0u = 0
2. k0 = 0
3. (−1)u = −u
4. If ku = 0, then k = 0 or u = 0
Proof
1. 0u + 0u = (0 + 0)u = 0u = 0u + 0
By the cancellation law for vector addition, we have 0u = 0.
2. k0 = k(0 + 0) = k0 + k0
k0 = k0 + 0
⇒ k0 + k0 = k0 + 0 ⇒ k0 = 0 (by the cancellation law)
185
ku = (ku1, 0)
The first nine rules of the axiom are satisfied. However, Axiom 10 fails
to hold.
1u = 1(u1, u2) = (u1, 0) ̸= u
188
In the following, we will study the issues of subspaces and the dimension
of a vector space.
189
• Subspaces
Definition A subspace of a vector space V is a subset H of V
that is also a vector space.
H
190
x
191
To determine if a subset H of a vector space V is a subspace, we
only need to examine the following three rules, for all u, v ∈ H, and
c ∈ F,
1. 0 ∈ H
2. (Closure) u ∈ H, v ∈ H ⇒ u + v ∈ H
3. (Closure) u ∈ H, c ∈ F ⇒ cu ∈ H ⇒ (−1)u = −u ∈ H
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
193
{(x, y) | y = ax}
194
x
195
H
196
Example Assume m < n, and
W = {c0 + c1t + · · · + cmtm | c0, . . . , cm ∈ R}
V = {c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}
Then W is a (proper) subspace of V .
W ⊂ Rn
198
Example
P (a, b) ⊂ C ∞(a, b) ⊂ C m(a, b) ⊂ C 1(a, b) ⊂ C(a, b) ⊂ F (a, b)
Exercise Is H ∪ W a subspace of V ?
202
The subspaces of R :
3
W
U
In fact, 0 ∈ W ∩ U
204
How to define the dimension of a vector space?
1. 0 ∈ W (Let c1 = · · · = cp = 0)
2. Assume u ∈ W , s ∈ W , k ∈ F ,
u = c 1 v 1 + c2 v 2 + · · · + cp v p
s = d1 v1 + d2 v2 + · · · + dp vp
then u + s = (c1 + d1)v1 + (c2 + d2)v2 + · · · + (cp + dp)vp ∈ W
c1 v 1 + c2 v 2 + · · · + cp v p = 0
If v1, v2, . . . , vp are linearly dependent, there are c1, c2, . . . , cp,
not all zero,
|c1|2 + |c2|2 + · · · + |cp|2 ̸= 0
such that
c1 v 1 + c2 v 2 + · · · + cp v p = 0
208
In this case, say, ck ̸= 0, then
v1 v1
v2 v2
c1 v 1 + c2 v 2 = 0 u = d 1 v1 + d2 v2
209
Example The set of vectors {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T } is a linearly
independent set,
1 0 0 0
c 1 0 + c 2 1 + c 3 0 = 0 ⇒ c1 = c2 = c3 = 0
0 0 1 0
while the set {(1, 0)T , (0, 1)T , (2, 3)T } is not an independent set,
" # " # " #
2 1 0
=2 +3
3 0 1
210
For a vector space V ,
v1 , . . . , v p ∈ V
x ∈ span{v1, . . . , vp} = W
(0, 1)
(1, 0)
Rn
V
x [x]β
β
215
For example, consider the space of polynomials of degree smaller
than 3,
Let
β = {1, t, t2}
be a basis of V . For a polynomial
we have
d0
[ p(t) ]β = d1 ∈ R3
d2
216
(1, 1)
(0, 1)
A vector space V may have more than one bases. Any set of vectors
{v1, . . . , vn} in V can be a basis of V if it is linearly independent and
it spans V .
For example, in R2, we can choose β = {(1, 0), (0, 1)} as a basis,
or γ = {(1, 1), (1, −1)} as a basis.
217
Theorem A vector space V can have different bases, say,
β = {b1, . . . , bn}, γ = {f1, . . . , fm}.
However, they must have the same number of vectors, i.e., m = n.
Proof:
1. m > n
Consider the coordinates of f1, . . . , fm relative to β,
f11 f21 fm1
f12 f22 fm2
[f1]β =
..
,
[f2]β = . , . . . , [fm]β =
..
.
f1n f2n fmn
This definition is well-defined, since all bases have the same number of
vectors.
220
Then
/ span(β ′)
v1 ∈
since β is an independent set.
Therefore,
span(β ′) ̸= V
and β ′ cannot span V .
226
Summary:
For a vector space V , a basis β = {v1, v2, . . . , vn}
1. is an linearly independent set,
2. and it spans (generates) V .
3. n = dimV .
Span(S)
u1
span(S ∪ u1)
u2
dim(W ) ≤ dim(V )
232
Proof :
• Let βW be the a basis of W . Since βW ⊂ W ⊆ V and V is
finite-dimensional, we have that βW is a finite set, and W is
finite-dimensional.
• Since βW can be extended, if necessary, to become a basis βV of V ,
we have dim(W ) ≤ dim(V ) for βW ⊆ βV .
• If dim(W ) = dim(V ) = n, then βW is an independent set of n vectors
in V , and βW becomes a basis for V . So V = W = span(βW ).
Note the above result may not be true if the dimensions of V and W
are not finite, dim(V ) = dim(W ) = ∞.
For example, P (a, b) ⊊ C(a, b).
233
• Change of basis in R n
Let
1 0 0
0 1 0
ϵ = {e1, . . . , en} = { , , . . . , }
.. .. ..
0 0 1
be the standard basis of Rn, and
x = c 1 u1 + · · · + c n un = d 1 v 1 + · · · + d n v n
234
Then
c1 d1
c2 d2
[x]β =
.. , [x]γ =
..
cn dn
and
x1 c1 d1
x2 c2 d2
x = [x]ϵ = = [u1 · · · un] = [v1 · · · vn]
.. .. ···
xn cn dn
= Pβ [x]β = Pγ [x]γ
△
2. [x]γ = Pγ−1Pβ [x]β = Pβγ [x]β , where Pβγ = Pγ−1Pβ .
" # " #
1 1
u1 = , u2 =
0 2
If
" #
−2
[x]β =
3
then
" # " # " #
1 1 1
x = −2u1 + 3u2 = (−2) +3 = = [x]ϵ
0 2 6
237
Example Consider a basis β = {u1, u2} of R , where
2
" # " #
1 1
u1 = , u2 =
0 2
If " #
1
x = [x]ϵ =
6
assume " #
c1
[x]β =
c2
then " #" # " #
1 1 c1 1
x = c1u1 + c2u2 = =
0 2 c2 6
" # " #−1 " # " #" # " #
c1 1 1 1 1 −0.5 1 −2
⇒ [x]β = = = =
c2 0 2 6 0 0.5 6 3
" # " #
238
−9 −5
Example Let β = {u1, u2} = { , } and
1 −1
" # " #
1 3
γ = {v1, v2} = { , } be two bases in R2.
−4 −5
" # " #
2 d1
Assume [x]β = , let [x]γ = . Then
3 d2
" # " #
2 d1
x = [u1 u2] = [v1 v2]
3 d2
and
" #" # " #" # " # " #
−9 −5 2 1 3 d1 d1 24
= ⇒ =
1 −1 3 −4 −5 d2 d2 −19
239
• Change of basis in a general vector space V
⇒ d0 + d1 = 2, d1 + d2 = 3, d2 = 4
3
⇒ d2 = 4, d1 = −1, d0 = 3, [p]γ = −1
4
241
Consider an m × n matrix A,
rT1
rT
2
A = [c1 c2 · · · cn] =
.
, ck ∈ R m , rk ∈ Rn
.
rTm
Definition
• The column space of A: Col(A) = span{c1, c2, . . . , cn} ⊆ Rm
• The row space of A: Row(A) = span{r1, r2, . . . , rm} ⊆ Rn
• The null space of A is
Nul(A) = {x | Ax = 0} ⊂ Rn
242
A
Rn Rm
Col(A)
Nul(A)
0
A : m×n
Col(A) = {Ax | x ∈ Rn} x = (x1, x2, . . . , xn)T
= {x1c1 + x2c2 + · · · + xncn | xk ∈ R}
= span(c1, c2, · · · , cn)
Nul(A) = {x | Ax = 0}
243
Theorem Elementary row operations do not change the null space
of a matrix.
Proof
Nul(A) = {x | Ax = 0}
= {x | EAx = 0}
since the elementary row operation is reversible, and each elementary
matrix is invertible.
In general, we have
{x | Ax = 0} ⊆ {x | BAx = 0}
or
Nul(A) ⊆ Nul(BA)
244
Theorem Elementary row operations do not change the row space
of a matrix.
×3
×2
Row(A) = Row(R)
= span{(1, −3, 4, −2, 5, 4), (0, 0, 1, 3, −2, −6), (0, 0, 0, 0, 1, 5)}
c1 a 1 + c2 a 2 + · · · + cn a n = 0
⇔ c1b1 + c2b2 + · · · + cnbn = 0
(or d 2 a 2 + d 4 a 4 − a 5 = 0 ⇔ d 2 b 2 + d 4 b 4 − b5 = 0 )
249
How to find a basis of Col(A)?
A = [a1 a2 . . . an]
That is, how to find a basis for span{a1 a2 . . . an}?
Example Find bases for Nul(A), Row(A) and Col(A) of the matrix
−3 6 −1 1 −7
A = 1 −2 2 3 −1
2 −4 5 8 −4
1 −2 0 −1 3 0 x1 − 2x2 − x4 3x5 = 0
A 0 ∼ 0 0 1 2 −2 0 x3 + 2x4 − 2x5 = 0
0 0 0 0 0 0 0 =0
252
x1 2x2 + x4 − 3x5 2 1 −3
x2 x2 1 0 0
x3 = −2x4 + 2x5 = x2 0 + x4 −2 + x5 2
x
4 x 4 0
1 0
x5 x5 0 0 1
Therefore, the solution set of Ax = 0, or Nul(A), is
Nul(A) = {x | Ax = 0}
2 1 −3
( 1
0
0
)
= t1 0
+ t2 −2 + t3 2 ,
t 1 , t2 , t3 ∈ R
0
1 0
0 0 1
253
Note that
2 1 −3
1 0 0
0, −2 , 2
0 1 0
0 0 1
Theorem For any matrix A, the two spaces Row(A) and Col(A)
have the same dimensions.
Proof
A ∼ R
Since Row(A) = Row(R), we have
dim(Row(A)) = dim(Row(R))
dim(Row(A)) = dim(Row(R))
dim(Col(A)) = dim(Col(R))
However,
dim(Col(R)) = dim(Row(R)) = number of pivots
1 −3 4 −2 5 4
0 0 1 3 −2 −6
A∼R=
0 0 0 0 1 5
0 0 0 0 0 0
257
Definition For a matrix A, we define its rank as
rank(A) = dim(Col(A)) = dim(Row(A))
A
Rn Rm
Col(A)
Nul(A)
0
258
For an m × n matrix A, we have
rank(A) = dim(Col(A)) ≤ min(m, n)
A
Rn Rm
Col(A)
Nul(A)
0
1 −2 0 −1 3
A ∼ R = 0 0 1 2 −2
0 0 0 0 0
261
Example As in the previous examples,
1 −3 4 −2 5 4 1 −3 4 −2 5 4
2 −6 9 −1 8 2
A= ∼ R = 0 0 1 3 −2 −6
2 −6 9 −1 9 7 0 0 0 0 1 5
−1 3 −4 2 −5 −4 0 0 0 0 0 0
Rn Rm
Col(A)
Nul(A)
0
We have
1. Consistent ⇔ b ∈ Col(A)
263
= {x | x1c1 + · · · + xncn = 0}
264
Ax = b, A : m × n, x ∈ Rn, b ∈ Rm
m A A
A
Rn Rm
Col(A)
Nul(A)
0
rank(A) + nullity(A) = n (A : m × n)
rank(A) ≤ min(m, n)
267
Overdetermined (m > n) Ax = b
n
A
Rn Rm
Col(A)
m A Nul(A)
0
268
Underdetermined (m < n) Ax = b
Rm
A
n
Rn
Col(A)
Nul(A)
m A 0
269
Let Ax = b be a consistent linear system of m equations
in n unknowns (A : m × n).
△
If A has rank r, then dim( Nul(A) ) = n − r = k.
A = [c1 c2 . . . cn], ck ∈ Rm
Ax = 0, x = (x1, x2, . . . , xn)T
x1c1 + x2c2 + · · · + xncn = 0
x1c1 + x2c2 + · · · + xncn = b
272
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, then the following statements are equivalent.
a. A is invertible.
b. The column vectors of A are linearly independent.
c. The row vectors of A are linearly independent.
d. The column vectors of A span Rn.
e. The row vectors of A span Rn.
f. A has rank n.
g. A has nullity 0.
273
Proof
Recall that for a square matrix A of size n,
A is invertible
⇔ Ax = 0 has only the trivial solution x = 0.
⇔ The column vectors of A are linearly independent. (b)
(the above is by (1) and (2) of the last Theorem)
⇔ Col(A) = Rn. (d)
⇔ Rank(A) = n. (f)
A
Rn Rm
Col(A)
Nul(A)
0
274
Furthermore, by the Dimension Theorem, since A: n × n, we have
rank(A) = n ⇔ nullity(A) = 0 (g)
rank(AT ) = rank(A) = n
⇔ the row vectors of A are linearly independent. (c)
⇔ the row vectors of A span Rn. (e)
rank(A) + nullity(A) = n
Theorem Suppose that A and B are two matrices and 276
rank(AB) ≤ rank(A)
Proof
Nul(AB) ⊇ Nul(B) Bx = 0 ⇒ ABx = 0
⇒ nullity(AB) ≥ nullity(B) (A : m × p, B : p × n)
⇒ rank(AB) ≤ rank(B)
since
rank(AB) + nullity(AB) = rank(B) + nullity(B) = n
277
rank(AB) = 0, rank(BA) = 1
279
■ Linear Transformation
v T (v)
V : domain, W : codomain
T (V ): range, T (V ) = {T (v) | v ∈ V } ⊆ W
280
Definition
A transformation T : V 7→ W is called a linear transformation if for
all vectors u and v in V and all scalars c, we have
1. T (u + v) = T (u) + T (v)
2. T (cu) = cT (u)
T
V W
u v T (u) T (v)
u+v T (u + v)
V W
T
0V 0W
282
For a linear transformation T : V 7→ W , we have
T (c1v1 + c2v2 + · · · + cnvn)
= c1T (v1) + c2T (v2) + · · · + cnT (vn)
Let T = I.
T (v1 + v2) = v1 + v2 = T (v1) + T (v2)
T (cv) = cv = cT (v)
Exercise
Consider the mapping T : V 7→ W such that T (v) = w0 for every
v ∈ V , where w0 is a constant vector in W . Is T a linear transformation?
284
Example
Consider the differential operation on the space of polynomials,
V = c 0 + c 1 t + c 2 t 2 + · · · + c n t n | c0 , . . . , c n ∈ R
x1
x2
x=
.. = x1e1 + x2e2 + · · · + xnen
xn
Remarks
1. Every linear transformatin T from Rn to Rm corresponds to a matrix
A such that T (x) = Ax, with
A = [T (e1) T (e2) · · · T (en)]
2. Projection operator on R3
z
– Orthogonal projection
(x1, y1, z1)
100 x x
0 1 0 y = y
000 z 0 y
x (x1, y1, 0)
Note
1 1 0 0 0 0
T ( 0 ) = 0 , T ( 1 ) = 1 , T ( 0 ) = 0
0 0 0 0 1 0
292
3. Rotation operator
Define a linear operator on R2 that rotates a vector x counter-clockwise
through an angle θ.
" #" #
cos(θ) − sin(θ) x
T (x) =
sin(θ) cos(θ) y θ
• Nullity (T ) = dim (N (T ))
• Rank (T ) = dim (R(T ))
T
V W
R(T )
N (T )
0
294
T
V W
R(T )
N (T )
0
295
Recall that
Nul(A) = {x | Ax = 0} ⊆ Rn
Col (A) = {Ax | x ∈ Rn} ⊆ Rm
296
Definition A linear transformation T : V → 7 W is said
one-to-one if T maps distinct vectors in V to distinct vectors in W .
That is
x1 ̸= x2 ⇒ T (x1) ̸= T (x2)
or equivalently
T (x1) = T (x2) ⇒ x1 = x2
V W
x
297
(⇐)
Suppose that T is not one-to-one, then there exist x1 and x2 such that
x1 ̸= x2 but T (x1) = T (x2). Then T (x1 −x2) = 0, and x1 −x2 ∈ N (T ),
but x1 − x2 ̸= 0, which indicates N (T ) ̸= {0}, a contradiction.
298
V W
299
T
V W
R(T )
N (T )
0
T
V W
R(T )
N (T )
0
N (T ) = {x | T (x) = 0}
R(T ) = {T (x) | x ∈ V }
301
Proof:
Let S = {v1, . . . , vp} is a basis of N(T ), p ≤ n. Then
T (v1) = · · · = T (vp) = 0
dim(R(T )) + dim(N(T )) = (n − p) + p = n
302
x (x1, y1, 0)
305
V T
W
T (V )
v T (v)
T
V W
R(T )
N (T )
0
V W
T −1
308
T (x) = y
We can define an inverse of T , T −1 : W → V , by
T −1(y) = x
If T is not both one-to-one and onto, its inverse T −1 may not exist.
T T
V W V W
x
310
T1 T2
V W Z
T1−1 T2−1
312
Proof : We have proved (a) ⇔ (b) ⇔ (c). It is clear that (b) ⇔ (d).
In addition,
(a) ⇔ nullity(A) = 0 ⇔ N (T ) = {0} ⇔ (e)
315
Recall that when a square matrix A is not invertible, its reduced row
echelon form is like
102 120
0 1 1 or 0 0 1
000 000
Rn
[u]β
317
Rn+1.
c0 + c1x + · · · + cnxn 7→ (c0, c1, . . . , cn)T
Example
The mapping " # " #
01 1
S(u) = u+
−1 0 1
Appendix
• The development of number systems
1. Natural numbers (1, 2, 3, . . .)
x + 5 = 3, x =?
Example
Find the derivatives of order n = 1, 2, 3, . . . of
1
x2 + 1
1 c1 c2
= +
(x + i)(x − i) x + i x − i
• About the field 3
In a vector space (V, F ), the scalars are in the field F (Ex. R, C),
c1 v 1 + c2 v 2 + · · · + cn v n
Both R and C are examples of the field.
The following gives the axiom of a field.
4
Definition (or Axiom) of a field F (Ex. R, C)
1. a + b = b + a and a · b = b · a
2. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c)
5. a · (b + c) = a · b + a · c
5
Example R, C, Z2 = {0, 1} are examples of fields.
where x1 and x3 are basic variables, and x2, x4, x5 are free vairables.
10