0% found this document useful (0 votes)
14 views

MAN-001 Unit 1

The MAI-101 Mathematics-I course at IIT Roorkee covers four main units: Matrix Algebra, Differential Calculus, Integral Calculus, and Vector Calculus. The evaluation system includes class work, a mid-term exam, and an end-term exam, with respective weightages of 25%, 25%, and 50%. Preferred textbooks for the course are 'Advanced Engineering Mathematics' by E. Kreyszig and by R.K. Jain and S.R.K. Iyenger.

Uploaded by

drishtig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

MAN-001 Unit 1

The MAI-101 Mathematics-I course at IIT Roorkee covers four main units: Matrix Algebra, Differential Calculus, Integral Calculus, and Vector Calculus. The evaluation system includes class work, a mid-term exam, and an end-term exam, with respective weightages of 25%, 25%, and 50%. Preferred textbooks for the course are 'Advanced Engineering Mathematics' by E. Kreyszig and by R.K. Jain and S.R.K. Iyenger.

Uploaded by

drishtig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE

MAI-101: MATHEMATICS-I

UADAY SINGH

DEPARTMENT OF MATHEMATICS
About the course
The course is divided into 4 units:
• Matrix Algebra
• Differential Calculus (more than one variable)
• Integral Calculus (double and triple integrals
• Vector Calculus

The detailed syllabus and reference books can be seen on the link:
https://iitr.ac.in/Departments/Mathematics%20Department/Acade
mics/Course%20Structure.html
Lecture: Tutorial: Practical :: 3:1:0 Per Week
Preferred Books:
1. E. Kreyszig, Advanced Engineering Mathematics, 9th edition, John Wiley and Sons, Inc., U. K.
2. R.K. Jain and S.R.K. Iyenger, Advanced Engineering Mathematics, 2nd Edition, Narosa
Publishing House.

2
Evaluation System
• Class Work Sessional (CWS): 25%
It will be based on assignment, quiz, regularity in classes etc.

• Mid-Term Examination: 25% (Subjective Exam of 1.5 Hours)

• End Term Examination (ETE): 50% (Subjective Exam of 3 Hours)

3
MATRIX ALGEBRA
Notation: An 𝑚 × 𝑛 matrix 𝐴 = (𝑎𝑖𝑗 ) is usually written as
𝑎11 𝑎12 … 𝑎1𝑗 … 𝑎1𝑛 𝑅1
𝑎21 𝑎22 … 𝑎2𝑗 … 𝑎2𝑛 𝑅2
𝐴= ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑎𝑖1 𝑎𝑖2 … 𝑎𝑖𝑗 … 𝑎𝑖𝑛 𝑅𝑖
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑗 … 𝑎𝑚𝑛 𝑅 𝑚
𝐶1 𝐶2 … 𝐶𝑗 … 𝐶𝑛

• The 𝑖𝑡ℎ row of the matrix A is written as


𝑎𝑖1 𝑇
𝑎𝑖2

𝑅𝑖 = 𝑎𝑖1 , 𝑎𝑖2 , … , 𝑎𝑖𝑗 , … , 𝑎𝑖𝑛 = 𝑎 for 𝑖 = 1,2, … , 𝑚
1×𝑛 𝑖𝑗

𝑎𝑖𝑛 𝑛×1
4
• The 𝑗𝑡ℎ column of the matrix A is written as
𝑎1𝑗
𝑎2𝑗
𝑇

𝐶𝑗 = (𝑎1𝑗 , 𝑎2𝑗 ,…,𝑎𝑖𝑗 , …, 𝑎𝑚𝑗 ) = 𝑎 for 𝑗 = 1, 2, … , 𝑛
𝑖𝑗

𝑎𝑚𝑗
𝑚×1
• Often the rows and columns of a matrix are also referred as row vectors and
column vectors, respectively.
• The zero matrix or zero vector denoted by 𝑶 is a matrix of the specified
order whose all entries are 0. The matrix 𝑶 can have a suitable form
depending on the context, for example,

0
0 0 0
0
𝑶 = 0,0, … , 0 1×𝑛 or 𝑶 = or 𝑶 = 0 0 0 etc.

0 0 0
0 𝑛×1

5
Elementary Row and Column Operations
• The following three operations on a matrix A are called the elementary row
operations:

1. Multiply a row by a non-zero constant for which we use the symbol 𝑐𝑅𝑖
or 𝑅𝑖 → 𝑐𝑅𝑖 (multiply the 𝑖 𝑡ℎ row by a non-zero constant 𝑐)
2. Interchange any two rows for which we use the symbol 𝑅𝑖𝑗 or 𝑅𝑖 ↔ 𝑅𝑗
(interchange row 𝑖 and row 𝑗)
3. Add a non-zero multiple of one row to any other row for which we use
the symbol 𝑐𝑅𝑖 + 𝑅𝑗 or 𝑅𝑗 → 𝑅𝑗 + 𝑐𝑅𝑖 (multiply the 𝑖 𝑡ℎ row by 𝑐 and add
to the 𝑗𝑡ℎ row)

• A matrix A is said to be row equivalent to a matrix B, if the matrix A can be


obtained from the matrix B by using a finite sequence of elementary row
operations. Then we usually write 𝐴~𝐵.
• Similarly, we can have the elementary column operations with similar
notations.

6
Example: 1
2 6 1 7
Let us consider the matrix 𝐵 = 1 2 −1 −1 . Then
5 7 −4 9
2 6 1 7 1 2 −1 −1 𝑅2 → 𝑅2 − 2𝑅1 1 2 −1 −1
𝑅1 ↔ 𝑅2
1 2 −1 −1 2 6 1 7 𝑅3 → 𝑅3 − 5𝑅1 0 2 3 9
~ ~
5 7 −4 9 5 7 −4 9 0 −3 1 14
3 1 2 −1 −1 2 1 2 −1 −1
𝑅3 → 𝑅3 + 𝑅2 0 2 3 9 𝑅3 → 11 𝑅3 0 2 3 9 = 𝐴.
2
~ 0 0 11/2 55/2 ~ 0 0 1 5
Thus 𝐴~𝐵. Later we will see that this particular form of 𝐵 or 𝐴 is important.
Again, if we apply the same row operations on the 3 × 3 identity matrix 𝐼3 , we
get
0 1 0 0 1 0
1 0 0 0 1 0 0 1 0
1 −2 0 1 −2 0
𝐼3 = 0 1 0 ~ 1 0 0 ~ 1 −2 0 ~ 3 ~ 3 16 2 = 𝐼3 ′
0 0 1 0 0 1 0 −5 1 −8 1 −
2 11 11 11

7
0 1 0 2 6 1 7 1 2 −1 −1
Now, 𝐼3′ 𝐵 = 13 −2
16
0
2
1 2 −1 −1 = 0 2 3 9 = 𝐴.
− 5 7 −4 9 0 0 1 5
11 11 11
We note that the matrix 𝐴 can be obtained by two ways:
1. by applying a sequence of elementary row operations on 𝐵.
2. pre-multiplying 𝐵 with 𝐼3′ , a matrix obtained by using the same elementary
operations on 𝐼3 .

Elementary Matrix: An 𝑚 × 𝑚 matrix 𝐸 is said to be an elementary matrix, if it is


obtained by performing an elementary (row or column) operation on the 𝑚 × 𝑚
identity matrix 𝐼𝑚×𝑚 .
0 1 0 3 0 0 1 0 0
Examples: 1 0 0 𝑅1 ↔ 𝑅2 ; 0 1 0 𝑅1 → 3𝑅1 ; 0 1 2 𝑅2 → 𝑅2 + 2𝑅3 .
0 0 1 0 0 1 0 0 1
 It can also be noted that to each elementary operation 𝑒, there is a same type
elementary row operation 𝑒′ such that 𝑒 ′ 𝑒𝐴 = 𝑒 𝑒′𝐴 = 𝐴.
Again consider the elementary matrices used in Example 1.
8
0 1 0 1 0 0 1 0 0 1 0 0 1 0 0
𝐸1 = 1 0 0 , 𝐸2 = −2 1 0 , 𝐸3 = 0 1 0 , 𝐸4 = 0 13 0 , 𝐸5 = 0 1 02
0 0 1 0 0 1 −5 0 1 0 2 1 0 0 11
0 1 0 2 6 1 7 1 2 −1 −1
Now, 𝐸1 𝐵 = 1 0 0 1 2 −1 −1 = 2 6 1 7 (𝑅1 ↔ 𝑅2 )
0 0 1 5 7 −4 9 5 7 −4 9
1 0 0 1 2 −1 −1 1 2 −1 −1
𝐸2 𝐸1 𝐵 = −2 1 0 2 6 1 7 = 0 2 3 9 (𝑅2 → 𝑅2 − 2𝑅1 )
0 0 1 5 7 −4 9 5 7 −4 9
1 0 0 1 2 −1 −1 1 2 −1 −1
𝐸3 𝐸2 𝐸1 𝐵 = 0 1 0 0 2 3 9 = 0 2 3 9 (𝑅3 → 𝑅3 − 5𝑅1 )
−5 0 1 5 7 −4 9 0 −3 1 14
1 0 0 1 2 −1 −1 1 2 −1 −1
3
𝐸4 𝐸3 𝐸2 𝐸1 𝐵 = 0 13 0 0 2 3 9 = 0 2 3 9 (𝑅3 → 𝑅3 + 2 𝑅2 )
0 2 1 0 −3 1 14 0 0 11/2 55/2
1 0 0 1 2 −1 −1 1 2 −1 −1
2
𝐸5 𝐸4 𝐸3 𝐸2 𝐸1 𝐵 = 0 1 0 0 2 3 9 = 0 2 3 9 = 𝐴 (𝑅3 → 11 𝑅3 )
0 0 2/11 0 0 11/2 55/2 0 0 1 5

9
1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0
Also, 𝐸5 𝐸4 𝐸3 𝐸2 𝐸1 = 0 1 0
2
0 1
3
0 0 1 0 −2 1 0 1 0 0 = 13 −216
0 .
2
0 0 0 1 −5 0 1 0 0 1 0 0 1 −
11 2 11 11 11
= 𝐼3′ = 𝑃, say.
Thus the following results hold.
1. Let e be an elementary row operation and 𝐸 be the corresponding 𝑚 × 𝑚 elementary matrix,
that is, 𝐸 = 𝑒(𝐼𝑚 ). Then for every 𝑚 × 𝑛 matrix 𝐴, we have 𝑒 𝐴 = 𝐸𝐴.
2. Let 𝐴 and 𝐵 be two 𝑚 × 𝑛 matrices. Then 𝐴 is row-equivalent to 𝐵 if and only if there is an
invertible (non-singular) 𝑚 × 𝑚 matrix 𝑃 = 𝐸𝑘 𝐸𝑘−1 … 𝐸1 (for some 𝑘) such that 𝐵 = 𝑃𝐴, where 𝑃
is the product of a finite number of 𝑚 × 𝑚 elementary matrices.
3. In case of column operations or column equivalence there is an invertible (non-singular) 𝑛 × 𝑛
matrix 𝑃 = 𝐸1 𝐸2 … 𝐸𝑘 (for some 𝑘) such that 𝐵 = 𝐴𝑃, where 𝑃 is obtained by applying the same
column operations on 𝐼𝑛 which we want to apply on the columns of 𝐴.

Row equivalence as a relation:


 𝑨~𝑨: Since there is an invertible matrix 𝐼 such that 𝐴 = 𝐼𝐴, 𝐴~𝐴.
 𝑨~𝑩 implies that 𝑩~𝑨: There is an invertible matrix 𝑃 such that 𝐴 = 𝑃𝐵 or 𝑃−1 𝐴 = 𝐵. Hence
𝐵~𝐴.
 𝑨~𝑩 and 𝑩~𝑪 implies that 𝑨~𝑪: There are two invertible matrices of the same order such that
𝐴 = 𝑃𝐵 and 𝐵 = 𝑄𝐶. Therefore, 𝐴 = 𝑃 𝑄𝐶 = 𝑃𝑄 𝐶, where 𝑃𝑄 is an invertible matrix. Hence
𝐴~𝐶.
10
Row - reduced matrix: An 𝑚 × 𝑛 matrix 𝐴 is said to be row-reduced if:
(a) the first non-zero entry in each non-zero row (called leading non-zero entry) of 𝐴 is
equal to 1.
(b) Each column of 𝐴 which contains the leading non-zero entry of some row has all its
other entries 0. Such columns are called the pivot columns.
1 0 0 0 0 2 1
Examples: 1. 𝐴 = 0 1 −1 0 and 𝐵 = 1 0 −3 are not in row-reduced form.
0 0 1 0 0 0 0
0 1 −3 0 1/2 1 0 0
2. 𝐶 = 0 0 0 1 2 and 𝐷 = 0 1 0 are in row-reduced form.
0 0 0 0 0 0 0 1

Note that two matrices are said to be row equivalent if and only if they have the same
row reduced forms.
H.W.: Prove that the following two matrices are not row-equivalent.
2 0 0 1 1 2
𝑎 −1 0 , −2 0 −1
𝑏 𝑐 3 1 3 5

11
Row-echelon form: An 𝑚 × 𝑛 matrix 𝐴 is said to be in row-echelon form, if
1) every row of 𝐴 which has all its entries 0 occurs below the non-zero rows
2) the first non-zero entry of a non-zero row is 1
3) in consecutive non-zero rows, the first entry 1 in a lower row appears to the right of
1 in the upper row
Examples: 1. The identity matrix and zero matrix are the trivial examples of row-
echelon form. The matrix 𝐵 in previous example is not in row-echelon form.
2 6 1 7
2. Let us consider the matrix 𝐵 = 1 2 −1 −1 of Example 1 on Slide 7.
5 7 −4 9
This was reduced to
1 2 −1 −1 1 2 −1 −1
0 2 3 9 ~ 0 1 3/2 9/2 ,
0 0 1 5 0 0 1 5
which is in the row-echelon form.

12
 If a matrix 𝐴 (in row-echelon form) is reduced into row-reduced form, then it
is called in row-reduced echelon form. The matrices 𝐶 and 𝐷 in Slide -11
are in row-reduced echelon form.
 The row-echelon form of a matrix can be used to find a solution of a system
of linear equation. This process is called, the Gauss Elimination Method
also.
Example 2: Let us find a solution to the system of linear equations:
2𝑥1 − 𝑥2 + 3𝑥3 + 2𝑥4 = 0
𝑥1 + 4𝑥2 − 𝑥4 = 0
2𝑥1 + 6𝑥2 − 𝑥3 + 5𝑥4 = 0
This system can be written in the following matrix form (𝐴𝑋 = 𝐵):
𝑥1
2 −1 3 2 𝑥2 0
1 4 0 −1 = 0 . (1)
𝑥3
2 6 −1 5 𝑥4 0
We know that the elimination process for solving a system of linear equations and the
elementary row operations are the same.

13
Applying 𝑅1 → 𝑅1 − 2𝑅2 and 𝑅3 → 𝑅3 − 2𝑅2 , we have
2
2 −1 3 2 9
0 −9 3 4 𝑅1 → 𝑅1 − 𝑅3 0 0 15/2 −55/2 1 15 𝑅1
𝑅 →
1 4 0 −1 ~ 1 4 0 −1 2 1 0 −2 13 1
2 6 −1 5 0 −2 −1 7 𝑅2 → 𝑅2 + 2𝑅3 0 −2 −1 7 𝑅 3 → − 𝑅
~ 2 3
~
0 0 1 −11/3 𝑅2 → 𝑅2 +2𝑅1 0 0 1 −11/3 1 0 0 17/3
1 0 −2 13 𝑅3 → 𝑅3 − 2𝑅3 1 0 0 17/3 ~ 0 1 0 −5/3 = 𝐴′(Row-
0 1 1/2 −7/2 ~ 0 1 0 −5/3 0 0 1 −11/3
interchange.
Thus the system of equations (1) and A′𝑋 = 𝑂, that is,
17 5 11
𝑥1 + 𝑥 = 0; 𝑥2 − 3 𝑥4 = 0; 𝑥3 − 𝑥 =0
3 4 3 4
are the exactly same and have the same solutions.
Finally, if we assign any arbitrary value, say 𝑡, to 𝑥4 , then solution of the system is 𝑥1 =
17 5 11
− 3 𝑡; 𝑥2 = 3 𝑡; 𝑥3 = 3 𝑡; 𝑥4 = 𝑡, where 𝑡 is an arbitrary real number. Hence the given
system of equations has infinitely many solutions.
Note that we have reduced the augmented matrix into row-echelon form and used the
back substitution.
14
Example 3: Solve the system of equations
2𝑥1 + 6𝑥2 + 𝑥3 = 7 1 2 −1 𝑥1 −1
𝑥1 + 2𝑥2 − 𝑥3 = −1 or 2 6 1 𝑥2 = 7 (𝐴𝑋 = 𝐵). (1)
5𝑥1 + 7𝑥2 − 4𝑥3 = 9 5 7 −4 𝑥3 9
The matrix [A ⋮ 𝐵] is called the augmented matrix. Here
1 2 −1⋮−1 𝑅2 → 𝑅2 − 2𝑅1 1 2 −1⋮−1
A ⋮ 𝐵 = 2 6 1 ⋮ 7 𝑅3 → 𝑅3 − 5𝑅1 0 2 3⋮9
5 7 −4⋮ 9 ~ 0 −3 1 ⋮ 14
1
𝑅2 → 2 𝑅2 1 2 −1 ⋮ −1 2 1 2 −1 ⋮ −1
𝑅3 → 11 𝑅3 0 1 3/2⋮9/2 .
𝑅3 → 𝑅3 + 3𝑅2 0 1 3/2 ⋮ 9/2 ~
~ 0 0 11/2⋮55/2 0 0 1 ⋮ 5
Thus the system (1) of equations is equivalent to the system
𝑥1 + 2𝑥2 − 𝑥3 = −1 𝑥3 = 5
3 9 9 15
𝑥2 + 2 𝑥3 = 2 which gives 𝑥2 = 2 − 2 = −3 or 𝑥1 , 𝑥2 , 𝑥3 = (10, −3,5).
𝑥3 = 5 𝑥1 = −1 − 2 −3 + 5 = 10
(a unique solution) (In other words the three planes intersect in a point).

15
Example 4: Consider the system of equations
𝑥1
𝑥1 − 2𝑥2 + 𝑥3 + 2𝑥4 = 1 1 −2 1 2 𝑥 1
2
𝑥1 + 𝑥2 − 𝑥3 + 𝑥4 = 2 or 1 1 −1 1 = 2.
𝑥3
𝑥1 + 7𝑥2 − 5𝑥3 − 𝑥4 = 3 1 7 −5 −1 𝑥 3
4
The augmented matrix is
1 −2 1 2 ⋮1 1 −2 1 2 ⋮1 1 −2 1 2 ⋮ 1
1 1 −1 1 ⋮2 ~ 0 3 −2 −1⋮1 ~ 0 1 −2/3 −1/3⋮1/3 .
1 7 −5 −1⋮3 0 9 −6 −3⋮2 0 0 0 0 ⋮ −1
The given system of equations is equivalent to
𝑥1 − 2𝑥2 + 𝑥3 + 2𝑥4 = 1
0𝑥1 + 3𝑥2 − 2𝑥3 − 𝑥4 = 1 (degenerate case).
0𝑥1 + 0𝑥2 + 0𝑥3 + 0𝑥4 = −1
There cannot be any 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 which satisfies the last equation. Hence this
system has no solution.

16
Linear independence/dependence (Rank of matrix)

A set of vectors {𝑢1 , 𝑢2 , … , 𝑢𝑛 } (row vectors or column vectors) is said to be


linearly independent (LI), if the only scalars/constants satisfying the equation
𝑘1 𝑢1 + 𝑘2 𝑢2 + ⋯ + 𝑘𝑛 𝑢𝑛 = 𝑂
(1)
(𝑙𝑖𝑛𝑒𝑎𝑟 𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑡𝑖𝑜𝑛)
are 𝑘1 = 𝑘2 = ⋯ = 𝑘𝑛 = 0.
If the set is not linearly independent, that is the homogeneous system of
equations in (1) has a non-zero solution also, is called linearly dependent
(LD). [In other words one vector can be written as a linear combination of the
others]
Example 5: (a) The rows of a 3 × 3 identity matrix, that is, 𝑢1 = (1,0,0), 𝑢2 =
(0,1,0), 𝑢3 = (0,0,1) are LI.
Let us write 𝑘1 𝑢1 + 𝑘2 𝑢2 + 𝑘3 𝑢3 = 𝑂 or 𝑘1 , 0,0 + 0, 𝑘2 , 0 + 0,0, 𝑘3 = (0,0,0).
This gives 𝑘1 , 𝑘2 , 𝑘3 = (0,0,0), that is, 𝑘1 = 𝑘2 = 𝑘3 = 0. Hence rows of the
3 × 3 identity matrix are LI.
(b) The vectors 1,1,1 , 2 − 1,4 , (5,2,7) are LD.
Here 3 1,1,1 + 2 − 1,4 − 5,2,7 = (0,0,0) or 3 1,1,1 + 2 − 1,4 = 5,2,7 .
17
RANK OF A MATRIX
If we use the definition, 𝑘1 1,1,1 + 𝑘2 2, −1,4 + 𝑘3 5,2,7 = (0,0,0) gives
𝑘1 + 2𝑘2 + 5𝑘3 , 𝑘1 − 𝑘2 + 2𝑘3 , 𝑘1 + 4𝑘2 + 7𝑘3 = (0,0,0)
𝑘1 + 2𝑘2 + 5𝑘3 = 0 𝑘1 + 2𝑘2 + 5𝑘3 = 0 𝑘1 + 2𝑘2 + 5𝑘3 = 0
or 𝑘1 − 𝑘2 + 2𝑘3 = 0 or 0𝑘1 − 3𝑘2 − 3𝑘3 = 0 or 0𝑘1 + 𝑘2 + 𝑘3 = 0 (2).
𝑘1 + 4𝑘2 + 7𝑘3 = 0 0𝑘1 +2𝑘2 + 2𝑘3 = 0 0𝑘1 +0𝑘2 + 0𝑘3 = 0
This gives 𝑘2 = −𝑘3 and 𝑘1 = −3𝑘3 . Thus system (2) has infinitely many
solutions. Let us take 𝑘3 = 1, 𝑘2 = −1, 𝑘1 = −3. Hence the given vectors are
LD.

 Subset of a linearly independent set is linearly independent


 Superset of a linearly dependent set is linearly dependent
 A set with zero vector is always linearly dependent

The rank of an 𝑚 × 𝑛 matrix 𝐴 is the maximum number of linearly independent row (or
column) vectors in 𝐴. The rank of the matrix 𝐴 is usually denoted by rank(𝐴) or 𝜌(𝐴).

18
1 1 −1 3
Example 6: Let us consider the 3 × 4 matrix 𝐴 = 2 −2 6 8 .
3 5 −7 8
Let us try to find LI rows of 𝐴. Let 𝑎, 𝑏, 𝑐 be three constants such that
𝑎 1,1, −1,3 + 𝑏 2, −2,6,8 + 𝑐 3,5, −7,8 = (0,0,0,0).
𝑎 + 2𝑏 + 3𝑐 = 0 Thus we have
or 𝑎 − 2𝑏 + 5𝑐 = 0 −8 1,1, −1,3 + 1 2, −2,6,8 + 2 3,5, −7,8 = 0.
−𝑎 + 6𝑏 − 7𝑐 = 0 The three rows of the matrix 𝐴 are LD. Also none of
3𝑎 + 8𝑏 + 8𝑐 = 0 the first row and second row is a scalar multiple of
or 2𝑎 + 8𝑐 = 0; 4𝑏 − 2𝑐 = 0 other rows, that is, first two rows are LI. Hence rank
of 𝐴 is 2.
or 𝑎 = −8𝑏, 𝑐 = 2𝑏.
The rank or LI rows can also be determined by using
If we set 𝑏 = 1, then 𝑎 = −8, 𝑐 = 2. row-echelon form.

1 1 −1 3 1 1 −1 3
𝐴 = 2 −2 6 8 ~ 0 1 −2 −1/2 .
3 5 −7 8 0 0 0 0
The number of non-zero rows in the row-echelon form of 𝐴 is two. Therefore, the maximum
number of LI rows in 𝐴 is two. Hence rank 𝐴 = 2.
19
Thus the rank of 𝐴 is also defined as the number of non-zero rows in the row-echelon form of
𝐴. Also rank 𝐴𝑚×𝑛 = number of LI rows ≤ 𝑚.
 The three row-vectors 𝑢1 = 1,1, −1,3 , 𝑢2 = 2, −2,6,8 , 𝑢3 = 3,5, −7,8 in the
previous Example 6 are LD as the rank of the matrix whose rows are 𝑢1 , 𝑢2 , 𝑢3 is less
than 3.
 In general, the vector 𝑢1 = 𝑎11 , 𝑎12 , … , 𝑎1𝑛 , 𝑢2 = 𝑎21 , 𝑎22 , … , 𝑎2𝑛 , … , 𝑢𝑚 =
𝑎𝑚1 , 𝑎𝑚2 , … , 𝑎𝑚𝑛 are LD or LI according as the matrix
𝑎11 𝑎12 ⋯ 𝑎1𝑛
𝑎 𝑎22 ⋯ 𝑎2𝑛
𝐴 = 21 ⋯ ⋯
⋯ ⋯
𝑎𝑚1 𝑎𝑚2 ⋯ 𝑎𝑚𝑛
has rank (number of non-zero rows in the row echelon form of 𝐴) less than 𝑚 or equal to
𝑚.
For example in order to determine whether the vectors 𝑢1 = 2,1,1 , 𝑢2 = 0,3,0 and
𝑢3 = (3,1,2) are LI, one can consider the matrix 𝐴 whose rows are 𝑢1 , 𝑢2 and 𝑢3 .
2 1 1 1 1/2 1/2 1 1/2 1/2 1 1 0 1 0 0 1 0 0
𝐴= 0 3 0 ~ 3 1 2 ~ 0 −1/2 1/2 ~ 0 −1 1 ~ 0 0 1 ~ 0 1 0 .
3 1 2 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1
Thus rank 𝐴 = 3 = number of rows. This implies that given vectors are LI
20
ROW RANK vs COLUMN RANK
Let row rank 𝐴𝑚×𝑛 = 𝑟 and column rank 𝐴𝑚×𝑛 = 𝑝. Also let 𝑢𝑗 = (𝑢𝑗1 , 𝑢𝑗2 , … , 𝑢𝑗𝑛 ), 𝑗 =
1, 2, … , 𝑚 be the row vectors of 𝐴. By definition of rank, 𝐴 has 𝑟 linearly independent
rows which we denote by 𝑣1 , 𝑣2 , … , 𝑣𝑟 (regardless their position in 𝐴). Also all the rows of
𝐴 are linear combinations of 𝑣1 , 𝑣2 , … , 𝑣𝑟 . Therefore, we can write
𝑢1 = 𝑐11 𝑣1 + 𝑐12 𝑣2 + ⋯ + 𝑐1𝑟 𝑣𝑟 = 𝑐11 𝑣11 , 𝑣12 , … , 𝑣1𝑛 + ⋯ + 𝑐1𝑟 (𝑣𝑟1 , 𝑣𝑟2 , … , 𝑣𝑟𝑛 )
𝑢2 = 𝑐21 𝑣1 + 𝑐22 𝑣2 + ⋯ + 𝑐2𝑟 𝑣𝑟 = 𝑐21 𝑣11 , 𝑣12 , … , 𝑣1𝑛 + ⋯ + 𝑐2𝑟 (𝑣𝑟1 , 𝑣𝑟2 , … , 𝑣𝑟𝑛 )
⋮ ⋮
𝑢𝑚 = 𝑐𝑚1 𝑣1 + 𝑐𝑚2 𝑣2 + ⋯ + 𝑐𝑚𝑟 𝑣𝑟 = 𝑐𝑚1 𝑣11 , 𝑣12 , … , 𝑣1𝑛 + ⋯ + 𝑐𝑚𝑟 (𝑣𝑟1 , 𝑣𝑟2 , … , 𝑣𝑟𝑛 )
In matrix form we can write
𝑢11 … 𝑢1𝑘 … 𝑢1𝑛

𝑢𝑗1 … 𝑢𝑗𝑘 … 𝑢𝑗𝑛 =

𝑢𝑚1 … 𝑢𝑚𝑘 … 𝑢𝑚𝑛 𝑚×𝑛

𝑐11 𝑣11 + 𝑐12 𝑣21 + ⋯ + 𝑐1𝑟 𝑣𝑟1 ⋯ 𝑐11 𝑣1𝑘 + 𝑐12 𝑣2𝑘 + ⋯ + 𝑐1𝑟 𝑣𝑟𝑘 ⋯ 𝑐11 𝑣1𝑛 + 𝑐12 𝑣2𝑛 + ⋯ + 𝑐1𝑟 𝑣𝑟𝑛

𝑐𝑗1 𝑣11 + 𝑐𝑗2 𝑣21 + ⋯ + 𝑐𝑗𝑟 𝑣𝑟1 ⋯ 𝑐𝑗1 𝑣1𝑘 + 𝑐𝑗2 𝑣2𝑘 + ⋯ + 𝑐𝑗𝑟 𝑣𝑟𝑘 ⋯ 𝑐𝑗1 𝑣1𝑛 + 𝑐𝑗2 𝑣2𝑛 + ⋯ + 𝑐𝑗𝑟 𝑣𝑟𝑛

𝑐𝑚1 𝑣11 + 𝑐𝑚2 𝑣21 + ⋯ + 𝑐𝑚𝑟 𝑣𝑟1 ⋯ 𝑐𝑚1 𝑣1𝑘 + 𝑐𝑚2 𝑣2𝑘 + ⋯ + 𝑐𝑚𝑟 𝑣𝑟𝑘 ⋯ 𝑐𝑚1 𝑣1𝑛 + 𝑐𝑚2 𝑣2𝑛 + ⋯ + 𝑐𝑚𝑟 𝑣𝑟𝑛 𝑚×𝑛

21
Comparing the 𝑘 𝑡ℎ column in both the sides, we have
𝑢1𝑘 𝑐11 𝑣1𝑘 + 𝑐12 𝑣2𝑘 + ⋯ + 𝑐1𝑟 𝑣𝑟𝑘 𝑐11 𝑐12 𝑐1𝑟
𝑢2𝑘 𝑐21 𝑣1𝑘 + 𝑐22 𝑣2𝑘 + ⋯ + 𝑐2𝑟 𝑣𝑟𝑘 𝑐 𝑐 𝑐
= = 𝑣1𝑘 21 + 𝑣2𝑘 22 + ⋯ +𝑣𝑟𝑘 2𝑟
⋮ ⋮ ⋮ ⋮ ⋮
𝑢𝑚𝑘 𝑐𝑚1 𝑣1𝑘 + 𝑐𝑚2 𝑣2𝑘 + ⋯ + 𝑐𝑚𝑟 𝑣𝑟𝑘 𝑐𝑚1 𝑐𝑚2 𝑐𝑚𝑟
for 𝑘 = 1,2, … , 𝑛.
Now, vector on the left is the 𝑘 𝑡ℎ column of 𝐴. We see that each of these 𝑛 columns of 𝐴
is a linear combination of the same 𝑟 columns of the right. Hence 𝐴 cannot have more
than 𝑟 LI columns. Therefore, column rank 𝐴 ≤ 𝑟 or 𝑝 ≤ 𝑟.
Also 𝑝 = column rank 𝐴 = row rank 𝐴𝑇 ≥ column rank 𝐴𝑇 = row rank 𝐴 = 𝑟.
Hence 𝑝 = 𝑟.
The following can be easily observed about a matrix 𝐴𝑚×𝑛 :
 Rank 𝐴 ≤ min{𝑚, 𝑛}
 Rank 𝐴 = Rank 𝐴𝑇
 Rank 𝐴 + 𝐵 ≤ Rank 𝐴 + Rank 𝐵 , provided 𝐴 + 𝐵 is possible
 Rank 𝐴𝐵 ≤ min{Rank 𝐴 , Rank 𝐵 }, provided 𝐴𝐵 is possible
 𝐴𝑛×𝑛 is non-singular if and only if Rank 𝐴 = 𝑛
 𝑝 vectors with 𝑛(< 𝑝) components are always LD
22
Minors and rank of a matrix
Let 𝐴 be any 𝑚 × 𝑛 matrix. The determinant of any square submatrix of 𝐴 is
called a minor of 𝐴. The order of the submatrix is called order of the minor.
 𝐴𝑚×𝑛 has minors of order ≤ min{𝑚, 𝑛}
 A number 𝑟 is said to be rank of 𝐴, if 𝐴 has at least one non-zero minor of
order 𝑟 and every minor of order 𝑟 + 1 or more is zero
1 2 −1 0
Examples 7:1. The minors of the matrix 𝐴 = 0 1 3 1 are of order ≤ 3.
2 5 1 2
1 2 −1
Let us take the minor 0 1 3 which is equal to 1 1 − 15 + 2 6 + 1 = 0.
2 5 1
2 −1 0
Next 1 3 1 = 2 6 − 1 + 1 2 − 5 = 10 − 3 = 7 ≠ 0. Hence rank 𝐴 = 3.
5 1 2
2 1 −1
2. The matrix 𝐴 = 1 4 2 has rank 2 since det 𝐴 = 0 = minor of order 3, and
3 5 1
2 1
= 7.
1 4
23
Normal form and rank of a matrix
The normal form of a matrix 𝐴𝑚×𝑛 is a matrix 𝐸 (obtained by elementary row and
𝐼𝑟 ⋮ 𝑂𝑟×(𝑛−𝑟)
column operations) of the form 𝐸 = ⋯ ⋮ ⋯ , where 𝐼𝑟 is the
𝑂(𝑚−𝑟)×𝑟 ⋮ 𝑂(𝑚−𝑟)×(𝑛−𝑟)
𝑚×𝑛

identity matrix of order 𝑟, and 𝑂 𝑠 are the zero matrices. It is trivial to observe that
rank 𝐴 = rank 𝐸 = 𝑟.
Example 8: Reduce the following matrix into normal form and find its rank.
2 −4 3 1 0 𝑅 → 𝑅 − 2𝑅 0 0 1 9 −4 𝑅 → 𝑅 + 2𝑅 0 0 1 9 −4
𝐴 = 1 −2 1 −4 2 1 1 2
1 −2 1 −4 2 2 2 4
1 0 1 20 −4
0 1 −1 3 1 𝑅4 → 𝑅4 − 4𝑅2 0 1 −1 3 1 𝑅3 → 𝑅3 − 𝑅4 0 0 −1 −9 4
4 −7 4 −4 5 ~ 0 1 0 12 −3 ~ 0 1 0 12 −3

𝑅1 → 𝑅1 + 𝑅3 0 0 0 0 0 𝐶 → 𝐶 − 11𝐶 − 12𝐶 − 9𝐶 0 0 0 0 0
1 0 0 11 0 4 4 1 2 3
1 0 0 0 0
𝑅2 → 𝑅2 + 𝑅3 0 0 −1 −9 4 𝐶5 → 𝐶5 + 4𝐶3 + 3𝐶2 0 0 −1 0 0
~ 0 1 0 12 −3 ~ 0 1 0 0 0

𝑅1 ↔ 𝑅2 ↔ 𝑅4 1 0 0 0 0 𝐼3 ⋮ 𝑂3×2
0 1 0 0 0 = ⋯ ⋯ ⋯ . Hence rank 𝐴 = 3.
𝑅3 → −𝑅3 0 0 1 0 0
~ 𝑂1×3 ⋮ 𝑂1×2
0 0 0 0 0

24
Inverse of matrix
If an 𝑛 × 𝑛 matrix 𝐴 can be transformed into 𝐼𝑛 , then 𝐴 is non-singular, that is, invertible.
The sequence of elementary operations that transforms 𝐴𝑛×𝑛 into the identity matrix 𝐼𝑛 ,
also transforms 𝐼𝑛 into 𝐴−1
𝑛×𝑛 .
If we can convert 𝐴 = 𝐼𝐴 into 𝐼 = 𝐵𝐴, then 𝐴 is invertible and 𝐴−1 = 𝐵.
2 0 1
Example 9: To find inverse of 𝐴 = −2 3 4 .
−5 5 6
1 1
2 0 1 1 0 0 1 1 0 0 0
𝑅 → 𝑅 2 2
Let us start with −2 3 4 = 0 1 0 𝐴 1 2 1
−2 3 4 = 0 1 0 𝐴
−5 5 6 0 0 1 ~
−5 5 6 0 0 1
1 1
1 0
1 1
0 0 1 0 0 0
𝑅2 → 𝑅2 /3 2 2
𝑅2 → 𝑅2 + 2𝑅1 2 2 5 1 1
𝑅3 → 𝑅3 + 5𝑅1 0 3 5 = 1 1 0 𝐴𝑅3 → 𝑅3 − 5𝑅2 0 1 = 0 𝐴
3 3 3
~ 17 5
0 5 0 1 ~ 1 5 5
2 2 0 0 − 1
6 6 3

𝑅1 → 𝑅1 − 3𝑅3 1 0 0 −2 5 −3 −2 5 −3
𝑅2 → 𝑅2 − 10𝑅3 0 1 0 = −8 17 −1 −1
−10 𝐴. Hence 𝐴 exists, and A = −8 17 −10 .
𝑅3 → 6𝑅3 0 0 1 5 −10 6 5 −10 6
~

25
1 −1 −2
Example 10: The matrix 𝐴 = 2 4 5 has no inverse since it cannot be reduced
6 0 −3
to an identity matrix.
1 −1 −2 1 −1 −2 1 −1 −2
2 4 5 ~ 0 6 9 ~ 0 6 9 .
6 0 −3 0 6 9 0 0 0
Also rank 𝐴 = 2 < 3.

𝑎 1 1 1
H. W.: By using elementary operations show that the matrix 𝐴 = 1 𝑏 2 −1 is
2 2 3 1
−1 1 1 1
1
invertible if and only if 𝑎 ≠ −1 and 𝑏 ≠ 2.

26
RANK AND SYSTEM OF LINEAR EQUATIONS

A system of 𝑚 linear equations in 𝑛 unknowns/variables has the general form


𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1 𝑎11 𝑎12 ⋯ 𝑎1𝑛 𝑥1 𝑏1
𝑎21 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎2𝑛 𝑥𝑛 = 𝑏2 𝑎21 𝑎22 ⋯ 𝑎2𝑛 𝑥2 𝑏
or ⋯ ⋯ ⋯ ⋯ = 2 or 𝐴𝑋 = 𝐵. (1)
………………………………….. ⋮ ⋮
𝑎𝑚1 𝑥1 + 𝑎𝑚2 𝑥2 + ⋯ + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑚 𝑎𝑚1 𝑎𝑚2 ⋯ 𝑎𝑚𝑛 𝑥𝑛 𝑏𝑚

• If all the constants 𝑏1 , 𝑏2 , … , 𝑏𝑛 are zero, that is, 𝐵 is zero column, then system (1) reduces to
𝐴𝑋 = 𝑂 and called the homogeneous system. Otherwise it is called a non-homogeneous
system.
• The column 𝑋 = 𝑥1 , 𝑥2 , … , 𝑥𝑛 𝑇 is called a solution of the system (1), if the numbers
𝑥1 , 𝑥2 , … , 𝑥𝑛 satisfy all the equations of the system.
• A linear system of equations is called consistent, if it has at least one solution, and
inconsistent, if it has no solution.
• If a linear system is consistent, it may has a unique solution or infinitely many solutions.
• The homogeneous system 𝐴𝑋 = 𝑂 is always consistence with zero (trivial) solution.

𝑎11 𝑎12 ⋯ 𝑎1𝑛 ⋮ 𝑏1


𝑎21 𝑎22 ⋯ 𝑎2𝑛 ⋮ 𝑏2
• The matrix 𝐴 ⋮ 𝐵 = ⋯ ⋯ ⋯ ⋯⋮ is called the augmented matrix of the system.

𝑎𝑚1 𝑎𝑚2 ⋯ 𝑎𝑚𝑛 ⋮ 𝑏𝑚

27
Examples:
Let us recall our previous Examples 2, 3, 4.
• In Example 2, the system has infinitely many solutions and the augmented matrix
2 −1 3 2 ⋮0 1 0 0 17/3 ⋮0
is 1 4 0 −1⋮0 ~ 0 1 0 −5/3 ⋮0 .
2 6 −1 5 ⋮0 0 0 1 −11/3⋮0
Note that rank 𝐴 = rank 𝐴 ⋮ 𝐵 = 3 < number of unknowns 4. Further in the solution
we have one (4 − 3) free variable (𝑥4 ) on which other three variables depend.
• In Example 3, the system has a unique solution and the augmented matrix is
1 2 −1⋮−1 1 2 −1 ⋮ −1
A ⋮ 𝐵 = 2 6 1 ⋮ 7 ~ 0 1 3/2⋮9/2 .
5 7 −4⋮ 9 0 0 1 ⋮ 5
Note that rank 𝐴 = rank 𝐴 ⋮ 𝐵 = 3 = number of unknowns.
• In Example 4, the system has no solution and the augmented matrix is
1 −2 1 2 ⋮1 1 −2 1 2⋮ 1
1 1 −1 1 ⋮2 ~ 0 1 −2/3 −1/3⋮1/3 .
1 7 −5 −1⋮3 0 0 0 0 ⋮ −1
Note that rank 𝐴 = 2 ≠ rank 𝐴 ⋮ 𝐵 = 3.

28
Rank vs Consistency of Linear System
 A system of linear equations 𝐴𝑋 = 𝐵 is consistence if and only if the rank of the coefficient
matrix 𝐴 = rank of the augmented matrix 𝐴 ⋮ 𝐵 .
 If a linear system 𝐴𝑋 = 𝐵 with 𝑚 equations and 𝑛 unknowns is consistent with rank 𝐴 =
rank 𝐴 ⋮ 𝐵 = 𝑟, then solution of the system contains 𝑛 − 𝑟 parameters or free variables.
Let us discuss some more examples.
Example 11: Discuss the solutions of the system:
𝑥1 + 2𝑥2 + 4𝑥3 + 𝑥4 − 𝑥5 = 1
2𝑥1 + 4𝑥2 + 8𝑥3 + 3𝑥4 − 4𝑥5 = 2
𝑥1 +3𝑥2 + 7𝑥3 + 3𝑥5 = −2
1 2 4 1 −1 ⋮ 1
The augmented matrix of the system is 𝐴 ⋮ 𝐵 = 2 4 8 3 −4 ⋮ 2
1 3 7 0 3 ⋮ −2
𝑅2 → 𝑅2 − 2𝑅1 1 2 4 1 −1 ⋮ 1 1 2 4 1 −1 ⋮ 1
𝑅2 ↔ 𝑅3
𝑅3 → 𝑅3 − 𝑅1 0 0 0 1 −2 ⋮ 0 0 1 3 −1 4 ⋮ −3
~
~ 0 1 3 −1 4 ⋮ −3 0 0 0 1 −2 ⋮ 0
𝑅2 → 𝑅2 + 𝑅3 1 2 4 0 1 ⋮ 1 1 0 −2 0 −3 ⋮ 7
𝑅1 → 𝑅1 − 2𝑅2
𝑅1 → 𝑅1 − 𝑅3 0 1 3 0 2 ⋮ −3 0 1 3 0 2 ⋮ −3
~
~ 0 0 0 1 −2 ⋮ 0 0 0 0 1 −2 ⋮ 0

29
The rank 𝐴 ⋮ 𝐵 = rank 𝐴 = 3 < 5(the number of unknowns). Therefore, the system is
consistent and has infinitely many solutions.
Thus the given system has the same solution as the system:
𝑥1 − 2𝑥3 − 3𝑥5 = 7 𝑥1 − 2𝑥3 = 7 + 3𝑥5
𝑥2 + 3𝑥3 + 2𝑥5 = −3 which gives 𝑥2 + 3𝑥3 = −3 − 2𝑥5
𝑥4 − 2𝑥5 = 0 𝑥4 = 2𝑥5 .
If we assign an arbitrary value 𝑡 to the unknown 𝑥5 , then 𝑥5 = 𝑡, 𝑥4 = 2𝑡 and
𝑥1 = 7 + 3𝑡 + 2𝑥3 = 7 + 3𝑡 + 2𝑠;
𝑥2 = −3 − 2𝑡 − 3𝑥3 = −3 − 2𝑡 − 3𝑠,
where 𝑠 is an arbitrary value assigned to the variable 𝑥3 . Finally, our solution has the
form 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 = (7 + 3𝑡 + 2𝑠, −3 − 2𝑡 − 3𝑠, 𝑠, 𝑡, 2𝑡), where 𝑠 and 𝑡 are arbitrary
real numbers.
NOTE:
• A variable is a basic variable, if it corresponds to a pivot column. Otherwise, the
variable is known as a free variable.
• In Example 11, we have THREE basic variables, namely, 𝑥1 , 𝑥2 , 𝑥4 ; and TWO free
variables, namely, 𝑥3 , 𝑥5 .

30
Example 12: Investigate the values of 𝜆 and 𝜇 for which the system
2𝑥 + 3𝑦 + 5𝑧 = 9
7𝑥 + 3𝑦 − 2𝑧 = 8
2𝑥 + 3𝑦 + 𝜆𝑧 = 𝜇
has (i) no solution (ii) unique solution (iii) infinitely many solutions.
The augmented matrix of the system is

2 3 5 ⋮ 9 1 3/2 5/2 ⋮ 9/2


𝐴⋮𝐵 = 7 3 −2 ⋮ 8 ~ 0 −15/2 −39/2 ⋮ 47/2
2 3 𝜆 ⋮ 𝜇 0 0 𝜆−5 ⋮ 𝜇−9
1
(Performing the operations 𝑅1 → 2 𝑅1 , 𝑅2 → 𝑅2 − 7𝑅1 , 𝑅3 → 𝑅3 − 2𝑅1 )
(i) The system has no solution if rank 𝐴 ⋮ 𝐵 ≠ rank(A). In this case rank of none of
the matrices can be less than 2. Therefore, rank 𝐴 ⋮ 𝐵 = 3 and rank A = 2.
Hence, 𝜆 = 5 and 𝜇 ≠ 9.
(ii) The system has a unique solution if rank 𝐴 ⋮ 𝐵 = rank A = 3, i.e., 𝜆 ≠ 5 and 𝜇 is
any real number.
(iii) The system has infinitely many solutions if rank 𝐴 ⋮ 𝐵 = rank A < 3, 𝜆 = 5 and
𝜇 = 9.

31
H.W.:
𝑥2𝑧3 2
8 𝑦 𝑧
3
4 𝑥 𝑦
1. Solve = 𝑒 ; 𝑥 = 𝑒 ; 𝑧4 = 1 for 𝑥, 𝑦, 𝑧 > 0.
𝑦

Quiz:
1. If a homogeneous system of linear equations with 𝑚 equations in 𝑛 unknowns has a
non-trivial solution, then what is the relation between 𝑚 and 𝑛.
2. Let 𝐴 be a 4 × 4 non-invertible matrix. Can we find a 4 × 4 matrix 𝐵 ≠ 𝑂 such that
𝐴𝐵 = 𝑂. Can you generalize your result for general 𝑛 × 𝑛 matrices.

32
SOME SPECIAL TYPES OF REAL MATRICES
(Symmetric and Skew-symmetric)

• An 𝑛 × 𝑛 real matrix 𝐴 = [𝑎𝑖𝑗 ] is said to be symmetric if 𝑎𝑖𝑗 = 𝑎𝑗𝑖 for each 𝑖 and 𝑗,
that is, 𝐴 is symmetric, if 𝐴𝑇 = 𝐴.
• An 𝑛 × 𝑛 real matrix 𝐴 = [𝑎𝑖𝑗 ] is said to be skew-symmetric if 𝑎𝑖𝑗 = −𝑎𝑗𝑖 for each 𝑖
and 𝑗, that is, 𝐴 is skew-symmetric, if 𝐴𝑇 = −𝐴.
• In case of skew-symmetric matrices we note that 𝑎𝑖𝑖 = −𝑎𝑖𝑖 for all 𝑖 or 𝑎𝑖𝑖 =0 for all 𝑖,
that is, the diagonal entries of a skew-symmetric matrix are all 0.
1 −1 5 0 −1 5
Examples: 𝐴 = −1 2 3 is symmetric and 𝐵 = 1 0 2 is skew-symmetric.
5 3 3 −5 −2 0
PROPERTIES:
1. For any square matrix 𝐴, 𝐴 + 𝐴𝑇 is symmetric and 𝐴 − 𝐴𝑇 is skew-symmetric.
2. Sum of two symmetric (or skew-symmetric) matrices is also symmetric (or skew-
symmetric).
3. If 𝐴 is symmetric/skew-symmetric, then 𝑘𝐴, where 𝑘 is any non-zero real number, is
symmetric/skew-symmetric.
4. If 𝐴 and 𝐵 are symmetric matrices of the same order, then 𝐴𝐵 is symmetric if and
only if 𝐴𝐵 = 𝐵𝐴. Also 𝐴𝑝 is symmetric for any positive integer 𝑝.

33
5. The matrix 𝐵𝑇 𝐴𝐵 is symmetric or skew-symmetric according as 𝐴 is symmetric or
skew-symmetric.
6. If 𝐴 is skew-symmetric, then 𝐴2𝑘 is symmetric and 𝐴2𝑘−1 is skew-symmetric for any
positive integer 𝑘.
7. If 𝐴 and 𝐵 are symmetric matrices of the same order, then 𝐴𝐵 + 𝐵𝐴 is symmetric
and 𝐴𝐵 − 𝐵𝐴 is skew-symmetric.
8. Any square matrix can be expressed a the sum of a symmetric matrix and a skew-
symmetric matrix.

ORTHOGONAL MATRICES:
• The vectors 𝑢1 = (𝑎11 , 𝑎12 , … , 𝑎1𝑛 ) and 𝑢2 = (𝑎21 , 𝑎22 , … , 𝑎2𝑛 ) in ℝ𝑛 are said to be
orthogonal, if 𝑢1 . 𝑢2 = 𝑢1 𝑢2𝑇 = 𝑎11 𝑎21 + 𝑎12 𝑎22 + ⋯ + 𝑎1𝑛 𝑎2𝑛 = 0.
• A set {𝑢1 , 𝑢2 , … , 𝑢𝑚 } of vectors in ℝ𝑛 is called orthogonal, if 𝑢𝑖 . 𝑢𝑗 = 0 for 𝑖 ≠ 𝑗.
0, if 𝑖 ≠ 𝑗
• A set {𝑢1 , 𝑢2 , … , 𝑢𝑚 } of vectors in ℝ𝑛 is called orthonormal, if 𝑢𝑖 . 𝑢𝑗 = ቊ
1, if 𝑖 = 𝑗
• The length of a vector 𝑢 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) is denoted by 𝑢 = 𝑥12 + 𝑥22 + ⋯ + 𝑥𝑛2

34
A real square matrix 𝐴 is called orthogonal, if 𝐴−1 = 𝐴𝑇 or 𝐴𝐴𝑇 = 𝐼 = 𝐴𝑇 𝐴.
1/9 8/9 −4/9
For example 𝐴 = 4/9 −4/9 −7/9 is orthogonal, since
8/9 1/9 4/9

1/9 8/9 −4/9 1/9 4/9 8/9 1 0 0


𝐴𝐴𝑇 = 4/9 −4/9 −7/9 8/9 −4/9 1/9 = 0 1 0.
8/9 1/9 4/9 −4/9 −7/9 4/9 0 0 1

𝑢1 𝑎11 𝑎12 ⋯ 𝑎1𝑛


𝑢 𝑎 𝑎22 ⋯ 𝑎2𝑛
In general suppose that 𝐴 = 2 = 21 be an orthogonal matrix.
⋮ ⋮ ⋮ ⋯ ⋮
𝑢𝑛 𝑎𝑛1 𝑎𝑛2 ⋯ 𝑎𝑛𝑛

𝑢1 𝑢12 𝑢1 . 𝑢2 ⋯ 𝑢1 . 𝑢𝑛 1 0 ⋯ 0
𝑢2 2 ⋯ 𝑢2 . 𝑢𝑛
𝑇
Then, 𝐴𝐴 = 𝑢1 , 𝑢2 , … , 𝑢𝑛 = 𝑢2 . 𝑢1 𝑢 2 = 0 1 ⋯ 0 .
⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮ ⋯ ⋮
𝑢𝑛 𝑢𝑛 . 𝑢1 𝑢𝑛 . 𝑢2 ⋯ 𝑢𝑛2 0 0 ⋯ 1
⇒ 𝑢12 = 𝑢22 = ⋯ = 𝑢𝑛2 = 1 and 𝑢𝑖 . 𝑢𝑗 = 0 for 𝑖 ≠ 𝑗.

35
⇒ rows of 𝐴 are orthonormal. Similarly, the columns of 𝐴 are orthonormal.

Thus the following statements are equivalent


• 𝐴 is orthogonal
• Rows of 𝐴 are orthonormal
• Columns of 𝐴 are orthonormal

H.W.: Find an orthogonal matrix whose first two rows are the multiples of
𝑢1 = 1,1,1 , 𝑢2 = 0, −1,1 .
1 1 1
3 3 3
1 1
Ans. 0 − 2 2
2 −1 −1
6 6 6

36
Normal Matrices: A real square matrix 𝐴 is said to be normal, if 𝐴𝐴𝑇 = 𝐴𝑇 𝐴.
• Every symmetric or orthogonal or skew-symmetric matrix is normal.
6 −3 45 0
• Example is normal as 𝐴𝐴𝑇 = = 𝐴𝑇 𝐴.
3 6 0 45
COMPLEX MATRICES:
 A matrix 𝐴 = [𝑎𝑖𝑗 ] is said to be a complex matrix, if entries of the matrix, that is, 𝑎𝑖𝑗
are complex numbers.
• The conjugate of 𝐴, denoted by 𝐴,ҧ is defined as 𝐴ҧ = 𝑎𝑖𝑗 .
• The conjugate transpose of 𝐴, denoted by 𝐴∗ , is defined by 𝐴∗ = 𝐴ҧ 𝑇
= (𝐴𝑇 ).
2 − 8𝑖 −6𝑖
2 + 8𝑖 5 − 3𝑖 4 − 7𝑖 ∗
• For example, if 𝐴 = , then 𝐴 = 5 + 3𝑖 1 + 4𝑖 .
6𝑖 1 − 4𝑖 3 + 2𝑖
4 + 7𝑖 3 − 2𝑖
Properties:
• 𝐴∗ ∗ = 𝐴
• 𝐴 + 𝐵 ∗ = 𝐴∗ + 𝐵∗
• 𝑘𝐴 ∗ = 𝑘𝐴ത ∗
• 𝐴𝐵 ∗ = 𝐵 ∗ 𝐴∗
37
Some Special Complex Matrices:
• A complex square matrix 𝐴 is said to be Hermitian or Skew-Hermitian according as
𝐴∗ = 𝐴ҧ 𝑇 = 𝐴 or 𝐴∗ = −𝐴.
• If 𝐴 = 𝑎𝑖𝑗 is Hermitian, then 𝑎𝑖𝑗 = 𝑎𝑗𝑖 for each 𝑖, 𝑗. In this case 𝑎𝑖𝑖 = 𝑎𝑖𝑖
⇒ 𝑎𝑖𝑖 (diagonal elements) are real.
• If 𝐴 = 𝑎𝑖𝑗 is skew-Hermitian, then 𝑎𝑖𝑗 = −𝑎𝑗𝑖 for each 𝑖, 𝑗 and then 𝑎𝑖𝑖 = −𝑎𝑖𝑖
⇒ real(𝑎𝑖𝑖 ) = 0 (diagonal elements are all purely imaginary).
• A complex matrix 𝐴 is said to be unitary if 𝐴∗ = 𝐴−1 or 𝐴∗ 𝐴 = 𝐴𝐴∗ = 𝐼.
• A complex matrix 𝐴 is said to be normal if 𝐴∗ 𝐴 = 𝐴𝐴∗ .

1 1 1
If 𝐴 is unitary, then 𝐴−1 = 𝐴
⇒ 𝐴∗ = 𝐴−1 = 𝐴
and 𝐴∗ 𝐴 = 𝐴∗ 𝐴 = 𝐴
𝐴 =1
or 𝐴𝑇 𝐴 = 𝐴 𝐴 = 1 ⇒ det 𝐴 2 = 1 ⇒ det 𝐴 = ±1.

Hence the determinant of a unitary matrix has absolute value 1.

38
EXAMPLES

3 1 − 2𝑖 4 + 7𝑖
• 𝐴 = 1 + 2𝑖 −4 −2𝑖 is Hermitian,
4 − 7𝑖 2𝑖 5

1 −𝑖 −1 + 𝑖
1
• 𝐵= 𝑖 1 1 + 𝑖 is unitary
2
1+𝑖 −1 + 𝑖 0

2 + 3𝑖 1
• 𝐶= is normal
𝑖 1 + 2𝑖

𝑖 3 2−𝑖
• 𝐷 = −3 −𝑖 2𝑖 is skew- Hermitian.
−2 − 𝑖 2𝑖 −2𝑖

39
EIGENVALUES AND EIGENVECTORS
Consider the system of linear equation 𝐴𝑋 = 𝜆𝑋 or 𝐴 − 𝜆𝐼 𝑋 = 𝑂, ...(1)
where 𝐴 is a square matrix.
• The system (1) always has a zero solution.
• The values of 𝜆 for which the system (1) has a non-zero (non-trivial) solution are
called eigenvalues of the matrix 𝐴 and the corresponding non-zero solution vectors
𝑋 are called the eigenvectors of 𝐴.
• The system (1) has a non-zero solution if and only if 𝐴 − 𝜆𝐼 = 0 …(2)
• The equation (2) is a polynomial equation in 𝜆 and called the characteristic equation
of 𝐴, and the polynomial 𝐴 − 𝜆𝐼 is called the characteristic polynomial of 𝐴.
1 2 1 2 1 0 1−𝜆 2
For example, if 𝐴 = , then 𝐴 − 𝜆𝐼2 = −𝜆 = .
3 2 3 2 0 1 3 2−𝜆
1−𝜆 2
The characteristic equation of 𝐴 is 𝐴 − 𝜆𝐼2 = =0
3 2−𝜆
or 𝜆2 − 3𝜆 + 2 − 6 = 0 or 𝜆2 − 3𝜆 − 4 = 0 or 𝜆 − 4 𝜆 + 1 = 0 ⇒ 𝜆 = 4, −1.
Hence the eigenvalues of 𝐴 are 4 and −1. The eigenvector corresponding to 𝜆 = 4 is
−3 2 𝑥1 0 −3 2 𝑥1 0
given by 𝐴 − 4𝐼 𝑋1 = 𝑂 ⇒ = or = or 3𝑥1 − 2𝑥2 = 0
3 −2 𝑥2 0 0 0 𝑥2 0
2
⇒ 𝑥1 = 2, 𝑥2 = 3 ⇒ 𝑋1 = (There can be many more).
3
40
2 2 𝑥1 0
For 𝜆 = −1, the eigenvector is given by 𝐴 − 𝐼 𝑋 = 𝑂 or =
3 3 𝑥2 0
1
⇒ 𝑥1 = 1, 𝑥2 = −1, i.e., 𝑋2 = .
−1
• For given 𝑖, (𝜆𝑖 , 𝑋𝑖 ) is called an eigen pair of 𝐴.
Property: If 𝑋 is an eigenvector of 𝐴 corresponding to the eigenvalue 𝜆, then 𝑘𝑋(𝑘 ≠ 0)
is also an eigenvector of 𝐴 corresponding to 𝜆.
We have 𝐴𝑋 = 𝜆𝑋 ⇒ 𝐴 𝑘𝑋 = 𝑘 𝐴𝑋 = 𝑘 𝜆𝑋 = 𝜆(𝑘𝑋).
Hence 𝑘𝑋 is also an eigenvector of 𝐴 corresponding to 𝜆.
2 1 0
Example 13: Find the eigenvalues and eigenvectors of the matrix 𝐴 = 0 2 1 .
0 1 2
The eigenvalues are: 𝜆1 = 1, 𝜆2 = 2, 𝜆3 = 3 and the corresponding eigenvectors are:
1 1 1
𝑋1 = −1 , 𝑋2 = 0 , 𝑋3 = 1 .
1 0 1
Note that the eigenvectors are LI, and the eigenvalues are distinct.

41
SOME MORE PROPERTIES
If 𝜆 is an eigenvalues of 𝐴, then
• 𝛼𝜆 is an eigenvalue of α𝐴, where 𝛼 is scalar.
• 𝜆 − 𝑘 is an eigenvalue of 𝐴 − 𝑘𝐼.
1
• is an eigenvalue of 𝐴−1 , 𝜆 ≠ 0.
𝜆
• 𝜆𝑚 is an eigenvalue of 𝐴𝑚 , 𝑚 is some positive integer.
• 𝜆 is an eigenvalue of 𝐴𝑇 .
Proof: (iii) Let 𝑋 be the eigenvector of 𝐴 corresponding to 𝜆. Then 𝐴𝑋 = 𝜆𝑋
1
⇒𝐴−1 𝐴𝑋 = 𝐴−1 𝜆𝑋 ⇒ 𝐼𝑋 = 𝜆 𝐴−1 𝑋 ⇒ 𝐴−1 𝑋 = 𝜆 𝑋
1
⇒ is an eigenvalue of 𝐴−1 .
𝜆
(iv). We can write
𝐴𝑚 𝑋 = 𝐴𝑚−1 𝐴𝑋 = 𝐴𝑚−1 𝜆𝑋 = 𝜆𝐴𝑚−1 𝑋 = 𝜆𝐴𝑚−2 𝐴𝑋 = 𝜆𝐴𝑚−2 𝜆𝑋
= 𝜆2 𝐴𝑚−2 𝑋 = 𝜆𝑚 𝑋
⇒ 𝜆𝑚 is an eigenvalue of 𝐴𝑚 .
(v). 𝐴 − 𝜆𝐼 = 0 ⇒ (𝐴 − 𝜆𝐼)𝑇 = 0 ⇒ 𝐴𝑇 − 𝜆𝐼 = 0 ⇒ 𝜆 is an eigenvalue of 𝐴𝑇 .
(Rest are HW)

42
Eigenvalues and Eigenvectors of some special
matrices:
1. The eigenvalues of a symmetric or Hermitian matrix are real.
Let 𝜆 be an eigenvalue of a Hermitian matrix 𝐴 corresponding eigenvector 𝑋, that is
𝑇
𝐴𝑋 = 𝜆𝑋 ⇒𝐴ҧ𝑋ത = 𝜆ҧ 𝑋ത ⇒ 𝐴ҧ𝑋ത 𝑇 = 𝜆ҧ 𝑋ത
⇒ 𝑋ത 𝑇 𝐴𝑇ҧ = 𝜆ҧ 𝑋ത 𝑇 ⇒ 𝑋ത 𝑇 𝐴∗ 𝑋 = 𝜆ҧ 𝑋ത 𝑇 𝑋 ⇒ 𝑋ത 𝑇 𝐴𝑋 − 𝜆ҧ 𝑋ത 𝑇 𝑋 = 0
⇒ 𝑋ത 𝑇 𝜆𝑋 − 𝜆ҧ 𝑋ത 𝑇 𝑋 = 0 ⇒ 𝜆𝑋ത 𝑇 𝑋 − 𝜆ҧ 𝑋ത 𝑇 𝑋 = 0 ⇒ 𝜆 − 𝜆ҧ 𝑋ത 𝑇 𝑋 = 0.
𝑥1
Since 𝑋ത 𝑇 𝑋 = 𝑥ҧ1 , 𝑥ҧ2 , 𝑥ҧ3 , … 𝑥ҧ𝑛 ⋮ = 𝑥1 2
+ 𝑥2 2
+ 𝑥3 2
+ ⋯ + 𝑥𝑛 2
≠ 0, 𝜆 = 𝜆ҧ ⇒ 𝜆 is
𝑥𝑛
real.
2. The eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are
orthogonal.
Let 𝐴𝑋1 = 𝜆1 𝑋1 and 𝐴𝑋2 = 𝜆2 𝑋2 ⇒ 𝑋1 𝑇 𝐴𝑋2 = 𝜆2 𝑋1 𝑇 𝑋2 … (1)
and 𝐴𝑋1 𝑇 = (𝜆1 𝑋1 )𝑇 ⇒ 𝑋1 𝑇 𝐴 = 𝜆1 𝑋1 𝑇 ⇒ 𝑋1 𝑇 𝐴𝑋2 = 𝜆1 𝑋1 𝑇 𝑋2 … (2)
From (1) and (2),
𝜆2 − 𝜆1 𝑋1 𝑇 𝑋2 = 0 ⇒ 𝑋1 𝑇 𝑋2 = 0 as the eigenvalues are distinct.
⇒ 𝑋1 , 𝑋2 are orthogonal.

43
0 −1 0
Example 14: Take the symmetric matrix 𝐴 = −1 −1 1 .
0 1 0
−𝜆 −1 0
The characteristic equation of 𝐴 is 𝐴 − 𝜆𝐼 = −1 −1 − 𝜆 1 = 0
0 1 −𝜆
−𝜆 0 −𝜆 1 0 1
⇒ −1 − 𝜆 −1 − 𝜆 1 − 𝜆 = 0 ⇒ 𝜆 1 + 𝜆 1 + 𝜆 𝜆 − 1 = 0
0 1 −𝜆 0 1 −𝜆
2 2
⇒ 𝜆 1 −𝜆 − 𝜆 − 𝜆 + 1 + (1 + 𝜆) = 0 ⇒ 𝜆 −𝜆 − 𝜆 + 2 = 0 ⇒ 𝜆 𝜆 − 1 𝜆 + 2 = 0
⇒ 𝜆1 = 0, 𝜆2 = 1, 𝜆3 = −2.
The eigenvector 𝑋1 corresponding to 𝜆1 = 0 is given by 𝐴 − 0𝐼 𝑋 = 0, that is,
0 −1 0 𝑥 0 0 −1 0 𝑥 0
−1 −1 1 𝑦 = 0 or −1 0 1 𝑦 = 0 .
0 1 0 𝑧 0 0 0 0 𝑧 0

⇒ 𝑦 = 0, 𝑥 − 𝑧 = 0 ⇒ 𝑥 = 𝑧, 𝑥 is arbitrary. So let 𝑥 = 1 ⇒ 𝑋1 = 1, 0,1 𝑇 .


The eigenvector 𝑋2 corresponding to 𝜆2 = 1 is given by 𝐴 − 𝐼 𝑋 = 0, that is,

44
−1 −1 0 𝑥 0 −1 −1 0 𝑥 0
−1 −2 1 𝑦 = 0 or 0 0 0 𝑦 = 0
0 1 −1 𝑧 0 0 1 −1 𝑧 0
⇒ 𝑥 + 𝑦 = 0, 𝑦 − 𝑧 = 0 ⇒ 𝑦 = 𝑧, 𝑥 = −𝑧.
−1
Let 𝑧 = 1, 𝑥 = −1, 𝑦 = 1 so that 𝑋2 = 1 .
1
1
Similarly, we can obtain 𝑋3 = 2 .
−1
−1 1
𝑇
Now 𝑋1 . 𝑋2 = 𝑋1 𝑋2 = 1 0 1 1 = 0 and 𝑋2 . 𝑋3 = 𝑋2𝑇 𝑋3 = −1 1 1 2 =𝑂
1 −1
1
𝑇
𝑋1 . 𝑋3 = 𝑋1 𝑋2 = 1 0 1 2 = 0 ⇒ 𝑋1 , 𝑋2, 𝑋3 are orthogonal.
−1

(Check whether 𝑋1 , 𝑋2, 𝑋3 are LI)

45
Sum and product of eigenvalues:
𝑎11 𝑎12
Suppose 𝐴 = 𝑎 𝑎22 . Then, 𝐴 − 𝜆𝐼 = 0 (1)
21
𝑎11 − 𝜆 𝑎12
⇒ =0
𝑎21 𝑎22 − 𝜆
⇒ 𝑎11 − 𝜆 𝑎22 − 𝜆 − 𝑎21 𝑎12 = 0
⇒ 𝜆2 − 𝑎11 + 𝑎22 𝜆 − 𝑎21 𝑎12 + 𝑎11 𝑎22 = 0
⇒ 𝜆2 − 𝑎11 + 𝑎22 𝜆 + 𝑎11 𝑎22 − 𝑎21 𝑎12 = 0
⇒ 𝜆2 − traceA 𝜆 + det 𝐴 = 0 (2)
If 𝜆1 and 𝜆2 are the roots of (1) or (2), then 𝜆1 + 𝜆2 = traceA and 𝜆1 𝜆2 = det 𝐴 .
𝑎11 𝑎12 𝑎13
In case of 𝐴= 𝑎21 𝑎22 𝑎23 , 𝐴 − 𝜆𝐼 = 0
𝑎31 𝑎32 𝑎33
⇒ 𝜆3 − trace 𝐴 𝜆2 + 𝐴11 + 𝐴22 + 𝐴33 𝜆 − det 𝐴 = 0
⇒ 𝜆1 +𝜆2 + 𝜆3 = trace 𝐴,
𝜆1 𝜆2 + 𝜆2 𝜆3 + 𝜆3 𝜆1 = 𝐴11 + 𝐴22 + 𝐴33 = sum of cofactors of diagonal elements,
𝜆1 𝜆2 𝜆3 = det 𝐴.

46
CAYLEY HAMILTON THEOREM

Polynomials of matrices: An expression of the form


𝑓 𝐴 = 𝑎𝑛 𝐴𝑛 + 𝑎𝑛−1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼,
where 𝐴 is a square matrix and 𝐼 is the identity matrix of the same order, is called the
polynomial of 𝐴. In particular, we say that 𝐴 is a root of 𝑓 𝑡 = 0, if 𝑓 𝐴 = 𝑂.
1 2 7 10
For example, let 𝐴 = so that 𝐴2 = .
3 4 15 22
Let 𝑓 𝑡 = 2𝑡 2 − 3𝑡 + 5 and 𝑔 𝑡 = 𝑡 2 − 5𝑡 − 2 be two polynomials. Then
7 10 1 2 1 0 16 14
𝑓 𝐴 = 2𝐴2 − 3𝐴 + 5𝐼 = 2 −3 +5 =
15 22 3 4 0 1 21 37
7 10 1 2 2 0 0 0
𝑔 𝐴 = 𝐴2 − 5𝐴 − 2𝐼 = −5 −2 = =𝑂 (1)
15 22 3 4 0 2 0 0
⇒ 𝐴 is a root of the equation 𝑡 2 − 5𝑡 − 2 = 0 but not of 2𝑡 2 − 3𝑡 + 5 = 0.
1
Also we note that 𝐴2 − 5𝐴 − 2𝐼 = 𝑂 ⇒ 𝐴 − 5𝐼 − 2𝐴−1 = 𝑂 ⇒ 𝐴−1 = 2 𝐴 − 5𝐼 .
1−𝜆 2
Also the characteristic equation of 𝐴 is 𝐴 − 𝜆𝐼 = =0
3 4−𝜆
⇒ 𝜆2 − 5𝜆 − 2 = 0 (2)
From (1), we note that 𝐴 satisfies (2), that is, 𝐴 satisfies its characteristic equation.

47
Cayley-Hamilton Theorem:
Every square matrix satisfies its characteristic equation

Proof: Let 𝑓 𝑥 = 𝑥𝐼 − 𝐴 = 𝑥 𝑛 + 𝑎𝑛−1 𝑥 𝑛−1 + ⋯ + 𝑎1 𝑥 + 𝑎0 be the characteristic


polynomial of 𝐴𝑛×𝑛 . Let 𝐵 𝑥 be the adjoint of the matrix 𝑥𝐼 − 𝐴. We know that the
elements of 𝐵 𝑥 are the cofactors of 𝑥𝐼 − 𝐴, hence are the polynomials of degree at
most 𝑛 − 1 in 𝑥. Therefore, we can write
𝐵 𝑥 = 𝐵𝑛−1 𝑥 𝑛−1 + ⋯ + 𝐵1 𝑥 + 𝐵0 , * (1)
where 𝐵𝑖′ 𝑠 are 𝑛-square matrices. By the property of adjoint of a matrix, we have
𝑥𝐼 − 𝐴 𝐵 𝑥 = 𝑥𝐼 − 𝐴 𝐼
Or 𝑥𝐼 − 𝐴 𝐵𝑛−1 𝑥 𝑛−1 + ⋯ + 𝐵1 𝑥 + 𝐵0 = 𝑥 𝑛 + 𝑎𝑛−1 𝑥 𝑛−1 + ⋯ + 𝑎1 𝑥 + 𝑎0 𝐼.
Equating the powers of 𝑥 in both sides, we have
𝐵𝑛−1 = 𝐼, 𝐵𝑛−2 − 𝐴𝐵𝑛−1 = 𝑎𝑛−1 𝐼, …, 𝐵0 − 𝐴𝐵1 = 𝑎1 𝐼, −𝐴𝐵0 = 𝑎0 𝐼.
Multiplying the above equations by 𝐴𝑛 , 𝐴𝑛−1 ,…, 𝐴, 𝐼, respectively, we get
𝐴𝑛 𝐵𝑛−1 = 𝐴𝑛 𝑎11 𝑥 2 + 𝑏11 𝑥 + 𝑐11 𝑎12 𝑥 2 + 𝑏12 𝑥 + 𝑐12 𝑎13 𝑥 2 + 𝑏13 𝑥 + 𝑐13
𝑛−1 𝑛 𝑛−1 *𝐵 𝑥 = 𝑎21 𝑥 2 + 𝑏21 𝑥 + 𝑐21 𝑎22 𝑥 2 + 𝑏22 𝑥 + 𝑐22 𝑎23 𝑥 2 + 𝑏23 𝑥 + 𝑐23
𝐴 𝐵𝑛−2 − 𝐴 𝐵𝑛−1 = 𝑎𝑛−1 𝐴 𝑎31 𝑥 2 + 𝑏31 𝑥 + 𝑐31 𝑎32 𝑥 2 + 𝑏32 𝑥 + 𝑐32 𝑎33 𝑥 2 + 𝑏33 𝑥 + 𝑐33
⋮ 𝑎11 𝑎12 𝑎13 𝑐11 𝑐12 𝑐13
𝑏11 𝑏12 𝑏13
2
𝐴𝐵0 − 𝐴 𝐵1 = 𝑎1 𝐴 𝑎
= 21 𝑎 22 𝑎 2
23 𝑥 + 𝑏 21 𝑏22 𝑏23 𝑥 + 21 𝑐22 𝑐23
𝑐
𝑎31 𝑎31 𝑎33 𝑏31 𝑏32 𝑏33 𝑐31 𝑐31 𝑐33
−𝐴𝐵0 = 𝑎0 𝐼.
Adding the above equations, 𝑂 = 𝐴𝑛 + 𝑎𝑛−1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼 = 𝑓(𝐴).
Hence 𝐴 satisfies its characteristic equation 𝑓 𝑥 = 0.
48
1 2 0
Example 15: Verify the Cayley-Hamilton theorem for 𝐴 = −1 1 2 and find 𝐴−1.
1 2 1
The characteristic equation of 𝐴 is given by 𝐴 − 𝜆𝐼3 = 0, that is,
1−𝜆 2 0 1−𝜆 2 0
−1 1 − 𝜆 2 = 0 or 0 3−𝜆 3−𝜆 =0
1 2 1−𝜆 1 2 1−𝜆
1−𝜆 2 0 1−𝜆 2 0
Or (3 − 𝜆) 0 1 1 = 0 or (3 − 𝜆) 0 1 1 =0
1 2 1−𝜆 1 0 −1 − 𝜆
Or (3 − 𝜆) 1 − 𝜆 −1 − 𝜆 − 0 − 2 0 − 1 + 0 = 0 or (3 − 𝜆) 𝜆2 + 1 = 0
Or 𝜆3 − 3𝜆2 + 𝜆 − 3 = 0. … (1)
1 2 0 1 2 0 −1 4 4
Now 𝐴2 = −1 1 2 −1 1 2 = 0 3 4
1 2 1 1 2 1 0 6 5
1 2 0 −4 4 4 −1 10 12
and 𝐴3 = −1 1 2 0 3 4 = 1 11 10 .
1 2 1 0 6 5 −1 16 17

49
−1 10 12 −1 4 4 1 2 0 3 0 0 0 0 0
3 2
𝐴 − 3𝐴 + 𝐴 − 3𝐼3 = 1 11 10 − 3 0 3 4 + −1 1 2 − 0 3 0 = 0 0 0 (𝑂3×3 )
−1 16 17 0 6 5 1 2 1 0 0 3 0 0 0
⇒ 𝐴 satisfies its characteristic equation.
Also 𝐴3 − 3𝐴2 + 𝐴 − 3𝐼3 = 𝑂 ⇒ 𝐴2 − 3𝐴 + 𝐼 − 3𝐴−1 = 𝑂
−3 −2 4
1 1
⇒ 𝐴−1 = 𝐴2 − 3𝐴 + 𝐼 = 3 1 −2 .
3 3
−3 0 3
(Check why existence of 𝐴−1 is sure?)

Minimal Polynomial: The minimal polynomial 𝑔(𝑡) for a square matrix 𝐴 is the monic
polynomial of least positive degree for which 𝑔 𝐴 = 𝑂.
• Recall the division algorithm for polynomials: 𝑓 𝑥 = 𝑔 𝑥 𝑞 𝑥 + 𝑟(𝑥)
• The minimal polynomial of 𝐴 is a divisor of the characteristic polynomial of 𝐴
2 0 0
Examples: The minimal polynomial of 0 2 0 is 2 − 𝜆 as 2𝐼 − 𝐴 = 𝑂.
0 0 2
2 1 1
The minimal polynomial of 0 2 1 is 2 − 𝜆 3 since 2𝐼 − 𝐴 2 and 2𝐼 − 𝐴 are not zero.
0 0 2
50
Computation of 𝐴𝒎
We shall discuss it by examples.
−2 4 −2 − 𝜆 4
1. Let us consider 𝐴 = whose characteristic equation is =0
−1 3 −1 3−𝜆
Or 𝜆2 − 𝜆 − 2 = 0 ⇒ 𝜆 = −1, 2.
⇒ 𝐴2 − 𝐴 − 2𝐼 = 0 ⇒ 𝐴2 = 𝐴 + 2𝐼 ⇒ 𝐴3 = 𝐴2 + 2𝐴 = 𝐴 + 2𝐼 + 2𝐴 = 3𝐴 + 2𝐼
⇒ 𝐴4 = 3𝐴2 + 2𝐴 = 3 𝐴 + 2𝐼 + 2𝐴 = 5𝐴 + 6𝐼 ⇒ 𝐴5 = 5𝐴2 + 6𝐴 = 5 𝐴 + 2𝐼 + 6𝐼
⇒ 𝐴5 = 11𝐴 + 10𝐼 and 𝐴6 = 11 𝐴 + 2𝐼 + 10𝐴 = 21𝐴 + 22𝐼.
−2 4 1 0 −20 84
𝐴6 = 21 + 22 = .
−1 3 0 1 −21 85
We note that every power of 𝐴 obtained above is a linear combination of 𝐴 and 𝐼. So,
let us assume 𝐴𝑚 = 𝑐0 𝐼 + 𝑐1 𝐴, where 𝜆𝑚 = 𝑐0 + 𝑐1 𝜆. …(1)
⇒(−1)𝑚 = 𝑐0 + 𝑐1 −1 = 𝑐0 − 𝑐1 and 2𝑚 = 𝑐0 + 𝑐1 2 ⇒ 2𝑚 − (−1)𝑚 = 3𝑐1
1 1 1
⇒𝑐1 = 2𝑚 − (−1)𝑚 and 𝑐0 = 2𝑚 − (−1)𝑚 + (−1)𝑚 = 2𝑚 + 2(−1)𝑚 .
3 3 3
Hence, from (1), we have
1 4
𝑐 − 2𝑐1 4𝑐1 (−2)𝑚 +4(−1)𝑚 2𝑚 − (−1)𝑚
𝐴𝑚 = 0 = 3
−1 1
3
, 𝑚 = 1, 2, 3, …
−𝑐1 3𝑐1 + 𝑐0 2𝑚 − (−1)𝑚 2𝑚+2 − (−1)𝑚
3 3
51
For a matrix 𝐴𝑛×𝑛 having characteristic equation of degree 𝑛, we can write
𝐴𝑚 = 𝑐0 𝐼 + 𝑐1 𝐴 + 𝑐2 𝐴2 + ⋯ + 𝑐𝑛−1 𝐴𝑛−1 , where 𝜆𝑚 = 𝑐0 + 𝑐1 𝜆 + 𝑐2 𝜆2 + ⋯ . . +𝑐𝑛−1 𝜆𝑛−1 .
1 1 −2
H.W: Compute 𝐴𝑚 for 𝐴 = −1 2 1 .
0 1 −1
Hint: First obtain −𝜆3 + 2𝜆2 + 𝜆 − 2 = 0, 𝜆 = −1,1,2 (roots are distinct)
Assume 𝐴𝑚 = 𝑐0 𝐼 + 𝑐1 𝐴 + 𝑐2 𝐴2 and 𝜆𝑚 = 𝑐0 + 𝑐1 𝜆 + 𝑐2 𝜆2 , which gives
(−1)𝑚 = 𝑐0 − 𝑐1 + 𝑐2
1 = 𝑐0 + 𝑐1 + 𝑐2
2𝑚 = 𝑐0 + 2𝑐1 + 4𝑐2 .
Solve these equations for 𝑐0 , 𝑐1 , 𝑐2 and use them in 𝑐0 𝐼 + 𝑐1 𝐴 + 𝑐2 𝐴2 to get
1 1 𝑚 1
9 − 2𝑚+1 − −1 𝑚
[2 − −1 𝑚]
−9 + 2𝑚+1 + 7 −1 𝑚
6 3 6
𝐴𝑚 = 1 − 2𝑚 2𝑚 2𝑚 − 1 .
1 1 𝑚 1
6
3 − 2𝑚+1 − −1 𝑚
3
[2 − −1 𝑚
] 6
−3 + 2𝑚+1 + 7 −1 𝑚

52
Note: In case of repeated roots, one can differentiate the relation
𝜆𝑚 = 𝑐0 + 𝑐1 𝜆 + 𝑐2 𝜆2 + ⋯ +𝑐𝑛−1 𝜆𝑛−1
as many times as required to get the coefficients.
4 1 3
H. W.: If 𝐴 = 0 2 0 , then find 𝐴14 − 3𝐴13 .
−4 1 −4
HINT: The characteristic polynomial is 2 − 𝑥 𝑥 − 2 𝑥 + 2 = − 𝑥 − 2 2 𝑥+2 .
By division algorithm 𝑥 14 − 3𝑥 13 = 𝑥 − 2 2
𝑥 + 2 𝑞 𝑥 + 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2 …(1)
2
The term 𝑥 − 2 𝑥 + 2 𝑞 𝑥 is zero for 𝑥 = 2, −2. So we use only
𝑥 14 − 3𝑥 13 = 𝑎0 + 𝑎1 𝑥 + 𝑎2 𝑥 2
so that 214 − 3.213 = 𝑎0 + 2𝑎1 + 4𝑎2
and 214 + 3.213 = 𝑎0 − 2𝑎1 + 4𝑎2 .
Differentiate (1) and put 𝑥 = 2, so that 14.213 − 3.13.212 = 𝑎1 + 4𝑎2
Solving the above equations for 𝑎0 , 𝑎1 , 𝑎2 , we get
𝐴14 − 3𝐴13 = 214 + 215 𝐼3 − 3.212 𝐴 − 213 𝐴2 .
53
DIAGONALIZATION
• Two square matrices 𝐴 and 𝐵 of the same order are said to be similar, if there is an
invertible matrix 𝑃 such that 𝐴 = 𝑃𝐵𝑃−1 or 𝐴𝑃 = 𝑃𝐵 or 𝑃−1 𝐴𝑃 = 𝐵.
𝑎11 𝑎12 𝑏11 𝑏12
• A special notation: Suppose 𝐴 = 𝑎 and 𝐵 = = 𝑋1 𝑋2 .
21 𝑎22 𝑏21 𝑏22
𝑎11 𝑎12 𝑏11 𝑏12 𝑎11 𝑏11 + 𝑎12 𝑏21 𝑎11 𝑏12 + 𝑎12 𝑏22
𝐴𝐵 = 𝑎 𝑎22 𝑏21 𝑏22 = = 𝐴𝑋1 𝐴𝑋2
21 𝑎21 𝑏11 + 𝑎21 𝑏22 𝑎21 𝑏12 + 𝑎22 𝑏22
• In general for 𝑛 × 𝑛 matrices 𝐴𝐵 = 𝐴𝑋1 𝐴𝑋2 𝐴𝑋3 … 𝐴𝑋𝑛 , where 𝑋1 , 𝑋2 , 𝑋3 , … , 𝑋𝑛 are
columns of 𝐵.
Diagonalization of a matrix: A square matrix 𝐴𝑛×𝑛 is said to be diagonalizable, if there
exists an 𝑛 × 𝑛 non-singular matrix 𝑃 such that 𝑃−1 𝐴𝑃 = 𝐷 is a diagonal matrix. Then
we say that 𝑃 diagonalizes 𝐴.
For example, consider a 3 × 3 diagonalizable matrix 𝐴. Then there exists a 3 × 3 non-
𝑑11 0 0
singular matrix 𝑃 such that 𝑃−1 𝐴𝑃 = 𝐷 or 𝐴𝑃 = 𝑃𝐷, where 𝐷 = 0 𝑑22 0 .
0 0 𝑑33
If 𝑃1 , 𝑃2 , 𝑃3 denote the columns of 𝑃, then 𝐴𝑃 = 𝑃𝐷.
⇒ 𝐴𝑃1 , 𝐴𝑃2 , 𝐴𝑃3 = 𝑑11 𝑃1 , 𝑑22 𝑃2 , 𝑑33 𝑃3 ⇒ 𝐴𝑃1 = 𝑑11 𝑃1 , 𝐴𝑃2 = 𝑑22 𝑃2 , 𝐴𝑃3 = 𝑑33 𝑃3
⇒ 𝑑11 , 𝑑22 and 𝑑33 are the eigenvalues of 𝐴 associated with the eigenvectors 𝑃1 , 𝑃2 and
𝑃3 , respectively.
54
• Further these eigenvectors are linearly independent since 𝑃 was assumed to be non-
singular.
• If 𝐴 is diagonalizable, then the columns of the diagonalizing matrix 𝑃 are linearly
independent.
We also have the following sufficient condition:
Theorem: If an 𝑛 × 𝑛 matrix 𝐴 has 𝑛 linearly independent eigenvectors 𝑃1 , 𝑃2 , … , 𝑃𝑛 ,
then 𝐴 is diagonalizable.
Proof: We can realize the proof by proving the result for a 3 × 3 matrix 𝐴. Let 𝑃1 , 𝑃2 and
𝑃3 be the 3 linearly independent eigenvectors of 𝐴 corresponding to the eigenvalues 𝜆1 ,
𝜆2 and 𝜆3 . Then 𝐴𝑃1 = 𝜆1 𝑃1 , 𝐴𝑃2 = 𝜆2 𝑃2 , 𝐴𝑃3 = 𝜆3 𝑃3 and 𝑃 = 𝑃1 𝑃2 𝑃3 .
𝜆1 0 0
Now, 𝐴𝑃 = 𝐴𝑃1 𝐴𝑃2 𝐴𝑃3 = 𝜆1 𝑃1 𝜆2 𝑃2 𝜆3 𝑃3 = 𝑃1 𝑃2 𝑃3 0 𝜆2 0 = 𝑃𝐷
0 0 𝜆3
⇒ 𝑃−1 𝐴𝑃 = 𝐷.
• Note that the diagonal entries of 𝐷 are the eigenvalues of 𝐴 in an order that
corresponds to the order of the eigenvectors placed in 𝑃.
• Thus we have the following criterion: An 𝐴𝑛×𝑛 is diagonalizable if and only if 𝐴 has 𝑛
linearly independent eigenvectors. (HOME ASSIGNEMNT)
55
Theorem: If 𝐴𝑛×𝑛 has 𝑛 distinct eigenvalues, then eigenvectors of 𝐴 are linearly
independent, that is, 𝐴 is diagonalizable.
Proof: Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be the eigenvectors of 𝐴 corresponding to the different
eigenvalues 𝜆1 , 𝜆2 , … , 𝜆𝑛 . Then 𝐴𝑋𝑘 = 𝜆𝑘 𝑋𝑘 𝑘 = 1,2,3, … , 𝑛 . Now consider the linear
combination 𝑐1 𝑋1 + 𝑐2 𝑋2 + ⋯ + 𝑐𝑛 𝑋𝑛 = 𝑂. … (1)
Multiply (1) by 𝐴 − 𝜆2 𝐼 𝐴 − 𝜆3 𝐼 … 𝐴 − 𝜆𝑛 𝐼 and use the fact that 𝐴 − 𝜆𝑘 𝐼 𝑋𝑘 = 𝑂, we
get 𝑐1 𝐴 − 𝜆2 𝐼 𝐴 − 𝜆3 𝐼 … 𝐴 − 𝜆𝑛 𝐼 𝑋1 + ⋯ +𝑐𝑛 𝐴 − 𝜆2 𝐼 𝐴 − 𝜆3 𝐼 … 𝐴 − 𝜆𝑛 𝐼 𝑋𝑛 = 𝑂
⇒ 𝑐1 𝜆1 − 𝜆2 𝜆1 − 𝜆3 𝜆1 − 𝜆𝑛 𝑋1 = 𝑂 ⇒ 𝑐1 = 0.
Similarly, multiplying (1) by 𝐴 − 𝜆1 𝐼 𝐴 − 𝜆2 𝐼 … 𝐴 − 𝜆𝑖−1 𝐼 𝐴 − 𝜆𝑖+1 𝐼 … 𝐴 − 𝜆𝑛 𝐼 , we
can show that 𝑐𝑖 = 0 for each 𝑖. Hence 𝑋1 , 𝑋2 , … , 𝑋𝑛 are LI.
1 2 1
Example 16: Show that that the matrix 𝐴 = 6 −1 0 is diagonalizble. Also find
−1 −2 −1
−1
an invertible matrix 𝑃 such that 𝑃 𝐴𝑃 is a diagonal matrix.
The characteristic equation of 𝐴 is 𝐴 − 𝜆𝐼 = 0, that is,
1−𝜆 2 1 −𝜆 0 −𝜆
6 −1 − 𝜆 0 = 0 or 6 −1 − 𝜆 0 = 0.
−1 −2 −1 − 𝜆 −1 −2 −1 − 𝜆
56
−1 0 −1 −1 0 −1
⇒ 𝜆 6 −1 − 𝜆 0 = 0 or 6 −1 − 𝜆 0 =0
−1 −2 −1 − 𝜆 0 −2 −𝜆
⇒ 𝜆 −1 1 + 𝜆 𝜆 + 12 ⇒ 𝜆 −𝜆2 − 𝜆 + 12 = 0 ⇒ 𝜆 𝜆2 + 𝜆 − 12 = 0
⇒ 𝜆 𝜆 + 4 𝜆 − 3 = 0 ⇒ 𝜆1 = 0, 𝜆2 = −4, 𝜆3 = 3. Hence 𝐴 is diagonalizable.
1 −1 2
The corresponding eigenvector are 𝑋1 = 6 , 𝑋2 = 2 , 𝑋3 = 3 (H.W.)
−13 1 −2
Here since the eigenvalues are distinct, the eigenvectors are linearly independent.
Hence 𝑃 = 𝑋1 𝑋2 𝑋3 is non-singular. That is,
−1 −1
0
1 −1 2 12 12
−9 2 3
𝑃= 6 2 3 , where 𝑃−1 = 28 7 28
(H.W.)
−13 1 −2 8 1 2
21 7 21
−1 −1
0
1 −1 2 1 2 1 12 12 0 0 0
−9 2 3
And so 𝑃−1 𝐴𝑃 = 6 2 3 6 −1 0 28 7 28
= 0 −4 0 = 𝐷.
−13 1 −2 −1 −2 −1 8 1 2 0 0 3
21 7 21
57
The condition that 𝐴 has 𝑛 distinct eigenvalues is sufficient for diagonalization, but not
necessary. In other words, if the matrix 𝐴 does not have 𝑛 distinct eigenvalues, it may or
may not be diagonalizable.
3 4
For example, 𝐴 = has the characteristic equation 3 − 𝜆 7 − 𝜆 + 4 = 0
−1 7
or 𝜆2 − 10𝜆 + 25 = 0 ⇒ 𝜆 = 5, 5 (not distinct).
−2 4 𝑥 0
The corresponding eigenvector is = or 𝑥 − 2𝑦 = 0, 𝑥 = 2, 𝑦 = 1
−1 2 𝑦 0
2
⇒ 𝑋1 = . There are infinitely may eigenvectors but not LI.
1
0 1 0
Example 17: Check the matrix 𝐴 = 1 0 0 for diagonalization.
0 0 1
−𝜆 1 0
The characteristic equation of 𝐴 is 𝐴 − 𝜆𝐼 = 1 −𝜆 0 =0
0 0 1−𝜆
Or −𝜆 𝜆2 − 𝜆 − 1 1 − 𝜆 = 0 or 𝜆 + 1 𝜆2 − 1 = 0 or 𝜆 + 1 𝜆 − 1 𝜆 − 1 = 0
⇒ 𝜆1 = −1, 𝜆2 = 1, 𝜆3 = 1 (Two repeated root).

58
The eigenvector corresponding to 𝜆1 = −1 is given by 𝐴 + 𝐼 𝑋1 = 0
1 1 0 𝑥 0 1
Or 1 1 0 𝑦 = 0 or 𝑥 + 𝑦 = 0, 2𝑧 = 0 ⇒ 𝑧 = 0, 𝑥 = 1, 𝑦 = −1 ⇒ 𝑋1 = −1 .
0 0 2 𝑧 0 0
The eigenvector corresponding to 𝜆2 = 1 is given by (𝐴 − 𝐼)𝑋2 = 0
−1 1 0 𝑥 0 −1 1 0 𝑥 0
Or 1 −1 0 𝑦 = 0 or 0 0 0 𝑦 = 0 or −𝑥 + 𝑦 = 0, 𝑧 is arbitrary.
0 0 0 𝑧 0 0 0 0 𝑧 0
Let us choose two LI eigenvectors corresponding to the eigenvalue 𝜆2 = 1.
1 0
Take 𝑧 = 0, 𝑥 = 1 = 𝑦 or 𝑧 = 1, 𝑥 = 0 = 𝑦 ⇒ 𝑋2 = 1 or 0 .
0 1
To check linear independence of 𝑋1 , 𝑋2 and 𝑋3 , let
1 1 0 1 1 0 1 1 0
𝑃 = −1 1 0 ~ 0 2 0 ~ 0 1 0 ⇒ columns of 𝑃 are linearly independent.
0 0 1 0 0 1 0 0 1
−1 0 0
⇒ 𝑃 is non- singular. Hence 𝐴 is diagonalizable and 𝐷 = 0 1 0 so that 𝑃−1 𝐴𝑃 = 𝐷
0 0 1
59
NOTE:
• Recall from slides 28-30 that for 𝐴𝑋 = 𝑂, where 𝑋 consists of 𝑛 unknowns and 𝐴
consists of 𝑟 linearly independent rows (rank of 𝐴 = 𝑟), the number of linearly
independent solutions is 𝑛 − 𝑟.
• In Example 17, corresponding to 𝜆2 = 1, we have rank of 𝐴 − 𝐼 is 1. Therefore, there
are 2(= 3 − 1) LI solutions of 𝐴 − 𝐼 𝑋 = 𝑂.
4 𝛼 −1
H.W.: If the matrix 2 5 𝛽 is diagonalizable and has the eigenvalues 3, 3, 𝛿 ≠ 3,
1 1 𝛾
then find the values of 𝛼, 𝛽, 𝛾, 𝛿.
TWO MULTIPLICTIES ASSOCIATED WITH EIGENVALUES:
• The algebraic multiplicity of an eigenvalue is the number of times it appears as a root
of the characteristic equation.
• The geometric multiplicity of an eigenvalue is the number of LI eigenvectors
associated with the eigenvalue.
• In Example 16, every eigenvalue has geometric multiplicity and algebraic multiplicity
1. However, in Example 17, the eigenvalue 𝜆2 = 1 has the geometric multiplicity as
well as algebraic multiplicity 2.
60
Powers of diagonalizable matrix

If 𝐴 is diagonalizable, then there exists a non-singular matrix 𝑃 such that 𝑃−1 𝐴𝑃 = 𝐷

⇒ 𝑃−1 𝐴𝑃 𝑃−1 𝐴𝑃 = 𝐷𝐷 ⇒ 𝑃−1 𝐴𝑃𝑃−1 𝐴𝑃 = 𝐷2 ⇒𝑃−1 𝐴2 𝑃 = 𝐷2 .

Continuing in this way, 𝑃−1 𝐴𝑚 𝑃 = 𝐷𝑚 ⇒ 𝐴𝑚 = 𝑃𝐷𝑚 𝑃−1 .

𝜆1 0 0 𝜆1 0 0 𝜆1 2 0 0
Also 𝐷2 = 0 𝜆2 0 0 𝜆2 0 = 0 𝜆2 2 0
0 0 𝜆3 0 0 𝜆3 0 0 𝜆3 2

𝜆1 𝑚 0 0
so that 𝐷𝑚 = 0 𝜆2 𝑚 0 .
0 0 𝜆3 𝑚
Thus if 𝑓(𝑡) is any function in powers of 𝑡, then 𝑓 𝐴 = 𝑃𝑓(𝐷)𝑃−1 .

For example, if 𝐴 is diagonalizable and 𝑃−1 𝐴𝑃 = 𝐷, then 𝑒 𝐴 = 𝑃𝑒 𝐷 𝑃−1, where


𝐷
𝐷2 𝐷3
𝑒 =𝐼+𝐷+ + +⋯
2! 3!

61

You might also like