0% found this document useful (0 votes)
119 views

Quadratic Forms and Definite Matrices

This document discusses quadratic forms and definiteness of matrices. It defines quadratic forms and how they can be represented by symmetric matrices. It then defines what makes a quadratic form positive definite, negative definite, positive semidefinite, negative semidefinite, or indefinite based on the values of the coefficients. It also introduces the concept of principal minors of matrices and how the signs of the leading principal minors can determine if a symmetric matrix is positive definite, negative definite, or indefinite. Examples are provided to illustrate these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views

Quadratic Forms and Definite Matrices

This document discusses quadratic forms and definiteness of matrices. It defines quadratic forms and how they can be represented by symmetric matrices. It then defines what makes a quadratic form positive definite, negative definite, positive semidefinite, negative semidefinite, or indefinite based on the values of the coefficients. It also introduces the concept of principal minors of matrices and how the signs of the leading principal minors can determine if a symmetric matrix is positive definite, negative definite, or indefinite. Examples are provided to illustrate these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Quadratic Forms and Definite Matrices

Recall the definition of quadratic form on ℛ 𝑛 .


Definition A quadratic form on ℛ 𝑛 is a real-valued function of the form

𝑄(𝑥1 , … , 𝑥𝑛 ) = ∑ 𝑎𝑖𝑗 𝑥𝑖 𝑥𝑗
𝑖≤𝑗

in which each term is a monomial of degree two.


Let 𝒙 = (𝑥1 , … , 𝑥𝑛 )𝑇 , a column vector. Each quadratic form Q can be represented by
a symmetric matrix 𝐴, such that
𝑄(𝒙) = 𝒙𝑇 𝐴𝒙

Definiteness of Quadratic Forms


For a quadratic form of one variable,
𝑄(𝑥) = 𝑎𝑥 2
 If 𝑎 > 0, 𝑎𝑥 2 is always ≥ 0 and equals zero only when 𝑥 = 0. Such a form is
called positive definite.
 If 𝑎 < 0, 𝑎𝑥 2 is always ≤ 0 and equals zero only when 𝑥 = 0. Such a form is
called negative definite.

For a quadratic form of two variables,


𝑄(𝑥1 , 𝑥2 ) = 𝑎11 𝑥12 + 𝑎12 𝑥1 𝑥2 + 𝑎22 𝑥22
 If 𝑎11 = 𝑎22 = 1, 𝑎12 = 0 , 𝑄(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥22 is always ≥ 0 and equals
zero only when 𝑥1 = 𝑥2 = 0. Such a form is called positive definite.
 If 𝑎11 = 𝑎22 = −1, 𝑎12 = 0, 𝑄(𝑥1 , 𝑥2 ) = −𝑥12 − 𝑥22 is always ≤ 0 and equals
zero only when 𝑥1 = 𝑥2 = 0. Such a form is called negative definite.
 If 𝑎11 = 𝑎22 = 1, 𝑎12 = 2 , 𝑄(𝑥1 , 𝑥2 ) = 𝑥12 + 2𝑥1 𝑥2 + 𝑥22 = (𝑥1 + 𝑥2 )2 is
always ≥ 0 but may equal zero at nonzero 𝒙’s (𝑄(1, −1) = 0). Such a form is
called positive semidefinite.
 If 𝑎11 = 𝑎22 = −1, 𝑎12 = −2 , 𝑄(𝑥1 , 𝑥2 ) = −𝑥12 − 2𝑥1 𝑥2 − 𝑥22 = −(𝑥1 + 𝑥2 )2
is always ≤ 0 but may equal zero at nonzero 𝒙’s (𝑄(1, −1) = 0). Such a form
is called negative semidefinite.

1
 If 𝑎11 = 1, 𝑎22 = −1, 𝑎12 = 0 , 𝑄(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥22 could take on both
positive values and negative values (𝑄(1,0) = 1, 𝑄(0,1) = −1). Such a form is
called indefinite.
The concept of definiteness can be generalized to quadratic forms on ℛ 𝑛 .
From the definition of definiteness, determining the definiteness of a quadratic form is
equivalent to determining whether 𝒙 = 𝟎 is a maximizer, minimizer or neither for
the real-valued function 𝑄. 𝑄 will achieve its unique global minimum at 𝒙 = 𝟎 if
and only if 𝑄 is positive definite. 𝑄 will achieve its unique global maximum at 𝒙 =
𝟎 if and only if 𝑄 is negative definite.

Definiteness of Symmetric Matrices


We could apply the concept of definiteness to symmetric matrix.
Definition Let 𝐴 be an 𝑛 × 𝑛 symmetric matrix, then 𝐴 is
 positive definite if 𝒙𝑇 𝐴𝒙 > 0 for all 𝒙 ≠ 0 in ℛ 𝑛 .
 positive semidefinite if 𝒙𝑇 𝐴𝒙 ≥ 0 for all 𝒙 ≠ 0 in ℛ 𝑛 .
 negative definite if 𝒙𝑇 𝐴𝒙 < 0 for all 𝒙 ≠ 0 in ℛ 𝑛 .
 negative semidefinite if 𝒙𝑇 𝐴𝒙 ≤ 0 for all 𝒙 ≠ 0 in ℛ 𝑛 .
 indefinite if 𝒙𝑇 𝐴𝒙 > 0 for some 𝒙 in ℛ 𝑛 and < 0 for some other 𝒙 in
ℛ𝑛 .
Remark:
 A symmetric matrix which is positive (negative) definite is automatically
positive (negative) semidefinite.
 Every symmetric matrix falls into one of the above five categories.

Principal Minors of a matrix


We will describe a simple test for the definiteness of a symmetric matrix.
Definition Let 𝐴 be an 𝑛 × 𝑛 matrix. A 𝑘 × 𝑘 submatrix of 𝐴 formed by deleting
𝑛 − 𝑘 columns, say columns 𝑖1 , … , 𝑖𝑛−𝑘 and the same 𝑛 − 𝑘 rows, rows
𝑖1 , … , 𝑖𝑛−𝑘 from 𝐴 is called a 𝑘𝑡ℎ order principal submatrix of 𝐴. The determinant

2
of a 𝑘 × 𝑘 principal submatrix is called a 𝑘𝑡ℎ order principal minor of 𝐴.
Example For a general 3 × 3 matrix
𝑎11 𝑎12 𝑎13
𝐴 = (𝑎21 𝑎22 𝑎23 )
𝑎31 𝑎32 𝑎33
There is one third order principal minor: det(𝐴).
 There are three second order principal minors:
𝑎11 𝑎12
➢ |𝑎 𝑎22 |, formed by deleting column 3 and row 3.
21
𝑎11 𝑎13
➢ |𝑎 𝑎33 |, formed by deleting column 2 and row 2.
31
𝑎22 𝑎23
➢ |𝑎 𝑎33 |, formed by deleting column 1 and row 1.
32

 There are three first order principal minors:


➢ |𝑎11 |, formed by deleting the last columns and rows.
➢ |𝑎22 |, formed by deleting column 1, column 3 and row 1, row 3.
➢ |𝑎33 |, formed by deleting the first columns and rows.
Definition Let 𝐴 be an 𝑛 × 𝑛 matrix. The 𝑘𝑡ℎ order principal submatrix of 𝐴
formed by deleting the last 𝑛 − 𝑘 rows and the last 𝑛 − 𝑘 columns from 𝐴 is
called the 𝑘𝑡ℎ order leading principal submatrix of 𝐴. Its determinant is called the
𝑘𝑡ℎ order leading principal minor of 𝐴. We will denote the 𝑘𝑡ℎ order leading
principal submatrix by 𝐴𝑘 and the corresponding leading principal minor by |𝐴𝑘 |.

Example For a general 3 × 3 matrix, the three leading principal minors are
𝑎11 𝑎12 𝑎13
𝑎 𝑎12
|𝐴1 | = |𝑎11 |, |𝐴2 | = |𝑎11 𝑎22 | , |𝐴3 | = |𝑎21 𝑎22 𝑎23 |
21
𝑎31 𝑎32 𝑎33
The leading principal minors could be used to determine the definiteness of a given
matrix.
Theorem Let 𝐴 be an 𝑛 × 𝑛 symmetric matrix. Then
(a) 𝐴 is positive definite if and only if all its 𝑛 leading principal minors are strictly
positive.
(b) 𝐴 is negative definite if and only if its 𝑛 leading principal minors alternate in
sign as follows:

3
|𝐴1 | < 0, |𝐴2 | > 0, |𝐴3 | < 0, ⋯

The 𝑘𝑡ℎ leading principal minor should have the same sign as (−1)𝑘 .
(c) If some 𝑘𝑡ℎ leading principal minor of 𝐴 (or some pair of them) is nonzero but
does not fit either of the above two sign patterns then 𝐴 is indefinite. This case
occurs when 𝑘𝑡ℎ leading principal minor is negative while 𝑘 is an even integer
or when 𝑘𝑡ℎ leading principal minor is negative, 𝑙𝑡ℎ leading principal minor is
positive while both 𝑘, 𝑙 are odd.
Remark The leading principal minor test fails when some of the leading principal
minors are zero and all the other leading principal minors fit sign pattern (a) or (b).

Theorem Let 𝐴 be an 𝑛 × 𝑛 symmetric matrix. Then 𝐴 is positive semidefinite if


and only if every principal minor of 𝐴 is ≥ 0. 𝐴 is negative semidefinite if and
only if every principal minor of even order is ≥ 0 and every principal minor of odd
order is ≤ 0.

Example Let 𝐴 be a 4 × 4 symmetric matrix.


 If |𝐴1 | > 0, |𝐴2 | > 0, |𝐴3 | > 0, |𝐴4 | > 0, then 𝐴 is positive definite.
 If |𝐴1 | < 0, |𝐴2 | > 0, |𝐴3 | < 0, |𝐴4 | > 0, then 𝐴 is negative definite.
 If |𝐴1 | > 0, |𝐴2 | > 0, |𝐴3 | = 0, |𝐴4 | < 0, then 𝐴 is indefinite.
 If |𝐴1 | < 0, |𝐴2 | < 0, |𝐴3 | < 0, |𝐴4 | < 0, then 𝐴 is indefinite.
 If |𝐴1 | = 0, |𝐴2 | < 0, |𝐴3 | > 0, |𝐴4 | = 0, then 𝐴 is indefinite.
 If |𝐴1 | > 0, |𝐴2 | = 0, |𝐴3 | > 0, |𝐴4 | > 0, then 𝐴 could be positive semidefinite
or indefinite. We need to check the signs of all the principal minors.
 If |𝐴1 | = 0, |𝐴2 | > 0, |𝐴3 | = 0, |𝐴4 | > 0, then 𝐴 could be positive or negative
semidefinite or indefinite. We need to check the signs of all the principal minors.
Example Consider
2 3 2 4
𝐴=( ), 𝐵=( )
3 7 4 7
Since |𝐴1 | = 2 > 0, |𝐴2 | = 5 > 0, 𝐴 is positive definite.
Since |𝐵1 | = 2 > 0, |𝐵2 | = −2 < 0, 𝐴 is indefinite.

4
Definiteness of Diagonal Matrices
Let 𝐴 be a diagonal matrix. Then
𝑎11 0 ⋯ 0 𝑥1
0 𝑎22 ⋯ 0 𝑥2
(𝑥1 , 𝑥2 … , 𝑥𝑛 ) ( )( ⋮ )
⋮ ⋮ ⋱ ⋮
0 0 ⋯ 𝑎𝑛𝑛 𝑥𝑛
= 𝑎11 𝑥12 + 𝑎22 𝑥22 + ⋯ + 𝑎𝑛𝑛 𝑥𝑛2
 If 𝑎11 > 0, 𝑎22 > 0, … 𝑎𝑛𝑛 > 0, the quadratic form is positive definite. Every
leading principal minor is positive
 If 𝑎11 < 0, 𝑎22 < 0, … 𝑎𝑛𝑛 < 0, the quadratic form is negative definite. A leading
principal minor of even order is the product of even number of negative values
which is positive. A leading principal minor of odd order is the product of odd
number of negative values which is negative. Therefore, the signs of leading
principal minors alternate.
 If there exist 𝑖, 𝑗 such that 𝑎𝑖𝑖 > 0, 𝑎𝑗𝑗 < 0 or 𝑎𝑖𝑖 < 0, 𝑎𝑗𝑗 > 0 , then the

quadratic form is indefinite. For example, if 𝑎11 = 0, 𝑎12 > 0, 𝑎33 < 0 , the
quadratic form is indefinite, while all the leading principal minors are zero. Only
checking the signs of the leading principal minors is not sufficient to judge the
definiteness of the matrix.

Linear Constraints and Bordered Matrices


Consider the constrained optimization problem for quadratic forms of two variables

𝑥2 ) (𝑎 𝑏 𝑥1
𝑄(𝑥1 , 𝑥2 ) = 𝑎𝑥12 + 2𝑏𝑥1 𝑥2 + 𝑐𝑥22 = (𝑥1 )( )
𝑏 𝑐 𝑥2
on the general linear subspace
𝐴𝑥1 + 𝐵𝑥2 = 0
where 𝐴, 𝐵 are nonzero.
𝐵
From the linear constraint, 𝑥1 = − 𝐴 𝑥2 , and then substitute the expression for 𝑥1

into the objective function


2
𝐵 𝐵 𝐵
𝑄 (− 𝑥2 , 𝑥2 ) = 𝑎 (− 𝑥2 ) + 2𝑏 (− 𝑥2 ) 𝑥2 + 𝑐𝑥22
𝐴 𝐴 𝐴

5
𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 2
= 𝑥2
𝐴2
𝑄 is positive definite on the constraint if and only if 𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 > 0, and 𝑄
is negative definite on the constraint if and only if 𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 < 0. There is
a convenient way to write this expression
0 𝐴 𝐵
2 2
𝑎𝐵 − 2𝑏𝐴𝐵 + 𝑐𝐴 = −det (𝐴 𝑎 𝑏)
𝐵 𝑏 𝑐
Theorem The quadratic form 𝑄(𝑥1 , 𝑥2 ) = 𝑎𝑥12 + 2𝑏𝑥1 𝑥2 + 𝑐𝑥22 is positive (negative)
definite on the constraint set 𝐴𝑥1 + 𝐵𝑥2 = 0 if and only if
0 𝐴 𝐵
det (𝐴 𝑎 𝑏)
𝐵 𝑏 𝑐
is negative (positive).

Now consider the constrained optimization problem for quadratic forms of 𝑛


variables
𝑎11 𝑎12 ⋯ 𝑎1𝑛 𝑥1
𝑎 𝑎22 ⋯ 𝑎2𝑛 𝑥2
𝑄(𝒙) = 𝒙𝑇 𝐴𝒙 = (𝑥1 𝑥2 ⋯ 𝑥𝑛 ) ( 12
⋮ ⋮ ⋱ ⋮ )( ⋮ )
𝑎1𝑛 𝑎2𝑛 ⋯ 𝑎𝑛𝑛 𝑥𝑛
on the linear constraint set
𝑥1
𝐵11 𝐵12 ⋯ 𝐵1𝑛 0
𝑥2
( ⋮ ⋮ ⋱ ⋮ ) ( ⋮ ) = (⋮)
𝐵𝑚1 𝐵𝑚2 ⋯ 𝐵𝑚𝑛 𝑥 0
𝑛

Border the matrix of the quadratic form on the top and on the left by the matrix of the
linear constraint
0 ⋯ 0 | 𝐵11 ⋯ 𝐵1𝑛
⋮ ⋱ ⋮ | ⋮ ⋱ ⋮
0 ⋯ 0 | 𝐵𝑚1 ⋯ 𝐵𝑚𝑛
𝐻= − − − − − − −
𝐵11 ⋯ 𝐵𝑚1 | 𝑎11 ⋯ 𝑎1𝑛
⋮ ⋱ ⋮ | ⋮ ⋱ ⋮
(𝐵1𝑛 ⋯ 𝐵𝑚𝑛 | 𝑎1𝑛 ⋯ 𝑎𝑛𝑛 )
We assume the rank of the constraint matrix is 𝑚. In practice, we would first perform
row operations to the constraint matrix, find out the rank of the matrix and then delete

6
the equations that are linear combinations of the others. We finally obtain a set of
effective linear constraints.

Theorem To determine the definiteness of a quadratic form of 𝑛 variables, 𝑄 =


𝒙𝑇 𝐴𝒙, when restricted to a constraint set given by 𝑚 equations, 𝐵𝒙 = 𝟎, construct
the (𝑛 + 𝑚) × (𝑛 + 𝑚) symmetric matrix 𝐻 bordering the matrix 𝐴 by above and
to the left by the coefficients 𝐵 of the linear constraints:
0 𝐵
𝐻=( )
𝐵𝑇 𝐴
Check the signs of the last 𝑛 − 𝑚 leading principal minors of 𝐻, starting from the
determinant of 𝐻 itself:
 If det 𝐻 has the same sign as (−1)𝑛 and these last 𝑛 − 𝑚 leading principal
minors alternate in sign, then 𝑄 is negative definite on the constraint set 𝐵𝒙 = 𝟎.
And 𝑄 achieves its global maximum at 𝒙 = 𝟎.
 If det 𝐻 and these last 𝑛 − 𝑚 leading principal minors have the same sign as
(−1)𝑚 , then 𝑄 is positive definite on the constraint set 𝐵𝒙 = 𝟎 . And 𝑄
achieves its global minimum at 𝒙 = 𝟎.
 If both of the conditions in (a) and (b) are violated by nonzero leading principal
minors, then 𝑄 is indefinite on the constraint set 𝐵𝒙 = 𝟎. Then 𝒙 = 𝟎 is neither
the maximizer nor minimizer of 𝑄.
Example To Check the definiteness of
𝑄(𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 ) = 𝑥12 − 𝑥22 + 𝑥32 + 𝑥42 + 4𝑥2 𝑥3 − 2𝑥1 𝑥4
on the constraint set
𝑥2 + 𝑥3 + 𝑥4 = 0, 𝑥1 − 9𝑥2 + 𝑥4 = 0
Form the bordered matrix
0 0 | 0 1 1 1
0 0 | 1 −9 0 4
− − − − − − −
𝐻6 = 0 1 | 1 0 0 −1
1 −9 | 0 −1 2 0
1 0 | 0 2 1 0
(1 4 | −1 0 0 1)
Since the problem has 𝑛 = 4 variables and 𝑚 = 2 constraints, we need to check the

7
largest 𝑛 − 𝑚 = 2 leading principal submatrices 𝐻6 and 𝐻5
0 0 | 0 1 1
0 0 | 1 −9 0
− − − − − −
𝐻5 =
0 1 | 1 0 0
1 −9 | 0 −1 2
(1 0 | 0 2 1)
Since 𝑚 = 2 and (−1)2 = 1 , we need det 𝐻6 > 0 and det 𝐻5 > 0 to verify
positive definiteness. Since 𝑛 = 4 and (−1)4 = 1 , we need det 𝐻6 > 0 and
det 𝐻5 < 0 to verify negative definiteness. In fact, det 𝐻6 = 24 > 0 and
det 𝐻5 = 77 > 0 . So 𝑄 is positive definite on the constraint set 𝐵𝒙 = 𝟎 and
achieves its global minimum at 𝒙 = 𝟎.

Eigenvalues and Eigenvectors


Definition Let A be a square matrix. An eigenvalue of 𝐴 is a number 𝑟 which
makes det(𝐴 − 𝑟𝐼) = 0.

Theorem Let 𝐴 be an 𝑛 × 𝑛 matrix and let 𝑟 be a scalar. Then, the following


statements are equivalent:
 Subtracting 𝑟 from each diagonal entry of 𝐴 transforms 𝐴 into a singular
matrix.
 𝐴 − 𝑟𝐼 is a singular matrix
 det(𝐴 − 𝑟𝐼) = 0
 (𝐴 − 𝑟𝐼)𝑉 = 𝟎 for some nonzero vector 𝑉.
 𝐴𝑉 = 𝑟𝑉 for some nonzero vector 𝑉.

Definition When 𝑟 is an eigenvalue of 𝐴, a nonzero vector 𝑉 such that
(𝐴 − 𝑟𝐼)𝑉 = 0
is called an eigenvector of 𝐴 corresponding to eigenvalue 𝑟. det(𝐴 − 𝑟𝐼) = 0 is
called the characteristic polynomial of 𝐴.

8
Theorem Let 𝐴 be a 𝑘 × 𝑘 square matrix with eigenvalues 𝑟1 , … , 𝑟𝑘 , then
 𝑟1 + 𝑟2 + ⋯ + 𝑟𝑘 = 𝑡𝑟(𝐴)
 𝑟1 ∙ 𝑟2 ⋯ 𝑟𝑘 = det(𝐴)
Proof:
𝑎11 𝑎12
Consider the case for 2 × 2 matrices. Assume 𝐴 = (𝑎 𝑎22 )
21

det(𝐴 − 𝑟𝐼) = 𝑟 2 − (𝑎11 + 𝑎22 )𝑟 + (𝑎11 𝑎22 − 𝑎12 𝑎21 ) = 𝛽(𝑟 − 𝑟1 )(𝑟 − 𝑟2 )
= 𝛽𝑟 2 − 𝛽(𝑟1 + 𝑟2 )𝑟 + 𝛽𝑟1 𝑟2
for all 𝑟, then 𝛽 = 1. Therefore, we have
𝑟1 + 𝑟2 = 𝑡𝑟(𝐴)
𝑟1 𝑟2 = det(𝐴)
The theorem naturally extends to higher dimensional cases.

Distinct Eigenvalues
Example
 Let’s compute the eigenvalues and eigenvectors of the 3 × 3 matrix
1 0 2
𝐵 = (0 5 0 )
3 0 2
Its characteristic equation is
1−𝑟 0 2
det ( 0 5−𝑟 0 ) = (5 − 𝑟)(𝑟 − 4)(𝑟 + 1) = 0
3 0 2−𝑟
Therefore, the eigenvalues of 𝐵 are 𝑟 = 5, 4, −1. To compute an eigenvector
corresponding to 𝑟 = 5, we compute the nullspace of (𝐵 − 5𝐼); that is, we
solve the system
−4 0 2 𝑣1 −4𝑣1 + 2𝑣3 0
(𝐵 − 5𝐼)𝑉 = ( 0 0 0 ) (𝑣2 ) = ( 0 ) = (0 )
3 0 −3 𝑣3 3𝑣1 − 3𝑣3 0
0
whose solution is 𝑣1 = 𝑣3 = 0; 𝑣2 = anything. So, we’ll take 𝑣1 = (1) as
0
an eigenvector for 𝑟 = 5.
To find an eigenvector for 𝑟 = 4, solve

9
−3 0 2 𝑣1 −3𝑣1 + 2𝑣3 0
(𝐵 − 5𝐼)𝑉 = ( 0 1 0 ) (𝑣2 ) = ( 𝑣2 ) = (0 )
3 0 −2 𝑣3 3𝑣1 − 2𝑣3 0
2
A simple eigenvector for 𝑟 = 4 is (0) . This same method yields the
3
1
eigenvector ( 0 ) for eigenvalue 𝑟 = −1.
−1

Theorem Let 𝑟1 , 𝑟2 , … , 𝑟ℎ be ℎ distinct eigenvalues of the 𝑘 × 𝑘 matrix 𝐴. let


𝑉1 , 𝑉2 , … , 𝑉ℎ be corresponding eigenvectors. Then 𝑉1 , 𝑉2 , … , 𝑉ℎ are linearly
independent, that is, no one of them can be written as a linear combination of the
others.

Definition A matrix 𝐴, if there exist an invertible matrix 𝑃 such that 𝑃 −1 𝐴𝑃 is a


diagonal matrix, we say the matrix 𝐴 is diagonalizable.

Theorem Let 𝐴 be a 𝑘 × 𝑘 matrix. Let 𝑟1 , 𝑟2 , … , 𝑟𝑘 be eigenvalues of 𝐴, and


𝑉1 , 𝑉2 , … , 𝑉𝑘 the corresponding eigenvalues. Form the matrix
𝑃 = [𝑉1 , 𝑉2 , … , 𝑉𝑘 ]
whose columns are these 𝑘 eigenvectors. If 𝑃 is invertible, then
𝑟1 ⋯ 0
−1
𝑃 𝐴𝑃 = ( ⋮ ⋱ ⋮)
0 ⋯ 𝑟𝑘
Conversely, if 𝑃−1 𝐴𝑃 is a diagonal matrix 𝐷 , the columns of 𝑃 must be
eigenvectors of 𝐴 and the diagonal entries of 𝐷 must be eigenvalues of 𝐴.

Repeated Eigenvalues
Example
Consider the matrix
4 1
𝐴=( )
−1 2
1
whose eigenvalues are 𝑟 = 3, 3. It has one independent eigenvector 𝑉1 = ( ).
−1

10
Definition A matrix 𝐴 which has an eigenvalue of multiplicity 𝑚 > 1 but does not
have 𝑚 independent eigenvectors corresponding to this eigenvalue is called
nondiagonalizable matrix or sometimes a defective matrix.

When 𝐴 is nondiagonalizable, we try to make it like diagonalizable. Here we


introduce the almost diagonal matrix:
𝑟∗ 1
( )
0 𝑟∗
Does this “almost diagonal” form achieve as 𝑃−1 𝐴𝑃 for any defective matrix 𝐴?
𝑟∗ 1
𝑃−1 𝐴𝑃 = ( )
0 𝑟∗
𝑟∗ 1
𝐴[𝑉1 , 𝑉2 ] = [𝑉1 , 𝑉2 ] ( )
0 𝑟∗
[𝐴𝑉1 , 𝐴𝑉2 ] = [𝑟 ∗ 𝑉1 , 𝑉1 + 𝑟 ∗ 𝑉2 ]
(𝐴 − 𝑟 ∗ 𝐼)𝑉1 = 0
(𝐴 − 𝑟 ∗ 𝐼)𝑉2 = 𝑉1
(𝐴 − 𝑟 ∗ 𝐼)2 𝑉2 = 0

Definition Let 𝑟 ∗ be an eigenvalue of the matrix 𝐴. A nonzero vector 𝑉 such that


(𝐴 − 𝑟 ∗ 𝐼)𝑉 ≠ 0 but (𝐴 − 𝑟 ∗ 𝐼)𝑚 𝑉 = 0 for some integer 𝑚 > 1 is called a
generalized eigenvector for 𝐴 corresponding to 𝑟 ∗ .

Example
Consider the matrix again
4 1
𝐴=( )
−1 2
Its generalized eigenvector will be a solution 𝑉2 of
𝑣21
(𝐴 − 3𝐼)𝑉2 = 𝑉1 𝑜𝑟 ( 1 1 1
) (𝑣 ) = ( )
−1 −1 22 −1
Take 𝑣21 = 1, 𝑣22 = 0, for example. Then form
1 1
𝑃 = [𝑉1 , 𝑉2 ] = ( )
−1 0
And check that

11
0 −1 4 1 1 1 3 1
𝑃 −1 𝐴𝑃 = ( )( )( )=( )
1 1 −1 2 −1 0 0 3

Theorem Let 𝐴 be a 2 × 2 matrix with equal eigenvalue 𝑟 = 𝑟 ∗ , 𝑟 ∗ . Then,


 either 𝐴 has two independent eigenvectors corresponding to 𝑟 ∗ , in which case 𝐴
is diagonal matrix 𝑟 ∗ 𝐼.
 𝐴 has only one independent eigenvector, say 𝑉1 . In this case, there is a
generalized eigenvector 𝑉2 such that (𝐴 − 𝑟 ∗ 𝐼)𝑉2 = 𝑉1 . if 𝑃 = [𝑉1 , 𝑉2 ] then
𝑟∗ 1
𝑃−1 𝐴𝑃 = ( )
0 𝑟∗
Example
The characteristic polynomial of the matrix
4 2 −4
𝐴 = (1 4 −3)
1 1 0
is 𝑝(𝑟) = (𝑟 − 3)2 (2 − 𝑟), its eigenvalues are 𝑟 = 3,3,2. For eigenvalues 𝑟 = 2,
the solution space of
2 2 −4 𝑣1 0
(𝐴 − 2𝐼)𝑉 = (1 2 −3) (𝑣2 ) = (0)
1 1 −2 𝑣3 0
1
is the one-dimensional space spanned by 𝑉1 = (1). For eigenvalue 𝑟 = 3, the
1
solution space of
1 2 −4 𝑣1 0
(𝐴 − 2𝐼)𝑉 = (1 1 −3) ( 𝑣2 ) = ( 0)
1 1 −3 𝑣3 0
2
is the one-dimensional space spanned by 𝑉2 = (1). There is only one independent
1
eigenvector corresponding to the eigenvalue of multiplicity two. We need one more
vector 𝑉3 independent of 𝑉1 , 𝑉2 to form the change of coordinate matrix 𝑃 =
[𝑉1 , 𝑉2 , 𝑉3 ]. Take 𝑉3 to be a generalized eigenvector for the eigenvalue 𝑟 = 3 −a
solution to the system
1 2 −4 𝑣1 2
(𝐴 − 3𝐼)𝑉3 = 𝑉2 𝑜𝑟 (1 1 −3) (𝑣2 ) = (1)
1 1 −3 𝑣3 1

12
0
We take 𝑉3 = ( 1 ). Let
10
1 2 0
𝑃 = [𝑉1 , 𝑉2 , 𝑉3 ] = (1 1 1)
1 1 0
Then
2 0 0
−1
𝑃 𝐴𝑃 = (0 3 1)
0 0 3
The almost diagonal matrices like
𝑟∗ 1 0
𝑟∗ 1
( ) 𝑜𝑟 ( 0 𝑟∗ 1)
0 𝑟∗
0 0 𝑟∗
are called Jordan canonical form of the original matrix 𝐴.

Symmetric Matrices
Theorem Let 𝐴 be a 𝑘 × 𝑘 symmetric matrix. Then
 All 𝑘 roots of the characteristic equation det(𝐴 − 𝑟𝐼) = 0 are real numbers.
 Eigenvectors corresponding to distinct eigenvalues are orthogonal; and even if 𝐴
has multiple eigenvalues, there is a nonsingular matrix 𝑃 whose columns
𝑊1 , … , 𝑊𝑘 are eigenvectors of 𝐴 such that
➢ 𝑊1 , … , 𝑊𝑘 are mutually orthogonal to each other
➢ 𝑃 −1 = 𝑃𝑇
𝑟1 ⋯ 0
➢ 𝑃 −1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃 = ( ⋮ ⋱ ⋮)
0 ⋯ 𝑟𝑘

Definition A matrix 𝑃 which satisfies the condition 𝑃 −1 = 𝑃𝑇 , or 𝑃𝑇 𝑃 = 𝐼 is


called orthogonal matrix.
Example
 The eigenvalues of the symmetric matrix
3 1 −1
𝐵=( 1 3 −1)
−1 −1 5
are 𝑟1 = 2, 𝑟2 = 3, 𝑟3 = 6. Corresponding eigenvectors are

13
−1 1 1
𝑉1 = ( 1 ) , 𝑉2 = (1) , 𝑉3 = ( 1 )
0 1 −2
these vectors are perpendicular to each other. Divide each eigenvector by its
length to generate a set of normalized eigenvectors

1−1 1 1 1 1
𝑈1 = ( 1 ) , 𝑈2 = (1) , 𝑈3 = (1)
√2 0 √3 1 √6 −2

and make these three orthonormal vectors -vectors which are orthogonal and have
length 1-the columns of the orthogonal matrix

−1⁄√2 1⁄√3 1⁄√6


𝑄 = ( 1⁄√2 1⁄√3 1⁄√6 )
0 1⁄√3 −2⁄√6

Then 𝑄 −1 = 𝑄 𝑇 and
2 0 0
𝑇
𝑄 𝐵𝑄 = (0 3 0)
0 0 6
 Let’s diagonalize a symmetric matrix with nondistinct eigenvalues. Consider the
4 × 4 symmetric matrix
3 1 1 1
1 3 1 1
𝐶=( )
1 1 3 1
1 1 1 3
the eigenvalues of 𝐶 are, by inspection 2,2,2 and 6. The set of eigenvectors for
2, the eigenspace of eigenvectors 2, is the three-dimensional nullspace of
1 1 1 1
1 1 1 1
𝐶 − 2𝐼 = ( )
1 1 1 1
1 1 1 1
the space of {(𝑢1 , 𝑢2 , 𝑢3 , 𝑢4 ): 𝑢1 + 𝑢2 + 𝑢3 + 𝑢4 = 1} . Three independent
vectors in this eigenspace are
−1 −1 −1
1 0 0
𝑉1 = ( ) , 𝑉2 = ( ) , 𝑉3 = ( )
0 1 0
0 0 1
In order to construct an orthogonal matrix 𝑃 so that the product 𝑃𝑇 𝐶𝑃 =
𝑃−1 𝐶𝑃 is diagonal, we need to find three orthogonal vectors 𝑊1 , 𝑊2 , 𝑊3 which

14
span the same subspace as the independent vectors𝑉1 , 𝑉2 , 𝑉3 . The following
procedure, called the Gram-Schmidt Orthogonalization Process, will
accomplish this task. Let 𝑊1 = 𝑉1. Define
𝑊1 ⋅ 𝑉2
𝑊2 = 𝑉2 − 𝑊
𝑊1 ⋅ 𝑊1 1
𝑊1 ⋅ 𝑉3 𝑊2 ⋅ 𝑉3
𝑊3 = 𝑉3 − 𝑊1 − 𝑊
𝑊1 ⋅ 𝑊1 𝑊2 ⋅ 𝑊2 2
The 𝑊𝑖 ′𝑠 so constructed are mutually orthogonal. By construction, 𝑊1 , 𝑊2 , 𝑊3
span the same space as 𝑉1 , 𝑉2 , 𝑉3 . The application of this process to the
eigenvectors 𝑉1 , 𝑉2 , 𝑉3 yields the orthogonal vectors
−1 −1⁄2 −1⁄3
1 ⁄ ⁄
𝑊1 = ( ) , 𝑊2 = (−1 2) , 𝑊2 = (−1 3)
0 1 −1⁄3
0 0 1
Finally, normalize these three vectors and make them the first three columns of an
orthogonal matrix whose fourth column is the normalized eigenvector for 𝑟 =
6:

−1⁄√2 −1⁄√6 −1⁄√12 1⁄2


1⁄√2 −1⁄√6 −1⁄√12 1⁄2
𝑄=
0 2⁄√6 −1⁄√12 1⁄2
( 0 0 3⁄√12 1⁄2)
Check that 𝑄 𝑇 = 𝐴−1 and
2 0 0 0
0 2 0 0
𝑄 𝑇 𝐶𝑄 = ( )
0 0 2 0
0 0 0 2
Definiteness of Quadratic Forms
Theorem Let 𝐴 be a symmetric matrix, then
 𝐴 is positive definite if and only if all the eigenvalues of 𝐴 are >0
 𝐴 is negative definite if and only if all the eigenvalues of 𝐴 are <0
 𝐴 is positive semidefinite if and only if all the eigenvalues of 𝐴 are ≥ 0
 𝐴 is negative semidefinite if and only if all the eigenvalues of A are ≤ 0
 𝐴 is indefinite if and only if 𝐴 has a positive eigenvalue and a negative
eigenvalue.

15

You might also like