Quadratic Forms and Definite Matrices
Quadratic Forms and Definite Matrices
𝑄(𝑥1 , … , 𝑥𝑛 ) = ∑ 𝑎𝑖𝑗 𝑥𝑖 𝑥𝑗
𝑖≤𝑗
1
If 𝑎11 = 1, 𝑎22 = −1, 𝑎12 = 0 , 𝑄(𝑥1 , 𝑥2 ) = 𝑥12 − 𝑥22 could take on both
positive values and negative values (𝑄(1,0) = 1, 𝑄(0,1) = −1). Such a form is
called indefinite.
The concept of definiteness can be generalized to quadratic forms on ℛ 𝑛 .
From the definition of definiteness, determining the definiteness of a quadratic form is
equivalent to determining whether 𝒙 = 𝟎 is a maximizer, minimizer or neither for
the real-valued function 𝑄. 𝑄 will achieve its unique global minimum at 𝒙 = 𝟎 if
and only if 𝑄 is positive definite. 𝑄 will achieve its unique global maximum at 𝒙 =
𝟎 if and only if 𝑄 is negative definite.
2
of a 𝑘 × 𝑘 principal submatrix is called a 𝑘𝑡ℎ order principal minor of 𝐴.
Example For a general 3 × 3 matrix
𝑎11 𝑎12 𝑎13
𝐴 = (𝑎21 𝑎22 𝑎23 )
𝑎31 𝑎32 𝑎33
There is one third order principal minor: det(𝐴).
There are three second order principal minors:
𝑎11 𝑎12
➢ |𝑎 𝑎22 |, formed by deleting column 3 and row 3.
21
𝑎11 𝑎13
➢ |𝑎 𝑎33 |, formed by deleting column 2 and row 2.
31
𝑎22 𝑎23
➢ |𝑎 𝑎33 |, formed by deleting column 1 and row 1.
32
Example For a general 3 × 3 matrix, the three leading principal minors are
𝑎11 𝑎12 𝑎13
𝑎 𝑎12
|𝐴1 | = |𝑎11 |, |𝐴2 | = |𝑎11 𝑎22 | , |𝐴3 | = |𝑎21 𝑎22 𝑎23 |
21
𝑎31 𝑎32 𝑎33
The leading principal minors could be used to determine the definiteness of a given
matrix.
Theorem Let 𝐴 be an 𝑛 × 𝑛 symmetric matrix. Then
(a) 𝐴 is positive definite if and only if all its 𝑛 leading principal minors are strictly
positive.
(b) 𝐴 is negative definite if and only if its 𝑛 leading principal minors alternate in
sign as follows:
3
|𝐴1 | < 0, |𝐴2 | > 0, |𝐴3 | < 0, ⋯
The 𝑘𝑡ℎ leading principal minor should have the same sign as (−1)𝑘 .
(c) If some 𝑘𝑡ℎ leading principal minor of 𝐴 (or some pair of them) is nonzero but
does not fit either of the above two sign patterns then 𝐴 is indefinite. This case
occurs when 𝑘𝑡ℎ leading principal minor is negative while 𝑘 is an even integer
or when 𝑘𝑡ℎ leading principal minor is negative, 𝑙𝑡ℎ leading principal minor is
positive while both 𝑘, 𝑙 are odd.
Remark The leading principal minor test fails when some of the leading principal
minors are zero and all the other leading principal minors fit sign pattern (a) or (b).
4
Definiteness of Diagonal Matrices
Let 𝐴 be a diagonal matrix. Then
𝑎11 0 ⋯ 0 𝑥1
0 𝑎22 ⋯ 0 𝑥2
(𝑥1 , 𝑥2 … , 𝑥𝑛 ) ( )( ⋮ )
⋮ ⋮ ⋱ ⋮
0 0 ⋯ 𝑎𝑛𝑛 𝑥𝑛
= 𝑎11 𝑥12 + 𝑎22 𝑥22 + ⋯ + 𝑎𝑛𝑛 𝑥𝑛2
If 𝑎11 > 0, 𝑎22 > 0, … 𝑎𝑛𝑛 > 0, the quadratic form is positive definite. Every
leading principal minor is positive
If 𝑎11 < 0, 𝑎22 < 0, … 𝑎𝑛𝑛 < 0, the quadratic form is negative definite. A leading
principal minor of even order is the product of even number of negative values
which is positive. A leading principal minor of odd order is the product of odd
number of negative values which is negative. Therefore, the signs of leading
principal minors alternate.
If there exist 𝑖, 𝑗 such that 𝑎𝑖𝑖 > 0, 𝑎𝑗𝑗 < 0 or 𝑎𝑖𝑖 < 0, 𝑎𝑗𝑗 > 0 , then the
quadratic form is indefinite. For example, if 𝑎11 = 0, 𝑎12 > 0, 𝑎33 < 0 , the
quadratic form is indefinite, while all the leading principal minors are zero. Only
checking the signs of the leading principal minors is not sufficient to judge the
definiteness of the matrix.
𝑥2 ) (𝑎 𝑏 𝑥1
𝑄(𝑥1 , 𝑥2 ) = 𝑎𝑥12 + 2𝑏𝑥1 𝑥2 + 𝑐𝑥22 = (𝑥1 )( )
𝑏 𝑐 𝑥2
on the general linear subspace
𝐴𝑥1 + 𝐵𝑥2 = 0
where 𝐴, 𝐵 are nonzero.
𝐵
From the linear constraint, 𝑥1 = − 𝐴 𝑥2 , and then substitute the expression for 𝑥1
5
𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 2
= 𝑥2
𝐴2
𝑄 is positive definite on the constraint if and only if 𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 > 0, and 𝑄
is negative definite on the constraint if and only if 𝑎𝐵 2 − 2𝑏𝐴𝐵 + 𝑐𝐴2 < 0. There is
a convenient way to write this expression
0 𝐴 𝐵
2 2
𝑎𝐵 − 2𝑏𝐴𝐵 + 𝑐𝐴 = −det (𝐴 𝑎 𝑏)
𝐵 𝑏 𝑐
Theorem The quadratic form 𝑄(𝑥1 , 𝑥2 ) = 𝑎𝑥12 + 2𝑏𝑥1 𝑥2 + 𝑐𝑥22 is positive (negative)
definite on the constraint set 𝐴𝑥1 + 𝐵𝑥2 = 0 if and only if
0 𝐴 𝐵
det (𝐴 𝑎 𝑏)
𝐵 𝑏 𝑐
is negative (positive).
Border the matrix of the quadratic form on the top and on the left by the matrix of the
linear constraint
0 ⋯ 0 | 𝐵11 ⋯ 𝐵1𝑛
⋮ ⋱ ⋮ | ⋮ ⋱ ⋮
0 ⋯ 0 | 𝐵𝑚1 ⋯ 𝐵𝑚𝑛
𝐻= − − − − − − −
𝐵11 ⋯ 𝐵𝑚1 | 𝑎11 ⋯ 𝑎1𝑛
⋮ ⋱ ⋮ | ⋮ ⋱ ⋮
(𝐵1𝑛 ⋯ 𝐵𝑚𝑛 | 𝑎1𝑛 ⋯ 𝑎𝑛𝑛 )
We assume the rank of the constraint matrix is 𝑚. In practice, we would first perform
row operations to the constraint matrix, find out the rank of the matrix and then delete
6
the equations that are linear combinations of the others. We finally obtain a set of
effective linear constraints.
7
largest 𝑛 − 𝑚 = 2 leading principal submatrices 𝐻6 and 𝐻5
0 0 | 0 1 1
0 0 | 1 −9 0
− − − − − −
𝐻5 =
0 1 | 1 0 0
1 −9 | 0 −1 2
(1 0 | 0 2 1)
Since 𝑚 = 2 and (−1)2 = 1 , we need det 𝐻6 > 0 and det 𝐻5 > 0 to verify
positive definiteness. Since 𝑛 = 4 and (−1)4 = 1 , we need det 𝐻6 > 0 and
det 𝐻5 < 0 to verify negative definiteness. In fact, det 𝐻6 = 24 > 0 and
det 𝐻5 = 77 > 0 . So 𝑄 is positive definite on the constraint set 𝐵𝒙 = 𝟎 and
achieves its global minimum at 𝒙 = 𝟎.
8
Theorem Let 𝐴 be a 𝑘 × 𝑘 square matrix with eigenvalues 𝑟1 , … , 𝑟𝑘 , then
𝑟1 + 𝑟2 + ⋯ + 𝑟𝑘 = 𝑡𝑟(𝐴)
𝑟1 ∙ 𝑟2 ⋯ 𝑟𝑘 = det(𝐴)
Proof:
𝑎11 𝑎12
Consider the case for 2 × 2 matrices. Assume 𝐴 = (𝑎 𝑎22 )
21
det(𝐴 − 𝑟𝐼) = 𝑟 2 − (𝑎11 + 𝑎22 )𝑟 + (𝑎11 𝑎22 − 𝑎12 𝑎21 ) = 𝛽(𝑟 − 𝑟1 )(𝑟 − 𝑟2 )
= 𝛽𝑟 2 − 𝛽(𝑟1 + 𝑟2 )𝑟 + 𝛽𝑟1 𝑟2
for all 𝑟, then 𝛽 = 1. Therefore, we have
𝑟1 + 𝑟2 = 𝑡𝑟(𝐴)
𝑟1 𝑟2 = det(𝐴)
The theorem naturally extends to higher dimensional cases.
Distinct Eigenvalues
Example
Let’s compute the eigenvalues and eigenvectors of the 3 × 3 matrix
1 0 2
𝐵 = (0 5 0 )
3 0 2
Its characteristic equation is
1−𝑟 0 2
det ( 0 5−𝑟 0 ) = (5 − 𝑟)(𝑟 − 4)(𝑟 + 1) = 0
3 0 2−𝑟
Therefore, the eigenvalues of 𝐵 are 𝑟 = 5, 4, −1. To compute an eigenvector
corresponding to 𝑟 = 5, we compute the nullspace of (𝐵 − 5𝐼); that is, we
solve the system
−4 0 2 𝑣1 −4𝑣1 + 2𝑣3 0
(𝐵 − 5𝐼)𝑉 = ( 0 0 0 ) (𝑣2 ) = ( 0 ) = (0 )
3 0 −3 𝑣3 3𝑣1 − 3𝑣3 0
0
whose solution is 𝑣1 = 𝑣3 = 0; 𝑣2 = anything. So, we’ll take 𝑣1 = (1) as
0
an eigenvector for 𝑟 = 5.
To find an eigenvector for 𝑟 = 4, solve
9
−3 0 2 𝑣1 −3𝑣1 + 2𝑣3 0
(𝐵 − 5𝐼)𝑉 = ( 0 1 0 ) (𝑣2 ) = ( 𝑣2 ) = (0 )
3 0 −2 𝑣3 3𝑣1 − 2𝑣3 0
2
A simple eigenvector for 𝑟 = 4 is (0) . This same method yields the
3
1
eigenvector ( 0 ) for eigenvalue 𝑟 = −1.
−1
Repeated Eigenvalues
Example
Consider the matrix
4 1
𝐴=( )
−1 2
1
whose eigenvalues are 𝑟 = 3, 3. It has one independent eigenvector 𝑉1 = ( ).
−1
10
Definition A matrix 𝐴 which has an eigenvalue of multiplicity 𝑚 > 1 but does not
have 𝑚 independent eigenvectors corresponding to this eigenvalue is called
nondiagonalizable matrix or sometimes a defective matrix.
Example
Consider the matrix again
4 1
𝐴=( )
−1 2
Its generalized eigenvector will be a solution 𝑉2 of
𝑣21
(𝐴 − 3𝐼)𝑉2 = 𝑉1 𝑜𝑟 ( 1 1 1
) (𝑣 ) = ( )
−1 −1 22 −1
Take 𝑣21 = 1, 𝑣22 = 0, for example. Then form
1 1
𝑃 = [𝑉1 , 𝑉2 ] = ( )
−1 0
And check that
11
0 −1 4 1 1 1 3 1
𝑃 −1 𝐴𝑃 = ( )( )( )=( )
1 1 −1 2 −1 0 0 3
12
0
We take 𝑉3 = ( 1 ). Let
10
1 2 0
𝑃 = [𝑉1 , 𝑉2 , 𝑉3 ] = (1 1 1)
1 1 0
Then
2 0 0
−1
𝑃 𝐴𝑃 = (0 3 1)
0 0 3
The almost diagonal matrices like
𝑟∗ 1 0
𝑟∗ 1
( ) 𝑜𝑟 ( 0 𝑟∗ 1)
0 𝑟∗
0 0 𝑟∗
are called Jordan canonical form of the original matrix 𝐴.
Symmetric Matrices
Theorem Let 𝐴 be a 𝑘 × 𝑘 symmetric matrix. Then
All 𝑘 roots of the characteristic equation det(𝐴 − 𝑟𝐼) = 0 are real numbers.
Eigenvectors corresponding to distinct eigenvalues are orthogonal; and even if 𝐴
has multiple eigenvalues, there is a nonsingular matrix 𝑃 whose columns
𝑊1 , … , 𝑊𝑘 are eigenvectors of 𝐴 such that
➢ 𝑊1 , … , 𝑊𝑘 are mutually orthogonal to each other
➢ 𝑃 −1 = 𝑃𝑇
𝑟1 ⋯ 0
➢ 𝑃 −1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃 = ( ⋮ ⋱ ⋮)
0 ⋯ 𝑟𝑘
13
−1 1 1
𝑉1 = ( 1 ) , 𝑉2 = (1) , 𝑉3 = ( 1 )
0 1 −2
these vectors are perpendicular to each other. Divide each eigenvector by its
length to generate a set of normalized eigenvectors
1−1 1 1 1 1
𝑈1 = ( 1 ) , 𝑈2 = (1) , 𝑈3 = (1)
√2 0 √3 1 √6 −2
and make these three orthonormal vectors -vectors which are orthogonal and have
length 1-the columns of the orthogonal matrix
Then 𝑄 −1 = 𝑄 𝑇 and
2 0 0
𝑇
𝑄 𝐵𝑄 = (0 3 0)
0 0 6
Let’s diagonalize a symmetric matrix with nondistinct eigenvalues. Consider the
4 × 4 symmetric matrix
3 1 1 1
1 3 1 1
𝐶=( )
1 1 3 1
1 1 1 3
the eigenvalues of 𝐶 are, by inspection 2,2,2 and 6. The set of eigenvectors for
2, the eigenspace of eigenvectors 2, is the three-dimensional nullspace of
1 1 1 1
1 1 1 1
𝐶 − 2𝐼 = ( )
1 1 1 1
1 1 1 1
the space of {(𝑢1 , 𝑢2 , 𝑢3 , 𝑢4 ): 𝑢1 + 𝑢2 + 𝑢3 + 𝑢4 = 1} . Three independent
vectors in this eigenspace are
−1 −1 −1
1 0 0
𝑉1 = ( ) , 𝑉2 = ( ) , 𝑉3 = ( )
0 1 0
0 0 1
In order to construct an orthogonal matrix 𝑃 so that the product 𝑃𝑇 𝐶𝑃 =
𝑃−1 𝐶𝑃 is diagonal, we need to find three orthogonal vectors 𝑊1 , 𝑊2 , 𝑊3 which
14
span the same subspace as the independent vectors𝑉1 , 𝑉2 , 𝑉3 . The following
procedure, called the Gram-Schmidt Orthogonalization Process, will
accomplish this task. Let 𝑊1 = 𝑉1. Define
𝑊1 ⋅ 𝑉2
𝑊2 = 𝑉2 − 𝑊
𝑊1 ⋅ 𝑊1 1
𝑊1 ⋅ 𝑉3 𝑊2 ⋅ 𝑉3
𝑊3 = 𝑉3 − 𝑊1 − 𝑊
𝑊1 ⋅ 𝑊1 𝑊2 ⋅ 𝑊2 2
The 𝑊𝑖 ′𝑠 so constructed are mutually orthogonal. By construction, 𝑊1 , 𝑊2 , 𝑊3
span the same space as 𝑉1 , 𝑉2 , 𝑉3 . The application of this process to the
eigenvectors 𝑉1 , 𝑉2 , 𝑉3 yields the orthogonal vectors
−1 −1⁄2 −1⁄3
1 ⁄ ⁄
𝑊1 = ( ) , 𝑊2 = (−1 2) , 𝑊2 = (−1 3)
0 1 −1⁄3
0 0 1
Finally, normalize these three vectors and make them the first three columns of an
orthogonal matrix whose fourth column is the normalized eigenvector for 𝑟 =
6:
15