Notes On The Symmetric QR Algorithm: 1 Subspace Iteration
Notes On The Symmetric QR Algorithm: 1 Subspace Iteration
November 4, 2014
The QR algorithm is a standard method for computing all eigenvalues and eigenvectors of a matrix. In
this note, we focus on the real valued symmetric eigenvalue problem (the case where A Rnn . For this
case, recall the Spectral Decomposition Theorem:
Theorem 1. If A Rnn then there exists unitary matrix Q and diagonal matrix such that A = QQT .
We will partition Q = q0 qn1 and assume that = diag(0 , , n1 ) so that throughout
this note, qi and i refer to the ith column of Q and the ith diagonal element of , which means that each
tuple (, qi ) is an eigenpair.
1 Subspace Iteration
We start with a matrix V Rnr with normalized columns and iterate something like
V (0) = V
for k = 0, . . . convergence
V (k+1) = AV (k)
Normalize the columns to be of unit length.
end for
The problem with this approach is that all columns will (likely) converge to an eigenvector associated with
the dominant eigenvalue, since the Power Method is being applied to all columns simultaneously. We will
now lead the reader through a succession of insights towards a practical algorithm.
Let us examine what Vb = AV looks like, for the simple case where V = v0 v1 v2 (three columns).
We know that
vj = Q QT vj .
| {z }
yj
Hence
n1
X
v0 = 0,j qj ,
j=0
1
n1
X
v1 = 1,j qj , and
j=0
n1
X
v2 = 2,j qj ,
j=0
If we happened to know 0 , 1 , and 2 then we could divide the columns by these, respectively, and
get new vectors
P
n1 j Pn1 j Pn1 j
vb0 vb1 vb2 =
j=0 0,j 0 q j
j=0 1,j 1 q j
j=0 2,j 2 q j
1,0 10 q0 + 2,0 02 q0 + 2,1 12 q1 +
=
0,0 q0 + 1,1 q1 + 2,2 q2 + (1)
Pn1 j Pn1 j Pn1 j
j=1 0,j 0 qj j=2 1,j 1 qj j=3 2,j 2 qj
Assume that |0 | > |1 | > |2 | > |3 | |n1 |. Then, similar as for the power method,
The first column will see components in the direction of {q1 , . . . , qn1 } shrink relative to the
component in the direction of q0 .
The second column will see components in the direction of {q2 , . . . , qn1 } shrink relative to the
component in the direction of q1 , but the component in the direction of q0 increases, relatively,
since |0 /1 | > 1.
The third column will see components in the direction of {q3 , . . . , qn1 } shrink relative to the
component in the direction of q2 , but the components in the directions of q0 and q1 increase,
relatively, since |0 /2 | > 1 and |1 /2 | > 1.
How can we make it so that vj converges to a vector in the direction of qj ?
If we happen to know q0 , then we can subtract out the component of
n1
0 X j
vb1 = 1,0 q0 + 1,1 q1 + 1,j qj
1 j=2
1
in the direction of q0 :
n1
X j
vb1 q0T vb1 q0 = 1,1 q1 + 1,j qj
j=2
1
so that we are left with the component in the direction of q1 and components in directions of q2 , . . . , qn1
that are suppressed every time through the loop.
Similarly, if we also know q1 , the components of vb2 in the direction of q0 and q1 can be subtracted from
that vector.
2
We do not know 0 , 1 , and 2 but from the discussion about the Power Method we remember that
we can just normalize the so updated vb0 , vb1 , and vb2 to have unit length.
How can we make these insights practical?
We do not know q0 , q1 , and q2 , but we can informally argue that if we keep iterating,
The vector vb0 , normalized in each step, will eventually point in the direction of q0 .
Span(b
v0 , vb1 ) will eventually equal Span(q0 , q1 ).
In each iteration, we can subtract the component of vb1 in the direction of vb0 from vb1 , and then
normalize vb1 so that eventually result in a the vector that points in the direction of q1 .
Span(b
v0 , vb1 , vb2 ) will eventually equal Span(q0 , q1 , q2 ).
In each iteration, we can subtract the component of vb2 in the directions of vb0 and vb1 from vb2 , and
then normalize the result, to make vb2 eventually point in the direction of q2 .
What we recognize is that normalizing vb0 , subtracting out the component of vb1 in the direction of vb0 , and then
normalizing vb1 , etc., is exactly what the Gram-Schmidt process does. And thus, we can use any convenient
(and stable) QR factorization method. This also shows how the method can be generalized to work with
more than three columns and even all columns simultaneously.
The algorithm now becomes:
This shows that, if the components in the direction of q0 and q1 are subtracted out, it is the component in
the direction of q3 that is deminished in length the most slowly, dictated by the ratio 23 . This, of course,
(k) (k)
generalizes: the jth column of V (k) , vi will have a component in the direction of qj+1 , of length |qj+1
T
vj |,
that can be expected to shrink most slowly.
We demonstrate this in Figure 1, which shows the execution of the algorithm with p = n for a 5 5
T (k)
matrix, and shows how |qj+1 vj | converge to zero as as function of k.
Next, we observe that if V Rnn in the above iteration (which means we are iterating with n vectors
at a time), then AV yields a next-to last column of the form
j
Pn3
j=0 n2,j n2 qj +
n2,n2 qn2 +
,
n1
n2,n1 n2 qn1
where i,j = qjT vi . Thus, given that the components in the direction of qj , j = 0, . . . , n2 can be expected in
later iterations
to be greatly reduced by the QR factorization that subsequently happens with AV , we notice
n1 (k)
that it is n2 that dictates how fast the component in the direction of qn1 disappears from vn2 . This
3
Figure 1: Convergence of the subspace iteration for a 5 5 matrix. This graph is mislabeled: x should be
labeled with v. The (linear) convergence of vj to a vector in the direction of qj is dictated by now quickly
T
the component in the direction qj+1 converges to zero. The line labeled |qj+1 xj | plots the length of the
component in the direction qj+1 as a function of the iteration number.
is a ratio we also saw in the Inverse Power Method and that we noticed we could accelerate in the Rayleigh
Quotient Iteration: At each iteration we should shift the matrix to (A k I) where k n1 . Since the
(k)T (k)
last column of V (k) is supposed to be converging to qn1 , it seems reasonable to use k = vn1 Avn1 (recall
(k)
that vn1 has unit length, so this is the Rayleigh quotient.)
The above discussion motivates the iteration
V (0) := I (V (0) Rnn !)
for k := 0, . . . convergence
(k)T (k)
k := vn1 Avn1 (Rayleigh quotient)
(A k I)V (k) V (k+1) R(k+1) (QR factorization)
end for
Notice that this does not require one to solve with (A k I), unlike in the Rayleigh Quotient Iteration.
However, it does require a QR factorization, which requires more computation than the LU factorization
(approximately 43 n3 flops).
We demonstrate the convergence in Figure 2, which shows the execution of the algorithm with a 5 5
(k)
matrix and illustrates how |qjT vn1 | converge to zero as as function of k.
4
0
10 | qT0 x(k)
4
|
| qT1 x(k)
4
|
| qT x(k) |
2 4
T (k)
| q3 x4 |
length of component in direction ...
5
10
10
10
15
10
0 5 10 15 20 25 30 35 40
k
Figure 2: Convergence of the shifted subspace iteration for a 5 5 matrix. This graph is mislabeled: x
should be labeled with v. What this graph shows is that the components of v4 in the directions q0 throught
q3 disappear very quickly. The vector v4 quickly points in the direction of the eigenvector associated with the
smallest (in magnitude) eigenvalue. Just like the Rayleigh-quotient iteration is not guaranteed to converge
to the eigenvector associated with the smallest (in magnitude) eigenvalue, the shifted subspace iteration
may home in on a different eigenvector than the one associated with the smallest (in magnitude) eigenvalue.
Something is wrong in this graph: All curves should quickly drop to (near) zero!
2 The QR Algorithm
The QR algorithm is a classic algorithm for computing all eigenvalues and eigenvectors of a matrix. While
we explain it for the symmetric eigenvalue problem, it generalizes to the nonsymmetric eigenvalue problem
as well.
5
Subspace iteration QR algorithm
Ab(0) := A A(0) := A
Vb (0) := I V (0) := I
for k := 0, . . . until convergence for k := 0, . . . until convergence
AVb (k) Vb (k+1) R
b(k+1) (QR factorization) A(k) Q(k+1) R(k+1) (QR factorization)
b(k+1) := Vb (k+1)T AVb (k+1)
A A(k+1) := R(k+1) Q(k+1)
V (k+1) := V (k) Q(k+1)
end for end for
We conclude that if Vb (k) converges to the matrix of orthonormal eigenvectors when the subspace iteration
is applied to V (0) = I, then A(k) converges to the diagonal matrix with eigenvalues along the diagonal.
We conclude that if Vb (k) converges to the matrix of orthonormal eigenvectors when the shifted subspace
iteration is applied to V (0) = I, then A(k) converges to the diagonal matrix with eigenvalues along the
diagonal.
The convergence of the basic shifted QR algorithm is illustrated below. Pay particular attention to the
convergence of the last row and column.
6
2.01131953448 0.05992695085 0.14820940917 2.21466116574 0.34213192482 0.31816754245
A(0) = 0.05992695085 2.30708673171 0.93623515213 A(1) = 0.34213192482 2.54202325042 0.57052186467
Once the off-diagonal elements of the last row and column have converged (are sufficiently small), the problem
can be deflated by applying the following theorem:
Theorem 4. Let
A0,0 A01 A0N 1
0 A1,1 A1,N 1
A= .. .. ..
,
.
. . A00
0 0 AN 1,N 1
1
where Ak,k are all square. Then (A) = N
k=0 (Ak,k ).
In other words, once the last row and column have converged, the algorithm can continue with the submatrix
that consists of the first n 1 rows and columns.
The problem with the QR algorithm, as stated, is that each iteration requires O(n3 ) operations, which
is too expensive given that many iterations are required to find all eigenvalues and eigenvectors.
We observe:
Let z be any vector that is perpendicular to u. Applying a Householder transform H(u) to z leaves
the vector unchanged: H(u)z = z.
Let any vector x be written as x = z + uT xu, where z is perpendicular to u and uT xu is the component
of x in the direction of u. Then H(u)x = z uT xu.
7
This can be interpreted as follows: The space perpendicular to u acts as a mirror: any vector in that
space (along the mirror) is not reflected, while any other vector has the component that is orthogonal to the
space (the component outside and orthogonal to the mirror) reversed in direction. Notice that a reflection
preserves the length of the vector. Also, it is easy to verify that:
1. HH = I (reflecting the reflection of a vector results in the original vector);
3.2 Algorithm
The first step towards computing the eigenvalue decomposition of a symmetric matrix is to reduce the matrix
to tridiagonal form.
The basic algorithm for reducing a symmetric matrix to tridiagonal form, overwriting the original matrix
with the result, can be explained as follows. We assume that symmetric A is stored only in the lower
triangular part of the matrix and that only the diagonal and subdiagonal of the symmetric tridiagonal
matrix is computed, overwriting those parts of A. Finally, the Householder vectors used to zero out parts of
A overwrite the entries that they annihilate (set to zero).
!
11 aT21
Partition A .
a21 A22
Update
! ! ! ! !
11 aT21 1 0 11 aT21 1 0 11 aT21 H
:= =
a21 A22 0 H a21 A22 0 H Ha21 HA22 H
where H = H(u21 ). Note that a21 := Ha21 need not be executed since this update was performed
by the instance of Hous above.2 Also, aT12 is not stored nor updated due to symmetry. Finally, only
the lower triangular part of HA22 H is computed, overwriting A22 . The update of A22 warrants closer
scrutiny:
1 1
A22 := (I u21 uT21 )A22 (I u21 uT21 )
1 1
= (A22 u21 uT21 A22 )(I u21 uT21 )
| {z }
T
y21
1 Note that the semantics here indicate that a21 is overwritten by Ha21 .
2 In practice, the zeros below the first element of Ha21 are not actually written. Instead, the implementation overwrites
these elements with the corresponding elements of the vector u21 .
8
Algorithm: [A] := TriRed unb(b, A)
!
AT L AT R
Partition A
ABL ABR
where AT L is 0 0
while m(AT L ) < m(A) do
Repartition
! A00 a01 A02
AT L AT R
aT aT
10 11 12
ABL ABR
A20 a21 A22
where 11 is a scalar
Continue with
! A00 a01 A02
AT L AT R
aT 11 aT
10 12
ABL ABR
A20 a21 A22
endwhile
0 0 0 0 0 0
0 0
0 0
0 0 0
0 0 0
Original matrix First iteration Second iteration
0 0 0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0 0 0
Third iteration
Figure 6: Illustration of reduction of a symmetric matrix to tridiagonal form. The s denote nonzero
elements in the matrix. The gray entries above the diagonal are not actually updated.
9
1 T 1 1
= A22 u21 y21 Au21 uT + u21 y21 T
u21 uT
| {z } 21 2 | {z } 21
y21 2
1 T 1
= A22 u21 y21 + 2 u21 uT21 y21 uT21 + 2 u21 uT21
1 T 1
= A22 u21 y21 + uT21 y21 + u21 uT21
| {z } | {z }
T
w21 w21
T
= A22 u21 w21 w21 uT21 .
| {z }
symmetric
rank-2 update
10
5 QR Factorization of a Tridiagonal Matrix
Now, consider the 4 4 tridiagonal matrix
0,0 0,1 0 0
1,1 1,2 0
1,0
0 2,1 2,2 2,3
0 0 3,2 3,3
!
0,0
From one can compute 1,0 and 1,0 so that
1,0
!T ! !
1,0 1,0 0,0
b0,0
= .
1,0 1,0 1,0 0
Then
b0,0
b0,1
b0,2 0 1,0 1,0 0 0 0,0 0,1 0 0
0 0 1,0
b1,1 b1,2 1,0 0 1,0
0 1,1 1,2 0
=
0 2,1 2,2 2,3 0 0 1 0 0 2,1 2,2 2,3
0 0 3,2 3,3 0 0 0 1 0 0 3,2 3,3
!
b1,1
Next, from one can compute 2,1 and 2,1 so that
2,1
!T ! !
2,1 2,1
b1,1
b1,1
b
= .
2,1 2,1 2,1 0
Then
b0,0
b0,1
b0,2 0 1 0 0 0
b0,0
b0,1
b0,2 0
0
b1,1
b
b1,2
b
b1,3
b 0
= 2,1 2,1 0
0
b1,1
b1,2 0
0 0
b2,2
b2,3 0
2,1 2,1 0
0 2,1 2,2 2,3
0 0 3,2 3,3 0 0 0 1 0 0 3,2 3,3
! !T ! !
b2,2 3,2 3,2 b2,2
b2,2
b
Finally, from one can compute 3,2 and 3,2 so that = .
3,2 3,2 3,2 3,2 0
Then
b0,0
b0,1
b0,2 0 1 0 0 0
b0,0
b0,1
b0,2 0
0 0 1 0 0 0
b1,1
b b1,2
b b1,3
b b1,1
b b1,2
b b1,3
b
=
0 0
b2,2
b
b2,3
b 1 0 3,2 3,2 0 0
b2,2
b2,3
0 0 0
b3,3 0 1 3,2 3,2 0 0 3,2 3,3
The matrix Q is the orthogonal matrix that results from multiplying the different Givens rotations together:
1,0 1,0 0 0 1 0 0 0 1 0 0 0
0 2,1 2,1 0 0 1 0 0
1,0 1,0 0 0
Q= . (2)
0 0 1 0
0
2,1 2,1 0 0 0 3,2 3,2
0 0 0 1 0 0 0 1 0 0 3,2 3,2
11
However, it is typically not explicitly formed.
The next question is how to compute RQ given the QR factorization of the tridiagonal matrix:
b0,0 b0,1 b0,2 0 1,0 1,0 0 0 1 0 0 0 1 0 0 0
0 1 0 0
0
b
b 1,1
b
b 1,2
b
b 1,3
1,0 1,0 0 0
0 2,1 2,1 0
0
0
b2,2
b b2,3 0
b
0 1 0
0 2,1 2,1 0 0 0 3,2 3,2
0 0 0
b3,3 0 0 0 1 0 0 0 1 0 0 3,2 3,2
| {z }
0,0
0,1
b0,2 0
1,1
1,0
b1,2
b1,3
b b b
0 0
b2,2
b b2,3
b
0 0 0
b3,3
| {z }
0,0 0,1
0,2 0
1,1
1,0 1,2
b1,3
b
0 2,1 2,2
b2,3
b
0 0 0
b3,3
| {z }
0,0 0,1
0,2 0
1,1
1,2
1,3
1,0 .
0
2,1
2,2
2,3
0 0
3,2
3,3
12
Theorem 8 (Implicit Q Theorem). Let A, B Rnn where B is upper Hessenberg and has only positive elements
on its first subdiagonal and assume there exists an orthogonal matrix Q such that QT AQ = B. Then Q and
B are uniquely determined by A and the first column of Q.
00 ? ?
Proof: Notice that AQ = QB. Let Q = q0 q1 Q2 and B = 10 11 ? and focus on
0 21 e0 B22
!
00
the first column of both sides of AQ = QB: Aq0 = q0 = 00 q0 + 10 q0 . By orthogonality
q1
10
of q0 and q1 we find that 00 = q0T Aq0 and 10 q1 = qb1 = Aq0 00 q0 . Since 10 > 0 we deduce that
qb1 6= 0. Since kq1 k2 = 1 we conclude that 10 = kb
q1 k2 and q1 = qb1 /10 . The point is that 00 and 10 are
prescribed, as is q1 . An inductive proof can be constructed to similarly show that the rest of the elements
of B and Q are uniquely determined.
Notice the similarity between the above proof and the proof of the existence and uniqueness of the QR
factorization!
13
From: Gene H Golub <[email protected]>
Date: Sun, 19 Aug 2007 13:54:47 -0700 (PDT)
Subject: John Francis, Co-Inventor of QR
Dear Colleagues,
John Francis was born in 1934 in London and currently lives in Hove, near
Brighton. His residence is about a quarter mile from the sea; he is a
widower. In 1954, he worked at the National Research Development Corp
(NRDC) and attended some lectures given by Christopher Strachey.
In 1955,56 he was a student at Cambridge but did not complete a degree.
He then went back to NRDC as an assistant to Strachey where he got
involved in flutter computations and this led to his work on QR.
After leaving NRDC in 1961, he worked at the Ferranti Corp and then at the
University of Sussex. Subsequently, he had positions with various
industrial organizations and consultancies. He is now retired. His
interests were quite general and included Artificial Intelligence,
computer languages, systems engineering. He has not returned to numerical
computation.
He was surprised to learn there are many references to his work and
that the QR method is considered one of the ten most important
algorithms of the 20th century. He was unaware of such developments as
TeX and Math Lab. Currently he is working on a degree at the Open
University.
John Francis did remarkable work and we are all in his debt. Along with
the conjugate gradient method, it provided us with one of the basic tools
of numerical analysis.
Gene Golub
Figure 7: Posting by the late Gene Golub in NA Digest Sunday, August 19, 2007 Volume 07 : Issue 34. An
article on the ten most important algorithms of the 20th century, published in SIAM News, can be found at
http://www.uta.edu/faculty/rcli/TopTen/topten.pdf.
14
!
b2,1
again preserves eigenvalues. Finally, from one can compute 3,1 and 3,1 so that
b3,1
!T ! !
3,1 3,1
b2,1
2,1
= .
3,1 3,1
b3,1 0
Then
0,0
1,0 0 0 1 0 0 0 0,0
1,0 0 0 1 0 0 0
1,0
1,1
2,1 0 0 1 0 0
1,1 0 1 0 0
1,0
b2,1
b b3,1
=
0
2,1
2,2
2,3 1
0 3,2 3,2 0
b2,1
b
b2,2
b2,3 1
0 3,1 3,1
0 0
3,2
3,3 0 1 3,2 3,2 0
b3,1
b3,2 3,3 0 1 3,1 3,1
The matrix Q is the orthogonal matrix that results from multiplying the different Givens rotations together:
1,0 1,0 0 0 1 0 0 0 1 0 0 0
0 2,0 2,0 0 0 1 0 0
1,0 1,0 0 0
Q= .
0 0 1 0
0
2,0 2,0 0 0 0 3,1 3,1
0 0 0 1 0 0 0 1 0 0 3,1 3,1
1,0
1,0
It is important to note that the first columns of Q is given by , which is exactly the same first
0
0
column had Q been computed as in Section 5 (Equation 2). Thus, by the Implicit Q Theorem, the tridiagonal
matrix that results from this approach is equal to the tridiagonal matrix that would be computed by applying
the QR factorization from Section 5 with A I, A I QR followed by the formation of RQ + I using
the algorithm for computing RQ in Section 5.
The successive elimination of elements bi+1,i is often referred to as chasing the bulge while the entire
process that introduces the bulge and then chases it is known as a Francis Implicit QR Step. Obviously, the
method generalizes to matrices of arbitrary size, as illustrated in Figure 8. An algorithm for the chasing of
the bulge is given in Figure 9. (Note that in those figures T is used for A, something that needs to be made
consistent in these notes, eventually.) In practice, the tridiagonal matrix is not stored as a matrix. Instead,
its diagonal and subdiagonal are stored as vectors.
15
TT L T
TM
L
+
Beginning of iteration
TM L TM M T
TBM
+
TBM TBR
T00 t10
+ T
T10 11 tT
21
t43 T44
T00 t10
tT
10
11 tT
21
t43 T44
TT L T
TM L
+
End of iteration
TM L TM M T
TBM
+
TBM TBR
Figure 8: One step of chasing the bulge in the implicitly shifted symmetric QR algorithm.
right to the appropriate columns of Q so that upon completion Q is overwritten with the eigenvectors
of A. Notice that applying a Givens rotation to a pair of columns of Q requires O(n) computation per
Givens rotation. For each Francis implicit QR step O(n) Givens rotations are computed, making the
application of Givens rotations to Q of cost O( n2 ) per iteration of the implicitly shifted QR algorithm.
Typically a few (2-3) iterations are needed per eigenvalue that is uncovered (by deflation) meaning that
O(n) iterations are needed. Thus, the QR algorithm is roughly of cost O(n3 ) if the eigenvalues are
accumulated (in addition to the cost of forming the Q from the reduction to tridiagonal form, which
takes another O(n3 ) operations.)
16
Algorithm: T := ChaseBulge(T )
TT L ? ?
Partition T TM L TM M ?
0 TBM TBR
where TT L is 0 0 and TM M is 3 3
while m(TBR ) > 0 do
Repartition
T00 ? 0 0 0
T
TT L ? 0 t10 11 ? 0 0
TM L TM M ? 0 t21 T22 ? 0
tT
0 TBM TBR 0 0 32 33 ?
0 0 0 t43 T44
where 11 and 33 are scalars
(during final step, 33 is 0 0)
! !
21 21
Compute (, ) s.t. GT
, t21 = , and assign t21 =
0 0
T22 = GT
, T22 G,
tT T
32 = t32 G, (not performed during final step)
Continue with
T00 ? 0 0 0
TT L ? 0
tT 11 ? 0 0
10
TM L TM M ?
0 t21 T22 ? 0
0 0 tT 33 ?
0 TBM TBR 32
0 0 0 t43 T44
endwhile
If an element on the subdiagonal becomes zero (or very small), and hence the corresponding element
of the superdiagonal, then the problem can be deflated: If
!
T00 0
T = 0
0 T11
0
then
The computation can continue separately with T00 and T11 .
One can pick the shift from the bottom-right of T00 as one continues finding the eigenvalues of
T00 , thus accelerating the computation.
One can pick the shift from the bottom-right of T11 as one continues finding the eigenvalues of
T11 , thus accelerating the computation.
17
One must continue to accumulate the eigenvectors by applying the rotations to the appropriate
columns of Q.
Because of the connection between the QR algorithm and the Inverse Power Method, subdiagonal
entries near the bottom-right of T are more likely to converge to a zero, so most deflation will happen
there.
7 Further Reading
7.1 More on reduction to tridiagonal form
The reduction to tridiagonal form can only be partially cast in terms of matrix-matrix multiplication [5].
This is a severe hindrance to high performance for that first step towards computing all eigenvalues and
eigenvector of a symmetric matrix. Worse, a considerable fraction of the total cost of the computation is in
that first step.
For a detailed discussion on the blocked algorithm that uses FLAME notation, we recommend [8]
Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ort, G. Joseph Elizondo.
Families of Algorithms for Reducing a Matrix to Condensed Form.
ACM Transactions on Mathematical Software (TOMS) , Vol. 39, No. 1, 2012
(Reduction to tridiagonal form is one case of what is more generally referred to as condensed form.)
8 Other Algorithms
8.1 Jacobis method for the symmetric eigenvalue problem
(Not to be mistaken for the Jacobi iteration for solving linear systems.)
The oldest algorithm for computing the eigenvalues and eigenvectors of a matrix is due to Jacobi and
dates back to 1846 [6]. This is a method that keeps resurfacing, since it parallelizes easily.
The idea is as follows: Given a symmetric 2 2 matrix
!
11 31
A31 =
31 j,j
18
Sweep 1
0 0 0
0
0
0
0
0 0
0 0
0 0 0
Sweep 2
0 0 0
0
0 0
0 0
0
0 0
0 0
0 0 0
such that
! ! !T !
T 11 31 11 31 11 31
b11 0
J31 A31 J31 = = .
31 33 31 33 31 33 0
b33
We know this exists since the Spectral Decomposition of the 2 2 matrix exists. Such a rotation is called
a Jacobi rotation. (Notice that it is different from a Givens rotation because it diagonalizes a 2 2 matrix
when used as a unitary similarity transformation. By contrast, a Givens rotation zeroes an element when
applied from one side of a matrix.)
2 2 2 2 2
Exercise 10. In the above discussion, show that 11 + 231 + 31 =
b11 +
b33 .
19
Jacobi rotation rotations can be used to selectively zero off-diagonal elements by observing the following:
T
I 0 0 0 0 A00 a10 AT20 a30 AT40 I 0 0 0 0
0 11 0 31 0 aT10 11 aT21 31 aT41 0 11 0 13 0
JAJ T
=
0 0 I 0 0
A20 a21 A22 a32 AT42
0 0 I 0 0
0 31 0 33 0 aT30 31 aT32 33 aT43 0 31 0 33 0
0 0 0 0 I A40 a41 A42 a43 A44 0 0 0 0 I
A00 a10
b AT20 a30
b AT40
aT10 aT21 0 aT41
b b11 b b
=
A20 a21
b A22 a32
b AT42
= A,
b
aT30 0 aT32 aT43
b b b33 b
A40 a41
b A42 a43
b A44
where ! ! !
11 31 aT10 aT21 aT41 aT10
b aT21
b aT41
b
= .
31 33 aT30 aT32 aT43 aT30
b aT32
b aT43
b
Importantly,
What this means is that if one defines off(A) as the square of the Frobenius norm of the off-diagonal elements
of A,
off(A) = kAk2F kdiag(A)k2F ,
b = off(A) 22 .
then off(A) 31
The good news: every time a Jacobi rotation is used to zero an off-diagonal element, off(A) decreases
by twice the square of that element.
The bad news: a previously introduced zero may become nonzero in the process.
The original algorithm developed by Jacobi searched for the largest (in absolute value) off-diagonal
element and zeroed it, repeating this processess until all off-diagonal elements were small. The algorithm
was applied by hand by one of his students, Seidel (of Gauss-Seidel fame). The problem with this is that
searching for the largest off-diagonal element requires O(n2 ) comparisons. Computing and applying one
Jacobi rotation as a similarity transformation requires O(n) flops. Thus, for large n this is not practical.
Instead, it can be shown that zeroing the off-diagonal elements by columns (or rows) also converges to a
diagonal matrix. This is known as the column-cyclic Jacobi algorithm. We illustrate this in Figure 10.
20
References
[1] I. S. Dhillon. A New O(n2 ) Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector Prob-
lem. PhD thesis, Computer Science Division, University of California, Berkeley, California, May 1997.
Available as UC Berkeley Technical Report No. UCB//CSD-97-971.
[2] I. S. Dhillon. Reliable computation of the condition number of a tridiagonal matrix in O(n) time. SIAM
J. Matrix Anal. Appl., 19(3):776796, July 1998.
[3] I. S. Dhillon and B. N. Parlett. Multiple representations to compute orthogonal eigenvectors of symmetric
tridiagonal matrices. Lin. Alg. Appl., 387:128, August 2004.
[4] Inderjit S. Dhillon, Beresford N. Parlett, and Christof Vomel. The design and implementation of the
MRRR algorithm. ACM Transactions on Mathematical Software, 32(4):533560, December 2006.
[5] Jack J. Dongarra, Sven J. Hammarling, and Danny C. Sorensen. Block reduction of matrices to condensed
forms for eigenvalue computations. Journal of Computational and Applied Mathematics, 27, 1989.
[6] C. G. J. Jacobi. Uber ein leichtes Verfahren, die in der Theorie der Sakular-storungen vorkommenden
Gleichungen numerisch aufzul osen. Crelles Journal, 30:5194, 1846.
[7] Field G. Van Zee, Robert A. van de Geijn, and Gregorio Quintana-Ort. Restructuring the tridiagonal
and bidiagonal qr algorithms for performance. ACM Transactions on Mathematical Software, 40(3):18:1
18:34, April 2014.
[8] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ort, and G. Joseph Elizondo. Families
of algorithms for reducing a matrix to condensed form. ACM Trans. Math. Soft., 39(1), 2012.
21