0% found this document useful (0 votes)
58 views15 pages

Orthogonality in Gram-Schmidt

The document discusses how the Modified Gram-Schmidt (MGS) algorithm for computing the QR factorization of a matrix A is numerically equivalent to applying Householder transformations to the augmented matrix [A; 0]. This equivalence helps explain how orthogonality can be lost in the columns of Q1 computed by MGS due to rounding errors, and also how orthogonality can be regained without recomputing Q1 or altering the MGS algorithm itself. The document proposes a new algorithm for solving augmented linear systems that regains orthogonality in a stable way using the Q1 and R matrices from MGS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views15 pages

Orthogonality in Gram-Schmidt

The document discusses how the Modified Gram-Schmidt (MGS) algorithm for computing the QR factorization of a matrix A is numerically equivalent to applying Householder transformations to the augmented matrix [A; 0]. This equivalence helps explain how orthogonality can be lost in the columns of Q1 computed by MGS due to rounding errors, and also how orthogonality can be regained without recomputing Q1 or altering the MGS algorithm itself. The document proposes a new algorithm for solving augmented linear systems that regains orthogonality in a stable way using the Q1 and R matrices from MGS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LOSS AND RECAPTURE OF ORTHOGONALITY IN THE

MODIFIED GRAM-SCHMIDT ALGORITHM 


 
A. BJORCK y AND C. C. PAIGEz

To our close friend and mentor Gene Golub, on his 60th birthday.
This is but one of the many topics on which Gene has generated so
much interest, and shed so much light.
Abstract. This paper arose from a fascinating observation, apparently by Charles Sheeld, and
relayed to us by Gene Golub, that the QR factorization of an m  n matrix A via MGS is numerically
equivalent to that arising from Householder transformations applied to the matrix A augmented by
an n by n zero matrix. This is explained in a clear and simple way, and then combined with a
well known rounding error result to show the upper triangular matrix R from MGS is about as
accurate as R from other QR factorizations. The special structure of the product of the Householder
transformations is derived, and then used to explain and bound the loss of orthogonality in MGS.
Finally this numerical equivalence is used to show how orthogonality in MGS can be regained in
general. This is illustrated by deriving a numerically stable algorithm based on MGS for a class
of problems which includes solution of nonsingular linear systems, minimum 2-norm solution of
underdetermined linear systems, and linear least squares problems. A brief discussion on the relative
merits of such algorithms is included.
Key Words. orthogonal matrices, QR factorization, Householder transformations, least
squares, minimum norm solution, numerical stability, Gram-Schmidt, augmented systems.
AMS Subject Classi cations: 65F25, 65G05, 65F05, 65F20
1. Introduction. We consider a matrix A 2 Rmn with rank n  m. The
Modi ed Gram-Schmidt algorithm (MGS) in theory produces Q1 and R in the QR
factorization
R
(1.1) A = Q 0 = Q1 R; Q = ( Q1 Q2 )
where Q is orthogonal and R upper triangular. In practice if the condition number
 = (A)  1=n is large, (1  : : :  n being the singular values of A), then
the columns of Q1 are not accurately orthogonal [3]. If orthogonality is crucial, then
usually either rotations or Householder transformations have been used to compute
the QR factorization. Here we show how MGS can be used just as stably for many
problems requiring this orthogonality.
We derive some important properties of MGS in the presence of rounding errors.
In particular we show that the R obtained from MGS is numerically as good as that
obtained from rotations or Householder transformations. We present new insights
on the loss of orthogonality in Q1 from MGS, and show how this can e ectively
be regained in computations which use Q1 , without altering the MGS algorithm or
reorthogonalizing the columns of Q1 . As a practical example of this, we indicate how
Q1 and R from MGS may be used to solve an important class of problems reliably,
despite the loss of orthogonality in Q1. This new approach seems applicable to most
problems for which MGS is in theory relevant.
 This research was partially supported by NSERC of Canada Grant No. A9236.
y Mathematics, Linkoping University, S-581 83, Linkoping, Sweden, ([email protected]).
z Computer Science, McGill University, Montreal, Quebec, Canada, H3A 2A7,
([email protected]).
1
2 AKE BJO RCK AND CHRIS PAIGE


The class of problems we consider is that of solving the symmetric inde nite linear
system involving A 2 Rmn with rank n
 I A  x   b 
(1.2) AT 0 y = c :
In general we call (1.2) the Augmented System Formulation (ASF) of the following
two problems, since it represents the conditions for their solution:
(1.3) min
x
kb xk2; AT x = c;

(1.4) min
y
fkb Ayk22 + 2cT yg:
We examine these problems more fully in [5]. The ASF can be obtained by di erenti-
ating the Lagrangian kb xk22 +2yT (AT x c) of (1.3), and equating to zero. Here y is
the vector of Lagrange multipliers. The ASF can also be obtained by di erentiating
(1.4) to give AT (b Ay) = c, and setting x to be the \residual" x = b Ay.
The ASF covers two important special cases. Setting b = 0 in (1.3), and so in
(1.2), gives the problem of nding the minimum 2-norm solution of a Linear Under-
determined System (LUS). Setting c = 0 in (1.4) gives the much used Linear Least
Squares (LLS) problem. The ASF also occurs in its full form (1.2) in the iterative
re nement of least squares solutions [2].
Using the QR factorization (1.1) we can transform (1.2) into
0 R1   
I T T
@ 0 A Qy x = Qc b :
( RT0) 0
This gives one method for solving (1.2):
d z
(1.5) z = R c; T T
f = Q b; x = Q f ; y = R 1(d z):
Using x = Q1z + Q2 f = Q1z + Q2QT2 b = Q1z + (I Q1 QT1 )b an obvious variant is
(1.6) z = R T c; d = QT1 b; x = b Q1 (d z); y = R 1 (d z):
Bjorck [2] showed (1.5) is backward stable for (1.2) using the Householder QR fac-
torization. Since (1.5) uses Q, (1.6) seems preferable if x is required and only Q1 is
available. However, as we shall see it cannot in general be recommended when Q1 is
obtained by MGS. We will show how to develop more reliable algorithms based on
Q1 from MGS.
In Section 2 we illustrate the important but not widely appreciated result that
MGS is numerically equivalent to the Householder QR factorization applied to A
augmented with a block of zeros. From this we show in Section 3 that the computed
R from MGS is numerically as satisfactory as that obtained using Householder QR
on A. The
 product P of the Householder transformations from the QR factorization
O
of A is crucial for a full understanding of MGS. P has a simple and important
n

structure, and this is derived in the theorem in Section 4. This structure shows exactly
how the computed Q 1 from MGS can lose orthogonality. In Section 5 this structure
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 3
is used to bound the loss of orthogonality of Q 1 , while Section 6 shows how the lost
orthogonality can be compensated for just by using Q 1 di erently without altering Q 1
or MGS. We illustrate this by producing a new backward stable algorithm for (1.2)
using the computed Q 1 and R from MGS. In Section 7 we consider when we might
use MGS in preference to the Householder QR factorization of A.
2. Modi ed Gram-Schmidt as a Householder Method. The MGS algo-
rithm computes a sequence of matrices A = A(1) ; A(2); : : :; A(n+1) = Q1 2 Rmn ,
where A(k) = (q1 ; : : :; qk 1; a(kk); : : :; a(nk)). Here the rst (k 1) columns are nal
columns in Q1, and a(kk); : : :; a(nk) have been made orthogonal to q1; : : :; qk 1. In the
kth step we take
(2.1) qk0 = a(kk); kk = kqk0 k2 ; qk = qk0 =kk ;
and orthogonalize a(kk+1) ; : : :; a(nk) against qk using the orthogonal projector I qk qkT ,
(2.2) a(jk+1) = (I qk qkT )a(jk) = a(jk) qk kj ;
kj = qkT a(jk) ; j = k + 1; : : :; n:
We see A(k) = A(k+1)Rk where Rk has the same kth row as upper triangular R  (ij ),
but is the unit matrix otherwise. After n steps we have obtained the factorization
(2.3) A = A(1) = A(2) R1 = A(3) R2R1 = A(n+1)Rn : : :R1 = Q1 R;
where in exact arithmetic the columns of Q1 are orthonormal by construction. Note
that in the modi ed Gram-Schmidt algorithm, as opposed to the classical version,
all the projections qk kj are subtracted from the a(jk) sequentially as soon as qk is
computed. In practice a square root free version is often used, where one computes
Q01; R0, and D = diag( 1 ; : : :; n ) in the scaled factorization, taking qk0 as above,
(2.4) A = Q01R0 ; Q01 = (q10 ; : : :; qn0 ); k = (qk0 )T qk0 ; k = 1; : : :; n;
with R0 = (0kj ) unit upper triangular, and 0kj = (qk0 )T a(jk)= k , j > k.
It was reported in [4] that the modi ed Gram-Schmidt algorithm for the QR
factorization can be interpreted as Householder's method applied to the matrix A
augmented with a square matrix of zero elements on top. This is not only true in
theory, but in the presence of rounding errors as well. This observation is originally
due to Charles Sheeld, and was communicated to the authors by Gene Golub. Be-
cause it is such an important but unexpected result, we will discuss this relationship
in some detail. First we look at the theoretical result.
Let A 2 Rmn have rank n, and let On 2 Rnn be a zero matrix. Consider
the two QR factorizations (here we use Q for m  m and P for (m + n)  (m + n)
orthogonal matrices),
R R
A = Q 0 = ( Q1 Q2 ) 0 ;

   ~   ~ 
(2.5) A~  OAn = P R0 = PP11 PP12 R0 :
21 22
4 AKE BJO RCK AND CHRIS PAIGE


Since A has rank n, P11 is zero, P21 is an m  n matrix of orthonormal columns, and
A = Q1R = P21R. ~ If upper triangular R and R~ are both chosen to have positive
diagonal elements in AT A = RT R = R~ T R,~ then R = R~ by uniqueness, so P21 = Q1
can be found from any QR factorization of the augmented matrix. The last m columns
of P are then arbitrary up to an m  m orthogonal multiplier. The important result
is that the Householder QR factorization of the augmented matrix is numerically
equivalent to MGS applied to A.
To see this, remember that with ek the kth column of the unit matrix, the House-
holder transformation P a = e1  uses P = I 2vvT =kvk22 , v = a e1 ,  = kak2. If
(2.5) is obtained using Householder transformations, then
(2.6) P T = Pn : : :P2 P1; Pk = I 2^vk v^kT =kv^k k22; k = 1; : : :; n;
where the vectors v^k are described below. Now from MGS applied to A(1) = A,
11 = ka(1) (1) 0
1 k2 and a1 = q1 = q111 , so for the rst Householder transformation
applied to the augmented matrix
O   0 ;

A~(1)  n
A(1) ; a~(1)
1 = a(1)
 e  1  e
v^1(1)  1 11 = 11v1 ;
0q1 v1 = 1
q1 ;
(since there can be no cancellation we take kk  0). But kv1k22 = 2, giving
P1 = I 2^v1v^1T =kv^1k22 = I 2v1 v1T =kv1 k22 = I v1v1T ;
and
   e  
P1a~(1) (1) v vT ~a(1) = 0 1 qT a(1) = e1 1j ;
j = a~j 1 1 j a(1) q1 1 j a(2)
j j
so
0 11 12    1n 1
BB 0. 0  0 C
.. .. .. C
P1A = B
~(1)
B@ .. . . . C CA ;
0 0    (2)
0
0 a(2)
2    an
where these values are clearly numerically the same as in the rst step of MGS on
A. We see the next Householder transformation produces the second row of R and
a(3) (3)
3 ; : : :; an , just as in MGS. Carrying on this way we see this Householder QR is
numerically equivalent to MGS applied to A, and that every Pk is e ectively de ned
by Q1, since
 e
(2.7) Pk = I vk vkT ; vk = k
qk ; k = 1; : : :; n:
P gives us a key to understanding the numerical behavior of MGS. First note
that in theory viT vj = eTi ej + qiT qj = 0 if i 6= j, so PiPj = I vi viT vj vjT , and
P T = Pn    P1 = I v1 v1T v2 v2T    vn vnT is symmetric, so using Householder
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 5
transformations in (2.5),
P11 = 0;
P12T = P21 = q1 eT1 +    + qneTn = Q1;
P22 = I q1q1T    qnqnT = I Q1QT1 = Q2 QT2 :
This shows such special orthogonal matrices are fully de ned by their (1; 2) blocks,
O Q T 
(2.8) n
P = Q1 I Q1 QT : 1
1
3. Accuracy of R from Modi ed Gram-Schmidt. A rounding error analysis
of MGS was given in [3]. There it was shown that the computed Q 1 and R satisfy
A + E = Q 1 R;
 kE k2  c1 ukAk2;
(3.1) kI Q T1 Q 1 k2  c2 u;
where ci are constants depending on m; n and the details of the arithmetic, and u
is the unit roundo . Hence Q 1R accurately represents A and the departure from
orthogonality can be bounded in terms of the condition number  = 1=n.
From the numerical equivalence shown in the previous section it follows that
the backward error analysis for the Householder QR factorization of the augmented
matrix in (2.5) can also be applied to the modi ed Gram-Schmidt algorithm on A.
Here we will do this, and in this section and Section 5 we will rederive (3.1) as well
as give some new results. This is a simple and uni ed approach, in that the one
analysis of orthogonal transformations can be used to analyse the QR factorization
via both Householder transformations and MGS. It also deepens our understanding
of the MGS algorithm and its possible uses.
Let Q 1 = (q1; : : :; qn) be the matrix of vectors computed by MGS, and for k =
1; : : :; n de ne
 e
(3.2) vk = q k ; Pk = I vk vkT ; P = P1P2 : : : Pn;
k
 e
q~k = qk =kqk k2; v~k = k ~ T ~ ~ ~ ~
q~k ; Pk = I v~k v~k ; P = P1P2 : : : Pn:
Then Pk is the computed version of theHouseholder  matrix applied in the kth step
of the Householder QR factorization of OAn , and P~k is its orthonormal equivalent,
so that P~kT P~k = I. Wilkinson [11, pp. 153{162] has given a general error analysis of
orthogonal transformations of this type. From this it follows that for R computed by
MGS, the equivalent of (2.5) is
 E   R 
1 ~  ~ 0
A + E2 = P 0 ; P = P + E ;
(3.3) kEik2  ciukAk2; i = 1; 2; kE 0k2  c3 u;
where again ci are constants depending on m; n and the details of the arithmetic.
To show this R from MGS, or the Householder QR factorization of the augmented
matrix, is numerically about as good as that from the ordinary Householder QR
factorization of A, we use the following general result.
6 AKE BJO RCK AND CHRIS PAIGE


Lemma 3.1. For any matrices satisfying


 E  P 
1 11 P11T P11 + P21T P21 = I;
A + E2 = P21 R;
there exist Q^ 1 and E such that
A + E = Q^ 1 R; Q^ T1 Q^ 1 = I;

(3.4) kQ^ 1 P21k2  kP11k22 ;


(3.5) k(Q^ 1 P21)Rk2  kP11k2 kE1k2;
(3.6) kE k2  kP11k2 kE1k2 + kE2k2  kE1k2 + kE2k2:
Proof. Consider the CS decomposition (see for example [7, p.77]) P11 = U1 CW T ,
P21 = V1 SW T , where U = (U1 ; U2), V = (V1 ; V2) are square orthonormal matrices
and C and S are nonnegative diagonal matrices with C 2 +S 2 = I. De ne Q^ 1  V1 W T ,
the closest orthonormal matrix to P21 in any unitarily invariant norm, then since
(I + S)(I S) = C 2,
Q^ 1 P21 = V1(I S)W T = V1 (I + S) 1 W T WCU1T U1 CW T
= V1(I + S) 1 W T P11T P11;
(Q^ 1 P21)R = V1(I + S) 1 W T P11T E1;
from which the rst two bounds follow. Next
E = Q^ 1R A = (Q^ 1 P21)R + E2;
from which the third bound follows. 2
Using these results we see when R is computed using MGS, so R satis es (3.3),
there exists orthonormal Q^ 1 such that, writing c = c1 + c2,
(3.7) A + E = Q^ 1R; Q^ T1 Q^ 1 = I; kE k2  cukAk2:
 and 1  : : :  n are those
This means if 1  : : :  n are the singular values of R,
of A,
(3.8) j i i j cu1; i = 1; : : :; n:
Thus R from MGS is not only the same as R from the Householder QR factorization
applied to A augmented by a square block of zeros, but (3.7) shows it is comparable
in accuracy to the upper triangular matrix from the Householder or Givens QR fac-
torization applied to A alone. Also (3.8) shows the singular values of R are very close
to those of A, which means we could use MGS as a rst step in nding the singular
values of A, and justi es an algorithm by Longley in [9, Ch. 9]. Since we have not
required A to be full rank as yet in this section, this fact also ensures R from MGS
can be used in any computation for nding the rank of A. Here we will just use this
knowledge to simplify our bounds below.
In fact R is usually even better than (3.7) suggests. We see R is nonsingular if
cu1 < n , that is if cu < 1, so we make the following assumption and de nition,
(3.9) cu < 1;   (1 cu) 1;
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 7
where usually   1. Then
(3.10) kAk2kR 1 k2 = 1 =n  1=(n cu1) = ;
and E1 = P~11R, so
(3.11) kP~11k2 = kE1R 1k2  c1u; kP11k2  (c1  + c3 )u:
From (3.6)
(3.12) kE k2  kP~11k2kE1k2 + kE2 k2  c21 u2kAk2 + kE2k2
showing that the rst term on the right will be negligible if c1u  1, which is
usually true.
We will show how all of P~ and P depend crucially on P~11 and P11 respectively,
so the bounds in (3.11) are important in understanding the loss of orthogonality
in MGS. Since R is numerically about as good as we can hope for, it is clear that
the main drawback of MGS is this lack of orthogonality in Q 1 = (q1; : : :; qn), so we
examine this in the next two sections. (As is mentioned in Section 7, another less
important drawback is that the operation count is slightly higher for MGS than for
the Householder QR factorization.)
4. Structure of P , P and P~ from the Householder QR factorization of
the augmented matrix. It is well known that the orthogonality of the ideal Q1 is
lost in MGS because of cancellation in the subtractions in (2.2), and that this can
give a severely nonorthogonal computed Q 1 . In order to understand this loss fully
and later to bound it, the following theorem provides the detailed structures of P
and P~ in (3.2) as functions of the computed Q 1 and the normalized Q~ 1  (~q1 ; :::; ~qn)
respectively. Note the theorem is for general Q1 = (q1; : : :; qn), and so will apply to
P, P and P.
~ The idea is that any matrix P = P1 P2 : : :Pn with Pk = I vk vkT and
vk = ( ek ; qkT ) has a very special structure, and the theorem reveals this. In this
T T
structure the whole matrix is seen to depend only on the leading n  n block P11 of
P, and on Q1. But we have bounds on our P~11 and P11 in (3.11), and so will be able
to understand and bound the loss of orthogonality in Q~ 1 or Q 1 from MGS. What is
more all such P11 have special structure too, being strictly upper triangular.
Theorem 4.1. Let Q1 = (q1; : : :; qn) 2 Rmn , and for k = 1; : : :; n, de ne
 e
Mk = I qk qk ;T vk = q k 2 Rm+n ; Pk = I vk vkT :
k
Then with the partitioning we use throughout this theorem
n m !
P  P1P2 : : :Pn  n P11 P12
m P21 P22
0 0 qT q T
q1 M2 q3    q1T M2 M3    Mn 1qn q1T M2 M3    Mn 1
BB 0 10 2 q2T q3    q2T M3 M4    Mn 1qn q2T M3 M4    Mn CC
B ... ... .. .. .. CC
(4.1) = BBB .  . . CC
B@ 00 00 0  qnT 1qn qnT 1Mn CA
0  0 qnT
q1 M1 q2 M1 M2 q3    M1 M2    Mn 1qn M1 M2    Mn
 P (IP11)QT1

(4.2) = 11 :
Q1 (I P11) I Q1(I P11)QT1
8 AKE BJO RCK AND CHRIS PAIGE


P is orthonormal if and only if kqk k2 = 1 for k = 1; : : :; n; and P11 = 0 if and only


if QT1 Q1 is diagonal.
There is a short proof that does not give (4.1), but since (4.1) reveals the detailed
structure of P, we give a longer proof. Note if qk has length 1 then Mk is a projector,
and from (4.1) the second column of P21 is that part of q2 orthogonal to q1, the third
is q3 orthogonalized against q2 and the result orthogonalized against q1, and so on.
However this is not the same as reorthogonalizing the qk .
Proof. To determine the rst n columns of P = P1 P2    Pn note
 e   T 
T qT  = I ek ek ek qk
k T
Pk = I vk vkT = I qk ek k qk eTk Mk
 
and let 1  j  n. If j 6= k then Pk ej = ej , while Pj ej = I0 qj , so
m
 e qT 
Pej = P1P2    Pn ej = P1P2    Pj ej = P1P2    Pj 2 jM1 j 1 qj
 e qT + e qT jM1 
= P1P2    Pj 3 j 1 j 1 j 2 j 2 j 1 qj
Mj 2 Mj 1
0 qT M2    Mj 1qj 1 0 1j 1
BB q21T M3    Mj 1qj CC BB 2j CC
BB . CC B . CC
BB qT M.. q CC     BBB  .. CC
(4.3) = B
B j q2T jq 1 j CC = P11 e = p1j  BB j 2;j CC
BB j 1j
0
CC P21 j p2j BB j 1;j CC
BB .. CC BB .jj CC
BB . CC BB .. CC
@ 0 A @ nj A
M1 M2    Mj 1qj p2j
say, which gives the (1; 1) and (2; 1) blocks of (4.1). For the last m columns we have
P   0   e qT 
12 = P I = P1P2    Pn 1 M n n
P22 m n
0 qT M    M 1
BB q21T M32    Mnn CC
 e qT + e qT M  B .. CC
(4.4) = P1P2    Pn 2 n n M n 1Mn 1 n = BB
BB qnT 1Mn
. CC ;
n 1 n
@ qnT
CA
M1    Mn
which completes the proof of (4.1). Next from (4.3)

P21ej = (I q1q1T )M2 M3    Mj 1qj


= M2 M3    Mj 1 qj q1 1j
= M3 M4    Mj 1 qj q1 1j q22j
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 9
= Mj 1 qj q11j q22j    qj 2j 2;j
= qj q11j q22j    qj 1j 1;j
= Q1 ej Q1 P11ej ;
so P21 = Q1 (I P11), giving the (2; 1) block of (4.2). Next from (4.4)
eTi P12 = qiT Mi+1    Mn 1Mn
= qiT Mi+1    Mn 1 qiT Mi+1    Mn 1 qnqnT
= qiT Mi+1    Mn 1 inqnT
= qiT Mi+1    Mn 2 i;n 1qnT 1 inqnT
= qiT Mi+1 i;i+2qiT+2    inqnT
= qiT i;i+1qiT+1    inqnT
= eTi QTi eTi P11QT1 ;
so P12 = (I P11)QT1 , giving the (1; 2) block of (4.2). We can now use the structure
of P21 in (4.1) to give
P22 = M1 M2    Mn
= M1 M2    Mn 1 M1 M2    Mn 1 qnqnT
= M1 M2    Mn 1 P21en qnT
= M1 M2    Mn 2 P21en 1qnT 1 P21enqnT
= I q1q1T P21e2 q2T P21e3 q3T    P21en qnT
= I P21(e1 q1T + e2 q2T +    + en qnT )
= I P21QT1 = I Q1(I P11)QT1 ;
completing the proof of (4.2).
Clearly Pk is orthonormal if qkT qk = 1, so if kqkk2 = 1 for k = 1; : : :; n; then P
is orthonormal. Now suppose P is orthonormal, then P e1 = P1e1 = (0; q1T )T must
have length 1, so kq1k2 = 1 and P1 and so P2P3    Pn is orthonormal. But then
P2P3    Pne2 = P2e2 = (0; q2T )T must have length 1, and so on. Finally we see from
(4.1) that the ith row of P11 is zero if and only if qiT qj = 0 for j = i+1; : : :; n, proving
P11 = 0 if and only if QT1 Q1 is diagonal. 2
Since each of P (see (2.6) and (2.7)), P and P~ (see (3.2)) has the structure of P
in the theorem, P has the form (2.8), and
 ~ ~ ~T 
(4.5) P~ = Q~ (IP11P~ ) I (IQ~ (IP11)P~Q1)Q~ T ;
1 11 1 11 1
for some strictly upper triangular P~11, with P having a similar form. This shows how
Q~ 1 loses orthogonality when P~11 is nonzero. Clearly P and P~ are orthogonal matrices,
so their rst n columns form orthonormal sets. Since P11 is zero, Q1 is clearly an m  n
matrix of orthonormal columns, but all we can say about the size of P~11 is kP~11k2 
c1 u, from (3.11). If  is not very much greater than 1, then P~11 is small, and from
(4.5) Q~ 1 has nearly orthonormal columns. For larger  (4.5) shows how the columns
of Q~ 1 can become less and less orthogonal, losing all likelihood of orthogonality when
c1 u ' 1. Clearly column pivoting would be useful in maintaining orthogonality as
long as possible, and in revealing the rank of rank de cient A. Since Q~ 1 is just Q 1
10 AKE BJO RCK AND CHRIS PAIGE


with normalized columns, the same comments on orthogonality apply to Q 1 . We will


bound these losses of orthogonality in the next section, and show how to avoid them
after that.
5. Loss of Orthogonality in Q 1 and Q~1 from MGS. Each column of Q~ 1 is
just the correctly normalized column of the computed Q 1 from MGS, whose columns
already have norm almost one, so what we prove for Q~ 1 e ectively holds for Q 1. We
saw from Theorem 4.1 that the rst n columns P~ (n) of P~ are orthonormal and
 ~   P~ 
P~ (n) = Q~ PQ 11 = P~11 I; P~ (n)T P~ (n) = I;
1 ~ 1 P~11 21
so an easy result is obtained by applying Lemma 3.1 with R = I, A = Q~ 1, E1 = P~11
and E2 = Q~ 1 P~11, showing there exist Q^ 1 and E such that Q~ 1 + E = Q^ 1 with
Q^ T1 Q^ 1 = I and
kE k2 = kQ~ 1 Q^ 1 k2  (kP~11k2 + kQ~ 1 k2)kP~11k2:
But then kQ~ 1k2  1 + kE k2 , giving
kE k2  kP~11k2(1 + kP~11k2 )=(1 kP~11k2);
and a bound on the distance of Q~ 1 from an orthogonal matrix when c1u < 1,
(5.1) kQ~ 1 Q^ 1 k2  c1 u 11 + cc1 u ;
1 u
which for c1 u  1 is e ectively c1 u.
In order to bound the departure of Q~ T1 Q~ 1 from the unit matrix we could use (5.1)
directly, but a more revealing result follows by noting in (3.3) E1 = P~11R is strictly
upper triangular, since P~11 is so from Theorem 4.1. Thus
 ~   
P~ (n)R = Q~ (IP11P~ ) R = Q~ (RE1 E ) ;
1 11 1 1
so that
(R E1 )T Q~ T1 Q~ 1(R E1) = R T R E1T E1
= (R E1 )T (R E1) + (R E1)T E1 + E1T (R E1):
Since R is nonsingular upper triangular, and E1 is strictly upper triangular, R E1
is nonsingular upper triangular, and
(5.2) Q~ T1 Q~ 1 = I + E1 (R E1 ) 1 + (R E1) T E1T ;
with E1(R E1) 1 the strictly upper triangular part of Q~ T1 Q~ 1. This gives a clear
picture of exactly how the loss of orthogonality depends on the computed R.  Thus
from (3.3) and (3.8){(3.10) if (c + c1)u < 1 we obtain the bound
(5.3) kI Q~ T1 Q~ 1k2  1 (c2c+1uc )u ;
1
and a loss of orthogonality of this magnitude can often be observed in practice.
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 11

The bound (5.3) is of similar form to the bound (3.1) given in [3], but here we also
derived the relation of Q~ 1 to the orthonormal matrix P~ , and described the relation
between the loss of orthogonality in Q~ 1 and the deviation of P~ from the ideal form
of P. We also note here that if the rst k columns of A in (3.3) have a small , then
the rst k columns of P~11 will be small, and the rst k columns of Q~ 1 will be nearly
orthonormal.
Our main purpose is not to show how Q~ 1 or Q 1 may be improved. Instead the
key point of this work is that although the computed P is very close to the exactly
orthogonal P~ in (3.3), the columns of Q 1 need not be particularly orthonormal. Our
thesis here is that as a result of this, it is usually inadvisable to use Q 1 as our set of
orthonormal vectors, but we can use P (as the theoretical product of the computed
Pk = I vk vkT , which is extremely close to P),
~ to make use of the desired orthogonality,
since we have all the necessary information in Q 1, that is vkT = ( eTk ; qkT ). Thus
we can solve problems as accurately using MGS as using Householder or Givens QR
factorizations if, instead of using the computed Q 1 directly, we formulate the problems
in terms of (2.5), see (3.3), and use the qk to de ne P.  Ofcourse in most cases no
block of P need actually be formed. We illustrate an important use of this idea in
the next section, and discuss the eciency of such an approach in Section 7.
6. Backward Stable Solution of the ASF using MGS. Bjorck [2] showed
(1.5) is backward stable for the ASF (1.2) using the Householder QR factorization,
but the same is not true when we use (1.6) with R and Q 1 computed by MGS, see
[5]. Here we use our new knowledge of MGS to produce a backward stable algorithm
for the ASF based on Q 1 and R from MGS. This new approach can be used to design
good algorithms using MGS in general.
Our original ASF (1.2) is equivalent to the augmented system
0 I 0 0 10w 1 0 01
(6.1) @0 I AA@ x A = @bA;
T
0 A 0 y c
so applying Householder transformations as in (2.5) gives the augmented version of
the method (1.5) is
d   w   z 
(6.2) z = R T c; T 0 1
h =P b ; x = P h ; y = R (d z):
But as we saw in Section 2 we can use the qk from MGS to produce Pk = I vk vkT ,
vkT = ( eTk ; qkT ), and use P T = Pn : : :P2 P1 in (6.2). We show in [5] that this algorithm
is strongly stable (see [6]) for (6.1), and also strongly stable for (1.2).
We now show how to take advantage of the structure of the Pk , then we will
summarize this numerically stable use of MGS for the ASF. To compute d and h in
(6.2) note that P T = Pn : : : P1, and de ne
 d(1)   0   d(k+1)  0  d(k) 
h(1) = b ; h(k+1) = Pk : : :P1 b = Pk h(k) :
Now using induction we see d(k) has all but its rst k 1 elements zero, and
 d(k+1)   d(k)   e     
= k ( eT qT ) d(k) = d(k) + ek (qkT h(k) ) ;
hk+1) h(k) qk k k h(k) h(k) qk (qkT h(k))
12 AKE BJO RCK AND CHRIS PAIGE


giving the computation starting with h(1) := b,


for k = 1; : : :; n dofk := qkT h(k); h(k+1) := h(k) qkk g;
so h = h(n+1); d = d(n+1) = (1 ; : : :; n )T . This costs 2mn ops (1 op = 1 multi-
plication and 1 addition in oating point arithmetic), compared with the mn ops
required to form d = QT1 b in (1.6). The computation for d and h is exactly the same
as the one that would arise if the n MGS steps in (2.1){(2.3) had been applied to
(A; b) instead of just A, so that h is theoretically the component of b orthogonal to
the columns of A. Note that now d has elements qkT h(k) instead of qkT b as would be
the case in (1.6).
To compute x in (6.2), de ne
 w(n)   z   w(k 1)  z   w(k) 
x(n) = h ; x(k 1) = P k : : :P n h = P k x(k) ;
so that
 w(k 1)   w(k)   e 
k T (k) T (k)
x(k 1) = x(k) qk ( ek w + qk x ) ;
which shows in this step only the kth element of w(k) is changed from k = eTk z to
!k = qkT x(k). This gives the computation starting with x(n) := h = h(n+1),
for k = n; : : :; 1 dof!k := qkT x(k); x(k 1) := x(k) qk(!k k )g;
so x = x(0), w = (!1 ; : : :; !n)T . This costs 2mn ops compared with mn ops for
x = b Q1 (d z) in (1.6). From (2.8) we see in theory (6.2) gives x = Q1z + Q2 QT2 h
where h = Q2 QT2 b, so x = h + Q1 z. Note that w = (!1 ; : : :; !n)T is ideally zero, see
(6.1), but can be signi cant when (A) is large. The computation of x here can be
seen to reorthogonalize each x(k) against the corresponding qk before adding on qk k
to give x(k 1). The complete algorithm is then:
Algorithm 6.1. Backward Stable Algorithm for the ASF based on MGS
1. Carry out MGS on A to give Q1 = (q1 ; : : :; qn) and R,
2. Solve RT z = c for z = (1 ; : : :; n)T ,
3. for k = 1; : : :; n dofk := qkT b; b := b qkk g;
4. for k = n; : : :; 1 dof!k := qkT b; b := b qk(!k k )g; x := b;
5. Solve Ry = d z for y, where d = (1 ; : : :; n )T .
A weakness in some other MGS-based algorithms is that the reorthogonalization
in step 4 is not done. This is the case for the two algorithms denoted (3.4) and (3.6)
in [1]. The rst is equivalent to (1.6) and the second is the Huang algorithm [8] which,
instead of steps 3 and 4, does (using our notation)
for k = 1; : : :; n dofk := qkT b; b := b qk(k k )g; x := b;
The following implementation issues and specializations of the algorithm are fairly
obvious. Steps 1, 2 and 3 can be combined, and there is a lot of parallelism inherent
in these. When these are complete, steps 4 and 5 can be carried out independently.
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 13

For (1.3), step 5 can be omitted if the vector of Lagrange multipliers y is not needed,
while for (1.4), step 4 can be omitted if the residual x is not needed.
If b = 0, corresponding to LUS, then d = 0 and step 3 will be omitted, and step 5
too if the Lagrange multipliers are not needed. If c = 0, corresponding to LLS, then
z = 0 and step 2 will be omitted, and step 4 too if the LLS residual x is not needed.
Then the algorithm is equivalent to the following variant of MGS:
R d
(6.3) (A; b) = (Q1 ; h) 0 1 ; y = R 1d;

where d is computed as part of MGS. This is the approach recommended for LLS in [3].
The work here is another way of proving the backward stability of this approach, and
adds insight into why it works. For LUS however, the numerically stable algorithm
made of steps 1,2 and 4 constitutes a new algorithm which is superior to the usual
approach that omits the !k in step 4.
If A is square and nonsingular, (1.3) becomes the solution of AT x = c, and x is
independent of b, so if y is not wanted then b can be taken as zero in the algorithm,
and steps 3 and 5 dropped. Similarly if A is square and nonsingular and c = 0 then
(1.4) becomes Ay = b, and steps 2 and 4 can be dropped. This gives two di erent
backward stable algorithms for solving nonsingular systems using MGS. Note that the
rst algorithm applies MGS to the rows of the matrix (here AT ), and is numerically
invariant under row scalings. The second algorithm applies MGS to the columns of
A, and is invariant under column scalings. Hence the rst algorithm is to be preferred
if the matrix is badly row scaled, the second if A is badly column scaled.
A square root free version of Algorithm 6.1 is obtained if we instead use the
factorization (2.4) A = Q01R0 , where R0 is unit upper triangular:
Algorithm 6.2.
1. Carry out MGS on A to give Q01 = (q10 ; : : :; qn0 ), R0, and D = diag( 1 ; : : :; n ),
where i = kqi0 k22.
2. Solve (R0 )T Dz 0 = c for z 0 = (10 ; : : :; n0 )T .
3. for k = 1; : : :; n dofk0 := (qk0 )T b= k ; b := b qk0 k0 g;
4. for k = n; : : :; 1 dof!k0 := (qk0 )T b= k ; b := b qk0 (!k0 k0 )g; x := b;
5. Solve R0 y = d0 z 0 for y, where d0 = (10 ; : : :; n0 )T .
This section has not only shown how MGS can be used in a numerically stable
way to solve the very useful linear system (1.2), along with its many specializations,
but it has hopefully shown how MGS can be used more e ectively in general.
7. Comparison of MGS and Householder Factorizations. There are four
main approaches we need to compare:
1. MGS on A producing computed R and Q 1, and using these.
2. MGS on A producing computed R and  Q 1, and using R and P1 ; : : :; Pn.
O
3. Householder transformations on A producing R and P1 ; : : :; Pn and using
n 
these.
4. Householder transformations on A producing R^ and P^1; : : :; P^n say, and using
these.
14 AKE BJO RCK AND CHRIS PAIGE


We call these approaches rather than algorithms, since each includes a reduction
algorithm, plus a choice of tools to use in problems that use the reduction. We only
consider the case of a single processor computer, and a dense matrix A.
Approaches 2 and 3 are numerically equivalent, but it is clearly more ecient for
computer storage to use approach 2 via (2.1) and (2.2) than to use 3, even though we
may think in terms of 3 to design algorithms which use the P1; : : :; Pn, these of course
being \stored" as q1; : : :; qn. Thus we would use the new approach 2 rather than 3
computationally, while being aware of both their properties theoretically.
The most usual case is where we wish to use the orthogonality computationally,
but cannot rely on (A) being small. Then the choice is between 2 and 4. For
the initial QR factorization MGS requires mn2 ops compared to mn2 n3 =3 for
Householder. MGS also needs n(n 1)=2 more storage locations. Hence approach
4 has an advantage with respect to both storage and operation count for the initial
factorization, although this is small when m  n.
If accurately orthogonal Q or Q1 in (1.1) is required as an entity in itself, then
since such orthogonal matrices are not immediately produced by 2 when (A) is
large, the obvious choice is 4, where Q (or Q1 ) is available as the product (or part
of it) of the P^k . To produce Q1 doubles the cost using 4. To produce an accurately
orthogonal Q1 with MGS in general, we apparently need to reorthogonalize. This also
approximately doubles the factorization cost, and again the operation count is higher
than for Householder.
For both approaches 2 and 4 we have shown backward stability in the usual
normwise sense. In agreement with this both these approaches tend to give similar
accuracy, although experience shows that MGS has a small edge here, in particular if
the square root free version is used.
If the matrix A is not well row scaled then row interchanges may be needed in 4
to give accurate solutions for problem LLS, see [10]. In this context it is interesting
to note that MGS is numerically invariant under row permutations of A as long as
inner products are unaltered by the order of accumulation of terms. That is, if Q 1
and R are the computed factors for A then Q 1 and R are the computed factors of
A. This shows that 2 is more stable than 4 without row interchanges. However,
if row interchanges are included in 4, this approach is more accurate for problems
where the row norms of A vary widely. In approach 2 a second order error term
 O((wu)2 ) appears, where w is the maximum ratio of row norms. This error term
can be eliminated by reorthogonalization, which however increases the cost of MGS.
We nally mention that sometimes R is used alone to solve our problems, and
then approaches 1 and 2 are identical. We will discuss this case in [5].

REFERENCES
[1] M. Arioli and A. Laratta, Error analysis of algorithms for computing the projection of a
point onto a linear manifold, Linear Algebra Appl., 82 (1986), pp. 1{26.
A. Bjo rck, Iterative re nement of linear least squares solutions I, BIT, 7 (1967), pp. 257{278.
[2] 
[3] , Solving linear least squares problems by Gram-Schmidt orthogonalization, BIT, 7 (1967),
pp. 1{21.
[4] , Methods for sparse least squares problems, in Sparse Matrix Computations, J. Bunch
and D. J. Rose, eds., Academic Press, New York, 1976, pp. 177{199.
A. Bjo rck and C. Paige, Solution of augmented linear systems using orthogonal factoriza-
[5] 
tions, BIT, 34 (1994), pp. 1{24.
[6] J. R. Bunch, The weak and strong stability of algorithms in numerical linear algebra, Linear
Algebra Appl., 88/89 (1987), pp. 49{66.
LOSS AND RECAPTURE OF ORTHOGONALITY IN MODIFIED GRAM-SCHMIDT 15

[7] G. H. Golub and C. F. V. Loan, Matrix Computations, The Johns Hopkins University Press,
Baltimore, Maryland, 2nd ed., 1989.
[8] H. Y. Huang, A direct method for the general solution of a system of linear equations, J.
Optim. Theory Appl., 16 (1975), pp. 429{445.
[9] J. W. Longley, Least Squares Computations Using Orthogonalization Methods, Marcel
Dekker, Inc., New York, 1984.
[10] M. J. D. Powell and J. K. Reid, On applying Householder's method to linear least squares
problems, in Proceedings IFIP Congress, 1968, pp. 122{126.
[11] J. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1965.

You might also like