0% found this document useful (0 votes)

63 views

Block LU Factorization

Uploaded by

Chris1908

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Block LU Factorization

Uploaded by

Chris1908

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.

php

Chapter 13
Block LU Factorization

Block algorithms are advantageous for at least two important reasons.

First, they work with blocks of data having b2 elements,
performing O(b3 ) operations.
The O(b) ratio of work to storage means that
processing elements with an O(b) ratio of
computing speed to input/output bandwidth can be tolerated.
Second, these algorithms are usually rich in matrix multiplication.
This is an advantage because
nearly every modern parallel machine is good at matrix multiplication.
— ROBERT S. SCHREIBER, Block Algorithms for Parallel Machines (1988)

It should be realized that, with partial pivoting,

any matrix has a triangular factorization.
DECOMP actually works faster when zero pivots occur because they mean that
the corresponding column is already in triangular form.
— GEORGE E. FORSYTHE, MICHAEL A. MALCOLM, and CLEVE B. MOLER,
Computer Methods for Mathematical Computations (1977)

It was quite usual when dealing with very large matrices to

perform an iterative process as follows:
the original matrix would be read from cards and the reduced matrix punched
without more than a single row of the original matrix
being kept in store at any one time;
then the output hopper of the punch would be
transferred to the card reader and the iteration repeated.
— MARTIN CAMPBELL-KELLY, Programming the Pilot ACE (1981)

245
246 Block LU Factorization

13.1. Block Versus Partitioned LU Factorization

As we noted in Chapter 9 (Notes and References), Gaussian elimination (GE) com-
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

prises three nested loops that can be ordered in six ways, each yielding a different
algorithmic variant of the method. These variants involve different computational
kernels: inner product and saxpy operations (level-1 BLAS), or outer product and
gaxpy operations (level-2 BLAS). To introduce matrix–matrix operations (level-3
BLAS), which are beneficial for high-performance computing, further manipula-
tion beyond loop reordering is needed. We will use the following terminology,
which emphasises an important distinction.
A partitioned algorithm is a scalar (or point) algorithm in which the operations
have been grouped and reordered into matrix operations.
A block algorithm is a generalization of a scalar algorithm in which the basic
scalar operations become matrix operations (α + β, αβ, and α/β become A + B,
AB, and AB −1 ), and a matrix property based on the nonzero structure becomes
the corresponding property blockwise (in particular, the scalars 0 and 1 become
the zero matrix and the identity matrix, respectively). A block factorization is
defined in a similar way and is usually what a block algorithm computes.
A partitioned version of the outer product form of LU factorization may be
developed as follows. For A ∈ Rn×n and a given block size r, write

A11 A12 L11 0 Ir 0 U11 U12
= , (13.1)
A21 A22 L21 In−r 0 S 0 In−r
where A11 is r × r.

Algorithm 13.1 (partitioned LU factorization). This algorithm computes an LU

factorization A = LU ∈ Rn×n using a partitioned outer product implementation,
using block size r and the notation (13.1).
1. Factor A11 = L11 U11 .
2. Solve L11 U12 = A12 for U12 .
3. Solve L21 U11 = A21 for L21 .
4. Form S = A22 − L21 U12 .
5. Repeat steps 1–4 on S to obtain L22 and U22 .
Note that in step 4, S = A22 − A21 A−1
11 A12 is the Schur complement of A11 in
A. Steps 2 and 3 require the solution of the multiple right-hand side triangular
systems, so steps 2–4 are all level-3 BLAS operations. This partitioned algorithm
does precisely the same arithmetic operations as any other variant of GE, but
it does the operations in an order that permits them to be expressed as matrix
operations.
A genuine block algorithm computes a block LU factorization, which is a factor-
ization A = LU ∈ Rn×n , where L and U are block triangular and L has identity
matrices on the diagonal:
   
I U11 U12 . . . U1m
 .
.. 
 L21 I  U22
L= . . , U =  

.
 .. ..   ..
. Um−1,m 
Lm1 . . . Lm,m−1 I Umm
13.1 Block Versus Partitioned LU Factorization 247

In general, the blocks can be of diﬀerent dimensions. Note that this factorization is
not the same as a standard LU factorization, because U is not triangular. However,
the standard and block LU factorizations are related as follows: if A = LU is a
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

block LU factorization and each Uii has an LU factorization Uii = Lii U ii , then
A = L diag(Lii ) · diag(U ii )U is an LU factorization. Conditions for the existence
of a block LU factorization are easy to state.

Theorem 13.2. The matrix A = (Aij )m i,j=1 ∈ R

n×n
has a unique block LU factor-
ization if and only if the first m − 1 leading principal block submatrices of A are
nonsingular.
Proof. The proof is entirely analogous to the proof of Theorem 9.1.
This theorem makes clear that a block LU factorization may exist when an LU
factorization does not.
If A11 ∈ Rr×r is nonsingular we can write

A11 A12 I 0 A11 A12
A= = , (13.2)
A21 A22 L21 I 0 S
which describes one block step of an outer-product-based algorithm for computing
a block LU factorization. Here, S is again the Schur complement of A11 in A. If
the (1, 1) block of S of appropriate dimension is nonsingular then we can factorize
S in a similar manner, and this process can be continued recursively to obtain
the complete block LU factorization. The overall algorithm can be expressed as
follows.

Algorithm 13.3 (block LU factorization). This algorithm computes a block LU

factorization A = LU ∈ Rn×n , using the notation (13.2).
1. U11 = A11 , U12 = A12 .
2. Solve L21 A11 = A21 for L21 .
3. S = A22 − L21 A12 .
4. Compute the block LU factorization of S, recursively.
Given a block LU factorization of A, the solution to a system Ax = b can
be obtained by solving Ly = b by forward substitution (since L is triangular)
and solving U x = y by block back substitution. There is freedom in how step 2
of Algorithm 13.3 is accomplished, and how the linear systems with coeﬃcient
matrices Uii that arise in the block back substitution are solved. The two main
possibilities are as follows.
Implementation 1: A11 is factorized by GEPP. Step 2 and the solution of linear
systems with Uii are accomplished by substitution with the LU factors of A11 .
Implementation 2: A−1 11 is computed explicitly, so that step 2 becomes a matrix
multiplication and U x = y is solved entirely by matrix–vector multiplications.
This approach is attractive for parallel machines.
A particular case of partitioned LU factorization is recursively partitioned LU
factorization. Assuming, for simplicity, that n is even, we write

A11 A12 L11 0 In/2 0 U11 U12
= , (13.3)
A21 A22 L21 In/2 0 S 0 In/2
248 Block LU Factorization

where each block is n/2 × n/2. The algorithm is as follows.

Algorithm 13.4 (recursively partitioned LU factorization). This algorithm com-

Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

putes an LU factorization A = LU ∈ Rn×n using a recursive partitioning, using

the notation (13.3).

A11 L11
1. Recursively factorize = U11 .
A21 L21
2. Solve L11 U12 = A12 for U12 .
3. Form S = A22 − L21 U12 .
4. Recursively factorize S = L22 U22 .
In contrast with Algorithm 13.1, this recursive algorithm does not require a
block size to be chosen. Intuitively, the recursive algorithm maximizes the dimen-
sions of the matrices that are multiplied in step 3: at the top level of the recursion
two n/2 × n/2 matrices are multiplied, at the next level two n/4 × n/4 matrices,
and so on. Toledo [1145, ] shows that Algorithm 13.4 transfers fewer words
of data between primary and secondary computer memory than Algorithm 13.1
and shows that it outperforms Algorithm 13.1 on a range of computers. He also
shows that the large matrix multiplications in Algorithm 13.4 enable it to benefit
particularly well from the use of Strassen’s fast matrix multiplication method (see
§23.1).
What can be said about the numerical stability of partitioned and block LU
factorization? Because the partitioned algorithms are just rearrangements of stan-
dard GE, the standard error analysis applies if the matrix operations are computed
in the conventional way. However, if fast matrix multiplication techniques are used
(for example, Strassen’s method), the standard results are not applicable. Stan-
dard results are, in any case, not applicable to block LU factorization; its stability
can be very different from that of LU factorization. Therefore we need error anal-
ysis for both partitioned and block LU factorization based on general assumptions
that permit the use of fast matrix multiplication.
Unless otherwise stated, in this chapter an unsubscripted norm denotes kAk :=
maxi,j |aij |. We make two assumptions about the underlying level-3 BLAS (matrix–
matrix operations).
(1) If A ∈ Rm×n and B ∈ Rn×p then the computed approximation C b to
C = AB satisfies
b = AB + ∆C,
C k∆Ck ≤ c1 (m, n, p)ukAk kBk + O(u2 ), (13.4)
where c1 (m, n, p) denotes a constant depending on m, n, and p.
(2) The computed solution X b to the triangular systems T X = B, where T ∈
m×m m×p
R and B ∈ R , satisfies
b = B + ∆B,
TX b + O(u2 ).
k∆Bk ≤ c2 (m, p)ukT k kXk (13.5)
For conventional multiplication and substitution, conditions (13.4) and (13.5)
hold with c1 (m, n, p) = n2 and c2 (m, p) = m2 . For implementations based on
Strassen’s method, (13.4) and (13.5) hold with c1 and c2 rather complicated func-
tions of the dimensions m, n, p, and the threshold n0 that determines the level of
recursion (see Theorem 23.2 and [592, ]).
13.2 Error Analysis of Partitioned LU Factorization 249

13.2. Error Analysis of Partitioned LU Factorization

An error analysis for partitioned LU factorization must answer two questions.
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

The ﬁrst is whether partitioned LU factorization becomes unstable in some fun-

damental way when fast matrix multiplication is used. The second is whether the
constants in (13.4) and (13.5) are propagated stably into the ﬁnal error bound
(exponential growth of the constants would be disastrous).
We will analyse Algorithm 13.1 and will assume that the block level LU factor-
ization is done in such a way that the computed LU factors of A11 ∈ Rr×r satisfy
b 11 U
L b11 = A11 + ∆A11 , b 11 k kU
k∆A11 k ≤ c3 (r)ukL b11 k + O(u2 ). (13.6)

Theorem 13.5 (Demmel and Higham). Under the assumptions (13.4)–(13.6), the
LU factors of A ∈ Rn×n computed using the partitioned outer product form of LU
bU
factorization with block size r satisfy L b = A + ∆A, where

k∆Ak ≤ u δ(n, r)kAk + θ(n, r)kLk b kUb k + O(u2 ), (13.7)

and where

δ(n, r) = 1 + δ(n − r, r), δ(r, r) = 0,

θ(n, r) = max c3 (r), c2 (r, n − r), 1 + c1 (n − r, r, n − r) + δ(n − r, r)

+ θ(n − r, r) , θ(r, r) = c3 (r).

Proof. The proof is essentially inductive. To save clutter we will omit “+O(u2 )”
from each bound. For n = r, the result holds trivially. Consider the ﬁrst block
stage of the factorization, with the partitioning (13.1). The assumptions imply
that
b 11 U
L b12 = A12 + ∆A12 , b 11 k kU
k∆A12 k ≤ c2 (r, n − r)ukL b12 k, (13.8)
b 21 U
L b11 = A21 + ∆A21 , b 21 k kU
k∆A21 k ≤ c2 (r, n − r)ukL b11 k. (13.9)
b 21 U
To obtain S = A22 − L21 U12 we ﬁrst compute C = L b12 , obtaining

b=L
C b 21 U
b12 + ∆C, b 21 k kU
k∆Ck ≤ c1 (n − r, r, n − r)ukL b12 k,

and then subtract from A22 , obtaining

Sb = A22 − C
b + F, b
kF k ≤ u(kA22 k + kCk). (13.10)

It follows that

Sb = A22 − L
b 21 U
b12 + ∆S, (13.11a)

b b b b
k∆Sk ≤ u kA22 k + kL21 k kU12 k + c1 (n − r, r, n − r)kL21 k kU12 k . (13.11b)

The remainder of the algorithm consists of the computation of the LU factorization

b and by our inductive assumption (13.7), the computed LU factors satisfy
of S,
b 22 U
L b22 = Sb + ∆S,
b (13.12a)
k∆Sk b ≤ δ(n − r, r)ukSk
b + θ(n − r, r)ukL
b 22 k kU
b22 k. (13.12b)
250 Block LU Factorization

b using (13.10), we obtain

Combining (13.11) and (13.12), and bounding kSk
b 21 U
L b12 + L
b 22 U
b22 = A22 + ∆A22 ,
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

k∆A22 k ≤ u [1 + δ(n − r, r)]kA22 k + [1 + c1 (n − r, r, n − r) + δ(n − r, r)]

b 21 k kU
× kL b12 k + θ(n − r, r)kL
b 22 k kU
b22 k . (13.13)
bU
Collecting (13.6), (13.8), (13.9), and (13.13) we have L b = A + ∆A, where bounds
on k∆Aij k are given in the equations just mentioned. These bounds for the blocks
of ∆A can be weakened slightly and expressed together in the more succinct form
(13.7).
These recurrences for δ(n, r) and θ(n, r) show that the basic error constants in
assumptions (13.4)–(13.6) combine additively at worst. Thus, the backward error
analysis for the LU factorization is commensurate with the error analysis for the
particular implementation of the BLAS3 employed in the partitioned factorization.
In the case of the conventional BLAS3 we obtain a Wilkinson-style result for GE
without pivoting, with θ(n, r) = O(n3 ) (the growth factor is hidden in L b and U
b ).
Although the above analysis is phrased in terms of the partitioned outer prod-
uct form of LU factorization, the same result holds for other “ijk” partitioned
forms (with slightly diﬀerent constants), for example, the gaxpy or sdot forms and
the recursive factorization (Algorithm 13.4). There is no diﬃculty in extending
the analysis to cover partial pivoting and solution of Ax = b using the computed
LU factorization (see Problem 13.6).

13.3. Error Analysis of Block LU Factorization

Now we turn to block LU factorization. We assume that the computed matrices
b 21 from step 2 of Algorithm 13.3 satisfy
L
b 21 A11 = A21 + E21 ,
L b 21 k kA11 k + O(u2 ).
kE21 k ≤ c4 (n, r)ukL (13.14)

We also assume that when a system Uii xi = di of order r is solved, the computed
solution x
bi satisﬁes

(Uii + ∆Uii )b
xi = di , k∆Uii k ≤ c5 (r)ukUii k + O(u2 ). (13.15)

The assumptions (13.14) and (13.15) are satisﬁed for Implementation 1 of Algo-
rithm 13.3 and are suﬃcient to prove the following result.

Theorem 13.6 (Demmel, Higham, and Schreiber). Let L b and U

b be the computed
n×n
block LU factors of A ∈ R from Algorithm 13.3 (with Implementation 1), and
let x
b be the computed solution to Ax = b. Under the assumptions (13.4), (13.14),
and (13.15),
bU
L b = A + ∆A1 , (A + ∆A2 )b
x = b,

b kU
k∆Ai k ≤ dn u kAk + kLk b k + O(u ), i = 1: 2,
2
(13.16)

where the constant dn is commensurate with those in the assumptions.

13.3 Error Analysis of Block LU Factorization 251

Proof. We omit the proof (see Demmel, Higham, and Schreiber [326, ]
for details). It is similar to the proof of Theorem 13.5.
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

The bounds in Theorem 13.6 are valid also for other versions of block LU
factorization obtained by “block loop reordering”, such as a block gaxpy based
algorithm.
Theorem 13.6 shows that the stability of block LU factorization is determined
by the ratio kLkb kU b k/kAk (numerical experiments show that the bounds are, in
fact, reasonably sharp). If this ratio is bounded by a modest function of n, then
b and U
L b are the true factors of a matrix close to A, and x b solves a slightly
perturbed system. However, kLk b kU b k can exceed kAk by an arbitrary factor, even
if A is symmetric positive deﬁnite or diagonally dominant by rows. Indeed, kLk ≥
kL21 k = kA21 A−111 k, using the partitioning (13.2), and this lower bound for kLk can
be arbitrarily large. In the following two subsections we investigate this instability
more closely and show that kLk kU k can be bounded in a useful way for particular
classes of A. Without further comment we make the reasonable assumption that
kLk kU k ≈ kLkb kU b k, so that these bounds may be used in Theorem 13.6.
What can be said for Implementation 2? Suppose, for simplicity, that the
inverses A−1
11 (which are used in step 2 of Algorithm 13.3 and in the block back
substitution) are computed exactly. Then the best bounds of the forms (13.14)
and (13.15) are

b 21 A11 = A21 + ∆A21 ,

L k∆A21 k ≤ c4 (n, r)uκ(A11 )kA21 k + O(u2 ),
(Uii + ∆Uii )b
xi = di , k∆Uii k ≤ c5 (r)uκ(Uii )kUii k + O(u2 ).

Working from these results, we find that Theorem 13.6 still holds provided the
first-order terms in the bounds in (13.16) are multiplied by maxi κ(Ubii ). This
suggests that Implementation 2 of Algorithm 13.3 can be much less stable than
Implementation 1 when the diagonal blocks of U are ill conditioned, and this is
confirmed by numerical experiments.

13.3.1. Block Diagonal Dominance

One class of matrices for which block LU factorization has long been known to be
stable is block tridiagonal matrices that are diagonally dominant in an appropriate
block sense. A general matrix A ∈ Rn×n is block diagonally dominant by columns
with respect to a given partitioning A = (Aij ) and a given norm if, for all j,
X
kA−1
jj k
−1
− kAij k =: γj ≥ 0. (13.17)
i6=j

This deﬁnition implicitly requires that the diagonal blocks Ajj are all nonsingu-
lar. A is block diagonally dominant by rows if AT is block diagonally dominant
by columns. For the block size 1, the usual property of point diagonal domi-
nance is obtained. Note that for the 1- and ∞-norms diagonal dominance does
not imply block diagonal dominance, nor does the reverse implication hold (see
Problem 13.2). Throughout our analysis of block diagonal dominance we take the
norm to be an arbitrary subordinate matrix norm.
252 Block LU Factorization

First, we show that for block diagonally dominant matrices a block LU factor-
ization exists, using the key property that block diagonal dominance is inherited by
the Schur complements obtained in the course of the factorization. In the analysis
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

we assume that A has m block rows and columns.

Theorem 13.7 (Demmel, Higham, and Schreiber). Suppose A ∈ Rn×n is non-

singular and block diagonally dominant by rows or columns with respect to a sub-
ordinate matrix norm in (13.17). Then A has a block LU factorization, and all
the Schur complements arising in Algorithm 13.3 have the same kind of diagonal
dominance as A.

Proof. This proof is a generalization of Wilkinson’s proof of the corresponding

result for point diagonally dominant matrices [1229, , pp. 288–289], [509,
, Thm. 3.4.3] (as is the proof of Theorem 13.8 below). We consider the case of
block diagonal dominance by columns; the proof for row-wise diagonal dominance
is analogous.
The ﬁrst step of Algorithm 13.3 succeeds, since A11 is nonsingular, producing
a matrix that we can write as

U11 U12
A(2) = .
0 S

For j = 2: m we have
m
X m
X
(2)
kAij k = kAij − Ai1 A−1
11 A1j k
i=2 i=2
i6=j i6=j
m
X m
X
≤ kAij k + kA1j k kA−1
11 k kAi1 k
i=2 i=2
i6=j i6=j
m
X
≤ kAij k + kA1j k kA−1 −1 −1
11 k kA11 k − kAj1 k , using (13.17),
i=2
i6=j
m
X
= kAij k + kA1j k − kA1j k kA−1
11 k kAj1 k
i=2
i6=j

≤ kA−1
jj k
−1
− kA1j k kA−1
11 k kAj1 k, using (13.17),
= min kAjj xk − kA1j k kA−1
11 k kAj1 k
kxk=1

≤ min k(Ajj − Aj1 A−1

11 A1j )xk
kxk=1
(2)
= min kAjj xk. (13.18)
kxk=1

(2) Pm (2)
Now if Ajj is singular it follows that i=2,i6=j kAij k = 0; therefore A(2) , and
(2)
hence also A, is singular, which is a contradiction. Thus Ajj is nonsingular, and
13.3 Error Analysis of Block LU Factorization 253

(13.18) can be rewritten

m
X (2) (2) −1 −1
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

kAij k ≤ kAjj k ,
i=2
i6=j

showing that A(2) is block diagonally dominant by columns. The result follows by
induction.
The next result allows us to bound kU k for a block diagonally dominant matrix.

Theorem 13.8 (Demmel, Higham, and Schreiber). Let A satisfy the conditions
of Theorem 13.7. If A(k) denotes the matrix obtained after k − 1 steps of Algo-
rithm 13.3, then
(k)
max kAij k ≤ 2 max kAij k.
k≤i,j≤m 1≤i,j≤m

Proof. Let A be block diagonally dominant by columns (the proof for row
diagonal dominance is similar). Then
m
X m
X
(2)
kAij k = kAij − Ai1 A−1
11 A1j k
i=2 i=2
m
X m
X
≤ kAij k + kA1j k kA−1
11 k kAi1 k
i=2 i=2
Xm
≤ kAij k,
i=1

Pm (k)
using (13.17). By induction, using Theorem 13.7, it follows that i=k kAij k ≤
Pm
i=1 kAij k. This yields

m
X m
X
(k) (k)
max kAij k ≤ max kAij k ≤ max kAij k.
k≤i,j≤m k≤j≤m k≤j≤m
i=k i=1
P
From (13.17), i6=j kAij k ≤ kA−1
jj k
−1
≤ kAjj k, so

(k)
max kAij k ≤ 2 max kAjj k ≤ 2 max kAjj k = 2 max kAij k.
k≤i,j≤m k≤j≤m 1≤j≤m 1≤i,j≤m

The implications of Theorems 13.7 and 13.8 for stability are as follows. Suppose
A is block diagonally dominant by columns. Also, assume for the moment that
the (subordinate) norm has the property that
X
max kAij k ≤ kAk ≤ kAij k, (13.19)
i,j
i,j

which holds for any p-norm, for example. The subdiagonal blocks in the ﬁrst
block column of L are given by Li1 = Ai1 A−1 T T T
11 and so k[L21 , . . . , Lm1 ] k ≤ 1, by
(13.17) and (13.19). From Theorem 13.7 it follows that k[Lj+1,j , . . . , LTmj ]T k ≤ 1
T
254 Block LU Factorization

(i)
for j = 2: m. Since Uij = Aij for j ≥ i, Theorem 13.8 shows that kUij k ≤ 2kAk
for each block of U (and kUii k ≤ kAk). Therefore kLk ≤ m and kU k ≤ m2 kAk,
and so kLk kU k ≤ m3 kAk. For particular norms the bounds on the blocks of L
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

and U yield a smaller bound for kLk and kU k. For example, for the 1-norm we
have kLk1 kU k1 ≤ 2mkAk1 and for the ∞-norm kLk∞ kU k∞ ≤ 2m2 kAk∞ . We
conclude that block LU factorization is stable if A is block diagonally dominant
by columns with respect to any subordinate matrix norm satisfying (13.19).
Unfortunately, block LU factorization can be unstable when A is block diago-
nally dominant by rows, for although Theorem 13.8 guarantees that kUij k ≤ 2kAk,
kLk can be arbitrarily large. This can be seen from the example

A 0 I 0 A11 0
A = 111 = 1 −1 = LU,
2I I 2 A11 I 0 I

where A is block diagonally dominant by rows in any subordinate norm for any
nonsingular matrix A11 . It is easy to conﬁrm numerically that block LU factor-
ization can be unstable on matrices of this form.
Next, we bound kLk kU k for a general matrix and then specialize to point
diagonal dominance. From this point on we use the norm kAk := maxi,j |aij |. We
partition A according to

A11 A12
A= , A11 ∈ Rr×r , (13.20)
A21 A22

and denote by ρn the growth factor for GE without pivoting. We assume that GE
applied to A succeeds.
To bound kLk, we note that, under the partitioning (13.20), for the ﬁrst
block stage of Algorithm 13.3 we have kL21 k = kA21 A−111 k ≤ nρn κ(A) (see Prob-
lem 13.4). Since the algorithm works recursively with the Schur complement S,
and since every Schur complement satisﬁes κ(S) ≤ ρn κ(A) (see Problem 13.4),
each subsequently computed subdiagonal block of L has norm at most nρ2n κ(A).
Since U is composed of elements of A together with elements of Schur complements
of A,
kU k ≤ ρn kAk. (13.21)
Overall, then, for a general matrix A ∈ Rn×n ,

kLk kU k ≤ nρ2n κ(A) · ρn kAk = nρ3n κ(A)kAk. (13.22)

Thus, block LU factorization is stable for a general matrix A as long as GE is

stable for A (that is, ρn is of order 1) and A is well conditioned.
If A is point diagonally dominant by columns then, since every Schur comple-
ment enjoys the same property, we have kLij k ≤ 1 for i > j, by Problem 13.5.
Hence kLk = 1. Furthermore, ρn ≤ 2 (Theorem 9.9 or Theorem 13.8), giving
kU k ≤ 2kAk by (13.21), and so

kLk kU k ≤ 2kAk.

Thus block LU factorization is perfectly stable for a matrix point diagonally dom-
inant by columns.
13.3 Error Analysis of Block LU Factorization 255

If A is point diagonally dominant by rows then the best we can do is to take

ρn ≤ 2 in (13.22), obtaining
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

kLk kU k ≤ 8nκ(A)kAk. (13.23)

Hence for point row diagonally dominant matrices, stability is guaranteed if A

is well conditioned. This in turn is guaranteed if the row diagonal dominance
amounts γj in the analogue of (13.17) for point row diagonal dominance are suﬃ-
ciently large relative to kAk, because kA−1 k∞ ≤ (minj γj )−1 (see Problem 8.7(a)).

13.3.2. Symmetric Positive Definite Matrices

Further useful results about the stability of block LU factorization can be derived
for symmetric positive definite matrices. First, note that the existence of a block
LU factorization is immediate for such matrices, since all their leading princi-
pal submatrices are nonsingular. Let A be a symmetric positive definite matrix,
partitioned as
A11 AT21
A= , A11 ∈ Rr×r .
A21 A22
The definiteness implies certain relations among the submatrices Aij that can be
used to obtain a stronger bound for kLk2 than can be deduced for a general matrix
(cf. Problem 13.4).

Lemma 13.9. If A is symmetric positive definite then kA21 A−1

11 k2 ≤ κ2 (A)
1/2
.
Proof. This lemma is a corollary of Lemma 10.12, but we give a separate
proof. Let A have the Cholesky factorization
T
R11 0 R11 R12
A= T T , R11 ∈ Rr×r .
R12 R22 0 R22

Then A21 A−1 T −1 −T T −T

11 = R12 R11 · R11 R11 = R12 R11 , so

kA21 A−1 −1
11 k2 ≤ kR12 k2 kR11 k2 ≤ kRk2 kR
−1
k2 = κ2 (R) = κ2 (A)1/2 .

The following lemma is proved in a way similar to the second inequality in Prob-
lem 13.4.

Lemma 13.10. If A is symmetric positive definite then the Schur complement

S = A22 − A21 A−1 T
11 A21 satisfies κ2 (S) ≤ κ2 (A).

Using the same reasoning as in the last subsection, we deduce from these
two lemmas that each subdiagonal block of L is bounded in 2-norm by κ2 (A)1/2 .
Therefore kLk2 ≤ 1 + mκ2 (A)1/2 , where
√ there are m block stages in the algorithm.
Also, it can be shown that kU k2 ≤ mkAk2 . Hence
√
kLk2 kU k2 ≤ m(1 + mκ2 (A)1/2 )kAk2 . (13.24)
256 Block LU Factorization

Table 13.1. Stability of block and point LU factorization. ρn is the growth factor for GE
without pivoting.
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Matrix property Block LU Point LU

Symmetric positive deﬁnite κ(A)1/2 1
Block column diagonally dominant 1 ρn
Point column diagonally dominant 1 1
Block row diagonally dominant ρ3n κ(A) ρn
Point row diagonally dominant κ(A) 1
Arbitrary ρ3n κ(A) ρn

It follows from Theorem 13.6 that when Algorithm 13.3 is applied to a symmetric
positive deﬁnite matrix A, the backward errors for the LU factorization and the
subsequent solution of a linear system are both bounded by
√
cn mukAk2 (2 + mκ2 (A)1/2 ) + O(u2 ). (13.25)

Any resulting bound for kx − x bk2 /kxk2 will be proportional to κ2 (A)3/2 , rather
than κ2 (A) as for a stable method. This suggests that block LU factorization
can lose up to 50% more digits of accuracy in x than a stable method for solving
symmetric positive deﬁnite linear systems. The positive conclusion to be drawn,
however, is that block LU factorization is guaranteed to be stable for a symmetric
positive deﬁnite matrix that is well conditioned.
The stability results for block LU factorization are summarized in Table 13.1,
which tabulates a bound for kA− L bU b k/(cn ukAk) for block and point LU factoriza-
tion for the matrix properties considered in this chapter. The constant cn incor-
porates any constants in the bound that depend polynomially on the dimension,
so a value of 1 in the table indicates unconditional stability.

13.4. Notes and References

The distinction between a partitioned algorithm and a block algorithm is rarely
made in the literature (exceptions include the papers by Schreiber [1021, ] and
Demmel, Higham, and Schreiber [326, ]); the term “block algorithm” is fre-
quently used to describe both types of algorithm. A partitioned algorithm might
also be called a “blocked algorithm” (as is done by Dongarra, Duff, Sorensen,
and van der Vorst [349, ]), but the similarity of this term to “block algo-
rithm” can cause confusion and so we do not recommend this terminology. Note
that in the particular case of matrix multiplication, partitioned and block algo-
rithms are equivalent. Our treatment of partitioned LU factorization has focused
on the stability aspects; for further details, particularly concerning implementa-
tion on high-performance computers, see Dongarra, Duff, Sorensen, and van der
Vorst [349, ] and Golub and Van Loan [509, ].
Recursive LU factorization is now regarded as the most efficient way in which
to implement LU factorization on machines with hierarchical memories [535, ],
[1145, ], but it has not yet been incorporated into LAPACK.
Problems 257

Block LU factorization appears to have ﬁrst been proposed for block tridi-
agonal matrices, which frequently arise in the discretization of partial diﬀerential
equations. References relevant to this application include Isaacson and Keller [667,
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

, p. 59], Varah [1187, ], Bank and Rose [62, ], Mattheij [827, ],
[828, ], and Concus, Golub, and Meurant [262, ].
For an application of block LU factorization to linear programming, see Elder-
sveld and Saunders [388, ].
Theorem 13.5 is from Demmel and Higham [324, ]. The results in §13.3 are
from Demmel, Higham, and Schreiber [326, ], which extends earlier analysis
of block LU factorization by Demmel and Higham [324, ].
Block diagonal dominance was introduced by Feingold and Varga [406, ],
and has been used mainly in generalizations of the Gershgorin circle theorem.
Varah [1187, ] obtained bounds on kLk and kU k for block diagonally dominant
block tridiagonal matrices; see Problem 13.1.
Theorem 13.7 is obtained in the case of block diagonal dominance by rows
with minj γj > 0 by Polman [946, ]; the proof in [946, ] makes use of the
corresponding result for point diagonal dominance and thus differs from the proof
we have given.
At the cost of a much more difficult proof, Lemma 13.9 can be strengthened
to the attainable bound kA21 A−1 11 k2 ≤ (κ2 (A)
1/2
− κ2 (A)−1/2 )/2, as shown by
Demmel [307, , Thm. 4], but the weaker bound is sufficient for our purposes.

13.4.1. LAPACK
LAPACK does not implement block LU factorization, but its LU factorization
(and related) routines for full matrices employ partitioned LU factorization in
order to exploit the level-3 BLAS and thereby to be eﬃcient on high-performance
machines.

Problems
13.1. (Varah [1187, ]) Suppose A is block tridiagonal and has the block LU
factorization A = LU (so that L and U are block bidiagonal and Ui,i+1 = Ai,i+1 ).
Show that if A is block diagonally dominant by columns then

kLi,i−1 k ≤ 1, kUii k ≤ kAii k + kAi−1,i k,

while if A is block diagonally dominant by rows then

kLi,i−1 k ≤ kAi,i−1 k/kAi−1,i k, kUii k ≤ kAii k + kAi,i−1 k.

What can be deduced about the stability of the factorization for these two classes
of matrices?
13.2. Show that for the 1- and ∞-norms diagonal dominance does not imply block
diagonal dominance, and vice versa.
13.3. If A ∈ Rn×n is symmetric, has positive diagonal elements, and is block
diagonally dominant by rows, must it be positive deﬁnite?
258 Block LU Factorization

13.4. Let A ∈ Rn×n be partitioned

A11 A12
A= , A11 ∈ Rr×r , (13.26)
Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

A21 A22

with A11 nonsingular. Let kAk := maxij |aij |. Show that kA21 A−1
11 k ≤ nρn κ(A),
where ρn is the growth factor for GE without pivoting on A. Show that the Schur
complement S = A22 − A21 A−1 11 A12 satisﬁes κ(S) ≤ ρn κ(A).

13.5. Let A ∈ Rn×n be partitioned as in (13.26), with A11 nonsingular, and

suppose that A is point diagonally dominant by columns. Show that kA21 A−1
11 k1 ≤
1.
13.6. Show that under the conditions of Theorem 13.5 the computed solution to
Ax = b satisﬁes

(A + ∆A)b
x = b, k∆Ak ≤ cn u kAk + kLkb kUb k + O(u2 ),

and the computed solution to the multiple right-hand side system AX = B (where
(13.5) is assumed to hold for the multiple right-hand side triangular solves) satisﬁes

b − Bk ≤ cn u kAk + kLk
kAX b kU
b k kXk
b + O(u2 ).

In both cases, cn is a constant depending on n and the block size.

A B n×n
13.7. Let X = C D ∈R , where A is square and nonsingular. Show that

det(X) = det(A) det(D − CA−1 B).

Assuming A, B, C, D are all m × m, give a condition under which det(X) =

det(AD − CB).
13.8. By using a block LU factorization show that
−1
A B A−1 + A−1 BS −1 CA−1 −A−1 BS −1
= ,
C D −S −1 CA−1 S −1

where A is assumed to be nonsingular and S = D − CA−1 B.

13.9. Let A ∈ Rn×m , B ∈ Rm×n . Derive the expression

(I − AB)−1 = I + A(I − BA)−1 B

by considering block LU and block UL factorizations of BI AI . Deduce the Sherman–
Morrison–Woodbury formula

(T − U W −1 V T )−1 = T −1 + T −1 U (W − V T T −1 U )−1 V T T −1 ,

where T ∈ Rn×n , U ∈ Rn×r , W ∈ Rr×r , V ∈ Rr×n .

Balakotaiah V Ratnakar RR Applied Linear Analysis For Chemic
No ratings yet
Balakotaiah V Ratnakar RR Applied Linear Analysis For Chemic
783 pages
Recursive LU Factorization of A Matrix in Python
No ratings yet
Recursive LU Factorization of A Matrix in Python
4 pages
Write A Program To Store The Elements in 1-D Array and Perform The Operations Like Searching, Sorting and Reversing The Elements. (Menu Driven)
No ratings yet
Write A Program To Store The Elements in 1-D Array and Perform The Operations Like Searching, Sorting and Reversing The Elements. (Menu Driven)
11 pages
Linear Algebra - Challenging Problems For Students - Fuzhen Zhang
91% (11)
Linear Algebra - Challenging Problems For Students - Fuzhen Zhang
266 pages
061585673X Coding
100% (3)
061585673X Coding
689 pages
Block LU Decomposition
No ratings yet
Block LU Decomposition
8 pages
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
No ratings yet
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
16 pages
Rook Pivoting
No ratings yet
Rook Pivoting
12 pages
133A Note
No ratings yet
133A Note
74 pages
x6101 LU Factorization 2024: 1 Gaussian Elimination
No ratings yet
x6101 LU Factorization 2024: 1 Gaussian Elimination
10 pages
Higham, Nicholas J. 2011: Gaussian Elimination
No ratings yet
Higham, Nicholas J. 2011: Gaussian Elimination
10 pages
Lu Factor Proof
No ratings yet
Lu Factor Proof
9 pages
CSE/MATH 6643: Numerical Linear Algebra: Haesun Park
No ratings yet
CSE/MATH 6643: Numerical Linear Algebra: Haesun Park
14 pages
CE 007 (Numerical Solutions To CE Problems)
No ratings yet
CE 007 (Numerical Solutions To CE Problems)
19 pages
Tutorial 9
No ratings yet
Tutorial 9
16 pages
Lecture 13
No ratings yet
Lecture 13
19 pages
NM2012S-Lecture10-LU Factorization
No ratings yet
NM2012S-Lecture10-LU Factorization
11 pages
The LU Factorization: Matrix Factorizations
No ratings yet
The LU Factorization: Matrix Factorizations
8 pages
LECTURE-08-LU-FACTORIZATION OF MATRICES AND THEIR APPLICATIONS
No ratings yet
LECTURE-08-LU-FACTORIZATION OF MATRICES AND THEIR APPLICATIONS
29 pages
Lu Decomposition (1) (3)
No ratings yet
Lu Decomposition (1) (3)
37 pages
Linear System: 2011 Intro. To Computation Mathematics LAB Session
No ratings yet
Linear System: 2011 Intro. To Computation Mathematics LAB Session
7 pages
MS 8117 Lecture Note 2
No ratings yet
MS 8117 Lecture Note 2
9 pages
Block Lu Factorization
No ratings yet
Block Lu Factorization
22 pages
NM Unit 4 New
No ratings yet
NM Unit 4 New
28 pages
Cholesky and LDL Decomposition: B X A N N
No ratings yet
Cholesky and LDL Decomposition: B X A N N
18 pages
Lecture 16: Linear Algebra III: cs412: Introduction To Numerical Analysis
No ratings yet
Lecture 16: Linear Algebra III: cs412: Introduction To Numerical Analysis
7 pages
hpc_linear
No ratings yet
hpc_linear
52 pages
Lesson 2
No ratings yet
Lesson 2
6 pages
Design and Analysis of Algorithms- Algebraic Algorithms
No ratings yet
Design and Analysis of Algorithms- Algebraic Algorithms
6 pages
Numerical Method For Linear Algebra
No ratings yet
Numerical Method For Linear Algebra
16 pages
Pages From Linear Fall23 Pp21-40
No ratings yet
Pages From Linear Fall23 Pp21-40
20 pages
Iccgi 2013 11 30 10133
No ratings yet
Iccgi 2013 11 30 10133
5 pages
Updating An LU Factorization With Pivoting
No ratings yet
Updating An LU Factorization With Pivoting
16 pages
CSC336 Assignment 4
No ratings yet
CSC336 Assignment 4
5 pages
7-Jacobi Iteration Method_Gauss–Seidel Method
No ratings yet
7-Jacobi Iteration Method_Gauss–Seidel Method
23 pages
6.5_Matrix_Factorization(1)
No ratings yet
6.5_Matrix_Factorization(1)
27 pages
CH 10
No ratings yet
CH 10
10 pages
Solvingsingular Linear Equation
No ratings yet
Solvingsingular Linear Equation
49 pages
2 Basis Art Rev
No ratings yet
2 Basis Art Rev
10 pages
Matrix Multiplication On Linear Bidirectional Systolic Arrays
No ratings yet
Matrix Multiplication On Linear Bidirectional Systolic Arrays
10 pages
LU Factorisation of A Matrix
No ratings yet
LU Factorisation of A Matrix
10 pages
Matrix Fcatorization
No ratings yet
Matrix Fcatorization
59 pages
Algorithms
No ratings yet
Algorithms
3 pages
Algorithms Flops
No ratings yet
Algorithms Flops
3 pages
Lect 6-7
No ratings yet
Lect 6-7
21 pages
Numerical Solution of Linear Systems: Chen Greif
No ratings yet
Numerical Solution of Linear Systems: Chen Greif
59 pages
1.3.Lu Decomposition
No ratings yet
1.3.Lu Decomposition
14 pages
1.3.lu Decomposition
No ratings yet
1.3.lu Decomposition
14 pages
Triangular Factorization and Inversion by Fast Matrix Multiplication
No ratings yet
Triangular Factorization and Inversion by Fast Matrix Multiplication
6 pages
CE3330 Dr. Tarun Naskar
No ratings yet
CE3330 Dr. Tarun Naskar
20 pages
Matrix Computations, Marko Huhtanen
No ratings yet
Matrix Computations, Marko Huhtanen
63 pages
3.4-Factorization
No ratings yet
3.4-Factorization
84 pages
Applying Approximate LU-factorizations As Preconditioners in Eight Iterative Methods For Solving Systems of Linear Algebraic Equations
No ratings yet
Applying Approximate LU-factorizations As Preconditioners in Eight Iterative Methods For Solving Systems of Linear Algebraic Equations
21 pages
LU Factorization and Guess Elimination: by Zeinab Mahdi
No ratings yet
LU Factorization and Guess Elimination: by Zeinab Mahdi
13 pages
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
No ratings yet
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
479 pages
Application of LU Decomposition
No ratings yet
Application of LU Decomposition
7 pages
Iterative Matrix Computation
No ratings yet
Iterative Matrix Computation
55 pages
Lab 5. LU Factorization: Name: 1 Instructions
No ratings yet
Lab 5. LU Factorization: Name: 1 Instructions
2 pages
Lu and Plu Factorization: Terry A. Loring
No ratings yet
Lu and Plu Factorization: Terry A. Loring
7 pages
Summary and Conclusion
No ratings yet
Summary and Conclusion
7 pages
Doolittle Algorithm
No ratings yet
Doolittle Algorithm
6 pages
Lec 12
No ratings yet
Lec 12
5 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Matrices Example With Mathematica
No ratings yet
Matrices Example With Mathematica
11 pages
Final Paper
No ratings yet
Final Paper
81 pages
MainNumMath KMM
0% (1)
MainNumMath KMM
85 pages
Matrix
No ratings yet
Matrix
3 pages
Mesh-Tensorflow: Deep Learning For Supercomputers
No ratings yet
Mesh-Tensorflow: Deep Learning For Supercomputers
16 pages
Download ebooks file Schaum's Outline of Discrete Mathematics, Fourth Edition Lipschutz all chapters
100% (3)
Download ebooks file Schaum's Outline of Discrete Mathematics, Fourth Edition Lipschutz all chapters
40 pages
12 Pass
No ratings yet
12 Pass
15 pages
Math Theory
No ratings yet
Math Theory
13 pages
Matrix Algebra - Structural - Analysis - SI - Edition - Sixth - Edition - 2020 - by - Aslam - Kassimali
No ratings yet
Matrix Algebra - Structural - Analysis - SI - Edition - Sixth - Edition - 2020 - by - Aslam - Kassimali
17 pages
Hsu-Chapter 4 Linear Algebra and Matrices
No ratings yet
Hsu-Chapter 4 Linear Algebra and Matrices
47 pages
Encyclopedia of Mathematics and Its Applications: Edited by G.-C. ROTA
No ratings yet
Encyclopedia of Mathematics and Its Applications: Edited by G.-C. ROTA
320 pages
Schaum's Outline of Calculus for Business, Economics and Finance, Fourth Edition (Schaum's Outlines) Luis Moises Pena-Levano All Chapters Instant Download
100% (5)
Schaum's Outline of Calculus for Business, Economics and Finance, Fourth Edition (Schaum's Outlines) Luis Moises Pena-Levano All Chapters Instant Download
66 pages
Numpy For Matlab User
No ratings yet
Numpy For Matlab User
17 pages
Maths Project
No ratings yet
Maths Project
21 pages
ADMATH Module 2
No ratings yet
ADMATH Module 2
13 pages
Assignment 01
No ratings yet
Assignment 01
3 pages
Bca Odd Sem
No ratings yet
Bca Odd Sem
18 pages
Download ebooks file Introduction to Structures 2nd Edition W.R. Spillers all chapters
100% (3)
Download ebooks file Introduction to Structures 2nd Edition W.R. Spillers all chapters
71 pages
Mathematics For Machine Learning
No ratings yet
Mathematics For Machine Learning
52 pages
Mathematical Foundations for AI Basic
No ratings yet
Mathematical Foundations for AI Basic
3 pages
Matrices Definition
100% (1)
Matrices Definition
32 pages
ATOM 02 Algebra Intermediate Methods
No ratings yet
ATOM 02 Algebra Intermediate Methods
50 pages
Amarne Chem 721 Matrices, Representations, and Character Tables Lecture 1 Part I
No ratings yet
Amarne Chem 721 Matrices, Representations, and Character Tables Lecture 1 Part I
43 pages
Laboratory 1 Discrete and Continuous-Time Signals
No ratings yet
Laboratory 1 Discrete and Continuous-Time Signals
8 pages
Review of Vectors and Matrices
No ratings yet
Review of Vectors and Matrices
11 pages
Linear Algebra
67% (3)
Linear Algebra
395 pages

Block LU Factorization

Uploaded by

Block LU Factorization

Uploaded by

Downloaded 07/01/14 to 129.174.21.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.

Block algorithms are advantageous for at least two important reasons.

It should be realized that, with partial pivoting,

It was quite usual when dealing with very large matrices to

13.1. Block Versus Partitioned LU Factorization

Algorithm 13.1 (partitioned LU factorization). This algorithm computes an LU

Theorem 13.2. The matrix A = (Aij )m i,j=1 ∈ R

Algorithm 13.3 (block LU factorization). This algorithm computes a block LU

where each block is n/2 × n/2. The algorithm is as follows.

Algorithm 13.4 (recursively partitioned LU factorization). This algorithm com-

putes an LU factorization A = LU ∈ Rn×n using a recursive partitioning, using

13.2. Error Analysis of Partitioned LU Factorization

The ﬁrst is whether partitioned LU factorization becomes unstable in some fun-

δ(n, r) = 1 + δ(n − r, r), δ(r, r) = 0,

and then subtract from A22 , obtaining

The remainder of the algorithm consists of the computation of the LU factorization

b using (13.10), we obtain

k∆A22 k ≤ u [1 + δ(n − r, r)]kA22 k + [1 + c1 (n − r, r, n − r) + δ(n − r, r)]

13.3. Error Analysis of Block LU Factorization

Theorem 13.6 (Demmel, Higham, and Schreiber). Let L b and U

where the constant dn is commensurate with those in the assumptions.

b 21 A11 = A21 + ∆A21 ,

13.3.1. Block Diagonal Dominance

we assume that A has m block rows and columns.

Theorem 13.7 (Demmel, Higham, and Schreiber). Suppose A ∈ Rn×n is non-

Proof. This proof is a generalization of Wilkinson’s proof of the corresponding

≤ min k(Ajj − Aj1 A−1

(13.18) can be rewritten

kLk kU k ≤ nρ2n κ(A) · ρn kAk = nρ3n κ(A)kAk. (13.22)

Thus, block LU factorization is stable for a general matrix A as long as GE is

If A is point diagonally dominant by rows then the best we can do is to take

kLk kU k ≤ 8nκ(A)kAk. (13.23)

Hence for point row diagonally dominant matrices, stability is guaranteed if A

13.3.2. Symmetric Positive Definite Matrices

Lemma 13.9. If A is symmetric positive definite then kA21 A−1

Then A21 A−1 T −1 −T T −T

Lemma 13.10. If A is symmetric positive definite then the Schur complement

Matrix property Block LU Point LU

13.4. Notes and References

kLi,i−1 k ≤ 1, kUii k ≤ kAii k + kAi−1,i k,

while if A is block diagonally dominant by rows then

kLi,i−1 k ≤ kAi,i−1 k/kAi−1,i k, kUii k ≤ kAii k + kAi,i−1 k.

13.4. Let A ∈ Rn×n be partitioned

13.5. Let A ∈ Rn×n be partitioned as in (13.26), with A11 nonsingular, and

In both cases, cn is a constant depending on n and the block size.

det(X) = det(A) det(D − CA−1 B).

Assuming A, B, C, D are all m × m, give a condition under which det(X) =

where A is assumed to be nonsingular and S = D − CA−1 B.

(I − AB)−1 = I + A(I − BA)−1 B

where T ∈ Rn×n , U ∈ Rn×r , W ∈ Rr×r , V ∈ Rr×n .

You might also like