Kwok Thesis
Kwok Thesis
A DISSERTATION
SUBMITTED TO THE PROGRAM IN SCIENTIFIC COMPUTING
AND COMPUTATIONAL MATHEMATICS
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
ii
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.
(Khalid Aziz)
(Michael Saunders)
iii
iv
Abstract
v
flow configurations.
We also present a rigorous analysis of phase-based upstream discretization, which
is different from the classical Godunov and Engquist-Osher schemes for nonlinear
conservation laws. We show, based on a fully nonlinear analysis, that the fully im-
plicit scheme is well-defined, stable, monotonic and converges to the entropy solution
for arbitrary CFL numbers. Thus, unlike the existing linear stability analysis, our
results provide a rigorous justification for the empirical observation that fully-implicit
solutions are always stable and yield monotonic profiles.
vi
Acknowledgement
I would like to express my utmost gratitude towards my advisor, Prof. Hamdi Tchelepi,
not only for his insights and guidance, but also for his patience and encouragement
when things did not go so well. This research would not have been possible without
his constant input and moral support. In addition, I would like to thank Prof. Khalid
Aziz for making many useful suggestions. I am also indebted to Prof. Margot Ger-
ritsen, who introduced me to porous media flow, and to Philipp Birken, who gave
constructive comments on several chapters of this dissertation.
Much appreciation goes to my office mates, Rami Younis and Marc Hesse, for our
fruitful and entertaining conversations. Our interactions were thoroughly enriching
both on a professional and a personal level, and I will really miss your company. I
also thank Yuanlin Jiang and Huanquan Pan for their help with GPRS-related issues.
Many thanks to Prof. Michael Saunders, who agreed to join the reading committee
at the very last minute and did an amazingly thorough job as a reader. Thanks also
to Indira Choudhury, who helped me tremendously in putting my orals commitee
back together when it almost fell apart. Finally, I am grateful for the moral support
from my parents, who always believed in me throughout this rather long journey.
This dissertation is dedicated to the memory of Prof. Gene Golub, who passed
away two weeks before my PhD defense. I will always remember him for the depth
of his knowledge on all aspects of scientific computing, as well as his generosity and
genuine interest in the well-being of every student in SCCM/ICME. He is truly an
irreplaceable figure in our community, and he will be sorely missed.
I would like to thank the SUPRI-B reservoir simulation affliates program for its
financial support for this research.
vii
Contents
Abstract v
Acknowledgement vii
1 Introduction 1
1.1 Governing equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 General mass-balance equations . . . . . . . . . . . . . . . . . 2
1.1.2 Black-oil model . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Numerical simulation of the reservoir . . . . . . . . . . . . . . . . . . 8
1.2.1 Spatial discretization . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Temporal discretization . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3 Solution of nonlinear equations . . . . . . . . . . . . . . . . . 15
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
viii
2.3.5 Well-definedness of implicit monotone schemes . . . . . . . . . 45
2.3.6 Rate of convergence of the nonlinear processes . . . . . . . . . 50
2.3.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.8 Maximum principle . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4 Convergence to the entropy solution . . . . . . . . . . . . . . . . . . . 57
2.5 Accuracy of phase-based upstreamed solutions . . . . . . . . . . . . . 62
2.5.1 Refinement under fixed mesh ratio . . . . . . . . . . . . . . . 63
2.5.2 Spatial refinement for fixed time steps . . . . . . . . . . . . . 64
2.5.3 Non-uniform grids . . . . . . . . . . . . . . . . . . . . . . . . 65
3 Potential Ordering 70
3.1 Methods derived from cell-based ordering . . . . . . . . . . . . . . . . 71
3.2 Phase-based ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2.1 Cocurrent flow . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2.2 Countercurrent flow due to gravity . . . . . . . . . . . . . . . 76
3.2.3 Capillarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2.4 Remarks on implementation . . . . . . . . . . . . . . . . . . . 79
ix
5 Linear Preconditioning 118
5.1 Structure of the Jacobian matrix . . . . . . . . . . . . . . . . . . . . 119
5.2 CPR preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.2.1 True-IMPES reduction . . . . . . . . . . . . . . . . . . . . . . 127
5.2.2 Improved second-stage preconditioner via ordering . . . . . . . 130
5.2.3 Spectrum of the preconditioned matrix . . . . . . . . . . . . . 138
5.2.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . 146
5.3 Schur complement preconditioning . . . . . . . . . . . . . . . . . . . 149
5.3.1 Spectrum and nonzero pattern . . . . . . . . . . . . . . . . . . 151
5.3.2 Convergence behavior . . . . . . . . . . . . . . . . . . . . . . . 152
6 Conclusions 159
Bibliography 179
x
List of Tables
5.1 Convergence behavior for the block ILU(0) and CPR preconditioners. 141
5.2 Performance of CPR-ILU for the quarter 5-spot problem. . . . . . . . 148
5.3 Performance of CPR-ILU for the upscaled SPE 10 problem. . . . . . 149
5.4 Convergence of GMRES in the absence of gravity. . . . . . . . . . . . 156
5.5 Convergence of GMRES in the presence of gravity. . . . . . . . . . . 157
xi
List of Figures
xii
5.1 Spectra of matrices for the cocurrent flow problem, ∆t = 1, 5. . . . . 142
5.2 Spectra of matrices for the cocurrent flow problem, ∆t = 20, 100. . . . 143
5.3 Spectra of matrices for the countercurrent flow problem, ∆t = 1, 5. . . 144
5.4 Spectra of matrices for the countercurrent flow problem, ∆t = 20, 100. 145
5.5 Two configurations of the quarter 5-spot problem. . . . . . . . . . . . 147
5.6 Nonzero pattern of S2 for the 1D and 2D reservoirs. . . . . . . . . . . 152
5.7 Spectrum and nonzero profiles of S2 for the 1D reservoir. . . . . . . . 153
5.8 Spectrum and nonzero profiles of S2 for the 2D reservoir. . . . . . . . 154
xiii
xiv
Chapter 1
Introduction
Petroleum reservoir simulation is the use of numerical techniques to solve the equa-
tions for heat and fluid flow in porous media, given the appropriate initial and bound-
ary conditions. Simulation technology has evolved tremendously since the develop-
ment of the first simulator in the 1950s. Due to the explosion of available computing
power and the ever-increasing sophistication of simulation techniques, simulation has
become an indispensible tool to reservoir engineering. Today, nearly all major reser-
voir development decisions are based at least partially on simulation results [83].
Despite the growing speed and storage capacities of today’s computers, there is in-
creasing interest and necessity to simulate larger and more complex reservoir models.
As a result, the efficient simulation of miscible and immiscible fluid displacements in
underground porous media remains an important and challenging problem in reservoir
engineering.
There are several hurdles to an efficient, scalable reservoir simulator. First, the
governing PDEs exhibit a mixed hyperbolic-parabolic character due to the coupling
between the flow (pressure and total velocity) and the transport (phase saturations)
problems. In addition, rock properties such as porosity and permeability are highly
heterogeneous, leading to poor numerical conditioning of the resulting linear systems.
Finally, fluid velocities vary greatly across the domain, with near-well regions experi-
encing fast flows and some far away regions experiencing almost no flow at all. These
characteristics impose severe constraints on the numerical methods used in practical
1
2 CHAPTER 1. INTRODUCTION
reservoir simulation. In particular, scalable techniques that work well for specific
classes of problems (e.g., algebraic multigrid for elliptic problems [74]) no longer work
well for reservoir simulation problems.
The simplest and most widely used model in reservoir simulation is the standard
black-oil model [6]. In this model, mass transfer between the hydrocarbon liquid
and vapor phases is represented using pressure-dependent solubilities, and the com-
pressibility effects are represented using normalized densities (the so-called formation
volume factors). These simplifying assumptions on fluid properties are used to elimi-
nate the need for equation of state (EOS) and phase equilibrium calculations, which
can take up to 70% of the total simulation time [87, 16]. Thus, despite the increasing
use of compositional models, black-oil simulation still accounts for the vast major-
ity of simulations in industry. Hence, this thesis will concentrate on improving the
efficiency and robustness of black oil simulation.
The rest of this chapter is organized as follows. In section 1.1, we derive the
PDEs that describe the black-oil model. In section 1.2, we introduce the finite-volume
discretization, as well as the various time-marching schemes that are used to integrate
the PDEs in time. We also describe the most commonly used methods to solve the
resulting system of nonlinear and linear equations. We outline the remainder of the
thesis and state our contributions in section 1.3.
The governing equations for multiphase flow in porous media are based on the conser-
vation of mass for each component. Here, a component can be either a single chemical
species (e.g., decane C10 H22 ), or a mixture of components that behave similarly, so
that they can be lumped together into a pseudocomponent. When nc components
are present, the system of conservation laws has the form
∂ci
+ ∇ · Fi = qi , i = 1, . . . , nc , (1.1.1)
∂t
1.1. GOVERNING EQUATIONS 3
where ci is the mass concentration of component i, Fi is the mass flux and qi is the
source or sink term. Each component can exist in one or more immiscible fluid phases
that flow inside the pore space; typically, we consider either two-phase (aqueous and
liquid hydrocarbon) or three-phase (aqueous, liquid and vapor hydrocarbon) flow
problems. If Xij is the concentration of component i in phase j (mass per unit
volume), then the concentration of component i can be written as
np
X
ci = φ Xij Sj , (1.1.2)
j=1
where φ = φ(x) is the porosity of the medium (i.e., the fraction of the bulk volume
that is open to fluid flow), np is the number of phases present, and Sj is the saturation
of phase j (i.e., the fraction of the pore volume occupied by phase j). The mass flux
Fi is the sum of the volumetric fluxes of each phase j, multiplied by the concentration
Xij . In other words,
np
X
Fi = Xij uj , (1.1.3)
j=1
where uj is the volumetric flux vector of phase j. The volumetric fluxes are given by
generalized Darcy’s law :
krj
uj = − K(∇pj − γj ∇z), (1.1.4)
µj
where K is the absolute permeability tensor, z is the depth variable; and for each
phase j, krj = krj (S1 , . . . , Snp ) is the relative permeability of phase j, µj is the
phase viscosity, pj is the phase pressure, and γj is the gravitational force acting
on phase j. The permeability tensor K is highly variable over the domain, even
within short distances; it also exhibits complex correlation patterns over a hierarchy
of spatial scales. For simulation purposes, it is generally necessary to assume K to
be a discontinuous function of x, since it would be impractical (or even impossible)
to simulate on a scale over which K becomes continuous. This has implications on
the choice of spatial discretization, which is described in section 1.2.1.
We also have a few algebraic constraints in addition to the above PDEs. Since
4 CHAPTER 1. INTRODUCTION
and the phase pressures are related by the capillary pressure constraints:
Equations (1.1.1), (1.1.5) and (1.1.6) yield nc + np equations, and we have 2np un-
knowns corresponding to the phase pressures and saturations. In a compositional
model, the concentrations Xij are also treated as unknowns, and additional equa-
tions are needed to close the system (cf. [58]). However, for the black-oil model, we
have nc = np , and Xij are treated as known functions of pj , so that we have the
same number of equations and unknowns. Specifically, the black-oil assumptions are
as follows:
2. The water component exists only in the aqueous phase, and the oil component
exists only in the liquid hydrocarbon phase;
3. The gas component can exist in both the liquid and vapor hydrocarbon phases,
but gas solubility in the liquid phase is a pure function of pg (the vapor-phase
pressure).
With these assumptions, the mass-balance equations (1.1.1) take the form
∂(φρp Sp )
+ ∇ · (ρp up ) = ρp qp (1.1.7)
∂t
1.1. GOVERNING EQUATIONS 5
for the vapor phase, where Rs = Rs (pg ) is the solubility ratio. The generalized Darcy’s
law (1.1.4), which is valid for p = w, o, g, is used to obtain the phase velocities, up .
In practical simulations, we typically rewrite the PDEs in terms of a set of linearly
independent primary variables (usually Sw , Sg and pg , but one can choose any phase
pressure and any np − 1 saturations), and then use the algebraic relations (1.1.5) and
(1.1.6) to calculate the remaining variables. In addition, it is commonly assumed
that the relative permeabilities krp and capillary pressures Pcpq have the following
dependencies on saturation:
krw = krw (Sw ), kro = kro (Sw , Sg ), krg = krg (Sg ); (1.1.9)
po − pw = Pcow (Sw ), pg − po = Pcgo (Sg ). (1.1.10)
The above functions are all nonlinear with respect to the saturation variables, and
they contribute to the highly nonlinear character of the resulting PDEs [71]. The
parameterization is based on the assumption that water is the most wetting phase
and gas the least wetting phase, which is valid for most reservoirs of interest (see [6]
0 0
for more detailed explanations). We also need Pcow ≤ 0 and Pcgo ≥ 0 for the problem
to be well-posed. The resulting system of PDEs is supplemented with the boundary
conditions
pw = pwd on Γd (1.1.11)
ρw uw · ν = gwn on Γn (1.1.12)
ρo uo · ν = gon on Γn (1.1.13)
ρg ug · ν = ggn on Γn (1.1.14)
6 CHAPTER 1. INTRODUCTION
pw (x, 0) = pw0 (x), Sw (x, 0) = Sw0 (x), Sg (x, 0) = Sg0 (x), (1.1.15)
where the Dirichlet boundary Γd has positive measure, and ν denotes the outward
normal to the boundary.
∂Sp
φ − ∇ · (λp K∇(pp − γp z)) = qp (1.1.16)
∂t
for p = o, w, and
∂Sg ∂(So R̄s )
φ − ∇ · λg K∇(pg − γg z) + φ − ∇ · R̄s λo K∇(po − γo z) = qg ,
∂t ∂t
(1.1.17)
for the gas phase, where λp = krp /µp is the (relative) mobility of phase p, and R̄s =
ρo Rs /ρg is the normalized solubility ratio. Sometimes we also consider the two-phase
flow case, which is simply the same PDEs with the gas-related equations removed.
Pressure equation
An important equation that can be derived from the mass balance equations and the
saturation constraint is the pressure equation. It can be obtained by taking a special
linear combination of the mass-balance equations (1.1.7), (1.1.8). Assume there are
no source or sink terms and no buoyancy effects, and suppose Pcow = Pcgo = 0, so
that all the phase pressures are identical. Inclusion of such terms would introduce
additional lower order terms, but would not alter the fundamental character of the
PDE. Let us multiply the water equation by 1/ρw , the gas equation by 1/ρg , and the
1.1. GOVERNING EQUATIONS 7
oil equation by (1 − R̄s )/ρo . Assuming that the pressure p is differentiable and that
φ, ρp and Rs are smooth functions of pressure, we get (after some algebra):
∂p
φcT − ∇ · (λT K∇p) − KχT |∇p|2 = 0, (1.1.18)
∂t
Total compressibility: c T = Sw c w + So c o + Sg c g + c r ,
Total mobility: λT = λw + λo + λg ,
Mobility-weighted compressibility: χ T = λw c w + λo c o + λg c g .
∂u
− a∇2 u + b|∇u|2 = 0, (x, t) ∈ Rn × (0, ∞)
∂t
u(x, 0) = g,
where a > 0 and b are constants [32, §4.4]. When cT ≡ 0 (the incompressible case),
(1.1.18) degenerates to an elliptic equation in p:
The pressure equation is important because it dictates the choice of numerical meth-
ods and forms the basis for several widely used methods in reservoir simulation.
1. Read input data (model grid geometry, permeability, porosity, fluid properties,
etc.);
• Set well locations and production/injection rates for the current time step;
• Form the nonlinear algebraic equations that arise from discretizing the
governing equations;
• Increment time;
approximate the PDEs. In this section, we provide the background for the remainder
of the thesis by briefly discussing several common discretizations and solvers; for a
broader survey of discretizations that are used in reservoir simulation, we direct the
reader to [34, 52, 83]. A discussion of time-step control and the treatment of wells is
beyond the scope of this thesis, even though these are very important considerations
in building an accurate and useful simulator (see [6] for details).
∂(φi ρp Sp ) 1 X
+ Fp,il = 0, (1.2.1)
∂t |Vi |
l∈adj(i)
where |Vi | is the volume of the i-th gridblock, and Fp,il is the numerical flux function
of phase p from cell i to cell l:
(pp,l − pp,i )(xl − xi ) γp,il (zl − zi )
Fp,il = −|∂Vil |Kil ρp,il λp (Sil ) − · ν il , (1.2.2)
|xl − xi |2 |xl − xi |
where |∂Vil | is the area of the interface between cells i and l, xi is the location of
the center of cell i, zi is the component of xi along the direction of gravity, and ν il
is the unit normal to the cell interface, pointing from cell i to cell l. The above
discretization uses a two-point flux approximation, and we restrict ourselves to the
two-point flux case in this dissertation. One should note, however, that multipoint
flux approximations are also used occasionally in reservoir simulation, especially for
tensorial permeability fields [2, 47, 48].
10 CHAPTER 1. INTRODUCTION
The literature on finite volume methods for multiphase flow is vast [87, 79, 16], and
[6] describes the method in detail for various flow configurations. On the other hand,
the use of finite-element methods for general-purpose simulation in industry is rare.
Finite element methods are more flexible in terms of the treatment of unstructured
grids, irregular boundaries, as well as anisotropic or tensorial permeability fields. As
a result, there is active interest in using finite-element methods to develop finite-
volume discretizations [53]. In this thesis, we restrict our discussion to finite volume
methods, but the reader is referred to [1, 43, 86, 31] for more detailed discussion on
finite element methods.
A peculiar feature of the spatial discretization used in reservoir simulation is the
upstream weighting of saturation-dependent terms. Buoyancy and capillary forces
may induce sonic points to the hyperbolic flux function (see Figure 2.1), but the exact
location of the sonic point is a strong function of the total velocity and permeability, so
it would be inconvenient to locate the sonic point for every cell interface. In practical
simulations, the upstream direction for phase p is determined by the potential gradient
of phase p. Since different phases can have different upstream directions, the resulting
numerical flux functions are in fact a combination of mobilities, each evaluated at a
different saturation. It can be shown [13] that these numerical flux functions are
different from those used in classical CFD, such as the Godunov and Engquist-Osher
schemes. In Chapter 2, we will study this upstream weighting in detail and discuss
its convergence to the analytical solution under grid refinement.
(3) updating the saturations using the mass-balance equations (1.1.7), (1.1.8)
and a forward difference approximation for ∂/∂t. Because of the explicit treat-
ment of saturation, IMPES is only conditionally stable; the CFL condition for
a 1D two-phase incompressible oil-water problem without gravity is given by
(cf. [20])
φ
∆t < , (1.2.3)
2Kλw λo |dPcow /dSw | vT dfw /dSw
+
(λw + λo )∆x2 ∆x
where vT is the total velocity of the oil and water phases, and fw is the fractional
flow of the water phase:
λw Kλo ∂Pcow
fw = 1+ .
λw + λo vT ∂x
In the absence of capillarity, (1.2.3) reduces to the familiar CFL condition for
the hyperbolic conservation law
3. Adaptive implicit method (AIM): This method changes the level of implicitness
adaptively for each cell, depending on the CFL limit for that cell. For a cell
experiencing fast flows (i.e., the local CFL number is greater than 1), both
the saturation and pressure are taken implicitly; if, on the other hand, the
local CFL number is less than 1, the saturations are taken explicitly, whereas
pressure is taken implicitly. More detailed descriptions and analyses can be
found in [76, 36, 67, 26].
4. Fully implicit method (FIM): Both saturation and pressure variables are taken
implicitly in every cell. A linear stability analysis [6], together with a more
refined analysis based on linearized mobilities [61], strongly indicate (but do not
provide a rigorous proof) that this method is unconditionally stable. However,
it is also generally the most diffusive of the above mentioned schemes.
These methods differ in the level of implicitness of the saturation-dependent quanti-
ties, with IMPES having the least degree of implicitness and FIM having the most.
Note that pressure is treated implicitly in all methods. This is because the pressure
equation is either weakly parabolic (and nearly elliptic) in the compressible case, or
elliptic in the incompressible case. Hence, in the compressible case, explicit pressure
treatment would entail a time-step restriction proportional to ∆x2 , which is unac-
ceptably severe. In the incompressible case, the pressure equation degenerates into a
constraint that is required to ensure global conservation, which must be satisfied by
the numerical solution. Thus, it is also necessary to treat pressure implicitly in the
incompressible case.
Clearly, a method with a lower level of implicitness would incur a lower compu-
tational cost per time step. However, the difference in computational cost between
explicit and implicit methods (such as IMPES and FIM) is not as pronounced as
one would expect, since the “explicit” IMPES still needs to solve an implicit pres-
sure equation at every time step. Figure 1.1 shows the amount of time the simulator
spends in each module during a typical black-oil simulation when FIM is used. Even
for FIM, the pressure solve represents almost half of the total running time, and
about 60% of the solver time. So in this case, IMPES would be faster than FIM
only if the FM time step is chosen such that the maximum CFL number is less than
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 13
1.67. In practice, reasonable time steps yield maximum CFL numbers that are much
larger than 1 because of the presence of sources and sinks, as well as spatial variations
in permeability and porosity. However, the impact of these high CFL numbers on
overall accuracy is minimal because they only occur in a few cells. Figure 1.2 shows
the saturation profiles for the FIM and IMPES solutions in a 2D water flood prob-
lem. The maximum saturation difference between the two solutions is 0.036, which is
negligible considering the uncertainty in the reservoir characterization. In this case,
FIM takes only 113 time steps to reach Tfinal , whereas IMPES takes 1318 steps, so
FIM is clearly more efficient.
The above example, in which the high CFL numbers do not significantly affect
solution accuracy, is typical among reservoir models of practical interest. Such models
are generally highly heterogeneous with permeability variations up to several orders
of magnitude. Moreover, wells can be completed anywhere in the reservoir model
and can operate in a wide variety of ways, often resulting in CFL limits that are
unacceptably severe. Thus, reservoir simulators typically use implicit time-stepping
for robustness and efficiency. Consequently, efficient linear and nonlinear solvers for
the fully-implicit problem can be the crucial factor in determining the efficiency of
reservoir simulators.
14 CHAPTER 1. INTRODUCTION
Figure 1.2: A comparison between FIM (top) and IMPES (bottom) saturation profiles
for a 2D heterogeneous reservoir. The permeability and porosity fields are taken from
the 51st layer of the SPE 10 reservoir [19].
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 15
Higher-order methods for reservoir simulation have been an active area of research
in recent years. With the exception of streamline methods, which can take advantage
of high-order 1D integrators readily [52], higher-order methods are still primarily in
the development stage and are not yet routinely used in commercial simulators. A
major impediment to the wide-spread adoption of higher-order methods is the loss
of positivity, which leads to spurious oscillations as the initial profile is integrated
forward in time. An important result due to Bolley and Crouzeix [12] states that
a method that preserves positivity for all ∆t is at most first-order accurate. An
elaborate discussion on higher-order methods is beyond the scope of this thesis; see
[9, 18, 11, 26, 77] for a detailed discussion.
Since all temporal discretizations contain some level of implicitness, the simulator
needs to solve a large system of nonlinear algebraic equations at each time step. The
size and properties of this system, of course, depend on the number and nature of the
implicit variables. For IMPES, the nonlinear system will by N -by-N , where N is the
number of grid blocks (control volumes) in the domain, and the equations will inherit
the parabolic/elliptic nature of the pressure equation. For FIM, on the other hand, we
would have an np N -by-np N system, where np is the number of fluid phases, and the
equations would be of mixed hyperbolic-parabolic type. As a result, the bulk of the
simulation time (80% to 90%, cf. Figure 1.1) is spent on solving these large systems.
It is therefore crucial, for the sake of efficiency and robustness, that the linear and
nonlinear solvers exploit the structure and properties of these discrete equations.
Nonlinear solvers
The most commonly used nonlinear solvers in reservoir simulation are all variations
on the basic Newton method:
where R(x) is the residual function and J(x) = ∂R/∂x is the Jacobian matrix. New-
ton’s method is popular because of its local quadratic convergence and its general
applicability. For residual functions arising from discretized PDEs, the resulting Ja-
cobian is generally sparse and structured, which means the linear systems can be
solved efficiently. Also, quadratic convergence means Newton’s method is very fast
when good initial guesses are available. For time-dependent problems, a natural
initial guess is the saturation and pressure profiles from the previous time step. As-
suming the profiles vary continuously with time (which is always true for pressure,
and true for saturations away from shock fronts), the old time-step values will be
close to the solution provided ∆t is small enough. However, when the time step is
too large, it is possible for Newton’s method to diverge, since the residual functions
are in general non-convex and possibly non-monotonic (see Figure 2.1). When faced
with non-convergence, the simplest approach is to cut the time-step size and rerun
Newton’s method with the smaller time step. Such time-step cuts are very expensive,
since they mean we must throw away the results of all previous iterations and start
over. Thus, one should avoid time-step cuts as much as possible.
One way to avoid time-step cuts is to take small enough time-steps. However, in
practice, one does not want to choose time-step sizes based on the nonlinear solver
for the following reasons:
1. The use of excessively small time steps reduces the benefits of using FIM, since
we would not be taking advantage of its ability to handle long time steps;
3. The time-step size should be chosen based on the desired solution accuracy (e.g.,
bounds on numerical diffusion errors) instead of the ability of the nonlinear
solver to converge a time step.
For these reasons, several modifications to the basic Newton’s method have been pro-
posed to ensure global convergence, or at least to enlarge the region of convergence
to the point that the algorithm will converge for all ∆t of practical interest. Glob-
alization techniques for general nonlinear residual functions, such as line search and
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 17
trust-region methods, are discussed in [29]. In our experience, line search methods,
in which the search direction is scaled by a single step-length parameter α, are in-
adequate for reservoir simulation problems because (1) the residual norm is sensitive
to diagonal scaling, and the correct scaling for the phase conservation equations is
not obvious in most problems; (2) α is often very small when flow reversal due to
gravity occurs across several cell interfaces; (3) a number of backtracking steps is of-
ten needed to guarantee a sufficient decrease in the residual, and function evaluations
are quite expensive, since each evaluation involves calculating fluid properties and
pressure gradients for every cell in the domain.
Another method, which is implemented in the commercial simulator Eclipse, is
the so-called Appleyard chop [37]. It limits, on a cell-by-cell basis, the allowable
saturation and pressure changes within a nonlinear iteration to a fixed (but empirically
determined) threshold. When the threshold parameters are chosen properly, the
method is quite robust and the number of time-step cuts is often small. However,
because large saturation changes are disallowed, the method can lead to unnecessarily
slow convergence, especially in cases where Newton’s method actually works well (such
as problems with convex fractional flow functions).
Other methods for solving general nonlinear systems (e.g., continuation methods)
can be found in [29, 59]. Such methods, however, are not used in general-purpose
simulation in industry.
Linear solvers
To solve the linear system (1.2.4), early reservoir simulators [51, 62, 6] used either
direct methods (Gaussian elimination) or stationary iterative methods such as succes-
sive over-relaxation (SOR), alternating direction implicit method (ADI), or Stone’s
strongly implicit procedure (SIP) [73]. With the advent of Krylov accelerators such as
ORTHOMIN [80] and GMRES [69], iterative methods became more popular, and the
need for efficient preconditioning techniques has increased. In addition to precondi-
tioners derived from stationary methods, other preconditioners have been developed
by the reservoir simulation community to handle the linear equations arising from
fully-implicit simulation. Examples include:
18 CHAPTER 1. INTRODUCTION
Behie [8] provides a comparison among the three preconditioners above. In Chapter
5, the spectral properties of CPR-preconditioned Jacobians are discussed in detail.
the existing literature [6, 61] in which only stability is established using a linear or
linearized stability analysis.
In Chapter 2, we analyze phase-based upstreaming in detail. We show how the
FIM formulation in 1D, as well as SEQ in multiple dimensions, can be cast as a
monotone implicit scheme. We then extend the work of Rheinboldt on M -functions
and Gauss-Seidel iterations [64] to show that the discretized equations always have
a unique solution, which can be found using the nonlinear Gauss-Seidel process. We
also show that the discrete solution converges to the entropy solution under grid
refinement, and we investigate the accuracy of the discrete solutions for different
time-step sizes and spatial grids. This chapter is of a more theoretical nature, and
practitioners of reservoir engineering who are familiar with the discretizations can go
directly to Chapter 3 for a more algorithms-related discussion.
In Chapter 3, we introduce phase-based potential ordering, which reorders the
equations and variables in the nonlinear system in a way that exploits flow direction
information and eventually allows a partial decoupling of the problem into a sequence
of single-cell problems that are easy to solve. This ordering is valid for both two-phase
and three-phase flow, and it can handle countercurrent flow due to gravity and/or
capillarity.
In Chapter 4, we propose a reduced-order Newton algorithm, which makes use of
the phase-based potential ordering in Chapter 3 to reduce the size of the nonlinear
system. The latter is then solved using Newton’s method. We analyze its convergence
behavior for 1D cocurrent problems, and we show a variety of examples (two- and
three-phase flow, with and without gravity) illustrating its effectiveness in dealing
with large, complex heterogeneous problems.
In Chapter 5, we analyze the two-stage CPR preconditioner in detail and propose
an improved second-stage preconditioner that uses a cell-based potential ordering.
This approach reduces the sensitivity of CPR to flow configurations, and this re-
duction in sensitivity is both justified theoretically and observed from numerical ex-
periments. We also experiment with directly preconditioning the Schur complement
problem that arises from the phase-based potential order reduction.
We present our conclusions and outline future directions in Chapter 6.
Chapter 2
2.1 Background
∂(φρj Sj )
+ ∇ · (ρj uj ) = ρj qj , j = 1, . . . , n, (2.1.1)
∂t
where φ = φ(x) is the porosity of the medium (with 0 < φ ≤ 1), K = K(x) > 0 is the
absolute permeability, z = z(x) is the depth variable; and for each phase j = 1, . . . , n,
ρj is the density, Sj is the saturation (i.e. the volume fraction occupied by phase j),
uj is the volumetric flux vector, qj is the source or sink term, λj = λj (S1 , . . . , Sn ) is
the phase mobility, pj is the pressure, and γj is the gravitational force. In addition,
20
2.1. BACKGROUND 21
P
where λT = j λj is the total mobility. Thus, for a given saturation distribution,
the pressure field satisfies an elliptic PDE. On the other hand, when the total ve-
P
locity uT = j uj is constant over the domain (which is the case for flow in a one-
dimensional porous medium), we can rewrite uj as
λj P
uj = (uT − K∇z l λl (γl − γj )) , (2.1.6)
λT
∂Sj
φ + ∇ · uj (x, S1 , . . . , Sn ) = 0, j = 1, . . . , n − 1. (2.1.7)
∂t
This means saturation behaves like the solution to a system of first-order hyperbolic
PDEs, so one should expect discontinuous saturation profiles. In higher dimensions,
there is generally a strong coupling between pressure and saturation, due to the sat-
uration dependence of λj and λT in (2.1.5) and the dependence of uT on the pressure
field in (2.1.6). In addition, the porosity φ and permeability K are highly oscillatory,
non-smooth functions of x, and K(x) can vary by several orders of magnitude over
the domain Ω. The large variability of φ and K leads to local CFL limits that are
22 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
unacceptably severe when explicit schemes are used. As a result, the discretization
of choice for most reservoir simulators is the fully-implicit method (FIM), which uses
finite volume in space and backward Euler in time. The numerical flux functions,
which approximate the uj as defined in (2.1.2), use a two-point finite difference to
approximate ∇p and phase-based upstream weighting to approximate λj (S). In other
words, to approximate uj at the interface of cells a and b (centered at xa and xb ), we
evaluate λj (S) at
S(xa ) if −∇(pj − γj z) · νab ≥ 0,
S= (2.1.8)
S(x ) otherwise,
b
where νab is the unit vector normal to the interface, pointing from a to b. The
resulting numerical flux functions are different from those used in classical CFD,
such as the Godunov and Engquist-Osher schemes [13]. Despite being only first-
order accurate, phase-based upstreaming is the preferred upwind method in reservoir
simulation because it is physically intuitive, and because it is generally easier to verify
a consistency condition such as (2.1.8) than to identify potential sonic points, which
vary over the domain and are strong functions of permeability and total velocity. This
is especially true for the fully-implicit method because the total velocity at time tn+1
is usually unknown.
Note that in (2.1.8) it is possible for −∇(pj − γj z) · νab to have different signs
for different j, meaning the upstream directions can be different for different phases
when buoyancy forces are significant; this is known as countercurrent flow in reservoir
engineering literature. In one-dimensional porous media, countercurrent flow mani-
fests itself through the presence of sonic points in the flux function uj ; thus, the flux
function for a countercurrent flow problem would typically look like the one shown
in Figure 2.1(b), whereas without countercurrent flow it would look more like Figure
2.1(a). A detailed treatment of phase-based upstreaming is given in [13], in which
the authors showed that, when explicit time-stepping is used on a two-phase flow
problem, phase-based upstreaming leads to a monotone difference scheme, as long as
the appropriate CFL condition is satisfied. This in turn implies that the solution of
2.1. BACKGROUND 23
1 1.6
1.4
0.8
1.2
0.6 1
Fw
Fw
0.8
0.4 0.6
0.4
0.2
0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Sw Sw
(a) (b)
Figure 2.1: Flux functions for 1D incompressible two-phase flow: (a) Co-current flow
(no buoyancy effects), (b) Countercurrent flow due to gravity.
the explicit schemes converge to the entropy solution of the two-phase equations
∂S ∂f (S)
+ = 0, (2.1.9)
∂t ∂x
λ1 ∂z
f (S) = u1 = uT + Kλ2 (γ1 − γ2 ) (2.1.10)
λ1 + λ2 ∂x
as ∆t, ∆x → 0 while satisfying the CFL condition. The goal of this chapter is to
extend this result to the fully-implicit case. This leads us to study the more general
problem of implicit monotone schemes, which would then include the multiphase flow
problem as a special case.
The use of implicit time-stepping leads to a (typically large) system of nonlinear
algebraic equations that must be solved for each time step. Moreover, the residual
functions are generally non-differentiable because of upstreaming criteria of the form
(2.1.8); thus, the existence of a unique solution to these systems of equations is
not immediately obvious. For implicit monotone schemes for 1D scalar conservation
laws, Lucier [50] showed that a solution to the discrete problem exists and is unique
whenever the initial data is bounded and has bounded total variation. The proof
of existence, which relies heavily on Crandall-Liggett theory [23], proceeds along the
following lines (see [27, Chapter 3] for more details). First, one shows that the residual
24 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
function R for the numerical scheme defines an m-accretive operator in the L1 -norm.
Then by the Crandall-Ligett theorem, the ODE
du
= −Ru, u(0) = x (2.1.11)
dt
has a unique solution for t ∈ [0, ∞) for any initial point x. Let u(t; x) denote the
solution of (2.1.11) with starting point x. Then one shows that the Poincaré operator
Pω , which maps the point x to the point u(ω, x), is strictly contractive. Then by
Banach’s fixed point theorem, Pω has a unique fixed point x0 . One then proceeds to
prove that u(t; x0 ) = x0 for all 0 ≤ t ≤ ω; thus, du/dt = 0, which implies Rx0 = 0.
While this argument does prove the existence and uniqueness of a solution to
the discretized problem, the proof does not suggest a practical algorithm for finding
the solution. In section 3, we present an alternate constructive proof of existence by
showing that the classical Gauss-Seidel and Jacobi iterations converge for this class of
problems. In fact, we show that the iterative methods converge whenever the initial
data for the discrete problem is bounded, so the implicit scheme is well-defined even
when the initial data does not have bounded variation in R. The well-definedness of
the numerical scheme, together with the total variation diminishing (TVD) property
and the existence of a discrete entropy inequality, imply that the numerical scheme
converges to the entropy solution as the mesh is refined (i.e., as ∆x → 0). This result
holds for any mesh ratio λ = ∆t/∆x (i.e., for any Courant number).
for p = w, o (water and oil), together with the saturation constraint So + Sw = 1, the
initial condition Sw (x, 0) = S 0 (x) for x ∈ [xL , xR ], and boundary conditions
We assume that the injection velocities qw,L and qo,L are non-negative, and that the
total velocity qT,L := qw,L +qo,L is strictly positive. (These assumptions cover the most
interesting cases, such as oil recovering by water-flooding.) This formulation, which
contains pressure variables, is known as the parabolic form of the problem, since
it represents the incompressible limit of a parabolic problem. We can also derive
the hyperbolic or “fractional flow” form of the problem by eliminating the pressure
variables as follows. The discretized PDEs can be written as
old
φi (Sw,i − Sw,i ) Fw,i+1/2 − Fw,i−1/2
+ = 0, (2.2.3a)
∆t ∆x
old
φi (Sw,i − Sw,i ) Fo,i+1/2 − Fo,i−1/2
+ = 0, (2.2.3b)
∆t ∆x
where
pi − pi+1
Fp,i+1/2 = Ki+1/2 λp,i+1/2 + gp , p = o, w, (2.2.4)
∆x
26 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
Fp,1/2 = qp , p = o, w, (2.2.5)
pN +1 = 2pR − pN . (2.2.6)
For the remainder of this section, we assume without loss of generality that gw ≥ go ;
in the case of gw < go , the same argument would hold by considering the oil phase
instead of the water phase. To eliminate the pressure variables pi , first note that
summing equations (2.2.3a) and (2.2.3b) and rearranging gives
In other words, the total flux is constant across any interface, and this flux is denoted
by qT , which is equal to qT,L . Summing Equation (2.2.4) through p = o, w, we can
express the pressure gradient (pi − pi+1 )/∆x in terms of qT :
pi − pi+1
qT = Ki+1/2 λT,i+1/2 + (λw,i+1/2 gw + λo,i+1/2 go ) ,
∆x
λw,i+1/2
Fw,i+1/2 = qT + Ki+1/2 λo,i+1/2 ∆g
λT,i+1/2 (2.2.8)
= Fw,i+1/2 (Sw,i , Sw,i+1 ),
old ∆t
φi (Sw,i − Sw,i )+ (Fw,i+1/2 − Fw,i−1/2 ) = 0, (2.2.9)
∆x
2.2. TWO MODEL PROBLEMS 27
leads to a numerical scheme with exactly the same form as (2.3.1), except for the
boundary conditions. Clearly, the treatment of boundary conditions will significantly
affect the stability and accuracy of the numerical scheme. However, in order to
understand the behavior of the numerical scheme at interior points, we will replace
the initial-boundary value problem (2.2.1) with an initial value problem on an infinite
domain with appropriate initial conditions. In particular, we replace the injection
boundary condition with
The modified continuous problem will yield a solution identical to (2.2.1) for 0 < t <
TBT , where TBT is the breakthrough time (i.e. the time at which the shock wave
arrives at the pressure boundary). Note that since f is one-to-one over the interval
I = {S : 0 ≤ f (S) < 1} (see Figure 2.1), and since qw,L ≤ qT,L by assumption,
(2.2.10) is well-defined unless qo,L = 0. (If qo,L = 0, we define u0 (x) = inf f −1 (1),
where f −1 denotes the inverse image.)
Phase-based upstreaming
Recall from section 2.1 (cf. Equation (2.1.8)) that the mobilities λp,i+1/2 are evaluated
using the upstream saturations with respect to the flow direction of phase p:
1
λp (Si ) if (p
∆x i
− pi+1 ) + gp ≥ 0,
λp,i+1/2 = (2.2.12)
λ (S ) otherwise.
p i+1
where the subscript q denotes the phase other than phase p. Even though pressure
dependence has been eliminated, Equation (2.2.13) still does not explicitly define the
upstream direction for λp , since the definition of upstream is in terms of the (yet
undetermined) mobility of the other phase λq,i+1/2 . For explicit numerical schemes,
Brenier and Jaffré have shown in [13] how to explicitly determine the upstream direc-
tion for each phase for a given saturation profile {Sin }. In the special case of two-phase
flow, they define the following quantities:
These quantities correspond precisely to the condition in (2.2.13), but the condition
is evaluated at Sin for θo and Si+1
n
for θw . Clearly θw,i+1/2 > 0, since ∆g ≥ 0. The
correct upstream directions are then given by
Thus, for an explicit time-marching scheme, the numerical fluxes are completely
defined by these conditions, and there is no need to go back to the original definition
(2.2.12) involving unknown pressure values. However, this is not the case for an im-
plicit time-marching scheme (such as backward Euler), since the upstream directions
must be consistent with the saturation values at the end of the time step, i.e. with
the saturation profile {Sin+1 }. Because of this consistency requirement, it is not clear
a priori that a solution to the parabolic form of the problem (2.2.3) even exists. Our
approach to proving that a solution exists is to rely on the hyperbolic form g(2.2.8)–
(2.2.11). From the above derivation, it is evident that if {(Si , pi )}N
i=1 is any solution to
the parabolic form (2.2.3)–(2.2.6), then {Si }N
i=1 must be a solution to the hyperbolic
problem. Thus, the key idea is to begin by finding the correct saturation profile {Si }
via (2.2.8)–(2.2.11), with a numerical flux that automatically ensures consistency with
the upstream directions; once the {Si } are known, we can easily solve for the pressure
part because the pressure equation is linear. We distinguish two cases:
2.2. TWO MODEL PROBLEMS 29
A plot of the numerical flux Fw (u, v) in the latter case is shown in Figure 2.2. The
black curve on the surface, which shows the value of F (u, v) along the line u = v,
is identical to the continuous flux function in Figure 2.1(b). Thus, it is evident that
the numerical flux satisfies the consistency condition F (u, u) = f (u). Even though
f (u) itself is non-monotonic, the plot clearly shows that F (u, v) is an increasing
function of u and a decreasing function of v. This monotonicity property is what
makes upstream weighting amenable to a Gauss-Seidel type analysis. Also notice
that the numerical flux is independent of the downstream saturation v inside the
cocurrent region (0 ≤ u ≤ Sc ≈ 0.27), but becomes a function of both variables when
u > Sc . Finally, F (u, v) is Lipschitz continuous, but non-differentiable along the line
u = Sc because of the upstream condition (2.2.14). The following theorem, which
summarizes several results by Brenier and Jaffré [13], shows that upstream-weighted
fluxes generally satisfy the monotonicity property.
Theorem 2.1. Assume that the mobility of phase p is increasing with the saturation of
the same phase and decreasing with the saturation of the other phase, for p = o, w (oil
and water). Then the numerical fluxes obtained from phase-based upstreaming defined
by (2.2.8), (2.2.13) are (1) Lipchitz continuous, (2) consistent with the continuous
30 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
flux function (i.e., F (u, u) = f (u)), (3) non-decreasing with respect to Sw,i , and (4)
non-increasing with respect to Sw,i+1 .
The hypothesis on phase mobilities is physically realistic [6]. These properties are
sufficient to ensure that the hyperbolic problem with implicit time-stepping possesses
a unique solution {Sin+1 }, which must also be the correct saturation profile for the
parabolic problem. To solve for pressure, we use Equation (2.2.7):
pN +1 = 2pR − pN .
Since {Sin+1 } is now known, the right-hand side of (2.2.7) also completely determined.
Thus, the vector p of pressures actually satisfies Ap = b, where A is an N × N upper
triangular matrix with a nonzero diagonal. So A is nonsingular, which means there is
a unique pressure profile {pn+1
i } that satisfies (2.2.7) and (2.2.6). It is easy to see that
this pressure profile is consistent with the upstream condition (2.2.12): because of
(2.2.7), this upstream condition is equivalent to (2.2.13), and the conditions therein
are precisely the ones we use to define the numerical flux function (2.2.14) for the
hyperbolic problem. Hence, we have shown that the parabolic form (2.2.3)–(2.2.6)
has a unique solution, given by the above {(Sin+1 , pin+1 )}.
3.5
2.5
F(u,v)
1.5
0.5
0
1
0.5
0.8 1
0.4 0.6
0 0 0.2
u
v
Figure 2.2: The numerical flux function F (u, v) corresponding to the fractional flow
in Figure 2.1(b). The black curve along the diagonal indicates the value of F (u, u) =
f (u).
32 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
∂Sj
φ + ∇ · uj (x, S1 , . . . , Sn ) = 0,
∂t
λj ∗ P
uj = (uT − K∇z l λl (γl − γj )) .
λT
Essentially, the SEQ method decouples the system into an elliptic and a hyperbolic
subproblem. A finite-volume discretization of (2.1.6) and (2.1.7) gives rise to the
following multidimensional analog of (2.2.9):
X
n+1 n n+1 n+1
φi (Sw,i − Sw,i )+ λil Fil (Sw,i , Sw,l ) = 0. (2.2.17)
l∈adj(i)
Here, Fil is the flux (or velocity) from cell i to cell l, and λil = ∆t|∂Vil |/|Vi |, where
|∂Vil | is the area of the surface separating cell i, and l, |Vi | is the volume of cell i and
∆t is the time step. For a conservative scheme we must have
and for monotonicity we require that Fil be non-decreasing with respect to the first
argument and non-increasing with respect to the second. This requirement is satisfied
for two-phase flow problems, since we can reproduce the derivation in section 2.2.1
2.2. TWO MODEL PROBLEMS 33
λw,il
Fw,il = [qil + Kil λo,il (gw − go )]
λT,il
for p = o, w, where qil = u∗T · νil and gp = γp ∇z · νil . We show that a unique solution
to (2.2.17) exists for any ∆t if the following conditions hold:
1. The number of cells (control volumes) adjacent to cell i, |adj(i)|, is bounded for
all i;
2. The ratio |∂Vil |/|Vi | is bounded for all pairs of adjacent cells (i, l);
3. The quantity φi |Vi | is uniformly bounded away from zero for all i;
4. For any cell i, the total number of cells reachable from i in k steps is O(k p ) for
some fixed p > 0 (i.e. grows at most polynomially in k).
5. Fil is equicontinuous with the same Lipschitz constant for all pairs of adjacent
cells (i, l).
Assumptions 1–4 are easily satisfied by regular Cartesian grids, and also by most
unstructured grids of practical interest. From (2.2.18) we see that assumption 5 is
satisfied as long as Kil is uniformly bounded over the domain, which is generally true
for problems of practical interest. We justify these assumptions in section 2.3.7.
34 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
φi (un+1
i
n+1
− uni ) + λ(Fi+1/2 n+1
− Fi−1/2 ) = 0, λ = ∆t/∆x, i ∈ Z. (2.3.1)
which generalizes problem (2.1.9), (2.1.10) to the variable porosity and permeability
case. For simplicity, we assume a three-point scheme
n+1
Fi+1/2 = Fi+1/2 (un+1
i , un+1
i+1 );
thus, the implicit stencil at cell i involves the value at cell i at time tn , as well as the
values at cells i − 1, i and i + 1 at the future time tn+1 . Given we are interested in
handling flux functions of the type shown in Figure 2.1(b), we do not assume that the
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 35
flux function f (x, u) is monotonic in u, so that sonic points may be present. Assume
that f and F are both locally Lipschitz continuous (but not necessarily differentiable),
and that the numerical flux function Fi+1/2 is consistent with f in the sense that
For the purpose of this thesis, a 1D implicit scheme is said to an implicit monotone
scheme if the following assumption is satisfied.
Assumption 1 (Monotonic fluxes). For all i ∈ Z, the numerical flux function Fi+1/2
is non-decreasing in the first argument and non-increasing in the second argument,
i.e. for any w, we have Fi+1/2 (u, w) ≤ Fi+1/2 (v, w) and Fi+1/2 (w, u) ≥ Fi+1/2 (w, v)
whenever u ≤ v.
As shown in section 2.2.1, the fully implicit 1D problem satisfies this assumption.
We show that residual functions corresponding to implicit monotone schemes are in
fact M -functions in the sense of Rheinboldt [64]. This allows us to prove the existence
and uniqueness of solutions via a convergent iterative process.
which is equivalent to diagonal dominance when A is linear (see Appendix B). On the
other hand, M -functions are generalizations of M -matrices, i.e., A is a nonsingular
M -matrix if (1) aii > 0, (2) aij ≤ 0 for i 6= j, and (3) A−1 has only non-negative
36 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
entries. Thus, if
2 1 0 1 0 0
,
M1 = 1 2 1 M2 = −4 1 0
,
0 1 2 0 −4 1
then the function f1 (x) = M1 x is m-accretive but not an M -function, and the reverse
is true for f2 (x) = M2 x. We do not directly use m-accretivity in this work.
Remark. Assumption 1 implies that (2.3.1) is an E-scheme (cf. [60]), so it is at most
first-order accurate.
k+1
Solve ri (xk+1 ∗ k k ∗
1 , . . . , xi−1 , xi , xi+1 , . . . , xN ) = 0 for xi ,
(2.3.4)
Set xik+1 = x∗i , i = 1, . . . , N, k = 1, 2, . . . ,
In other words, the residual functions must come from a compact stencil and must
preserve boundedness. With these assumptions, the nonlinear Gauss-Seidel process
becomes
The only differences between the above processes and (2.3.4)–(2.3.5) are that each
Gauss-Seidel/Jacobi “sweep” now involves infinitely many variables and equations.
These processes are well-defined because each ri is assumed to depend on only finitely
many arguments, so that for any given i ∈ Z, k ∈ N, the value of xik+1 can be obtained
from a finite number of univariate solves. The main purpose of these assumptions is
to ensure the residual function of the discretized PDE is an M -function. This would
then allow us to prove the convergence of Jacobi and Gauss-Seidel iterations to a
unique bounded solution.
We denote by ei the unit basis vectors with the i-th component one and all others
zero. The following definitions are essentially identical to those in [64], except the
domain of definition has been changed from Rn to `∞ (N) to handle vectors of infinite
length.
1. R is isotone (or antitone) if, for all x, y ∈ `∞ (N), x ≤ y implies R(x) ≤ R(y) (or
R(x) ≥ R(y)). It is strictly isotone (or antitone) if x < y implies R(x) < R(y)
(or R(x) > R(y)).
are antitone.
2. for any kxk∞ < B, the function Q(t) = (q1 (t), q2 (t), . . .) defined by
∞
X
qi (t) = wjB rj (x + tei )
j=1
Then R is an M -function.
Proof. The proof is an adaptation of the proof of Theorem 5.1 in [64], suitably
modified to handle the infinite-dimensional case. Suppose R(x) ≤ R(y) for some
x, y ∈ `∞ (N). Define the sets
N − = {i ∈ N | yi < xi }; N + = {i ∈ N | yi ≥ xi }.
Define Rk := R(z k ) and R∞ = R(z). In either case, we have the following properties:
2. For each i, zik = zi for large enough k, so by Assumption 3, Rjk → Rj∞ pointwise
for each j.
Since Rjk < ζ(B) for all j, k, each Rk is dominated by the constant sequence G =
(ζ(B), ζ(B), . . .). Moreover ∞ B
P
j=1 wj Gj < ∞, so by the dominated convergence the-
orem (cf. [65]), we have
∞
X ∞
X
wjB Rjk → wjB Rj∞ as k → ∞.
j=1 j=1
with at least one strict inequality (since N − is non-empty). Thus, we must have
∞
X ∞
X ∞
X ∞
X
wjB rj (y) = wjB Rj0 < wjB Rj∞ = wjB rj (z). (2.3.10)
j=1 j=1 j=1 j=1
(and invoking the dominated convergence theorem whenever necessary), we can show
similarly that
X X X X
wjB rj (z) ≤ wjB rj (x), wjB rj (z) ≤ wjB rj (y), (2.3.12)
j∈N − j∈N − j∈N + j∈N +
∞
X X X
wjB rj (y) < wjB rj (x) + wjB rj (y), (2.3.13)
j=1 j∈N − j∈N +
which implies
X X
wjB rj (y) < wjB rj (x). (2.3.14)
j∈N − j∈N −
Thus, we must have rj (y) < rj (x) for some j ∈ N − , which contradicts the hypothesis
R(x) ≤ R(y). Hence N − must be empty, so x ≤ y.
Corollary 2.3. Let R satisfy the hypotheses of Theorem 2.2. Let z ∈ `∞ (N). Then
there is at most one bounded solution to the equation R(x) = z.
Remark. In the context of discretized PDEs one normally assumes tacitly that the
solution of interest must be bounded; this can be regarded as a boundary condition
“at infinity”. However, since such boundary conditions are not explicitly stated in the
definition of M -functions, one must be careful to exclude any parasitic unbounded
solutions that may arise. In fact, the solution is not necessarily unique if we allow
unbounded solutions. Consider the linear function R = (r1 , r2 , . . .) defined by ri (x) =
xi − αxi+1 for |α| < 1. Then for any kxk∞ < ∞, we have kR(x)k∞ ≤ (1 + α)kxk∞ ,
so that Assumption 2 is satisfied. Assumption 3 (finitely many dependencies) is also
satisfied because each ri is only non-constant with respect to two components of x.
42 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
Finally, if we let wjB = β j for any |α| < β < 1, then β j < ∞ and
P
j
∞
X
qi (t) = β j rj (x + tei )
j=1
X∞
β j xj + tδij − α(xj+1 + tδi,j+1 )
=
j=1
∞
X
i−1
= (β − α)β t + βx1 + (β − α) β j−1 xj ,
j=2
so qi (t) is well-defined and is strictly increasing with respect to t whenever kxk∞ < ∞.
So the hypotheses of Theorem 2.2 are satisfied, and hence x = 0 is the only bounded
solution of R(x) = 0. However, unbounded solutions of the form y = {Kα−i }, K 6= 0
also satisfy R(y) = 0, so the theorem does not preclude these possibilities.
It turns out that the hypotheses of Theorem 2.2 are enough to ensure convergence of
nonlinear Jacobi and Gauss-Seidel for certain starting points described below. The
following result is essentially Theorem 3.1 in [64], with modified hypotheses to accom-
modate `∞ -bounded vectors with infinitely many components. The proof in [64] goes
through verbatim, but is reproduced here for completeness. Note that by Assumption
3, each ri depends on only finitely many arguments, so the standard arguments on
limits, continuity and antitonicity hold without additional complications when they
are used on individual components of R.
Theorem 2.4 (Rheinboldt). Let R : `∞ (N) → `∞ (N) satisfy the hypotheses of The-
orem 2.2. Suppose for some z ∈ `∞ (N) there exist points x0 , y 0 ∈ `∞ (N) such that
x0 ≤ y 0 , R(x0 ) ≤ z ≤ R(y 0 ).
Then the nonlinear Gauss-Seidel and Jacobi iterates {y k } and {xk }, given by (2.3.6)
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 43
and (2.3.7) and starting from y 0 and x0 , respectively, are uniquely defined and satisfy
First we need the following lemma (which is part of Theorem 2.10 in [64]).
Proof. Suppose that for some x ∈ `∞ (N), s, t ∈ R, s > t and index i we have
ri (x + sei ) ≤ ri (x + tei ). The off-diagonal antitonicity then implies that
rj (x + sei ) ≤ rj (x + tei ), j 6= i,
or, altogether, that R(x + sei ) ≤ R(x + tei ). By inverse isotonicity this leads to the
contradiction s ≤ t, which shows that R must be strictly diagonally isotone.
Proof of Theorem 2.4. We present only the proof for convergence of Gauss-Seidel; the
proof for Jacobi is similar. We proceed by induction and suppose that for some k ≥ 0
and i ≥ 1,
where for i = 1 the relation (2.3.17b) is vacuous. Clearly, (2.3.17) is valid for k = 0
and i = 1. Define the functions
for s ∈ [x0i , yi0 ]. From (2.3.17) and the off-diagonal antitonicity of R, we then find
that
β(s) ≤ α(s), s ∈ [x0i , yi0 ], (2.3.18)
and
β(xki ) ≤ α(xki ) ≤ ri (xk ) ≤ zi ≤ ri (y k ) ≤ β(yik ) ≤ α(yik ). (2.3.19)
where the relation x̂ki ≤ ŷik is a consequence of (2.3.18). But xik+1 = x̂ki and yik+1 = ŷik
by definition, so we have proved (2.3.17b) for j = 1, . . . , i. By induction (2.3.17b)
holds for all i ∈ N, and hence
xk ≤ xk+1 ≤ y k+1 ≤ y k .
This completes the induction on k and hence the proof of (2.3.15). Applying the
monotone convergence theorem for sequences, we conclude that the pointwise limits
exist for each j, which allows us to define x∗ = {x∗j } and y ∗ = {yj∗ }. Since each
ri is continuous and depends on only finitely many arguments, the definition of the
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 45
R(x∗ ) = R(y ∗ ) = z.
Since both x∗ and y ∗ are bounded, Corollary 2.3 implies that they are equal, com-
pleting the proof.
Theorem 2.6. Consider the numerical scheme (2.3.1) with the numerical flux given
by
n+1
Fi+1/2 = F (un+1
i , un+1
i+1 ),
for all i ∈ Z.
Proof. The strategy is to start by defining an ordering for the Gauss-Seidel sweeps,
i.e., by permuting the equations and variables so that the spatial indices go from 1 to
∞ rather than from −∞ to ∞. After that, it suffices to check that all the hypotheses
46 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
Thus, for any kvk∞ ≤ B, we have |rj (v)| ≤ ζ(B) for all j, where
∞
X ∞
X
qi (t) := wjB rj (v + tei ) = q̃i (t) + wjB rj (v),
j=1 j=1
where
q̃i (t) = wiB t/λ + (wiB − wτB(σ(i)+1) ) F (vi + t, vτ (σ(i)+1) ) − F (vi , vτ (σ(i)+1) )
β |σ(i)|−1 βt/λ − 2(1 − β)KB |t| ≤ q̃i (t) ≤ β |σ(i)|−1 βt/λ + 2(1 − β)KB |t| .
so picking
2λKB
<β<1 (2.3.22)
1 + 2λKB
ensures isotonicity for q̃i (t) (and hence qi (t)) for all i, as required in Theorem 2.2.
(Note that the choice of β depends on B.)
un+1
i = x∗τ (i) .
48 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
Remarks.
In fact, one can show that the nonlinear Jacobi and Gauss-Seidel processes con-
(0)
verge for any starting point {zi } that is bounded by the initial data {uni }. (In the
sequel, superscripts in brackets indicate iterates within the Gauss-Seidel process, and
superscripts without brackets indicate the time level in the numerical scheme.)
Theorem 2.7. Assume the hypotheses of Theorem 2.6. Suppose the initial guess
(0)
{zi } satisfies
(0)
inf unj ≤ zi ≤ sup unj (2.3.23)
j∈Z j∈Z
for all i ∈ Z. Then the nonlinear Jacobi and Gauss-Seidel processes (2.3.6) and
(2.3.7) are well-defined and converge to the unique bounded solution of (2.3.1).
Proof. Again we only show convergence for the Gauss-Seidel process, since the proof
for Jacobi is similar. Denote u = inf j∈Z unj and u = supj∈Z unj . First, we show that
(k)
the Gauss-Seidel iterates are well-defined and that u ≤ uj ≤ u for all j, k. At each
step we need to solve
(k) (k+1)
where zj±1 = zj±1 or zj±1 depending on the ordering of the Gauss-Seidel sweep,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 49
where the last inequality follows from Assumption 1. Similarly one obtains rj (u) ≥ 0,
so by continuity of F (and hence rj ) there must exist a solution zj∗ to (2.3.24), which
by Lemma 2.5 must be unique. Hence, by induction, the Gauss-Seidel iterates are
well-defined and are bounded above and below by u and u respectively.
(k) (k) (0)
Now consider the Gauss-Seidel iterates {xj } and {yj } with initial guess xj = u
(0)
and yj = u for all j. By Theorem 2.6 these iterates converge pointwise to the same
solution {x∗j }. We show inductively that x(k) ≤ z (k) ≤ y (k) for all k, which would imply
(k)
that zj → x∗j pointwise. Using the same reordering as in Theorem 2.6, assume that
for some k ≥ 0 and i ≥ 1 we have
which is valid for k = 0 and i = 1. Then by the same boundedness and antitonicity
arguments as in Theorem 2.4, we have
(k+1) (k+1)
which, together with the strict diagonal isotonicity or ri , implies that yi ≥ zi .
(k+1) (k+1)
Similarly it follows that zi ≤ xi . This completes the induction, and hence
(k)
zj → x∗j pointwise.
In other words, the nonlinear Gauss-Seidel process converges if we use {unj } (i.e.,
the solution from the previous time step) as an initial guess. For small to moderate
timestep sizes, one generally expects the solutions between consecutive time steps to
be close to each other, so in practice using {unj } results in much faster convergence
than either u or u as an initial guess.
50 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
So far we have proven that the nonlinear Gauss-Seidel and Jacobi processes both
converge globally when applied to residual functions arising from implicit monotone
schemes, but we have not investigated how fast these processes converge. For this
purpose, let us reconsider the finite-dimensional case, i.e., when R : RN → RN is
given by
ri (un+1 ) = φi (un+1
i − uni ) + λ(Fi+1/2 (un+1
i , un+1 n+1 n+1
i+1 ) − Fi−1/2 (ui−1 , ui )) (2.3.25)
for i = 1, . . . , N , and the finite versions of the Gauss-Seidel and Jacobi processes
((2.3.4) and (2.3.5)) are used. It is well known [59] that for a convergent fixed-point
iteration xn+1 = Gxn , the asymptotic rate of convergence is given by ρ(G0 (x∗ )),
the spectral radius of the Jacobian matrix evaluated at the solution x∗ . Moreover,
superlinear convergence is obtained when ρ(G0 (x∗ )) = 0. The following lemma gives
the rate of convergence for the nonlinear Gauss-Seidel and Jacobi processes.
Proof. Let G denote the Gauss-Seidel operator, i.e., y := xk+1 = Gxk , where xk+1 is
defined implicitly as a function of xk by (2.3.4). Then implicit differentiation gives,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 51
for each j = 1, . . . , N ,
∂y
D(y) − L(y) − U (x) = 0,
∂x
where D, L and U are the diagonal, strict lower-triangular and strict upper-triangular
part of ∂R/∂x respectively. Since R is an M -function, (D(y) + L(y)) is nonsingular
for all y ∈ D. Thus, G0 is given by
∂y −1
G0 (x) = = D(y(x)) − L(y(x)) U (x).
∂x
In other words, the rates of convergence of the nonlinear processes are exactly the
same as the rates for the corresponding linear processes applied to the Jacobian matrix
of the residual function. For the residual function (2.3.25), the Jacobian matrix has
the following tridiagonal form:
d1 f1
...
∂R e2 d2
= ,
∂u . .
. . . . fN −1
eN dN
52 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
where
∂Fi+1/2 ∂Fi−1/2
di = φi + λ − > 0,
∂ui ∂ui
∂Fi−1/2
ei = −λ ≤ 0,
∂ui−1
∂Fi+1/2
fi = λ ≤ 0.
∂ui+1
Since − log ρJ ≈ φmin / max di , it follows that − log ρJ is roughly inversely proportional
to the mesh ratio λ, especially when λ (and equivalently ∆t) is large. Thus, one
expects Jacobi to take roughly twice as many iterations to converge when one doubles
the time-step size while fixing the spatial grid (or, equivalently, when the grid is refined
by a factor of two while ∆t is kept constant).
For Gauss-Seidel, we exploit the fact that ∂R/∂u is tridiagonal. For this class
of matrices (and in fact, for any consistently ordered matrices in the sense of Young
[85]), the following theorem holds [68].
Theorem 2.9. Let A be a consistently ordered matrix such that aii 6= 0 for i =
1, . . . , N , and let the SOR parameter ω be nonzero. Then, if λ is a nonzero eigenvalue
of the SOR iteration matrix GSOR , any scalar µ such that
(λ + ω − 1)2 = λω 2 µ2 (2.3.26)
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 53
2.3.7 Extensions
In this section we show how to extend the results of Theorems 2.6 and 2.7 to deal
with:
3. problems in which the flux functions are only defined over a closed interval
I ⊂ R, and
φi (un+1
i
n+1
− uni ) + λ(Fi+1/2 n+1
− Fi−1/2 ) = 0, λ = ∆t/∆x, i ∈ Z,
with a spatially-varying φi and Fi+1/2 . We assume that 0 < φi ≤ 1. Notice that the
non-uniform grid case is automatically included: for any non-uniform discretization
54 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
of the form
n+1 n+1
φ̃i (un+1
i − uni ) Fi+1/2 − Fi−1/2
+ = 0, (2.3.27)
∆t ∆xi
we can multiply (2.3.27) by ∆t∆xi /∆xmax to recover the form of (2.3.1) with
To ensure convergence of the Jacobi and Gauss-Seidel processes, we need the following
assumptions:
2. {φi } is uniformly bounded away from zero, i.e. there exists φmin > 0 such that
φi ≥ φmin for all i ∈ Z.
While the equicontinuity condition may appear severe, it is usually satisfied in practice
because the spatially-varying coefficients (e.g. K(x) in (2.1.10)) tend to be uniformly
bounded, ensuring equicontinuity in the flux functions. With the above assumptions,
we can mimic Theorem 2.6 exactly by replacing λ with λ/φi . Then the proof goes
through verbatim, except for (2.3.22), which must be modified to
2λKB
< β < 1. (2.3.28)
φmin + 2λKB
Formally, Theorem 2.6 requires the discrete flux function F (ui , ui+1 ) to be defined on
R × R. In practice one may want to solve problems for which the flux function f is
only defined on an interval [umin , umax ] rather than on all of R, so states outside these
physical bounds are not admissible. For instance, in the two-phase flow problem,
we must have Si ∈ [0, 1] for all i, and the flux function f (S) in (2.1.10) is not even
defined outside this range. Fortunately, the estimate (2.3.20) ensures that as long
as the initial conditions are within physical bounds, so will the solution remain for
subsequent time steps n > 0. Thus, in order to apply Theorem 2.6 to these problems,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 55
one can formally extend the domain of definition of the flux function f to R by
defining, for instance,
f (umin ), u < umin ,
f˜(u) = f (u), umin ≤ u ≤ umax ,
f (u ), u > u ,
max max
and similarly for the discrete flux F (u, v). Since all the Gauss-Seidel iterates {y k } and
{xk } satisfy the bound x0 ≤ xk ≤ y k ≤ y 0 , the exact manner in which the extension
is defined is unimportant as long as the monotonicity property (Assumption 1) is
satisfied.
Multiple dimensions
The M -function analysis above can be extended to scalar conservation laws in multiple
dimensions. Consider once again the conservative, implicit monotone scheme
X
φi (un+1
i − uni ) + λil Fil (un+1
i , un+1
l ) = 0, i ∈ I, (2.3.29)
l∈adj(i)
of which the SEQ problem is an example. Recall that Fil is the flux from cell i to
cell l, λil = ∆t|∂Vil |/|Vi |, where |∂Vil | is the area of the surface separating cell i and
l, |Vi | is the volume of cell i and ∆t is the time step. In order to mimic Theorem 2.6,
we need the following assumption on the numerical flux:
1. Fil is equicontinuous with the same Lipschitz constant for all pairs of adjacent
cells (i, l),
2. The number of cells (control volumes) adjacent to cell i, |adj(i)|, is bounded for
all i;
3. The ratio |∂Vil |/|Vi | is bounded for all pairs of adjacent cells (i, l);
4. The quantity φi |Vi | is uniformly bounded away from zero for all i;
56 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
5. For any cell i, the total number of cells reachable from i in k steps is O(k p ) for
some fixed p > 0 (i.e. grows at most polynomially in k).
Items (1) and (4) are analogous to the conditions stated in the non-uniform grid case,
whereas the other conditions are new. These assumptions ensure that the residual
functions are all bounded and have the same Lipschitz constant over the set {u ∈
`∞ (N ) | kuk∞ < B}. The polynomial growth assumption (5) allows us to assign
the weights {wiB } to each cell i in the following manner: pick any node i0 and let
wiB = β d(i0 ,i) , where d(i, j) is the shortest distance between node i and j in the graph-
theoretic sense. Since the number of cells within k steps of i0 grows polynomially in
k, the series i wiB converges for any 0 < β < 1, so β can be chosen the same way
P
u∗ − u0 + λ F (u∗ , u+ ) − F (u− , u∗ ) = 0,
F (u∗ , u∗ ) − F (u∗ , u+ )
C= ≥0 (2.3.31)
u+ − u∗
F (u∗ , u∗ ) − F (u− , u∗ )
D= ≥ 0. (2.3.32)
u∗ − u−
2.4. CONVERGENCE TO THE ENTROPY SOLUTION 57
0 = u∗ − u0 + λ F (u∗ , u+ ) − F (u− , u∗ )
i.e. when φ(x) ≡ 1 is constant and the flux function does not vary in space (but is
generally non-convex and/or non-monotonic). Kružkov [45] has shown that (2.4.1)
has a unique entropy-satisfying weak solution, as stated in the following theorem.
for all ψ ∈ C0∞ (R × [0, T ]) and, in addition, satisfies the entropy condition: For all
ψ ∈ C0∞ (R × [0, T ]) with ψ ≥ 0, and for all c ∈ R,
ZZ
[|u − c|ψt + sgn(u − c)(f (u) − f (c))ψx ] dx dt ≥ 0. (2.4.3)
R×[0,T ]
58 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
The classical approach for establishing convergence to the unique entropy solution
proceeds as follows (cf. [24, 41, 70]):
2. Show that the numerical flux is consistent and satisfies a discrete entropy in-
equality. By the Lax-Wendroff theorem [46], this implies the limit u of the
convergent subsequence satisfies (2.4.2) and (2.4.3) in Theorem 2.11;
3. Verify that the entropy-satisfying weak solution is unique. In the 1D scalar case
this is a result of Theorem 2.11. This ensures all subsequences have the same
limit point, so that the finite difference scheme is convergent as ∆x, ∆t → 0.
A detailed argument along the above lines can be found in [24, 50, 70] and will not
be repeated here. Instead we focus on checking the various criteria listed above. The
numerical flux is assumed to be consistent, and by Theorem 2.6, the discrete solution
is uniformly bounded for spatial and temporal grid size. Thus, we only need to verify
that the numerical approximations have bounded total variation, and that a discrete
entropy inequality exists. The following two lemmas address these questions.
Lemma 2.12. Assume the hypotheses of Theorem 2.6, and suppose for n ≥ 1 the
discrete solution {uni }∞
i=−∞ is given by the unique bounded solution satisfying (2.3.1).
Assume the initial data {u0i }∞
i=−∞ has bounded total variation, i.e.
∞
X
0
T V (u ) := |u0i+1 − u0i | < ∞.
i=−∞
where
Since Ci , Ci+1 , Di , Di+1 are all non-negative, the triangle inequality gives
N
X N
X
|∆ui+1/2 | ≤ |∆vi+1/2 | + λCN +1 |∆uN +3/2 |
i=−N i=−N
≤ T V (v) + 4λKB · B,
60 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
where ku0 k∞ < B and KB is the local Lipschitz constant. Since the last expression
is finite and independent of N , the monotone convergence theorem guarantees that if
we let N approach infinity, the series will converge, so we get T V (u) < ∞. Moreover,
it implies that for every ε > 0 there exists N (which depends on ε) such that
X 1 ε
|∆ui+1/2 | ≤ min{ ε, }.
2 λKB
|i|>N
Thus, we have
N
ε X
T V (u) ≤ + |∆ui+1/2 |
2 i=−N
N
ε X
≤ + |∆vi+1/2 | + λCN +1 |∆uN +3/2 | + λD−N |∆u−N −1/2 |
2 i=−N
ε
≤ + T V (v) + λKB (|∆uN +3/2 | + |∆u−N −1/2 |)
2
ε X
≤ + T V (v) + λKB |∆ui+1/2 |
2
|i|>N
ε ε
≤ + T V (v) + λKB ·
2 2λKB
≤ T V (v) + ε.
in the weak sense, where ϕ(·) is an arbitrary C 2 function with ϕ00 > 0, and (ϕ, ψ) are
related by ψ 0 = ϕ0 f 0 . Kružkov showed in [45] that this formulation is equivalent to
requiring that condition (2.4.3) be satisfied for all c ∈ R.
Lemma 2.13. Assume the hypotheses of Theorem 2.6, and let {uni } be the unique
2.4. CONVERGENCE TO THE ENTROPY SOLUTION 61
bounded solution satisfying (2.3.1). Let (ϕ, ψ) be an entropy/flux pair. Then there ex-
ist functions Φ = Φ(u) and Ψ = Ψ(u− , u+ ) that are consistent with (ϕ, ψ) (i.e. Φ(u) =
ϕ(u) and Ψ(u, u) = ψ(u)) such that (Φ, Ψ) satisfies a discrete entropy inequality:
Proof. The development essentially follows [75]. Define the entropy variables
v := ϕ0 (u).
Since ϕ00 > 0, ϕ0 is one-to-one, so we can do a change of variables and let u = u(v).
So we can define the potential function
Z v
q(v) = f (u(η))dη,
0
Φ(u) := ϕ(u)
1 1
Ψ(ui , ui+1 ) := (vi + vi+1 )Fi+1/2 − (q(vi ) + q(vi+1 )).
2 2
d
It is easily seen that Ψ is consistent with ψ (by showing that du
Ψ(u, u) = ϕ0 f 0 ). Then
since ϕ = Φ is convex, we have
Φ(un+1
i ) + Φ0 (un+1
i )(uni − un+1
i ) ≤ Φ(uni ),
so that
0 ≥ Φ(un+1
i ) − Φ(uni ) + vin+1 (uni − un+1
i )
0 ≥ Φ(un+1
i ) − Φ(uni ) + λvin+1 (Fi+1/2
n+1 n+1
− Fi−1/2 )
h i
0 ≥ Φ(un+1 n n+1 n+1 n+1 n+1 n+1 n+1
i ) − Φ(u i ) + λ v i F i+1/2 + q(v i ) − v i Fi−1/2 + q(v i )
62 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
where η lies between vi and vi+1 . If vi ≤ vi+1 , then ui ≤ u(η) ≤ ui+1 , so that
and hence the integrand is non-positive. Analogously, vi ≥ vi+1 implies that the
integrand is non-negative, so either way the integral cannot be positive, thus proving
(2.4.8). Relation (2.4.9) is proved similarly, and the lemma follows.
∂S ∂f (S)
+ = 0.
∂tD ∂xD
The flux function f (S) is shown in Figure 1(b), with a sonic point at S = 0.49;
countercurrent flow occurs whenever S ≥ 0.49. The initial saturation profile is a step
function with
1, 0 ≤ xD < 0.2
S 0 (xD ) =
0 0.2 < x ≤ 1.
D
The numerical solution is compared with the analytical solution at time tD = 0.15.
Because of the sonic point, the solution contains two shocks connected by a rarefac-
tion; one shock moves to the right with a velocity of 3.9, and the other travels to the
left with a velocity of −1.2. When considering the accuracy of a numerical solution,
two error measures are shown:
• The L1 -error, which is the difference between the numerical and the analytical
solution in the L1 -norm;
• The front dispersion, which is the distance between analytical shock front and
the leftmost point for which the numerical solution becomes zero.
We also measure how difficult the nonlinear problem is by showing, for each test case,
the average number of nonlinear Gauss-Seidel iterations required to converge each
time step. We remark that this measure is only useful for problems with countercur-
rent flow, since Gauss-Seidel always converges in one iteration in the cocurrent case
(cf. section 2.3.6).
analytical solution even though the CFL number is greater than 1, which confirms
our analysis. Moreover, both the L1 error and the front dispersion are converging
a bit worse than linearly, with a ratio of about 0.61 and about 0.58 respectively
for every refinement by a factor of two. Also note the poor resolution near the left
boundary N = 25, 50, 100, where instead of approaching S = 1, the solution is closer
to Sc ≈ 0.27 at the left boundary. For these coarser grids, the numerical solution
has a hard time deciding whether the left-moving wave has reached the boundary,
which is maintained at S(x = 0) = Sc (see Equation (2.2.10)). For higher resolutions
(N = 200, 400), the artifact has disappeared and the numerical solution reproduces
the back end of the saturation profile quite accurately. The average number of Gauss-
Seidel iterations required for convergence are all similar, so refining the grid for a fixed
mesh ratio does not increase the difficulty of the problem for the nonlinear solver.
Table 2.2: Accuracy of numerical solutions for a fixed time step size.
N tD /∆t CFL L1 -error Front dispersion Average # iterations
25 20 1.02 0.0673 > 0.215 2.6
50 20 2.05 0.0529 0.156 3.3
100 20 4.10 0.0444 0.116 4.4
200 20 8.20 0.0378 0.101 6.4
400 20 16.40 0.0366 0.094 9.2
error. In addition, the average number of iterations required to attain convergence in-
creases with each refinement: as we refine the grid, we are solving increasingly difficult
problems, even though the improvement in solution accuracy will stagnate beyond a
certain point. Thus, even though the fully-implicit method can tolerate arbitrarily
large CFL numbers, one should not hope to improve solution accuracy indefinitely
simply by using a finer spatial grid, without making a corresponding reduction of
time-step size.
of iterations required for Gauss-Seidel convergence is roughly the same for both the
uniform and non-uniform case, so the resulting nonlinear equations are not harder to
solve, despite the large CFL numbers.
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 67
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(a) (b)
N = 100, T/dt = 20, CFL = 4.0956 N = 200, T/dt = 40, CFL = 4.0956
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(c) (d)
N = 400, T/dt = 80, CFL = 4.0956
1
0.9
0.8
0.7
0.6
Sw
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1
xD
(e)
Figure 2.3: Numerical solution at different resolutions, CFL = 4.10, tD = 0.15.
68 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING
N = 25, T/dt = 20, CFL = 1.0239 N = 50, T/dt = 20, CFL = 2.0478
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(a) (b)
N = 100, T/dt = 20, CFL = 4.0956 N = 200, T/dt = 20, CFL = 8.1913
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(c) (d)
N = 400, T/dt = 20, CFL = 16.3826
1
0.9
0.8
0.7
0.6
Sw
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1
xD
(e)
Figure 2.4: Numerical solution for different spatial grids, 20 time steps, tD = 0.15.
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 69
N = 50, T/dt = 20, CFL = 105.8195 N = 50, T/dt = 50, CFL = 42.3278
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(a) (b)
N = 50, T/dt = 20, CFL = 2.0478 N = 50, T/dt = 50, CFL = 0.81913
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
Sw
Sw
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD
(c) (d)
Figure 2.5: Numerical solutions obtained from a non-uniform grid ((a) and (b)) and
their uniform-grid counterparts ((c) and (d)), tD = 0.15.
Chapter 3
Potential Ordering
The main theme of this thesis is the reordering of equations and variables in a way
that allows a partial decoupling of the problem into a sequence of single-cell problems
that are easier to solve. The basic insight is to perform reordering based on flow
direction information, which is provided by the pressure field. This approach is in-
tuitive because saturation information travels from upstream to downstream, so one
expects methods that respect this ordering to be more efficient than methods that
are blind to upstream information. We have already seen in section 2.3.6 that in the
1D cocurrent case, nonlinear Gauss-Seidel converges in exactly one iteration if, and
only if, the cells are ordered from upstream to downstream. Thus, ordering can have
a large impact on the performance of solution algorithms.
For a problem with np phases, there are np equations and unknowns associated
with each block, which means there are multiple ways of ordering these equations
while respecting the direction of flow. We can distinguish between the following two
categories of ordering:
1. Cell-based ordering, in which all the equations and variables aligned with a cell
(control volume) are grouped together as a block, and reordering only applies
at the cell level;
70
3.1. METHODS DERIVED FROM CELL-BASED ORDERING 71
The two approaches are useful in different situations and they both contribute to
the various nonlinear solvers and preconditioning algorithms presented in subsequent
chapters.
Cascade method
The Cascade method was proposed by Appleyard and Cheshire [4] as an acceleration
scheme for the basic Newton method. A brief description of the method follows.
Suppose we have an np -phase model (np = 2 or 3) in which we discretize the domain
into N gridblocks. The first step in the Cascade method is the same as in the ordinary
Newton method: namely, we linearize the np N conservation equations and solve the
np N -by-np N linear system J(x(ν) )δx(ν) = −R(ν) (x(ν) ) for δx(ν) . Next, we apply a
linear update to pressure variables Po only, leaving the saturations intact for the time
being. Using this new pressure field, we update the potential for each phase, and
then we order the cells from the highest potential to the lowest. This is the order
in which the Cascade sweep should be performed. Note that there is a choice in the
ordering because the potential sequence can be different for each phase. Appleyard
and Cheshire suggest that one Cascade sweep be done for the potential sequence
of each phase, although the method was only demonstrated for a two-phase flow
problem.
72 CHAPTER 3. POTENTIAL ORDERING
k k
1 Form the fullJacobian J, evaluated at (S , Po ) ;
k
δS
2 Solve J = rk ;
δPok
3 Compute Pok+1 = Pok + δPok ;
4 Reorder the cells so that Po,i ≥ Po,j whenever i > j ;
5 For i = 1, . . . , N :
6 Solve (3.1.1) at cell i for Sw,i and Po,i ;
k+1
7 Update Sw,i using the value from line 6 ;
8 Compute outward fluxes F Op (Sw , Po ) for subsequent i ;
9 end for
1
fo (Sw , Po ) = ∆Mo (Sw , Po ) + F Oo (Sw , Po ) − F Io − qo = 0
∆t (3.1.1)
1
fw (Sw , Po ) = ∆Mw (Sw , Po ) + F Ow (Sw , Po ) − F Iw − qw = 0,
∆t
where ∆Mp is the accumulation of phase p, F Op and F Ip are the outward and inward
fluxes of phase p respectively, and qp are the well terms. For a three-phase problem,
we would have three such equations. We assume that the inward fluxes are known
and independent of the values of Sw and Po at the cell, which is valid provided
that all neighboring cells at a higher potential have been processed, and there is
no countercurrent flow. We now have a system of two nonlinear equations in two
unknowns, which can be efficiently solved for Sw and Po . The computed Sw are taken
to be the saturation solution for the nonlinear iteration, and the computed F Op are
used as the influx for subsequent single-cell problems. The computed Po , on the other
hand, are discarded, since their only purpose is to ensure local mass conservation for
both phases and do not yield an accurate approximation for the global pressure field.
(ν)
In other words, the approximate solution (S (ν) , Po ) takes its saturation values from
the single-cell problems, but the pressure values are obtained from the linear update.
Figure 3.1 outlines one step of the cascade method.
3.1. METHODS DERIVED FROM CELL-BASED ORDERING 73
• incompressible flow,
It can be shown that the Cascade method converges to the solution in two iterations
for this problem (see Appendix C for a proof). However, this ceases to be true in the
presence of countercurrent flow or in multiple dimensions. Also, the formulation may
break down if the phase potential chosen to order the cells contains local minima; in
this case, the cell whose potential is at a local minimum will lack an outward flux term
F Op , so it would be impossible to satisfy mass balance for both phases no matter
what Sw and Po are. This is an important drawback because in practical applications
it is usually impossible to guarantee the absence of local minima in the pressure field
when the solution has not converged, especially when the initial guess is poor.
Natvig, Lie and Eikemo [56] proposed a cell-based reordering method for solving the
multiphase advection problem in the absence of gravity and capillarity. In [56] the
reordering was applied to equations obtained from a discontinuous Galerkin discretiza-
tion, but it can equally be applied to the standard finite volume methods described
in section 1.2.1. Basically, a topological sort (cf. [22]) is performed on the directed
acyclic graph G = (V, E), whose nodes V are the control volumes, and whose edges E
are the directions of the total velocity across cell interfaces (which coincide with the
flow directions for each phase, since there is no countercurrent flow). The single-cell
problems, each consisting of an np -by-np nonlinear system, are solved in the topologi-
cal order from upstream to downstream by Newton’s method, for example. Since the
pressure and total velocity fields are regarded as part of the data rather than the un-
knowns, this ordering completely decouples the system, just like Gauss-Seidel is exact
for cocurrent 1D flow. In fact, this approach can be regarded as a block nonlinear
Gauss-Seidel method, which is exact as long as the nonlinear system is block lower
74 CHAPTER 3. POTENTIAL ORDERING
depends only on the saturation of the upstream cell. Suppose we reorder the cells such
that they appear in decreasing order of pressure, i.e. pi ≥ pj whenever i < j. Then
for all j, the component conservation equations for cell j depend only on saturations
Si with i ≤ j. Thus, we can rearrange the system of nonlinear equations to the form
fc1 (S1 , p1 , . . . , pN ) = 0
fc2 (S1 , S2 , p1 , . . . , pN ) = 0
.. (3.2.2)
.
fcN (S1 , S2 , . . . ,SN , p1 , . . . , pN ) = 0,
3.2. PHASE-BASED ORDERING 75
where c = o, w are the oil and water components, respectively. Notice how the
saturation part of the equations becomes “triangular”. Thus, if we have the exact
pressure solution p1 , . . . , pN , we can perform a “forward substitution” and solve a
series of single-variable nonlinear equations to obtain the saturations S1 , . . . , SN . We
remark that the triangularity carries over to the Jacobian matrix, which now has the
form
Sw p
" #
Jww Jwp water equation (3.2.3)
J=
Jow Jop oil equation
where Jww is lower triangular.
In the three-phase case, we have two saturation variables per cell, which we can
choose as Sw and So without loss of generality. Since the black oil model assumes
that krw depends solely on Sw , the above construction can be used to order the water
equations. Now kro depends on both Sw and So , but we can maintain triangularity
by writing all the water equations first before writing the oil and gas equations. The
nonlinear system then looks like
fw1 (Sw1 , p1 , . . . , pN ) = 0
fw2 (Sw1 , Sw2 , p1 , . . . , pN ) = 0
..
.
fwN (Sw1 , . . . , SwN , p1 , . . . , pN ) = 0
(3.2.4)
fo1 (Sw1 , . . . , SwN , So1 , p1 , . . . , pN ) = 0
..
.
foN (Sw1 , . . . , SwN , So1 , . . . , SoN , p1 , . . . , pN ) = 0
and fgi (Sw1 , . . . , SwN , So1 , . . . , SoN , p1 , . . . , pN ) = 0, i = 1, . . . , N.
Sw So p
Jww Jwp water equation
(3.2.5)
J = Jow Joo Jop
oil equation
Jgw Jgo Jgp gas equation
with Jww and Joo lower triangular, which implies the entire upper-left block is lower
triangular. Note that Jow will also be lower triangular, since all phases have the same
upstream direction. However, this fact is not needed to justify solving for Sw and So
using forward substitution.
In the presence of gravity, buoyancy forces can cause different phases to flow in
opposite directions. The upstream direction for each phase p is determined by the
sign of (Φp,i − Φp,l ), where
Φp,i = pi − γp zi (3.2.6)
is the phase potential at cell i, zi is the depth of the cell, and γp is the specific gravity
of phase p. Despite possible differences in upstream directions, we are interested in
maintaining the triangular forms shown in (3.2.2) and (3.2.4) (and equivalently (3.2.3)
and (3.2.5)). For two-phase flow, one can simply use Φw for ordering, since one only
needs Jww (and not Jow ) to be triangular. For three-phase flow, we need both Jww
and Joo to be lower triangular. Clearly, no cell-based ordering can accomplish this; we
need to order the water and oil phases separately. The trick is to exploit the relative
permeability dependencies (1.1.9) in such a way that triangularity is preserved.
Unlike the cocurrent flow case, we can no longer align the ordering of equations
and variables with cell ordering. Thus, in the sequel, subscripts (such as k in Φp,k )
always denote the value of the scalar field (in this case, the potential of phase p) at cell
k in the natural ordering. This is because we concentrate on ordering the equations
and unknowns, rather than the cells themselves.
3.2. PHASE-BASED ORDERING 77
In other words, if cell k is such that Φw,k > Φw,l for any other l, then σ1 := k.
Suppose we order first all the water equations and the associated variables Sw using
the σ ordering, and then order the oil equations and the associated variables So using
the τ ordering. The nonlinear system then looks like
fw,σ1 (Sw,σ1 , p1 , . . . , pN ) = 0
fw,σ2 (Sw,σ1 , Sw,σ2 , p1 , . . . , pN ) = 0
..
.
fw,σN (Sw,σ1 , . . . , Sw,σN , p1 , . . . , pN ) = 0
(3.2.9)
fo,τ1 (Sw,σ1 , . . . , Sw,σN , So,τ1 , p1 , . . . , pN ) = 0
..
.
fo,τN (Sw,σ1 , . . . , Sw,σN , So,τ1 , . . . , So,τN , p1 , . . . , pN ) = 0
and fgi (Sw,σ1 , . . . , Sw,σN , So,τ1 , . . . , So,τN , p1 , . . . , pN ) = 0, i = 1, . . . , N.
Now consider the pattern of the corresponding Jacobian matrix. Clearly, Jww is
still lower triangular because of (3.2.7), and Joo is lower triangular because of (3.2.8).
The only effect of countercurrent flow is that Jow will no longer be lower triangular,
because the Sw are not arranged in the order of decreasing oil potential, Φo . However,
as long as the upper-left 2 × 2 block in (3.2.5) is lower triangular, we can use forward
substitution to solve for Sw and So once the pressures are known.
3.2.3 Capillarity
So far, in the absence of capillary effects, the saturation dependence in each equation
is purely upstream; thus, for a given phase, saturations downstream from cell i do not
appear in equation i. When capillary effects are present, equation i involves phase
78 CHAPTER 3. POTENTIAL ORDERING
pressures from all neighboring cells, be they upstream or downstream from cell i. In
the standard approach, we can only choose one phase pressure as a primary variable;
the other phase pressures must be expressed as
where pp is the primary phase pressure and pq is the pressure of another phase. Thus,
when capillarity is present, we must choose our primary variables carefully to avoid
introducing downstream dependence on saturation that cannot be removed by simply
reordering the equations and unknowns. Choosing pw , the water-phase pressure, as
the primary pressure variable allows us to maintain the triangularity in the upper-left
block of (3.2.5). Note that choosing pg causes the water equations to depend on So ,
since pw = pg − Pcog (Sg ) − Pcow (Sw ) and Sg = 1 − Sw − So . This would completely
destroy the triangularity of the block. If we instead choose po , then there would be
no So dependence, but there would be both upstream and downstream dependence
on Sw due to pw = po − Pcow (Sw ), which is undesirable. Thus, the only choice that
leaves the water equation intact (i.e., a triangular Jww ) is pw .
We need to ensure that Joo is still lower triangular when pw is used. We have
We also remark that in most simulations, the flow directions do not change very
often, so it may not be necessary to compute this ordering at every time step. For
instance, we could compute the potential ordering only at the beginning of a time
step. At each subsequent Newton iteration, we could simply verify the validity of the
ordering, and only recompute it when the submatrix ceases to be triangular.
Chapter 4
In this chapter, we use the phase-based ordering introduced in section 3.2 to reformu-
late the mass-balance equations into a system of smaller size that involves pressure
variables only. The Implicit Function Theorem [66] plays a central role in the formu-
lation. We first describe the algorithm that arises when Newton’s method is applied
to the reduced system.
80
4.1. ALGORITHM DESCRIPTION 81
where
and
Jss = ∂Fs /∂S, Jsp = ∂Fs /∂p, Jgs = ∂Fg /∂S, Jgp = ∂Fg /∂p.
It can be shown that Jss is nonsingular as long as the monotonicity condition dkrp /dSp ≥
0 is valid for p = o, w (see Appendix D for the proof). For krw = krw (Sw ) (which
is usually obtained from experimental data), monotonicity is almost always satisfied,
but the situation is less clear for kro = kro (Sw , Sg ), since the latter is usually obtained
by interpolating data from oil-water and oil-gas experiments. Certain methods of
interpolation, such as Stone I and Stone II [6], yield monotonic kro under mild con-
ditions (see Appendix D), but this is not always the case for other methods (e.g.,
the segregation model [37]). In this work it is assumed that kro is a monotonically
increasing function of So when Sw is fixed, which would ensure the nonsingularity of
Jss .
Fg (S(p), p) = 0, (4.1.3)
which we need to solve for the pressure p. If we use Newton’s method to solve (4.1.3),
82 CHAPTER 4. REDUCED NEWTON METHOD
∂Fg ∂S ∂Fg
Jreduced = + (4.1.4)
∂S ∂p ∂p
∂S
= Jgs + Jgp . (4.1.5)
∂p
∂Fs ∂S ∂Fs
+ = 0, (4.1.6)
∂S ∂p ∂p
−1
Jreduced = Jgp − Jgs Jss Jsp , (4.1.8)
which is precisely the Schur complement of (4.1.2) with respect to pressure. Figure 4.1
summarizes the algorithm used to solve the reduced system. Notice that the only dif-
ference between the algorithm in Figure 4.1 and Newton’s method applied to the full
problem is the way we compute S k+1 . In the full method, we set S k+1 = S k + δS k ; in
the reduced method, S k+1 is updated nonlinearly by solving the constraint equations
F (S k+1 , pk+1 ) = 0, in which the special triangular structure of Jss is exploited. Also
note that since this is just the usual Newton method applied to a reduced problem,
convergence is locally quadratic.
a time (where np is the number of fluid phases). A wide variety of reliable univari-
ate solvers are available to deal with the nonlinear single-cell problems. One such
choice is the van Wijngaarden-Dekker-Brent Method [14], which combines bisection
with inverse quadratic interpolation to obtain superlinear convergence. This is a
derivative-free algorithm, which means only function values are required, although an
initial guess based on the solution of the ordinary Newton step can be used to accel-
erate convergence. In a reasonably efficient implementation, each function evaluation
should only require a few floating-point operations. As shown in section 4.3, the extra
cost of the single-cell nonlinear solves is usually offset by a reduction in the number
of global Newton steps. The nonlinear updates can be performed quite efficiently if
more sophisticated zero-finders are used.
The first way is to notice that one can solve the equivalent system
" #" # " #
Jss Jsp δS 0
= . (4.1.10)
Jgs Jgp δp −r
84 CHAPTER 4. REDUCED NEWTON METHOD
Krylov subspace methods (such as GMRES) can be used, and effective preconditioners
(such as the Constrained Pressure Residual method [81]) are available. A second
way is to apply the Krylov method directly to the Schur complement system. In
this approach, matrix-vector multiplication by Jreduced would have the same cost as
multiplication by the full matrix, because Jss is lower triangular, so that multiplication
−1
by Jss is simply a forward substitution. In terms of preconditioning, one can either
precondition Jreduced directly with ILU type methods, or use an induced preconditioner
based on the full system by letting
−1 −1 T
Mreduced = RMfull R , (4.1.11)
−1
where Mfull is the preconditioner for the full system, and R = 0 I is the restriction
operator to the pressure variables. In other words, a preconditioning step for the
−1
reduced system y = Mreduced x consists of the following steps:
!
0(np −1)N
1. Pad the vector x with zeros to form x̂ = .
x
−1
2. Compute ŷ = Mfull x̂.
One potential advantage of applying the Krylov method to the Schur complement
system rather than the full system is that the resulting Krylov vectors are only of
length N rather than length np N , where np is the number of fluid phases. This greatly
reduces storage requirements and orthogonalization cost in methods such as GMRES,
so that more Krylov steps can be taken before restarting.
In fact, the Schur complement reduction can be used even if the nonlinear con-
straint equations are not exactly satisfied. This could happen if the initial pressure
guess is so poor that some of the residual constraints in the reduced Newton cannot
be satisfied. In that case we would have
" #" # " #
Jss Jsp δS rs
=− (4.1.12)
Jgs Jgp δp rg
4.2. CONVERGENCE ANALYSIS 85
−1
Jreduced δp = −(rg − Jgs Jss rs ) (4.1.13)
which has the same form as (4.1.9). All these options are evaluated in chapter 5.
g : x 7→ x − (f 0 (x))−1 f (x)
Theorem 4.1. Let f be a C 2 function over some interval J, and let I = (a, b) ⊂ J
be an open interval such that f 0 (x) 6= 0 on I and
|f (x)f 00 (x)|
x ∈ I =⇒ < 1. (4.2.1)
|f 0 (x)|2
Let x∗ ∈ I be such that f (x∗ ) = 0, and let L = min{|x∗ − a|, |b − x∗ |}. Then x∗ is
the unique root of f in I, and for any initial guess x0 ∈ (x∗ − L, x∗ + L), the Newton
86 CHAPTER 4. REDUCED NEWTON METHOD
iteration
f (xk )
xk+1 = xk −
f 0 (xk )
converges quadratically to x∗ .
Convex functions enjoy many nice properties (such as continuity everywhere and
differentiability almost everywhere [59]), but for our purposes we mainly consider C 2
functions. The following properties are used repeatedly in our analysis.
x0 ≥ x1 ≥ · · · ≥ xk ≥ · · · ≥ x∗ .
Proof. First, assume that f (xk ) ≥ 0 for some k ≥ 0. Then f is convex on the interval
[x∗ , xk ], so we have
f (xk )
x∗ ≥ xk − = xk+1 ,
f 0 (xk )
which, together with the fact that f (xk ) ≥ 0, implies x∗ ≤ xk+1 ≤ xk . So f (xk+1 ) ≥ 0
and f is convex on [x∗ , xk+1 ]. Induction now shows that
xk ≥ xk+1 ≥ · · · ≥ · · · ≥ x∗ ,
so that
x̃ = x̃ − (f 0 (x̃))−1 f (x̃),
To exploit this useful connection between convex functions and Newton conver-
gence, we make the following assumptions on the relative permeability functions.
Assumption 4. We assume that the following properties hold for all saturations
0 ≤ Sw ≤ 1:
2. kw0 (Sw ) ≥ 0, ko0 (Sw ) ≤ 0 (Phase mobilities increasing with phase saturations),
where Vi = φi ∆x/∆t and Ki is the (absolute) permeability between blocks i and i+1.
We note that applying Newton’s method to this modified system will yield pressure
profiles that are identical to those obtained from applying Newton’s method to the
original system, since all we did is a linear change of independent variables.
Now consider applying reduced Newton to (4.2.2), i.e., we use the water phase
equations as the constraints required to define the implicit functions
Si = Si (π1 , . . . , πi ).
where fwi and foi are influxes from the upwind cell, which do not depend on the
pressure gradient πi . Thus, our approach for proving convergence is as follows: we
show that for fixed π1 , . . . , πi−1 , the objective function Foi is strictly increasing and
convex with respect to πi over a semi-infinite interval containing the root πi∗ . Thus,
Newton’s method converges for any starting point within this interval. Then an
induction argument, together with the continuous dependence of Foi on the influx
foi (π1 , . . . , πi−1 ), will guarantee global convergence of Newton’s method for the whole
system.
Remark. Without loss of generality, we can restrict our attention to how reduced
Newton behaves inside the positive orthant {πi > 0, i = 1, . . . , N }. Let πi∗ denote the
solution of the i-th cell problem (so that Foi (π1∗ , . . . , πi∗ ) = 0). Since flow is cocurrent
and the total velocity is positive, each πi∗ must be positive. Moreover, because of
uniform ellipticity, we have the lower bound
q q min{µo , µw }
πi∗ ≥ = , (4.2.4)
Ki (λT )max Ki
We are now ready to show that reduced Newton converges when ∆t is large, pro-
vided we make a few additional assumptions that are satisfied by quadratic relative
permeabilities. In the next section, we derive a modified reduced Newton iteration
that is provably convergent for all ∆t without the need of these additional assump-
tions.
Proposition 4.4. Assume kw and ko are both uniformly convex, i.e., there exist
positive constants cw and co such that kw00 ≥ cw and ko00 ≥ co for all S ∈ [0, 1]. Let
kw0 (0) = 0. Then there exists Sc > 0 such that λ0w + λ0o ≤ 0 for all 0 ≤ Sw ≤ Sc .
Proof. Since ko0 (Sw = 1) ≤ 0 and ko00 (Sw ) ≥ co > 0, we must have ko0 (Sw = 0) ≤ −co ,
so that λ0w + λ0o ≤ −co /µo < 0. Thus, by continuity, there exists a non-trivial
neighborhood around zero, say 0 ≤ Sw ≤ Sc , such that λ0w + λ0o takes on negative
values.
Lemma 4.5 (Monotonicity and convexity with respect to πi ). Assume the hypotheses
of Proposition 4.4. Let πj > 0 for all j. Then ∂Foi /∂πi > 0, and there exists γ0 > 0
such that ∂ 2 Foi /∂πi2 ≥ 0 whenever Vi /Ki πi ≤ γ0 .
∂Si Ki λw
=− ,
∂πi Vi + Ki λ0w πi
implying that
∂Foi Ki λw (Vi − Ki λ0o πi )
= Ki λo + ,
∂πi Vi + Ki λ0w πi
which is positive for πi > 0 if the fluid properties in Assumption 4 hold. Similarly,
the second derivative is given by
∂ 2 Foi
Ki λw
=− 2Vi Ki (λ0w + λ0o )
∂πi2 (Vi + Ki λ0w πi )2
Ki λw 2 2 00 0 00 0 00 00
− K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) .
Vi + Ki λ0w πi i i o w
The terms inside the square brackets are non-negative by Assumption 4. Thus, if
4.2. CONVERGENCE ANALYSIS 91
λ0w + λ0o ≤ 0, then ∂ 2 Foi /∂πi2 ≥ 0 automatically. If λ0w + λ0o > 0, then we need
Ki λw
2Vi Ki (λ0w + λ0o ) ≤
2 2 00 0 00 0 00 00
K i π i (λ o λ w − λ w λ o ) + Vi K i π i (λ w + λ o ) .
Vi + Ki λ0w πi
Aγ 2 + Bγ + C ≤ 0 (4.2.5)
with
A = 2(λ0w + λ0o ),
B = 2λ0w (λ0w + λ0o ) − λw (λ00w + λ00o ),
C = −λw (λ00o λ0w − λ00w λ0o ).
−2C
γ0 = min √ . (4.2.6)
Sc ≤S≤1 B + B 2 − 4AC
We exclude the interval [0, Sc ) from the minimization because λ0w +λ0o < 0 there. This
implies γ0 > 0, since
c3w co
−C ≥ λw (Sc )λ0w (Sc )λ00o,min ≥ >0
2µ2w µo
Lemma 4.5 implies that if Vi ≤ θγ0 q min{µo , µw } with θ < 1, then Foi is convex
in the interval θq min{µo , µw } ≤ πi < ∞, which contains πi∗ . Hence, by Theorem 4.3,
(k)
the sequence of Newton iterates {πi } converges monotonically to πi∗ provided the
influx is constant. In particular, the first cell converges if ∆t is large enough. Global
92 CHAPTER 4. REDUCED NEWTON METHOD
Lemma 4.6. Let (π1 , . . . , πN ) > 0 be given. Suppose we define the implicit functions
(1) (2)
Si (π1 , . . . , πi ) and Si (π1 , . . . , πi ) via the constraints
(1) (1)
Fwi (π1 , . . . , πi ) = Vi Si (π1 , . . . , πi ) + Ki λw (Si (π1 , . . . , πi ))πi − fwi (π1 , . . . , πi−1 ) ≡ 0,
(2) (2)
Foi (π1 , . . . , πi ) = −Vi Si (π1 , . . . , πi ) + Ki λo (Si (π1 , . . . , πi ))πi − foi (π1 , . . . , πi−1 ) ≡ 0,
(4.2.7)
(1) (1)
F̄oi (π1 , . . . , πi ) = −Vi Si (π1 , . . . , πi ) + Ki λo (Si (π1 , . . . , πi ))πi − foi (π1 , . . . , πi−1 ),
(2) (2)
F̄wi (π1 , . . . , πi ) = Vi Si (π1 , . . . , πi ) + Ki λw (Si (π1 , . . . , πi ))πi − fwi (π1 , . . . , πi−1 ).
(4.2.8)
Then both F̄oi and F̄wi are increasing with respect to πi , and at least one of F̄oi and
F̄wi must be a convex function over a semi-infinite interval containing the root πi∗ .
(1)
∂Si Ki λw
=− ,
∂πi Vi + Ki λ0w πi
4.2. CONVERGENCE ANALYSIS 93
µo/µw = 0.1
10
6
Fo
2
∆t=1
∆t=2
0 ∆t=5
∆ t = 20
−2
0 2 4 6 8 10
∆ p/∆ x
µo/µw = 10
2
1.5
1
o
0.5
F
∆t=1
−0.5 ∆t=2
∆t=5
∆ t = 20
−1
0 1 2 3 4 5 6 7 8
∆ p/∆ x
Figure 4.2: Reduced Newton residual functions for various ∆t. Top: favorable mo-
bility ratio (µo /µw = 0.1). Bottom: unfavorable mobility ratio (µo /µw = 10).
94 CHAPTER 4. REDUCED NEWTON METHOD
and
∂ 2 F̄oi
Ki λw
=− 2Vi Ki (λ0w + λ0o )
∂πi2 (Vi + Ki λ0w πi )2
Ki λw 2 2 00 0 00 0 00 00
− K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) ,
Vi + Ki λ0w πi i i o w
(1)
where the λp and their derivatives are evaluated at Si (π1 , . . . , πi ). A similar calcu-
lation shows that
(2)
∂Si Ki λo
= ,
∂πi Vi − Ki λ0o πi
∂ F̄wi Ki λo (Vi + Ki λ0w πi )
= Ki λw +
∂πi Vi − Ki λ0o πi
and
∂ 2 F̄wi
Ki λo
2
= 0 2
2Vi Ki (λ0w + λ0o )
∂πi (Vi − Ki λo πi )
Ki λo 2 2 00 0 00 0 00 00
+ K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) ,
Vi − Ki λ0o πi i i o w
(2)
where the λp and their derivatives are now evaluated at Si (π1 , . . . , πi ). By definition,
(1) (2)
at the solution πi∗ we must have Si (π1 , . . . , πi∗ ) = Si (π1 , . . . , πi∗ ) =: Si∗ . Moreover,
(1) (2) (1)
we must have Si ≤ Si∗ ≤ Si over the interval [πi∗ , ∞) because ∂Si /∂πi ≤ 0 and
(2)
∂Si /∂πi ≥ 0. We now consider two cases:
(2)
1. λ0w (Si∗ ) + λ0o (Si∗ ) ≥ 0. Then since Si ≥ Si∗ for πi ≥ πi∗ , the convexity of λw and
(2) (2)
λo implies λ0w (Si ) + λ0o (Si ) ≥ 0. Hence ∂ F̄wi /∂πi ≥ 0 for all πi ≥ πi∗ .
(1)
2. λ0w (Si∗ ) + λ0o (Si∗ ) ≤ 0. Then since Si ≤ Si∗ for πi ≥ πi∗ , the convexity of λw and
(1) (1)
λo implies λ0w (Si ) + λ0o (Si ) ≤ 0. Hence ∂ F̄oi /∂πi ≥ 0 for all πi ≥ πi∗ .
4.2. CONVERGENCE ANALYSIS 95
Thus, at least one of the two reduced functions is convex over a semi-infinite interval
containing πi∗ , as required.
The above lemma tells us that if we knew ahead of time the slope of the total
mobility curve at the solution, we could always pick the correct reduced function (or
equivalently, the correct constraint) for each cell in order to achieve global conver-
gence. Unfortunately, this information is usually not available. However, if we switch
constraints when non-convexity is detected, then we can be certain that the new
reduced function must be convex, so convergence is now guaranteed. The modified
algorithm is shown in Figure 4.3. The convexity test in line 9 is motivated by Theorem
4.3. Assume all cells upstream of i have converged. If the current residual function
is convex and Fgi (πik ) > 0, then we should have Fgi (πik+1 ) > 0 as well. Thus, if the
latter condition is violated, non-convexity is detected, so we should switch constraints
and work with the other residual function, which must be convex. In practice, we
may not want to swap constraints every time the residual becomes negative for the
following reasons:
• When the nonlinear iterate is close to the solution (but has not yet converged),
the residual can have the wrong sign even when convex objective functions are
used. This is because the linear and nonlinear equations that define the Newton
steps are themselves solved inexactly by inner iterations;
As a result, we should switch constraints only when the overshoot is severe enough
that we are certain no progress has been made. The parameter 0 < θ < 1 in line 9
achieves this purpose: if the new residual changes sign but has a significantly smaller
magnitude, we accept the current constraint and continue; on the other hand, large
overshoots cause the constraint to switch. It is fairly easy to convince oneself that
the modified algorithm converges for all initial guesses inside the positive orthant
{πi > 0, i = 1, . . . , N }.
96 CHAPTER 4. REDUCED NEWTON METHOD
Here πi = (Φw,i − Φw,i+1 )/∆x denotes the gradient for the water potential. The
presence of countercurrent flow introduces several complications in our attempt to
analyze the convergence behavior of the reduced Newton algorithm:
2. The flow direction of the oil phase, which depends on πi∗ − ∆ρg∆z, is generally
not known until the problem has converged.
Our experiments show that when significant countercurrent flow is present, it is pos-
sible that reduced Newton no longer converges to the solution for every initial guess,
especially when a large time step is taken. It is then natural to try to identify condi-
tions for which the reduced Newton procedure converges.
where {Si0 } denotes the initial saturation profile, i.e., the saturation profile at the be-
ginning of the time step. Thus, the objective function Foi actually depends implicitly
on the old saturation values S10 , . . . , Si0 as well as the pressure gradients π1 , . . . , πi .
Since the characteristics of the PDE only travel from left to right in the cocurrent
case, the “domain of dependence” of reduced Newton contains the domain of depen-
dence of the PDE for any ∆t. As a result, one can expect a fairly stable method for
a wide range of initial guesses. On the other hand, in the countercurrent flow case,
98 CHAPTER 4. REDUCED NEWTON METHOD
Foi = −Vi Si − Ki−1 λo (Si )(πi−1 − ∆ρg∆z) + Ki λo (Si+1 )(πi − ∆ρg∆z) − qoi
= Foi (Si (· · · ), Si+1 (· · · ), πi−1 , πi )
= Foi (π1 , . . . , πi+1 ; S10 , . . . , Si+1
0
).
Thus, if ∆t is so large that the waves traveling to the left (i.e., countercurrent to
the main flow direction) can cross more than one cell boundary, then the domain of
dependence of reduced Newton will fail to cover the physical domain of dependence.
In such cases, one cannot generally expect global convergence of the reduced Newton
iterations. Since the fastest backward-moving wave travels at the speed of vmin =
minS∈[0,1] qT fw0 , where
λw Ki λo
fw = 1+ ∆ρg∆z , (4.2.10)
λT qT
we can expect reduced Newton to converge whenever
0
−∆tqT fw,min ≤ φi ∆x. (4.2.11)
Thus, if fw0 ≥ 0 everywhere (i.e., we have cocurrent flow), we expect reduced Newton
to converge for any ∆t. If countercurrent flow is present, then there is a range of S
over which fw0 < 0, in which case we would have the time-step restriction
φi ∆x
∆t ≤ 0
, (4.2.12)
qT fw,min
A monotonicity argument
We have shown in Lemma 4.5 that in the cocurrent case, the objective function is
monotonically increasing (∂Foi /∂πi > 0). Monotonicity is an important property
if global convergence to a unique solution is to be expected: non-monotonic func-
tions necessarily have local minima or maxima, which cause breakdown in Newton’s
method. Thus, a reasonable criterion for ensuring convergence is one that guarantees
4.2. CONVERGENCE ANALYSIS 99
monotonicity of the objective function. We can mimic the proof of Lemma 4.5 and
compute the partial derivative ∂ F̂oi /∂πi , where
In other words, we perform the analysis as though the upstream direction is to the left.
Even though this upstream direction may be incorrect, the analysis is still valuable
for the following reason: since the correct upstream direction is generally unknown
before the solution has converged, a robust algorithm should still be able to make
some progress even when the upstream direction is wrong. The algorithm should
produce an answer that would cause a switch in the upstream direction in the next
iteration, but it should not overshoot by so much as to cause the overall algorithm
to fail. These desirable properties are only possible when F̂oi is monotonic, so our
analysis can still provide a useful criterion for convergence.
We have
1
πi =qT /Ki + λo ∆ρg∆z
λT
1
πi − ∆ρg∆z = qT /Ki − λw ∆ρg∆z ,
λT
∂ F̂oi Ki λT h
0
i
= Vi + qT fw ,
∂πi Vi + Ki λ0w πi
where fw (S) is defined in (4.2.10). Thus, the objective function F̂oi is monotonically
100 CHAPTER 4. REDUCED NEWTON METHOD
increasing whenever
φi ∆x
Vi = ≥ −qT fw0 ,
∆t
which is exactly the same as (4.2.11). As it is shown in Example 4.3.1, criterion
(4.2.11) is usually enough for reduced Newton to converge. For problems of practical
interest, the backward CFL number is usually much smaller than the forward CFL
number, so reduced Newton can generally converge with much larger time steps than
standard Newton even in the countercurrent flow setting. In the next section, we
show a variety of examples that demonstrate the effectiveness of the reduced Newton
algorithm.
completed across the bottom layer and operates at a BHP (bottom hole pressure) of
500 psi. The densities of water and oil at standard conditions are 64 lb/cu.ft. and 49
lb/cu.ft., respectively, and the viscosities are µo = 1.0 cp, µw = 0.3 cp. The fractional
flow curve for this problem is shown in Figure 4.4. We see that flow is cocurrent for
0 ≤ Sw ≤ 0.38 and countercurrent for 0.38 ≤ Sw ≤ 1. The forward CFL number,
maxS∈[0,1] fw0 , is 3.73, whereas the backward CFL number, − minS∈[0,1] fw0 , is 0.638.
We test our algorithm for uniform initial water saturations of Swi = 0.0, 0.1, . . . , 0.9.
In each case, the simulation steps through T = 1, 3, 7, 15, 30, 45, 60 days (1 day =
0.002 pore volumes), and afterwards the time-step size is fixed at ∆T = 20 days
until T = 300 days is reached, for a total of 21 steps. Table 4.1 shows the results
for the standard and reduced Newton algorithms. We see that reduced Newton does
not need to cut any time steps to achieve convergence, whereas standard Newton
must cut the time step multiple times in four cases (Sw = 0.0, 0.6, 0.7, 0.8). Time-
step cuts are very expensive, since it means that we must throw away the results of
all previous iterations and start over. Moreover, the size of the next step following
a time-step cut is usually set to the last successfully integrated ∆t, i.e., the one
reduced by the time-step cut. This can lead to a significantly smaller average time
step size for a given simulation. Thus, a more stable algorithm that avoids time-step
cuts can significantly outperform one that cuts time steps frequently, especially if
their convergence rates are otherwise comparable. Table 4.1 shows that when neither
algorithm requires time-step cuts, standard Newton converges more quickly some of
the time (Sw = 0.4, 0.5, 0.9), whereas reduced Newton is quicker at other times (Sw =
0.1, 0.2, 0.3, 0.6). Nevertheless, the difference in average iteration count is less than
0.67 iterations per time step in all cases, so the convergence rates for both algorithms
are comparable when no time-step cuts are needed. As we observe in later examples,
the enhanced stability of reduced Newton does translate into gains in the overall run
time for larger problems. The primary goal of this example is to demonstrate the
robustness of reduced Newton, even in the presence of strong countercurrent flow.
This property is essential if the algorithm is to be used in heterogeneous reservoirs
with complicated permeability/porosity fields, especially since countercurrent flow
due to gravity can be important in regions where the total velocity is small.
102 CHAPTER 4. REDUCED NEWTON METHOD
1.4
1.2
0.8
fw
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
Sw
Table 4.1: Convergence history for 1D water floods with different initial water
saturations. For both methods, Time steps = total number of time steps taken
to simulate up to 300 days; Newtons = number of Newton iterations (excluding
iterations wasted due to time-step cuts); Cuts = number of times the algorithm must
cut the time-step size by half due to non-convergence.
Standard Reduced
Swi Time steps Newtons Cuts Time steps Newtons Cuts
0.0 26 140 5 21 61 0
0.1 21 59 0 21 58 0
0.2 21 59 0 21 58 0
0.3 21 50 0 21 49 0
0.4 21 51 0 21 58 0
0.5 21 67 0 21 81 0
0.6 22 88 2 21 85 0
0.7 24 96 6 21 90 0
0.8 23 85 3 21 84 0
0.9 21 51 0 21 65 0
4.3. NUMERICAL EXAMPLES 103
Next we specify an initial time step of 1 day and track the number of Newton
iterations required to converge. Figure 4.6 shows the results. We see that reduced
Newton converges for the first time step in 9 iterations, whereas standard Newton
does not converge and needs to cut the time step twice to converge with an initial
time step of 0.25 days. Beyond the first time step, reduced Newton always takes fewer
iterations to converge than its standard counterpart, and the iteration count does not
exhibit the large variations that standard Newton does at the beginning.
104 CHAPTER 4. REDUCED NEWTON METHOD
Table 4.2: Convergence history for the upscaled SPE 10 model with an initial time
step of 0.1 days. N = Number of Nonlinear (Newton) iterations; L = Number
of Linear (CPR) solves; CFL = Maximum CFL number in the reservoir; %CC =
Percentage of cell interfaces that experience countercurrent flow.
Standard Reduced
days N L N L CFL %CC
0.1 4 18 4 17 1.8 6.2
0.3 3 17 3 17 1.9 2.4
0.7 3 18 2 12 2.1 1.1
1.5 3 19 2 14 2.5 0.7
3.1 4 26 2 15 4.0 0.5
6.3 5 32 2 16 6.7 0.5
10 4 26 2 15 11.1 0.5
20 6 45 3 27 23.9 0.5
35 4 32 3 27 35.2 0.5
50 3 27 2 19 33.2 0.5
70 4 35 3 27 35.1 0.6
90 4 33 3 28 35.6 0.6
110 4 37 3 30 52.9 0.6
140 4 41 3 34 112.1 0.6
170 4 39 2 21 102.8 0.7
200 4 35 2 21 145.3 0.7
230 3 33 2 22 129.1 0.7
260 3 33 2 22 132.0 0.8
290 3 30 2 21 132.3 0.8
320 3 31 2 21 119.6 0.8
350 3 30 2 19 109.5 0.8
380 3 30 2 20 116.7 0.9
410 3 31 2 20 112.0 0.9
440 3 30 2 19 114.9 0.9
470 3 28 2 19 108.1 1.0
500 3 29 2 19 146.3 1.0
Total 93 785 61 542
Running time (s) 728.6 560.6
4.3. NUMERICAL EXAMPLES 105
log K
10 x
3
2.5
1.5
0.5
−0.5
−1
−1.5
−2
Figure 4.5: Permeability field and well configuration for the upscaled SPE 10
problem[19]. The reservoir is displayed upside down so that the channels in the
bottom layers are clearly visible.
106 CHAPTER 4. REDUCED NEWTON METHOD
12
Reduced
Standard
10
8
Iterations
0
0
0.25
1
3
7
15
31
50
70
90
110
140
170
200
230
260
290
320
350
380
410
440
470
500
Time (days)
Figure 4.6: Convergence history for the upscaled SPE 10 model with an initial time
step of 1 day.
4.3. NUMERICAL EXAMPLES 107
• Short time steps: T = 0.01, 0.03, 0.07, 0.15, 0.31, 0.63, 1, 3, 7, 15, 31, 63, 90,
120, 150, 180, 220, 260, 300 days. After 300 days, ∆T = 50 days (0.0183 pore
volumes) until T = 2000 days is reached.
• Long time steps: T = 0.01, 0.31, 1, 7, 31, 90, 150, 220, 300 days. After 300
days, ∆T = 100 days (0.0366 pore volumes) until T = 2000 days is reached.
• Huge time steps: T = 0.01, 0.31, 1, 7, 31, 90, 200 days. After 200 days,
∆T = 500 days (0.183 pore volumes) until T = 2000 days is reached.
As before, the time step is cut in half if the global nonlinear solver does not con-
verge within 20 iterations. Table 4.3 summarizes the runs for both the standard and
reduced Newton algorithms, and Figure 4.7 compares the convergence histories of
standard and reduced Newton for the long time step case. We observe that reduced
Newton can easily handle the “long” and “huge” time step cases. Standard Newton,
on the other hand, needs to cut time steps multiple times in order to achieve conver-
gence, and this results in a significant number of wasted linear solves and a serious
degradation in performance. In fact, we were unable to run standard Newton for the
huge time step case because of the large number of time step cuts. Consistent with
the collective experience in the simulation community, taking too large a time step
in standard Newton actually makes the simulation slower. The opposite is true for
108 CHAPTER 4. REDUCED NEWTON METHOD
Table 4.3: Summary of runs for the full SPE 10 problem. “Wasted Newton steps”
and “wasted linear solves” indicate the number of Newton iterations and linear solves
that are wasted because of time step cuts.
Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t Huge ∆t
No. of time steps 58 38 53 26 11
No. of time step cuts 6 17 0 0 0
No. of Newton steps 353 516 128 90 55
− Wasted Newton steps 120 340 0 0 0
No. of linear solves 3818 6257 2271 2399 1805
− Wasted linear solves 860 3934 0 0 0
Total running time (sec) 24053 37388 16558 14727 10275
− Linear solves (sec) 22570 35457 11697 11301 7899
− Single-cell solves (sec) 0 0 4194 2996 2132
reduced Newton. Indeed, reduced Newton with long or huge time steps runs in less
than 60% of the time required by standard Newton with either time-stepping strategy.
Finally, Figure 4.8 shows the oil production rate and water cut for all four simulation
runs. The discrepancy between the solutions is insignificant, with the exception of
the huge time step case, in which the time truncation error becomes so large that the
water cut and production curves noticeably deviate from the cases. In practice, one
would probably not want to take such a large time step, but it is reassuring to know
that reduced Newton can still converge under such extreme circumstances. In general,
by using reduced Newton with (reasonably) larger time steps, we obtain substantial
speedups with little or no change in solution accuracy.
To show that the reduced formulation is applicable to three-phase flow, the algorithm
is tested on a three-phase model in which gas is injected into a reservoir initially
saturated with a mixture of 50% oil and 50% water in every cell. This saturation
is chosen to ensure that all phases are mobile, and that we have a truly three-phase
4.3. NUMERICAL EXAMPLES 109
18
Reduced
Standard
16
14
12
10
Iterations
0
0.01
c
0.31
c
1
c
c
c
7
c
c
31
c
c
90
c
150
c
220
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
Time (days)
Figure 4.7: Convergence history for the full SPE 10 problem with long time steps.
Tick marks on the x-axis labeled c correspond to intermediate time steps needed by
standard Newton to achieve convergence; these steps are skipped by reduced Newton.
110 CHAPTER 4. REDUCED NEWTON METHOD
3000
2000
1000
0
0 500 1000 1500 2000
Days
Water Cut for Producer #1
0.8
Reduced, short ∆ t
0.7 Reduced, long ∆ t
Reduced, huge ∆ t
0.6
Standard, short ∆ t
0.5 Standard, long ∆ t
Water cut
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000
Days
Figure 4.8: Total oil production rate and water cut for the full SPE 10 problem.
4.3. NUMERICAL EXAMPLES 111
problem. The reservoir is identical to the one used in Example 4.3.1. The PVT data
and relative permeabilities are shown in Tables 4.4 and 4.5, respectively. For simplic-
ity, the gas component is assumed not to dissolve into the oil phase (i.e., Rgo = 0).
The oil relative permeability is interpolated from the oil-gas and oil-water tables using
the Stone I method. Gas is injected into the top layer at a rate of 100 MSCF/day
(0.000768 pore volumes/day at 4000 psi), and a producer in the bottom layer is main-
tained at a constant pressure of 4000 psi. The production curve is shown in Figure
4.9. Even though gas is highly mobile (µw /µg = 11.6, µo /µg = 111.9), breakthrough
occurs relatively late (at T = 521 days or 0.4 pore volumes) because gas preferentially
stays in the upper layers because of buoyancy. In addition, since the simulation does
not start from gravity equilibrium, gravity segregation between oil and water must
occur at the initial stages of the simulation. Up to 98% of cell interfaces experi-
ence countercurrent flow at some point before gas breaks through. This accounts for
the rather complicated behavior of the water and oil production curves prior to gas
breakthrough. Even though this is a rather small example, we believe it captures the
essence of the types of nonlinearity present in countercurrent three-phase flow, and
provides a good test case for comparing the convergence behavior of the standard and
reduced Newton algorithms. In this example, two time-stepping strategies are used:
• Long time steps: After an initial time step of 0.1 days, ∆t is automatically
chosen based on saturation and pressure changes, with a minimum of ∆t = 10
days and gradually increasing until ∆t = 100 days (0.0768 pore volumes).
Table 4.6 summarizes the runs for the standard and potential-based reduced New-
ton algorithms. Running times have little meaning because of the small size of the
problem, and are thus omitted. We once again observe that reduced Newton has no
difficulty handling both short and long time steps, whereas standard Newton needs
to cut the time-step size repeatedly throughout the simulation. Thus, the presence of
three phases does not negatively impact the convergence behavior of reduced Newton.
112 CHAPTER 4. REDUCED NEWTON METHOD
P Bo µo Bw µw Bg µg
(psi) (RB/STB) (cp) (RB/STB) (cp) (RB/SCF) (cp)
14.7 1.062 2.200 1.0410 0.31 0.166666 0.0080
264.7 1.061 2.850 1.0430 0.31 0.012093 0.0096
514.7 1.060 2.970 1.0395 0.31 0.006274 0.0112
1014.7 1.059 2.990 1.0380 0.31 0.003197 0.0140
2014.7 1.056 2.992 1.0350 0.31 0.001614 0.0189
2514.7 1.054 2.994 1.0335 0.31 0.001294 0.0208
3014.7 1.053 2.996 1.0320 0.31 0.001080 0.0228
4014.7 1.050 2.998 1.0290 0.31 0.000811 0.0268
5014.7 1.047 3.000 1.0258 0.31 0.000649 0.0309
9014.7 1.033 3.008 1.0130 0.31 0.000386 0.0470
100
80
Production rate
60
Gas
Oil
Water
40
20
0
0 200 400 600 800 1000
Time (days)
Figure 4.9: Production curve for the 1D three-phase example. The units are STB/day
for oil and water, and MSCF/day for gas.
Table 4.6: Summary of runs for the 1D three-phase example with gravity. “Wasted
Newton steps” and “wasted linear solves” indicate the number of Newton iterations
and linear solves that are wasted because of time step cuts.
Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 111 74 103 26
No. of time step cuts 16 36 0 0
No. of Newton steps 888 1223 480 229
− Wasted Newton steps 320 720 0 0
No. of linear solves 1763 2421 973 480
− Wasted linear solves 641 1418 0 0
114 CHAPTER 4. REDUCED NEWTON METHOD
• Short time steps: T = 1, 3, 7, 15, 31, 63, 100 days. After 100 days, ∆t = 50 days
(0.00125 pore volumes) until T = 500 days is reached.
• Long time steps: T = 10, 30, 60, 100 days. After 100 days, ∆t = 100 days
(0.0025 pore volumes) until T = 500 days is reached.
Table 4.7 shows the performance of the standard and reduced Newton algorithms.
Once again no time step cuts are required by reduced Newton, demonstrating its
stability compared with the standard Newton’s method. This translates to an im-
provement in running time for the long time step case. This example shows that
the improvement obtained from reduced Newton in three-phase flow is not limited to
simple 1D cases.
Table 4.7: Summary of runs for the 2D heterogeneous three-phase example. “Wasted
Newton steps” and “wasted linear solves” indicate the number of Newton iterations
and linear solves that are wasted because of time step cuts.
Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 16 10 15 8
No. of time step cuts 1 3 0 0
No. of Newton steps 74 101 58 40
− Wasted Newton steps 20 60 0 0
No. of linear solves 1264 1529 1172 881
− Wasted linear solves 276 698 0 0
Total running time (sec) 63.5 75.6 73.8 53.9
− Linear solves (sec) 53.7 66.1 50.5 37.9
− Single-cell solves (sec) 0 0 18.6 13.0
116 CHAPTER 4. REDUCED NEWTON METHOD
previous examples (Table 4.4 and 4.5), and the Stone I model is used to interpolate
the oil-gas and oil-water data. The gas-injection well is completed in cell (1,1,1)
and operates at 100000 MSCF/day (0.000073 pore volumes per day at 9000 psi); a
production well, completed in cell (20,20,3), operates at a bottom-hole pressure of
1000 psi. The simulation is run up to T = 5000 days (0.365 PVI). Because of the
high gas mobility, breakthrough occurs very early (TBT ≈ 100 days or 0.0073 PVI).
Since the oil and water are not in gravity equilibrium at the start of the simulation,
there is significant countercurrent flow in the problem. Two time-stepping strategies
are used:
• Short time steps: T = 30, 100, 200, 250, 400, 600, 900 days. After 900 days, ∆t
= 400 days (0.0292 pore volumes) until T = 5000 days.
• Long time steps: T = 100, 250, 600 days. After 600 days, ∆t = 800 days (0.0584
pore volumes) until T = 5000 days.
Table 4.8 shows the performance of the standard and reduced Newton algorithms.
Again we see that the reduced Newton method requires no time step cuts and fewer
iterations to converge compared to the standard Newton method.
4.3. NUMERICAL EXAMPLES 117
Production well
50 ft
k = 200 md
10000 ft
10000 ft
Table 4.8: Summary of runs for 3D three-phase example. “Wasted Newton steps”
and “wasted linear solves” indicate the number of Newton iterations and linear solves
that are wasted because of time step cuts.
Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 18 11 17 9
No. of time step cuts 1 3 0 0
No. of Newton steps 95 117 74 57
− Wasted Newton steps 20 60 0 0
No. of linear solves 1083 1624 974 838
− Wasted linear solves 178 855 0 0
Total running time (sec) 4.9 6.5 6.1 4.9
− Linear solves (sec) 3.7 5.5 3.4 2.9
− Single-cell solves (sec) 0 0 2.1 1.6
Chapter 5
Linear Preconditioning
Vi φi n+1 n
X
(Sp,i − Sp,i )+ |∂Vil |Fp,il (S, p) = qp,i , (5.0.1)
∆t
l∈adj(i)
It is well known that the ordering of equations and unknowns can have a huge
impact on the quality of various preconditioners [25, 30, 10]. In most of these works
the orderings considered tend to belong to the following categories:
118
5.1. STRUCTURE OF THE JACOBIAN MATRIX 119
1. Coloring-based orderings, in which the nodes in the adjacency graph are parti-
tioned into a finite number of colors, and nodes with the same color are ordered
within the same block. The red-black ordering is a classical example of such
orderings, which are often motivated by parallelization considerations or in the
context of cyclic reduction.
The above ordering strategies, while having the advantage of being applicable to
general sparse matrices, do not exploit the underlying physics of the problem. For
advection dominated problems, a natural idea is to order the cells according to flow
direction (e.g., from upstream to downstream). Ordering of this type has been consid-
ered in the CFD community (cf. [54]), but its use is limited in reservoir simulation.
The aim of this chapter is to exploit the cell-based and phase-based orderings in-
troduced in Chapter 3 for preconditioning purposes. In particular, we proceed as
follows:
the Jacobian will be denoted by J, which has the form (cf. (3.2.3),(3.2.5))
Sw p
" #
Jss Jsp water equation (5.1.1)
J=
Jps Jpp oil equation
In addition, we will often use cell-based ordering, in which all the equations and
variables belonging to the same control volume are grouped into a single block. In
this case, the Jacobian is denoted by A, where
A11 · · · A1N
. ... ..
A= .. . (5.1.2)
AN 1 · · · AN N
is a block matrix with np × np blocks. Each block row represents the derivatives of
the conservation equations (oil and water) with respect to the discrete unknowns (Si
and pi ) at the gridblock and its adjacent cells. For example,
" #
(Jss )ii (Jsp )ii
Aii = .
(Jps )ii (Jpp )ii
Assumptions 5.
1. The phase mobilities are non-negative and satisfy λ0w = ∂λw /∂Sw > 0 and
λ0o = ∂λo /∂Sw < 0;
Assumption 5.1 has already been stated in Theorem 2.1. Assumption 5.2 is similar
to the uniform ellipticity condition in Section 4.2. When the flow is cocurrent, it is
purely an assumption on the fluid mobilities. In the countercurrent flow case, however,
it is also an assumption on the linearization point (S ` , P ` ), since it is possible that
λw and λo are evaluated at two different saturations because of upstreaming. Thus,
`
if there are adjacent cells i and i + 1 such that Si` = 0 and Si+1 = 1, then Assumption
5.2 would disallow the possibility that the upstream directions for water and oil are
i and i + 1 respectively, which is essentially a restriction on the set of admissible
pressure profiles P ` . Assumption 5.3 ensures monotonicity of the discretization (in
the sense of Chapter 2), and assumption 5.4 is needed for a unique pressure solution.
Lemma 5.1. Assume the hypothesis given in Assumptions (5.1–4). Then the sub-
blocks of the Jacobian J have the following properties:
0 0
1. Jss = (1/∆t)D + Jss and Jps = −(1/∆t)D + Jps , where D is a positive diagonal
0 0
matrix, and Jss and −Jps are weakly column diagonally-dominant M -matrices;
2. Jsp and Jpp are weakly diagonally dominant, symmetric, positive semi-definite
matrices;
0 0
Moreover, the matrices Jss , Jps , Jsp and Jpp are independent of ∆t.
Based on the above lemma, the following theorems concerning the rank of J can
be proven. Clearly, we have
" # " #
Jss Jsp Jss Jsp
J= nonsingular ⇐⇒ J˜ = nonsingular,
Jps Jpp Jts Jtp
122 CHAPTER 5. LINEAR PRECONDITIONING
where Jts = Jss + Jps and Jtp = Jsp + Jpp . That is, Jts and Jtp are the Jacobian
matrices corresponding to the total mass balance equation (1.1.19).
Since Jtp is nonsingular, J˜ is nonsingular if and only if the Schur complement
−1
S1 := Jss − Jsp Jtp Jts
is also nonsingular.
Theorem 5.2. There exists T > 0 such that J is nonsingular for 0 < ∆t < T .
0 0
Proof. First, note that Jts = Jss + Jps = Jss + Jps is independent of ∆t. Thus, we can
write
1 0 −1
S1 = D + (Jss − Jsp Jtp Jts ),
∆t
where the terms in brackets are independent of ∆t. Now S1 is nonsingular if and only
if
∆tD−1 S1 = I + ∆tD−1 (Jss
0 −1
− Jsp Jtp Jts )
ρ(∆tD−1 (Jss
0 −1
− Jsp Jtp Jts )) < 1,
where ρ(·) denotes the spectral radius. Thus, S1 is nonsingular whenever 0 < ∆t < T ,
where
1
T = 0 − J J −1 J ))
,
ρ(D−1 (Jss sp tp ts
−1
or T = ∞ if ρ(D−1 (Jss
0
− Jsp Jtp Jts )) = 0.
Proof. For 1D problems, it is possible to write explicitly down the form of the Schur
complement S1 by eliminating the pressure terms directly. Since the discretization
and linearization steps commute, it is notationally more convenient to manipulate
the PDE itself, although one can also perform the same calculation on the discrete
5.1. STRUCTURE OF THE JACOBIAN MATRIX 123
∂
φSt − λw (px + ρw gzx ) = 0, (5.1.3)
∂x
∂
−φSt − λo (px + ρo gzx ) = 0. (5.1.4)
∂x
In the above equations, all coefficients are evaluated at the linearization point, so
they do not depend on σ and π. We also used the notation σ w and σ o to denote the
upwind direction of the water and oil phase in the finite volume discretization, which
can be different in general. By keeping the terms separate we can easily mimic this
manipulation in the discrete case. If we define
∂
F (x, t) := −φSt` + λw (Px` + ρw gzx )], (5.1.7)
∂x
∂
G(x, t) := φSt` + λo (Px` + ρo gzx )], (5.1.8)
∂x
∂ 0 w `
φσt − λw σ (Px + ρw gzx ) + λw πx = F (x, t), (5.1.9)
∂x
∂ 0 o `
−φσt − λo σ (Px + ρo gzx ) + λo πx = G(x, t). (5.1.10)
∂x
We can now eliminate πx to obtain a single equation involving σ. Adding (5.1.9) and
124 CHAPTER 5. LINEAR PRECONDITIONING
1
πx = − H(x, t) + λ0w σ w (Px` + ρw gzx ) + λ0o σ o (Px` + ρo gzx ) . (5.1.12)
λT
where R(x, t) is some combination of F (x, t) and H(x, t) that does not depend on
σ and π, and hence is unimportant for the analysis. To derive the discrete form of
(5.1.14), we need to resolve the upstreamed saturations σ w and σ o , which are given
by
σi+1 , P ` + ρw gzx ≥ 0
w x
σi+1/2 =
σ ,
i Px` + ρw gzx < 0,
and similarly for σ o . Thus, the discrete algebraic equations that arise from Newton’s
method are of the form
φi σi 1
+ αi+1/2 σi − βi+1/2 σi+1 − αi−1/2 σi−1 + βi−1/2 σi = Ri , (5.1.15)
∆t ∆xi
5.2. CPR PRECONDITIONING 125
where
Thus, with proper scaling, the Schur complement S1 has the form
γ1 + α3/2 + β1/2 −β3/2
.
−α3/2 γ2 + α5/2 + β3/2 . .
S1 = ... ... ...
,
... ...
−βN −1/2
−αN −1/2 γN + αN +1/2 + βN −1/2
method, in the sense that it first decouples the full problem into an elliptic and
a hyperbolic subproblem; then at each iteration, one would first solve the elliptic
problem to obtain an approximate pressure, and then use this pressure to solve the
transport problem. A more precise description in terms of two-stage preconditioners
follows.
M −1 = T2 I − JT1 + T1 ,
(5.2.1)
where T1 and T2 are approximate inverses for J, or for the restriction of J onto
some subspace. When T1 and T2 are both invertible, then M −1 is equivalent to the
preconditioner derived from the two-stage stationary iteration
Examples of this type include ADI preconditioners [68], the symmetric SOR method
[39] and the HSS method [7]. The Ti can be singular as well. The special case of
The CPR preconditioner, which operates on the matrix J of size 2N × 2N , also has
the form (5.2.1):
−1 −1 T −1 T
+ C(W T JC)−1 W T ,
MCP R = M 2 I − JC(W JC) W (5.2.3)
Ap δp = −rp ,
where Ap = W T JC, that can be solved easily and gives a meaningful approximate
pressure solution δp. Different choices of W T and C give rise to different first-stage
preconditioners, which is the subject of study in [44]. One popular choice of the first-
stage preconditioner, called the True-IMPES reduction, uses the IMPES pressure
matrix directly; in this case, Ap is an elliptic operator, so efficient solvers such as
algebraic multigrid [74] can be used to solve the pressure equation. In addition, since
Ap is simply the pressure matrix that arises from a different time discretization, the
solution δp is also a meaningful approximation of the FIM pressure solution, at least
when ∆t is small.
In the general black-oil case with np phases, it is possible to obtain the IMPES
pressure matrix by manipulating J directly. We describe the procedure here infor-
mally (but see [44] for a detailed discussion). Since IMPES treats the transmissibility
derivatives explicitly, one needs to first eliminate these terms from J. This can be done
by performing a column sum (i.e., for each phase p, sum the equations corresponding
to phase p over the whole domain): since mass is conserved, all the flux terms must
cancel, so the transmissibility derivatives will also cancel out. Only accumulation
terms remain, which means that
are now diagonal matrices. Finally, the pressure equation is obtained by eliminating
the pressure variables, which is equivalent to forming the Schur complement
The resulting pressure matrix Ap will have the same sparsity pattern as Jpp and Jsp ,
since the scaling matrix Jˆps Jˆss
−1
is diagonal and does not modify the sparsity pattern
of Jsp . In the incompressible two-phase flow case, Lemma 5.1 shows that
1
Jˆss = −Jˆps = D,
∆t
so we have the very simple relation Ap = Jps + Jpp = Jtp . Thus, the restriction and
prolongation operators are
" #
h i 0
WT = I I , C= ,
I
−1 −1
MCP R J = M2 J(I − T1 J) + T1 J.
−1
Recall that S1 = Jss − Jsp Jtp Jts is the Schur complement with respect to the (1,1)-
" #
J˜ss J˜sp
block. If we partition M2 into M2 = and define S̃1 = J˜ss − J˜sp J˜tp
−1 ˜
Jts , we
˜ ˜
Jps Jpp
5.2. CPR PRECONDITIONING 129
obtain " #
−1
−1 S̃ 1 S 1 0
MCP RJ = . (5.2.4)
Jtp Jts − J˜tp
−1 −1 ˜ −1
Jts S̃1 S1 I
−1
As a result, MCP R J has λ = 1 as an eigenvalue with (geometric) multiplicity at
least N , so the first-stage preconditioner clusters all the eigenvalues associated with
the pressure part into the point z = 1. Equation (5.2.4) also implies that GMRES
converges in at most N + 1 iterations in exact arithmetic. To see this, consider any
matrix of the form " #
S 0
G := .
Y I
Pk
Let q(t) = i=0 βi ti be the minimal polynomial of S, where β0 6= 0 if and only if S is
nonsingular. Then since
" #
S i+1 − S i 0
Gi+1 − Gi = ,
Y Si 0
we see that
k
X
βi (Gi+1 − Gi ) = 0.
i=0
Note that ILU is only exact if the cell-based block form of the Jacobian is used.
Fill-in necessarily occurs if the partitioned form of the Jacobian is used, since the
Schur complement
−1
S2 := Jpp − Jps Jss Jsp
−1
is not tridiagonal. In fact, Jps Jss is in general a full lower-triangular matrix, which
means S2 is in general a full lower Hessenberg matrix. Thus, one should expect
BILU(0) to be a better second-stage preconditioner than ILU on the partitioned
matrix J.
5.2. CPR PRECONDITIONING 131
It is usually difficult to ascertain a priori that the BILU(0) factorization exists for
the general two-phase flow problem. However, in the special case of cocurrent flow,
we can prove the existence of BILU(0) when a cell-based potential ordering is used.
Theorem 5.5. Let J be the Jacobian corresponding to a cocurrent flow problem lin-
earized at (S ` , P ` ), and suppose the pressure profile P ` satisfies the maximum princi-
ple. Assume the cell-centered grid admits a two-coloring. Let A = P JP T be the block
form of the Jacobian, in which the cells are arranged in decreasing order of pressure.
Then the block ILU(0) factorization of A exists with nonsingular factors L and U .
Moreover, we have
P T (LU )P = J + E, (5.2.5)
" #
0 Esp
where E = .
0 Epp
In other words, BILU(0) is exact on the saturation part. The assumption that P `
satisfies the maximum principle implies that the cell(s) with the lowest pressure must
be on a Dirichlet boundary. Also note that this theorem is applicable to cocurrent
flow problems in any dimension, and not just for 1D flows, as long as the grid is
two-colorable. This applies to many grids of practical interest (Cartesian and other
orthogonal grids, radial grids, etc.). In light of (5.2.2) and the fact that T1 is exact
on the pressure part, Theorem 5.5 indicates that BILU(0) should be an excellent
preconditioner as long as Esp and Epp are not too large.
Proof. Let A = P JP T and Aij be the 2 × 2 blocks. Let A(k) be the block (N − k +
1) × (N − k + 1) matrix that remains to be factored at the kth step, i.e., we have
132 CHAPTER 5. LINEAR PRECONDITIONING
A = A(1) ,
(k) (k) (k)
Akk Ak,k+1 · · · AkN
A(k) ... ..
.
A(k) = k+1,k ,
... ... ..
.
(k) (k)
AN k ··· · · · AN N
0, (k)
(k+1) Aij = 0;
Aij =
A(k) − A(k) (A(k) )−1 A(k) , A(k) 6= 0
ij ik kk kj ij
for i, j ≥ k + 1.
1. We argue that
(1) (K)
Aij = Aij = · · · = Aij (5.2.6)
whenever i 6= j and K ≤ min(i, j). This is true because a two-coloring exists for
any Cartesian grid, i.e., one can partition the gridblocks V = {1, 2, . . . , N } into
disjoint sets VR and VB such that Aij = 0 whenever i 6= j and either i, j ∈ VR
(k) (k) (k) (k)
or i, j ∈ VB . Thus, when i 6= j, either Aij = 0 or Aik (Akk )−1 Akj = 0, which
implies (5.2.6). Thus, the only blocks that change during the elimination are the
(k)
diagonal blocks Aii .
(k)
which means only the second column of Aii gets updated during the elimination.
5.2. CPR PRECONDITIONING 133
So we have
" (k)
# " #
(k) aii Xii (k) −aij −Xij
Aii = (k) ; Aij = Aij = (i 6= j). (5.2.8)
−bii Yii bij −Yij
3. Let γi = φi Vi /∆t ≥ γmin > 0. The following properties hold for A = A(1) :
(k) P
(a) Xii ≥ j≥k,j6=i Xij ≥ 0;
(k) P
(b) Yii ≥ Yij ≥ 0.
j≥k,j6=i
(k) (k) P
(c) Xii + Yii > j≥k,j6=i (Xij + Yij ) ≥ 0 for a cell i on the Dirichlet boundary.
(k)
Then (a)–(c) together would imply that Aii is nonsingular for all k ≤ i. Assume
first that cell i is not on the Dirichlet boundary. Then by the maximum principle,
there is at least one cell downstream from i. Thus,
Clearly, conditions (a)–(c) are satisfied for k = 1. For the inductive step, we
compute
" #" (k) (k)
#" #
(k) (k) (k) 1 −aik −Xik Ykk −Xkk 0 −Xki
Aik (Akk )−1 Aki = (k)
det Akk bik −Yik bkk akk 0 −Yki
" (k)
#
0 dXii
= (k) .
0 dYii
We have
(k) (k)
Thus, dXii ≤ Xik , and by a similar calculation, we get dYii ≤ Yik . Hence,
X
(k+1) (k) (k)
X
Xii = Xii − dXii ≥ Xij − Xik ≥ Xij ,
j≥k j≥k+1
j6=i j6=i
4. The above argument shows that the block ILU(0) factorization of A exists and
has the form
I U11 U12 · · · U1N
..
L21 I
U22 .
L= , U = ,
.. ... . . . UN −1,N
.
LN 1 · · · LN,N −1 I UN N
5.2. CPR PRECONDITIONING 135
where
(j)
A A )−1 , i > j, Aij 6= 0,
ij jj
Lij = I, i = j, (5.2.9)
0, otherwise;
A , i < j, Aij 6= 0,
ij
Uij = A(i)
ii , i = j, (5.2.10)
0, otherwise.
Clearly, L and U are both nonsingular. Each 2 × 2 block in the factorization error
(k)
P EP T has the form Aik Akk Akj , which has the pattern shown in (5.2.7). Thus,
after permutation, we get " #
0 Esp
E= ,
0 Epp
as required.
Theorem 5.6. Assume the hypotheses of Theorem 5.5. Let G = (V, E) be the up-
stream graph, i.e., V is the set of cells in the domain, and (i, j) ∈ E iff (1) i is
adjacent to j, and (2) either Pi` > Pj` or Pi` = Pj` and i > j. Let
σ1 : V → {1, . . . , N }
σ2 : V → {1, . . . , N }
Then
ΠT1 L1 U1 Π1 = ΠT2 L2 U2 Π2 ,
(i) (i)
X
(Ur )ii = (Ar )ii = (Ar )ii − (Ar )ik ((Ar )ii )−1 (Ar )ki ,
k<i
but since (Ar )ik = 0 unless (τr (k), τr (i)) ∈ E, we really have
X
(Ur )ii = (Ar )ii − (Ar )ik (Ur )−1
kk (Ar )ki .
(τr (k),τr (i))∈E
Thus, for any j ∈ V , we have (U1 )σ1 (j),σ1 (j) = (U2 )σ2 (j),σ2 (j) if and only if
(U1 )σ1 (k),σ1 (k) = (U2 )σ2 (k),σ2 (k) for all k such that (k, j) ∈ E,
Theorem 5.6 says that there is essentially only one BILU(0) preconditioner that
respects flow directions.
It is possible to describe the nonzero pattern of the error matrices Esp and Epp in
terms of the upstream graph G. A fill-in entry is created (and subsequently dropped
by BILU(0)) at position (i, j), with i 6= j, if there exists k < i, j such that both Aik
5.2. CPR PRECONDITIONING 137
and Ajk is nonzero. In other words, if Esp and Epp are nonzero at position (i, j), then
nodes i and j must be siblings in the upstream graph G, i.e., i and j must share the
same parent k. This immediately provides an upper bound on the number of entries
in Esp and Epp : the number of error entries due to the elimination of node k is given
by dk (dk − 1), where dk is the out-degree of node k (i.e., the number of edges coming
out of k). So the total number of entries in Esp and Epp is bounded by
X X X
di (di − 1) = d2i − di = |V |D2 − |E|,
i i i
So in either case, the error matrices are sparse, since the number of entries scales
linearly with |V |. Moreover, the value of the entries are given by
" #
0 (Esp )ij (k)
= −Aik (Akk )−1 Akj .
0 (Epp )ij
Physically, this corresponds to the flux from cell j to cell i (traveling via k) that is
generated by a change in pressure pj . Since the potential ordering always orders the
cells according to the major flow direction, the fluxes between siblings are generally
much smaller than fluxes along upstream edges. This implies the error matrices
Esp and Epp are small. Contrast this with a lexicographical ordering, where there
is no guarantee that the flux between siblings should be small. Thus, a second-
stage preconditioner that uses potential ordering should be more effective than one
that uses the natural ordering. This is what we observe in our numerical examples
(section 5.2.4).
138 CHAPTER 5. LINEAR PRECONDITIONING
−1 −1 T −1 T
I − MCP R J = (I − M2 J)(I − C(W JC) W J)
Thus,
−1
Jts = S̃1−1 J˜pp J˜tp
−1 ˜
(Jsp − Jsp ) − J˜sp J˜tp
−1 ˜
−1
Êsp Jtp (Jpp − Jpp ) Jtp Jts
= S̃1−1 (J˜sp − Jsp ) − J˜sp J˜tp
−1 ˜ −1
(Jtp − Jtp ) Jtp Jts
= S̃1−1 J˜sp J˜tp
−1 −1
− Jsp Jtp Jts .
It is interesting to compare the above expressions with the spectrum one would get
with a single-stage BILU(0) preconditioner (i.e., no pressure solve). In that case, we
would have " #
I −Êsp
M2−1 J = .
0 I − Êpp
Therefore, in the single-stage BILU case, we would still have termination within N +1
steps when flow is cocurrent, but the convergence behavior from iteration 1 to N is
−1
dictated by I − Êpp , instead of I + Êsp Jtp Jts . We expect the CPR preconditioner to
outperform single-stage BILU(0) based on the following (somewhat heuristic) reasons:
1. If kJts k is small (e.g., when the overall flow (total mass balance equations) is
slowly varying with respect to the time-step size), then S̃1−1 S1 will be close to
the identity matrix, whereas this is not the case for single-stage BILU. In many
practical applications, the total velocity, which dictates Jtp , does not vary much
within a time step, so CPR would have a significant advantage over BILU(0).
−1 −1
2. It can be shown (see Appendix E) that both Jsp Jtp and Jpp Jtp are similar to
a symmetric positive semi-definite matrix, with eigenvalues between 0 and 1.
Thus, even though factorization errors are present, one can also expect J˜sp J˜tp
−1
,
and hence the term J˜sp J˜tp
−1 −1
− Jsp Jtp , to be relatively benign. On the other
−1 −1
hand, one cannot bound the eigenvalues of Jps Jss : when ∆t is large, Jps Jss
can have both very large eigenvalues (when |λo,i | |λw,i |) and very small ones
(when |λo,i | |λw,i |). So any bound on Êsp is likely to be much tighter than a
bound on Êpp .
−1 −1 −1
S1 = Jss − Jsp Jtp Jts = Jss (I − Jss Jsp Jtp Jts ),
−1 −1 −1
S2 = Jtp − Jts Jss Jsp = Jtp (I − Jtp Jts Jss Jsp ).
Since the two matrices inside the parentheses have the same eigenvalues, it is
evident that S1 behaves more like the transport part Jss , whereas S2 behaves
more like the elliptic part Jtp . In particular, one expects κ(S1 ) to scale like
O(∆t/h), whereas κ(S2 ) would be O(1/h2 ). We also know that fixed-pattern
ILU preconditioners tend to perform poorly on elliptic problems. This indicates
CPR should, in general, outperform single-stage BILU(0).
To illustrate these arguments, we show the spectral plots of J, M2−1 J and MCP
−1
R J,
as well as their condition numbers, for various time-step sizes in Figures 5.1 and 5.2.
For this test case, we have a 2D homogeneous reservoir (with uniform porosity),
discretized on a 20 × 10 grid. A constant injection rate is imposed along the left
edge, and pressure is held constant along the right edge, with no flow boundaries
along the top and bottom. We see that the spectrum of J changes significantly as
∆t varies. The condition number is very large for all cases, and there is no obvious
clustering of eigenvalues, which means GMRES will likely perform poorly without
preconditioning. When BILU(0) is used, the spectrum lies almost completely on the
positive real axis, but the distribution is continuous and no obvious clustering exists;
in fact, the spectrum looks very similar to one belonging to an elliptic operator
(possibly due to Jtp appearing as a multiplicative factor in S2 ). When two-stage CPR
is used, the clustering around z = 1 becomes very obvious, and the high quality of
the clustering is remarkably consistent across time steps.
Figures 5.3 and 5.4 show the same spectral plots when countercurrent flow is
present. The same comments concerning the spectra of J and M2−1 J apply, except
that the condition numbers become much higher. As for the CPR-preconditioned
matrix, we still see a very good clustering of eigenvalues around z = 1, but the cluster
5.2. CPR PRECONDITIONING 141
Table 5.1: Convergence behavior for the block ILU(0) and CPR preconditioners.
Each figure represents the average number of GMRES iterations per Newton step
required for convergence.
is not as tight, and we start to see more spreading along the positive real axis. This
is probably due to the fact that M2 is no longer exact with respect to saturation, and
this factorization error manifests itself as a spreading of the eigenvalues. Fortunately,
the outlying eigenvalues are well separated from one another, so GMRES should
have little problem eliminating the subspaces associated with them within a few
iterations. Table 5.1 shows the linear iteration counts per Newton step for both the
CPR and block ILU(0) preconditioners on the 20 × 10 grid. For both the cocurrent
and countercurrent flow cases, it is evident that the higher quality clustering produced
by CPR does, in fact, translate into much faster convergence compared with block
ILU(0).
142 CHAPTER 5. LINEAR PRECONDITIONING
No preconditioning, ∆ t = 1 No preconditioning, ∆ t = 5
4 2.5
2
3
1.5
2
1
1
0.5
0 0
−0.5
−1
−1
−2
−1.5
−3
−2
−4 −2.5
0 1 2 3 4 5 6 7 8 0 0.5 1 1.5 2 2.5 3
κ = 5961.8005 κ = 5414.6887
x 10
−3 ILU only, ∆ t = 1 x 10
−4 ILU only, ∆ t = 5
5 2
4
1.5
3
1
2
0.5
1
0 0
−1
−0.5
−2
−1
−3
−1.5
−4
−5 −2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 358.3443 κ = 299.1005
x 10
−3 CPR preconditioned, ∆ t = 1 CPR preconditioned, ∆ t = 5
4 0.02
3 0.015
2 0.01
1 0.005
0 0
−1 −0.005
−2 −0.01
−3 −0.015
−4 −0.02
0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.1 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3
κ = 103.6347 κ = 61.642
Figure 5.1: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the cocurrent flow problem (∆t = 1, 5).
5.2. CPR PRECONDITIONING 143
2 2
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
−2.5 −2.5
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
κ = 5683.9447 κ = 13524.7056
x 10
−4 ILU only, ∆ t = 20 x 10
−11 ILU only, ∆ t = 100
1.5 3
1 2
0.5 1
0 0
−0.5 −1
−1 −2
−1.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 234.6865 κ = 222.3062
0.008
0.01
0.006
0.004
0.005
0.002
0 0
−0.002
−0.005
−0.004
−0.006
−0.01
−0.008
−0.015 −0.01
0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 0.9 1 1.1 1.2 1.3 1.4
κ = 131.5138 κ = 149.9701
Figure 5.2: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the cocurrent flow problem (∆t = 20, 100).
144 CHAPTER 5. LINEAR PRECONDITIONING
No preconditioning, ∆ t = 1 No preconditioning, ∆ t = 5
3 2.5
2
2
1.5
1
1
0.5
0 0
−0.5
−1
−1
−1.5
−2
−2
−3 −2.5
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
κ = 5601.3062 κ = 6741.5014
0.02 0.04
0.01 0.02
0 0
−0.01 −0.02
−0.02 −0.04
−0.03 −0.06
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 534.1869 κ = 5677.5048
x 10
−3 CPR preconditioned, ∆ t = 1 x 10
−3 CPR preconditioned, ∆ t = 5
8 4
6 3
4 2
2 1
0 0
−2 −1
−4 −2
−6 −3
−8 −4
0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25
κ = 20.0608 κ = 9.25
Figure 5.3: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the countercurrent flow problem (∆t = 1, 5).
5.2. CPR PRECONDITIONING 145
2 2
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
−2.5 −2.5
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
κ = 34310.1894 κ = 2306857.8683
0.1 0.15
0.1
0.05
0.05
0
0
−0.05
−0.05
−0.1
−0.1
−0.15 −0.15
−0.2 −0.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 87351.0046 κ = 5176047.3408
0.2
0.3
0.15
0.2
0.1
0.1
0.05
0 0
−0.05
−0.1
−0.1
−0.2
−0.15
−0.3
−0.2
−0.25 −0.4
0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
κ = 349.7405 κ = 49843.1776
Figure 5.4: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the countercurrent flow problem (∆t = 20, 100).
146 CHAPTER 5. LINEAR PRECONDITIONING
in the first stage.) This example provides experimental confirmation of Theorem 5.6
and illustrates the ability of potential ordering to shield this type of grid orientation
effect from the linear solver.
(a) The production well is located at (1,1,:), so that the major direction of flow is
aligned with the lexicographical ordering;
(b) The production well is located at (110,1,:), so that the major direction of flow
is transverse to the lexicographical ordering.
We run the simulation to T = 100 days (0.02 pore volumes injected). Table 5.3
summarizes the runs for both preconditioners. The number of time steps and New-
ton steps are again exactly the same in all cases, indicating that the problems are
148 CHAPTER 5. LINEAR PRECONDITIONING
In our current implementation, the savings due to the use of potential ordering
are rather modest, even though the GMRES iteration count decreases substantially.
We believe this is due to our inefficient implementation. Currently, we physically
permute the blocks of the Jacobian matrix into the potential ordering before feeding
it into a library routine that computes the block ILU factorization. This simplifies the
implementation, but adds unnecessary cost to the solver, because one can actually
modify the ILU routine to use the unpermuted data structure, and only change the
order of elimination when the factorization is computed. (This is what modern direct
5.3. SCHUR COMPLEMENT PRECONDITIONING 149
meaning that if M2 is exact on saturation (e.g., BILU(0) for cocurrent flow), then the
two-stage preconditioner would be exact. Note that the “pressure matrix” W T JC in
this case would be
W T JC = −Jps Jss
−1
Jsp + Jpp = S2 ,
meaning we are actually solving the Schur complement problem with respect to pres-
sure. S2 is in general a dense matrix; however, as we noted in section 4.1, we can
perform the matrix-vector product S2 v with exactly the same computational cost as
−1
performing Jv, since multiplication with Jss is simply a forward substitution when
we exploit potential ordering. It is thus worthwhile to attempt to devise effective
preconditioners for the Schur complement problem. In fact, a good preconditioner
for S2 (i.e., one that converges as quickly as two-stage CPR on the full problem)
would eliminate the need for a two-stage preconditioner on J, since one can always
obtain the saturation solution from the pressure solution by a back substitution (i.e.,
−1
a multiplication by Jss ).
The spectral and profile plots are shown in Figures 5.7 (for 1D problems) and 5.8. The
profile plots show the magnitudes of the entries along a row of S2 that corresponds
to a gridblock near the center of the reservoir, behind the flood front.
Let us first look at the spectrum of S2 . When the time step is small, the transport
problem contributes little to the spectrum of the Schur complement; the eigenvalue
plot is similar to that of a positive definite elliptic operator. For the medium and large
time step, though, we start to see a rather complicated spectral plot consisting of two
parts: eigenvalues along the positive real axis corresponding to the pressure part, and
complex conjugate pairs that arise from the saturation part of the problem. The plots
for the 2D case are especially revealing, since the complex eigenvalues are roughly in
the shape of a parabola and bear a striking resemblance to the pseudospectra of
convection-diffusion operators [63]. Based on the spectral plots, we conclude that
preconditioners that depend strongly on the matrix being nearly symmetric positive
definite (such as algebraic multigrid) will probably perform poorly on problems with
moderate to large time steps.
As for the nonzero pattern, darker colors in Figure 5.6 indicate larger magnitudes.
In the 1D case, most of the energy (i.e., Frobenius norm) of the matrix lies within the
tridiagonal part; even though the lower triangular part is technically nonzero, most
152 CHAPTER 5. LINEAR PRECONDITIONING
1D flow 2D flow
Figure 5.6: Nonzero pattern of S2 for the 1D and 2D reservoirs. Darker colors indicate
larger magnitudes.
of the entries outside the tridiagonal region are tiny and can be neglected. Thus,
ILU(0) can potentially be a good preconditioner for the 1D Schur complement. In
contrast, the 2D Jacobian has large entries outside the pentadiagonal region, and the
magnitude of the fill-in entries increases as the time step is increased. Hence, it is
unlikely that a preconditioner with small bandwidth (such as an ILU preconditioner
induced from the partitioned matrix J) would be effective for 2D problems.
0.8
5
0.6
0.4
4
0.2
0 3
−0.2
2
−0.4
−0.6
1
−0.8
−1 0
0 2 4 6 8 10 0 5 10 15 20
κ = 1344.0839
0.4
0.3
0.2
1
0.1
−0.1
0.5
−0.2
−0.3
−0.4
−0.5 0
0 0.5 1 1.5 2 2.5 3 3.5 4 0 5 10 15 20
κ = 626.5996
0.16
0.1
0.14
0.05
0.12
0 0.1
−0.05 0.08
0.06
−0.1
0.04
−0.15
0.02
−0.2 0
0 0.5 1 1.5 2 2.5 0 5 10 15 20
κ = 385.3661
0.8
10
0.6
0.4
8
0.2
0 6
−0.2
4
−0.4
−0.6
2
−0.8
−1 0
0 2 4 6 8 10 12 14 16 18 0 50 100 150 200 250 300 350 400
κ = 2702.7489
3 3.5
2 3
1 2.5
0 2
−1 1.5
−2 1
−3 0.5
−4 0
0 1 2 3 4 5 6 7 8 0 50 100 150 200 250 300 350 400
κ = 2044.792
0.45
0.4
0.4
0.35
0.2
0.3
0 0.25
0.2
−0.2
0.15
0.1
−0.4
0.05
0
0 1 2 3 4 5 0 50 100 150 200 250 300 350 400
κ = 1786.5725
where Jˆss
−1 −1
is a first-order approximation of Jss , defined as follows. Suppose we order
Jss so that it is lower triangular. Then Jss = D − L = (I − LD−1 )D, where D is
diagonal and L is strictly lower triangular. Then
−1
Jss = D−1 (I − LD−1 )−1
= D−1 (I + LD−1 + · · · + (LD−1 )N −1 ).
2. AMG on S2 : the full Schur complement is handed to AMG, and one V-cycle is
used per GMRES iteration;
The induced preconditioners (items (1) and (4)) are defined as follows. If MJ is a
preconditioner for the full matrix J, then the induced preconditioner MS is defined
156 CHAPTER 5. LINEAR PRECONDITIONING
AMG Exact
∆t 1.6 3.1 7.8 1.6 3.1 7.8
M0 10.7 19.3 24.7 4.0 8.3 10.0
M1 10.0 13.3 15.3 3.7 6.0 7.0
M2 11.7 17.7 21.7 4.3 8.3 11.0
M3 11.0 29.0 45.0 4.3 7.3 9.0
M4 21.3 33.3 41.0 3.0 5.0 5.0
Induced ILU 33.7 38.3 40.7
AMG on S2 7.7 16.3 14.3
CPR on S2 5.7 8.7 11.3
CPR on J 4.0 6.3 5.3
as " #
h i 0
MS−1 = 0 I MJ−1 ,
I
Note that the induced ILU preconditioner can always be applied exactly because if
MA = LA UA , then MS = LS US , where
h i
LS = RT LA R, US = RT UA R, RT = 0 I
1. The convergence rates of all the Schur complement methods have a fairly strong
dependence on the time-step size. CPR on the full matrix, on the other hand,
exhibits a convergence behavior that is nearly independent of ∆t, which is
consistent with the spectral plots of Figures 5.1–5.4.
5.3. SCHUR COMPLEMENT PRECONDITIONING 157
AMG Exact
∆t 1.6 3.1 7.8 1.6 3.1 7.8
M0 19.5 42.3 63.2 8.5 16.8 22.6
M1 16.8 27.5 86.2 6.8 12.0 32.2
M2 14.2 24.5 55.2 7.3 12.3 22.8
M3 17.5 45.0 >100 7.2 12.3 23.8
M4 44.0 DNC DNC 6.2 10.0 15.2
Induced ILU 35.0 42.0 63.3
AMG on S2 18.0 DNC DNC
CPR on S2 8.2 11.5 22.6
CPR on J 5.7 6.8 6.6
3. A more accurate approximation of the Schur complement does not imply faster
convergence when AMG is used. In particular, M4 (which has the most fill
among the M ’s) and the exact Schur complement S2 both do poorly in the
countercurrent flow case when AMG is used. As expected, AMG has trouble
when the matrix is far from being an elliptic operator. This is in contrast with
the exact preconditioner case, where a more accurate preconditioner usually
requires fewer iterations to converge.
−1
requires more iterations to converge than MCP R on the full system, even though it is
operating on a smaller system. One possible explanation is as follows. A direct (but
tedious) calculation shows that
−1 ˜
where S̃2 = J˜pp − J˜ps J˜ss Jsp is the Schur complement of the second stage precondi-
tioner M2 with respect to pressure. If we assume cocurrent flow and that BILU(0)
with potential ordering is used, we would have J˜ss = Jss and J˜ps = Jps , so the
preconditioned matrix S2 MS−1 would become
Once again,the convergence behavior depends on how close S2 S̃2−1 is to the identity,
−1
when such a term is absent from MCP R J. This could explain why CPR on the Schur
complement S2 scales less well than CPR on the full Jacobian J.
Chapter 6
Conclusions
The efficient simulation of immiscible multiphase flow in porous media requires the
use of nonlinear solvers and linear preconditioners that can take advantage of the
underlying structure of the problem, such as flow direction information. The phase-
based potential ordering in Chapter 3 exploits the upstream nature of the spatial
discretization in order to triangularize the saturation part of the nonlinear system of
equations. This ordering is valid for any flow configuration, and it can handle coun-
tercurrent flow due to gravity and capillarity. To compute the ordering, one simply
needs to perform a topological sort on the upstream graph, so the time complexity
scales linearly with the size of the grid. Moreover, this cost can be amortized over
several Newton and time steps, since in practice flow directions reverse only sparingly.
The proposed phase-based potential ordering allows a partial decoupling of the
transport problem from the flow problem, since the saturations can be computed via
back substitution once the pressures are known. This allows us to derive a reduced-
order Newton algorithm, which is the nonlinear analog of a Schur complement ap-
proach in matrix computations. We have proved that for 1D countercurrent flow, the
reduced Newton method converges unconditionally for large ∆t. In addition, a minor
modification to the method (which can be thought of as pivoting) yields provable
convergence for any time-step size. As demonstrated in various examples, reduced
Newton has a much more robust convergence behavior than the usual Newton method,
which translates into the ability to take larger time steps without risking divergence of
159
160 CHAPTER 6. CONCLUSIONS
the nonlinear iterations. This, in turn, leads to a more efficient and robust simulator
overall.
Ordering techniques can also lead to improvements in the linear solver. For a
cocurrent flow problem, a block ILU(0) factorization always exists provided the cells
are ordered according to the phase potential, and this factorization is unique over all
topological orderings. Moreover, this factorization is exact on the saturation part of
the Jacobian. Since block ILU is used as the second stage of CPR preconditioning,
exactness on saturation means that the pairing with True-IMPES reduction, which
is exact on pressure, is practically ideal. Moreover, its uniqueness over topological
orderings means CPR is much less sensitive to flow configuration variations if potential
ordering is used. Spectral plots and numerical experiments demonstrate the power
of this combination. Finally, experiments reveal that it is difficult to construct a
preconditioner for the pressure Schur complement S2 that rivals two-stage CPR in
performance. This is likely because S2 is a dense matrix that exhibits both advective
and diffusive characters, as indicated by the spectral plots.
advantage of the fully-implicit method over an explicit scheme: the ability to handle
the large CFL numbers that naturally arise from heterogeneity.
Future directions
In this section, we outline several possible future research directions stemming from
our work.
primary variables, the nonlinear algebraic system contains only pressure and satu-
rations; in other words, the implicit part looks exactly like a black-oil system, so
we can use reduced Newton without modifications. A possible difficulty is that in
a compositional model, heavy hydrocarbon components are allowed to vaporize into
the gas phase, whereas this is not allowed in the standard black-oil model. This could
complicate the triangularization process, as both the oil and gas equations contain
flow terms from both phases. However, since the amount of heavy components in
the vapor phase is generally small for heavy oils, it may still be possible to trian-
gularize the system by temporarily freezing or linearizing the Sg dependent terms in
the oil equation. More numerical experiments and theory are needed to verify the
effectiveness of this approach.
Here we derive the pressure equation (1.1.18). Assume no gravity, capillarity or source
terms. Then the phase equations are given by:
∂
Water: (φρw Sw ) − ∇ · (Kλw ρw ∇p) = 0, (A.1)
∂t
∂
Oil: (φρo So ) − ∇ · (Kλo ρo ∇p) = 0, (A.2)
∂t
∂
Gas: (φρg Sg + φρo Rs So ) − ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0. (A.3)
∂t
1 0 ∂p ∂Sw 1
(φ ρw + φρ0w ) Sw + φ − ∇ · (Kλw ρw ∇p) = 0. (A.4)
ρw ∂t ∂t ρw
163
164 APPENDIX A. PRESSURE EQUATION DERIVATION
1h 0 0 0 0 0
i ∂p
(φ ρg + φρg )Sg + (φ ρo Rs + φρo Rs + φρo Rs )So
ρg ∂t
(A.6)
∂Sg ρo Rs ∂So 1
+φ +φ − ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0.
∂t ρg ∂t ρg
We now add (A.4)–(A.6) together. First, the sum of the saturation derivatives is
∂Sw ρo Rs ∂So ∂Sg ρo Rs ∂So ∂Sw ∂So ∂Sg
φ +φ 1− +φ +φ =φ + + = 0,
∂t ρg ∂t ∂t ρg ∂t ∂t ∂t ∂t
where
φ0 ρ0w ρ0o ρo Rs0 ρ0g
cr = , cw = , co = + , cg = .
φ ρw ρo ρg ρg
So the pressure equation is
∂p 1 ρg − ρo R s
φcT − ∇ · (Kλw ρw ∇p) + ∇ · (Kλo ρo ∇p)
∂t ρw ρo ρg
(A.7)
1
+ ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0.
ρg
We can simplify (A.7) further by assuming that p is differentiable with respect to the
spatial variable x. We use the identity
∇ · (f v) = ∇f · v + f ∇ · v,
165
1 1 0
∇ · (Kλw ρw ∇p) = (ρw ∇p) · (Kλw ∇p) + ρw ∇ · (Kλw ∇p)
ρw ρw
= Kλw cw |∇p|2 + ∇ · (Kλw ∇p).
1
∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇ p)
ρg
1 0
= (ρ ∇p) · (Kλg ∇p) + ρg ∇ · (Kλg ∇p)
ρg g
+ (ρ0o Rs + ρo Rs0 )∇p · (Kλo ∇p) + ρo Rs ∇ · (Kλo ∇p)
ρo R s
= Kλg cg |∇p|2 + ∇ · (K λg ∇p) + ∇ · (Kλo ∇p)
ρg
ρo Rs ρ0o ρo Rs0
+ Kλo |∇p|2 + Kλo |∇p|2 .
ρg ρo ρg
∂p
φcT − ∇ · (KλT ∇p) − χT K|∇p|2 = 0, (A.8)
∂t
Here we prove the equivalence between column diagonal dominance and m-accretivity
in the L1 -norm for linear maps over Rn . Recall that for the space L1 (Rn ), A is m-
accretive if it is continuous and for any u, v ∈ Rn ,
n
X
(A(u)i − A(v)i ) sgn(ui − vi ) ≥ 0. (B.1)
i=1
Proof. Since A is linear, it suffices to show equivalence between condition (B.2) and
n
X
(Au)i sgn(ui ) ≥ 0 (B.3)
i=1
for any u ∈ Rn . Assume (B.3) holds for any vector u. For a given ε > 0, define
166
167
Then
Au(j) = Aj + εv (j) ,
(j)
where Aj is the j-th column of A and kv (j) k1 ≤ nkAk1 . Since sgn(ui ) = − sgn(aij )
for i 6= j, we obtain
n
(j) (j) (j)
X X X
(j)
(Au )i sgn(ui ) = ajj − |aij | + ε vi sgn(ui ),
i=1n i6=j i=1
n
(j) (j)
X X
ajj − |aij | ≥ −ε vi sgn(ui ) ≥ −nεkAk1 ,
i6=j i=1
which is true for all j. Letting ε → 0 yields column diagonal dominance, as required.
so (B.3) holds for u. The general case where u has zero entries is similar, except the
double summation will skip over any index i or j for which ui or uj is zero.
Appendix C
• incompressible flow,
for p = w, o, with
dp(x)
up (x) = −K(x)krp (Sw (x))
dx
and boundary conditions
up (xL ) = qp,L ,
p(xR ) = pR .
168
169
Proposition C.1. For the above 1D model problem, the Appleyard-Cheshire Cascade
method [4] converges in two iterations, provided the cells are ordered from upstream
to downstream (left to right). In particular, the saturation of each cell will be correct
at the end of the first iteration, and the pressures will be correct at the end of the
second iteration.
Proof. Under the Cascade (left-to-right) ordering, the discretized equations have the
form
old
φi (Sw,i − Sw,i )
1 pi − pi+1
+ Kli krw (Sw,i ) − F Iw,i = 0,
∆t ∆x ∆x
old
(C.2)
φi (Sw,i − Sw,i )
1 pi − pi+1
+ Kli kro (Sw,i ) − F Io,i = 0.
∆t ∆x ∆x
∗ (0)
Let the exact solution be Sw,i and p∗w,i , i = 1, . . . , N , and let the initial guess be Sw,i
(0)
and pw,i . Consider the first iteration of the Cascade method. In line 3 in Figure 3.1,
(1)
the pressures are updated to pi , but this has no impact on convergence in this model
problem. The saturations Sw,i are updated inside the loop from lines 4 to 8. We show
by induction that at the ith step of the loop, Sw,j and F Op,j are correct for j < i.
which we can solve for Sw,1 and p1 . Since the exact solution also solves the single-cell
problem, the uniqueness of solutions tells us that
∗ (1)
Sw,1 = Sw,1 and p1 − p2 = p∗1 − p∗2 .
170 APPENDIX C. CONVERGENCE OF THE CASCADE METHOD
(1)
F Op,1 = K12 krp (Sw,1 )(p1 − p2 )/∆x
∗
= K12 krp (Sw,1 )(p∗1 − p∗2 )/∆x
is exact as well. This proves the base case. For i > 1, note that the outward fluxes for
j = 1, . . . , i − 1 are assumed to be exact. This means the Cascade solution, and the
∗
exact solution at cell i, both solve the same single-cell problem. Hence, Sw,i = Sw,i ,
and the outward fluxes will match as well. Thus, the induction step goes through,
∗
and we have Sw,i = Sw,i for all i after one iteration. It follows that during the second
iteration of the Cascade method, in which we solve the linearized problem
" #
δS (2)
J (2)
= −r(2) , (C.4)
δp
we get δS (2) = 0, which means (1) the transmissibility coefficients are exact, and (2)
the fully implicit problem and the IMPES problem have the same pressure solution.
But since the residual function is linear (affine) in pressure, solving (C.4) will yield
the exact pressure, i.e.
p∗i = p(1) + δp(2) .
So at the end of the second iteration, both the saturations and the pressures are
correct, and the Cascade method converges to the solution.
Appendix D
Nonsingularity of Jss
Proposition D.1. Let the relative permeability functions krw and kro be such that
dkrw /dSw ≥ 0 and ∂kro /∂So ≥ 0. Then Jss = ∂Fs /∂S is nonsingular.
Proof. Since Jss is a lower triangular matrix, it suffices to show that none of its
diagonal entries is zero. A typical oil conservation equation for cell i is
φSoi ρo (pi ) X
Foi = + Kil Ho,il (Φoi − Φol ) + Fcap , (D.1)
∆t l adjacent to i
where
kro (Si )ρo (pi )/µo (pi ) if Φoi ≥ Φol ,
Ho,il =
k (S )ρ (p )/µ (p ) if Φ < Φ ,
ro l o l o l oi ol
The accumulation term φρo (pi )/∆t will always be positive. The sign of the flux term
depends on the upstream direction. If Φoi ≥ Φol , then
by assumption. On the other hand, if Φoi < Φol , then Ho,il is independent of Soi , so
171
172 APPENDIX D. NONSINGULARITY OF JSS
the derivative is zero. Thus, the flux derivative will always be non-negative, which
means ∂Foi /∂Soi > 0 for all cells i. The argument for the water equations is similar.
Thus, Jss has a positive diagonal, so it is nonsingular.
Under certain mild conditions (to be specified below), the Stone I and II models
(cf. [6]) can be shown to satisfy ∂kro /∂So ≥ 0, as required by Proposition D.1. Note
that we are only concerned with saturations inside the region
where Swc is the connate water saturation and Som is the minimum oil saturation at
which oil is simultaneously displaced by water and gas. Also note that the derivative
∂kro /∂So is taken along the line Sw = constant, so by the relation Sw + So + Sg = 1,
the criterion ∂kro /∂So ≥ 0 is equivalent to ∂kro /∂Sg ≤ 0, which turns out to be more
natural to show.
Proposition D.2. Assume dkrog /dSg ≤ 0. Then for saturations in D, the Stone I
model satisfies ∂kro /∂Sg ≤ 0 provided ∂Som /∂Sg ≥ − 21 .
Proof. The Stone I model is defined as kro (Sw , Sg ) = krocw So∗ βw βg , where
Sw − Swc So − Som Sg
Sw∗ = , So∗ = , Sg∗ = .
1 − Swc − Som 1 − Swc − Som 1 − Swc − Som
Combining all these relations, we see that kro = U (Sw , Sg , Som )/V (Sw , Sg , Som ), where
so the sign of ∂kro /∂Sg is determined by the quantity within the square brackets.
After some manipulation, we get
0
since korg ≤ 0. In addition, we get
R2 = −Sg (Sw − Swc ) (1 − Sw − Som − Sg ) + (1 − Swc − Som ) krow krog
≤ 0.
∂Som
Hence R1 + R2 · ∂Sg
≤ 0 if either ∂Som /∂Sg ≥ 0 or
Thus, in order to ensure that ∂kro /∂Sg ≤ 0, it is sufficient to require either ∂Som /∂Sg ≥
0 or |∂Som /∂Sg | ≤ 21 , which is equivalent to requiring ∂Som /∂Sg ≥ − 21 .
174 APPENDIX D. NONSINGULARITY OF JSS
Note that if the Fayers and Matthews [35] model for Som is used, we would have
so the condition in Proposition D.2 would be satisfied as long as Sorw − Sorg is small,
which is usually the case. In particular, the monotonicity condition is always satisfied
whenever Sorw = Sorg .
Proposition D.3. Assume that dkrg /dSg ≥ 0, dkrog /dSg ≤ 0, and that krw and krow
are convex functions of Sw . Then for all saturations in D, the Stone II model satisfies
∂kro /∂Sg ≤ 0.
0
The second term is clearly non-positive because krog ≤ 0. To show that the first term
0
is also non-positive, first note that krg ≥ 0. Next, define g(Sw ) = krw + krow /krocw .
Then g(Swc ) = g(1 − Sorw ) = 1. But since g is convex, it must be that g(Sw ) ≤ 1 for
all Swc ≤ Sw ≤ 1−Sorw . So g(Sw )−1 ≤ 0, which implies the first term is non-positive
as well. Hence, we have shown that ∂kro /∂Sg ≤ 0, as required.
Appendix E
This appendix deals with the spectral properties of various combinations of the pres-
sure matrices Jsp , Jpp and Jtp . These properties are useful in evaluating the relative
importance of various terms that appear in the preconditioned matrices in Chapter
5.
Let G = (V, E) be a connected undirected graph with nodes V and edges E.
Suppose the nodes V can be partitioned into V = V int ∪ V bdy , where V bdy 6= ∅. (For
our purposes, V int consists of the control volumes in the domain; an edge in E is
either an interface separating two cells, or the face of a boundary cell that is subject
to a pressure boundary condition; V bdy consists of “ghost cells” outside the domain
that are used by the finite volume method to deal with pressure boundary conditions.)
Suppose there exists a function σ : E → [0, ∞) that assigns a non-negative weight
(transmissibility) to each edge in E, and let σij denote the weight assigned to edge
(i, j). Then we can define a |V int | × |V int | matrix M σ by
P
(i,l)∈E σil i = j,
Mijσ = −σij i 6= j, (i, j) ∈ E, (E.1)
0 i 6= j, (i, j) ∈
/ E.
175
176 APPENDIX E. PROPERTIES OF PRESSURE MATRICES
Given two weight functions σ and τ we say σ ≤ τ if σij ≤ τij for all edges (i, j).
The following lemma is a slight modification of a theorem by Ostrowski and Reich
(cf. [78]).
Lemma E.1. Let A = M − N , where A = A∗ , A and M are both nonsingular, and
define Q = M + M ∗ − A. If A is positive definite and Q is positive semi-definite, then
ρ(M −1 N ) ≤ 1, where ρ(·) is the spectral radius. In addition, if Q is positive definite,
then ρ(M −1 N ) < 1.
Proof. Define B = M −1 N = I − M −1 A. It follows that if Bu = λu, u 6= 0, then
Au = (1 − λ)M u,
where λ 6= 1 since A is nonsingular. Taking the inner product of both sides with u
yields
u∗ Au = (1 − λ)u∗ M u,
u∗ Au = (1 − λ̄)u∗ M ∗ u.
u∗ (Q + A)u u∗ Qu
1
= 1 + = 2< .
u∗ Au u∗ Au 1−λ
1
Since A is positive definite and Q is positive semi-definite, we must have 2< 1−λ
≥ 1,
177
2(1 − α)
≥ 1,
(1 − α)2 + β 2
Corollary E.2. Let σ and τ be weight functions on the edges E. If τ > 0 and
0 ≤ σ ≤ τ , then ρ((M τ )−1 M σ ) ≤ 1.
Proof. Let M = M τ and N = −M σ in Lemma E.1. Then A = M τ + M σ and
Q = M τ − M σ , which corresponds to matrices with weights τ + σ > 0 and τ − σ ≥ 0,
so that A is symmetric positive definite and Q is symmetric positive semi-definite.
Thus, we have ρ((M τ )−1 M σ ) ≤ 1 by Lemma E.1, as required.
−1 −1
The above corollary immediately implies ρ(Jsp Jtp ) ≤ 1 and ρ(Jpp Jtp ) ≤ 1, since
λw , λo are both bounded above by λT . The corollary also leads to a bound on the
condition number of M σ :
Theorem E.3. Let σ and τ be weight functions on the edges E. If there exist con-
stants 0 < b ≤ B such that 0 < bτ ≤ σ ≤ Bτ , then
B
κ2 (M σ ) ≤ κ2 (M τ ),
b
so that
k(M σ )−1 k2 kM σ k2 B
−1 2
· 2
≤ .
kR k2 kRk2 b
Finally, by the symmetry of R we have
B
kM σ k2 k(M σ )−1 k2 ≤ kM τ k2 k(M τ )−1 k2 ,
b
as required.
The Laplacian of a graph G (cf. [68]), denoted by L(G), is the matrix M τ when
τ ≡ 1. Theorem E.3 can yield useful bounds for Jtp when κ2 (L(G)) is known. For a
Cartesian grid, it is well known that κ2 (L(G)) = O(1/h2 ); as a result, κ2 (Jtp ) is also
O(1/h2 ), provided the absolute permeability K(x) and total mobility λT (S) satisfy
[1] J. E. Aarnes. On the use of a mixed multiscale finite element method for greater
flexibility and increased speed or improved accuracy in reservoir simulation. Mul-
tiscale Model. Simul., 2:421–439, 2004.
[4] J. R. Appleyard and I. M. Cheshire. The cascade method for accelerated conver-
gence in implicit simulators. In European Petroleum Conference, pp. 113–122,
1982.
[6] K. Aziz and A. Settari. Petroleum Reservoir Simulation. Applied Science Pub-
lishers, New York, 1979.
[7] Z.-Z. Bai, G. H. Golub, and M. K. Ng. Hermitian and skew-Hermitian splitting
methods for non-Hermitian positive definite linear systems. SIAM J. Matrix
Anal. Appl., 24(3):603–626, 2002.
179
180 BIBLIOGRAPHY
[10] M. Benzi, D. B. Szyld, and A. van Duin. Orderings for incomplete factorization
preconditioning of nonsymmetric problems. SIAM J. Sci. Comput., 20:1652–
1670, 1999.
[11] M. Blunt and B. Rubin. Implicit flux limiting schemes for petroleum reservoir
simulation. J. Comput. Phys., 102(1):194–210, 1992.
[13] Y. Brenier and J. Jaffré. Upstream differencing for multiphase flow in reservoir
simulation. SIAM J. Numer. Anal., 28(3):685–696, 1991.
[16] H. Cao. Development of Techniques for General Purpose Simulators. PhD thesis,
Stanford University, Stanford, CA, June 2002.
[20] K. H. Coats. A note on IMPES and some IMPES-based simulation models. SPE
J., 5(3):245–251, Sept. 2000.
[25] E. F. D’Azevedo, P. A. Forsyth, and W.-P. Tang. Ordering methods for pre-
conditioned conjugate gradient methods applied to unstructured grid problems.
SIAM J. Matrix Anal. Appl., 13(3):944–961, 1992.
[29] P. Deuflhard. Newton Methods for Nonlinear Problems: Affine Invariance and
Adaptive Algorithms. Springer-Verlag, Berlin, 2004.
[38] A. George and J. W. Liu. Computer Solution of Large Sparse Positive Definite
Systems. Prentice-Hall, Englewood Cliffs, NJ, 1981.
[41] A. Harten. High resolution schemes for hyperbolic conservation laws. J. Comput.
Phys., 135:260–278, 1997.
[43] T. Y. Hou and X. H. Wu. A multiscale finite element method for elliptic problems
in composite materials and porous media. J. Comput. Phys., 134:169–189, 1997.
[44] Y. Jiang. Tracer flow modeling and efficient solvers for GPRS. Master’s thesis,
Stanford University, June 2004.
[46] P. D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl.
Math, 13:217–237, 1960.
[54] A. Meister and C. Vömel. Efficient preconditioning of linear systems arising from
the discretization of hyperbolic conservation laws. Adv. Comp. Math., 14:49–73,
2001.
[56] J. R. Natvig, K.-A. Lie, and B. Eikemo. Fast solvers for flow in porous media
based on discontinuous Galerkin methods and optimal reordering. In Computa-
tional Methods in Water Resources XVI, 2006.
[58] F. M. Orr, Jr. Theory of Gas Injection Processes. Tie-Line Publications, 2007.
[60] S. Osher. Riemann solvers, the entropy condition, and difference approximations.
SIAM J. Numer. Anal., 21:217–235, 1984.
[62] H. S. Price and K. H. Coats. Direct methods in reservoir simulation. Trans. SPE
of AIME, 257:295–308, 1974.
BIBLIOGRAPHY 185
[67] T. F. Russell. Stability analysis and switching criteria for adaptive implicit
methods based on the CFL condition. SPE paper 18416, presented at the SPE
Symposium on Reservoir Simulation in Houston, TX, 1989.
[68] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition, 2003.
[71] A. Settari and K. Aziz. Treatment of nonlinear terms in the numerical solution
of partial differential solutions for multiphase flow in porous media. Int. J.
Multiphase Flow, 1:817–844, 1975.
[77] B. van Leer. Upwind and high-resolution methods for compressible flow: from
donor cell to residual-distribution schemes. Commun. Comput. Phys., 1:192–206,
2006.
[79] S. Verma and K. Aziz. Control volume scheme for flexible grids in reservoir
simulation. SPE paper 37999, presented at the SPE Symposium on Reservoir
Simulation in Dallas, TX, 1997.
[83] J. W. Watts. Reservoir simulation: past, present and future. SPE Computer
Applications, 12(4):171–176, 1997.
[84] J. W. Watts III. A conjugate gradient truncated direct method for the iterative
solution of the reservoir simulation pressure equation. SPE J., 21:345–353, 1981.
[85] D. M. Young. Iterative Solution of Large Linear Systems. Academic Press, New
York, 1971.
BIBLIOGRAPHY 187
[86] L. C. Young. A finite-element method for reservoir simulation. SPE J., 21(1):115–
128, 1981.