0% found this document useful (0 votes)
23 views201 pages

Kwok Thesis

This dissertation presents scalable algorithms for simulating multiphase flow in porous media, addressing challenges such as mixed hyperbolic-parabolic PDEs and highly nonlinear transport problems. Key contributions include a reduced-order Newton method that improves convergence and a rigorous analysis of phase-based upstream discretization, ensuring stability and monotonicity. The work enhances numerical methods for reservoir engineering, particularly in managing large systems of nonlinear equations.

Uploaded by

Timothy Petimoya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views201 pages

Kwok Thesis

This dissertation presents scalable algorithms for simulating multiphase flow in porous media, addressing challenges such as mixed hyperbolic-parabolic PDEs and highly nonlinear transport problems. Key contributions include a reduced-order Newton method that improves convergence and a rigorous analysis of phase-based upstream discretization, ensuring stability and monotonicity. The work enhances numerical methods for reservoir engineering, particularly in managing large systems of nonlinear equations.

Uploaded by

Timothy Petimoya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 201

SCALABLE LINEAR AND NONLINEAR ALGORITHMS

FOR MULTIPHASE FLOW IN POROUS MEDIA

A DISSERTATION
SUBMITTED TO THE PROGRAM IN SCIENTIFIC COMPUTING
AND COMPUTATIONAL MATHEMATICS
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

Wing Hong Felix Kwok


December 2007
c Copyright by Wing Hong Felix Kwok 2008
All Rights Reserved

ii
I certify that I have read this dissertation and that, in my opinion, it
is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.

(Hamdi Tchelepi) Principal Adviser

I certify that I have read this dissertation and that, in my opinion, it


is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.

(Khalid Aziz)

I certify that I have read this dissertation and that, in my opinion, it


is fully adequate in scope and quality as a dissertation for the degree
of Doctor of Philosophy.

(Michael Saunders)

Approved for the University Committee on Graduate Studies.

iii
iv
Abstract

The efficient simulation of immiscible fluid displacements in underground porous me-


dia remains an important and challenging problem in reservoir engineering. First,
the governing PDEs exhibit a mixed hyperbolic-parabolic character due to the cou-
pling between the global flow and the local transport of the different phases. The
transport problem is highly nonlinear, leading to the formation of shock fronts and
steep gradients in the saturation profile. In addition, rock properties such as porosity
and permeability are highly heterogeneous, leading to poor numerical conditioning of
the resulting linear systems. Finally, fluid velocities vary greatly across the domain,
with near-well regions experiencing fast flows and some far away regions experiencing
almost no flow at all. Consequently, the use of explicit integrators would entail a
time-step restriction that is much more severe than the global reservoir time scales.
For this reason, implicit time-stepping is the preferred temporal discretization in the
reservoir simulation community, but this requires the solution of a very large system
of nonlinear algebraic equations (often on the order of millions of unknowns) at each
time step.
Our main algorithmic contribution is the ordering of equations and unknowns in
such a way that flow directions are exploited. This leads to improvements in both the
linear and nonlinear solvers. In the nonlinear setting, the ordering leads to a reduced-
order Newton method, which numerical experiments have shown to have a much more
robust convergence behavior than the usual Newton’s method. We also prove, for
1D incompressible two-phase flow, that the reduced Newton method converges for
any time-step size. In the linear solver, ordering improves the convergence of the
Constrained Pressure Residual (CPR) preconditioner and reduces its sensitivity to

v
flow configurations.
We also present a rigorous analysis of phase-based upstream discretization, which
is different from the classical Godunov and Engquist-Osher schemes for nonlinear
conservation laws. We show, based on a fully nonlinear analysis, that the fully im-
plicit scheme is well-defined, stable, monotonic and converges to the entropy solution
for arbitrary CFL numbers. Thus, unlike the existing linear stability analysis, our
results provide a rigorous justification for the empirical observation that fully-implicit
solutions are always stable and yield monotonic profiles.

vi
Acknowledgement

I would like to express my utmost gratitude towards my advisor, Prof. Hamdi Tchelepi,
not only for his insights and guidance, but also for his patience and encouragement
when things did not go so well. This research would not have been possible without
his constant input and moral support. In addition, I would like to thank Prof. Khalid
Aziz for making many useful suggestions. I am also indebted to Prof. Margot Ger-
ritsen, who introduced me to porous media flow, and to Philipp Birken, who gave
constructive comments on several chapters of this dissertation.
Much appreciation goes to my office mates, Rami Younis and Marc Hesse, for our
fruitful and entertaining conversations. Our interactions were thoroughly enriching
both on a professional and a personal level, and I will really miss your company. I
also thank Yuanlin Jiang and Huanquan Pan for their help with GPRS-related issues.
Many thanks to Prof. Michael Saunders, who agreed to join the reading committee
at the very last minute and did an amazingly thorough job as a reader. Thanks also
to Indira Choudhury, who helped me tremendously in putting my orals commitee
back together when it almost fell apart. Finally, I am grateful for the moral support
from my parents, who always believed in me throughout this rather long journey.
This dissertation is dedicated to the memory of Prof. Gene Golub, who passed
away two weeks before my PhD defense. I will always remember him for the depth
of his knowledge on all aspects of scientific computing, as well as his generosity and
genuine interest in the well-being of every student in SCCM/ICME. He is truly an
irreplaceable figure in our community, and he will be sorely missed.
I would like to thank the SUPRI-B reservoir simulation affliates program for its
financial support for this research.

vii
Contents

Abstract v

Acknowledgement vii

1 Introduction 1
1.1 Governing equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 General mass-balance equations . . . . . . . . . . . . . . . . . 2
1.1.2 Black-oil model . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Numerical simulation of the reservoir . . . . . . . . . . . . . . . . . . 8
1.2.1 Spatial discretization . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Temporal discretization . . . . . . . . . . . . . . . . . . . . . . 10
1.2.3 Solution of nonlinear equations . . . . . . . . . . . . . . . . . 15
1.3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Analysis of Upstream Weighting 20


2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Two model problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.1 FIM for 1D problem with gravity . . . . . . . . . . . . . . . . 24
2.2.2 SEQ for multidimensional problems . . . . . . . . . . . . . . . 30
2.3 Existence and uniqueness of solutions . . . . . . . . . . . . . . . . . . 34
2.3.1 Implicit monotone schemes . . . . . . . . . . . . . . . . . . . . 34
2.3.2 Nonlinear Jacobi and Gauss-Seidel process . . . . . . . . . . . 36
2.3.3 M-function theory . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.4 Convergence of nonlinear Jacobi and Gauss-Seidel . . . . . . . 42

viii
2.3.5 Well-definedness of implicit monotone schemes . . . . . . . . . 45
2.3.6 Rate of convergence of the nonlinear processes . . . . . . . . . 50
2.3.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.8 Maximum principle . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4 Convergence to the entropy solution . . . . . . . . . . . . . . . . . . . 57
2.5 Accuracy of phase-based upstreamed solutions . . . . . . . . . . . . . 62
2.5.1 Refinement under fixed mesh ratio . . . . . . . . . . . . . . . 63
2.5.2 Spatial refinement for fixed time steps . . . . . . . . . . . . . 64
2.5.3 Non-uniform grids . . . . . . . . . . . . . . . . . . . . . . . . 65

3 Potential Ordering 70
3.1 Methods derived from cell-based ordering . . . . . . . . . . . . . . . . 71
3.2 Phase-based ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2.1 Cocurrent flow . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2.2 Countercurrent flow due to gravity . . . . . . . . . . . . . . . 76
3.2.3 Capillarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2.4 Remarks on implementation . . . . . . . . . . . . . . . . . . . 79

4 Reduced Newton Method 80


4.1 Algorithm description . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.1 Convex functions and Newton’s method . . . . . . . . . . . . 86
4.2.2 The cocurrent case: large ∆t . . . . . . . . . . . . . . . . . . . 88
4.2.3 The general cocurrent case . . . . . . . . . . . . . . . . . . . . 92
4.2.4 The countercurrent flow case . . . . . . . . . . . . . . . . . . . 96
4.3 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.1 1D example with gravity . . . . . . . . . . . . . . . . . . . . . 100
4.3.2 Heterogeneous example with gravity . . . . . . . . . . . . . . 103
4.3.3 Large heterogeneous example . . . . . . . . . . . . . . . . . . 107
4.3.4 1D three-phase example with gravity . . . . . . . . . . . . . . 108
4.3.5 2D Heterogeneous three-phase example . . . . . . . . . . . . . 114
4.3.6 3D three-phase example . . . . . . . . . . . . . . . . . . . . . 114

ix
5 Linear Preconditioning 118
5.1 Structure of the Jacobian matrix . . . . . . . . . . . . . . . . . . . . 119
5.2 CPR preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.2.1 True-IMPES reduction . . . . . . . . . . . . . . . . . . . . . . 127
5.2.2 Improved second-stage preconditioner via ordering . . . . . . . 130
5.2.3 Spectrum of the preconditioned matrix . . . . . . . . . . . . . 138
5.2.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . 146
5.3 Schur complement preconditioning . . . . . . . . . . . . . . . . . . . 149
5.3.1 Spectrum and nonzero pattern . . . . . . . . . . . . . . . . . . 151
5.3.2 Convergence behavior . . . . . . . . . . . . . . . . . . . . . . . 152

6 Conclusions 159

A Pressure Equation Derivation 163

B Diagonal Dominance and L1 -Accretivity 166

C Convergence of the Cascade Method 168

D Nonsingularity of Jss 171

E Properties of Pressure Matrices 175

Bibliography 179

x
List of Tables

2.1 Accuracy of numerical solutions for a fixed CFL number. . . . . . . . 64


2.2 Accuracy of numerical solutions for a fixed time step size. . . . . . . . 65
2.3 Accuracy of numerical solutions for a non-uniform grid. . . . . . . . . 66

3.1 Ordering strategies for different black-oil models. . . . . . . . . . . . 79

4.1 Convergence history for 1D water floods. . . . . . . . . . . . . . . . . 102


4.2 Convergence history for the upscaled SPE 10 model. . . . . . . . . . . 104
4.3 Summary of runs for the full SPE 10 problem. . . . . . . . . . . . . . 108
4.4 PVT relations for all three-phase examples. . . . . . . . . . . . . . . . 112
4.5 Relative permeabilities for all three-phase examples. . . . . . . . . . . 112
4.6 Summary of runs for the 1D three-phase example with gravity. . . . . 113
4.7 Summary of runs for the 2D heterogeneous three-phase example. . . . 115
4.8 Summary of runs for the 3D three-phase example. . . . . . . . . . . . 117

5.1 Convergence behavior for the block ILU(0) and CPR preconditioners. 141
5.2 Performance of CPR-ILU for the quarter 5-spot problem. . . . . . . . 148
5.3 Performance of CPR-ILU for the upscaled SPE 10 problem. . . . . . 149
5.4 Convergence of GMRES in the absence of gravity. . . . . . . . . . . . 156
5.5 Convergence of GMRES in the presence of gravity. . . . . . . . . . . 157

xi
List of Figures

1.1 Timing report for a typical black-oil simulation run. . . . . . . . . . . 13


1.2 FIM and IMPES saturation profiles for a 2D heterogeneous reservoir 14

2.1 Flux functions for 1D incompressible two-phase flow. . . . . . . . . . 23


2.2 Numerical flux function for a countercurrent flow problem. . . . . . . 31
2.3 Numerical solutions for a fixed CFL number. . . . . . . . . . . . . . . 67
2.4 Numerical solutions for a fixed time step size. . . . . . . . . . . . . . 68
2.5 Numerical solutions on a non-uniform grid. . . . . . . . . . . . . . . . 69

3.1 One iteration of the Cascade method. . . . . . . . . . . . . . . . . . . 72

4.1 Algorithm for solving the reduced system (4.1.3). . . . . . . . . . . . 83


4.2 Reduced Newton residual functions for various ∆t. . . . . . . . . . . 93
4.3 Modified reduced Newton algorithm. . . . . . . . . . . . . . . . . . . 96
4.4 Fractional flow fw for the 1D gravity example. . . . . . . . . . . . . . 102
4.5 Permeability field and well configuration for the upscaled SPE 10 prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.6 Convergence history for the upscaled SPE 10 model. . . . . . . . . . . 106
4.7 Convergence history for the full SPE 10 problem with long time steps. 109
4.8 Total oil production rate and water cut for the full SPE 10 problem. . 110
4.9 Production curve for the 1D three-phase example. . . . . . . . . . . . 113
4.10 Gas saturation at T = 500 days in the 2D heterogeneous three-phase
example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.11 Reservoir description for the 3D three-phase example. . . . . . . . . . 117

xii
5.1 Spectra of matrices for the cocurrent flow problem, ∆t = 1, 5. . . . . 142
5.2 Spectra of matrices for the cocurrent flow problem, ∆t = 20, 100. . . . 143
5.3 Spectra of matrices for the countercurrent flow problem, ∆t = 1, 5. . . 144
5.4 Spectra of matrices for the countercurrent flow problem, ∆t = 20, 100. 145
5.5 Two configurations of the quarter 5-spot problem. . . . . . . . . . . . 147
5.6 Nonzero pattern of S2 for the 1D and 2D reservoirs. . . . . . . . . . . 152
5.7 Spectrum and nonzero profiles of S2 for the 1D reservoir. . . . . . . . 153
5.8 Spectrum and nonzero profiles of S2 for the 2D reservoir. . . . . . . . 154

xiii
xiv
Chapter 1

Introduction

Petroleum reservoir simulation is the use of numerical techniques to solve the equa-
tions for heat and fluid flow in porous media, given the appropriate initial and bound-
ary conditions. Simulation technology has evolved tremendously since the develop-
ment of the first simulator in the 1950s. Due to the explosion of available computing
power and the ever-increasing sophistication of simulation techniques, simulation has
become an indispensible tool to reservoir engineering. Today, nearly all major reser-
voir development decisions are based at least partially on simulation results [83].
Despite the growing speed and storage capacities of today’s computers, there is in-
creasing interest and necessity to simulate larger and more complex reservoir models.
As a result, the efficient simulation of miscible and immiscible fluid displacements in
underground porous media remains an important and challenging problem in reservoir
engineering.
There are several hurdles to an efficient, scalable reservoir simulator. First, the
governing PDEs exhibit a mixed hyperbolic-parabolic character due to the coupling
between the flow (pressure and total velocity) and the transport (phase saturations)
problems. In addition, rock properties such as porosity and permeability are highly
heterogeneous, leading to poor numerical conditioning of the resulting linear systems.
Finally, fluid velocities vary greatly across the domain, with near-well regions experi-
encing fast flows and some far away regions experiencing almost no flow at all. These
characteristics impose severe constraints on the numerical methods used in practical

1
2 CHAPTER 1. INTRODUCTION

reservoir simulation. In particular, scalable techniques that work well for specific
classes of problems (e.g., algebraic multigrid for elliptic problems [74]) no longer work
well for reservoir simulation problems.
The simplest and most widely used model in reservoir simulation is the standard
black-oil model [6]. In this model, mass transfer between the hydrocarbon liquid
and vapor phases is represented using pressure-dependent solubilities, and the com-
pressibility effects are represented using normalized densities (the so-called formation
volume factors). These simplifying assumptions on fluid properties are used to elimi-
nate the need for equation of state (EOS) and phase equilibrium calculations, which
can take up to 70% of the total simulation time [87, 16]. Thus, despite the increasing
use of compositional models, black-oil simulation still accounts for the vast major-
ity of simulations in industry. Hence, this thesis will concentrate on improving the
efficiency and robustness of black oil simulation.
The rest of this chapter is organized as follows. In section 1.1, we derive the
PDEs that describe the black-oil model. In section 1.2, we introduce the finite-volume
discretization, as well as the various time-marching schemes that are used to integrate
the PDEs in time. We also describe the most commonly used methods to solve the
resulting system of nonlinear and linear equations. We outline the remainder of the
thesis and state our contributions in section 1.3.

1.1 Governing equations

1.1.1 General mass-balance equations

The governing equations for multiphase flow in porous media are based on the conser-
vation of mass for each component. Here, a component can be either a single chemical
species (e.g., decane C10 H22 ), or a mixture of components that behave similarly, so
that they can be lumped together into a pseudocomponent. When nc components
are present, the system of conservation laws has the form

∂ci
+ ∇ · Fi = qi , i = 1, . . . , nc , (1.1.1)
∂t
1.1. GOVERNING EQUATIONS 3

where ci is the mass concentration of component i, Fi is the mass flux and qi is the
source or sink term. Each component can exist in one or more immiscible fluid phases
that flow inside the pore space; typically, we consider either two-phase (aqueous and
liquid hydrocarbon) or three-phase (aqueous, liquid and vapor hydrocarbon) flow
problems. If Xij is the concentration of component i in phase j (mass per unit
volume), then the concentration of component i can be written as

np
X
ci = φ Xij Sj , (1.1.2)
j=1

where φ = φ(x) is the porosity of the medium (i.e., the fraction of the bulk volume
that is open to fluid flow), np is the number of phases present, and Sj is the saturation
of phase j (i.e., the fraction of the pore volume occupied by phase j). The mass flux
Fi is the sum of the volumetric fluxes of each phase j, multiplied by the concentration
Xij . In other words,
np
X
Fi = Xij uj , (1.1.3)
j=1

where uj is the volumetric flux vector of phase j. The volumetric fluxes are given by
generalized Darcy’s law :
krj
uj = − K(∇pj − γj ∇z), (1.1.4)
µj
where K is the absolute permeability tensor, z is the depth variable; and for each
phase j, krj = krj (S1 , . . . , Snp ) is the relative permeability of phase j, µj is the
phase viscosity, pj is the phase pressure, and γj is the gravitational force acting
on phase j. The permeability tensor K is highly variable over the domain, even
within short distances; it also exhibits complex correlation patterns over a hierarchy
of spatial scales. For simulation purposes, it is generally necessary to assume K to
be a discontinuous function of x, since it would be impractical (or even impossible)
to simulate on a scale over which K becomes continuous. This has implications on
the choice of spatial discretization, which is described in section 1.2.1.

We also have a few algebraic constraints in addition to the above PDEs. Since
4 CHAPTER 1. INTRODUCTION

the pore space is saturated, we have the constraint


np
X
Sj = 1, (1.1.5)
j=1

and the phase pressures are related by the capillary pressure constraints:

pj − pj+1 = Pcj,j+1 (S1 , . . . , Snp ), j = 1, . . . , np − 1. (1.1.6)

1.1.2 Black-oil model

Equations (1.1.1), (1.1.5) and (1.1.6) yield nc + np equations, and we have 2np un-
knowns corresponding to the phase pressures and saturations. In a compositional
model, the concentrations Xij are also treated as unknowns, and additional equa-
tions are needed to close the system (cf. [58]). However, for the black-oil model, we
have nc = np , and Xij are treated as known functions of pj , so that we have the
same number of equations and unknowns. Specifically, the black-oil assumptions are
as follows:

1. The chemical species are represented by three pseudocomponents: water, oil


and gas, which are aligned with the aqueous, liquid and vapor hydrocarbon
phases respectively;

2. The water component exists only in the aqueous phase, and the oil component
exists only in the liquid hydrocarbon phase;

3. The gas component can exist in both the liquid and vapor hydrocarbon phases,
but gas solubility in the liquid phase is a pure function of pg (the vapor-phase
pressure).

With these assumptions, the mass-balance equations (1.1.1) take the form

∂(φρp Sp )
+ ∇ · (ρp up ) = ρp qp (1.1.7)
∂t
1.1. GOVERNING EQUATIONS 5

for p = o, w (liquid and aqueous phases), and


   
∂(ρg φSg ) ∂(ρo φSo Rs )
+ ∇ · (ρg ug ) + + ∇ · (ρo uo Rs ) = ρg qg (1.1.8)
∂t ∂t

for the vapor phase, where Rs = Rs (pg ) is the solubility ratio. The generalized Darcy’s
law (1.1.4), which is valid for p = w, o, g, is used to obtain the phase velocities, up .
In practical simulations, we typically rewrite the PDEs in terms of a set of linearly
independent primary variables (usually Sw , Sg and pg , but one can choose any phase
pressure and any np − 1 saturations), and then use the algebraic relations (1.1.5) and
(1.1.6) to calculate the remaining variables. In addition, it is commonly assumed
that the relative permeabilities krp and capillary pressures Pcpq have the following
dependencies on saturation:

krw = krw (Sw ), kro = kro (Sw , Sg ), krg = krg (Sg ); (1.1.9)
po − pw = Pcow (Sw ), pg − po = Pcgo (Sg ). (1.1.10)

The above functions are all nonlinear with respect to the saturation variables, and
they contribute to the highly nonlinear character of the resulting PDEs [71]. The
parameterization is based on the assumption that water is the most wetting phase
and gas the least wetting phase, which is valid for most reservoirs of interest (see [6]
0 0
for more detailed explanations). We also need Pcow ≤ 0 and Pcgo ≥ 0 for the problem
to be well-posed. The resulting system of PDEs is supplemented with the boundary
conditions

pw = pwd on Γd (1.1.11)
ρw uw · ν = gwn on Γn (1.1.12)
ρo uo · ν = gon on Γn (1.1.13)
ρg ug · ν = ggn on Γn (1.1.14)
6 CHAPTER 1. INTRODUCTION

and initial conditions

pw (x, 0) = pw0 (x), Sw (x, 0) = Sw0 (x), Sg (x, 0) = Sg0 (x), (1.1.15)

where the Dirichlet boundary Γd has positive measure, and ν denotes the outward
normal to the boundary.

Incompressible flow (and other simplifications)

In subsequent chapters, we often consider the case of incompressible flow, which


implies the phases have constant densities ρp . For simplicity, we also restrict our
attention to heterogeneous, but pointwise isotropic permeabilities, i.e., K = KI,
where I is the identity tensor. In this case, the conservation equations become

∂Sp
φ − ∇ · (λp K∇(pp − γp z)) = qp (1.1.16)
∂t

for p = o, w, and
   
∂Sg   ∂(So R̄s )  
φ − ∇ · λg K∇(pg − γg z) + φ − ∇ · R̄s λo K∇(po − γo z) = qg ,
∂t ∂t
(1.1.17)
for the gas phase, where λp = krp /µp is the (relative) mobility of phase p, and R̄s =
ρo Rs /ρg is the normalized solubility ratio. Sometimes we also consider the two-phase
flow case, which is simply the same PDEs with the gas-related equations removed.

Pressure equation

An important equation that can be derived from the mass balance equations and the
saturation constraint is the pressure equation. It can be obtained by taking a special
linear combination of the mass-balance equations (1.1.7), (1.1.8). Assume there are
no source or sink terms and no buoyancy effects, and suppose Pcow = Pcgo = 0, so
that all the phase pressures are identical. Inclusion of such terms would introduce
additional lower order terms, but would not alter the fundamental character of the
PDE. Let us multiply the water equation by 1/ρw , the gas equation by 1/ρg , and the
1.1. GOVERNING EQUATIONS 7

oil equation by (1 − R̄s )/ρo . Assuming that the pressure p is differentiable and that
φ, ρp and Rs are smooth functions of pressure, we get (after some algebra):

∂p
φcT − ∇ · (λT K∇p) − KχT |∇p|2 = 0, (1.1.18)
∂t

where the phase compressibilities are

ρ0w ρ0o ρo Rs0


cw = , co = + ,
ρw ρo ρg
ρ0g φ0
cg = , cr = ,
ρg φ

and the ‘total’ quantities (denoted with the subscript T ) are

Total compressibility: c T = Sw c w + So c o + Sg c g + c r ,
Total mobility: λT = λw + λo + λg ,
Mobility-weighted compressibility: χ T = λw c w + λo c o + λg c g .

The full derivation is shown in Appendix A. Equation (1.1.18) is a parabolic PDE in


p with an additional quadratic nonlinear term KχT |∇p|2 . We must have cT > 0 for
the problem to be well posed. (This criterion has been exploited by Coats in [20] to
derive validity checks for PVT data of isothermal black-oil and compositional fluid
systems.) An analytic solution can be found for the constant-coefficient analog of
(1.1.18):

∂u
− a∇2 u + b|∇u|2 = 0, (x, t) ∈ Rn × (0, ∞)
∂t
u(x, 0) = g,

where a > 0 and b are constants [32, §4.4]. When cT ≡ 0 (the incompressible case),
(1.1.18) degenerates to an elliptic equation in p:

−∇ · (KλT ∇p) = 0. (1.1.19)


8 CHAPTER 1. INTRODUCTION

The pressure equation is important because it dictates the choice of numerical meth-
ods and forms the basis for several widely used methods in reservoir simulation.

1.2 Numerical simulation of the reservoir


In order to simulate fluid flow in the reservoir, the above governing equations need
to be discretized in time and space, and the resulting systems of nonlinear algebraic
equations need to be solved at every time step. A reservoir simulator, which integrates
the governing equations up to a final time Tfinal based on given initial conditions, will
typically follow these steps during the simulation process:

1. Read input data (model grid geometry, permeability, porosity, fluid properties,
etc.);

2. Initialize reservoir (initial conditions, equilibrium calculations);

3. Set boundary conditions;

4. While Tfinal not reached:

• Compute an appropriate ∆t;

• Set well locations and production/injection rates for the current time step;

• Form the nonlinear algebraic equations that arise from discretizing the
governing equations;

• Solve the nonlinear system;

• Print results (water cut, saturation profile, etc.) if necessary;

• Increment time;

5. End when Tfinal is reached.

A robust general-purpose simulator needs to handle a variety of reservoir spec-


ifications (model sizes, property distributions) and flow configurations. It is this
generality requirement that dictates the choice of numerical methods that are used to
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 9

approximate the PDEs. In this section, we provide the background for the remainder
of the thesis by briefly discussing several common discretizations and solvers; for a
broader survey of discretizations that are used in reservoir simulation, we direct the
reader to [34, 52, 83]. A discussion of time-step control and the treatment of wells is
beyond the scope of this thesis, even though these are very important considerations
in building an accurate and useful simulator (see [6] for details).

1.2.1 Spatial discretization


Historically, the majority of reservoir simulators used (and still use) finite volume
methods to discretize the multiphase flow equations. This choice is motivated by
the need for exact local conservation, since shocks will generally be present in the
saturation profile in the incompressible case. When compressibility and capillarity
are present, the analytical solution will no longer contain shocks, but steep gradients
will remain in the saturation profile, and it would be computationally costly to use
a grid that is fine enough to resolve these gradients. The discretized component
mass-balance equations are written in conservation form:

∂(φi ρp Sp ) 1 X
+ Fp,il = 0, (1.2.1)
∂t |Vi |
l∈adj(i)

where |Vi | is the volume of the i-th gridblock, and Fp,il is the numerical flux function
of phase p from cell i to cell l:
 
(pp,l − pp,i )(xl − xi ) γp,il (zl − zi )
Fp,il = −|∂Vil |Kil ρp,il λp (Sil ) − · ν il , (1.2.2)
|xl − xi |2 |xl − xi |

where |∂Vil | is the area of the interface between cells i and l, xi is the location of
the center of cell i, zi is the component of xi along the direction of gravity, and ν il
is the unit normal to the cell interface, pointing from cell i to cell l. The above
discretization uses a two-point flux approximation, and we restrict ourselves to the
two-point flux case in this dissertation. One should note, however, that multipoint
flux approximations are also used occasionally in reservoir simulation, especially for
tensorial permeability fields [2, 47, 48].
10 CHAPTER 1. INTRODUCTION

The literature on finite volume methods for multiphase flow is vast [87, 79, 16], and
[6] describes the method in detail for various flow configurations. On the other hand,
the use of finite-element methods for general-purpose simulation in industry is rare.
Finite element methods are more flexible in terms of the treatment of unstructured
grids, irregular boundaries, as well as anisotropic or tensorial permeability fields. As
a result, there is active interest in using finite-element methods to develop finite-
volume discretizations [53]. In this thesis, we restrict our discussion to finite volume
methods, but the reader is referred to [1, 43, 86, 31] for more detailed discussion on
finite element methods.
A peculiar feature of the spatial discretization used in reservoir simulation is the
upstream weighting of saturation-dependent terms. Buoyancy and capillary forces
may induce sonic points to the hyperbolic flux function (see Figure 2.1), but the exact
location of the sonic point is a strong function of the total velocity and permeability, so
it would be inconvenient to locate the sonic point for every cell interface. In practical
simulations, the upstream direction for phase p is determined by the potential gradient
of phase p. Since different phases can have different upstream directions, the resulting
numerical flux functions are in fact a combination of mobilities, each evaluated at a
different saturation. It can be shown [13] that these numerical flux functions are
different from those used in classical CFD, such as the Godunov and Engquist-Osher
schemes. In Chapter 2, we will study this upstream weighting in detail and discuss
its convergence to the analytical solution under grid refinement.

1.2.2 Temporal discretization


A variety of temporal discretizations are commonly used in black-oil simulation. The
most commonly used methods are:

1. Implicit pressure, explicit saturation (IMPES): All saturation-dependent co-


efficients in the flux terms are evaluated at the beginning of the time step
(t = tn ), and pressure-dependent terms are evaluated at the end of the time step
(t = tn+1 ). In algorithmic terms, this amounts to (1) solving the pressure equa-
tion (1.1.18) for pn+1 , (2) computing the phase velocities up at (S n , pn+1 ), and
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 11

(3) updating the saturations using the mass-balance equations (1.1.7), (1.1.8)
and a forward difference approximation for ∂/∂t. Because of the explicit treat-
ment of saturation, IMPES is only conditionally stable; the CFL condition for
a 1D two-phase incompressible oil-water problem without gravity is given by
(cf. [20])
φ
∆t < , (1.2.3)
2Kλw λo |dPcow /dSw | vT dfw /dSw
+
(λw + λo )∆x2 ∆x
where vT is the total velocity of the oil and water phases, and fw is the fractional
flow of the water phase:
 
λw Kλo ∂Pcow
fw = 1+ .
λw + λo vT ∂x

In the absence of capillarity, (1.2.3) reduces to the familiar CFL condition for
the hyperbolic conservation law

φSt + (vT fw (S))x = 0.

Thus, ∆t is O(∆x2 ) when capillarity is present, and O(∆x) otherwise.

2. Sequential implicit method (SEQ): The sequential implicit method computes


the new pressure pn+1 in exactly the same manner as IMPES, but it updates
the saturations by solving the transport problem with implicit time-stepping
[72, 82]. This amounts to an operator splitting method, in which the flow
problem (resolution of the global pressure field) and the transport problem
(advection of individual phases) are decoupled and solved sequentially. A more
detailed description is given in Section 2.2.2. Since the transport problem is
solved implicitly using a frozen total velocity field vT , SEQ is stable for any
time-step size as long as vT is conservative. However, for compressible flow,
mass is generally not conserved for one of the phases; the mass-balance errors
are proportional to the areal variation of ρo /ρw [6, 21] and can be significant
for large time steps.
12 CHAPTER 1. INTRODUCTION

3. Adaptive implicit method (AIM): This method changes the level of implicitness
adaptively for each cell, depending on the CFL limit for that cell. For a cell
experiencing fast flows (i.e., the local CFL number is greater than 1), both
the saturation and pressure are taken implicitly; if, on the other hand, the
local CFL number is less than 1, the saturations are taken explicitly, whereas
pressure is taken implicitly. More detailed descriptions and analyses can be
found in [76, 36, 67, 26].

4. Fully implicit method (FIM): Both saturation and pressure variables are taken
implicitly in every cell. A linear stability analysis [6], together with a more
refined analysis based on linearized mobilities [61], strongly indicate (but do not
provide a rigorous proof) that this method is unconditionally stable. However,
it is also generally the most diffusive of the above mentioned schemes.
These methods differ in the level of implicitness of the saturation-dependent quanti-
ties, with IMPES having the least degree of implicitness and FIM having the most.
Note that pressure is treated implicitly in all methods. This is because the pressure
equation is either weakly parabolic (and nearly elliptic) in the compressible case, or
elliptic in the incompressible case. Hence, in the compressible case, explicit pressure
treatment would entail a time-step restriction proportional to ∆x2 , which is unac-
ceptably severe. In the incompressible case, the pressure equation degenerates into a
constraint that is required to ensure global conservation, which must be satisfied by
the numerical solution. Thus, it is also necessary to treat pressure implicitly in the
incompressible case.
Clearly, a method with a lower level of implicitness would incur a lower compu-
tational cost per time step. However, the difference in computational cost between
explicit and implicit methods (such as IMPES and FIM) is not as pronounced as
one would expect, since the “explicit” IMPES still needs to solve an implicit pres-
sure equation at every time step. Figure 1.1 shows the amount of time the simulator
spends in each module during a typical black-oil simulation when FIM is used. Even
for FIM, the pressure solve represents almost half of the total running time, and
about 60% of the solver time. So in this case, IMPES would be faster than FIM
only if the FM time step is chosen such that the maximum CFL number is less than
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 13

Total running time: ------------- 518.86 sec ( 100 % )


-Initialization time: --------- 1.16 sec ( 0 % )
-Property Calc time: --------- 13.96 sec ( 3 % )
-Linearization time: --------- 31.93 sec ( 6 % )
-Newton Update time: --------- 58.81 sec ( 11 % )
-Solver running time: --------- 412.15 sec ( 79 % )
--(B)ILU Pre fac time: ----- 61.78 sec ( 12 % )
--(B)ILU Pre slv time: ----- 14.32 sec ( 3 % )
--Pres dcpl time: ---------- 12.16 sec ( 2 % )
--Pres slv time: ----------- 247.26 sec ( 48 % )
-Timestep Calc time: --------- 0.13 sec ( 0 % )
-CFL No. Calc time: --------- 0 sec ( 0 % )
Figure 1.1: Timing report for a typical black-oil simulation run. The above simulation
is performed on a 3D, two-phase heterogeneous model with 141900 grid blocks.

1.67. In practice, reasonable time steps yield maximum CFL numbers that are much
larger than 1 because of the presence of sources and sinks, as well as spatial variations
in permeability and porosity. However, the impact of these high CFL numbers on
overall accuracy is minimal because they only occur in a few cells. Figure 1.2 shows
the saturation profiles for the FIM and IMPES solutions in a 2D water flood prob-
lem. The maximum saturation difference between the two solutions is 0.036, which is
negligible considering the uncertainty in the reservoir characterization. In this case,
FIM takes only 113 time steps to reach Tfinal , whereas IMPES takes 1318 steps, so
FIM is clearly more efficient.
The above example, in which the high CFL numbers do not significantly affect
solution accuracy, is typical among reservoir models of practical interest. Such models
are generally highly heterogeneous with permeability variations up to several orders
of magnitude. Moreover, wells can be completed anywhere in the reservoir model
and can operate in a wide variety of ways, often resulting in CFL limits that are
unacceptably severe. Thus, reservoir simulators typically use implicit time-stepping
for robustness and efficiency. Consequently, efficient linear and nonlinear solvers for
the fully-implicit problem can be the crucial factor in determining the efficiency of
reservoir simulators.
14 CHAPTER 1. INTRODUCTION

Figure 1.2: A comparison between FIM (top) and IMPES (bottom) saturation profiles
for a 2D heterogeneous reservoir. The permeability and porosity fields are taken from
the 51st layer of the SPE 10 reservoir [19].
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 15

Higher-order methods for reservoir simulation have been an active area of research
in recent years. With the exception of streamline methods, which can take advantage
of high-order 1D integrators readily [52], higher-order methods are still primarily in
the development stage and are not yet routinely used in commercial simulators. A
major impediment to the wide-spread adoption of higher-order methods is the loss
of positivity, which leads to spurious oscillations as the initial profile is integrated
forward in time. An important result due to Bolley and Crouzeix [12] states that
a method that preserves positivity for all ∆t is at most first-order accurate. An
elaborate discussion on higher-order methods is beyond the scope of this thesis; see
[9, 18, 11, 26, 77] for a detailed discussion.

1.2.3 Solution of nonlinear equations

Since all temporal discretizations contain some level of implicitness, the simulator
needs to solve a large system of nonlinear algebraic equations at each time step. The
size and properties of this system, of course, depend on the number and nature of the
implicit variables. For IMPES, the nonlinear system will by N -by-N , where N is the
number of grid blocks (control volumes) in the domain, and the equations will inherit
the parabolic/elliptic nature of the pressure equation. For FIM, on the other hand, we
would have an np N -by-np N system, where np is the number of fluid phases, and the
equations would be of mixed hyperbolic-parabolic type. As a result, the bulk of the
simulation time (80% to 90%, cf. Figure 1.1) is spent on solving these large systems.
It is therefore crucial, for the sake of efficiency and robustness, that the linear and
nonlinear solvers exploit the structure and properties of these discrete equations.

Nonlinear solvers

The most commonly used nonlinear solvers in reservoir simulation are all variations
on the basic Newton method:

Solve J(x(ν) ) δx(ν) = −R(x(ν) ) for δx(ν) ,


(1.2.4)
Set x(ν+1) = x(ν) + δx(ν) , ν = 0, 1, 2, . . . ,
16 CHAPTER 1. INTRODUCTION

where R(x) is the residual function and J(x) = ∂R/∂x is the Jacobian matrix. New-
ton’s method is popular because of its local quadratic convergence and its general
applicability. For residual functions arising from discretized PDEs, the resulting Ja-
cobian is generally sparse and structured, which means the linear systems can be
solved efficiently. Also, quadratic convergence means Newton’s method is very fast
when good initial guesses are available. For time-dependent problems, a natural
initial guess is the saturation and pressure profiles from the previous time step. As-
suming the profiles vary continuously with time (which is always true for pressure,
and true for saturations away from shock fronts), the old time-step values will be
close to the solution provided ∆t is small enough. However, when the time step is
too large, it is possible for Newton’s method to diverge, since the residual functions
are in general non-convex and possibly non-monotonic (see Figure 2.1). When faced
with non-convergence, the simplest approach is to cut the time-step size and rerun
Newton’s method with the smaller time step. Such time-step cuts are very expensive,
since they mean we must throw away the results of all previous iterations and start
over. Thus, one should avoid time-step cuts as much as possible.
One way to avoid time-step cuts is to take small enough time-steps. However, in
practice, one does not want to choose time-step sizes based on the nonlinear solver
for the following reasons:

1. The use of excessively small time steps reduces the benefits of using FIM, since
we would not be taking advantage of its ability to handle long time steps;

2. It is difficult to decide whether Newton’s method would converge for a particular


time-step size without actually performing the iterations;

3. The time-step size should be chosen based on the desired solution accuracy (e.g.,
bounds on numerical diffusion errors) instead of the ability of the nonlinear
solver to converge a time step.

For these reasons, several modifications to the basic Newton’s method have been pro-
posed to ensure global convergence, or at least to enlarge the region of convergence
to the point that the algorithm will converge for all ∆t of practical interest. Glob-
alization techniques for general nonlinear residual functions, such as line search and
1.2. NUMERICAL SIMULATION OF THE RESERVOIR 17

trust-region methods, are discussed in [29]. In our experience, line search methods,
in which the search direction is scaled by a single step-length parameter α, are in-
adequate for reservoir simulation problems because (1) the residual norm is sensitive
to diagonal scaling, and the correct scaling for the phase conservation equations is
not obvious in most problems; (2) α is often very small when flow reversal due to
gravity occurs across several cell interfaces; (3) a number of backtracking steps is of-
ten needed to guarantee a sufficient decrease in the residual, and function evaluations
are quite expensive, since each evaluation involves calculating fluid properties and
pressure gradients for every cell in the domain.
Another method, which is implemented in the commercial simulator Eclipse, is
the so-called Appleyard chop [37]. It limits, on a cell-by-cell basis, the allowable
saturation and pressure changes within a nonlinear iteration to a fixed (but empirically
determined) threshold. When the threshold parameters are chosen properly, the
method is quite robust and the number of time-step cuts is often small. However,
because large saturation changes are disallowed, the method can lead to unnecessarily
slow convergence, especially in cases where Newton’s method actually works well (such
as problems with convex fractional flow functions).
Other methods for solving general nonlinear systems (e.g., continuation methods)
can be found in [29, 59]. Such methods, however, are not used in general-purpose
simulation in industry.

Linear solvers

To solve the linear system (1.2.4), early reservoir simulators [51, 62, 6] used either
direct methods (Gaussian elimination) or stationary iterative methods such as succes-
sive over-relaxation (SOR), alternating direction implicit method (ADI), or Stone’s
strongly implicit procedure (SIP) [73]. With the advent of Krylov accelerators such as
ORTHOMIN [80] and GMRES [69], iterative methods became more popular, and the
need for efficient preconditioning techniques has increased. In addition to precondi-
tioners derived from stationary methods, other preconditioners have been developed
by the reservoir simulation community to handle the linear equations arising from
fully-implicit simulation. Examples include:
18 CHAPTER 1. INTRODUCTION

1. Incomplete factorization (ILU): Originally developed by Watts [84] to handle


Jacobian matrices from structured grids, this technique has been generalized to
handle other sparse matrices. For a thorough discussion of ILU and its variants,
see [68].

2. Nested factorization: This method was introduced by Appleyard et al. in [5]


and subsequently improved by Appleyard and Cheshire in [3]. It exploits the
band structure in three-dimensional problems to produce an approximate fac-
torization M = LU , such that the error matrix E = M − A has zero column
sums. In physical terms, this means global mass balance is preserved by the
approximate factors, yielding a better preconditioner than ILU.

3. Constrained pressure residual (CPR): Proposed by Wallis et al. [81], CPR is


a two-stage preconditioner in which the residual vector is constrained to lie in
some subspace V via a projection process. The choice of constraint subspace
determines the effectiveness of the preconditioner. With the emergence of fast
elliptic solvers such as algebraic multigrid [74], CPR has become one of the most
attractive preconditioners for reservoir simulation problems [17].

Behie [8] provides a comparison among the three preconditioners above. In Chapter
5, the spectral properties of CPR-preconditioned Jacobians are discussed in detail.

1.3 Thesis outline


In this thesis, we make two contributions to the existing literature on reservoir simu-
lation. On the algorithmic side, we present a new ordering scheme for the equations
and unknowns for the discrete mass-balance equations (1.2.1), (1.2.2). This new or-
dering exploits flow direction information and allows us to derive a more efficient
nonlinear solver as well as an improved linear preconditioner. On the theoretical
side, we present a rigorous nonlinear analysis of phase-based upstream discretization.
We show that the discretization yields a well-defined, stable and monotonic method
that converges to the entropy solution for arbitrary CFL numbers. This complements
1.3. THESIS OUTLINE 19

the existing literature [6, 61] in which only stability is established using a linear or
linearized stability analysis.
In Chapter 2, we analyze phase-based upstreaming in detail. We show how the
FIM formulation in 1D, as well as SEQ in multiple dimensions, can be cast as a
monotone implicit scheme. We then extend the work of Rheinboldt on M -functions
and Gauss-Seidel iterations [64] to show that the discretized equations always have
a unique solution, which can be found using the nonlinear Gauss-Seidel process. We
also show that the discrete solution converges to the entropy solution under grid
refinement, and we investigate the accuracy of the discrete solutions for different
time-step sizes and spatial grids. This chapter is of a more theoretical nature, and
practitioners of reservoir engineering who are familiar with the discretizations can go
directly to Chapter 3 for a more algorithms-related discussion.
In Chapter 3, we introduce phase-based potential ordering, which reorders the
equations and variables in the nonlinear system in a way that exploits flow direction
information and eventually allows a partial decoupling of the problem into a sequence
of single-cell problems that are easy to solve. This ordering is valid for both two-phase
and three-phase flow, and it can handle countercurrent flow due to gravity and/or
capillarity.
In Chapter 4, we propose a reduced-order Newton algorithm, which makes use of
the phase-based potential ordering in Chapter 3 to reduce the size of the nonlinear
system. The latter is then solved using Newton’s method. We analyze its convergence
behavior for 1D cocurrent problems, and we show a variety of examples (two- and
three-phase flow, with and without gravity) illustrating its effectiveness in dealing
with large, complex heterogeneous problems.
In Chapter 5, we analyze the two-stage CPR preconditioner in detail and propose
an improved second-stage preconditioner that uses a cell-based potential ordering.
This approach reduces the sensitivity of CPR to flow configurations, and this re-
duction in sensitivity is both justified theoretically and observed from numerical ex-
periments. We also experiment with directly preconditioning the Schur complement
problem that arises from the phase-based potential order reduction.
We present our conclusions and outline future directions in Chapter 6.
Chapter 2

Analysis of Upstream Weighting

2.1 Background

As mentioned in Chapter 1, the multiphase flow equations give rise to a system of


n conservation laws (where n is the number of immiscible fluid phases), defined over
Ω ⊂ Rk (1 ≤ k ≤ 3), each of the form

∂(φρj Sj )
+ ∇ · (ρj uj ) = ρj qj , j = 1, . . . , n, (2.1.1)
∂t

and generalized Darcy’s law

uj = −Kλj ∇(pj − γj z), (2.1.2)

where φ = φ(x) is the porosity of the medium (with 0 < φ ≤ 1), K = K(x) > 0 is the
absolute permeability, z = z(x) is the depth variable; and for each phase j = 1, . . . , n,
ρj is the density, Sj is the saturation (i.e. the volume fraction occupied by phase j),
uj is the volumetric flux vector, qj is the source or sink term, λj = λj (S1 , . . . , Sn ) is
the phase mobility, pj is the pressure, and γj is the gravitational force. In addition,

20
2.1. BACKGROUND 21

we have the algebraic relations:


X
Saturation constraint: Sj = 1, (2.1.3)
Capillary pressure constraint: pj − pj+1 = Pcj (S1 , . . . , Sn ), j = 1, . . . , n − 1.
(2.1.4)

The above system of PDEs exhibits a mixed hyperbolic-parabolic character, which


becomes apparent when we consider the various limiting cases. If we assume constant
densities and neglect capillary pressure relations (i.e. we assume p1 = · · · = pn ≡ p),
then we can sum (2.1.1) over j = 1, . . . , n and invoke the saturation constraint to get
 P  P
−∇ · KλT ∇p − K∇z j γj λj = j qj , (2.1.5)

P
where λT = j λj is the total mobility. Thus, for a given saturation distribution,
the pressure field satisfies an elliptic PDE. On the other hand, when the total ve-
P
locity uT = j uj is constant over the domain (which is the case for flow in a one-
dimensional porous medium), we can rewrite uj as

λj P
uj = (uT − K∇z l λl (γl − γj )) , (2.1.6)
λT

which is a function of the saturations S1 , . . . , Sn only. Thus, if we substitute (2.1.6)


into (2.1.1), we get

∂Sj
φ + ∇ · uj (x, S1 , . . . , Sn ) = 0, j = 1, . . . , n − 1. (2.1.7)
∂t

This means saturation behaves like the solution to a system of first-order hyperbolic
PDEs, so one should expect discontinuous saturation profiles. In higher dimensions,
there is generally a strong coupling between pressure and saturation, due to the sat-
uration dependence of λj and λT in (2.1.5) and the dependence of uT on the pressure
field in (2.1.6). In addition, the porosity φ and permeability K are highly oscillatory,
non-smooth functions of x, and K(x) can vary by several orders of magnitude over
the domain Ω. The large variability of φ and K leads to local CFL limits that are
22 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

unacceptably severe when explicit schemes are used. As a result, the discretization
of choice for most reservoir simulators is the fully-implicit method (FIM), which uses
finite volume in space and backward Euler in time. The numerical flux functions,
which approximate the uj as defined in (2.1.2), use a two-point finite difference to
approximate ∇p and phase-based upstream weighting to approximate λj (S). In other
words, to approximate uj at the interface of cells a and b (centered at xa and xb ), we
evaluate λj (S) at 
S(xa ) if −∇(pj − γj z) · νab ≥ 0,
S= (2.1.8)
S(x ) otherwise,
b

where νab is the unit vector normal to the interface, pointing from a to b. The
resulting numerical flux functions are different from those used in classical CFD,
such as the Godunov and Engquist-Osher schemes [13]. Despite being only first-
order accurate, phase-based upstreaming is the preferred upwind method in reservoir
simulation because it is physically intuitive, and because it is generally easier to verify
a consistency condition such as (2.1.8) than to identify potential sonic points, which
vary over the domain and are strong functions of permeability and total velocity. This
is especially true for the fully-implicit method because the total velocity at time tn+1
is usually unknown.

Note that in (2.1.8) it is possible for −∇(pj − γj z) · νab to have different signs
for different j, meaning the upstream directions can be different for different phases
when buoyancy forces are significant; this is known as countercurrent flow in reservoir
engineering literature. In one-dimensional porous media, countercurrent flow mani-
fests itself through the presence of sonic points in the flux function uj ; thus, the flux
function for a countercurrent flow problem would typically look like the one shown
in Figure 2.1(b), whereas without countercurrent flow it would look more like Figure
2.1(a). A detailed treatment of phase-based upstreaming is given in [13], in which
the authors showed that, when explicit time-stepping is used on a two-phase flow
problem, phase-based upstreaming leads to a monotone difference scheme, as long as
the appropriate CFL condition is satisfied. This in turn implies that the solution of
2.1. BACKGROUND 23

1 1.6

1.4
0.8
1.2

0.6 1
Fw

Fw
0.8
0.4 0.6

0.4
0.2
0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Sw Sw

(a) (b)
Figure 2.1: Flux functions for 1D incompressible two-phase flow: (a) Co-current flow
(no buoyancy effects), (b) Countercurrent flow due to gravity.

the explicit schemes converge to the entropy solution of the two-phase equations

∂S ∂f (S)
+ = 0, (2.1.9)
∂t ∂x 
λ1 ∂z
f (S) = u1 = uT + Kλ2 (γ1 − γ2 ) (2.1.10)
λ1 + λ2 ∂x

as ∆t, ∆x → 0 while satisfying the CFL condition. The goal of this chapter is to
extend this result to the fully-implicit case. This leads us to study the more general
problem of implicit monotone schemes, which would then include the multiphase flow
problem as a special case.
The use of implicit time-stepping leads to a (typically large) system of nonlinear
algebraic equations that must be solved for each time step. Moreover, the residual
functions are generally non-differentiable because of upstreaming criteria of the form
(2.1.8); thus, the existence of a unique solution to these systems of equations is
not immediately obvious. For implicit monotone schemes for 1D scalar conservation
laws, Lucier [50] showed that a solution to the discrete problem exists and is unique
whenever the initial data is bounded and has bounded total variation. The proof
of existence, which relies heavily on Crandall-Liggett theory [23], proceeds along the
following lines (see [27, Chapter 3] for more details). First, one shows that the residual
24 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

function R for the numerical scheme defines an m-accretive operator in the L1 -norm.
Then by the Crandall-Ligett theorem, the ODE

du
= −Ru, u(0) = x (2.1.11)
dt

has a unique solution for t ∈ [0, ∞) for any initial point x. Let u(t; x) denote the
solution of (2.1.11) with starting point x. Then one shows that the Poincaré operator
Pω , which maps the point x to the point u(ω, x), is strictly contractive. Then by
Banach’s fixed point theorem, Pω has a unique fixed point x0 . One then proceeds to
prove that u(t; x0 ) = x0 for all 0 ≤ t ≤ ω; thus, du/dt = 0, which implies Rx0 = 0.
While this argument does prove the existence and uniqueness of a solution to
the discretized problem, the proof does not suggest a practical algorithm for finding
the solution. In section 3, we present an alternate constructive proof of existence by
showing that the classical Gauss-Seidel and Jacobi iterations converge for this class of
problems. In fact, we show that the iterative methods converge whenever the initial
data for the discrete problem is bounded, so the implicit scheme is well-defined even
when the initial data does not have bounded variation in R. The well-definedness of
the numerical scheme, together with the total variation diminishing (TVD) property
and the existence of a discrete entropy inequality, imply that the numerical scheme
converges to the entropy solution as the mesh is refined (i.e., as ∆x → 0). This result
holds for any mesh ratio λ = ∆t/∆x (i.e., for any Courant number).

2.2 Two model problems


In this section, we present two model problems from porous media flow, both of which
contain a hyperbolic subproblem that can be analyzed using the theory developed in
this chapter.

2.2.1 FIM for 1D problem with gravity


Consider a one-dimensional model problem with:
2.2. TWO MODEL PROBLEMS 25

• incompressible two-phase flow,

• zero capillarity (pw = po ≡ p),

• an injection boundary condition on the left, and

• a pressure boundary condition on the right.

In this case, the continuous problem (2.1.1)–(2.1.2) can be rewritten as

∂Sp (x) ∂up (x)


φ(x) + = 0, xL < x < xR , (2.2.1)
∂t ∂x  
dp dz
up (x) = −K(x)λp (Sw (x)) − γp , (2.2.2)
dx dx

for p = w, o (water and oil), together with the saturation constraint So + Sw = 1, the
initial condition Sw (x, 0) = S 0 (x) for x ∈ [xL , xR ], and boundary conditions

up (xL ) = qp,L , p(xR ) = pR .

We assume that the injection velocities qw,L and qo,L are non-negative, and that the
total velocity qT,L := qw,L +qo,L is strictly positive. (These assumptions cover the most
interesting cases, such as oil recovering by water-flooding.) This formulation, which
contains pressure variables, is known as the parabolic form of the problem, since
it represents the incompressible limit of a parabolic problem. We can also derive
the hyperbolic or “fractional flow” form of the problem by eliminating the pressure
variables as follows. The discretized PDEs can be written as
old
φi (Sw,i − Sw,i ) Fw,i+1/2 − Fw,i−1/2
+ = 0, (2.2.3a)
∆t ∆x
old
φi (Sw,i − Sw,i ) Fo,i+1/2 − Fo,i−1/2
+ = 0, (2.2.3b)
∆t ∆x

where  
pi − pi+1
Fp,i+1/2 = Ki+1/2 λp,i+1/2 + gp , p = o, w, (2.2.4)
∆x
26 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

with gp = γp ∆z/∆x, i = 1, . . . , N . The numerical boundary conditions become

Fp,1/2 = qp , p = o, w, (2.2.5)
pN +1 = 2pR − pN . (2.2.6)

For the remainder of this section, we assume without loss of generality that gw ≥ go ;
in the case of gw < go , the same argument would hold by considering the oil phase
instead of the water phase. To eliminate the pressure variables pi , first note that
summing equations (2.2.3a) and (2.2.3b) and rearranging gives

Fw,i+1/2 + Fo,i+1/2 = Fw,i−1/2 + Fo,i−1/2 = qw + qo =: qT .

In other words, the total flux is constant across any interface, and this flux is denoted
by qT , which is equal to qT,L . Summing Equation (2.2.4) through p = o, w, we can
express the pressure gradient (pi − pi+1 )/∆x in terms of qT :
 
pi − pi+1
qT = Ki+1/2 λT,i+1/2 + (λw,i+1/2 gw + λo,i+1/2 go ) ,
∆x

where λT,i+1/2 = λw,i+1/2 + λo,i+1/2 . Thus,

pi − pi+1 qT − Ki+1/2 (λw,i+1/2 gw + λo,i+1/2 go )


= . (2.2.7)
∆x Ki+1/2 (λw,i+1/2 + λo,i+1/2 )

Substituting into (2.2.4) for the water phase gives

λw,i+1/2  
Fw,i+1/2 = qT + Ki+1/2 λo,i+1/2 ∆g
λT,i+1/2 (2.2.8)
= Fw,i+1/2 (Sw,i , Sw,i+1 ),

where ∆g = gw − go ≥ 0. This, together with (2.2.3a):

old ∆t
φi (Sw,i − Sw,i )+ (Fw,i+1/2 − Fw,i−1/2 ) = 0, (2.2.9)
∆x
2.2. TWO MODEL PROBLEMS 27

leads to a numerical scheme with exactly the same form as (2.3.1), except for the
boundary conditions. Clearly, the treatment of boundary conditions will significantly
affect the stability and accuracy of the numerical scheme. However, in order to
understand the behavior of the numerical scheme at interior points, we will replace
the initial-boundary value problem (2.2.1) with an initial value problem on an infinite
domain with appropriate initial conditions. In particular, we replace the injection
boundary condition with

S 0 (x) = f −1 (qw,L /qT,L ), x < xL , (2.2.10)

and the pressure boundary condition with

S 0 (x) = S 0 (xR ), x > xR . (2.2.11)

The modified continuous problem will yield a solution identical to (2.2.1) for 0 < t <
TBT , where TBT is the breakthrough time (i.e. the time at which the shock wave
arrives at the pressure boundary). Note that since f is one-to-one over the interval
I = {S : 0 ≤ f (S) < 1} (see Figure 2.1), and since qw,L ≤ qT,L by assumption,
(2.2.10) is well-defined unless qo,L = 0. (If qo,L = 0, we define u0 (x) = inf f −1 (1),
where f −1 denotes the inverse image.)

Phase-based upstreaming

Recall from section 2.1 (cf. Equation (2.1.8)) that the mobilities λp,i+1/2 are evaluated
using the upstream saturations with respect to the flow direction of phase p:

1
λp (Si ) if (p
∆x i
− pi+1 ) + gp ≥ 0,
λp,i+1/2 = (2.2.12)
λ (S ) otherwise.
p i+1

In light of (2.2.7), we can rewrite the upstream conditions as



λp (Si ) if qT + Ki+1/2 (gp − gq )λq,i+1/2 ≥ 0,
λp,i+1/2 = (2.2.13)
λ (S ) otherwise,
p i+1
28 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

where the subscript q denotes the phase other than phase p. Even though pressure
dependence has been eliminated, Equation (2.2.13) still does not explicitly define the
upstream direction for λp , since the definition of upstream is in terms of the (yet
undetermined) mobility of the other phase λq,i+1/2 . For explicit numerical schemes,
Brenier and Jaffré have shown in [13] how to explicitly determine the upstream direc-
tion for each phase for a given saturation profile {Sin }. In the special case of two-phase
flow, they define the following quantities:

θo,i+1/2 = qT − Ki+1/2 ∆gλw (Sin ),


n
θw,i+1/2 = qT + Ki+1/2 ∆gλo (Si+1 ).

These quantities correspond precisely to the condition in (2.2.13), but the condition
is evaluated at Sin for θo and Si+1
n
for θw . Clearly θw,i+1/2 > 0, since ∆g ≥ 0. The
correct upstream directions are then given by

λno,i+1/2 = λo (Sin ), λnw,i+1/2 = λw (Sin ) if 0 ≤ θo,i+1/2 ≤ θw,i+1/2 ,


λno,i+1/2 = λo (Si+1
n
), λnw,i+1/2 = λw (Sin ) if θo,i+1/2 ≤ 0 ≤ θw,i+1/2 .

Thus, for an explicit time-marching scheme, the numerical fluxes are completely
defined by these conditions, and there is no need to go back to the original definition
(2.2.12) involving unknown pressure values. However, this is not the case for an im-
plicit time-marching scheme (such as backward Euler), since the upstream directions
must be consistent with the saturation values at the end of the time step, i.e. with
the saturation profile {Sin+1 }. Because of this consistency requirement, it is not clear
a priori that a solution to the parabolic form of the problem (2.2.3) even exists. Our
approach to proving that a solution exists is to rely on the hyperbolic form g(2.2.8)–
(2.2.11). From the above derivation, it is evident that if {(Si , pi )}N
i=1 is any solution to
the parabolic form (2.2.3)–(2.2.6), then {Si }N
i=1 must be a solution to the hyperbolic
problem. Thus, the key idea is to begin by finding the correct saturation profile {Si }
via (2.2.8)–(2.2.11), with a numerical flux that automatically ensures consistency with
the upstream directions; once the {Si } are known, we can easily solve for the pressure
part because the pressure equation is linear. We distinguish two cases:
2.2. TWO MODEL PROBLEMS 29

1. If Ki+1/2 ∆gλw,max ≤ qT , then θo,i+1/2 ≥ 0 always, so we revert to a single-point


upstream scheme Fi+1/2 = Fi+1/2 (Si );

2. If Ki+1/2 ∆gλw,max > qT , then by the monotonicity of λw (S), there exists a


unique 0 < Sc < 1 such that

Ki+1/2 ∆gλw (Sc ) = qT .

Then the numerical flux, which is to be evaluated at time tn+1 , is defined as


  
λ w (S i ) q T + K i+1/2 λ o (S i )∆g
if 0 ≤ Si ≤ Sc ,


λw (Si ) + λo (Si )

Fw,i+1/2 (Si , Si+1 ) = 
λ w (S i ) q T + K i+1/2 λ o (S i+1 )∆g
if Sc < Si ≤ 1.



λw (Si ) + λo (Si+1 )
(2.2.14)

A plot of the numerical flux Fw (u, v) in the latter case is shown in Figure 2.2. The
black curve on the surface, which shows the value of F (u, v) along the line u = v,
is identical to the continuous flux function in Figure 2.1(b). Thus, it is evident that
the numerical flux satisfies the consistency condition F (u, u) = f (u). Even though
f (u) itself is non-monotonic, the plot clearly shows that F (u, v) is an increasing
function of u and a decreasing function of v. This monotonicity property is what
makes upstream weighting amenable to a Gauss-Seidel type analysis. Also notice
that the numerical flux is independent of the downstream saturation v inside the
cocurrent region (0 ≤ u ≤ Sc ≈ 0.27), but becomes a function of both variables when
u > Sc . Finally, F (u, v) is Lipschitz continuous, but non-differentiable along the line
u = Sc because of the upstream condition (2.2.14). The following theorem, which
summarizes several results by Brenier and Jaffré [13], shows that upstream-weighted
fluxes generally satisfy the monotonicity property.

Theorem 2.1. Assume that the mobility of phase p is increasing with the saturation of
the same phase and decreasing with the saturation of the other phase, for p = o, w (oil
and water). Then the numerical fluxes obtained from phase-based upstreaming defined
by (2.2.8), (2.2.13) are (1) Lipchitz continuous, (2) consistent with the continuous
30 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

flux function (i.e., F (u, u) = f (u)), (3) non-decreasing with respect to Sw,i , and (4)
non-increasing with respect to Sw,i+1 .

The hypothesis on phase mobilities is physically realistic [6]. These properties are
sufficient to ensure that the hyperbolic problem with implicit time-stepping possesses
a unique solution {Sin+1 }, which must also be the correct saturation profile for the
parabolic problem. To solve for pressure, we use Equation (2.2.7):

pi − pi+1 qT − Ki+1/2 (λw,i+1/2 gw + λo,i+1/2 go )


=
∆x Ki+1/2 (λw,i+1/2 + λo,i+1/2 )

for i = 1, . . . , N , and the boundary condition (2.2.6):

pN +1 = 2pR − pN .

Since {Sin+1 } is now known, the right-hand side of (2.2.7) also completely determined.
Thus, the vector p of pressures actually satisfies Ap = b, where A is an N × N upper
triangular matrix with a nonzero diagonal. So A is nonsingular, which means there is
a unique pressure profile {pn+1
i } that satisfies (2.2.7) and (2.2.6). It is easy to see that
this pressure profile is consistent with the upstream condition (2.2.12): because of
(2.2.7), this upstream condition is equivalent to (2.2.13), and the conditions therein
are precisely the ones we use to define the numerical flux function (2.2.14) for the
hyperbolic problem. Hence, we have shown that the parabolic form (2.2.3)–(2.2.6)
has a unique solution, given by the above {(Sin+1 , pin+1 )}.

2.2.2 SEQ for multidimensional problems


In multiple dimensions, it is no longer possible to eliminate pressure variables, because
the total velocity uT is generally a function of space and time. Thus, the system of
PDEs (2.1.1)–(2.1.2) does not reduce to a purely hyperbolic problem, which means
we cannot directly apply our existence and uniqueness results to the fully-implicit
method in this case. Nonetheless, our analysis does apply to the sequential-implicit
method (see section 1.2.2). In each time step in SEQ, we first solve the discrete version
of the (linear) elliptic equation (2.1.5), in which the saturation-dependent coefficients
2.2. TWO MODEL PROBLEMS 31

3.5

2.5
F(u,v)

1.5

0.5

0
1

0.5
0.8 1
0.4 0.6
0 0 0.2
u
v

Figure 2.2: The numerical flux function F (u, v) corresponding to the fractional flow
in Figure 2.1(b). The black curve along the diagonal indicates the value of F (u, u) =
f (u).
32 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

are taken at time tn . In other words, we solve for pn+1 via


" #
X X
−∇ · KλT (S n )∇pn+1 − K∇z γj λj (S n ) = qj . (2.2.15)
j j

Next, we compute the total velocity


X X
u∗T = u∗j = − Kλj (S n )∇(pn+1 − γj z). (2.2.16)
j j

Finally, we compute the saturations Sjn+1 (j = 1, . . . , n − 1) by solving the discrete


version of (2.1.6) and (2.1.7) with implicit time-stepping:

∂Sj
φ + ∇ · uj (x, S1 , . . . , Sn ) = 0,
∂t
λj ∗ P
uj = (uT − K∇z l λl (γl − γj )) .
λT

Essentially, the SEQ method decouples the system into an elliptic and a hyperbolic
subproblem. A finite-volume discretization of (2.1.6) and (2.1.7) gives rise to the
following multidimensional analog of (2.2.9):
X
n+1 n n+1 n+1
φi (Sw,i − Sw,i )+ λil Fil (Sw,i , Sw,l ) = 0. (2.2.17)
l∈adj(i)

Here, Fil is the flux (or velocity) from cell i to cell l, and λil = ∆t|∂Vil |/|Vi |, where
|∂Vil | is the area of the surface separating cell i, and l, |Vi | is the volume of cell i and
∆t is the time step. For a conservative scheme we must have

Fil (ui , ul ) = −Fli (ul , ui ), (2.2.18)

and for monotonicity we require that Fil be non-decreasing with respect to the first
argument and non-increasing with respect to the second. This requirement is satisfied
for two-phase flow problems, since we can reproduce the derivation in section 2.2.1
2.2. TWO MODEL PROBLEMS 33

to obtain the flux function

λw,il
Fw,il = [qil + Kil λo,il (gw − go )]
λT,il

and the upstream condition



λp (Si ) if qil + Kil (gp − gq )λq,il ≥ 0,
λp,il =
λ (S ) otherwise,
p l

for p = o, w, where qil = u∗T · νil and gp = γp ∇z · νil . We show that a unique solution
to (2.2.17) exists for any ∆t if the following conditions hold:

1. The number of cells (control volumes) adjacent to cell i, |adj(i)|, is bounded for
all i;

2. The ratio |∂Vil |/|Vi | is bounded for all pairs of adjacent cells (i, l);

3. The quantity φi |Vi | is uniformly bounded away from zero for all i;

4. For any cell i, the total number of cells reachable from i in k steps is O(k p ) for
some fixed p > 0 (i.e. grows at most polynomially in k).

5. Fil is equicontinuous with the same Lipschitz constant for all pairs of adjacent
cells (i, l).

Assumptions 1–4 are easily satisfied by regular Cartesian grids, and also by most
unstructured grids of practical interest. From (2.2.18) we see that assumption 5 is
satisfied as long as Kil is uniformly bounded over the domain, which is generally true
for problems of practical interest. We justify these assumptions in section 2.3.7.
34 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

2.3 Existence and uniqueness of solutions for the


discretized problems
In both model problems, we must solve a system of nonlinear equations ((2.2.9) and
(2.2.17) respectively) for the unknowns {un+1
i }. In this section, we show that the clas-
sical nonlinear Jacobi and Gauss-Seidel processes both converge to a unique bounded
solution, which provides an alternate constructive proof of the well-definedness of
implicit monotone schemes. In addition, we show that Jacobi and Gauss-Seidel both
converge for any starting point that is bounded by the initial data, which leads to
a practical algorithm for computing the solution. In the interest of clarity, we first
consider the following one-dimensional problem:

φi (un+1
i
n+1
− uni ) + λ(Fi+1/2 n+1
− Fi−1/2 ) = 0, λ = ∆t/∆x, i ∈ Z. (2.3.1)

We then extend the analysis to problems with spatially-varying coefficients, as well


as problems in multiple dimensions.

2.3.1 Implicit monotone schemes


Consider a numerical scheme of the form (2.3.1), where Fi+1/2 denotes the numerical
flux across the interface between cells i and i + 1. This scheme approximates the 1D
nonlinear conservation law

φ(x)ut + f (x, u)x = 0, (x, t) ∈ R × R+ , (2.3.2)

which generalizes problem (2.1.9), (2.1.10) to the variable porosity and permeability
case. For simplicity, we assume a three-point scheme

n+1
Fi+1/2 = Fi+1/2 (un+1
i , un+1
i+1 );

thus, the implicit stencil at cell i involves the value at cell i at time tn , as well as the
values at cells i − 1, i and i + 1 at the future time tn+1 . Given we are interested in
handling flux functions of the type shown in Figure 2.1(b), we do not assume that the
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 35

flux function f (x, u) is monotonic in u, so that sonic points may be present. Assume
that f and F are both locally Lipschitz continuous (but not necessarily differentiable),
and that the numerical flux function Fi+1/2 is consistent with f in the sense that

Fi+1/2 (u, u) = f (xi+1/2 , u). (2.3.3)

For the purpose of this thesis, a 1D implicit scheme is said to an implicit monotone
scheme if the following assumption is satisfied.

Assumption 1 (Monotonic fluxes). For all i ∈ Z, the numerical flux function Fi+1/2
is non-decreasing in the first argument and non-increasing in the second argument,
i.e. for any w, we have Fi+1/2 (u, w) ≤ Fi+1/2 (v, w) and Fi+1/2 (w, u) ≥ Fi+1/2 (w, v)
whenever u ≤ v.

As shown in section 2.2.1, the fully implicit 1D problem satisfies this assumption.
We show that residual functions corresponding to implicit monotone schemes are in
fact M -functions in the sense of Rheinboldt [64]. This allows us to prove the existence
and uniqueness of solutions via a convergent iterative process.

Remark. Assumption 1 also guarantees that the resulting residual function is an m-


accretive operator in `1 (Z) (see [33] for a proof). In general, m-accretive functions
and M -functions are not equivalent concepts. Consider the space X = L1 (Rn ), i.e.,
the (finite) n-dimensional vector space with the L1 -norm. Then A is an m-accretive
operator if A is continuous and for any u, v ∈ Rn ,
n
X
(A(u)i − A(v)i ) sgn(ui − vi ) ≥ 0,
i=1

which is equivalent to diagonal dominance when A is linear (see Appendix B). On the
other hand, M -functions are generalizations of M -matrices, i.e., A is a nonsingular
M -matrix if (1) aii > 0, (2) aij ≤ 0 for i 6= j, and (3) A−1 has only non-negative
36 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

entries. Thus, if
   
2 1 0 1 0 0
   

,
M1 = 1 2 1 M2 = −4 1 0

,
0 1 2 0 −4 1

then the function f1 (x) = M1 x is m-accretive but not an M -function, and the reverse
is true for f2 (x) = M2 x. We do not directly use m-accretivity in this work.
Remark. Assumption 1 implies that (2.3.1) is an E-scheme (cf. [60]), so it is at most
first-order accurate.

2.3.2 Nonlinear Jacobi and Gauss-Seidel process


Suppose we want to solve a nonlinear system of algebraic equations R(x) = 0 for
x ∈ RN , where R = (r1 , . . . , rN )T : RN → RN . Then we can consider the nonlinear
Gauss-Seidel process:

k+1
Solve ri (xk+1 ∗ k k ∗
1 , . . . , xi−1 , xi , xi+1 , . . . , xN ) = 0 for xi ,
(2.3.4)
Set xik+1 = x∗i , i = 1, . . . , N, k = 1, 2, . . . ,

as well as the nonlinear Jacobi process:

Solve ri (xk1 , . . . , xki−1 , x∗i , xki+1 , . . . , xkN ) = 0 for x∗i ,


(2.3.5)
Set xk+1
i = x∗i , i = 1, . . . , N, k = 1, 2, . . .

If R is continuous, then we know that whenever Jacobi or Gauss-Seidel converge, they


have to converge to a solution x∗ such that R(x∗ ) = 0. We would like to use the tools
in [64] to show that (2.3.1) has a unique solution for any mesh ratio λ. However,
since (2.3.1) is defined all i ∈ Z, we need to extend Rheinboldt’s results to include
an appropriate class of infinite-dimensional systems in which the residual functions
satisfy the following assumptions.

Assumption 2 (Preservation of bounded sets). R : `∞ (N) → `∞ (N) is a mapping


between bounded sequences for which there exists an increasing function ζ : [0, ∞) →
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 37

[0, ∞) such that


kxk∞ ≤ B =⇒ kR(x)k ≤ ζ(B).

Assumption 3 (Finite number of dependencies). For each i, the residual function


ri (x1 , x2 , . . .) is non-constant with respect to a finite number (which can vary with i)
of xj .

In other words, the residual functions must come from a compact stencil and must
preserve boundedness. With these assumptions, the nonlinear Gauss-Seidel process
becomes

Solve ri (xk+1 k+1 ∗ k k ∗


1 , . . . , xi−1 , xi , xi+1 , xi+2 , . . .) = 0 for xi ,
(2.3.6)
Set xk+1
i = x∗i , i = 1, 2, . . . , k = 1, 2, . . . ,

and the nonlinear Jacobi process becomes

Solve ri (xk1 , . . . , xki−1 , x∗i , xki+1 , xki+2 , . . .) = 0 for x∗i ,


(2.3.7)
Set xk+1
i = x∗i , i = 1, 2, . . . , k = 1, 2, . . . .

The only differences between the above processes and (2.3.4)–(2.3.5) are that each
Gauss-Seidel/Jacobi “sweep” now involves infinitely many variables and equations.
These processes are well-defined because each ri is assumed to depend on only finitely
many arguments, so that for any given i ∈ Z, k ∈ N, the value of xik+1 can be obtained
from a finite number of univariate solves. The main purpose of these assumptions is
to ensure the residual function of the discretized PDE is an M -function. This would
then allow us to prove the convergence of Jacobi and Gauss-Seidel iterations to a
unique bounded solution.

2.3.3 M-function theory

M -functions are essentially generalizations of M -matrices in linear algebra. In the


linear setting, it is well known (cf. [68]) that the Gauss-Seidel method applied to
38 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

Ax = b converges for any right hand side b and starting point x0 if A is an M -


matrix. M -functions have similar properties with respect to the nonlinear Gauss-
Seidel process, which is the subject of investigation in [64]. Here we provide extensions
to the relevant definitions and theorems in [64] that would allow us to prove the
existence and uniqueness of bounded solutions to (2.3.1).
For the remainder of the section, the natural partial ordering on `∞ (N) is written
as x ≤ y, i.e.,
x≤y ⇐⇒ xi ≤ yi , ∀i ∈ N.

We denote by ei the unit basis vectors with the i-th component one and all others
zero. The following definitions are essentially identical to those in [64], except the
domain of definition has been changed from Rn to `∞ (N) to handle vectors of infinite
length.

Definition 2.1. Let R : `∞ (N) → `∞ (N).

1. R is isotone (or antitone) if, for all x, y ∈ `∞ (N), x ≤ y implies R(x) ≤ R(y) (or
R(x) ≥ R(y)). It is strictly isotone (or antitone) if x < y implies R(x) < R(y)
(or R(x) > R(y)).

2. R is inverse isotone if, for all x, y ∈ `∞ (N), R(x) ≤ R(y) implies x ≤ y.

3. R is (strictly) diagonally isotone if, for all x ∈ `∞ (N), the functions

ρii : R → R, ρii (t) = ri (x + tei ), i = 1, 2, . . . (2.3.8)

are (strictly) isotone.

4. R is off-diagonally antitone if, for any x ∈ `∞ (N), the functions

ρij : R → R, ρij (t) = ri (x + tej ), i 6= j, i, j = 1, 2, . . . (2.3.9)

are antitone.

5. R is an M -function if R is inverse isotone and off-diagonally antitone.


2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 39

One characterization of M -functions is given by Theorem 2.2, which generalizes


the following result from matrix analysis: a square matrix A is an M -matrix if it has
positive diagonal, non-positive off-diagonal, and is column diagonally dominant.

Theorem 2.2. Suppose R : `∞ (N) → `∞ (N) is off-diagonally antitone and satisfies


Assumption 2 and 3. Suppose, for each B > 0, there exists a positive sequence {wiB }
such that
P∞
1. i=1 wiB < ∞,

2. for any kxk∞ < B, the function Q(t) = (q1 (t), q2 (t), . . .) defined by


X
qi (t) = wjB rj (x + tei )
j=1

is strictly isotone over the interval t ∈ (tmin , tmax ), where

tmin = −B − inf xi , tmax = B − sup xi .


i i

Then R is an M -function.

Proof. The proof is an adaptation of the proof of Theorem 5.1 in [64], suitably
modified to handle the infinite-dimensional case. Suppose R(x) ≤ R(y) for some
x, y ∈ `∞ (N). Define the sets

N − = {i ∈ N | yi < xi }; N + = {i ∈ N | yi ≥ xi }.

Suppose N − is non-empty. For each i ∈ N − , let γi = (xi − yi )ei . We consider two


cases:

1. If |N − | < ∞ , let i1 < i2 < · · · < im be the elements of N − , and define

z 0 = y, z 1 = y + γi1 , ..., z m = y + γi1 + · · · + γim ,

and let z k = z m = z for all k > m.


40 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

2. If |N − | = ∞, let i1 < i2 < · · · be the elements of N − , and define

z 0 = y, z 1 = y + γi1 , ..., z k = y + γi1 + · · · + γik , ...

and let z = {zi } be such that zi = max{xi , yi }.

Define Rk := R(z k ) and R∞ = R(z). In either case, we have the following properties:

1. kz k k∞ < B and kzk∞ < B, where B = max{kxk∞ , kyk∞ }. Hence, by Assump-


tion 2, kRk k∞ < ζ(B) for all k (similarly for R∞ ).

2. For each i, zik = zi for large enough k, so by Assumption 3, Rjk → Rj∞ pointwise
for each j.

Since Rjk < ζ(B) for all j, k, each Rk is dominated by the constant sequence G =
(ζ(B), ζ(B), . . .). Moreover ∞ B
P
j=1 wj Gj < ∞, so by the dominated convergence the-
orem (cf. [65]), we have


X ∞
X
wjB Rjk → wjB Rj∞ as k → ∞.
j=1 j=1

By the strict isotonicity of Q, we have



X ∞
X
wjB Rj0 ≤ wjB Rj1 ≤ · · ·
j=1 j=1

with at least one strict inequality (since N − is non-empty). Thus, we must have


X ∞
X ∞
X ∞
X
wjB rj (y) = wjB Rj0 < wjB Rj∞ = wjB rj (z). (2.3.10)
j=1 j=1 j=1 j=1

Now split the last sum into two parts:



X X X
wjB rj (z) = wjB rj (z) + wjB rj (z), (2.3.11)
j=1 j∈N − j∈N +

where the summation over N + may be empty. Then by off-diagonal antitonicity of R


2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 41

(and invoking the dominated convergence theorem whenever necessary), we can show
similarly that
X X X X
wjB rj (z) ≤ wjB rj (x), wjB rj (z) ≤ wjB rj (y), (2.3.12)
j∈N − j∈N − j∈N + j∈N +

using the fact that z − x and z − y vanish on N − and N + respectively. Combining


equations (2.3.10)–(2.3.12) gives


X X X
wjB rj (y) < wjB rj (x) + wjB rj (y), (2.3.13)
j=1 j∈N − j∈N +

which implies
X X
wjB rj (y) < wjB rj (x). (2.3.14)
j∈N − j∈N −

Thus, we must have rj (y) < rj (x) for some j ∈ N − , which contradicts the hypothesis
R(x) ≤ R(y). Hence N − must be empty, so x ≤ y.

The above theorem, together with the definition of M -functions, immediately


imply the following corollary.

Corollary 2.3. Let R satisfy the hypotheses of Theorem 2.2. Let z ∈ `∞ (N). Then
there is at most one bounded solution to the equation R(x) = z.

Remark. In the context of discretized PDEs one normally assumes tacitly that the
solution of interest must be bounded; this can be regarded as a boundary condition
“at infinity”. However, since such boundary conditions are not explicitly stated in the
definition of M -functions, one must be careful to exclude any parasitic unbounded
solutions that may arise. In fact, the solution is not necessarily unique if we allow
unbounded solutions. Consider the linear function R = (r1 , r2 , . . .) defined by ri (x) =
xi − αxi+1 for |α| < 1. Then for any kxk∞ < ∞, we have kR(x)k∞ ≤ (1 + α)kxk∞ ,
so that Assumption 2 is satisfied. Assumption 3 (finitely many dependencies) is also
satisfied because each ri is only non-constant with respect to two components of x.
42 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

Finally, if we let wjB = β j for any |α| < β < 1, then β j < ∞ and
P
j


X
qi (t) = β j rj (x + tei )
j=1
X∞
β j xj + tδij − α(xj+1 + tδi,j+1 )
 
=
j=1

X
i−1
= (β − α)β t + βx1 + (β − α) β j−1 xj ,
j=2

so qi (t) is well-defined and is strictly increasing with respect to t whenever kxk∞ < ∞.
So the hypotheses of Theorem 2.2 are satisfied, and hence x = 0 is the only bounded
solution of R(x) = 0. However, unbounded solutions of the form y = {Kα−i }, K 6= 0
also satisfy R(y) = 0, so the theorem does not preclude these possibilities.

2.3.4 Convergence of nonlinear Jacobi and Gauss-Seidel

It turns out that the hypotheses of Theorem 2.2 are enough to ensure convergence of
nonlinear Jacobi and Gauss-Seidel for certain starting points described below. The
following result is essentially Theorem 3.1 in [64], with modified hypotheses to accom-
modate `∞ -bounded vectors with infinitely many components. The proof in [64] goes
through verbatim, but is reproduced here for completeness. Note that by Assumption
3, each ri depends on only finitely many arguments, so the standard arguments on
limits, continuity and antitonicity hold without additional complications when they
are used on individual components of R.

Theorem 2.4 (Rheinboldt). Let R : `∞ (N) → `∞ (N) satisfy the hypotheses of The-
orem 2.2. Suppose for some z ∈ `∞ (N) there exist points x0 , y 0 ∈ `∞ (N) such that

x0 ≤ y 0 , R(x0 ) ≤ z ≤ R(y 0 ).

Then the nonlinear Gauss-Seidel and Jacobi iterates {y k } and {xk }, given by (2.3.6)
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 43

and (2.3.7) and starting from y 0 and x0 , respectively, are uniquely defined and satisfy

x0 ≤ xk ≤ xk+1 ≤ y k+1 ≤ y k ≤ y 0 , R(xk ) ≤ z ≤ R(y k ) (2.3.15)

for all k ≥ 0. In addition, the pointwise limits

lim xk = lim y k = x∗ (2.3.16)


k→∞ k→∞

exist, and R(x∗ ) = z.

First we need the following lemma (which is part of Theorem 2.10 in [64]).

Lemma 2.5. Let R : `∞ (N) → `∞ (N) be an M -function. Then R is strictly diago-


nally isotone.

Proof. Suppose that for some x ∈ `∞ (N), s, t ∈ R, s > t and index i we have
ri (x + sei ) ≤ ri (x + tei ). The off-diagonal antitonicity then implies that

rj (x + sei ) ≤ rj (x + tei ), j 6= i,

or, altogether, that R(x + sei ) ≤ R(x + tei ). By inverse isotonicity this leads to the
contradiction s ≤ t, which shows that R must be strictly diagonally isotone.

Proof of Theorem 2.4. We present only the proof for convergence of Gauss-Seidel; the
proof for Jacobi is similar. We proceed by induction and suppose that for some k ≥ 0
and i ≥ 1,

x0 ≤ xk ≤ y k ≤ y 0 , R(xk ) ≤ z ≤ R(y k ), (2.3.17a)


xkj ≤ xk+1
j ≤ yjk+1 ≤ yjk , j = 1, . . . , i − 1, (2.3.17b)

where for i = 1 the relation (2.3.17b) is vacuous. Clearly, (2.3.17) is valid for k = 0
and i = 1. Define the functions

α(s) = ri (xk+1 k+1 k k


1 , . . . , xi−1 , s, xi+1 , xi+2 , . . .)
k+1
β(s) = ri (y1k+1 , . . . , yi−1 k
, s, yi+1 , xki+2 , . . .)
44 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

for s ∈ [x0i , yi0 ]. From (2.3.17) and the off-diagonal antitonicity of R, we then find
that
β(s) ≤ α(s), s ∈ [x0i , yi0 ], (2.3.18)

and
β(xki ) ≤ α(xki ) ≤ ri (xk ) ≤ zi ≤ ri (y k ) ≤ β(yik ) ≤ α(yik ). (2.3.19)

By the continuity and strict isotonicity of α and β (since R is an M -function and


hence strictly diagonally isotone), (2.3.19) implies the existence of unique ŷik and x̂ki
for which
β(ŷik ) = zi = α(x̂ki ), xki ≤ x̂ki ≤ ŷik ≤ yik ,

where the relation x̂ki ≤ ŷik is a consequence of (2.3.18). But xik+1 = x̂ki and yik+1 = ŷik
by definition, so we have proved (2.3.17b) for j = 1, . . . , i. By induction (2.3.17b)
holds for all i ∈ N, and hence

xk ≤ xk+1 ≤ y k+1 ≤ y k .

From this it follows again from off-diagonal antitonicity that

ri (y k+1 ) ≥ ri (y1k+1 , . . . , yik+1 , yi+1


k k
, yi+2 . . .) = zi

and similarly that

ri (xk+1 ) ≤ ri (xk+1 k+1


1 , . . . , xi , xki+1 , xki+2 . . .) = zi .

This completes the induction on k and hence the proof of (2.3.15). Applying the
monotone convergence theorem for sequences, we conclude that the pointwise limits

lim xkj = x∗j ≤ yj∗ = lim yjk


k→∞ k→∞

exist for each j, which allows us to define x∗ = {x∗j } and y ∗ = {yj∗ }. Since each
ri is continuous and depends on only finitely many arguments, the definition of the
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 45

Gauss-Seidel process then implies ri (x∗ ) = ri (y ∗ ) = zi for each i, and hence

R(x∗ ) = R(y ∗ ) = z.

Since both x∗ and y ∗ are bounded, Corollary 2.3 implies that they are equal, com-
pleting the proof.

2.3.5 Well-definedness of implicit monotone schemes


Using the theory in the last two sections, we can now prove that implicit monotone
schemes (i.e., implicit schemes whose flux functions satisfy Assumption 1) are well-
defined for bounded initial conditions. What we need to show is that the residual
functions satisfy the hypotheses of Theorem 2.4. In the interest of clarity, in this
section we only show convergence of the iterative schemes for problems whose coeffi-
cients do not vary in space (i.e., corresponding to the conservation law ut + f (u)x = 0,
discretized on a uniform spatial grid). In the next section, we state the additional
assumptions on φi and Fi+1/2 that are required for the spatially-varying case.

Theorem 2.6. Consider the numerical scheme (2.3.1) with the numerical flux given
by
n+1
Fi+1/2 = F (un+1
i , un+1
i+1 ),

where F : R × R → R is locally Lipschitz continuous and satisfies Assumption 1, i.e.,


non-decreasing in the first argument and non-increasing in the second. Assume that
the initial condition {u0i }∞
i=−∞ is bounded. Then (2.3.1) has a unique bounded solution
{un+1
i } for n = 0, 1, 2, . . .. Moreover, this bounded solution satisfies the estimate

inf unj ≤ un+1


i ≤ sup unj (2.3.20)
j∈Z j∈Z

for all i ∈ Z.

Proof. The strategy is to start by defining an ordering for the Gauss-Seidel sweeps,
i.e., by permuting the equations and variables so that the spatial indices go from 1 to
∞ rather than from −∞ to ∞. After that, it suffices to check that all the hypotheses
46 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

of Theorem 2.4 are satisfied for this ordering.

1. For j = 1, 2, . . ., define σ(j) = (−1)j bj/2c, i.e. σ maps {1, 2, 3, 4, 5, . . .} to


{0, 1, −1, 2, −2, . . .}. Let τ be the inverse map, such that τ (σ(j)) = j. Define
R : `∞ (N) → `∞ (N) to be the reordered (and rescaled) set of residual equations,
i.e.,
vj − unσ(j)
rj (v) = + F (vj , vτ (σ(j)+1) ) − F (vτ (σ(j)−1) , vj ), (2.3.21)
λ
where vj = un+1
σ(j) .

2. Since F is locally Lipschitz continuous, it is Lipschitz continuous over any compact


set, so for any B > 0 there exists KB (which can be chosen to be increasing with
B) such that for any (x, y) ∈ [−B, B] × [−B, B],

|F (x, y) − F (0, 0)| ≤ KB (|x| + |y|) ≤ 2KB · B.

Thus, for any kvk∞ ≤ B, we have |rj (v)| ≤ ζ(B) for all j, where

ζ(B) = (2/λ + 4KB )B.

Hence Assumption 2 is satisfied. Moreover, since each rj depends only on vj ,


vτ (σ(j)−1) and vτ (σ(j)+1) , Assumption 3 (finite number of dependencies) is also sat-
isfied.

3. By Assumption 1 (monotonic fluxes), F is clearly off-diagonally antitone. To


satisfy the remaining hypotheses of Theorem 2.2, let {wjB } take the form wjB =
β |σ(j)| for some 0 < β < 1, so that ∞ B
P
j=1 wj < ∞. An easy calculation shows that


X ∞
X
qi (t) := wjB rj (v + tei ) = q̃i (t) + wjB rj (v),
j=1 j=1

where

q̃i (t) = wiB t/λ + (wiB − wτB(σ(i)+1) ) F (vi + t, vτ (σ(i)+1) ) − F (vi , vτ (σ(i)+1) )
 

+ (wτB(σ(i)−1) − wiB ) F (vτ (σ(i)−1) , vi + t) − F (vτ (σ(i)−1) , vi ) .


 
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 47

By the definition of wiB , we see that

|wiB − wτB(σ(i)±1) | ≤ β |σ(i)|−1 (1 − β),

which, when combined with the local Lipschitz continuity of F , gives

β |σ(i)|−1 βt/λ − 2(1 − β)KB |t| ≤ q̃i (t) ≤ β |σ(i)|−1 βt/λ + 2(1 − β)KB |t| .
   

Hence, q̃i (t) is strictly isotone whenever

β/λ > 2(1 − β)KB ,

so picking
2λKB
<β<1 (2.3.22)
1 + 2λKB
ensures isotonicity for q̃i (t) (and hence qi (t)) for all i, as required in Theorem 2.2.
(Note that the choice of β depends on B.)

4. We need to choose starting points x0 and y 0 that satisfy the requirements of


Theorem 2.4. Let x0 and y 0 both be constant sequences with

x0i = inf unj , yi0 = sup uj , ∀i ∈ N.


j∈Z j∈Z

Then clearly x0 ≤ y 0 , and for all i ∈ N,

ri (x0 ) = x0i − unσ(i) = inf unj − unσ(i) ≤ 0,


j∈Z

ri (y 0 ) = yi0 − unσ(i) = sup unj − unσ(i) ≥ 0,


j∈Z

so R(x0 ) ≤ 0 ≤ R(y 0 ). Thus, by Theorem 2.4, the nonlinear Gauss-Seidel iterates


{y k } and {xk } both converge (pointwise) to the unique solution x∗ with R(x∗ ) = 0;
hence, a unique solution to (2.3.1) exists, i.e.,

un+1
i = x∗τ (i) .
48 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

Moreover, we know that x0 ≤ x∗ ≤ y 0 , which immediately implies (2.3.20).

Remarks.

1. Note that the initial condition {u0i }∞ 1


i=−∞ is not assumed to be in ` nor in BV ,
so this result is somewhat more general than classical results that use Crandall-
Liggett theory.

2. Note that the definition of an M -function is invariant under symmetric permu-


tations, i.e., R(x) is an M -function if and only if σR(σx) is also an M -function
for any permutation σ : N → N. Thus, the Gauss-Seidel process will converge
regardless of the way the ordering is chosen in step 1 of the proof. However, we
show in the next section that the rate of convergence is sensitive to the ordering.

In fact, one can show that the nonlinear Jacobi and Gauss-Seidel processes con-
(0)
verge for any starting point {zi } that is bounded by the initial data {uni }. (In the
sequel, superscripts in brackets indicate iterates within the Gauss-Seidel process, and
superscripts without brackets indicate the time level in the numerical scheme.)

Theorem 2.7. Assume the hypotheses of Theorem 2.6. Suppose the initial guess
(0)
{zi } satisfies
(0)
inf unj ≤ zi ≤ sup unj (2.3.23)
j∈Z j∈Z

for all i ∈ Z. Then the nonlinear Jacobi and Gauss-Seidel processes (2.3.6) and
(2.3.7) are well-defined and converge to the unique bounded solution of (2.3.1).

Proof. Again we only show convergence for the Gauss-Seidel process, since the proof
for Jacobi is similar. Denote u = inf j∈Z unj and u = supj∈Z unj . First, we show that
(k)
the Gauss-Seidel iterates are well-defined and that u ≤ uj ≤ u for all j, k. At each
step we need to solve

rj (zj∗ ) = zj∗ − unj + λ F (zj∗ , zj+1 ) − F (zj−1 , z ∗ j) = 0,


 
(2.3.24)

(k) (k+1)
where zj±1 = zj±1 or zj±1 depending on the ordering of the Gauss-Seidel sweep,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 49

which by induction must lie between u and u. But

rj (u) = u − unj + λ [F (u, zj+1 ) − F (zj−1 , u)]


≤ 0 + λ [F (u, u) − F (u, u)] ≤ 0,

where the last inequality follows from Assumption 1. Similarly one obtains rj (u) ≥ 0,
so by continuity of F (and hence rj ) there must exist a solution zj∗ to (2.3.24), which
by Lemma 2.5 must be unique. Hence, by induction, the Gauss-Seidel iterates are
well-defined and are bounded above and below by u and u respectively.
(k) (k) (0)
Now consider the Gauss-Seidel iterates {xj } and {yj } with initial guess xj = u
(0)
and yj = u for all j. By Theorem 2.6 these iterates converge pointwise to the same
solution {x∗j }. We show inductively that x(k) ≤ z (k) ≤ y (k) for all k, which would imply
(k)
that zj → x∗j pointwise. Using the same reordering as in Theorem 2.6, assume that
for some k ≥ 0 and i ≥ 1 we have

(k+1) (k+1) (k+1)


y (k) ≥ z (k) ≥ x(k) , yj ≥ zj ≥ xj , j = 1, . . . , i − 1,

which is valid for k = 0 and i = 1. Then by the same boundedness and antitonicity
arguments as in Theorem 2.4, we have

(k+1) (k+1) (k+1) (k) (k+1) (k+1) (k+1) (k)


ri (y1 , . . . , yi−1 , yi , yi+1 , . . .) = 0 = ri (z1 , . . . , zi−1 , zi , zi+1 , . . .)
(k+1) (k+1) (k+1) (k)
≥ ri (y1 , . . . , yi−1 , zi , yi+1 , . . .),

(k+1) (k+1)
which, together with the strict diagonal isotonicity or ri , implies that yi ≥ zi .
(k+1) (k+1)
Similarly it follows that zi ≤ xi . This completes the induction, and hence
(k)
zj → x∗j pointwise.

In other words, the nonlinear Gauss-Seidel process converges if we use {unj } (i.e.,
the solution from the previous time step) as an initial guess. For small to moderate
timestep sizes, one generally expects the solutions between consecutive time steps to
be close to each other, so in practice using {unj } results in much faster convergence
than either u or u as an initial guess.
50 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

2.3.6 Rate of convergence of the nonlinear processes

So far we have proven that the nonlinear Gauss-Seidel and Jacobi processes both
converge globally when applied to residual functions arising from implicit monotone
schemes, but we have not investigated how fast these processes converge. For this
purpose, let us reconsider the finite-dimensional case, i.e., when R : RN → RN is
given by

ri (un+1 ) = φi (un+1
i − uni ) + λ(Fi+1/2 (un+1
i , un+1 n+1 n+1
i+1 ) − Fi−1/2 (ui−1 , ui )) (2.3.25)

for i = 1, . . . , N , and the finite versions of the Gauss-Seidel and Jacobi processes
((2.3.4) and (2.3.5)) are used. It is well known [59] that for a convergent fixed-point
iteration xn+1 = Gxn , the asymptotic rate of convergence is given by ρ(G0 (x∗ )),
the spectral radius of the Jacobian matrix evaluated at the solution x∗ . Moreover,
superlinear convergence is obtained when ρ(G0 (x∗ )) = 0. The following lemma gives
the rate of convergence for the nonlinear Gauss-Seidel and Jacobi processes.

Lemma 2.8. Suppose the residual function R : D ⊂ RN → RN is an M -function that


is continuously differentiable at x∗ . Let the Jacobian matrix be written as R0 (x∗ ) =
D − L − U , where D is a diagonal matrix and L, U are strictly lower and upper
triangular respectively. Then the asymptotic rates of convergence for the nonlinear
Gauss-Seidel and Jacobi processes ( (2.3.4) and (2.3.5)) are given by ρGS and ρJ
respectively, where

ρGS = ρ((D − L)−1 U ), ρJ = ρ(D−1 (L + U )).

Proof. Let G denote the Gauss-Seidel operator, i.e., y := xk+1 = Gxk , where xk+1 is
defined implicitly as a function of xk by (2.3.4). Then implicit differentiation gives,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 51

for each j = 1, . . . , N ,

∂r1 ∂y1 ∂r1


+ =0
∂y1 ∂xj ∂xj
∂r2 ∂y1 ∂r2 ∂y2 ∂r2
+ + =0
∂y1 ∂xj ∂y2 ∂xj ∂xj
..
.
∂rN ∂y1 ∂rN ∂yN ∂rN
+ ··· + + =0
∂y1 ∂xj ∂yN ∂xj ∂xj

We can rewrite the above in matrix form as

  ∂y
D(y) − L(y) − U (x) = 0,
∂x

where D, L and U are the diagonal, strict lower-triangular and strict upper-triangular
part of ∂R/∂x respectively. Since R is an M -function, (D(y) + L(y)) is nonsingular
for all y ∈ D. Thus, G0 is given by

∂y  −1
G0 (x) = = D(y(x)) − L(y(x)) U (x).
∂x

Since R is continuously differentiable at x∗ , letting x, y → x∗ shows that the asymp-


totic rate of convergence is given by ρ((D − L)−1 U ), as required. The argument for
the Jacobi process is similar.

In other words, the rates of convergence of the nonlinear processes are exactly the
same as the rates for the corresponding linear processes applied to the Jacobian matrix
of the residual function. For the residual function (2.3.25), the Jacobian matrix has
the following tridiagonal form:
 
d1 f1
 ... 
∂R e2 d2
 
= ,

∂u  . .
. . . . fN −1 
 
eN dN
52 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

where

∂Fi+1/2 ∂Fi−1/2
 
di = φi + λ − > 0,
∂ui ∂ui
∂Fi−1/2
ei = −λ ≤ 0,
∂ui−1
∂Fi+1/2
fi = λ ≤ 0.
∂ui+1

Thus, di = φi − ei+1 − fi−1 , so that ∂R/∂u is a column diagonally dominant matrix.


This guarantees that both ρGS and ρJ are strictly less than 1. Since ∂R/∂u is generally
not a diagonal matrix, it is clear that nonlinear Jacobi converges at most linearly.
One can compute an upper bound for ρJ as follows. We have

ρJ = ρ(D−1 (L + U )) = ρ((L + U )D−1 )


 
−1 ei+1 + fi−1
≤ (L + U )D 1
= max
i di
min φi
≤1− .
max di

Since − log ρJ ≈ φmin / max di , it follows that − log ρJ is roughly inversely proportional
to the mesh ratio λ, especially when λ (and equivalently ∆t) is large. Thus, one
expects Jacobi to take roughly twice as many iterations to converge when one doubles
the time-step size while fixing the spatial grid (or, equivalently, when the grid is refined
by a factor of two while ∆t is kept constant).
For Gauss-Seidel, we exploit the fact that ∂R/∂u is tridiagonal. For this class
of matrices (and in fact, for any consistently ordered matrices in the sense of Young
[85]), the following theorem holds [68].

Theorem 2.9. Let A be a consistently ordered matrix such that aii 6= 0 for i =
1, . . . , N , and let the SOR parameter ω be nonzero. Then, if λ is a nonzero eigenvalue
of the SOR iteration matrix GSOR , any scalar µ such that

(λ + ω − 1)2 = λω 2 µ2 (2.3.26)
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 53

is an eigenvalue of the Jacobi iteration matrix GJ . Conversely, if µ is an eigenvalue


of GJ and if a scalar λ satisfies (2.3.26), then λ is an eigenvalue of GSOR .

Since Gauss-Seidel is simply SOR with ω = 1, it follows that either ρGS = 0


or ρGS = ρ2J . The former happens when fi = 0 for i = 1, . . . , N − 1, i.e., when
∂R/∂u is lower triangular. In this case, Gauss-Seidel converges in one iteration (i.e.,
superlinearly), since the nonlinear system is actually decoupled and Gauss-Seidel is
essentially just a forward substitution. For 1D porous media flow, this occurs when
flow is purely cocurrent and the numerical scheme reverts to single-point upstream-
ing. When countercurrent flow is present, there is no symmetric permutation that
would render ∂R/∂u lower triangular, so Gauss-Seidel also converges linearly, requir-
ing about half as many iterations as Jacobi.

2.3.7 Extensions
In this section we show how to extend the results of Theorems 2.6 and 2.7 to deal
with:

1. conservation laws with non-uniform spatial grids,

2. spatially-varying flux functions,

3. problems in which the flux functions are only defined over a closed interval
I ⊂ R, and

4. problems in multiple dimensions.

Non-uniform grids and spatially-varying flux functions

Consider again the fully-implicit discretization (2.3.1):

φi (un+1
i
n+1
− uni ) + λ(Fi+1/2 n+1
− Fi−1/2 ) = 0, λ = ∆t/∆x, i ∈ Z,

with a spatially-varying φi and Fi+1/2 . We assume that 0 < φi ≤ 1. Notice that the
non-uniform grid case is automatically included: for any non-uniform discretization
54 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

of the form
n+1 n+1
φ̃i (un+1
i − uni ) Fi+1/2 − Fi−1/2
+ = 0, (2.3.27)
∆t ∆xi
we can multiply (2.3.27) by ∆t∆xi /∆xmax to recover the form of (2.3.1) with

φi = φ̃i ∆xi /∆xmax , λ = ∆t/∆xmax .

To ensure convergence of the Jacobi and Gauss-Seidel processes, we need the following
assumptions:

1. The family of flux functions {Fi+1/2 }∞


i=−∞ is equicontinuous [65] with the same
Lipschitz constant KB ;

2. {φi } is uniformly bounded away from zero, i.e. there exists φmin > 0 such that
φi ≥ φmin for all i ∈ Z.

While the equicontinuity condition may appear severe, it is usually satisfied in practice
because the spatially-varying coefficients (e.g. K(x) in (2.1.10)) tend to be uniformly
bounded, ensuring equicontinuity in the flux functions. With the above assumptions,
we can mimic Theorem 2.6 exactly by replacing λ with λ/φi . Then the proof goes
through verbatim, except for (2.3.22), which must be modified to

2λKB
< β < 1. (2.3.28)
φmin + 2λKB

Bounded admissible solutions

Formally, Theorem 2.6 requires the discrete flux function F (ui , ui+1 ) to be defined on
R × R. In practice one may want to solve problems for which the flux function f is
only defined on an interval [umin , umax ] rather than on all of R, so states outside these
physical bounds are not admissible. For instance, in the two-phase flow problem,
we must have Si ∈ [0, 1] for all i, and the flux function f (S) in (2.1.10) is not even
defined outside this range. Fortunately, the estimate (2.3.20) ensures that as long
as the initial conditions are within physical bounds, so will the solution remain for
subsequent time steps n > 0. Thus, in order to apply Theorem 2.6 to these problems,
2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 55

one can formally extend the domain of definition of the flux function f to R by
defining, for instance,

f (umin ), u < umin ,



f˜(u) = f (u), umin ≤ u ≤ umax ,



f (u ), u > u ,
max max

and similarly for the discrete flux F (u, v). Since all the Gauss-Seidel iterates {y k } and
{xk } satisfy the bound x0 ≤ xk ≤ y k ≤ y 0 , the exact manner in which the extension
is defined is unimportant as long as the monotonicity property (Assumption 1) is
satisfied.

Multiple dimensions

The M -function analysis above can be extended to scalar conservation laws in multiple
dimensions. Consider once again the conservative, implicit monotone scheme
X
φi (un+1
i − uni ) + λil Fil (un+1
i , un+1
l ) = 0, i ∈ I, (2.3.29)
l∈adj(i)

of which the SEQ problem is an example. Recall that Fil is the flux from cell i to
cell l, λil = ∆t|∂Vil |/|Vi |, where |∂Vil | is the area of the surface separating cell i and
l, |Vi | is the volume of cell i and ∆t is the time step. In order to mimic Theorem 2.6,
we need the following assumption on the numerical flux:

1. Fil is equicontinuous with the same Lipschitz constant for all pairs of adjacent
cells (i, l),

as well as these assumptions on the grid:

2. The number of cells (control volumes) adjacent to cell i, |adj(i)|, is bounded for
all i;

3. The ratio |∂Vil |/|Vi | is bounded for all pairs of adjacent cells (i, l);

4. The quantity φi |Vi | is uniformly bounded away from zero for all i;
56 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

5. For any cell i, the total number of cells reachable from i in k steps is O(k p ) for
some fixed p > 0 (i.e. grows at most polynomially in k).

Items (1) and (4) are analogous to the conditions stated in the non-uniform grid case,
whereas the other conditions are new. These assumptions ensure that the residual
functions are all bounded and have the same Lipschitz constant over the set {u ∈
`∞ (N ) | kuk∞ < B}. The polynomial growth assumption (5) allows us to assign
the weights {wiB } to each cell i in the following manner: pick any node i0 and let
wiB = β d(i0 ,i) , where d(i, j) is the shortest distance between node i and j in the graph-
theoretic sense. Since the number of cells within k steps of i0 grows polynomially in
k, the series i wiB converges for any 0 < β < 1, so β can be chosen the same way
P

as in step 3 of Theorem 2.6 and the same argument will hold.

2.3.8 Maximum principle

We conclude this section by proving a stronger version of (2.3.20) that is satisfied by


implicit monotone schemes, as well as any Gauss-Seidel iterates.

Proposition 2.10 (Maximum principle). Suppose u∗ solves the problem

u∗ − u0 + λ F (u∗ , u+ ) − F (u− , u∗ ) = 0,
 

where F satisfies Assumption 1. Then u∗ satisfies

min{u0 , u− , u+ } ≤ u∗ ≤ max{u0 , u− , u+ }. (2.3.30)

Proof. If u∗ is equal to any one of u0 , u− , u+ , there is nothing to prove. So assume


u∗ ∈
/ {u0 , u− , u+ }. Define

F (u∗ , u∗ ) − F (u∗ , u+ )
C= ≥0 (2.3.31)
u+ − u∗
F (u∗ , u∗ ) − F (u− , u∗ )
D= ≥ 0. (2.3.32)
u∗ − u−
2.4. CONVERGENCE TO THE ENTROPY SOLUTION 57

The non-negativity of C and D follows from Assumption 1. Then

0 = u∗ − u0 + λ F (u∗ , u+ ) − F (u− , u∗ )
 

= u∗ − u0 + λ F (u∗ , u+ ) − F (u∗ , u∗ ) + F (u∗ , u∗ ) − F (u− , u∗ )


 

= (u∗ − u0 ) + λC(u∗ − u+ ) + λD(u∗ − u− ).

Thus, if u∗ − u0 , u∗ − u− , u∗ − u+ all had the same sign, we would get a contradiction.


Thus, at least two of the three terms must have opposite signs, which implies (2.3.30).

2.4 Convergence to the entropy solution


In this section, we restrict our attention to implicit monotone discretizations corre-
sponding to the conservation law

ut + f (u)x = 0, (x, t) ∈ R × R+ , (2.4.1)

i.e. when φ(x) ≡ 1 is constant and the flux function does not vary in space (but is
generally non-convex and/or non-monotonic). Kružkov [45] has shown that (2.4.1)
has a unique entropy-satisfying weak solution, as stated in the following theorem.

Theorem 2.11 (Kružkov). If f is locally Lipschitz continuous, then for any u0 ∈


BV (R) and for any T > 0 there is a unique u ∈ BV (R × [0, T ]) ∩ C 0 ([0, T ], L1loc (R))
such that u is a weak solution, i.e.
ZZ Z
(uψt + f (u)ψx )dx dt + u0 (x)ψ(x, 0)dx = 0 (2.4.2)
R×[0,T ] R

for all ψ ∈ C0∞ (R × [0, T ]) and, in addition, satisfies the entropy condition: For all
ψ ∈ C0∞ (R × [0, T ]) with ψ ≥ 0, and for all c ∈ R,
ZZ
[|u − c|ψt + sgn(u − c)(f (u) − f (c))ψx ] dx dt ≥ 0. (2.4.3)
R×[0,T ]
58 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

The classical approach for establishing convergence to the unique entropy solution
proceeds as follows (cf. [24, 41, 70]):

1. Show that the sequence of numerical approximations remains uniformly bounded


and has uniformly bounded total variation as ∆x, ∆t → 0. This ensures the set
of numerical approximations is precompact in L1loc (R × [0, T ]), which allows one
to produce a convergent subsequence;

2. Show that the numerical flux is consistent and satisfies a discrete entropy in-
equality. By the Lax-Wendroff theorem [46], this implies the limit u of the
convergent subsequence satisfies (2.4.2) and (2.4.3) in Theorem 2.11;

3. Verify that the entropy-satisfying weak solution is unique. In the 1D scalar case
this is a result of Theorem 2.11. This ensures all subsequences have the same
limit point, so that the finite difference scheme is convergent as ∆x, ∆t → 0.

A detailed argument along the above lines can be found in [24, 50, 70] and will not
be repeated here. Instead we focus on checking the various criteria listed above. The
numerical flux is assumed to be consistent, and by Theorem 2.6, the discrete solution
is uniformly bounded for spatial and temporal grid size. Thus, we only need to verify
that the numerical approximations have bounded total variation, and that a discrete
entropy inequality exists. The following two lemmas address these questions.

Lemma 2.12. Assume the hypotheses of Theorem 2.6, and suppose for n ≥ 1 the
discrete solution {uni }∞
i=−∞ is given by the unique bounded solution satisfying (2.3.1).
Assume the initial data {u0i }∞
i=−∞ has bounded total variation, i.e.


X
0
T V (u ) := |u0i+1 − u0i | < ∞.
i=−∞

Then T V (un ) < ∞ for all n ≥ 1, and

T V (un+1 ) ≤ T V (un ) for all n.


2.4. CONVERGENCE TO THE ENTROPY SOLUTION 59

Proof. For notational simplicity we write ui = un+1


i , vi = uni , ∆ui+1/2 = ui+1 − ui .
By a manipulation similar to the one in Proposition 2.10, we have, for each i ∈ Z,

ui − vi − λCi ∆ui+1/2 + λDi ∆ui−1/2 = 0, (2.4.4)

where

F (ui , ui ) − F (ui , ui+1 )


Ci = ≥ 0,
ui+1 − ui
F (ui , ui ) − F (ui−1 , ui )
Di = ≥ 0.
ui − ui−1

Writing (2.4.4) for cells i, i + 1 and subtracting gives

∆ui+1/2 − ∆vi+1/2 − λCi+1 ∆ui+3/2


+ λDi+1 ∆ui+1/2 + λCi ∆ui+1/2 − Di ∆ui−1/2 = 0.

Rearrange and get

(1 + λCi + λDi+1 )∆ui+1/2 = ∆vi+1/2 + λCi+1 ∆ui+3/2 + λDi ∆ui−1/2 .

Since Ci , Ci+1 , Di , Di+1 are all non-negative, the triangle inequality gives

(1 + λCi + λDi+1 )|∆ui+1/2 | ≤ |∆vi+1/2 | + λCi+1 |∆ui+3/2 | + λDi |∆ui−1/2 |. (2.4.5)

Summing (2.4.5) for i from −N to N and making some cancellations, we get

N
X N
X
|∆ui+1/2 | ≤ |∆vi+1/2 | + λCN +1 |∆uN +3/2 |
i=−N i=−N

− λC−N |∆u−N +1/2 | + λD−N |∆u−N −1/2 | − λDN +1 |∆uN +1/2 |


N
X
≤ |∆vi+1/2 | + λCN +1 |∆uN +3/2 | + λD−N |∆u−N −1/2 |
i=−N

≤ T V (v) + 4λKB · B,
60 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

where ku0 k∞ < B and KB is the local Lipschitz constant. Since the last expression
is finite and independent of N , the monotone convergence theorem guarantees that if
we let N approach infinity, the series will converge, so we get T V (u) < ∞. Moreover,
it implies that for every ε > 0 there exists N (which depends on ε) such that
X 1 ε
|∆ui+1/2 | ≤ min{ ε, }.
2 λKB
|i|>N

Thus, we have

N
ε X
T V (u) ≤ + |∆ui+1/2 |
2 i=−N
N
ε X
≤ + |∆vi+1/2 | + λCN +1 |∆uN +3/2 | + λD−N |∆u−N −1/2 |
2 i=−N
ε
≤ + T V (v) + λKB (|∆uN +3/2 | + |∆u−N −1/2 |)
2
ε X
≤ + T V (v) + λKB |∆ui+1/2 |
2
|i|>N
ε ε
≤ + T V (v) + λKB ·
2 2λKB
≤ T V (v) + ε.

Finally, letting ε → 0 gives T V (u) ≤ T V (v), as required.

For the next lemma, recall that if u is an entropy-satisfying weak solution to


(2.4.1), it must satisfy an entropy inequality of the form

ϕ(u)t + ψ(u)x ≤ 0 (2.4.6)

in the weak sense, where ϕ(·) is an arbitrary C 2 function with ϕ00 > 0, and (ϕ, ψ) are
related by ψ 0 = ϕ0 f 0 . Kružkov showed in [45] that this formulation is equivalent to
requiring that condition (2.4.3) be satisfied for all c ∈ R.

Lemma 2.13. Assume the hypotheses of Theorem 2.6, and let {uni } be the unique
2.4. CONVERGENCE TO THE ENTROPY SOLUTION 61

bounded solution satisfying (2.3.1). Let (ϕ, ψ) be an entropy/flux pair. Then there ex-
ist functions Φ = Φ(u) and Ψ = Ψ(u− , u+ ) that are consistent with (ϕ, ψ) (i.e. Φ(u) =
ϕ(u) and Ψ(u, u) = ψ(u)) such that (Φ, Ψ) satisfies a discrete entropy inequality:

Φ(un+1 ) − Φ(uni ) + λ Ψ(un+1 , un+1 n+1 n+1


 
i i i+1 ) − Ψ(ui−1 , ui ) ≤ 0. (2.4.7)

Proof. The development essentially follows [75]. Define the entropy variables

v := ϕ0 (u).

Since ϕ00 > 0, ϕ0 is one-to-one, so we can do a change of variables and let u = u(v).
So we can define the potential function
Z v
q(v) = f (u(η))dη,
0

which is used to define the discrete entropy/flux pair:

Φ(u) := ϕ(u)
1 1
Ψ(ui , ui+1 ) := (vi + vi+1 )Fi+1/2 − (q(vi ) + q(vi+1 )).
2 2
d
It is easily seen that Ψ is consistent with ψ (by showing that du
Ψ(u, u) = ϕ0 f 0 ). Then
since ϕ = Φ is convex, we have

Φ(un+1
i ) + Φ0 (un+1
i )(uni − un+1
i ) ≤ Φ(uni ),

so that

0 ≥ Φ(un+1
i ) − Φ(uni ) + vin+1 (uni − un+1
i )
0 ≥ Φ(un+1
i ) − Φ(uni ) + λvin+1 (Fi+1/2
n+1 n+1
− Fi−1/2 )
h i
0 ≥ Φ(un+1 n n+1 n+1 n+1 n+1 n+1 n+1

i ) − Φ(u i ) + λ v i F i+1/2 + q(v i ) − v i Fi−1/2 + q(v i )
62 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

So for (2.4.7) to hold it is sufficient demonstrate the following inequalities:

Ψ(ui , ui+1 ) ≤ vi Fi+1/2 − q(vi ), (2.4.8)


Ψ(ui−1 , ui ) ≥ vi Fi+1/2 − q(vi ). (2.4.9)

From the definition of Ψ we get

Ψ(ui , ui+1 ) − vi Fi+1/2 + q(vi )


1 1
= (vi + vi+1 )Fi+1/2 − (q(vi ) + q(vi+1 )) − vi Fi+1/2 + q(vi )
2 2
1 1
= (vi+1 − vi )Fi+1/2 − (q(vi+1 ) − q(vi ))
2Z 2
1 vi+1  
= Fi+1/2 − f (u(η)) dη
2 vi
1 vi+1
Z
= [F (ui , ui+1 ) − F (u(η), u(η))] dη
2 vi
1 vi+1 
Z
 
= F (ui , ui+1 ) − F (ui , u(η)) + F (ui , u(η)) − F (u(η), u(η)) dη,
2 vi

where η lies between vi and vi+1 . If vi ≤ vi+1 , then ui ≤ u(η) ≤ ui+1 , so that

F (ui , ui+1 ) − F (ui , u(η)) ≤ 0


F (ui , u(η)) − F (u(η), u(η)) ≤ 0,

and hence the integrand is non-positive. Analogously, vi ≥ vi+1 implies that the
integrand is non-negative, so either way the integral cannot be positive, thus proving
(2.4.8). Relation (2.4.9) is proved similarly, and the lemma follows.

2.5 Accuracy of phase-based upstreamed solutions


In this section, we investigate the accuracy of the numerical solution obtained from
phase-based upstreaming when we vary the spatial and temporal grid. Our test case
consists of the 1D countercurrent flow problem (§2.2.1), with domain Ω = [0, 1].
Water is injected at the boundary xD = 0 and a pressure boundary condition is
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 63

maintained at xD = 1. The hyperbolic form of the problem is described by

∂S ∂f (S)
+ = 0.
∂tD ∂xD

The flux function f (S) is shown in Figure 1(b), with a sonic point at S = 0.49;
countercurrent flow occurs whenever S ≥ 0.49. The initial saturation profile is a step
function with 
1, 0 ≤ xD < 0.2
S 0 (xD ) =
0 0.2 < x ≤ 1.
D

The numerical solution is compared with the analytical solution at time tD = 0.15.
Because of the sonic point, the solution contains two shocks connected by a rarefac-
tion; one shock moves to the right with a velocity of 3.9, and the other travels to the
left with a velocity of −1.2. When considering the accuracy of a numerical solution,
two error measures are shown:

• The L1 -error, which is the difference between the numerical and the analytical
solution in the L1 -norm;

• The front dispersion, which is the distance between analytical shock front and
the leftmost point for which the numerical solution becomes zero.

We also measure how difficult the nonlinear problem is by showing, for each test case,
the average number of nonlinear Gauss-Seidel iterations required to converge each
time step. We remark that this measure is only useful for problems with countercur-
rent flow, since Gauss-Seidel always converges in one iteration in the cocurrent case
(cf. section 2.3.6).

2.5.1 Refinement under fixed mesh ratio


Here we refine the grid under a fixed mesh ratio ∆x/∆t, which in turn yields a fixed
CFL number of 4.10, which is above the CFL limit for explicit schemes. Figure 2.3
shows the plots for N = 25, 50, 100, 200, 400, and Table 2.1 shows the L1 -error and
front dispersion data. The plots show that the numerical solution converges to the
64 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

Table 2.1: Accuracy of numerical solutions for a fixed CFL number.


N tD /∆t CFL L1 -error Front dispersion Average # iterations
25 5 4.10 0.0889 > 0.215 5.2
50 10 4.10 0.0665 > 0.215 4.9
100 20 4.10 0.0444 0.116 4.4
200 40 4.10 0.0273 0.066 4.2
400 80 4.10 0.0168 0.039 4.1

analytical solution even though the CFL number is greater than 1, which confirms
our analysis. Moreover, both the L1 error and the front dispersion are converging
a bit worse than linearly, with a ratio of about 0.61 and about 0.58 respectively
for every refinement by a factor of two. Also note the poor resolution near the left
boundary N = 25, 50, 100, where instead of approaching S = 1, the solution is closer
to Sc ≈ 0.27 at the left boundary. For these coarser grids, the numerical solution
has a hard time deciding whether the left-moving wave has reached the boundary,
which is maintained at S(x = 0) = Sc (see Equation (2.2.10)). For higher resolutions
(N = 200, 400), the artifact has disappeared and the numerical solution reproduces
the back end of the saturation profile quite accurately. The average number of Gauss-
Seidel iterations required for convergence are all similar, so refining the grid for a fixed
mesh ratio does not increase the difficulty of the problem for the nonlinear solver.

2.5.2 Spatial refinement for fixed time steps


Here, we refine the spatial grid only while fixing the time-step size. Figure 2.4
and Table 2.2 show the results for N = 25, 50, 100, 200, 400, and a time-step size
of ∆tD = 0.0075, i.e. we use 20 time steps to integrate up to tD = 0.15. We see that
even though the N = 25 case has a CFL number close to 1, the grid is clearly too
coarse, and the shock front is very poorly resolved. The accuracy increases substan-
tially when the spatial grid is refined to N = 50, 100, even though the CFL number
becomes progressively larger; thus, the CFL number by itself is not a good measure
of solution quality. However, the improvement due to spatial grid refinement be-
comes negligible for N > 100, since time discretization is now the dominant source of
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 65

Table 2.2: Accuracy of numerical solutions for a fixed time step size.
N tD /∆t CFL L1 -error Front dispersion Average # iterations
25 20 1.02 0.0673 > 0.215 2.6
50 20 2.05 0.0529 0.156 3.3
100 20 4.10 0.0444 0.116 4.4
200 20 8.20 0.0378 0.101 6.4
400 20 16.40 0.0366 0.094 9.2

error. In addition, the average number of iterations required to attain convergence in-
creases with each refinement: as we refine the grid, we are solving increasingly difficult
problems, even though the improvement in solution accuracy will stagnate beyond a
certain point. Thus, even though the fully-implicit method can tolerate arbitrarily
large CFL numbers, one should not hope to improve solution accuracy indefinitely
simply by using a finer spatial grid, without making a corresponding reduction of
time-step size.

2.5.3 Non-uniform grids


The real advantage of the fully-implicit method over an explicit scheme lies in its
efficiency when applied to a heterogeneous problem, where the porosity φ(x) and per-
meability K(x) can vary by orders of magnitude over the domain. In these problems,
the CFL condition is determined by the minimum porosity in the domain, which
can be much smaller than the average porosity. To illustrate this point, we show an
example in which the spatial grid is non-uniform (which, based on Remark 2.3.7, is
equivalent to the spatially-varying porosity case). The non-uniform grid contains 50
gridblocks, with ∆xmax /∆xmin = 96. Figure 2.5 and Table 2.3 compare the numeri-
cal solutions obtained from this grid to the uniform-grid solutions. We see that the
solutions are qualitatively (from the plots) and quantitatively (from the L1 -error and
front dispersion) not very different from their uniform counterparts, even though the
CFL number is 50 times larger in the non-uniform case. Thus, an explicit integra-
tor would have to take unacceptably small time steps, whereas an implicit method
allows time steps that are much more reasonable. In addition, the average number
66 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

Table 2.3: Accuracy of numerical solutions for a non-uniform grid.


N tD /∆t CFL L1 -error Front dispersion Avg. # its
Non-uniform 50 20 105.80 0.0566 0.180 3.2
50 50 42.30 0.0475 0.132 2.2
Uniform 50 20 2.05 0.0529 0.156 3.3
50 50 0.82 0.0435 0.116 2.1

of iterations required for Gauss-Seidel convergence is roughly the same for both the
uniform and non-uniform case, so the resulting nonlinear equations are not harder to
solve, despite the large CFL numbers.
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 67

N = 25, T/dt = 5, CFL = 4.0956 N = 50, T/dt = 10, CFL = 4.0956


1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw
0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(a) (b)
N = 100, T/dt = 20, CFL = 4.0956 N = 200, T/dt = 40, CFL = 4.0956
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(c) (d)
N = 400, T/dt = 80, CFL = 4.0956
1

0.9

0.8

0.7

0.6
Sw

0.5

0.4

0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1
xD

(e)
Figure 2.3: Numerical solution at different resolutions, CFL = 4.10, tD = 0.15.
68 CHAPTER 2. ANALYSIS OF UPSTREAM WEIGHTING

N = 25, T/dt = 20, CFL = 1.0239 N = 50, T/dt = 20, CFL = 2.0478
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw
0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(a) (b)
N = 100, T/dt = 20, CFL = 4.0956 N = 200, T/dt = 20, CFL = 8.1913
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(c) (d)
N = 400, T/dt = 20, CFL = 16.3826
1

0.9

0.8

0.7

0.6
Sw

0.5

0.4

0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1
xD

(e)
Figure 2.4: Numerical solution for different spatial grids, 20 time steps, tD = 0.15.
2.5. ACCURACY OF PHASE-BASED UPSTREAMED SOLUTIONS 69

N = 50, T/dt = 20, CFL = 105.8195 N = 50, T/dt = 50, CFL = 42.3278
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw
0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(a) (b)
N = 50, T/dt = 20, CFL = 2.0478 N = 50, T/dt = 50, CFL = 0.81913
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Sw

Sw

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
xD xD

(c) (d)
Figure 2.5: Numerical solutions obtained from a non-uniform grid ((a) and (b)) and
their uniform-grid counterparts ((c) and (d)), tD = 0.15.
Chapter 3

Potential Ordering

The main theme of this thesis is the reordering of equations and variables in a way
that allows a partial decoupling of the problem into a sequence of single-cell problems
that are easier to solve. The basic insight is to perform reordering based on flow
direction information, which is provided by the pressure field. This approach is in-
tuitive because saturation information travels from upstream to downstream, so one
expects methods that respect this ordering to be more efficient than methods that
are blind to upstream information. We have already seen in section 2.3.6 that in the
1D cocurrent case, nonlinear Gauss-Seidel converges in exactly one iteration if, and
only if, the cells are ordered from upstream to downstream. Thus, ordering can have
a large impact on the performance of solution algorithms.
For a problem with np phases, there are np equations and unknowns associated
with each block, which means there are multiple ways of ordering these equations
while respecting the direction of flow. We can distinguish between the following two
categories of ordering:
1. Cell-based ordering, in which all the equations and variables aligned with a cell
(control volume) are grouped together as a block, and reordering only applies
at the cell level;

2. Phase-based ordering, in which all the equations and variables corresponding


to a particular phase p are grouped together, and a different cell ordering can
be used for each block.

70
3.1. METHODS DERIVED FROM CELL-BASED ORDERING 71

The two approaches are useful in different situations and they both contribute to
the various nonlinear solvers and preconditioning algorithms presented in subsequent
chapters.

3.1 Methods derived from cell-based ordering


In cell-based approaches, the cells are ordered along the flow direction (based on either
the total velocity field or the pressure field of the “dominant” phase). The single-
cell problems thus obtained are generally np -by-np systems of nonlinear equations
corresponding to local mass balances, one per component. Decoupling occurs by
assuming that the inward fluxes from upstream cells are known, and that downstream
dependence is weak enough so that the single-cell solution will not be significantly
affected. These approaches work well for cocurrent flow problems because downstream
dependence is effectively nil in such cases, and there is no ambiguity in the ordering
since all phases flow in the same direction.

Cascade method

The Cascade method was proposed by Appleyard and Cheshire [4] as an acceleration
scheme for the basic Newton method. A brief description of the method follows.
Suppose we have an np -phase model (np = 2 or 3) in which we discretize the domain
into N gridblocks. The first step in the Cascade method is the same as in the ordinary
Newton method: namely, we linearize the np N conservation equations and solve the
np N -by-np N linear system J(x(ν) )δx(ν) = −R(ν) (x(ν) ) for δx(ν) . Next, we apply a
linear update to pressure variables Po only, leaving the saturations intact for the time
being. Using this new pressure field, we update the potential for each phase, and
then we order the cells from the highest potential to the lowest. This is the order
in which the Cascade sweep should be performed. Note that there is a choice in the
ordering because the potential sequence can be different for each phase. Appleyard
and Cheshire suggest that one Cascade sweep be done for the potential sequence
of each phase, although the method was only demonstrated for a two-phase flow
problem.
72 CHAPTER 3. POTENTIAL ORDERING

k k
1 Form the fullJacobian J, evaluated at (S , Po ) ;
k
δS
2 Solve J = rk ;
δPok
3 Compute Pok+1 = Pok + δPok ;
4 Reorder the cells so that Po,i ≥ Po,j whenever i > j ;
5 For i = 1, . . . , N :
6 Solve (3.1.1) at cell i for Sw,i and Po,i ;
k+1
7 Update Sw,i using the value from line 6 ;
8 Compute outward fluxes F Op (Sw , Po ) for subsequent i ;
9 end for

Figure 3.1: One iteration of the Cascade method [4].

Each Cascade sweep requires the solution of N single-cell problems, where N is


the number of cells in the grid. For a two-phase problem, a single-cell problem has
the form

1
fo (Sw , Po ) = ∆Mo (Sw , Po ) + F Oo (Sw , Po ) − F Io − qo = 0
∆t (3.1.1)
1
fw (Sw , Po ) = ∆Mw (Sw , Po ) + F Ow (Sw , Po ) − F Iw − qw = 0,
∆t

where ∆Mp is the accumulation of phase p, F Op and F Ip are the outward and inward
fluxes of phase p respectively, and qp are the well terms. For a three-phase problem,
we would have three such equations. We assume that the inward fluxes are known
and independent of the values of Sw and Po at the cell, which is valid provided
that all neighboring cells at a higher potential have been processed, and there is
no countercurrent flow. We now have a system of two nonlinear equations in two
unknowns, which can be efficiently solved for Sw and Po . The computed Sw are taken
to be the saturation solution for the nonlinear iteration, and the computed F Op are
used as the influx for subsequent single-cell problems. The computed Po , on the other
hand, are discarded, since their only purpose is to ensure local mass conservation for
both phases and do not yield an accurate approximation for the global pressure field.
(ν)
In other words, the approximate solution (S (ν) , Po ) takes its saturation values from
the single-cell problems, but the pressure values are obtained from the linear update.
Figure 3.1 outlines one step of the cascade method.
3.1. METHODS DERIVED FROM CELL-BASED ORDERING 73

Consider a one-dimensional model problem with

• incompressible flow,

• an injection boundary condition on the left,

• a pressure boundary condition on the right, and

• no countercurrent flow (e.g., horizontal reservoir with no capillarity).

It can be shown that the Cascade method converges to the solution in two iterations
for this problem (see Appendix C for a proof). However, this ceases to be true in the
presence of countercurrent flow or in multiple dimensions. Also, the formulation may
break down if the phase potential chosen to order the cells contains local minima; in
this case, the cell whose potential is at a local minimum will lack an outward flux term
F Op , so it would be impossible to satisfy mass balance for both phases no matter
what Sw and Po are. This is an important drawback because in practical applications
it is usually impossible to guarantee the absence of local minima in the pressure field
when the solution has not converged, especially when the initial guess is poor.

The Natvig approach

Natvig, Lie and Eikemo [56] proposed a cell-based reordering method for solving the
multiphase advection problem in the absence of gravity and capillarity. In [56] the
reordering was applied to equations obtained from a discontinuous Galerkin discretiza-
tion, but it can equally be applied to the standard finite volume methods described
in section 1.2.1. Basically, a topological sort (cf. [22]) is performed on the directed
acyclic graph G = (V, E), whose nodes V are the control volumes, and whose edges E
are the directions of the total velocity across cell interfaces (which coincide with the
flow directions for each phase, since there is no countercurrent flow). The single-cell
problems, each consisting of an np -by-np nonlinear system, are solved in the topologi-
cal order from upstream to downstream by Newton’s method, for example. Since the
pressure and total velocity fields are regarded as part of the data rather than the un-
knowns, this ordering completely decouples the system, just like Gauss-Seidel is exact
for cocurrent 1D flow. In fact, this approach can be regarded as a block nonlinear
Gauss-Seidel method, which is exact as long as the nonlinear system is block lower
74 CHAPTER 3. POTENTIAL ORDERING

triangular. Again, convergence is no longer superlinear when countercurrent flow is


present, and a robust implementation in the block case becomes non-trivial (see [28]
for a discussion).

3.2 Phase-based ordering


In this section, we present an ordering of equations and unknowns that allows us to
solve for saturation one unknown at a time, even in multiple dimensions and/or in
the presence of gravity and countercurrent flow. First, we explain how to construct
this ordering in the absence of countercurrent flow; in this case we recover the Ap-
pleyard and Cheshire Cascade ordering [4]. We then extend the ordering to treat
countercurrent flow due to gravity, and finally we show how to deal with capillarity.

3.2.1 Cocurrent flow


Consider the two-phase model outlined in section 1.1. In the absence of gravity and
capillary forces, all phases will be flowing in the same direction, which is given by the
negative pressure gradient −∇p (i.e., from high to low pressure). Thus, in the finite
volume discretization, the flux term between cells i and l,

krp (Sl ) pl − pi
K · , pl ≥ pi


µp ∆x
Fil = (3.2.1)
K · krp (Si ) pl − pi , pl ≤ pi

µp ∆x

depends only on the saturation of the upstream cell. Suppose we reorder the cells such
that they appear in decreasing order of pressure, i.e. pi ≥ pj whenever i < j. Then
for all j, the component conservation equations for cell j depend only on saturations
Si with i ≤ j. Thus, we can rearrange the system of nonlinear equations to the form

fc1 (S1 , p1 , . . . , pN ) = 0
fc2 (S1 , S2 , p1 , . . . , pN ) = 0
.. (3.2.2)
.
fcN (S1 , S2 , . . . ,SN , p1 , . . . , pN ) = 0,
3.2. PHASE-BASED ORDERING 75

where c = o, w are the oil and water components, respectively. Notice how the
saturation part of the equations becomes “triangular”. Thus, if we have the exact
pressure solution p1 , . . . , pN , we can perform a “forward substitution” and solve a
series of single-variable nonlinear equations to obtain the saturations S1 , . . . , SN . We
remark that the triangularity carries over to the Jacobian matrix, which now has the
form
Sw p
" #
Jww Jwp water equation (3.2.3)
J=
Jow Jop oil equation
where Jww is lower triangular.

In the three-phase case, we have two saturation variables per cell, which we can
choose as Sw and So without loss of generality. Since the black oil model assumes
that krw depends solely on Sw , the above construction can be used to order the water
equations. Now kro depends on both Sw and So , but we can maintain triangularity
by writing all the water equations first before writing the oil and gas equations. The
nonlinear system then looks like

fw1 (Sw1 , p1 , . . . , pN ) = 0
fw2 (Sw1 , Sw2 , p1 , . . . , pN ) = 0
..
.
fwN (Sw1 , . . . , SwN , p1 , . . . , pN ) = 0
(3.2.4)
fo1 (Sw1 , . . . , SwN , So1 , p1 , . . . , pN ) = 0
..
.
foN (Sw1 , . . . , SwN , So1 , . . . , SoN , p1 , . . . , pN ) = 0
and fgi (Sw1 , . . . , SwN , So1 , . . . , SoN , p1 , . . . , pN ) = 0, i = 1, . . . , N.

In this case the corresponding Jacobian would have the form


76 CHAPTER 3. POTENTIAL ORDERING

Sw So p
 
Jww Jwp water equation
  (3.2.5)
J =  Jow Joo Jop 

 oil equation
Jgw Jgo Jgp gas equation

with Jww and Joo lower triangular, which implies the entire upper-left block is lower
triangular. Note that Jow will also be lower triangular, since all phases have the same
upstream direction. However, this fact is not needed to justify solving for Sw and So
using forward substitution.

3.2.2 Countercurrent flow due to gravity

In the presence of gravity, buoyancy forces can cause different phases to flow in
opposite directions. The upstream direction for each phase p is determined by the
sign of (Φp,i − Φp,l ), where
Φp,i = pi − γp zi (3.2.6)

is the phase potential at cell i, zi is the depth of the cell, and γp is the specific gravity
of phase p. Despite possible differences in upstream directions, we are interested in
maintaining the triangular forms shown in (3.2.2) and (3.2.4) (and equivalently (3.2.3)
and (3.2.5)). For two-phase flow, one can simply use Φw for ordering, since one only
needs Jww (and not Jow ) to be triangular. For three-phase flow, we need both Jww
and Joo to be lower triangular. Clearly, no cell-based ordering can accomplish this; we
need to order the water and oil phases separately. The trick is to exploit the relative
permeability dependencies (1.1.9) in such a way that triangularity is preserved.

Unlike the cocurrent flow case, we can no longer align the ordering of equations
and variables with cell ordering. Thus, in the sequel, subscripts (such as k in Φp,k )
always denote the value of the scalar field (in this case, the potential of phase p) at cell
k in the natural ordering. This is because we concentrate on ordering the equations
and unknowns, rather than the cells themselves.
3.2. PHASE-BASED ORDERING 77

Let σ1 , . . . , σN and τ1 , . . . , τN be permutations such that

Φw,σi ≥ Φw,σj whenever i < j, (3.2.7)


Φo,τi ≥ Φo,τj whenever i < j. (3.2.8)

In other words, if cell k is such that Φw,k > Φw,l for any other l, then σ1 := k.
Suppose we order first all the water equations and the associated variables Sw using
the σ ordering, and then order the oil equations and the associated variables So using
the τ ordering. The nonlinear system then looks like

fw,σ1 (Sw,σ1 , p1 , . . . , pN ) = 0
fw,σ2 (Sw,σ1 , Sw,σ2 , p1 , . . . , pN ) = 0
..
.
fw,σN (Sw,σ1 , . . . , Sw,σN , p1 , . . . , pN ) = 0
(3.2.9)
fo,τ1 (Sw,σ1 , . . . , Sw,σN , So,τ1 , p1 , . . . , pN ) = 0
..
.
fo,τN (Sw,σ1 , . . . , Sw,σN , So,τ1 , . . . , So,τN , p1 , . . . , pN ) = 0
and fgi (Sw,σ1 , . . . , Sw,σN , So,τ1 , . . . , So,τN , p1 , . . . , pN ) = 0, i = 1, . . . , N.

Now consider the pattern of the corresponding Jacobian matrix. Clearly, Jww is
still lower triangular because of (3.2.7), and Joo is lower triangular because of (3.2.8).
The only effect of countercurrent flow is that Jow will no longer be lower triangular,
because the Sw are not arranged in the order of decreasing oil potential, Φo . However,
as long as the upper-left 2 × 2 block in (3.2.5) is lower triangular, we can use forward
substitution to solve for Sw and So once the pressures are known.

3.2.3 Capillarity
So far, in the absence of capillary effects, the saturation dependence in each equation
is purely upstream; thus, for a given phase, saturations downstream from cell i do not
appear in equation i. When capillary effects are present, equation i involves phase
78 CHAPTER 3. POTENTIAL ORDERING

pressures from all neighboring cells, be they upstream or downstream from cell i. In
the standard approach, we can only choose one phase pressure as a primary variable;
the other phase pressures must be expressed as

pq = pp + Pcpq (S), (3.2.10)

where pp is the primary phase pressure and pq is the pressure of another phase. Thus,
when capillarity is present, we must choose our primary variables carefully to avoid
introducing downstream dependence on saturation that cannot be removed by simply
reordering the equations and unknowns. Choosing pw , the water-phase pressure, as
the primary pressure variable allows us to maintain the triangularity in the upper-left
block of (3.2.5). Note that choosing pg causes the water equations to depend on So ,
since pw = pg − Pcog (Sg ) − Pcow (Sw ) and Sg = 1 − Sw − So . This would completely
destroy the triangularity of the block. If we instead choose po , then there would be
no So dependence, but there would be both upstream and downstream dependence
on Sw due to pw = po − Pcow (Sw ), which is undesirable. Thus, the only choice that
leaves the water equation intact (i.e., a triangular Jww ) is pw .

We need to ensure that Joo is still lower triangular when pw is used. We have

po = pw + Pcow (Sw ), (3.2.11)

which means we introduce downstream dependence on Sw , but not on So . Hence, the


Jow block will now contain downstream terms, but the Joo block remains unchanged.
Thus, the upper-left block remains triangular, as before. The same analysis carries
over to the nonlinear equations (3.2.9). Table 3.1 summarizes the ordering strategies
for black oil models with different numbers of phases. Note that the gas equations,
whenever they are present, are always ordered last. This is because the gas compo-
nent exists in both the oil and gas phases, so no ordering can produce the required
triangular form when countercurrent flow is present.
3.2. PHASE-BASED ORDERING 79

Table 3.1: Ordering strategies for different black-oil models.

Component Cell ordering Primary


Model ordering water oil pressure
2-phase, oil-water water/oil Φw * pw
2-phase, gas-water water/gas Φw * pw
2-phase, oil-gas oil/gas * Φo po
3-phase water/oil/gas Φw Φo pw

3.2.4 Remarks on implementation


In order to produce cell orderings that satisfy (3.2.7) and (3.2.8), it is not neces-
sary to sort the potentials in decreasing order. Instead, consider the directed graph
G = (V, E) where the nodes V are the cells and the edges E are such that (i, j) is
an edge whenever i and j are neighbors and Φi > Φj or Φi = Φj and i > j. Then
a topological ordering of this graph (cf. [22]) will yield an ordering consistent with
either (3.2.7) or (3.2.8), depending on which potential is used. The running time of
this operation is O(N ), which is asymptotically faster than sorting (O(N log N )).

We also remark that in most simulations, the flow directions do not change very
often, so it may not be necessary to compute this ordering at every time step. For
instance, we could compute the potential ordering only at the beginning of a time
step. At each subsequent Newton iteration, we could simply verify the validity of the
ordering, and only recompute it when the submatrix ceases to be triangular.
Chapter 4

Reduced Newton Method

In this chapter, we use the phase-based ordering introduced in section 3.2 to reformu-
late the mass-balance equations into a system of smaller size that involves pressure
variables only. The Implicit Function Theorem [66] plays a central role in the formu-
lation. We first describe the algorithm that arises when Newton’s method is applied
to the reduced system.

4.1 Algorithm description


For notational convenience, we rewrite (3.2.9) by splitting the equations into two
blocks: the first block Fs = 0 contains all the water and oil equations, and the second
block Fg = 0 contains all the gas equations. Similarly, we denote the vector of all
saturation variables (Swi and Soi , i = 1, . . . , N ) by S, and the vector of pressure
variables by p. Then (3.2.9) becomes

Fs (S, p) = 0
(4.1.1)
F (S, p) = 0,
g

and the corresponding Jacobian J in (3.2.5) becomes


" #
Jss Jsp
J= , (4.1.2)
Jgs Jgp

80
4.1. ALGORITHM DESCRIPTION 81

where

Fs = [fw1 , . . . , fwN , fo1 , . . . , foN ]T ,


Fg = [fg1 , . . . , fgN ]T ,
S = [Sw1 , . . . , SwN , So1 , . . . , SoN ]T ,
p = [pw1 , . . . , pwN ]T ,

and
Jss = ∂Fs /∂S, Jsp = ∂Fs /∂p, Jgs = ∂Fg /∂S, Jgp = ∂Fg /∂p.

It can be shown that Jss is nonsingular as long as the monotonicity condition dkrp /dSp ≥
0 is valid for p = o, w (see Appendix D for the proof). For krw = krw (Sw ) (which
is usually obtained from experimental data), monotonicity is almost always satisfied,
but the situation is less clear for kro = kro (Sw , Sg ), since the latter is usually obtained
by interpolating data from oil-water and oil-gas experiments. Certain methods of
interpolation, such as Stone I and Stone II [6], yield monotonic kro under mild con-
ditions (see Appendix D), but this is not always the case for other methods (e.g.,
the segregation model [37]). In this work it is assumed that kro is a monotonically
increasing function of So when Sw is fixed, which would ensure the nonsingularity of
Jss .

Consequently, since Fs (S, p) has a triangular structure with respect to saturation,


one can solve for S one unknown at a time if p is given. In addition, the implicit
function theorem guarantees that if Fs (S0 , p0 ) = 0 and ∂Fs /∂S is nonsingular at
(S0 , p0 ), then there exists a neighborhood U of p0 and a unique differentiable function
S = S(p) such that S(p0 ) = S0 and Fs (S(p), p) = 0 for all p ∈ U . In other words, we
can use Fs as a constraint to define saturation as a function of pressure, and substitute
it into the remaining equations Fg . Thus, we obtain

Fg (S(p), p) = 0, (4.1.3)

which we need to solve for the pressure p. If we use Newton’s method to solve (4.1.3),
82 CHAPTER 4. REDUCED NEWTON METHOD

the Jacobian becomes

∂Fg ∂S ∂Fg
Jreduced = + (4.1.4)
∂S ∂p ∂p
∂S
= Jgs + Jgp . (4.1.5)
∂p

Now ∂S/∂p is given by the implicit function theorem: Fs (S(p), p) ≡ 0 implies

∂Fs ∂S ∂Fs
+ = 0, (4.1.6)
∂S ∂p ∂p

which we can write as


∂S
Jss + Jsp = 0. (4.1.7)
∂p
Thus, the reduced Jacobian matrix is

−1
Jreduced = Jgp − Jgs Jss Jsp , (4.1.8)

which is precisely the Schur complement of (4.1.2) with respect to pressure. Figure 4.1
summarizes the algorithm used to solve the reduced system. Notice that the only dif-
ference between the algorithm in Figure 4.1 and Newton’s method applied to the full
problem is the way we compute S k+1 . In the full method, we set S k+1 = S k + δS k ; in
the reduced method, S k+1 is updated nonlinearly by solving the constraint equations
F (S k+1 , pk+1 ) = 0, in which the special triangular structure of Jss is exploited. Also
note that since this is just the usual Newton method applied to a reduced problem,
convergence is locally quadratic.

Sequential updating of the saturations

The algorithm in Figure 4.1 requires the solution of Fs (S k+1 , pk+1


w ) = 0 for S
k+1
at
every step. Using the potential ordering presented in Section 3.2, we can triangular-
ize the constraint equations to obtain the system (3.2.9). Thus, given the pressure
values p1 , . . . , pN , we first solve fw1 = 0 for Sw1 . Then, using this Sw1 we can now
solve fw2 = 0 for Sw2 , and so on until we obtain all saturation values. Thus, solving
Fs (S k+1 , pk+1
w ) = 0 reduces to solving (np − 1)N nonlinear scalar equations one at
4.1. ALGORITHM DESCRIPTION 83

1 while Fg (S(pkw ), pkw ) > tol , do  


Jss Jsp
2 Form the full Jacobian J = , evaluated at (S(pkw ), pkw ) ;
Jgs Jgp
−1
3 Solve (Jgp − Jgs Jss Jsp )δpk = −rk ;
4 Compute pw = pw + δpk ;
k+1 k

5 Update S k+1 = S(pk+1 w ) nonlinearly by solving Fs (S


k+1 k+1
, pw ) = 0,
6 one variable at a time in potential ordering ;
7 k := k + 1
8 end

Figure 4.1: Algorithm for solving the reduced system (4.1.3).

a time (where np is the number of fluid phases). A wide variety of reliable univari-
ate solvers are available to deal with the nonlinear single-cell problems. One such
choice is the van Wijngaarden-Dekker-Brent Method [14], which combines bisection
with inverse quadratic interpolation to obtain superlinear convergence. This is a
derivative-free algorithm, which means only function values are required, although an
initial guess based on the solution of the ordinary Newton step can be used to accel-
erate convergence. In a reasonably efficient implementation, each function evaluation
should only require a few floating-point operations. As shown in section 4.3, the extra
cost of the single-cell nonlinear solves is usually offset by a reduction in the number
of global Newton steps. The nonlinear updates can be performed quite efficiently if
more sophisticated zero-finders are used.

Solving the Schur complement system

There are two ways to solve the Schur complement system

Jreduced δp = −r. (4.1.9)

The first way is to notice that one can solve the equivalent system
" #" # " #
Jss Jsp δS 0
= . (4.1.10)
Jgs Jgp δp −r
84 CHAPTER 4. REDUCED NEWTON METHOD

Krylov subspace methods (such as GMRES) can be used, and effective preconditioners
(such as the Constrained Pressure Residual method [81]) are available. A second
way is to apply the Krylov method directly to the Schur complement system. In
this approach, matrix-vector multiplication by Jreduced would have the same cost as
multiplication by the full matrix, because Jss is lower triangular, so that multiplication
−1
by Jss is simply a forward substitution. In terms of preconditioning, one can either
precondition Jreduced directly with ILU type methods, or use an induced preconditioner
based on the full system by letting

−1 −1 T
Mreduced = RMfull R , (4.1.11)

−1
 
where Mfull is the preconditioner for the full system, and R = 0 I is the restriction
operator to the pressure variables. In other words, a preconditioning step for the
−1
reduced system y = Mreduced x consists of the following steps:
!
0(np −1)N
1. Pad the vector x with zeros to form x̂ = .
x
−1
2. Compute ŷ = Mfull x̂.

3. Let y be the portion of ŷ corresponding to pressure variables, i.e., retain only


the last N elements of ŷ.

One potential advantage of applying the Krylov method to the Schur complement
system rather than the full system is that the resulting Krylov vectors are only of
length N rather than length np N , where np is the number of fluid phases. This greatly
reduces storage requirements and orthogonalization cost in methods such as GMRES,
so that more Krylov steps can be taken before restarting.
In fact, the Schur complement reduction can be used even if the nonlinear con-
straint equations are not exactly satisfied. This could happen if the initial pressure
guess is so poor that some of the residual constraints in the reduced Newton cannot
be satisfied. In that case we would have
" #" # " #
Jss Jsp δS rs
=− (4.1.12)
Jgs Jgp δp rg
4.2. CONVERGENCE ANALYSIS 85

But this is equivalent to solving

−1
Jreduced δp = −(rg − Jgs Jss rs ) (4.1.13)

which has the same form as (4.1.9). All these options are evaluated in chapter 5.

4.2 Convergence analysis


This section is devoted to the analysis of the reduced Newton method. For simplicity
we assume we are dealing with the discrete version of the two-phase, 1D model prob-
lem outlined in section 2.2.1. Though simple, this model problem captures the nature
of the nonlinearity of the reduced objective function for transport in porous media. A
physical argument (supported by numerical evidence [82]) suggests that nonlinearity
due to pressure is negligible unless highly compressible components (such as gas) are
present in the system.
There are two basic mechanisms that guarantee the convergence of Newton’s
method. The first mechanism is contraction, i.e., when the Newton mapping

g : x 7→ x − (f 0 (x))−1 f (x)

is contractive. Classical convergence theorems of this type include the Newton-


Kanterovich and Newton-Mysovskikh theorems [29]. When the objective function
f is a scalar function, the proofs simplify, and the following theorem can be estab-
lished.

Theorem 4.1. Let f be a C 2 function over some interval J, and let I = (a, b) ⊂ J
be an open interval such that f 0 (x) 6= 0 on I and

|f (x)f 00 (x)|
x ∈ I =⇒ < 1. (4.2.1)
|f 0 (x)|2

Let x∗ ∈ I be such that f (x∗ ) = 0, and let L = min{|x∗ − a|, |b − x∗ |}. Then x∗ is
the unique root of f in I, and for any initial guess x0 ∈ (x∗ − L, x∗ + L), the Newton
86 CHAPTER 4. REDUCED NEWTON METHOD

iteration
f (xk )
xk+1 = xk −
f 0 (xk )
converges quadratically to x∗ .

The above theorem, while establishing quadratic convergence, is inherently a local


result. This is because the quantity |f (x)f 00 (x)/f (x)2 | is typically small only in the
vicinity of a root. Since our goal is to prove global convergence over a very wide
parameter space of relative permeability functional forms and initial guesses, it is
very difficult to ensure that criterion (4.2.1) is satisfied in every case. Thus, we must
exploit the other mechanism for convergence, namely, the convexity of the objective
function.

4.2.1 Convex functions and Newton’s method


Recall that a function f : [a, b] → R is convex if, for all x, y ∈ [a, b] and 0 ≤ t ≤ 1,
we have
f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y).

Convex functions enjoy many nice properties (such as continuity everywhere and
differentiability almost everywhere [59]), but for our purposes we mainly consider C 2
functions. The following properties are used repeatedly in our analysis.

Lemma 4.2 (Properties of convex functions [59, §3.4]). Let f : [a, b] → R be a C 2


function. Then the following are equivalent:

1. f is convex on [a, b];

2. f 0 (x)(y − x) ≤ f (y) − f (x) for all x, y ∈ [a, b];

3. f 00 (x) ≥ 0 for all x ∈ [a, b].

Theorem 4.3 (Monotone convergence of Newton’s method). Let f : R → R be a C 2


function such that f 0 (x) > 0 everywhere, and let x∗ be such that f (x∗ ) = 0. Suppose
there is a semi-infinite interval I = [c, ∞) such that x∗ ∈ I and f 00 (x) ≥ 0 for all
4.2. CONVERGENCE ANALYSIS 87

x ∈ I. Then Newton’s method converges to x∗ for an arbitrary initial guess x0 . In


addition, if f (x0 ) ≥ 0, then the Newton iterates converge monotonically, i.e.

x0 ≥ x1 ≥ · · · ≥ xk ≥ · · · ≥ x∗ .

Proof. First, assume that f (xk ) ≥ 0 for some k ≥ 0. Then f is convex on the interval
[x∗ , xk ], so we have

f 0 (xk )(xk − x) ≤ f (xk ) − f (x∗ ) = f (xk ).

Since f 0 (xk ) > 0, rearranging gives

f (xk )
x∗ ≥ xk − = xk+1 ,
f 0 (xk )

which, together with the fact that f (xk ) ≥ 0, implies x∗ ≤ xk+1 ≤ xk . So f (xk+1 ) ≥ 0
and f is convex on [x∗ , xk+1 ]. Induction now shows that

xk ≥ xk+1 ≥ · · · ≥ · · · ≥ x∗ ,

which means {xk }∞ ∗


k=1 is a decreasing sequence bounded below by x ; thus, the se-
quence converges to a limit x̃. Since the Newton mapping g : x 7→ x − (f 0 (x))−1 f (x)
is continuous, we must have

x̃ = lim g(xk ) = g(x̃),

so that
x̃ = x̃ − (f 0 (x̃))−1 f (x̃),

which implies f (x̃) = 0. Hence x̃ = x∗ , and Newton’s method converges to x∗ . The


second statement of the theorem now follows, since it is just the special case k = 0.
Now assume, on the contrary, that there is no k such that f (xk ) ≥ 0. In other
words, we have f (xk ) < 0 for every k ≥ 0. But this implies:

1. xk < x∗ for all k, since f is monotonically increasing, and


88 CHAPTER 4. REDUCED NEWTON METHOD

2. xk+1 > xk for all k, since f (xk ) < 0.

This means {xk }∞ ∗


k=1 is an increasing sequence bounded above by x , so it converges to
some limit, which must then be equal to x∗ by the continuity of the Newton mapping.
So in both cases, Newton’s method converges to the root x∗ , as required.

To exploit this useful connection between convex functions and Newton conver-
gence, we make the following assumptions on the relative permeability functions.

Assumption 4. We assume that the following properties hold for all saturations
0 ≤ Sw ≤ 1:

1. λT (Sw ) = kw (Sw )/µw + ko (Sw )/µo > 0 (Uniform ellipticity),

2. kw0 (Sw ) ≥ 0, ko0 (Sw ) ≤ 0 (Phase mobilities increasing with phase saturations),

3. kw00 (Sw ) ≥ 0, ko00 (Sw ) ≥ 0 (Convex relperms).

Uniform ellipticity is an essential assumption that is required for the well-posedness


0 0
of the elliptic subproblem. The requirements on krw and kro are the same as those
in chapter 2, which, as indicated previously, are physically realistic. Convexity of the
relative permeabilities is the only additional assumption, and most commonly used
relative permeability functions, such as those due to Honarpour et al. [42], satisfy
this requirement.

4.2.2 The cocurrent case: large ∆t


Recall that in the cocurrent case, the upstream direction is the same for both phases,
i.e., λp,i+1/2 = λp (Si ) for p = o, w. For analysis purposes, we perform a linear change
of variables by defining πi = (pi − pi+1 )/∆x, the pressure gradient at the cell interface
i + 1/2. Then the mass balance equations become

Fwi = Vi Si − Ki−1 λw (Si−1 )πi−1 + Ki λw (Si )πi − qwi ,


(4.2.2)
Foi = −Vi Si − Ki−1 λo (Si−1 )πi−1 + Ki λo (Si )πi − qoi ,
4.2. CONVERGENCE ANALYSIS 89

where Vi = φi ∆x/∆t and Ki is the (absolute) permeability between blocks i and i+1.
We note that applying Newton’s method to this modified system will yield pressure
profiles that are identical to those obtained from applying Newton’s method to the
original system, since all we did is a linear change of independent variables.
Now consider applying reduced Newton to (4.2.2), i.e., we use the water phase
equations as the constraints required to define the implicit functions

Si = Si (π1 , . . . , πi ).

Then we can rewrite (4.2.2) as

Fwi (π1 , . . . , πi ) = Vi Si (π1 , . . . , πi ) + Ki λw (Si (π1 , . . . , πi ))πi − fwi (π1 , . . . , πi−1 ) ≡ 0,


Foi (π1 , . . . , πi ) = −Vi Si (π1 , . . . , πi ) + Ki λo (Si (π1 , . . . , πi ))πi − foi (π1 , . . . , πi−1 ),
(4.2.3)

where fwi and foi are influxes from the upwind cell, which do not depend on the
pressure gradient πi . Thus, our approach for proving convergence is as follows: we
show that for fixed π1 , . . . , πi−1 , the objective function Foi is strictly increasing and
convex with respect to πi over a semi-infinite interval containing the root πi∗ . Thus,
Newton’s method converges for any starting point within this interval. Then an
induction argument, together with the continuous dependence of Foi on the influx
foi (π1 , . . . , πi−1 ), will guarantee global convergence of Newton’s method for the whole
system.

Remark. Without loss of generality, we can restrict our attention to how reduced
Newton behaves inside the positive orthant {πi > 0, i = 1, . . . , N }. Let πi∗ denote the
solution of the i-th cell problem (so that Foi (π1∗ , . . . , πi∗ ) = 0). Since flow is cocurrent
and the total velocity is positive, each πi∗ must be positive. Moreover, because of
uniform ellipticity, we have the lower bound

q q min{µo , µw }
πi∗ ≥ = , (4.2.4)
Ki (λT )max Ki

where the last equality holds by convexity.


90 CHAPTER 4. REDUCED NEWTON METHOD

We are now ready to show that reduced Newton converges when ∆t is large, pro-
vided we make a few additional assumptions that are satisfied by quadratic relative
permeabilities. In the next section, we derive a modified reduced Newton iteration
that is provably convergent for all ∆t without the need of these additional assump-
tions.

Proposition 4.4. Assume kw and ko are both uniformly convex, i.e., there exist
positive constants cw and co such that kw00 ≥ cw and ko00 ≥ co for all S ∈ [0, 1]. Let
kw0 (0) = 0. Then there exists Sc > 0 such that λ0w + λ0o ≤ 0 for all 0 ≤ Sw ≤ Sc .

Proof. Since ko0 (Sw = 1) ≤ 0 and ko00 (Sw ) ≥ co > 0, we must have ko0 (Sw = 0) ≤ −co ,
so that λ0w + λ0o ≤ −co /µo < 0. Thus, by continuity, there exists a non-trivial
neighborhood around zero, say 0 ≤ Sw ≤ Sc , such that λ0w + λ0o takes on negative
values.

Lemma 4.5 (Monotonicity and convexity with respect to πi ). Assume the hypotheses
of Proposition 4.4. Let πj > 0 for all j. Then ∂Foi /∂πi > 0, and there exists γ0 > 0
such that ∂ 2 Foi /∂πi2 ≥ 0 whenever Vi /Ki πi ≤ γ0 .

Proof. The water phase constraint yields

∂Si Ki λw
=− ,
∂πi Vi + Ki λ0w πi

implying that
∂Foi Ki λw (Vi − Ki λ0o πi )
= Ki λo + ,
∂πi Vi + Ki λ0w πi
which is positive for πi > 0 if the fluid properties in Assumption 4 hold. Similarly,
the second derivative is given by

∂ 2 Foi

Ki λw
=− 2Vi Ki (λ0w + λ0o )
∂πi2 (Vi + Ki λ0w πi )2

Ki λw  2 2 00 0 00 0 00 00

− K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) .
Vi + Ki λ0w πi i i o w

The terms inside the square brackets are non-negative by Assumption 4. Thus, if
4.2. CONVERGENCE ANALYSIS 91

λ0w + λ0o ≤ 0, then ∂ 2 Foi /∂πi2 ≥ 0 automatically. If λ0w + λ0o > 0, then we need

Ki λw
2Vi Ki (λ0w + λ0o ) ≤
 2 2 00 0 00 0 00 00

K i π i (λ o λ w − λ w λ o ) + Vi K i π i (λ w + λ o ) .
Vi + Ki λ0w πi

Cross-multiplying and setting γ = Vi /Ki πi gives

Aγ 2 + Bγ + C ≤ 0 (4.2.5)

with

A = 2(λ0w + λ0o ),
B = 2λ0w (λ0w + λ0o ) − λw (λ00w + λ00o ),
C = −λw (λ00o λ0w − λ00w λ0o ).

Since A > 0 and C < 0, we deduce that (4.2.5) is satisfied iff



−B + B 2 − 4AC −2C
γ≤ = √ .
2A B + B 2 − 4AC

Thus, ∂ 2 Foi /∂πi2 ≥ 0 if γ ≤ γ0 , where

−2C
γ0 = min √ . (4.2.6)
Sc ≤S≤1 B + B 2 − 4AC

We exclude the interval [0, Sc ) from the minimization because λ0w +λ0o < 0 there. This
implies γ0 > 0, since

c3w co
−C ≥ λw (Sc )λ0w (Sc )λ00o,min ≥ >0
2µ2w µo

and the denominator is bounded.

Lemma 4.5 implies that if Vi ≤ θγ0 q min{µo , µw } with θ < 1, then Foi is convex
in the interval θq min{µo , µw } ≤ πi < ∞, which contains πi∗ . Hence, by Theorem 4.3,
(k)
the sequence of Newton iterates {πi } converges monotonically to πi∗ provided the
influx is constant. In particular, the first cell converges if ∆t is large enough. Global
92 CHAPTER 4. REDUCED NEWTON METHOD

convergence follows by induction and continuity.

4.2.3 The general cocurrent case


The rather weak result on convergence in the previous section is due to our inability
to ascertain convexity of the objective function except under fairly limited circum-
stances. Figure 4.2 plots the objective function Foi (πi ) for different time-step sizes.
We see that for large ∆t, the objective function is indeed convex over a semi-infinite
interval containing the root, but this is not always the case for smaller ∆t, espe-
cially for unfavorable mobility ratios. In practice, our numerical results show that
convergence still occurs, but this is due to contraction rather than convexity. In or-
der to ensure global convergence based on a convexity argument, we need to make a
small modification to the reduced Newton algorithm. The following lemma is the key
observation.

Lemma 4.6. Let (π1 , . . . , πN ) > 0 be given. Suppose we define the implicit functions
(1) (2)
Si (π1 , . . . , πi ) and Si (π1 , . . . , πi ) via the constraints

(1) (1)
Fwi (π1 , . . . , πi ) = Vi Si (π1 , . . . , πi ) + Ki λw (Si (π1 , . . . , πi ))πi − fwi (π1 , . . . , πi−1 ) ≡ 0,
(2) (2)
Foi (π1 , . . . , πi ) = −Vi Si (π1 , . . . , πi ) + Ki λo (Si (π1 , . . . , πi ))πi − foi (π1 , . . . , πi−1 ) ≡ 0,
(4.2.7)

respectively. Now consider the reduced functions

(1) (1)
F̄oi (π1 , . . . , πi ) = −Vi Si (π1 , . . . , πi ) + Ki λo (Si (π1 , . . . , πi ))πi − foi (π1 , . . . , πi−1 ),
(2) (2)
F̄wi (π1 , . . . , πi ) = Vi Si (π1 , . . . , πi ) + Ki λw (Si (π1 , . . . , πi ))πi − fwi (π1 , . . . , πi−1 ).
(4.2.8)

Then both F̄oi and F̄wi are increasing with respect to πi , and at least one of F̄oi and
F̄wi must be a convex function over a semi-infinite interval containing the root πi∗ .

Proof. We have shown in Lemma 4.5 that

(1)
∂Si Ki λw
=− ,
∂πi Vi + Ki λ0w πi
4.2. CONVERGENCE ANALYSIS 93

µo/µw = 0.1
10

6
Fo

2
∆t=1
∆t=2
0 ∆t=5
∆ t = 20
−2
0 2 4 6 8 10
∆ p/∆ x
µo/µw = 10
2

1.5

1
o

0.5
F

∆t=1
−0.5 ∆t=2
∆t=5
∆ t = 20
−1
0 1 2 3 4 5 6 7 8
∆ p/∆ x

Figure 4.2: Reduced Newton residual functions for various ∆t. Top: favorable mo-
bility ratio (µo /µw = 0.1). Bottom: unfavorable mobility ratio (µo /µw = 10).
94 CHAPTER 4. REDUCED NEWTON METHOD

and that the first and second derivatives of F̄oi are

∂ F̄oi Ki λw (Vi − Ki λ0o πi )


= Ki λo +
∂πi Vi + Ki λ0w πi

and

∂ 2 F̄oi

Ki λw
=− 2Vi Ki (λ0w + λ0o )
∂πi2 (Vi + Ki λ0w πi )2

Ki λw  2 2 00 0 00 0 00 00

− K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) ,
Vi + Ki λ0w πi i i o w

(1)
where the λp and their derivatives are evaluated at Si (π1 , . . . , πi ). A similar calcu-
lation shows that
(2)
∂Si Ki λo
= ,
∂πi Vi − Ki λ0o πi
∂ F̄wi Ki λo (Vi + Ki λ0w πi )
= Ki λw +
∂πi Vi − Ki λ0o πi

and

∂ 2 F̄wi

Ki λo
2
= 0 2
2Vi Ki (λ0w + λ0o )
∂πi (Vi − Ki λo πi )

Ki λo  2 2 00 0 00 0 00 00

+ K π (λ λ − λw λo ) + Vi Ki πi (λw + λo ) ,
Vi − Ki λ0o πi i i o w

(2)
where the λp and their derivatives are now evaluated at Si (π1 , . . . , πi ). By definition,
(1) (2)
at the solution πi∗ we must have Si (π1 , . . . , πi∗ ) = Si (π1 , . . . , πi∗ ) =: Si∗ . Moreover,
(1) (2) (1)
we must have Si ≤ Si∗ ≤ Si over the interval [πi∗ , ∞) because ∂Si /∂πi ≤ 0 and
(2)
∂Si /∂πi ≥ 0. We now consider two cases:

(2)
1. λ0w (Si∗ ) + λ0o (Si∗ ) ≥ 0. Then since Si ≥ Si∗ for πi ≥ πi∗ , the convexity of λw and
(2) (2)
λo implies λ0w (Si ) + λ0o (Si ) ≥ 0. Hence ∂ F̄wi /∂πi ≥ 0 for all πi ≥ πi∗ .

(1)
2. λ0w (Si∗ ) + λ0o (Si∗ ) ≤ 0. Then since Si ≤ Si∗ for πi ≥ πi∗ , the convexity of λw and
(1) (1)
λo implies λ0w (Si ) + λ0o (Si ) ≤ 0. Hence ∂ F̄oi /∂πi ≥ 0 for all πi ≥ πi∗ .
4.2. CONVERGENCE ANALYSIS 95

Thus, at least one of the two reduced functions is convex over a semi-infinite interval
containing πi∗ , as required.

The above lemma tells us that if we knew ahead of time the slope of the total
mobility curve at the solution, we could always pick the correct reduced function (or
equivalently, the correct constraint) for each cell in order to achieve global conver-
gence. Unfortunately, this information is usually not available. However, if we switch
constraints when non-convexity is detected, then we can be certain that the new
reduced function must be convex, so convergence is now guaranteed. The modified
algorithm is shown in Figure 4.3. The convexity test in line 9 is motivated by Theorem
4.3. Assume all cells upstream of i have converged. If the current residual function
is convex and Fgi (πik ) > 0, then we should have Fgi (πik+1 ) > 0 as well. Thus, if the
latter condition is violated, non-convexity is detected, so we should switch constraints
and work with the other residual function, which must be convex. In practice, we
may not want to swap constraints every time the residual becomes negative for the
following reasons:

• The upstream cells may not have converged;

• When the nonlinear iterate is close to the solution (but has not yet converged),
the residual can have the wrong sign even when convex objective functions are
used. This is because the linear and nonlinear equations that define the Newton
steps are themselves solved inexactly by inner iterations;

• Frequent constraint switches can lead to a deterioration in global convergence.

As a result, we should switch constraints only when the overshoot is severe enough
that we are certain no progress has been made. The parameter 0 < θ < 1 in line 9
achieves this purpose: if the new residual changes sign but has a significantly smaller
magnitude, we accept the current constraint and continue; on the other hand, large
overshoots cause the constraint to switch. It is fairly easy to convince oneself that
the modified algorithm converges for all initial guesses inside the positive orthant
{πi > 0, i = 1, . . . , N }.
96 CHAPTER 4. REDUCED NEWTON METHOD

1 Initialize constraint set s := {Fw1 , . . . , FwN } and its complement g := s0


2 while Fg (S k ), pk ) > tol , do  
Jss Jsp
3 Form the full Jacobian J = , evaluated at (S k , pk ) ;
Jgs Jgp
−1
4 Solve (Jgp − Jgs Jss Jsp )δpk = −rk ;
5 Compute pk+1 = pk + δpk ;
6 Update S k+1 nonlinearly by solving Fs (S k+1 , pk+1 ) = 0,
7 one variable at a time in potential ordering ;
8 for i = 1, . . . , n , do
9 i f Fgi (S k+1 , pk+1 ) < −θFgi (S k , pk ) , then
10 s := s ∪ {Fgi }\{Fsi } (Swap constraints)
11 g := g ∪ {Fsi }\{Fgi }
12 end i f
13 end
14 k := k + 1
15 end

Figure 4.3: Modified reduced Newton algorithm.

4.2.4 The countercurrent flow case


When gravity effects are included, countercurrent flow may be present in some parts
of the domain. In a cell experiencing countercurrent flow, the mass balance equations
take the form

Fwi = Vi Si − Ki−1 λw (Si−1 )πi−1 + Ki λw (Si )πi − qwi ,


(4.2.9)
Foi = −Vi Si − Ki−1 λo (Si )(πi−1 − ∆ρg∆z) + Ki λo (Si+1 )(πi − ∆ρg∆z) − qoi .

Here πi = (Φw,i − Φw,i+1 )/∆x denotes the gradient for the water potential. The
presence of countercurrent flow introduces several complications in our attempt to
analyze the convergence behavior of the reduced Newton algorithm:

1. Convergence can no longer be analyzed by considering a sequence of indepen-


dent single-cell problems. Since the objective functions Foi now contain down-
stream dependencies, all the cells within the countercurrent flow region are fully
coupled.
4.2. CONVERGENCE ANALYSIS 97

2. The flow direction of the oil phase, which depends on πi∗ − ∆ρg∆z, is generally
not known until the problem has converged.

3. The objective function becomes non-differentiable when the upstream direction


changes.

Our experiments show that when significant countercurrent flow is present, it is pos-
sible that reduced Newton no longer converges to the solution for every initial guess,
especially when a large time step is taken. It is then natural to try to identify condi-
tions for which the reduced Newton procedure converges.

A domain of dependence argument

To derive a criterion that would ensure convergence, we turn to a heuristic argument


based on the domain of dependence. In the theory of numerical methods for hyper-
bolic PDEs, the Courant-Friedrichs-Lewy (CFL) condition states that if a numerical
method is stable, then its numerical domain of dependence must be at least as large
as the domain of dependence of the underlying PDE (cf. [49]). In this context, the
superior convergence behavior of reduced Newton for cocurrent flow can be explained
as follows: the implicit function Si , defined by the water phase constraint

Fwi = Vi (Si − Si0 ) − Ki−1 λw (Si−1 )πi−1 + Ki λw (Si )πi − qwi ≡ 0,

is actually a function of the arguments

Si = Si (π1 , . . . , πi ; S10 , . . . , Si0 ),

where {Si0 } denotes the initial saturation profile, i.e., the saturation profile at the be-
ginning of the time step. Thus, the objective function Foi actually depends implicitly
on the old saturation values S10 , . . . , Si0 as well as the pressure gradients π1 , . . . , πi .
Since the characteristics of the PDE only travel from left to right in the cocurrent
case, the “domain of dependence” of reduced Newton contains the domain of depen-
dence of the PDE for any ∆t. As a result, one can expect a fairly stable method for
a wide range of initial guesses. On the other hand, in the countercurrent flow case,
98 CHAPTER 4. REDUCED NEWTON METHOD

the objective function Foi takes the form

Foi = −Vi Si − Ki−1 λo (Si )(πi−1 − ∆ρg∆z) + Ki λo (Si+1 )(πi − ∆ρg∆z) − qoi
= Foi (Si (· · · ), Si+1 (· · · ), πi−1 , πi )
= Foi (π1 , . . . , πi+1 ; S10 , . . . , Si+1
0
).

Thus, if ∆t is so large that the waves traveling to the left (i.e., countercurrent to
the main flow direction) can cross more than one cell boundary, then the domain of
dependence of reduced Newton will fail to cover the physical domain of dependence.
In such cases, one cannot generally expect global convergence of the reduced Newton
iterations. Since the fastest backward-moving wave travels at the speed of vmin =
minS∈[0,1] qT fw0 , where  
λw Ki λo
fw = 1+ ∆ρg∆z , (4.2.10)
λT qT
we can expect reduced Newton to converge whenever

0
−∆tqT fw,min ≤ φi ∆x. (4.2.11)

Thus, if fw0 ≥ 0 everywhere (i.e., we have cocurrent flow), we expect reduced Newton
to converge for any ∆t. If countercurrent flow is present, then there is a range of S
over which fw0 < 0, in which case we would have the time-step restriction

φi ∆x
∆t ≤ 0
, (4.2.12)
qT fw,min

which is effectively a CFL limit for backward-traveling waves.

A monotonicity argument

We have shown in Lemma 4.5 that in the cocurrent case, the objective function is
monotonically increasing (∂Foi /∂πi > 0). Monotonicity is an important property
if global convergence to a unique solution is to be expected: non-monotonic func-
tions necessarily have local minima or maxima, which cause breakdown in Newton’s
method. Thus, a reasonable criterion for ensuring convergence is one that guarantees
4.2. CONVERGENCE ANALYSIS 99

monotonicity of the objective function. We can mimic the proof of Lemma 4.5 and
compute the partial derivative ∂ F̂oi /∂πi , where

F̂oi = −Vi Si + Ki λo (Si )(πi − ∆ρg∆z) − foi (π1 , . . . , πi−1 ).

In other words, we perform the analysis as though the upstream direction is to the left.
Even though this upstream direction may be incorrect, the analysis is still valuable
for the following reason: since the correct upstream direction is generally unknown
before the solution has converged, a robust algorithm should still be able to make
some progress even when the upstream direction is wrong. The algorithm should
produce an answer that would cause a switch in the upstream direction in the next
iteration, but it should not overshoot by so much as to cause the overall algorithm
to fail. These desirable properties are only possible when F̂oi is monotonic, so our
analysis can still provide a useful criterion for convergence.

We have

∂ F̂oi ∂Si ∂Si


= −Vi + Ki λo + Ki λ0o (πi − ∆ρg∆z) ,
∂πi ∂πi ∂πi
Ki h
0 0
i
= λT Vi + Ki λo λw πi − Ki λw λo (πi − ∆ρg∆z) .
Vi + Ki λ0w πi

Using the relations

1  
πi =qT /Ki + λo ∆ρg∆z
λT
1  
πi − ∆ρg∆z = qT /Ki − λw ∆ρg∆z ,
λT

we can rewrite ∂ F̂oi /∂πi as

∂ F̂oi Ki λT h
0
i
= Vi + qT fw ,
∂πi Vi + Ki λ0w πi

where fw (S) is defined in (4.2.10). Thus, the objective function F̂oi is monotonically
100 CHAPTER 4. REDUCED NEWTON METHOD

increasing whenever
φi ∆x
Vi = ≥ −qT fw0 ,
∆t
which is exactly the same as (4.2.11). As it is shown in Example 4.3.1, criterion
(4.2.11) is usually enough for reduced Newton to converge. For problems of practical
interest, the backward CFL number is usually much smaller than the forward CFL
number, so reduced Newton can generally converge with much larger time steps than
standard Newton even in the countercurrent flow setting. In the next section, we
show a variety of examples that demonstrate the effectiveness of the reduced Newton
algorithm.

4.3 Numerical examples


To test the efficiency of the potential-based reduced Newton algorithm, we imple-
ment it inside the General Purpose Research Simulator (GPRS) developed by Cao
[16]. GPRS is used by Stanford University’s SUPRI-B and SUPRI-HW research
groups, as well as other research groups and companies for their in-house research.
By implementing our algorithm in GPRS we can guarantee that all the property
calculations and convergence checks are identical for both the standard and reduced
Newton methods. We can also ensure our reference point is indeed the basic Newton
method, rather than a version adorned with various heuristics. Consequently, all our
comparisons between the standard and reduced Newton methods are generated by
GPRS.

4.3.1 1D example with gravity


To demonstrate that the potential-based reduced Newton algorithm does indeed work
in the presence of countercurrent flow, we first test it on a simple pseudo-1D example.
The reservoir is discretized using 10 × 1 × 100 cells in the x, y and z directions
respectively, with Dx = 10 ft, Dy = 50 ft and Dz = 4 ft. A uniform porosity (φ = 0.3)
and permeability (kx = ky = kz = 758 md) are used. Water is injected across the top
layer at a rate of 213.6 bbl/day (0.002 pore volumes per day) and a production well is
4.3. NUMERICAL EXAMPLES 101

completed across the bottom layer and operates at a BHP (bottom hole pressure) of
500 psi. The densities of water and oil at standard conditions are 64 lb/cu.ft. and 49
lb/cu.ft., respectively, and the viscosities are µo = 1.0 cp, µw = 0.3 cp. The fractional
flow curve for this problem is shown in Figure 4.4. We see that flow is cocurrent for
0 ≤ Sw ≤ 0.38 and countercurrent for 0.38 ≤ Sw ≤ 1. The forward CFL number,
maxS∈[0,1] fw0 , is 3.73, whereas the backward CFL number, − minS∈[0,1] fw0 , is 0.638.
We test our algorithm for uniform initial water saturations of Swi = 0.0, 0.1, . . . , 0.9.
In each case, the simulation steps through T = 1, 3, 7, 15, 30, 45, 60 days (1 day =
0.002 pore volumes), and afterwards the time-step size is fixed at ∆T = 20 days
until T = 300 days is reached, for a total of 21 steps. Table 4.1 shows the results
for the standard and reduced Newton algorithms. We see that reduced Newton does
not need to cut any time steps to achieve convergence, whereas standard Newton
must cut the time step multiple times in four cases (Sw = 0.0, 0.6, 0.7, 0.8). Time-
step cuts are very expensive, since it means that we must throw away the results of
all previous iterations and start over. Moreover, the size of the next step following
a time-step cut is usually set to the last successfully integrated ∆t, i.e., the one
reduced by the time-step cut. This can lead to a significantly smaller average time
step size for a given simulation. Thus, a more stable algorithm that avoids time-step
cuts can significantly outperform one that cuts time steps frequently, especially if
their convergence rates are otherwise comparable. Table 4.1 shows that when neither
algorithm requires time-step cuts, standard Newton converges more quickly some of
the time (Sw = 0.4, 0.5, 0.9), whereas reduced Newton is quicker at other times (Sw =
0.1, 0.2, 0.3, 0.6). Nevertheless, the difference in average iteration count is less than
0.67 iterations per time step in all cases, so the convergence rates for both algorithms
are comparable when no time-step cuts are needed. As we observe in later examples,
the enhanced stability of reduced Newton does translate into gains in the overall run
time for larger problems. The primary goal of this example is to demonstrate the
robustness of reduced Newton, even in the presence of strong countercurrent flow.
This property is essential if the algorithm is to be used in heterogeneous reservoirs
with complicated permeability/porosity fields, especially since countercurrent flow
due to gravity can be important in regions where the total velocity is small.
102 CHAPTER 4. REDUCED NEWTON METHOD

1.4

1.2

0.8
fw

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1
Sw

Figure 4.4: Fractional flow fw for the 1D gravity example.

Table 4.1: Convergence history for 1D water floods with different initial water
saturations. For both methods, Time steps = total number of time steps taken
to simulate up to 300 days; Newtons = number of Newton iterations (excluding
iterations wasted due to time-step cuts); Cuts = number of times the algorithm must
cut the time-step size by half due to non-convergence.

Standard Reduced
Swi Time steps Newtons Cuts Time steps Newtons Cuts
0.0 26 140 5 21 61 0
0.1 21 59 0 21 58 0
0.2 21 59 0 21 58 0
0.3 21 50 0 21 49 0
0.4 21 51 0 21 58 0
0.5 21 67 0 21 81 0
0.6 22 88 2 21 85 0
0.7 24 96 6 21 90 0
0.8 23 85 3 21 84 0
0.9 21 51 0 21 65 0
4.3. NUMERICAL EXAMPLES 103

4.3.2 Heterogeneous example with gravity

To demonstrate the effectiveness of reduced Newton on a large, complex heteroge-


neous reservoir, we test it on a water flood problem using a 2 × 2 × 2 upscaling of the
SPE 10 model [19]. This gives rise to a model with 141900 grid blocks (110 × 30 × 43).
The reservoir model is shown in Figure 4.5. The top 18 layers of the reservoir repre-
sent a Tarbert formulation with highly variable permeabilities ranging from 4.8×10−3
to 1.2 × 103 md. The bottom 25 layers consist of an Upper Ness sequence, which is
highly channelized. The porosity is 0.3 throughout the reservoir. Water is injected
at the center of the reservoir at 5000 bbl/day (= 0.0002 pore volumes per day); four
production wells are located in the four corners of the reservoir, operating at a bottom
hole pressure of 4000 psi. Quadratic relative permeabilities are used with a residual
saturation of 0.2 for both phases, and the viscosity ratio is 10. The rest of the parame-
ters are the same as those in the original specification [19]. The simulation is carried
out up to T = 500 days, which corresponds to 0.1 pore volumes injected (P V I). For
any time step, if the global nonlinear solver does not converge within 20 iterations, the
iterations are stopped, and the current time step is cut in half before restarting. Table
4.2 shows the convergence history of the standard and reduced Newton methods for
an initial time step of 0.1 days. Here the time stepping is gentle enough that standard
Newton does not need to cut time steps in order to achieve convergence. We see that
reduced Newton takes fewer iterations than standard Newton to converge, and that
the running time decreases from 728.6 seconds to 560.6 seconds. Thus, the savings
from reducing the number of Newton iterations are more than enough to offset the
cost of univariate solves.

Next we specify an initial time step of 1 day and track the number of Newton
iterations required to converge. Figure 4.6 shows the results. We see that reduced
Newton converges for the first time step in 9 iterations, whereas standard Newton
does not converge and needs to cut the time step twice to converge with an initial
time step of 0.25 days. Beyond the first time step, reduced Newton always takes fewer
iterations to converge than its standard counterpart, and the iteration count does not
exhibit the large variations that standard Newton does at the beginning.
104 CHAPTER 4. REDUCED NEWTON METHOD

Table 4.2: Convergence history for the upscaled SPE 10 model with an initial time
step of 0.1 days. N = Number of Nonlinear (Newton) iterations; L = Number
of Linear (CPR) solves; CFL = Maximum CFL number in the reservoir; %CC =
Percentage of cell interfaces that experience countercurrent flow.

Standard Reduced
days N L N L CFL %CC
0.1 4 18 4 17 1.8 6.2
0.3 3 17 3 17 1.9 2.4
0.7 3 18 2 12 2.1 1.1
1.5 3 19 2 14 2.5 0.7
3.1 4 26 2 15 4.0 0.5
6.3 5 32 2 16 6.7 0.5
10 4 26 2 15 11.1 0.5
20 6 45 3 27 23.9 0.5
35 4 32 3 27 35.2 0.5
50 3 27 2 19 33.2 0.5
70 4 35 3 27 35.1 0.6
90 4 33 3 28 35.6 0.6
110 4 37 3 30 52.9 0.6
140 4 41 3 34 112.1 0.6
170 4 39 2 21 102.8 0.7
200 4 35 2 21 145.3 0.7
230 3 33 2 22 129.1 0.7
260 3 33 2 22 132.0 0.8
290 3 30 2 21 132.3 0.8
320 3 31 2 21 119.6 0.8
350 3 30 2 19 109.5 0.8
380 3 30 2 20 116.7 0.9
410 3 31 2 20 112.0 0.9
440 3 30 2 19 114.9 0.9
470 3 28 2 19 108.1 1.0
500 3 29 2 19 146.3 1.0
Total 93 785 61 542
Running time (s) 728.6 560.6
4.3. NUMERICAL EXAMPLES 105

log K
10 x
3

2.5

1.5

0.5

−0.5

−1

−1.5

−2

Figure 4.5: Permeability field and well configuration for the upscaled SPE 10
problem[19]. The reservoir is displayed upside down so that the channels in the
bottom layers are clearly visible.
106 CHAPTER 4. REDUCED NEWTON METHOD

12
Reduced
Standard

10

8
Iterations

0
0
0.25
1
3
7
15
31
50
70
90
110
140
170
200
230
260
290
320
350
380
410
440
470
500

Time (days)

Figure 4.6: Convergence history for the upscaled SPE 10 model with an initial time
step of 1 day.
4.3. NUMERICAL EXAMPLES 107

4.3.3 Large heterogeneous example


Since the cost of the single cell solves scales linearly with the problem size, we ex-
pect that the savings from the potential-based reduced Newton method will become
even more evident in large heterogeneous examples, where the computational cost
is dominated by the solution of linear systems. We demonstrate this by simulating
the full SPE 10 problem (60 × 220 × 85 = 1.12 million grid blocks) and with the
variable porosity field as specified in [19]. The published relative permeabilities and
fluid properties are used, except that the formation volume factor Bo and the density
ρo are taken to be the same as the published Bw and ρw . The injection rate is 5000
bbl/day (0.000366 pore volumes per day). The simulation runs until T = 2000 days
(P V I = 0.732). Three time-stepping strategies are used:

• Short time steps: T = 0.01, 0.03, 0.07, 0.15, 0.31, 0.63, 1, 3, 7, 15, 31, 63, 90,
120, 150, 180, 220, 260, 300 days. After 300 days, ∆T = 50 days (0.0183 pore
volumes) until T = 2000 days is reached.

• Long time steps: T = 0.01, 0.31, 1, 7, 31, 90, 150, 220, 300 days. After 300
days, ∆T = 100 days (0.0366 pore volumes) until T = 2000 days is reached.

• Huge time steps: T = 0.01, 0.31, 1, 7, 31, 90, 200 days. After 200 days,
∆T = 500 days (0.183 pore volumes) until T = 2000 days is reached.

As before, the time step is cut in half if the global nonlinear solver does not con-
verge within 20 iterations. Table 4.3 summarizes the runs for both the standard and
reduced Newton algorithms, and Figure 4.7 compares the convergence histories of
standard and reduced Newton for the long time step case. We observe that reduced
Newton can easily handle the “long” and “huge” time step cases. Standard Newton,
on the other hand, needs to cut time steps multiple times in order to achieve conver-
gence, and this results in a significant number of wasted linear solves and a serious
degradation in performance. In fact, we were unable to run standard Newton for the
huge time step case because of the large number of time step cuts. Consistent with
the collective experience in the simulation community, taking too large a time step
in standard Newton actually makes the simulation slower. The opposite is true for
108 CHAPTER 4. REDUCED NEWTON METHOD

Table 4.3: Summary of runs for the full SPE 10 problem. “Wasted Newton steps”
and “wasted linear solves” indicate the number of Newton iterations and linear solves
that are wasted because of time step cuts.

Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t Huge ∆t
No. of time steps 58 38 53 26 11
No. of time step cuts 6 17 0 0 0
No. of Newton steps 353 516 128 90 55
− Wasted Newton steps 120 340 0 0 0
No. of linear solves 3818 6257 2271 2399 1805
− Wasted linear solves 860 3934 0 0 0
Total running time (sec) 24053 37388 16558 14727 10275
− Linear solves (sec) 22570 35457 11697 11301 7899
− Single-cell solves (sec) 0 0 4194 2996 2132

reduced Newton. Indeed, reduced Newton with long or huge time steps runs in less
than 60% of the time required by standard Newton with either time-stepping strategy.
Finally, Figure 4.8 shows the oil production rate and water cut for all four simulation
runs. The discrepancy between the solutions is insignificant, with the exception of
the huge time step case, in which the time truncation error becomes so large that the
water cut and production curves noticeably deviate from the cases. In practice, one
would probably not want to take such a large time step, but it is reassuring to know
that reduced Newton can still converge under such extreme circumstances. In general,
by using reduced Newton with (reasonably) larger time steps, we obtain substantial
speedups with little or no change in solution accuracy.

4.3.4 1D three-phase example with gravity

To show that the reduced formulation is applicable to three-phase flow, the algorithm
is tested on a three-phase model in which gas is injected into a reservoir initially
saturated with a mixture of 50% oil and 50% water in every cell. This saturation
is chosen to ensure that all phases are mobile, and that we have a truly three-phase
4.3. NUMERICAL EXAMPLES 109

18
Reduced
Standard
16

14

12

10
Iterations

0
0.01
c
0.31
c
1
c
c
c
7
c
c
31
c
c
90
c
150
c
220
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
Time (days)

Figure 4.7: Convergence history for the full SPE 10 problem with long time steps.
Tick marks on the x-axis labeled c correspond to intermediate time steps needed by
standard Newton to achieve convergence; these steps are skipped by reduced Newton.
110 CHAPTER 4. REDUCED NEWTON METHOD

Total Oil Production Rate


6000
Reduced, short ∆ t
Reduced, long ∆ t
5000
Reduced, huge ∆ t
Standard, short ∆ t
4000
Standard, long ∆ t
STB/day

3000

2000

1000

0
0 500 1000 1500 2000
Days
Water Cut for Producer #1
0.8
Reduced, short ∆ t
0.7 Reduced, long ∆ t
Reduced, huge ∆ t
0.6
Standard, short ∆ t
0.5 Standard, long ∆ t
Water cut

0.4

0.3

0.2

0.1

0
0 500 1000 1500 2000
Days

Figure 4.8: Total oil production rate and water cut for the full SPE 10 problem.
4.3. NUMERICAL EXAMPLES 111

problem. The reservoir is identical to the one used in Example 4.3.1. The PVT data
and relative permeabilities are shown in Tables 4.4 and 4.5, respectively. For simplic-
ity, the gas component is assumed not to dissolve into the oil phase (i.e., Rgo = 0).
The oil relative permeability is interpolated from the oil-gas and oil-water tables using
the Stone I method. Gas is injected into the top layer at a rate of 100 MSCF/day
(0.000768 pore volumes/day at 4000 psi), and a producer in the bottom layer is main-
tained at a constant pressure of 4000 psi. The production curve is shown in Figure
4.9. Even though gas is highly mobile (µw /µg = 11.6, µo /µg = 111.9), breakthrough
occurs relatively late (at T = 521 days or 0.4 pore volumes) because gas preferentially
stays in the upper layers because of buoyancy. In addition, since the simulation does
not start from gravity equilibrium, gravity segregation between oil and water must
occur at the initial stages of the simulation. Up to 98% of cell interfaces experi-
ence countercurrent flow at some point before gas breaks through. This accounts for
the rather complicated behavior of the water and oil production curves prior to gas
breakthrough. Even though this is a rather small example, we believe it captures the
essence of the types of nonlinearity present in countercurrent three-phase flow, and
provides a good test case for comparing the convergence behavior of the standard and
reduced Newton algorithms. In this example, two time-stepping strategies are used:

• Short time steps: T = 0.1, 1, 5, 10 days. After 10 days, ∆t = 10 days (0.00768


pore volumes) until T = 1000 days.

• Long time steps: After an initial time step of 0.1 days, ∆t is automatically
chosen based on saturation and pressure changes, with a minimum of ∆t = 10
days and gradually increasing until ∆t = 100 days (0.0768 pore volumes).

Table 4.6 summarizes the runs for the standard and potential-based reduced New-
ton algorithms. Running times have little meaning because of the small size of the
problem, and are thus omitted. We once again observe that reduced Newton has no
difficulty handling both short and long time steps, whereas standard Newton needs
to cut the time-step size repeatedly throughout the simulation. Thus, the presence of
three phases does not negatively impact the convergence behavior of reduced Newton.
112 CHAPTER 4. REDUCED NEWTON METHOD

Table 4.4: PVT relations for all three-phase examples.

P Bo µo Bw µw Bg µg
(psi) (RB/STB) (cp) (RB/STB) (cp) (RB/SCF) (cp)
14.7 1.062 2.200 1.0410 0.31 0.166666 0.0080
264.7 1.061 2.850 1.0430 0.31 0.012093 0.0096
514.7 1.060 2.970 1.0395 0.31 0.006274 0.0112
1014.7 1.059 2.990 1.0380 0.31 0.003197 0.0140
2014.7 1.056 2.992 1.0350 0.31 0.001614 0.0189
2514.7 1.054 2.994 1.0335 0.31 0.001294 0.0208
3014.7 1.053 2.996 1.0320 0.31 0.001080 0.0228
4014.7 1.050 2.998 1.0290 0.31 0.000811 0.0268
5014.7 1.047 3.000 1.0258 0.31 0.000649 0.0309
9014.7 1.033 3.008 1.0130 0.31 0.000386 0.0470

Table 4.5: Relative permeabilities for all three-phase examples.

Sw krw krow Sg krg krog


0.12 0 1.00 0 0 1.00
0.121 1.67E-12 1.00 0.001 0.0002 1.00
0.14 2.67E-07 0.997 0.02 0.0033 0.997
0.17 1.04E-05 0.98 0.05 0.0106 0.98
0.24 3.46E-04 0.7 0.12 0.0364 0.70
0.32 2.67E-03 0.35 0.20 0.0919 0.35
0.37 6.51E-03 0.2 0.25 0.1459 0.20
0.42 0.014 0.09 0.30 0.2226 0.09
0.52 0.043 0.021 0.40 0.4588 0.021
0.57 0.068 0.01 0.45 0.6336 0.01
0.62 0.104 0.001 0.50 0.7449 0.001
0.72 0.216 0.0001 0.60 0.8887 0.0001
0.82 0.400 0 0.70 0.9563 0
1.00 1.000 0 0.88 1.0000 0
4.3. NUMERICAL EXAMPLES 113

100

80
Production rate

60
Gas
Oil
Water
40

20

0
0 200 400 600 800 1000
Time (days)

Figure 4.9: Production curve for the 1D three-phase example. The units are STB/day
for oil and water, and MSCF/day for gas.

Table 4.6: Summary of runs for the 1D three-phase example with gravity. “Wasted
Newton steps” and “wasted linear solves” indicate the number of Newton iterations
and linear solves that are wasted because of time step cuts.

Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 111 74 103 26
No. of time step cuts 16 36 0 0
No. of Newton steps 888 1223 480 229
− Wasted Newton steps 320 720 0 0
No. of linear solves 1763 2421 973 480
− Wasted linear solves 641 1418 0 0
114 CHAPTER 4. REDUCED NEWTON METHOD

4.3.5 2D Heterogeneous three-phase example


We now test the reduced Newton algorithm on a three-phase example with hetero-
geneity. The reservoir consists of the 51st layer of the SPE 10 problem, which is a
slice in the Upper Ness formation (see Example 4.3.3). Initially the reservoir contains
a mixture of 50% oil and 50% water, and gas is injected through a well in the center at
a rate of 1000 MSCF/day (0.00005 pore volumes per day). The four production wells
(one in each corner) are each maintained at a bottom hole pressure of 4000 psi. The
PVT and relative permeability data are the same as in Example 4.3.4 and are given
in Tables 4.4 and 4.5. The simulation is run up to T = 500 days (0.025 PVI), which
is much larger than the breakthrough time (TBT ≈ 40 days or 0.002 PVI). Note that
the early breakthrough time is due to the extremely high mobility of the gas. Figure
4.10 shows the gas saturation of the reservoir at T = 500 days. Two time-stepping
strategies are used:

• Short time steps: T = 1, 3, 7, 15, 31, 63, 100 days. After 100 days, ∆t = 50 days
(0.00125 pore volumes) until T = 500 days is reached.

• Long time steps: T = 10, 30, 60, 100 days. After 100 days, ∆t = 100 days
(0.0025 pore volumes) until T = 500 days is reached.

Table 4.7 shows the performance of the standard and reduced Newton algorithms.
Once again no time step cuts are required by reduced Newton, demonstrating its
stability compared with the standard Newton’s method. This translates to an im-
provement in running time for the long time step case. This example shows that
the improvement obtained from reduced Newton in three-phase flow is not limited to
simple 1D cases.

4.3.6 3D three-phase example


Here, the algorithm is tested on a 3D three-phase model in which gas is injected into
a reservoir containing a mixture of 50% oil and 50% water. The reservoir (20 × 20 × 3
cells) is a 2 × 2 areal refinement of the one used in the SPE1 test set [57] and is shown
in Figure 4.11. The PVT data and relative permeabilities are the same as the two
4.3. NUMERICAL EXAMPLES 115

Figure 4.10: Gas saturation at T = 500 days in the 2D heterogeneous three-phase


example. Dark blue indicates 100% gas, whereas dark red indicates a cell consisting
purely of liquid phases.

Table 4.7: Summary of runs for the 2D heterogeneous three-phase example. “Wasted
Newton steps” and “wasted linear solves” indicate the number of Newton iterations
and linear solves that are wasted because of time step cuts.

Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 16 10 15 8
No. of time step cuts 1 3 0 0
No. of Newton steps 74 101 58 40
− Wasted Newton steps 20 60 0 0
No. of linear solves 1264 1529 1172 881
− Wasted linear solves 276 698 0 0
Total running time (sec) 63.5 75.6 73.8 53.9
− Linear solves (sec) 53.7 66.1 50.5 37.9
− Single-cell solves (sec) 0 0 18.6 13.0
116 CHAPTER 4. REDUCED NEWTON METHOD

previous examples (Table 4.4 and 4.5), and the Stone I model is used to interpolate
the oil-gas and oil-water data. The gas-injection well is completed in cell (1,1,1)
and operates at 100000 MSCF/day (0.000073 pore volumes per day at 9000 psi); a
production well, completed in cell (20,20,3), operates at a bottom-hole pressure of
1000 psi. The simulation is run up to T = 5000 days (0.365 PVI). Because of the
high gas mobility, breakthrough occurs very early (TBT ≈ 100 days or 0.0073 PVI).
Since the oil and water are not in gravity equilibrium at the start of the simulation,
there is significant countercurrent flow in the problem. Two time-stepping strategies
are used:

• Short time steps: T = 30, 100, 200, 250, 400, 600, 900 days. After 900 days, ∆t
= 400 days (0.0292 pore volumes) until T = 5000 days.

• Long time steps: T = 100, 250, 600 days. After 600 days, ∆t = 800 days (0.0584
pore volumes) until T = 5000 days.

Table 4.8 shows the performance of the standard and reduced Newton algorithms.
Again we see that the reduced Newton method requires no time step cuts and fewer
iterations to converge compared to the standard Newton method.
4.3. NUMERICAL EXAMPLES 117

Production well

Injection well φ = 0.3


∆ x = ∆ y = 500 ft
20 ft
k = 500 md
30 ft
k = 50 md

50 ft
k = 200 md
10000 ft
10000 ft

Figure 4.11: Reservoir description for the 3D three-phase example.

Table 4.8: Summary of runs for 3D three-phase example. “Wasted Newton steps”
and “wasted linear solves” indicate the number of Newton iterations and linear solves
that are wasted because of time step cuts.

Standard Reduced
Short ∆t Long ∆t Short ∆t Long ∆t
No. of time steps 18 11 17 9
No. of time step cuts 1 3 0 0
No. of Newton steps 95 117 74 57
− Wasted Newton steps 20 60 0 0
No. of linear solves 1083 1624 974 838
− Wasted linear solves 178 855 0 0
Total running time (sec) 4.9 6.5 6.1 4.9
− Linear solves (sec) 3.7 5.5 3.4 2.9
− Single-cell solves (sec) 0 0 2.1 1.6
Chapter 5

Linear Preconditioning

When an immiscible np -phase flow problem is discretized on a grid containing N cells,


Newton’s method requires the solution of a sparse np N × np N linear system Jx = r
at every iteration. The matrix J, which comes from the linearized residual functions

Vi φi n+1 n
X
(Sp,i − Sp,i )+ |∂Vil |Fp,il (S, p) = qp,i , (5.0.1)
∆t
l∈adj(i)

inherits the mixed hyperbolic-parabolic character of the underlying PDEs, which


means methods developed for a specific type of discretized PDE (e.g., elliptic PDEs)
will not work well for J. For this reason, efficient solution of the linear systems
remains a challenging problem in reservoir simulation. Direct solvers become pro-
hibitively expensive as the grid is refined; this is especially true for 3D problems,
where LU factorization requires O(N 2 ) floating-point operations and O(N 4/3 ) stor-
age, even when an optimal ordering strategy such as nested dissection is used [38].
On the other hand, when iterative methods are used, standard preconditioners such
as incomplete LU factorizations and multigrid perform poorly because the problem
is neither purely hyperbolic nor purely elliptic.

It is well known that the ordering of equations and unknowns can have a huge
impact on the quality of various preconditioners [25, 30, 10]. In most of these works
the orderings considered tend to belong to the following categories:

118
5.1. STRUCTURE OF THE JACOBIAN MATRIX 119

1. Coloring-based orderings, in which the nodes in the adjacency graph are parti-
tioned into a finite number of colors, and nodes with the same color are ordered
within the same block. The red-black ordering is a classical example of such
orderings, which are often motivated by parallelization considerations or in the
context of cyclic reduction.

2. Fill-minimizing orderings, which are developed in the context of sparse direct


solvers in order to minimize the number of fill-in entries in the LU factoriza-
tion. Examples include the reverse Cuthill-McKee method and minimum degree
ordering [38].

The above ordering strategies, while having the advantage of being applicable to
general sparse matrices, do not exploit the underlying physics of the problem. For
advection dominated problems, a natural idea is to order the cells according to flow
direction (e.g., from upstream to downstream). Ordering of this type has been consid-
ered in the CFD community (cf. [54]), but its use is limited in reservoir simulation.
The aim of this chapter is to exploit the cell-based and phase-based orderings in-
troduced in Chapter 3 for preconditioning purposes. In particular, we proceed as
follows:

1. Propose an improvement to the standard CPR-BILU(0) preconditioner that


exploits cell-based ordering;

2. Use phase-based ordering to derive preconditioned Krylov solvers based on


Schur complement preconditioning.

5.1 Structure of the Jacobian matrix


In this chapter, we mainly consider Jacobians that arise from a fully implicit, five-
point finite-volume discretization of the incompressible black-oil equations, with up-
stream weighting for saturation-dependent terms. When phase-based ordering is used,
120 CHAPTER 5. LINEAR PRECONDITIONING

the Jacobian will be denoted by J, which has the form (cf. (3.2.3),(3.2.5))

Sw p
" #
Jss Jsp water equation (5.1.1)
J=
Jps Jpp oil equation

In addition, we will often use cell-based ordering, in which all the equations and
variables belonging to the same control volume are grouped into a single block. In
this case, the Jacobian is denoted by A, where
 
A11 · · · A1N
 . ... .. 
A= .. .  (5.1.2)
 
AN 1 · · · AN N

is a block matrix with np × np blocks. Each block row represents the derivatives of
the conservation equations (oil and water) with respect to the discrete unknowns (Si
and pi ) at the gridblock and its adjacent cells. For example,
" #
(Jss )ii (Jsp )ii
Aii = .
(Jps )ii (Jpp )ii

Clearly, A = P JP T for some permutation matrix P . For simplicity, we assume two-


phase flow throughout this chapter, while noting that many results can be extended
to three-phase flow. We make the following additional assumptions.

Assumptions 5.

1. The phase mobilities are non-negative and satisfy λ0w = ∂λw /∂Sw > 0 and
λ0o = ∂λo /∂Sw < 0;

2. The total mobility λt = λo + λw across each cell boundary is strictly positive;

3. Phase-based upstreaming is performed based on the upstream directions given


at the linearization point (P ` , S ` );
5.1. STRUCTURE OF THE JACOBIAN MATRIX 121

4. A pressure Dirichlet boundary condition is prescribed on a segment of the


boundary with positive measure. (Alternatively, one can assume there exists at
least one production well operating at a fixed bottom-hole pressure.)

Assumption 5.1 has already been stated in Theorem 2.1. Assumption 5.2 is similar
to the uniform ellipticity condition in Section 4.2. When the flow is cocurrent, it is
purely an assumption on the fluid mobilities. In the countercurrent flow case, however,
it is also an assumption on the linearization point (S ` , P ` ), since it is possible that
λw and λo are evaluated at two different saturations because of upstreaming. Thus,
`
if there are adjacent cells i and i + 1 such that Si` = 0 and Si+1 = 1, then Assumption
5.2 would disallow the possibility that the upstream directions for water and oil are
i and i + 1 respectively, which is essentially a restriction on the set of admissible
pressure profiles P ` . Assumption 5.3 ensures monotonicity of the discretization (in
the sense of Chapter 2), and assumption 5.4 is needed for a unique pressure solution.

Lemma 5.1. Assume the hypothesis given in Assumptions (5.1–4). Then the sub-
blocks of the Jacobian J have the following properties:

0 0
1. Jss = (1/∆t)D + Jss and Jps = −(1/∆t)D + Jps , where D is a positive diagonal
0 0
matrix, and Jss and −Jps are weakly column diagonally-dominant M -matrices;

2. Jsp and Jpp are weakly diagonally dominant, symmetric, positive semi-definite
matrices;

3. Jsp + Jpp is a symmetric, positive-definite, irreducibly diagonally dominant M -


matrix;

0 0
Moreover, the matrices Jss , Jps , Jsp and Jpp are independent of ∆t.

Based on the above lemma, the following theorems concerning the rank of J can
be proven. Clearly, we have
" # " #
Jss Jsp Jss Jsp
J= nonsingular ⇐⇒ J˜ = nonsingular,
Jps Jpp Jts Jtp
122 CHAPTER 5. LINEAR PRECONDITIONING

where Jts = Jss + Jps and Jtp = Jsp + Jpp . That is, Jts and Jtp are the Jacobian
matrices corresponding to the total mass balance equation (1.1.19).
Since Jtp is nonsingular, J˜ is nonsingular if and only if the Schur complement

−1
S1 := Jss − Jsp Jtp Jts

is also nonsingular.

Theorem 5.2. There exists T > 0 such that J is nonsingular for 0 < ∆t < T .

0 0
Proof. First, note that Jts = Jss + Jps = Jss + Jps is independent of ∆t. Thus, we can
write
1 0 −1
S1 = D + (Jss − Jsp Jtp Jts ),
∆t
where the terms in brackets are independent of ∆t. Now S1 is nonsingular if and only
if
∆tD−1 S1 = I + ∆tD−1 (Jss
0 −1
− Jsp Jtp Jts )

is also nonsingular, which is the case whenever

ρ(∆tD−1 (Jss
0 −1
− Jsp Jtp Jts )) < 1,

where ρ(·) denotes the spectral radius. Thus, S1 is nonsingular whenever 0 < ∆t < T ,
where
1
T = 0 − J J −1 J ))
,
ρ(D−1 (Jss sp tp ts

−1
or T = ∞ if ρ(D−1 (Jss
0
− Jsp Jtp Jts )) = 0.

Theorem 5.3. For 1D flow problems, J is nonsingular for all ∆t > 0.

Proof. For 1D problems, it is possible to write explicitly down the form of the Schur
complement S1 by eliminating the pressure terms directly. Since the discretization
and linearization steps commute, it is notationally more convenient to manipulate
the PDE itself, although one can also perform the same calculation on the discrete
5.1. STRUCTURE OF THE JACOBIAN MATRIX 123

equations. We start with the two-phase conservation law:

∂  
φSt − λw (px + ρw gzx ) = 0, (5.1.3)
∂x
∂  
−φSt − λo (px + ρo gzx ) = 0. (5.1.4)
∂x

Linearize around (S ` , P ` ) by letting S = S ` + σ, p = P ` + π:


 
∂  ∂  0 w `
φSt` `

− λw (Px + ρw gzx )] + φσt − λ σ (Px + ρw gzx ) + λw πx = 0,
∂x ∂x w
(5.1.5)
 
∂  ∂  0 o `
−φSt` − λo (Px` + ρo gzx )] + −φσt −

λo σ (Px + ρo gzx ) + λo πx = 0.
∂x ∂x
(5.1.6)

In the above equations, all coefficients are evaluated at the linearization point, so
they do not depend on σ and π. We also used the notation σ w and σ o to denote the
upwind direction of the water and oil phase in the finite volume discretization, which
can be different in general. By keeping the terms separate we can easily mimic this
manipulation in the discrete case. If we define

∂ 
F (x, t) := −φSt` + λw (Px` + ρw gzx )], (5.1.7)
∂x
∂ 
G(x, t) := φSt` + λo (Px` + ρo gzx )], (5.1.8)
∂x

we obtain the linearized PDE

∂  0 w ` 
φσt − λw σ (Px + ρw gzx ) + λw πx = F (x, t), (5.1.9)
∂x
∂  0 o ` 
−φσt − λo σ (Px + ρo gzx ) + λo πx = G(x, t). (5.1.10)
∂x

We can now eliminate πx to obtain a single equation involving σ. Adding (5.1.9) and
124 CHAPTER 5. LINEAR PRECONDITIONING

(5.1.10) and integrating gives


Z x
λ0w σ w (Px` + ρw gzx ) + λ0o σ o (Px` + ρo gzx ) + λT πx =− (F (ξ, t) + G(ξ, t))dξ =: −H(x, t)
0
(5.1.11)
so that

1 
πx = − H(x, t) + λ0w σ w (Px` + ρw gzx ) + λ0o σ o (Px` + ρo gzx ) . (5.1.12)
λT

Substituting into (5.1.9) gives



∂ λw
φσt − λ0w σ w (Px` + ρw gzx ) − H(x, t)
∂x λT
 (5.1.13)
λ0w σ w (Px` λ0o σ o (Px`

+ + ρw gzx ) + + ρo gzx ) = F (x, t).

Simplify and get

∂ λo λ0w (Px` + ρw gzx ) w λw λ0o (Px` + ρo gzx ) o


 
φσt − σ − σ = R(x, t), (5.1.14)
∂x λT λT

where R(x, t) is some combination of F (x, t) and H(x, t) that does not depend on
σ and π, and hence is unimportant for the analysis. To derive the discrete form of
(5.1.14), we need to resolve the upstreamed saturations σ w and σ o , which are given
by 
σi+1 , P ` + ρw gzx ≥ 0
w x
σi+1/2 =
σ ,
i Px` + ρw gzx < 0,

and similarly for σ o . Thus, the discrete algebraic equations that arise from Newton’s
method are of the form

φi σi 1  
+ αi+1/2 σi − βi+1/2 σi+1 − αi−1/2 σi−1 + βi−1/2 σi = Ri , (5.1.15)
∆t ∆xi
5.2. CPR PRECONDITIONING 125

where

λw,i+1/2 λ0o,i min{Pi+1


`
− Pi` + ρo g∆z, 0}
αi+1/2 =
∆xi+1/2 (λo,i+1/2 + λw,i+1/2 )
λo,i+1/2 λ0w,i min{Pi+1
`
− Pi` + ρw g∆z, 0}
− ≥ 0,
∆xi+1/2 (λo,i+1/2 + λw,i+1/2 )
`
λw,i+1/2 λ0o,i+1 max{Pi+1 − Pi` + ρo g∆z, 0}
βi+1/2 =−
∆xi+1/2 (λo,i+1/2 + λw,i+1/2 )
λo,i+1/2 λ0w,i+1 max{Pi+1
`
− Pi` + ρw g∆z, 0}
+ ≥ 0.
∆xi+1/2 (λo,i+1/2 + λw,i+1/2 )

Thus, with proper scaling, the Schur complement S1 has the form
 
γ1 + α3/2 + β1/2 −β3/2
 . 

 −α3/2 γ2 + α5/2 + β3/2 . . 


S1 =  ... ... ... 
,
 
 ... ... 

 −βN −1/2 

−αN −1/2 γN + αN +1/2 + βN −1/2

where γi = ∆xi φi /∆t. This is a tridiagonal, column diagonally dominant M -matrix,


which means S1 is nonsingular (and in fact positive-stable) whenever ∆t > 0. Hence,
the full Jacobian J is also nonsingular.

We remark that, in most practical reservoir flow simulations, it is extremely rare


to encounter a singular Jacobian unless the linearization point (S ` , P ` ) is so far from
the solution that it is physically inadmissible (e.g., when P ` no longer satisfies the
maximum principle).

5.2 CPR preconditioning


One of the most successful approaches for preconditioning fully-implicit Jacobians is
the two-stage constrained pressure residual (CPR) method proposed by Wallis [81].
The method can be viewed as the linear analog of the sequential implicit (SEQ)
126 CHAPTER 5. LINEAR PRECONDITIONING

method, in the sense that it first decouples the full problem into an elliptic and
a hyperbolic subproblem; then at each iteration, one would first solve the elliptic
problem to obtain an approximate pressure, and then use this pressure to solve the
transport problem. A more precise description in terms of two-stage preconditioners
follows.

For a linear system Jx = r, the general two-stage preconditioner is given by

M −1 = T2 I − JT1 + T1 ,
 
(5.2.1)

where T1 and T2 are approximate inverses for J, or for the restriction of J onto
some subspace. When T1 and T2 are both invertible, then M −1 is equivalent to the
preconditioner derived from the two-stage stationary iteration

T1−1 xk+1/2 = (T1−1 − J)xk + b,


T2−1 xk = (T2−1 − J)xk+1/2 + b.

Examples of this type include ADI preconditioners [68], the symmetric SOR method
[39] and the HSS method [7]. The Ti can be singular as well. The special case of

Ti = Ri (RiT ARi )−1 RiT ,

where RiT is a restriction operator, corresponds to either a block Gauss-Seidel or a


multiplicative Schwarz method, depending on whether the blocks overlap (cf. [68]).
Since
I − M −1 J = (I − T2 J)(I − T1 J), (5.2.2)

one can generally expect M to be a good preconditioner if T1 and T2 complement


each other by closely approximating J on different parts of the spectrum. Other ways
of combining two or more preconditioners to solve a single linear system can be found
in [15].
5.2. CPR PRECONDITIONING 127

5.2.1 True-IMPES reduction

The CPR preconditioner, which operates on the matrix J of size 2N × 2N , also has
the form (5.2.1):

−1 −1 T −1 T
+ C(W T JC)−1 W T ,
 
MCP R = M 2 I − JC(W JC) W (5.2.3)

where W T and C, of size N × 2N and 2N × N respectively, are the restriction and


prolongation operators; M2 , of size 2N × 2N , is typically a local preconditioner such
as ILU. The goal of the first stage preconditioner is to form a pressure equation

Ap δp = −rp ,

where Ap = W T JC, that can be solved easily and gives a meaningful approximate
pressure solution δp. Different choices of W T and C give rise to different first-stage
preconditioners, which is the subject of study in [44]. One popular choice of the first-
stage preconditioner, called the True-IMPES reduction, uses the IMPES pressure
matrix directly; in this case, Ap is an elliptic operator, so efficient solvers such as
algebraic multigrid [74] can be used to solve the pressure equation. In addition, since
Ap is simply the pressure matrix that arises from a different time discretization, the
solution δp is also a meaningful approximation of the FIM pressure solution, at least
when ∆t is small.

In the general black-oil case with np phases, it is possible to obtain the IMPES
pressure matrix by manipulating J directly. We describe the procedure here infor-
mally (but see [44] for a detailed discussion). Since IMPES treats the transmissibility
derivatives explicitly, one needs to first eliminate these terms from J. This can be done
by performing a column sum (i.e., for each phase p, sum the equations corresponding
to phase p over the whole domain): since mass is conserved, all the flux terms must
cancel, so the transmissibility derivatives will also cancel out. Only accumulation
terms remain, which means that

Jˆss := Colsum(Jss ) and Jˆps := Colsum(Jps )


128 CHAPTER 5. LINEAR PRECONDITIONING

are now diagonal matrices. Finally, the pressure equation is obtained by eliminating
the pressure variables, which is equivalent to forming the Schur complement

Ap = Jpp − Jˆps Jˆss


−1
Jsp .

The resulting pressure matrix Ap will have the same sparsity pattern as Jpp and Jsp ,
since the scaling matrix Jˆps Jˆss
−1
is diagonal and does not modify the sparsity pattern
of Jsp . In the incompressible two-phase flow case, Lemma 5.1 shows that

1
Jˆss = −Jˆps = D,
∆t

so we have the very simple relation Ap = Jps + Jpp = Jtp . Thus, the restriction and
prolongation operators are
" #
h i 0
WT = I I , C= ,
I

and the first-stage preconditioner becomes


" #
T −1 T 0 0
T1 = C(W JC) W = −1 −1
.
Jtp Jtp

We wish to investigate the effect of the CPR preconditioner on J by computing

−1 −1
MCP R J = M2 J(I − T1 J) + T1 J.

A straightforward calculation shows that


" #
−1
Jss − Jsp Jtp Jts 0
J(I − T1 J) = −1
.
Jps − Jpp Jtp Jts 0

−1
Recall that S1 = Jss − Jsp Jtp Jts is the Schur complement with respect to the (1,1)-
" #
J˜ss J˜sp
block. If we partition M2 into M2 = and define S̃1 = J˜ss − J˜sp J˜tp
−1 ˜
Jts , we
˜ ˜
Jps Jpp
5.2. CPR PRECONDITIONING 129

obtain " #
−1
−1 S̃ 1 S 1 0
MCP RJ = . (5.2.4)
Jtp Jts − J˜tp
−1 −1 ˜ −1
Jts S̃1 S1 I
−1
As a result, MCP R J has λ = 1 as an eigenvalue with (geometric) multiplicity at
least N , so the first-stage preconditioner clusters all the eigenvalues associated with
the pressure part into the point z = 1. Equation (5.2.4) also implies that GMRES
converges in at most N + 1 iterations in exact arithmetic. To see this, consider any
matrix of the form " #
S 0
G := .
Y I
Pk
Let q(t) = i=0 βi ti be the minimal polynomial of S, where β0 6= 0 if and only if S is
nonsingular. Then since
" #
S i+1 − S i 0
Gi+1 − Gi = ,
Y Si 0

we see that
k
X
βi (Gi+1 − Gi ) = 0.
i=0

So the minimal polynomial of G, q̃(t), has degree at most k + 1, and q̃(0) 6= 0 if


−1
and only if S is nonsingular. In the case of G = MCP R J, q̃(t) has degree at most
N + 1; this implies the convergence of GMRES within N + 1 iterations, since the
m-th residual rm of GMRES satisfies

krm k2 = min kpm (G)r0 k2 ,


pm ∈Pm
pm (0)=1

where Pm denotes the set of polynomials with degree at most m.


Given the role the matrix S̃1−1 S1 plays, the convergence behavior of CPR is pre-
dominantly dictated by how well the second-stage preconditioner M2 approximates
the Schur complement S1 with respect to the transport problem.
130 CHAPTER 5. LINEAR PRECONDITIONING

5.2.2 Improved second-stage preconditioner via ordering


The choice of second-stage preconditioners has a significant impact on the effectiveness
of the overall CPR preconditioner. Based on (5.2.4), it is clear that an effective second-
stage preconditioner must perform well on the transport problem. Popular choices
for the second-stage preconditioners include ILU(k) (typically k = 0) as well as block
ILU(k), where the np -by-np blocks correspond to the np equations in np unknowns
aligned with a given control volume. Even though both pointwise and cell-based
block ILU(k) have similar performance in practice, the block variant is generally
more robust and easier to analyze, since no special procedure is needed to handle
accidental zero entries arising from residual saturations. For instance, the (i, j) entry
in Jsp is generally nonzero if i and j are adjacent gridblocks, but can become zero
occasionally if λw = 0 at the i − j interface. When this happens, pointwise ILU will
drop any fill-in that occurs at the (i, j) position, whereas block ILU will retain the
fill-in entry. For the remainder of this section, we mainly focus on block ILU to avoid
complications of this sort.
The effectiveness of BILU (0) on the transport problem is demonstrated next.
Proposition 5.4. Let A be the Jacobian (in block form) of a 1D flow problem, with
the cells ordered from left to right. Then if the BILU (0) factorization of A exists
(i.e., if no singular diagonal block occurs during factorization), it is exact.
Proof. Since A is block tridiagonal, no fill-in occurs during block Gaussian elimina-
tion, so the block LU and BILU(0) factorizations coincide. As a result, the BILU(0)
factorization is exact.

Note that ILU is only exact if the cell-based block form of the Jacobian is used.
Fill-in necessarily occurs if the partitioned form of the Jacobian is used, since the
Schur complement
−1
S2 := Jpp − Jps Jss Jsp
−1
is not tridiagonal. In fact, Jps Jss is in general a full lower-triangular matrix, which
means S2 is in general a full lower Hessenberg matrix. Thus, one should expect
BILU(0) to be a better second-stage preconditioner than ILU on the partitioned
matrix J.
5.2. CPR PRECONDITIONING 131

It is usually difficult to ascertain a priori that the BILU(0) factorization exists for
the general two-phase flow problem. However, in the special case of cocurrent flow,
we can prove the existence of BILU(0) when a cell-based potential ordering is used.

Theorem 5.5. Let J be the Jacobian corresponding to a cocurrent flow problem lin-
earized at (S ` , P ` ), and suppose the pressure profile P ` satisfies the maximum princi-
ple. Assume the cell-centered grid admits a two-coloring. Let A = P JP T be the block
form of the Jacobian, in which the cells are arranged in decreasing order of pressure.
Then the block ILU(0) factorization of A exists with nonsingular factors L and U .
Moreover, we have
P T (LU )P = J + E, (5.2.5)
" #
0 Esp
where E = .
0 Epp

In other words, BILU(0) is exact on the saturation part. The assumption that P `
satisfies the maximum principle implies that the cell(s) with the lowest pressure must
be on a Dirichlet boundary. Also note that this theorem is applicable to cocurrent
flow problems in any dimension, and not just for 1D flows, as long as the grid is
two-colorable. This applies to many grids of practical interest (Cartesian and other
orthogonal grids, radial grids, etc.). In light of (5.2.2) and the fact that T1 is exact
on the pressure part, Theorem 5.5 indicates that BILU(0) should be an excellent
preconditioner as long as Esp and Epp are not too large.

Proof. Let A = P JP T and Aij be the 2 × 2 blocks. Let A(k) be the block (N − k +
1) × (N − k + 1) matrix that remains to be factored at the kth step, i.e., we have
132 CHAPTER 5. LINEAR PRECONDITIONING

A = A(1) ,
 
(k) (k) (k)
Akk Ak,k+1 · · · AkN

A(k) ... .. 
. 
A(k) =  k+1,k ,
 
 ... ... .. 
. 
 
(k) (k)
AN k ··· · · · AN N

0, (k)
(k+1) Aij = 0;
Aij =
A(k) − A(k) (A(k) )−1 A(k) , A(k) 6= 0
ij ik kk kj ij

for i, j ≥ k + 1.

1. We argue that
(1) (K)
Aij = Aij = · · · = Aij (5.2.6)

whenever i 6= j and K ≤ min(i, j). This is true because a two-coloring exists for
any Cartesian grid, i.e., one can partition the gridblocks V = {1, 2, . . . , N } into
disjoint sets VR and VB such that Aij = 0 whenever i 6= j and either i, j ∈ VR
(k) (k) (k) (k)
or i, j ∈ VB . Thus, when i 6= j, either Aij = 0 or Aik (Akk )−1 Akj = 0, which
implies (5.2.6). Thus, the only blocks that change during the elimination are the
(k)
diagonal blocks Aii .

2. Because of upstream weighting in the finite-volume discretization, we see that for


i < j, " #
0 Xij
Aij = .
0 Yij
(k) (k) (k)
Thus, Aik (Akk )−1 Akj also has the form
" #
0 ∗
, (5.2.7)
0 ∗

(k)
which means only the second column of Aii gets updated during the elimination.
5.2. CPR PRECONDITIONING 133

So we have
" (k)
# " #
(k) aii Xii (k) −aij −Xij
Aii = (k) ; Aij = Aij = (i 6= j). (5.2.8)
−bii Yii bij −Yij

3. Let γi = φi Vi /∆t ≥ γmin > 0. The following properties hold for A = A(1) :

• aij , bij , Xij , Yij are all non-negative;

• Xij = Xji and Yij = Yji for all i 6= j,


P
• ajj ≥ γj + i>j aij ;
P
• bjj ≥ γj + i>j bij ;
P
• Xii ≥ j6=i Xij ;
P
• Yii ≥ j6=i Yij ;
P
• For a cell i on the Dirichlet boundary, Xii + Yii > j6=i (Xij + Yij ).

We prove inductively that for any given k and any i, j ≥ k,

(k) P
(a) Xii ≥ j≥k,j6=i Xij ≥ 0;
(k) P
(b) Yii ≥ Yij ≥ 0.
j≥k,j6=i

(k) (k) P
(c) Xii + Yii > j≥k,j6=i (Xij + Yij ) ≥ 0 for a cell i on the Dirichlet boundary.

(k)
Then (a)–(c) together would imply that Aii is nonsingular for all k ≤ i. Assume
first that cell i is not on the Dirichlet boundary. Then by the maximum principle,
there is at least one cell downstream from i. Thus,

(k) (k) (k) (k) (k)


det Aii = aii Yii + bii Xii ≥ γmin (Xii + Yii ) ≥ γmin λt,min > 0.

Similarly, if i is on the Dirichlet boundary, then (c) implies

(k) (k) (k)


det Aii ≥ γmin (Xii + Yii ) > 0.
134 CHAPTER 5. LINEAR PRECONDITIONING

Clearly, conditions (a)–(c) are satisfied for k = 1. For the inductive step, we
compute
" #" (k) (k)
#" #
(k) (k) (k) 1 −aik −Xik Ykk −Xkk 0 −Xki
Aik (Akk )−1 Aki = (k)
det Akk bik −Yik bkk akk 0 −Yki
" (k)
#
0 dXii
= (k) .
0 dYii

We have

(k)  (k) (k) (k)


det Akk dXii = aik Ykk Xki − aik Xkk Yki + Xik bkk Xki + Xik akk Yki
(k)
≤ aik Ykk Xki − aik Xki Yki + Xik bkk Xki + Xik akk Yki
 
(k)
= Xik aik Ykk − aik Yki + bkk Xki + akk Yki
 
(k)
= Xik aik Ykk + (akk − aik )Yki + bkk Xki
 
(k) (k) (k)
≤ Xik aik Ykk + (akk − aik )Ykk + bkk Xkk
(k) (k)
= Xik (akk Ykk + bkk Xkk )
(k) 
= det Akk Xik .

(k) (k)
Thus, dXii ≤ Xik , and by a similar calculation, we get dYii ≤ Yik . Hence,
X 
(k+1) (k) (k)
X
Xii = Xii − dXii ≥ Xij − Xik ≥ Xij ,
j≥k j≥k+1
j6=i j6=i

proving (a); (b) and (c) are proved similarly.

4. The above argument shows that the block ILU(0) factorization of A exists and
has the form
   
I U11 U12 · · · U1N
 .. 
 L21 I
 
  U22 . 
L= , U = ,
 
 .. ... . . . UN −1,N 
.
 
   
LN 1 · · · LN,N −1 I UN N
5.2. CPR PRECONDITIONING 135

where

(j)
A A )−1 , i > j, Aij 6= 0,
 ij jj



Lij = I, i = j, (5.2.9)



0, otherwise;

A , i < j, Aij 6= 0,
 ij



Uij = A(i)
ii , i = j, (5.2.10)



0, otherwise.

Clearly, L and U are both nonsingular. Each 2 × 2 block in the factorization error
(k)
P EP T has the form Aik Akk Akj , which has the pattern shown in (5.2.7). Thus,
after permutation, we get " #
0 Esp
E= ,
0 Epp
as required.

As we pointed out in section 3.2.4, it is not necessary to perform an exact sorting on


the cell pressures in order to obtain a potential ordering. Instead, a topological sort,
which can be calculated in O(N ) time, suffices. This implies there exist many ways to
order the cells in such a way that Jss and Jps are triangular. Although it is conceivable
that the different topological orderings will lead to different ILU prconditioners, the
next theorem shows that they are in fact identical up to permutation.

Theorem 5.6. Assume the hypotheses of Theorem 5.5. Let G = (V, E) be the up-
stream graph, i.e., V is the set of cells in the domain, and (i, j) ∈ E iff (1) i is
adjacent to j, and (2) either Pi` > Pj` or Pi` = Pj` and i > j. Let

σ1 : V → {1, . . . , N }
σ2 : V → {1, . . . , N }

be two topological orderings of G, and define Ar = Πr AΠTr (r = 1, 2), where the Πr


136 CHAPTER 5. LINEAR PRECONDITIONING

are block N × N permutation matrices with



I, j = σr (i)
(Πr )ij =
0 otherwise.

Then
ΠT1 L1 U1 Π1 = ΠT2 L2 U2 Π2 ,

where Lr and Ur are the block ILU(0) factors of Ar .

Proof. Let τr : {1, . . . , N } → V be the inverse of σr , r = 1, 2. Based on the expressions


for Lr and Ur given by (5.2.9) and (5.2.10), it suffices to show that the diagonal blocks
of ΠT1 U1 Π1 and ΠT2 U2 Π2 are identical. These diagonal blocks are given by

(i) (i)
X
(Ur )ii = (Ar )ii = (Ar )ii − (Ar )ik ((Ar )ii )−1 (Ar )ki ,
k<i

but since (Ar )ik = 0 unless (τr (k), τr (i)) ∈ E, we really have
X
(Ur )ii = (Ar )ii − (Ar )ik (Ur )−1
kk (Ar )ki .
(τr (k),τr (i))∈E

Thus, for any j ∈ V , we have (U1 )σ1 (j),σ1 (j) = (U2 )σ2 (j),σ2 (j) if and only if

(U1 )σ1 (k),σ1 (k) = (U2 )σ2 (k),σ2 (k) for all k such that (k, j) ∈ E,

which is true by induction (recall that G is a directed acyclic graph).

Theorem 5.6 says that there is essentially only one BILU(0) preconditioner that
respects flow directions.

Structure of the factorization error

It is possible to describe the nonzero pattern of the error matrices Esp and Epp in
terms of the upstream graph G. A fill-in entry is created (and subsequently dropped
by BILU(0)) at position (i, j), with i 6= j, if there exists k < i, j such that both Aik
5.2. CPR PRECONDITIONING 137

and Ajk is nonzero. In other words, if Esp and Epp are nonzero at position (i, j), then
nodes i and j must be siblings in the upstream graph G, i.e., i and j must share the
same parent k. This immediately provides an upper bound on the number of entries
in Esp and Epp : the number of error entries due to the elimination of node k is given
by dk (dk − 1), where dk is the out-degree of node k (i.e., the number of edges coming
out of k). So the total number of entries in Esp and Epp is bounded by
X X X
di (di − 1) = d2i − di = |V |D2 − |E|,
i i i

where D is the maximum out-degree of any node in G. On a Cartesian grid, the


parameters are

• 2D problems: D ≤ 4, |E| ≈ 5|V |;

• 3D problems: D ≤ 6, |E| ≈ 7|V |.

So in either case, the error matrices are sparse, since the number of entries scales
linearly with |V |. Moreover, the value of the entries are given by
" #
0 (Esp )ij (k)
= −Aik (Akk )−1 Akj .
0 (Epp )ij

Physically, this corresponds to the flux from cell j to cell i (traveling via k) that is
generated by a change in pressure pj . Since the potential ordering always orders the
cells according to the major flow direction, the fluxes between siblings are generally
much smaller than fluxes along upstream edges. This implies the error matrices
Esp and Epp are small. Contrast this with a lexicographical ordering, where there
is no guarantee that the flux between siblings should be small. Thus, a second-
stage preconditioner that uses potential ordering should be more effective than one
that uses the natural ordering. This is what we observe in our numerical examples
(section 5.2.4).
138 CHAPTER 5. LINEAR PRECONDITIONING

5.2.3 Spectrum of the preconditioned matrix

To understand the effectiveness of two-stage CPR preconditioning, it is instructive to


−1
examine the spectrum of the preconditioned matrix MCP R J and compare it with the
spectrum obtained from other preconditioners. Generally speaking, iterative solvers
such as GMRES perform well when the spectrum of M −1 J consists of a few compact,
well-separated clusters far away from the origin (z = 0 on the complex plane), and
preferably close to z = 1. It is important to note that when M −1 J is non-normal,
its eigenvalues do not completely determine the convergence behavior of GMRES
(cf. [40, 55]); however, spectral plots still have heuristic value, because they allow us
to compare visually the quality of various preconditioners.
We have already seen that, thanks to the first stage exact pressure solve, the
preconditioned matrix has N eigenvalues at z = 1, while the remaining eigenvalues
are given by the spectrum of S̃1−1 S1 . We compute these eigenvalues for the BILU(0)
case. We have

−1 −1 T −1 T
I − MCP R J = (I − M2 J)(I − C(W JC) W J)

= M2−1 (M2 − J)(I − C(W T JC)−1 W T J)


" #" #
0 Esp I 0
= M2−1 −1
0 Epp −Jtp Jts 0
" #
−1
−Êsp Jtp Jts 0
= −1
.
−Êpp Jtp Jts 0

So S̃1−1 S1 = I + Êsp Jtp


−1
Jts , where
" # " #
Êsp Esp
= M2−1
Êpp Epp
" #" #
S̃1−1 J˜pp J˜tp
−1
−S̃1−1 J˜sp J˜tp
−1
Esp
=
−S̃2−1 Jps Jss
−1
S̃2−1 Epp
" #
S̃ −1 (J˜pp J˜−1 Esp − J˜sp J˜tp
−1
Epp )
= 1 −1 tp −1
.
S̃2 (Epp − Jps Jss Esp )
5.2. CPR PRECONDITIONING 139

Thus,

−1
Jts = S̃1−1 J˜pp J˜tp
−1 ˜
(Jsp − Jsp ) − J˜sp J˜tp
−1 ˜
  −1
Êsp Jtp (Jpp − Jpp ) Jtp Jts
= S̃1−1 (J˜sp − Jsp ) − J˜sp J˜tp
−1 ˜ −1
 
(Jtp − Jtp ) Jtp Jts
= S̃1−1 J˜sp J˜tp
−1 −1
 
− Jsp Jtp Jts .

It is interesting to compare the above expressions with the spectrum one would get
with a single-stage BILU(0) preconditioner (i.e., no pressure solve). In that case, we
would have " #
I −Êsp
M2−1 J = .
0 I − Êpp
Therefore, in the single-stage BILU case, we would still have termination within N +1
steps when flow is cocurrent, but the convergence behavior from iteration 1 to N is
−1
dictated by I − Êpp , instead of I + Êsp Jtp Jts . We expect the CPR preconditioner to
outperform single-stage BILU(0) based on the following (somewhat heuristic) reasons:

1. If kJts k is small (e.g., when the overall flow (total mass balance equations) is
slowly varying with respect to the time-step size), then S̃1−1 S1 will be close to
the identity matrix, whereas this is not the case for single-stage BILU. In many
practical applications, the total velocity, which dictates Jtp , does not vary much
within a time step, so CPR would have a significant advantage over BILU(0).
−1 −1
2. It can be shown (see Appendix E) that both Jsp Jtp and Jpp Jtp are similar to
a symmetric positive semi-definite matrix, with eigenvalues between 0 and 1.
Thus, even though factorization errors are present, one can also expect J˜sp J˜tp
−1
,
and hence the term J˜sp J˜tp
−1 −1
− Jsp Jtp , to be relatively benign. On the other
−1 −1
hand, one cannot bound the eigenvalues of Jps Jss : when ∆t is large, Jps Jss
can have both very large eigenvalues (when |λo,i |  |λw,i |) and very small ones
(when |λo,i |  |λw,i |). So any bound on Êsp is likely to be much tighter than a
bound on Êpp .

3. It is easy to see that I − Êpp = S̃2−1 S2 , so the use of single-stage BILU(0) is


equivalent to preconditioning the Schur complement with respect to pressure
140 CHAPTER 5. LINEAR PRECONDITIONING

(S2 ) by a fixed-pattern ILU preconditioner. Contrast this with CPR, which


attempts to precondition S1 , the Schur complement with respect to saturation.
The two Schur complements S1 and S2 are related by

−1 −1 −1
S1 = Jss − Jsp Jtp Jts = Jss (I − Jss Jsp Jtp Jts ),
−1 −1 −1
S2 = Jtp − Jts Jss Jsp = Jtp (I − Jtp Jts Jss Jsp ).

Since the two matrices inside the parentheses have the same eigenvalues, it is
evident that S1 behaves more like the transport part Jss , whereas S2 behaves
more like the elliptic part Jtp . In particular, one expects κ(S1 ) to scale like
O(∆t/h), whereas κ(S2 ) would be O(1/h2 ). We also know that fixed-pattern
ILU preconditioners tend to perform poorly on elliptic problems. This indicates
CPR should, in general, outperform single-stage BILU(0).

To illustrate these arguments, we show the spectral plots of J, M2−1 J and MCP
−1
R J,
as well as their condition numbers, for various time-step sizes in Figures 5.1 and 5.2.
For this test case, we have a 2D homogeneous reservoir (with uniform porosity),
discretized on a 20 × 10 grid. A constant injection rate is imposed along the left
edge, and pressure is held constant along the right edge, with no flow boundaries
along the top and bottom. We see that the spectrum of J changes significantly as
∆t varies. The condition number is very large for all cases, and there is no obvious
clustering of eigenvalues, which means GMRES will likely perform poorly without
preconditioning. When BILU(0) is used, the spectrum lies almost completely on the
positive real axis, but the distribution is continuous and no obvious clustering exists;
in fact, the spectrum looks very similar to one belonging to an elliptic operator
(possibly due to Jtp appearing as a multiplicative factor in S2 ). When two-stage CPR
is used, the clustering around z = 1 becomes very obvious, and the high quality of
the clustering is remarkably consistent across time steps.
Figures 5.3 and 5.4 show the same spectral plots when countercurrent flow is
present. The same comments concerning the spectra of J and M2−1 J apply, except
that the condition numbers become much higher. As for the CPR-preconditioned
matrix, we still see a very good clustering of eigenvalues around z = 1, but the cluster
5.2. CPR PRECONDITIONING 141

Table 5.1: Convergence behavior for the block ILU(0) and CPR preconditioners.
Each figure represents the average number of GMRES iterations per Newton step
required for convergence.

∆t = 1.6 ∆t = 3.1 ∆t = 7.8


Cocurrent BILU(0) 23.0 22.7 23.0
CPR 3.7 4.3 5.0
Countercurrent BILU(0) 22.4 22.0 22.0
CPR 5.4 6.0 7.0

is not as tight, and we start to see more spreading along the positive real axis. This
is probably due to the fact that M2 is no longer exact with respect to saturation, and
this factorization error manifests itself as a spreading of the eigenvalues. Fortunately,
the outlying eigenvalues are well separated from one another, so GMRES should
have little problem eliminating the subspaces associated with them within a few
iterations. Table 5.1 shows the linear iteration counts per Newton step for both the
CPR and block ILU(0) preconditioners on the 20 × 10 grid. For both the cocurrent
and countercurrent flow cases, it is evident that the higher quality clustering produced
by CPR does, in fact, translate into much faster convergence compared with block
ILU(0).
142 CHAPTER 5. LINEAR PRECONDITIONING

No preconditioning, ∆ t = 1 No preconditioning, ∆ t = 5
4 2.5

2
3
1.5
2
1
1
0.5

0 0

−0.5
−1
−1
−2
−1.5
−3
−2

−4 −2.5
0 1 2 3 4 5 6 7 8 0 0.5 1 1.5 2 2.5 3
κ = 5961.8005 κ = 5414.6887

x 10
−3 ILU only, ∆ t = 1 x 10
−4 ILU only, ∆ t = 5
5 2

4
1.5
3
1
2
0.5
1

0 0

−1
−0.5
−2
−1
−3
−1.5
−4

−5 −2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 358.3443 κ = 299.1005

x 10
−3 CPR preconditioned, ∆ t = 1 CPR preconditioned, ∆ t = 5
4 0.02

3 0.015

2 0.01

1 0.005

0 0

−1 −0.005

−2 −0.01

−3 −0.015

−4 −0.02
0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.1 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3
κ = 103.6347 κ = 61.642

Figure 5.1: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the cocurrent flow problem (∆t = 1, 5).
5.2. CPR PRECONDITIONING 143

No preconditioning, ∆ t = 20 No preconditioning, ∆ t = 100


2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

−0.5 −0.5

−1 −1

−1.5 −1.5

−2 −2

−2.5 −2.5
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
κ = 5683.9447 κ = 13524.7056

x 10
−4 ILU only, ∆ t = 20 x 10
−11 ILU only, ∆ t = 100
1.5 3

1 2

0.5 1

0 0

−0.5 −1

−1 −2

−1.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 234.6865 κ = 222.3062

CPR preconditioned, ∆ t = 20 CPR preconditioned, ∆ t = 100


0.015 0.01

0.008
0.01
0.006

0.004
0.005
0.002

0 0

−0.002
−0.005
−0.004

−0.006
−0.01
−0.008

−0.015 −0.01
0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 0.9 1 1.1 1.2 1.3 1.4
κ = 131.5138 κ = 149.9701

Figure 5.2: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the cocurrent flow problem (∆t = 20, 100).
144 CHAPTER 5. LINEAR PRECONDITIONING

No preconditioning, ∆ t = 1 No preconditioning, ∆ t = 5
3 2.5

2
2
1.5

1
1
0.5

0 0

−0.5
−1
−1

−1.5
−2
−2

−3 −2.5
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
κ = 5601.3062 κ = 6741.5014

ILU only, ∆ t = 1 ILU only, ∆ t = 5


0.03 0.06

0.02 0.04

0.01 0.02

0 0

−0.01 −0.02

−0.02 −0.04

−0.03 −0.06
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 534.1869 κ = 5677.5048

x 10
−3 CPR preconditioned, ∆ t = 1 x 10
−3 CPR preconditioned, ∆ t = 5
8 4

6 3

4 2

2 1

0 0

−2 −1

−4 −2

−6 −3

−8 −4
0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25
κ = 20.0608 κ = 9.25

Figure 5.3: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the countercurrent flow problem (∆t = 1, 5).
5.2. CPR PRECONDITIONING 145

No preconditioning, ∆ t = 20 No preconditioning, ∆ t = 100


2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

−0.5 −0.5

−1 −1

−1.5 −1.5

−2 −2

−2.5 −2.5
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
κ = 34310.1894 κ = 2306857.8683

ILU only, ∆ t = 20 ILU only, ∆ t = 100


0.15 0.2

0.1 0.15

0.1
0.05

0.05
0
0
−0.05
−0.05

−0.1
−0.1

−0.15 −0.15

−0.2 −0.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
κ = 87351.0046 κ = 5176047.3408

CPR preconditioned, ∆ t = 20 CPR preconditioned, ∆ t = 100


0.25 0.4

0.2
0.3
0.15
0.2
0.1
0.1
0.05

0 0

−0.05
−0.1
−0.1
−0.2
−0.15
−0.3
−0.2

−0.25 −0.4
0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
κ = 349.7405 κ = 49843.1776

Figure 5.4: Spectra of Jacobian (no preconditioning), BILU(0) and CPR precondi-
tioning for the countercurrent flow problem (∆t = 20, 100).
146 CHAPTER 5. LINEAR PRECONDITIONING

5.2.4 Numerical examples


To illustrate the importance of ordering on the CPR preconditioner, we provide the
following two numerical examples. The first example shows how the performance
of CPR with the standard ordering can vary significantly depending on the flow
configuration, even on the same problem, whereas CPR with potential ordering is
insensitive to flow configurations. The second example shows the effect of ordering
on a 3D complex flow problem.

Quarter 5-spot problem

For this test problem, we have a two-dimensional reservoir that is discretized on a


20 × 20 grid. Water is injected through a well at one corner of the reservoir, at
a constant rate of 0.005 pore volumes per day; a production well, maintained at
a fixed pressure, is located at the opposite corner (see Figure 5.5). The no-flow
condition is imposed on the remaining sections of the boundary. Quadratic relative
permeabilities are used, with a mobility ratio of M = 10. The simulation is run until
T = 100 days (0.5 pore volumes injected). The cells are numbered in lexicographical
order (i.e., from left to right, then from bottom to top). We solve the same problem
under two different configurations: in the first case (a), the injection and production
wells are located at the lower-left and upper-right corners respectively, so that the
lexicographical ordering coincides with the potential ordering. In the second case (b),
the wells are located at the lower-right and upper-left corners instead, so the natural
ordering is no longer a valid potential ordering. Table 5.2 shows the total iteration
counts over the whole simulation, as well as running time information. (Data for
single-stage BILU(0) are omitted, since there are many Newton steps within which
BILU(0) fails to converge within 1500 iterations.) The two configurations require
exactly the same number of time steps and Newton steps, as expected. However,
the number of GMRES iterations for the two cases are significantly different (a 36%
increase) when CPR-BILU(0) with lexicographical ordering is used. This difference
is insignificant when potential ordering is used, which is expected since the upstream
graphs for the two problems are isomorphic, i.e., the two graphs are the same up to
relabeling. (The discrepancy in iteration counts is probably due to the inexact solve
5.2. CPR PRECONDITIONING 147

No flow Fixed Fixed No flow


pressure pressure

Inject water No flow No flow Inject water


q = qw q = qw
(a) (b)

Figure 5.5: Two configurations of the quarter 5-spot problem.

in the first stage.) This example provides experimental confirmation of Theorem 5.6
and illustrates the ability of potential ordering to shield this type of grid orientation
effect from the linear solver.

Upscaled SPE 10 problem

In this example, the reservoir is a 2 × 2 × 2 upscaling of the SPE 10 problem, i.e.,


it is identical to the one used in example 4.3.2 in Chapter 4. It is initially saturated
with oil, and we inject water at the center of the reservoir at a rate of 0.0002 pore
volumes per day (or 29 cell pore volumes per day). A production well, maintained at
constant pressure, is completed at one corner of the reservoir. Once again, we test
the solvers on two configurations:

(a) The production well is located at (1,1,:), so that the major direction of flow is
aligned with the lexicographical ordering;

(b) The production well is located at (110,1,:), so that the major direction of flow
is transverse to the lexicographical ordering.

We run the simulation to T = 100 days (0.02 pore volumes injected). Table 5.3
summarizes the runs for both preconditioners. The number of time steps and New-
ton steps are again exactly the same in all cases, indicating that the problems are
148 CHAPTER 5. LINEAR PRECONDITIONING

Table 5.2: Performance of CPR-ILU for the quarter 5-spot problem.

Config. (a) Config. (b)


Natural Potential Natural Potential
ordering ordering ordering ordering
No. of time steps 21 21 21 21
No. of Newton steps 80 80 80 80
No. of GMRES iterations 254 254 346 246
No. of AMG V-cycles 286 286 368 274
Total running time (sec) 1.37 1.45 1.49 1.43
− Top. sort (sec) 0 0.02 0 0.02
− Permutation (sec) 0 0.08 0 0.09
− BILU solve (sec) 0.10 0.14 0.15 0.09
− Pressure solve (sec) 0.32 0.27 0.39 0.33

completely equivalent. Even in configuration (a), the number of GMRES iterations


decreases somewhat when potential ordering is used, because the lexicographical or-
dering is no longer a valid potential ordering because of the strong spatial hetero-
geneity of the permeability field. When configuration (b) is used instead, we observe
an increase in GMRES iterations when the lexicographical ordering is used, since the
major direction of flow is no longer aligned with this ordering. However, the iteration
count is almost exactly the same when potential ordering is used, once again illustrat-
ing the invariance of potentially-ordered BILU(0) with respect to flow configuration
details.

In our current implementation, the savings due to the use of potential ordering
are rather modest, even though the GMRES iteration count decreases substantially.
We believe this is due to our inefficient implementation. Currently, we physically
permute the blocks of the Jacobian matrix into the potential ordering before feeding
it into a library routine that computes the block ILU factorization. This simplifies the
implementation, but adds unnecessary cost to the solver, because one can actually
modify the ILU routine to use the unpermuted data structure, and only change the
order of elimination when the factorization is computed. (This is what modern direct
5.3. SCHUR COMPLEMENT PRECONDITIONING 149

Table 5.3: Performance of CPR-ILU for the upscaled SPE 10 problem.

Config. (a) Config. (b)


Natural Potential Natural Potential
ordering ordering ordering ordering
No. of time steps 37 37 37 37
No. of Newton steps 106 106 106 106
No. of GMRES iterations 389 349 447 351
No. of AMG V-cycles 524 502 570 504
Total running time (sec) 595.99 614.37 630.55 616.21
− Top. sort (sec) 0 4.49 0 4.57
− Permutation (sec) 0 30.79 0 30.93
− BILU solve (sec) 46.82 42.72 52.76 42.34
− Pressure solve (sec) 229.33 224.60 242.93 225.89

solvers typically do when a symmetric permutation is required.) Since the permuta-


tion step represents about 5% of the total running time, eliminating this step should
lead to a significant performance improvement. Note that the cost of computing the
topological ordering is insignificant, and it can be shared with other modules. For
instance, if reduced Newton is used as a nonlinear solver, then a topological order-
ing would already have been calculated, so there would be no need to compute it
again for the linear solver. If all these efficiency measures are taken, the potential-
ordered CPR-ILU preconditioner should outperform lexicographical ordering in most
practical cases.

5.3 Schur complement preconditioning

Recall that the first-stage True-IMPES preconditioning corresponds to the following


choice of restriction and prolongation operators in the first stage:
" #
h i 0
WT = I I , C= .
I
150 CHAPTER 5. LINEAR PRECONDITIONING

Another reasonable choice of W T and C would be


" #
h i 0
W T = −Jps Jss
−1
I , C= ,
I

which leads to the first stage preconditioner being


" #
0 0
T1 = C(W T JC)−1 W T = .
−S2−1 Jps Jss
−1
S2−1

The overall preconditioned matrix M −1 J then becomes


" #
S̃1−1 (Jss − J˜sp J˜tp
−1
Jps ) 0
M −1 J = ,
0 I

meaning that if M2 is exact on saturation (e.g., BILU(0) for cocurrent flow), then the
two-stage preconditioner would be exact. Note that the “pressure matrix” W T JC in
this case would be
W T JC = −Jps Jss
−1
Jsp + Jpp = S2 ,

meaning we are actually solving the Schur complement problem with respect to pres-
sure. S2 is in general a dense matrix; however, as we noted in section 4.1, we can
perform the matrix-vector product S2 v with exactly the same computational cost as
−1
performing Jv, since multiplication with Jss is simply a forward substitution when
we exploit potential ordering. It is thus worthwhile to attempt to devise effective
preconditioners for the Schur complement problem. In fact, a good preconditioner
for S2 (i.e., one that converges as quickly as two-stage CPR on the full problem)
would eliminate the need for a two-stage preconditioner on J, since one can always
obtain the saturation solution from the pressure solution by a back substitution (i.e.,
−1
a multiplication by Jss ).

The structure of this section is as follows. First, we study the properties of S2 by


examining its spectrum, nonzero pattern and relative magnitudes of its entries. Then
we look at different approximations to the Schur complement and how they behave
as preconditioners for cocurrent and countercurrent cases.
5.3. SCHUR COMPLEMENT PRECONDITIONING 151

5.3.1 Spectrum and nonzero pattern


We consider a pseudo-1D flow situation, where water is injected from one edge of the
reservoir and pressure is held constant at the other edge. The reservoir is initially
filled with water between the injection edge and the middle of the reservoir, while the
remaining part is filled with oil. Flow is cocurrent (i.e. no gravity is present). For
both the 1D (20 × 1) case and the 2D (20 × 20) case, we show the nonzero pattern
of the full matrix in Figure 5.6. We also show the spectrum, as well as the absolute
value of the entries, for the following time steps:

(a) 0.1 cell pore volumes,

(b) 1 cell pore volume,

(c) 10 cell pore volumes.

The spectral and profile plots are shown in Figures 5.7 (for 1D problems) and 5.8. The
profile plots show the magnitudes of the entries along a row of S2 that corresponds
to a gridblock near the center of the reservoir, behind the flood front.
Let us first look at the spectrum of S2 . When the time step is small, the transport
problem contributes little to the spectrum of the Schur complement; the eigenvalue
plot is similar to that of a positive definite elliptic operator. For the medium and large
time step, though, we start to see a rather complicated spectral plot consisting of two
parts: eigenvalues along the positive real axis corresponding to the pressure part, and
complex conjugate pairs that arise from the saturation part of the problem. The plots
for the 2D case are especially revealing, since the complex eigenvalues are roughly in
the shape of a parabola and bear a striking resemblance to the pseudospectra of
convection-diffusion operators [63]. Based on the spectral plots, we conclude that
preconditioners that depend strongly on the matrix being nearly symmetric positive
definite (such as algebraic multigrid) will probably perform poorly on problems with
moderate to large time steps.
As for the nonzero pattern, darker colors in Figure 5.6 indicate larger magnitudes.
In the 1D case, most of the energy (i.e., Frobenius norm) of the matrix lies within the
tridiagonal part; even though the lower triangular part is technically nonzero, most
152 CHAPTER 5. LINEAR PRECONDITIONING

1D flow 2D flow
Figure 5.6: Nonzero pattern of S2 for the 1D and 2D reservoirs. Darker colors indicate
larger magnitudes.

of the entries outside the tridiagonal region are tiny and can be neglected. Thus,
ILU(0) can potentially be a good preconditioner for the 1D Schur complement. In
contrast, the 2D Jacobian has large entries outside the pentadiagonal region, and the
magnitude of the fill-in entries increases as the time step is increased. Hence, it is
unlikely that a preconditioner with small bandwidth (such as an ILU preconditioner
induced from the partitioned matrix J) would be effective for 2D problems.

5.3.2 Convergence behavior


Recall the partitioned form J of the Jacobian matrix J:
" #
Jss Jsp
J= .
Jps Jpp

We investigate the convergence behavior of the following preconditioners:

• M0 = Jpp − Colsum(Jps ) Colsum(Jss )−1 Jsp (True-IMPES),

• M1 = Jpp − diag(Jps ) diag(Jss )−1 Jsp (Quasi-IMPES),

• M2 = Jpp − Jps diag(Jss )−1 Jsp ,

• M3 = Jpp − diag(Jps )Jˆss


−1
Jsp ,
5.3. SCHUR COMPLEMENT PRECONDITIONING 153

1−D reservoir, ∆ t = 0.1


1 6

0.8
5
0.6

0.4
4
0.2

0 3

−0.2
2
−0.4

−0.6
1
−0.8

−1 0
0 2 4 6 8 10 0 5 10 15 20
κ = 1344.0839

(a) 0.1 cell pore volumes


1−D reservoir, ∆ t = 1
0.5 1.5

0.4

0.3

0.2
1
0.1

−0.1
0.5
−0.2

−0.3

−0.4

−0.5 0
0 0.5 1 1.5 2 2.5 3 3.5 4 0 5 10 15 20
κ = 626.5996

(b) 1 cell pore volume


1−D reservoir, ∆ t = 10
0.15 0.18

0.16
0.1

0.14
0.05
0.12

0 0.1

−0.05 0.08

0.06
−0.1
0.04

−0.15
0.02

−0.2 0
0 0.5 1 1.5 2 2.5 0 5 10 15 20
κ = 385.3661

(c) 10 cell pore volumes

Figure 5.7: Spectrum and nonzero profiles of S2 for the 1D reservoir.


154 CHAPTER 5. LINEAR PRECONDITIONING

2−D reservoir, ∆ t = 0.1


1 12

0.8
10
0.6

0.4
8
0.2

0 6

−0.2
4
−0.4

−0.6
2
−0.8

−1 0
0 2 4 6 8 10 12 14 16 18 0 50 100 150 200 250 300 350 400
κ = 2702.7489

(a) 0.1 cell pore volumes


2−D reservoir, ∆ t = 1
4 4

3 3.5

2 3

1 2.5

0 2

−1 1.5

−2 1

−3 0.5

−4 0
0 1 2 3 4 5 6 7 8 0 50 100 150 200 250 300 350 400
κ = 2044.792

(b) 1 cell pore volume


2−D reservoir, ∆ t = 10
0.6 0.5

0.45
0.4
0.4

0.35
0.2
0.3

0 0.25

0.2
−0.2
0.15

0.1
−0.4
0.05

0
0 1 2 3 4 5 0 50 100 150 200 250 300 350 400
κ = 1786.5725

(c) 10 cell pore volumes

Figure 5.8: Spectrum and nonzero profiles of S2 for the 2D reservoir.


5.3. SCHUR COMPLEMENT PRECONDITIONING 155

• M4 = Jpp − Jps Jˆss


−1
Jsp ,

where Jˆss
−1 −1
is a first-order approximation of Jss , defined as follows. Suppose we order
Jss so that it is lower triangular. Then Jss = D − L = (I − LD−1 )D, where D is
diagonal and L is strictly lower triangular. Then

−1
Jss = D−1 (I − LD−1 )−1
= D−1 (I + LD−1 + · · · + (LD−1 )N −1 ).

Then the first order approximation is taken to be Jˆss


−1
= D−1 (D + L)D−1 .
Tables 5.4 and 5.5 illustrate the rate of convergence for each of these precondi-
tioners. The flow setting is the same as the 2D case in the previous section, except we
now show results for both cocurrent and countercurrent flow. Each figure represents
the average number of linear iterations per Newton iteration that GMRES takes to
reduce the linear residual by a factor of 10−6 . The ‘AMG’ column corresponds to
the case where the preconditioner is applied using one cycle of AMG, whereas the
‘exact’ column corresponds to the case where the preconditioner is applied using a
direct method. The time step size ∆t is measured in cell pore volumes injected.
‘DNC’ means the linear iteration does not converge within 100 iterations, and the
approximate linear solution is so poor that it causes Newton’s method to diverge.
For comparison purposes we include results for when the following preconditioners
are used:

1. Induced ILU: single-stage preconditioner induced from the ILU(0) factorization


of the partitioned matrix J.

2. AMG on S2 : the full Schur complement is handed to AMG, and one V-cycle is
used per GMRES iteration;

3. CPR on J: two-stage preconditioner applied to the full Jacobian;

4. CPR on S2 : two-stage preconditioner induced from the CPR method.

The induced preconditioners (items (1) and (4)) are defined as follows. If MJ is a
preconditioner for the full matrix J, then the induced preconditioner MS is defined
156 CHAPTER 5. LINEAR PRECONDITIONING

Table 5.4: Convergence of GMRES in the absence of gravity.

AMG Exact
∆t 1.6 3.1 7.8 1.6 3.1 7.8
M0 10.7 19.3 24.7 4.0 8.3 10.0
M1 10.0 13.3 15.3 3.7 6.0 7.0
M2 11.7 17.7 21.7 4.3 8.3 11.0
M3 11.0 29.0 45.0 4.3 7.3 9.0
M4 21.3 33.3 41.0 3.0 5.0 5.0
Induced ILU 33.7 38.3 40.7
AMG on S2 7.7 16.3 14.3
CPR on S2 5.7 8.7 11.3
CPR on J 4.0 6.3 5.3

as " #
h i 0
MS−1 = 0 I MJ−1 ,
I

meaning the preconditioning step z = MS−1 r is computed via


" #
−1 0
h i
z = 0 I MJ .
r

Note that the induced ILU preconditioner can always be applied exactly because if
MA = LA UA , then MS = LS US , where
h i
LS = RT LA R, US = RT UA R, RT = 0 I

(see Chapter 14 in [68] for a proof).


Based on the convergence data, the following observations can be made:

1. The convergence rates of all the Schur complement methods have a fairly strong
dependence on the time-step size. CPR on the full matrix, on the other hand,
exhibits a convergence behavior that is nearly independent of ∆t, which is
consistent with the spectral plots of Figures 5.1–5.4.
5.3. SCHUR COMPLEMENT PRECONDITIONING 157

Table 5.5: Convergence of GMRES in the presence of gravity.

AMG Exact
∆t 1.6 3.1 7.8 1.6 3.1 7.8
M0 19.5 42.3 63.2 8.5 16.8 22.6
M1 16.8 27.5 86.2 6.8 12.0 32.2
M2 14.2 24.5 55.2 7.3 12.3 22.8
M3 17.5 45.0 >100 7.2 12.3 23.8
M4 44.0 DNC DNC 6.2 10.0 15.2
Induced ILU 35.0 42.0 63.3
AMG on S2 18.0 DNC DNC
CPR on S2 8.2 11.5 22.6
CPR on J 5.7 6.8 6.6

2. The “best” preconditioner depends on the flow situation. While Quasi-IMPES


beats True-IMPES in the cocurrent case, the opposite is true for countercurrent
flow.

3. A more accurate approximation of the Schur complement does not imply faster
convergence when AMG is used. In particular, M4 (which has the most fill
among the M ’s) and the exact Schur complement S2 both do poorly in the
countercurrent flow case when AMG is used. As expected, AMG has trouble
when the matrix is far from being an elliptic operator. This is in contrast with
the exact preconditioner case, where a more accurate preconditioner usually
requires fewer iterations to converge.

4. Induced ILU performs poorly, since S2 contains a large elliptic component.

5. Keeping off-diagonal blocks, or at least treating them properly, is important for


convergence in the countercurrent flow case, as seen in the faster convergence
of M2 relative to the other preconditioners.

Given our comment about the performance of narrow-band preconditioners, it


is not surprising that M0 through M4 have a hard time competing with the CPR
preconditioner. What is surprising, though, is that the induced preconditioner MS
158 CHAPTER 5. LINEAR PRECONDITIONING

−1
requires more iterations to converge than MCP R on the full system, even though it is
operating on a smaller system. One possible explanation is as follows. A direct (but
tedious) calculation shows that

MS−1 = (I + S̃2−1 J˜ts J˜ss


−1 −1
Jsp )Jtp ,

−1 ˜
where S̃2 = J˜pp − J˜ps J˜ss Jsp is the Schur complement of the second stage precondi-
tioner M2 with respect to pressure. If we assume cocurrent flow and that BILU(0)
with potential ordering is used, we would have J˜ss = Jss and J˜ps = Jps , so the
preconditioned matrix S2 MS−1 would become

S2 MS−1 = (Jtp − Jts Jss


−1
Jsp + S2 S̃2−1 Jts Jss
−1 −1
Jsp )Jtp
−1
= I + (S2 S̃2−1 − I)Jts Jss −1
Jsp Jtp .

Once again,the convergence behavior depends on how close S2 S̃2−1 is to the identity,
−1
when such a term is absent from MCP R J. This could explain why CPR on the Schur
complement S2 scales less well than CPR on the full Jacobian J.
Chapter 6

Conclusions

The efficient simulation of immiscible multiphase flow in porous media requires the
use of nonlinear solvers and linear preconditioners that can take advantage of the
underlying structure of the problem, such as flow direction information. The phase-
based potential ordering in Chapter 3 exploits the upstream nature of the spatial
discretization in order to triangularize the saturation part of the nonlinear system of
equations. This ordering is valid for any flow configuration, and it can handle coun-
tercurrent flow due to gravity and capillarity. To compute the ordering, one simply
needs to perform a topological sort on the upstream graph, so the time complexity
scales linearly with the size of the grid. Moreover, this cost can be amortized over
several Newton and time steps, since in practice flow directions reverse only sparingly.
The proposed phase-based potential ordering allows a partial decoupling of the
transport problem from the flow problem, since the saturations can be computed via
back substitution once the pressures are known. This allows us to derive a reduced-
order Newton algorithm, which is the nonlinear analog of a Schur complement ap-
proach in matrix computations. We have proved that for 1D countercurrent flow, the
reduced Newton method converges unconditionally for large ∆t. In addition, a minor
modification to the method (which can be thought of as pivoting) yields provable
convergence for any time-step size. As demonstrated in various examples, reduced
Newton has a much more robust convergence behavior than the usual Newton method,
which translates into the ability to take larger time steps without risking divergence of

159
160 CHAPTER 6. CONCLUSIONS

the nonlinear iterations. This, in turn, leads to a more efficient and robust simulator
overall.

Ordering techniques can also lead to improvements in the linear solver. For a
cocurrent flow problem, a block ILU(0) factorization always exists provided the cells
are ordered according to the phase potential, and this factorization is unique over all
topological orderings. Moreover, this factorization is exact on the saturation part of
the Jacobian. Since block ILU is used as the second stage of CPR preconditioning,
exactness on saturation means that the pairing with True-IMPES reduction, which
is exact on pressure, is practically ideal. Moreover, its uniqueness over topological
orderings means CPR is much less sensitive to flow configuration variations if potential
ordering is used. Spectral plots and numerical experiments demonstrate the power
of this combination. Finally, experiments reveal that it is difficult to construct a
preconditioner for the pressure Schur complement S2 that rivals two-stage CPR in
performance. This is likely because S2 is a dense matrix that exhibits both advective
and diffusive characters, as indicated by the spectral plots.

A rigorous analysis is performed on the phase-based upstream discretization.


This discretization handles sonic points differently from the classical Godunov and
Engquist-Osher schemes, since the upstream directions are obtained from the poten-
tial gradient of each phase, rather than by manipulating the fractional flow curve
directly. Even though the numerical flux function becomes non-differentiable when-
ever the upstream direction changes, our analysis shows that the nonlinear algebraic
system resulting from a fully-implicit time discretization has a unique bounded solu-
tion for any time-step size, and the resulting solution profiles are always monotonic.
Since the analysis is based on the nonlinear Gauss-Seidel process, it also leads to
an implementable algorithm for solving these nonlinear systems. The convergence
rate is generally linear, but can become superlinear when the correct ordering is
used. In addition, the phase-based upstream scheme satisfies an entropy inequality,
so the method converges under mesh refinement. This is verified experimentally for a
countercurrent flow problem. Finally, when a non-uniform grid is used, the solution
accuracy is often comparable to the uniform-grid case, even though the maximum
CFL number is usually much higher for the non-uniform grid. This reveals the real
161

advantage of the fully-implicit method over an explicit scheme: the ability to handle
the large CFL numbers that naturally arise from heterogeneity.

Future directions
In this section, we outline several possible future research directions stemming from
our work.

Treatment of strong countercurrent flow due to gravity


In section 4.2.4, we showed that the reduced Newton method is expected to converge
when the countercurrent flow due to gravity satisfies a backward CFL condition. On
the other hand, when the backward CFL number is much larger than 1, it is possible
for reduced Newton to cycle or diverge, especially when the initial guess is poor. In
practical simulations, it is generally too restrictive to require a backward CFL number
that is less than 1 everywhere. This is because the flow in regions far away from wells
can be dominated by gravity segregation, since the total velocity is close to zero there.
Thus, the backward CFL numbers in these regions (which are close to the foward
CFL numbers) determine the convergence behavior of reduced Newton. Ongoing
work focuses on hybridizing reduced Newton with a globally convergent scheme (e.g.,
nonlinear Gauss-Seidel) in such a way that reduced Newton handles regions with low
backward CFL numbers, whereas regions with strong countercurrent flow are to be
handled by the globally convergent scheme.

Extending reduced Newton to compositional models


Compositional simulations are even more expensive than black-oil simulations, es-
pecially when a large number of components are present. It would be beneficial to
extend the ordering and reduction paradigm introduced in Chapters 3 and 4 to a
compositional setting. A natural starting point would be the IMPSAT formulation
[16], in which pressure and saturations are treated implicitly, whereas compositions
(mole fractions of each component) are treated explicitly. Since compositions are not
162 CHAPTER 6. CONCLUSIONS

primary variables, the nonlinear algebraic system contains only pressure and satu-
rations; in other words, the implicit part looks exactly like a black-oil system, so
we can use reduced Newton without modifications. A possible difficulty is that in
a compositional model, heavy hydrocarbon components are allowed to vaporize into
the gas phase, whereas this is not allowed in the standard black-oil model. This could
complicate the triangularization process, as both the oil and gas equations contain
flow terms from both phases. However, since the amount of heavy components in
the vapor phase is generally small for heavy oils, it may still be possible to trian-
gularize the system by temporarily freezing or linearizing the Sg dependent terms in
the oil equation. More numerical experiments and theory are needed to verify the
effectiveness of this approach.

Extensions of stability analysis


The stability and convergence analysis presented in Chapter 2 are generally applica-
ble to scalar hyperbolic conservation laws. This is adequate for one-dimensional flow,
since the total velocity there is constant and known. However, for multiple dimen-
sions, our existence and stability results apply only if we temporarily fix the total
velocity field and solve the scalar transport problem on this frozen velocity field.
Thus, our approach does not directly apply to the fully-implicit method, which solves
for both the updated flow field and saturations in a coupled fashion. Our next step
is to extend our analysis to handle this coupling properly. Ongoing work also focuses
on extending the analysis to handle three-phase flow.
Appendix A

Pressure Equation Derivation

Here we derive the pressure equation (1.1.18). Assume no gravity, capillarity or source
terms. Then the phase equations are given by:


Water: (φρw Sw ) − ∇ · (Kλw ρw ∇p) = 0, (A.1)
∂t

Oil: (φρo So ) − ∇ · (Kλo ρo ∇p) = 0, (A.2)
∂t

Gas: (φρg Sg + φρo Rs So ) − ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0. (A.3)
∂t

We assume ρw , ρo , ρg , φ and Rs are all smooth functions of pressure p, and p is


differentiable with respect to t. We multiply the water equation by 1/ρw and expand
the time derivative:

1 0 ∂p ∂Sw 1
(φ ρw + φρ0w ) Sw + φ − ∇ · (Kλw ρw ∇p) = 0. (A.4)
ρw ∂t ∂t ρw

For the oil equation, we multiply by (ρg − ρo Rs )/(ρo ρg ):


     
ρg − ρo R s 0 ∂p ρo R s ∂So ρg − ρo R s
(φ ρo +φρ0o ) So +φ 1− − ∇·(Kλo ρo ∇p) = 0.
ρo ρg ∂t ρg ∂t ρo ρg
(A.5)

163
164 APPENDIX A. PRESSURE EQUATION DERIVATION

Finally, we multiply the gas equation by 1/ρg :

1h 0 0 0 0 0
i ∂p
(φ ρg + φρg )Sg + (φ ρo Rs + φρo Rs + φρo Rs )So
ρg ∂t
(A.6)
∂Sg ρo Rs ∂So 1
+φ +φ − ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0.
∂t ρg ∂t ρg

We now add (A.4)–(A.6) together. First, the sum of the saturation derivatives is
   
∂Sw ρo Rs ∂So ∂Sg ρo Rs ∂So ∂Sw ∂So ∂Sg
φ +φ 1− +φ +φ =φ + + = 0,
∂t ρg ∂t ∂t ρg ∂t ∂t ∂t ∂t

since Sw + So + Sg ≡ 1. Next, the coefficient of ∂p/∂t is given by


h  ρo R s  ρo R s i
φ0 Sw + So 1 − + Sg + So
ρg ρg
ρ0g
 0
ρ0o   ρ0 R ρo Rs0 

ρw ρo R s  o s
+φ Sw + 1− So + Sg + + So
ρw ρo ρg ρg ρg ρg
ρ0g
 0  0
ρo ρo Rs0
 
0 ρw
= φ (Sw + So + Sg ) + φ Sw + + So + Sg
ρw ρo ρg ρg
= φ(cr + cw + co + cg ) =: φcT ,

where
φ0 ρ0w ρ0o ρo Rs0 ρ0g
cr = , cw = , co = + , cg = .
φ ρw ρo ρg ρg
So the pressure equation is
  
∂p 1 ρg − ρo R s
φcT − ∇ · (Kλw ρw ∇p) + ∇ · (Kλo ρo ∇p)
∂t ρw ρo ρg
 (A.7)
1
+ ∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇p) = 0.
ρg

We can simplify (A.7) further by assuming that p is differentiable with respect to the
spatial variable x. We use the identity

∇ · (f v) = ∇f · v + f ∇ · v,
165

where f : Rn → R and v : Rn → Rn are differentiable functions. The water term can


be written as

1 1 0 
∇ · (Kλw ρw ∇p) = (ρw ∇p) · (Kλw ∇p) + ρw ∇ · (Kλw ∇p)
ρw ρw
= Kλw cw |∇p|2 + ∇ · (Kλw ∇p).

Similarly, the oil term becomes


   
ρg − ρo R s 1 ρo R s  0 
∇ · (Kλo ρo ∇p) = 1− (ρo ∇p) · (Kλo ∇p) + ρo ∇ · (Kλo ∇p)
ρo ρg ρo ρg
ρ0o
  
ρo R s 2
= 1− Kλo |∇p| + ∇ · (Kλo ∇p) .
ρg ρo

Finally, the gas term takes the form

1
∇ · (Kλg ρg ∇p + Kλo ρo Rs ∇ p)
ρg
1 0
= (ρ ∇p) · (Kλg ∇p) + ρg ∇ · (Kλg ∇p)
ρg g
+ (ρ0o Rs + ρo Rs0 )∇p · (Kλo ∇p) + ρo Rs ∇ · (Kλo ∇p)


ρo R s
= Kλg cg |∇p|2 + ∇ · (K λg ∇p) + ∇ · (Kλo ∇p)
ρg
ρo Rs ρ0o ρo Rs0
+ Kλo |∇p|2 + Kλo |∇p|2 .
ρg ρo ρg

Substituting the above terms into (A.7) gives

∂p
φcT − ∇ · (KλT ∇p) − χT K|∇p|2 = 0, (A.8)
∂t

where λT := λw + λo + λg is the total mobility and χT := λw cw + λo co + λg cg is the


mobility-weighted compressibility.
Appendix B

Diagonal Dominance and


L1-Accretivity

Here we prove the equivalence between column diagonal dominance and m-accretivity
in the L1 -norm for linear maps over Rn . Recall that for the space L1 (Rn ), A is m-
accretive if it is continuous and for any u, v ∈ Rn ,
n
X
(A(u)i − A(v)i ) sgn(ui − vi ) ≥ 0. (B.1)
i=1

Theorem B.1. Let A : Rn → Rn be a linear map with matrix A = [aij ]. Then A is


m-accretive if and only if A is column diagonally dominant, i.e.,
X
ajj ≥ |aij | for j = 1, . . . , n. (B.2)
i6=j

Proof. Since A is linear, it suffices to show equivalence between condition (B.2) and

n
X
(Au)i sgn(ui ) ≥ 0 (B.3)
i=1

for any u ∈ Rn . Assume (B.3) holds for any vector u. For a given ε > 0, define

u(j) = (−ε sgn(a1j ), . . . , −ε sgn(aj−1,j ), 1, −ε sgn(aj+1,j ), . . . , −ε sgn(anj ))T .

166
167

Then
Au(j) = Aj + εv (j) ,
(j)
where Aj is the j-th column of A and kv (j) k1 ≤ nkAk1 . Since sgn(ui ) = − sgn(aij )
for i 6= j, we obtain
n
(j) (j) (j)
X X X
(j)
(Au )i sgn(ui ) = ajj − |aij | + ε vi sgn(ui ),
i=1n i6=j i=1

which must be non-negative by (B.3). Thus, we have

n
(j) (j)
X X
ajj − |aij | ≥ −ε vi sgn(ui ) ≥ −nεkAk1 ,
i6=j i=1

which is true for all j. Letting ε → 0 yields column diagonal dominance, as required.

Conversely, assume A is column diagonally dominant. Then so is AD, where D is


a diagonal matrix with dii > 0. Now, for any u ∈ Rn ,
n
X X n
n X n X
X n
(Au)i sgn(ui ) = aij uj sgn(ui ) = (aij |uj |) sgn(ui ) sgn(uj ). (B.4)
i=1 i=1 j=1 i=1 j=1

If u has no zero entries, the above is equivalent to evaluating sT M s, where s is a


vector of ±1s, and M = AU , U = diag(|u1 |, . . . , |un |) > 0, so that M is also column
diagonally dominant. Thus,
n X
X n
T
s Ms = mij si sj
j=1 i=1
n
" # n
" #
X X X X
= mjj s2j + mij si sj ≥ mjj − |mij | ≥ 0,
j=1 i6=j j=1 i6=j

so (B.3) holds for u. The general case where u has zero entries is similar, except the
double summation will skip over any index i or j for which ui or uj is zero.
Appendix C

Convergence of the Cascade


Method

Consider a one-dimensional model problem with

• incompressible flow,

• an injection boundary condition on the left,

• a pressure boundary condition on the right, and

• no countercurrent flow (e.g. horizontal reservoir with no capillarity).

The continuous form of the problem is given by the conservation law

∂Sp (x) ∂up (x)


φ(x) + = 0, xL < x < xR (C.1)
∂t ∂x

for p = w, o, with
dp(x)
up (x) = −K(x)krp (Sw (x))
dx
and boundary conditions

up (xL ) = qp,L ,
p(xR ) = pR .

168
169

Proposition C.1. For the above 1D model problem, the Appleyard-Cheshire Cascade
method [4] converges in two iterations, provided the cells are ordered from upstream
to downstream (left to right). In particular, the saturation of each cell will be correct
at the end of the first iteration, and the pressures will be correct at the end of the
second iteration.

Proof. Under the Cascade (left-to-right) ordering, the discretized equations have the
form
old
φi (Sw,i − Sw,i )
 
1 pi − pi+1
+ Kli krw (Sw,i ) − F Iw,i = 0,
∆t ∆x ∆x
old
(C.2)
φi (Sw,i − Sw,i )
 
1 pi − pi+1
+ Kli kro (Sw,i ) − F Io,i = 0.
∆t ∆x ∆x

∗ (0)
Let the exact solution be Sw,i and p∗w,i , i = 1, . . . , N , and let the initial guess be Sw,i
(0)
and pw,i . Consider the first iteration of the Cascade method. In line 3 in Figure 3.1,
(1)
the pressures are updated to pi , but this has no impact on convergence in this model
problem. The saturations Sw,i are updated inside the loop from lines 4 to 8. We show
by induction that at the ith step of the loop, Sw,j and F Op,j are correct for j < i.

For the base case, let i = 1. The single-cell problem becomes


!
old (1)
φ1 (Sw,1 − Sw,1 ) 1 p1 − p2
+ K12 krw (Sw,1 ) − Aqw,L = 0,
∆t ∆x ∆x
(1)
! (C.3)
old
φ1 (Sw,1 − Sw,1 ) 1 p1 − p2
+ K12 kro (Sw,1 ) − Aqo,L = 0.
∆t ∆x ∆x

which we can solve for Sw,1 and p1 . Since the exact solution also solves the single-cell
problem, the uniqueness of solutions tells us that

∗ (1)
Sw,1 = Sw,1 and p1 − p2 = p∗1 − p∗2 .
170 APPENDIX C. CONVERGENCE OF THE CASCADE METHOD

Thus, Sw,1 is exact, and the outward flux

(1)
F Op,1 = K12 krp (Sw,1 )(p1 − p2 )/∆x

= K12 krp (Sw,1 )(p∗1 − p∗2 )/∆x

is exact as well. This proves the base case. For i > 1, note that the outward fluxes for
j = 1, . . . , i − 1 are assumed to be exact. This means the Cascade solution, and the

exact solution at cell i, both solve the same single-cell problem. Hence, Sw,i = Sw,i ,
and the outward fluxes will match as well. Thus, the induction step goes through,

and we have Sw,i = Sw,i for all i after one iteration. It follows that during the second
iteration of the Cascade method, in which we solve the linearized problem
" #
δS (2)
J (2)
= −r(2) , (C.4)
δp

we get δS (2) = 0, which means (1) the transmissibility coefficients are exact, and (2)
the fully implicit problem and the IMPES problem have the same pressure solution.
But since the residual function is linear (affine) in pressure, solving (C.4) will yield
the exact pressure, i.e.
p∗i = p(1) + δp(2) .

So at the end of the second iteration, both the saturations and the pressures are
correct, and the Cascade method converges to the solution.
Appendix D

Nonsingularity of Jss

Proposition D.1. Let the relative permeability functions krw and kro be such that
dkrw /dSw ≥ 0 and ∂kro /∂So ≥ 0. Then Jss = ∂Fs /∂S is nonsingular.

Proof. Since Jss is a lower triangular matrix, it suffices to show that none of its
diagonal entries is zero. A typical oil conservation equation for cell i is

φSoi ρo (pi ) X
Foi = + Kil Ho,il (Φoi − Φol ) + Fcap , (D.1)
∆t l adjacent to i

where 
kro (Si )ρo (pi )/µo (pi ) if Φoi ≥ Φol ,
Ho,il =
k (S )ρ (p )/µ (p ) if Φ < Φ ,
ro l o l o l oi ol

and Fcap denotes capillary forces, which are independent of So . Hence

∂Foi φρo (pi ) ∂Ho,il


= + Kil (Φoi − Φol ). (D.2)
∂Soi ∆t ∂Soi

The accumulation term φρo (pi )/∆t will always be positive. The sign of the flux term
depends on the upstream direction. If Φoi ≥ Φol , then

∂Ho,il ρo (pi ) ∂kro


= (Soi ) ≥ 0
∂Soi µo (pi ) ∂So

by assumption. On the other hand, if Φoi < Φol , then Ho,il is independent of Soi , so

171
172 APPENDIX D. NONSINGULARITY OF JSS

the derivative is zero. Thus, the flux derivative will always be non-negative, which
means ∂Foi /∂Soi > 0 for all cells i. The argument for the water equations is similar.
Thus, Jss has a positive diagonal, so it is nonsingular.

Under certain mild conditions (to be specified below), the Stone I and II models
(cf. [6]) can be shown to satisfy ∂kro /∂So ≥ 0, as required by Proposition D.1. Note
that we are only concerned with saturations inside the region

D = {(Sw , So , Sg ) | Sw ≥ Swc , So ≥ Som , Sg ≥ 0, Sw + So + Sg = 1},

where Swc is the connate water saturation and Som is the minimum oil saturation at
which oil is simultaneously displaced by water and gas. Also note that the derivative
∂kro /∂So is taken along the line Sw = constant, so by the relation Sw + So + Sg = 1,
the criterion ∂kro /∂So ≥ 0 is equivalent to ∂kro /∂Sg ≤ 0, which turns out to be more
natural to show.

Proposition D.2. Assume dkrog /dSg ≤ 0. Then for saturations in D, the Stone I
model satisfies ∂kro /∂Sg ≤ 0 provided ∂Som /∂Sg ≥ − 21 .

Proof. The Stone I model is defined as kro (Sw , Sg ) = krocw So∗ βw βg , where

krow (Sw )/krocw krog (Sg )/krocw


βw = , βg = , krocw = krow |Sw =Swc ,
1 − Sw∗ 1 − Sg∗

and the normalized saturations are defined as

Sw − Swc So − Som Sg
Sw∗ = , So∗ = , Sg∗ = .
1 − Swc − Som 1 − Swc − Som 1 − Swc − Som

Combining all these relations, we see that kro = U (Sw , Sg , Som )/V (Sw , Sg , Som ), where

U = (1 − Sw − Som − Sg )(1 − Swc − Som )krow (Sw )krog (Sg ),


V = (1 − Swc − Som − Sg )(1 − Sw − Som ).
173

Remembering that Som = Som (Sw , Sg ), we deduce that


   
∂kro 1 ∂U ∂V ∂Som ∂U ∂V
= 2 V −U + V −U
∂Sg V ∂Sg ∂Sg ∂Sg ∂Som ∂Som
 
1 ∂Som
= 2 R1 + R2 · ,
V ∂Sg

so the sign of ∂kro /∂Sg is determined by the quantity within the square brackets.
After some manipulation, we get

R1 = −(Sw − Swc )(1 − Swc − Som )(1 − Sw − Som )krow krog


+ (1 − Swc − Som − Sg )(1 − Sw − Som )(1 − Swc − Som )×
0
(1 − Sw − Som − Sg )krow krog
≤ −(Sw − Swc )(1 − Swc − Som )(1 − Sw − Som )krow krog
≤ 0,

0
since korg ≤ 0. In addition, we get

 
R2 = −Sg (Sw − Swc ) (1 − Sw − Som − Sg ) + (1 − Swc − Som ) krow krog
≤ 0.

∂Som
Hence R1 + R2 · ∂Sg
≤ 0 if either ∂Som /∂Sg ≥ 0 or

∂Som (Sw − Swc )(1 − Swc − Som )(1 − Sw − Som )


≤  . (D.3)
∂Sg Sg (Sw − Swc ) (1 − Sw − Som − Sg ) + (1 − Swc − Som )

But since Sg ≤ 1 − Sw − Som and 1 − Sw − Som − Sg ≤ 1 − Swc − Som , we see that

(Sw − Swc )(1 − Swc − Som )(1 − Sw − Som ) 1


 ≥ .
Sg (Sw − Swc ) (1 − Sw − Som − Sg ) + (1 − Swc − Som ) 2

Thus, in order to ensure that ∂kro /∂Sg ≤ 0, it is sufficient to require either ∂Som /∂Sg ≥
0 or |∂Som /∂Sg | ≤ 21 , which is equivalent to requiring ∂Som /∂Sg ≥ − 21 .
174 APPENDIX D. NONSINGULARITY OF JSS

Note that if the Fayers and Matthews [35] model for Som is used, we would have

∂Som Sorw − Sorg


=− ,
∂Sg 1 − Swc − Sorg

so the condition in Proposition D.2 would be satisfied as long as Sorw − Sorg is small,
which is usually the case. In particular, the monotonicity condition is always satisfied
whenever Sorw = Sorg .

Proposition D.3. Assume that dkrg /dSg ≥ 0, dkrog /dSg ≤ 0, and that krw and krow
are convex functions of Sw . Then for all saturations in D, the Stone II model satisfies
∂kro /∂Sg ≤ 0.

Proof. The Stone II model is defined as


   
krow krog
kro (Sw , Sg ) = krocw + krw + krg − (krw + krg ) .
krocw krocw

Differentiating with respect to Sg gives


 0
krog
  
∂kro krow 0 krow
= + krw − 1 krg + + krw .
∂Sg krocw krocw krocw

0
The second term is clearly non-positive because krog ≤ 0. To show that the first term
0
is also non-positive, first note that krg ≥ 0. Next, define g(Sw ) = krw + krow /krocw .
Then g(Swc ) = g(1 − Sorw ) = 1. But since g is convex, it must be that g(Sw ) ≤ 1 for
all Swc ≤ Sw ≤ 1−Sorw . So g(Sw )−1 ≤ 0, which implies the first term is non-positive
as well. Hence, we have shown that ∂kro /∂Sg ≤ 0, as required.
Appendix E

Properties of Pressure Matrices

This appendix deals with the spectral properties of various combinations of the pres-
sure matrices Jsp , Jpp and Jtp . These properties are useful in evaluating the relative
importance of various terms that appear in the preconditioned matrices in Chapter
5.
Let G = (V, E) be a connected undirected graph with nodes V and edges E.
Suppose the nodes V can be partitioned into V = V int ∪ V bdy , where V bdy 6= ∅. (For
our purposes, V int consists of the control volumes in the domain; an edge in E is
either an interface separating two cells, or the face of a boundary cell that is subject
to a pressure boundary condition; V bdy consists of “ghost cells” outside the domain
that are used by the finite volume method to deal with pressure boundary conditions.)
Suppose there exists a function σ : E → [0, ∞) that assigns a non-negative weight
(transmissibility) to each edge in E, and let σij denote the weight assigned to edge
(i, j). Then we can define a |V int | × |V int | matrix M σ by
P
 (i,l)∈E σil i = j,



Mijσ = −σij i 6= j, (i, j) ∈ E, (E.1)



0 i 6= j, (i, j) ∈
/ E.

Then M σ is a symmetric M-matrix, and by Gershgorin theorem its eigenvalues are


non-negative, so that M σ is positive semi-definite. If in addition σ > 0 then M σ is

175
176 APPENDIX E. PROPERTIES OF PRESSURE MATRICES

irreducible, so by the Peron-Frobenius theorem it is also nonsingular, i.e. symmetric


positive definite. Moreover, for any constant c > 0 we have M cσ = cM σ .

Given two weight functions σ and τ we say σ ≤ τ if σij ≤ τij for all edges (i, j).
The following lemma is a slight modification of a theorem by Ostrowski and Reich
(cf. [78]).
Lemma E.1. Let A = M − N , where A = A∗ , A and M are both nonsingular, and
define Q = M + M ∗ − A. If A is positive definite and Q is positive semi-definite, then
ρ(M −1 N ) ≤ 1, where ρ(·) is the spectral radius. In addition, if Q is positive definite,
then ρ(M −1 N ) < 1.
Proof. Define B = M −1 N = I − M −1 A. It follows that if Bu = λu, u 6= 0, then

Au = (1 − λ)M u,

where λ 6= 1 since A is nonsingular. Taking the inner product of both sides with u
yields
u∗ Au = (1 − λ)u∗ M u,

but since A is symmetric positive definite, we also have

u∗ Au = (1 − λ̄)u∗ M ∗ u.

Adding these relaions yields


 
∗ ∗ 1 1
u (M + M )u = + u∗ Au
1 − λ 1 − λ̄
 
1
= 2< u∗ Au,
1−λ

which can be rewritten as

u∗ (Q + A)u u∗ Qu
 
1
= 1 + = 2< .
u∗ Au u∗ Au 1−λ

1

Since A is positive definite and Q is positive semi-definite, we must have 2< 1−λ
≥ 1,
177

with strict inequality if Q is positive definite. If we write λ = α + iβ, it follows that

2(1 − α)
≥ 1,
(1 − α)2 + β 2

which yields α2 + β 2 = |λ|2 ≤ 1 (again with strict inequality if Q is positive definite).

Corollary E.2. Let σ and τ be weight functions on the edges E. If τ > 0 and
0 ≤ σ ≤ τ , then ρ((M τ )−1 M σ ) ≤ 1.
Proof. Let M = M τ and N = −M σ in Lemma E.1. Then A = M τ + M σ and
Q = M τ − M σ , which corresponds to matrices with weights τ + σ > 0 and τ − σ ≥ 0,
so that A is symmetric positive definite and Q is symmetric positive semi-definite.
Thus, we have ρ((M τ )−1 M σ ) ≤ 1 by Lemma E.1, as required.
−1 −1
The above corollary immediately implies ρ(Jsp Jtp ) ≤ 1 and ρ(Jpp Jtp ) ≤ 1, since
λw , λo are both bounded above by λT . The corollary also leads to a bound on the
condition number of M σ :
Theorem E.3. Let σ and τ be weight functions on the edges E. If there exist con-
stants 0 < b ≤ B such that 0 < bτ ≤ σ ≤ Bτ , then

B
κ2 (M σ ) ≤ κ2 (M τ ),
b

where κ2 (A) = kAk2 kA−1 k2 is the 2-norm condition number.


Proof. Let M τ = R2 , where R is the symmetric square root of M τ . In other words,
R = U Λ1/2 U T , where M τ = U ΛU T is the spectral decomposition of the symmetric
positive definite matrix M τ . Then by the above corollary we must have

bρ(R(M σ )−1 R) = bρ((M σ )−1 R2 ) = ρ((M σ )−1 (bM τ )) ≤ 1,


1 1
ρ(R−1 M σ R−1 ) = ρ(R−2 M σ ) = ρ((BM τ )−1 M σ ) ≤ 1.
B B

But since ρ(·) = k · k2 for symmetric matrices, this implies

kR(M σ )−1 Rk2 kR−1 M σ R−1 k2 ≤ B/b,


178 APPENDIX E. PROPERTIES OF PRESSURE MATRICES

so that
k(M σ )−1 k2 kM σ k2 B
−1 2
· 2
≤ .
kR k2 kRk2 b
Finally, by the symmetry of R we have

kRk22 = ρ2 (R) = ρ(R2 ) = ρ(M τ ) = kM τ k2 ,

and similarly kR−1 k22 = k(M τ )−1 k2 , so we must have

B
kM σ k2 k(M σ )−1 k2 ≤ kM τ k2 k(M τ )−1 k2 ,
b

as required.

The Laplacian of a graph G (cf. [68]), denoted by L(G), is the matrix M τ when
τ ≡ 1. Theorem E.3 can yield useful bounds for Jtp when κ2 (L(G)) is known. For a
Cartesian grid, it is well known that κ2 (L(G)) = O(1/h2 ); as a result, κ2 (Jtp ) is also
O(1/h2 ), provided the absolute permeability K(x) and total mobility λT (S) satisfy

0 < kmin ≤ K(x) ≤ kmax ,


0 < λT,min ≤ λT (S) ≤ λT,max .
Bibliography

[1] J. E. Aarnes. On the use of a mixed multiscale finite element method for greater
flexibility and increased speed or improved accuracy in reservoir simulation. Mul-
tiscale Model. Simul., 2:421–439, 2004.

[2] I. Aavatsmark. An introduction to multipoint flux approximations for quadri-


lateral grids. Computat. Geosci., 6:405–432, 2002.

[3] J. R. Appleyard and I. M. Cheshire. Nested factorization. SPE paper 12264,


presented at the SPE Symposium on Reservoir Simulation in San Francisco, CA,
1983.

[4] J. R. Appleyard and I. M. Cheshire. The cascade method for accelerated conver-
gence in implicit simulators. In European Petroleum Conference, pp. 113–122,
1982.

[5] J. R. Appleyard, I. M. Cheshire, and R. K. Pollard. Special techniques for fully


implicit simulators. In Proc. European Symposium on Enhanced Oil Recovery,
pp. 395–408, Bournemouth, England, 1981.

[6] K. Aziz and A. Settari. Petroleum Reservoir Simulation. Applied Science Pub-
lishers, New York, 1979.

[7] Z.-Z. Bai, G. H. Golub, and M. K. Ng. Hermitian and skew-Hermitian splitting
methods for non-Hermitian positive definite linear systems. SIAM J. Matrix
Anal. Appl., 24(3):603–626, 2002.

179
180 BIBLIOGRAPHY

[8] A. Behie. Comparison of nested factorization, constrained pressure residual, and


incomplete factorization preconditionings. SPE paper 13531, presented at the
SPE Reservoir Simulation Symposium in Dallas, TX, 1985.

[9] J. B. Bell, C. N. Dawson, and G. R. Shubin. An unsplit, higher order Godunov


method for scalar conservation laws in multiple dimensions. J. Comput. Phys.,
74:1–24, 1988.

[10] M. Benzi, D. B. Szyld, and A. van Duin. Orderings for incomplete factorization
preconditioning of nonsymmetric problems. SIAM J. Sci. Comput., 20:1652–
1670, 1999.

[11] M. Blunt and B. Rubin. Implicit flux limiting schemes for petroleum reservoir
simulation. J. Comput. Phys., 102(1):194–210, 1992.

[12] C. Bolley and M. Crouzeix. Conservation de la positivité lors de la discrétisation


des problèmes d’évolution paraboliques. RAIRO Anal. Numer., 12(3):237–245,
1978.

[13] Y. Brenier and J. Jaffré. Upstream differencing for multiphase flow in reservoir
simulation. SIAM J. Numer. Anal., 28(3):685–696, 1991.

[14] R. P. Brent. Algorithms for Minimization Without Derivatives, chapter 3–4.


Prentice-Hall, Englewood Cliffs, NJ, 1973.

[15] R. Bridson and C. Greif. A multipreconditioned conjugate gradient algorithm.


SIAM J. Matrix Anal. Appl., 27:1056–1068, 2006.

[16] H. Cao. Development of Techniques for General Purpose Simulators. PhD thesis,
Stanford University, Stanford, CA, June 2002.

[17] H. Cao, H. A. Tchelepi, J. Wallis, and H. Yardumian. Parallel scalable un-


structured CPR-type linear solver for reservoir simulation. SPE Paper 96809,
presented at the SPE Annual Technical Conference and Exhibition in Dallas,
TX, 2005.
BIBLIOGRAPHY 181

[18] W. H. Chen, L. J. Durlofsky, B. Engquist, and S. Osher. Minimization of grid


orientation effects through the use of higher order finite difference methods. SPE
Advanced Technology Series, 1(2):43–52, 1991.

[19] M. A. Christie and M. J. Blunt. Tenth SPE comparative solution project: A


comparison of upscaling techniques. SPE Reservoir Eval. Eng., 4(4):308–317,
2001.

[20] K. H. Coats. A note on IMPES and some IMPES-based simulation models. SPE
J., 5(3):245–251, Sept. 2000.

[21] K. H. Coats, W. D. George, and B. E. Marcum. Three-dimensional simulation


of steamflooding. Trans. SPE of AIME, 257:573–592, 1974.

[22] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. D. Stein. Introduction to


Algorithms. MIT Press, Cambridge, MA, 2nd edition, 2001.

[23] M. G. Crandall and T. M. Liggett. Generation of semi-groups of nonlinear


transformations on general Banach spaces. Amer. J. Math., 93:265–298, 1971.

[24] M. G. Crandall and A. Majda. Monotone difference approximations for scalar


conservation laws. Math Comp., 34:1–21, 1980.

[25] E. F. D’Azevedo, P. A. Forsyth, and W.-P. Tang. Ordering methods for pre-
conditioned conjugate gradient methods applied to unstructured grid problems.
SIAM J. Matrix Anal. Appl., 13(3):944–961, 1992.

[26] R. de Loubens. Construction of high-order adaptive implicit methods for reser-


voir simulation. Master’s thesis, Stanford University, June 2007.

[27] K. Deimling. Nonlinear Functional Analysis. Springer-Verlag, 1985.

[28] J. E. Dennis, Jr., J. M. Martı́nez, and X. Zhang. Triangular decomposition meth-


ods for solving reducible nonlinear systems of equations. SIAM J. Optimization,
4:358–382, 1994.
182 BIBLIOGRAPHY

[29] P. Deuflhard. Newton Methods for Nonlinear Problems: Affine Invariance and
Adaptive Algorithms. Springer-Verlag, Berlin, 2004.

[30] I. S. Duff and G. A. Meurant. The effect of ordering on preconditioned conjugate


gradients. BIT, 29:635–657, 1989.

[31] L. J. Durlofsky and M. C. H. Chien. Development of a mixed finite-element-based


compositional reservoir simulator. SPE Paper 25253, presented at the 12th SPE
Symposium on Reservoir Simulation in New Orleans, LA, 1993.

[32] L. C. Evans. Partial Differential Equations. American Mathematical Society,


1998.

[33] S. Evje and K. H. Karlsen. Degenerate convection-diffusion equations and im-


plicit monotone difference schemes. In M. Fey and R. Jeltsch, editors, Hyperbolic
problems: Theory, Numerics, Applications, volume 129, pp. 285–294. Birkhäuser
Verlag, 1999.

[34] R. E. Ewing. Simulation of multiphase flows in porous media. Transport Porous


Med., 6:479–499, 1991.

[35] F. J. Fayers and J. D. Matthews. Evaluation of normalized Stone’s methods for


estimating three-phase relative permeabilities. SPE J., 24:224–232, 1984.

[36] P. A. Forsyth and P. H. Sammon. Practical considerations for adaptive implicit


methods in reservoir simulation. J. Comput. Phys., 62:265–281, 1986.

[37] Geoquest. Eclipse Technical Description 2005A. Schlumberger, 2005.

[38] A. George and J. W. Liu. Computer Solution of Large Sparse Positive Definite
Systems. Prentice-Hall, Englewood Cliffs, NJ, 1981.

[39] G. H. Golub and C. F. V. Loan. Matrix Computations. Johns Hopkins University


Press, 3rd edition, 1996.

[40] A. Greenbaum, V. Pták, and Z. Strakoš. Any nonincreasing convergence curve


is possible for GMRES. SIAM J. Matrix Anal. Appl., 17:465–469, 1996.
BIBLIOGRAPHY 183

[41] A. Harten. High resolution schemes for hyperbolic conservation laws. J. Comput.
Phys., 135:260–278, 1997.

[42] M. Honarpour, L. F. Koederitz, and H. A. Harvey. Empirical equations for esti-


mating two-phase relative permeability in consolidated rock. J. Petrol. Technol.,
34:2905–2908, 1982.

[43] T. Y. Hou and X. H. Wu. A multiscale finite element method for elliptic problems
in composite materials and porous media. J. Comput. Phys., 134:169–189, 1997.

[44] Y. Jiang. Tracer flow modeling and efficient solvers for GPRS. Master’s thesis,
Stanford University, June 2004.

[45] S. N. Kružkov. First order quasilinear equations in several independent variables.


Math. USSR Sbornik, 10(2):217–243, 1970.

[46] P. D. Lax and B. Wendroff. Systems of conservation laws. Comm. Pure Appl.
Math, 13:217–237, 1960.

[47] S. H. Lee, L. J. Durlofsky, M. F. Lough, and W. H. Chen. Finite difference


simulation of geologically complex reservoirs with tensor permeabilities. SPE
Reserv. Eval. Eng., 1(6):567–574, 1998.

[48] S. H. Lee, H. A. Tchelepi, P. Jenny, and L. J. DeChant. Implementation of a


flux-continuous finite-difference method for stratigraphic, hexahedron grids. SPE
J., 7(3):267–277, 2002.

[49] R. J. LeVeque. Numerical Methods for Conservation Laws. Birkhäuser, 2nd


edition, 1992.

[50] B. J. Lucier. On nonlocal monotone difference schemes for scalar conservation


laws. Math. Comp., 47(175):19–36, 1986.

[51] R. C. MacDonald and K. H. Coats. Methods for numerical simulation of water


and gas coning. Trans. SPE of AIME, 249:425–436, 1970.
184 BIBLIOGRAPHY

[52] B. T. Mallison. Streamline-based Simulation of Two-phase, Multicomponent Flow


in Porous Media. PhD thesis, Stanford University, 2004.

[53] S. F. Matringe, R. Juanes, and H. A. Tchelepi. Mixed-finite-element and related-


control-volume discretizations for reservoir simulation on three-dimensional un-
structured grids. SPE Paper 106117, presented at the SPE Reservoir Simulation
Symposium in Houston, TX, 2007.

[54] A. Meister and C. Vömel. Efficient preconditioning of linear systems arising from
the discretization of hyperbolic conservation laws. Adv. Comp. Math., 14:49–73,
2001.

[55] N. M. Nachtigal, S. C. Reddy, and L. N. Trefethen. How fast are nonsymmetric


matrix iterations? SIAM J. Matrix Anal. Appl., 13:778–795, 1992.

[56] J. R. Natvig, K.-A. Lie, and B. Eikemo. Fast solvers for flow in porous media
based on discontinuous Galerkin methods and optimal reordering. In Computa-
tional Methods in Water Resources XVI, 2006.

[57] A. S. Odeh. Comparison of solutions to a three-dimensional black-oil reservoir


simulation problem. J. Petrol Technol., 33(1):13–25, 1981.

[58] F. M. Orr, Jr. Theory of Gas Injection Processes. Tie-Line Publications, 2007.

[59] J. M. Ortega and W. Rheinboldt. Iterative Solution of Nonlinear Equations in


Several Variables. Academic Press, 1970.

[60] S. Osher. Riemann solvers, the entropy condition, and difference approximations.
SIAM J. Numer. Anal., 21:217–235, 1984.

[61] D. W. Peaceman. A nonlinear stability analysis for difference equations using


semi-implicit mobility. SPE J., 17:79–91, 1977.

[62] H. S. Price and K. H. Coats. Direct methods in reservoir simulation. Trans. SPE
of AIME, 257:295–308, 1974.
BIBLIOGRAPHY 185

[63] S. C. Reddy and L. N. Trefethen. Pseudospectra of the convection-diffusion


operator. SIAM J. Appl. Math., 54:1634–1649, 1994.

[64] W. C. Rheinboldt. On M -functions and their application to nonlinear Gauss-


Seidel iterations and to network flows. J. Math. Anal. Appl., 32:274–307, 1970.

[65] H. L. Royden. Real Analysis. Prentice-Hall, 1988.

[66] W. Rudin. Principles of Mathematical Analysis. McGraw-Hill, 3rd edition, 1976.

[67] T. F. Russell. Stability analysis and switching criteria for adaptive implicit
methods based on the CFL condition. SPE paper 18416, presented at the SPE
Symposium on Reservoir Simulation in Houston, TX, 1989.

[68] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2nd edition, 2003.

[69] Y. Saad and M. H. Schultz. GMRES: a generalized minimal residual algorithm


for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comp., 7(3):856–
869, July 1986.

[70] R. Sanders. On convergence of monotone finite difference schemes with variable


spatial differencing. Math. Comp., 40:91–106, 1983.

[71] A. Settari and K. Aziz. Treatment of nonlinear terms in the numerical solution
of partial differential solutions for multiphase flow in porous media. Int. J.
Multiphase Flow, 1:817–844, 1975.

[72] A. G. Spillette, J. G. Hillestad, and H. L. Stone. A high-stability sequential


solution approach to reservoir simulation. SPE Paper 4542, presented at the Fall
Meeting of the Society of Petroleum Engineers of AIME in Las Vegas, NV, 1973.

[73] H. L. Stone. Iterative solution of implicit approximations of multidimensional


partial differential equations. SIAM J. Numer. Anal., 5:530–568, 1968.

[74] K. Stueben. Algebraic multigrid (AMG): experiences and comparisons. In Pro-


ceedings of the International Multigrid Conference, Copper Mountain, CO, 1983.
186 BIBLIOGRAPHY

[75] E. Tadmor. Entropy stability theory for difference approximations of nonlinear


conservation laws and related time dependent problems. Acta Numerica, 12:451–
512, 2003.

[76] G. W. Thomas and D. H. Trunau. Reservoir simulation using an adaptive implicit


method. SPE J., 23:759–768, 1983.

[77] B. van Leer. Upwind and high-resolution methods for compressible flow: from
donor cell to residual-distribution schemes. Commun. Comput. Phys., 1:192–206,
2006.

[78] R. S. Varga. Matrix Iterative Analysis. Prentice-Hall, 1962.

[79] S. Verma and K. Aziz. Control volume scheme for flexible grids in reservoir
simulation. SPE paper 37999, presented at the SPE Symposium on Reservoir
Simulation in Dallas, TX, 1997.

[80] P. K. W. Vinsome. ORTHOMIN, an iterative method for solving sparse banded


sets of simultaneous linear equations. SPE paper 5729, presented at the SPE
Symposium on Numerical Simulation of Reservoir Performance in Los Angeles,
CA, 1976.

[81] J. R. Wallis, R. P. Kendall, and T. E. Little. Constrained residual acceleration of


conjugate residual methods. SPE paper 13536, presented at the SPE Reservoir
Simulation Symposium in Dallas, TX, 1985.

[82] J. W. Watts. A compositional formulation of the pressure and saturation equa-


tions. SPE Reservoir Eng., 1(3):243–252, 1986.

[83] J. W. Watts. Reservoir simulation: past, present and future. SPE Computer
Applications, 12(4):171–176, 1997.

[84] J. W. Watts III. A conjugate gradient truncated direct method for the iterative
solution of the reservoir simulation pressure equation. SPE J., 21:345–353, 1981.

[85] D. M. Young. Iterative Solution of Large Linear Systems. Academic Press, New
York, 1971.
BIBLIOGRAPHY 187

[86] L. C. Young. A finite-element method for reservoir simulation. SPE J., 21(1):115–
128, 1981.

[87] L. C. Young and R. E. Stephenson. A generalized compositional approach for


reservoir simulation. SPE J., 23(5):727–742, 1983.

You might also like