An Introduction To Probabilistic Seismic Hazard Analysis (PSHA)
An Introduction To Probabilistic Seismic Hazard Analysis (PSHA)
Hazard Analysis (PSHA)
Jack W. Baker
Version 1.3
October 1st, 2008
Acknowledgements
This document is based in part on an excerpt from a report originally written for the US Nuclear
Regulatory Commission. Financial support for the writing of that material is greatly appreciated.
Thanks to Marcello Bianchini, Tom Hanks, Ting Ling, Nirmal Jayaram and Seok Goo Song for
pointing out errors and making suggestions to improve this document. Any remaining errors are the
sole responsibility of the author.
2
Contents
Acknowledgements 2
Contents 3
Section 1 An overview of PSHA 5
1.1 Introduction ..................................................................................................................... 5
1.2 Deterministic versus probabilistic approaches................................................................ 6
1.2.1 Variability in the design event .................................................................................... 6
1.2.2 Variability of ground motion intensity for a given earthquake event ......................... 8
1.2.3 Can we use a deterministic approach, given these uncertainties? .............................. 9
1.3 Probabilistic seismic hazard analysis calculations ........................................................ 10
1.3.1 Identify earthquake sources ...................................................................................... 13
1.3.2 Identify earthquake magnitudes ................................................................................ 13
1.3.3 Identify earthquake distances ................................................................................... 17
1.3.4 Ground motion intensity ........................................................................................... 21
1.3.5 Combine all information ........................................................................................... 25
1.4 Example PSHA calculations ......................................................................................... 27
Section 2 Extensions of PSHA 37
2.1 Deaggregation ............................................................................................................... 37
2.2 Impact of bounds on considered magnitudes and distances ......................................... 43
2.3 Probabilities, rates, and return periods .......................................................................... 44
2.4 Summarizing PSHA output: the uniform hazard spectrum .......................................... 46
2.5 Joint distributions of two intensity measures ................................................................ 47
Section 3 Conclusions 52
Section 4 A review of probability 53
4.1 Random events .............................................................................................................. 53
4.2 Conditional probabilities............................................................................................... 55
4.3 Random variables ......................................................................................................... 58
4.4 Expectations and moments ........................................................................................... 66
Section 5 Further study 68
5.1 Origins and development of PSHA............................................................................... 68
5.2 Books and summary papers .......................................................................................... 68
References 71
3
4
Section 1 An overview of PSHA
“The language of probability allows us to speak quantitatively about some situation which
may be highly variable, but which does have some consistent average behavior. Our most
precise description of nature must be in terms of probabilities.”
−Richard Feynman
1.1 Introduction
The goal of many earthquake engineering analyses is to ensure that a structure can withstand a given
level of ground shaking while maintaining a desired level of performance. But what level of ground
shaking should be used to perform this analysis? There is a great deal of uncertainty about the
location, size, and resulting shaking intensity of future earthquakes. Probabilistic Seismic Hazard
Analysis (PSHA) aims to quantify these uncertainties, and combine them to produce an explicit
description of the distribution of future shaking that may occur at a site.
In order to assess risk to a structure from earthquake shaking, we must first determine the annual
probability (or rate) of exceeding some level of earthquake ground shaking at a site, for a range of
intensity levels. Information of this type could be summarized as shown in Figure 1.1, which shows
that low levels of intensity are exceeded relatively often, while high intensities are rare. If one was
willing to observe earthquake shaking at a site for thousands of years, it would be possible to obtain
this entire curve experimentally. That is the approach often used for assessing flood risk, but for
seismic risk this is not possible because we do not have enough observations to extrapolate to the low
rates of interest. In addition, we have to consider uncertainties in the size, location, and resulting
shaking intensity caused by an earthquake, unlike the case of floods where we typically only worry
about the size of the flood event. Because of these challenges, our seismic hazard data must be
obtained by mathematically combining models for the location and size of potential future earthquakes
with predictions of the potential shaking intensity caused by these future earthquakes. The
mathematical approach for performing this calculation is known as Probabilistic Seismic Hazard
Analysis, or PSHA.
The purpose of this document is to discuss the calculations involved in PSHA, and the motivation
for using this approach. Because many models and data sources are combined to create results like
5
those shown in Figure 1.1, the PSHA approach can seem opaque. But when examined more carefully,
the approach is actually rather intuitive. Once understood and properly implemented, PSHA is flexible
enough to accommodate a variety of users’ needs, and quantitative so that it can incorporate all
knowledge about seismic activity and resulting ground shaking at a site.
Probability calculations are a critical part of the procedures described here, so a basic knowledge
of probability and its associated notation is required to study this topic. A review of the concepts and
notation used in this document is provided for interested readers in Section 4.
The somewhat complicated probabilistic evaluation could be avoided if it was possible to identify a
“worst-case” ground motion and evaluate the facility of interest under that ground motion. This line of
thinking motivates an approach known as deterministic hazard analysis, but we will see that
conceptual problems arise quickly and are difficult to overcome.
1.2.1 Variability in the design event
A designer looking to choose a worst-case ground motion would first want to look for the maximum
magnitude event that could occur on the closest possible fault. This is simple to state in theory, but
several difficulties arise in practice. Consider first the hypothetical site shown in Figure 1.2a, which is
located 10 km from a fault capable of producing an earthquake with a maximum magnitude of 6.5. It
6
is also located 30 km from a fault capable of producing a magnitude 7.5 earthquake. The median
predicted response spectra from those two events are shown in Figure 1.2b. As seen in that figure, the
small-magnitude nearby event produces larger spectral acceleration amplitudes at short periods, but
the larger-magnitude event produces larger amplitudes at long periods. So, while one could take the
envelope of the two spectra, there is not a single “worst-case” event that produces the maximum
spectral acceleration amplitudes at all periods.
Figure 1.2: (a) Map view of an illustrative site, with two nearby sources capable of producing
earthquakes. (b) Predicted median response spectra from the two earthquake events, illustrating that
the event producing the maximum response spectra may vary depending upon the period of interest
(prediction obtained from the model of Campbell and Bozorgnia 2008).
While the site shown in Figure 1.2a produces some challenges in terms of identifying a worst-case
event, an even greater challenges arise when faults near a site are not obvious and so the seismic
source is quantified as an areal source capable of producing earthquakes at any location, as shown in
Figure 1.3. In this case, the worst-case event has to be the one with the maximum conceivable
magnitude, at a location directly below the site of interest (i.e., with a distance of 0 km). This is clearly
the maximum event, no matter how unlikely its occurrence may be. For example, in parts of the
Eastern United States, especially near the past Charleston or New Madrid earthquakes, one can quite
feasibly hypothesize the occurrence of magnitude 7.5 or 8 earthquakes immediately below a site,
although that event may occur very rarely.
7
Figure 1.3: Example site at the center of an area source, with potential earthquakes at zero distance
from the site.
While the choice of a “worst-case” earthquake can be difficult and subjective, as discussed in the
previous section, an even greater problem with deterministic hazard analysis is the choice of worst-
case ground motion intensity associated with that earthquake. The response spectra plotted in Figure
1.2 are the median1 spectra predicted by empirical models calibrated to recorded ground motions. But
recorded ground motions show a very large amount of scatter around those median predictions. By
definition, the median predictions shown in Figure 1.2b are exceeded in 50% of observed ground
motions having the given magnitudes and distances.
An example of the large scatter around those ground motion prediction models is seen in Figure
1.4, which shows spectral acceleration values at 1 second that were observed in a past earthquake
(1999 Chi-Chi, Taiwan), plotted versus the closest distance from the earthquake rupture to the
recording site. Note that observations at distances between 1 and 3 km vary between 0.15g and more
than 1g—nearly an order of magnitude. Also plotted are the mean predicted lnSA values, along with
bounds illustrating one standard deviation above and below that mean. The scatter of the log of
1
There is considerable opportunity for confusion when referring to means and medians of predicted ground
motion intensity. Ground motion predictions models, such as the one used to make Figure 1.4, provide the mean
and standard deviation of the natural logarithm of spectral acceleration (lnSA) or peak ground acceleration
(lnPGA). These lnSA values are normally distributed, which means that the non-logarithmic SA values are
lognormally distributed. The exponential of the mean lnSa value can be shown to equal the median SA value. It
is easiest to work with lnSA values in the calculations that follow, so this text will often refer to mean lnSA
values rather than median SA values. Plots such as Figure 1.4 will show non-logarithmic SA, because the units
are more intuitive, but the axis will always be in logarithmic scale so that the visual effect is identical to if one
was viewing a plot of lnSA.
8
spectral accelerations around the mean prediction is well-represented by a normal distribution (leading
to symmetric scatter in Figure 1.4, which is plotted in logarithmic scale).
The one-standard-deviation bounds should enclose about 2/3 of the observed values if the
variations are normally distributed, and that is the case here. To account for this scatter, deterministic
hazard analyses sometimes specify a “mean plus one standard deviation” response spectra, but even
that will be exceeded 16% of the time2. Because the scatter is normally distributed, there is no
theoretical upper bound on the amplitude of ground motion that might be produced at a given
magnitude and distance3.
Figure 1.4: Observed spectral acceleration values from the 1999 Chi-Chi, Taiwan earthquake,
illustrating variability in ground motion intensity. The predicted distribution comes from the model
of Campbell and Bozorgnia (2008).
Given these challenges, it is clear that whatever deterministic design earthquake and ground motion
intensity is eventually selected, it is not a true “worst-case” event, as a larger earthquake or ground
motion can always plausibly be proposed. Without a true worst-case event to consider, we are left to
2
This number comes from considering normally-distributed residuals. As seen in Table 4.1, the probability of a
normal random variable being more than one standard deviation greater than its mean (i.e., 1 - Φ(1) ) is 0.16.
3
There is almost certainly some true physical upper bound on ground motion intensity caused by an inability of
the earth to carry more intense seismic waves without shattering or otherwise failing. Current research suggests
that this limit may be important to structures designed for extremely intense ground motions, such as nuclear
waste repositories, but it almost certainly has no practical impact on more common structures such as buildings
or bridges, which are analyzed for ground motion intensities that are exceeded once every few thousand years.
Thus, the assumption of not theoretical upper bound is reasonable and appropriate in most cases.
9
identify a “reasonably large” event. That is often done by choosing a nearby large-magnitude event,
and then identifying some level of reasonable intensity associated with this event. While it is possible
to proceed using this type of approach, two issues should be made clear. 1) The resulting ground
motion is not a “worst-case” ground motion. 2) The result may be very sensitive to decisions made
with regard to the chosen scenario magnitude and ground motion intensity. An event chosen in this
manner was historically described as a “Maximum Credible Earthquake,” or MCE. More recently,
however, the acronym has been retained but taken to mean “Maximum Considered Earthquake,” in
recognition of the fact that larger earthquakes (and larger ground motion intensities) are likely to be
credible as well. This “worst-case” thinking will be abandoned for the remainder of the document,
although the problems identified here will serve as a useful motivation for thinking about probability-
based alternatives.
In this section, we will describe a probability-based framework capable of addressing the concerns
identified above. Rather than ignoring the uncertainties present in the problem, this approach
incorporates them into calculations of potential ground motion intensity. While incorporation of
uncertainties adds some complexity to the procedure, the resulting calculations are much more
defensible for use in engineering decision-making for reducing risks.
With PSHA, we are no longer searching for an elusive worst-case ground motion intensity. Rather,
we will consider all possible earthquake events and resulting ground motions, along with their
associated probabilities of occurrence, in order to find the level of ground motion intensity exceeded
with some tolerably low rate. At its most basic level, PSHA is composed of five steps.
5. Combine uncertainties in earthquake size, location and ground motion intensity, using a
calculation known as the total probability theorem.
10
The end result of these calculations will be a full distribution of levels of ground shaking intensity,
and their associated rates of exceedance. The illusion of a worst-case ground motion will be removed,
and replaced by identification of occurrence frequencies for the full range of ground motion intensities
of potential interest. These results can then be used to identify a ground motion intensity having an
acceptably small probability of being exceeded.
11
Figure 1.5: Schematic illustration of the basic five steps in probabilistic seismic hazard analysis. (a)
Identify earthquake sources. (b) Characterize the distribution of earthquake magnitudes from each
source. (c) Characterize the distribution of source-to-site distances from each source. (d) Predict the
resulting distribution of ground motion intensity. (e) Combine information from parts a-d to
compute the annual rate of exceeding a given ground motion intensity.
12
1.3.1 Identify earthquake sources
In contrast to the deterministic thinking above, which focused only on the largest possible earthquake
event, here we are interested in all earthquake sources capable of producing damaging ground motions
at a site. These sources could be faults, which are typically planar surfaces identified through various
means such as observations of past earthquake locations and geological evidence. If individual faults
are not identifiable (as in the less seismically active regions of the eastern United States), then
earthquake sources may be described by areal regions in which earthquakes may occur anywhere.
Once all possible sources are identified, we can identify the distribution of magnitudes and source-to-
site distances associated with earthquakes from each source.
1.3.2 Identify earthquake magnitudes
Tectonic faults are capable of producing earthquakes of various sizes (i.e., magnitudes). Gutenberg
and Richter (1944) first studied observations of earthquake magnitudes, and noted that the distribution
of these earthquake sizes in a region generally follows a particular distribution, given as follows
log λm = a − bm (1.1)
where λm is the rate of earthquakes with magnitudes greater than m, and a and b are constants. This
equation is called the Gutenberg-Richter recurrence law. Figure 1.6 illustrates typical observations
from a fault or region, along with the Gutenberg-Richter recurrence law given by equation 1.1.
The a and b constants from equation 1.1 are estimated using statistical analysis of historical
observations, with additional constraining data provided by other types of geological evidence4. The a
value indicates the overall rate of earthquakes in a region, and the b value indicates the relative ratio of
small and large magnitudes (typical b values are approximately equal to 1).
Equation 1.1 can also be used to compute a cumulative distribution function5 (CDF) for the
magnitudes of earthquakes that are larger than some minimum magnitude mmin (this conditioning is
used because earthquakes smaller than mmin will be ignored in later calculations due to their lack of
engineering importance).
4
Note that some care is needed during this process to ensure that no problems are caused by using historical data
that underestimates the rate of small earthquakes due to the use of less sensitive instruments in the past. Methods
have been developed to address this issue (e.g., Weichert 1980), but are not considered further in this document.
5
Probability tools such as cumulative distribution functions and probability density functions are necessary for
much of the analysis that follows. See Section 4 for a review of this material.
13
FM (m) = P( M ≤ m | M > mmin )
Rate of earthquakes with mmin < M ≤ m
=
Rate of earthquakes with mmin < M
λm − λm
= min
(1.2)
λm min
where FM ( m) denotes the cumulative distribution function for M. One can compute the probability
density function (PDF) for M by taking the derivative of the CDF
d
f M ( m) = FM (m)
dm
d ⎡ (1.3)
= ⎣1 − 10− b( m − mmin ) ⎤⎦
dm
= b ln(10)10− b( m − mmin ) , m > mmin
Note that the PDF given in equation 1.3 relies on the Gutenberg-Richter law of equation 1.1,
which theoretically predicts magnitudes with no upper limit, although physical constraints make this
unrealistic. There is generally some limit on the upper bound of earthquake magnitudes in a region,
due to the finite size of the source faults (earthquake magnitude is related to the area of the seismic
rupture). If a maximum magnitude can be determined, then equation 1.2 becomes
1 − 10−b( m − mmin )
FM (m) = − b ( mmax − mmin )
, mmin < m < mmax (1.4)
1 − 10
b ln(10)10−b( m − mmin )
f M ( m) = − b ( mmax − mmin )
, mmin < m < mmax (1.5)
1 − 10
where mmax is the maximum earthquake that a given source can produce. This limited magnitude
distribution is termed a bounded Gutenberg-Richter recurrence law. Example observations of
earthquake magnitudes are shown in Figure 1.6, along with Gutenberg-Richter and bounded
Gutenberg-Richter recurrence laws fit to the data.
14
Figure 1.6: Typical distribution of observed earthquake magnitudes, along with Gutenberg-Richter
and bounded Gutenberg-Richter recurrence laws fit to the observations.
For our later PSHA equations, we will convert the continuous distribution of magnitudes into a
discrete set of magnitudes. For example, consider a source with a minimum considered magnitude of
5, a maximum magnitude of 8, and a b parameter equal to 1. Table 1.1 lists probabilities of interest for
this example source. The first column lists 13 magnitude values between 5 and 8. The second column
lists the cumulative distribution function, as computed using equation 1.4. The third column lists
probabilities of occurrence of these discrete set of magnitudes, assuming that they are the only
possible magnitudes; they are computed as follows
P ( M = m j ) = FM (m j +1 ) − FM ( m j ) (1.6)
where mj are the discrete set of magnitudes, ordered so that mj < mj+1. This calculation assigns the
probabibilities associated with all magnitudes between mj and mj+1 to the discrete value mj. As long as
the discrete magnitudes are closely spaced, the approximation will not affect numerical results.
Magnitudes are spaced at intervals of 0.25 for illustration in Table 1.1 so that the table is not too
lengthy, but a practical PSHA analysis might use a magnitude spacing of 0.1 or less.
15
Table 1.1: Magnitude probabilities for a source with a truncated Gutenberg-Richter distribution, a
minimum considered magnitude of 5, a maximum magnitude of 8, and a b parameter of 1. The
numbers in this table were computed using equations 1.4 and 1.6.
mj FM (m j ) P( M = m j )
5.00 0.0000 0.4381
5.25 0.4381 0.2464
5.50 0.6845 0.1385
5.75 0.8230 0.0779
6.00 0.9009 0.0438
6.25 0.9447 0.0246
6.50 0.9693 0.0139
6.75 0.9832 0.0078
7.00 0.9910 0.0044
7.25 0.9954 0.0024
7.50 0.9978 0.0014
7.75 0.9992 0.0008
8.00 1.0000 0.0000
Figure 1.7: Illustration of discretization of a continuous magnitude distribution for a source with a
truncated Gutenberg-Richter distribution, a minimum considered magnitude of 5, a maximum
magnitude of 8, and a b parameter of 1. (a) Continuous probability density function from equation
1.5. (b) Discrete probabilities from equation 1.6.
An aside: The Gutenberg-Richter models above are not the only models proposed for describing
the distribution of earthquake magnitudes. One common alternative is the Characteristic Earthquake
model, which proposes that some faults have repeated occurrences of a characteristic earthquake with
a reasonably consistent magnitude (Schwartz and Coppersmith 1984). This characteristic magnitude
occurs more often than predicted by the Gutenberg-Richter models proposed above. All that is
required to adopt an alternative recurrence model is to replace equation 1.5 with a suitably modified
16
Logarithms of pairs of spectral acceleration values (and presumably also PGA values) have been
shown to have a joint normal distribution (Jayaram and Baker 2007), so calculations of joint
distributions of two intensity measures becomes reasonably simple in this special case. In the case of a
joint normal distribution, conditional distributions of one intensity measure parameter, given the other,
are also normally distributed, and can be computed using only a linear correlation coefficient between
the two parameters (see Section 4.3 for a few further details, and Benjamin and Cornell, 1970, for a
more complete discussion).
Let us consider joint predictions of PGA and spectral acceleration at a period of 0.5 seconds
(SA(0.5s)), given a magnitude 5 earthquake at a distance of 10 km. Abrahamson and Silva (1997)
provide the following predictions for the mean and standard deviation of lnSA
ln SA = −2.7207 (2.12)
σ ln SA = 0.80 (2.13)
Note that a few more parameters than just magnitude and distance are needed to obtain this lnSA
prediction; here we have also assumed a rock site and a strike-slip mechanism for the earthquake. The
median of (non-log) SA is simply the exponential of this number, which in this case is 0.065 g.
Looking back to section 1.3.4, we recall that the mean and standard deviation of lnPGA for this
event was
The only thing needed further to compute the joint distribution of PGA and SA is the correlation
coefficient between the two (typically referred to using the Greek letter ρ). These correlation
coefficients have been determined in a manner similar to the way that ground motion prediction
models are calibrated; several documents provide these coefficients (e.g., Baker and Cornell 2006;
Baker and Jayaram 2008), and estimate a ρ of approximately 0.7 for this case.
Now let us consider a prediction of the distribution of PGA, given knowledge of the SA(0.5s)
value for a ground motion coming from the specified earthquake event. Because of the joint normality
of lnPGA and lnSA, we can write the conditional mean and standard deviation of lnPGA as simply
48
where all parameters have been defined above except ε SA . That parameter is the number of standard
deviations by which a given lnSA value differs from its mean predicted value. Mathematically, it can
be written
ln x − ln( SA)
ε SA = (2.18)
σ ln SA
where x is the observed SA value, and the other terms are the mean and standard deviation from the
original ground motion prediction model.
Now imagine that we have observed an SA value of 0.2g from the magnitude 5 earthquake at a
distance of 10 km. Using equation 2.18, we find
That is, the observed spectral acceleration is 1.4 standard deviations larger than the mean predicted
value associated with this earthquake.
If SA was larger than its mean, and SA and PGA are correlated, then knowledge of this large SA
value should increase our predictions of PGA for the given ground motion. We make this increased
prediction using equation 2.16
ln( PGA | SA) = ln( PGA) + ρε SAσ ln PGA = −2.2673 + 0.7(1.4)(0.57) = −1.7124 (2.20)
Taking an exponential of this number tells us that the median conditional PGA is 0.18g (a significant
increase from the median prediction of 0.104g we made before we had observed SA=0.2g).
Knowledge of SA should also decrease our uncertainty in PGA, and this is reflected in equation
2.17
Using the updated conditional mean and standard deviation of PGA, we can now predict the
probability of exceeding different PGA values conditional upon SA(0.5s) = 0.2g, by using equation
1.15 with our updated conditional mean and standard deviation. Some sample results are summarized
in Table 2.3. The first column lists a series of PGA values of potential interest. The second column
lists the probability of exceeding those PGA values, given a magnitude 5 earthquake at a distance of
10 km, but not yet conditioned on any observed SA value. That is, the second column was computed
using the original mean and standard deviation from equations 2.14 and 2.15. Note that this calculation
is identical to the one from Table 1.2. In the third column of Table 2.3, we compute probabilities of
exceeding the same PGA values, but this time conditioned upon knowledge that SA(0.5s) = 0.2g. That
49
is, we evaluate equation 1.15 with our new conditional mean and standard deviation. Examining the
second and third columns of this table, two interesting features are apparent. First, the probability of
exceeding low PGA values has increased significantly, because we now know that the correlated
parameter SA(0.5s) is larger than usual for this event. Second, we see that the probability of exceeding
very large PGA values has actually decreased. The decrease is because knowledge of SA has reduced
our uncertainty in PGA. Although we know that SA is larger than its mean prediction, we have also
eliminated the possibility that SA is even more extreme than the observed value, so the most extreme
PGA values actually become less likely. Finally, in the fourth column of the table, we compute the
conditional probability of PGA equaling the various values of interest, using equation 1.19
Table 2.3: PGA probabilities associated with a magnitude 5 earthquake at 10 km, and an SA(0.5s)
value of 0.2g.
To aid in intuitive understanding of these calculations, Figure 2.7 shows a schematic illustration of
the joint distribution referred to above. The horizontal axes represent the range of (log) PGA and SA
values that might result from earthquakes with a given magnitude and distance. The contour lines
illustrate the contours of the joint distribution of PGA and SA. The centroid and spread of these
contours with respect to each horizontal axis are specified by the mean and standard deviation from
ground motion prediction models. The correlation between the two parameters is reflected by the
elliptical shape of the contours, which means that large values of lnSA are likely to be associated with
large values of lnPGA. What we are interested in here is the distribution of PGA, given that we have
observed some SA value. This is represented by cuts through the joint distribution. The conditional
distributions at two cuts (i.e., two potential SA values) are shown on the vertical axis of the figure. The
probability of exceeding some PGA value x1 is represented by the area of the conditional distribution
to the right of x1. We see from the two cuts that as the observed lnSA value gets larger, the probability
of exceeding x1 also gets larger. That is the effect we also saw in the third column of Table 2.3.
50
Figure 2.7: Schematic illustration of the joint distribution of PGA and SA.
51
Section 3 Conclusions
We have now completed an overview of probabilistic seismic hazard analysis (PSHA), and several
extensions of the basic methodology. Example calculations have been presented to illustrate how the
computations are performed in practice. With these tools, one can quantify the risk of ground motion
shaking at a site, given knowledge about seismic sources surrounding the site.
Having now considered the many sources of uncertainty present when predicting future shaking at
a site, it is hopefully clear to the reader why deterministic approaches to seismic hazard analysis can
be unsatisfying. It should be clear that there is no such thing as a deterministic “worst-case” ground
motion, and that attempts to identify an alternate deterministic ground motion necessitate making
decisions that may be arbitrary and hard to justify.
PSHA is fundamentally an accounting method that lets one combine diverse sources of data
regarding occurrence rates of earthquakes, the size of future earthquakes, and propagation of seismic
shaking through the earth. It would be impossible to model the distribution of future earthquake
shaking at a site through direct observation, because one would have to wait thousands or millions of
years to collect enough observations to make a reasonable inference regarding rare ground motions.
But, by incorporating many sources of data into the calculations, it becomes possible to project out to
these low probabilities with scientifically-defensible and reproducible models.
The basic PSHA calculation, and its required inputs, was discussed in Section 1. In Section 2,
several extensions were presented, such as deaggregation and uniform hazard spectra. There is also a
vast literature regarding the accurate estimation of the many inputs, such as occurrence rates of
earthquakes and their magnitude distributions, which was not discussed here. References for further
study on these topics are provided in Section 5 for the interested reader.
52
Section 4 A review of probability
Probability is so fundamental to probabilistic seismic hazard analysis that the word appears in its title.
The calculations in this document thus rely heavily on the use of probability concepts and notation.
These probabilistic tools allow us to move through calculations without having to stop and derive
intermediate steps. The notational conventions allow us to easily describe the behavior of uncertain
quantities. It is recognized that these concepts and notations are not familiar to all readers, however, so
this section is intended to provide a brief overview of the needed material. Readers desiring more
details may benefit from reviewing a textbook dedicated specifically to practical applications of
probability concepts (e.g., Ang and Tang 2007; Benjamin and Cornell 1970; Ross 2004).
The most basic building block of probability calculations is the random event: an event having more
than one possible outcome. The sample space (denoted S) is the collection of all possible outcomes of
a random event. Any subset of the sample space is called an event, and denoted E. Sample spaces and
events are often illustrated graphically using Venn diagrams, as illustrated in Figure 4.1.
53
For example, the number obtained from rolling of a die is a random event. The sample space for
this example is S = {1, 2,3, 4,5,6} . The outcomes in the event that the number is odd are E1 = {1,3,5} .
The outcomes in the event that the number is greater than three are E2 = {4,5,6}
We are commonly interested in two operations on events. The first is the union of E1 and E2,
denoted by the symbol ∪ . E1 ∪ E2 is the event that contains all outcomes in either E1 or E2. The
second is the intersection. E1E2 (also denoted E1 ∩ E2 ) is the event that contains all outcomes in both
E1 and E2. For example, continuing the die illustration from above, E1 ∪ E2 = {1,3, 4,5,6} and
E1 ∩ E2 = {5} .
Special events
There are a few special terms and special events that are often useful for probability calculations:
The certain event is an event that contains all possible outcomes in the sample space. The sample
space S is the certain event.
Events E1 and E2 are mutually exclusive when they have no common outcomes. E1E2 = φ if E1 and
E2 are mutually exclusive.
Events E1, E2…En are collectively exhaustive when their union contains every possible outcome of
the random event (i.e., E1 ∪ E2 ∪ ... ∪ En = S ).
The complementary event, E1 , of an event E1, contains all outcomes in the sample space that are
not in event E1. It should be clear that, by this definition, E1 ∪ E1 = S and E1 E1 = ϕ . That is, E1 and
E1 are mutually exclusive and collectively exhaustive.
Axioms of probability
We will be interested in the probabilities of occurrence of various events. These probabilities must
follow three axioms of probability:
0 ≤ P( E ) ≤ 1 (4.1)
P( S ) = 1 (4.2)
P ( E1 ∪ E2 ) = P ( E1 ) + P ( E2 ) , for mutually exclusive events E1 and E2 (4.3)
These axioms form the building blocks of all other probability calculations. It is very easy to derive
additional laws using these axioms, and the previously-defined events. For example,
P( E ) = 1 − P( E ) (4.4)
P (ϕ ) = 0 (4.5)
54
P ( E1 ∪ E2 ) = P ( E1 ) + P ( E2 ) − P ( E1 E2 ) (4.6)
The probability of the event E1 may depend upon the occurrence of another event E2. The conditional
probability P(E1|E2) is defined as the probability that event E1 has occurred, given that event E2 has
occurred. That is, we are computing the probability of E1, if we restrict our sample space to only those
outcomes in event E2. Figure 4.2 may be useful as the reader thinks about this concept.
Figure 4.2: Schematic illustration of the events E1 and E2. The shaded region depicts the area
corresponding to event E1E2.
P ( E1 E2 )
P ( E1 | E2 ) = if P( E2 ) > 0
P ( E2 ) (4.7)
=0 if P ( E2 ) = 0
P ( E1 E2 ) = P ( E1 | E2 ) P ( E2 ) (4.8)
Independence
Conditional probabilities give rise to the concept of independence. We say that two events are
independent if they are not related probabilistically in any way. More precisely, we say that events E1
and E2 are independent if
P ( E1 | E2 ) = P ( E1 ) (4.9)
That is, the probability of E1 is not in any way affected by knowledge of the occurrence of E2.
Substituting equation 4.9 into equation 4.8 gives
55
P ( E1 E2 ) = P ( E1 ) P ( E2 ) (4.10)
which is an equivalent way of stating that E1 and E2 are independent. Note that equations 4.9 and 4.10
are true if and only if E1 and E2 are independent.
Total Probability Theorem
Consider an event A and a set of mutually exclusive, collectively exhaustive events E1, E2 … En. The
total probability theorem states that
n
P ( A) = ∑ P ( A | Ei ) P ( Ei ) (4.11)
i =1
In words, this tells us that we can compute the probability of A if we know the probability of the Ei’s,
and know the probability of A, given each of these Ei’s. The schematic illustration in Figure 4.3 may
help to understand what is being computed. At first glance, the utility of this calculation may not be
obvious, but it is critical to many engineering calculations where the probability of A is difficult to
determine directly, but where the problem can be broken down into several pieces whose probabilities
can be computed.
Consider the following example, to illustrate the value of this calculation. You have been asked to
compute the probability that Building X collapses during the next earthquake in the region. You do not
know with certainty if the next earthquake will be strong, medium or weak, but seismologists have
estimated the following probabilities:
P(strong) = 0.01
P(medium) = 0.1
P(weak) = 0.89
Additionally, structural engineers have performed analyses and estimated the following:
P(collapse|strong) = 0.9
P(collapse|medium) = 0.2
P(collapse|weak) = 0.01
Referring to equation 4.11, the “A” in that equation is the event that the building collapses, and the Ei’s
are the events that the earthquake is strong, medium or weak. We can therefore compute the
probability of collapse as
The total probability theorem allows one to break the problem into two parts (size of the earthquake
and capacity of the building), compute probabilities for those parts, and then re-combine them to
56
answer the original question. This not only facilitates solution of the problem in pieces, but it allows
different specialists (e.g., seismologists and engineers) to work on different aspects of the problem.
Probabilistic seismic hazard analysis is a direct application of the total probability theorem (except
that it uses random variables, discussed below, instead of random events). The distribution of
earthquake magnitudes and distances are studied independently of the conditional distribution of
ground motion intensity, and this probabilistic framework allows us to re-combine the various sources
of information in a rigorous manner.
Bayes’ Rule
Consider an event A and a set of mutually exclusive, collectively exhaustive events E1, E2 … En. From
our conditional probability equations above, we can write
P ( AE j ) = P ( E j | A) P( A) = P( A | E j ) P( E j ) (4.13)
P( A | E j ) P( E j )
P ( E j | A) = (4.14)
P( A)
This equation is known as Bayes’ Rule. An alternate form is based on substituting equation 4.11 for
the total probability theorem in place of P(A) in the denominator of equation 4.14.
P( A | E j ) P( E j )
P( E j | A) = n
(4.15)
∑ P( A | Ei ) P( Ei )
i =1
57
The utility of these equations lies in their ability to compute conditional probabilities, when you
only know probabilities related to conditioning in the reverse order of what is desired. That is, you
would like to compute P(A|B) but only know P(B|A). This is exactly the type of calculation used in the
deaggregation calculations of Section 2.1.
To provide a simple illustration of how this equation works, consider again the example used to
illustrate the total probability theorem. Suppose you have just learned that an earthquake occurred and
building X collapsed. You don’t yet know the size of the earthquake, and would like to compute the
probability that it was a strong earthquake. Using equation 4.14, you can write
0.9(0.01)
P( strong | collapse) = = 0.24 (4.17)
0.0379
It is not obvious intuitively how large that probability would be, because strong earthquakes have a
high probability of causing collapse, but they are also extremely unlikely to occur. Like the Total
Probability Theorem, Bayes’ Rule provides a valuable calculation approach for combining pieces of
information to compute a probability that may be difficult to determine directly.
Here we will introduce an important concept and an associated important notation. A random variable
is a numerical variable whose specific value cannot be predicted with certainty before the occurrence
of an “event” (in our context, this might be the magnitude of a future earthquake). Examples of
random variables relevant to PSHA are the time to the next earthquake in a region, the magnitude of a
future earthquake, the distance form a future earthquake to a site, ground shaking intensity at a site,
etc.
We need a notation to refer to both the random variable itself, and to refer to specific numerical
values which that random variable might take. Standard convention is to denote a random variable
with an uppercase letter, and denote the values it can take on by the same letter in lowercase. That is,
x1, x2, x3, … denote possible numerical outcomes of the random variable X. We can then talk about
probabilities of the random variable taking those outcomes (i.e., P ( X = x1 ) ).
Discrete and continuous random variables
We can in general treat all random variables using the same tools, with the exception of distinguishing
between discrete and continuous random variables. If the number of values a random variable can take
58
are countable, the random variable is called discrete. An example of a discrete random variable is the
number of earthquakes occurring in a region in some specified period of time. The probability
distribution for a discrete random variable can be quantified by a probability mass function (PMF),
defined as
p X ( x) = P ( X = x) (4.18)
The cumulative distribution function (CDF) is defined as the probability of the event that the random
variable takes a value less than or equal to the value of the argument
FX ( x) = P ( X ≤ x) (4.19)
The probability mass function and cumulative distribution function have a one-to-one relationship
FX ( a) = ∑
all xi ≤ a
p X ( xi ) (4.20)
Examples of the PMF and CDF of a discrete random variable are shown in Figure 4.4.
Figure 4.4: Example descriptions of a discrete random variable. (a) Probability mass function. (b)
Cumulative distribution function.
In many cases we are interested in the probability of X > x, rather than the X ≤ x addressed by the
CDF. But noting that those two outcomes are mutually exclusive and collectively exhaustive events,
we can use the previous axioms of probability to see that P ( X > x ) = 1 − P ( X ≤ x ) .
In contrast to discrete random variables, continuous random variables can take any value on the
real axis (although they don’t have to). Because there are an infinite number of possible realizations,
the probability that a continuous random variable X will take on any single value x is zero. This forces
us to use an alternate approach for calculating probabilities. We define the probability density function
(PDF) using the following
59
f X ( x ) dx = P ( x < X ≤ x + dx ) (4.21)
where dx is a differential element of infinitesimal length. An illustration of the PDF and related
probability calculation is given in Figure 4.5. We can compute the probability that the outcome of X is
between a and b by “summing” (integrating) the probability density function over the interval of
interest
b
P (a < X ≤ b) = ∫ f X ( x) dx (4.22)
a
Figure 4.5: Plot of a continuous probability density function. The area of the shaded rectangle,
f X ( x) dx , represents the probability of the random variable X taking values between x and x + dx.
Note that in many of the PSHA equations above, we have approximated continuous random
variables by discrete random variables, for ease of numerical integration. In those cases, we have
replaced the infinitesimal dx by a finite Δx , so that equation 4.21 becomes:
where pX ( x) is the probability mass function for X , the discretized version of the continuous random
variable X. Reference to Figure 4.5 should help the reader understand that the probabilities of any
outcome between x and x + Δx have been assigned to the discrete value x.
Another way to describe a continuous random variable is with a cumulative distribution function
(CDF)
FX ( x) = P ( X ≤ x) (4.24)
60
x
FX ( x) = P ( X ≤ x) = ∫
−∞
f X (u ) du (4.25)
d
f X ( x) = FX ( x) (4.26)
dx
Note that the CDF of continuous and discrete random variables has the same definition. This is
because probabilities of outcomes within an interval are identically defined for discrete and continuous
outcomes. Plots of continuous cumulative distribution functions are seen in the body of the document
(e.g., Figure 1.9b and Figure 1.11b).
Comments on notion
This PMF/PDF/CDF notation allows us to compactly and precisely describe probabilities of outcomes
of random variables. Note that the following conventions have been used:
1. The initial letter indicates the type of probability being described (i.e., “p” for PMFs,
“f” for PDFs, and “F” for CDFs).
2. The subscript denotes the random variable (e.g., “X”), and thus is always a capital letter.
3. The argument in parentheses indicates the numerical value being considered (e.g., “x”),
and is thus either a lower-case letter or a numeric value (e.g., FX (2) = P ( X ≤ 2) ).
It is worth noting that these conventions are not chosen arbitrarily here or unique to PSHA. They are
used almost universally in all probability papers and books, regardless of the field of application.
Conditional distributions
A final note on random variables: we are often interested in conditional probability distributions of
random variables. We can adopt all of the results from Section 4.2 if we recognize that the random
variable X exceeding some value x is an event. So we can adapt equation 4.7, for example, to write
where the notation f X |Y ( x | y ) is introduced to denote the conditional probability density function of X,
given that the random variable Y has taken value y. If we further introduce the following notation for
the joint probability density function of X and Y
61
f X ,Y ( x, y )
f X |Y ( x | y ) = (4.29)
fY ( y )
Similarly, equation 4.10 can be used to show that random variables X and Y are said to be
independent if and only if
f X ,Y ( x, y ) = f X ( x) fY ( y ) (4.30)
Another example is the PSHA equations above that use integrals over the random variables for
magnitude and distance; these are the random-variable analog of the total probability theorem
introduced earlier for events.
These types of manipulations, which are only briefly introduced here, are very useful for
computing probabilities of outcomes of random variables, conditional upon knowledge of other
probabilistically-dependent random variables.
The normal distribution
One particular type of random variable will play an important role in the calculations above, so we
will treat it carefully here. A random variable is said to have a “normal” (or “Gaussian”) distribution if
it has the following PDF
1 ⎛ 1 ⎛ x − μ ⎞2 ⎞
f X ( x) = exp ⎜ − ⎜ X
⎟ (4.31)
σX 2π ⎜ 2 ⎝ σ X ⎟⎠ ⎟
⎝ ⎠
where μx and σx denote the mean value and standard deviation, respectively, of X. This PDF forms the
familiar “bell curve” seen above in Figure 4.5. This is one of the most common distributions in
engineering, and has been found to describe very accurately the distribution of the logarithm of ground
motion intensity associated with a given earthquake magnitude and distance. Because of that, we often
want to find the probability that a normally-distributed random variable X takes a value less than x.
From above, we know that we can find this probability by integrating the PDF over the region of
interest
x
P ( X ≤ x) =
−∞
∫ f X (u ) du
(4.32)
x
1 ⎛ 1 ⎛ u − μ ⎞2 ⎞
= ∫σ 2π
exp ⎜ − ⎜
⎜ 2 ⎝ σ
X
⎟ ⎟⎟ du
⎠ ⎠
−∞ X ⎝ X
Unfortunately, there is no analytic solution to this integral. But because it is so commonly used,
we tabulate its values, as shown in Table 4.1. To keep this table reasonably small in size, we use two
tricks. First, we summarize values for the “standard” normal distribution, where standard denotes that
the random variable has a mean value (μx) of 0 and a standard deviation (σx) of 1. So the CDF becomes
62
x
1 ⎛ 1 ⎞
P ( X ≤ x) = ∫
−∞ 2π
exp ⎜ − u 2 ⎟ du
⎝ 2 ⎠
(4.33)
Because the CDF of the standard normal random variable is so common, we give it the special
notation P ( X ≤ x) = Φ ( x ) .
If the random variable of interest, X, is normal but not standard normal, then we can transform it
into a standard normal random variable as follows
X − μX
U= (4.34)
σX
where U is a standard normal random variable. This means that we can use the right-hand side of
equation 4.34 as an argument for the standard normal CDF table. That is
⎛ X − μX ⎞
P ( X ≤ x) = Φ ⎜ ⎟ (4.35)
⎝ σX ⎠
A second trick used to manage the size of the standard normal CDF table is to note that the normal
PDF is symmetric about zero. This means that
Φ(−u ) = 1 − Φ (u ) (4.36)
so the CDF value a negative number can be determined from the CDF value for the corresponding
positive number. Thus, the table is tabulated for only positive values of u. The identity of equation
4.36 might be intuitively clear if one views the standard normal PDF illustrated at the top of Table 4.1.
63
Table 4.1: Standard normal cumulative distribution function.
64
The normal distribution can be generalized to the case of more than one random variable. Two
random variables are said to have a joint normal distribution if they have the following joint PDF
1 ⎧ z ⎫
f X ,Y ( x, y ) = exp ⎨− 2 ⎬
(4.37)
2πσ X σ Y 1− ρ 2
⎩ 2(1 − ρ ) ⎭
( x − μ X )2 2 ρ ( x − μ X )( y − μY ) ( y − μY ) 2
z= − + (4.38)
σ X2 σ XσY σ Y2
An important property of random variables having this distribution is that if X and Y are joint
normal, then their marginal distributions ( f X ( x ) and fY ( y ) ) are normal, and their conditional
distributions are also normal. Specifically, the distribution of X given Y = y is normal, with conditional
mean
⎛ y − μY ⎞
μ X |Y = y = μ X + ρ ⋅ σ X ⎜ ⎟ (4.39)
⎝ σY ⎠
σ X |Y = y = σ X 1 − ρ 2 (4.40)
These properties are convenient when computing joint distributions of ground motion parameters.
65
4.4 Expectations and moments
A random variable is completely defined by its PMF or PDF (for discrete and continuous random
variables, respectively). Sometimes, however, it is convenient to use measures that describe general
features of the distribution, such as its “average” value, breadth of feasible values, and whether the
distribution has a heavier tail to the left or right. We can measure these properties using moments of a
random variable, and they are often more convenient to work with for engineering applications.
The mean is the most common moment, and is used to describe the central location of a random
variable. The mean of X is denoted μX or E[X]. It can be calculated for a discrete random variable as
μ X = ∑ xi p X ( xi ) (4.41)
all i
μX = ∫
all x
x f X ( x) dx (4.42)
Note that this is equal to the center of gravity of the PMF or PDF. The equations may be recognizable
to some readers as being very similar to centroid calculations.
The variation of values to be expected from a random variable can be measured using the
variance, denoted σ X2 or Var[X]. It is calculated for a discrete random variable as
σ X2 = ∑ ( xi − μ x ) p X ( xi )
2
(4.43)
all i
∫ (x− μ )
2
σ X2 = x f X ( x ) dx (4.44)
all x
This the moment of inertia of the PDF (or PMF) about the mean.
The square root of the variance is known as the standard deviation, and is denoted σX. It is often
preferred to the variance when reporting a description of a random variable, because it has the same
units as the random variable itself (unlike the variance, whose units are the original units squared).
Means and variances are special cases of the expectation operation. The expectation of g(X) is
defined for discrete random variables as
E[ g ( X )] = ∑ g ( xi ) p X ( xi ) (4.45)
all i
66
E[ g ( X )] = ∫
all x
g ( x) f X ( x) dx (4.46)
The mean value is the special case of expectation where g(X) = X, and the variance is the special case
where g ( X ) = ( X − μ X )2 . These are special cases of what are called moments of random variables, but
we will restrict the discussion here to those two special cases.
Finally, note that the normal random variable described above uses the mean and standard
deviation explicitly as parameters in its PDF. So given knowledge of the mean and standard deviation
of a normal random variable, one knows its complete distribution. This is not the case for random
variables in general, but it is one of the reasons why the normal random variable is convenient to work
with.
67
Section 5 Further study
Below is a list of important papers and summary books that would be valuable for those interested in
further study. References are grouped by type, and followed by a short description of their value.
McGuire, R. K. (2004). Seismic Hazard and Risk Analysis, Earthquake Engineering Research Institute,
Berkeley.
Reiter, L. (1990). Earthquake hazard analysis: issues and insights, Columbia University Press, New
York.
Another book describing PSHA. The author was an employee of the Nuclear Regulatory
Commission, so his insights are particularly focused on nuclear applications. This book also
makes the greatest effort to compare deterministic and probabilistic seismic hazard analysis
methods.
68
Abrahamson, N. A. (2006). "Seismic hazard assessment: problems with current practice and future
developments." First European Conference on Earthquake Engineering and Seismology, Geneva,
Switzerland, 17p. http://www.ecees.org/keynotes.html
69
References
Abrahamson, N. A., and Silva, W. J. (1997). "Empirical response spectral attenuation relations for
shallow crustal earthquakes." Seismological Research Letters, 68(1), 94-126.
Anagnos, T., and Kiremidjian, A. S. (1984). "Stochastic time-predictable model for earthquake
occurrences." Bulletin of the Seismological Society of America, 74(6), 2593-2611.
Ang, A. H.-S., and Tang, W. H. (2007). Probability concepts in engineering emphasis on applications
in civil & environmental engineering, Wiley, New York.
Baker, J. W., and Cornell, C. A. (2006). "Correlation of response spectral values for multi-component
ground motions." Bulletin of the Seismological Society of America, 96(1), 215-227.
Baker, J. W., and Jayaram, N. (2008). "Correlation of spectral acceleration values from NGA ground
motion models." Earthquake Spectra, 24(1), 299-317
Bazzurro, P., and Cornell, C. A. (1999). "Disaggregation of Seismic Hazard." Bulletin of the
Seismological Society of America, 89(2), 501-520.
Benjamin, J. R., and Cornell, C. A. (1970). Probability, Statistics, and Decision for Civil Engineers,
McGraw-Hill, New York.
Campbell, K. W., and Bozorgnia, Y. (2008). "NGA Ground Motion Model for the Geometric Mean
Horizontal Component of PGA, PGV, PGD and 5% Damped Linear Elastic Response Spectra for
Periods Ranging from 0.01 to 10 s." Earthquake Spectra, 24(1), 139-171.
Cornell, C. A., Banon, H., and Shakal, A. F. (1979). "Seismic motion and response prediction
alternatives." Earthquake Engineering & Structural Dynamics, 7(4), 295-315.
Cornell, C. A., and Winterstein, S. R. (1988). "Temporal and magnitude dependence in earthquake
recurrence models." Bulletin of the Seismological Society of America, 78(4), 1522-1537.
Gutenberg, B., and Richter, C. F. (1944). "Frequency of earthquakes in California." Bulletin of the
Seismological Society of America, 34(4), 185-188.
Jayaram, N., and Baker, J. W. (2007). "Statistical Tests of the Joint Distribution of Spectral
Acceleration Values." Bulletin of the Seismological Society of America(in press).
71
Lomnitz-Adler, J., and Lomnitz, C. (1979). "A modified form of the Gutenberg-Richter magnitude-
frequency relation." Bulletin of the Seismological Society of America, 69(4), 1209-1214.
McGuire, R. K. (1995). "Probabilistic Seismic Hazard Analysis and Design Earthquakes: Closing the
Loop." Bulletin of the Seismological Society of America, 85(5), 1275-1284.
Nuclear Regulatory Commission. (1997). Identification and Characterization of Seismic Sources and
Determination of Safe Shutdown Earthquake Ground Motion, Regulatory Guide 1.165.
Reiter, L. (1990). Earthquake hazard analysis: Issues and insights, Columbia University Press, New
York.
Ross, S. M. (2004). Introduction to probability and statistics for engineers and scientists, 3rd Ed.,
Elsevier Academic Press, Amsterdam ; Boston.
Schwartz, D. P., and Coppersmith, K. J. (1984). "Fault behavior and characteristic earthquakes:
Examples from the Wasatch and San Andreas fault zones." Journal of geophysical research,
89(B7), 5681-5698.
Weichert, D. H. (1980). "Estimation of the earthquake recurrence parameters for unequal observation
periods for different magnitudes." Bulletin of the Seismological Society of America, 70(4), 1337-
1346.
Youngs, R. R., and Coppersmith, K. J. (1985). "Implications of fault slip rates and earthquake
recurrence models to probabilistic seismic hazard analysis." Bulletin of the Seismological Society
of America, 75(4), 939-964.
72
Logarithms of pairs of spectral acceleration values (and presumably also PGA values) have been
shown to have a joint normal distribution (Jayaram and Baker 2007), so calculations of joint
distributions of two intensity measures becomes reasonably simple in this special case. In the case of a
joint normal distribution, conditional distributions of one intensity measure parameter, given the other,
are also normally distributed, and can be computed using only a linear correlation coefficient between
the two parameters (see Section 4.3 for a few further details, and Benjamin and Cornell, 1970, for a
more complete discussion).
Let us consider joint predictions of PGA and spectral acceleration at a period of 0.5 seconds
(SA(0.5s)), given a magnitude 5 earthquake at a distance of 10 km. Abrahamson and Silva (1997)
provide the following predictions for the mean and standard deviation of lnSA
ln SA = −2.7207 (2.12)
σ ln SA = 0.80 (2.13)
Note that a few more parameters than just magnitude and distance are needed to obtain this lnSA
prediction; here we have also assumed a rock site and a strike-slip mechanism for the earthquake. The
median of (non-log) SA is simply the exponential of this number, which in this case is 0.065 g.
Looking back to section 1.3.4, we recall that the mean and standard deviation of lnPGA for this
event was
The only thing needed further to compute the joint distribution of PGA and SA is the correlation
coefficient between the two (typically referred to using the Greek letter ρ). These correlation
coefficients have been determined in a manner similar to the way that ground motion prediction
models are calibrated; several documents provide these coefficients (e.g., Baker and Cornell 2006;
Baker and Jayaram 2008), and estimate a ρ of approximately 0.7 for this case.
Now let us consider a prediction of the distribution of PGA, given knowledge of the SA(0.5s)
value for a ground motion coming from the specified earthquake event. Because of the joint normality
of lnPGA and lnSA, we can write the conditional mean and standard deviation of lnPGA as simply
48
where all parameters have been defined above except ε SA . That parameter is the number of standard
deviations by which a given lnSA value differs from its mean predicted value. Mathematically, it can
be written
ln x − ln( SA)
ε SA = (2.18)
σ ln SA
where x is the observed SA value, and the other terms are the mean and standard deviation from the
original ground motion prediction model.
Now imagine that we have observed an SA value of 0.2g from the magnitude 5 earthquake at a
distance of 10 km. Using equation 2.18, we find
That is, the observed spectral acceleration is 1.4 standard deviations larger than the mean predicted
value associated with this earthquake.
If SA was larger than its mean, and SA and PGA are correlated, then knowledge of this large SA
value should increase our predictions of PGA for the given ground motion. We make this increased
prediction using equation 2.16
ln( PGA | SA) = ln( PGA) + ρε SAσ ln PGA = −2.2673 + 0.7(1.4)(0.57) = −1.7124 (2.20)
Taking an exponential of this number tells us that the median conditional PGA is 0.18g (a significant
increase from the median prediction of 0.104g we made before we had observed SA=0.2g).
Knowledge of SA should also decrease our uncertainty in PGA, and this is reflected in equation
2.17
Using the updated conditional mean and standard deviation of PGA, we can now predict the
probability of exceeding different PGA values conditional upon SA(0.5s) = 0.2g, by using equation
1.15 with our updated conditional mean and standard deviation. Some sample results are summarized
in Table 2.3. The first column lists a series of PGA values of potential interest. The second column
lists the probability of exceeding those PGA values, given a magnitude 5 earthquake at a distance of
10 km, but not yet conditioned on any observed SA value. That is, the second column was computed
using the original mean and standard deviation from equations 2.14 and 2.15. Note that this calculation
is identical to the one from Table 1.2. In the third column of Table 2.3, we compute probabilities of
exceeding the same PGA values, but this time conditioned upon knowledge that SA(0.5s) = 0.2g. That
49
is, we evaluate equation 1.15 with our new conditional mean and standard deviation. Examining the
second and third columns of this table, two interesting features are apparent. First, the probability of
exceeding low PGA values has increased significantly, because we now know that the correlated
parameter SA(0.5s) is larger than usual for this event. Second, we see that the probability of exceeding
very large PGA values has actually decreased. The decrease is because knowledge of SA has reduced
our uncertainty in PGA. Although we know that SA is larger than its mean prediction, we have also
eliminated the possibility that SA is even more extreme than the observed value, so the most extreme
PGA values actually become less likely. Finally, in the fourth column of the table, we compute the
conditional probability of PGA equaling the various values of interest, using equation 1.19
Table 2.3: PGA probabilities associated with a magnitude 5 earthquake at 10 km, and an SA(0.5s)
value of 0.2g.
To aid in intuitive understanding of these calculations, Figure 2.7 shows a schematic illustration of
the joint distribution referred to above. The horizontal axes represent the range of (log) PGA and SA
values that might result from earthquakes with a given magnitude and distance. The contour lines
illustrate the contours of the joint distribution of PGA and SA. The centroid and spread of these
contours with respect to each horizontal axis are specified by the mean and standard deviation from
ground motion prediction models. The correlation between the two parameters is reflected by the
elliptical shape of the contours, which means that large values of lnSA are likely to be associated with
large values of lnPGA. What we are interested in here is the distribution of PGA, given that we have
observed some SA value. This is represented by cuts through the joint distribution. The conditional
distributions at two cuts (i.e., two potential SA values) are shown on the vertical axis of the figure. The
probability of exceeding some PGA value x1 is represented by the area of the conditional distribution
to the right of x1. We see from the two cuts that as the observed lnSA value gets larger, the probability
of exceeding x1 also gets larger. That is the effect we also saw in the third column of Table 2.3.
50
Figure 2.7: Schematic illustration of the joint distribution of PGA and SA.
51
Section 3 Conclusions
We have now completed an overview of probabilistic seismic hazard analysis (PSHA), and several
extensions of the basic methodology. Example calculations have been presented to illustrate how the
computations are performed in practice. With these tools, one can quantify the risk of ground motion
shaking at a site, given knowledge about seismic sources surrounding the site.
Having now considered the many sources of uncertainty present when predicting future shaking at
a site, it is hopefully clear to the reader why deterministic approaches to seismic hazard analysis can
be unsatisfying. It should be clear that there is no such thing as a deterministic “worst-case” ground
motion, and that attempts to identify an alternate deterministic ground motion necessitate making
decisions that may be arbitrary and hard to justify.
PSHA is fundamentally an accounting method that lets one combine diverse sources of data
regarding occurrence rates of earthquakes, the size of future earthquakes, and propagation of seismic
shaking through the earth. It would be impossible to model the distribution of future earthquake
shaking at a site through direct observation, because one would have to wait thousands or millions of
years to collect enough observations to make a reasonable inference regarding rare ground motions.
But, by incorporating many sources of data into the calculations, it becomes possible to project out to
these low probabilities with scientifically-defensible and reproducible models.
The basic PSHA calculation, and its required inputs, was discussed in Section 1. In Section 2,
several extensions were presented, such as deaggregation and uniform hazard spectra. There is also a
vast literature regarding the accurate estimation of the many inputs, such as occurrence rates of
earthquakes and their magnitude distributions, which was not discussed here. References for further
study on these topics are provided in Section 5 for the interested reader.
52
Section 4 A review of probability
Probability is so fundamental to probabilistic seismic hazard analysis that the word appears in its title.
The calculations in this document thus rely heavily on the use of probability concepts and notation.
These probabilistic tools allow us to move through calculations without having to stop and derive
intermediate steps. The notational conventions allow us to easily describe the behavior of uncertain
quantities. It is recognized that these concepts and notations are not familiar to all readers, however, so
this section is intended to provide a brief overview of the needed material. Readers desiring more
details may benefit from reviewing a textbook dedicated specifically to practical applications of
probability concepts (e.g., Ang and Tang 2007; Benjamin and Cornell 1970; Ross 2004).
The most basic building block of probability calculations is the random event: an event having more
than one possible outcome. The sample space (denoted S) is the collection of all possible outcomes of
a random event. Any subset of the sample space is called an event, and denoted E. Sample spaces and
events are often illustrated graphically using Venn diagrams, as illustrated in Figure 4.1.
53
For example, the number obtained from rolling of a die is a random event. The sample space for
this example is S = {1, 2,3, 4,5,6} . The outcomes in the event that the number is odd are E1 = {1,3,5} .
The outcomes in the event that the number is greater than three are E2 = {4,5,6}
We are commonly interested in two operations on events. The first is the union of E1 and E2,
denoted by the symbol ∪ . E1 ∪ E2 is the event that contains all outcomes in either E1 or E2. The
second is the intersection. E1E2 (also denoted E1 ∩ E2 ) is the event that contains all outcomes in both
E1 and E2. For example, continuing the die illustration from above, E1 ∪ E2 = {1,3, 4,5,6} and
E1 ∩ E2 = {5} .
Special events
There are a few special terms and special events that are often useful for probability calculations:
The certain event is an event that contains all possible outcomes in the sample space. The sample
space S is the certain event.
Events E1 and E2 are mutually exclusive when they have no common outcomes. E1E2 = φ if E1 and
E2 are mutually exclusive.
Events E1, E2…En are collectively exhaustive when their union contains every possible outcome of
the random event (i.e., E1 ∪ E2 ∪ ... ∪ En = S ).
The complementary event, E1 , of an event E1, contains all outcomes in the sample space that are
not in event E1. It should be clear that, by this definition, E1 ∪ E1 = S and E1 E1 = ϕ . That is, E1 and
E1 are mutually exclusive and collectively exhaustive.
Axioms of probability
We will be interested in the probabilities of occurrence of various events. These probabilities must
follow three axioms of probability:
0 ≤ P( E ) ≤ 1 (4.1)
P( S ) = 1 (4.2)
P ( E1 ∪ E2 ) = P ( E1 ) + P ( E2 ) , for mutually exclusive events E1 and E2 (4.3)
These axioms form the building blocks of all other probability calculations. It is very easy to derive
additional laws using these axioms, and the previously-defined events. For example,
P( E ) = 1 − P( E ) (4.4)
P (ϕ ) = 0 (4.5)
54
P ( E1 ∪ E2 ) = P ( E1 ) + P ( E2 ) − P ( E1 E2 ) (4.6)
The probability of the event E1 may depend upon the occurrence of another event E2. The conditional
probability P(E1|E2) is defined as the probability that event E1 has occurred, given that event E2 has
occurred. That is, we are computing the probability of E1, if we restrict our sample space to only those
outcomes in event E2. Figure 4.2 may be useful as the reader thinks about this concept.
Figure 4.2: Schematic illustration of the events E1 and E2. The shaded region depicts the area
corresponding to event E1E2.
P ( E1 E2 )
P ( E1 | E2 ) = if P( E2 ) > 0
P ( E2 ) (4.7)
=0 if P ( E2 ) = 0
P ( E1 E2 ) = P ( E1 | E2 ) P ( E2 ) (4.8)
Independence
Conditional probabilities give rise to the concept of independence. We say that two events are
independent if they are not related probabilistically in any way. More precisely, we say that events E1
and E2 are independent if
P ( E1 | E2 ) = P ( E1 ) (4.9)
That is, the probability of E1 is not in any way affected by knowledge of the occurrence of E2.
Substituting equation 4.9 into equation 4.8 gives
55
P ( E1 E2 ) = P ( E1 ) P ( E2 ) (4.10)
which is an equivalent way of stating that E1 and E2 are independent. Note that equations 4.9 and 4.10
are true if and only if E1 and E2 are independent.
Total Probability Theorem
Consider an event A and a set of mutually exclusive, collectively exhaustive events E1, E2 … En. The
total probability theorem states that
n
P ( A) = ∑ P ( A | Ei ) P ( Ei ) (4.11)
i =1
In words, this tells us that we can compute the probability of A if we know the probability of the Ei’s,
and know the probability of A, given each of these Ei’s. The schematic illustration in Figure 4.3 may
help to understand what is being computed. At first glance, the utility of this calculation may not be
obvious, but it is critical to many engineering calculations where the probability of A is difficult to
determine directly, but where the problem can be broken down into several pieces whose probabilities
can be computed.
Consider the following example, to illustrate the value of this calculation. You have been asked to
compute the probability that Building X collapses during the next earthquake in the region. You do not
know with certainty if the next earthquake will be strong, medium or weak, but seismologists have
estimated the following probabilities:
P(strong) = 0.01
P(medium) = 0.1
P(weak) = 0.89
Additionally, structural engineers have performed analyses and estimated the following:
P(collapse|strong) = 0.9
P(collapse|medium) = 0.2
P(collapse|weak) = 0.01
Referring to equation 4.11, the “A” in that equation is the event that the building collapses, and the Ei’s
are the events that the earthquake is strong, medium or weak. We can therefore compute the
probability of collapse as
The total probability theorem allows one to break the problem into two parts (size of the earthquake
and capacity of the building), compute probabilities for those parts, and then re-combine them to
56
answer the original question. This not only facilitates solution of the problem in pieces, but it allows
different specialists (e.g., seismologists and engineers) to work on different aspects of the problem.
Probabilistic seismic hazard analysis is a direct application of the total probability theorem (except
that it uses random variables, discussed below, instead of random events). The distribution of
earthquake magnitudes and distances are studied independently of the conditional distribution of
ground motion intensity, and this probabilistic framework allows us to re-combine the various sources
of information in a rigorous manner.
Bayes’ Rule
Consider an event A and a set of mutually exclusive, collectively exhaustive events E1, E2 … En. From
our conditional probability equations above, we can write
P ( AE j ) = P ( E j | A) P( A) = P( A | E j ) P( E j ) (4.13)
P( A | E j ) P( E j )
P ( E j | A) = (4.14)
P( A)
This equation is known as Bayes’ Rule. An alternate form is based on substituting equation 4.11 for
the total probability theorem in place of P(A) in the denominator of equation 4.14.
P( A | E j ) P( E j )
P( E j | A) = n
(4.15)
∑ P( A | Ei ) P( Ei )
i =1
57
The utility of these equations lies in their ability to compute conditional probabilities, when you
only know probabilities related to conditioning in the reverse order of what is desired. That is, you
would like to compute P(A|B) but only know P(B|A). This is exactly the type of calculation used in the
deaggregation calculations of Section 2.1.
To provide a simple illustration of how this equation works, consider again the example used to
illustrate the total probability theorem. Suppose you have just learned that an earthquake occurred and
building X collapsed. You don’t yet know the size of the earthquake, and would like to compute the
probability that it was a strong earthquake. Using equation 4.14, you can write
0.9(0.01)
P( strong | collapse) = = 0.24 (4.17)
0.0379
It is not obvious intuitively how large that probability would be, because strong earthquakes have a
high probability of causing collapse, but they are also extremely unlikely to occur. Like the Total
Probability Theorem, Bayes’ Rule provides a valuable calculation approach for combining pieces of
information to compute a probability that may be difficult to determine directly.
Here we will introduce an important concept and an associated important notation. A random variable
is a numerical variable whose specific value cannot be predicted with certainty before the occurrence
of an “event” (in our context, this might be the magnitude of a future earthquake). Examples of
random variables relevant to PSHA are the time to the next earthquake in a region, the magnitude of a
future earthquake, the distance form a future earthquake to a site, ground shaking intensity at a site,
etc.
We need a notation to refer to both the random variable itself, and to refer to specific numerical
values which that random variable might take. Standard convention is to denote a random variable
with an uppercase letter, and denote the values it can take on by the same letter in lowercase. That is,
x1, x2, x3, … denote possible numerical outcomes of the random variable X. We can then talk about
probabilities of the random variable taking those outcomes (i.e., P ( X = x1 ) ).
Discrete and continuous random variables
We can in general treat all random variables using the same tools, with the exception of distinguishing
between discrete and continuous random variables. If the number of values a random variable can take
58
are countable, the random variable is called discrete. An example of a discrete random variable is the
number of earthquakes occurring in a region in some specified period of time. The probability
distribution for a discrete random variable can be quantified by a probability mass function (PMF),
defined as
p X ( x) = P ( X = x) (4.18)
The cumulative distribution function (CDF) is defined as the probability of the event that the random
variable takes a value less than or equal to the value of the argument
FX ( x) = P ( X ≤ x) (4.19)
The probability mass function and cumulative distribution function have a one-to-one relationship
FX ( a) = ∑
all xi ≤ a
p X ( xi ) (4.20)
Examples of the PMF and CDF of a discrete random variable are shown in Figure 4.4.
Figure 4.4: Example descriptions of a discrete random variable. (a) Probability mass function. (b)
Cumulative distribution function.
In many cases we are interested in the probability of X > x, rather than the X ≤ x addressed by the
CDF. But noting that those two outcomes are mutually exclusive and collectively exhaustive events,
we can use the previous axioms of probability to see that P ( X > x ) = 1 − P ( X ≤ x ) .
In contrast to discrete random variables, continuous random variables can take any value on the
real axis (although they don’t have to). Because there are an infinite number of possible realizations,
the probability that a continuous random variable X will take on any single value x is zero. This forces
us to use an alternate approach for calculating probabilities. We define the probability density function
(PDF) using the following
59
f X ( x ) dx = P ( x < X ≤ x + dx ) (4.21)
where dx is a differential element of infinitesimal length. An illustration of the PDF and related
probability calculation is given in Figure 4.5. We can compute the probability that the outcome of X is
between a and b by “summing” (integrating) the probability density function over the interval of
interest
b
P (a < X ≤ b) = ∫ f X ( x) dx (4.22)
a
Figure 4.5: Plot of a continuous probability density function. The area of the shaded rectangle,
f X ( x) dx , represents the probability of the random variable X taking values between x and x + dx.
Note that in many of the PSHA equations above, we have approximated continuous random
variables by discrete random variables, for ease of numerical integration. In those cases, we have
replaced the infinitesimal dx by a finite Δx , so that equation 4.21 becomes:
where pX ( x) is the probability mass function for X , the discretized version of the continuous random
variable X. Reference to Figure 4.5 should help the reader understand that the probabilities of any
outcome between x and x + Δx have been assigned to the discrete value x.
Another way to describe a continuous random variable is with a cumulative distribution function
(CDF)
FX ( x) = P ( X ≤ x) (4.24)
60
x
FX ( x) = P ( X ≤ x) = ∫
−∞
f X (u ) du (4.25)
d
f X ( x) = FX ( x) (4.26)
dx
Note that the CDF of continuous and discrete random variables has the same definition. This is
because probabilities of outcomes within an interval are identically defined for discrete and continuous
outcomes. Plots of continuous cumulative distribution functions are seen in the body of the document
(e.g., Figure 1.9b and Figure 1.11b).
Comments on notion
This PMF/PDF/CDF notation allows us to compactly and precisely describe probabilities of outcomes
of random variables. Note that the following conventions have been used:
1. The initial letter indicates the type of probability being described (i.e., “p” for PMFs,
“f” for PDFs, and “F” for CDFs).
2. The subscript denotes the random variable (e.g., “X”), and thus is always a capital letter.
3. The argument in parentheses indicates the numerical value being considered (e.g., “x”),
and is thus either a lower-case letter or a numeric value (e.g., FX (2) = P ( X ≤ 2) ).
It is worth noting that these conventions are not chosen arbitrarily here or unique to PSHA. They are
used almost universally in all probability papers and books, regardless of the field of application.
Conditional distributions
A final note on random variables: we are often interested in conditional probability distributions of
random variables. We can adopt all of the results from Section 4.2 if we recognize that the random
variable X exceeding some value x is an event. So we can adapt equation 4.7, for example, to write
where the notation f X |Y ( x | y ) is introduced to denote the conditional probability density function of X,
given that the random variable Y has taken value y. If we further introduce the following notation for
the joint probability density function of X and Y
61
f X ,Y ( x, y )
f X |Y ( x | y ) = (4.29)
fY ( y )
Similarly, equation 4.10 can be used to show that random variables X and Y are said to be
independent if and only if
f X ,Y ( x, y ) = f X ( x) fY ( y ) (4.30)
Another example is the PSHA equations above that use integrals over the random variables for
magnitude and distance; these are the random-variable analog of the total probability theorem
introduced earlier for events.
These types of manipulations, which are only briefly introduced here, are very useful for
computing probabilities of outcomes of random variables, conditional upon knowledge of other
probabilistically-dependent random variables.
The normal distribution
One particular type of random variable will play an important role in the calculations above, so we
will treat it carefully here. A random variable is said to have a “normal” (or “Gaussian”) distribution if
it has the following PDF
1 ⎛ 1 ⎛ x − μ ⎞2 ⎞
f X ( x) = exp ⎜ − ⎜ X
⎟ (4.31)
σX 2π ⎜ 2 ⎝ σ X ⎟⎠ ⎟
⎝ ⎠
where μx and σx denote the mean value and standard deviation, respectively, of X. This PDF forms the
familiar “bell curve” seen above in Figure 4.5. This is one of the most common distributions in
engineering, and has been found to describe very accurately the distribution of the logarithm of ground
motion intensity associated with a given earthquake magnitude and distance. Because of that, we often
want to find the probability that a normally-distributed random variable X takes a value less than x.
From above, we know that we can find this probability by integrating the PDF over the region of
interest
x
P ( X ≤ x) =
−∞
∫ f X (u ) du
(4.32)
x
1 ⎛ 1 ⎛ u − μ ⎞2 ⎞
= ∫σ 2π
exp ⎜ − ⎜
⎜ 2 ⎝ σ
X
⎟ ⎟⎟ du
⎠ ⎠
−∞ X ⎝ X
Unfortunately, there is no analytic solution to this integral. But because it is so commonly used,
we tabulate its values, as shown in Table 4.1. To keep this table reasonably small in size, we use two
tricks. First, we summarize values for the “standard” normal distribution, where standard denotes that
the random variable has a mean value (μx) of 0 and a standard deviation (σx) of 1. So the CDF becomes
62
x
1 ⎛ 1 ⎞
P ( X ≤ x) = ∫
−∞ 2π
exp ⎜ − u 2 ⎟ du
⎝ 2 ⎠
(4.33)
Because the CDF of the standard normal random variable is so common, we give it the special
notation P ( X ≤ x) = Φ ( x ) .
If the random variable of interest, X, is normal but not standard normal, then we can transform it
into a standard normal random variable as follows
X − μX
U= (4.34)
σX
where U is a standard normal random variable. This means that we can use the right-hand side of
equation 4.34 as an argument for the standard normal CDF table. That is
⎛ X − μX ⎞
P ( X ≤ x) = Φ ⎜ ⎟ (4.35)
⎝ σX ⎠
A second trick used to manage the size of the standard normal CDF table is to note that the normal
PDF is symmetric about zero. This means that
Φ(−u ) = 1 − Φ (u ) (4.36)
so the CDF value a negative number can be determined from the CDF value for the corresponding
positive number. Thus, the table is tabulated for only positive values of u. The identity of equation
4.36 might be intuitively clear if one views the standard normal PDF illustrated at the top of Table 4.1.
63
Table 4.1: Standard normal cumulative distribution function.
64
The normal distribution can be generalized to the case of more than one random variable. Two
random variables are said to have a joint normal distribution if they have the following joint PDF
1 ⎧ z ⎫
f X ,Y ( x, y ) = exp ⎨− 2 ⎬
(4.37)
2πσ X σ Y 1− ρ 2
⎩ 2(1 − ρ ) ⎭
( x − μ X )2 2 ρ ( x − μ X )( y − μY ) ( y − μY ) 2
z= − + (4.38)
σ X2 σ XσY σ Y2
An important property of random variables having this distribution is that if X and Y are joint
normal, then their marginal distributions ( f X ( x ) and fY ( y ) ) are normal, and their conditional
distributions are also normal. Specifically, the distribution of X given Y = y is normal, with conditional
mean
⎛ y − μY ⎞
μ X |Y = y = μ X + ρ ⋅ σ X ⎜ ⎟ (4.39)
⎝ σY ⎠
σ X |Y = y = σ X 1 − ρ 2 (4.40)
These properties are convenient when computing joint distributions of ground motion parameters.
65
4.4 Expectations and moments
A random variable is completely defined by its PMF or PDF (for discrete and continuous random
variables, respectively). Sometimes, however, it is convenient to use measures that describe general
features of the distribution, such as its “average” value, breadth of feasible values, and whether the
distribution has a heavier tail to the left or right. We can measure these properties using moments of a
random variable, and they are often more convenient to work with for engineering applications.
The mean is the most common moment, and is used to describe the central location of a random
variable. The mean of X is denoted μX or E[X]. It can be calculated for a discrete random variable as
μ X = ∑ xi p X ( xi ) (4.41)
all i
μX = ∫
all x
x f X ( x) dx (4.42)
Note that this is equal to the center of gravity of the PMF or PDF. The equations may be recognizable
to some readers as being very similar to centroid calculations.
The variation of values to be expected from a random variable can be measured using the
variance, denoted σ X2 or Var[X]. It is calculated for a discrete random variable as
σ X2 = ∑ ( xi − μ x ) p X ( xi )
2
(4.43)
all i
∫ (x− μ )
2
σ X2 = x f X ( x ) dx (4.44)
all x
This the moment of inertia of the PDF (or PMF) about the mean.
The square root of the variance is known as the standard deviation, and is denoted σX. It is often
preferred to the variance when reporting a description of a random variable, because it has the same
units as the random variable itself (unlike the variance, whose units are the original units squared).
Means and variances are special cases of the expectation operation. The expectation of g(X) is
defined for discrete random variables as
E[ g ( X )] = ∑ g ( xi ) p X ( xi ) (4.45)
all i
66
E[ g ( X )] = ∫
all x
g ( x) f X ( x) dx (4.46)
The mean value is the special case of expectation where g(X) = X, and the variance is the special case
where g ( X ) = ( X − μ X )2 . These are special cases of what are called moments of random variables, but
we will restrict the discussion here to those two special cases.
Finally, note that the normal random variable described above uses the mean and standard
deviation explicitly as parameters in its PDF. So given knowledge of the mean and standard deviation
of a normal random variable, one knows its complete distribution. This is not the case for random
variables in general, but it is one of the reasons why the normal random variable is convenient to work
with.
67
Section 5 Further study
Below is a list of important papers and summary books that would be valuable for those interested in
further study. References are grouped by type, and followed by a short description of their value.
McGuire, R. K. (2004). Seismic Hazard and Risk Analysis, Earthquake Engineering Research Institute,
Berkeley.
Reiter, L. (1990). Earthquake hazard analysis: issues and insights, Columbia University Press, New
York.
Another book describing PSHA. The author was an employee of the Nuclear Regulatory
Commission, so his insights are particularly focused on nuclear applications. This book also
makes the greatest effort to compare deterministic and probabilistic seismic hazard analysis
methods.
68
Abrahamson, N. A. (2006). "Seismic hazard assessment: problems with current practice and future
developments." First European Conference on Earthquake Engineering and Seismology, Geneva,
Switzerland, 17p. http://www.ecees.org/keynotes.html
69
References
Abrahamson, N. A., and Silva, W. J. (1997). "Empirical response spectral attenuation relations for
shallow crustal earthquakes." Seismological Research Letters, 68(1), 94-126.
Anagnos, T., and Kiremidjian, A. S. (1984). "Stochastic time-predictable model for earthquake
occurrences." Bulletin of the Seismological Society of America, 74(6), 2593-2611.
Ang, A. H.-S., and Tang, W. H. (2007). Probability concepts in engineering emphasis on applications
in civil & environmental engineering, Wiley, New York.
Baker, J. W., and Cornell, C. A. (2006). "Correlation of response spectral values for multi-component
ground motions." Bulletin of the Seismological Society of America, 96(1), 215-227.
Baker, J. W., and Jayaram, N. (2008). "Correlation of spectral acceleration values from NGA ground
motion models." Earthquake Spectra, 24(1), 299-317
Bazzurro, P., and Cornell, C. A. (1999). "Disaggregation of Seismic Hazard." Bulletin of the
Seismological Society of America, 89(2), 501-520.
Benjamin, J. R., and Cornell, C. A. (1970). Probability, Statistics, and Decision for Civil Engineers,
McGraw-Hill, New York.
Campbell, K. W., and Bozorgnia, Y. (2008). "NGA Ground Motion Model for the Geometric Mean
Horizontal Component of PGA, PGV, PGD and 5% Damped Linear Elastic Response Spectra for
Periods Ranging from 0.01 to 10 s." Earthquake Spectra, 24(1), 139-171.
Cornell, C. A., Banon, H., and Shakal, A. F. (1979). "Seismic motion and response prediction
alternatives." Earthquake Engineering & Structural Dynamics, 7(4), 295-315.
Cornell, C. A., and Winterstein, S. R. (1988). "Temporal and magnitude dependence in earthquake
recurrence models." Bulletin of the Seismological Society of America, 78(4), 1522-1537.
Gutenberg, B., and Richter, C. F. (1944). "Frequency of earthquakes in California." Bulletin of the
Seismological Society of America, 34(4), 185-188.
Jayaram, N., and Baker, J. W. (2007). "Statistical Tests of the Joint Distribution of Spectral
Acceleration Values." Bulletin of the Seismological Society of America(in press).
71
Lomnitz-Adler, J., and Lomnitz, C. (1979). "A modified form of the Gutenberg-Richter magnitude-
frequency relation." Bulletin of the Seismological Society of America, 69(4), 1209-1214.
McGuire, R. K. (1995). "Probabilistic Seismic Hazard Analysis and Design Earthquakes: Closing the
Loop." Bulletin of the Seismological Society of America, 85(5), 1275-1284.
Nuclear Regulatory Commission. (1997). Identification and Characterization of Seismic Sources and
Determination of Safe Shutdown Earthquake Ground Motion, Regulatory Guide 1.165.
Reiter, L. (1990). Earthquake hazard analysis: Issues and insights, Columbia University Press, New
York.
Ross, S. M. (2004). Introduction to probability and statistics for engineers and scientists, 3rd Ed.,
Elsevier Academic Press, Amsterdam ; Boston.
Schwartz, D. P., and Coppersmith, K. J. (1984). "Fault behavior and characteristic earthquakes:
Examples from the Wasatch and San Andreas fault zones." Journal of geophysical research,
89(B7), 5681-5698.
Weichert, D. H. (1980). "Estimation of the earthquake recurrence parameters for unequal observation
periods for different magnitudes." Bulletin of the Seismological Society of America, 70(4), 1337-
1346.
Youngs, R. R., and Coppersmith, K. J. (1985). "Implications of fault slip rates and earthquake
recurrence models to probabilistic seismic hazard analysis." Bulletin of the Seismological Society
of America, 75(4), 939-964.
72