0% found this document useful (0 votes)
74 views11 pages

Analyzing Seasonal To Interannual Extreme Weather and Climate Variability With The Extremes Toolkit

This document discusses analyzing seasonal and interannual extreme weather and climate variability using extreme value analysis. It provides background on extreme value theory and distributions. The document then summarizes an extreme temperature application and provides analysis of temperature data using the R package extRemes.

Uploaded by

abhinavatripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views11 pages

Analyzing Seasonal To Interannual Extreme Weather and Climate Variability With The Extremes Toolkit

This document discusses analyzing seasonal and interannual extreme weather and climate variability using extreme value analysis. It provides background on extreme value theory and distributions. The document then summarizes an extreme temperature application and provides analysis of temperature data using the R package extRemes.

Uploaded by

abhinavatripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

P2.

15
ANALYZING SEASONAL TO INTERANNUAL EXTREME WEATHER AND
CLIMATE VARIABILITY WITH THE EXTREMES TOOLKIT

Eric Gilleland∗and Richard W. Katz

Research Applications Laboratory,


National Center for Atmospheric Research

1 INTRODUCTION

Statistical analyses in weather and climate variabil-


ity studies have often been concerned with averages
of a random variable, such as mean precipitation or
temperature. However, the extremes of random vari-
ables are important to consider, and have become
increasingly studied in recent years (e.g., Wettstein
and Mearns (2002) (hereafter, WM), Brown and Katz
(1995), Zwiers and Kharin (1998), Kharin and Zwiers
(2000,2005), Jagger et al. (2001), Ekström et al.
(2005) and Fowler et al. (2005)). When studying Figure 1: One-thousand random samples each of
changes in the average of a distribution, the Cen- size 1,000 simulated from a normal distribution with
tral Limit Theorem (CLT) indicates that the averages mean zero and unit standard deviation. The his-
are asymptotically normally distributed and therefore tograms shown here are for the means (left) and
is an appropriate assumption for modeling and infer- maxima (right) for each of these samples. The solid
ence. For extremes, there is a similar theorem to the lines show the best fit normal pdf and the dashed lines
CLT called the Extremal Types Theorem, which gives show the best fit GEV (using maximum-likelihood es-
asymptotic justification for assuming the extreme data timation).
(e.g., maxima, minima, etc.) follow one of three types
of distributions: Gumbel, Fréchet or Weibull (see e.g.,
Beirlant et al. (2004), Coles (2001), Embrechts et al.
how the GEV distribution is a better fit in the case of
(1997), Reiss and Thomas (2001) and Leadbetter et
maxima from the uniform distribution, though for the
al. (1983)). Furthermore, these three distributions
means the two are still similar to each other with the
can be written in a single expression as a family of dis-
normal capturing the overall shape slightly better than
tributions referred to as the generalised extreme value
the GEV.
(GEV) distribution. Figure 1 shows an example from
having simulated means and maxima from 1,000 sam- In this paper, it will be demonstrated how extreme-
ples of standard normal distributions each of size 1,000. value statistical analysis can be employed for study-
The resulting histograms for the means and maxima ing extreme weather and climate variability incorporat-
of these samples are displayed along with the best fit ing seasonality and other covariates. Although several
normal and GEV distributions. In each case, either software packages exist for performing such analyses
distribution appears to be reasonable, but the normal (Stephenson and Gilleland (2005)), the R (R Develop-
distribution (solid line) clearly provides a better fit for ment Core Team (2004)) package extRemes (Gilleland
the means and the GEV (dashed line) a better fit for and Katz (2005)) is used here because it is open source
the maxima as the theory suggests. Figure 2 is similar (as is R) and particularly well suited for weather and
to Figure 1, but simulations are from a uniform distri- climate applications because of its extensive tutorial
bution on the range of -1 to 1. It is much easier to see aimed at such applications, and the ability to incorpo-
rate covariate information into parameter estimates.
∗ Corresponding author address: Eric Gilleland, National

Center for Atmospheric Research (NCAR), Research Appli-


The present paper analyzes the data of WM who also
cations Laboratory, 3450 Mitchell Ln, Boulder, CO 80301; employ extreme-value analysis and relate the behavior
email: ericg@[Link] of the extreme events to the mean and standard devi-
method will not be discussed further here. The latter
method is carried out by two alternative approaches:
block maxima and peaks over threshold (POT). The
distributional theory is equivalent for either approach,
though the distributional forms may, at first glance, ap-
pear to be different. The extreme-value distributions
for each of these two approaches are introduced in Sec-
tion 2.1, and return levels (quantiles) are discussed in
Section 2.2 followed by an introduction to parameter
estimation in Section 2.3. As is usual in the literature,
only the upper tail extremes (e.g., maxima) are dis-
Figure 2: One thousand random samples each of size cussed because the lower tail extremes can be handled
1,000 simulated from a uniform distribution over the by taking the negative transformation and simply ap-
range -1 to 1. The histograms shown here are for the plying the same techniques as for maxima. Finally, a
means (left) and maxima (right) for each of these brief discussion is given on extending these univariate
samples. The solid lines show the best fit normal pdf analyses to a spatial setting in Section 2.4.
and the dashed lines show the best fit GEV (using
maximum-likelihood estimation). 2.1 Extreme-Value Distributions

When data are taken to be the maxima (or minima)


ation of the quantities of interest. It is demonstrated over certain blocks of time (such as annual maximum
here how such relations can be formalized directly in precipitation, monthly maximum/minimum tempera-
the extreme-value analysis using extRemes. ture), then it is appropriate to use the GEV distribution
First, some background for extreme-value statistics (1).
is given in Section 2. Section 3 summarizes the ex- h i
treme temperature application analyzed in WM. Sec- −1/ξ
G(z; µ, σ, ξ) = exp −{1 + ξ(z − µ)/σ}+ , (1)
tion 4 provides analysis of these data. Some discussion
is presented in Section 5, and finally, some tutorial in- where −∞ < µ < ∞, σ > 0 and −∞ < ξ < ∞ are
struction for using extRemes to perform the analysis the location, scale and shape parameters respectively,
described in Section 4 is given in the appendix. and x+ = max(x, 0). The three extremal types are
determined by the sign of ξ arriving at the Weibull dis-
2 BRIEF BACKGROUND FOR tribution for ξ < 0, the Gumbel distribution is obtained
EXTREME-VALUE ANALYSIS in the limit as ξ −→ 0, and the Fréchet distribution for
ξ > 0. Each of the three types of distributions have
Because much has been written about extreme-value distinct forms of behavior in the tails. The Weibull is
analysis, this section will be brief. For further reading bounded above, meaning that there is a finite value
on the subject, Coles (2001) is a good introductory text which the maximum cannot exceed. The Gumbel dis-
that is heavy on application, but still giving some the- tribution yields a light tail, meaning that although the
oretical development, and Beirlant et al. (2004) give maximum can take on infinitely high values, the proba-
a more thorough discussion. Stephenson and Tawn bility of obtaining such levels becomes small exponen-
(2004) give a short, but particularly insightful intro- tially. The Fréchet, a heavy tailed distribution, decays
duction. For a more in-depth theoretical discussion, polynomially so that higher values of the maximum are
see Embrechts et al. (1997) and Leadbetter et al. obtained with greater probability than would be the
(1983). For a basic introduction, see Gilleland and case with a lighter tail (see, e.g., Figure 3).
Katz (2005). Smith (2002) and Katz et al. (2002) Note that some literature uses κ = −ξ in (1). With
give a more terse applied introduction with enough de- such a parameterisation, positive and negative κ would
tail to satisfy a novice to extreme-value analysis. yield the Weibull and Fréchet respectively.
There are two primary methods for analyzing ex- The approach leading to distribution (1) assumes
treme values statistically. The first is to fit data to data are maxima from blocks, say of time. Ar-
a model using traditional statistical techniques, and guably, for some problems, taking maxima from large
then look at the extreme quantiles typically by simu- blocks (e.g., annual maximum precipitation) discards
lating from the model (see e.g., Gilleland and Nychka too much data. The POT approach allows for more
(2005) and Chandler (2005)); the second method is to data to inform the analysis, but also increases the com-
fit data to an extreme-value distribution. The former plexity of the problem.
perature day is likely to be followed by another high
temperature day). One measure of dependency that
is frequently used is the extremal index, θ (e.g., Ferro
and Segers (2003), Coles (2001)). The case of com-
plete independence will yield a value of θ = 1, but it is
also possible to have dependent data where θ = 1.
An approach frequently employed to handle such de-
pendency is to decluster the data by identifying clusters
and utilizing only a summary of each cluster (e.g., the
maxima). Several such methods have been devised for
determining clusters (see, e.g., Ferro and Segers (2003)
and the references therein), and one of the simplest
and most widely used is runs declustering. With runs
Figure 3: Example histograms of data simulated from
declustering, a new cluster is formed once the value
a GEV distribution with ξ < 0 (left), ξ = 0 (center)
of the quantity of interest exceeds the threshold after
and ξ > 0 (right).
having fallen below the threshold for a certain length
of time, called the run length (denoted here by r).
For the POT approach, a threshold is first deter- There is also a point-process characterisation for
mined, and data above that threshold are fit to the the POT approach that allows for simulataneous fit-
generalised Pareto distribution (GPD). ting of the rate at which values exceed the threshold
and the intensity of the exceedances (see, e.g., Coles
−1/ξ
(2001) Chapter 7 and Smith (2002)). Again, the same

ξ(x − u)
G(x; σ̃, ξ, u) = 1 − 1 + , (2) assumptions of a high threshold, common distribu-
σ̃
tion, and independence are necessary for the model to
where x−u > 0, 1+ ξ(x−u) > 0 and σ̃ = σ +ξ(u−µ). be theoretically valid. Furthermore, the point-process
σ̃
The GPD (2) gives the probability of a random vari- model can be shown to be equivalent through appropri-
able exceeding a high value given that it already ex- ate transformations (see e.g., Katz et al. (2002) and
ceeds a high threshold, say u (i.e., Pr[X > x|X > u]). Coles (2001) Section 7.4) to the GEV (1) and GPD
Of course, the theoretical results that show the GPD (2), and unlike the GPD explicitly includes the loca-
to be the asymptotic distribution appropriate for ex- tion parameter, µ.
ceedances over a high threshold requires these ex-
ceedances to be independent and identically distributed
random variables. 2.2 Return Levels (Quantiles)
Choice of threshold is critical to any POT analysis. Typically, when considering extreme values of a ran-
Too high of a threshold could discard too much data dom variable, one is interested in the return level of
leading to high variance of the estimate, but too low of an extreme event, defined as the value, zp , such that
a threshold can lead to bias because (2) is an asymp- there is a probability of p that zp is exceeded in any
totic result requiring a high threshold. Practitioners are given year, or alternatively, the level that is expected
wont to using graphical tools in determining an appro- to be exceeded on average once every 1/p years (1/p
priate threshold. One of the more popular approaches is often referred to as the return period). For example,
is to fit the GPD using a range of thresholds, and then if the 100-year return level for precipitation at a given
graphing the parameter estimates along with their vari- location is found to be 1.5 cm, then the probability
ability; an appropriate threshold being the lowest pos- of precipitation exceeding 1.5 cm in any given year is
sible choice such that any higher threshold would result 1/100 = 0.01.
in similar estimates. It should be noted that the dis-
The return level is derived from the distribution (ei-
tribution (2) is equivalent to (1) under an appropriate
ther (1) or (2)) by setting the cumulative distribution
transformation (see e.g., Katz et al. (2002) and Coles
function equal to the desired probability/quantile, 1−p;
(2001)).
and then solving for the return level. For example, for
Apart from threshold selection, an important as-
the GEV distribution given in (1), the return level, zp ,
sumption for the GPD is that the threshold exceedances
is given by the following equation.
are independent. Such an assumption is often unrea-
sonable for weather and climate data because high val-
µ − σξ [1 − {− log(1 − p)}−ξ ], for ξ =

ues of meteorological and climatological quantities are 6 0
zp =
often succeeded by high quantities (e.g., a high tem- µ − σ log{− log(1 − p)}, for ξ = 0
2.3 Estimation
There are different methods available for perform-
ing parameter estimation including: Method of Mo-
ments Estimation (MME), Probability Weighted Mo-
ments (PWM) or equivalently L-Moments (LM), Max-
imum Likelihood Estimation (MLE), and Bayesian
methods. For smaller sample sizes (n < 50), the
MLE is unstable and can give unrealistic estimates
for the shape parameter (e.g., Hosking and Wallis
(1997), Coles and Dixon (1999), and Martins and Ste-
dinger (2000,2001)). Madsen et al (1997) argue that
the MME quantile estimators have smaller root mean
square error when the true value of the shape parameter
is within a narrow range around zero. For weather and
climate applications, enough data are typically avail-
able to expect that MLE would be comparable in per-
formance, especially when blocks smaller than years
Figure 4: Minimum temperatures (degrees celsius) at
are used. Additionally, MLE allows one to easily incor-
Port Jervis, New York (top) and Sept-Iles, Québec
porate covariate information into parameter estimates.
(bottom).
Furthermore, it is more straightforward to obtain er-
ror bounds for parameter estimates with MLE com-
pared with most alternative methods. Although work the focus here is only on univariate data.
on Bayesian estimation with respect to extreme-value
analysis has been sparse in the literature, good ex-
amples are available (see e.g., Stephenson and Tawn 3 EXTREME TEMPERATURE DATA
(2004) and the references therein, Coles (2001, Sec- IN THE NORTHEAST UNITED
tion 9.1), and Cooley et al. (2005a, 2005b)). STATES AND CANADA
Obviously, one will never select the Gumbel when
fitting data to a GEV because the Gumbel is reduced Data to be analyzed here are a subset from the study
to a single point in a continuous parameter space. A carried out in WM; here summary description is given,
common approach is to perform an initial hypothesis but for more detail please refer to WM and the R help
test to determine which of the three extremal types files for SEPTsp and PORTw included with the pack-
(e.g., the Gumbel) is appropriate, and then fit data age extRemes. Here the focus centers on data from
only to that type. However, this approach does not the two locations given special attention in WM: Port
account for the uncertainty of the choice of extremal Jervis, New York and Sept-Iles, Québec. The Port
type on the subsequent inference, which can be quite Jervis data cover the winters from 1927 through 1995,
large. Stephenson and Tawn (2004) suggest a Bayesian and Sept-Iles cover the spring seasons 1945 through
approach to estimating these parameters that allows 1995 consisting of 68 and 51 monthly minima (i.e.,
for the Gumbel to be achieved with positive probability; block minima) derived from daily data respectively.
though results can be highly sensitive to choice of prior The subsets are chosen not as much to compare find-
distributions. ings with WM, but rather to demonstrate the statistical
techniques available in using extRemes for performing
such an analysis. Nevertheless, these two locations
2.4 Spatial Extensions
were treated with special attention in WM largely be-
So far, only univariate data has been considered. In cause results tended to be more significant in these two
fact, incorporating spatial structure in the analysis of areas; and therefore make for a reasonable pedagogical
extremes is an area of active research in statistics (e.g., example.
Schlather and Tawn (2002, 2003), Heffernan and Tawn Each dataset contains measurements of monthly
(2004), Cooley et al. (2005a), Gilleland and Nychka minimum (Figure 4) and maximum temperatures. Co-
(2005), Gilleland et al. (2006)). Work is currently variate information is also available, including: the as-
underway to add spatial tools to extRemes, and the sociated Arctic Oscillation (AO) index, mean daily min-
beta version already has the capability of fitting data ima (maxima) over the one-month period, and stan-
to a GPD at several sites, and smoothing parameter dard deviations of daily minima (maxima) for each pe-
estimates by way of a thin plate spline. For simplicity, riod.
Table 1: GEV parameter estimates from fitting
monthly minimum temperatures at (a) Sept-Iles,
Québec (spring) and (b) Port Jervis, New York (win-
ter).
Parameter Estimate Std. Error
Location (µ) 26.822 0.655
(a) Scale (σ) 4.030 0.483
Shape (ξ) -0.096 0.127
Negative log-likelihood: 149.12
Location (µ) -20.770 0.455
(b) Scale (σ) 3.346 0.324
Shape (ξ) -0.264 0.092
Negative log-likelihood: 179.56

is reasonable, and that fitting the data to the other


two possible extremal types would likely not be bet-
ter. Indeed, the estimate found here from fitting all
three extremal types simultaneously yields a value of
ξ that is close to zero (−0.096), being zero within
Figure 5: Diagnostic plots from fitting the minimum one standard error of the point estimate. However, as
monthly temperature data at Sept-Iles, Québec to a noted in Section 2.3, fitting only to the Gumbel ignores
GEV distribution. Quantile and return-level graphs the uncertainty associated with the choice of extremal
are for the negative transformed minima. From up- types. For the monthly minimum temperature data
per left to lower right: probability, quantile, return in the winter at Port Jervis, New York, the estimated
level, and histogram with fitted GEV density. shape parameter is much farther (about three standard
errors) from zero at about −0.26 (Table 1 (b)); mak-
ing a much weaker case for constraining the fit a pri-
4 ANALYSIS OF SEASONAL EX- ori to the Gumbel. The 100-year return level for mini-
TREME TEMPERATURE mum temperature at this location is about −29.68 with
95% confidence bounds (profile-likelihood) of about
In WM, it was found that for Sept-Iles in the spring, (−33.333, −28.245), whereas the 100-year return level
lower mean minimum temperatures are coupled with assuming the Gumbel is estimated to be approximately
significant increases in the standard deviation of daily −35.23, which is beyond the lower limit of the 95%
minimum temperature, and that these minimum tem- confidence bounds for the bounded-tail Weibull. Al-
perature extremes become more severe as the AO in- though this difference for such a long return level is
dex increases. For Port Jervis in the winter, it was not considerable, the difference could have been much
also found that increasing temperatures are associated greater had the estimate for ξ been positive instead
with increases in the AO index. Here, these features are (i.e., the heavier tailed Fréchet distribution).
examined by modeling the extreme temperature data An initial glance at the probability and quantile
without any covariates, and then making comparisons graphs of Figure 5 suggest that the underlying assump-
with more complex extreme-value models that incor- tions for the GEV distribution are reasonable for these
porate covariate information such as the AO index. data. The return level plot is shown in the lower left
To fit the monthly minimum temperature to corner along with point-wise 95% confidence bounds
a GEV distribution, the usual methods for max- estimated by the delta method (the default). The
ima apply by realising that min{X1 , . . . , Xn } = delta method assumes that the parameter estimates
− max{−X1 , . . . , −Xn }. That is, the negative trans- are symmetric; which is typically not the case for the
formation of the data (Y1 = −X1 , . . . , Yn = −Xn ) is shape parameter or extreme return levels. For exam-
fit to the GEV distribution. Here, results are presented ple, Figure 6 shows the profile likelihoods for the 100-
in terms of the untransformed minima except where year return level (estimated to be about 11.83 degrees
noted. celsius with 95% confidence interval of about (-2.566,
Maximum-likelihood fitted parameter values and 15.620)) and the shape parameter. In each case, there
other information are summarized in Table 1 (a). In is clearly asymmetry–especially for the 100-year return
WM, it is argued that the Gumbel (ξ = 0) distribution level.
Figure 7: Return-level graph of negative transformed
monthly minimum temperature (degrees celsius) cal-
culated from associated GEV distribution (solid line)
Figure 6: Profile likelihoods for the negative trans-
with 95% confidence intervals calculated from the
formed 100-year return level (top) and shape param-
delta and profile-likelihood methods respectively for
eter (bottom) from having fit monthly minimum tem-
Sept-Iles, Québec (spring).
perature at Sept-Iles, Québec (spring) to the GEV
distribution.
It can be seen from both Table 2 and Figure 7 that
the return levels for minimum temperature gradually
decrease for higher and higher return periods. Clearly,
at least for these data, the delta method bounds favor
lower values of return levels whereas for return peri-
ods beyond about 10 years, the profile-likelihood has
Table 2: Estimated return levels and 95% confidence a much tighter lower bound, and slightly higher upper
intervals for several return periods from having fit bound. Beyond about 100 years, both bounds are very
monthly minimum temperatures (degrees celsius) at wide reflecting the inherent uncertainty associated with
Sept-Iles, Québec (spring) to the GEV distribution. making inferences far beyond the range of the data; but
Return Return Lower Upper the profile-likelihood method gives a more accurate pic-
period level bound bound ture for such longer return periods because it accounts
5 21.19 19.142 22.836 for the skewness in the parameter distributions.
10 18.66 15.227 20.568 Inclusion of the AO index as a covariate in the loca-
15 17.32 12.654 19.437 tion parameter (µ) of the GEV yields a significant (at
25 15.72 9.100 18.188 the 5% level) improvement over the fit without AO in-
50 13.70 3.666 16.769 dex (likelihood ratio test statistic is about 12, which is
75 12.59 0.124 16.067 much greater than the χ21,1−0.05 critical value of about
100 11.83 -2.566 15.620 4). Specifically, the model obtained is summarized in
110 11.58 -3.492 15.481 Table 3, where the location parameter is modeled as a
125 11.26 -4.762 15.300 linear regression of the following form.
150 10.80 -6.629 15.055
200 10.09 -9.716 14.696 µ(x) = µ0 + µ1 x, (3)
500 7.95 -20.809 13.751
1000 6.46 -30.647 13.204 where x is the AO index.
Note that as the AO index increases, the values of
the location parameter become more and more nega-
tive indicating that the minimum temperature extremes
differing tail behavior for the three types of distribu-
Table 3: GEV parameter estimates from fitting
tions) is enough to deter researchers from utilizing co-
monthly minimum temperature (degrees celsius)
variates in conjunction with this parameter, although
recorded at Sept-Iles, Québec (spring) with AO index
such analysis is allowed with extRemes.
incorporated as a covariate in the location parameter
In WM, it is found that minimum temperature ex-
as in Eq. (3).
tremes at Port Jervis during winter are not significantly
Parameter Estimate Stand. Error
influenced by the AO index; and this is corroborated
Location (µ0 ) -23.42 0.668
when performing an analogous analysis as for the Sept-
Location (µ1 ) -2.09 0.686
Iles data.
Scale (σ) 4.19 0.502
Shape (ξ) -0.368 0.117
Negative log-likelihood: 143.04 5 DISCUSSION

Results found from analyzing two subsets of data from


become more severe as the AO index increases. This WM show agreement with the results in WM, but using
result is consistent with the findings of WM. It might a slightly more sophisticated analysis. A primary in-
also be worth noting that the fitted distribution with tent of the paper is to demonstrate how easily such an
AO index as a covariate now has a more strongly neg- analysis can be perfomed using extRemes. It should be
ative shape parameter (≈ −0.4) that is about three noted that extRemes has much more capability than is
standard errors away from zero (Gumbel case). detailed here (see Gilleland and Katz (2005) for a more
It is also found in WM that this intensification of thorough description of the capabilities of extRemes).
the extreme temperatures is coupled with increases in The inclusion of spatial tools is a much needed addi-
the standard deviation of daily minimum temperature tion for a software package intended for weather and
for this location in spring. Performing a similar fit with climate applications, and such tools will be available in
a covariate in the location parameter as in Eq. (3), the near future.
but with x the standard deviation of daily minimum Naturally, a GUI-based software package will have
temperatures, also results in a significant (5% level) limitations as far as analyzing cutting-edge research
improvement over the no-covariate model. Further- problems. The intention is for extRemes to serve as
more, adding the standard deviation to the model with an aid in shortening the learning curve associated with
AO index (i.e., µ(x) = µ0 + µ1 x1 + µ2 x2 , where x1 is using a possibly very new methodology.
the AO index and x2 is the standard deviation) results
in a significant improvement over the model with just APPENDIX
AO index (likelihood ratio test statistic of about 62).
On the other hand, the addition of AO index to the The Extremes Toolkit: Weather and Climate
model with only the standard deviation as a covariate Applications of Extreme Value Statistics
is not found to be significant. This result, however,
should not be surprising because the standard devia- The extremes toolkit (extRemes) was developed at
tion of daily minimum temperatures are derived from NCAR to assist scientists, especially scientists inter-
the same data as the block maxima. Additionally, it ested in trends in weather and climate extremes or in
should be expected that data with greater variability societal and ecological impacts of severe weather and
should also have more intense extremes. For these rea- climate change, not familiar with extreme value statis-
sons, using such a covariate in the model is mislead- tical techniques. The package is written in the open
ing. Nevertheless, the AO index is not derived from source1 statistical computing language called R (R De-
the same data as the dependent variable (i.e., monthly velopment Core Team (2004)), but can be used with-
minimum temperature), and is certainly an improve- out any knowledge of R. extRemes provides a graph-
ment over the model without a covariate. ical user interface (GUI) to another R package called
It is also possible to incorporate covariates into the ismev–itself an R port of the S-Plus package written
other parameters. In the case of the scale parameter, by Stuart Coles (Coles (2001)). However, extRemes
it is important to ensure that σ > 0 for all possible does provide some additional functionality (Stephen-
covariate values. This is easily attained by using the son and Gilleland (2005)). This section is only a short
log link function instead of the identity–the two choices tutorial on using extRemes to perform the analyses de-
provided by extRemes. Results from trying such fits scribed herein, but see Gilleland and Katz (2005) for a
were not found to be as significant as for the location full tutorial on the package.
parameter for the Sept-Iles data. Typically, the diffi- 1 All of the packages available from the R-CRAN website

culty in estimating the shape parameter (with wildly are also open source, including extRemes and ismev.
It is assumed here that extRemes (version Data Transformations
1.51 or greater) is already installed and loaded
into an R session (see the toolkit’s home page
at [Link]
for installation and loading instructions). To load Several data transformations can be made using
the datasets used here into extRemes one sim- extRemes. Although most are easy to compute from
ply needs to click on File followed by Read the command line, it is preferable to use the toolkit
Data, and then search for the files (one at a dialog in order that the resulting transformations are
time) PORTw.R and SEPTsp.R (if these placed where subsequent dialogs can include them.
files are not found within the extRemes data di-
rectory, they can be obtained from the web at To perform the negative transformation for the Sept-
[Link] Iles springtime minimum temperature data, choose
After double-clicking (or selecting and clicking Open) Negative from the Transform Data menu under
these files, a new window will appear. Select R File. Select SeptIlesSpring from the Data Ob-
source and enter a name in the Save As field (e.g., ject listbox, and then select the variable TMN0 from
the names PortJervisWinter and SeptIlesSpring the Variables to Transform field, and finally click
are used here), then click OK. Summary information OK. A message is displayed in the R console inform-
on the data should appear in the R console window, ing you that a new column has been added to the data
and the working directory is saved to memory. with the heading [Link].

Graphical Tools
It is always a good idea to graph data before analyzing
it, and this can be readily performed from the toolkit
dialogs. For example, to create the top graph in
Figure 4, simply select Plot and then Scatter Plot,
then make the selections as shown below.
Parameter Estimation

After taking the negative transformation, the monthly


minima can be fit to a GEV distribution by selecting
Generalized Extreme Value (GEV) Distribu-
tion under Analyze. Again, select SeptIlesSpring
from the Data Object field followed by [Link]
in the Response field. Check the Plot diagnostics
checkbutton to also obtain the graphs of Figure 5, but
do not make any other selections (yet)–simply click
OK. For the example below, the same steps are taken,
but AOindex is selected from the Location param-
eter (mu) field.

The estimation for Table 3 can be performed in


extRemes by making the selections (shown below)
from the Fit Generalized Extreme Value Dis-
tribution dialog window.
Other graphs, such as AO index against monthly
mean daily minima (maxima), can also be easily drawn
using the extRemes GUI dialogs.2
2 For more advanced graphs, a good start might be an in-

spection of the code executed by the GUI windows found in


the [Link] file located in the path of the current R path) and the R help files for the function plot (i.e., use help(
working directory (use the R command getwd() to find this plot)) and associated parameters (use help( par)).
done for the 100-year return level above.4 . Next,
graph the return levels by selecting Return Level
Plot from the Plot menu, and make the selections
as shown below.

There are several choices available for the numerical


optimization method. See the help file for the R func-
tion optim and the references therein for more infor-
mation. Press et al. (1989) provide Fortran algorithms
for all of these methods.

Profile Likelihood To add the profile-likelihood derived confidence


bounds to the graph requires use of the command line.
Figure 6 is graphed using extRemes by selecting GEV First, for convenience, set up a 13 × 3 matrix of the
fit from the Parameter Confidence Intevals un- return year and negative transformed values from the
der Analyze. In the window that opens, select Sep- second and third columns of Table 2 by using, for exam-
tIlesSpring from the Data Object field followed by ple, the following commands (here, the resulting matrix
gev.fit1 in the Select a fit field. Enter −16 and 5 is assigned to the object ci).
in the Lower and Upper limit fields for the Return ci <- rbind( c(5 , -22.836, -19.142),
Level Search Range respectively; and −0.5 to 0.5 c( 10 , -20.568, -15.227),
for the Shape parameter (xi) Search Range.3 c( 15 , -19.437, -12.654),
Finally, check the Plot profile likelihoods check- c( 25 , -18.188, -9.100),
button, and then OK. Note that the parameter esti- c( 50 , -16.769, -3.666),
mate along with its confidence intervals are reported to c( 75 , -16.067, -0.124),
the R console. Here, the 100-year return level is esti- c( 100 , -15.620, 2.566),
mated to be about 11.83 degrees celsius (recall that the c( 110 , -15.481, 3.492),
model is fit to the negative minimum temperatures so c( 125 , -15.300, 4.762),
that the −11.83 reported actually refers to the negative c( 150 , -15.055, 6.629),
transformed minima) with a 95% confidence interval of c( 200 , -14.696, 9.716),
about (-2.566, 15.620). c( 500 , -13.751, 20.809),
Although somewhat tedious, it is possible to use c( 1000 , -13.204, 30.647))
extRemes to create a graph similar to that of Figure 5 Finally, add the values to the graph using the lines
using results from the profile-likelihood estimation. command as follows (coloring (col) and line type
First, obtain confidence bounds for each of several (lty) are optional to the user’s preference).
return levels using the profile-likelihood method as lines( ci[,1], ci[,2], lty=2, col="red")
3 Choice lines( ci[,1], ci[,3], lty=2, col="red")
of search ranges and number of intervals is a mat-
ter of trial and error. Different ranges are possible so long
as the range includes the points where the profile likelihood 4 The delta method is a reasonable approximation for the

crosses the lower horizontal line (see Coles (2001) Section 2.7 shorter return levels, and therefore these are not estimated
for the meaning of this line). via the more time consuming profile-likelihood method
Likelihood-Ratio Test Embrechts, P., Klüppelberg, C., and Mikosch, T.,
1997: Modelling Extremal Events, Springer-
The likelihood-ratio test is performed with extRemes
Verlag, New York.
by selecting Likelihood-ratio test from the An-
alyze menu. Simply select the data object (in this Ferro, C.A.T. and Segers, J., 2003: Inference for clus-
case SeptIlesSpring), and then choose the model ters of extreme values, J. R. Stat. Society B
fit associated with the base model (M0) and the more 65:545–556.
complicated model (M1) being sure that M0 is nested
Fowler, H.J., Ekström, M., Kilsby, C.G., and Jones,
in M1 (extRemes will automatically switch the two if
P.D., 2005: New Estimates of future changes in
M0 has more parameter estimates than M1), and fi-
extreme rainfall across the UK using regional cli-
nally click OK.
mate model integrations. 1. Assessment of con-
trol climate. J. Hydrology 300:212–233.
ACKNOWLEDGEMENTS
Gilleland, E., Nychka, D., and Schneider, U., April 27,
This work is funded by the NCAR Weather and Cli- 2006: Spatial models for the distribution of ex-
mate Impacts Assessment Science (WCIAS) Program. tremes, Computational Statistics: Hierarchical
NCAR is sponsored by the National Science Founda- Bayes and MCMC Methods in the Environmen-
tion (NSF). I would like to thank everyone from the tal Sciences, Edited by J.S. Clark and A. Gelfand.
”Statistical Analysis of EXTREMES in GEOPHYSICS” Oxford University Press. (in press)
([Link]
Gilleland, E. and Katz R.W., 2005: Tutorial for
reading group who helped to edit this paper.
The Extremes Toolkit: Weather and Cli-
mate Applications of Extreme Value Statistics,
REFERENCES [Link]

Beirlant, J., Goegebeur, Y., Segers, J., and Teugels, Gilleland, E. and Nychka, D., 2005: Statistical models
J., 2004: Statistics of Extremes, Wiley, Chich- for monitoring and regulating ground-level ozone,
ester, England. Environmetrics 16:535–546.

Brown, B.G. and Katz, R.W., 1995: Regional anal- Heffernan, J.E. and Tawn, J.A., 2004: A conditional
ysis of temperature extremes: Spatial analog for approach for multivariate extreme values, J. R.
climate change?, J. of Climate 8:108–119. Stat. Society B 66(3):497-530(34).

Chandler, R.E., 2005: On the use of generalized lin- Hosking, J.R.M. and Wallis, J.R.,1997: Regional Fre-
ear models for interpreting climate variability, En- quency Analysis: An Approach Based on L-
vironmetrics 16(7):699–715. Moments, Cambridge University Press, New York.
Jagger, T., Elsner, J.B., and Xufeng, N., 2001: A dy-
Coles, S.G., 2001: An Introduction to Statistical
namic model of hurricane winds in coastal coun-
Modeling of Extreme Values, Springer-Verlag,
ties of the United States, J. Appl. Meteor.
London.
40(5):853–863.
Coles, S.G. and Dixon, M.J., 1999: Likelihood-based Katz, R.W., Parlange, M.B., and Naveau, P., 2002:
inference for extreme value models, Extremes Statistics of extremes in hydrology, Advances in
2(1):5–23. Water Resources, 25:1287–1304.
Cooley D., Nychka D., Naveau P., 2005a: Bayesian Kharin, V.V. and Zwiers, F.W., 2005: Estimating ex-
Spatial Modeling of Extreme Precipitation Return tremes in transient climate change simulations, J.
Levels, (submitted). Climate 18:1156–1173.
Cooley D., Naveau P., and Jomelli, V., 2005b: A Kharin, V.V. and Zwiers F.W., 2000: Changes in the
Bayesian Hierarchical Extreme Value Model for extremes in an ensemble of transient climate sim-
Lichenometry. Environmetrics, (in press). ulations with a coupled atmosphere-ocean GCM,
J. Climate 13:3760–3788.
Ekström, M., Fowler, H.J., Kilsby, C.G., and Jones,
P.D., 2005: New Estimates of future changes in Leadbetter, M.R., Lindgren, G., and Rootzén, H.,
extreme rainfall across the UK using regional cli- 1983: Extremes and Related Properties of Ran-
mate model integrations. 2. Future estimates and dom Sequences and Series, Springer-Verlag, New
use in impact studies. J. Hydrology 300:234–251. York.
Madsen, H., Rasmussen, P.F., and Rosbjerg, D., Zwiers, F.W. and Kharin, V.V., 1998: Changes in the
1997: Comparison of annual maximum series and extremes of the climate simulated by CCC GCM2
partial duration series methods for modeling ex- under CO2 doubling, J. Climate 11:2200–2222.
treme hydrologic events, 1, At-site modeling, Wa-
ter Resour. Res. 33(4):747–758.

Martins, E.S. and Stedinger, J.R., 2000: Gener-


alized maximum-likelihood generalized extreme-
value quantile estimators for hydrologic data, Wa-
ter Resour. Res. 36(3):737–744.

Martins, E.S. and Stedinger, J.R., 2001: General-


ized maximum likelihood Pareto-Poisson estima-
tors for partial duration series, Water Resour.
Res. 37(10):2551–2557.

Press, W.H., Flannery, B.P., Teukolsky, S.A., and


Vetterling, W.T., 1989: Numerical Recipes (For-
tran), Cambridge University Press, New York.

Reiss, R.D. and Thomas, M., 2001: Statistical Anal-


ysis of Extreme Values from Insurance, Finance,
Hydrology and Other Fields, Birkhauser, New
York.

R Development Core Team, 2004: R: A lan-


guage and environment for statistical com-
puting. R Foundation for Statistical Com-
puting, Vienna, Austria. ISBN 3-900051-07-0,
[Link]

Schlather, M. and Tawn, J.A., 2003: A dependence


measure for multivariate and spatial extreme
values: Properties and inference. Biometrika
90(1):139–156.

Schlather, M. and Tawn, J.A., 2002: Inequalities for


the extremeal coefficients of multivariate extreme
value distributions. Extremes 5(1):87–102.

Smith, R.L., 2002: Statistics of extremes with ap-


plications in environment, insurance and finance,
[Link]

Stephenson, A. and Gilleland, E., 2005: Software for


the analysis of extreme events: the current state
and future directions, Extremes (submitted).

Stephenson, A. and Tawn, J.A., 2004: Bayesian in-


ference for extremes: Accounting for the three ex-
tremal types, Extremes 7:291–307.

Wettstein, J.J. and Mearns, L.O., December 2002:


The influence of the North Atlantic-Arctic Oscil-
lation on mean, variance and extremes of tem-
perature in the northeastern United States and
Canada, J. of Climate 15:3586–3600.

You might also like