0% found this document useful (0 votes)
25 views16 pages

Hydro Soil

hydro soillllll

Uploaded by

David Humphrey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views16 pages

Hydro Soil

hydro soillllll

Uploaded by

David Humphrey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

WATER RESOURCES RESEARCH, VOL. 48, W01535, doi:10.

1029/2011WR010489, 2012

Continuous rainfall simulation: 1. A regionalized subdaily


disaggregation approach
Seth Westra,1 Rajeshwar Mehrotra,2 Ashish Sharma,2 and Ratnasingham Srikanthan3
Received 28 January 2011; revised 5 December 2011; accepted 13 December 2011; published 25 January 2012.
[1] This paper is the first of two in the current issue that presents a framework for
generating continuous (uninterrupted) rainfall sequences at both gaged and ungaged point
locations. The ultimate objective is to present a methodology for stochastically generating
continuous subdaily rainfall sequences at any location such that the statistics at a range of
aggregation scales are preserved. This first paper presents a regionalized nonparametric
daily disaggregation model in which, conditional on a daily rainfall amount and previous-
and next-day wetness states at the location of interest, subdaily fragments are resampled
using continuous records at nearby locations. The second paper then focuses on a
regionalized daily rainfall generation model.To enable the substitution of subdaily rainfall
at nearby locations for subdaily rainfall at the location of interest, it is necessary to identify
locations with ‘‘similar’’ daily to subdaily scaling characteristics. We use a two-sample,
two-dimensional Kolmogorov-Smirnov (K-S) test to identify whether the daily to subdaily
scaling relationships are statistically similar between all possible station pairs sampled from
232 gages located throughout Australia. This step is followed by a logistic regression to
determine the influence of the covariates of latitude, longitude, elevation, and distance to
the coast on the probability that the scaling at any two locations will be similar. The model
is tested at five locations, where recorded subdaily data was available for comparison, and
results indicate good model performance, particularly in preserving the probability
distribution of extremes and the antecedent rainfall prior to the storm event.
Citation: Westra, S., R. Mehrotra, A. Sharma, and R. Srikanthan (2012), Continuous rainfall simulation: 1. A regionalized subdaily
disaggregation approach, Water Resour. Res., 48, W01535, doi:10.1029/2011WR010489.

1. Introduction [3] Compared to daily rainfall records, however, histori-


[2] Continuous (uninterrupted) sequences of subdaily cal records of subdaily rainfall are usually more sparsely
rainfall is an important source of information for many sampled in space, of shorter duration, and also often con-
hydrological applications, with fine-timescale rainfall often tain a greater percentage of missing data. To address the
used as an input in the design of urban storm water paucity of recorded subdaily records, a range of approaches
systems, the simulation of environmental flows in small has been developed for synthetically generating continuous
catchments, and the modeling of short-duration floods. In subdaily rainfall sequences. These include multiscaling
particular, the use of continuous sequences for this latter models, such the canonical and microcanonical cascades
application has been the subject of much research [Blazkova family of models, which are based on the observation that
and Beven, 2002; Boughton and Droop, 2003; Cameron rainfall patterns exhibit ‘‘self-similarity’’ at a range of time-
et al., 2000; Lamb and Kay, 2004], as it provides a viable scales, enabling information on coarse-scale rainfall to be
means of accounting both for the ‘‘flood-producing’’ rainfall used to describe behavior at finer timescales [Gupta and
event itself as well as the antecedent rainfall in the hours, Waymire, 1993; Lovejoy and Schertzer, 1990; Marshak
days, weeks, and months prior to the event, with both fac- et al., 1994; Menabde et al., 1997; Schertzer and Lovejoy,
tors potentially having a significant bearing on the resulting 1987]. An alternative is the Poisson cluster suite of models,
flood estimates [Kuczera et al., 2006; Pui et al., 2011a]. which simulate rainfall using a storm Poisson arrivals pro-
cess [Cowpertwait et al., 2007, 1996; Koutsoyiannis and
Onof, 2001; Rodriguez-Iturbe et al., 1987, 1988; Verhoest
et al., 1997]. Two popular implementations of this class of
1
School of Civil, Environmental, and Mining Engineering, University model are the Bartlett-Lewis and Neyman-Scott rectangular
of Adelaide, South Australia, Australia. pulse models, with both having been used widely in research
2
School of Civil and Environmental Engineering, University of New and engineering practice [e.g., Frost et al., 2004]. Finally,
South Wales, Sydney, New South Wales, Australia. nonparametric resampling models have been developed
3
Water Division, Australian Bureau of Meteorology, Melbourne, Victo-
ria, Australia.
which avoid making strong assumptions as to the underly-
ing distribution of the rainfall [e.g., Lall and Sharma,
Copyright 2012 by the American Geophysical Union 1996; Nowak et al., 2010; Sharma et al., 1997; Snavidze,
0043-1397/12/2011WR010489 1977; Tarboton et al., 1998], by drawing ‘‘fragments’’

W01535 1 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

from instrumental data to form new stochastic rainfall continuous rainfall sequences at any desired location.
sequences. Detailed testing of these algorithms is conducted with an
[4] A limitation of many of these approaches is the need emphasis on evaluating the extent to which the methods cap-
for long, high-quality subdaily rainfall records as the basis ture both the distribution of extreme rainfall and the anteced-
for parameter estimation in the case of the multifractal ent rainfall leading up to the extreme event, reflecting the
and Poisson class of models, or for drawing the subdaily likely applicability of these techniques for flood estimation
‘‘fragments’’ in the case of the nonparametric algorithms practice.
described above. This is particularly unfortunate given that [8] The remainder of this paper is structured as follows.
the absence of long continuous rainfall records provides In section 2 we provide an overview of Australia’s continu-
one of the principal justifications for continuous simulation, ous rainfall record. This is followed in section 3 by a descrip-
with the solution to this problem usually involving the de- tion of the proposed methodology, including the statistics
velopment of ‘‘regionalized’’ approaches that make use of used to determine the similarity between daily/subdaily
subdaily data within a broader spatial domain in the vicin- rainfall relationships at any two locations. Results are pre-
ity of the location of interest. sented in section 4, including a preliminary analysis of the
[5] The majority of work on such regionalized approaches viability of the method at Sydney Airport, Australia, as
has focused on the Poisson cluster family of models. For well as more detailed results for five case study locations
example, Cowpertwait et al. [1996] and Cowpertwait and distributed throughout Australia. Finally, a discussion and
O’Connell [1997] developed a regionalized Neyman-Scott conclusions are provided in section 5.
Rectangular Pulse (NSRP) model for generating sequences
of hourly rainfall data across the UK, by regressing the 2. Data
NSRP parameters on site variables obtained from a relief [9] Continuous subdaily rainfall data were obtained from
map of the UK (including : elevation, north-south distance, the Australian Bureau of Meteorology (www.bom.gov.au) at
east-west effect, and distance to coast). Cowpertwait et al. 1397 stations, in increments of 6 minutes. The location of
[1996] also developed a disaggregation model that allows each gaging station is shown in Figure 1, together with an indi-
historical or generated hourly data to be disaggregated into cation of the length of record. The median record length of all
totals for shorter time intervals. An alternative approach stations was 9 yr, with only 101 stations having records longer
was proposed by Gyasi-Agyei [1999], who developed a than 40 yr, and an additional 331 stations have records of
regionalized version of the Gyasi-Agyei and Willgoose between 20 and 40 yr. Furthermore, the spatial distribution of
hybrid model based on the nonrandomized Bertlett-Lewis the gaging stations is not homogeneous, with a high density of
rectangular pulse and an autoregressive jitter [Gyasi-Agyei gages in the populated regions particularly along the eastern
and Willgoose, 1997, 1999]. This approach uses observed coastal fringe of Australia, and lower density elsewhere. In
daily statistics (namely dry probability, mean, and variance) contrast, there are 17,451 daily-read gaging stations in Aus-
and two regionalized subdaily parameter estimates, with tralia, of which 2708 locations stations have records longer
promising results found in simulating subdaily rainfall in than 20 yr, and 1768 stations which have more than 40 yr of
central Queensland, Australia. This model was extended to record. This asymmetry in data availability between daily
Australia-wide data by Gyasi-Agyei and Parvez Bin Hahbub and subdaily records highlights the potential benefits of
[2007], and was found to be successful in simulating a range developing a regionalized disaggregation approach using the
of statistics including extreme rainfall. conditional relationship between daily and subdaily rainfall.
[6] In our two articles, we present an alternative region- [10] The number of gaging stations with continuous
alized framework for generating continuous subdaily rain- rainfall records are plotted against the year of record in
fall sequences, drawing on the nonparametric resampling Figure 2. As can be seen, only a small number of gaging
approaches developed by Lall and Sharma [1996] and a stations were available in the early twentieth century (the
novel approach at defining regional similarity. Specifically, longest available record in Australia being from Melbourne
in this paper a nonparametric disaggregation approach will Regional Office, gage number 086071, with data from
be described, in which subdaily rainfall ‘‘fragments’’ are 1873), with significant increases in recording density appa-
randomly sampled from nearby pluviograph stations condi- rent in the 1960s. To limit the effects of possible temporal
tional on daily rainfall amounts at the location of interest. variability in the daily/subdaily characteristics, the remain-
This is one of the first regionalized extensions to the der of the paper only considers records between 1970 and
method of fragments logic, and substantially expands the 2005 with less than 20% of the record classified as ‘‘miss-
applicability of the method of fragments due to the relative ing,’’ with a total of 232 stations meeting this criterion.
abundance of high-quality daily rainfall data compared to ‘‘Missing’’ data was defined as data which was flagged as
subdaily rainfall records. We also modified the method of either missing or presented as an accumulation over previ-
fragments logic to also consider previous- and next-day ous time steps, and in these cases the full day of record was
wetness stages, with this modification improving the conti- removed from the analysis. As will be discussed further
nuity of the resampled-subdaily fragments, and it is made below, the proposed method is relatively insensitive to
possible because of the greater sample size brought about missing data.
by using multiple nearby records.
[7] In the second paper, an algorithm is developed for 3. Methodology
generating daily rainfall sequences at ungaged locations,
once again being informed by data from nearby gaged loca- 3.1. Regionalized Method of Fragments Algorithm
tions. The combination of these two algorithms allows for a [11] The method of fragments is a well-known resam-
complete regionalized framework for generating point-based pling algorithm for generating continuous rainfall sequences

2 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Figure 1. Spatial coverage and record length of the Australian subdaily pluviograph record.

[Lall and Sharma, 1996; Nowak et al., 2010; Sharma and wet or dry. As will be discussed later, this second modification
Srikanthan, 2006; Sharma et al., 1997; Snavidze, 1977; partially overcomes an issue with the conventional method of
Tarboton et al., 1998]. In this paper, we make two modifica- fragments related to continuity of the resampled-subdaily
tions to enable the method to be applied in a regionalized rainfall fragments when there are successive wet days, and
setting. The first and most important modification involves is made possible here because of the greater sample size
the development of a regionalized version in which, condi- due to use of a larger number of nearby stations.
tional on daily rainfall at the location of interest, fragments are [12] The algorithm for the adjusted method of fragments
sampled from a range of ‘‘nearby’’ locations. The second mod- is presented here. The approach is also illustrated in Fig-
ification involves including a ‘‘state-based’’ logic in which ure 3, with the steps in the algorithm matching the steps
fragments are drawn not only conditional on daily rainfall highlighted in the figure. The algorithm :
amounts, but also on whether the previous and next day are Step 1: Obtain a sequence of daily rainfall Roi at the loca-
tion of interest, where subscript i indexes time and the
superscript ‘‘o’’ refers to the target location (the location at
which the continuous rainfall sequences are sought). The
daily rainfall sequence can be obtained either from a histor-
ical record of daily rainfall at the target location, or alterna-
tively from a daily stochastic generation algorithm such as
the one described in our second paper.
Step 2: Obtain daily rainfall sequences at a range of
‘‘nearby’’ subdaily rainfall gages, via:
X
Rsi ¼ s
Xi;m ; (1)
m

s
where Xi;m represents the rainfall depth on day i and at sub-
daily time step m, at nearby station s. For the present study
we have subdaily rainfall available in increments of 6 min,
such that m 2 {1, . . . , 240}. We also obtain the subdaily
s s
fragments given by fri;m ¼ Xi;m /Rsi , which is a dimension-
less version of the subdaily rainfall record.
Step 3: For each wet day Rot > 0, search for days with
similar daily rainfall depth across all nearby stations s ¼ 1,
Figure 2. Number of Australia-wide pluviograph records . . . , S, where S represents the total number of nearby sta-
against year of record, plotted from 1900. tions, across every year for which subdaily data is

3 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Figure 3. Illustration of the state-based method of fragments algorithm. The indicated steps correspond
to the steps of the algorithm in section 3.1.

available. For example, if we consider 20 nearby stations [13] This completes the algorithm for the regionalized
which each have an average length of record of 9 yr, this method of fragments. In total, there are three ‘‘tuning’’ pa-
would amount to a total of 180 yr of record. To preserve rameters : the number of nearby stations S to include in the
seasonality, we only look within a moving window of 615 model, the number of nearest neighbors k, and the width of
days centered on day t. In other words, if t ¼ 45 (14 Febru- the moving window. Given the large amount of variability
ary), we search for days from t ¼ 30 to t ¼ 60. Further- in record length from one subdaily rainfall station to the
more, to account for continuity across the boundaries we next, we let S vary such that sufficient stations were
only look at wet days with the same previous- and next-day selected to have at least 250 yr of record from which to
wetness state (i.e., I[Rsi1 ] ¼ I[Rot1 ] and I[Rsiþ1 ] ¼ I[Rotþ1 ]), sample. The value of k ¼ 10 was chosen as this ensured the
where I( ) represents a binary indicator function defined as daily total rainfall for each fragment was within a relatively
I(R) ¼ 1 for a wet day and I(R) ¼ 0 for a dry day. small tolerance of Roi , while still ensuring a significant
Step 4: We use an index j ¼ 1, . . . , n to refer to days amount of induced sampling variability. Sensitivity to each
which are within the moving window and have the same of these turning parameters was evaluated and found to be
previous- and next-day wetness state, with the total number fairly limited. Finally, the width of the moving window
of days n being calculated across all the nearby stations was selected so as to ensure that samples were all drawn
S and across all years of record at each station. These days from the same time of year.
are ranked by absolute deviation in rainfall depth jRsj – Rot j, [14] Although the overall approach is conceptually sim-
to construct a sorted series RsðjÞ from the smallest absolute ple, the challenge is to define the neighborhood from which
deviation to the largest, where the use of parentheses indi- to sample the S pluviograph records. The basis for identify-
cates that the data has been sorted. We find the k nearest ing whether the daily-to-subdaily scaling at two locations is
neighbors (j) ¼ 1, . . . , (k), with the value of k selected to similar and thus substitutable is described below.
ensure all the neighbors have an absolute deviation in rain- 3.2. Daily-to-Subdaily Scaling
fall depth of less than 10% of the at-site rainfall, up to a
maximum of 10 nearest neighbors. [15] To enable substitution of subdaily fragments from
Step 5: Randomly draw from RsðjÞ with probability: one station to another, one needs to ensure that for any day
t, the conditional relationship between the daily rainfall
amount Rt and the full sequence of subdaily rainfall Xi,m
1=ðjÞ
PðjÞ ¼ Xk ; (2) are statistically similar at both the target station and the
i¼1
1=i nearby stations. This can be expressed as,
s
where P(j) represents the probability of selecting neighbor f ðXi;m jRst Þ ¼ f ðXi;m
o
jRot Þ (3)
(j) [Lall and Sharma, 1996; Mehrotra and Sharma, 2006].
The selected fragment is then inserted into day Rot via for all m and t, where f(.j.) is used to express a conditional prob-
s
fri;m ¼ X(j),m  Rot . ability density function. Given the difficulty of constructing

4 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

separate conditional density functions for 240 separate [19] In combination, these scalar attributes are expected
increments of subdaily rainfall, as well as the fact that to cover most of the information on the scaling and timing
for any wet day Rt there is a high probability that any behavior between daily rainfall and the fragments.
subdaily rainfall increment Xt,m has no rainfall, we mod- [20] To illustrate these concepts, we present in Figure 4
ify equation (3) as follows: the joint probability plot of daily rainfall and the maximum
12-min storm burst at three locations in Australia : Hobart,
f ðYts jRst Þ ¼ f ðYto jRot Þ; (4)
Sydney, and Darwin. These locations were selected as they
have distinctly different climatology, with Hobart located
in the south of Tasmania being one of the most southerly
where Yts and Yto represent scalar attributes of Xt;m
s o
and Xt;m pluviograph records, Darwin in the Northern Territory
for each day of record, respectively. The attributes to be being one of the most northerly pluviograph records, and
considered include: Sydney being situated along the Australian east coast.
[16] Maximum intensity: for each wet day, what is the [21] As can be seen in the daily rainfall histogram
maximum 6-, 12-, 30-, 60-, 120-, 180-, and 360-min dura- (Figure 4, lower panel), the marginal probabilities of daily
tion storm burst expressed as a fraction of the total rainfall rainfall at each station are distinctly different. For example,
amount for that day? Darwin has a high probability of high daily rainfall
[17] Fraction of zeros: for each day, what is the fraction amounts (the majority of rain days having >10 mm rain-
of 6-min time steps with no rainfall? fall), whereas Hobart has a large number of rain days with
[18] Maximum intensity timing: for each wet day, what is relatively little rainfall, with most days having significantly
the time of day when the maximum 6-, 12-, 30-, 60-, 120-, less than 10 mm over the entire day. It should be empha-
180-, and 360-min duration storm burst occurs? sized, however, that our interest here is not on this marginal

Figure 4. Scatterplot with daily rainfall and an attribute of subdaily rainfall (the maximum 12-min
storm burst expressed as a fraction of the total daily rainfall) at three locations in Australia : Hobart
(blue), Sydney (green), and Darwin (red). Histograms of daily rainfall and the maximum 12-min storm
burst are provided in the bottom and left figure panels, respectively, for each of the three locations. The
solid lines are loess smoothers of the observations, and are provided for visualization purposes only.

5 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

distribution; rather, we wish to know, conditional on some The two-dimensional K-S statistic D is the maximum
daily rainfall amount, whether the subdaily rainfall proper- difference (ranging over both data points and quadrants)
ties are the same at any two locations. To determine of the integrated probabilities, and is given by [Press et al.,
whether this is the case, we started by plotting a loess 1992]:
smoother [Hastie et al., 2009] with support of 25% of the 0 1
sample to represent the conditional expected value of the pffiffiffiffi
B C
maximum 12-min storm burst as a function of daily B ND C
rainfall. Pr ðD > observedÞ ¼ QKS B
B sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 C;
C (6)
@ 0:75 A
[22] It is evident that the fraction of daily rainfall con- 1 þ 1  r 0:25  pffiffiffiffi
2

tained in the maximum 12-min storm burst varies as a func- N


tion of the daily rainfall amount. This is unsurprising, as
intuitively one would expect that for small daily rainfall where
amounts, a smaller percentage of the day would be wet,
and therefore there is a greater chance that the maximum N1 N2
N¼ ; (7)
12-min storm burst contains a large portion of the daily N1 þ N2
rainfall. Interestingly, however, the loess smoother high-
lights that the relationship between daily rainfall and sub- with N1 and N2 representing the size of samples 1 and 2,
daily rainfall is on average very different at the three respectively. In calculating the probability that the K-S sta-
locations, with Darwin typically having a greater fraction tistic, as described in equation (6), is above some defined
of the daily rainfall contained within the maximum 12-min level under the null hypothesis that the two samples are
storm burst than Hobart. This suggests that even if both sta- from the same population, it is necessary to evaluate the
tions have the same daily total rainfall amount, Darwin is function:
more likely to have that rainfall distributed over a number
of short-duration, high-intensity rainfall events compared X1
2 2
QKS ðÞ ¼ 2 ðÞj1 e2j  : (8)
with Hobart whose rainfall is more likely to be spread j¼1
evenly over the day, with these results appearing sensible
given the tropical nature of Darwin climate. Although fig- [26] This allows for the estimation of the probability that
ures are not provided here, consistent conclusions can be the joint distribution of two different data sets are statisti-
drawn from considering other durations, as well as the frac- cally similar, with further details on this statistic provided
tion of each wet day that does not experience rainfall. by Press et al. [1992].
3.3. Defining Similarity 3.4. Predictive Model for Statistical Similarity
[23] We now wish to devise a metric to determine [27] In section 3.3 we described a metric for determining
whether the conditional distributions in equation (4) and whether the joint distribution between daily rainfall
illustrated in Figure 4 are, in fact, statistically equivalent. amounts and attributes of subdaily rainfall, as illustrated in
To simplify the analysis, rather than focus on the condi- Figure 4, is statistically similar. As discussed earlier, to use
tional distribution we consider whether the joint distribu- this information to extend the continuous simulation approach
tion of Y and R at any two stations is equivalent, given by to locations where pluviograph data is unavailable, it is neces-
sary to draw subdaily fragments from nearby stations condi-
f ðY s ; Rs Þ ¼ f ðY o ; Ro Þ: (5) tional on daily rainfall at the target location. As such, we
now wish to determine: What are the factors that will influ-
[24] This is a stricter criterion compared to the condi- ence whether the daily-to-subdaily scaling at two stations
tional distribution in equation (4), because two locations will be similar?
having equivalent joint distributions imply that the condi- [28] To answer this question, we consider each possible
tional distribution must also be equivalent, although the op- pairing of the 232 pluviograph stations with at least 30
posite is not necessarily true. (One can easily imagine two years of data, totaling 26,796 station pairs, and calculate
samples having an equivalent distribution of subdaily rain- the two-sample, two-dimensional K-S statistic for each pair
fall conditional on daily rainfall amount, but different of stations and subdaily rainfall attributes. We use a 5%
marginal distribution for the daily rainfall amount, and significance level to evaluate whether two stations are simi-
therefore different joint distributions.) lar, and then consider how the probability that any two sta-
[25] To test the hypothesis that the joint distribution tions are similar varies as a function of a range of possible
between daily rainfall and some attribute of subdaily rain- covariates, including difference in latitude, longitude, dis-
fall at any two locations are statistically similar, we use a tance to coast, and elevation between each station pair.
two-dimensional, two-sample Kolmogorov-Smirnov (K-S) These predictors, summarized in Table 1, comprise a range
test. This represents a generalization of the better known of easily measurable physiographic characteristics, which
one-dimensional K-S test [Press et al., 1992], and was might be expected to influence the similarity between two
developed by Fasano and Franceschini [1987]. The basis stations. Seasonal variations in the daily-to-subdaily rain-
of the two-dimensional generalization is that although a cu- fall relationship are accommodated by formulating a sepa-
mulative distribution function is not well defined over more rate model for each season of the year.
than one dimension, the integrated probability in each of [29] Thus, we have a set of continuous predictors repre-
four quadrants around some point (xi, yi) in some arbitrary sented by V (dimension 26,796  5) which we wish to
x- and y-dimensions provides a reasonable approximation. model against a binomial response represented by u of

6 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Table 1. Predictors Used for the Logistic Regression Model Described in Equations (9) and (10)a
Predictor Units Description/Comments

Diff_lat Degrees (expressed as a decimal) Difference in latitude between each station pair, calculated as abs(Lat1–Lat2)
Diff_lon Degrees (expressed as a decimal) Difference in longitude between each station pair, calculated as abs(Lon1–Lon2)
Diff_lat  Diff_lon Degrees (expressed as a decimal) Interaction term, which would be greater than zero if it is the distance between
stations, rather than the sum of the latitude and longitude, which is the dominant
predictor.
Diff_dist_coast Dimensionless Difference in distance to coast between each station pair, normalized by the
(normalized) average distance to coast for the station pair, calculated as abs(dist1–dist2)/
mean(dist1, dist2).
Diff_elev Meters Difference in elevation between each station pair, calculated as abs(Elev1–Elev2)
a
The prefix ‘‘Diff_’’ emphasizes that it is the difference in each of the predictors between stations that is considered, rather than the absolute value.

length 26,796 (where u 2 {0, 1} represents the cases where model to a multivariate logistic regression setting to con-
the scaling between daily and subdaily rainfall at two sta- sider the influence of each of the plausible predictors men-
tions are statistically different and similar, respectively, as tioned above. The conceptual basis for this approach is
calculated by the Kolmogorov-Smirnov test described in illustrated in Figure 6. Given a target location of interest,
section 3.2). This relationship can be modeled using a we wish to define a zone for which the probability that
logistic regression, in which daily-to-subdaily scaling at two stations are statistically
similar is greater than a predefined threshold. This zone is
ez described by contours of equal probability, with the proba-
Pr ðu ¼ 1Þ ¼ logitðzÞ ¼ (9)
ez þ1 bility decreasing linearly (in the logistic transformed space)
in each of the dimensions of the regression model. The
transforms the continuous predictor variables to the range shapes of the contours are defined by the logistic regression
[0,1] as required when modeling a binomial response. In coefficients. In the idealized example in Figure 6, we repre-
this equation, z is defined as sent the case where the probability of two stations being
statistically similar decreases at a faster rate in the latitude
z ¼  0 þ  1 v1 þ . . . þ  5 v5 ; (10) dimension compared to the longitude dimension. Further-
more, the location of the target station is slightly offset
with  representing the regression coefficients. The results from the center of the contours, this being governed by the
of the logistic regression model are shown in Figure 5, influence of the relative difference in distance to the coast.
plotted against the difference in latitude. The results are [32] The results of this multivariate regression are pre-
presented for four attributes of subdaily rainfall: 6 min sented in Table 2, and once again plotted for the summer
maximum storm burst, 1 h maximum storm burst, fraction
of day with no rainfall, and time of day with the maximum
6-min storm burst. Note that for the time attribute, we are
only considering the marginal distribution of the time of
day when the maximum 6-min storm burst occurs, rather
than a joint density.
[30] As can be seen in Figure 5, with the exception of the
fraction of zeros measured by the K-S statistic, there is a
chance between 40% and 60% that the joint distribution of
daily rainfall and each of the attributes are statistically sim-
ilar provided that the difference in latitude is small, with
the probability decreasing rapidly with increasing differ-
ence in latitude. This is interesting, as no account is made
of any other physiographic information, so that stations
may be located in opposite sides of the continent, or at very
different elevations, and yet still have close to a 50%
chance of having the same scaling between daily and sub-
daily rainfall, provided the latitude is the same. The joint
distribution of daily rainfall and fraction of zeros has the
lowest probability of being statistically similar between sta-
tion pairs, with a chance of 22% that two stations will
have the same joint dependence, assuming they are at the
same latitude. Figure 5. Logistic regression results against a single pre-
[31] Consideration of just a single covariate – difference dictor (difference in latitude) and four responses represent-
in latitude – as in latitude as the only factor influencing the ing different subdaily attributes. The responses have been
similarity between stations ignores other physiographic in- calculated using the two-sample two-dimension Kolmo-
formation which may be important. As such, we extend this gorov-Smirnov test statistic.

7 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Figure 6. Diagrammatic representation of logistic regression results. The response is the probability
that the joint distribution of daily rainfall amount and some attribute of subdaily rainfall at a ‘‘nearby’’
station is statistically similar to the target station. The predictors are the difference in latitude, longitude,
latitude  longitude, elevation, and a normalized distance to coast, with the logistic regression coeffi-
cients determining the relative decrease in the probability that two stations are similar in each of these
dimensions.

months against latitude in Figure 7, with the remaining pre- only a chance of 40% that two stations have the same scal-
dictors held at zero. As can be seen, the results in Figure 7 ing, assuming all the predictors are zero.
show notable improvements in the probability that two sta- [33] It should be emphasized that this is in many ways a
tions are equal compared to Figure 5, because we are now conservative estimate as we consider the subdaily attributes
plotting the influence of latitude assuming that differences (e.g., fraction of zeros, 6-min rainfall intensity), which are
in longitude, elevation, and relative distance to coast are all the most challenging to capture from daily data alone. Even
zero. In fact, with the exception of the fraction of zeros, the more importantly, as can be seen in the example of Figure 4,
results show that for small values of each of the predictors the number of samples in each bivariate distribution is large
there is between a 60% and 70% probability that the daily- (30 years of data, 90 days per season, and 30% of days
to-subdaily joint probability distributions are statistically being wet days yields 800 wet days), such that the 95%
similar. Once again, the fraction of zeros is the most chal- confidence intervals are very narrow (as the width of the
lenging statistic in terms of maintaining similarity, with confidence intervals is governed by sample size).

Table 2. Logistic Regression Coefficientsa


Logistic Regression Coefficients
Subdaily Rainfall
Season Attribute Intercept Latitude Longitude Latitude  Longitude Distance Coast Elevation

DJF 6 min intensity 0.426 0.345 0.0377 0.0064 0.186 0.00089


DJF 1 h intensity 0.823 0.333 0.0425 0.0093 0.231 0.00075
DJF Fraction of zeros 0.375 0.253 0.0318 0.0075 0.242 0.00065
DJF 6 min time 0.979 0.137 0.0099 0.0022 0.453 0.00141
MAM 6 min intensity 0.067 0.192 0.0065 NS 0.218 0.00130
MAM 1 h intensity 0.308 0.178 0.0074 NS 0.107 0.00098
MAM Fraction of zeros 0.806 0.157 0.0105 0.0025 0.165 0.00060
MAM 6 min time 1.256 0.140 0.0226 0.0034 0.227 0.00092
JJA 6 min intensity 0.197 0.097 0.0110 0.0034 0.096 0.00198
JJA 1 h intensity 0.471 0.102 0.0204 0.0033 NS 0.00335
JJA Fraction of zeros 0.365 0.073 0.0171 0.0031 0.101 0.00116
JJA 6 min time 2.078 0.098 0.0321 0.0037 0.156 0.00069
SON 6 min intensity 0.474 0.387 0.0722 0.0129 NS 0.00146
SON 1 h intensity 0.824 0.325 0.0835 0.0135 NS 0.00132
SON Fraction of zeros 0.382 0.239 0.0623 0.0104 0.087 0.00095
SON 6 min time 1.028 0.162 0.0287 0.0042 0.317 NS
a
All predictors were found to be statistically significant (usually with a p-value <0.001 level), with the exception of several predictors labeled as NS
(not significant). Seasons include December-January-February (DJF), March-April-May (MAM), June-July-August (JJA) and September-October-
November (SON).

8 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

pluviograph station, and therefore provides a useful record


for verification of the method.
[35] The approach to identifying ‘‘nearby’’ stations is as
follows:
[36] (1) For all 1396 pluviograph stations in Australia
(excluding the Sydney Airport gage), calculate each of the
regression predictors identified in Table 1; namely, differ-
ence in latitude, longitude, latitude times longitude, eleva-
tion, and normalized distance to the coast, relative to the
Sydney Airport station.
[37] (2) Having developed the 1396  5 predictor ma-
trix, apply the regression model presented in equations (9)
and (10) using the regression coefficients shown in Table 2
for each season and attribute to calculate the probability
Pr(u ¼ 1).
[38] (3) For each season and attribute, separately rank
the probabilities from highest to lowest.
[39] (4) For each season calculate the average rank for
each station across all attributes.
[40] (5) Select the S lowest-ranked stations for inclusion
in the disaggregation model.
Figure 7. As per Figure 5, except the results represent the [41] This algorithm yields different choices of stations
outcomes of the full multivariate regression. The probabil- for each season, as physiographic influences may vary
ity that daily to subdaily scaling is statistically similar is depending on the dominant synoptic systems occurring at
once again plotted against difference in latitude, however, different times of the year. It is noted that the selection of
now all the remaining predictors are held at zero. the size of S represents a somewhat subjective decision, as
larger values of S increase the probability of selecting sta-
4. Results tions which are statistically different to the target station,
whereas smaller values of S will result in small sample
4.1. Identifying ‘‘Nearby’’ Stations: Application to sizes. For this case, we selected S ¼ 13, resulting in a total
Sydney Airport of 250 yr of data distributed over the 13 stations.
[34] We start by demonstrating a single application of [42] The 13 lowest-ranked stations for the summer sea-
the approach at one location: Sydney Airport (gage number son are shown in Figure 8. As expected, the lowest-ranked
066037). This location represents a relatively long-record stations (i.e., those with the greatest chance of being similar

Figure 8. Sydney Airport (large red dot) and nearby pluviograph stations (blue and brown dots). The
highest-ranked 13 pluviograph stations (totaling 250 yr of pluviograph data) based the full logistic
regression model are shown as brown dots, with the associated ranking.

9 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

to Sydney Airport) are those which are most proximate to sequences tend to overestimate rainfall for all exceedance
this station, generally within a small distance to coast, and probabilities, and for Hobart in which the simulated sequen-
all are at low coastal elevations. In this case, therefore, the ces underestimate the low exceedance probability rainfall
stations appear to be selected over a wide range of lati- events. Interestingly, this is observed for both results using
tudes, which is probably due to the strong increases in ele- at-site data and nearby station data, highlighting that the
vation and relative distance to the coast with changing issue is unlikely to be related to the regionalization proce-
longitude. dure. In fact, a more thorough examination indicates that
the annual maxima of the daily rainfall obtained using the
4.2 Model Evaluation daily rainfall record is on average slightly higher than the
[43] We now repeat the process of identifying nearby annual maxima of daily rainfall obtained from the subdaily
stations at five locations across Australia each having more rainfall record, due to the daily record being more complete
than 50 yr of pluviograph data, representing a diversity of (i.e., with less missing days) than the subdaily record. The
climate zones. These stations are shown in Table 3. Having issue is particularly notable for Alice Springs and Hobart,
identified the pool of nearby stations from which to draw which both have a significant percentage of the pluviograph
the fragments, we apply the approach described in algo- record classified as missing, such that resampling the sub-
rithm 1 to draw subdaily rainfall fragments from nearby daily fragments conditional to the daily rainfall record
stations conditional on at-site daily rainfall, and compare would be expected to yield simulated series which on aver-
these sequences to the at-site pluviograph records. For age have higher annual maximum rainfall at both daily and
comparison purposes, we also generated results using the subdaily durations.
algorithm but with at-site results only, and presented these [47] In addition to this issue, it was noted that the maxi-
alongside the regionalized results. mum largest 6-min value for the Hobart Airport record was
[44] It is emphasized that the use of a disaggregation well in excess of the simulated results, with this being no-
model derived using observed daily rainfall sequences ticeable for both the at-site and regionalized results. In par-
implies that the daily- and longer-timescale statistics will ticular, the maximum-recorded 6-min storm burst was
be identical to the observational data set. As such, the selec- 23.14 mm occurring on 24 April 1972, representing a very
tion of evaluation statistics should focus on the capacity intense storm burst for such a high latitude. Aggregating
of the model to simulate rainfall at subdaily timescales. the pluviograph record for that full day showed 192.2 mm
Reflecting the likely application of this model for flood esti- falling, which contrasted with the daily station at the same
mation, the statistics considered here are based on: whether location recording only 42.2 mm for that day. We also
the model is capable of reproducing the extreme rainfall in- examined the nearest pluviograph and daily-read station
tensity; and whether the model captures the antecedent pairing, namely gage number 94029 located 15.6 km from
rainfall prior to the flood-producing rainfall event. In addi- the Hobart Airport gage, and found the aggregated daily
tion, several statistics have been calculated to determine the rainfall from the pluviograph to be 27.94 mm, compared
connectivity of rainfall events between successive wet with 27.9 mm from the daily rain gage at that same loca-
days. tion. Furthermore, the maximum 6-min increment rainfall
[45] Considering first the annual maxima statistics, we intensity was found to be 1.74 mm, substantially smaller
present in Figure 9 a plot of the annual maximum 6-min than that recorded at Hobart Airport. This therefore indi-
rainfall against the exceedance probability for both the cates that a recording error probably occurred at the pluvio-
observed data at the target location, as well as the results of graph gage at Hobart Airport.
100 simulation runs with the same length of series as the [48] For both reasons, we suggest that the simulated
original target pluviograph time series. The left column results in this case may be more likely to reflect the precipi-
represents the results using at-site data as the basis for tation patterns at each location compared with the observed
resampling, while the right column represents results using subdaily record at those same locations, although this con-
data from nearby records. The median and the 5th and 95th clusion is unlikely to apply everywhere. Comparing the
percentiles are calculated empirically from these 100 simu- sampling intervals for the at-site and regionalized results, it
lation runs, with the 5th and 95th percentile values meas- can be seen that the regionalized intervals are generally
uring the degree of sampling variability induced by the smoother, and tend to widen for higher events. In contrast,
stochastic generation algorithm. the at-site results tend to have narrower sampling intervals
[46] As can be seen, the observed data is generally for the events with the lowest exceedance probabilities,
within the sampling interval for most of the stations, with reflecting the small sample size from which to draw sub-
the exception of Alice Springs, for which the generated daily fragments for these large events. Therefore, rather

Table 3. Data Used to Test Continuous Simulation Modela


Station Airport Gage Number Start Year Years of Observed Data Latitude/Longitude Köppen Climate Classification

Sydney 066037 1961 45 33.9411/151.1725 Temperate (warm summer)


Perth 009021 1960 46 31.9275/115.9764 Subtropical (dry summer)
Alice Springs 015590 1950 57 23.7951/133.8890 Desert/grassland (hot, persistently dry)
Cairns 031011 1941 66 16.8736/145.7458 Tropical (monsoonal)
Hobart 094008 1959 47 42.8339/147.5033 Temperate (mild summer)
a
All stations continue until 2009.

10 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Figure 9. Six-minute annual maximum rainfall against exceedance probability for (a) Sydney,
(b) Perth, (c) Alice Springs, (d) Cairns, and (e) Hobart. Black dots represents observed data, black solid
line represents the median of 100 simulations, and black dotted lines represent the 5th- and 95th-percentile
simulated values.

11 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

than resulting in a deterioration in performance, it is likely [51] As can be seen, the simulated data appear to follow
that the regionalized version actually provides a better rep- the observed data reasonably well, although there are sev-
resentation of the sampling intervals for these very large eral points outside the 90% sampling interval. Importantly,
events. no systematic biases could be identified, with performance
[49] In addition to the results presented in Figure 9 for varying depending on the location. This is also shown in
the 6-min duration annual maxima, we also tabulated the the lower half of Table 4 with the antecedent rainfall of dif-
results for other durations up to 12 h, presented in Table 4. ferent durations prior to the 1-h storm burst. The observed
Once again, the observed and simulated sequences are gen- antecedent rainfall is generally within the 90% sampling
erally similar, with the median-sampled value within 10% interval, with the exception of Cairns in which antecedent
of the observed value, and no obvious systematic under- or rainfall is underestimated for 6-h depth prior to the 1-h
overestimation biases. The exception here is for Hobart, in storm burst, and overestimation for longer durations. Once
which the annual maxima are typically undersimulated by again, the main outlier is for Hobart Airport, however, as
10%–20%. It should be noted, however, that the observed discussed in the context of the annual maxima this is likely
rainfall often falls outside of the 5th- and 95th-percentile due to a recording error for the pluviograph record.
simulation bounds, highlighting that the simulation bounds [52] Finally, we present results addressing the connectiv-
may underestimate the true level of variance. Finally, ity in rainfall events between successive wet days. This is a
although the results are only presented in Table 4 for the potential issue with the conventional method of fragments
regionalized method of fragments, the results from the at- logic described in the literature [Lall and Sharma, 1996;
site implementation are comparable, again highlighting that Nowak et al., 2010; Sharma et al., 1997; Snavidze, 1977;
the regionalized method of fragments does not result in any Tarboton et al., 1998], as the subdaily fragments are, in
notable deterioration in model performance. effect, randomly reordered such that by definition all of the
[50] We next consider the antecedent rainfall prior to the within-day rainfall characteristics will be preserved, but the
design storm burst event, plotted in Figure 10. The justifi- between-day characteristics will be lost other than ensuring
cation for focusing on the antecedent rainfall exceedance that the daily total rainfalls are maintained. This was one of
probability plot was because of the often important rela- the primary justifications for using the state-based method
tionship between the ‘‘flood-producing’’ rainfall event and of fragments, in which the fragments are selected condi-
the catchment wetness prior to the event [Kuczera et al., tional on both current day wetness and the previous and
2006]. We only focus on the 6-h antecedent rainfall depth, next-day wetness state.
as antecedent conditions for longer durations (particularly, [53] The results are presented in Table 5 for all five test
multiday antecedent rainfall depth) will be correctly cap- locations. For each location, the first four rows represent
tured as we are using observed daily rainfall data at the the probability that the last hour of day t (represented as
location of interest. Xt,24) is wet or dry given that the next day is wet or dry.

Table 4. Comparison of Observed and Simulated Results for Median Annual Maxima for Different Storm Burst Durations and Antecedent
Rainfall Prior to 1 h Storm Bursta
Sydney Perth Alice Springs Cairns Hobart

Simulated Simulated Simulated Simulated Simulated


(5% and 95% (5% and 95% (5% and 95% (5% and 95% (5% and 95%
Observed Bounds) Observed Bounds) Observed Bounds) Observed Bounds) Observed Bounds)

Annual Maxima
6 min 8.9 8.8 6.2 6.2 5.5 6.8 11.6 11.8 4.5 3.8
(8.14–9.32) (5.77–6.81) (6.32–7.2) (11.15–12.65) (3.4–4.12)
30 min 25.7 23.7 14.7 14.0 16.7 18.2 34.9 35.3 11.3 8.9
(21.95–25.92) (13.08–15.26) (17.09–19.45) (33.96–37.05) (8.22–9.55)
1h 35.4 32.6 18.8 18.4 22.1 24.2 51.7 51.9 14.6 12.0
(30.04–35.45) (16.95–19.72) (22.5–25.75) (49.76–54.79) (11.26–12.85)
3h 55.4 49.4 29.0 27.9 32.6 33.6 83.5 85.1 22.9 19.5
(46.46–52.47) (26.18–29.89) (31.42–35.16) (81.33–89) (18.54–20.56)
6h 72.3 64.0 36.3 35.4 39.6 39.8 113.0 110.8 30.3 26.5
(61.08–67.09) (34.09–37.53) (37.65–41.42) (106.12–114.22) (25.69–27.75)
12 h 91.8 84.8 45.4 44.5 48.2 46.5 147.4 140.7 39.6 35.3
(81.92–87.2) (43.46–45.63) (45.42–47.65) (137.24–144.39) (34.58–36.26)

Antecedent Moisture Prior to 1-h Burst (mm)


6h 15.4 13.2 6.8 8.5 6.1 5.3 25.4 21.5 6.3 5.2
(9.86–17.13) (6.26–10.06) (4.14–6.91) (17.97–26.25) (4.21–6.52)
12 h 22.7 18.8 9.7 10.9 8.0 7.8 32.3 31.0 9.1 6.8
(14.61–23.34) (8.17–12.93) (5.97–9.73) (25.15–35.9) (5.52–8.59)
24 h 31.4 28.1 12.8 13.6 10.7 11.5 42.0 49.1 10.2 9.0
(22.51–35.5) (11.12–16.64) (8.7–13.5) (40.99–56.76) (7.1–11.55)
48 h 43.0 37.2 15.5 17.2 15.5 16.8 58.6 74.3 11.4 11.1
(29.05–46.59) (14.01–20.82) (12.99–19.96) (63.77–86.02) (8.33–14.19)
a
The simulated median annual maxima represent the median of all 100 simulations.

12 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Figure 10. Six hour antecedent rainfall prior to the 6-min annual maximum storm burst plotted against
exceedance probability for (a) Sydney, (b) Perth, (c) Alice Springs, (d) Cairns, and (e) Hobart. Black
dots represents observed data, black solid line represents the median of 100 simulations, and black dotted
lines represent the 5th- and 95th-percentile simulated values.

13 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Table 5. The Connectivity of Rainfall Spells Between Successive Wet Daysa


Location Data Observed Conventional MoFb State-Based MoF

Sydney Airport
Pr(Xt,24 > 0, Rtþ1 > 0 j Rt > 0) 19.9% 13.1% 18.3%
Pr(Xt,24 > 0, Rtþ1 ¼ 0 j Rt > 0) 5.2% 11.2% 6.4%
Pr(Xt,24 ¼ 0, Rtþ1 > 0 j Rt > 0) 28.0% 34.9% 29.7%
Pr(Xt,24 ¼ 0, Rtþ1 ¼ 0 j Rt > 0) 46.9% 40.8% 45.6%
Pr(Xt,24>0, Xtþ1,1 > 0 j Rt > 0, Rtþ1 > 0) 31.7% 6.7% 14.7%

Perth Airport
Pr(Xt,24 > 0, Rtþ1 > 0 j Rt > 0) 21.6% 15.9% 21.3%
Pr(Xt,24 > 0, Rtþ1 ¼ 0 j Rt > 0) 5.2% 11.7% 7.4%
Pr(Xt,24 ¼ 0, Rtþ1 > 0 j Rt > 0) 30.9% 36.5% 31.1%
Pr(Xt,24 ¼ 0, Rtþ1 ¼ 0 j Rt > 0) 42.3% 35.9% 40.2%
Pr(Xt,24>0, Xtþ1,1 > 0 j Rt > 0, Rtþ1 > 0) 26.7% 6.8% 14.3%

15590
Pr(Xt,24 > 0, Rtþ1 > 0 j Rt > 0) 16.3% 6.5% 11.3%
Pr(Xt,24 > 0, Rtþ1 ¼ 0 j Rt > 0) 6.7% 7.9% 4.0%
Pr(Xt,24 ¼ 0, Rtþ1 > 0 j Rt > 0) 24.8% 34.6% 29.7%
Pr(Xt,24 ¼ 0, Rtþ1 ¼ 0 j Rt > 0) 52.2% 51.0% 55.0%
Pr(Xt,24>0, Xtþ1,1 > 0 j Rt > 0, Rtþ1 > 0) 28.6% 2.4% 8.3%

31011
Pr(Xt,24 > 0, Rtþ1 > 0 j Rt > 0) 21.2% 15.4% 19.8%
Pr(Xt,24 > 0, Rtþ1 ¼ 0 j Rt > 0) 3.6% 6.4% 4.3%
Pr(Xt,24 ¼ 0, Rtþ1 > 0 j Rt > 0) 44.0% 49.9% 45.4%
Pr(Xt,24 ¼ 0, Rtþ1 ¼ 0 j Rt > 0) 31.2% 28.4% 30.5%
Pr(Xt,24>0, Xtþ1,1 > 0 j Rt > 0, Rtþ1 > 0) 20.9% 5.3% 8.9%

94008
Pr(Xt,24 > 0, Rtþ1 > 0 j Rt > 0) 14.0% 9.6% 14.5%
Pr(Xt,24 > 0, Rtþ1 ¼ 0 j Rt > 0) 4.4% 10.4% 5.6%
Pr(Xt,24 ¼ 0, Rtþ1 > 0 j Rt > 0) 30.1% 34.5% 29.6%
Pr(Xt,24 ¼ 0, Rtþ1 ¼ 0 j Rt > 0) 51.4% 45.5% 50.2%
Pr(Xt,24>0, Xtþ1,1 > 0 j Rt > 0, Rtþ1 > 0) 23.6% 4.5% 10.9%
a
The first four rows provide the probability that the last hour of day t (Xt,24) is wet/dry given that the next day t þ 1 is wet/dry, with the probabilities
summing to 100%. The fifth row is the probability that the last hour of day t, and the first hour of day t þ 1 wet.
b
MoF, Method of Fragments.

This has been calculated for the observed record as well as this approach is randomly to draw subdaily fragments from
for the conventional and state-based implementations of nearby pluviograph stations conditional on the daily rain-
the method of fragments logic. As can be seen, the state- fall amount and the previous- and next-day wetness state at
based logic yields a significant improvement compared the target station. The identification of nearby stations is
with the conventional method of fragments. In particular, based on a distance metric which considers latitude and
the probability that the last hour of the day is wet is under- longitude as well as elevation and distance to coast, with
estimated by the conventional method of fragments when the relative importance of each variable determined by
the next day is wet, and overestimated when the next day is looking at the similarity in the daily-to-subdaily scaling at
dry, for all five locations. 232 long pluviograph stations across Australia.
[54] The fifth row then summarizes the probability that [56] The approach sought to address several important
the last hour of day t, and the first hour of day t þ 1, are limitations associated with the Australian pluviograph re-
both wet for successive wet days. As can be seen, this is cord. First, compared to daily rainfall data, there is approxi-
dramatically underestimated for both the conventional and mately one order of magnitude less pluviograph stations, and
state-based method of fragments, highlighting that the tem- the records at each station are usually much shorter than
poral patterns on the boundary between wet days are likely their daily read counterparts. Thus, by combining longer,
to be less continuous for the simulated data compared with more abundant, and more reliable daily data at the target
the observations. Nevertheless, the state-based method of location with the information contained in a number of plu-
fragments provides a significant improvement compared viograph records in the neighborhood of the target location,
with the conventional algorithm, highlighting the advan- it is possible to make the best use of the both types of data.
tages of moving to the state-based logic. Second, by drawing records from multiple nearby pluvio-
graph records rather than relying on a single record, it is also
possible to consider information from records only several
5. Discussion and Conclusions years long, which would usually be discarded as being too
[55] In this paper, a framework was described where short for meaningful analysis. Finally, pluviograph data
continuous (6-min increment) rainfall can be generated at flagged as missing or unreliable can simply be discarded
any location of interest provided that daily data is either from the analysis, even for cases where there is a systematic
available or can be synthetically generated. The basis of bias in the missing data (e.g., pluviograph recording tends to

14 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

fail during major storm events). This is because, provided yield more intense rainfall bursts for a given daily rainfall
the daily rainfall data are reliable, and there are sufficient amount [e.g., see Hardwick-Jones et al., 2010; Lenderink
data at other pluviograph stations to capture a diversity of and van Meijgaard, 2008; Lenderink et al., 2011; Westra
rainfall events across a range of magnitudes, such possible and Sisson, 2011], however, explicitly addressing this issue
systematic pluviograph recording biases are unlikely to be is reserved for future research.
translated into the final synthetically generated sequences. [60] Although daily data is much more abundant than
[57] The evaluation of the method on a range of statistics pluviograph data across Australia, in many regions the
which are relevant for flood estimation, notably the annual length or reliability of daily rainfall may not be sufficient
maximum statistics and the antecedent rainfall prior to the for the stochastic generation of rainfall sequences. This is
flood-producing storm burst, suggests that the method com- the subject of the next paper, in which the approach pre-
pares reasonably well with at-site data for the five test loca- sented here is generalized to any location in Australia,
tions considered. In particular, no significant deterioration in regardless of the availability of daily or pluviograph data.
the results could be observed when moving from the at-site
method of fragments to the regionalized version, suggesting
that the regionalized version properly represents the at-site [61] Acknowledgments. This study was supported by an Australian
Research Council Discovery grant as well as a research grant from the Insti-
variability. Furthermore, it is likely that the sampling inter- tution of Engineers, Australia to help develop continuous rainfall sequences
vals for the regionalized version are likely to more reason- for design flood estimation. The daily and continuous rainfall records used
ably reflect the true variability of the data, with widening were obtained from the Australian Bureau of Meteorology. Finally, we wish
to thank Geoff Pegram and two anonymous reviewers, whose comments
sampling intervals for lower exceedance probability (and and suggestions have greatly improved the quality of the manuscript.
thus higher magnitude) events; although, as discussed in the
context of the results of Table 4, this variability may still be
underestimated. This also highlights that the regionalized References
method is able to provide a much greater diversity of extreme Blazkova, S., and K. Beven (2002), Flood frequency estimation by continu-
ous simulation for a catchment treated as ungauged (with uncertainty),
rainfall sequences (and associated temporal patterns) than Water Resour. Res., 38(8), 1139, doi:10.1029/2001WR000500.
what has been observed at any one point location, with this Boughton, W., and O. Droop (2003), Continuous simulation for design
in turn likely to yield more robust flood-frequency results flood estimation—a review, Environ. Model. Software, 18(4), 309–318.
when the continuous rainfall sequences are run through a Cameron, D., K. Beven, J. Tawn, and P. Naden (2000), Flood frequency
estimation by continuous simulation (with likelihood based uncertainty
continuous rainfall-runoff model. estimation), Hydrol. Earth Syst. Sci., 4(1), 23–34.
[58] We also looked at the connectivity in the temporal Cowpertwait, P. S. P., and P. E. O’Connell (1997), A regionalised Neyman-
patterns between successive wet days, which represents one Scott model of rainfall with convective and stratiform cells, Hydrol.
of the most obvious limitations of the method of fragments Earth Syst. Sci., 1, 71–80.
logic. In general, the state-based logic proposed here results Cowpertwait, P. S. P., P. E. O’Connell, A. V. Metcalfe, and J. A. Mawdsley
(1996), Stochastic point process modelling of rainfall. II. Regionalisation
in a notable improvement in connectivity, although it is and disaggregation, J. Hydrol., 175, 47–65.
clear that the method is unable to reproduce observed con- Cowpertwait, P. S. P., V. Isham, and C. Onof (2007), Point process models
nectivity exactly. Nevertheless, the implications for applica- of rainfall: Developments for fine-scale structure, Proc. R. Soc. A and B,
tions such as flood estimation are unclear. For example, if 463(2086), 2569–2588.
Fasano, G., and A. Franceschini (1987), A multidimensional version of the
the method is able to reproduce within-day temporal pat- Kolmogorov-Smirnov test, Monthly Notices of the Royal Astronomical
terns, preserves annual maximum rainfall and associated an- Society, 225, 155–170.
tecedent conditions, and maintains the daily total rainfall Frost, A. J., R. Srikanthan, and P. S. P. Cowpertwait (2004), Stochastic gen-
depths, then the effect of some discontinuities on flood esti- eration of rainfall data at subdaily timescales: A comparison of DRIP
mates are unlikely to be large. The use of these generated and NSRP, Rep. 04/9, pp. 1813–1819, CRC, Salisbury South, Australia.
Gupta, V. K., and E. C. Waymire (1993), A statistical analysis of mesoscale
sequences as an input for continuous rainfall-runoff model- rainfall as a random cascade, J. Appl. Meteorol., 32, 251–267.
ing would be one way to test this issue, and is an area which Gyasi-Agyei, Y. (1999), Identification of regional parameters of a stochasic
we plan to investigate further. model for rainfall disaggregation, J. Hydrol., 223, 148–163.
[59] We note that like most continuous simulation algo- Gyasi-Agyei, Y., and S. M. Parvez Bin Mahbub (2007), A stochastic model
for daily rainfall disaggregation into fine time scale for a large region,
rithms, the objective of our method is to preserve various J. Hydrol., 347, 358–370.
statistics of historical rainfall variability. We have addressed Gyasi-Agyei, Y., and G. R. Willgoose (1997), A hybrid model for point
nonstationarity in the daily-to-subdaily scaling as a result of rainfall modelling, Water Resour. Res., 33(7), 1699–1706.
seasonal fluctuations by selecting fragments from within Gyasi-Agyei, Y., and G. R. Willgoose (1999), Generalisation of a hybrid
the same season. We do not expect that non-stationarity model for point rainfall, J. Hydrol., 219(3–4), 218–224.
Hardwick-Jones, R., S. Westra, and A. Sharma (2010), Observed relation-
issues due to inter-annual variability of rainfall are likely to ships between extreme sub-daily precipitation, surface temperature and
result in major distortions to the fidelity of the generated relative humidity, Geophys. Res. Lett., 37, L22805, doi:10.1029/
continuous sequences, since much of this variability results 2010GL045081.
in changes to wet day occurrences and daily rainfall Hastie, T., R. Tibshirani, and J. Friedman (2009), The Elements of Statisti-
cal Learning: Data Mining, Inference and Prediction, 763 pp., Springer,
amounts rather than sub-daily temporal patterns [Pui et al., Berlin, Germany.
2011b]. Thus, this should be accounted for by the daily Koutsoyiannis, D., and C. Onof (2001), Rainfall disaggregation using
rainfall simulation algorithm rather than the daily to sub- adjusting procedures on a Poisson cluster model, J. Hydrol., 246(1–4),
daily disaggregation approach. Finally, there is an 109–122.
increased interest in nonstationarity of rainfall (and other Kuczera, G., M. Lambert, T. M. Heneker, S. Jennings, A. J. Frost, and P. J.
Coombes (2006), Joint probability and design storms at the crossroads,
hydroclimatic sequences) as a result of anthropogenic cli- Aust. J. Water Resour., 10(1), 63–80.
mate change [e.g., Milly et al., 2008]. In particular, the Lall, U., and A. Sharma (1996), A nearest neighbour bootstrap for resam-
associated increases in temperature may be expected to pling hydrological timeseries, Water Resour. Res., 32, 679–693.

15 of 16
W01535 WESTRA ET AL.: CONTINUOUS RAINFALL: REGIONALIZED DISAGGREGATION W01535

Lamb, R., and A. L. Kay (2004), Confidence intervals for a spatially gener- Pui, A., S. Westra, A. Santoso, and A. Sharma (2011b), Impact of the El
alized, continuous simulation flood frequency model for Great Britain, Niño Southern Oscillation, Indian Ocean Dipole, and Southern Annular
Water Resour. Res., 40(7), W07501, doi:10.1029/2003WR002428. Mode on daily to sub-daily rainfall characteristics in East Australia,
Lenderink, G., and E. van Meijgaard (2008), Increase in hourly precipita- Monthly Weather Review, in press.
tion extremes beyond expectations from temperature changes, Nat. Rodriguez-Iturbe, I., D. Cox, and V. Isham (1987), Some models for rain-
Geosci., 1, 511–514. fall based on stochastic point processes, Proc. R. Soc. A and B, 410,
Lenderink, G., H. Y. Mok, T. C. Lee, and G. J. Van Oldenborgh (2011), 269–288.
Scaling and trends of hourly precipitation extremes in two different cli- Rodriguez-Iturbe, I., D. Cox, and V. Isham (1988), A point process model
mate zones—Hong Kong and the Netherlands, Hydrol. Earth Syst. Sci., for rainfall: Further developments, Proc. R. Soc. A and B, 417, 283–298.
8, 4701–4719. Schertzer, D., and S. Lovejoy (1987), Physical modelling and analysis of
Lovejoy, S., and D. Schertzer (1990), Multifractals, universality classes, rain and clouds by anisotropic scaling multiplicative processes, J. Geo-
and satellite and radar measurements of cloud and rain fields, J. Geophys. phys. Res., 92, 9693–9714.
Res., 95, 2021–2031. Sharma, A., and R. Srikanthan (2006), Continuous rainfall simulation: A
Marshak, A., A. Davis, R. Cahalan, and W. Wiscombe (1994), Bounded nonparametric alternative, in 30th Hydrology and Water Resour. Symp.,
cascade models as nonstationary multifractals, Phys. Rev. E, 49(1), Launceston, Tasmania.
55–69. Sharma, A., D. G. Tarboton, and U. Lall (1997), Streamflow simulation: a
Mehrotra, R., and A. Sharma (2006), Conditional resampling of hydrologic nonparametric approach, Water Resour. Res., 33(2), 291–308.
time series using multiple predictor variables: A K-nearest neighbour Snavidze, G. G. (1977), Mathematical Modeling of Hydrologic Series, 314
approach, Adv. Water Resour., 29, 987–999. pp., Water Resour. Publ., Littleton, Colo.
Menabde, M., D. Harris, A. W. Seed, G. Austin, and D. Stow (1997), Multi- Tarboton, D. G., A. Sharma, and A. Lall (1998), Disaggregation procedures
scaling properties of rainfall and bounded random cascades, Water for stochastic hydrology based on nonparametric density estimation,
Resour. Res., 33(12), 2823–2830. Water Resour. Res., 34(1), 107–119.
Milly, P. C. D., J. Betancourt, M. Falkenmark, R. M. Hirsch, W. Zbigniew, Verhoest, N., P. Troch, and F. D. Troch (1997), On the applicability of Bart-
Z. W. Kundzewicz, D. P. Lettenmaier, and R. J. Stouffer (2008), lett-Lewis rectangular pulse models in the modelling of design storms at
Stationarity is dead: Whither water management?, Science, 319, a point, J. Hydrol., 202, 108–120.
573–574. Westra, S., and S. A. Sisson (2011), Detection of non-stationarity in precip-
Nowak, K., J. Prarie, B. Rajagopalan, and U. Lall (2010), A nonparametric itation extremes using a max-stable process model, J. Hydrol., 406,
stochastic approach for multisite disaggregation of annual to daily 119–128.
streamflow, Water Resour. Res., 46, W08529, doi:10.1029/2009WR
008530. R. Mehrotra and A. Sharma, School of Civil and Environmental Engi-
Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery (1992), neering, University of New South Wales, Sydney, NSW 2052, Australia.
Numerical Recipes in Fortran—The Art of Scientific Computing, 2nd ed., ([email protected])
963 pp., Cambridge Univ. Press, Cambridge, Mass. R. Srikanthan, Water Division, Australian Bureau of Meterology,
Pui, A., A. Lall, and A. Sharma (2011a), How does the Interdecadal Pacific G.P.O. Box 1289, Melbourne, Victoria 3001, Australia.
Oscillation affect design floods in Australia?, Water Resour. Res., 47, S. Westra, School of Civil, Environmental, and Mining Engineering,
W05554, doi:10.1029/2010WR009420. University of Adelaide, SA 5005, Australia.

16 of 16

You might also like