0% found this document useful (0 votes)
71 views56 pages

Kavish Daya and Darmikah Pather Final IP Report

This report investigates how sensitive the performance of a daily variable length bootstrap stochastic rainfall generator is to variations in two subjective parameters: the number of nearest neighbor pairs selected and the number of years used for disaggregation. Daily rainfall data from 1957-2013 from two stations in Gauteng and two in Kwa-Zulu Natal were analyzed. 100 stochastic series were generated for different parameter pairs and evaluated based on percentage deviation of statistics from historic values, deviation from the median, and average composite deviation. Results found that no parameter pair was optimal for a specific climate and the generator was not highly sensitive to parameter changes, though it may not be suitable for arid regions with low rainfall occurrence.

Uploaded by

Kavish Daya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views56 pages

Kavish Daya and Darmikah Pather Final IP Report

This report investigates how sensitive the performance of a daily variable length bootstrap stochastic rainfall generator is to variations in two subjective parameters: the number of nearest neighbor pairs selected and the number of years used for disaggregation. Daily rainfall data from 1957-2013 from two stations in Gauteng and two in Kwa-Zulu Natal were analyzed. 100 stochastic series were generated for different parameter pairs and evaluated based on percentage deviation of statistics from historic values, deviation from the median, and average composite deviation. Results found that no parameter pair was optimal for a specific climate and the generator was not highly sensitive to parameter changes, though it may not be suitable for arid regions with low rainfall occurrence.

Uploaded by

Kavish Daya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 56

CIVN4005A-Investigational Project Report

CIVN4005A-Investigational Project 2023


Report

Parameter selection for a daily


stochastic rainfall generator

Kavish Daya- 2307722


Darmikah Pather-2308785

Supervisor: John Ndiritu

1
CIVN4005A-Investigational Project Report

Table of Contents

Abstract.............................................................................................................................3
List of figures.....................................................................................................................4
1. Introduction...............................................................................................................5
1.2 Aim and objectives...............................................................................................................8
1.3 Research Questions..............................................................................................................8
1.4 Organization of the report....................................................................................................8
2. Literature Review.....................................................................................................10
2.1 Introduction.......................................................................................................................10
2.2 Hydrological Stochastic generator applications...................................................................10
2.3 Non-Parametric and Parametric models..............................................................................14
2.4 Climate change and variability modelling............................................................................16
2.5 Variable Length Bootstrap method.....................................................................................17
2.6 Daily stochastic rainfall data...............................................................................................18
2.7 Daily VLB stochastic generator............................................................................................19
2.8 Limitations of subjective parameters in the daily stochastic rainfall VLB generator.............20
3. Research Method.....................................................................................................21
3.1 Introduction.................................................................................................................21
3.2 Tasks that will be carried out..............................................................................................21
3.3 Daily VLB stochastic generation procedure..........................................................................24
3.4 Analysis of historical observed data....................................................................................29
3.4.1 Gauteng rainfall stations..................................................................................................30
3.4.2 Kwa-Zulu Natal rainfall stations.......................................................................................34
4. Performance Evaluation...........................................................................................38
Gauteng Evaluation:.................................................................................................................38
Kwa-Zulu Natal Evaluation........................................................................................................44
5. Conclusions and Recommendations..........................................................................51
6. References................................................................................................................53

2
CIVN4005A-Investigational Project Report

Abstract
This research project aims to investigate how sensitive the performance of the daily VLB
stochastic generator is to variations in two parameters that are subjectively selected to
disaggregate stochastic rainfalls from an annual to daily timestep. The two subjective
parameters are the number of pairs of nearest neighbours to select and the number of
years out of these to use for the disaggregation. 57 years of daily rainfall data obtained from
two stations in Gauteng (Johannesburg Leeukop and Johannesburg Turffontein) and two
stations in Kwa-Zulu Natal (Durban Heights, Ngome-Bos) from the South African Weather
Service were applied. By using the daily VLB stochastic generator, 100 stochastic series were
generated for a number of pairs, which are defined as the number of pairs of nearest
neighbours selected and the number of years out of these to use for the disaggregation.
These pairs are the subjective parameter values that this research is based on. By using
three main performance evaluations, one being the percentage deviation of median values
from historic values for different statistical measures such as mean, standard deviation and
skewness. Another performance evaluation is based on the deviation of the historic statistic
from the median of the 100 stochastic values, and lastly an average composite deviation
performance evaluator was used. In addition to these performance evaluators, boxplots
were used to evaluate performance of certain statistical measures. When the historic
statistic lies within the inter- quartile range of the stochastic data, the replication was
deemed acceptable. It was found that no pair of the disaggregation parameters is optimal
for a specific climate and the daily VLB stochastic generator is not sensitive to changes in the
subjective parameters. The largest caveat of the daily VLB stochastic generator is that it
cannot be used to generate stochastic data for an arid region where the occurrence of
rainfall is too low for some of the computational steps of the method.

3
CIVN4005A-Investigational Project Report

List of figures
Table 1: Rainfall station characteristics..................................................................................26
Figure 1:Historic daily rainfall.txt........................................................................................... 25
Figure 2: VLBSettingsandfiles.txt file......................................................................................26
Figure 3: RainAndRainDays.txt...............................................................................................26
Figure 4: Stochastic monthly generated rainfall.....................................................................27
Figure 5: Stochastic daily generated rainfall...........................................................................28
Figure 6: Stochastic daily generated rainfall with updated neighbouring pairs.....................28
Figure 7: Summary of VLB generator process........................................................................29
Figure 8 :Leeukop Station obtained from Google Earth Pro...................................................30
Figure 9: Average and daily rainfall for Leeukop....................................................................31
Figure 10:Daily rainfalls for the 57-year period in Leeukop....................................................32
Figure 11:Average daily rainfall per month in Leeukop..........................................................32
Figure 12:Turrfontein Station obtained from Google Earth Pro.............................................32
Figure 13: Average daily rainfall depths observed from the historic data for Turffontein.....33
Figure 14: Daily rainfalls for the 57-year period in Turffontein..............................................33
Figure 15: Average daily rainfall per month in Turffontein....................................................34
Figure 16: Ngome-Bos station in Kwa-Zulu Natal province....................................................36
Figure 17: Mean historic daily rainfall per year......................................................................36
Figure 18: Historic daily rainfall..............................................................................................37
Figure 19: Monthly Mean daily rainfall box plots over a 57 year period for Ngome-Bos.......37
Figure 20: Google Earth Pro image of Durban Heights station...............................................38
Figure 21: Mean historic daily rainfall per year......................................................................38
Figure 22: Daily historic rainfalls over a 57 year period.........................................................38
Figure 23: Mean daily rainfall per month over the 57-year period in Durban Heights..........39
Figure 24: Leeukop performance based on historic value……………………………………………………39
Figure 25: Leeukop performance based on deviation from median…………………………………….39
Figure 26: Skewness boxplot for Leeukop…………………………………………………………………………..40
Figure 27: Composite average deviation for Leeukop…………………………………………………………40
Figure 28: Turffontein performance based on historic value………………………………………………41
Figure 29: Turffontein performance based on deviation from median……………………………....41
Figure 30: Mean boxplot for Turffontein ……………………………………………………………………………42
Figure 31: Composite average deviation for Turffontein…………………………………………………….42
Figure 32: Durban Heights performance based on historic deviation………………………………….44
Figure 33: Durban Heights performance based on deviation from median…………………………44
Figure 34: Proportion of rainfall days boxplot for Durban Heights………………………………………45
Figure 35: Composite average deviation for Durban Heights………………………………………………45
Figure 36: Ngome Bos performance based on historic deviation………………………………………..46
Figure 37: Ngome Bos performance based on deviation from median……………………………….46
Figure 38: Standard deviation boxplot for Ngome Bos……………………………………………………….47
Figure 39: Composite average deviation for Ngome Bos …………………………………………………..47

4
CIVN4005A-Investigational Project Report

1. Introduction
Rainfall is the main input in the hydrological cycle and provides the primary source of water.
The hydrological cycle allows for the continuous movement of water across the Earth. Since
rainfall is an important aspect in the hydrological system there have been several studies
done in order to predict rainfall patterns (Chapman,1994). The variability in rainfall patterns
can affect human activities such as farming, and mining. The unpredictability of rainfall
patterns on a global scale due to climate change has made it crucial to incorporate
uncertainty into the prediction of future rainfall patterns for use in hydrological evaluations
and planning.

In hydrology, it is common to simulate rainfall events using stochastic data generation.


Stochastic data generation can be applied to a stochastic rainfall model, where this method
creates synthetic rainfall data based on the statistical properties of past rainfall
data(Ye,2018). To generate stochastic data, a probability distribution function (PDF) based
on the historical rainfall data is required. The use of a probability distribution function is to
ensure the synthetic data produced is closely related to the observed rainfall data. The PDF
is used to help incorporate uncertainty of rainfall patterns and climate change into the
analysis.

Stochastic data generation allows for the extension of the historical data records in areas
beyond the current data records and is particularly useful in areas where there are
limitations in sparse monitoring networks(Benoit,2022). Furthermore, stochastic generated
data can be used to study the impact of climate change on rainfall patterns, which are
expected to vary in terms of, duration and intensity in different regions
worldwide(Seifossadat,2022). By simulating these changes hydrologists can assess the
potential effects on local and regional water resources, which impact infrastructure planning
and water management decisions. It is important to incorporate stochastic data generation
into a hydrological rainfall model, as it helps with the simulation of rainfall events. By using
probability distribution functions uncertainty and accuracy of synthetic data is achieved.
Despite the vast amount of research conducted globally on stochastic rainfall generation,
there has been limited use of observed rainfall data in African regions for such development

5
CIVN4005A-Investigational Project Report

(Pegram and Clothier, 2002). Within South Africa, climate change is a significant issue that is
currently overlooked by the Department of Water Affairs.

A model used in South Africa that accounts for climate change is the SERGE tool. It is
designed for a semi-arid environment (Wiegand,2008). The SERGE tool is a parametric daily
rainfall generator that employs a Zucchini rainfall model (Zucchini et al., 1992), which
focuses on examining the impact of spatial rainfall variability using several randomly located
rain clouds. It consists of 16 parameters and has been calibrated based on daily rainfall data
across South Africa. This model has been shown to accurately depict temporal rainfall
patterns and thus has a dependable accuracy (Wiegand, 2008).

Southern Africa is faced with large variabilities in rainfall based on a temporal and spatial
scale. These variabilities influence water resources, agriculture, and the economy. Due to
these large climate variabilities extreme floods and droughts are experienced (Tjebane,
2022). A few daily stochastic rainfall generation models have been used to study Southern
African rainfall data. Three comparable models that utilize Southern Africa rainfall data
include the WGEN (Weather Generator) Method, the Transition Probability Matrix (TPM)
method and the Variable Length Bootstrap (VLB) method.

The WGEN method was developed by Richard and Wright (1984).The Weather Generator
method is a parametric model that provides daily values for maximum and minimum
temperatures, solar radiation, and precipitation values (Tjebane,2022). It considers the
variable’s persistence, dependency, and seasonality. WGEN's precipitation model is a
Markov chain-gamma model, with a first-order Markov chain used to produce the
occurrence of rainy or dry days. This method tends to overestimate daily values when
compared to historical data, while the monthly average precipitation values are relatively
accurate when compared to the historic data. WGEN produced lower maximum values for
daily, monthly, and annual averages when compared to historic data. Due to WGEN
requiring a large time series for daily rainfall data, it tends to either over or underestimate
precipitation values, however it is much more accurate for monthly rainfall generation.

The Variable Length bootstrap(VLB) is a non-parametric method for producing stochastic


data (Ndiritu, 2011). It has shown equivalent, and occasionally superior, performance to the

6
CIVN4005A-Investigational Project Report

STOMSA parametric model (Phakula et al., 2018, Pegram and Clothier, 2002) which is
commonly used in South African water resource planning and yield analysis. In 2011,
Professor John Ndiritu created the Variable Length Bootstrap (VLB) method for generating
monthly streamflow data (Ndiritu,2011). Ndiritu and Nyaga adopted this method in 2014 to
generate monthly stochastic rainfall data. The effectiveness of this approach is assessed by
comparing the upper and lower bounds of the VLB-generated annual flows. When
comparing VLB with STOMSA it was discovered that both approaches reproduce the mean
monthly flows and standard deviations accurately (Tjebane, 2022). STOMSA tends to
underestimate the standard deviation, whereas VLB does not (Tjebane, 2022). The VLB
stochastic generation model has enabled a more comprehensive and dependable
incorporation of risk and uncertainty in hydrological modelling and analysis. The data
produced from VLB is generally overestimated when compared to the historic data and has
a higher variability than WGEN. VLB did not generate maximum values that were greater
than the historic data, implying that it is rather conservative. During dry and wet seasons,
synthetic data produced was representative of historic data.

Furthermore, the Transition Probability Matrix (TPM) is a multi-state Markov chain model
that generates synthetic sequences of daily rainfalls. The model collects probabilities for rain
in one state and determines the probabilities for the next day. Data is collected for several
days and then collated into a transition probability matrix. For annual data, 12 TPMS would
be required for each month (Tjebane,2022). TPM tends to underestimate large variances in
annual data and is therefore not suitable for areas prone to drought or flooding (Boughton,
1999). The TPM method generates daily rainfall depths that are closer to historic values and
has a higher variability than the Weather Generator (WGEN) Method. It is recommended
that for areas prone to floods and droughts VLB or TPM is better suited as data produced by
these models are more conservative. (Tjebane,2022).

The VLB daily stochastic generator developed by Professor Ndiritu in 2022 generates daily
stochastic rainfall data by disaggregating the annual stochastic rainfalls generated by the
VLB stochastic rainfall generator developed by Ndiritu and Nyaga (2014) (Ndiritu, 2022). The
daily VLB stochastic generator involves identifying the nearest neighbours based on an
annual rainfall magnitude from the historic rainfall sequence. The nearest neighbours are
annual rainfalls whose values are close to the stochastic annual rainfall. The neighbours
7
CIVN4005A-Investigational Project Report

selected is a subjective modelling parameter. From the selected number of neighbours, a


subset of neighbours is selected, this can be referred to as the number of years selected to
supply the initial daily rainfall amounts for set parts of the year. This is also a subjective
parameter selection. However, a more systematic determination of these subjective
parameters is crucial to ensuring the reliability and accuracy of the generated daily
stochastic rainfalls for hydrological modelling applications.

1.2 Aim and objectives

This research project aims to investigate how sensitive the performance of the daily VLB
stochastic generator is to variations in the two subjective parameters namely, the number
of pairs of nearest neighbours and the years selected out of these pairs to be used in the
disaggregation from an annual to daily time step.

1.3 Research Questions

1.How does subjective parameter selection influence the performance of the VLB daily
stochastic generator?

2.Do optimal parameter values of the VLB daily stochastic generator exist?

3. Do optimal parameters values exist for the range of climatic conditions assessed by the
daily VLB generator?

1.4 Organization of the report


This research report is presented as follows:

Chapter 1 provides an introduction into the definition of stochastic data generation and
different types of generators available. This chapter will include a discussion about the
proposed generator that will be used to conduct this research.

Chapter 2 includes a literature review on stochastic daily rainfall generation. It will provide
information on the uses and importance of daily stochastic rainfall generation. In addition to
providing information about parametric and non-parametric stochastic generators.

8
CIVN4005A-Investigational Project Report

Chapter 3 provides the VLB daily stochastic generator procedure followed by the
methodology used to carry out the research. The methodology is split into steps that were
followed during this research and the reasoning behind each step.

Chapter 4 presents the performance evaluation for all stations in Gauteng and Kwa-Zulu
Natal, this includes graphs obtained for each subjective parameter pair using Microsoft
Excel. This chapter further includes an analysis, discussing the findings from the
performance evaluation.

Chapter 5 provides the conclusions and recommendations based on the findings presented
in chapter 4.

2. Literature Review

9
CIVN4005A-Investigational Project Report

2.1 Introduction

Rainfall is a driving force for many hydrologic processes (Zhao & Nearing, 2019). However,
the lack of reliable rainfall records hinders the development of hydrologic research and
applications. With the developments in stochastic hydrological generation, improvement to
our understanding of water resources can be achieved in the decision-making process in
areas such as water management, agriculture, and mining (Hughes, 2004). Stochastic
generation involves the simulation of the random variability of hydrological variables, such
as rainfall and streamflow, using statistical models and probability theory (Ndiritu & Nyaga,
2014). The stochastic generation of hydrological data is essential for a wide range of
applications, including water resource planning, flood management, and environmental
protection (Ndiritu & Nyaga, 2014). In recent years, advancements in computing power and
data analysis techniques have led to significant improvements in hydrological stochastic
generation methods. These developments have enabled a more comprehensive and
dependable incorporation of risk and uncertainty in hydrological modelling and analysis
(Beven, 2012). However, improvements can be made to make stochastic generators more
robust and reliable.

2.2 Hydrological Stochastic generator applications


A stochastic generator is a tool used to generate weather data using a synthetic model
through computer methods. The synthetic data produced can be for any specified location,
provided that certain assumptions and parameters are defined; these parameters may
include precipitation, humidity, air pressure and radiation. Synthetic data is produced by
analysing the statistical properties of the previously collected data for the specified location.
As such the main purpose of a stochastic generator is to produce a series of climatic data
values that will match and represent statistical information from prior historical
observations in order to forecast future hydrological conditions(Cowden, 2008). The idea of
a stochastic generator was based on a Poisson distribution method used to generate
random events. For this model to work efficiently, it required parameters, of which included
and are not limited to the following: period of rainfall, intensity of rainfall and temperature.
The early versions of these models were subpar and did not accurately predict forthcoming
weather patterns, temperatures and amounts of rainfall, since it was unable to compute the
nonlinear weather patterns with regards to climate change, topography, and vegetation.

10
CIVN4005A-Investigational Project Report

However, with the advancement of technology, the accuracy of these stochastic models has
been greatly improved. To achieve better accuracy, new generators utilize much wider
search field criteria and focus on key elemental areas such as spatial topography and
temporal variability of climate conditions (Huang, Wang,Xiao,2018).

An accurate and reliable stochastic rainfall generator was invented by Keith Beven and Mike
Kirkby (Beven,2012). Beven and Kirkby produced a surface rainfall runoff generator called
the “Topmodel generator” also known as the topology based hydrological model. This
generator is capable of simulating rainfall inputs from small catchment areas using a
stochastic rainfall representation. The model consists of relationships between the
catchment topography and the rainfall input. It is assumed that a catchment will behave as a
large sum of interconnected soil found in series. As such the model will be able to calculate
the water balances from each soil column by accounting for infiltration or runoff of rainfall
on that particular column of soil (Beven,2012).

Another accurate rainfall generator that was invented is called The Long Ashton Research
Weather Generator (LARS-WG) (Chisanga,2017). This generator is used for a singular
location to obtain weather data; the generator is to be calibrated every time before
obtaining temperature and rainfall data. This generator is capable of testing 3 parameters,
including maximum and minimum temperatures and precipitation. The main reason why
LARS-WG can simulate future and present climatic weather conditions is that the generator
is based on the General Circulation Models (GCM)(Chisanga,2017). GCM uses historic
geographical data to help manipulate and produce different climatic responses for
scenarios, such as the ocean weather, global atmosphere, and greenhouse gas emissions
(Sellwood, Valdes, 2005). The LARS-WG model and several other stochastic rainfall models
can be used for hydrological design purposes, such as evaluating the likelihood of peak
discharges from occurring in a dam with a basin area of 100km2. It can be used to evaluate
the probability of flood occurrence and risk. The rainfall model is calibrated to obtain hourly
precipitation over the basin area. The hourly precipitation data will be analysed by the
model to identify storm water events that occurred in a period of 100 years. This data will
then be further used to predict future dam water levels in order to determine forthcoming
flooding and overflowing of a dam (Campo-Bescos, Sordo-Ward,2009).The stochastic rainfall

11
CIVN4005A-Investigational Project Report

model can be used to determine if a biomass environment is a niche environment using the
variances in rainfall patterns, water availability, carbon storage and vegetation competition
within an environment (Coletti,2013).

Furthermore, the use of the stochastic model is dependent on the type of model used, as
such there are two main types of models, namely a parametric stochastic model and a non-
parametric stochastic model. A parametric stochastic model follows a parametric probability
distribution, these distributions include, a 1- parameter distribution consisting of an
exponential distribution, a 2- parameter distribution consisting of a Log-normal, Weibull,
Gamma, and Gaussian distributions. Lastly there is a 3-parameter distribution consisting of a
mixture of distributions such as exponential, hybrid exponential and normal skewed
distributions (Pratikasiwi,2022). The parametric model is utilized for several hydrological
analyses, particularly it can be used in simulating the extreme rainfall characteristics within
the 95th percentile, this is achieved using the Weibull distribution. Another application of the
parametric model is determining variability of flow distribution across a catchment area. It is
also capable of estimating minimum, maximum and average temperatures for a tropical
area, in an hourly time scale.

The Markov Chain stochastic model is a non-parametric model and is further classified as a
comparison type of model; for example, the model can determine whether a day is
considered too dry or wet based on the relationship between the current and previous day’s
data. This data includes parameters such as temperature, humidity, air pressure and air
density. A first order Markov chain is used in the WGEN model that is used to describe the
probability of rainfall or the occurrence of a wet and dry day, for a specific day based on if
the previous day was wet or dry. Zhang, Singh,Gagnon,2013). This model was designed in
Fortran and is used to generate rainfall data and is then used to simulate the minimum and
maximum precipitation and air temperature.

Moreover, another model is the variable length block generator (VLB) (Ndiritu, Nyaga,2014).
The VLB was initially a streamflow generator that was modified to be used as a stochastic
rainfall generation model (Ndiritu,2011). This is a non-parametric stochastic model. This
model is an adaptation of the variable length block streamflow generator(Ndiritu, 2011) for

12
CIVN4005A-Investigational Project Report

rainfall generation. The model is designed to incorporate historic spatial and temporal
characteristics of rainfall. The VLB method allows for the disaggregation of fragments using
the weighted method and perturbing bootstrapped annual flows. According to findings by
(Ndiritu,2011), it was found that the VLB generator produces an overestimate of minimum
flows, and when an effort was made to enhance the reproduction of these flows, it resulted
in a significant underestimation.

Furthermore, a stochastic streamflow generator developed by R.S Mckenzie in 1991 is the


synthetic streamflow generator. The Vaal River System utilized this generator, and several
tests were carried out to verify the performance of this streamflow model. The tests will
dictate that when dealing with flow data that has a significant skew, it is necessary to utilize
resilient techniques for fitting probability models. This streamflow model is based on a
multisite model, as such it uses a three-stage model. This idea was based on Pegram and
James (Pegram,1986), where the analysis of each station was conducted separately to
determine the marginal distribution of the yearly totals to achieve a suitable transformation
to a normal distribution. After normalization, the subsequent step involved determining the
time-series structure of the normalized data and extracting the residual series that remain
once the correlation structure of each station has been identified. The final stage included
determining the covariance matrix of the normal residuals extracted from each record in
stage two to identify the connections between the stations. (Pegram,1986) The generation
of the monthly flows is done using cross-correlation. Two investigations were conducted
during the development of this model. These investigations include the effect of sample
length on the validation tests and choosing the marginal distribution function
(Pegram,1986). An important aspect of the modelling procedure is the stage of testing and
confirming the model's accuracy, which may prompt suggestions for enhancing the model's
performance even after the verification and validation stage, therefore each region being
analysed needs to be subsequently broken down into monthly totals using the "key station"
approach to determine the year with the closest historical match (Pegram,1986)

2.3 Non-Parametric and Parametric models

13
CIVN4005A-Investigational Project Report

All stochastic rainfall models are primarily based on either parametric or non-parametric
methods. A non-parametric stochastic rainfall model is a statistical type of model. It is used
to reproduce and predict future rainfall patterns based on observed and historic rainfall
data. A core aspect of a non-parametric model is that it does not require assumptions to be
made about the statistical distribution of the data and limited parameters needed to be
made. The bootstrap method may be used to generate synthetic rainfall data based on the
original historic rainfall data that was initially observed. An important note is that the
selected parameters used for a non-parametric model, need to be stochastically correct and
unbiased. As this will severely impact the estimations that are generated by the probability
distribution, resulting in a lack of accuracy and poor performance of the model
(Chapman,1997). The application of the non-parametric model is using the synthetic rainfall
data for assessing flood risk and overflowing of dam walls, predicting future weather
patterns, and mining hydrological events. A non-parametric model can also be used for
hydrological water resource management, this includes determining the behaviour of
stream flow, river channels and generating a rainfall time series annually, monthly, daily, or
hourly.

Whereas a parametric model is still able to reproduce and simulate rainfall data, but it does
require assumptions about the probability distribution of the data to be made, hence an
exact number of parameters needs to be defined for the functioning of the model. These
specific assumptions, such as defining what the parameters are, need to be decided based
on the rainfall data and the chosen parametric distribution. Data input is necessary for the
model to function and simulate any rainfall data (Tjebane,2022).

With the two different types of stochastic models to choose from, each model will have its
respective strengths and weaknesses. Concerning the non-parametric model, if it is based
off on The Markov Chain, this model is highly efficient in simulating monthly and annual
rainfall events, it is generally used in sub-tropical and tropical climate regions such as South
Africa, Mozambique, Angola, and Madagascar (Nix, 2019). A non-parametric model is much
more flexible and efficient compared to a parametric model, the parameters do not need to
be defined which allows for a larger range of variability and rainfall patterns, since this
model is also capable of using the kernel density estimation techniques, several assumptions

14
CIVN4005A-Investigational Project Report

are avoided which helps to increase accuracy (Harrold,2003). A non-parametric model can
clearly identify any outliers in the rainfall data due to not requiring defined parameters,
while a parametric model is unable to do so. Non-parametric model requires fewer
assumptions to be made as such it is more efficient than a parametric model and its
application is much wider utilised than a parametric model. However, since less
assumptions are made, non-parametric models have less statistical power to compute the
data and can be highly resource intensive, thus requiring large amounts of input data, which
ultimately results in longer processing times to simulate rainfall data. Due to the non-
parametric model having fewer assumptions, it could limit the model, which can result in
uncertain data being processed, this can be evident in the tendency of the model to
disregard any variation in low frequency rainfall (Pratikasiwi,2022).

Parametric stochastic models are best used to generate rainfall amounts, as it utilizes the
parametric probability distributions. By using these distributions, results are more accurate
compared to non-parametric models that do not use this type of distributions (Huth,2004).
For example, the mixed exponential parametric model is extremely accurate in reproducing
daily rainfall occurrences in subtropical areas, a key area that a non-parametric model is
incapable of producing. In a tropical climate region, a parametric model is better suited for
generating maximum and average rainfall values in a 1-hour time frame (Pratikasiwi,2022).
Less effort is required for the model to function, it is less resource intensive, processing
speeds and the generation of rainfall data is much faster than non-parametric models. Both
parametric and non-parametric models may underestimate extreme data values, as such
outliers are apparent and is not reduced if no care is taken specifying what is required
(Chapman,1998). Parametric models are unable to automatically generate a weather time
series without having bias data. The nonparametric model, the VLB method, can capture
long term rainfall data accurately with VLB generators are able to reproduce the historic
statistics within 82% and 90% of the historic statistics ranging within the interquartile range
of the box plots (Ndiritu and Nyaga, 2014). The VLB generator is able to better reproduce
annual statistics more accurately than a parametric model is a more suitable model to be
used for annual and monthly generation (Ndiritu and Nyaga, 2014).

15
CIVN4005A-Investigational Project Report

As such, the decision about appropriate type of model to use requires careful consideration
and many factors are involved in the decision process. These factors include such the
climate of the region, whether a model is required for annual, monthly, or daily rainfall and
whether a model is required for a time series generation (Pratikasiwi,2022). Based on these
parameters, the most logical model to use includes the variable length block generator
(VLB), which is a non-parametric model, used for rainfall generation. The method has no
known limitations and can be used for groundwater level, streamflow, precipitation and for
radiation (Tjebane,2022).

2.4 Climate change and variability modelling

Climate change holds significant importance for South Africa, particularly given by the
variability in rainfall patterns and climate change. These factors must be considered when
utilizing a rainfall model. There are not many models available that can be associated with
climate change and the long-term effect that it might have due to future uncertainties that
may arise (Ghil et al., 2002; Hughes, 2012), however the General circulation model (GCM’s)
is a model that can attempt to model a long-term climatic change into the analysis. The
GCM is currently the only model that is a physically based approach when accounting for
climate change. This model can assess the impact of greenhouse gases on a global scale to
possibly slow down the impact of greenhouse gases. Several other models that take climate
change into account when producing data with a large degree of variability and uncertain
values that are produced (Mujumdar and Ghosh, 2008). Since the GCM is a one of a kind,
the models data reproduction cannot be validated against another data set that has being
analysed, therefore the accuracy cannot be validated (Koutsoyiannis et al., 2009). However,
from what testing that has been carried out, it is apparent that the GCM cannot replicate
inter-annual rainfall data. This is further proved by (Koutsoyiannis,2011) who stated that
long term persistence needs to be incorporated into the hydro-climate time series as the
GCM underestimates the uncertainties.In 2010 Kundzewicz and Stakhiv concluded that
GCMs are currently not ready to be used for practical water resource planning as the
downscaled data produced by the GCM is uncertain and not validated.

16
CIVN4005A-Investigational Project Report

2.5 Variable Length Bootstrap method

Variable length bootstrapping is a statistical resampling technique that resamples data using
a variable number of samples (Ndiritu, 2011). The number of samples are obtained
randomly from the original data set. The traditional bootstrap technique only resamples
data in the historic record consequently producing no new data, thus resulting in a
significant limitation to the stochastic data generation (Ndiritu, 2011). The Variable Length
Block (VLB) bootstrap was developed to overcome this limitation and to ensure a replication
of multi-annual flow variability, using blocks of variable length (Ndiritu, 2011). A variable
length bootstrapping stream flow generator developed by Ndiritu (2011) was later adapted
for rainfall generation (Ndiritu & Nyaga, 2014) by considering the characteristics of rainfall
rather than streamflow. These characteristics include spatial and temporal dependence
characteristics of rainfall rather than streamflow (Ndiritu & Nyaga, 2014). The rainfall
generator (Ndiritu & Nyaga, 2014) generates monthly and annual stochastic rainfalls using
the disaggregation of stochastic annual values and perturbing the data (Ndiritu & Nyaga,
2014). Thereafter it updates the stochastic annual values after the disaggregation.
VLB is a nonparametric method that (Ndiritu, 2011):
i) Produces stochastic values beyond the range of historical values, which is
beneficial in simulating extreme rainfall events different to historical data.
ii) Includes a method for preserving the correlations between monthly values at the
end and beginning of each year, thus ensuring a realistic generated time series.

The main steps of the VLB generator include (Ndiritu & Nyaga, 2014):

● The segmentation of historical rainfall data into variable length blocks.

● Generation of stochastic annual time series randomly resampling variable length

blocks.

● Matching each stochastic time series year with a pair of different years from the

historical data.

● Annual stochastic values are disaggregated into monthly distributions and

perturbation is incorporated.

17
CIVN4005A-Investigational Project Report

● Stochastic annual values are updated after the disaggregation.

Refer to (Ndiritu & Nyaga, 2014) for the detailed description of the method.

2.6 Daily stochastic rainfall data


Stochastic rainfall data is of utmost importance to the agricultural sector as it assists farmers
in planning their planting and harvesting schedules and makes decisions about irrigation and
fertilization (Srikanthan & McMahon, 2001). Rainfall is a critical input in crop growth
therefore, the amount and duration of rainfall can significantly affect crop yield (Srikanthan
& McMahon, 2001). Daily stochastic rainfall data is important in water resource
management of surface water and groundwater (Beven, 2012). The management of water is
carried out by predicting water availability and estimating the water demand. This
information is crucial in the allocation of water for different domestic and industrial uses
(Beven, 2012). Stochastic rainfall data is used in flood forecasting models to predict flood
levels and issue flood warnings (Srikanthan & McMahon, 2001). Allowing efficient time to
allocate resources and prepare for such events is essential and stochastic rainfall data allows
this to be carried out. In addition, the mining industry relies heavily on stochastic daily
rainfall data as it can impact slope stability and the quality of water used in the mining
process. High levels of rainfall can cause landslides or erosion, which can damage mining
infrastructure, equipment, and disrupt operations (LePan, 2023). Low levels of rainfall can
impact the quality of water used in mining, leading to issues such as increased mineral
concentrations and reduced water availability (LePan, 2023). Through stochastic rainfall
data analysis mining companies can plan and mitigate the impacts of rainfall on their
operations. Overall, daily stochastic rainfall data plays a pivotal role in the planning and
decision-making process related to agriculture, water resource management, emergency
management, and climate change.

Long sequences of daily rainfall is required for hydrological purposes and to provide inputs
for models of crop growth, landfills, tailing dams and other environmentally sensitive
projects (Srikanthan & McMahon, 2001). It is thus imperative that annual data can be used
to obtain daily rainfall data. The newly developed daily VLB stochastic rainfall generator
(Ndiritu, 2022) is derived from the disaggregation of the annual stochastic rainfalls obtained
using the VLB stochastic rainfall generator (Ndiritu and Nyaga, 2014). Annual to daily

18
CIVN4005A-Investigational Project Report

disaggregation is preferred over monthly to daily as it leads to fewer discontinuities in


resampling and therefore less distortion to the historic rainfall data which is continuous.

2.7 Daily VLB stochastic generator


The following procedure is followed to obtain daily stochastic rainfalls from the annual VLB
stochastic rainfalls (Ndiritu,2022):

(i) From the historic rainfall sequence an annual rainfall magnitude (ASI) is selected and
thereafter nearest neighbours of the annual stochastic rainfall are identified. Nearest
neighbours are defined as the annual rainfall values that are close to the selected
annual stochastic rainfall value. An equal number of neighbours with lower and
higher rainfalls are identified. The number of neighbours is a multiple of two.

(ii) After selecting the number of nearest neighbours, a random number will be selected
from the number of neighbours that will supply the initial daily rainfall amounts for
set parts of the year. For example, if 8 nearest neighbours are selected, as it must be
a multiple of two, a random number will be selected between 1 and 8, such as 5. The
random selected number of neighbours (e.g., 5) will fill in equal proportion of the
year to create the initial daily rainfall sequence for the year. If 5 neighbours are
selected, then each neighbour supplies daily rainfalls for a continuous period of
365/5 = 73 days to fill up the year.

(iii) The summation of the rainfalls obtained in step (ii) is unlikely to equal the stochastic
annual rainfall ASi selected in step (i). Therefore, the daily stochastic rainfalls from
step (ii) are proportioned such that it retains the total magnitude of the annual
stochastic streamflow ASi. This adjustment ensures that new daily rainfall data is
generated.

(iv) In order to ensure the replication of cross correlations where multiple rainfall
stations are involved, the contemporaneous approach described by (Ndiritu &
Nyaga, 2014) and (Ndiritu, 2011) is applied.

19
CIVN4005A-Investigational Project Report

2.8 Limitations of subjective parameters in the daily stochastic rainfall VLB generator

Two parameters of the daily generator have been obtained subjectively, namely the number
of pairs of nearest neighbours and number of years selected out of the pairs to use in the
disaggregation from an annual to daily time step. Subjective parameters in a stochastic
generation could lead to sub-optimal generation performance and this investigation
therefore seeks to find out how sensitive the performance of the daily stochastic generator
is to variations in the two subjective parameters and if optimal parameter values exist for
the different climatic conditions the daily VLB generator could be used for.

20
CIVN4005A-Investigational Project Report

3. Research Method

3.1 Introduction

The aims of this investigation include determining how sensitive the performance of the
daily VLB stochastic generator is to the variations in the two subjective parameters. These
subjective parameters are the number of pairs of nearest neighbours and number of years
out of the pairs that are selected to disaggregate the annual rainfalls to daily. In addition,
the aim of this study is to determine if optimal parameter values exist for different climatic
conditions the daily VLB generator can be used for. The following tasks will be carried out to
meet these aims and objectives.

3.2 Tasks that will be carried out

- Task 1: Obtain historic rainfall data for different climatic conditions namely an arid,
semi-arid and humid conditions within South Africa. Gauteng is the semi- arid region
and represents a normal climate. Kwa-Zulu Natal will represent the humid region.
The arid region will be Northern Cape. This is done to determine if optimal values
for the subjective parameters exist for different climatic conditions. The historic
rainfall data will be obtained for a minimum of 50 years from the South African
Weather Service. More than 50 years of historic data is selected to account for
climatic variability and to ensure the analysis is thorough. Once the historic data is
obtained, quality checks will be carried out to ensure the data is continuous and
basic statistics of the data will be obtained. These will include mean daily rainfall,
skewness, standard deviation, and proportion of rainfall days.

- Task 2: A range of values will be defined for the two subjective parameters. For one
subjective parameter, which is the number of pairs of nearest neighbours, a
minimum, middle, and maximum value will be set. The number of pairs of nearest
neighbours can only range from 1-50% of the historic years of data obtained. For
example, for 57 years of historic data that is obtained, a minimum number of pairs
of nearest neighbours selected will be 1, the middle number selected will be 14 and

21
CIVN4005A-Investigational Project Report

the maximum will be 28. The second parameter is the years selected to
disaggregate the annual rainfalls to daily, which takes a maximum value of twice the
number of pairs of nearest neighbours. For each number of pairs of nearest
neighbours selected, a lower and upper bound is defined for the years selected to
be used in the disaggregation. For example, if the number of pairs of nearest
neighbours selected is 9, the years selected out of these neighbours will be 1 and
18, which will be used to generate the stochastic data. As such if the number of
pairs of nearest neighbours is 9, the pairs referred to in this report will be (9,1) and
(9,18). Therefore, a pair will be defined as the number of pairs of nearest
neighbours selected and the number of years out of these to use in the
disaggregation from an annual to a daily time step. By choosing a minimum, middle,
and maximum number of nearest neighbour pairs, this will allow for the sensitivity
analysis to be carried out on the daily VLB stochastic generator’s performance due
to variations in the subjectively selected values of the two parameters.

- Task 3: The daily VLB stochastic generator will be employed by changing the number
of pairs of nearest neighbours and years selected. For the minimum, middle and
maximum nearest pairs of neighbours selected. This will be done for each station in
the respective regions. Since there are six pairs per station, and four stations will be
analysed a total of twenty four sets of 100 stochastic series will be obtained .

- Task 4: Once the daily stochastic data has been generated by varying subjective
parameters for each climatic condition, an analysis of the performance of the daily
VLB stochastic generator will be carried out. This analysis carried out will be across
the different climatic conditions in order to determine whether an optimal
subjective parameter value exists in each condition. The analysis will be carried out
to determine how sensitive the generator is to the variation in the two subjective
parameters. It will be carried out by using statistical measures such as mean,
standard deviation and skewness. The mean and standard deviation will provide
information about the central tendency and variability of the data, respectively. The
proportion of rainfall days for each set of stochastic generated data will be

22
CIVN4005A-Investigational Project Report

determined, as this is important for rainfall-based analysis and for practical


purposes such as flood analysis and water resource management.

- Task 5: To determine if optimal values exist for different climatic conditions and the
sensitivity of altering the parameters values on the VLB daily generator three
performance evaluations will be carried out.
1) The percentage deviation from the historic value will be one of the performance
indicators that will be determined. From the historic data one value will be obtained
for the mean, skewness, standard deviation as well as proportion of rainfall days.
From the stochastic data the median value for all 100 stochastic series will be
obtained for each of the statistical measures mentioned above. To get the deviation
from the historic value the following formula will be applied for each statistical
measure:
% deviation from historic value ¿ stochastic median−historic value …Equation 1
historic value

For a pair (number of pairs of nearest neighbours, years selected from these pairs)
to perform well, the deviation away from the stochastic median for a statistical
measure from the historic value should be low, as it is ideal for the median of the
stochastic data to be close to the historic value to ensure the stochastically
generated data is representative of the historic data.
- Another performance evaluator that will be utilised is based on the deviation of the
historic statistic from the median of the 100 stochastic values. The historic value for
a particular statistical measure will be included in the 100 stochastic values for that
measure and the deviation of the historic value from the stochastic median value will
be determined. It is deemed optimal if the historic value deviates minimally from the
stochastic median value (50th value). If the historic value is near the median
stochastic value, the data is representative of the historic data, whereas large
deviations away from the median stochastic value will indicate a lower performance.
A low deviation from the stochastic median value will therefore lead to a pair
performing well as the data will be representative of the historic data.
- An overall performance evaluator utilised is the composite average deviation. The
composite average deviation is obtained by multiplying the deviation from the

23
CIVN4005A-Investigational Project Report

historic statistic to the stochastic median statistic and the percentage deviation from
the historic value for all statistical measures for each pair and thereafter obtaining an
average value for each pair. A low average composite deviation will indicate that the
pair performs well. These three performance evaluators will determine if optimal
parameter values exist for the applied rainfall data sets and will be used to assess
the sensitivity of the VLB daily generator to changes in the parameter values.

The daily VLB generator is useful for generating daily rainfall data; however, the
performance of the VLB daily generator is dependent on two subjective parameters which
can affect its reliability. Therefore, these tasks will help determine whether optimal values
exist for certain climatic conditions and how the two subjective parameters will influence
the performance of the VLB generator.

3.3 Daily VLB stochastic generation procedure

The Fortran programming language was employed in utilizing the VLB Method to develop
the application with stochastic elements. Fortran is a highly capable and versatile
programming language that is designed for generating rainfall data, specifically tailored for
numeric calculations and scientific computing. After the VLB is designed, an executable
(.exe) file is created to work alongside a text file(. txt) where all the relevant rainfall data
would be required. A Microsoft excel file is used, where all the rainfall is captured, and this
excel file is then saved as a delimited text file which would be read by the executable
application. Microsoft excel is used ; therefore the Fortran compiler is not required.
Four executable files are created, named DailyToMonthlyRainfallCreator.exe,
VLBMonthlyGenerator.exe , AnnualToDailyDisaggregator.exe and
AnnualtoDailyDisaggregationParameters.txt, these executables are used to convert the daily
historic rainfall into monthly rainfall, thereafter stochastic monthly data is produced, and
daily stochastic rainfall is produced, this last set is produced using the
AnnualToDailyDisaggregation.exe where VLB disaggregates the annual stochastic rainfalls
(aggregated from the stochastic monthly rainfalls) to daily stochastic rainfalls. A detailed
description of the VLB monthly stochastic generator is provided by Ndiritu and Nyaga
(2014).

24
CIVN4005A-Investigational Project Report

A summary of the steps is provided below. Despite the ability of the VLB model created by
Ndiritu and Nyaga (2014) to produce stochastic rainfalls on both monthly and annual scales,
it is deemed suitable to directly disaggregate rainfalls from an annual to a daily time step
without involving the intermediary monthly time step. Annual to daily disaggregation is far
less complex as opposed to monthly to daily disaggregation and is the reason why this
computation is instead carried out by the VLB generator.
The following files were produced when running the VLB daily stochastic generator to
generate stochastic daily rainfalls.
Step 1:
Two text files were required to run the application, this includes:
a) Historic daily rainfall.txt consisting of the daily historic rainfall data from the rainfall
stations, the format of the data is shown below in Figure 1. The VLB application is
capable of producing stochastic for multiple regions simultaneously but it was opted
to run the software for each station in each region individually for simplicity, thus the
software was executed 4 times to obtain stochastic monthly data for
Gauteng(Leeukop and Turffontein) and Kwa-Zulu Natal(Durban Heights and Ngome-
Bos) respectively.

Figure 1:Historic daily rainfall.txt

25
CIVN4005A-Investigational Project Report

b) VLBSettingsandfiles.txt which consist of the length of historic data in years, length of the
stochastic sequences to be generated in years and minimum block length which are altered
according to each station, as shown in Figure 2.

Figure 2: VLBSettingsandfiles.txt file

Step 2:
Execute the DailyToMonthlyRainfallCreator.exe program to generate monthly rainfall data
files intended for utilization by the VLB Monthly Generator.exe. Additionally, this program
conducts an analysis that establishes a correlation between annual rainfalls and the count of
annual rainy days, and the results are saved as RainAndRainDays.txt. VLB also produces the
RainfilesNames.txt consisting of names of the rainfall stations and the number of stations
analysed, as seen in figure 3.

Figure 3: RainAndRainDays.txt

26
CIVN4005A-Investigational Project Report

Step 3:
Execute the VLB Monthly Generator.exe program, which generates text files containing
various analyses of the stations, including historical annual rainfalls, historical monthly
statistics, block length and averages, and cross-relations, among others. The only required
files are the ones containing the monthly stochastic generation files, these files include the
specified station containing monthly stochastics rainfall data, the files are named with the
station name followed with generated. Each monthly stochastic generated file consists of
100 stochastic rainfall series with each stochastic series consisting of 57 years’ worth of
rainfall monthly data as seen in the figure 4 below. The columns are arranged with the
months of least rainfall to months of most rainfall.

Figure 4: Stochastic monthly generated rainfall

Step 4:
Execute the AnnualToDailyDisaggregator.exe program, which generates text files containing
the annual stochastic data disaggregated into stochastic daily rainfall data for 100 stochastic
rainfall series, as seen in figure 5 below. This data will now be used to perform a
performance evaluation in order to determine how well the daily stochastic rainfall data
comparred to the historic rainfall. A series of performance evaluations will be carried out
which will is discussed in the methodology above.

27
CIVN4005A-Investigational Project Report

Figure 5: Stochastic daily generated rainfall

Step 5:
The last step in the VLB procedure, is changing the parameters of the generator to improve
the accuracy of the stochastic rainfall that is generated. This is done by executing the
AnnualToDailyDisaggregation.exe The parameters selected in the text file is in the form of a
pair e.g. (X,Y), Where X represents the number of pairs of nearest neighbours, the number
selected should not exceed 50% of the number of years of stochastic data. While Y
represents the number of nearest neighbour’s to use to disaggregate the annual rainfall to
daily rainfall. These pairs can be altered interchangeably that produces different sets of
stochastic data that consists of 100 stochastic series.

28
Figure 6: Stochastic daily generated rainfall with updated neighbouring pairs
CIVN4005A-Investigational Project Report

Figure 7 shows the summary flow chart of the VLB process followed.

Figure 7: Summary of VLB generator process

3.4 Analysis of historical observed data

The historical data was obtained from the South African Weather service, 2 stations for each
province was selected and 57 years of daily historical data was obtained. The stations were
chosen subjectively by the South African Weather service based on the given provinces and
the number of years of data given such that it meets the requirements of being long and
reliable. Two stations in the Northern Cape were obtained, however due to the very low
rainfalls the daily VLB stochastic generator could not produce stochastic data, and is not
considered in the analysis.
The Gauteng stations include:
- Johannesburg Turffontein (Station number- 0476044 0)
- Johannesburg Leeukop (Station number- 0476031 8).
The Kwa-Zulu Natal stations include:
- Durban Heights station (Station number- 0240738 1)
- Ngome Bos (Station number- 0373680A1).
Once the data was obtained, quality checks and checks for the continuity of the data were
carried out. These quality checks include basic statistical analysis such as the average daily

29
CIVN4005A-Investigational Project Report

and annual rainfall, number of rain days per year and checking if there are any gaps in the
data. The data had missing values and a 96% reliability percentage, this was deemed
acceptable. The 4% missing values were manually filled in with 0 values to provide a
continuous data set. This is a common practice in hydrology, as most stochastic generators
require a continuous data set.
Table 3.1 summarises the characteristics of each station for the two provinces.
Province Representative Station Co- ordinates Record of Number Mean
climatic region Latitude Longitud Data of years annual
e of data precipitation
MAP(mm)
Gauteng Semi-arid Johannesburg -26.23 28.04 01/01/1965- 57 759.5
Turffontein 31/12/2022
Johannesburg -26.00 28.05 01/01/1965- 57 693.2
Leeukop 31/12/2022
Kwa- Humid Durban -29.80 30.93 01/01/1965- 57 790.1
Zulu Heights 31/12/2022
Natal Ngome Bos -27.82 31.42 01/01/1965- 57 1450.4
31/12/2022
Table 1: Rainfall station characteristics

3.4.1 Gauteng rainfall stations

Johannesburg Leeukop
Johannesburg Leeukop is situated near Leeukop dam and around 1.2km away from the
Jukskei River. This area is largely farmland with agricultural activity. It is situated near the
Kyalami grand prix circuit and is a far less built-up area in Johannesburg. Figure 8 is an image
of Leeukop rainfall station obtained from Google Earth Pro.

Figure 8:Leeukop Station obtained from Google Earth Pro

30
CIVN4005A-Investigational Project Report

The mean daily historic rainfall per year is calculated to be 1.9mm/day/year over the 57-
year historic period, as illustrated in figure 9. Additionally, from figure 9 it observed that the
wettest year is in 1997 with a mean daily rainfall of 2.9mm/day for 1997 and the driest year
is 1965 with a mean daily rainfall of 1mm/day for the year 1965 as shown in the figure
below. The maximum daily observed rainfall occurred on the 5th of February 2022 with a
value of 122mm, as shown in figure 10. The wet months are from October- March as seen in
figure 11 that shows the mean historic rainfall per day in each month for the 57-year period.

Figure 9: Average annual daily rainfall for Leeukop for the historic period from 1965-2022

Figure 10: Daily rainfall(mm) for Leeukop from 1965-2022

31
CIVN4005A-Investigational Project Report

Figure 11:Average daily rainfall per month in Leeukop


Johannesburg Turffontein

The Johannesburg Turffontein rainfall station is located next to Robinson deep gold mine
and near the Robinson deep landfill, figure 12 shows the Turffontein area where the station
is located. Turffontein is developing into a residential and commercial area and mining
activity is decreasing in this area. The Turffontein station is around 30km away from the
Leeukop station

Figure 12:Turrfontein Station obtained from Google Earth Pro

32
CIVN4005A-Investigational Project Report

The mean daily historic rainfall per year is calculated to be 2.1mm/day over the 57-year
historic period, as illustrated in figure 13. It is observed that the wettest year is in 2010 with
a mean daily rainfall of 3.2mm/day for 2010 and the driest year is 2003 with a mean daily
rainfall of 1.2mm/day for 2003 as shown in the figure below. The maximum daily observed
rainfall occurred on the 20th of January 1972 with a value of 162mm, as shown in figure 14.
The wet months are from October- March, as seen in figure 15 that shows the mean historic
rainfall per day in each month for the 57-year period.

Figure 13: Average daily rainfall depths observed from the historic data for Turffontein These are annual rainfalls

Figure 14: Daily rainfalls for the 57-year period in Turffontein

33
CIVN4005A-Investigational Project Report

Figure 15: Box plots of the average daily rainfall per month in Turffontein (mm/day/month) from 1965-2022

3.4.2 Kwa-Zulu Natal rainfall stations

Ngome-Bos
F
This station is located in the Ngome Forest that is situated 70km east of Vryheid, KwaZulu-
Natal, South Africa. This is a forest that is transitional between the Mistbelt
Forest and Coastal Scarp Forest. It is a protected area since 1905, and forms part of the
Ntendeka Wilderness Area. The Ngome Forest is situated on the southern slopes of high-
altitude mist-belt grasslands and contains a unique combination of coastal and upland plant
and bird species. Below is an image of the Ngome-Bos area.

Figure 16: Google Earth view of Ngome-Bos station in Kwa-Zulu Natal

34
CIVN4005A-Investigational Project Report

The mean daily historic rainfall per year is calculated to be 4mm/day over the 57-year
historic period, as illustrated in figure 17. It is observed that the wettest year is in 1984 with
a mean daily rainfall of 6.1mm/day in 1984 and the driest year is 2015 with a mean daily
rainfall of 2.2mm/day in 2015 as shown in the figure below. The maximum daily observed
rainfall occurred on the 30th of December 1993 with a value of 320 mm, as shown in figure
11. The wet months are from October- March, as seen in figure 19 that shows the mean
historic rainfall per day in each month for the 57-year period.

Figure 17: Mean historic daily rainfall per year from 1965-2022

Figure 18: Historic daily rainfall for Ngome Bos from 1965-2022

35
CIVN4005A-Investigational Project Report

Figure 19: Box plots of monthly mean daily rainfall over a 57-year period for Ngome-Bos

Durban Heights

Durban heights rainfall station is situated around 400km away from the Ngome- Bos station.
Durban heights station is near Umgeni Water in a suburb called Reservoir Hills. This area is a
few kilometres away from Durban CBD in the Western part of Durban and is characterised
by its hilly terrain.

Figure 20: Google Earth Pro image of Durban Heights station

The mean daily historic rainfall per year is calculated to be 2.2 mm/day over the 57-year
historic period, as illustrated in figure 21. It is observed that the wettest year is in 1994 with
a mean daily rainfall of 4mm/day in 1994 and the driest year is 2016 with a mean daily

36
CIVN4005A-Investigational Project Report

rainfall of 0.6mm/day in as shown in the figure below. The maximum daily observed rainfall
occurred on the 9th of April 2022 with a value of 281 mm, as shown in figure 22. The wet
months are from October- March, as seen in figure 23 that shows the mean historic rainfall
per day in each month for the 57-year period. The mean daily historic rainfall per year varies
from Ngome-Bos by 82% and this is due to the change in environmental conditions as
Ngome-Bos is situated in a forest and Durban Heights is a residential area.

Figure 21: Mean historic daily rainfall per year Annual rainfalls.

Figure
22: Daily historic
rainfalls over a 57-
year period

37
CIVN4005A-Investigational Project Report

Figure 23:Box plots of mean daily rainfall per month over the 57-year period in Durban Heights

4. Performance Evaluation
This chapter will assess the performance of different subjective parameter pairings. Pairs
(1,1) and (1,2) are the minimum pairs, (14,1) and (14,28) are the middle pairs and lastly
(28,1) and (28,56) are the maximum pairs. The first number of the subjective pairs is the
number of pairs of nearest neighbours selected and the second number is the number of
years used to disaggregate from an annual to a daily time step.

This chapter will evaluate the performance of the different subjective pairs mentioned
above based on the three performance evaluations stated chapter 3 (research method). This
chapter covers the performance evaluation for Gauteng and Kwa-Zulu Natal, it will provide
an analysis based on the graphs obtained from Excel for each performance evaluator.

Gauteng Evaluation:

Percentage deviation from historic median:


Altering the pairs did not influence the statistical mean as can be seen in figures 24 and 28.
The percentage deviation of the stochastic mean from the daily historical mean rainfall,
ranged from 0.58%-0.64% for both stations. Showing that altering the pairs did not influence
mean as a statistical measure for the generated stochastic data, this is attributed to the fact
that the nearest neighbours are determined by the mean rainfall. Since the deviation is less
than 1% it is indicative that the stochastic generated daily rainfalls correlate to the historic
mean rainfall. The range of deviation for the measure’s standard deviation and proportion
of rainfall days do not vary significantly for both stations when the pairs are altered. For
skewness, there was a large variation in deviation from the daily historic skewness, ranging

38
CIVN4005A-Investigational Project Report

from 2.59%-8.6% for Leeukop and for Turffontein the range was 0.11%-3.74%. It can be
concluded that altering the pairs influences skewness as a statistical measure as seen by the
greater deviations but does not influence mean.

Deviation of the historic statistic from the median of the 100 stochastic values

For Gauteng, the deviation of the statistical historic value from the stochastic median value
varies from 9-11 values away from the stochastic median, however the deviation for the
mean historic statistic is less than 1% as seen in figure 24 and 28, therefore changing the
pairs has no effect on the deviation away the stochastic median i.e., the 50 th value. Changing
the pairs affected the performance of the VLB generator as it can be seen in figure 25 and
29 the deviation is further away from the stochastic median value. However, it can be noted
that the deviations from the historic median( figures 24 and 18) are relatively small
compared to the great deviation away from the median stochastic value ( figure 25 and 29).
Thus, altering the pairs in the VLB generator results in a sensitivity to the statistical
measures, standard deviation, and skewness.

As per figure 25 for Leeukop, all subjective parameter pairs have large deviations in the
skewness value from the stochastic median. With the largest deviation of 44 values away
from the stochastic median skewness for pair 28,1. On average the historic value is 3.66%
away from being within the interquartile range, where this region is deemed acceptable
(figure 26). Pair 14,28 was the only subjective parameter pair where the historic value was
in the interquartile range. This implies that the VLB daily stochastic generator may be bias
with regards to producing these stochastic rainfalls, as it tends to overestimate the rainfall
resulting in higher rainfalls.

Across all pairs for Turffontein, the central mean remained consistent with a value of 2.06
seen in figure 30. These stochastic means fared well against the historic mean of 2.08. This is
further depicted in figure 29 as where all subjective parameter pairs were 11 places away
from the stochastic mean median. For all subjective pairs, the first quartile and median
have values below the historic mean thus indicating some form of systematic bias, whereby
daily VLB stochastic generator is underestimating rainfall values. This ultimately means that
the generator is generating lower rainfall days for the stochastic series.

Composite average deviation

39
CIVN4005A-Investigational Project Report

For the station Leeukop pair 14,28 performed the best as seen in figure 24 and 25. The
average composite deviation is the lowest for this pair (figure 27) and it performs the best in
3/4 statistical measure in terms of deviation as well as deviation away from the stochastic
median.

For the Turffontein station it can be concluded that pair 28,56 performs the best, as it has a
significantly lower average composite deviation when compared to performance of the
other pairs. In addition, pair 28,56 performs the best in ¾ statistical measures for the
percentage deviation away from the historic median rainfall.

Figure 24: Leeukop performance based on historic value

40
CIVN4005A-Investigational Project Report

Figure 25: Leeukop performance based on deviation from median

Figure 26: Skewness boxplot for Leeukop

41
CIVN4005A-Investigational Project Report

Figure 27: Composite average deviation for Leeukop

Turffontein Performance evaluation based on deviation away from historic value


4.00
3.74

3.50 3.41
3.25

3.00
2.50
2.50
2.03
2.00
1.72

1.50 1.35
1.16
1.00 0.86
0.69 0.76
0.64 0.64 0.570.48
0.41 0.46
0.50
0.11
0.01 0.00
0.00
Mean Stanard Deviation Skewness Proportion of rainfall days
Subjective Pairs:
1,2 1,1 14,1 14,28 28,1 28,56

Figure 28: Turffontein performance based on historic deviation

42
CIVN4005A-Investigational Project Report

Turffontein performance evaluation based on deviation from stochastic median


40 38

33

30
Deviation from median

20 18 18
17
14 14 14
11 11 11 11 11 11 12 12
10 10
10
6 7

1 2 2 1
0
Mean Standard deviation Skewness Proportion of rainfall days

Statistical measures
Subjective Pairs:
1,2 1,1 14,1 14,28 28,1 28,56

Figure 29: Turffontein performance based on deviation from median

Subjective Pairs:

Figure 30: Mean boxplot for Turffontein

43
CIVN4005A-Investigational Project Report

Average deviation from historic value x deviation Turffontein Composite average deviation
35.00
32.25

30.00

25.00 24.29
from median

20.00 17.78
16.17
15.00 14.17

10.00
5.17
5.00

0.00

Subjective Pairs

Figure 31: Composite average deviation for Turffontein


1,2 1,1 14,1 14,28 28,1 28,56

Kwa-Zulu Natal Evaluation

Percentage deviation from historic median:


Altering of the pairs did not have an impact on the statistical mean as seen in figure 32 and
figure 36. Percentage deviations less than 5% depicts comparable results when reproducing
stochastics rainfall data from the historic data. Altering the pairs did have an impact on the
statistical standard deviation for Durban Heights, with pair 14,1 having the highest deviation
from the historic standard deviation rainfall, which suggests that there is a larger variation in
stochastic rainfall from the historic standard deviation rainfall. While pair 28,1 had the
lowest deviation from the historic standard deviation rainfall for Durban Heights. Changing
the pairs did have a large impact on the skewness. As seen in figure 36 for Ngome Bos
where pair 1,1 had the largest deviation from the historic skewness rainfall across all
regions.

Typically for Ngome Bos, all deviations in skewness were substantial, with all pairs exhibiting
a deviation greater than 5%. The same can be said for Durban Heights, with all pairs except
1,1 having deviations larger than 5%. It should be noted that pair 1,1 fared the worst for
Ngome Bos while it performed the best for Durban Heights.

As for proportion of rainfall, the pairs utilized by VLB produced stochastic rainfall data that
had much larger proportion of rainfall to non-rainfall days, which is evident in figure 32 for

44
CIVN4005A-Investigational Project Report

Durban Heights, therefore implying the alteration of pairs did have an impact on the
proportion of historic rainfall days as all percentages were much larger than 5%. As for
Ngome Bos, altering of the pairs did not have a significant impact on the data, with pair
14,28 having faired the best.

Deviation of the historic statistic from the median of the 100 stochastic values

In Kwa-Zulu Natal the deviation of the historic statistic from the stochastic median value
varies slightly. For the two stations namely, Ngome-Bos and Durban heights for the
deviation away from the stochastic median for mean does not change when the pairs are
altered thus indicating that changing the parameter pair does not influence the mean of the
stochastic data. Skewness is the most sensitive measure when the parameter pair values are
altered, this can be seen by the large deviations away from the stochastic median. For
Durban heights pair 1,1 is 1 value away from the stochastic median which is the ideal value
and thus the most representative of the historic data, whereas for pair 14,1 the historic
statistic deviates by 37 values away from the stochastic median thus indicating that is not
representative of the historic data. For Durban heights the proportion of rainfall days
deviates by 51 values away from the stochastic median, which is constant when the pairs
are changed, however the deviation of the historic statistic from the median of the 100
stochastic values for Ngome-Bos varies when the pairs are altered. Standard deviation varies
in both stations by 2-24 values away from the stochastic median. For Durban heights most
pairs deviate by 2 values away from the stochastic median with exceptions of pair 14,1 and
14,28. However for Ngome-Bos the deviations from the stochastic median range from 2-19
values away from the stochastic median with no noticeable trends.

The consistent deviation of the historic proportion of rainfall by 51 (figure 33) values from
the stochastic median across varied subjective parameter pairs is indicative of a systematic
discrepancy between observed historical data and the statistical central tendency
represented by the stochastic median. This divergence could be indicative of a systematic
bias or unaccounted influencing factor that consistently skews the historic values away from
the expected median. From figure 34 the historic proportion of rainfall days is greater than
the stochastic median proportion of rainfall days for all pairs, thus indicating that the daily
VLB stochastic generator may be bias to generating lower rainfall days for the stochastic
series.
45
CIVN4005A-Investigational Project Report

For Ngome-Bos the differences between the historical standard deviation and the median
stochastic standard deviation range from 2 to 19 values across the different subjective
parameter pairs (figure 37). As illustrated in Figure 38, the historical value closely aligns with
the stochastic median values, with the most significant disparity observed at 1.7% for pair
14,28. This implies that despite a substantial deviation of 19 values from the stochastic
median value, the disparity between the stochastic and historical medians remains minimal
at 1.7%. Consequently, the stochastic data maintains its representativeness of the historical
data.

Composite average deviation


The composite average deviations are significant for all pairs for Durban Heights, this is due
to the large deviations away from the stochastic median for proportion of rainfall days as
well as the great deviations away from the historic value for proportion of rainfall days as
seen in figure 32. For Durban Heights pair 1,1 performs the best by having the lowest
average composite deviation however pair 28,1 performs comparably by differing by 9%.
Pair 1,1 performs well due to the deviation away from the stochastic median performance
evaluation as seen in figure 33, it deviates 1 value away from the stochastic median while
pair 28,1 deviates 14 value away from the stochastic median. Overall pair 1,1 has deviations
that are less than 5% away from the historic value for most statistical measures, thus
making it the best performing pair for Durban Heights. For Ngome-Bos pair 28,56 performs
the best as it has the lowest composite average deviation. Pair 28,56 performs well in all
statistical measures for both percentage deviation away from the historic and deviation of
the historic statistic from the median of the 100 stochastic values.

Overall, when carrying out the performance evaluation for all stations in Gauteng and Kwa-
Zulu Natal, it is found that there was no specific pair of the disaggregation parameters that
performed the best by having the lowest percentage deviation from the historic value for
the statistical measures of central tendency and deviations from the stochastic median.

46
CIVN4005A-Investigational Project Report

Figure 3: Durban Heights performance based on historic value

Figure 4: Durban Heights performance based on rank

Figure 32: Durban Heights performance based on historic deviation

Figure 33: Durban Heights performance based on deviation from median

47
CIVN4005A-Investigational Project Report

Figure 34: Proportion of rainfall days boxplot for Durban Heights

Figure 35: Composite average deviation for Durban Heights

48
CIVN4005A-Investigational Project Report

Figure 36: Ngome Bos performance based on historic deviation

Figure 37: Ngome Bos performance based on deviation from median

49
CIVN4005A-Investigational Project Report

Figure 38: Skewness box plot for Ngome Bos

Figure 39: Composite average deviation for Ngome Bos

50
CIVN4005A-Investigational Project Report

5. Conclusions and Recommendations


This research aimed to evaluate the sensitivity of the performance of the daily VLB
stochastic generator to changes in the values of the two disaggregation parameters of the
model. The study achieved this by conducting a sensitivity analysis using data from 2 pairs of
rainfall stations located in Gauteng and Kwa-Zulu Natal. It was found that changing the
subjective parameters, namely the number of pairs of nearest neighbours and years
selected in disaggregating annual rainfalls to daily rainfalls, does not influence mean as a
statistical measure, since the percentage deviation for the mean remained constant as pairs
were altered. However, skewness and standard deviation were found to be sensitive to
changes in the parameter values, as seen by the variations in the percentage deviations
from the historic median and the deviation of the historic statistic from the median of the
100 stochastic values graphs. Overall, from the performance evaluations carried out it is
concluded that the daily VLB stochastic generator is not sensitive to the two subjective
parameter pairs.

For Gauteng, there is no optimal parameter values that exist because no trends were
identified in the analysis. When conducting an analysis on an arid region, namely Northen
Cape, it was found that VLB stochastic generator could not produce stochastic data as the
rainfall occurrences were too few for the methodology of the VLB to function. As such the
arid region (Northen Cape) was excluded from the research. For the humid region (Kwa-Zulu
Natal), the stochastic generator was able to produce stochastic data, yet no optimal
parameter values were identified for this. A key limitation to the VLB daily stochastic
generator was the inability to generate stochastic data for arid regions. Moreover, in the
case where historical data includes missing values, the data would need to be in-filled for
the generator to operate. This is a common practice in hydrological analysis as a continuous
data set is required for most stochastic generators to operate.

From this research it can be concluded that no optimal pair exists for the regions analysed,
however if additional subjective parameter pairings are examined beyond just the minimal,
median, and maximum pairs there is a possibility that an optimal pair of subjective
parameters may exists.

51
CIVN4005A-Investigational Project Report

For future research, it is advised that a minimum of three weather stations with comparable
climatic conditions per region is examined, to identify if an optimal pair of subjective
parameters exists. Adaptations to the VLB daily stochastic generator are necessary to
accommodate the analysis of arid regions. In addition, the performance of the daily VLB
stochastic generator can be compared to other daily stochastic generators such as the
Weather Generator Method (WGEN) and the Transition Probability Matrix method. (TPM).
This can assess how well each model captures the statistical characteristics of observed
rainfall patterns.

52
CIVN4005A-Investigational Project Report

6. References
–Acharya, S. (2021, May 14). What are RMSE and MAE? Retrieved from Towards Data
science: https://towardsdatascience.com/what-are-rmse-and-mae-e405ce230383

-Anagnostopoulos G, Koutsoyiannis D, Christofides A, Efstratiadis A and Mamassis N (2010).


A comparison of local and aggregated climate model outputs with observed data. Hydrol.
Sci. J. 55(7), 1094-1110.

–Benoit, L., Sichoix, L., Nugent, A. D., Lucas, M. P., & Giambelluca, (2022, April 27).
Stochastic daily rainfall generation on tropical islands with complex topography. Retrieved
from European Geosciences Union: https://doi.org/10.5194/hess-26-2113-2022

-Beven, K. (2012). Rainfall-Runoff Modelling. Sussex: John Wiley & Sons, Ltd.

–Boughton, W. C. (1999). A daily rainfall generating model for water yield and flood studies.
Australia: Cooperative Research Centre for Catchment Hydrology (99/9).

-Brissette, F. (2014, January 2). Comparison of five stochastic weather generators in


simulating daily precipitation and temperature for the Loess Plateau of China Retrieved
from RMetS: https://rmets.onlinelibrary.wiley.com/doi/full/10.1002/joc.3896.

-Campo, M. (2009, March 8). Application of a stochastic rainfall model in flood risk
assessment. Retrieved from Research Gate:
https://www.researchgate.net/publication/234517790_Application_of_a_stochastic_rainfall
_model_in_flood_risk_assessment.

-Chapman, T. (1999, January 5). Stochastic modelling of daily rainfall: the impact of adjoining
wet days on the distribution of rainfall amounts Retrieved from Science Direct:
https://www.sciencedirect.com/science/article/pii/S136481529800036X?via%3Dihub.

-Chisanga, C. (2017, September 3). Statistical Downscaling of Precipitation and Temperature


Using Long Ashton Research Station Weather Generator in Zambia: A Case of Mount Makulu
Agriculture Research Station. Retrieved from Scientific Research:
https://www.scirp.org/journal/paperinformation.aspx?paperid=78684.

-Cowden, J. (2007, September 24). Stochastic rainfall modeling in West Africa:Parsimonious


approaches for domestic rainwater harvesting assessment. Retrieved from Elsevier:
https://sci-hub.se/https:/doi.org/10.1016/j.jhydrol.2008.07.025

-Edwin, A. I., & Martins, O. Y. (2014). Stochastic Characteristics and Modelling of Monthly
Rainfall Time Series of Ilorin, Nigeria. Modern Hydrology, 67-79.

-Ghil M (2002). Natural Climate Variability. Vol. 1, The Earth system: physical and chemical
dimensions of global environmental change, Eds., MacCracken M C and Perry J S., in
Encyclopedia of Global Environmental Change, Ed. -in-Chief, Munn T. John Wiley. P 544-549.

53
CIVN4005A-Investigational Project Report

-Harold, T. (2003, December 12). A nonparametric model for stochastic generation of daily
rainfall amounts. Retrieved from Research Gate:
https://www.researchgate.net/publication/248808540_A_nonparametric_model_for_stoch
astic_generation_of_daily_rainfall_amounts.

-Helton, J. (1995). Uncertainty and sensitivity analysis in the presence of stochastic and
subjective uncertainty. Journal of Statistical Computation and Simulation , 3-76.
-Huang, Y. (2018, March 11). Spatial and Temporal Variability in the Precipitation
Concentration in the Upper Reaches of the Hongshui River Basin, Southwestern China.
Retrieved from Hinawi: https://www.hindawi.com/journals/amete/2018/4329757/.

-Hughes, D. (2004). Three decades of hydrological modelling research in South Africa. South
African Journal of Science, 638-642.

-Koutsoyiannis D, Montanari A, Lins H F and Cohn TA (2009). Climate, hydrology and


freshwater: towards an interactive incorporation of hydrological experience into climate
research Discussion of “The implications of projected climate change for freshwater
resources and their management, Hydrol.Sci. J., 54(2). 394-405.

-Kundzewicz ZW, Mata lJ, Arnell NW, Döll P, Jimenez B, Miller K, Oki T, Şen Z and
Shiklomanov I (2008). The implications of projected climate change for freshwater resources
and their management, Hydrol.Sci. J., 53:1, 3-10.

-Kundzewicz, Z. (2010, October 13). Are climate models “ready for prime time” in water
resources management applications, or is more research needed? Retrieved from Taylor
Francis Online: https://www.tandfonline.com/doi/full/10.1080/02626667.2010.513211.

-LePan, N. (2023, March 22). The world’s wettest mines: Measuring precipitation at mine
sites. Retrieved from Datamine: https://www.mining.com/the-worlds-wettest-mines-
measuring-precipitation-at-mine-sites/

-Mirzaei, M. (2022, August 9). A Novel Framework for Nonparametric Rainfall Generator
Based on Deep Convolutional Wasserstein Generative Networks (DC-WGANs). Research
Square, 6-9.

-Mujumdar PP and Ghosh S (2008). Modeling GCM and scenario uncertainty using a
possibilistic approach: Application to the Mahanadi River, India, Water Resour. Res., 44,
W06407, doi:10.1029/2007WR006137.

-Ndiritu, J. (2011). A variable-length block bootstrap method for multi-site synthetic


streamflow generation. Hydrological Sciences Journal, 362-379.

-Ndiritu, J. (2011). Effect of fragment-based perturbations on the disaggregation


performance of a VLB streamflow generator. Elsevier, 823-839.

54
CIVN4005A-Investigational Project Report

-Ndiritu, J., & Nyaga, J. (2014). A NON-PARAMETRIC MULTI-SITE STOCHASTIC RAINFALL


MODEL WITH APPLICATIONS TO CLIMATE CHANGE. Johannesburg: Water Research
Commission.

-Nix, S. (2019, July 8). The Territory and Current Status of the African Rainforest. Retrieved
from ThoughtCo.: https://www.thoughtco.com/african-rainforest-1341794.

-Pegram, G. , & Clothier, A. N. (2002). Space time modelling of rainfall using the string of
beads model: Integration of radar and rain gauge data. Durban, South Africa: Water
Research Commission Report No 1010/1/02.

-Pratikasiwi, H. (2022, July 14). Stochastics Modelling of Rainfall Process in Asia Region: A
Systematics Review. Retrieved from environmental sciences proceedings:
file:///D:/Downloads/environsciproc-19-00022.pdf.

–Richardson, C. W., & Wright, D. A. (1984). WGEN: A model for generating daily weather
variables. U. S. Department of Agriculture, 8 (83p).

–Seifossadat, E. Sameti, H. (2020, June 5). Retrieved from ScienceDirect:


https://www.sciencedirect.com/science/article/abs/pii/S0885230822000274

-Sellwood, B. W., & Valdes, P. J. (2005). PALAEOCLIMATES. Encyclopedia of Geology, 131–


140

-Srikanthan, R., & McMahon, T. (2001). Stochastic generation of annual, monthly and daily
climate data: A review. Hydrology and Earth System Sciences, 653–670.

Tjebane,W (2022). ASSESSMENT OF DAILY STOCHASTIC RAINFALL GENERATION IN


SOUTHERN AFRICA 4-8.

-Wiegand, K.(2008, February 2). SERGE: a spatially explicit generator of local rainfall in
southern Africa. Retrieved from SciELO: http://www.scielo.org.za/scielo.php?
script=sci_arttext&pid=S0038-23532008000100010.

–Ye, Lei. (2018, December 17). The probability distribution of daily precipitation at the point
and catchment scales in the United States. Retrieved from European Geosciences Union:
https://hess.copernicus.org/articles/22/6519/2018/

-Zhao, Y., & Nearing, M. A. (2019). A daily spatially explicit stochastic rainfall generator for a
semi-arid climate. Elsevier, 181-192.

-Zhao, Y., Nearing, M. A., & Guertin, D. P. (2019). A daily spatially explicit stochastic rainfall
generator for a semi-arid climate. Journal of Hydrology, 181-192.

-Zhou, Y., Yu, Z. J., Li, J., Huang, Y., & Zhang, G. (2017). The Effect of Temporal Resolution on
the Accuracy of Predicting Building Occupant Behaviour based on Markov Chain Models.
ScienceDirect: 10th International Symposium on Heating, Ventilation and Air Conditioning,
ISHVAC2017 , Procedia Engineering 205 (2017) 1698 1704.

55
CIVN4005A-Investigational Project Report

-Zucchini, W., Adamson, P. T., & McNeill, L. (1992). A model of southern African Rainfall. S.
Afr. J. Sci, 103-109.

56

You might also like