0% found this document useful (0 votes)
29 views20 pages

Fuller Correction Factor and Chi-Squared Test

The document discusses the Fuller Correction Factor for estimating maximum instantaneous flow based on annual flow data and details the Chi-Squared goodness of fit test for assessing the fit of flow records to a probability distribution. It provides a method for calculating class intervals, expected values, and applying the Smirnov-Kolmogorov test for ungrouped data. Examples illustrate the application of these statistical tests to flow data to determine adherence to a Normal distribution at a specified significance level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views20 pages

Fuller Correction Factor and Chi-Squared Test

The document discusses the Fuller Correction Factor for estimating maximum instantaneous flow based on annual flow data and details the Chi-Squared goodness of fit test for assessing the fit of flow records to a probability distribution. It provides a method for calculating class intervals, expected values, and applying the Smirnov-Kolmogorov test for ungrouped data. Examples illustrate the application of these statistical tests to flow data to determine adherence to a Normal distribution at a specified significance level.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

FULLER CORRECTION FACTOR: Ff

It is a factor that allows estimating the maximum flow.


instantaneous based on the maximum annual flow.

Where:
A: Basin area, km2
QMIInstantaneous maximum flow
QMAMaximum annual flow.

GOODNESS OF FIT TEST


CHI SQUARED: χ2

The general expression of Chi Squared is given by:

Where:
χ2Calculated
C value of Chi square from the record
of instantaneous maximum flows.
θINumber of observed values in class interval 'i'
eINumber of expected values in class interval 'i'
k: Number of class intervals

The theoretical value of Chi Squared (χ2)t is obtained from the


distribution table of the same name considering:
Significance level: α, generally α = 5%
Degrees of freedom: k - h–1. Where:
h=2 Normal Distribution, LogNormal and
Gumbel
h=3 Pearson III Distribution, LogPearson
III

Decision Criterion:

Yes: χ2C≤ χ2The


t flow record is adjusted to the
probability distribution for the level
of considered significance.
Yes: χ2C> χ2The
t flow record does not fit the
probability distribution for the level
of significant consideration. It should
try another distribution.

This method requires grouping the data into intervals of


class, its application is more suitable for adjustments to the
normal distribution since it has been developed to
normal and independent databases. In practice, it
used to verify the fit to any distribution of
probabilities.

Example:
For the following record of maximum annual flows, in
m3/s, registered at a station whose basin is 250 km2,
it is requested to apply the Chi-Square test and determine if the
information fits the Normal Distribution for a
5% significance level.
Year QMA Year QMA Year QMA Year QMA Year QMA
1951 156.49 1961 71.59 1971 145.96 1981 115.63 1991 182.65
1952 146.89 1962 100.33 1972 218.08 1982 144.57 1992 93.31
1953 106.82 1963 182.38 1973 249.93 1983 173.91 1993 156.29
1954 121.46 1964 80.73 76.23 1994 108.61
136.62 149.4
1956 216.82 1966 137.22 1976 96.49 1986 157.55 1996 136.56
93.97 92.12
1958 170.86 1968 107.88 1978 174.04 1988 143.71 1998 125.03
73.84
1960 133.25 1970 139.8 1980 130.6 1990 146.16 2000 150.13

Applying the Fuller Factor to the flow record


annual maxima, we determine the maximum flows
instantaneous, for an area of 250 km2:

1.51

Then the maximum instantaneous flows will be


next:
Year QMA Year QMA Year QMA Year QMA Year QMA
236.30
221.80
161.30
183.40
1955 206.30 1965 233.51 1975 284.39 1985 344.30 1995 225.59
327.40
141.89
258.00
111.50
201.21

Our goal is to calculate χ2and compare


C it with χ2 t

Calculation of the number of class intervals: k


According to Yevjevich:

N: Sample size, in this case N = 50

Calculation of the amplitude of the class interval: ΔQ

In this case:
Qmax377.39 m3/s Qmin108.10 m3/s
Calculation of the first class interval:
Lower Limit: Qmin- ΔQ/2
Upper Limit: Qmin+ ΔQ/2
Replacing:
Lower Limit: 108.10–53.86/2 = 81.17
Upper Limit: 108.10 + 53.86/2 = 135.03

We can now calculate the 6 class intervals, the


class marks, such as the average of the upper limits
the lower bound of each interval and the number of values
observed in each class interval: θ

k Linf Lsup Mclass θ


1 81.17 135.03 108.10 5
2 135.03 188.89 161.96 14
3 188.89 242.75 215.82 19
4 242.75 296.61 269.68 6
5 296.61 350.47 323.54 4
6 350.47 404.33 377.40 2

To find the number of expected values in each


class interval (e) we must apply the following
relationship
Where:
F(SiProbability distribution function in the limit
upper limit of class interval 'i'.
F(IiLimiting probability distribution function
lower bound of the class interval 'i'.

To determine the Normal distribution function of


we must first determine the variable
standardized (Z):

ZiStandardized variable for the flow Qi


QmAverage of the grouped values
Standard deviation of grouped values.

Where:
QiClass mark of each interval
Replacing values we obtain:
Qm= 211.51 S = 66,05
The calculations are shown in the following table:

Linf Lsup Mclase θ Zinf Zsup


81.17 135.03 108.10 5 -1.9735 -1.1580
135.03 188.89 161.96 14 -1.1580 -0.3425
188.89 242.75 215.82 19 -0.3425 0.4730
242.75 296.61 269.68 6 0.4730 1.2885
296.61 350.47 323.54 4 1.2885 2.1040
350.47 404.33 377.40 2 2.1040 2.9195

From the normal distribution table we find the values of


F(Z) for the lower and upper limits of each interval
of class and we completed the calculations:

Linf Lsup Mclase θ Zinf Zsup F(Zinf) F(Zsup) ei (θi- ei)2/ei


81.17 135.03 108.10 5 -1.9735 -1.1580 0.0242 0.1234 4.96 0.00
135.03 188.89 161.96 14 -1.1580 -0.3425 0.1234 0.366 12.13 0.29
188.89 242.75 215.82 19 -0.3425 0.4730 0.366 0.6819 15.80 0.65
242.75 296.61 269.68 6 0.4730 1.2885 0.6819 0.9012 10.97 2.25
296.61 350.47 323.54 4 1.2885 2.1040 0.9012 0.9823 4.06 0.00
350.47 404.33 377.40 2 2.1040 2.9195 0.9823 0.9982 0.80 1.80

Adding the last column we find: χ24.99 C

From the Chi-Squared distribution table with:


G. Freedom: k–h–1 : 6–2–1 = 3
Significance level: α = 5%
We obtain: χ2= 7.81
t

Comparing we see that: χ2< χC2, that ist to say:


The flow measurement conforms to the Normal distribution.
of probabilities with a significance level of 5%
GOODNESS OF FIT TEST
SMIRNOV - KOLMOGOROV

This test consists of comparing the maximum value


absolute of the difference between the distribution function
of observed and estimated probability: ΔC, with a value
theoretical (Δtthat depends on the number of data and the level of
meaning α.

Where:
ΔCSmirnov statistic calculated.
F(Q): Adjustment probability distribution function.
P(Q): Observed probability distribution function
(Probability less than):

Table 1: Values of Δt
α: LEVEL OF SIGNIFICANCE
N
0.20 0.10 0.05 0.01
5 0.45 0.51 0.56 0.67
10 0.32 0.37 0.41 0.49
15 0.27 0.30 0.34 0.40
20 0.23 0.26 0.29 0.36
25 0.21 0.24 0.27 0.32
30 0.19 0.22 0.24 0.29
35 0.18 0.20 0.23 0.27
40 0.17 0.19 0.21 0.25
45 0.16 0.18 0.20 0.24
50 0.15 0.17 0.19 0.23

> 50
Decision Criterion:
Yes: ΔC≤ ΔtThe flow measurement is adjusted to the
probability distribution for the level
of considered significance.
Yes: ΔC> ΔtThe flow record does not conform to the
probability distribution for the level
of significant consideration. It should
try another distribution.
This test is applicable to ungrouped data, that is, not
it requires creating class intervals and is applicable to
any probability distribution.
It is not an exact test but rather an approximate one.

Example:
For the flow recording of the previous example, it is requested
apply the Smirnov–Kolmogorov test for
determine if the information fits the Distribution
Normal for a significance level of 5%.
Based on the record of instantaneous maximum flows:
Year QMA Year QMA Year QMA Year QMA Year QMA
236.30
221.80
161.30
183.40
1955 206.30 1965 233.51 1975 284.39 1985 344.30 1995 225.59
1956 327.40 1966 207.20 1976 145.70 1986 237.90 1996 206.21
141.89
1958 258.00 1968 162.90 1978 262.80 1988 217.00 1998 188.80
111.50
201.21
We carried out the following calculations:

Columns 1 and 2: The flow rates are arranged in order


descendant.

Column 3: The empirical probability is calculated of


the flows, applying:

Column 4: The standardized variable (Z) is calculated,


as a preliminary step for the calculation of the function of
Normal distribution of probabilities

Column 5: The function of the table is obtained from


Normal probability distribution F(Z).

Column 6: The Smirnov statistic is calculated


applying:
That is, the absolute value of the difference of the
results of columns 5 and 3.

The following table shows these calculations:

1 2 3 4 5 6
m Q P(Q) Z F(Z) Abs(F - P)
1 377.4 0.9804 2.453 0.993 0.013
2 357.3 0.9608 2.152 0.984 0.024
3 344.3 0.9412 1.957 0.975 0.034
4 329.3 0.9216 1.732 0.958 0.037
5 329.3 0.9020 1.732 0.958 0.056
6 327.4 0.8824 1.703 0.956 0.073
7 284.4 0.8627 1.059 0.855 0.008
8 275.8 0.8431 0.930 0.824 0.019
9 275.4 0.8235 0.924 0.822 0.001
10 262.8 0.8039 0.735 0.769 0.035
11 262.6 0.7843 0.732 0.768 0.016
12 258.0 0.7647 0.663 0.746 0.018
13 237.9 0.7451 0.362 0.641 0.104
14 236.3 0.7255 0.338 0.632 0.093
15 236.0 0.7059 0.333 0.630 0.075
16 233.8 0.6863 0.300 0.618 0.068
17 233.5 0.6667 0.296 0.616 0.050
18 226.7 0.6471 0.194 0.577 0.070
19 225.6 0.6275 0.177 0.570 0.057
20 221.8 0.6078 0.120 0.548 0.060
21 220.7 0.5882 0.104 0.541 0.047
22 220.4 0.5686 0.099 0.540 0.029
23 218.3 0.5490 0.068 0.527 0.022
24 217.0 0.5294 0.048 0.519 0.010
25 211.1 0.5098 -0.040 0.484 0.026
26 208.3 0.4902 -0.082 0.467 0.023
27 207.2 0.4706 -0.099 0.461 0.010
28 206.3 0.4510 -0.112 0.455 0.004
29 206.2 0.4314 -0.114 0.455 0.023
30 201.2 0.4118 -0.189 0.425 0.013
31 197.2 0.3922 -0.249 0.402 0.010
32 188.8 0.3725 -0.375 0.354 0.019
33 184.0 0.3529 -0.446 0.328 0.025
34 183.4 0.3333 -0.455 0.324 0.009
35 174.6 0.3137 -0.587 0.278 0.035
36 174.6 0.2941 -0.587 0.278 0.016
37 164.0 0.2745 -0.746 0.228 0.047
38 162.9 0.2549 -0.763 0.223 0.032
39 161.3 0.2353 -0.787 0.216 0.020
40 158.0 0.2157 -0.836 0.202 0.014
41 151.5 0.1961 -0.934 0.175 0.021
42 145.7 0.1765 -1.021 0.154 0.023
43 141.9 0.1569 -1.078 0.141 0.016
44 140.9 0.1373 -1.093 0.137 0.000
45 139.1 0.1176 -1.120 0.131 0.014
46 121.9 0.0980 -1.377 0.084 0.014
47 115.1 0.0784 -1.479 0.070 0.009
48 112.2 0.0588 -1.523 0.064 0.005
49 111.5 0.0392 -1.533 0.063 0.023
50 108.1 0.0196 -1.584 0.057 0.037

Where:
N = 50 Qm= 213.78 m3/s S = 66.70 m3/s

From column 6, we see that the maximum value is


0.104, that is:
ΔC0.104

From Table 1, for N = 50 and α = 5%, we find:


Δt0.190

Comparing we see that: ΔC< Δtthat is to say:


The flow record follows the Normal distribution
of probabilities with a significance level of 5%
CALCULATION OF FLOW RATE
Ven Te Chow found that these flows can
to be calculated by the expression:

Where:
QTFlow of streams for a return period 'T'
QmAverage of the maximum instantaneous flows.
Standard deviation of the maximum flows
instantaneous.
KTFrequency factor what depends on the
probability distribution that best fits the
record of maximum instantaneous flows.

NORMAL DISTRIBUTION
The normal probability density function is defined
how:

and the normal probability distribution function as:


Where, Z is the standardized variable calculated as:

Caudal
QmAverage of the maximum instantaneous flows.
Standard deviation of the maximum flows
instantaneous.
Clearing:

Comparing with Ven Te Chow's expression:

we see that the frequency factor of the distribution


Normal is given by the standardized variable Z.

F(a) = P(Z≤a) → Tables


F(-a) = P(Z≤-a) = 1–P(Z≤a)

The value of Z corresponding to a probability of


exceedance P, (P = 1/T), can be calculated by finding the value
from an intermediate variable 'w':

Where: 0 < P ≤ 0.5


and then, applying expression (2):

In the case that P > 0.5, then (1–P) is substituted in


place of P in the equation (1), then, to the value found of Z
a negative sign is assigned.

Example 1:
Given the maximum instantaneous flows, in m3/s, in a
flow measurement station, it is requested to determine:
a. The probability that in any given year the
flow rate greater than or equal to 7460 m3Calculate
in addition to its return period.
b. The flow of streams for a return period of
60 years and 100 years.

Consider that the flow recording is adjusted to the


normal probability distribution.
Q Q Q
Year Year Year
(m3/s) (m3 (m3/s)
1965 3706 1975 2367 1985 4240
1966 4060 1976 4819 1986 2849
1967 2350 1977 3919 1987 6267
1968 6000 1978 6900 1988 2246
1969 4744 1979 3505 1989 7430
1970 6388 1980 7061 1990 5971
1971 2675 1981 3220 1991 3747
1972 3130 1982 2737 1992 5468
1973 2298 1983 5565 1993 3682
1974 4972 1984 2414 1994 2230

a.P(Q ≥7460 m3/s) = ?


They ask us to apply the Normal distribution of
probabilities, so we must standardize this
flow. Previously we calculated:
Qm4232.00 m3/s 1629.96 m3/s
Then:
With this value, we enter the Normal distribution table.
of probabilities and we find:
F(Z) = F(1.98) = P(Z<1.98) = P(Q<7460) = 0.9762
Therefore, P(Q≥7460) = 0.0238 = 2.38%
Then: T = 1/P = 1/0.0238 = 42 years

b. Q60= ? Q100= ?
We know that:

Clearing:

For: T = 60 P = 0.9833 = F(Z60)


From the normal distribution table: Z60= 2.13
Replacing in:

We found: Q60= 7703.8 m3/s

Similarly: T = 100 years; P = 0.9900 = F(Z100)


From the normal distribution table: Z100= 2.33
Finally: Q1008029.8 m3/s
Z could also be found by applying equation (1):
w = 2.8609 replacing in equation (2):
Z = 2.13

LOG NORMAL DISTRIBUTION


The normal probability density function is defined
like:

and the normal probability distribution function as:

Where, Z is the standardized variable calculated as:

Logarithm of the maximum instantaneous flow


Average of the Y values.
Standard deviation of the Y values.

Y = Ln Q
Example 2:
Resolve Example 1 considering that the record of
flows conform to the Log probability distribution
Normal.
From the original flow record:
Q Q Q
Year Year Year
(m3/s) (m3/s) (m3/s)
1965 3706 1975 2367 1985 4240
1966 4060 1976 4819 1986 2849
1967 2350 1977 3919 1987 6267
1968 6000 1978 6900 1988 2246
1969 4744 1979 3505 1989 7430
6388
1971 2675 1981 3220 1991 3747
1972 3130 1982 2737 1992 5468
1973 2298 1983 5565 1993 3682
1974 4972 1984 2414 1994 2230

we calculate the variable Y = Ln Q:


Year Qmax Year Qmax Year Qmax
8.2177
1966 8.3089 1976 8.4803 1986 7.9547
7.7622
8.6995
8.4646
8.7622
7.8917
8.0488
1973 7.7398 1983 8.6243 1993 8.2112
8.5116

Furthermore:
P(Q ≥ 7460 m3/s) = ?

Y = Ln 7460 = 8.92 Z = 1.63


From the normal distribution table we obtain:
F(Z)= F(1.63)= P(Z<1.63)= P(Y<8.92)= P(Q<7460)= 0.9489
Therefore, P(Q≥7460) = 0.0511 = 5.11%
Then: T = 1/P = 1/0.0511 ≈ 20 years

b. Q60= ? Q100= ?
We know:

For: T = 60 P = 0.9833 = F(Z60)

From the normal distribution table: Z60= 2.13


Replacing in:

We found: Y60= 9,11 Q609045.3 m3/s

In the same way: T = 100 years; P = 0.9900 = F(Z100)

From the normal distribution table: Z1002.33


Finally: And100= 9.19 Q1009798.7 m3/s

You might also like