MAT 442 TIME SERIES ANALYSIS
WHAT IS TIME SERIES?
A time series is a name given to a set of data collected and recorded over time. For the purpose
of observing the pattern of the data recorded, within a span of time, such data can be graphed.
The name of the graph is called histogram meaning a history of record of time.
In all the sciences and social sciences, and particularly economics and business, the problem of
how condition changes with the passage of time are of utmost importance. For study of such
problems, the appropriate kind of statistical information consist of data in the form of time series,
figures which shows the magnitude of a phenomenon month after month or year after year. The
proper methods for treating such data and thus summarizing the experience which they represent
are indispensable part of the practicing statistician equipment.
Time series may be organized and plotted on a fixed time interval: daily, weekly, monthly,
quarterly, yearly or any other time interval. The graph is usually plotted with the horizontal axis
chosen as the time axis. The record of data needs to be kept for lengthy period of time, usually 10
to 15 years on even more. This is with a view to reveal the pattern that contains much
information about the data. Examples of time series data includes:
(i). monthly sales of product of a company.
(ii). Quarterly values
(iii). Total annual production of a country for a given number of years.
(iv). Students annual enrolment of a given institution.
(v). Annual records of criminal cases.
(vi). Monthly records of number of HIV patients.
(vii). Annual record of rainfall distribution in a given area.
(viii). Annual records of staff appraisal of a given organization.
(ix). Weekly sales of bread of a confectionery.
(x). Annual records of bank’s net profit.
Since the observations are recorded over time, we can say that observations are dependent on
time and thus can be stated mathematically as observations are functions of time. If we represent
each observation by Y i , I = 1, 2, 3, …,n and time t i (i=1 ,2 , 3 , .., n), then the mathematical
relationship stated is generally presented as follows;
Y =f (t i )
Components of time series
Time series analysis is composed of four elements or factors that usually influence the behavior
of the data over time. They include:
1). Secular Trend denoted by T
2). Seasonal variation denoted by S.
3). Cyclical fluctuation denoted by C.
4). Irregular or residual variation denoted by I.
Secular Trend
This refers to the general direction in which the graph of time series appears to be going over a
long period of time. This explains the growth or decline of a time series over a long period. Time
series is said to contain a trend if the mean or average of series changes systematically with time.
This is a long term tendency of the whole data series to rise or fall. In other words, the value of
the variable tends to increase or decrease over a long period of time. An example could be a
steady increase in the cost of living recorded by the consumer price index.
There are basically three reasons why we study trend. They are:
1). It describes the historical pattern- there are many instances when we can use a past trend to
evaluate the success of previous policy. For an example, a University may evaluate the
effectiveness of a recruiting programme by examining its past enrolment trends.
Trend plot
2). Projecting past pattern, or trends into the future- Knowledge of the past can tell us a great
deal about the future.
3). Allows us to eliminate the trend component from the time series. This means it easier for the
study of other components of the time series.
Trends can be linear or curve linear.
Seasonal variation
This refers to short term fluctuation or changes that occur at regular intervals less than a year. It
is usually brought about by climatic and social factor(s); it is usually because of an event
occurring at a particular period of the year. Examples of these are sale of card during valentine
period, sale of chicken during xmas, new year or any festive period(s).
This component is defined as repetitive and predictable movement around the trend line in one
year or less. In order to detect seasonal variation, time intervals need to be measured in small
units, such as days, weeks, monthly or quarterly. The need to study seasonal variation includes:
1). To establish pattern of past changes. This gives us the frame work for comparing two time
intervals that would otherwise be too dissimilar.
2). It is used to project past patterns into the future. The ability to predict seasonal fluctuations is
often essential for a short run decision.
3). So that its effects can be eliminated from the time series once it’s existing pattern is
established. This adjustment allows us to calculate the cyclical variation that takes place each
year. When the effects of seasonal variation is eliminated from the time series, it becomes
deseasonalized.
Cyclical variation
This component of time series tends to oscillate above and below the secular trend line for
periods longer than one year. It is a wave like up and down recurrent movements occurring in the
observed data due to economic reasons say, recession or boom. To identify this component, we
will use residual method
This refers to long term variations about the trend usually caused by disruption in services or
socio-economic activities, cyclical variations are commonly associated with economic cycles,
successive boom and slumps in the economy. A good example of this is business cycle.
A time series that is annual is considered to have trend, cyclical and irregular variations as
components. It is because seasonal variation makes a complete, regular cycle within each year
and thus does not affect one year any more than another. Cyclical and irregular components can
be isolated from the trend since the trend can be described by a trend line. We will then assume
that the cyclical component explains most of the variation left unexplained by the trend
component (many real-life time series do not satisfy this assumption and so methods such as
Fourier and Spectral analysis can analyze the cyclical variation). This statement justifies why
seasonal variation is disregarded.
Irregular variation
This refers to time series movement that are not definite this is usually caused by unusual or
unexpected and unpredictable events such as strike, war, flood, disasters. Here, there’s no
definite behavioural pattern
These are random fluctuations in the data which cannot be explained or ascribed to any particular
cause and cannot be predicted in advance. Because of this random error, time series forecast
cannot be 100 percent accurate. It however occurs over a short interval of time and follows
random pattern. One factor that allows decision makers to cope with irregular variation is that
over time, these random movements tend to counteract each other.
Note: The emergence of any of the one components is never dependent on the emergence or non-
emergence of another element. Each of these components (elements) are therefore assumed to be
independent.
Decomposition of Time series
Breaking of time series into its basic components.
Reasons for decomposition of time series
So that we can make prediction.
Moreover, in decomposing, there is a need to assume a model. Conventionally, there are 2
models you must assume. (a). additive model (b). multiplicative model.
Additive model
Y t =T t + St +C t + I t
Multiplicative model
Y t =T t × St ×C t × I t
log Y t =log ( T t × S t ×C t × I t )
log Y t =log T t + log St + log Ct +log I t
' ' ' ' '
Y t =T t + St +C t + I t
Multiplicative model is an additive model in which the components of the model are the
logarithm of the components of the time series.
Using natural logarithm
Y t =T t × St ×C t × I t
ln Y t =ln(T t × S t ×C t × I t )
ln Y t =ln T t + ln St +ln C t +ln I t
¿ ¿ ¿ ¿ ¿
Y t =T t +S t +C t + I t
Measurement of Trend
Basically, trend values of a time series can be estimated by any of the following methods:
1). Free hand
2) Least square method
3) Moving average
4) Semi average method
1). Free Hand Method
This method involves the drawing a scattered diagram of the values with time as the independent
variable on the x-axis and then drawing the trend line by eye. This method is condemned because
it is subjective and inaccurate method of obtaining a Trend line.
Example 1
Make a free hand sketch using the given data.
Years Sales
2010 85
2011 94
2012 110
2013 123
2014 98
2015 117
2016 125
Solution
Chart Title
140
120
100
80
60
40
20
0
2010 2011 2012 2013 2014 2015 2016
2) Least square method
Time series trend can be modeled as a linear function when original data suggest that a given
time series tend to increase or decrease at a constant rate. Hence, the series can appropriately be
described by a first-order polynomial or straight-line equation as follows;
y=a+bt
Applying the regression analysis
y=
∑y
n
t=
∑t
n
a^ = y− b^ t
n ∑ yt−∑ y ∑ t
^
b=
n ∑ t 2− ( ∑ t )
2
Thus, the estimated trend equation is
^y = a^ + b^ t
By substituting the corresponding values of t in the estimated trend equation, the required trend
values are obtained. However, since the observations are ordered by time, we can simplify the
solutions to the normal equations by choosing the middle of the series as origin (zero for odd
periods or repeated values for even periods), which will result to ∑ t=0.
When ∑ t=0 , which also implies that, t=
∑ t =0
n
the trend equations reduce to;
a^ = y
^ ∑
yt
b=
∑ t2
Thus, in computing trend by this method, it is more convenient to use the middle of the series as
the origin.
Example 2
Obtain the trend of the given data using the least square method. Hence compute the trend for the
next three years.
Years 2005 2006 2007 2008 2009 2010 2011
Productio 84 92 88 76 85 94 98
n
Solution
Years (t) Prod. (y) Code for t ty t
2
2005 84 -3 -252 9
2006 92 -2 -184 4
2007 88 -1 -88 1
2008 76 0 0 0
2009 85 1 85 1
2010 94 2 188 4
2011 98 3 294 9
617 0 43 28
y=
∑ y = 617 =99.1429
n 7
a^ = y=88.1429
^ ∑
yt 43
b= = =1.5357
∑t 2
28
Using the trend equation
^y = a^ + b^ t
^y =88.1429+1.5357 t
ii). Next 1 year (2012), t = 4
^y =88.1429+1.5357 ( 4 )=94.2857
iii). Next 2 years (2013), t = 5
^y =88.1429+1.5357 ( 5 )=95.8214
iv). Next 3 years (2014), t = 6
^y =88.1429+1.5357 ( 6 )=97.3571
Assignment 1
Obtain the trend of the given data using the least square method. Hence compute the trend for the
next three years.
Years 2011 2012 2013 2014 2015 2016 2017 2018
Demand 14 17 8 10 7 5 8 4
Model diagnostics for the least square method (autoregressive model)
Testing the reliability of the autoregressive model is to know how good the autoregressive model
we have constructed is. There is a need to calculate the residuals or the trend errors and examine
them. We can look at the measure called coefficient of determination since this is a form of
regression model. This model is known as r 2 and is denoted by the formula:
[ √( n ∑ ty−∑ t ∑ y
]
2
2
r=
n ∑ t −( ∑ t ) )( n ∑ y −( ∑ y ) )
2 2 2 2
For time series, ∑ t=0 , which implies that,
[√ n ∑ ty
]
2
2
r=
( n ∑ t 2 ) ( n ∑ y 2− ( ∑ y ) )
2
Also, it should be noted that 0 ≤ r 2 ≤ 1
Example 3
Obtain the trend of the given data using the least square method. Calculate the coefficient of
determination and explain its significance.
Years 2011 2012 2013 2014 2015 2016 2017 2018
Citations 3 5 5 4 7 7 16 17
Solution
Years Citation Code for t ty t
2
y
2
(t) (y)
2011 3 -4 -12 16 9
2012 5 -3 -15 9 25
2013 5 -2 -10 4 25
2014 4 -1 -4 1 16
2015 7 1 7 1 49
2016 7 2 14 4 49
2017 16 3 48 9 256
2018 17 4 68 16 289
64 0 96 60 718
y=
∑ y = 64 =8
n 8
a^ = y=8
^ ∑
yt −96
b= = =1.6
∑t 2
60
Using the trend equation ^y = a^ + b^ t
^y =8+1.6 t
Test for the significance of the trend.
n=8 ; ∑ y=64 ; ∑ ty=96 ; ∑ t =60 ; ∑ y =718
2 2
[√ n ∑ ty
]
2
2
r=
( n ∑ t 2 ) ( n ∑ y 2− ( ∑ y ) )
2
[√ ]
2
2 8(96)
r=
( 8 ×60 ) ( 8(718)−( 64 )2 )
[ ]
2
2 768
r=
√( 480 )( 5744−4096 )
[ ]
2
2 768
r=
√( 480 )( 1648 )
[ ]
2
768
¿
889.404295
2
¿ 0.863499
¿ 0.746
On the basis of this information, it appears that our autoregressive model would be of somewhat
importance for future forecasting citation using the data. This is because the model was able to
explain 74.6% of the variability in the time series. We will have some degree of confidence of its
forecasting power.
3). Simple Moving Average (SMA) method
A moving average is a simple arithmetic mean. We select a group of figures at the start of the
series e.g. 3,4,5,7 and average them to obtain our first trend figure. Then you drop the first figure
and include the next item in the series to obtain a new group. The average of this group gives the
second trend figure. You continue to do this until all figures in the series is exhausted.
There is no doubt that the trend eliminates the large-scale fluctuations found in the original series
moving average smoothing is a smoothing technique used to make the long-term trend of a time
series cleared.
This method is usually used to make forecast based on average. The method requests that to
make a forecast, say for next month’s sales, we obtain the average of sales for several preceding
months. This will enable us have random fluctuations cancelling each other (smoothed) away.
This is even better than if we have used the actual sales for a preceding month, say November, to
make a forecast for December sales.
The characteristics of Simple Moving Average Method
1). The different moving averages produces different forecasts.
2). The greater the number of periods in the moving average, the greater the smoothing effect.
3). If the underlying trend of the past data are thought to be fairly constant with substantial
randomness, then a greater number of periods should be chosen.
4). Alternatively, if there seems to be some changes in underlying state of the data, more
responsiveness is needed, and therefore fewer periods should be included in the moving average.
The limitations of SMA.
1). Equal weighting is given to each of the values used in the moving average calculation,
whereas, it is reasonable to suppose that the most recent data is more relevant to the current
condition.
2). The MA calculation does not take into account, data outside the period of average.
3). The use of unadjusted MA as a forecast can cause misleading results when there is an
underlying seasonal variation.
Example 4
Prepare a three-month moving average using the data.
Months Sales
January 350
February 340
March 360
April 310
May 280
June 300
July 270
August 260
September 310
October 350
November 370
December 390
Solution
Months Sales 3 month total 3 month average
January 350
February 340 1050 350
March 360 1010 336.7
April 310 950 316.7
May 280 890 296.7
June 300 850 283.3
July 270 830 276.7
August 260 840 280
September 310 920 306.7
October 350 1030 343.3
November 370 1110 370
December 390
Column 3 is arrived at by adding the sales figure in 3s i.e
Jan + Feb + Mar = 1050
Feb + Mar + April = 1010
Mar + April + May = 950
Column 4 is arrived at by dividing the column 3 by the n which happen to be the moving
average. The fourth column is the trend and is plotted.
400
350
300
250
200
150
100
50
0
1 2 3 4 5 6 7 8 9 10
Assignment 2
Using the data given below, prepare three years moving average time plot.
Year Sales
2001 130
2002 110
2003 120
2004 140
2005 210
2006 220
2007 160
2008 150
2009 200
2010 250
2011 160
2012 140
Example 7
Prepare a four quarterly moving average for the sales data of a company.
Quarter
Years 1 2 3 4
2015 600 820 400 720
2016 630 840 420 740
2017 670 900 430 760
Solution
Year Q1 to Q4 4-year total 4-year 2 point centre Moving average
average total (trend)
2015 600
820
2540 635
400 1277.5 638.75
2570 642.5
720 1290 645
2590 647.5
2016 630 1300 650
2610 652.5
840 1310 655
2630 657.5
420 1325 662.5
2670 667.5
740 1350 675
2730 682.5
2017 670 1367.5 683.75
2740 685
900 1375 687.5
2760 690
430
760
To obtain column 5, add the two corresponding to obtain the new for example;
635 + 642.5 = 1277.5; 642.5 + 647.5 = 1290; 647.5 + 652.5 = 1300
To obtain the last column, divide the numbers by 2; for example;
1277.5/2 = 638.75; 1290/2 = 645; 1300/2 = 650
You can then plot the trend.