Basic Properties of a time series data
Rishman Jot Kaur Chahal
Assistant Professor
Department of Humanities and Social Sciences
Joint Faculty at the Mehta Family School of DS & AI
Indian Institute of Technology Roorkee
July 29, 2025
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 1 / 32
Time Series - Visualization
Autocorrelation
Measures the linear relationship between lagged values of a time series
yt .
Each graph shows yt plotted against yt−k for different values of k.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 2 / 32
Time Series - Visualization
ACF
We denote the sample autocovariance at lag k by ck and the sample
autocorrelation at lag k by rk . Then define
T
1 X
ck = (yt − ȳ )(yt−k − ȳ )
T
t=k+1
and rk = c1 /ck
r1 indicates how successive values of y relate to each other
r2 indicates how y values two periods apart relate to each other
rk is almost the same as the sample correlation between yt and yt−k
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 3 / 32
Time Series - Visualization
ACF
Results for 9 lags are as follows:
r4 higher than for the other lags. This is due to the seasonal pattern
in the data: the peaks tend to be 4 quarters apart and the troughs
tend to be 2 quarters apart.
Together, the autocorrelations at lags 1, 2, . . . , make up the
autocorrelation or ACF.
The plot is known as a correlogram
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 4 / 32
Time Series - Visualization
Strong Autocorrelation on python
Figure: Google Stock Data (Strong Autocorrelation)
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 5 / 32
Some of the Simple time series forecasting methods
Exponential smoothing is the most widely used class of procedures
for smoothing discrete time series in order to forecast the immediate
future.
It is a simple and pragmatic approach to forecasting, whereby the
forecast is constructed from an exponentially weighted average of
past observations.
The largest weight is given to the present observation, less weight to
the immediately preceding observation, even less weight to the
observation before that, and so on (exponential decay of the influence
of past data).
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 6 / 32
Some of the Simple time series forecasting methods
Mathematically:
ŷT +1|T = αyT + α(1 − α)yT −1 + α(1 − α)2 yT −2 + ...,
where 0 ≤ α ≤ 1 is the smoothing parameter.
The one-step-ahead forecast for time T + 1 is a weighted average of
all of the observations in the series y1 , ..., yT . The rate at which the
weights decrease is controlled by the parameter α.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 7 / 32
Double Exponential Smoothing - Holt’s Trend Method
Under the assumption of no trend in the data, simple exponential
smoothing yields goods results but it fails in case of existence of
trend.
Double exponential smoothing is used when there is a linear trend
in the data.
The basic idea behind double exponential smoothing is to introduce a
term to take into account the possibility of a series exhibiting some
form of trend.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 8 / 32
Double Exponential Smoothing - Holt’s Trend Method
Holt (1957) extended simple exponential smoothing to allow the
forecasting of data with a trend.
This method involves a forecast equation and two smoothing
equations (one for the level and one for the trend):
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 9 / 32
Double Exponential Smoothing - Holt’s Trend Method
Forecast Equation
ŷt+h|t = lt + hbt
Level Equation
lt = αyt + (1 − α)(lt−1 + bt−1 )
Trend Equation
bt = β ∗ (lt − lt−1 ) + (1 − β ∗ )bt−1
where lt denotes an estimate of the level of the series at time t, bt
denotes an estimate of the trend (slope) of the series at time t, α is
the smoothing parameter for the level, 0 ≤ α ≤ 1, and β ∗ is the
smoothing parameter for the trend, 0 ≤ β ∗ ≤ 1. h = 1, 2, 3,...
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 10 / 32
Double Exponential Smoothing - Holt’s Trend Method
As with simple exponential smoothing, the level equation here shows
that lt is a weighted average of observation yt and the one-step-ahead
training forecast for time t, here given by lt−1 + bt−1 .
The trend equation shows that bt is a weighted average of the
estimated trend at time t based on lt − lt−1 and bt−1 , the previous
estimate of the trend.
The forecast function is no longer flat but trending. The h-step-ahead
forecast is equal to the last estimated level plus h times the last
estimated trend value. Hence the forecasts are a linear function of h.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 11 / 32
Holt-Winters’ Seasonal Trend Method
Triple Exponential Smoothning
Holt (1957) and Winters (1960) extended Holt’s method to capture
seasonality.
The Holt-Winters seasonal method comprises the forecast equation
and three smoothing equations - one for the level lt , one for the trend
bt and one for the seasonal component st , with corresponding
smoothing parameters α, β ∗ , γ .
m is used to denote the frequency of the seasonality, i.e., the number
of seasons in a year. For example, for quarterly data m=4, and for
monthly data m=12.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 12 / 32
Holt-Winters’ Seasonal Trend Method
Forecast Equation
ŷt+h|t = lt + hbt + st+h−mk+1
Level Equation
lt = α(yt − st−m + (1 − α)(lt−1 + bt−1 )
Trend Equation
bt = β ∗ (lt − lt−1 ) + (1 − β ∗ )bt−1
Seasonal Equation
st = γ(yt − lt−1 − bt−1 ) + (1 − γ)st−m
where lt denotes an estimate of the level of the series at time t, bt
denotes an estimate of the trend (slope) of the series at time t, α is
the smoothing parameter for the level, 0 ≤ α ≤ 1, and β ∗ is the
smoothing parameter for the trend, 0 ≤ β ∗ ≤ 1. h = 1, 2, 3,...
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 13 / 32
Univariate Time Series Models
These models are a class of specifications where one attempts to
model and to predict (dependent) variable (say yt ) using only
information contained in their own past values (yt−1 ).
These models may further include the current and past values of the
error term.
Thus, these models are different from structured models.
What do we mean by structured models?
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 14 / 32
Univariate Time Series Models
Structured models are multivariate in nature and explains the changes
in dependent variable based on the movements in the current or past
values of other (explanatory) variables.
Time Series models are more like a-theoretical indicating their
construction is not based upon any underlying theoretical model on
the behaviour of a variable.
But they are an attempt to capture the empirically relevant features
of the observed data that may have arisen from a variety of different
(but unspecified) structured models.
An important class of time series models is ARIMA models (Box and
Jenkins, 1976).
Before that, let us first understand the basic important concepts.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 15 / 32
Univariate Time Series Models
A Weakly Stationary Process
If a series satisfies the next three equations, it is said to be weakly or
covariance stationary.
E (yt ) = µ t = 1, 2, ...,
E (yt − µ)(yt − µ) = σ 2
E (yt1 − µ)(yt2 − µ) = γt2 −t1
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 16 / 32
Weakly Stationary Series
A stationary process or series has the following properties:
-constant mean
-constant variance
-constant autocovariance structure
The latter refers to the covariance between yt−1 and yt−2 being the
same as yt−5 and yt−6 .
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 17 / 32
Univariate Time Series Models (cont’d)
So if the process is stationary, all the variances are the same and all
the covariances depend only on the difference between t1 and t2 .The
moments
E (yt − E (yt ))(yt+s − E (yt+s )) = γs , s = 0, 1, 2, ...
are known as the covariance function.
The covariances, γs , are also known as autocovariances.
However, the value of the autocovariances depend on the units of
measurement of yt .
It is thus more convenient to use the autocorrelations which are the
autocovariances normalised by dividing by the variance:
τs = γγ0s , s = 0, 1, 2, ...
If we plot τs against s=0,1,2,... then we obtain the autocorrelation
function (acf) or correlogram as seen in the earlier slides as well.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 18 / 32
Stationary Series
Remember, mathematically
E (yt ) = µ
E (yt − µ)2 = σ 2
E (yt1 − µ)(yt2 − µ) = γt2−t1 , ∀t1 , t2
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 19 / 32
A White Noise Process
A white noise process is one with (virtually) no noticeable structure.
A definition of a white noise process is
E (yt ) = µ
Var (yt )(= σ 2
σ 2 , if t = r
γt−r =
0, otherwise
Thus, a white noise process has constant mean and variance, and zero
autocovariances, except at lag zero.
Hence, the autocorrelation function will be zero apart from a single
peak of 1 at s = 0.
If it is further assumed that yt is distributed normally, then the
sample autocorrelation coefficients are also approximately normally
distributed.
τs approximately N(0,1/T) where T = sample size and τˆs denotes
the autocorrelation coefficient at lag s estimated from a sample.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 20 / 32
A White Noise Process
We can use this to do significance tests for the autocorrelation
coefficients by constructing a confidence interval.
For example, a 95% confidence interval would be given by
±0.196 ∗ √1T . If the sample autocorrelation coefficient,τˆs , falls
outside this region for any given value of s, then we reject the null
hypothesis that the true value of the coefficient at lag s is zero.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 21 / 32
Joint Hypothesis Tests
We can also test the joint hypothesis that all m of the τk correlation
coefficients are simultaneously equal to zero using the Q-statistic
developed by Box and Pierce (1970):
Q=T m 2
P
k=1 τk
where T = sample size, m = maximum lag length.
The Q-statistic is asymptotically distributed as a χ2m .
However, the Box Pierce test has poor small sample properties, so a
variant has been developed, called the Ljung-Box statistic:
τk2
Q ∗ = T (T + 2) m 2
P
k=1 T −k ∼ χm
This statistic is very useful as a portmanteau (general) test of linear
dependence in time series
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 22 / 32
An ACF Example
Question:
Suppose that a researcher had estimated the first 5 autocorrelation
coefficients using a series of length 100 observations, and found them
to be (from 1 to 5): 0.207, -0.013, 0.086, 0.005, -0.022. Test each of
the individual coefficient for significance, and use both the BoxPierce
and Ljung-Box tests to establish whether they are jointly significant.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 23 / 32
An ACF Example
Solution:
A coefficient would be significant (rejecting the null hypothesis) if it
lies outside (-0.196,+0.196) at the 5% level, so only the first
autocorrelation coefficient is significant.
For joint hypothesis test, H0 : τ1 = 0, τ2 = 0, τ3 = 0, τ4 = 0, τ5 = 0.
The test statistics for the Box–Pierce and Ljung–Box tests are given
respectively, as
Q = 100x(0.2072 + (−0.013)2 + 0.0862 + 0.0052 + (−0.022)2 )
Thus, Q=5.09 and Q ∗ =5.26 Compared with a tabulated χ2 (5) = 11.1
at the 5% level, so the 5 coefficients are jointly insignificant.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 24 / 32
Moving Average Processes
Moving Average (MA) is the simplest class of time-series model.
Let ut (t=1,2,3,...) be a sequence of independently and identically
distributed (iid) random variables or white noise process with
E (ut ) = 0 and Var (ut ) = σ 2 , then
yt = µ + ut + θ1 ut−1 + θ2 ut−2 + ... + θq ut−q is a q th order moving
average model MA(q).
Its properties are
E (yt ) = µ; Var (yt ) = γ0 = (1 + θ12 + θ22 + ... + θq2 )σ 2 (mean = 0)
Covariances
(
(θs + θs+1 θ1 + θs+2 θ2 + ... + θq θq−s )σ 2 , for s = 1, 2, ....., q
γs =
0, for s>q
Thus, MA has a constant mean, constant variance and
autocovariances which may be non-zero to lag q and will always be
zero thereafter.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 25 / 32
Example of an MA Problem
Consider the following MA(2) process:
yt = ut + θ1 ut−1 + θ2 ut−2
where ut is a zero mean white noise process with variance σ 2 .
Calculate the mean and variance of yt
Derive the autocorrelation function for this process (i.e. express the
autocorrelations, τ1 , τ2 , ... as functions of the parameters θ1 and θ2 ).
If θ1 = −0.5 and θ2 = 0.25, sketch the act of yt .
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 26 / 32
Solution
If E (ut ) = 0, Then E (ut−i ) = 0∀i.
Now, taking expectations on both the sides of the above equation.
E (yt ) = E (ut +θ1 ut−1 +θ2 ut−2 ) = E (ut )+θ1 E (ut−1 )+θ2 E (ut−2 ) = 0
Now, Variance, Var (yt ) = E [yt − E (yt )][yt − E (yt )]
but E (yt ) = 0 , so
Var (yt ) = E [(yt )(yt )]
= E[(ut + θ1 ut−1 + θ2 ut−2 )(ut + θ1 ut−1 + θ2 ut−2 )]
=E[u2t + θ12 ut−12 + θ22 ut−2
2 + crossproducts]
But E[cross-products]=0 since cov(ut , ut−s )=0 for s ̸= 0
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 27 / 32
Solution (cont’d)
So Var (yt ) = γ0 = E [ut2 + θ12 ut−1
2 + θ22 ut−2
2 ]
2 2 2
= σ + θ1 σ + θ2 σ 2 2
= (1 + θ12 + θ22 )σ 2
(ii) The acf of yt .
γ1 = E [yt − E (yt )][yt−1 − E (yt−1 )]
= E [yt ][yt−1 ]
= E [(ut + θ1 ut−1 + θ2 ut−2 )(ut−1 + θ1 ut−2 + θ2 ut−3 )]
2
= E [(θ1 ut−1 2 )]
+ θ1 θ2 ut−2
2
= θ1 σ + θ1 θ2 σ 2
= (θ1 + θ1 θ2 )σ 2
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 28 / 32
Solution (cont’d)
γ2 = E [yt − E (yt )][yt−2 − E (yt−2 )]
= E [yt ][yt−2 ]
= E [(ut + θ1 ut−1 + θ2 ut−2 )(ut−2 + θ1 ut−3 + θ2 ut−4 )]
2 )]
= E [(θ2 ut−2
= θ2 σ 2
γ3 = E [yt − E (yt )][yt−3 − E (yt−3 )]
= E [yt ][yt−3 ]
= E [(ut + θ1 ut−1 + θ2 ut−2 )(ut−3 + θ1 ut−4 + θ2 ut−5 )]
2 )]
= E [(θ2 ut−3
=0
So γs = 0 for s>2. All autocovariances for the MA(2) process will be
zero for any lag length, s, greater than 2.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 29 / 32
Solution (cont’d)
We have the autocovariances, now calculate the autocorrelations:
Autocorrelation at lag 0 is:
τ0 = γγ00 = 1
Autocorrelation at lag 1 is:
(θ1 +θ1 θ2 )σ 2 (θ1 +θ1 θ2 )
τ1 = γγ10 = (1+θ 2 +θ 2 )σ 2 = (1+θ 2 +θ 2 )
1 2 1 2
γ2 (θ2 )σ 2 θ2
Autocorrelation at lag 2 is: τ2 = γ0 = (1+θ12 +θ22 )σ 2
= (1+θ12 +θ22 )
Thus, τ3 = γγ30 = 0
τs = γγ0s = 0∀s>2
(iii) For θ1 = −0.5 and θ2 = 0.25, substituting these into the
formulae above gives τ1 = −0.476, τ2 = 0.190.
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 30 / 32
ACF Plot
Thus the acf plot will appear as follows :
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 31 / 32
Thank you!!
Rishman Jot Kaur Chahal (IIT Roorkee) July 29, 2025 32 / 32