1
INTRODUCTION TO
FORECASTING & TIME
SERIES STRUCTURE
Dr. Susan Simmons
Institute for Advanced Analytics
2
Source: xkcd.com/2620
3
TIME SERIES DATA
4
Time Series Data
• A time series is an ordered sequence of observations.
• Ordering is typically through equally spaced time
intervals.
• Possibly through space as well.
• Used in a variety of fields:
• Agriculture: Crop Production
• Economics: Stock Prices
• Engineering: Electric Signals
• Meteorology: Wind Speeds
• Social Sciences: Crime Rates
5
Time Series Data
• We will begin our time
series discussions with
univariate time series
(only one time
series…one variable,
we will call it Y).
• Multivariate time series
will be in Fall 2.
6
Time Series Data
• We will begin our time
Date Y
series discussions with January 2000 23
univariate time series February 2000 18
(only one time March 2000 20
series…one variable, April 2000 25
we will call it Y). May 2000 21
• Multivariate time series
will be in Fall 2.
7
Time Series Data
• We will begin our time
Date Y
series discussions with January 2000 23 Y1
univariate time series February 2000 18
(only one time March 2000 20
series…one variable, April 2000 25
we will call it Y). May 2000 21
• Multivariate time series
will be in Fall 2.
8
Time Series Data
• We will begin our time
Date Y
series discussions with January 2000 23
univariate time series February 2000 18 Y2
(only one time March 2000 20
series…one variable, April 2000 25
we will call it Y). May 2000 21
• Multivariate time series
will be in Fall 2.
9
Time Series Data
• We will begin our time
Date Y
series discussions with January 2000 23
univariate time series February 2000 18
(only one time March 2000 20 Y3
series…one variable, April 2000 25
we will call it Y). May 2000 21
• Multivariate time series
will be in Fall 2.
10
Time Series Data
• We will begin our time
Date Y
series discussions with January 2000 23
univariate time series February 2000 18
(only one time March 2000 20 Y3
series…one variable, April 2000 25
we will call it Y). May 2000 21
• Multivariate time series Yt
will be in Fall 2.
CAREFUL: Since we are assuming equally
spaced, you will need to take care of missing
values !!
11
Example 1: Iron and Steel Exports
12
Example 2: Amazon.com Stock
13
Example 2: Amazon.com Stock
Time series can have a
trend – an overall
pattern to the data
(linear, quadratic;
positive, negative)
14
Example 3: Airlines Passengers
15
Example 3: Airlines Passengers
Time series can
have a seasonal
pattern – a
systematic up and
down pattern
16
Example 5: Airline Passengers Again
17
Temperature over the past century for
Tuscaloosa, Alabama
Source: Dr. Robert Lund
18
Temperature over the past century for
Tuscaloosa, Alabama
Station relocated and
instrumentation was
changed
Thermometer was
Station was changed
relocated
Source: Dr. Robert Lund
19
Time Series to Forecast
20
Forecasting Process
Propose
Fit Model
Model
Data Cleaning Forecasts
Diagnose
Repeat
Model
21
SIGNAL AND NOISE
22
Statistical Forecasting
Time
Signal Noise
Series
23
Statistical Forecasting
Time
Signal Noise
Series
Explained Variation Unexplained Variation
24
Statistical Forecasting
Time
Signal Noise
Series
Explained variation: Error
Trend/Cycle
Seasonality
25
Statistical Forecasting
Time
Signal Noise
Series
Explained variation: Error
Trend/Cycle
Seasonality This will be
Dependence Structure discussed in
ARIMA
26
Statistical Forecasting
Time
Signal Noise
Series
Forecasts extrapolate Confidence intervals
signal portion of model. account for uncertainty.
27
Time Series Decomposition
• If a time series only has trend/cycle patterns, there is no
need to decompose
• If a time series has both trend/cycle patterns AND
seasonal variation, we can decompose series into these
individual parts:
• Trend/Cycle patterns
• Seasonal variation
• Error
28
Time Series Decomposition
• The signal part of the time series can typically be broken
down into two components:
Time
Signal Noise
Series
Trend / Cycle and Seasonal Error / Remainder / Irregular
29
Time Series Decomposition
• The whole time series can now be thought of like the
equations below.
• Additive:
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
• Multiplicative:
𝑌𝑡 = 𝑇𝑡 × 𝑆𝑡 × 𝐸𝑡
30
Time Series Decomposition
• The whole time series can now be thought of like the
equations below.
• Additive:
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
Trend / Cycle
• Multiplicative:
𝑌𝑡 = 𝑇𝑡 × 𝑆𝑡 × 𝐸𝑡
31
Time Series Decomposition
• The whole time series can now be thought of like the
equations below.
• Additive:
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
Seasonal
• Multiplicative:
𝑌𝑡 = 𝑇𝑡 × 𝑆𝑡 × 𝐸𝑡
32
Time Series Decomposition
• The whole time series can now be thought of like the
equations below.
• Additive:
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
Error
• Multiplicative:
𝑌𝑡 = 𝑇𝑡 × 𝑆𝑡 × 𝐸𝑡
33
Time Series Decomposition
• The whole time series can now be thought of like the
equations below.
• Additive:
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
• Multiplicative:
𝑌𝑡 = 𝑇𝑡 × 𝑆𝑡 × 𝐸𝑡
OR
log(𝑌𝑡 ) = log(𝑇𝑡 ) + log(𝑆𝑡 ) + log(𝐸𝑡 )
34
Additive vs. Multiplicative
• Additive – magnitude of • Multiplicative – magnitude
variation around trend / of the variation around
cycle remains constant. trend / cycle proportionally
changes.
35
Seasonally Adjusted Data
One advantage of time series decomposition is that we are
able to create seasonally adjusted data (i.e. remove the “effect
of Seasonality”)
This allows analysts to understand the trend of the series
𝑌𝑡 = 𝑇𝑡 + 𝑆𝑡 + 𝐸𝑡
𝑌𝑡 − 𝑆𝑡 (𝑇𝑡 + 𝐸𝑡 )
The seasonal length of the time series is the length of one
season (how long til the series repeats the “pattern”)
36
37
Airline data set
• Data contains number of US airline passengers from
January 1990 – March 2008
• Data is monthly (length of season is 12…repeats pattern
every 12 observations)
38
Time Series Decomposition
𝑌𝑡
=
𝑆𝑡
+
𝑇𝑡
+
𝐸𝑡
39
Time Series Decomposition
𝑌𝑡
=
𝑆𝑡 Original
+ data
𝑇𝑡
+
𝐸𝑡
40
Time Series Decomposition
𝑌𝑡
=
𝑆𝑡
+
Seasonal
𝑇𝑡 Component
+
𝐸𝑡
41
Time Series Decomposition
𝑌𝑡
=
𝑆𝑡
+
𝑇𝑡
+
𝐸𝑡 Trend
Component
42
Time Series Decomposition
𝑌𝑡
=
𝑆𝑡
+
𝑇𝑡
Remainder
+ (What’s left
𝐸𝑡 over)
43
Time Series Decomposition
# Time Series Decomposition ...STL#
Passenger <- ts(USAirlines$Passengers, start = 1990, frequency =12)
decomp_stl <- stl(Passenger, s.window = 7)
# Plot the individual components of the time series
plot(decomp_stl)
autoplot(decomp_stl)
44
45
Time Series Decomposition
autoplot(Passenger)+
geom_line(aes(y=decomp_stl$time.series[,2]),
color="blue")
Overlay the trend component
46
47
Time Series Decomposition
seas_adj=Passenger-decomp_stl$time.series[,1]
autoplot(Passenger) +
geom_line(aes(y=decomp_stl$time.series[,2]),color="blue")
+ geom_line(aes(y=seas_adj),color="orange")
Overlay the trend component
Overlay seasonally adjusted
48
Time Series Decomposition
𝑇𝑡
𝑌𝑡 − 𝑆𝑡 = 𝑇𝑡 + 𝐸𝑡
49
Time Series Decomposition – R
ggsubseriesplot(Passenger)
50
Time Series Decomposition
51
Decomposition Techniques
• There are many different ways to calculate the trend/cycle
and seasonal effects inside time series data.
• Here are 3 common techniques:
1. Classical Decomposition
52
Decomposition Techniques
• There are many different ways to calculate the trend/cycle
and seasonal effects inside time series data.
• Here are 3 common techniques:
1. Classical Decomposition
a. Default in SAS (Can be done in R)
b. Trend – Uses Moving / Rolling Average Smoothing
c. Seasonal – Average De-trended Values Across Seasons
53
Decomposition Techniques
• There are many different ways to calculate the trend/cycle
and seasonal effects inside time series data.
• Here are 3 common techniques:
1. Classical Decomposition
2. X-13 ARIMA Decomposition (self study)
a. Trend – Uses Moving / Rolling Average Smoothing
b. Seasonal – Uses Moving / Rolling Average Smoothing
c. Iteratively Repeats Above Methods and ARIMA Modeling
d. Can handle outliers
54
Decomposition Techniques
• There are many different ways to calculate the
trend/cycle, and seasonal effects inside time series data.
• Here are 3 common techniques:
1. Classical Decomposition
2. X-12 ARIMA Decomposition
3. STL (Seasonal and Trend using LOESS estimation)
Decomposition
a. Default of stl Function in R (Not available in SAS)
b. Uses LOcal regrESSion Techniques to Estimate Trend
and Seasonality
c. Allows Changing Effects for Trend and Season
d. Adapted to Handle Outliers
55
Comparison of Classical versus STL
seasonal decomposition