Concept of stationarity and Dickey Fuller Test
A stationary process has the property that the mean, variance and autocorrelation structure
do not change over time. Stationarity can be defined in precise mathematical terms, but for
our purpose we mean a flat looking series, without trend, constant variance over time, a
constant autocorrelation structure over time and no periodic fluctuations (seasonality).
If the time series is not stationary, we can often transform it to stationarity with one of the
following techniques.
1. We can difference the data. That is, given the series Zt, we create the new series
Yi=Zi−Zi−1.
The differenced data will contain one less point than the original data. Although you can
difference the data more than once, one difference is usually sufficient.
2. If the data contain a trend, we can fit some type of curve to the data and then model
the residuals from that fit. Since the purpose of the fit is to simply remove long term
trend, a simple fit, such as a straight line, is typically used.
3. For non-constant variance, taking the logarithm or square root of the series may
stabilize the variance. For negative data, you can add a suitable constant to make all
the data positive before applying the transformation. This constant can then be
subtracted from the model to obtain predicted (i.e., the fitted) values and forecasts
for future points.
The above techniques are intended to generate series with constant location and scale.
Although seasonality also violates stationarity, this is usually explicitly incorporated into the
time series model.
Unit Root Test
A unit root test tests whether a time series is not stationary and consists of a unit root in
time series analysis. The presence of a unit root in time series defines the null hypothesis,
and the alternative hypothesis defines time series as stationary.
Mathematically the unit root test can be represented as
Where,
• Dt is the deterministic component.
• zt is the stochastic component.
• ɛt is the stationary error process.
The unit root test’s basic concept is to determine whether the z t (stochastic component )
consists of a unit root or not.
Explanation of the Dickey-Fuller test.
A simple AR model can be represented as:
where
• yt is variable of interest at the time t
• ρ is a coefficient that defines the unit root
• ut is noise or can be considered as an error term.
If ρ = 1, the unit root is present in a time series, and the time series is non-stationary.
If a regression model can be represented as
Where
• Δ is a difference operator.
• ẟ = ρ-1
So here, if ρ = 1, which means we will get the differencing as the error term and if the
coefficient has some values smaller than one or bigger than one, we will see the changes
according to the past observation.
There can be three versions of the test.
• test for a unit root
test for a unit root with constant
• test for a unit root with the constant and deterministic trends with time
o if a time series is non-stationary, it will tend to return an error term or a deterministic
trend with the time values. If the series is stationary, then it will tend to return only an error
term or deterministic trend. In a stationary time series, a large value tends to be followed by
a small value, and a small value tends to be followed by a large value. And in a non-
stationary time series the large and the small value will accrue with probabilities that do not
depend on the current value of the time series.
The augmented dickey- fuller test is an extension of the dickey-fuller test, which removes
autocorrelation from the series and then tests similar to the procedure of the dickey-fuller
test.
The augmented dickey fuller test works on the statistic, which gives a negative number and
rejection of the hypothesis depends on that negative number; the more negative magnitude
of the number represents the confidence of presence of unit root at some level in the time
series.
We apply ADF on a model, and it can be represented mathematically as
Where
• ɑ is a constant
• ???? is the coefficient at time.
• p is the lag order of the autoregressive process.
Here in the mathematical representation of ADF, we have added the differencing terms that
make changes between ADF and the Dickey-Fuller test.
The unit root test is then carried out under the null hypothesis ???? = 0 against the
alternative hypothesis of ???? < 0. Once a value for the test statistic.
it can be compared to the relevant critical value for the Dickey-Fuller test. The test has a
specific distribution simply known as the Dickey–Fuller table for critical values.
ACF
Autocorrelation is the correlation between two observations at different points in a time
series. For example, values that are separated by an interval might have a strong positive or
negative correlation. When these correlations are present, they indicate that past values
influence the current value. In mathematical terms, the observations at yt and yt–k are
separated by k time units. K is the lag. This lag can be days, quarters, or years depending on
the nature of the data. When k=1, you’re assessing adjacent observations. For each lag,
there is a correlation.
UTILITY
Use the autocorrelation function (ACF) to identify which lags have significant correlations,
understand the patterns and properties of the time series, and then use that information to
model the time series data. From the ACF, you can assess the randomness and stationarity
of a time series. You can also determine whether trends and seasonal patterns are present.
In an ACF plot, each bar represents the size and direction of the correlation. Bars that
extend across the red line are statistically significant.
For random data, autocorrelations should be near zero for all lags. Analysts also refer to this
condition as white noise. Non-random data have at least one significant lag. When the data
are not random, it’s a good indication that you need to use a time series analysis or
incorporate lags into a regression analysis to model the data appropriately.
his ACF plot indicates that these time series data are random.
Stationarity
Stationarity means that the time series does not have a trend, has a constant variance, a
constant autocorrelation pattern, and no seasonal pattern. The autocorrelation function
declines to near zero rapidly for a stationary time series. In contrast, the ACF drops slowly
for a non-stationary time series.
In this chart for a stationary time series, notice how the autocorrelations decline to non-
significant levels quickly.
Trends
When trends are present in a time series, shorter lags typically have large positive
correlations because observations closer in time tend to have similar values. The
correlations taper off slowly as the lags increase.
In this ACF plot for metal sales, the autocorrelations decline slowly. The first five lags are
significant.
Seasonality
When seasonal patterns are present, the autocorrelations are larger for lags at multiples of
the seasonal frequency than for other lags.
When a time series has both a trend and seasonality, the ACF plot displays a mixture of both
effects.
Autocorrelation and Partial Autocorrelation in Time Series Data
Autocorrelation is the correlation between two observations at different points in a time
series. For example, values that are separated by an interval might have a strong positive or
negative correlation. When these correlations are present, they indicate that past values
influence the current value. Analysts use the autocorrelation and partial autocorrelation
functions to understand the properties of time series data, fit the appropriate models, and
make forecasts.
In this post, I cover both the autocorrelation function and partial autocorrelation function.
You’ll learn about the differences between these functions and what they can tell you about
your data. In later posts, I’ll show you how to incorporate this information
in regression models of time series data and other time-series analyses.
Autocorrelation and Partial Autocorrelation Basics
Autocorrelation is the correlation between two values in a time series. In other words, the
time series data correlate with themselves—hence, the name. We talk about these
correlations using the term “lags.” Analysts record time-series data by measuring a
characteristic at evenly spaced intervals—such as daily, monthly, or yearly. The number of
intervals between the two observations is the lag. For example, the lag between the current
and previous observation is one. If you go back one more interval, the lag is two, and so on.
In mathematical terms, the observations at yt and yt–k are separated by k time units. K is the
lag. This lag can be days, quarters, or years depending on the nature of the data. When k=1,
you’re assessing adjacent observations. For each lag, there is a correlation.
The autocorrelation function (ACF) assesses the correlation between observations in a time
series for a set of lags. The ACF for time series y is given by: Corr (yt,yt−k), k=1,2,….
Analysts typically use graphs to display this function.
Autocorrelation Function (ACF)
Use the autocorrelation function (ACF) to identify which lags have significant correlations,
understand the patterns and properties of the time series, and then use that information to
model the time series data. From the ACF, you can assess the randomness and stationarity
of a time series. You can also determine whether trends and seasonal patterns are present.
In an ACF plot, each bar represents the size and direction of the correlation. Bars that
extend across the red line are statistically significant.
Partial Autocorrelation Function (PACF)
The partial autocorrelation function is similar to the ACF except that it displays only
the correlation between two observations that the shorter lags between those observations
do not explain. For example, the partial autocorrelation for lag 3 is only the correlation that
lags 1 and 2 do not explain. In other words, the partial correlation for each lag is the
unique correlation between those two observations after partialling out the intervening
correlations.
As you saw, the autocorrelation function helps assess the properties of a time series. In
contrast, the partial autocorrelation function (PACF) is more useful during the specification
process for an autoregressive model.
On the graph, the partial autocorrelations for lags 1 and 2 are statistically significant. The
subsequent lags are nearly significant. Consequently, this PACF suggests fitting either a
second or third-order autoregressive model.
By assessing the autocorrelation and partial autocorrelation patterns in your data, you can
understand the nature of your time series and model it.
Properties of random walk
• An AR(1) time series with [Math Processing Error]β0=0 and [Math Processing
Error]β1=1 is a random walk.
• This is because the best prediction for tomorrow is the best value today plus a
random error term.
• The expected value of the error term [Math Processing Error]ϵt is equal to zero.
• The variance of the residuals is constant.
• Random walk with drift: A time series follows a random walk with drift if it has a
non-zero constant intercept term.
• A random walk has an undefined mean reversion level. If has a mean-reverting
level, i.e., [Math Processing Error]xt=b0+b1xt, then [Math Processing
Error]xt=b01−b1. However, in a random walk, [Math Processing
Error]b0=0 and [Math Processing Error]b1=1, so, [Math Processing Error]xt=01−1=0.
• A random walk is not covariance stationary. The covariance stationary property
suggests that the mean and variance terms of a time series remain constant over
time.
• However, the variance of a random walk process does not have an upper bound.
As [Math Processing Error]t increases, the variance grows with no upper bound. This
implies that we cannot use standard regression analysis on a time series that
appears to be a random walk.
Vector Autoregressive Modeling
The vector autoregression (VAR) model is one of the most commonly employed multivariate
regression time series analytic techniques. The VAR model is advantageous because it can
explain past and causal relationships among multiple variables over time, as well as predict
future observations. Explanation and prediction of future observations in a time series is
dependent upon correctly postulating a VAR model and estimating its parameters
Granger causality is a concept of causality derived from the notion that causes may not
occur after effects and that if one variable is the cause of another, knowing the status on
the cause at an earlier point in time can enhance prediction of the effect at a later point in
time. The VAR model has been widely employed in econometric analyses and in
neurobiology to elucidate underlying mechanisms using Granger causality. To the best of
our knowledge, VAR has never been implemented in studying causality in physiologic time
series variables of HR, RR and SpO2 as multivariate inputs to the development of CR
A VAR(p) model for a multivariate time series is a regression model for outcomes at a
specified time t and time lagged predictors, with p indicating the lag (e.g., p = 1 refers to the
observation previous to t; p = 2 refers to two observations prior to t, and so on ). Key terms
used in VAR modeling are defined in Table 1. The VAR model for vital sign time series data
was developed using the following steps:
Glossary of Terms
Stationarity of the individual VSTS (HR, RR, SpO2) is tested.
Lag is determined using lag-length selection criteria.
A VAR model with appropriate lags is built.
Residual autocorrelation is assessed with the Lagrange Multiplier (LM) test.
Stability of the VAR system is assessed with the autoregressive (AR) roots graph.
The Granger causality test is performed.
Structural vector autoregressions (SVARs)
Structural Vector Autoregressions represent a prominent class of time series models used
for macroeconomic analysis. The model consists of a set of multivariate linear
autoregressive equations characterizing the joint dynamics of economic variables. The
residuals of these equations are combinations of the underlying structural economic shocks,
assumed to be orthogonal to each other. Using a minimal set of restrictions, these relations
can be estimated—the so-called shock identification—and the variables can be expressed as
linear functions of current and past structural shocks. The coefficients of these equations,
called impulse response functions, represent the dynamic response of model variables to
shocks. Several ways of identifying structural shocks have been proposed in the literature:
short-run restrictions, long-run restrictions, and sign restrictions, to mention a few.
SVAR models have been extensively employed to study the transmission mechanisms of
macroeconomic shocks and test economic theories. Special attention has been paid to
monetary and fiscal policy shocks as well as other nonpolicy shocks like technology and
financial shocks.
In recent years, many advances have been made both in terms of theory and empirical
strategies. Several works have contributed to extend the standard model in order to
incorporate new features like large information sets, nonlinearities, and time-varying
coefficients. New strategies to identify structural shocks have been designed, and new
methods to do inference have been introduced.
Impulse response functions trace the dynamic impact to a system of a “shock” or change to
an input. While impulse response functions are used in many fields, they are particularly
useful in economics and finance for a number of reasons:
• They are consistent with how we use theoretical economic and finance models.
Theoretical economists develop a model, then ask how outcomes change in the face
of exogenous changes.
• They can be used to predict the implications of policy changes in a macroeconomic
framework.
• They employ structural restrictions which allow us to model our believed theoretical
relationships in the economy.
In stationary systems, we expect that the shocks to the system are not persistent and over
time the system converges. When the system converges, it may or may not converge to the
original state, depending on the restrictions imposed on our structural VAR model.
For example, Blanchard and Quah(1989) famously demonstrated the use of long-run
restrictions in a structural VAR to trace the impact of aggregate supply and aggregate
demand shocks on output and unemployment. In their model:
• They allow aggregate supply shocks to have lasting effects on output.
• Assume that aggregate demand shocks do not have lasting effects on long-run
output.
As a result, when a positive aggregate supply shock occurs, output converges to a higher
level than before the shock.
There is a clear modeling procedure to obtaining the impulse response functions:
• Determine appropriate restrictions based on theory and/or previous empirical
models.
• Estimate the structural VAR model.
• Predict the impulse response functions for a specified time horizon along with their
confidence bands.
• Plot the predicted IRF and their confidence bands.
INTERPRETATION
Impulse responses are most often interpreted through grid graphs of the individual
responses of each variable to an implemented shock over a specified time horizon.
Example: VAR(2) Model of Consumption, Investment, and Income
The graph above shows the impulse response functions for a VAR(2) of income,
consumption, and investment. These IRFs show the impact of a one standard deviation
shock to income.
In order to estimate the structural VAR, short-run restrictions on the model were employed.
These restrictions are such that:
• Income shocks cannot contemporaneously (i.e. immediately) impact investment.
• Consumption shocks cannot contemporaneously impact income or investment.
Let’s look first at the IRF tracing the impact of the shock to income on income itself. In this
graph, we see:
• The initial shock to in income in the first period.
• This shock quickly dies as the impact returns to almost zero in the second period.
• A slight increase in income in periods 2-4, with a post-shock peak in period 4.
• The impact converges back to zero after period 4.
The consumption graph shows:
• A quick jump in consumption at the time of the income increase -- this is consistent
with the economic theory that consumption is a normal good (it increases with
increases in income).
• A second spike in consumption occurs around the third period -- this is likely a
lagging response to the increase in investment.
In the investment response to the income shock, we note that there:
• Is no first-period impact of the income shock on investment. This is by design and
results directly from the restrictions implemented in order to estimate the SVAR.
• A short period of positive impact periods 2-4 which converges back to zero.
GARCH MODEL
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) is a statistical model
used in analyzing time-series data where the variance error is believed to be serially
autocorrelated. GARCH models assume that the variance of the error term follows an
autoregressive moving average process.
Although GARCH models can be used in the analysis of a number of different types of
financial data, such as macroeconomic data, financial institutions typically use them to
estimate the volatility of returns for stocks, bonds, and market indices. They use the
resulting information to help determine pricing and judge which assets will potentially
provide higher returns, as well as to forecast the returns of current investments to help in
their asset allocation, hedging, risk management, and portfolio optimization decisions.
GARCH models are used when the variance of the error term is not constant. That is, the
error term is heteroskedastic. Heteroskedasticity describes the irregular pattern of
variation of an error term, or variable, in a statistical model.
Essentially, wherever there is heteroskedasticity, observations do not conform to a linear
pattern. Instead, they tend to cluster. Therefore, if statistical models that assume constant
variance are used on this data, then the conclusions and predictive value one can draw
from the model will not be reliable.
The variance of the error term in GARCH models is assumed to vary systematically,
conditional on the average size of the error terms in previous periods. In other words, it
has conditional heteroskedasticity, and the reason for the heteroskedasticity is that the
error term is following an autoregressive moving average pattern. This means that it is a
function of an average of its own past values.
EGARCH MODEL
The exponential GARCH (EGARCH) may generally be specified as
𝜀𝑡 = 𝜎𝑡𝑧𝑡 ; 𝑙𝑛𝜎^2 = ω + 𝛼𝑖𝜀𝑡−𝑖 𝑝 ^2 𝑖=1 + 𝛽𝑗 𝑞 𝑗=1 𝑙𝑛𝜎^2 𝑡−𝑗 .
This model differs from the GARCH variance structure because of the log of the variance.
The following specification also has been used in the financial literature (Dhamija and
Bhalla.
𝜀𝑡 = 𝜎𝑡𝑧𝑡 ; 𝑙𝑛𝜎^2 = ω + 𝛼𝑖εt−i ^2 + 𝜆𝑗 𝑞 𝑗=1 ln 𝜎 ^2 𝑡−𝑗 + 𝛾𝑖 𝑝 𝑖=1 |𝜀𝑡−i | 𝜎𝑡−𝑖 − 2 𝜋 .
GJR GARCH
4 GJR GARCH The GJR GARCH model is represented by the expression
ζ t 2 = ω + 𝛼𝑖𝜀𝑡 𝑝 2 𝑖=1 + 𝛽𝑗 𝑞 𝑗=1 𝜎 2 𝑡−𝑗 + 𝛾𝑖 𝐼𝑡−𝑖𝜀 2 𝑡−𝑖 𝑝 𝑖=1 .
where: 𝐼𝑡−𝑖 = 1 𝑖𝑓 𝜀𝑡−𝑖 < 0 0 𝑖𝑓𝜀𝑡−𝑖 ≥ 0.
IGARCH GARCH
IGARCH GARCH models apply both an autoregressive and moving average structure to the
variance, 𝜎 2 . The integrated GARCH (IGARCH) is specified as 𝜀
𝑡 = 𝜎𝑡𝑧𝑡 ; 𝜎 ^2 = ω + 𝛼𝑖εt−i 𝑝 ^2 𝑖=1 + 𝛽𝑗 𝑞 𝑗=1 𝜎 2 𝑡−j
AVGARCH An asymmetric GARCH
(AGARCH) is simply 𝜀𝑡 = 𝜎𝑡𝑧𝑡 ; 𝜎 2 = ω + 𝛼𝑖 |ε t−i − b| 𝑝 2 𝑖=1 + 𝛽𝑗 𝑞 𝑗=1 𝜎 2 𝑡−𝑗 . (5) The
absolute value generalized autoregressive conditional heteroscedastic (AVGARCH) model is
specified as 𝜀𝑡 = 𝜎𝑡𝑧𝑡 ; 𝜎 2 = ω + 𝛼𝑖 (|ε t−i + b| − c εt−i + b ) 𝑝 2 𝑖=1 + 𝛽𝑗 𝑞 𝑗=1 𝜎 2 𝑡−𝑗 .
COINTEGRATION AND ERROR CORRECTION MODEL
Cointegration is a statistical property of a collection (X1, X2, ..., Xk) of time series variables.
First, all of the series must be integrated of order d (see Order of integration). Next, if
a linear combination of this collection is integrated of order less than d, then the collection
is said to be co-integrated. Formally, if (X,Y,Z) are each integrated of order d, and there exist
coefficients a,b,c such that aX + bY + cZ is integrated of order less than d, then X, Y, and Z are
cointegrated. Cointegration has become an important property in contemporary time series
analysis. Time series often have trends—either deterministic or stochastic. In an influential
paper, Charles Nelson and Charles Plosser (1982) provided statistical evidence that many US
macroeconomic time series (like GNP, wages, employment, etc.) have stochastic trends
Cointegration occurs when two or more nonstationary time series:
• Have a long-run equilibrium.
• Move together in such a way that their linear combination results in a stationary
time series.
• Share an underlying common stochastic trend.
Eg- Consumption and income.
Nominal exchange rates and domestic and foreign prices.
Stock prices and stock dividends/earnings.
Male and female mortality rates.
Occurrence rates of different types of cancer.
What is the Error Correction Model?
Cointegration implies that time series will be connecting through an error correction model.
The error correction model is important in time series analysis because it allows us to better
understand long-run dynamics. Additionally, failing to properly model cointegrated variables
can result in biased estimates.
The error correction model:
• Reflects the long-run equilibrium relationships of variables.
• Includes a short-run dynamic adjustment mechanism that describes how variables
adjust when they are out of equilibrium.
• Uses adjustment coefficients to measure the forces that push the relationship
towards long-run equilibrium.
FIXED EFFECT MODEL
In statistics, a fixed effects model is a statistical model in which the model parameters are
fixed or non-random quantities. This is in contrast to random effects models and mixed
models in which all or some of the model parameters are random variables. In many
applications including econometrics[1] and biostatistics[2][3][4][5][6] a fixed effects model
refers to a regression model in which the group means are fixed (non-random) as opposed
to a random effects model in which the group means are a random sample from a
population.[7][6] Generally, data can be grouped according to several observed factors. The
group means could be modeled as fixed or random effects for each grouping. In a fixed
effects model each group mean is a group-specific fixed quantity.
In panel data where longitudinal observations exist for the same subject, fixed effects
represent the subject-specific means. In panel data analysis the term fixed effects
estimator (also known as the within estimator) is used to refer to an estimator for
the coefficients in the regression model including those fixed effects
RANDOM EFFECT MODEL
In statistics, a random effects model, also called a variance components model, is
a statistical model where the model parameters are random variables. It is a kind
of hierarchical linear model, which assumes that the data being analysed are drawn from a
hierarchy of different populations whose differences relate to that hierarchy. A random
effects model is a special case of a mixed model.
Contrast this to the biostatistics definitions,[1][2][3][4][5] as biostatisticians use "fixed" and
"random" effects to respectively refer to the population-average and subject-specific effects
(and where the latter are generally assumed to be unknown, latent variables).
In statistics, a random effects model, also called a variance components model, is
a statistical model where the model parameters are random variables. It is a kind
of hierarchical linear model, which assumes that the data being analysed are drawn from a
hierarchy of different populations whose differences relate to that hierarchy. A random
effects model is a special case of a mixed model.
Contrast this to the biostatistics definitions,[1][2][3][4][5] as biostatisticians use "fixed" and
"random" effects to respectively refer to the population-average and subject-specific effects
(and where the latter are generally assumed to be unknown, latent variables).
COMPONENTS OF TIME SERIES ANALYSIS
Components for Time Series Analysis
The various reasons or the forces which affect the values of an observation in a time series are
the components of a time series. The four categories of the components of time series are
• Trend
• Seasonal Variations
• Cyclic Variations
• Random or Irregular movements
Seasonal and Cyclic Variations are the periodic changes or short-term fluctuations.
Trend
The trend shows the general tendency of the data to increase or decrease during a long period
of time. A trend is a smooth, general, long-term, average tendency. It is not always necessary
that the increase or decrease is in the same direction throughout the given period of time.
It is observable that the tendencies may increase, decrease or are stable in different sections of
time. But the overall trend must be upward, downward or stable. The population, agricultural
production, items manufactured, number of births and deaths, number of industry or any
factory, number of schools or colleges are some of its example showing some kind of
tendencies of movement.
Linear and Non-Linear Trend
If we plot the time series values on a graph in accordance with time t. The pattern of the data
clustering shows the type of trend. If the set of data cluster more or less round a straight line,
then the trend is linear otherwise it is non-linear (Curvilinear).
Periodic Fluctuations
There are some components in a time series which tend to repeat themselves over a certain
period of time. They act in a regular spasmodic manner.
Seasonal Variations
These are the rhythmic forces which operate in a regular and periodic manner over a span of
less than a year. They have the same or almost the same pattern during a period of 12 months.
This variation will be present in a time series if the data are recorded hourly, daily, weekly,
quarterly, or monthly.
These variations come into play either because of the natural forces or man-made conventions.
The various seasons or climatic conditions play an important role in seasonal variations. Such as
production of crops depends on seasons, the sale of umbrella and raincoats in the rainy
season, and the sale of electric fans and A.C. shoots up in summer seasons.
The effect of man-made conventions such as some festivals, customs, habits, fashions, and
some occasions like marriage is easily noticeable. They recur themselves year after year. An
upswing in a season should not be taken as an indicator of better business conditions.
Cyclic Variations
The variations in a time series which operate themselves over a span of more than one year
are the cyclic variations. This oscillatory movement has a period of oscillation of more than a
year. One complete period is a cycle. This cyclic movement is sometimes called the ‘Business
Cycle’.
It is a four-phase cycle comprising of the phases of prosperity, recession, depression, and
recovery. The cyclic variation may be regular are not periodic. The upswings and the
downswings in business depend upon the joint nature of the economic forces and the
interaction between them.
Random or Irregular Movements
There is another factor which causes the variation in the variable under study. They are not
regular variations and are purely random or irregular. These fluctuations are unforeseen,
uncontrollable, unpredictable, and are erratic. These forces are earthquakes, wars, flood,
famines, and any other disasters.
Mathematical Model for Time Series Analysis
Mathematically, a time series is given as
yt = f (t)
Here, yt is the value of the variable under study at time t. If the population is the variable under
study at the various time period t1, t2, t3, … , tn. Then the time series is
t: t1, t2, t3, … , tn
yt: yt1, yt2, yt3, …, ytn
or, t: t1, t2, t3, … , tn
yt: y1, y2, y3, … , yn