Innovations Algorithm for Periodically Stationary Time Series
Paul L. Anderson
1
Department of Mathematics
University of Nevada
Mark M. Meerschaert
Department of Mathematics
University of Nevada
Aldo V. Vecchia
Water Resources Division
U.S. Geological Survey
April 20, 2004
AMS 1991 subject classication: Primary 62M10, 62E20; Secondary 60E07, 60F05.
Key words and phrases: time series, periodically stationary, Yule Walker estimates, innovations algorithm,
heavy tails, regular variation.
1
On leave from Department of Mathematics, Albion College, Albion MI 49224.
1
Abstract
Periodic ARMA, or PARMA, time series are used to model periodically stationary time series.
In this paper we develop the innovations algorithm for periodically stationary processes. We
then show how the algorithm can be used to obtain parameter estimates for the PARMA
model. These estimates are proven to be weakly consistent for PARMA processes whose
underlying noise sequence has either nite or innite fourth moment. Since many time series
from the elds of economics and hydrology exhibit heavy tails, the results regarding the
innite fourth moment case are of particular interest.
2
1 Introduction
The innovations algorithm yields parameter estimates for nonstationary time series models.
In this paper we show that these estimates are consistent for periodically stationary time
series. A stochastic process X
t
is called periodically stationary if
t
= EX
t
and
t
(h) =
EX
t
X
t+h
for h = 0, 1, 2, . . . are all periodic functions of time t with the same period .
Periodically stationary processes manifest themselves in such elds as economics, hydrology,
and geophysics, where the observed time series are characterized by seasonal variations
in both the mean and covariance structure. An important class of stochastic models for
describing periodically stationary time series are the periodic ARMA models, in which the
model parameters are allowed to vary with the season. Periodic ARMA models are developed
in Jones and Brelsford (1967), Pagano (1978), Troutman (1979), Tjostheim and Paulsen
(1982), Salas, Tabios, and Bartolini (1985), Vecchia and Ballerini (1991), Anderson and
Vecchia (1993), Ula (1993), Adams and Goodwin (1995), and Anderson and Meerschaert
(1997).
This paper provides a parameter estimation technique that considers two types of periodic
time series models, those with nite fourth moment and the models with nite variance
but innite fourth moment. In the latter case we make the technical assumption that the
innovations have regularly varying probability tails. The estimation procedure used adapts
the wellknown innovations algorithm (see for example Brockwell and Davis (1991) p. 172)
to the case of periodically stationary time series. We show that the estimates from the
algorithm are weakly consistent. A more formal treatment of the asymptotic behavior for
the innovations algorithm will be discussed in a forthcoming paper Anderson, Meerschaert,
and Vecchia (1999).
3
Brockwell and Davis (1988) discuss asymptotics of the innovations algorithm for station-
ary time series, using results of Berk (1974) and Bhansali (1978). Our results reduce to theirs
when the period = 1 and the process has nite fourth moments. For innite fourth moment
time series, our results are new even in the stationary case. Davis and Resnick (1986) estab-
lish the consistency of YuleWalker estimates for a stationary autoregressive process of nite
order with nite variance and innite fourth moments. We extend their result to periodic
ARMA processes. However, the DurbinLevinson algorithm to compute the YuleWalker
estimates does not extend to nonstationary processes, and so these results are primarily of
theoretical interest. Mikosch, Gadrich, Kl uppenberg and Adler (1995) investigate parameter
estimation for ARMA models with innite variance innovations, but they do not consider
the case of nite variance and innite fourth moment. Time series with innite fourth mo-
ment and nite variance are common in nance and hydrology, see for example Jansen and
de Vries (1991), Loretan and Phillips (1994), and Anderson and Meerschaert (1998). The
results in this paper provide the rst practical method for time series parameter estimation
in this important special case.
2 The Innovations Algorithm for Periodically Correlated Processes
Let
X
t
be a time series with nite second moments and dene its mean function
t
=
E(
X
t
) and its autocovariance function
t
() = cov(
X
t
,
X
t+
).
X
t
is said to be periodically
correlated with period if, for some positive integer and for all integers k and , (i)
t
=
t+k
and (ii)
t
() =
t+k
(). For a monthly periodic time series it is typical that
= 12. In this paper, we are especially interested in the periodic ARMA process due to
its importance in modeling periodically correlated processes. The periodic ARMA process,
4
X
t
, with period (PARMA
(p, q)) has representation
X
t
j=1
t
(j)X
tj
=
t
j=1
t
(j)
tj
(1)
where X
t
=
X
t
t
and
t
is a sequence of random variables with mean zero and standard
deviation
t
such that
1
t
t
is i.i.d. The model parameters
t
(j),
t
(j), and
t
are respec-
tively the periodic autoregressive, periodic moving average, and periodic residual standard
deviation parameters. In this paper we will consider models where E
4
t
< , and also mod-
els in which E
4
t
= . We will say that the i.i.d. sequence
t
is RV() if P[[
t
[ > x]
varies regularly with index and P[
t
> x]/P[[
t
[ > x] p for some p [0, 1]. In the
case where the noise sequence has innite fourth moment, we assume that the sequence is
RV() with > 2. This assumption implies that E[
t
[
< if 0 < , in particular
the variance of
t
exists. With this technical condition, Anderson and Meerschaert (1997)
show that the sample autocovariance is a consistent estimator of the autocovariance, and
asymptotically stable with tail index /2. Stable laws and processes are comprehensively
treated in Samorodnitsky and Taqqu (1994).
There are some restrictions that need to be placed on the parameter space of (1). The
rst restriction is that the model admits a causal representation
X
t
=
j=0
t
(j)
tj
(2)
where
t
(0) = 1 and
j=0
[
t
(j)[ < for all t. The absolute summability of the -weights
ensures that (2) converges almost surely for all t, and in the mean-square to the same limit.
The causality condition places constraints on the autoregressive parameters (see for example
Tiao and Grupe (1980)) but these constraints are not the focus of this paper. It should be
noted that
t
(j) =
t+k
(j) for all j. Another restriction on the parameter space of (1) is
5
the invertibility condition,
t
=
j=0
t
(j)X
tj
(3)
where
t
(0) = 1 and
j=0
[
t
(j)[ < for all t. The invertibility condition places con-
straints on the moving average parameters in the same way that (2) places constraints on
the autoregressive parameters. Again,
t
(j) =
t+k
(j) for all j.
Given N years of data with seasons per year, the innovations algorithm allows us to
forecast future values of X
t
for t N in terms of the observed values X
0
, ..., X
N1
.
Toward this end, we would like to nd the best linear combination of X
0
, ..., X
N1
for
predicting X
N
such that the mean- square distance from X
N
is minimized. For a periodic
time series, the one-step predictors must be calculated for each season i, i = 0, 1..., 1.
The remainder of this section develops the innovations algorithm for periodic time series
models. We adapt the development of Brockwell and Davis (1991) to this special case, and
introduce the notation which will be used throughout the rest of the paper.
2.1 Equations for the One-Step Predictors
Let H
n,i
denote the closed linear subspace spX
i
, ..., X
i+n1
, n 1, and let
X
(i)
i+n
, n 0,
denote the one-step predictors, which are dened by
X
(i)
i+n
=
_
_
0 if n = 0
P
H
n,i
X
i+n
if n 1.
(4)
We call P
H
n,i
X
i+n
the projection mapping of X
i+n
onto the space H
n,i
. Also, dene
v
n,i
=| X
i+n
X
(i)
i+n
|
2
= E(X
i+n
X
(i)
i+n
)
2
.
There are two representations of P
H
n,i
X
i+n
pertinent to the goals of this paper. The rst
one relates directly to the innovations algorithm and depends on writing H
n,i
as a span of
6
orthogonal components, viz.,
H
n,i
= spX
i
X
(i)
i
, X
i+1
X
(i)
i+1
, ..., X
i+n1
X
(i)
i+n1
, n 1,
so that
X
(i)
i+n
=
n
j=1
(i)
n,j
(X
i+nj
X
(i)
i+nj
). (5)
The second representation of P
H
n,i
X
i+n
is given by
X
(i)
i+n
=
(i)
n,1
X
i+n1
+ +
(i)
n,n
X
i
, n 1.
(6)
The vector of coecients,
(i)
n
= (
(i)
n,1
, . . . ,
(i)
n,n
)
, appears in the prediction equations
n,i
(i)
n
=
(i)
n
(7)
where
(i)
n
= (
i+n1
(1),
i+n2
(2), . . . ,
i
(n))
and
n,i
=
_
i+n1
( m)
_
,m=0,...,n1
, i = 0, ..., 1. (8)
is the covariance matrix of (X
i+n1
, ..., X
i
)
. The condition sucient for
n,i
to be invertible
for all n 1 and each i = 0, 1, . . . , 1 is given in the following proposition. Only the
causality condition is required for the proposition to be valid.
Proposition 2.1.1 If
2
i
> 0 for i = 0, . . . , 1, then for a causal PARMA
(p, q) process
the covariance matrix
n,i
in (7) is nonsingular for every n 1 and each i.
Proof. See Proposition 4.1 of Lund and Basawa(1999) for a proof.
Remark. Proposition 5.1.1 of Brockwell and Davis (1991) does not extend to general
periodically stationary processes. By Proposition 2.1.1, however, if our periodic process is
a PARMA process, then we are guaranteed that the covariance matrix
n,i
is nonsingular
for every n and each i. To establish this remark consider the periodically stationary process
7
X
t
of period = 2 given by
X
2t
= Z
2t
,
X
2t+1
= (X
2t1
+ X
2t2
)/
2
where Z
t
is an i.i.d. sequence of standard normal variables. It is easy to show that
0
(0) =
1
(0) = 1 and
0
(1) = 0. Also, for n 1,
0
(2n) = 0,
0
(2n + 1) =
1
2
n/2
,
1
(2n1) = 0, and
1
(2n) =
1
2
n/2
. The process X
t
is, by denition, periodically stationary
of period = 2. Using (8) we let
n,0
=
_
i+n1
(m)
_
,m=0,...,n1
be the covariance matrix
of (X
n1
, ..., X
0
)
. Again, it is easy to show that
2,0
and
3,0
are identity matrices, hence
nonsingular. However,
4,0
is a singular matrix so that
n,0
is singular for n 4. Thus, the
process is such that
2,0
is invertible and
i
(h)0 as h but
n,0
is singular for n 4.
Note that this process is not a PARMA
2
process.
2.2 The Innovations Algorithm
The proposition that follows is the innovations algorithm for periodically stationary processes.
For a proof, see Proposition 5.2.2 in Brockwell and Davis (1991).
Proposition 2.2.1. If X
t
has zero mean and E(X
X
m
) =
(m), where the matrix
n,i
= [
i+n1
( m)]
,m=0,...,n1
, i = 0, ..., 1, is nonsingular for each n 1, then the
one-step predictors
X
i+n
, n 0, and their mean-square errors v
n,i
, n 1, are given by
X
i+n
=
_
_
0 if n = 0
n
j=1
(i)
n,j
(X
i+nj
X
i+nj
) if n 1
(9)
and for k = 0, 1, . . . , n 1
8
v
0,i
=
i
(0)
(i)
n,nk
= (v
k,i
)
1
_
i+k
(n k)
k1
j=0
(i)
k,kj
(i)
n,nj
v
j,i
_
v
n,i
=
i+n
(0)
n1
j=0
(
(i)
n,nj
)
2
v
j,i
.
(10)
We solve (10) recursively in the order v
0,i
;
(i)
1,1
, v
1,i
;
(i)
2,2
,
(i)
2,1
, v
2,i
;
(i)
3,3
,
(i)
3,2
,
(i)
3,1
, v
3,i
, . . . .
The corollaries which follow in this section require the invertibility condition (3). The rst
corollary shows that the innovations algorithm provides consistent estimates of the seasonal
standard deviations, and the proof also provides the rate of convergence.
Corollary 2.2.1. In the innovations algorithm, for each i = 0, 1,. . ., 1 we have
v
m,im
2
i
as m ,
where
k) =
_
_
k [k/] if k = 0, 1, ...,
+ k [k/ + 1] if k = 1, 2, ....
and [ ] is the greatest integer function. Note that k) denotes the season associated with
time k.
Proof. Let H
i+n1
= spX
j
, < j i + n 1. Then
2
i+m
= E(
2
i+m
) = E(X
i+m
+
j=1
i+m
(j)X
i+mj
)
2
= E(X
i+m
P
H
i+m1
X
i+m
)
2
where
j=1
i+m
(j)X
i+mj
= P
H
i+m1
(
i+m
X
i+m
) = P
H
i+m1
X
i+m
since
i+m
H
i+m1
. Thus, we have,
2
i+m
= E(X
i+m
P
H
i+m1
X
i+m
)
2
9
E(X
i+m
P
H
m,i
X
i+m
)
2
= v
m,i
E
_
X
i+m
+
m
j=1
i+m
(j)X
i+mj
_
2
= E
_
i+m
j>m
i+m
(j)X
i+mj
_
2
= E(
i+m
)
2
+ E
_
j>m
i+m
(j)X
i+mj
_
2
=
2
i+m
+ E
_
j>m
i+m
(j)X
i+mj
k>m
i+m
(k)X
i+mk
_
2
i+m
+
j,k>m
([
i+m
(j)[[
i+m
(k)[E[X
i+mj
X
i+mk
[)
2
i+m
+
j,k>m
_
[
i+m
(j)[[
i+m
(k)[
_
i+mj
(0)
i+mk
(0)
_
2
i+m
+
_
j>m
[
i+m
(j)[
_
2
M,
where M = max
i
(0) : i = 0, 1, . . . , 1. Since i m) +m = i +k for all m and some
k we write
2
im+m
v
m,im
2
im+m
+ M
_
j>m
[
i
(j)[
_
2
yielding
2
i
v
m,im
2
i
+ M
_
j>m
[
i
(j)[
_
2
where v
m,im
= E(X
n+i
P
M
X
n+i
)
2
and /=spX
n+i1
, . . . , X
n+im
, n arbitrary.
Hence, as m , v
m,im
2
i
.
Corollary 2.2.2 lim
m
|X
i+m
X
(i)
i+m
i+m
| = 0.
Proof.
E(X
i+m
X
(i)
i+m
i+m
)
2
= E(X
i+m
X
(i)
i+m
)
2
10
2E[
i+m
(X
i+m
X
(i)
i+m
)] + E(
2
i+m
)
= v
m,i
2
2
i+m
+
2
i+m
= v
m,i
2
i+m
where the last expression approaches 0 as m by Corollary 2.2.1.
Corollary 2.2.3
(im)
m,k
i
(k) as m for all i = 0, 1, . . . , 1 and all k = 1, 2, . . . ,.
Proof. We know that
(i)
m,k
= v
1
mk,i
E
_
X
i+m
(X
i+mk
X
(i)
i+mk
)
_
and
i+m
(k) =
2
i+mk
E(X
i+m
i+mk
).
By the triangle inequality,
[
(i)
m,k
i+m
(k)[
(i)
m,k
2
i+mk
E
_
X
i+m
(X
i+mk
X
(i)
i+mk
)
_
2
i+mk
E
_
X
i+m
(X
i+mk
X
(i)
i+mk
i+mk
)
_
(i)
m,k
2
i+mk
(i)
m,k
v
mk,i
2
i+mk
E
_
X
i+m
(X
i+mk
X
(i)
i+mk
i+mk
)
_
(i)
m,k
2
i+mk
(i)
m,k
v
mk,i
+[
2
i+mk
[
_
i+m
(0)|X
i+mk
X
(i)
i+mk
i+mk
|.
As m , the rst term on the right-hand side approaches 0 by Corollary 2.2.1 and the
fact that
(i)
m,k
is bounded in m. Also, as m , the second term on the right-hand side
approaches 0 by Corollary 2.2.2 and the fact that
2
i+mk
_
i+m
(0) is bounded in m. Thus,
[
(i)
m,k
i+m
(k)[ 0 as m , and consequently, [
(im)
m,k
i
(k)[ 0 as m , k
arbitrary but xed.
11
Corollary 2.2.4
(im)
m,k
i
(k) as m for all i = 0, 1, . . . , 1 and k = 1, 2, . . . ,.
Proof. Dene
(i)
m
= (
(i)
m,1
, . . . ,
(i)
m,m
)
and
(i)
m
= (
i+m
(1), . . . ,
i+m
(m))
. We show that
(
(i)
m
+
(i)
m
) 0 as m . From Theorem A.1 in the Appendix we have
m
j=1
(
(i)
m,j
+
i+m
(j))
2
1
2C
(
(i)
m
+
(i)
m
)
m,i
(
(i)
m
+
(i)
m
)
=
1
2C
Var
_
m
j=1
(
(i)
m,j
+
i+m
(j))X
i+mj
_
=
1
2C
Var
_
i+m
(X
i+m
X
(i)
i+m
)
j>m
i+m
(j)X
i+mj
_
since
i+m
(X
i+m
X
(i)
i+m
) =
m
j=1
(
(i)
m,j
+
i+m
(j))X
i+mj
+
j=m+1
i+m
(j)X
i+mj
.
Now,
1
2C
Var
_
i+m
(X
i+m
X
(i)
i+m
)
j>m
i+m
(j)X
i+mj
_
1
2C
2
_
Var
_
i+m
(X
i+m
X
(i)
i+m
)
_
+ Var
_
j>m
i+m
(j)X
i+mj
__
=
1
C
_
v
m,i
2
i+m
+ Var
_
j>m
i+m
(j)X
i+mj
__
=
1
C
__
j>m
[
i+m
(j)[
_
2
M +
_
j>m
[
i+m
(j)[
_
2
M
_
where the rst inequality is a result of the fact that V ar(XY ) 2V ar(X) +2V ar(Y ) and
the last inequality follows from the proof of Corollary 2.2.1 recalling that M = max
i
(0) :
i = 0, 1, . . . , 1. The right hand side of the last inequality approaches 0 as m approaches
since
[
i
(j)[ < for all i = 0, 1, . . . , 1. We have shown, for xed but arbitrary k,
12
that [
(i)
m,k
+
i+m
(k)[ 0 as m . Using the notation of Corollary 2.2.1, our corollary
is established.
3 Weak Consistency of Innovation Estimates
Given N years of data
X
0
,
X
1
, . . . ,
X
N1
, where is the number of seasons per year, the
estimated periodic autocovariance at season i and lag is dened by
i
() = N
1
N1
j=0
(
X
j+i
i
)(
X
j+i+
i+
)
where
i
= N
1
N1
j=0
X
j+i
and any terms involving
X
t
are set equal to zero whenever t > N 1. For what follows, it
is simplest to work with the function
i
() = N
1
N1
j=0
X
j+i
X
j+i+
(11)
where X
t
=
X
t
t
. Since
i
() and
i
() have the same asymptotic properties, we use (11)
as our estimate of
i
(). If we replace the autocovariances in the innovations algorithm with
their corresponding sample autocovariances we obtain the estimator,
(ik)
k,j
, of
(ik)
k,j
. We
prove in this section that the innovations estimates are weakly consistent in the sense that
(
(ik)
k,1
i
(1), . . . ,
(ik)
k,k
i
(k), 0, 0, . . .)
P
0
in R
where
P
is used to denote convergence in probability. Results are presented for both
the nite and innite fourth moment cases. Theorems 3.1 through 3.4 below relate to the
case where we assume the underlying noise sequence has nite fourth moment. Analogously,
theorems 3.5 through 3.8 relate to the innite fourth moment case where we assume the
13
underlying noise sequence is RV() with 2 < 4 (see rst paragraph of section 2). The
latter set of theorems require regular variation theory for proof and are therefore treated
separately from the rst set of theorems. We assume throughout this section that the
associated PARMA process is causal and invertible. With this assumption it can be shown
that the spectral density matrix of the corresponding vector process, (see Anderson and
Meerschaert (1997), pg. 778), is positive denite. We emphasize this fact in the statements
of each of the theorems in this section, since it is essential in their proofs. Replacing the
autocovariances given in (8) with their corresponding sample autocovariances yields the
sample covariance matrix
n,i
for season i = 0, . . . , 1. In theorems 3.1 and 3.2 we make
use of the matrix 2-norm given by
|A|
2
= max
x
2
=1
|Ax|
2
(12)
where |x|
2
= (x
x)
1
2
(see Golub and Van Loan (1989) pg. 56).
Theorem 3.1. Let X
t
be the mean zero PARMA process with period given by
(1) with E(
4
t
) < . Assume that the spectral density matrix, f(), of its equivalent
vector ARMA process (see Anderson and Meerschaert (1997), pg. 778) is such that mzz
f()z Mzz
, , for some m and M such that 0 < m M < and for all z
in R
. If k is chosen as a function of the sample size N so that k
2
/N 0 as N and
k , then |
1
k,i
1
k,i
|
2
P
0.
Proof. The proof of this theorem is patterned after that of Lemma 3 in Berk (1974). Let
14
p
k,i
= |
1
k,i
|
2
, q
k,i
= |
1
k,i
1
k,i
|
2
, and Q
k,i
= |
k,i
k,i
|
2
. Then
q
k,i
= |
1
k,i
1
k,i
|
2
= |
1
k,i
(
k,i
k,i
)
1
k,i
|
2
|
1
k,i
|
2
|
k,i
k,i
|
2
|
1
k,i
|
2
= |
1
k,i
1
k,i
+
1
k,i
|
2
|
k,i
k,i
|
2
|
1
k,i
|
2
_
|
1
k,i
1
k,i
|
2
+|
1
k,i
|
2
_
|
k,i
k,i
|
2
|
1
k,i
|
2
= (q
k,i
+ p
k,i
)Q
k,i
p
k,i
.
(13)
Now,
Q
2
k,i
= |
k,i
k,i
|
2
2
k1
,m=0
_
i+k1
(l m)
i+k1
( m)
_
2
.
Multiplying the above equation by N and taking expectations yields
NE(Q
2
k,i
) N
k1
,m=0
Var
_
i+k1
( m)
_
.
Anderson (1989) shows that N Var(
i+k1
( m)) is bounded above by
m
1
=0
m
2
=0
[
i+k1
(m
1
)[[
i+km1
(m
2
)[
_
2
<
where = E(
4
t
). Dene
C = max
_
[ 3[
_
m
1
=0
m
2
=0
[
i
(m
1
)[[
j
(m
2
)[
_
2
, 0 i, j 1
_
which is independent of N and k so that we can write
NE(Q
2
k,i
) k
2
C
which holds for all i. Thus, E(Q
2
k,i
) k
2
C/N 0 as k since k
2
/N 0 as N .
It follows that Q
k,i
P
0 and since p
k,i
is bounded for all i and k (see Appendix, Theorem
A.1), we also have p
k,i
Q
k,i
P
0. From (13) we can write
q
k,i
p
2
k,i
Q
k,i
1 p
k,i
Q
k,i
15
if 1 p
k,i
Q
k,i
> 0, i.e., if p
k,i
Q
k,i
< 1. Now,
P(q
k,i
> ) = P(q
k,i
> [p
k,i
Q
k,i
< 1)P(p
k,i
Q
k,i
< 1)
+ P(q
k,i
> [p
k,i
Q
k,i
1)P(p
k,i
Q
k,i
1)
P
_
p
2
k,i
Q
k,i
1 p
k,i
Q
k,i
>
_
+ P(q
k,i
> [p
k,i
Q
k,i
1)P(p
k,i
Q
k,i
1).
Since p
2
k,i
Q
k,i
P
0 and (1 p
k,i
Q
k,i
)
P
1, then by Theorem 5.1, Corollary 2 of Billingsley
(1968)
p
2
k,i
Q
k,i
1p
k,i
Q
k,i
P
0. Also, we know that lim
k
P(p
k,i
Q
k,i
1) = 0 so
lim
k
P(q
k,i
> ) lim
k
P(
p
2
k,i
Q
k,i
1 p
k,i
Q
k,i
> ) + 1 lim
k
P(p
k,i
Q
k,i
1)
= 0 + 1 0
= 0
and it follows that q
k,i
P
0. This proves the theorem.
Substituting sample autocovariances for autocovariances in (7) yields the Yule-Walker
estimators
(i)
k
=
1
k,i
(i)
k
(14)
assuming
1
k,i
exists. The next theorem shows that
(i)
k
is consistent for
(i)
k
.
Theorem 3.2. If the hypotheses of Theorem 3.1 hold, then (
(i)
k
(i)
k
)
P
0.
Proof. Write
(i)
k
(i)
k
=
1
k,i
(i)
k
1
k,i
(i)
k
=
1
k,i
(i)
k
1
k,i
(i)
k
+
1
k,i
(i)
k
k,i
(i)
k
=
1
k,i
(
(i)
k
(i)
k
) + (
1
k,i
1
k,i
)
(i)
k
.
16
Then,
|
(i)
k
(i)
k
|
2
|
1
k,i
|
2
|
(i)
k
(i)
k
|
2
+|
1
k,i
1
k,i
|
2
|
(i)
k
|
2
= |
1
k,i
1
k,i
+
1
k,i
|
2
|
(i)
k
(i)
k
|
2
+ q
k,i
|
(i)
k
|
2
_
|
1
k,i
1
k,i
|
2
+|
1
k,i
|
2
_
|
(i)
k
(i)
k
|
2
+ q
k,i
|
(i)
k
|
2
= (q
k,i
+ p
k,i
)|
(i)
k
(i)
k
|
2
+ q
k,i
|
(i)
k
|
2
.
The last term on the right-hand side of the inequality goes to 0 in probability by Theorem
3.1 and the fact that
|
(i)
k
|
2
=
k1
j=0
_
i+j
(k j)
_
2
i=0
j=0
2
i
(j) <
by the absolute summability of
i
(k) for each i = 0, 1, . . . , 1. The rst term on the
right-hand side of the inequality goes to 0 in probability if we can show that |
(i)
k
(i)
k
|
2
P
0
by Theorem 3.1 and the fact that p
k,i
is uniformly bounded. Write
|
(i)
k
(i)
k
|
2
2
=
k1
j=0
_
i+j
(k j)
i+j
(k j)
_
2
which leads to
E|
(i)
k
(i)
k
|
2
2
=
k1
j=0
E
_
i+j
(k j)
i+j
(k j)
_
2
k1
j=0
C/N
= kC/N
where kC/N 0 by hypothesis and where C is as in the proof of Theorem 3.1. It follows
that |
(i)
k
(i)
k
|
2
P
0 and hence (
(i)
k
(i)
k
)
P
0.
Theorem 3.3. Under the conditions of Theorem 3.1, we have that
(ik)
k,j
P
i
(j)
17
for all j.
Proof. From Corollary 2.2.4 we know that
(i)
k,j
+
i+k
(j) 0 for all j as k . From
Theorem 3.2 we have
(i)
k,j
(i)
k,j
P
0 for all j so that
[
(i)
k,j
+
i+k
(j)[ = [
(i)
k,j
(i)
k,j
+
(i)
k,j
+
i+k
(j)[
[
(i)
k,j
(i)
k,j
[ +[
(i)
k,j
+
i+k
(j)[
P
0
as k for all xed but arbitrary j, by the continuous mapping theorem. Another
application of the continuous mapping theorem yields
(ik)
k,j
+
i
(j)
i
(j)
P
0
i
(j) =
i
(j)
using the notation of Corollary 2.2.1. This proves the theorem.
Theorem 3.4. Under the conditions in Theorem 3.1, we have that
(ik)
k,j
P
i
(j)
for all j.
Proof. From the representations of
X
i+k
given by (5) and (6) and the invertibility of
k,i
for all k and i, one can check that
(i)
k,j
=
j
=1
(i)
k,
(i)
k,j
,
j = 1, . . . , k if we dene
(i)
kj,0
= 1. Also, because of the way the estimates
(i)
k,j
and
(i)
k,j
are
dened we have
(i)
k,j
=
j
=1
(i)
k,
(i)
k,j
,
j = 1, . . . , k if we dene
(i)
kj,0
= 1. We propose that, for every n,
(ik)
k,
P
i
(),
18
= 1, . . . , n as k and N according to the hypotheses of the theorem. We use
strong induction on n. The proposition is true for n = 1 since
(ik)
k,1
=
(ik)
k,1
P
i
(1) =
i
(1).
Now, assume the proposition is true for n = j 1,i.e.,
(ik)
k,
P
i
(), = 1, . . . , j 1. Note
that
(ik)
k,j
P
i
(j ) as N and k according to k
2
/N 0 since (k)
2
/N 0
also. Additionally,
(ik)
k,
P
i
(), so by the continuous mapping theorem,
(ik)
k,j
P
=1
i
()
i
(j ) =
i
(j)
hence the theorem follows.
Corollary 3.4 v
k,ik
P
2
i
where
v
k,ik
=
i
(0)
k1
j=0
(
(ik)
k,kj
)
2
v
j,ik
.
Proof. Using a strong induction argument similar to that in Theorem 3.4 yields the result.
In Theorems 3.5 and 3.6 the matrix 1-norm is used to obtain the required bounds on the
appropriate statistics since these theorems deal with the innite fourth moment case. The
matrix 1-norm is given by
|A|
1
= max
x
1
=1
|Ax|
1
where |x|
1
= [x
1
[ + + [x
k
[ (see Golub and Van Loan (1989) pg. 57). We also need to
dene
a
N
= infx : P([
t
[ > x) < 1/N
where
a
1
N
N1
t=0
t+i
S
(i)
,
S
(i)
is an -stable law, and denotes convergence in distribution.
19
Theorem 3.5 Let X
t
be the mean zero PARMA process with period given by (1)
with 2 < 4. Assume that the spectral density matrix, f(), of its equivalent vector
ARMA process is such that mzz
f()zMzz
, , for some m and M such
that 0 < m M and for all z in R
. If k is chosen as a function of the sample size N so
that k
5/2
a
2
N
/N 0 as N and k , then |
1
k,i
1
k,i
|
1
P
0.
Proof. Dene p
k,i
, q
k,i
, and Q
k,i
as in Theorem 3.1 with the 1-norm replacing the 2-norm.
Starting with the equations (13), we want to show that Q
k,i
P
0. Toward this end, it is
shown in the Appendix, Theorem A.2, that there exists a constant, C, such that
E
Na
2
N
_
i
()
i
()
_
C
for all i = 0, 1, . . . , 1, for all = 0, 1, 2, . . ., and for all N = 1, 2, . . .. If we have a
random k k matrix A with E[a
ij
[ C for all i and j then
E|A|
1
= E
_
max
1jk
k
i=1
[a
ij
[
_
E
k
i,j=1
[a
ij
[
= k
2
C.
Thus
E(Q
k,i
) = E|
k,i
k,i
|
1
k
2
a
2
N
C/N
for all i,k, and N. We therefore have that Q
k,i
P
0 and since
p
k,i
= |
1
k,i
|
1
k
1/2
|
1
k,i
|
2
then p
k,i
Q
k,i
P
0 if k
5/2
a
2
N
C/N 0 as k and N . To show that q
k,i
P
0 we
follow exactly the proof given in Theorem 3.1 and this concludes the proof of our theorem.
20
Theorem 3.6 Given the hypotheses set forth in Theorem 3.5 we have that (
(i)
k
(i)
k
)
P
0.
Proof. From the proof of Theorem 3.2 with the 1-norm replacing the 2-norm we start
with the inequality
|
(i)
k
(i)
k
|
1
(q
k,i
+ p
k,i
)|
(i)
k
(i)
k
|
1
+ q
k,i
|
k,i
|
1
.
The last term on the right-hand side of the inequality goes to 0 in probability by Theorem
3.5 and the fact that
|
(i)
k
|
1
=
k1
j=0
[
i+j
(k j)[
1
i=0
j=0
[
i
(j)[ <
by the absolute summability of
i
(k) for each i = 0, 1, . . . , 1. The rst term on the
right-hand side of the inequality goes to 0 in probability if we can show that |
(i)
k
(i)
k
|
1
P
0
since we know that k
1/2
p
k,i
is uniformly bounded. By Theorem A.2 in the Appendix
E|
(i)
k
(i)
k
|
1
=
k1
j=0
E[
i+j
(k j)
i+j
(k j)[
kCa
2
N
/N
where the last term approaches 0 by hypothesis. It follows that |
(i)
k
(i)
k
|
1
P
0 and hence
(
(i)
k
(i)
k
)
P
0.
Theorem 3.7 Let X
t
be the mean zero PARMA process with period given by (1)
with 2 < 4. Assume that the spectral density matrix, f(), of its equivalent vector
ARMA process is such that mzz
f()zMzz
, , for some m and M such
that 0 < m M and for all z in R
. If k is chosen as a function of the sample size N so
that k
5/2
a
2
N
/N 0 as N and k , then
(ik)
k,j
P
i
(j) for all j and for every
i = 0, 1, . . . , 1.
Proof. The result follows by mimicking the proofs given in Theorems 3.3 and 3.4.
21
We state the next corollary without proof since it is completely analogous to Corollary
3.4.
Corollary 3.7 v
k,ik
P
2
i
where
v
k,ik
=
i
(0)
k1
j=0
(
(ik)
k,kj
)
2
v
j,ik
.
Remarks.
1. All of the results in this section hold true for second-order stationary ARMA models
since they are a special case of the periodic ARMA models with = 1.
2. In Theorem 3.2 and Theorem 3.6, we not only have that (
(i)
k
(i)
k
)
P
0 in R
but
also in
2
.
APPENDIX
Theorem A.1. Let X
t
be a mean zero periodically stationary time series with period
1. Also, let Y
t
= (X
t+1
, . . . , X
t
)
be the corresponding -variate stationary vector
process with spectral density matrix, f(). If there exists constants c and C such that
cz
z z
f()z Cz
z for all z R
where 0 < c C < , then |
k,i
|
2
2C and
|
1
k,i
|
2
1/(2c) for all k and i. Note that |A|
2
is the matrix 2-norm dened by (12).
Proof. Let (h) = Cov(Y
t
, Y
t+h
), Y = (Y
n1
, . . . , Y
0
)
, and = Cov(Y, Y ) = [(i j)]
n1
i,j=0
where Y
t
is as stated in the theorem. In the notation of (8) we see that =
n,0
=
Cov(X
n1
, . . . , X
0
)
. For xed i and k let n = [
k+i
] + 1. Then
k,i
= Cov(X
i+k1
, . . . , X
i
)
is a submatrix of =
n,0
. It is clear that |
1
|
2
|
1
k,i
|
2
and |
k,i
|
2
||
2
, since
k,i
is the restriction of onto a lower dimensional subspace. The spectral density matrix of Y
t
is f() =
1
2
h=
e
ih
(h) so that (h) =
_
e
ih
f()d. Dene the xed but arbitrary
22
vector y R
n
such that y = (y
0
, y
1
, . . . , y
n1
)
where y
j
= (y
j
, y
j+1
, . . . , y
j+1
)
. Then
y
y =
n1
j=0
n1
k=0
y
j
(j k)y
k
=
n1
j=0
n1
k=0
y
j
(
_
e
i(jk)
f()d)y
k
=
_
(
n1
j=0
e
ij
y
j
)
f()(
n1
k=0
e
ik
y
k
)d
C
_
(
n1
j=0
e
ij
y
j
)
(
n1
k=0
e
ik
y
k
)d
= C
n1
j=0
n1
k=0
y
j
y
k
_
e
i(jk)
d
= 2C
n1
j=0
y
j
y
j
= 2Cy
y.
Similarly, y
y 2cy
y. If y = y then y
y = y
y = y
y so 2cy
y y
y 2Cy
y
which shows that every eigenvalue of lies between 2c and 2C for all n. If we write
1
n
for the eigenvalues of then since
1
=
1
2
and
n
= ||
2
we have
|
k,i
|
2
||
2
=
n
2C
and
|
1
k,i
|
2
|
1
|
2
=
1
1
2c
.
The next result given in the Appendix arms that E[Na
2
N
(
i
()
i
()[ is uniformly
bounded for all i = 0, 1, . . . , 1, for all = 0, 1, 2, . . ., and for all N = 1, 2, . . .. We
assume that (1) and (2) hold and that the i.i.d. sequence
t
is RV() with 2 < < 4.
Then the squared noise Z
t
=
2
t
belong to the domain of attraction of an /2-stable law. We
23
also have that
i
() = E(X
n+i
X
n+i+
)
=
j=
i
(j)
i+
(j + )
assuming E(
t
) = 0, and E(
2
t
) = 1. In preparation for the following two lemmas we dene
the quantities
V
(y) = E[Z
1
[
I([Z
1
[ y)
U
(y) = E[Z
1
[
I([Z
1
[ y)
and recall that a
N
=infx : P([
t
[ > x) < 1/N.
Lemma A.1. Let the i.i.d. sequence Z
t
be in the domain of attraction of an -stable
law where 1 < < 2 and E(Z
t
) = 0. For all > 0, there exists some constant K such that
P
_
i=1
Z
i
> d
N
t
_
Kt
+
for all t > 0 and N 1 where d
N
= a
2
N
and NV
0
(d
N
)1.
Proof. For xed but arbitrary t > 0 dene
T
N
=
N
i=1
Z
i
,
T
NN
=
N
i=1
Z
i
I([Z
i
[ d
N
t),
E
N
=
N
_
i=1
([Z
i
[ > d
N
t)
and
G
N
= [T
NN
[ > d
N
t.
Then P([T
N
[ > d
N
t) P(E
N
) + P(G
N
). Also,
P(E
N
) NP([Z
1
[ > d
N
t) = NV
0
(d
N
t) C
1
t
+
24
for all t greater than or equal to some t
0
, where the last inequality follows from Potters
Theorem (see Bingham, Goldie, and Teugels (1987), pg.25). Now, by Chebychevs inequality,
P(G
N
) E(T
2
NN
)/(d
2
N
t
2
) where
E(T
2
NN
) = NEZ
2
1
I([Z
1
[ d
N
t)
= +N(N 1)EZ
1
I([Z
1
[ d
N
t)Z
2
I([Z
2
[ d
N
t)
= I
N
+ J
N
.
Note that,
I
N
d
2
N
t
2
=
NU
2
(d
N
t)
d
2
N
t
2
= NV
0
(d
N
t)
U
2
(d
n
t)
(d
N
t)
2
V
0
(d
N
t)
C
2
t
+
for all t t
0
by Karamatas Theorem (see Feller(1971), pg. 283). Also, for all t t
0
J
N
d
2
N
t
2
N
2
d
2
N
t
2
EZ
1
I([Z
1
[ d
N
t)EZ
2
I([Z
2
[ d
N
t)
=
_
N
d
N
t
EZ
1
I([Z
1
[ > d
N
t)
_
2
=
_
NV
1
(d
N
t)
d
N
t
_
2
=
_
NV
0
(d
N
t)
(d
N
t)
3
V
1
(d
N
t)
U
4
(d
N
t)
U
4
(d
N
t)
(d
N
t)
4
V
0
(d
N
t)
_
2
C
3
t
+
by Karamatas Theorem. Hence P([T
N
[ > d
N
t) Kt
+
for all t t
0
with K = C
1
+C
2
+
C
3
. Now, enlarge K if necessary so that Kt
+
0
> 1. Then
P
_
i=1
Z
i
> d
N
t
_
Kt
+
holds for t > 0 because P([
N
i=1
Z
i
[ > d
N
t) 1.
Lemma A.2. Under the conditions of Lemma A.1.,
E
d
1
N
N
i=1
Z
i
E[Y [
25
where
d
1
N
N
i=1
Z
i
Y.
Proof. By Billingsley (1995), pg. 338, it suces to show that E[d
1
N
T
N
[
1+
< for all N
where T
N
=
N
i=1
Z
i
. By Lemma A.1.,
E[d
1
N
T
N
[
1+
=
_
0
P([d
1
N
T
N
[
1+
> t)dt
=
_
1
0
P([d
1
N
T
N
[
1+
> t)dt +
_
1
P([d
1
N
T
N
[
1+
> t)dt
1 +
_
1
K(t
1
1+
)
+
dt
where the last term is nite.
Theorem A.2. There exists a constant, C > 0, such that
E[Na
2
N
(
i
()
i
())[ C
for all i = 0, 1, . . . , 1, for all = 0, 1, 2, . . ., and for all N = 1, 2, . . .. Proof. By the
proof of Lemma 2.1 of Anderson and Meerschaert (1997) we have
Na
2
N
_
i
() N
1
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
_
= A
1
+ A
2
+ A
3
+ A
4
where
V ar(A
1
) Na
4
N
4
N
K
E[A
2
[ Na
2
N
[
N
[K
E[A
3
[ Na
2
N
V
N
K
[A
4
[ Na
2
N
2
N
K
where
N
= E
1
I([
1
[ a
N
)
26
2
N
= E
2
1
I([
1
[ a
N
)
V
N
= E[
1
[I([
1
[ a
N
)
K =
j=
[
i
(j)[
j=
[
i+
(j)[
and
i
(j) = 0 for j < 0. For > 2 we have
2
N
1, [
N
[ V
N
, and V
N
1
a
N
N
. Then
Na
1
N
V
N
1
implies Na
1
N
V
N
K for all N, so Na
1
N
[
N
[ K for all N. Therefore,
E[A
2
[ a
1
N
K, E[A
3
[ a
1
N
K, and [A
4
[ Na
1
N
[
N
[Na
1
N
[
N
[N
1
N
1
a
1
N
K for all N,
and nally, since
(E[A
1
[)
2
E([A
1
[
2
) = E(A
2
1
) = Var(A
1
)
we have
E[A
1
[
_
Var(A
1
) N
1/2
a
2
N
2
K
1/2
for all N. Thus, for all N, i, and , we have
E
Na
2
N
_
i
() N
1
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
_
E[A
1
[ + E[A
2
[ + E[A
3
[ +[A
4
[
N
1/2
a
2
N
2
K
1/2
+ a
1
N
K + a
1
N
K + N
1
a
1
N
K
K
0
N
1/2
a
2
N
for all N, i, and . Next write
N
1
_
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
_
i
()
= N
1
N1
t=0
j=
i
(j)
i+
(j + )(
2
t+ij
1)
and apply Lemma A.2. with d
N
= a
2
N
and Z
t
=
2
t
1 to see that
E[a
2
N
N1
t=0
(
2
t+ij
1)[ E[S
ij
[
27
where, as in Anderson and Meerschaert (1997) we have the corresponding weak convergence
result
a
2
N
N1
t=0
(
2
t+r
1) S
r
for all r = 0, 1, . . . , 1 where S
0
, . . . , S
1
are i.i.d. /2-stable laws. Then we have
E[a
2
N
N1
t=0
(
2
tr
1)[ < C
(r)
for r = 0, . . . , 1 since this sequence is convergent, hence
bounded. Let B
0
= maxC
(r)
and write
E
N
1
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
i
()
= E
j=
i
(j)
i+
(j + )
_
N
1
N1
t=0
(
2
t+ij
1)
_
j=
[
i
(j)
i+
(j + )[E
N
1
N1
t=0
(
2
t+ij
1)
j=
[
i
(j)[
__
j=
[
i+
(j)[
_
B
0
a
2
N
/N
= Ba
2
N
/N.
Finally, we have
E[
i
()
i
()[
E
i
() N
1
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
+ E
N
1
N1
t=0
j=
i
(j)
i+
(j + )
2
t+ij
i
()
K
0
N
1/2
a
2
N
N
1
a
2
N
+ Ba
2
N
N
1
= K
0
N
1/2
+ Ba
2
N
N
1
where a
2
N
/N is regularly varying with index
2
1. For 2 < < 4, N
1/2
= o(a
2
N
/N). Hence,
E[
i
()
i
()[ Ca
2
N
/N.
28
Acknowledgements
The authors thank the referees for many helpful suggestions which greatly improved the
manuscript.
REFERENCES
Adams, G. and C. Goodwin (1995) Parameter estimation for periodic ARMA models, J.
Time Series Anal., 16, 127145.
Anderson, P. (1989) Asymptotic Results and Identication for Cyclostationary Times Series,
Doctoral Dissertation, Colorado School of Mines, Golden, Colorado.
Anderson, P. and M. Meerschaert (1998) Modeling river ows with heavy tails, Water Re-
sources Res., 34, 22712280.
Anderson, P. and M. Meerschaert (1997) Periodic moving averages of random variables with
regularly varying tails, Ann. Statist., 25, 771 785.
Anderson, P., Meerschaert, M. and A. Vecchia (1998) Asymptotics of the innovations algo-
rithm for periodic time series, in preparation.
Anderson, P. and A. Vecchia (1993) Asymptotic results for periodic autoregressive moving
average processes, J. Time Series Anal., 14, 118.
Berk, K. (1974) Consistent autoregressive spectral estimates, Ann. Statist., 2, 489502.
Bhansali, R. (1978) Linear prediction by autoregressive model tting in the time domain
Ann. Statist., 6, 224231.
Billingsley (1968) Convergence of Probability Measures, Wiley, New York.
29
Billingsley (1995) Probability and Measure, 3rd Ed., Wiley, New York.
Bingham, N., C. Goldie, and J. Teugels (1987) Regular Variation, Encyclopedia of Mathe-
matics and its Applications, 27, Cambridge University Press.
Brockwell, P. and R. Davis (1988) Simple consistent estimation of the coecients of a linear
lter, Stoch. Proc. Appl., 28, 4759.
Brockwell, P. and R. Davis (1991) Time Series: Theory and Methods, 2nd Ed., Springer
Verlag, New York.
Davis, R. and S. Resnick (1986) Limit theory for the sample covariance and correlation
functions of moving averages, Ann. Statist., 14, 533558.
Feller, W. (1971) An Introduction to Probability Theory and Its Applications, Vol. II, 2nd
Ed., Wiley, New York.
Golub, G. and C. Van Loan (1989) Matrix Computations, 2nd Ed., Johns Hopkins University
Press.
Jansen, D. and C. de Vries (1991) On the frequency of large stock market returns: Putting
booms and busts into perspective, Review of Econ. and Statist., 23, 1824.
Jones, R. and W. Brelsford (1967) Time series with periodic structure, Biometrika, 54,
403408.
Loretan, M. and P. Phillips (1994) Testing the covariance stationarity of heavytailed time
series, J. Empirical Finance, 211-248.
Lund, R. and I. Basawa (1999) Recursive Prediction and Likelihood Evaluation for Periodic
ARMA Models, J. Time Series Anal., to appear.
30
Mikosch, T., T. Gadrich, C. Kl uppenberg and R. Adler (1995) Parameter estimation for
ARMA models with innite variance innovations, Ann. Statist., 23, 305-326.
Pagano, M. (1978) On periodic and multiple autoregressions, Ann. Statist. 6, 13101317.
Salas, J., G. Tabios and P. Bartolini (1985) Approaches to multivariate modeling of water
resources time series, Water Res. Bull. 21, 683708.
Samorodnitsky, G. and M. Taqqu (1994) Stable nonGaussian Random Processes: Stochastic
Models with Innite Variance, Chapman and Hall, London.
Tiao, G. and M. Grupe (1980) Hidden periodic autoregressivemoving average models in
time series data, Biometrika, 67, 365373.
Tjostheim, D. and J. Paulsen (1982) Empirical identication of multiple time series, J. Time
Series Anal., 3, 265282.
Troutman, B. (1979) Some results in periodic autoregression, Biometrika, 6, 219228.
Ula, T. (1993) Forecasting of multivariate periodic autoregressive moving average processes,
J. Time Series Anal., 14, 645.
Vecchia, A. and R. Ballerini (1991) Testing for periodic autocorrelations in seasonal time
series data, Biometrika, 78, 5363.
31