Time series indexing by dynamic covering with cross-range
constraints
Sun, T., Liu, H., McLoone, S., Ji, S., & Wu, X. (2020). Time series indexing by dynamic covering with cross-
range constraints. The International Jounal on Very Large Data Bases.
Published in:
The International Jounal on Very Large Data Bases
Document Version:
Peer reviewed version
Queen's University Belfast - Research Portal:
Link to publication record in Queen's University Belfast Research Portal
Publisher rights
Copyright 2020 Springer. This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms
of use of the publisher.
General rights
Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other
copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated
with these rights.
Take down policy
The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to
ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the
Research Portal that you believe breaches copyright or violates any law, please contact
[email protected].
Download date:13. Dec. 2021
The VLDB Journal manuscript No.
(will be inserted by the editor)
Time Series Indexing By Dynamic Covering with
Cross-Range Constraints
Tao Sun · Hongbo Liu · Seán McLoone · Shaoxiong Ji · Xindong Wu
Received: date / Accepted: date
Abstract Time series indexing plays an important role computational time. A Hierarchical DCRC (HDCRC)
in querying and pattern mining of big data. This paper structure is proposed to generate the DCRC-tree based
proposes a novel structure for tightly covering a given indexing and used to develop time series indexing and
set of time series under the dynamic time warping sim- insertion algorithms. Experimental results for a selec-
ilarity measurement. The structure, referred to as Dy- tion of benchmark time series datasets are presented
namic Covering with cross-Range Constraints (DCRC), to illustrate the tightness of LB DCRC, as well as the
enables more efficient and scalable indexing to be de- pruning efficiency on the DCRC-tree, especially when
veloped than current hypercube based partitioning ap- the time series have large deformations.
proaches. In particular, a lower bound of the DTW dis-
Keywords Time Series · Dynamic Time Warping ·
tance from a given query time series to a DCRC-based
Indexing · R-Tree · Dynamic Covering · Cross-Range
cover set is introduced. By virtue of its tightness, which
Constraints
is proven theoretically, the lower bound can be used for
pruning when querying on an indexing tree. If the DCR-
C based Lower Bound (LB DCRC) of an upper node in 1 Introduction
an index tree is larger than a given threshold, all child
nodes can be pruned yielding a significant reduction in With the dramatic growth in the volume of data, and
the opportunities for data driven decision making af-
T. Sun forded by such data, particularly when it comes to so-
School of Innovation and Entrepreneurship, Dalian Universi- cial networks and e-commerce [18, 40], it is vital to have
ty of Technology, Dalian 116023, China algorithms that are able to efficiently mine big data [2,
E-mail:
[email protected]
36]. In many practical applications mining of data that
H. Liu ( ) is in the form of time series [5,10] is of interest and
Institute of Cognitive Information Technology, Dalian Mar-
itime University, Dalian 116026, China
this has led to the development of bespoke approaches
E-mail:
[email protected] for tasks such as pattern discovery and clustering [37,
S. McLoone
21, 9], classification [7,20], rule discovery [30,34], and
School of Electronics, Electrical Engineering and Computer summarisation [13]. As with standard data mining, in-
Science at Queen’s University Belfast, Northern Ireland, BT9 dexing is a fundamental technique for efficiently access-
5AH, UK ing and querying data when performing these tasks [6,
E-mail:
[email protected]
4]. However, when indexing time series data the choice
S. Ji of similarity measurement is a key consideration [23],
Department of Computer Science, Aalto University, Espoo
02150, Finland
particularly when they are not aligned temporally. In
E-mail: shaoxiong.ji@aalto.fi these circumstances, the classical Euclidean distance,
X. Wu
as introduced in [1], can result in large differences be-
School of Computing and Informatics, University of Louisiana tween two time series even when they are quite simi-
at Lafayette, Lafayette, LA 70504-3694, USA lar in shape [14]. Consequently, dynamic time warping
E-mail:
[email protected] (DTW), which addresses this deficiency, has become a
2 T. Sun, et al.
popular method of measuring the similarity between relationship. These tuples under the ACM-relationship
time series [25,22,35, 24]. correspond to multiple m-dimensional hyper rectangles.
When indexing big time series datasets performing In contrast to the classic method, DCRC proposes
a direct linear scan of all the time series is general- a “tight” structure composed of multiple hyper rectan-
ly computationally intractable and a more considered gles. For a given set S of similar time series in terms of
approach is needed. This usually involves mapping the DTW distance, any element of the corresponding DCR-
data to a tree-like structure with partitions, and then C must be similar to the elements of S. The tightness
extracting a small number of time series from these par- makes it possible to efficiently prune unnecessary sam-
titions for linear scanning [26,39]. A partition is defined ples when partitioning for DTW indexing.
as a low-complexity structure covering a set of relative- We determine the lower bound of the DTW between
ly similar time series. For a given query time series, a a given query time series and the cover set of a giv-
lower bound with respect to each partition can then be en DCRC structure, denoted as LB DCRC, and then
employed during indexing instead of directly measuring introduce the hierarchical DCRC (HDCRC) structure.
the similarity between the query time series and each This is composed of multiple layers, with the upper D-
element of the partitions. Using this approach efficient CRC structure covering all the elements covered by the
pruning procedures can be implemented, substantial- DCRC structures of its sub-layers. Based on the DCR-
ly reducing the computational complexity of indexing, C and HDCRC structures, we further present a novel
and enabling fast data access and querying [14]. The tree-like indexing and its insertion and node splitting
speed-ups achievable using time series partitioning very algorithms. Given time series set S and a query time
much depend on how the partitions are defined, the ap- series q, from the root down to its sub-layers in the
proach used to generate tree-like indexing using these indexing tree, if the LB DCRC (DCRC based Lower
partitions, and the complexity of the lower bound cal- Bound of DTW) of an upper layer is larger than a giv-
culation, hence improving on each of these remains an en acceptable range query tolerance, then all of its sub-
important area of research, and is the focus of this pa- layers are accordingly pruned, with the result that only
per. a few remaining leaves on the indexing tree need to
be sequentially scanned using the DTW distance. This
In the classical methods [14,39,15], when comput-
leads to significant reductions in computational time.
ing the lower bound of DTW from a query time series
q to a set S of time series, the range [Li , Ui ] is comput- In summary, the novel contributions of the paper
ed for each dimension i. The set of dimensional ranges are as follows:
[Li , Ui ], i = 1, · · · , m, define a hyper rectangular area,
denoted by C, which can serve as a partition in an in- (a) We develop the theory of DCRC-based covering of
dexing structure. In fact, the lower bound of DTW is a given set of time series, and prove that a DCRC-
exactly the Hausdorff distance from q to C. However, a based covering has significantly lower volume than
partition represented by a hyper rectangle is often not other methods, that is, if all the elements are similar
optimal in terms of DTW distance. With the deforma- to the reference c, any element of the corresponding
tion of the time axis when DTW matching, the volume DCRC-based cover set is also similar to c.
of partition C can be so large that C might still include (b) The corresponding lower bound of the DTW be-
quite dissimilar time series, even if the elements of S tween a given query time series and a given time se-
are similar, which results in inefficient indexing. ries set, namely, LB DCRC is proposed. This bound
outperforms other lower bounds in terms of tight-
As an alternative to hyper rectangles, we propose ness.
the use of Dynamic Covering with Cross-Range Con- (c) Since the number of feasible ACM-relationships for
straints (DCRC) to partition time series for indexing. a given DCRC usually grows exponentially, we pro-
For a given set S, let an approximately central ele- pose a novel polynomial time algorithm to compute
ment in terms of the DTW distance be the “reference” the lower bound of the DTW between a given query
time series, denoted c. DCRC is defined as a series of time series and the cover set of a given DCRC struc-
sets V1 , V2 , · · · , Vm . Each element of set Vi is a 3-tuple ture.
(l, u, p), where p is a dimensional subscript of refer- (d) We then present the hierarchical DCRC (HDCR-
ence c, and [l, u] denotes a dimensional range. A tuple C) structure, HDCRC-based tree indexing and its
(v1 , v2 , · · · ) over the Cartesian product V1 × V2 × · · · insertion and node splitting algorithms and demon-
corresponds to an m-dimensional hyper rectangle. We strate with extensive numerical studies that the
only consider tuples satisfying “Alignment”, “Continu- proposed DCRC based indexing method performs
ity” and “Monotonicity” conditions, called the ACM- efficient pruning for range querying, and outper-
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 3
forms linear scanning and other indexing methods and the lower bound is represented by LB Keogh(x,y)
in terms of computational time. + LB Keogh(y,x′ ).
Based on the common features of LB Kim, LB Yi
The remainder of the paper is organized as follows. and LB Keogh, Zhou and Wong [39] proposed several
Related work is reviewed in section 2. The key DCRC boundary-based lower bound functions including a non-
concepts and alrorithms are introduced in section 3. elaborate version (denoted as LB Corner) and an elab-
Then the HDCRC structure and the indexing approach orate version (denoted as LB ECorner). Li and Yang
based on the DCRC-tree are developed in section 4. The [17] proposed two extensions of LB Kim and LB Keogh
relevant theorems on DCRC and HDCRC are presented (denoted respectively as LB NKim and LB NKeogh).
in section 5. Using benchmark datasets from the UCR In 2018 Shen et al. proposed a new lower bound (L-
Time Series Classification Archive, experimental results B NEW) [29]. In contrast to LB Keogh, LB NEW de-
are provided in section 6 to demonstrate the efficiency fines Yi as all the elements of the warping window with
of our approaches. Finally, conclusions are provided in center yi , instead of the i-th envelope Yi in LB KEOGH.
section 7. Therefore, LB NEW is usually tighter than LB Keogh.
Tan et al. [32] proposed the LB ENHANCED lower
bound. In this algorithm, Yi is represented by left bands
2 Related Work LWi or right bands Ri , assuring a relatively tight lower
W
bound.
DTW is a more robust measure of the similarity be- In the traditional time series indexing methods [14],
tween two time series than the Euclidean distance as the dataset S of sample time series is stored in an R-
it takes account of time axis shifting between time se- tree like structure, each tree node of which corresponds
ries. Generally, the warping path of DTW is defined to a minimal boundary rectangle (MBR) containing a
by a number of global and/or local constraints. Two subset of S. Given a query time series q, retrieving the
of the most popular global constraints are the Itaku- subset {s ∈ S|DT W (q, s) ≤ ε} involves two steps:
ra parallelogram [12] and the Sakoe-Chiba band [28].
(1) Seach the nodes based on the lower bound between
In contrast to the traditional form of DTW, this paper
q and MBR in a top-down approach.
adopts the form DT Wp [16,32] to denote the Lp norm
(2) All the feasible time series are linear scanned using
of monotonic DTW distance (p = 2).
an efficient method [27].
Despite its limitation with respect to scalability to
high dimensional data sets, in recent years DTW has
been widely applied, particularly for high-dimensional 3 Dynamic Covering with Cross-Range
data indexing [33] and stream matching [19,11]. Constraints (DCRC)
However, since DTW does not obey the triangle in-
equality, and therefore is not suitable for indexing with 3.1 DTW
a metric access method, researchers have switched their
attention to developing indexing approaches that work Given a time series x represented by [x1 , x2 , · · · , xn ],
with suitability defined DTW lower bounds, rather than let x(i) denote the i-th entry of x, xi and x(i1 : i2 )
DTW itself. In recent years, many researches have fo- denote the subsequence [xi1 , xi1 +1 , · · · , xi2 ]. Here, n is
cused on the DTW lower bound. the length of the time series, also referred to as its “di-
The idea of using a lower bound function was first mension”.
proposed by Yi et al. [38]. In their lower bound, denoted DTW measures the similarity between two time se-
as LB Yi, the maximum and minimum elements of a ries [31]. For two given time series x = [x1 , x2 , · · · , xm ]
sequence are used to represent the sequence. and time series y = [y1 , y2 , · · · , yn ], let W denote a
Keogh et al. proposed a lower bound function (de- warping path from x to y. Let (ik , jk ) be the k-th el-
noted as LB Keogh) [14], together with an exact index- ement of W and K be the length of W (1 ≤ k ≤ K).
ing method based on their lower bound function. For The warping path in DTW is required to satisfy a set
two given time series x and y, let Y be a range series, of constraints, referred to as alignment, continuity and
each entry Yi of which denotes the i-th envelope, i.e. monotonicity constraints. These are defined as follows:
the range between the minimum and the maximum of (a) (i1 , j1 ) = (1, 1) and (iK , jK ) = (m, n);
the warping window with center yi . In fact, LB Keogh (b) ik+1 −ik ≤ 1 and jk+1 −jk ≤ 1, k = 1, 2, · · · , K −1;
corresponds to the Hausdorff distance from x to Y .
Lemire proposed LB IMPROVED lower bound [16], (c) ik+1 −ik ≥ 0 and jk+1 −jk ≥ 0, k = 1, 2, · · · , K −1.
which imports additional time series x′ from x and Y ,
4 T. Sun, et al.
3 • • •
The ratio of the width of the Sakoe-Chiba Band to the
length of the time series, denoted by λ (0 < λ ≤ 1),
imposes an additional constraint which is defined as
2 • • •
follows:
(d) | m
n
ik − jk | ≤ λn, k = 1, 2, · · · , K.
1 • •
The DTW path distance is obtained subject to these 1 2 3 4 5
constraints by solving the dynamic programming √ prob- (a) A legal ACM-Relationship
lem given in Equ. (1), where δ(i, j) = (xi −yj )2 , µ(i, j)
3 • • •
represents the DTW distance √ between x(1 : i) and
y(1 : j), and DT W (x, y) = µ(m, n).
2 • • •
δ(i, j) + µ(i − 1, j − 1)
µ(i, j) = min δ(i, j) + µ(i − 1, j) (1)
1 • •
δ(i, j) + µ(i, j − 1) 1 2 3 4 5
(b) A legal ACM-Relationship
3 • • •
3.2 ACM-Relationship
Definition 1 (ACM-Relationship) Considering the
Cartesian product P1 × P2 × · · · × Pm , where Pi = {1, 2, 2 • • •
· · · , n} for i = 1, 2, · · · , m. Let R(m, n) denote the
relationship on the Cartesian product, each element
1 • •
r[r1 , r2 , · · · , rm ] of which satisfies the Alignment, Con- 1 2 3 4 5
tinuity and Monotonicity (ACM-Relationships) as fol- (c) An illegal ACM-Relationship as it vi-
lows. olates “Monotonicity”
(a) Alignment. r1 = 1, rm = n; 3 • • •
(b) Continuity. ri+1 − ri ≤ 1 for i = 1, 2, · · · , m − 1;
(c) Monotonicity. ri+1 − ri ≥ 0 for i = 1, 2, · · · , m − 1.
2 • • •
Given a time series x[x1 , x2 , · · · , xn ] of length n, and
a relationship r[r1 , r2 , · · · , rm ] ∈ R(m, n), let
1 • •
τ (x, r) = [xr1 , xr2 , · · · , xrm ] (2) 1 2 3 4 5
(d) An illegal ACM-Relationship as it vi-
Given a time series x[x1 , x2 , · · · , xm ] of length m, and olates “Continuity”
a time series y[y1 , y2 , · · · , yn ] of length n, let
Fig. 1 Examples of legal and illegal ACM-relationships
R(x, y) = argmin ||x, τ (y, r)||
r∈R(m,n)
(3) 1(b) satisfy the ACM-relationships. However, the two
D(x, y) = min ||x, τ (y, r)||
r∈R(m,n) series in Figs. 1(c) and 1(d) do not satisfy the ACM-
relationships.
In Equ. (3), r is a ACM-Relationship, τ (y, r) is
a time series of length m while |y| = n < m, and
||x, τ (y, r)|| is the Euclidian distance of the two m- 3.3 Approximate Subsequence
length time series x and τ (y, r). D(x, y) is the mini-
mum Euclidian distance with respect to relationship r, Let A(i1 : i2 ) denote the mean of the entries of x(i1 : i2 )
and R(x, y) is the corresponding value of r. and let E(i1 : i2 ) denote the sum of squares of deviations
Fig. 1 shows examples of the ACM-relationship. In from the mean of the entries of x(i1 : i2 ) as defined in
each sub-figure of Fig. 1, 5 columns correspond to 5 Equ. (4).
sets P1 , P2 , · · · , P5 , and the black dots correspond to
∑i2
the elements of Pi . The black dots on the black path xj
represent elements of the Cartesian product P1 × P2 × A(i1 : i2 ) = i2 j=i 1
−i1 +1
E(i1 : i2 ) = ∑ i
(4)
· · · × P5 . The two series represented by Figs. 1(a) and j=i1 (xj − A(i1 : i2 ))
2 2
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 5
Algorithm 1 Minimization for ACM-Relationship Given a positive integer n (n < m), we define a
Input: A given time series x[x1 , x2 , · · · , xm ] of length m, new structure V to store different dimensional ranges.
and a time series y[y1 , y2 , · · · , yn ] of length n(n < m). Assume V = [V1 , V2 , · · · , Vm ], where each element v
Output: r[r1 , r2 , · · · , rm ] = R(x, y) and d = D (x, y).
1: Let µ00 = 0, let µi0 = ∞ for i = 1, 2, · · · , m, and let
of Vi is represented by v(p, l, u). The component p ∈
µ0j = ∞ for j = 1, 2, · · · , n; {1, 2, · · · , n} denotes a dimensional subscript, and [l, u]
2: for i = 1 to m, j = 1 to n do denotes an interval on the real line of the p-th dimen-
3: Let p = argmin δ (i − 1, q ); sion. We stipulate that for any given v1 , v2 ∈ Vi , v1 .p =
q∈{j−1,j}
4: Let ri−1 = p; v2 .p if and only if v1 = v2 .
5: Let µij = δ (i, j ) + µi−1,p ; Fig. 2 shows an example of the structure. As shown
6: end for in Fig. 2(a), the structure is composed of 5 sets V1 , V2 ,
7: Let rm = n;
√
8: return r = [r1 , r2 , · · · , rm ], and d = µmn ; · · · , V5 , with each set containing a number of 3-tuples.
Take V1 for example in Fig. 2(b). There are two rect-
angles representing the two 3-tuples. The upper edge
Definition 2 (Approximate Subsequence) For a and the lower edge of each rectangle denote the range
given m-length time series x and a given integer n (0 < [v.l, v.u], and the number in the rectangle denotes a
n < m), the n-length Approximate Subsequence of x, subscript of the reference time series.
denoted by AS(x, n) is defined as For the sake of convenience, we introduce the fol-
lowing notation.
AS(x, n) = argmin D(x, y) (5)
|y|=n
V.n = max{v.p|v ∈ Vm }
V.Pi = {v.p|v ∈ Vi }
From Definition 2, the approximate subsequence of
V.vij = v(p, l, u) s.t. (v ∈ Vi ∧ v.p = j) (7)
x is the approximate time series of x. The optimal so-
j
V.Li = V.vi .lj
lution to Equ. (5), and hence AS(x, n), is obtained by
solving the dynamic program: V.Uij = V.vij .u
ν(i, j) = min (ν(k − 1, j − 1) + E(k : i)) (6) For a given r ∈ R(m, n), let Rectr (V, r), as defined
k in Equ. (8), be an m-dimensional hyper rectangular
range.
where k ∈ {j, j + 1, · · · , i} and ν(i, j) = D2 (x(1 :
i), AS(x(1 : i), j)). The procedure for computing AS(x, n) Rectr (V, r) = {[x1 , · · · , xm ] | xi ∈ [Lj , U j ]} (8)
i i
is given in Algorithm 2.
Fig. 3 illustrates the set in a hyper dimensional rect-
angle defined in Equ. (8). The first row denotes a match-
Algorithm 2 Approximate subsequence for a given ing path of DTW, the second row illustrates a DCRC
time series structure, and the third row illustrates a 5-dimensional
Input: m-length time series x.
Output: n-length Approximate subsequence.
hyper rectangle. The lower and upper edges of each
1: Initialise an n-length time series y; rectangle denote the corresponding range of each di-
2: Let i = m; mension. Take the third column for example. The val-
3: for j = n down to 1 do ue of the first row is 2, and then in the second row,
4: Let p = i;
5: Let i = argmin (ν (k − 1, j − 1) + E (k : i)); the rectangle with label 2 is selected as the range cor-
k responding to the third row.
6: Let y(j ) = A(i : p);
7: Let i = i − 1;
Rectr (V, r) corresponds to an m-dimensional cube
8: end for for a given tuple r, which covers a set of time series.
9: return y In fact, not all tuples are permitted; a “legal” tuple r
must obey the so-called “ACM”-Relationships.
∏
n
volume(V) = (max{v.u − v.l|v.p = j ∧ v ∈ V ∈ V})
3.4 Covering Set 1
(9)
Consider a given set of m-length time series S = {s1 ,
s2 , · · · , s|S| }, where si consists of [si1 , si2 , · · · , sim ]. In The structure V stores different dimensional ranges
this section, we focus on defining a structure that can from the given set of time series, from which we can
tightly cover set S using the DTW distance. dynamically obtain a “legal” and “tight” cover of the
6 T. Sun, et al.
5 At Line 1, c is the reference time series for set S. To
v.u simply the computation, c is assigned to the n-length
2 3 3 •
1 3 “Approxmate Subsequence” of sk randomly selected
1 2
1• v.p from set S. At Line 4, r is an ACM-Relationship by
2 Algorithm 1. For each dimension i, the tuple set Vi of
0
•
v.l V is created or updated by the steps at Lines 5-13.
V1 V2 V3 V4 V5 Table 1 illustrates a sample DCRC structure build-
m dimensions v(p, l, u) ∈ V1
(a) (b) ing procedure. rt corresponds to the matching from st
to c satisfying Equ. (3). Xi is the set of matchings
Fig. 2 Illustration of a DCRC structure: (a) A DCRC struc- (rti , sti ). Yi represents the merged set {(r, Gr )} of Xi
ture with 5 tuple sets; (b) A tuple v(p, l, u) in set V1 . such that r ∈ {rti } and Gr = {s|(r, s) ∈ Xi }, and Vi
denotes the i-th entry of the DCRC structure.
r 1 1 2 2 3
Algorithm 3 DCRC Structure for a Given Set of Time
DCRC V
2 3 3
1 3
1 2 Series
2 Input: A given reference time series c of n-length;
Input: A given set of m-length time series S = {s1 , s2 ,
· · · , sT }, with each element, st , represented by st =
[st1 , st2 , · · · , stm ], where t = 1, 2, · · · , T .
Rectr (V, d)
Output: DCRC structure V.
1 3
1 2 1: If c = nil, let c = AS (sk , n) (n < m) by Algorithm ;
2 2: Initialise series V = [{}, {}, · · · , {}] of m-length;
3: for t = 1 to T do
4: Let rt = R(st , c) by Algorithm 1;
Fig. 3 Illustration of hyper rectangle Rectr for 5-dimensional 5: for i = 1 to m do
DCRC V 6: if (rti ∈ V.Pi ) then
7: Let v = V.virti ;
8: Let v.l = min(sti , v.l);
given set. The “Cover” function is defined by Equ. (10). 9: Let v.u = max(sti , v.u);
10: else
11: Let Vi = Vi ∪ {(rti , sti , sti )};
12: end if
Cover(V) = {x | x ∈ Rectr (V, r), r ∈ R(m, n)} (10) 13: end for
14: end for
where Cover is a dynamic combination of Rectr (V, r), 15: return V
where r is subject to the ACM-relationship. Hence, we
refer to the covering structure as the “Dynamic Cover-
ing with Cross-Range Constraints” (DCRC for short).
4 Time Series Indexing with DCRC
3.5 DCRC of Time Series
4.1 DCRC based DTW Lower Bound (LB DCRC)
In this section, a feasible and optimal algorithm for Given set S of m-length times series and a DCRC struc-
computing the DCRC for a given general set of time ture V determined by Equ. (10), a lower bound of DTW
series is proposed. The required steps are set out in from a given time series q to the elements of S can be
detail in Algorithm 3. defined as the minimal DTW distance from q to the
For a given set S of similar time series, the num- elements of Cover(V), as defined in Equ. (11).
ber of possible values that can be assigned to a DCRC
structure grows exponentially with the number of their LB DCRC(q, S) = min DT W (q, x) (11)
x∈Cover(V)
dimension. In our method, the creation of DCRC de-
pends on a so-called reference c time series, which is The DCRC based lower bound of classic DTW, name-
understood to be a lower dimensional contour of al- ly, LB DCRC, is summarized in Algorithm 4. Given
l the samples of S. The ACM-Relationship rt is just a the ratio of the width of the Sakoe-Chiba Band to the
many-to-one function from st to c. In fact, the greater length of the time series, denoted by λ, the time com-
the similarity between the reference and the samples, plexity for the algorithm is O(λm2 n).
the tighter the DCRC structure. The relevant theory is Note that, in a given DCRC structure V, the number
established in Theorem 3. of feasible relationships grows with the power of m and
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 7
Table 1 Example of computing a DCRC structure from two 5-length time series and a 3-length reference time series
1 2 3 4 5
c 4.00 6.00 5.00
s1 4.11 4.12 4.13 6.14 5.15
s2 4.21 6.22 6.23 5.24 5.25
s3 4.31 6.32 6.33 6.34 5.35
r1 1(s11 → c1 ) 1(s12 → c1 ) 1(s13 → c1 ) 2(s14 → c2 ) 3(s15 → c3 )
r2 1(s21 → c1 ) 2(s22 → c2 ) 2(s23 → c2 ) 3(s24 → c3 ) 3(s25 → c3 )
r3 1(s31 → c1 ) 2(s32 → c2 ) 2(s33 → c2 ) 2(s34 → c2 ) 3(s35 → c3 )
[Xi ] {(1,s11 ), {(1,s12 ), {(1,s13 ), {(2,s14 ), {(3,s15 ),
(1,s21 ), (2,s22 ), (2,s23 ), (3,s24 ), (3,s25 ),
(1,s31 )} (2,s32 )} (2,s33 )} (2,s34 )} (3,s35 )}
[Y i ] {Y11 {s11 ,s21 ,s31 }} {Y21 {s12 }, {Y31 {s13 }, {Y42 {s14 ,s34 }, {Y53 {s15 ,s25 ,s35 }}
Y22 {s22 ,s32 }} Y32 {s23 ,s33 }} Y43 {s24 }}
[Vi ] {(1,minY11 ,maxY11 )} {(1,minY21 ,maxY21 ), {(1,minY31 ,maxY31 ), {(2,minY42 ,maxY42 ), {(3,minY53 ,maxY53 )}
(2,minY22 ,maxY22 )} (2,minY32 ,maxY32 )} (3,minY43 ,maxY43 )}
[Vi ] {(1,4.11,4.13)} {(1,4.12,4.12), {(1,4.13,4.13), {(2,6.14,6.34), {(3,5.15,5.35)}
(2,6.22,6.32)} (2,6.23,6.33)} (3,5.24,5.24)}
n, i.e. is O(ϕmn ), where ϕ is a positive constant. How- V
ever, the computation of LB DCRC does not directly
enumerate all the relationships, and achieves polynomi-
V1 V2 ... VT
al complexity by using dynamic programming.
√
In Algorithm 4, aijk represents the lower bound
DTW from i-length time series q[1 : i] to j-length DCR-
s11 s12 . . . s1n1 s21 s22 . . . s2n1 sT 1 sT 2 . . . sT nT
C V′ (V1′ , V2′ , · · · , Vj′ ), satisfying Vl′ = {v ∈ Vk |v.p ≤ k},
...
Fig. 5 Illustration of Hierarchical DCRC with two layers
for l = 1, 2, · · · , j. Then aijk is computed by the recur-
sive formula at Line 16.
ries in Equ. (11). As jp = jq ⇒ kp = kq (p ̸= q), assume
(0, j, 0) r = [r1 , r2 , · · · , rm ] satisfying for ∀(p ∈ {1, 2, · · · , m}
r r
∃q(jq = p ∧kq = rp ), we have gp ∈ [Lpp , Up p ] for p =
1, 2, · · · , m. Furthermore, (i1 , j1 ), (i2 , j2 ), · · · , (iL , jL )
is exactly the DTW path between the query time series
q and the optimal solution g.
(i − 1, j, k)
(i, j, k)
(i, j − 1, k − 1)
(i − 1, j − 1, k − 1)
(0, 0, 0)
4.2 Hierarchical DCRC (HDCRC)
(i, j − 1, k) (i, 0, 0)
(i − 1, j − 1, k)
Consider a given series of sets S1 , S2 , · · · , ST , where St
(t = 1, 2, · · · , T ) is a set of m-length time series; and a
given series of DCRC structures V1 , V2 , · · · , VT , where
(0, 0, k) Vt .n = n and St ⊆ Cover(Vt ) for t = 1, 2, · · · , T .
Fig. 4 A feasible matching path from (0,0,0) to (i, j, k) for
The problem
∪T is how to obtain a DCRC structure V
the computation of LB DCRC satisfying t=1 St ⊆ Cover(V) and V.n = n′ (n′ ≤ n)
according to V1 , V2 , · · · , VT only, and not the en-
tire set of elements of S1 , S2 , · · · , ST . The hierarchical
We will prove Algorithm 4 satisfies Equ. (11) by structure is illustrated in Fig. 5. Algorithm 5 sets out
Theorem 4 in Sec. 5. In Fig. 4, the dotted line shows the procedure for determining the DCRC structure.
a solution for LB DCRC. The computation of point At line 3, the reference time series c of length n′
(i, j, k) depends on the five points (i − 1, j − 1, k), (i − is converted from the reference x1 of V1 by Algorithm
1, j −1, k −1), (i−1, j, k), (i, j −1, k) and (i, j −1, k −1). 2. The components of V are built by the steps from
Let (i1 , j1 , k1 ), (i2 , j2 , k2 ), · · · , (iL , jL , kL ) be an opti- Lines 9-19. For the i-th set in V, if j ∈ Vt .Pi , we have
mized path and g [g1 , g2 , · · · , gm ] the optimized time se- rj ∈ V.Pi .
8 T. Sun, et al.
Algorithm 4 DCRC based lower bound of DTW (L- Algorithm 5 Hierarchical DCRC
B DCRC) Input: Time series c of n′ -length
Input: Set S of m-length times series; Input: Set (V1 , V2 , · · · , VT ) of m-length DCRC structures,
Input: DCRC structure V = [V1 , V2 , · · · , Vm ] satisfying S ⊂ where Vt .n = n(n′ ≤ n) for t = 1, 2, · · · , T .
∪
Cover(V); Output: A DCRC structure V satisfying T t=1 Cover (Vt ) ⊆
Cover (V) and V.n = n . ′
Input: Ratio λ (0 < λ ≤ 1) of band width to m; an m-length
query time series q = [q1 , q2 , · · · , qm ]. 1: if c = nil then
Output: LB DCRC (q, S ). 2: Let x1 be the reference time series of Vt .
1: Let A = [aijk ] be an m×m×n-size array, each aijk = +∞ 3: Let c = AS (x1 , n′ ) by Algorithm 2;
initially; 4: end if
2: Let B be an empty set; 5: Initialise V = {V1 , V2 , · · · , Vm } such that Vi = ϕ for i =
3: for i = 1 to m, j = 1 to m do 1, 2, · · · , m ;
4: if |i − j| ≤ λm then 6: for t = 1 to T do
5: for each k in V.Pj do 7: Let xt be the reference time series of Vt .
6: B = B ∪ (i, j, k); 8: Let r[r1 , r2 , · · · , rn ] = R(xt , c);
7: end for 9: for i = 1 to m do
8: end if 10: for each j in Vt .Pi do
9: end for 11: Let k = rj ;
10: for each (i, j, k) in B do 12: if (k ∈ V.Pi ) then
11: Let η1 = α(i − 1, j − 1, k); 13: Let V.Lki = min(V.Lki , V.Ljti );
j
12: Let η2 = α(i − 1, j − 1, k − 1); 14: Let V.Uik = max(V.Uik , V.Uti );
13: Let η3 = α(i − 1, j, k); 15: else
14: Let η4 = α(i, j − 1, k); 16: Let Vi = Vi ∪ {(k, V.Ljti , V.Uti
j
)}
15: Let η5 = α(i, j − 1, k − 1); 17: end if
16: Let aijk = min(η1 , η2 , η3 , η4 , η5 ) + γ (i, j, k); 18: end for
17: end for 19: end for
√
18: return ammn 20: end for
19: 21: return V
20: function α(i, j, k)
21: if i = j = k = 0 then return 0;
22: else if (i, j, k) ∈ B return aijk ; implementation of function update dcrc(X, c, x) is de-
23: else return +∞;
24: end if
rived from lines 5 - 13 in Algorithm 3, with V, c, st
25: end function replaced by parameters X, c, x. The implementation of
26: function update hdcrc(X, c, Y) is derived from lines 9 -
27: function γ (i, j, k) 19 in Algorithm 5, with V, c, Vt replaced by parameters
28: Let x = qi ;
29: Let y0 = V.Lkj ;
X, c and Y.
30: Let y1 = V.Ujk ; The implementation of insert series(N , s) is as fol-
31: if x < y0 return (y0 − x)2 ; lows:
32: else if x > y1 return (x − y1 )2 ;
(a) If N .c = nil, then N .c is assigned to AS(s, |N .c|);
33: else return 0;
34: end if (b) Let N .Series = N .Series ∪ {s};
35: end function (c) Let N .V = update dcrc(N .V, N .c, s).
The implementation of insert node( N , N ′ ) is as
follows:
4.3 DCRC-Tree and Relevant Functions
(a) If N .c = nil, then let N .c = AS(N ′ .c, |N .c|);
Based on the HDCRC structure, an R-tree [8] like in- (b) Let N .Children = N .Children ∪ {N ′ };
dexing tree, named DCRC-tree, is proposed for efficient (c) Let N .V = update hdcrc(N .V, N .c, N ′ .V);
querying. Each node in a DCRC-tree corresponds to a (d) Let N ′ .Parent = N .
DCRC structure V (See Sec. 3.5), rather than a minimal
boundary rectangle (MBR) as used in R-trees. When 4.4 Node Splitting and Insertion in a DCRC-Tree
searching a time series from the DCRC-tree, we still
adopt the classic DTW (with global constraints). Motivated by the idea of node splitting in R-trees, we
A tree node of the DCRC-tree is represented by tu- develop a node splitting algorithm for DCRC-trees. Let
ple N (d, V, c, Parent, Children, Series), where the M be the maximal number of child nodes (not including
components are as defined in Table 2. The relevant ba- leaves) of each tree node. There are two cases of node
sic operators of the DCRC-Tree are given in Table 3. splitting.
The implementation of function create dcrc(c, S) u- The first case is when node N is a leaf node satis-
tilizes Algorithm 3 with c, S as input parameters. The fying |N .Series| = M , then it is split into nodes N1
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 9
and N2 , with both N1 .Series or N2 .Series containing Table 2 Relevant functions of DCRC-Tree
M/2 time series. Algorithm 6 details the node splitting Components Discription
algorithm.
d Depth of the tree node
The second case is is when N is a none-leaf node V DCRC structure
satisfying |N .Children| = M , then it is split into nodes c Referent time series
N1 and N2 , such that N1 .Children and N2 .Children Parent Parent node
Children Child nodes
respectively contain M/2 tree nodes. The correspond-
Series Chiled time series
ing node splitting algorithm for the set of tree nodes is
similar to Algorithm 6.
Table 3 Relevant functions of DCRC-Tree
Algorithm 7 summarizes the steps for inserting a
time series into a given DCRC-tree. These are similar Function Input/Output Discription
to the steps used with R-trees. From the root, the child c Reference time series
node with the minimal increasing volume is selected create dcrc S Set of time series
recursively, until the current node is a leaf. Then, the - A new DCRC structure built
from S
time series is inserted into the leaf node, and from bot- V Original DCRC structure
tom to top, the parent node is split if the number of c Reference time series
update dcrc
its children exceeds a pre-given maximal limit, and the x Newly inserted Time series
depth of the tree is less than a pre-given maximal limit. - The updated DCRC structure V
after insertion of x
Therefore the leaf nodes might have a huge number of
X Original DCRC structure
time series, which are relatively similar to each other in update hdcrc c Reference time series
terms of DTW distance. Y Newly inserted DCRC
For R-Tree and DCRC-Tree, consider the tree node - The updated DCRC structure
X after insertion of Y
covering a set of time series. In a tree node of a R-Tree: N Original DCRC-Tree Node
insert series s Newly inserted Time series
(1) The covering set is a MBR, each i-th component - The updated DCRC-Tree node
is a range interval derived from the bands with the N after insertion of s
i-th entry centered. N Original DCRC-Tree Node
insert node N′ Newly inserted Node
(2) The volume is the production of each i-th range - The updated DCRC-Tree node
interval. When the elements are similar, but have N after insertion of N ′
large time axis deformation, we have relatively large
volume.
(3) The lower bound DTW to a given query time se- 5 Theorems for DCRC
ries, is computed by different Hausdorff-distance-
like methods, including LB Keogh [14], LB NEW For the algorithms in Secs. 3 and 4.2, we will prove
[29], LB ENHANCED [32], etc. their correctness and efficiency in this section. Theo-
rem 1 assures the DCRC structure can cover a given
In a tree node of a DCRC-Tree:
set. Theorems 2 and 3 prove the tightness of the DCRC
(1) The covering set is a DCRC structure, each i-th covering. Considering the lower bound of DTW between
component is a set of tuples, and each tuple is a a given query time series and a given DCRC structure
range interval and a subscript. by Algorithm 4, Theorem 4 proves its correctness and
(2) The volume is computed a defined in Equ. (9). Theorem 5 proves that the hierarchical structure gen-
When the elements are similar, but have large time erated by Algorithm 5 is still a DCRC structure, which
axis deformation, as long as the reference time se- is used to generate an indexing tree.
ries is similar to these elements, we have relatively
small volume. Theorem 1 For a given set S of m-length time se-
(3) The lower bound DTW to a given query time se- ries, let V be the return value of Algorithm 3, then
ries, is computed by LB DCRC using a dynamic S ⊆ Cover(V).
programming method.
Proof Given st = [st1 , st2 , · · · , stm ] ∈ S where t ∈ {1, 2,
Hence, the DCRC-Tree based on HDCRC is a tighter · · · , T }, let r[r1 , r2 , · · · , rm ] be the ACM-relationship
structure for covering time series samples, than an R- at line 4 in Algorithm 3. From the loop from lines 5
Tree like structure. Consequently, this leads to more to 13, we have sti ∈ [Lri i , Uiri ] for i = 1, 2, · · · , m. From
efficient pruning when performing a query. r ∈ R(m, n) (defined in Definition 1), and the definition
10 T. Sun, et al.
Algorithm 6 Node Splitting for Time Series Set Algorithm 7 Insertion in a DCRC-tree
Input: DCRC-Tree node N (|N .Series| = M , N .d < dmax ). Input: Time series s of m-length
Output: The updated DCRC-Tree nodes N and N ′ after Input: Root T of DCRC-tree.
splitting. Output: The updated root T after insertion.
1: Let vol = volume(N ); 1: if T = nil then
2: Let Xi = create dcrc(N .c, {N .Series[i]}), for i = 1, 2, 2: New a DCRC-tree node T ;
· · · , M; 3: insert series(T , s);
3: Let j1 = argmin volume(Xi ), j2 = argmax volume (Xi ); 4: return T ;
i i 5: end if
4: Let x1 = N .Series[j1 ], and let x2 = N .Series[j2 ]; 6: Let N = T ;
5: Let N .Series = ϕ; 7: while N .Children = ̸ ϕ do
6: Create a new tree node N ′ , let |N ′ .c| = |N .c|, and |N ′ .d = 8: for each Ni in N .Children do
|N .d;
9: Copy Ni .V to Xi ;
7: insert series( N , x1 ); 10: Let Yi = update dcrc(Xi , Ni .c, {s});
8: if vol < ε then 11: end for
9: Let N ′ .c = N .c; 12: Let N = N .Children[k], k = argmin volume(Yi );
10: end if i
11: insert series( N ′ , x2 ); 13: end while
12: Let S ′ = N .Series − {x1 } − {x2 }; 14: insert series(N , s);
13: for each s in S ’ do 15: Let N ′ = nil;
14: Let v1 (s) = volume(create dcrc(x1 , {x1 , s})); 16: if T .d < dmax and |N .Series| = M then
15: Let v2 (s) = volume(create dcrc(x2 , {x2 , s})); 17: Split node N into N and N ′ ;
16: Denote ω (s) = v1 (s) − v2 (s); 18: end if
17: end for 19: while true do
18: Let y1 , y2 , · · · , yM −2 be the permutation of the elements 20: Let Nt = N .Parent;
of S ′ satisfying |ω (yi )| ≥ |ω (yi+1 )| for i = 1, 2, · · · , M − 3; 21: if Nt = nil then
19: for i = 1 to M − 2 do 22: if N ′ =
̸ nil then
20: if |N .Series| = M/2 then 23: Create a new node T , let T .Parent = nil;
21: insert series(N ′ , yi ); 24: Let T .d = N .d + 1
22: else if |N ′ .Series| = M/2 then 25: Let |T .c| = |N .c|/2;
23: insert series(N , yi ); 26: insert node(T , N );
24: else 27: insert node(T , Nb );
25: if ω (s) < 0 then 28: end if
26: insert series(N , yi ); 29: return T ;
27: else 30: else
28: insert series(N ′ , yi ); 31: if N ′ = nil then
29: end if 32: Update Nt .V with Nt .Children by Algorithm 5;
30: end if 33: else
31: end for 34: insert node(Nt , N ′ );
32: return N , N ′ 35: if |Nt .Children| = M then
36: Split node N into N and N ′ ;
37: end if
38: end if
of Rectr (V, r), st ∈ Rectr (V, r), i.e., st ∈ Cover(V) 39: Let N = Nt ;
from Equ. (10). 40: end if
41: end while
Lemma 1 Let x1 = [x11 , x12 , · · · , x1m1 ], x2 = [x21 , x22 ,
· · · , x2m2 ] be two given time series, of length m1 , m2
(m∑
√ 1 ≤ m2 ), and let y be a √ constant. If we have α = Theorem 2 If D(x1 , y) = α, D(x2 , y) = β (where
m1 ∑m2
(x − y) and β = (x − y), we have function D is defined in Equ. (3)), we have DT W (x1 , x2 )
i=1 1i i=1 2i
√
DT W 2 (x1 , x2 ) ≤ 2⌈m2 /m1 ⌉(α2 + β 2 ). ≤ 2(m2 − n)(α2 + β 2 ).
Proof Denote d = DT W (x1 , x2 ). Consider a matching
Proof Let r1 [r11 , r12 , · · · , r1m ] = R(x1 , y), let r2 [r21 , r22 ,
path W of length m2 (might not be a DTW warp-
· · · , r2m ] = R(x2 , y), and let W denote a matching
ing path) from x2 to x1 , such as (1, i1 ), (2, i2 ),∑· · · ,
m2 path from x1 to x2 , which is divided into n segments.
∑imk2= ⌈km1 /m2 ⌉.2 We have d2 ≤ k=1
2
(m2 , im2 ), where
Let the t-th segment correspond to set Xpt = {k| rpk =
(x1ik −x2k ) ≤ k=1 2((x1i∑
2
−y) +(x2k −y) ). As ik =
k
m2 t}, and let apt , bpt denote the minimum and maximum
⌈km
∑m11 2/m ⌉, we have d2
≤ 2 k=1 (x2k −y) +2⌈m2 /m1 ⌉
2
of Xpt , respectively, where p = 1, 2.
k=1 (x1k − y)
2
≤ 2⌈m2 /m1 ⌉(α2 + β 2 ). Therefore, ∑b1t ∑b2t
Let αt2 = k=a (x1k −yt )2 and βt2 = k=a (x2k −
DT W (x1 , x2 ) ≤ 2⌈m2 /m1 ⌉(α2 + β 2 ).
2 1t 2t
yt )2 .
Consider three time series x1 = [x11 , x12 , · · · , x1m1 ], We have 1 ≤ |X1t |, |X2t | ≤ m2 − n. From Lemma 1,
x2 = [ x21 , x22 , · · · , x2m2 ] and y = [ y1 , y2 , · · · , yn ] of we have DT W 2 (x1 (a1t : b1t ), x2 (a ∑2tn : b2t )) ≤ 2⌈d1 /d0 ⌉
length m1 , m2 and n, respectively, with n < m1 , m2 . (αt2 +βt2 ). Then DT W 2 (x1 , x2 ) ≤ t=1 DT W 2 (x1 (a1t :
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 11
∑n
b1t ), x2 (a2t : b2t )) ≤ 2(m2 −n) t=1√(αt2 +βt2 ) = 2(m2 − is adopted in the computation of aijk , then (j, k − 1)
n)(α2 +β 2 ). Then DT W ( x1 , x2 ) ≤ 2(m2 − n)(α2 + β 2 ). and (j, k) will appear in r1 , r2 , · · · , rj at the same time,
which contradicts the definition of ACM-Relationship.
Theorem 3 Considering Algorithm 3, let set S = {s1 , s2 , Using dynamic programming, aijk also satisfies the
· · · , sT } of m-length time series and time series c of √
minimal assumption. Finally, ammn at line 18 is the
n-length be the input parameters, and let V be the out- minimum of Equ. (11).
put DCRC structure. Assume D(st , c) ≤ α for ∀t ∈
{1, 2, · · · , T }. If x = [x1 , x2 , · · · , xm ] ∈ Cover(V), 5 The return value V of Algorithm 5 satis-
√ Theorem
∪T
then D(x, c) ≤ mα. D is defined in Equ. (3). fies Cover(Vt ) ⊆ Cover(V).
t=1
Proof Firstly, we will prove that for ∀V.vij (assumed v),
Proof For any given s = [s1 , s2 , · · · , sm ] ∈ Cover(Vt ),
we have v.l ≥ cj − α and v.u ≤ cj + α, where cj is the
there exists b = [b1 , b2 , · · · , bm ] satisfying s ∈ Rectr (Vt , b),
j-th entry of c. From the computation of warping∑m path i.e., si ∈ Vt .[Lbi i , Uibi ] for i = 1, 2, · · · , m. In addition, b
W at line 4, and the assumption D (st , c) = i=1 (sti −
2
satisfies the ACM-relationships.
cri )2 ≤ α2 , we have cri − α ≤ sti ≤ cri + α. For each
Consider line 8 in Algorithm 5, let r = [r1 , r2 , · · · , rn ].
v(V.vij ), from the assignments at lines 7-9 and 11, we
From Definition 1, we have that r satisfies the ACM-
have v.l ≥ cj − α and v.u ≤ cj + α.
relationships.
If x ∈ Cover(V), there exists a series r = [r1 , r2 , · · · ,
The series [rb1 , rb2 , · · · , rbm ] can be shown to satis-
rm ] ∈ R(m, n) satisfying x∑∈ Rectr (V, c). From the
m fy the ACM-relationships as follows. As r1 = 1, b1 =
definition of D, D2 (x, c) ≤ i=1 (xi − cri )2 . From Equ.
1, rn = n′ and bm = n, then rb1 = 1 and rbm = n′ ,
(8), we have xi ∈ [v.l, v.u]. As v.l ≥ cri − α and v.u ≤
∑m i.e., “Alignment” is satisfied. As 0 ≤ ri+1 − ri ≤ 1
cri + α, then D2 (x, c) ≤ i=1 (xi − cri )2 ≤ mα2 . Then
√ for i = 1, 2, · · · , n − 1 and 0 ≤ bi+1 − bi ≤ 1, for
D(x, c) ≤ mα.
i = 1, 2, · · · , m − 1, then 0 ≤ rbi+1 − rbi ≤ 1, i.e., “Con-
From Theorems 2, 3, and Algorithm 3, we can con- tinuity” and “Monotonicity” are satisfied. From the as-
rb rb
clude that if the elements of DCRC are all similar to signment at lines 12-17, we have si ∈ V.[Li i , Ui i ], i.e.,
the reference c as measured by function D, the elements s ∈ Cover(V).
are also similar to each other in terms of the DTW Dis-
tance. According to Theorem 5, if the LB DCRC of an up-
per layer is larger than a given acceptable range query
Theorem 4 The return value of Algorithm 4 is tolerance, then all of its sub-layers can be pruned to
LB DCRC( q, S) as defined in Equ. (11). reduce computational load.
Proof Firstly, we will prove aijk = min DT W (q(1 :
x
i), x(1 : j)) s.t. x(1 : j) ∈ Cover(Vj ) and xj ∈ [Lkj , Ujk ], 6 Experiments
where Vj = [V1 , V2 , · · · , Vj ] and x(1 : j) = [x1 , x2 , · · · ,
xj ]. In order to illustrate the effectiveness of our algorithm-
Mathematical induction. Assume ai′ j ′ k′ satisfy the s and indexing structure, experiments are carried out in
above min equation for ∀(i′ , j ′ , k ′ ) ((i′ ≤ i ∧j ′ ≤ j this section. We use LB NEW [29] and LB ENHANCED
∧k ′ ≤ k) ((i′ , j ′ , k ′ ) ̸= (i, j, k))). We will prove aijk also [32] for comparisons. The experiments are divided into
satisfies the above min equation. two parts, the first part, presented in Sec. 6.2, provides
In line 16, aijk is recursively represented by the sum a comparison of the different DTW lower bounds. In
of γ(i, j, k) and ai′ j ′ k′ . Considering subscript pair (i′ j ′ ) addition, we also perform experiments to analyze the
of ai′ j ′ k′ , there are three cases: (i − 1, j − 1), (i − 1, j) impact of parameters including the length of time se-
and (i, j − 1). In the case of (i′ , j ′ ) = (i − 1, j − 1), ries, the ratio of the width of the Sakoe-Chiba Band to
from the definition of Cover in Equ. (10) and the ACM- the length of the time series λ, and acceptable query tol-
relationships in Definition 1, we have (k − 1) ∈ V.Pj−1 erance ε. The second part, presented in Sec. 6.3, shows
or k ∈ V.Pj−1 . The two cases correspond to η1 and η2 , the performance of the different index trees.
respectively. Similarly, η2 , η3 , η4 and η5 correspond to
the other cases.
Note that η6 = α(i − 1, j, k − 1) and η7 = α(i, j, k − 6.1 Setup
1) are excluded. Considering aijk is the lower bound
of DTW from q(1 : i) to x(1 : j). As the optimum The datasets selected for our experiments are from the
x ∈ Cover(V), there exists r = [r1 , r2 , · · · , rj ] ∈ R(j, k) UCR Time Series Classification Archive [3]. Firstly, we
satisfying that xt ∈ V.vtrt for t = 1, 2, · · · , j. If η6 or η7 compute the average of the LB DCRC distances from
12 T. Sun, et al.
1
the query time series to the DCRC structure using Al-
LB NEW LB ENHANCED LB DCRC
gorithm 4. Then we compute the average DTW from 0.8
the query time series to all the samples in the dataset
0.6
tightness
S.
The computed LB DCRC and actual DTW values 0.4
for different λ are shown in Table 4. The average lower
bound distance of LB DCRC is lower than DTW for the 0.2
20 datasets. The time series have different length mi . 0
The dimension of V of the DCRC is set to mi and the 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ID of datasets
dimension of reference r of the DCRC is set to mi /2. (a) λ = 0.2
1
6.2 Distance and Tightness LB NEW LB ENHANCED LB DCRC
0.8
In terms of distance, we compute the average distance 0.6
tightness
between the query time series, and the candidate set
of time series using four methods: DTW, LB NEW, L- 0.4
B ENHANCED and LB DCRC. Table 5, which shows
0.2
the results of the average distance when λ = 0.2, demon-
strates that LB DCRC achieves better performance than 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
LB NEW and LB ENHANCED for all datasets. ID of datasets
(b) λ = 0.4
Definition 3 (Tightness of the DTW Lower Bound)
Given a method LB of obtaining a lower bound of 1
DTW, a set S of time series, and a query time series 0.8
LB NEW LB ENHANCED LB DCRC
q, let the tightness of LB for q and S be defined as
LB(q,S)
minDT W (s,q) .
0.6
tightness
s∈S
0.4
Using this definition, Fig. 6 shows the average tight-
ness of LB NEW, LB ENHANCED, and LB DCRC for 0.2
different λ (i.e., 0.2, 0.4, 0.6). From the charts, it is clear
0
that LB DCRC is superior to LB ENHANCED and L- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ID of datasets
B NEW on all datasets. When λ increases, the tightness
(c) λ = 0.6
of LB NEW and LB ENHANCED decrease significant-
ly. In contrast, the width of the Sakoe-Chiba band has Fig. 6 Comparison of lower bound tightness under different
little impact on LB DCRC, i.e. when time series has ratios λ of warping windows over the 20 datasets
relatively large deformation, LB DCRC is still a tight
lower bound of DTW. The dimensions of these datasets
the average pruning power as a function of ε and the
are distributed in the range 60 to 637, but this varia-
average tightness as a function of λ computed over the
tion in dimension does not impact the performance of
datasets.
LB DCRC relative to the other methods.
Fig. 9 shows how the tightness changes with the
Definition 4 (Pruning Power for a Query Set) ratio of the Sakoe-Chiba Band for the first 4 datasets
Given a candidate data set S of time series, and a query employed in our experiments, while Fig. 10 shows the
set of time series Q, the pruning power of LB for set Q corresponding variation in pruning power as a function
is defined as |{q∈Q|LB(q,S}>ε}|
|Q| , where ε is a predefined of query tolerance. In all cases the curves in Figs. 9 and
tolerance. 10 decrease monotonically and LB DCRC substantially
outperforms its counterparts.
Given a tolerance ε, higher pruning power means
more query time series can be directly excluded after
the computation of the DTW lower bound. Fig. 7 shows 6.3 Indexing Tree Comparisons
a comparison of the pruning power of each approach,
with increasing ε. The pruning power of LB NEW and By default, the length of each leave is reduced to 20
LB ENHANCED decrease dramatically, while the de- by PAA [14]. Let the maximum number of child nodes
cline in LB DCRC is much more gradual. Fig. 8 shows M = 20 and let the maximal depth of the tree dmax = 3
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 13
Table 4 Average LB DCRC / DTW values for different λ
Dataset Dimension λ=0.2 λ=0.6 λ=1.0
synthetic control 60 2.189/5.757 1.987/5.603 1.987/5.603
Gun Point 150 0.317/0.845 0.287/0.820 0.287/0.820
CBF 128 2.497/4.715 2.413/4.645 2.413/4.645
FaceAll 131 2.196/5.743 1.917/5.616 1.906/5.616
OSULeaf 427 1.416/5.543 1.326/5.411 1.326/5.411
SwedishLeaf 128 0.322/1.306 0.319/1.305 0.319/1.305
50Words 270 7.139/8.897 5.002/6.793 4.866/6.686
Trace 275 10.281/10.864 10.051/10.656 10.051/10.656
MedicalImages 99 1.696/3.545 1.379/3.209 1.363/3.204
ShapeletSim 500 8.148/13.396 8.135/13.396 8.135/13.396
FaceFour 350 4.951/7.250 4.794/7.237 4.794/7.237
Lighting2 637 4.858/8.804 3.828/7.842 3.828/7.842
Lighting7 319 6.770/9.794 5.223/8.268 5.195/8.268
FacesUCR 131 3.696/6.407 3.522/6.361 3.494/6.355
Adiac 176 0.899/1.179 0.899/1.179 0.899/1.179
MoteStrain 84 1.560/4.222 1.478/4.048 1.478/4.048
Fish 463 0.416/0.992 0.416/0.992 0.416/0.992
Plane 144 2.762/3.543 2.698/3.485 2.698/3.485
Car 577 0.663/1.243 0.663/1.243 0.663/1.243
Beef 470 3.112/3.908 3.099/3.894 3.099/3.894
Table 5 Average distance for λ = 0.2
Dataset Dimension DTW LB DCRC LB NEW LB ENHANCED
synthetic control 60 5.757 2.189 0.771 0.942
Gun Point 150 0.845 0.317 0.148 0.154
CBF 128 4.715 2.497 0.168 0.218
FaceAll 131 5.743 2.196 1.112 1.127
OSULeaf 427 5.543 1.416 0.033 0.042
SwedishLeaf 128 1.306 0.322 0.079 0.070
50Words 270 8.897 7.139 0.957 0.406
Trace 275 10.864 10.281 3.951 7.755
MedicalImages 99 3.545 1.696 0.807 0.769
ShapeletSim 500 13.396 8.148 0.075 0.082
FaceFour 350 7.250 4.951 0.340 0.368
Lighting2 637 8.804 4.858 0.014 0.070
Lighting7 319 9.794 6.770 0.749 1.068
FacesUCR 131 6.407 3.696 1.689 1.885
Adiac 176 1.179 0.899 0.710 0.386
MoteStrain 84 4.222 1.560 0.416 0.544
Fish 463 0.992 0.416 0.056 0.060
Plane 144 3.543 2.762 1.088 1.197
Car 577 1.243 0.663 0.020 0.021
Beef 470 3.908 3.112 0.659 1.346
in Algorithm 7. For each R-Tree node, the maximum time series contained in the leaf nodes will not be split,
number of child nodes M is set to 20. The time series for i.e., a leaf node might contain a huge number of time
our experiments are randomly selected from the UCR series which need to be stored in the same hard disk
Archive by the random walk method until the resulting file.
dataset has 1 Gillion bytes. All experiments were op- Fig. 11 compares the performance of the indexing
timised and implemented in Ansi C++ and conducted trees as a function of query tolerance. In plots (a) and
on a 64-bit Win10 operating system with 2.4GHz main (b), the horizontal axis is the query tolerance, and the
frequency, 8 CPUs, 64GB RAM and 4T hard disk. vertical axis is the pruning power, where the ratio of
For LB NEW and LB ENHANCED, we construct warping window λ is 0.1 in plot (a), and 1.0 in plot
the corresponding index structures as R-trees, while for (b). For all the algorithms considered, pruning power
LB DCRC we use a DCRC-tree. If the depth of the decreases with increasing query tolerance because more
DCRC-tree in Algorithm 7 reaches the given maximum, samples are accepted. From the two plots, the pruning
14 T. Sun, et al.
1 1
LB NEW
0.8 LB ENHANCED
LB NEW LB ENHANCED LB DCRC 0.8 LB DCRC
pruning power
tightness
0.6
0.6
0.4
0.2 0.4
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.2
ID of datasets
(a) ε = 0.1, λ = 0.2 0.2 0.4 0.6 0.8 1
ε
1 (a) Pruning power as a function of ε (λ = 1.0)
0.8 0.6
LB NEW
LB NEW LB ENHANCED LB DCRC
tightness
0.6 LB ENHANCED
0.5
LB DCRC
0.4
0.4
tightness
0.2
0.3
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ID of datasets 0.2
(b) ε = 0.5, λ = 0.2
0.1
1
0.2 0.4 0.6 0.8 1
0.8
λ
(b) Tightness as a function λ
LB NEW LB ENHANCED LB DCRC
tightness
0.6
Fig. 8 Pruning power as a function of ε and tightness as a
0.4 function of λ averaged over the 20 datasets
0.2
Fig. 12 shows the pruning power with varying λ for
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 different indexing structures. In plots (a) and (b), the
ID of datasets
horizontal axis is the ratio of warping window λ, and
(c) ε = 1.0, λ = 0.2
the vertical axis is the pruning power, where the toler-
Fig. 7 Comparison of pruning power under different accept- ance ε = 0.1 and 1.0 in plots (a)and (b), respectively.
able tolerances over the 20 datasets The pruning power decreases with increasing λ, because
as the warping window λ increases, the lower bound be-
comes lower so that more candidates are accepted. From
power of the DCRC-Tree is higher than the others, i.e. the two plots, it is evident that the pruning power of
LB DCRC has a tighter lower bound. After querying in DCRC-Tree is greater than the other methods.
the indexing tree(R-Tree or DCRC-Tree), the remain- In plots (c) and (d) (the tolerance ε is set to 0.1 and
ing unpruned time series are sequentially scanned using 1.0, respectively), the DCRC-Tree significantly reduces
the UCR suite method [27]. the number of candidates, which greatly reduces the
While searching for a given query time series on time complexity of indexing as only a small part of the
the DCRC-tree, visiting the non-leaf nodes only costs dataset needs to be linear scanned.
about 800 milliseconds of computation time. Therefore, Due to the computation time cost and “dimensional
the querying time cost of linear scanning is decided by curse” [8, 14], the node dimension of tree-like indexing
the pruning power, more pruning power leads to lower structures is usually set to 20. In Fig. 13, we compare
time cost. Plots (c)(λ = 0.1) and (d)(λ = 1.0) provide the influence of the tree node dimension on pruning
a comparison of the computation time for the different power. In Fig. 13, the horizontal axis is the node dimen-
algorithms. Again, DCRC-Tree outperforms the other sion which varies from 10 to 30, and the vertical axis is
methods. The curves are all monotonically increasing, the pruning power, where ε = 1.0, λ = 0.2 in plot(a),
which reflects the fact that as ε increases, more candi- and λ = 1.0 in plot(b), respectively. The dimensional
date data are retrieved. conversion adopts the PAA algorithm [8,14] and the un-
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 15
1
0.5 LB NEW LB NEW
LB ENHANCED LB ENHANCED
LB DCRC 0.8 LB DCRC
0.4
pruning power
tightness
0.3 0.6
0.2
0.4
0.1
0.2
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
λ ε
(a) Synthetic Control (a) Synthetic Control
0.3
LB NEW LB NEW
0.4 LB ENHANCED LB ENHANCED
LB DCRC 0.25 LB DCRC
pruning power
0.2
tightness
0.3
0.15
0.2 0.1
5 · 10−2
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
λ ε
(b) Gun Point (b) Gun Point
LB NEW 1.5 LB NEW
0.8
LB ENHANCED LB ENHANCED
LB DCRC LB DCRC
0.6
pruning power
1
tightness
0.4
0.5
0.2
0
0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
λ ε
(c) CBF (c) CBF
0.5
1.2 LB NEW
LB NEW
LB ENHANCED
LB ENHANCED
LB DCRC
LB DCRC 1
pruning power
0.4
tightness
0.8
0.3 0.6
0.4
0.2
0.2 0.4 0.6 0.8 1
0.2 0.4 0.6 0.8 1
ε l
λ
(d) Face All
(d) Face All
Fig. 10 The relationship between pruning power and the
Fig. 9 The relationship between tightness and warping win- query tolerance for 4 selected datasets (λ = 1.0)
dow for 4 selected datasets
16 T. Sun, et al.
120
R-Ttree(LB NEW) 104 R-Ttree(LB NEW)
R-Tree(LB ENHANCED) R-Tree(LB ENHANCED)
110 DCRC-Tree DCRC-Tree
P runingP ower(%)
P runingP ower (%)
102
100
100
98
90
96
80
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
ε λ
(a) Pruning power for varying of query toler- (a) Pruning power for varying of warping win-
ance ε (λ=0.1) dow λ (ε = 0.1)
120
R-Ttree(LB NEW) R-Ttree(LB NEW)
100
R-Tree(LB ENHANCED) R-Tree(LB ENHANCED)
110
DCRC-Tree DCRC-Tree
P runingP ower(%)
P runingP ower(%)
100 90
90
80
80
70 70
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
ε λ
(b) Pruning power for varying of query toler- (b) Pruning power for varying of warping win-
ance ε (λ=1.0) dow λ (ε = 1.0)
15
12 R-Ttree(LB NEW) R-Ttree(LB NEW)
R-Tree(LB ENHANCED) R-Tree(LB ENHANCED)
Time Consumption (sec)
Time Consumption (sec)
10 DCRC-Tree DCRC-Tree
10
8
6
4 5
2
0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
ε
λ
(c) Computation Time for varying of query
(c) Computation Time for varying of warping
tolerance ε (λ=0.1)
window λ (ε = 0.1)
R-Ttree(LB NEW) 80
60 R-Tree(LB ENHANCED) R-Ttree(LB NEW)
Time Consumption (sec)
DCRC-Tree R-Tree(LB ENHANCED)
Time Consumption (sec)
60 DCRC-Tree
40
40
20
20
0
0.2 0.4 0.6 0.8 1 0
ε 0.2 0.4 0.6 0.8 1
(d) Computation Time for varying of query λ
tolerance ε (λ=1.0) (d) Time Consumption for varying of warping
window λ (ε = 1.0)
Fig. 11 The impact of query tolerance on indexing perfor-
mance for λ = 0.1 and λ = 1.0 Fig. 12 Indexing performance comparisons with different
warping windows when ε = 0.1 and ε = 1.0
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 17
101
pruned results are linear scanned [27]. The results show
R-Ttree(LB NEW)
R-Tree(LB ENHANCED) that, as expected, the time consumption increases with
100.5
DCRC-Tree increasing dimension, and that the LB DCRC tree sub-
P runingP ower(%)
stantially outperforms the other methods across the full
100
range of dimensions considered.
99.5
7 Conclusion
99
Dynamic time warping has become a popular approach
10 15 20 25 30 for measuring the similarity of time series, with lower
Dimension
bound based techniques used to speed up its applica-
(a) Pruning power for varying of dimension (λ
= 0.1) tion to pruning series in search processes. This paper
has presented DCRC as a novel structure for tightly
R-Ttree(LB NEW)
covering a given set of time series under the DTW dis-
102 R-Tree(LB ENHANCED) tance, and based on this structure proposed the Hier-
DCRC-Tree archical DCRC (HDCRC) to generate DCRC-tree in-
P runingP ower(%)
100
dexing. We also introduce a lower bound of the DTW
distance, which is the distance between a query time se-
ries and a given DCRC-based cover set. The tightness
98 of the lower bound, which we have proven theoretical-
ly, makes it highly suited to pruning when querying on
96
indexing trees. With the aid of extensive experimental
studies we have illustrated that LB DCRC has more
10 15 20 25 30
Dimension
stable performance than competing methods for time
(b) Pruning power for varying of dimension (λ series indexing.
= 1.0) Our future research will focus on multivariate time
series, an increasingly important topic in time series
2
R-Ttree(LB NEW) data mining, with the view to extending the DCRC
1.8 R-Tree(LB ENHANCED) structure to cover the set of multivariate time series.
Time Consumption (sec)
DCRC-Tree Since multivariate time series have both variable-based
1.6 and time-based dimensions, we will endeavor to explore
a new way to represent multivariate time series appro-
1.4 priately.
1.2
Acknowledgements The authors sincerely thank the editors
1
and the anonymous reviewers for the very helpful and kind
comments that have enhanced the presentation of our pa-
10 15 20 25 30 per. The authors would also like to thank the UCR time se-
Dimension ries classification archive and Prof. Keogh for providing the
(c) Computation Time for varying of dimen- datasets used in the study. This work is supported in part
sion (λ = 0.1) by the National Natural Science Foundation of China (Grant
Nos. 61751205, 61572540, 61772102), and in part by the U.S.
15 National Science Foundation (Grant No. IIS-1613950).
R-Ttree(LB NEW)
R-Tree(LB ENHANCED)
Time Consumption (sec)
DCRC-Tree
10 References
1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity
search in sequence databases. In: Proceedings of Interna-
5 tional Conference on Foundations of Data Organization
and Algorithms, pp. 69–84. Springer, Boston, MA (1993)
2. Chen, C.L.P., Zhang, C.Y.: Data-intensive applications,
challenges, techniques and technologies: A survey on big
0 data. Information Sciences 275, 314–347 (2014)
10 15 20 25 30
3. Dau, H.A., Keogh, E., Kamgar, K., Yeh, C.C.M., Zhu,
Dimension
Y., Gharghabi, S., Ratanamahatana, C.A., Chen, Y., Hu,
(d) Time Consumption for varying of dimen-
sion (λ = 1.0)
Fig. 13 Indexing performance comparisons with varying tree
node dimension
18 T. Sun, et al.
B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The 21. Mikalsen, K.Ø., Bianchi, F.M., Soguero-Ruiz, C.,
UCR time series classification archive (2018). URL www. Jenssen, R.: Time series cluster kernel for learning simi-
cs.ucr.edu/~eamonn/time_series_data_2018 larities between multivariate time series with missing da-
4. Edstrom, J., Chen, D., Gong, Y., Wang, J., Gong, ta. Pattern Recognition 76, 569–581 (2018)
N.: Data-pattern enabled self-recovery low-power storage 22. Mondal, T., Ragot, N., Ramel, J.Y., Pal, U.: Compar-
system for big video data. IEEE Transactions on Big ative study of conventional time series matching tech-
Data 5(1), 95–105 (2019) niques for word spotting. Pattern Recognition 73, 47–64
5. Esling, P., Agon, C.: Time-series data mining. ACM (2018)
Computing Surveys 45(1), 12:1–34 (2012) 23. Mori, U., Mendiburu, A., Lozano, J.A.: Similarity mea-
6. Fu, T.C.: A review on time series data mining. Engineer- sure selection for clustering time series databases. IEEE
ing Applications of Artificial Intelligence 24(1), 164–181 Transactions on Knowledge and Data Engineering 28(1),
(2011) 181–195 (2016)
7. Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Fast 24. Mueen, A., Chavoshi, N., Abu-El-Rub, N., Hamooni, H.,
classification of univariate and multivariate time series Minnich, A., MacCarthy, J.: Speeding up dynamic time
through shapelet discovery. Knowledge and Information warping distance for sparse time series data. Knowledge
Systems 49(2), 429–454 (2016) and Information Systems 54(1), 237–263 (2018)
8. Guttman, A.: R-trees: A dynamic index structure for spa- 25. Mueen, A., Keogh, E.: Extracting optimal performance
tial searching. In: ACM Sigmod International Conference from dynamic time warping. In: Proceedings of the 22nd
on Management of Data, pp. 47–57. ACM, New York, NY ACM SIGKDD International Conference on Knowledge
(1984) Discovery and Data Mining, pp. 2129–2130. ACM, New
9. He, H., Tan, Y.: Unsupervised classification of multivari- York, NY (2016)
ate time series using VPCA and fuzzy clustering with 26. Park, S., Lee, D., Chu, W.W.: Fast retrieval of similar
spatial weighted matrix distance. IEEE Transactions on subsequences in long sequence databases. In: Proceedings
Cybernetics 50(3), 1096–1105 (2020) of 1999 Workshop on Knowledge and Data Engineering
10. Hu, J., Yang, B., Guo, C., Jensen, C.S.: Risk-aware path Exchange, pp. 60–67. IEEE, Chicago, IL (1999)
selection with time-varying, uncertain travel costs: A 27. Rakthanmanon, T., Campana, B., Mueen, A., Batista,
time series approach. VLDB Journal 27(2), 179–200 G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Search-
(2018) ing and mining trillions of time series subsequences under
dynamic time warping. In: Proceedings of the 18th ACM
11. Ignatov, A.: Real-time human activity recognition from
SIGKDD International Conference on Knowledge Discov-
accelerometer data using convolutional neural networks.
ery and Data Mining, pp. 262–270. ACM, New York, NY
Applied Soft Computing 62, 915–922 (2018)
(2012)
12. Itakura, F.: Minimum prediction residual principle ap-
28. Sakoe, H., Chiba, S.: Dynamic programming algorithm
plied to speech recognition. IEEE Transactions on Acous-
optimization for spoken word recognition. IEEE Transac-
tics, Speech, and Signal Processing 23(1), 67–72 (1975)
tions on Acoustics, Speech, and Signal Processing 26(1),
13. Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic sum-
43–49 (1978)
marization of time series using a fuzzy quantifier driven
29. Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time
aggregation. Fuzzy Sets and Systems 159(12), 1485–1499
series searching with large uniform scaling. In: Proceed-
(2008)
ings of the 2018 SIAM International Conference on Data
14. Keogh, E., Ratanamahatana, C.A.: Exact indexing of dy- Mining, pp. 234–242. SIAM, Bologna, Italy (2018)
namic time warping. Knowledge and Information Sys- 30. Son, N.T., Anh, D.T.: Discovery of time series k-motifs
tems 7(3), 358–386 (2005) based on multidimensional index. Knowledge and Infor-
15. Keogh, E., Wei, L., Xi, X., Vlachos, M., Lee, S.H., Pro- mation Systems 46(1), 59–86 (2016)
topapas, P.: Supporting exact indexing of arbitrarily ro- 31. Sun, T., Liu, H., Yu, H., Chen, C.L.P.: Degree-pruning
tated shapes and periodic time series under Euclidean dynamic planning approaches to central time series
and warping distance measures. VLDB Journal 18(3), through minimizing dynamic time warping distance.
611–630 (2009) IEEE Transactions on Cybernetics 47(7), 1719–1729
16. Lemire, D.: Faster retrieval with a two-pass dynamic- (2017)
time-warping lower bound. Pattern Recognition 42, 32. Tan, C.W., Petitjean, F., Webb, G.: Elastic bands across
2169–2180 (2009) the path: A new framework and method to lower bound
17. Li, H., Yang, L.: Extensions and relationships of some DTW. In: Proceedings of the 2019 SIAM International
existing lower-bound functions for dynamic time warping. Conference on Data Mining, pp. 522–530. SIAM, Alberta,
Journal of Intelligent Information Systems 43(1), 59–79 Canada (2019)
(2014) 33. Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and clas-
18. Li, Q., Chen, Y., Wang, J., Chen, Y., Chen, H.C.: Web sifying gigabytes of time series under time warping. In:
media and stock markets: A survey and future direction- Proceedings of the 2017 SIAM International Conference
s from a big data perspective. IEEE Transactions on on Data Mining, pp. 282–290. SIAM, Houston, TX (2017)
Knowledge and Data Engineering 30(2), 381–399 (2018) 34. Tan, Z., Wang, Y., Zhang, Y., Zhou, J.: A novel time
19. Lin, S.C., Yeh, M.Y., Chen, M.S.: Non-overlapping subse- series approach for predicting the long-term popularity of
quence matching of stream synopses. IEEE Transaction- online videos. IEEE Transactions on Broadcasting 62(2),
s on Knowledge and Data Engineering 30(1), 101–114 436–445 (2016)
(2018) 35. Tang, J., Cheng, H., Zhao, Y., Guo, H.: Structured dy-
20. Liu, M., Zhang, X., Xu, G.: Continuous motion classi- namic time warping for continuous hand trajectory ges-
fication and segmentation based on improved dynamic ture recognition. Pattern Recognition 80, 21–31 (2018)
time warping algorithm. International Journal of Pattern 36. Wu, X., Zhu, X., Wu, G., Ding, W.: Data mining with
Recognition and Artificial Intelligence 32(2), 1850,002 big data. IEEE Transactions on Knowledge and Data
(2018) Engineering 26(1), 97–107 (2014)
Time Series Indexing By Dynamic Covering with Cross-Range Constraints 19
37. Wu, Y., Tong, Y., Zhu, X., Wu, X.: NOSEP: Nonoverlap-
ping sequence pattern mining with gap constraints. IEEE
Transactions on Cybernetics 48(10), 2809–2822 (2018)
38. Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval
of similar time sequences under time warping. In: Pro-
ceedings of the 14th International Conference on Data
Engineering, pp. 201–208. IEEE, Orlando, FL (1998)
39. Zhou, M., Wong, M.H.: Boundary-based lower-bound
functions for dynamic time warping and their indexing.
Information Sciences 181(19), 4175–4196 (2011)
40. Zoumpatianos, K., Lou, Y., Ileana, I., Palpanas, T.,
Gehrke, J.: Generating data series query workloads.
VLDB Journal 27(6), 823–846 (2018)