A Method For Detecting Abnormal Behavior of Ships
A Method For Detecting Abnormal Behavior of Ships
DOI: 10.3934/mbe.2023620
Received: 07 March 2023
Revised: 07 May 2023
Accepted: 29 May 2023
Published: 20 June 2023
http://www.aimspress.com/journal/MBE
Research article
Lixiang Zhang1, Yian Zhu1,*, Jie Ren1, Wei Lu2 and Ye Yao1
1
School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
2
School of Information, Xi’an University of Finance and Economics, Xi’an 710100, China
Abstract: Abnormal ship behavior detection is essential for maritime navigation safety. Most
existing abnormal ship behavior detection methods only build A ship trajectory position outlier
detection model; however, the construction of a ship speed outlier detection model is also significant
for maritime navigation safety. In addition, in most existing methods for detecting a ship’s abnormal
behavior based on abnormal thresholds, one unsuitable threshold leads to the risk of the ship not
being minimized as much as possible. In this paper, we proposed an abnormal ship behavior
detection method based on distance measurement and an isolation mechanism. First, to address the
problem of traditional trajectory compression methods and density clustering methods only using
ship position information, the minimum description length principle based on acceleration (AMDL)
algorithm and Multi-Dimensional Density Clustering (MDDBSCAN) algorithm is used in this study.
These algorithms not only considered the position information of the ship, but also the speed
information. Second, regarding the issue of the difficulty in determining the anomaly threshold, one
method for determining the anomaly threshold based on the relationship between the velocity
weights and noise points of the MDDBSCAN algorithm has been introduced. Finally, due to the
randomness issue of the selected segmentation value in iForest, a strategy of selectively constructing
isolated trees was proposed, thus further improving the efficiency of abnormal ship behavior
detection. The experimental results on the historical automatic identification system data set of
Xiamen port prove the practicality and effectiveness of our proposed method. Our experiment results
show that the proposed method achieves an improvement of about 10% over the trajectory outlier
detection based on the local outlier fraction method, about 14% over the isolation-based online
13922
anomalous trajectory method in terms of the accuracy of ship position information anomaly detection,
and about 3% over the feature fusion method in terms of the accuracy of ship speed anomaly
detection. This method improves algorithm efficiency by about 5% compared to the traditional
isolation forest anomaly detection algorithm.
1. Introduction
Maritime safety has always been the focus of naval navigation, especially with the rapid growth
of marine traffic, so it has become an imperative [1]. To ensure the safety of ships during navigation,
we need to monitor the navigation information of ships in real-time, such as position and speed
information. At present, the automatic identification system (AIS) [2,3] installed in most ships can
record the navigation information of ships in real-time. This navigation information includes the
ship’s unique identification number, i.e., the Maritime Mobile Service Identity, longitude, latitude,
speed, course, etc. Using this information can help us to analyze the navigation state of the ship and
detect abnormal behavior of the ship [4].
With the development of big data and artificial intelligence technologies in recent years [5,6],
the issue of trajectory outlier detection has been well-studied in trajectory data mining. At the same
time, there are many offline trajectory outliers detection methods, such as the density-based method,
isolation-based anomalous trajectory detection (iBAT) [7], time-dependent widespread routes-based
trajectory outlier detection (TPRO) [8], etc. Meanwhile, there are also some online trajectory outlier
detection methods, such as isolation-based online anomalous trajectory detection (iBOAT) [9],
time-dependent widespread routes-based real-time trajectory outlier detection (TPRRO) [10], driving
behavior-based trajectory outlier detection [11] and gravity vector [12]. In these methods, IBAT and
iBOAT are based on abnormal isolation mechanisms. TPRO and TPRRO are based on the
time-dependent popular route. The gravity vector is based on the distance measurement. Most of the
above methods, whether offline detection, online detection or anything else, only consider the
position anomaly information of ship behavior and ignore other anomaly information. At the same
time, their abnormality thresholds were shown to be difficult to determine during abnormality
detection, which led to inconsistent abnormality detection results. To solve the above problems, a
ship outlier detection method based on distance measurement and an isolation mechanism is
proposed in this paper. Meanwhile, this method provides us with a reasonable basis for determining
the threshold for judging speed as abnormal. And it is suitable for online outlier detection and can
also detect the outliers of ship position information and velocity information.
First, the method uses the minimum description length principle based on acceleration (AMDL)
algorithm to compress ship trajectories. The reason for choosing this algorithm is that the algorithm
is based on the minimum description length algorithm and has strong applicability. Meanwhile, the
shape of the trajectory output by the other trajectory compression algorithms, such as the
Douglas-Peuker algorithm, depends on the determination of the threshold. However, the state of
motion and direction of ships may change at any time, so the efficiency of trajectory compression
methods that rely on setting a threshold is low and these methods may not achieve good results under
the trajectory compression of a ship.
Secondly, to accurately extract the normal behavior model of ships, it is necessary to preprocess
AIS data (i.e., identify trajectory clusters and remove noise points). At the initial stage of AIS data
processing, the distribution characteristics of unprocessed AIS data are unknown. Therefore, we used
the Multi-Dimensional Density Clustering (MDDBSCAN) algorithm (based on Density-Based Spatial
Clustering of Applications with Noise) to identify ship trajectory clusters, removed noise points from
the original AIS data and extracted the ship’s normal behavior model to detect ship position outliers.
The DBSCAN algorithm does not require prior knowledge of the number of ship trajectory clusters to
be formed, and it can discover any shape of ship trajectory cluster classes.
Thirdly, the method offers a strategy of selectively building the iTree algorithm to construct the
iForest algorithm. This algorithm has high efficiency and is suitable for the online detection of
abnormal behavior of ships. At the same time, in the process of extracting the correct speed set for
ships and removing speed outliers, the algorithm does not need to consider the distribution of the
original data. Finally, establishing the relationship between velocity weights in MDDBSCAN and
anomaly thresholds will provide one suitable anomaly threshold for detecting ship speed outliers.
The main contributions of this paper are as follows:
1) Regarding the issue of the MDL algorithm only considering trajectory position information,
this paper presents the AMDL algorithm, which preserves not only position information but also
speed information, unlike the MDL algorithm. The AMDL algorithm, based on the MDL algorithm,
forcibly retains the points where the acceleration changes from positive to negative or the
acceleration changes from negative to positive.
2) In response to the problem of traditional density clustering only using ship position
information as a similarity measure, the MDDBSCAN algorithm was developed in this study.
Compared with the traditional density clustering-based ship anomaly detection algorithm, it takes
into account the ship’s speed factor in the similarity measure of the trajectory cluster, so the ship
behavior modeling in the trajectory cluster is more accurate, thus improving the detection of
abnormal ship behavior.
3) Due to the randomness issue of the selected segmentation values in iForest, we propose a
strategy of selectively constructing isolated trees, which improves the detection efficiency of the
isolation forest algorithm for abnormal data compared with the traditional isolation forest algorithm.
The strategy can maximize the difference between the number of nodes in the left sub-tree and the
right sub-tree to improve the convergence speed for the iForest algorithm.
4) In response to the difficulty in determining the anomaly threshold, by analyzing the
relationship between the velocity weights and noise points of the MDDBSCAN algorithm, we have
established the connection between velocity weights and anomaly thresholds, which provides a
reasonable basis for determining anomaly thresholds. Compared with using grid search or
determining the anomaly threshold by experience, this method is more efficient, and the anomaly
threshold selection is more explanatory.
The remainder of the paper is organized as follows. Section 2 discusses related work. Section 3
describes basic concepts about the sub-trajectories similarity measurement and trajectory
compression. Section 4 presents the abnormal ship behavior detection algorithm. Section 5 discusses
the experimental setup and result. Section 6 concludes the article and gives future work.
2. Related works
In this section, this paper first introduces the general definition of abnormal ship behavior, the
current methods of abnormal ship behavior detection, followed by the main contributions of this paper.
Nowadays, there are many definitions of abnormal behavior of ships, but the main problem is that there
is no unified definition. Martineau and Roy [13] defined and classified abnormal ship behavior earlier,
but this method only divides ship behavior into two categories: motion anomaly and position anomaly.
Portnoy et al. [14] defined abnormal behavior according to the difference between the mathematical
model of normal ship behavior and the ship's data to be detected. Zhang and Tang [15] defined the
abnormal behavior of a ship as the ship’s motion that did not conform to the normal navigation activity
law. Lane et al. [16] classified abnormal ship behavior into five categories according to AIS data:
deviation from the normal route, abnormal activity of the ship AIS, abnormal arrival of the ship,
abnormal distance among ships and an abnormal navigation zone. Laxhammar [17] defined abnormal
behavior of ships as the abnormal deviation of ships from the channel and course, sudden acceleration,
sudden deceleration, and appearance in areas that should not be entered. It could be seen from the
above that different experts had different emphases for the definition of abnormal ship behavior.
The detection of abnormal behavior of ships is the detection of the abnormal trajectory and speed of
ships, and the trajectory of ships is composed of the trajectory points of ships. Therefore, we detected the
abnormal behavior of ships by detecting the position and speed of the ship trajectory points. Combined
with the above definition and analysis, we have defined abnormal ship behavior as the occurrence of a
position outlier or velocity outlier in the trajectory points of ships. The position outlier refers to the
deviation of the ship trajectory from the historical channel, while the speed outlier refers to the ship
entering some particular area that does not conform to the general speed of ships in the area.
In recent years, there has been much research on abnormal ship behavior detection, including
collaborative computing and distributed methods, deep learning methods, statistical methods,
distance measurement methods, outlier isolation methods, knowledge-based and data-driven
integrating approaches and so on [18]. In the method based on distance measurement, some
trajectories that are far away from most normal trajectories are regarded as outliers. Aiming at the
problem of the possible skewness in the distribution of raw AIS data, Bao and Du [12] extracted the
mathematical model from its trajectory clusters based on density clustering (DBSCAN) to detect the
abnormal behavior of ships. Aiming at the problem that the traditional Trajectory Outlier Detection
(TRAOD) algorithm cannot detect outliers from locally dense trajectories, Luan et al. [19] combined
the Local Outlier Factor algorithm with the traditional TRAOD algorithm to detect trajectory
anomalies. Due to a lack of serious studies on outlier detection for trajectory data, Liang et al. [20]
used the trajectory outlier detection based on the local outlier fraction (TODLOF) algorithm to detect
outliers in the trajectory dataset. Using an approach based on deep learning, Belhadi et al. [21] and
others compared the traditional deep learning methods with data mining, machine learning and other
methods, and they have proved the advantages of using traditional deep learning the Convolutional
Neural Network algorithm and the Region Convolutional Neural Network algorithm for outlier
detection. In the method based on statistical or collaborative computing and distributed methods,
Szarmach and Czarnowski [22] proposed a method of using a wavelet transform to detect incorrect
AIS data. Chen et al. [23] adopted spark technology to improve the detection efficiency of outliers.
Using a method based on isolating outliers, to avoid tricky parameters in their trajectory outlier
detection model, Hu et al. [24] used the idea of an isolated forest to isolate outliers. In other related
research on ship anomaly detection, Belhadi et al. [25] compared the current outlier detection
methods and deeply analyzed various trajectory outlier detection methods consequently, various
trajectory outlier detection methods could be well understood. Riveiro et al. [26] provided an
overview of the state-of-the-art research about maritime anomaly detection from the perspective of
data, methods, systems and user aspects.
Regarding the above methods of abnormal ship behavior detection, they were generally for
offline detection. That is, they could not detect abnormal ship behavior in realtime. Although the
effect of outlier detection was good in the experiment, it could not be applied to practical engineering.
The methods based on distance measurement and mathematical modeling have been widely used for
online abnormal ship behavior detection methods. Both judge whether the object detected is an
outlier by measuring the distance between the object to be detected and the correct object. However,
the problem is that the distance threshold selection significantly impacts the judgment of whether the
object to be detected is an outlier. At the same time, the method of abnormal ship behavior detection
based on mathematical modeling has poor scalability. At the same time, most other online abnormal
ship behavior detection methods only mined the position outlier, not the speed outlier. And then, the
anomaly threshold is challenging to determine in these methods, leading to unstable anomaly
detection. Regarding the approaches based on deep learning, these methods lack explanatory power
for detecting abnormal behavior of ships.
All in all, the above traditional methods for detecting abnormal behavior of ships have certain
problems, such as being unable to perform online detection, relying heavily on the selection of
thresholds for detection results, low scalability, only mining abnormal position information of ship
trajectories, lack of interpretability, etc.
2.3. Advantage of the method based on distance measurement and an isolation mechanism
Based on the above problems, we propose an abnormal ship behavior detection method based
on distance measurement and an isolation mechanism, which can not only detect the position outliers
of ship trajectory points in realtime, but it can also detect the speed outliers in realtime. Meanwhile,
this method improves the MDL algorithm to obtain more accurate compressed ship trajectories , and
it provides a reasonable basis for the determination of abnormal speed judgment thresholds. Finally,
a strategy of selectively constructing isolated trees is proposed to improve the efficiency of detecting
abnormal behavior in ships.
For the outlier detection of the ship position information, the AIS data are first processed and
compressed, leading to the minimum length description criterion [27] algorithm based on
acceleration (AMDL), which reflects the real navigation information of ships with less AIS data as
much as possible. Then the ship position information model is extracted from the trajectory cluster
after multi-dimensional density clustering (MDDBSCAN) [28,29]. By comparing the differences
between ship trajectory points and the ship position information model, the position outliers of the
ship can be detected in realtime.
For the outlier detection of the speed information, this method uses an isolation forest algorithm [30].
First, a functional relationship between the speed weights and abnormal speed judgment threshold is
established. Second, detect and eliminate the speed outliers by implementing the isolation forest
algorithm to obtain the correct ship speed set in some areas. Finally, the goal is to add the speed to be
detected to the correct ship speed set, and then to calculate the score of the speed to be detected. The
score can be used to judge whether the speed value is an outlier. The advantage of using an isolation
forest algorithm is that it has good efficiency and can meet the needs of online detection. Meanwhile,
the algorithm has strong expansibility for outlier detection for ship behavior. On this basis, the
efficiency of the traditional isolation forest algorithm is effectively improved by selectively
constructing isolated trees, resulting in faster detection of abnormal ship speeds. The method proposed
is not only applicable to online anomalous behavior detection for ships, but it can also provide a
theoretical reference basis for the establishment of anomalous behavior detection models of other
moving targets.
In this section, some related terms and formal expressions are defined first, which mainly
include the relevant definitions of sub-trajectory similarity measurement [31] and trajectory
compression.
There are three types of distances between trajectory segments: vertical distance (𝑑⊥ ), parallel
distance (𝑑|| ), and angular distance (𝑑𝜃 ). These three types of distances are used to measure the
similarity of trajectory segments. Figure 1 shows these three distances via a formal method.
It is assumed that there are two trajectory segments in space, namely 𝐿𝑗 = 𝑠𝑗 𝑒𝑗 and 𝐿𝑖 = 𝑠𝑖 𝑒𝑖 ,
where 𝑠𝑖 and 𝑒𝑖 respectively represent the two endpoints of the segment 𝐿𝑖 .Then, 𝑒𝑗 and 𝑠𝑗
respectively represent the two endpoints of the segment 𝐿𝑗 . Here, it is assumed that the length of the
segment 𝐿𝑗 is shorter than 𝐿𝑗 .
The vertical distance of 𝐿𝑖 and 𝐿𝑗 is defined as Formula (1), where the two endpoints (𝑠𝑗 and
𝑒𝑗 ) of segment 𝐿𝑗 are projected as 𝑝𝑠 and 𝑝𝑒 on the segment 𝐿𝑖 . At the same time, the Euclidean
distance from the point 𝑠𝑗 to 𝑝𝑠 is 𝑙⊥1 , and the Euclidean distance from the point 𝑒𝑗 to 𝑝𝑒 is 𝑙⊥2 .
2 2
𝑙⊥1 +𝑙⊥2
𝑑⊥ (𝑙𝑖 , 𝑙𝑗 ) = (1)
𝑙⊥1 +𝑙⊥2
The parallel distance of 𝐿𝑖 and 𝐿𝑗 is defined as Formulas (2)–(4), where the two endpoints (𝑠𝑗
and 𝑒𝑗 ) of segment 𝐿𝑗 are projected as 𝑝𝑠 and 𝑝𝑒 on the segment 𝐿𝑖 . At the same time, the
Euclidean distance from the point 𝑠𝑗 to 𝑝𝑠 is 𝑙⊥1 , and the Euclidean distance from the point 𝑒𝑗 to
𝑝𝑒 is 𝑙⊥2 .
𝑑|| (𝑙𝑖 , 𝑙𝑗 ) = 𝑚𝑖𝑛( 𝑙||1 , 𝑙||2 ) (2)
𝑙||1 = 𝑚𝑖𝑛( 𝑑(𝑠𝑖 , 𝑝𝑠 ), 𝑑(𝑒𝑖 , 𝑝𝑠 )) (3)
𝑙||2 = 𝑚𝑖𝑛( 𝑑(𝑒𝑖 , 𝑝𝑒 ), 𝑑(𝑠𝑖 , 𝑝𝑒 )) (4)
The angular distance is defined as Formula (5). The angle of 𝐿𝑖 and 𝐿𝑗 is 𝜃(0 ≤ 𝜃 ≤ 𝜋).
Generally, angle 𝜃 selects the smaller angle between 𝐿𝑖 and 𝐿𝑗 . |𝑙𝑗 | represents the length of the
line segment 𝐿𝑗 .
𝜋
|𝑙𝑗 | × 𝑠𝑖𝑛 𝜃 0 ≤ 𝜃 ≤
2
𝑑𝜃 = { 𝜋 (5)
|𝑙𝑗 | ≤𝜃≤𝜋
2
The angular distance is usually used for the trajectory segment with direction. When dealing
with the trajectory segment without direction, the angular distance can be simply defined as
|𝑙𝑗 | × 𝑠𝑖𝑛 𝜃.
the AMDL algorithm; additionally, 𝑙𝑒𝑛(𝑝𝑖 , 𝑝𝑗 ) represents the Euclidean distance between two
trajectory points.
𝑝𝑎𝑟 −1
𝐿(𝐻) = ∑𝑖=1 𝑖 𝑙𝑜𝑔2 ( 𝑙𝑒𝑛(𝑝𝑐𝑖 , 𝑝𝑐𝑖+1 )) (6)
(𝑑⊥ (𝑝𝑐𝑖 𝑝𝑐𝑖+1 ,𝑝𝑘 𝑝𝑘+1 )) (𝑑𝜃 (𝑝𝑐𝑖 𝑝𝑐𝑖+1 ,𝑝𝑘 𝑝𝑘+1 ))
𝑝𝑎𝑟 −1
𝐿(𝐷|𝐻) = ∑𝑖=1 𝑖 ∑𝑐𝑘=𝑐
𝑖+1 −1
*𝑙𝑜𝑔2 + 𝑙𝑜𝑔2 + (7)
𝑖
No segmentation cost , i.e., 𝐴𝑀𝐷𝐿𝑛𝑜𝑝𝑎𝑟 is the total length of the trajectory from point 𝑝𝒊 to
point 𝑝𝑗 , and the formula is given by Formula (8).
𝑗−1
𝐴𝑀𝐷𝐿𝑛𝑜𝑝𝑎𝑟 = ∑𝑖=1 𝑙𝑒𝑛(𝑝𝑖 , 𝑝𝑖+1 ) (8)
This paper presents an abnormal ship behavior detection method based on multi-dimensional
density clustering and an abnormal isolation mechanism. In this method, massive AIS data must be
compressed, so this study adopted an AMDL algorithm. Second, the MDDBSCAN algorithm must
be carried out on the compressed data. At the same time, the trajectory cluster is divided into 10 grids.
Then, a position information model of the ship trajectory is extracted on each grid. By measuring the
distance difference between the point to be detected and the correct model, the method can judge
whether the ship’s position is abnormal. Thirdly, in each grid, the isolation forest algorithm based on
selectively constructing isolated trees is used to remove the abnormal speed points of ships to extract
the correct speed set. By calculating the abnormal score value of the speed to be detected in the
speed set, the method can judge whether the ship speed is abnormal. During the process, the
connection between the velocity weights in MDDBSCAN and anomaly thresholds are established,
providing a reasonable basis for determining speed anomaly thresholds. The detection flow chart is
shown in Figure 2.
This paper’s data compression method is based on the AMDL algorithm. The core idea of the
AMDL algorithm is to extract feature points from a trajectory. At the same time, the trajectory
compression by this method has two ideal properties: accuracy and simplicity. Accuracy refers to the
trajectory after segmentation and the trajectory before segmentation having the same characteristics
as much as possible. At the same time, simplicity means that the feature points to be extracted from
the original trajectory should be as few as possible. Therefore, the AMDL algorithm process mainly
includes two parts: judging whether there are trajectory points with the positive and negative
transformations of front and rear acceleration in the segmented trajectory point set and judging
whether to segment the trajectory. In the AMDL algorithm, 𝑝i represents one trajectory point, 𝑝vi
represents trajectory points of positive or negative transformation of front and rear acceleration, and
𝑝𝑐i represents characteristic points of the ship trajectory selected by the AMDL algorithm.
Algorithm 1 AMDL
Input:One trajectory (𝑝1 , 𝑝2 𝑝𝑛 )
Output:All feature points of the trajectory (𝑝𝑐1 , 𝑝𝑐2 𝑝 𝑎𝑟𝑖 )
1:Add P1 into the set CP; /*the start point*/
2:startIndex = 1, length = 1;
3:while startIndex + length ≤ n do
4: currIndex = startIndex + length;
5: Add all points from startIndex to currindex to the set temp
6: Check if 𝑝vi exits in the temp set
7: if non-existent then
8: costpar = AMDLpar(PstartIndex,PcurrIndex);
9: costnopar = AMDLnopar(PstartIndex,PcurrIndex);
10: if costpar>costnopar then
11: Add the Pcurrindex-1 point to the set CP;
12: startIndex = currIndex-1 , length = 1;
13: else
14: length = length + 1;
15: else
16: mark the point as PcurrIndex
17: Add the pcurrIndex point to the set CP;
18: startIndex=currIndex;
19: length = 1;
20:Add Pn to the set CP
In Algorithm 1, the first two lines are the initialization operations of the algorithm. The process
from the third line to the end aims to find characteristic points of the ship trajectory. During the
process, the AMDL algorithm calculates the partition cost (costpar) and the no partition cost
(costnopar) for each trajectory point. If costpar is greater than costnopar, then the previous point of
that point will be selected as a characteristic point. Meanwhile, 𝑝vi will also be selected as the
characteristic point.
For the detection of ship trajectory position point outliers, it is necessary to carry out
multi-dimensional density clustering on the compressed AIS data. The MDDBSCAN algorithm is
proposed to solve the above problem. At the same time, the trajectory cluster needs to be meshed,
and then a correct model of ship position information is extracted on each grid.
The traditional DBSCAN algorithm only considers the Euclidean distance between points. The
clustering object of the MDDBSCAN algorithm in this paper is the sub-trajectory, and the clustering
process adopts the idea of DBSCAN [28,29]. In this process, the clustering objects are
sub-trajectories, that is 𝑆𝑢𝑏𝑖 = (𝑝𝑐𝑖 𝑝𝑐𝑗 ), and the sub-trajectory velocity can be represented by
1
𝑉𝑆𝑢𝑏𝑖 = (𝑉𝑝𝑐 + 𝑉𝑝𝑐 ). At the same time, the similarity distance of sub-trajectories will be calculated
2 𝑖 𝑗
by 𝐷𝑖𝑠𝑡(𝑆𝑢𝑏𝑖 , 𝑆𝑢𝑏𝑗 ) = 𝜔⊥ 𝑑⊥ (𝑆𝑢𝑏𝑖 , 𝑆𝑢𝑏𝑗 ) + 𝜔|| 𝑑|| (𝑆𝑢𝑏𝑖 , 𝑆𝑢𝑏𝑗 ) + 𝜔𝜃 𝑑𝜃 (𝑆𝑢𝑏𝑖 , 𝑆𝑢𝑏𝑗 ) + 𝜔𝑣 𝑉𝑆𝑢𝑏𝑖 .
Algorithm 2 MDDBSCAN
Input:(1) Sub-trajectory set 𝐷 = *Sub1 ,Sub2 Sub𝑛 +
(2) Neighborhood radius ,Minimum number of entities(𝑀𝑖𝑛𝑆𝑢𝑏𝑠)
Output:Clusters set 𝑆 = *𝑠1 , 𝑠2 𝑠𝑛 +
/*STEP 1*/
1:clusterID = 0; /*one initial id*/
2:Mark all sub-trajectories as unclassified
3:for each (𝑆𝑢𝑏𝑖 𝐷) do
4: if 𝑆𝑢𝑏𝑖 is not classified then
5: Compute 𝑁 (𝑆𝑢𝑏𝑖 ) /*find sub-trajectory 𝑆𝑢𝑏𝑖 neighborhood*/
6: if |𝑁 (𝑆𝑢𝑏𝑖 )| 𝑀𝑖𝑛𝑆𝑢𝑏𝑠 then
7: allocate 𝑐𝑙𝑢𝑠𝑡𝑒 𝐷 to 𝑆𝑢𝑏𝑖 𝑁 (𝑆𝑢𝑏𝑖 );
8: put 𝑁 (𝑆𝑢𝑏𝑖 ) 𝑆𝑢𝑏𝑖 into queue ;
/* STEP 2 */
9: ExpandCluster ( , 𝑐𝑙𝑢𝑠𝑡𝑒 𝐷, , 𝑀𝑖𝑛𝑆𝑢𝑏𝑠)
10: clusterID = clusterID + 1;
11: else
12: mark 𝑆𝑢𝑏𝑖 as noised sub-trajectory;
/*STEP 3*/
13:for each (𝑠𝑖 𝑆) do
14: if |𝑠𝑖 | minSubs then
15: remove 𝑠𝑖 from 𝑆;
/*STEP 2 find density connection set*/
16:ExpandCluster ( , 𝑐𝑙𝑢𝑠𝑡𝑒 𝐷, , 𝑀𝑖𝑛𝑆𝑢𝑏𝑠) {
17: while do
18: Define 𝑀 as the first sub-trajectory to be checked in the ;
19: Compute 𝑁 (𝑀) ;
20: if |𝑁 (𝑀)| 𝑀𝑖𝑛𝑆𝑢𝑏𝑠 then
21: for each ( 𝑁 (𝑀))do
22: if is not classified or is noised then
23: allocate 𝑐𝑙𝑢𝑠𝑡𝑒 𝐷 to ;
24: if is not classified then
25: put into queue ;
26: remove 𝑀 from queue ;
27:}
In Algorithm 2, STEP 1 includes two parts: algorithm initialization and searching for the
neighborhood of sub-trajectories. STEP 2 aims to find the density connection set and STEP 3 aims to
form clusters and remove noise data.
After completing the multi-dimensional density clustering of sub-trajectories, it is necessary to
establish a correct model of ship position information in each cluster. First, the trajectory cluster
needs to be meshed. In this work, the sub-trajectory cluster is divided into 10 grids, that is, 10
detection models are generated. Then, the center vector is established in each grid, and the center
vector will be used as the detection benchmark to judge whether the position information is abnormal.
The center vector is defined as follows: 𝑉 = (𝑎𝑣𝑔 , 𝑎𝑣𝑔𝑌, 𝑚𝑒𝑑𝑖𝑢𝑚𝐷).
𝑎𝑣𝑔 denotes the average X coordinate of all trajectory points in some grid. 𝑎𝑣𝑔𝑌 denotes
the average Y coordinate of all trajectory points in some grid. 𝑚𝑒𝑑𝑖𝑢𝑚𝐷 denotes the median
distance. Let all sub-trajectories in the grid be represented as a set a space*sub1 ,sub2 sub𝑛 +, and set
*𝑝𝑐1 , 𝑝𝑐2 𝑝𝑐𝑛 + denotes all trajectory points. Then the calculation formula for the components of the
CV is as follows.
∑𝑛
𝑖=1 𝑝𝑐𝑖 . 𝑥
1) average X coordinate: 𝑎𝑣𝑔 = ;
𝑛
∑𝑛
𝑖=1 𝑝𝑐𝑖 . 𝑦
2) average Y coordinate: 𝑎𝑣𝑔𝑌 = ;
𝑛
∑𝑛
𝑖=1 𝑙𝑒𝑛(𝑝𝑐𝑖 ,(𝑎𝑣𝑔𝑋,𝑎𝑣𝑔𝑌))
3) medium distance: 𝑚𝑒𝑑𝑖𝑢𝑚𝐷 = ; and 𝑙𝑒𝑛 denotes the Euclidean
𝑛
distance between two points.
After the center vector is determined in the grid of each sub-trajectory, the abnormal position of
ship trajectory points can be judged. The detection idea is to measure the relative distance between
the point to be detected and the center vector. If the relative distance exceeds the threshold range, it is
considered that the position of the trajectory point is abnormal. If the relative distance is within the
threshold, the position of the current point is considered normal. The formula of the relative distance
between the point to be detected and the center vector is given by Formula (9).
𝑙𝑒𝑛(𝑝,(𝐶𝑉.𝑎𝑣𝑔𝑋,𝐶𝑉.𝑎𝑣𝑔𝑌))
𝐷(𝑝, 𝑉) = (9)
𝐶𝑉.𝑚𝑒𝑑𝑖𝑢𝑚𝐷
In Formula (9), p is the point to be detected. When CRD > 1, the distance from the point to be
detected to the center vector is greater than the average distance from all points in the grid area to the
center vector. When CRD = 1, it is explained that the distance from the point to be detected to the
center vector is equal to the average of the distance from all points in the grid area to the center
vector. When CRD < 1, the distance from the point to be detected to the center vector is less than the
average distance from all points in the grid area to the center vector.
In determining the threshold (CRD(P,CV)), we assume that the distance from all points in the
sub-trajectory grid to the center vector approximately satisfies the normal distribution then, the
three standard deviations criterion is used to determine the threshold, and the formula is given by
Formula (10). When 𝑦𝑝𝑖 = 0, it indicates that the position of the point to be detected is normal.
When 𝑦𝑝𝑖 = 1, it indicates that the position of the point to be detected is abnormal.
In the MDDBSCAN algorithm, the method takes into account the ship's speed factor in the
similarity measure of the trajectory cluster. So, the extracted ship behavior modeling in the
sub-trajectory cluster is more accurate, and the iForest algorithm which is used to detect abnormal
ship speed can achieve better results after removing noise data that considers the speed factor.
For speed outlier detection for ships, the isolation forest algorithm based on selectively
constructing isolated trees is used to extract the correct ship speed set. The algorithm has high
efficiency and is suitable for the online detection of abnormal behavior of ships. At the same time, in
the process of extracting the correct speed set for ships and removing speed outliers, the algorithm
does not need to consider the distribution of the original data. The isolation forest algorithm is more
suitable for fewer data sets [32]. Ten grids have been divided for the trajectory cluster, as described
in the previous section. Still, to help the isolation forest algorithm achieve better results, each grid in
the previous section is divided into four grids again. Through this method, we can remove the
abnormal speed of ships in each grid, to obtain the correct speed set for ships in each grid. When it is
necessary to detect whether the speed of the ship is abnormal, the method will add the speed of the
ship to be detected to the correct set of ship speed in the grid at the corresponding position, as well as
determine whether the speed is abnormal by calculating the abnormal score value of the speed to be
detected.
iForest is similar to a decision tree and random forest. iForest is composed of isolated trees
(iTree). iTree uses random binary trees. Each node connects two child nodes or directly connects a
leaf node. Randomly sampling partial data to construct an isolated tree can ensure a difference
between different trees. To build an isolated tree, we need to select a feature (speed is selected here)
and randomly select a segmentation value to recursively segment the data set until the maximum
height limit of the tree is met or the number of samples of the tree nodes is only one. The maximum
height limit (h) of the tree is related to the number of sub-samples(φ), = 𝑐𝑒𝑖𝑙𝑖𝑛𝑔(𝑙𝑜𝑔2 ( 𝜙)).
Generally, when dividing left and right sub-trees, the isolation forest algorithm randomly selects
a number between the minimum and maximum values from the data set as the segmentation value.
Samples smaller than the segmentation value will be divided into the left sub-tree, and samples larger
than the segmentation value will be divided into the right sub-tree. Due to the randomness of the
selected segmentation value, there will be differences in the ability of each isolated tree to
distinguish outliers. For the identification of abnormal data, it is expected that the segmentation value
can maximize the difference between the number of nodes in the left sub-tree and the right sub-tree,
to improve the convergence speed of the algorithm. Therefore, we propose an algorithm for
selectively building an isolated tree. The algorithm flow for constructing the isolated tree of ship
speed is as shown in Algorithm 3. In Algorithm 3, the terms Ratio represents the ratio of the number
of samples divided into the left (right) sub-tree to the number of samples divided into the right (left)
sub-tree during the first division. X represents the data set to enter. e denotes the current height of
the tree. 𝑙 denotes the maximum height limit of the tree.
It is usually necessary to build 100 such isolated trees to construct an isolated forest (iForest).
The judgment of outliers in iForest is based on the average height of the outlier on 100 trees (i.e.,
path length). The average height of outliers in iForest is usually low. For iForest, given a data set
containing n samples, the average path length of the tree is as given by Formula (11).
2(𝑛−1)
𝑐(𝑛) = 2𝐻(𝑛 1) (11)
𝑛
H(i) is a harmonic number, which can be estimated as ln(i) + 0.5772156649. c(n) is the average
value of path length for a given number of samples n, which is used to standardize the path length
h(x) of sample X. The path length h(x) of sample point x is the number of edges from the root node to
the leaf node of iTree. The algorithm flow for calculating h(x) is as follows.
Outlier detection using iForest is performed by calculating the score of sample X. The score of
sample x is defined as Formula (12).
𝐸(ℎ(𝑥))
−
𝑠( , 𝑛) = 2 𝑐(𝑛) (12)
E (H (x)) represents the average path length of sample x on all iTrees in the isolated forest. c(n) is
the average value of path length for a given number of samples n, which is used to standardize the path
length H(x) of sample X. The relationship between the score s and E (H (x)) is shown in Figure 3.
It can be seen from Figure 3 that when E (H (x)) → C (n), s → 0.5, that is, when the average
path length of sample x is close to the average path length of the iTrees, it is difficult to distinguish
whether it is an outlier or not. When E (H (x)) → 0, s → 1, that is, when the score of X is close to
1, it is determined to be abnormal. When E (H (x)) → n-1, s → 0, it is determined to be normal.
Indeed, determining whether the data point is an outlier depends on the threshold value to
determine whether the data point is abnormal. If the threshold of deciding an outlier is too high, the
speed outlier cannot be detected in the data set as much as possible. If the threshold is too low,
misjudging the normal data as abnormal is possible. Here, the threshold can be determined by the
relationship between the speed weight and noise points in multi-dimensional density clustering.
Usually, multi-dimensional density clustering is mainly used to measure the position differences
of ship sub-trajectories, and its speed factor has little effect on the positions of trajectory points.
Therefore, the value of its velocity weight should generally not exceed the reciprocal of the number
of dimensions. If the number of noise points decreases obviously with the increase of speed weight,
it indicates that the ship speed in the data set is relatively average. So the speed weight should be
taken as a smaller value, and the threshold needed for determining the speed as an outlier should not
be too large. If the ship speed changes significantly in the data set, the speed weight can be
appropriately increased, but it should not exceed the reciprocal number of dimensions of
multi-dimensional density clustering. After selecting the appropriate speed weight, the relationship
between the threshold and the speed weight is defined as shown in Formula (13).
𝑘
𝑠𝑐𝑜 𝑒(𝜔) = (13)
1+𝑒 −𝜔
w is the velocity weight and k is the harmonic number. When the score of ship speed at a certain
moment is greater than score(w), the ship speed is abnormal; otherwise, it is normal.
When 𝑦𝑣𝑖 = 1, the speed is abnormal. When 𝑦𝑣𝑖 = 0, the speed is normal. After removing the
abnormal speed points of ships and extracting the correct speed setting in each grid, by calculating
the abnormal score value of the speed to be detected in the correct speed set, Formula (14) can judge
whether the ship speed is abnormal.
All in all, by Formula (13), the method will provide us with one reasonable basis for using the
threshold value to judge speed as abnormal. Meanwhile, the strategy of selectively constructing iTree
will accelerate the detection of abnormal ship speed.
During experiments, first, the original AIS data were preprocessed (improvement of data quality
and data compression, etc), and then multiple detection methods for abnormal ship behavior (ship's
position outliers and speed outliers) were compared from four perspectives (recall, precision, F1
score and accuracy). Finally, we carried out ablation experiments. The experimental hardware
environment in this study was Intel○ R
CoreTM i7-8700 octa-core CPU (3.20 GHz), 8 GB RAM; the
software experimental environment was Windows 10, Python 3.8 and JDK 1.8.
The data [33] selected in the experiment were the AIS data of a passenger ship near Xiamen
port (the whole journey is about 18 km), a total of 40011 data point. The spatial area has a longitude
of 117.77 to 118.63, latitude of 24.09 to 24.69 and time range from November 29, 2018 to January 3,
2019. However, the unprocessed raw AIS data may have data quality issues, which can affect the
construction of abnormal ship behavior detection models [34]. Therefore, for the issue of raw AIS
data quality, we conducted relevant research and processing, such as the interpolation of trajectory
breakpoints using the cubic spline method and identification and removal of abnormal AIS data
(abnormal stop points, abnormal acceleration points, abnormal drift points, abnormal turning points)
to enhance the continuity and integrity of AIS data and improve the quality of AIS data [3]. Figure 4
shows the number of ship trajectory points after data preprocessing, MDL compression and AMDL
compression.
As seen in Figure 4, the number of data points before and after data preprocessing varies greatly.
If the abnormal data caused by AIS equipment abnormality is not removed, the ship's abnormal
behavior analysis will be significantly affected. The AMDL compression algorithm is an
improvement of the MDL algorithm. Based on the MDL algorithm, it forcibly retains the points
where there is a positive or negative acceleration transformation. Therefore, the AMDL algorithm
can better reflect the real characteristics of ship motion. The comparison of a trajectory point before
and after compression is shown in Figure 5. It can be seen that the compressed trajectory points have
a good balance of accuracy and simplicity.
Figure 4. Comparison of the number of points before and after data processing.
For the analysis of ship behavior anomaly detection, we have analyzed it from the perspective
of recall, precision, F1 score and accuracy. Since AIS has no official standard data set, to label the
data set correctly as much as possible, the noise points of multi-dimensional density clustering were
marked as abnormal trajectory points, and the rest were marked as normal trajectory points. There
were 935 abnormal trajectory points and 14798 normal trajectory points. Here, the trajectory outlier
detection method (MDDBSCAN) in this paper is compared with the TODLOF trajectory outlier
detection method [20] and the isolation-based trajectory outlier detection algorithm (IBTOD)
trajectory outlier detection method [24] in terms of detection rate and false alarm rate. The confusion
matrices for the detection results for normal and abnormal trajectory points are shown in Figures 6–8.
Meanwhile, the MDDBSCAN method was compared with TODLOF, IBTOD, graph attention
network [35], Long Short-Term Memory, and feature fusion methods [36], and the results are shown
in Figure 9 and Table 1.
Table 1. The ship position anomaly detection results for different methods.
0.8437 0.8389
0.85
0.8049
VALUE
0.8
0.7651
0.75
0.7214 0.7131
0.7
0.65
0.6
RECALL PRECISION F1 ACCURACY
Table 1 and Figure 9 show that the MDDBSCAN method outperforms other methods. Analyzing
the reasons, the core idea of TODLOF is based on the local outlier factor algorithm, which requires
that the detected data must have an obvious density difference. However, for a ship trajectory with a
fixed round-trip destination it is difficult to always ensure the obvious density difference, which
limits the application scenario of the algorithm. The core idea of the IBTOD is based on the isolation
forest algorithm. Still, this algorithm often requires a small data set, and a large number of samples
will reduce the ability of isolated forest outliers because normal samples will interfere with the
isolation process and reduce the ability to isolate outliers. At the same time, the algorithm assumes
that the number of abnormal samples in the overall model is tiny, so the application scenario of the
algorithm is also relatively limited. The GAT, LSTM and feature fusion methods are all based on
deep learning. Their detection capability depends on the quality of the training data set and the
appropriate hyper-parameters, and their detection effect is unstable.
For the detection and analysis of ship speed outliers, 246 ship speed values in a grid area were
selected in this study. Five of the values were marked as outliers, and the rest were marked as normal
values. Figure 10(a)–(d) all describe the variation of the number of noise points with four different
weight values. For example, in the experiment in Figure 10(a), when the velocity weight was set to
0.2, the other three weights were set to one-third of 0.8 and when the velocity weight was set to 0.25,
Four speed weight values were selected to measure the relationship between speed weight and
noise points. As seen from Figure 10 above, compared with the increase of the weight of the other
three dimensions, the number of noise points has an undeniable downward trend with the increase of
speed weight. This reflects that the speed difference of ships in the data is relatively average, so the
value of speed weight cannot exceed the weight value of the other three dimensions. Next, a different
threshold for determining the speed as abnormal was calculated according to different speed weight
values. The confusion matrices for the detection results are shown in Figures 11–13.
It can be seen that the score (w ≤ 0.25) can provide a more appropriate anomaly threshold for
judging whether the speed is abnormal from Figures 11–13. According to Figures 11–13, the recall,
precision, accuracy and F1 values of the model can be calculated under different anomaly thresholds,
and the method was also compared with feature fusion; the results are shown in Figure 14 and Table 2.
Table 2. The ship position anomaly detection results for different methods.
1 1 1 1 1 1 1 0.99791 1 1 1
0.9959 0.9938 0.9959
1 0.9897
0.9877 0.9879
0.99
0.9797 0.9797
0.98 0.97
0.97 0.96 0.96 0.96
0.96
VALUE
0.95
0.94
0.93
0.92
0.91
0.9
Recall Precision F1 Accuracy
Figure 14. Ship position anomaly detection results for different methods.
Table 2 and Figure 14 show that when w is less than 0.25, the model can be guaranteed to detect
abnormal ship speed to a greater extent by using the score(w) formula. It can be seen that score(w)
can be used to determine a more appropriate anomaly threshold for judging whether the speed is
abnormal. When the anomaly threshold > 0.75, the detection capability begins to deteriorate, because
the speed of ships at sea is relatively average. The high anomaly threshold is difficult to apply to
accurately identify the anomaly data in a data set with a low degree of dispersion. The feature fusion
method is based on a deep learning algorithm, and its detection capability depends on the appropriate
super parameters and the quality of the training set. Its detection capability is not stable enough. Next,
comparing the improved iForest algorithm with the traditional iForest algorithm (which adopts the
strategy of selectively constructing isolated trees) from the perspective of algorithm efficiency, the
results are shown in Table 3.
As can be seen from Table 3, the iForest algorithm takes 7.11 ms to detect a single data points,
while the improved iForest algorithm takes 6.76 ms. In terms of algorithmic efficiency, the improved
iForest method improves efficiency by about 5% over the traditional iForest algorithm. By analyzing
the reason, it can be seen that the improved iForest algorithm adopts the strategy of selectively
constructing isolated trees; when the ratio of the number of samples divided into the left sub-tree and
the number of samples divided into the right sub-tree is not large, it chooses the strategy of stopping
construction, so its efficiency will be better than that of the iForest algorithm.
Finally, we have carried out ablation experiments to verify the high accuracy of the iForest
algorithm in detecting the abnormal speed of ships after noise removal by the MDDBSCAN
algorithm. The results are shown in Figures 15 and 16 and Table 4. Meanwhile, in this experiment,
the threshold used to detect whether the ship speed is abnormal was set to score (0.15). It can be seen
from Figures 15 and 16 and Table 4 that the detection capability of MDDBSCAN- Improved iForest
is better than that of Improved iForest. By analyzing the reasons, the MDDBSCAN algorithm
provides a global anomaly detection scenario for the iForest algorithm, which has high accuracy on
such data sets.
6. Conclusions
This method separates the detection of ship behavior outliers into three steps. The first part is
data preprocessing and data compression, which achieves the accuracy and simplicity of describing
ship trajectories. Second, the position information modeling scheme detects the ship position outliers.
By comparing the five trajectory outlier detection methods, the method in this paper had a better
detection effect. Finally, the isolation forest algorithm is used to detect the ship's speed outliers, and
the functional relationship between the speed weight of multi-dimensional density clustering and the
threshold for determining the speed as abnormal has been established. Experiments showed that the
threshold selected by score(w) had a good result for detecting ship speed outliers. This paper's
abnormal ship behavior detection method is suitable for online detection and can also mine more
abnormal ship information besides speed, such as ship acceleration, heading, etc. Meanwhile, due to
the evolving computing power techniques, establishing a more efficient and accurate abnormal ship
behavior detection model will also have a promising possibility.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this
article.
Acknowledgments
This research was funded by the Key Research and Development Program of China, grant
number 2021YFC2802503; Key Research and Development Program of Shaanxi Province, grant
number 2021ZDLGY05-05 and 2019ZDLGY12G07.
Conflict of interest
References
20. B. Liang, S. Wu, W. Chen, Z. Zhu, Trajectory outlier detection based on partition-and-detection
framework, in 2017 13th International Conference on Natural Computation, Fuzzy Systems and
Knowledge Discovery (ICNC-FSKD), 2017. https://doi.org/10.1109/FSKD.2017.8393071
21. A. Belhadi, Y. Djenouri, D. Djenouri, T. Michalak, J. C. Lin, Deep learning versus traditional
solutions for group trajectory outliers, IEEE Trans. Cybernetics, 6 (2020), 1–12.
https://doi.org/10.1109/TCYB.2020.3029338
22. M. Szarmach, I. Czarnowski, Multi-Label classification for AIS data anomaly detection using
wavelet transform, IEEE Access, 10 (2022), 109119–109131.
https://doi.org/10.1109/ACCESS.2022.3214217
23. Y. Chen, J. Yu, G. Yong, Detecting trajectory outliers based on spark, in 2017 25th
International Conference on Geoinformatics, (2017), 1–5.
https://doi.org/10.1109/GEOINFORMATICS.2017.8090919
24. K. Hu, P. Duan, B. Hu, Q. Duan, IBTOD: An isolation-based method to detect outlying
sub-trajectories on multi-factors, in IEEE Advanced Information Management, Communicates,
Electronic and Automation Control Conference, 2018.
https://doi.org/10.1109/IMCEC.2018.8469416
25. A. Belhadi, Y. Djenouri, C. Lin, Comparative study on trajectory outlier detection algorithms, in
2019 International Conference on Data Mining Workshops (ICDMW), (2019), 415–423.
https://doi.org/10.1109/ICDMW.2019.00067
26. R. Maria, P. Giuliana, V. Michele, Maritime anomaly detection: A review, Wiley Interdiscip.
Rev. Data Mining Knowl. Discovery, 8 (2018), 8. https://doi.org/10.1002/widm.1266
27. S. Papadimitriou, H. Kitagawa, P. Gibbons, C. Faloutsos, LOCI: fast outlier detection using the
local correlation integral, in Proceedings 19th International Conference on Data Engineering,
2003, 315–326. https://doi.org/10.1109/ICDE.2003.1260802
28. G.Pallotta, M.Vespe, K.Bryan, Vessel pattern knowledge discovery from AIS data: A
framework for anomaly detection and route prediction, Entropy, 15 (2013), 2218–2245.
https://doi.org/10.3390/e15062218
29. W.Dai, C.Zhang, X.Su, S. Cao, Trajectory Outlier Detection Based on DBSCAN and Velocity
Entropy, in 2020 International Conferences on Internet of Things (iThings) and IEEE Green
Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing
(CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics
(Cybermatics), (2020), 550–557.
https://doi.org/10.1109/iThings-GreenCom-CPSCom-SmartData-Cybermatics50389.2020.00097
30. Z. Cheng, C. Zou, J. Dong, Outlier detection using isolation forest and local outlier factor, in
Proceedings of the Conference on Research in Adaptive and Convergent Systems, (2019), 161–
168. https://doi.org/10.1145/3338840.3355641
31. F. Luan, Y. Zhang, K. Cao, Q. Li., Based local density trajectory outlier detection with
partition-and-detect framework, in 2017 13th International Conference on Natural Computation,
Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), (2017), 1708–1714.
https://doi.org/10.1109/FSKD.2017.8393023
32. T. Fei, M. Kai, Z. Zhou, Isolation forest, in Proceedings of the 2008 Eighth IEEE International
Conference on Data Mining, (2008), 413–422. https://doi.org/10.1109/ICDM.2008.17
33. Historical AIS Data Services (accessed on 10 December 2018). Available from:
http://www.vtexplorer.com/
34. C. Iphar, C. Ray, A. Napoli, Data integrity assessment for maritime anomaly detection, Expert
Syst. Appl., 147 (2020), 3. https://doi.org/10.1016/j.eswa.2020.113219
35. H. Liu, Y. Liu, Z. Zong, Research on ship abnormal behavior detection method based on graph
neural network, in 2022 IEEE International Conference on Mechatronics and Automation
(ICMA), (2022), 834–838. https://doi.org/10.1109/ICMA54519.2022.9856198
36. G. Huang, S. Lai, C. Ye, H. Zhou, Ship trajectory anomaly detection based on multi-feature
fusion, in 2021 IEEE International Conference on Smart Data Services (SMDS), (2021), 72–81.
https://doi.org/10.1109/SMDS53860.2021.00020