Multi-Source Information Fusion Based DLaaS For Traffic Flow Prediction
Multi-Source Information Fusion Based DLaaS For Traffic Flow Prediction
4, APRIL 2024
Abstract—Traffic flow prediction is the key to transportation Intelligent transportation system refers to the effective integra-
safety and efficiency. The advance in machine learning and deep tion and application of sensor technology, information tech-
learning has promoted the development of intelligent transporta- nology, communication transmission technology, and computer
tion systems. For example, the emergence of Deep Learning as a
Service (DLaaS) has benefitted researchers a lot in dealing with control technology on top of a complete infrastructure to estab-
large scale dataset and complex deep learning algorithms. In traffic lish a large-scale, all-around real-time, and efficient integrated
forecasting, despite the success of deep learning-based models, transportation and transportation system. It aims to combine
there are still shortcomings, such as inadequate use of temporal and advanced information technologies such as digital communica-
spatial traffic information, and indirect modeling of dependencies tions and computer networks to improve the management level
in traffic data. To address these challenges, we learn the transporta-
tion network in the form of a graph, and use graph wavelet as a key of traffic management departments and provide citizens with
component to extract well-positioned features from the graph based more accurate travel guidance information [2], [3].
on the transportation network. Compared with graph convolution, The advance in machine learning and deep learning has pro-
graph wavelets are very flexible and do not need to specify adjacent moted the development of intelligent transportation systems.
regions in the topological graph structure for feature extraction. At For example, the emergence of Deep Learning as a Service
the same time, we propose to combine the multi-information fusion
traffic control and guidance collaborative neural network and the (DLaaS) has benefit researchers a lot in dealing with large
results obtained are better than the benchmark algorithms. The scale dataset and complex deep learning algorithms. With this
results by comparison with several baseline methods show that our service, users can use popular frameworks to train neural net-
proposed method can outperform all the baseline methods. works, such as TensorFlow, PyTorch, and Caffe, without the
Index Terms—Deep learning as a service, graph wavelet gated need to purchase and maintain expensive hardware. Users can
recurrent neural network, secure transportation, traffic Big Data, choose suitable services based on a set of supported deep
traffic prediction. learning frameworks, neural network models, training data,
cost constraints and other conditions. DLaaS will help com-
I. INTRODUCTION plete the rest and provide them with an interactive and iter-
able training experience. In short, users only need to prepare
ITH the development of microelectronics technology,
W communication technology, and artificial intelligence,
intelligent applications in life scenarios have developed rapidly.
their data, upload it to DLaaS, start training, and then down-
load the training results. Such service will paly a significant
role in the construction of intelligent transportation systems
In the field of transportation, the intelligent transportation sys-
by saving training time. Due to the benefit of DLaaS, it has
tem [1] is one of the intelligent integrated application systems.
been applied into many practical scenarios, such as intelligent
transportation systems, adversary attack [4], [5], [6], and smart
Manuscript received 15 November 2020; revised 18 April 2021; accepted
25 April 2021. Date of publication 13 January 2023; date of current version healthcare [7].
13 March 2024. This work was supported in part by the National Key R & Traffic flow prediction is the key to transportation safety and
D Program of China under Grant 2018YFC0407904, and in part by the Key efficiency. In the transportation system, safety issues are most
Research Projects of Tibet Autonomous Region for Innovation and Entrepreneur
under Grant Z2016D01G01/01. Recommended for acceptance by S. Guest concerned by the public. traffic flow prediction has a strong
Editors. (Corresponding author: Ye Zhang.) enabling effect on road safety and auto-driving cars. Thus, an
He-xuan Hu is with the College of Computer and Information, Hohai Uni- important research topic of intelligent transportation systems
versity, Nanjing 211100, China, and also with the Electric Engineering College,
Tibet Agriculture & Animal Husbandry University, Tibet 860000, China (e-mail: is traffic flow prediction, which is also the foundation and
[email protected]). important part of the construction of an intelligent transporta-
Zhen-zhou Lin is with the College of Computer and Information, Hohai tion system. By comprehensively collecting and managing the
University, Nanjing 211100, China, and also with the Office of Teaching Affairs,
Nanjing University of Finance and Economics, Nanjing 210023, China (e-mail: massive data of the urban transportation system, and conducting
[email protected]). research and analysis on complex traffic operation conditions,
Qiang Hu and Ye Zhang are with the College of Computer and Informa- we can provide detailed and accurate data services to urban
tion, Hohai University, Nanjing 211100, China (e-mail: [email protected];
[email protected]). transportation participants. Traffic flow prediction is an impor-
Wei Wei is with the School of Computer Science and Engineering, Xi’an tant research task in the data service function. Only when the
University of Technology, Xi’an 710048, China (e-mail: [email protected]). intelligent transportation system realizes the accurate prediction
Wei Wang is with the School of Intelligent Systems Engineering, Sun Yat-sen
University, Shenzhen 510275, China (e-mail: [email protected]). of the traffic flow, the traffic information service, travel route
Digital Object Identifier 10.1109/TC.2023.3236902 planning, public transportation system, traffic management, and
0018-9340 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
HU et al.: MULTI-SOURCE INFORMATION FUSION BASED DLAAS FOR TRAFFIC FLOW PREDICTION 995
other functional subsystems in the system can work effectively. and the prediction information, which is very useful [16], [17],
The intelligent evaluation of intelligent transportation systems is [18], [19].
closely related to the accurate prediction of traffic flow to a large To address these challenges, in this paper, we propose a
extent. A variety of traffic service level evaluation indicators multi-source information fusion-based deep learning framework
are used for traffic prediction, such as queue length, incident for traffic flow prediction. The overall framework is presented
severity, traffic volume, traffic time, and average speed. Traffic in Figure. 1. Our method mainly contains three parts, includ-
time indicators are very intuitive and easy to understand for ing traffic information modeling, external factor modeling, and
travelers, so they are widely accepted [8]. graph wavelet gated recurrent neural network. Considering the
The road traffic system is a large-scale, time-varying, complex significance of the speed information for traffic prediction, we
and non-linear system with human participation [9]. On the propose to model the speed information from three perspectives,
one hand, it is an overall traffic cycle and evolution function including vehicle speed, traffic volume, and occupancy rate.
composed of several traffic elements that interact and depend on Moreover, we consider various external information, such as
each other on the road. It is also affected by the irresistible forces weather information and human mobility pattern. Then, we pro-
of nature and different degrees of human production activities, pose to combine the multi-information fusion traffic control and
thus forming a complex evolutionary law of the transportation guidance collaborative neural network and the results obtained
system [10], [11]. One of its distinctive features is its high degree are better than the benchmark algorithms. Finally, we learn the
of uncertainty. This kind of uncertainty not only comes from transportation network in the form of a graph and use graph
natural factors such as season and climate, but also comes from wavelet as a key component to extract well-positioned features
human factors such as traffic accidents, emergencies, and the from the graph based on the transportation network. Compared
psychological state of drivers. These factors have brought diffi- with graph convolution, graph wavelets are very flexible and do
culties to the prediction of traffic flow, especially the short-term not need to specify adjacent regions in the topological graph
traffic flow prediction is more affected by random interference structure for feature extraction.
factors, the uncertainty is stronger, the regularity is less obvious, Our main contributions are as follows:
and it is more difficult than the medium and long-term predic- r We propose a graph wavelet gated recurrent neural network
tion [12], [13]. to learn from spatiotemporal traffic network data, where
Due to the high complexity, randomness, and uncertainty of graph wavelet operators act as filters in the gates of the
the operation of the transportation system, only grasping the recurrent neural network.
characteristics of the traffic flow on a long-term scale, it is r Compared with many existing models, the proposed graph
difficult to meet the increased demand for traffic information wavelet-gated recurrent neural network can achieve excel-
in traffic management, and the real-time, The effectiveness of lent prediction performance with fewer weight parameters
traffic information release and the accuracy of abnormal event and higher training efficiency.
detection are all based on the research of traffic characteristics r The quantifiable learning weight of the item graph wavelet
that are smaller than the short time scale [14], [15]. For example, is very small. This sparse attribute of learning weights can
suppose there is a bottleneck on a certain part of the highway, greatly enhance the interpretability of the model and help
then over time, the end of the queue will move downstream from identify key lanes in the traffic network.
the bottleneck. Sometimes, the head and tail of the bottleneck r We proposed to adopt various traffic and external informa-
move downstream together over time. Therefore, the prediction tion to improve the accuracy of the proposed methods and
method can directly reflect and capture the complex nonlinear the experiments show that our used features are beneficial
spatiotemporal relationship between each historical data point for traffic flow prediction.
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
996 IEEE TRANSACTIONS ON COMPUTERS, VOL. 73, NO. 4, APRIL 2024
The average speed of the interval is the ratio of the length of the
observed road section to the average travel time of all vehicles
on the road section in a certain period of time. Mathematically,
the interval average speed is the harmonic average of the travel
speeds of all vehicles passing through the road section, which
can be calculated by the following formula
ζH = wi,j log p(vi |vj ), (2)
where log p(vi |vj ) denotes the conditional probability of vi
generated by vj , which can be calculated as:
exp(vi · vj )
The rest of the paper is organized as follows. We present log p(vi |vj ) = . (3)
our proposed graph wavelet-based deep learning method in z∈V exp(vi · vj )
Section II. Experimental settings are introduced in Section III The time average speed reflects the operating conditions of
and experimental results are presented in Section IV. Section V the traffic flow at a specific observation location, and the interval
reviews the related works on traffic flow prediction. Finally, average speed reflects the spatial operating conditions of the
Section VI summarizes this paper. traffic flow on a specific road section. Regardless of the time
average speed or the interval average speed, both are important
indicators in traffic management and control. They are of great
II. PROPOSED METHOD significance to traffic planning, road design, vehicle operation
In the task of traffic flow prediction, the goal is to utilize efficiency, traffic management, and control, etc.
the traffic information and the relationships among the road 2) Traffic Volume: Traffic volume, also known as flow, is the
network for predicting future traffic flow. The input will be a set total number of vehicles passing through a lane or a certain point
of traffic information and a set of road network information. To or section of a road in a unit time interval. It can be divided into
this end, we propose a multi-source information fusion-based annual traffic volume, daily traffic volume, hourly traffic volume,
convolutional network framework that mainly contains three or traffic volume during periods of less than one hour. Traffic
parts, including traffic information modeling, external factor flow rate is also called the flow rate. In a given time interval of
modeling, and graph wavelet gated recurrent neural network. less than one hour, the equivalent hourly flow rate of vehicles
We will introduce these parts in the following sections. The passing through a lane or a designated point or a designated
definitions of the used items are presented in Table I. section of the road.
Traffic volume and flow rate are important parameters to de-
scribe the characteristics of traffic flow. They are both variables
A. Traffic Information that reflect traffic demand, but they have important differences
The characteristic parameters of traffic flow used to describe both conceptually and essentially. Traffic volume is obtained
the traffic state are divided into macro parameters and micro through actual observation or prediction The flow rate is the
parameters. The macro parameters describe the operating status equivalent value obtained after equivalent conversion of insuf-
and characteristics of the traffic flow as a whole, mainly includ- ficient traffic volume. Since the traffic volume is not a static
ing traffic volume, vehicle speed, traffic density, occupancy rate, quantity, it has the characteristics of changing with time and
and queuing length. Micro-parameters are used to describe the space. One way to measure the characteristics of urban traffic
characteristics of the running state between vehicles related to is to observe the traffic volume in time and space at a series of
each other in the traffic flow, i.e., the distance between vehicles. positions in the road system. In traffic analysis, the peak hours
In this paper, we mainly consider three basic characteristic are often divided into shorter periods to show the changing
parameters of vehicle speed, traffic volume, and occupancy rate. characteristics of the traffic flow in each period. There is no
1) Vehicle Speed: The basic meaning of vehicle speed is the uniform standard for how long to use as the minimum time
distance the vehicle travels per unit time. It can be roughly interval for observation, and it is generally used in the analysis
divided into location speed, travel speed, design speed, and of road traffic volume. The peak flow rate is an important factor
running speed. Since the urban road network traffic flow is a in the analysis of road capacity. The peak flow rate is defined
complex system composed of multiple vehicles, the average as the ratio of the entire hour traffic volume to the maximum
speed of the traffic flow has time average speed one and interval flow rate within the hour by using the peak hour coefficient. The
average speed one. calculation method is as follows:
Time average speed refers to the arithmetic average of the ζH = wij ||xi − xj ||2 , (4)
speed of all vehicles passing through a certain place on the road (i,j)∈e
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
HU et al.: MULTI-SOURCE INFORMATION FUSION BASED DLAAS FOR TRAFFIC FLOW PREDICTION 997
where xi and xj are the representations of vehicles i and j, wij ⊗y on a traffic graph y as follows:
denotes the link weight between them.
3) Occupancy Rate: Traffic flow density refers to the number f ⊗y x = I fˆ x̂ = I I T f I T x ∗, (6)
of vehicles per unit length on the road at a certain moment.
where f is the convolutional kernel filter function and denotes
Since the density is an instantaneous value, it changes with
the element-wise multiplication operator. At the same time, we
the time of observation or the length of the interval, which has
can replace the I T f with a diagonal matrix fˆ(∀θ ). Then, we can
the disadvantage of difficulty in observation. For this reason, in
get the following equation
traffic engineering, the occupancy rate that is easier to measure
is often used instead of density. The occupancy rate is divided f ⊗y x = I fˆ(∀θ )I T x∗, (7)
into a space occupancy rate and time occupancy rate. The space
where fˆ denotes the filter function and ∀θ denotes the diagonal
occupancy rate is the ratio of the total length of the vehicle
parameter matrix. At the same time, considering the limitation
on the observed road section to the total length of the road
of previous graph convolution is not well-localized in the vertex
section within a certain time. The space occupancy rate directly
domain, we use a polynomial filter to construct the graph con-
reflects the level of traffic density, and can better indicate the
volution on one vertex based on its neighbors. Specifically, the
actual occupation of the road, but the directness of this traffic
filter in defined as
parameter It is difficult to obtain, the calculation method is
K−1
2
min ζ = A − XXT F + δ wij xi − xj 2 , (5) hθ = θk ∀k ∗, (8)
X k=0
(i,j)∈e
where θ ∈ Rk denotes the polynomial coefficients and ∀k de-
where δ denotes a trade-off between traffic information and notes k-order Laplacian diagonal eigenvalue matrix. Based on
traffic network. this transformation, we can make the graph convolution localize
on the Laplacian. Meanwhile, we utilize a free flow reachability
B. Fusion of External Factors matrix to take the roadway physical properties into considera-
Based on previous research, external factors such as emergen- tion.
cies, holiday arrangements, and weather conditions will have a 2) Wavelet Transform: In this work, we use the wavelet trans-
certain impact on traffic flow. Therefore, we should consider form to cut up signal data into different frequency components
these external factors when making traffic flow forecasts. and investigate each component with a resolution matched to
The external factors of the weather information used in this its scale based on a wavelet prototype function. Specifically, the
experiment are to grab from historical weather data of the city. wavelet Φs,a can be constructed as follows:
Specifically, we propose to extract other external factor features 1 x−a
by residents’ life and travel habits, including holidays, working Φs,a (x) = Φ ∗, (9)
s s
days, and weekends. After the original data is obtained, the
where s denotes the scale and a denotes the location. Meanwhile,
unit information is first removed through the de-unit process.
we can calculate the wavelet coefficients Wf (s, a) based on the
Meanwhile, various features of the weather are normalized to the
convolution of the input of f (x)
range of [0,1] through the Max-Min operation. Other external
∞
features such as holidays, weekends, and working days are taken 1 ∗ x−a
Wf (s, a) = Φ f (x)dx, (10)
into account. The one-hot operation is used to normalize all −∞ s s
these factors into a two-dimensional vector, and the normalized
two-dimensional original feature vectors obtained after these ∗ denotes
where
1 ∗ −x
the complex conjugate. Assuming the Φs (x) is
s Φ s , we can get the operator T s as
data preprocessing operations are used as the external factor
feature of the target area i at time t, which can be marked T s f (a) = Φs ⊗ f (a), (11)
as externali,t . For the time factor, we use the interval-level
short-term function is used to capture detailed information with where ⊗ denotes the theorem of the convolution. Thus, we can
a shorter interval. Specifically, the external factors we consider get
include weekdays, weekends, peak period, non-peak period, T ˆs f (w) = Φ̂s (w)fˆ(w), (12)
sunny day, and rainy day. We will explore the influence of these
factors with our proposed method. where w denotes the frequency component. Then, we can get the
classical wavelet transform by inverting the Fourier transform
as
C. Gated Graph Wavelet Recurrent Neural Network ∞
1
(T s f )(x) = eiwx Φ̂∗ (sw)fˆ(w)dw. (13)
We will introduce our used method from the perspectives of 2π −∞
graph convolution, wavelet transform, graph wavelet transform,
and graph wavelet gated recurrent neural network. Finally, after using a impulse function δa (x) = δ(x − a), we
1) Graph Convolution: The Fourier transform of a graph can get the localized wavelet as
convolution can be regarded as the element-wise product of the 1 ∗ a−x
(T s δa ) (x) = Φ = Φs,a (x). (14)
Fourier transform. We can generalize the convolution operator s s
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
998 IEEE TRANSACTIONS ON COMPUTERS, VOL. 73, NO. 4, APRIL 2024
3) Graph Wavelet Transform: Similarly, graph wavelet are more concerned about the road speed and travel time in the
transform aims to convert graph signal from vertex domain to future, rather than the number of vehicles on the road and the lane
spectral domain. It can be defined as occupancy rate at the next moment. Therefore, in the experiment
in this chapter, the average vehicle speed of the road sensor is
Tgs = g(sG), (15)
selected as the main research parameter. The experiments are
where g denotes a kernel and s denotes the scale. Here the g carried out on two regional data sets, which are named PeMS01
can be the mother wavelet of classical wavelet transform. The and PeS02 respectively. Among them, the PeMS01 data set is
wavelet operator can be calculated by modulating each Fourier the historical traffic data of 9 highways and urban arterial roads
mode as in the Los Angeles area. The data set includes the detection data
of 207 road sensors on highways such as I405, I5, SR101, and
Tgˆs f (i) = g(sσi )fˆ(i), (16)
SR134 within 4 months. The time span of more than 6.5 million
where σi denotes i-th eigenvalue of the network G. After apply- data records is from March 1, 2017 to June 30, 2017.
ing the Fourier transform, we can get Among them, the PeMS02 data set is the historical traffic data
(N −1) of 7 highways and urban arterial roads in the San Francisco area.
s
Tg f (m) = g(sσi )ˆ(f )(i)ui (m), (17) The data set includes the detection data of 352 road sensors on
the main roads of cities such as I280, I880, SR101, SR85 within
(i=0)
5 months, and the data is recorded from January 1, 2018 to May
where m denotes the m-th nodes in the graph. Finally, we can 31, 2018. It selects more than 18 million historical data from
define the graph convolution based on graph wavelet transform all road detection sensors on 7 roads. In addition to the traffic
as flow data set, it also needs external data, especially the number
−1
f ⊗ gx = Φs Φ−1 s f Φs x (18) of traffic accident records.
Meanwhile, we also collect the weather data which includes
4) Graph Wavelet Gated Recurrent Neural Network: Finally, the sunny or rainy information from https://darksky.net. For the
in order to extract the spatial-temporal of the transportation raw data, we divide a day into 48 intervals into units of half an
networks, we propose to use a graph wavelet gated recurrent hour. Second, the Max-Min method is used to standardize the
(GWGR) neural network, which has several gate units to filter training set in the training process as [0,1]. In the evaluation
out or add information to the cell state. process, the result is scaled to the normal range and compared
In the model, we have defined the graph wavelet coefficient with the true value. In addition, a sliding window method is
matrix in Eq. (18). Thus, the GWGR can be defined by the used to obtain samples on the training and test data sets. For the
following equations feature data of external factors, the de-unit process is used to
ft = λg ΦS ∀xf Φ−1 h −1
s xt−1 + ΦS ∀f Φs ht−1 + bf ∗ (19) remove the unit information so that each feature is standardized
to the range of [0,1] through the Max-Min method. In addition,
it = λg ΦS ∀xi Φ−1 h −1
s xt−1 + ΦS ∀i Φs ht−1 + bi (20) we use the one-hot method to convert holiday, weekday, and
weekend data into binary vectors.
ot = λg ΦS ∀xo Φ−1 h −1
s xt−1 + ΦS ∀o Φs ht−1 + bo (21)
x −1 h −1
Ct = tanh ΦS ∀c Φs xt−1 + ΦS ∀C Φs ht−1 + bC (22)
Here, we can calculate the cell state Ct and the hidden state B. Comparison Methods
ht at time t as
In order to evaluate the performance of our proposed method,
Ct = ft Ct−1 + it Ct , (23) we compare our method with several baseline methods.
r ConvLSTM: ConvLSTM [20] mainly uses convolutional
where
neural networks to obtain spatial dependencies, and uses
ht = ot tanh(Ct ). (24) long and short-term memory networks to obtain time de-
pendencies, and then predicts traffic flow.
The loss function of the whole method is
r DMVST-Net: DMVST-Net [21] is a deep multi-view
Loss = Loss XˆT − xT . (25) space-time network. This method uses a unified multi-view
model, which can comprehensively consider space, time,
III. EXPERIMENTAL SETUPS and semantic information for comprehensively predicting
traffic flow.
In this section, we introduce our experimental setups in r STDN: STDN [22] is a spatio-temporal dynamic network.
details, including datasets, evaluation metrics, and comparison
It uses the local spatial dependency convolutional neural
methods.
network to extract the spatial characteristics of the traffic
data and uses the attention mechanism combined with
A. Datasets the long- and short-term memory network to obtain the
In traffic flow forecasting research, the three major attributes temporal characteristics for prediction.
of traffic flow (traffic flow, average speed and line occupancy) all For simplicity, we name our method as Proposed. Moreover,
have practical research value. For passenger travel, the average to account for the effectiveness of multi factors, we compare our
speed has a more intuitive evaluation significance. Passengers method with its variations:
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
HU et al.: MULTI-SOURCE INFORMATION FUSION BASED DLAAS FOR TRAFFIC FLOW PREDICTION 999
TABLE II
ACCURACY COMPARISON ON PEMS01 DATASET
r Proposed-T: Proposed-G does not consider the traffic speed A. Accuracy Comparison
information.
r Proposed-W: Proposed-W does not consider the weather In this section, we first compare our method with three base-
line methods on two datasets.
information.
r Proposed-P: Proposed-P does not consider the peak and We first compare our method with three different baselines
on the PeMS01 dataset using MAPE and RMSE over different
non-peak time information. fractions of the training dataset. The comparison results are
All the methods are conducted with DLaaS platform with the shown in Table II. We can see from this table that:
construction of the deep learning framework. r The overall performance of our proposed method is better
than three baseline methods over all three metrics. Take
C. Evaluation Metrics the fraction of the training set as 0.8 for example. Our
method achieves an 8.28% improvement in MAPE and
We use two widely adopted evaluation metrics in traffic flow
15.12% improvement in RMSE compared with STDN,
prediction to compare our method with the proposed method.
the best baseline method, in the PeMS01 dataset. This
The two metrics are MAPE (Mean Absolute Percentage Error)
denotes that it is beneficial to consider various external
and RMSE (Root Mean Square Error). Specifically, the MAPE
factors in designing traffic flow prediction method. It also
is defined as follows:
demonstrates that our method with graph wavelet gated
ˆ recurrent neural network is effective.
1 fi,t+1 − fi,t+1
n
r ConvLSTM has the worst performance among all baseline
M AP E = , (26)
n i=1 fi,t+1 methods. This indicates that external factors may have a
positive influence on traffic flow prediction because this
where n denote the total samples, fi,t+1 denotes the ground method merely considers the traffic network information
truth, and fˆi,t+1 denotes the predicted value. Based on the for prediction where other factors are not utilized. STDN
equation, we can see that MAPE not only considers the deviation has the second-best performance because it can capture the
between the predicted value and the true value, but also considers dynamic of the traffic information.
the ratio between the error and the true value.
r Another observation we can clearly see in this table is the
The RMSE is defined as steady decrease of MAPE and RMSE with the increase
of the fraction of the training set. This is because of the
1 n ˆ 2 definition of these two metrics. With more data are utilized
RM SE = fi,t+1 − fi,t+1 . (27)
n t=1 for training, all the methods will have a better performance.
This indicates that it will be better to acquire more data for
We can see that RMSE is the square root of the ratio of the square model training and learning.
of the deviation between the predicted result and the true value Table III shows the comparison results on PeMS02 dataset.
to the number of samples. It can measure the deviation of the We can see from this table that our proposed method also has the
predicted value from the true value, and it is very sensitive to best performance compared with all the baseline methods. The
the outliers in the predicted results. If the prediction result at a improvements compared with the second-best approach, STDN,
certain point during the test is an outlier, the value of RMSE will are 9.73% in MAPE and 17.68% in RMSE when the fraction of
cause great fluctuations. the training set is 0.8. Meanwhile, hybrid prediction methods
such as STDN and DMVST-Net have a better performance
IV. RESULTS compared with simple approaches such as ConvLSTM. These
observations also demonstrate the effectiveness of our proposed
In this section, we present the experimental results from the method.
perspectives of accuracy comparison and factor contribution Based on these two tables, by comparing the results on two
analysis, and performance over different factors. datasets, we can see that almost all prediction methods have
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
1000 IEEE TRANSACTIONS ON COMPUTERS, VOL. 73, NO. 4, APRIL 2024
TABLE III
ACCURACY COMPARISON ON PEMS02 DATASET
a better performance on the PeMS02 dataset than that on the proposed method always has a better performance than the other
PeMS01 dataset. The potential reason is that the PeMS02 dataset three approaches. This indicates that our utilized factors are
has more records than in PeMS01 dataset. beneficial. At the same time, we can see that Proposed-T has the
worst performance, this indicates that traffic speed information
has a significant influence on the traffic flow prediction.
B. Factor Contribution Analysis
Fig. 3 shows the performance of the proposed method and
In our method, we have proposed to use several external traf- three “Jackknife” approaches on traffic flow prediction in terms
fic information for prediction, including traffic speed, weather of different fractions of the training set on the PeMS02 dataset.
information, and peak information. In order to investigate the Similar results can be seen from these figures. The MAPE and
contribution of each feature on the traffic prediction of our RMSE of all methods decline with the increasing of factions
proposed method. We utilize the “Jackknife” approach with of the training set. Our proposed method always has the best
one case: removing one factor and predicting with the rest performance. Proposed-T still has the worst performance.
factors so that we have three baselines, including Proposed-T,
Proposed-W, and Proposed-P. Proposed-T does not consider the
traffic speed information, Proposed-W does not consider the C. Performance Over Different Factors
weather information, and Proposed-P does not consider the peak
Traffic prediction is sensitive to environmental information
and non-peak time information.
and human mobility pattern. In this work, we consider the
Fig. 2 depicts the performance of the proposed method and
weekday, weekend, peak time, and none-peak time. In this
three “Jackknife” approaches on traffic flow prediction in terms
section, we show the performance of our proposed method in
of different fractions of the training set on the PeMS01 dataset.
these situations.
From this figure, we can see that the MAPE and RMSE of all
The comparison results are shown in Fig. 4. This figure shows
methods decline with the increasing of factions of the training
the MAPE and RMSE of our proposed method on weekdays,
set, which means that the prediction task will benefit from a large
weekends, peak times, and none-peak times. We can see from
amount of training data. When the fraction of the training set
this figure that: 1) our method will have a better prediction
goes up from 20% to 90%, the MAPE of our method decreases
performance on a weekday than on weekend; 2) our method will
from 36.61% to 22.1% and the RMSE of our method decreases
have a better prediction performance on none-peak time than on
from 25.29 to 13.22. On the other hand, we can notice that our
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
HU et al.: MULTI-SOURCE INFORMATION FUSION BASED DLAAS FOR TRAFFIC FLOW PREDICTION 1001
peak time; 3) Our method will have a better performance on forecast time, traffic forecasting can be divided into three scales:
PeMS02 dataset than that on PeMS01 dataset. short-term (5-30 minutes), medium-term (30-60 minutes), and
long-term (over 1 hour). Most popular methods can perform
V. RELATED WORKS well in the short-term prediction interval. However, due to the
uncertainty and complexity of traffic flow, these methods cannot
In this section, we review the related works from the perspec-
meet the requirements in long-term time series forecasting. For
tives of traditional methods and deep learning-based methods.
simulation methods, traffic flow prediction requires comprehen-
sive and detailed system modeling based on physical theory and
A. Traditional Methods for Traffic Flow Prediction prior knowledge. Nevertheless, simulation systems and simula-
The research of traffic flow forecasting has a history of tion tools still need to consume a lot of computing power and
decades, and its methods are mainly divided into two categories: skilled parameter settings to reach a stable state. At present,
knowledge-driven methods and data-driven methods. In the with the rapid development of real-time traffic data collection
research of transportation and operations research, knowledge- methods and forms, researchers are turning their attention to
driven methods usually apply to queue theory to simulate user data mining methods through a large number of historical traffic
behavior in transportation. In terms of time series, some data- records [28], [29], [30].
driven methods are still popular, such as the Autoregressive
Integrated Moving Average (ARIMA) model, which proposes
an algorithm to construct an abnormal causal tree based on the B. Deep Learning Based Methods for Traffic Flow Prediction
spatiotemporal characteristics of detected outliers. The structure Some scholars use Convolutional Neural Networks (CNN) to
of these causal trees not only reveals the repeated interaction model spatial correlation. Mihaita et. al [31] proposed a novel
between temporal and spatial outliers but also reveals potential deep architecture that combines CNN and LSTM to predict
flaws in the existing transportation network design. There is also future traffic flow (CLTFP). They use one-dimensional CNN to
the Kalman filtering method, which proposes two new support obtain the spatial characteristics of traffic flow and use LSTMs
vector, regression models. However, simple time series models to mine the short-term variability and periodicity of traffic flow.
usually rely on stationarity assumptions, whereas traffic data are Considering these meaningful features, feature-level fusion is
non-stationary [23], [24]. performed to achieve short-term prediction. Yang et al. [32] pro-
The speed, volume, density, and other indicators collected by posed a traffic learning method based on CNN, which uses traffic
various sensors reflect the road traffic conditions. Therefore, as an image to learn and conducts large-scale, network-wide
these measured values are usually selected as the target of traffic speed high-precision prediction, transforming spatiotem-
flow prediction [25], [26], [27]. According to the length of the poral traffic dynamics into a two-dimensional spatiotemporal
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
1002 IEEE TRANSACTIONS ON COMPUTERS, VOL. 73, NO. 4, APRIL 2024
matrix image describing the temporal and spatial relationship information fusion-based deep learning framework for traffic
of traffic flow, and applies CNN to the image through two con- flow prediction.
secutive steps, including abstract traffic feature extraction and
network-wide traffic speed prediction. But the spatial structure
considered by the above model is in euclidean space. Some VI. CONCLUSION
scholars have also studied graph convolutional neural networks. In this article, we propose a multi-source information fusion-
Tang et al. [33] proposed that CNNs can be extended to signals based deep learning framework based on DLaaS for traffic flow
defined in more general domains. In particular, two structures are prediction. Our method mainly contains three parts, including
proposed, including domain hierarchical clustering and graph traffic information modeling, external factor modeling, and
Laplacian spectrum. Chen et al. [34] proposed a convolutional graph wavelet gated recurrent neural network. Considering the
network that extends traditional CNN to non-euclidean space. significance of the speed information for traffic prediction, we
RNN performs well in dealing with sequence problems. propose to model the speed information from three perspectives,
Whether you are dealing with a longer sequence or a shorter including vehicle speed, traffic volume, and occupancy rate.
sequence, it can be processed and analyzed. The processing Moreover, we consider various external information, such as
results are different, so theoretically the RNN neural network weather information and human mobility pattern. Then, we pro-
structure can handle sequences of any length. After a large num- pose to combine the multi-information fusion traffic control and
ber of experiments, it has been shown that when the processed guidance collaborative neural network and the results obtained
sequence is too long, the gradient disappears during the opti- are better than the benchmark algorithms. Finally, we learn the
mization process. This is because, at this time when the neuron transportation network in the form of a graph and use graph
receives information feedback, it wants to obtain the influence wavelet as a key component to extract well-positioned features
of the element information at a far distance. When the maximum from the graph based on the transportation network. Compared
length is exceeded, the signal will be truncated, resulting in loss with graph convolution, graph wavelets are very flexible and do
of information. Therefore, to improve the memory loss defect not need to specify adjacent regions in the topological graph
of RNN, scholars have made improvements, i.e., developing a structure for feature extraction. The results by comparison with
long and short-term memory neural network LSTM, which adds several baseline methods show that our proposed method can
a memory cell to the network structure to perform information outperform all the baseline methods.
on the past or distant unit.
More recently, the emergence of DLaaS enables users to deal
with large-scale traffic datasets with its popular frameworks REFERENCES
to train neural networks, such as TensorFlow, PyTorch, and [1] X. Luo, D. Li, Y. Yang, and S. Zhang, “Spatiotemporal traffic flow pre-
Caffe. With the help of DLaaS, there is no need to purchase and diction with KNN and LSTM,” J. Adv. Transp., vol. 2019, no. 5, pp. 1–10,
2019.
maintain expensive hardware. Users can choose suitable DLaaS [2] Z. Zheng, Y. Yang, J. Liu, H.-N. Dai, and Y. Zhang, “Deep and em-
services based on supported deep learning frameworks, neural bedded learning approach for traffic flow prediction in urban informat-
network models, training data, cost constraints, and other condi- ics,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3927–3939,
Oct. 2019.
tions. Thus, more scholars are using DLaaS to implement their [3] P. Sun, N. AlJeri, and A. Boukerche, “A fast vehicular traffic flow predic-
deep learning framework with DLaaS for traffic flow prediction. tion scheme based on fourier and wavelet analysis,” in Proc. IEEE Glob.
However, recent research found that many existing traffic Commun. Conf., 2018, pp. 1–6.
[4] Y. Zeng, H. Qiu, G. Memmi, and M. Qiu, “A data augmentation-based
flow prediction systems have the following four shortcomings: defense method against adversarial attacks in neural networks,” in Proc.
1) Because neural networks with multiple hidden layers are Int. Conf. Algorithms Architect. Parallel Process., 2020, pp. 274–289.
difficult to train successfully, most models have very shallow [5] Y. Li, Y. Song, L. Jia, S. Gao, Q. Li, and M. Qiu, “Intelligent fault diagnosis
by fusing domain adversarial training and maximum mean discrepancy via
structures, and some even have only one hidden layer. It is ensemble learning,” IEEE Trans. Ind. Informat., vol. 17, no. 4, pp. 2833–
difficult to capture the deep associations hidden in large data sets; 2841, Apr. 2021.
2) Some systems require cumbersome and error-prone manual [6] H. Qiu, T. Dong, T. Zhang, J. Lu, G. Memmi, and M. Qiu, “Adversarial
attacks against network intrusion detection in IoT systems,” IEEE Internet
features, which require prior knowledge of specific fields for Things J., vol. 8, no. 13, pp. 10327–10335, Jul. 2020.
feature extraction and selection; 3) Many systems separately [7] P. Bhattacharya, S. Tanwar, U. Bodke, S. Tyagi, and N. Kumar, “BinDaaS:
predict the characteristics of each road Traffic flow, ignoring Blockchain-based deep-learning as-a-servicein healthcare 4.0 applica-
tions,” IEEE Trans. Netw. Sci. Eng., vol. 8, no. 2, pp. 1242–1255, Apr.–Jun.
the correlation between roads. For example, the ARIMA model 2019.
predicts the future traffic flow of this road based on the data [8] H. Duan, X. Xiao, and Q. Xiao, “An inertia grey discrete model and its
observed on this road in the past time and completely ignores application in short-term traffic flow prediction and state determination,”
Neural Comput. Appl., vol. 32, no. 12, pp. 8617–8633, 2020.
the historical data of other related roads. The traffic road network [9] A. Miglani and N. Kumar, “Deep learning models for traffic flow predic-
is highly correlated. It is obvious that the traffic flow is affected tion in autonomous vehicles: A review, solutions, and challenges,” Veh.
by the relevance of the road network. Therefore, it is best to Commun., vol. 20, 2019, Art. no. 100184.
[10] D. Kang, Y. Lv, and Y.-Y. Chen, “Short-term traffic flow prediction with
consider the temporal and spatial correlation characteristics lstm recurrent neural network,” in Proc. IEEE 20th Int. Conf. Intell. Transp.
of traffic flow data to predict traffic flow to use the correla- Syst., 2017, pp. 1–6.
tion between roads; 3) Existing research rarely considers the [11] X. Feng, X. Ling, H. Zheng, Z. Chen, and Y. Xu, “Adaptive multi-kernel
SVM with spatial–temporal correlation for short-term traffic flow predic-
very serious problem of incomplete data in practical applica- tion,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 6, pp. 2001–2013,
tions. To tackle these challenges, we proposed a multi-source Jun. 2019.
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.
HU et al.: MULTI-SOURCE INFORMATION FUSION BASED DLAAS FOR TRAFFIC FLOW PREDICTION 1003
[12] S. Zhu, Y. Zhao, Y. Zhang, Q. Li, W. Wang, and S. Yang, “Short-term He-xuan Hu received the BS degree in thermal
traffic flow prediction with wavelet and multi-dimensional taylor network energy engineering from South-East University, Nan-
model,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 5, pp. 3203–3208, jing, China, in 1998, and the MSc and PhD degrees in
May 2021. automation science, computer engineering and signal
[13] Z. Zhene et al., “Deep convolutional mesh rnn for urban traffic passenger processing from the University of Science and Tech-
flows prediction,” in Proc. IEEE SmartWorld, Ubiquitous Intell. Com- nology of Lille, Villeneuve-d’Ascq, France, in 2003
put., Adv. Trusted Comput., Scalable Comput. Commun., Cloud Big Data and 2009, respectively. He is currently a professor
Comput., Internet People Smart City Innov., 2018, pp. 1305–1310. with the College of Computer and Information, Hohai
[14] Y. Wu, H. Tan, L. Qin, B. Ran, and Z. Jiang, “A hybrid deep learning based University, Nanjing, where he develops and applies
traffic flow prediction method and its understanding,” Transp. Res. Part models and algorithms in the fields of artificial intel-
C: Emerg. Technol., vol. 90, pp. 166–180, 2018. ligence, Internet of Things, intelligent control theory,
[15] D. Chen, “Research on traffic flow prediction in the Big Data environment reconfigurable control, automated planning, and model checking.
based on the improved RBF neural network,” IEEE Trans. Ind. Informat.,
vol. 13, no. 4, pp. 2000–2008, Aug. 2017. Zhen-zhou Lin received the MS degree in project
[16] P. Wang, W. Hao, and Y. Jin, “Fine-grained traffic flow prediction of management from the Nanjing University of Posts
various vehicle types via fusison of multisource data and deep learning and Telecommunications, Nanjing, China, in 2011.
approaches,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 11, pp. 6921– He is currently working toward the PhD degree in
6930, Nov. 2021. computer science and technology with the College of
[17] H. Zheng, F. Lin, X. Feng, and Y. Chen, “A hybrid deep learning model Computer and Information, Hohai University, Nan-
with attention-based conv-lstm networks for short-term traffic flow pre- jing. He is currently a senior engineer with the office
diction,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 11, pp. 6910–6920, of educational administration, Nanjing University of
Nov. 2021. Finance and Economics, Nanjing. His current re-
[18] C. Zheng, X. Fan, C. Wen, L. Chen, C. Wang, and J. Li, “DeepStd: Mining search interests include application software develop-
spatio-temporal disturbances of multiple context factors for citywide traffic ing, artificial intelligence, Internet of Things, image
flow prediction,” IEEE Trans. Intell. Transp. Syst., vol. 21, no. 9, pp. 3744– processing and computer vision and educational administration informatization.
3755, Sep. 2020.
[19] S. Fang, Q. Zhang, G. Meng, S. Xiang, and C. Pan, “Gstnet: Global spatial- Qiang Hu received the bachelor’s degree in re-
temporal network for traffic flow prediction,” in Proc. Int. Joint Conf. Artif. newable energy science and engineering from Ho-
Intell., 2019, pp. 2286–2293. hai University, Nanjing, China, in 2015, and the
[20] X. Shi, Z. Chen, H. Wang, D. Y. Yeung, W. K. Wong, and W. C. Woo, master’s degree in fluid machinery and engineer-
“Convolutional LSTM network: A machine learning approach for precip- ing from the College of Energy and Electrical En-
itation nowcasting,” in Proc. 28th Int. Conf. Neural Inf. Process. Syst., gineering, Hohai University, in 2018. He is cur-
2015, pp. 802–810. rently working towrad the PhD degree in computer
[21] H. Yao et al., “Deep multi-view spatial-temporal network for taxi demand science and technology with the College of Com-
prediction,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 2588–2595. puter and Information, Hohai University. His cur-
[22] H. Yao, X. Tang, H. Wei, G. Zheng, and Z. Li, “Revisiting spatial-temporal rent research interests include artificial intelligence,
similarity: A deep learning framework for traffic prediction,” in Proc. AAAI Internet of Things, image processing and computer
Conf. Artif. Intell., 2019, pp. 5668–5675. vision.
[23] J. Zhou, H.-N. Dai, H. Wang, and T. Wang, “Wide-attention and deep-
composite model for traffic flow prediction in transportation cyber- Ye Zhang received the BS degree from the Faculty
physical systems,” IEEE Trans. Ind. Informat., vol. 17, no. 5, pp. 3431– of Computer and Information,South-East University,
3440, May 2020. Nanjing, China, and the MSc and PhD degrees in au-
[24] N. G. Polson and V. O. Sokolov, “Deep learning for short-term traffic flow tomatique, genie informatique et traitement du Signal
prediction,” Transp. Res. Part C: Emerg. Technol., vol. 79, pp. 1–17, 2017. from the University of Science and Technology of
[25] A. Koesdwiady, R. Soua, and F. Karray, “Improving traffic flow pre- Lille, France. She is a lecturer with the College of
diction with weather information in connected cars: A deep learning Computer and Information, Hohai University, Nan-
approach,” IEEE Trans. Veh. Technol., vol. 65, no. 12, pp. 9508–9517, jing, where she develops and applies models and
Dec. 2016. algorithms in the fields of automated planning, model
[26] Y. Chen, L. Shu, and L. Wang, “Traffic flow prediction with Big Data: checking, cloud computing, parallel computing, and
A deep learning based time series model,” in Proc. IEEE Conf. Comput. big data.
Commun. Workshops, 2017, pp. 1010–1011.
[27] H. Tan, Y. Wu, B. Shen, P. J. Jin, and B. Ran, “Short-term traffic prediction
Wei Wei (Senior Member, IEEE) received the MS and
based on dynamic tensor completion,” IEEE Trans. Intell. Transp. Syst., PhD degrees from Xi’an Jiaotong University, in 2011
vol. 17, no. 8, pp. 2123–2133, Aug. 2016.
and 2005, respectively. He is currently an associate
[28] Y. Tao, P. Sun, and A. Boukerche, “A delay-based deep learning approach
professor with the School of Computer Science and
for urban traffic volume prediction,” in Proc. IEEE Int. Conf. Commun., Engineering, Xi’an University of Technology, Xi’an,
2020, pp. 1–6.
China. His research interest is in the area of wireless
[29] S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatial-
networks, wireless sensor networks application, im-
temporal graph convolutional networks for traffic flow forecasting,” in
age processing, mobile computing, distributed com-
Proc. AAAI Conf. Artif. Intell., 2019, pp. 922–929. puting, and pervasive computing, Internet of Things,
[30] H. Nguyen, L.-M. Kieu, T. Wen, and C. Cai, “Deep learning methods
sensor data clouds, and so on. He is a senior member
in transportation domain: A review,” IET Intell. Transport Syst., vol. 12,
of CCF.
no. 9, pp. 998–1004, 2018.
[31] A.-S. Mihaita, H. Li, Z. He, and M.-A. Rizoiu, “Motorway traffic flow
prediction using advanced deep learning,” in Proc. IEEE Intell. Transp. Wei Wang received the PhD degree in software en-
Syst. Conf., 2019, pp. 1683–1690. gineering from the Dalian University of Technology,
[32] H.-F. Yang, T. S. Dillon, and Y.-P. P. Chen, “Optimized structure of the in 2018. He is currently an associate professor with
the School of Intelligent Systems Engineering, Sun
traffic flow forecasting model with a deep learning approach,” IEEE Trans.
Yat-sen University, China. Before this, he had been
Neural Netw. Learn. Syst., vol. 28, no. 10, pp. 2371–2381, Oct. 2017.
the UM Macao research fellow with the University of
[33] K. Tang, S. Chen, and A. J. Khattak, “A spatial–temporal multitask
collaborative learning model for multistep traffic flow prediction,” Transp. Macau, Macau SAR. His research interests include
computational social science, data mining, Internet
Res. Rec., vol. 2672, no. 45, pp. 1–13, 2018.
of Things, and artificial intelligence.
[34] L. Chen, K. Han, Q. Yin, and Z. Cao, “GDCRN: Global diffusion convo-
lutional residual network for traffic flow prediction,” in Proc. Int. Conf.
Knowl. Sci., Eng. Manage., 2020, pp. 438–449.
Authorized licensed use limited to: Universita degli Studi di Napoli Federico II. Downloaded on August 25,2024 at 14:39:30 UTC from IEEE Xplore. Restrictions apply.