0% found this document useful (0 votes)

76 views

Reputation-Aware Data Fusion and Malicious Participant Detection in Mobile Crowdsensing

Uploaded by

jasonw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

Reputation-Aware Data Fusion and Malicious Participant Detection in Mobile Crowdsensing

Uploaded by

jasonw

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

2018 IEEE International Conference on Big Data (Big Data)

Reputation-Aware Data Fusion and Malicious

Participant Detection in Mobile Crowdsensing
Yujian ‘Charles’ Tang*+ , Samia Tasnim*, Niki Pissinou, S. S. Iyengar, Abdur Shahid
School of Computing and Information Sciences
Florida International University
Miami, Florida 33199
Email: {stasn002, pissinou, ashah044}@fiu.edu, [email protected]
+
University of North Carolina, Chapel Hill, NC 27599
Email: {[email protected]}
* Both authors contributed equally to this work

Abstract—Mobile crowdsensing, an emerging sensing or erroneous data, making trust evaluation a highly important
paradigm, promotes scalability and reduction in the deployment issue in MCS applications. Therefore, validating the accuracy
of specialized sensing devices for large-scale data collection of contributions is essential to ensure the reliability of the
in a decentralized fashion. However, its open structure allows
malicious entities to interrupt a system by reporting fabricated application system.
or erroneous data, making trust evaluation a highly important In this paper, we consider data corruption attack behavior of
issue in mobile crowdsensing applications. The goal of this a malicious participant. By malicious we mean a participant
research is to show that an introduction of a reputation system who sends incorrect data either intentionally or unintention-
in the process of correlated sensor-based data fusion will ally. The unintentional error can arise because a participant
enhance the overall quality of the sensed data. To do so, we
design a reputation-aware data fusion mechanism to ensure data carelessly performed the sensing task, or due to a sensor
integrity. We use Gompertz function in our reputation method to error. On the contrary, a malicious participant can deliberately
rate the trustworthiness of the data reported by a crowdsensing fabricate the sensed data to infiltrate the system. For example,
participant. The proposed mechanism, on one hand, is capable in air quality monitoring, a malicious participant may hold
of defending a data corruption attack and identifying malicious the sensor beside a burning cigarette or place it over sand
or honest participants based on their reported data in real
time. On the other hand, this mechanism yields more accurate instead of facing to the air. Thus, the reported data will not
data prediction in terms of lower data prediction error. We represent the actual air quality. In the related contemporary
conducted experiments using two different real-world datasets. works [6], [7], [8], [9], the authors did not consider the
We compare our correlated data and reputation-aware data participants’ malicious behavior. Thus, these works were not
prediction (CDR) method with other popular methods, and able to distinguish the sensing data reported by malicious or
the results show that our effective method incurs lower data
prediction error. careless users. This limitation of the existing works motivates
Key words- Big data analytics; anomaly detection; data us to design reputation-aware real-time data fusion algorithms
fusion; mobile crowdsensing; Spatial-temporal data anal- for MCS to ensure data integrity. Our method can detect
ysis malicious participants and prevent them from infiltrating the
system in real time.
I. I NTRODUCTION We develop an online method for data quality prediction
With the advent of better wireless technology and an in- in MCS considering the heterogeneous trust level of the
crease in smartphone usage, a new mode of data collection participants. We took into account spatio-temporal and inter
named mobile crowdsensing (MCS) has emerged. Mobile sensor-category correlations. We consider the users who are
crowdsensing has a number of practical applications: traf- willing to participate in sensing at the same time. The terms
fic monitoring, epidemic disease monitoring, reporting from participant or node are used to denote a user with sensing
disaster situations and environment monitoring [1], [2], [3]. capability.
For example, an environmental air quality sensing system was We implement our Correlated Data and Reputation-aware
deployed on street sweeping vehicles to monitor air quality Data Prediction (CDR) method on two real-world datasets
in San Francisco [4]. These applications are usually open to [10], [11]. The sensing was performed for four days, and
the public and receive sensor data from multiple participants. there are 289 taxi values in the first real dataset. The taxis
This influences the reduction of data sparsity at lower costs move around different parts of Rome sensing temperature.
in comparison with traditional sensor networks. With various The second data set consists of Beijing’s air quality data. One
advantages, MCS’s people-centric architecture allows both hundred and forty nine taxis with four types of sensors col-
more inaccurate and corrupted data [5]. Malicious participants lect P M 2.5, P M 1.0, N O2 and humidity data from Beijing
can manipulate the MCS data collection process at ease. during seven days.
These entities can interrupt a system by reporting fabricated The main contributions of this paper are as follows.

978-1-5386-5035-6/18/$31.00 ©2018 IEEE 4820

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
one mobile user at each point of interest (PoI). Kishino et
al. [17] mounted sensor nodes on garbage trucks that drive
around the city. Their motivation was to detect target events by
analyzing vehicle-mounted sensor data streams. The authors
used machine learning methods to achieve so. On the other
hand, the author [7] broached a new sampling method named
stratified sampling for calculating mean temperature of a linear
area. In this paper, only random waypoint mobility model has
been considered for the movement of the sensing devices.
There are several works focusing on the cleaning of data
streams. Most of the previous works on sensor data cleaning
focused on the reduction of consumed energy. To achieve this
reduction, the authors [18], [19], [20] tried to reduce the inter
node communication. In these works, it was assumed that
Fig. 1. Three-dimensional Tensor sensor data are always aggregated during submission. There
have been significant works on using compressive sensing
for data reconstruction in static sensor networks [21], [22].
• We propose a novel reputation-aware correlated sensor-
In recent days, researchers [23], [24], [25], [26], [16], [27]
based light-weight real-time data fusion and malicious par-
are designing frameworks to deal with big data services.
ticipant detection mechanism for mobile crowdsensing data
In the past, the data size was not as big as present days,
streams.
which influences researchers to design and develop scalable
• Extensive experiments using two real-world data sets
mechanisms to correct any kind of inaccuracy in data streams.
ensure the efficiency and accuracy of our data prediction
Liu et al. [26] designed a framework for big data cleaning. This
mechanism over state-of-the art techniques.
paper gives direction on how to achieve reliable database in big
We organize our paper as follows. In section II, we discuss
data applications. They used context to find similarity between
the related work. In section III, we discuss different modules of
data items. Moreover, the authors exploited usage pattern to
our overall system. In section IV, we discuss our performance
classify and group data items that are not related contextually.
evaluation. Finally, concluding remarks and future works are
One of the challenging tasks in dealing with big data is to
offered in section V.
shrink the data size by extracting the irrelevant subset. Dong
II. R ELATED W ORK et al. [28], in contrast, debated that having more data does
not always provide more information. During data integration,
In this section, we discuss the most pertinent works. Zhang proper selection of reliable source among all available sources
et al. [12] proposed data cleaning method for environmental results in higher data accuracy.
sensing which was based on incrementally adjusted reliability Another aspect of literature focuses on finding outliers in
of individual sensors. With the advance of time, they incre- sensor data streams. In order to find global outliers in the
mentally adjusted the reliability of each sensor depending on data, Branch et al. [18] proposed a distance based ranking
the sensing data accuracy. Trustworthiness has been considered method. The other existing methods for finding outliers in
as a measure of data quality estimation [13]. Huang et al. [14] sensor data are geometry-based [29] , polygon-based spatial
showed that using a reputation framework helped to weed out outlier detection [30], clustering-based [31], kernel density-
non-colluding malicious attackers. Their reputation framework based [8] and histogram approach [32]. Bosman et al. [33]
produced more accurate results than not using a reputation tried to answer the question if adding more neighbors makes
framework. However, the authors assumed that data is coming the anomaly detection perform better. This paper considered
from every discrete block of space-time which is not practical static sensor nodes and it varied the neighborhood size by
in real-world scenarios. On the contrary, Peng et al. [15] used changing the communication range of the sensors.
unsupervised learning for data quality estimation. Though this However, to the best of our knowledge, we are the first
method works after the collection of historical data from all to develop reputation-aware correlated sensor-based real-time
the users, it is not an online method. data fusion and malicious participant detection mechanism for
Nowadays, instead of traditional static wireless sensor net- mobile crowdsensing data streams.
works, the sensing is distributed among a crowd of people.
This brings heterogeneity in the sensor networks and makes
the computation more complex. The most recent work on III. M ETHODOLOGY
data quality estimation in mobile crowdsensing is done by
Shengzhong et al. [16]. The authors broached real-time data In this section we first present an overview of the proposed
estimation in mobile crowd sensing and proposed a context- mechanism, Correlated Data and Reputation-aware Data Pre-
aware method for data quality estimation. The limitation of this diction (CDR), then a detailed description of the components,
work is that the authors considered the presence of exactly and finally how we fit them together to create our full structure.

4821

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
A. Overview normalization, the cooperation scores belong to the range
CDR consists of two parts: a reputation calculation method [−1, 1].
and correlated data [6]. The reputation method considers two 2(pi − min(p))
types of trust for each sensor, cooperation and reputation, and pnorm
i = −1 (3)
max(p) − min(p)
both parameters are calculated at the application server level.
The reputation calculation method is applied to multiple types We want to maximize impact of the most recent epochs
of sensor data streams. These varied sensors are correlated and minimize the impact of the least recent ones. To make
with each other. It is important for our mechanism to take the aging effective, we age the normalized cooperation scores
the granularity of time and space into account. We discretized with Eq. 4.
our time into epochs, and space into equal-sized grids. The
k
framework is applied only on data from sensors within the 0 X 0
pi,k = λk−k pnorm
i,k (4)
same region and the same epoch. CDR is applied to each
k0
different type of data and then the final, discretized space-
time blocks are used to produce a least-square regression Here, k denotes the current epoch and k 0 has the value from
on the target data type. This regression can be used to 1 to current epoch. Aging parameter λ has the value [0, 1]
predict both future data and missing data. We borrow the Finally, reputation is calculated using the Gompertz function
concept of three-dimensional tensors shown in Fig. 1 from [6]. [14], shown in Eq. 5.
The authors considered temporal interpolation for the sparse 0
cpi,k
regions. However, Kang et al. [6] assumed that all incoming Ri,k = aebe (5)
data from sensors was accurate.
Here, a, b and c are function parameters. The parameter a
B. Cooperation denotes the upper asymptote, displacement along x-axis is
Cooperation scores of sensors are measured per epoch; they controlled by b and the growth rate is controlled by parameter
measure the proportion of the inverse square root error of the c.
data from the sensor over the sum of the proportion of the D. Full Structure
inverse square root error from all sensors. For our cooperation
parameter, we used an inverse proportion of the square root We discretize the space into regions and the time into
of the absolute error so as not to punish small deviations from epochs, then we run CDR on every discrete block of space-
the average as much. In the data sets we tested, temperature time.
data and air quality data, small variations from the average are First we run an Expectation Maximization Algorithm (EM),
common. The equation for cooperation score is shown in Eq. shown in Algorithm 1, on the “reputable” sensors. To be
1. classified as reputable sensors, the participant must have a
reputation higher than the threshold. This threshold is an
√ 1 application dependent. Initially, all sensors are classified as
|xi −r|
Pn √ reputable with equal cooperation score.
|xi −r|
i=1
pi = Pn (1)
√ 1
j=1 |xj −r|
Pn √
|xi −r|
Algorithm 1 Expectation Maximization on Cooperation
i=1
Scores for Robust Average
Where r is the robust average of the data in that epoch and Input: Robust Average (r) , Cooperation Scores (pi )
xi is the measurement from sensor i. The robust average of Output: Robust Average (r)
the data provides an idea of where the data clusters, and this Initialize: all pi to 1/n, where n is the number of sensors, and
increases the accuracy of the data by assigning more weight to l = 0, where l is the iteration
values that occur more frequently. We calculate robust average while pli and pl+1 don’t converge do
i
using Eq. 2. Compute rl+1 from pli ’s using Eq. 2
n Compute pl+1 l
i ’s from r using Eq. 1
X
r= pi ∗ xi (2) l = l+1
end
i=1
return rl+1
C. Reputation
Reputation scores are updated at the end of each epoch; it After running EM algorithm once on only the reputable
measures how accurate the crowdsensing participant has been sensors, we then check the reported values from “disreputable”
over time. To calculate reputation from cooperation scores, sensors, or sensors with a reputation lower than the threshold.
first the cooperation scores are normalized [14] using Eq. 3. If the reported value from any of these sensors is within
Here, Pi is the cooperation score of participant i. min(p) an acceptable error range of the robust average calculated
and max(p) denote the minimum and maximum cooperation from the reputable sensors’ reported data, then it is added
score among all the participants during that epoch. After as faux reputable sensor in that block of space-time. After

4822

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
finding all the sensors from the set of disreputable sensors
that contributed acceptable data in the block of space-time,
EM is then run again on the new set of reputable sensors.
The reason that we run EM twice is to provide sensors in
the disreputable set a chance to move into the reputable set
if they consistently contribute accurate data, because only
sensors with a cooperation score for the epoch will have their
reputations updated. The second EM run gives a new reputable
average as well as update reputation scores for each sensor.
The new reputation scores are then normalized to the range
[−1, 1] using Eq. 3. The normalized cooperation scores are
then aged based on their cooperation rating. Sensors with a
cooperation score above a certain threshold are labeled as
“cooperative” and sensors with a cooperation score below
that threshold are labeled as “uncooperative”. Depending on
the sensor’s classification for the latest block, the normalized
cooperation is multiplied by a different aging parameter, λ. Fig. 2. Prediction results for test set 1: out of 612 predictions, CDR performed
better in 466 and was within 5% of the true value in 290 cases
Cooperative sensors are multiplied by a lower aging parameter
than uncooperative sensors. This means that the growth and
decay rates of reputation will be different; the decay rate will
be higher, and this provides higher punishment for bad data
and thus helps quickly detect malicious users. Finally the aged
cooperation score is inputted to Eq.5.
Once all the blocks are processed for each data type, then
we use the processed data to create a least-square fit with the
non-target data as the coefficient matrix, A, and the target data
type as the dependent matrix, b as shown in Eq. 6.
Ax̂ = b (6)
The regression, x̂, is then used to predict the target value given
knowledge of all the other data values.
IV. P ERFORMANCE E VALUATION
We used percentage absolute difference and Root Mean
Square Error (RM SE) as performance metrics of data pre-
Fig. 3. Prediction results for test set 2: out of 612 predictions, CDR performed
diction accuracy. We compared the performance of our CDR better in 453 and was within 5% of the true value in 261 cases
method against mean-based and temporal linear regression-
based data prediction models. We tested using two real-world
data sets. In the first data set, our target type is temperature
and uses two types of simulated correlated data. In the second
data set, our target type is particulate matter with a diameter
under 2.5 µm (P M 2.5) and uses three types of real correlated
data (P M 1.0, N O2 and humidity).
A. Temperature
The temperature data was from an area of roughly 22km by
23km and was taken over four days. The experimental area
was split into 25 regions using a 5x5 equal-sized grid. We
split the execution time into 96 epochs with each epoch being
one hour long. We tested the performance of our CDR method
against the existing mean-based method in three test data sets.
To imitate the data impurity, continuous or random errors
were applied on the temperature data streams. The data error
from malicious participants ranged from 25% to 75%. Figures
Fig. 4. Prediction results for test set 3: out of 612 predictions, CDR performed
2 through 4 show CDR’s percentage improvement over the better in 498 and was within 5% of the true value in 213 cases
mean-based method, and each figure shows 612 predictions.

4823

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Prediction results for test set 1: out of 640 predictions, CDR performed
better in 379 cases Fig. 6. Prediction results for test set 2: out of 640 predictions, CDR performed
better in 445 cases

On average CDR was 16% more accurate and performed

better in 77 percent of cases. Our CDR method incurred a
cumulative percentage error of 9.3%.

B. PM2.5
The air quality data was collected from an area of roughly
120km by 150km. The duration was seven days (149 hours).
CDR was tested against the existing mean-based and temporal
linear regression-based data prediction methods on five test
data sets. To imitate the data impurity, continuous or random
errors were applied on the crowdsensing data streams. The
data error ranged from 25% to 75%.
We tested the performance of our algorithm for different
levels of erroneous data from malicious users. We also varied
the knowledge level of the participants in regards to the experi-
mental environment to imitate sophisticated data manipulation
by a malicious crowdsensing participant. Test set 1 (Fig. 5, Fig. 7. Prediction results for test set 3: out of 640 predictions, CDR performed
better in 442 cases
Fig. 10, Fig. 15) was used for missing data prediction. We
tested with sequential and random data loss patterns. In the
first experiment with erroneous data from malicious users (Fig.
6, Fig. 8, Fig. 11, Fig. 13, Fig. 16, Fig. 18), we assumed
the participants did not have any prior knowledge about the
experimental environment. The data error ranged from 25%
to 75%. One group of malicious participants reported a fixed
percentage of error throughout the experiment. In the second
experiment, we considered that the malicious participant has
extended knowledge about the sensing area (Fig. 7, Fig. 9,
Fig. 12, Fig. 14, Fig. 17, Fig. 19). Thus, these participants try
to change the sensing data by adding noise to the air quality
data of that particular spatio-temporal unit.
1) Percent Error per Prediction: Figures 5 through 9 show
CDR’s percentage improvement over the mean-based method,
and each figure shows 640 predictions. On average CDR
performed better in 70% of cases and is 70% more accurate.
2) Root Mean Square Error by Epoch: Figures 10 through
Fig. 8. Prediction results for test set 4: out of 640 predictions, CDR performed
19 show CDR’s improvement of the root mean square error better in 454 cases
(RM SE) normalized by epoch. We calculated RM SE and

4824

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
Fig. 9. Prediction results for test set 5: out of 640 predictions, CDR performed
better in 533 cases Fig. 11. Prediction results for test set 2: out of 149 epochs, CDR performed
better in 88 epochs

Fig. 10. Prediction results for test set 1: out of 149 epochs, CDR performed
better in 88 epochs
Fig. 12. Prediction results for test set 3: Out of 149 epochs, CDR performed
better in 90 epochs

used it as a performance measurement criteria of our algo-

rithm. RM SE is a standard metric to evaluate the accuracy
of the prediction model [12].
v
u n
u1 X
RM SE = t (Vbi − Vi )2 , (7)
n i=1

where Vbi is the predicted value, Vi is the original value and n

is the number of epochs.
On average CDR had a lower RM SE than mean-based
method in 64 percent of the epochs and had a lower RM SE
by 25% . CDR’s average RM SE was 0.66, the average value
of the target data type, P M 2.5, was 79 with a range of [4, 244].
Figures 15 through 19 show CDR’s improvement in RM SE
over a temporal linear regression-based data prediction model.
On average CDR incurred a lower RM SE than the linear
Fig. 13. Prediction results for test set 4: Out of 149 epochs, CDR performed
regression model by 59%, and performed better in 71 percent better in 96 epochs
of epochs.

4825

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
Fig. 14. Prediction results for test set 5: out of 149 epochs, CDR performed Fig. 17. Prediction results for test set 3: out of 149 epochs, CDR performed
better in 115 epochs better in 105 epochs

Fig. 15. Prediction results for test set 1: out of 149 epochs, CDR performed Fig. 18. Prediction results for test set 4: out of 149 epochs, CDR performed
better in 119 epochs better in 93 epochs

Fig. 16. Prediction results for test set 2: out of 149 epochs, CDR performed Fig. 19. Prediction results for test set 5: Out of 149 epochs, CDR performed
better in 119 epochs better in 95 epochs

4826

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
V. C ONCLUSION & F UTURE W ORK [9] S. Tasnim, N. Pissinou, and S. Iyengar, “A novel cleaning approach of
environmental sensing data streams,” in Consumer Communications &
In this paper, we proposed a novel method, named as CDR, Networking Conference (CCNC), 2017 14th IEEE Annual. IEEE, 2017,
for reputation-aware data fusion for mobile crowdsensing data pp. 632–633.
streams. We showed that the proposed mechanism outperforms [10] L. Bracciale, M. Bonola, P. Loreti, G. Bianchi, R. Amici, and A. Rabuffi,
“CRAWDAD dataset roma/taxi (v. 2014-07-17),” Downloaded from url
the existing mean-based and temporal linear regression-based http://crawdad.org/roma/taxi/20140717, Jul. 2014.
data prediction models. We evaluate the approaches based on [11] Y. Zheng, F. Liu, and H.-P. Hsieh, “U-air: When urban air quality
two datasets: Rome crowdsensing temperature and Beijing Air inference meets big data,” in Proceedings of the 19th ACM SIGKDD
international conference on Knowledge discovery and data mining.
quality datasets, to demonstrate CDR’s efficacy in different ACM, 2013, pp. 1436–1444.
scenarios. For the Rome crowdsensing dataset, we achieved [12] Y. Zhang, C. Szabo, and Q. Z. Sheng, “Cleaning environmental sensing
16% better accuracy. Specifically, the 9.3% prediction error data streams based on individual sensor reliability,” in International
in temperature measurements of our approach equates to Conference on Web Information Systems Engineering. Springer, 2014,
pp. 405–414.
roughly 1 degree difference, which is negligible in real-life [13] H.-S. Lim, Y.-S. Moon, and E. Bertino, “Provenance-based trustwor-
applications. With this in mind, we can say that our mechanism thiness assessment in sensor networks,” in Proceedings of the Seventh
predicts temperature values with high accuracy. In case of the International Workshop on Data Management for Sensor Networks.
ACM, 2010, pp. 2–7.
air quality dataset, our CDR method incurred on average 25%
[14] K. L. Huang, S. S. Kanhere, and W. Hu, “On the need for a reputation
and 59% less RM SE than mean-based and temporal linear re- system in mobile phone based sensing,” Ad Hoc Networks, vol. 12, pp.
gression models, respectively. Our data fusion method incurred 130–149, 2014.
an average RM SE of 0.66 per epoch, which insinuates higher [15] D. Peng, F. Wu, and G. Chen, “Pay as how well you do: A quality
based incentive mechanism for crowdsensing,” in Proceedings of the
data prediction accuracy. The success of our approach lies in 16th ACM International Symposium on Mobile Ad Hoc Networking and
the integration of dynamic trust evaluation of the sensed data Computing. ACM, 2015, pp. 177–186.
which allows us to defend data corruption attack and identify [16] S. Liu, Z. Zheng, F. Wu, S. Tang, and G. Chen, “Context-aware data
quality estimation in mobile crowdsensing,” in INFOCOM 2017-IEEE
malicious or honest participants based on their reported data in Conference on Computer Communications, IEEE. IEEE, 2017, pp. 1–9.
real time. In the future, we will extend this work considering [17] Y. Kishino, K. Takeuchi, Y. Shirai, F. Naya, and N. Ueda, “Datafying
collusion attack of malicious participants. city: Detecting and accumulating spatio-temporal events by vehicle-
mounted sensors,” in Big Data (Big Data), 2017 IEEE International
ACKNOWLEDGMENT Conference on. IEEE, 2017, pp. 4098–4104.
[18] J. W. Branch, C. Giannella, B. Szymanski, R. Wolff, and H. Kargupta,
The work was supported by the National Science Founda- “In-network outlier detection in wireless sensor networks,” Knowledge
tion Grant number CNS-1560134 for the Research Experience and information systems, vol. 34, no. 1, pp. 23–54, 2013.
for Undergraduates, Advanced Secured Sensor Enabling Tech- [19] A. Deligiannakis, Y. Kotidis, V. Vassalos, V. Stoumpos, and A. Delis,
“Another outlier bites the dust: Computing meaningful aggregates in
nologies, and the Dissertation Year Fellowship support pro- sensor networks,” in Data Engineering, 2009. ICDE’09. IEEE 25th
vided by Florida International University’s Graduate School. International Conference on. IEEE, 2009, pp. 988–999.
The authors would like to thank Eric Xu for his contribution. [20] N. Giatrakos, Y. Kotidis, A. Deligiannakis, V. Vassalos, and Y. Theodor-
idis, “Taco: tunable approximate computation of outliers in wireless sen-
R EFERENCES sor networks,” in Proceedings of the 2010 ACM SIGMOD International
Conference on Management of data. ACM, 2010, pp. 279–290.
[1] F. Chen, P. Deng, J. Wan, D. Zhang, A. V. Vasilakos, and X. Rong, [21] G. Chen, X.-Y. Liu, L. Kong, J.-L. Lu, Y. Gu, W. Shu, and M.-Y. Wu,
“Data mining for the internet of things: literature review and challenges,” “Multiple attributes-based data recovery in wireless sensor networks,” in
International Journal of Distributed Sensor Networks, vol. 11, no. 8, p. Global Communications Conference (GLOBECOM), 2013 IEEE. IEEE,
431047, 2015. 2013, pp. 103–108.
[2] Z. Feng and Y. Zhu, “A survey on trajectory data mining: techniques [22] L. Kong, M. Xia, X.-Y. Liu, M.-Y. Wu, and X. Liu, “Data loss and
and applications,” IEEE Access, vol. 4, pp. 2056–2067, 2016. reconstruction in sensor networks,” in INFOCOM, 2013 Proceedings
[3] F. Restuccia, N. Ghosh, S. Bhattacharjee, S. K. Das, and T. Melodia, IEEE. IEEE, 2013, pp. 1654–1662.
“Quality of information in mobile crowdsensing: Survey and research [23] S. Gill, B. Lee, and E. Neto, “Context aware model-based cleaning of
challenges,” ACM Transactions on Sensor Networks (TOSN), vol. 13, data streams,” in Signals and Systems Conference (ISSC), 2015 26th
no. 4, p. 34, 2017. Irish. IEEE, 2015, pp. 1–6.
[4] P. M. Aoki, R. Honicky, A. Mainwaring, C. Myers, E. Paulos, S. Subra-
[24] S. Krishnan, J. Wang, E. Wu, M. J. Franklin, and K. Goldberg, “Active-
manian, and A. Woodruff, “A vehicle for research: using street sweepers
clean: interactive data cleaning for statistical modeling,” Proceedings of
to explore the landscape of environmental community action,” in Pro-
the VLDB Endowment, vol. 9, no. 12, pp. 948–959, 2016.
ceedings of the SIGCHI Conference on Human Factors in Computing
Systems. ACM, 2009, pp. 375–384. [25] A. Lazar, L. Jin, C. A. Spurlock, K. Wu, and A. Sim, “Data quality
[5] H. Mousa, S. B. Mokhtar, O. Hasan, O. Younes, M. Hadhoud, and challenges with missing values and mixed types in joint sequence
L. Brunie, “Trust management and reputation systems in mobile par- analysis,” in Big Data (Big Data), 2017 IEEE International Conference
ticipatory sensing applications: A survey,” Computer Networks, vol. 90, on. IEEE, 2017, pp. 2620–2627.
pp. 49–73, 2015. [26] H. Liu, A. K. Tk, J. P. Thomas, and X. Hou, “Cleaning framework for
[6] X. Kang, L. Liu, and H. Ma, “Data correlation based crowdsensing bigdata: An interactive approach for data cleaning,” in Big Data Com-
enhancement for environment monitoring,” in Communications (ICC), puting Service and Applications (BigDataService), 2016 IEEE Second
2016 IEEE International Conference on. IEEE, 2016, pp. 1–6. International Conference on. IEEE, 2016, pp. 174–181.
[7] I. Koukoutsidis, “Estimating spatial averages of environmental param- [27] S. Tasnim, J. Caldas, N. Pissinou, S. Iyengar, and Z. Ding, “Semantic-
eters based on mobile crowdsensing,” ACM Transactions on Sensor aware clustering-based approach of trajectory data stream mining,” in
Networks (TOSN), vol. 14, no. 1, p. 2, 2018. 2018 International Conference on Computing, Networking and Commu-
[8] S. Subramaniam, T. Palpanas, D. Papadopoulos, V. Kalogeraki, and nications (ICNC). IEEE, 2018, pp. 88–92.
D. Gunopulos, “Online outlier detection in sensor data using non- [28] X. L. Dong, B. Saha, and D. Srivastava, “Less is more: Selecting sources
parametric models,” in Proceedings of the 32nd international conference wisely for integration,” in Proceedings of the VLDB Endowment, vol. 6,
on Very large data bases. VLDB Endowment, 2006, pp. 187–198. no. 2. VLDB Endowment, 2012, pp. 37–48.

4827

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.
[29] S. Burdakis and A. Deligiannakis, “Detecting outliers in sensor networks
using the geometric approach,” in Data Engineering (ICDE), 2012 IEEE
28th International Conference on. IEEE, 2012, pp. 1108–1119.
[30] C. Franke and M. Gertz, “Orden: Outlier region detection and explo-
ration in sensor networks,” in Proceedings of the 2009 ACM SIGMOD
International Conference on Management of data. ACM, 2009, pp.
1075–1078.
[31] M. Keally, G. Zhou, and G. Xing, “Watchdog: Confident event detection
in heterogeneous sensor networks,” in Real-Time and Embedded Tech-
nology and Applications Symposium (RTAS), 2010 16th IEEE. IEEE,
2010, pp. 279–288.
[32] B. Sheng, Q. Li, W. Mao, and W. Jin, “Outlier detection in sensor
networks,” in Proceedings of the 8th ACM international symposium on
Mobile ad hoc networking and computing. ACM, 2007, pp. 219–228.
[33] H. H. Bosman, G. Iacca, A. Tejada, H. J. Wörtche, and A. Liotta,
“Spatial anomaly detection in sensor networks using neighborhood
information,” Information Fusion, vol. 33, pp. 41–56, 2017.

4828

Authorized licensed use limited to: University of Pittsburgh. Downloaded on January 02,2021 at 20:34:54 UTC from IEEE Xplore. Restrictions apply.

Data Analytics For CyberSecurity
100% (3)
Data Analytics For CyberSecurity
207 pages
HUAWEI Final Written Exam 3333
50% (2)
HUAWEI Final Written Exam 3333
13 pages
Estimating The On-Time Probability For Vendor Selection Problem 1
No ratings yet
Estimating The On-Time Probability For Vendor Selection Problem 1
6 pages
Real Estrate Valuation Theory PDF
100% (2)
Real Estrate Valuation Theory PDF
440 pages
CQE Body of Knowledge
No ratings yet
CQE Body of Knowledge
11 pages
Enabling Data Trustworthiness and User Privacy in Mobile Crowdsensing
No ratings yet
Enabling Data Trustworthiness and User Privacy in Mobile Crowdsensing
14 pages
ETBP-TD_An_Efficient_and_Trusted_Bilateral_Privacy-Preserving_Truth_Discovery_Scheme_for_Mobile_Crowdsensing
No ratings yet
ETBP-TD_An_Efficient_and_Trusted_Bilateral_Privacy-Preserving_Truth_Discovery_Scheme_for_Mobile_Crowdsensing
16 pages
Big Data Ethics in Research
From Everand
Big Data Ethics in Research
Nicolae Sfetcu
No ratings yet
Activity Recognition: Fundamentals and Applications
From Everand
Activity Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Detecting Malicious Data Injections in Event Detection WSNs
No ratings yet
Detecting Malicious Data Injections in Event Detection WSNs
16 pages
Mobile Games For Language Learning
No ratings yet
Mobile Games For Language Learning
14 pages
Uncertainty Theories and Multisensor Data Fusion
From Everand
Uncertainty Theories and Multisensor Data Fusion
Alain Appriou
No ratings yet
Sensors 23 01683
No ratings yet
Sensors 23 01683
21 pages
Attack-Proof Collaborative Spectrum Sensing in Cognitive Radio Networks
No ratings yet
Attack-Proof Collaborative Spectrum Sensing in Cognitive Radio Networks
5 pages
Lecture_5
No ratings yet
Lecture_5
38 pages
Defense Against SSDF Attack in Cognitive Radio Networks: Attack-Aware Collaborative Spectrum Sensing Approach
No ratings yet
Defense Against SSDF Attack in Cognitive Radio Networks: Attack-Aware Collaborative Spectrum Sensing Approach
4 pages
Digital Twins
From Everand
Digital Twins
Everett Sinclair
No ratings yet
Delivering Water and Power: GIS for Utilities
From Everand
Delivering Water and Power: GIS for Utilities
Pat Hohl
4/5 (1)
Ashwath Thesis PDF
No ratings yet
Ashwath Thesis PDF
90 pages
LSP Wireless network attacks using supervised machine learning techniques
No ratings yet
LSP Wireless network attacks using supervised machine learning techniques
28 pages
A Machine-Learning-Based Technique For False Data Injection Attacks Detection in Industrial IoT
No ratings yet
A Machine-Learning-Based Technique For False Data Injection Attacks Detection in Industrial IoT
10 pages
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
Sparse Malicious False Data Injection Attacks and Defense Mechanisms in Smart Grids
No ratings yet
Sparse Malicious False Data Injection Attacks and Defense Mechanisms in Smart Grids
12 pages
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
From Everand
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
Editor IJSMI
No ratings yet
Machine Learning Methods For Attack Detection in The Smart Grid Final
No ratings yet
Machine Learning Methods For Attack Detection in The Smart Grid Final
66 pages
Geospatial Data Science: Combining Geography with Data Science
From Everand
Geospatial Data Science: Combining Geography with Data Science
Dr Aran Castro A J
No ratings yet
1-s2.0-S1084804523002308-am
No ratings yet
1-s2.0-S1084804523002308-am
28 pages
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
No ratings yet
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
8 pages
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Addressing Earth's Challenges: GIS for Earth Sciences
From Everand
Addressing Earth's Challenges: GIS for Earth Sciences
Lorraine Tighe
No ratings yet
An Efficient Privacy-Enhancing Cross-Silo Federated Learning and Applications For False Data Injection Attack Detection in Smart Grids
No ratings yet
An Efficient Privacy-Enhancing Cross-Silo Federated Learning and Applications For False Data Injection Attack Detection in Smart Grids
15 pages
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Detection, Characterization and Diagnosis of Spoofed and Masked Events in Wireless Sensor Networks
No ratings yet
Detection, Characterization and Diagnosis of Spoofed and Masked Events in Wireless Sensor Networks
9 pages
Anomaly Detection in Big Data
No ratings yet
Anomaly Detection in Big Data
148 pages
A Survey On Mobile Crowd-Sensing and Its Applications
No ratings yet
A Survey On Mobile Crowd-Sensing and Its Applications
30 pages
Machine learning methods for secure internet of things against cyber threats synopsis
No ratings yet
Machine learning methods for secure internet of things against cyber threats synopsis
4 pages
Classification of Data Streams With Skewed Distribution
No ratings yet
Classification of Data Streams With Skewed Distribution
55 pages
Cyber_Security_of_Smart_Grid_Systems_Usi
No ratings yet
Cyber_Security_of_Smart_Grid_Systems_Usi
8 pages
Reality Mining: Using Big Data to Engineer a Better World
From Everand
Reality Mining: Using Big Data to Engineer a Better World
Nathan Eagle
4/5 (2)
Journal
No ratings yet
Journal
11 pages
Implicit Study of Techniques and Tools For Data Analysis of Complex Sensory Data
No ratings yet
Implicit Study of Techniques and Tools For Data Analysis of Complex Sensory Data
8 pages
A Machine Learning Approach For Data Quality Control of Earth Observation Data Management System
No ratings yet
A Machine Learning Approach For Data Quality Control of Earth Observation Data Management System
3 pages
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
No ratings yet
Comparison of Single and Ensemble Intrusion Detection Techniques Using Multiple Datasets
10 pages
Bus 14
No ratings yet
Bus 14
185 pages
Privacy Preserving Mechanism
No ratings yet
Privacy Preserving Mechanism
35 pages
Ai-based Anomaly Detection in Power Electronics[1]
No ratings yet
Ai-based Anomaly Detection in Power Electronics[1]
25 pages
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)
A User-Centric Machine Learning Framework For Cyber Security Operations Center
No ratings yet
A User-Centric Machine Learning Framework For Cyber Security Operations Center
11 pages
PHASE 2.1
No ratings yet
PHASE 2.1
9 pages
Machine Learning Guided Operational Intelligence From Synchrophasors (SEL, OSU, 2021)
No ratings yet
Machine Learning Guided Operational Intelligence From Synchrophasors (SEL, OSU, 2021)
185 pages
Brandsaeter - Data Methods for Sensor Streams for the Maritime Industry (2020)
No ratings yet
Brandsaeter - Data Methods for Sensor Streams for the Maritime Industry (2020)
120 pages
Robust Data Model For Enhanced Anomaly Detection: R.Ravinder Reddy, Dr.Y Ramadevi, DR.K.V.N Sunitha
No ratings yet
Robust Data Model For Enhanced Anomaly Detection: R.Ravinder Reddy, Dr.Y Ramadevi, DR.K.V.N Sunitha
8 pages
Deep Learning Models For Spatio-Temporal Forecasting and Analysis
No ratings yet
Deep Learning Models For Spatio-Temporal Forecasting and Analysis
131 pages
Computer Vision: Exploring the Depths of Computer Vision
From Everand
Computer Vision: Exploring the Depths of Computer Vision
Fouad Sabry
No ratings yet
Quality_Inference_in_Federated_Learning_With_Secure_Aggregation
No ratings yet
Quality_Inference_in_Federated_Learning_With_Secure_Aggregation
8 pages
Intrusion Detection System Based On One-Class Supp
No ratings yet
Intrusion Detection System Based On One-Class Supp
16 pages
Secure Cooperative Spectrum Sensing and Access Against Intelligent Malicious Behaviors
No ratings yet
Secure Cooperative Spectrum Sensing and Access Against Intelligent Malicious Behaviors
9 pages
A Comparative Analysis of Data Mining Tools For Performance Mapping of Wlan Data
No ratings yet
A Comparative Analysis of Data Mining Tools For Performance Mapping of Wlan Data
11 pages
Paper 7-Application of Relevance Vector Machines in Real Time Intrusion Detection
No ratings yet
Paper 7-Application of Relevance Vector Machines in Real Time Intrusion Detection
6 pages
Data Science, AI, and Blockchain: Integrated Approaches
From Everand
Data Science, AI, and Blockchain: Integrated Approaches
Ekaaksh Deshpande
No ratings yet
Fuzzy C-Means Clustering Based Secure Fusion Strategy in Collaborative Spectrum Sensing
No ratings yet
Fuzzy C-Means Clustering Based Secure Fusion Strategy in Collaborative Spectrum Sensing
6 pages
Intrusion Detection Based On Privacy-Preserving Federated Learning For The Industrial IoT
No ratings yet
Intrusion Detection Based On Privacy-Preserving Federated Learning For The Industrial IoT
10 pages
Ke 2021 J. Phys. Conf. Ser. 2113 012074
No ratings yet
Ke 2021 J. Phys. Conf. Ser. 2113 012074
14 pages
Virtual Machine Provisioning Based On Analytical Performance and Qos in Cloud Computing Environments
No ratings yet
Virtual Machine Provisioning Based On Analytical Performance and Qos in Cloud Computing Environments
10 pages
Libopenabe v1.0.0 Design
No ratings yet
Libopenabe v1.0.0 Design
30 pages
A Probabilistic Misbehavior Detection Scheme Toward Efficient Trust Establishment in Delay-Tolerant Networks
No ratings yet
A Probabilistic Misbehavior Detection Scheme Toward Efficient Trust Establishment in Delay-Tolerant Networks
11 pages
Security Enhancements For Mobile Ad Hoc Networks With Trust Management Using Uncertain Reasoning
No ratings yet
Security Enhancements For Mobile Ad Hoc Networks With Trust Management Using Uncertain Reasoning
12 pages
Fluid: A Blockchain Based Framework For Crowdsourcing: Siyuan Han, Zihuan Xu, Yuxiang Zeng, Lei Chen
No ratings yet
Fluid: A Blockchain Based Framework For Crowdsourcing: Siyuan Han, Zihuan Xu, Yuxiang Zeng, Lei Chen
4 pages
Mobicrowd: Mobile Crowdsourcing On Location-Based Social Networks
No ratings yet
Mobicrowd: Mobile Crowdsourcing On Location-Based Social Networks
9 pages
Survivability Analysis of Distributed Systems Using Attack Tree Methodology
No ratings yet
Survivability Analysis of Distributed Systems Using Attack Tree Methodology
7 pages
On Bounded Rationality in Cyber-Physical Systems Security: Game-Theoretic Analysis With Application To Smart Grid Protection
No ratings yet
On Bounded Rationality in Cyber-Physical Systems Security: Game-Theoretic Analysis With Application To Smart Grid Protection
6 pages
Designing Less-Structured P2P Systems For The Expected High Churn
No ratings yet
Designing Less-Structured P2P Systems For The Expected High Churn
11 pages
Principal Component Analysis of Wine
No ratings yet
Principal Component Analysis of Wine
20 pages
SS 187 - Applied Statistics for Humanities - Salaar Khan
No ratings yet
SS 187 - Applied Statistics for Humanities - Salaar Khan
5 pages
Econometrics Final Project Model Results
No ratings yet
Econometrics Final Project Model Results
9 pages
Prediction of Air Quality Index Using Supervised Machine Learning
No ratings yet
Prediction of Air Quality Index Using Supervised Machine Learning
14 pages
IJACSA Volume2No10
No ratings yet
IJACSA Volume2No10
130 pages
Logistic Regression Classification in Natural Language Processing (NLP) Final
No ratings yet
Logistic Regression Classification in Natural Language Processing (NLP) Final
14 pages
Predetermined Overhead Rates
No ratings yet
Predetermined Overhead Rates
16 pages
Cote 2011
No ratings yet
Cote 2011
8 pages
Frank-Lynch-Rego, 2009, Tax Reporting Aggressiveness and Its Relation To Aggressive Financial Reporting
No ratings yet
Frank-Lynch-Rego, 2009, Tax Reporting Aggressiveness and Its Relation To Aggressive Financial Reporting
49 pages
The Competency Levels of The Senior High School Math Teachers in Negros Oriental
No ratings yet
The Competency Levels of The Senior High School Math Teachers in Negros Oriental
6 pages
Cost Concepts, Classification and Segregation: M.S.M.C
No ratings yet
Cost Concepts, Classification and Segregation: M.S.M.C
7 pages
Fuzzy QFD For Supply Chain Management With Reliability Consideration
No ratings yet
Fuzzy QFD For Supply Chain Management With Reliability Consideration
8 pages
Ams 427 Statistical Model Building (3)
No ratings yet
Ams 427 Statistical Model Building (3)
5 pages
International Biometric Society
No ratings yet
International Biometric Society
3 pages
Integration of Prediction Scores From Various Automated Essay Scoring Models Using Item Response Theory
No ratings yet
Integration of Prediction Scores From Various Automated Essay Scoring Models Using Item Response Theory
18 pages
Sharpe 1963
No ratings yet
Sharpe 1963
18 pages
Deif2017Compiling An Earthquake Catalogue For The Arabian Plate, Western Asia
No ratings yet
Deif2017Compiling An Earthquake Catalogue For The Arabian Plate, Western Asia
57 pages
QTMS - Final Project Report
No ratings yet
QTMS - Final Project Report
27 pages
Impairment Modelling Using R v1.0
No ratings yet
Impairment Modelling Using R v1.0
43 pages
Spatial Data Analysis: Chapter 13 Geovisualization
No ratings yet
Spatial Data Analysis: Chapter 13 Geovisualization
5 pages
6.867 Machine Learning: Mid-Term Exam October 13, 2004
No ratings yet
6.867 Machine Learning: Mid-Term Exam October 13, 2004
11 pages
Ferreira Et Al ECSS 2014
No ratings yet
Ferreira Et Al ECSS 2014
10 pages
SMS 202
No ratings yet
SMS 202
171 pages
103 EADB Chapter-2: Dr. Rakesh Bhati
No ratings yet
103 EADB Chapter-2: Dr. Rakesh Bhati
85 pages
Capstone Project CausaLens - Ana Potje (1)
No ratings yet
Capstone Project CausaLens - Ana Potje (1)
5 pages
Stolzenberg, R. M. 1980. "The Measurement and Decomposition of Causal
No ratings yet
Stolzenberg, R. M. 1980. "The Measurement and Decomposition of Causal
31 pages

Reputation-Aware Data Fusion and Malicious Participant Detection in Mobile Crowdsensing

Uploaded by

Reputation-Aware Data Fusion and Malicious Participant Detection in Mobile Crowdsensing

Uploaded by

2018 IEEE International Conference on Big Data (Big Data)

Reputation-Aware Data Fusion and Malicious

978-1-5386-5035-6/18/$31.00 ©2018 IEEE 4820

On average CDR was 16% more accurate and performed

used it as a performance measurement criteria of our algo-

where Vbi is the predicted value, Vi is the original value and n

You might also like