0% found this document useful (0 votes)
6 views4 pages

A Walkthrough On Clone Profile Resolution in Social Networks

This paper reviews various approaches to detect clone profiles in online social networks, highlighting the significance of identifying fake accounts that can harm legitimate users. It discusses different methodologies, including classification algorithms, social graph analysis, and behavioral pattern recognition, to enhance the detection of identity clone attacks. The study emphasizes the need for more robust solutions to address the growing issue of fake profiles across multiple social media platforms.

Uploaded by

Shreeya Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views4 pages

A Walkthrough On Clone Profile Resolution in Social Networks

This paper reviews various approaches to detect clone profiles in online social networks, highlighting the significance of identifying fake accounts that can harm legitimate users. It discusses different methodologies, including classification algorithms, social graph analysis, and behavioral pattern recognition, to enhance the detection of identity clone attacks. The study emphasizes the need for more robust solutions to address the growing issue of fake profiles across multiple social media platforms.

Uploaded by

Shreeya Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Journal of Scientific & Engineering Research Volume 10, Issue 9, September-2019 1334

ISSN 2229-5518

A Walkthrough on Clone Profile Resolution in


Social Networks
Liyanage C.R, Premarathne S.C

Abstract— People use Online Social Networks to build social connections with others who are having similar personal interests or coming
from the same backgrounds. These social platforms make peoples' life better while generating lots of problems to society. Some attackers
perform profile cloning to harvest sensitive data from a targeted person on social media. These attacks will damage the prestige of the
legitimate user. Hence to detect such duplicate fake accounts has become a critical necessity of today's online social networks. Many
researchers have tried to solve the problem of fake profile detection in online social networks however, more robust solutions are still to be
taken. This paper presents a review of approaches in literature for detecting clone profiles.

Index Terms— Classification, Fake profiles, Identity Clone Attack, Online Social Networks, Profile Attributes, Similarity Measures, Social
Graph Analysis.

——————————  ——————————

1 INTRODUCTION

O nline Social Network (OSN) is a web of users connected


through user profiles to keep interactions with friends,
find news and updates around the world, gain business
ogies to verify duplicate profiles in online social networks. This
review study investigates approaches that have been intro-
duced to solve the problem of detecting fake clone profiles
opportunities, share information and knowledge etc. Due to through single and multiple OSN platforms.

IJSER
vast amount of benefits, OSNs have become a significant part
of people live where 2.46 billion of the global population is
using them and expected to reach around 2.95 billion in 2020
2 REVIEW OF LITERATURE
[1]. There are different types of social platforms such as Viber, Researchers have addressed fake profiles in two aspects either
YouTube, WhatsApp, Facebook, Instagram, Twitter, Google+, as a duplicate for a specific existing account (profile cloning)
LinkedIn etc. and these networks have changed the way of or as a new profile with random details. Profile cloning again
people interact with each other. tested across different platforms which made the security of
Due to socialization nature and extensive usage of OSNs, social network more vigorous. They have selected different
users tend to expose a vast amount of their personal details to social networks and most common selections were Facebook
public and these sensitive data can be easily used by malicious [7], Twitter, Google+ [9] and LinkedIn, where the user profile
users through fake profiles for different purposes [2]. OSN attributes and behaviors are significantly different.
wrongdoers create these fake profiles which do not belong to The study [7] proposes a three-step model to match two dif-
genuine users either by duplicating an existing user name or ferent profiles from different social media platforms. They
by giving non-existing user identity in social media [3],[4]. have used a binary classifier for feature extraction based on
According to statistical estimations 81million of Facebook ac- users’ information regarding friend requests and friend lists.
counts and 5 percent of Twitter accounts are fake [5]. This method presents an influential model by using a string-
In most of the social platforms, user identification is mainly matching similarity algorithm to find profile similarities.
based on limited displayed user details and this makes the user However, they have not tested their algorithm using a real
authentication feebler, since it is possible to have more than one dataset. Hence the accuracy and effectiveness of the output is
account with the same name and many other similar details [6]. questionable. The authors in [10] have compared the impact of
Under this capacity Identity Clone Attack (ICA) is one of the different parameters on verifying the results of the outcomes.
most severe security threats in social networks where scammers First, they have selected the victim and then found list of po-
create identical profiles to existing profiles and appear as some- tential clone profiles. By comparing clones with victims, they
one else in order to steal private information or to damage vic- have finally verified the results as which profiles are clones.
tims’ reputation by publishing inconvenient contents [6],[7],[8]. The study [11] has tried to find clones in social media
Hence detection of these clone profiles with fake identities has where the concept was evaluated on users’ original profile
become one of the crucial tasks in handling social media securi- data to catch similar accounts across OSNs. According to the
ty and privacy. Researches have introduced various methodol- detected profile similarities, a similarity score has been calcu-
lated based on shared values of the information field and pro-
————————————————
file picture. Another study [9] for detecting duplicate profiles
 Liyanage C.R, Department of IT, Faculty of IT, University of Moratuwa, in OSNs has performed and they have considered more simi-
Sri Lanka. E-mail: [email protected]
 Premarathne S.C, Department of IT, Faculty of IT, University of Moratu- lar steps as previous cases [11]. First, they extract information
wa, Sri Lanka. E-mail: [email protected] from users’ profile such as birthday, age, education, work-
place and then extract information from profiles with same
names. Finally, they have calculated a similarity index of all
the profiles found. Most of the studies have built their ap-
IJSER © 2019
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 10, Issue 9, September-2019 1335
ISSN 2229-5518
proaches based on attribute similarity models. In paper [12] of Facebook. Social Snapshot tool developed by Huber is one
also have done the same thing but further they have consid- of the tools used in [16] to collect Facebook user data.
ered about a friend network similarity value.
3.3 Approaches
3.3.1 Using Classification Algorithms
3 CURRENT STATUS Some algorithms have tried to solve this problem of identify-
The area of research for detecting duplicate profiles in ing OSN fake profiles based on classification approaches. In
online social media networks has evolved recently and most of [3] the author has used three classification algorithms, Support
the research findings were published after 2010. Since the re- Vector Machine (SVM), Naive Bayes and Decision trees and
search approaches differ from each other depending on differ- have compared the efficiency among each. After selecting the
ent OSNs, selecting the interest platform is the first most im- profile to be tested they have extracted the required features
portant step. After that finding data sets with interested fea- (Gender, Number of friends, education and work, relationship
tures, applying suitable methodologies and evaluation of re- status, numbers of photos tagged, number of uploaded photos
sults must be done accordingly. Current background of this etc.) and then using the classifier determined whether the pro-
research area will be discussed in this section. file is fake or not. Then again, the result has used to train the
classifier in order to obtain more accurate predictions. Accord-
3.1 Platform Selection ing to the results SVM has selected as the best classification
Single site and cross-site profile cloning are two types of cloning model where Naïve Bayes has given the lowest performance.
attacks wherein first type creates an account of the victim in the Another research study [15] has conducted to find malicious
same social network and sends friend requests to victims’ friends Facebook pages using Artificial Neural Networks. The set of
whereas in cross-site creates an account of victim in a new net- words in published contents has used to differentiate mali-
work and sends requests to friends who are in both networks cious and true pages.
[6],[13],[9]. According to these two types researchers have devel- Some approaches were there to find user profiles belong to

IJSER
oped their fake profile detection algorithms on either specific the same user over different social networks [18]. They have
network or across multiple networks [14]. In present as Facebook generated a similarity vector using a known dataset of paired
is the most popular OSN, many researches have selected it as the accounts belongs to the same user across multiple networks.
platform for their research work [3],[15],[16]. Not only that, Then these vectors were used as the training dataset for su-
some authors have used multiple platforms such as Google+ and pervised classifiers such as KNN, Naïve Bayes, Decision trees
Twitter along with the Facebook as their social environments and SVM. However, this approach is using more static attrib-
[4],[17]. utes (Name, Location, Description, Profile image and Number
of connections) when considering similarity vector whereas in
3.2 Data Collection some approaches use more dynamic behavioral features like
In each profile in OSN provides lots of qualitative and in [17] which have shown more robust and accurate results.
quantitative information such as gender, location, education,
work, age, number of friends, comments, likes. However, this 3.3.2 Social Grapgh based Approach
information provides different accessibilities for different au- In paper [4] the author introduces a detection mechanism
diences since some are public and others are private [3]. In called Fake Profiles Recognizer (FPR) which authenticate and
many researches public data has been used due to limitations recognize his trusted friends as well as detect fake ones by
of gathering private data of profiles [14],[18]. However, in [7] modeling the online social network graph after representing
the author has not used a real data set for his implementation. the identity of each user as a Friend Pattern. A profile will be a
Data gathering has mainly carried out in several ways where fake to a selected profile, if it has indicated by a fake instance
creating experimental fake profiles or called as “Honey pro- which came from another friend pattern and will not accepted
files” has done by [16] and this method was better than the by the friend pattern processor. This friend pattern has used to
way of data gathering via APIs, since researchers can gain da- distinguish duplicate profiles in OSN. This approach has
ta by controlling the conditions as they want. They have creat- proved higher accuracy than SVM [3] and lower F-Measure
ed several honey profiles with different features and collected values than Naïve Bayes approaches [3]. However, in case of
data once each day for one month. However, this method has lesser number of fake profiles this algorithm has unable to
limitations when considering vast amount of data collections. recognize the fake profiles. A case study [16] has performed by
Some researchers have collected real profile information us- illustrating its friendship network using graphs where nodes
ing Facebook Graph API along with Python [4],[3] and fake represented profiles and friendships among profiles repre-
profile dataset has provided by Barracuda Labs [3]. Some data sented edges. They have presented some concepts such as
has scrapped from friend accounts and for that they have im- network density, degree of nodes, and the correlation between
plemented an anti-scrap detection technique to prevent Face- nodes in the process of identification fake nodes. Finally, they
book from detecting [3]. Paper [4] has used a fixed number of have concluded that the profiles with lesser number of activi-
profiles around 3000 and these were downloaded from Stan- ties and high number of friends have more chance to be fakes.
ford Network Analysis Platform (SNAP Library). They have The approach [13] has evaluated the identity of clone pro-
divided the dataset into two parts one half as real profiles and files in the same network using two concepts in which the sec-
other as fake profiles. Another study [15] has collected their ond one is based on its’ strength of the relationship measures.
initial data set of 4.4million public posts using post search API For this, social network data were modeled using a weighted
IJSER © 2019
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 10, Issue 9, September-2019 1336
ISSN 2229-5518
graph and they have tried to consider user interactions not results. Same as most of the approaches, the paper [21] has
only based on friend requests, rather considered more linkage also discussed about an attribute similarity and friend net-
between profiles such as active friends, page likes, URLs, work similarity approach. They have considered three types of
friendship graph and mutual friends’ graph. friend network features for analysis, friend list, recommended
In [19] a novel social graph topology called “Trusted Social friend list and excluded friend list. Furthermore, the study [16]
Graph (TSG)” has introduced by using a special type of graph has focused more on analyzing the location-based attributes
called “DeBruijn graph” to visualize the trusted instances such as work and educational places and current locations and
within the social network. They have analyzed the social pro- has found that these will give stronger factors in fake identifi-
files by evaluating their friend patterns using mathematical cations. In paper [22], researchers have used 17 profile features
expressions. Finally, the incoming instances were checked to evaluate the similarities between profiles and this is a very
against the model and decided whether that profile is fake or high number comparing to other existing researches. Not only
real. that, they have used 12 classifiers for the task of detecting fake
Some algorithms like [20] have presented a method to de- profiles.
tect clone profiles using a graph and network-based approach
by analyzing the structural similarity of the social network. 3.3.4 Analyzing User Behavior Changes
The According to [6] the interested features can be categorized into
authors have first selected a node to analyze from an analyzed two as behavioral and non-behavioral attributes. Due to the
network and get the nearest neighbors considered node. After anomalous behavior of fake profiles they are easy to identify
measuring the similarity of nodes, it will detect duplicate pro- by analyzing behavioral patterns [17]. Paper [15] has used a
files as gave highest frequency of attribute similarities. Fur- bag-of-words collected from recent activities of Facebook pag-
thermore, due to the usage of k-nearest neighbor algorithm, es and extracted patterns from them. Also, they have analyzed
this approach was able to recover hidden values of attributes the behavior changes in such pages. The approach [17] has
of user profiles. used a combination of statistical models and sudden behav-

IJSER
ioral changes in user profiles to detect fakes. They have con-
3.3.3 Matching Similarity Attributes sidered detecting only the malicious behavioral changes for
In study [14] the similarity of two profiles has been checked their algorithms since users can experience sudden changes in
based on their HTML structures. They have conducted tech- their behaviors due to many other legal reasons as well. In [23]
niques on exact matching of attributes to match usernames by the authors have used a text mining approach to measure the
doing string comparisons and partial matching of related at- similarity between text information such as posts and com-
tributes to match parts of profile attributes such as location ments on two types of social media public pages.
and address. The paper [7] has also used a similarity matching
algorithm but it has shown higher results due to its recursive 3.3.5 Matching User Profiles Across Multiple OSNs
matching technique. As mentioned under graph-based ap- Since people tend to use different social network platforms
proach, the study [13] has evaluated the profile identity using many researches have focused on detecting fake profiles
two concepts which the first one was based on calculating pro- across different types of platforms [7]. This kind of detection is
file similarity using selected attributes, the first name, family more difficult than single site detection due to necessity of
name and location. After filtering suspicious accounts based analyzing different networks and different features of those
on these attribute similarities, they are evaluating the strength profiles [21].
of relations and finally have identified the fakes. The literature
has introduced another approach [10] to detect profile clones
by comparing five different similarity measures which in- 4 DISCUSSION
cludes two more additional attributes, gender and education Among most of the security issues in online social networks,
details than given in study [13]. However, this study has used fake profile identification gained more importance since it can
a limited dataset for their developments. lead to severe security and user privacy threats. Identity Clone
The methods like [9] have calculated a similarity index after Attack is one of the fake profile problems which was consid-
comparing the original profile and other searched accounts. ered as the most dangerous threat in OSN. Hence, the detec-
They have assumed if the similarity index is high the profiles tion of clone profiles has become an important area in the re-
may be cloned. However, the other assumption they have search field of computer science all over the world and 75 per-
made as the fake profiles will give the lowest similarities is not cent of the existing solutions were found after 2010.
acceptable since there can be profiles with less similarities to The detection of clone profiles in social networks is a cur-
each other but still real. The approach [12] has introduced a rently engaging research problem and most of the investiga-
weighted dice similarity measurement to calculate the similar- tions are done using Facebook, as it is the most popular social
ity of selected attributes. They have assigned weights accord- network platform. Other than that Twitter is also a widely
ing to the importance of each attribute for each person. This used network since there are less privacy concerns when creat-
method can give more reliable results since the importance of ing user profiles. When considering the selected platforms of
attributes may vary from person to person. Some algorithms past researches, the networks having less complex process for
[11] have directly matched the strings in information fields to creating user profiles and weak user authentication mecha-
measure the similarities between profiles. However, in case of nisms have mostly been subjected to the fake profile issue.
incorrectly typed information this method will give inaccurate Some researchers [7] have used synthetic data sets for their
IJSER © 2019
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 10, Issue 9, September-2019 1337
ISSN 2229-5518
investigations and these may not give the most realistic solu- social Networks based on a Finite Automaton Approach,” 2016 12th Int.
tions since social networks are highly diverse environments Comput. Eng. Conf. ICENCO 2016 Boundless Smart Soc., pp. 1–7, 2017.
and this complex diversity can be efficiently gain only using [5] WordStream, “40 Essential Social Media Marketing Statistics for 2017,” 2017.
real data. However, still most of the researches made their [Online]. Available:
assumptions using very limited amount of real data since it is http://www.wordstream.com/blog/ws/2017/01/05/social-media-
difficult to get personal user data through an API due to con- marketing-statistics. [Accessed: 10-Nov-2017].
fined accessibilities. [6] M. A. Wani and S. Jabin, “A Sneak into the Devil’s Colony - Fake Profiles in
There were different methodologies for detecting clone pro- Online Social Networks,” 2017.
files in OSNs, but the most common type was to match the [7] G. A. Kamhoua et al., “Preventing Colluding Identity Clone Attacks in Online
profiles using similarity measurements where study [22] has Social Networks,” in 2017 IEEE 37th International Conference on Distributed
used large number of attributes for this consideration. Moreo- Computing Systems Workshops (ICDCSW), 2017, pp. 187–192.
ver, graph-based approaches have been used to analyze friend [8] M. Fire, R. Goldschmidt, and Y. Elovici, “Online Social Networks: Threats and
networks in OSNs to consider the compactness and strength of Solutions Survey,” IEEE Commun. Surv. TUTORIALS Online, vol. 16, no. 4,
networks to predict clone profiles. Classification in data min- pp. 1–20, 2013.
ing was another common technique to analyze user data in the [9] M. A. Devmane and N. K. Rana, “Detection and Prevention of Profile Cloning
platforms and Decision Trees and Naïve Bayes were the most in Online Social Networks,” Int. Conf. Recent Adv. Innov. Eng. ICRAIE 2014,
used ones. pp. 9–13, 2014.
However, due to the diverse characteristics and rapidly [10] P. Bródka, M. Sobas, and H. Johnson, “Profile Cloning Detection in Social
changing nature of social networks, fake clone profile detec- Networks,” Proc. - 2014 Eur. Netw. Intell. Conf. ENIC 2014, pp. 63–68, 2014.
tion is still not fully solved by existing approaches and opened [11] G. Kontaxis, I. Polakis, S. Ioannidis, and E. P. Markatos, “Detecting Social
for future directions Network Profile Cloning,” 2011 IEEE Int. Conf. Pervasive Comput. Commun.
Work. PERCOM Work. 2011, pp. 295–300, 2011.
[12] M. R. Khayyambashi and F. S. Rizi, “An Approach for Detecting Profile Clon-
5 FUTURE DIRECTIONS

IJSER
ing in Online Social Networks,” 2013 7th Intenational Conf. e-Commerce Dev.
Some proposed techniques have been found after investigat- Ctries. With Focus e-Security, ECDC 2013, pp. 1–12, 2013.
ing current approaches. The study [14] has suggested a bio- [13] F. Rizi, M. Khayyambashi, and M. Kharaji, “A New Approach for Finding
metric authentication method to use user fingerprints, voice Cloned Profiles in Online Social Networks,” Int. J. Netw. Secur., vol. 6, no.
and signatures to verify the identity of a user in a social net- April, pp. 25–37, 2014.
work platform. This may result more accurate solutions since [14] B. B. Das, “Profile Similarity Technique for Detection of Duplicate Profiles in
biometric characters are unique to each person. Some [20] Online Social Network,” vol. 7, no. 2, pp. 507–512, 2016.
have proposed a user relationship prediction model to forecast [15] P. Dewan, S. Bagroy, and P. Kumaraguru, “Hiding in Plain Sight : Character-
future clone profiles. This will be more useful since prevention izing and Detecting Malicious Facebook Pages,” pp. 193–196, 2016.
of attack is better than detection after the attack. [16] K. Krombholz, D. Merkl, and E. Weippl, “Fake Identities in Social Media: A
Case Study on the Sustainability of the Facebook Business Model,” J. Serv. Sci.
Res., vol. 4, no. 2, pp. 175–212, 2012.
6 CONCLUSION
[17] M. Egele, C. Kruegel, and G. Vigna, “COMPA : Detecting Compromised
Identity Clone Attack is a severe threat in Online Social Net- Accounts on Social Networks.”
works which was spread over the recent years and it cause [18] A. Malhotra, L. Totti, W. Meira, P. Kumaraguru, and V. Almeida, “Studying
damages to the legitimate users in the network due to misus- User Footprints in Different Online Social Networks,” Proc. 2012 IEEE/ACM
ing the personal information. Several researches have taken Int. Conf. Adv. Soc. Networks Anal. Mining, ASONAM 2012, pp. 1065–1070,
attempts to solve this problem by detecting clone profiles in 2013.
different social platforms and their experimental techniques [19] A. M. Meligy, “A Framework for Detecting Cloning Attacks in OSN Based on
are mostly based on statistical estimations, data mining tech- a Novel Social Graph Topology,” no. February, pp. 13–20, 2015.
niques and behavioral analysis methodologies etc. However, [20] M. Zabielski, R. Kasprzyk, Z. Tarapata, and K. Szkółka, “Methods of Profile
Cloning Detection in Online Social Networks,” MATEC Web Conf., vol. 76,
due to the difficulty of finding real datasets for researches and
2016.
the higher diversity of profiles in these networks, a fully com-
[21] F. S. Rizi and M. R. Khayyambashi, “Profile Cloning in Online Social Net-
patible solutions are still to be taken.
works,” Int. J. Comput. Sci. Inf. Secur., vol. 11, no. 8, pp. 82–86, 2013.
[22] A. Gupta and R. Kaushal, “Towards Detecting Fake User Accounts in Face-
REFERENCES book,” ISEA Asia Secur. Priv. Conf. 2017, ISEASP 2017, vol. 1, pp. 1–6, 2017.
[1] Statista, “Social Media Statistics & Facts,” 2017. [Online]. Available: [23] H. Agrawal and R. Kaushal, “Analysis of Text Mining Techniques over Public
https://www.statista.com/topics/1164/social-networks/. [Accessed: 30-Oct- Pages of Facebook,” in Proceedings - 6th International Advanced Computing
2017]. Conference, IACC 2016, 2016, pp. 9–14.
[2] M. Fire, D. Kagan, A. Elishar, and Y. Elovici, “Social Privacy Protector - Pro-
tecting Users ’ Privacy in Social Networks,” no. c, pp. 46–50, 2012.
[3] N. Kumar and R. N. Reddy, “Automatic Detection of Fake Profiles in Online
Social Networks,” National Institute of Technology Rourkela Rourkela-769
008, Orissa, India, 2012.
[4] M. Torky, A. Meligy, and H. Ibrahim, “Recognizing Fake Identities in Online
IJSER © 2019
http://www.ijser.org

You might also like