0% found this document useful (0 votes)
42 views6 pages

Abedin 2020

The document discusses a research study on detecting phishing attacks using machine learning classification techniques. It evaluates three algorithms—KNN, logistic regression, and random forest—using a dataset of phishing and legitimate websites, achieving high precision and recall rates, particularly with the random forest classifier. The study aims to provide a solution to classify phishing websites effectively, thereby enhancing user protection against cyber threats.

Uploaded by

mainakroni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views6 pages

Abedin 2020

The document discusses a research study on detecting phishing attacks using machine learning classification techniques. It evaluates three algorithms—KNN, logistic regression, and random forest—using a dataset of phishing and legitimate websites, achieving high precision and recall rates, particularly with the random forest classifier. The study aims to provide a solution to classify phishing websites effectively, thereby enhancing user protection against cyber threats.

Uploaded by

mainakroni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]

IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

Phishing Attack Detection using Machine


Learning Classification Techniques
Noor Faisal Abedin Rosemary Bawm Tawsif Sarwar
2020 3rd International Conference on Intelligent Sustainable Systems (ICISS) | 978-1-7281-7089-3/20/$31.00 ©2020 IEEE | DOI: 10.1109/ICISS49785.2020.9315895

Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
East Delta University East Delta University East Delta University
Chittagong, Bangladesh Chittagong , Bangladesh Chittagong , Bangladesh
faisal.a@[Link] 161000112@[Link] 161000412@[Link]

M ohammed Saifuddin M ohammd Azizur Rahman Sohrab Hossain


Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
East Delta University East Delta University East Delta University
Chittagong , Bangladesh Chittagong , Bangladesh Chittagong , Bangladesh
161001912@[Link] 143000112@[Link] sohrab.h@[Link]

Abstract- Phishing attacks are the most common form of trusting their fake websites and leading us through actions
attacks that can happen over the internet. This method that allow the information to be leaked to them. The solution
involves attackers attempting to collect data of a user without is not avoiding the internet of course, but to gain knowledge
his/her consent through emails, URLs, and any other link that
regarding these attacks, and be careful not to be careless and
leads to a deceptive page where a user is persuaded to commit
fall victim to such attacks [2, 3, 30].
specific actions that can lead to the successful completion of an
attack. These attacks can allow an attacker to collect vital
Cyber Attacks are improving along with the technological
information of the user that can often allow the attacker to
impersonate the victim and get things done that only the victim improvements around us [4, 5]. Attackers can now create
should have been able to do, such as carry out transactions, or the same fakes to actual websites that are more and more
message someone else, or simply accessing the victim's data. difficult to distinguish from the original ones [6]. People get
Many studies have been carried out to discuss possible deceived by these fake pages quite quickly, and they are not
approaches to prevent such attacks. This research work precisely to blame if their knowledge on the subject of
includes three machine learning algorithms to predict any Cyber Security is indeed limited [7-9]. Expecting users to
websites' phishing status. In the experimentation these models tell these sites apart just from visual cues would be unfair
are trained using URL based features and attempted to
after all. Yet this innocent gap in one's knowledge can
prevent Zero-Day attacks by using proposed software proposal
potentially lead him/her to become a victim of social or
that differentiates the legitimate websites and phishing
websites by analyzing the website's URL. From observations, economic damage someday [10-12].
the random forest classifier performed with a precision of
Considering the magnitude of these consequences as
97%, a recall 99%, and F1 S core is 97%. Proposed model is
fast and efficient as it only works based on the URL and it does challenge, this research work is aimed to build a solution
not use other resources for analysis, as was the case for past that would classify phishing and legitimate websites
studies. concretely and save users from getting exploited [13-15].
Online Banking, E-Commerce, HR & Finance, Social
Keywords—Phishing Attack; Phishing Attack Detection; Networking cases of phishing are now common in almost
Artificial Intelligence; Machine Learning; Decision Tree
every sector [16, 17]. While a lot of current methods such as
I. Introduction blacklist – whitelist based techniques can help against these
attacks, these methods are not capable of detecting zero-day
Today, every individual is connected to others through the
attacks [18-21, 31].
internet. The connections are established using different
hardware and different software, and overtime is getting II. BACKGROUND AND RELAT ED W ORK
connected to everything. Today, 16% of the world 's
Supervised machine learning approaches are well suited for
population uses the internet. Despite the benefits the internet
this type of classification based problem. To train these
provides, there are dire consequences to using it without
classifiers, the features of both phishing and legitimate
proper knowledge regarding Cyber Security [1, 25, 32].
websites need to be extracted and used machine learning
Cyber Attackers lurk over the internet, deceive users into

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1125

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]
IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

algorithms to train a model that can predict a phishing


website's status concretely. While Phishers improve their
skills of attacking day after day, machine learning can be
used to train updated models that can prevent phishing
scams by keeping up with the times. By the use of
supervised machine learning methods, through analyzing the
URLs, website structure, and other feature differences
between phishing websites and legitimate websites,
proposed work aimed to predict whether a website is
phishing or not. This study mainly focusses on classifying
phishing websites and legitimate websites by using several
supervised machine learning methods. Their performance is
then finally evaluated and taken into account to determine
which of our discussed supervised machine learning Fig. 1 Proposed Architecture for phishing attack detection
methods works best to serve its purpose.
contains 32 attributes 11504 instances. The attributes of this
III. M ET HODOLOGY dataset are:
Machine Learning is a study of algorithms where using
Index, UsingIP, GoogleIndex, LongURL, ShortURL,
mathematical modelling with probabilistic theories decision
Symbol@, Redirecting, PrefixSuffix-, DNSRecording,
making for solving a problem is done based on some
SubDomains, HTTPS, DomainRegLen, Favicon,
amount of previous data or scenario of that problem.
NonStdPort, HTTPSDomainURL, RequestURL,
Machine learning is building mathematical models,
AnchorURL, LinksInScriptTags, ServerFormHandler,
integration of high-level equations which output the value of
InfoEmail, AbnormalURL, WebsiteForwarding,
a target variable based on some dependent variable.
StatusBarCust, DisableRightClick, UsingPopupWindow,
Analyzing the data of phishing and legitimate websites,
AgeofDomain, WebsiteTraffic, PageRank,
based on their different characteristics, a machine learning
LinksPointingToPage, IframeRedirection, StatsReport, and
model can predict whether a new unknown website would
class. We don't need the "Index" attribute here as this is just
be phishing or a legitimate one.
the index number of the instances in the dataset. The "class"
Supervised learning is a predictive model built on known attribute is our target variable which we are going to predict.
outcomes. The model predicts over a set of known values. In
T ABLE 1. ATTRIBUTES OF THE P HISHING DATASET
the training dataset, every single instance has a label
Attribute Name Attribute explanation
referring to a class. Real-world classification based
Used for displaying the webpage through a
problems like phishing detection, spam mail detection are Index search engine
solved using supervised learning methods. Random Forest, IP is used instead of DNS by the phishing
Classification and Regression Tree, K Nearest Neighbors, Using IP websites
Support Vector Machine, Logistic Regression are some of LongURL LongURL holds more than a hundred fonts
ShortURL is reduced URL by URL
the popular supervised machine learning methods used for ShortURL shortener, like [Link].
classification based problems. Used to recognize the remarkable characters
Symbol@ of phishing URL
A. Dataset Used to track proposed endpoint deviated
Redirecting// from the current connection
The dataset is one of the most critical parts of our study. A
Prefix denotes the letters added before an
dataset is nothing but the table containing information about original word to change the meaning of the
phishing and legitimate websites —the dataset for our original word, e.g. re-play, co-operative.
proposed model obtained from Kaggle. Kaggle is one of the Suffix denoted the letter(s) added after an
PrefixSuffix original word, e.g. computer, creative
most popular public repositories with a tremendous amount
T he subdomain is the domain extension
of dataset collection which can be used for training machine added before the main domain to navigate
learning models. The data set we have used for our work different sections of a website, e.g.
[Link] and [Link] are two
SubDomains subdomains of [Link]
Hypertext Transfer Protocol Secure (HTTPS)
is used for secure communication of the
HT T PS websites.
Denotes the year(s) a website is registered to
Domain Reg Len a domain.

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1126

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]
IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

It is the 16x16 pixel icon used as branding labelled data to acquire a function that predict the outcome
Favicon the website
when given new unlabeled data is given. In this research, the
Non-standard ports are used for another
NonStdPort purpose than its default assignments.
KNN algorithm uses 80% labelled data to acquire a function
HT T PSDomainURL Secure HT TP, used with T LS/SSL Protocol to predict whether a website is a real or a phishing website.
Used to request resources by the client from The second classifier name is logistic regression. Logistic is
RequestURL the server a statistical model. It uses a logistic function to model a
A clickable content in text form used to
binary dependent variable. In our regression analysis, uses
AnchorURL hyperlink
Used to link at script tag to manipulate the 80% labelled data to acquire a logistic function to predict
LinksInScriptT ags image whether a website is a legitimate or a phishing website. The
It is used to process the contents in the server third classifier in this research is the random forest and is a
ServerFormHandler from the client. supervised learning algorithm. It uses a set of decision trees
Email used with the domain or business
InfoEmail website
which build the forest. It is an ensemble of decision trees,
AbnormalURL T he reverse of normalURL unlikely to occur usually trained with the "bagging" technique. The main idea
Used to redirect multiple sources to a single of the bagging technique is that a mixture of learning
WebsiteForwarding web address models surges the global effect.
Used to show the system information at the
StatusBarCust bottom of the screen
IV. RESULT S AND DISCUSSIONS
Used to prevent web contents of the website
DisableRightClick from saving In our study, we used confusion matrixes, ROC curves,
A menu that appears on the screen by precision, recall, and F1 Score to evaluate the performance
popping up and disappears immediately after
of the three machine learning classifiers.
UsingPopupWindow a click
Used to inspect a website and redirect later
IframeRedirection on
AgeofDomain T he time duration of axe existed domain
DNSRecording Used to get information like IP address
WebsiteT raffic Used to log the visited users of a website
It is the web page ranking tool used by
PageRank google search engine
An indexing tool to add webpages in Google
GoogleIndex exploration
LinksPointingToPage Used to rank the website
Used to get information about all transferred
StatsReport files
Defines the features and behaviours, 1 means
class phishing and 0 means legitimate
Fig. 2 Classification report for KNN

B. Data Preprocessing Fig. 2, Fig. 3, and Fig.4 show the performance of the KNN
Feature scaling is the process of normalizing or algorithm. Fig. 2 shows the precision, recall, and fi score for
standardizing the independent variables of the training the KNN algorithm. It is observed that the precision is 91%
dataset to a fixed range, to handle variance in the values for a phishing website. On the other hand, the precision is
86% for the legitimate website. Besides, we see that recall
among different independent variables. Splitting the dataset
into two portions, one for training and one for testing is very and fi score are 94% and 93% respectively for phishing
important. It is vital to train a model with a subset of the full website. The recall and fi score for legitimate website are
79% and 82% respectively.
dataset and test model with the rest to evaluate the model
performance satisfactorily. We split the dataset into 80:20
ratio as follows: 80% of the dataset used for training and
20% of dataset for testing using a stratified sampling
technique. We did the train test split using the Scikit-Learn
library in Python programming language.

C. Machine Learning Classifiers


Three machine learning classifiers are applied in this
research. They are KNN, logistic regression, and random
forest. The k-nearest neighbours (KNN) classifier is a
simple supervised machine learning classifier. It is used
both classification and regression problems. It relies on

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1127

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]
IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

Fig. 4 Classification report Logistic Regression

Fig. 4, Fig. 5, and Fig. 6 show the performance of the


logistic regression algorithm. Fig. 2 shows the precision,
recall, and fi score for the logistic regression algorithm. It is
observed that the precision is 83% for a phishing website.
On the other hand, the precision is 88% for a legitimate
website. Besides, we see that recall and f1 Score are 97%
and 90% respectively for the phishing website. The recall
and fi score for the legitimate website is 57% and 69%
respectively.

Fig. 3 Confusion Matrix for KNN

Fig. 3 shows the confusion matrix results for KNN. The left
diagonal values are higher than the values of the right
diagonal, which means out proposed system successfully
detect the phishing website.

Fig. 5 Confusion Matrix for Logistic Regression

Fig. 5 shows the confusion matrix results for logistic


regression. The left diagonal values are higher than the
values of the right diagonal, which means out proposed
system successfully detect the phishing website.

Fig. 4 ROC curves for KNN

Fig. 4 shows the ROC of class 0, ROC of class 1, and the


micro-average ROC curve. Micro-average ROC is the
addition of actual positive ratio divided by the sum of false-
positive ratio. The area under curve (AUC) processes the
whole two-dimensional area under the whole ROC curve
from (0, 0) to (1,1). AUC score is 0.93, which is excellent.

Fig. 6 ROC curves for Logistic Regression

Fig. 6 shows the ROC of class 0, ROC of class 1, and


micro-average ROC curve for logistic regression. Micro-
average ROC is the addition of actual positive ratio divided
by the sum of the false positive ratio. The area under curve
(AUC) processes the whole two-dimensional area under the
whole ROC curve from (0, 0) to (1,1). AUC score is 0.93,
which is excellent.

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1128

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]
IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

Fig. 7 Classification report for Random Forest


Fig. 9 ROC curves for Random Forest

Fig. 7, Fig. 8, and Fig. 9 show the performance of the


Fig. 9 shows the ROC of class 0, ROC of class 1, and
random forest algorithm. Fig. 2 shows the precision, recall, micro-average ROC curve for the random forest algorithm.
and fi score for the random forest algorithm. It is observed Micro-average ROC is the addition of actual positive ratio
that the precision is 97% for a phishing website. On the
divided by the sum of false-positive ratio. Area under curve
other hand, the precision is 98% for a legitimate website. (AUC) processes the whole two-dimensional area under the
Also, we see that recall and f1 Score are 99% and 98%
whole ROC curve from (0, 0) to (1,1). AUC score is 1.0,
respectively for the phishing website. The recall and fi score
which is excellent.
for the legitimate website is 93% and 96% respectively.
V. CONCLUSION
In this paper, the performance of three widely used machine
learning classifiers are compared. Among these three
classifiers, random forest performance is the highest with a
precision of 97%. The AUC of the random forest is 1.0,
which means our system can detect phishing website a high
accuracy. In future, the accuracy improvement task will be
done by changing features. Handling large data and efficient
neural network and deep learning model based systems can
be developed detect a phishing attack from a logged dataset.
Incorporating feature reduction techniques will also be
considered in future work to improve the accuracy of the
system.

References
[1] M. Humayun, M. Niazi, N. Z. Jhanjhi, M. Alshayeb, and S.
Fig. 8 Confusion Matrix for Random Forest
Mahmood, "Cyber Security T hreats and Vulnerabilities: A
Systematic Mapping Study," (in English), Arabian Journal for
Fig. 8 shows the confusion matrix results for the random Science and Engineering, Article vol. 45, no. 4, pp. 3171-3189,
forest. The left diagonal values are higher than the values of Apr 2020.
[2] E. D. Frauenstein and S. Flowerday, "Susceptibility to phishing
the right diagonal, which means out proposed system on social network sites: A personality information processing
successfully detect the phishing website. model," (in English), Computers & Security, Article vol. 94, p.
18, Jul 2020, Art. no. Unsp 101862.
[3] A. Kulkarni and L. L. Brown, "Phishing Websites Detection
using Machine Learning," (in English), International Journal of
Advanced Computer Science and Applications, Article vol. 10,
no. 7, pp. 8-13, Jul 2019.
[4] M. Botacin, F. Ceschin, P. de Geus, and A. Gregio, "We need to
talk about antiviruses: challenges & pitfalls of AV evaluations,"
(in English), Computers & Security, Article vol. 95, p. 15, Aug
2020, Art. no. Unsp 101859.
[5] E. S. Gualberto, R. T . De Sousa, T . P. D. Vieira, J. Da Costa,
and C. G. Duque, "From Feature Engineering and T opics
Models to Enhanced Prediction Rates in Phishing Detection,"
(in English), Ieee Access, Article vol. 8, pp. 76368-76385, 2020.

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1129

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020]
IEEE Xplore Part Number: CFP20M19-ART; ISBN: 978-1-7281-7089-3

[6] "General Practice and the Community: Research on health [23] S. Hossain, D. Sarma, T . Mittra, M. N. Alam, I. Saha and F. T .
service, quality improvements and training. Selected abstracts Johora, "Bengali Hand Sign Gestures Recognition using
from the EGPRN Meeting in Vigo, Spain, 17-20 October 2019 Convolutional Neural Network," 2020 Second International
Abstracts," (in English), European Journal of General Practice, Conference on Inventive Research in Computing Applications
Article vol. 26, no. 1, pp. 42-50, Dec 2020. (ICIRCA), Coimbatore, India, 2020, pp. 636-641.
[7] H. Alqahtani, I. H. Sarker, A. Kalim, S. M. Minhaz Hossain, S. [24] S. Hossain, A. Abtahee, I. Kashem, M. M. Hoque, and I. H.
Ikhlaq, and S. Hossain, "Cyber Intrusion Detection Using Sarker, "Crime Prediction Using Spatio-T emporal Data," in
Machine Learning Classification T echniques," in Computing Computing Science, Communication and Security, Singapore,
Science, Communication and Security, Singapore, 2020, pp. 2020, pp. 277-289: Springer Singapore.
121-131: Springer Singapore. [25] H. Alqahtani, I.H. Sarker, A. Kalim, S.M.M. Hossain, S. Ikhlaq
[8] J. A. Bland, M. D. Petty, T. S. Whitaker, K. P. Maxwell, and W. and S. Hossain, "Cyber Intrusion Detection Using Machine
A. Cantrell, "Machine Learning Cyberattack and Defense Learning Classification T echniques," in Computing Science,
Strategies," (in English), Computers & Security, Article vol. 92, Communication and Security, Singapore, 2020, pp. 121-131:
p. 23, May 2020, Art. no. Unsp 101738. Springer Singapore.
[9] S. C. Sethuraman, V. Vijayakumar, and S. Walczak, "Cyber [26] S. Hossain, F. Islam, R. Karim and K.N. Siddique, "A Critical
Attacks on Healthcare Devices Using Unmanned Aerial Comparison between Distributed Database Approach and Data
Vehicles," (in English), Journal of Medical Systems, Article vol. Warehousing Approach." International Journal of Scientific &
44, no. 1, p. 10, Jan 2020, Art. no. 29. Engineering Research, Article 5.1 (2014): 196-201.
[10] M. A. Kosan, O. Yildiz, and H. Karacan, "Comparative analysis [27] S. Hossain, D. Sarma, F. T uj-Johora, J. Bushra, S. Sen and M.
of machine learning algorithms in detection of phishing T aher, "A Belief Rule Based Expert System to Predict Student
websites," (in T urkish), Pamukkale University Journal of Performance under Uncertainty," in 2019 22nd International
Engineering Sciences-Pamukkale Universitesi Muhendislik Conference on Computer and Information Technology (ICCIT),
Bilimleri Dergisi, Article vol. 24, no. 2, pp. 276-282, 2018. 2019, pp. 1-6.
[11] O. S. Lih et al., "Comprehensive electrocardiographic diagnosis [28] F. Ahmed, Fatema-Tuj-Johora, R. J. Chakma, S. Hossain and D.
based on deep learning," (in English), Artificial Intelligence in Sarma, "A Combined Belief Rule based Expert System to
Medicine, Article vol. 103, p. 8, Mar 2020, Art. no. Unsp Predict Coronary Artery Disease," in 2020 International
101789. Conference on Inventive Computation Technologies (ICICT),
[12] D. Zhang et al., "Automatic corneal nerve fiber segmentation 2020, pp. 252-257.
and geometric biomarker quantification," (in English), [29] S. Hossain, D. Sarma, R. J. Chakma, W. Alam, M. M. Hoque,
European Physical Journal Plus, Article vol. 135, no. 2, p. 16, and I. H. Sarker, "A Rule-Based Expert System to Assess
Feb 2020, Art. no. 266. Coronary Artery Disease Under Uncertainty," in Computing
[13] A. Cuzzocrea, F. Martinelli, and F. Mercaldo, Applying Science, Communication and Security, Singapore, 2020, pp.
Machine Learning Techniques to Detect and Analyze Web 143-159: Springer Singapore.
Phishing Attacks (Iiwas2018: The 20th International Conference [30] M. N. Alam, D. Sarma, F. F. Lima, I. Saha, R. -E. -. Ulfath and
on Information Integration and Web-Based Applications & S. Hossain, "Phishing Attacks Detection using Machine
Services). New York: Assoc Computing Machinery, 2014, pp. Learning Approach," 2020 Third International Conference on
355-359. Smart Systems and Inventive Technology (ICSSIT), Tirunelveli,
[14] X. W. Liu and J. M. Fu, "SPWalk: Similar Property Oriented India, 2020, pp. 1173-1179, doi:
Feature Learning for Phishing Detection," (in English), Ieee 10.1109/ICSSIT 48917.2020.9214225.
Access, Article vol. 8, pp. 87031-87045, 2020. [31] I. Saha, D. Sarma, R. J. Chakma, M. N. Alam, A. Sultana and S.
[15] J. Mao et al., "Detecting Phishing Websites via Aggregation Hossain, "Phishing Attacks Detection using Deep Learning
Analysis of Page Layouts," in 2017 International Conference on Approach," 2020 Third International Conference on Smart
Identification, Information and Knowledge in the Internet of Systems and Inventive Technology (ICSSIT), Tirunelveli, India,
Things, vol. 129, R. Bie, Y. Sun, and J. Yu, Eds. (Procedia 2020, pp. 1180-1185, doi:
Computer Science, Amsterdam: Elsevier Science Bv, 2018, pp. 10.1109/ICSSIT 48917.2020.9214132.
224-230. [32] S. Hossain, D. Sarma and R. J. Chakma, “Machine Learning-
[16] N. Shulzhenko and S. Romashkin, "Internet fraud and Based Phishing Attack Detection” International Journal of
transnational organized crime," (in English), Juridical Tribune- Advanced Computer Science and Applications(IJACSA), Article
Tribuna Juridica, Article vol. 10, no. 1, pp. 162-172, Mar 2020. vol. 11, no. 9, 2020, pp. 378-388, doi:
10.14569/IJACSA.2020.0110945
[17] A. Zamir et al., "Phishing web site detection using diverse
machine learning algorithms," (in English), Electronic Library,
Article vol. 38, no. 1, pp. 65-80, Jan 2020.
[18] A. Belabed, E. Aimeur, A. Chikh, and Ieee, A personalized
whitelist approach for phishing webpage detection (2012
Seventh International Conference on Availability, Reliability
and Security). Los Alamitos: Ieee Computer Soc, 2012, pp. 249-
254.
[19] E. Buber, O. Demir, O. K. Sahingoz, and Ieee, Feature
Selections for the Machine Learning based Detection of
Phishing Websites (2017 International Artificial Intelligence and
Data Processing Symposium). New York: Ieee, 2017.
[20] V. Patil, P. Thakkar, C. Shah, T . Bhat, S. P. Godse, and Ieee,
Detection and Prevention of Phishing Websites using Machine
Learning Approach (2018 Fourth International Conference on
Computing Communication Control and Automation). New
York: Ieee, 2018.
[21] G. Sonowal and K. S. Kuppusamy, "PhiDMA - A phishing
detection model with multi-filter approach," (in English),
Journal of King Saud University-Computer and Information
Sciences, Article vol. 32, no. 1, pp. 99-112, Jan 2020.
[22] D. Sarma, W. Alam, I. Saha, M. N. Alam, M. J. Alam and S.
Hossain, "Bank Fraud Detection using Community Detection
Algorithm," 2020 Second International Conference on Inventive
Research in Computing Applications (ICIRCA), Coimbatore,
India, 2020, pp. 642-646.

978-1-7281-7089-3/20/$31.00 ©2020 IEEE 1130

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 16,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.

You might also like