Applsci 12 12070
Applsci 12 12070
sciences
Article
A Heterogeneous Machine Learning Ensemble Framework for
Malicious Webpage Detection
Sam-Shin Shin 1 , Seung-Goo Ji 1 and Sung-Sam Hong 2, *
1 Internet Incident Response Technology Team, Korea Internet & Security Agency,
Naju 58324, Republic of Korea
2 Department of Multimedia Contents, Jangan University, Hwaseong 18331, Republic of Korea
* Correspondence: [email protected]
Abstract: The growing dependence on digital systems has heightened the risks posed by cybersecu-
rity threats. This paper proposes a new method for detecting malicious webpages among several
adversary activities. As shown in previous studies, malicious URL detection performance is signifi-
cantly affected by the learning dataset features. The overall performance of different machine learning
models varies depending on the data features, and using a particular model alone is not always
desirable in any given environment. To address these limitations, we propose an ensemble approach
using different machine learning models. Our proposed method outperforms the existing single
model by 6%, allowing for the detection of an additional 141 malicious URLs. In this study, repetitive
tasks are automated, improving the performance of different machine learning models. In addition,
the proposed framework builds an advanced feature set based on URL and web content and includes
the most optimized detection model structure. The proposed technology can contribute to define
an advanced feature set based on URL and web content and includes the most optimized detection
model structure and research on automated technology for the detection of malicious websites, such
as phishing websites and malicious code distribution.
for malicious web detection and establishing a pipeline for data access, collection, and
processing. In this study, model performance was improved by performing dimensionality
reduction by applying principal component analysis (PCA) and Chi-square testing to the
generated features. One general approach to deter such activities is a URL blacklist, which
is a collection of websites that have engaged in malicious or suspicious behaviors. Because
of human feedback, this safeguarding technique is highly accurate. However, it is still
unable to cover all categories in a constantly changing online environment [9]. To address
the shortcomings of the URL blacklist approach, cybersecurity experts have suggested
machine learning for malicious URL detection, which is known as a classification model.
This model is based on discriminative rules or features. This approach can distinguish
malicious from benign URLs by extracting features, thereby allowing the machine to learn
them. In this process, discriminative rules or feature selection play a crucial role in machine
learning, helping to identify effective features that can characterize malicious websites.
Most of the existing studies simply develop a feature based on the URL or web content
and detect it with a machine learning model. These studies are mainly related to phishing
site detection [10]. Invernizzi et al. [11] proposed an effective system to detect infections,
called Nazca, which detects web requests used for malicious code binary downloads and is
designed to work with large-scale networks such as internet service providers (ISP).
In this paper, we present an extensive survey of malicious website features and an
ensemble machine learning model for malicious web detection. The motivation of this
research study includes the following:
• To propose an advanced machine learning model that can detect and predict the
distribution of malicious codes based on URLs and web content without network data
analysis.
• To predict the risk of distribution by detecting malicious code distribution websites to
predict cyber threats in advance, rather than simple malicious web detection.
Because their patterns change over time, the information from malicious websites is
complex and can be utilized by combining different features. However, the performance
of machine learning models differs according to the feature selection techniques. We
propose an improved feature set to improve the web detection performance using the
collected dataset and feature analysis used in previous studies. Additionally, a module that
automatically generates this is included in the framework. In numerous studies, machine
learning models, such as support vector machines (SVM), decision trees, and random
forests, have already proved their merit. However, machine learning models may provide
different results based on generalization and other feature combinations. Thus, we propose
an ensemble machine learning method that offers the best performance. In particular, the
contributions of this research study include the following:
• Defining an advanced feature set based on URL and web content and including the
most optimized detection model structure.
• Providing an improved malicious web detection framework with high accuracy
through ensemble techniques based on six different machine learning sub-models.
• Providing the performance comparison results of various machine learning models
for malicious web detection.
• Providing an automated technology for the detection of malicious webpage such as
phishing, malicious code distribution, and via the web in the cyber security field.
• Reducing cyber security damage by predicting the location of malicious code distribu-
tion in advance.
The remainder of this paper is organized as follows: In Section 1, we briefly discuss
issues related to malicious webpages. Section 2 presents a survey of related works. In
Section 3, we describe the methodology for designing ensemble machine learning tech-
niques, including functions and processes. In Section 4, we present the findings of ensemble
machine learning testing, including the datasets, as well as a comparison of single and
Appl. Sci. 2022, 12, 12070 3 of 15
ensemble models. In Section 5, we provide our conclusions and discuss future scope of the
research.
2. Related Work
2.1. Heuristic-Based Malicious Website Detection
Heuristic-based techniques use an algorithm to generate matches from a database
after scanning suspicious webpages. In this approach, blacklisting is the most widely
employed practice. The database contains profiles of known malicious websites such as
URL, IP addresses, and domains. If a newly added URL matches the known malicious URL
listed on the blacklist, it is deemed malicious [12]. The heuristic method is implemented
using a webpage execution dynamics analysis, identifying any signature of malicious
activities, such as abnormal process generation and repetitive redirection. However, this
tedious mechanism requires everyday access to and analysis of each website. Because a
large number of new URLs are generated each day, it is impracticable to maintain a valid
blacklist. Nevertheless, its simplicity and efficiency are sufficient to overcome its inherent
limitations, and the heuristic method is widely used in antivirus systems.
Phishing websites often include branding that appears legitimate and may even use
the same logo as the actual company. It is very difficult to determine which page belongs
to each website, but this can be partially supplemented by heuristic approaches. For this
purpose, a tentative blacklist is generated in XML format by analyzing all webpages under
the same hostname. When a particular business name is typed on Google, it shows a valid
URL at the top of the search result page; if this URL is blocked, it is deemed as phishing,
and the address is automatically updated and included in a blacklist [13].
Heuristic-based approaches have the advantages of simplicity, high efficiency, and
general performance; however, there are nondeterministic polynomial (NP)-hard prob-
lems [14], computational complexity owing to several iterations, and local optimization
problems [15].
random forest, gradient boosting, and neural networks. They compared the performance
of each feature based on the similarity between known and new malicious web data. Vara
et al. [20] proposed a malicious web detection model using a SVM classifier. The features
they considered include the IP address, ‘@’ symbol, ‘.’ (dot) symbol, domain separation
using ‘–’ (underscore or hyphen) symbol, URL redirection, HTTPS token, email subject line,
short URL service, hostname length, sensitive words, the number of slashes, Unicode, SSL
certificate validity, anchor, iframe, and website ranking. They emphasized that selecting an
effective feature is the key to improving the performance of machine learning models.
Machine learning researchers agree that the success of a model or algorithm depends
on the manner in which the features are selected and extracted from the web. Classifying
whether a particular web is malicious or benign depends on the performance of the machine
learning model. Therefore, it is imperative to study the features and machine learning
models that are most effective in accurately predicting malicious URLs. Therefore, in this
study, a method for extracting a feature set optimized for a model is proposed.
3.1. Applying
Applying Machine
MachineLearning
LearningAlgorithm
Algorithmfor forMCDWDS
MCDWDS
A decision
decisiontree treeisisananalgorithm
algorithm used
used to to
classify thethe
classify labeled data.data.
labeled It analyzes associa-
It analyzes asso-
tions, patterns, and rules between and from a large dataset and
ciations, patterns, and rules between and from a large dataset and designs a model for designs a model for clas-
sification and and
classification prediction [23]. [23].
prediction Because it hasita has
Because flowchart-like
a flowchart-liketree structure, a decision
tree structure, tree
a decision
is the
tree is easiest classification
the easiest andand
classification prediction
predictionalgorithm
algorithmto interpret,
to interpret, particularly
particularly thethe input
input
data and target
target variables.
variables.Random
Randomforestforestisisa apopular
popularensemble
ensemble machine
machine learning
learning method
method
that is mainly
mainly used
usedin inclassification
classificationand andregression
regressionanalysis,
analysis, andandit produces
it produces a classifica-
a classifica-
tion or average
average prediction
predictionfrom frommultiple
multipledecision
decisiontrees treesconfigured
configuredduring during thethelearning
learning
process [24].
process [24]. Its
Its wide
wideapplication
applicationincludes
includesdetection,
detection,classification,
classification, andandregression.
regression. In In
this
this
algorithm, each tree has slightly different features, owing to its inherent
algorithm, each tree has slightly different features, owing to its inherent randomness. This randomness. This
makes tree
makes tree predictions
predictions decorrelated
decorrelated and and consequently
consequently improves
improves the thelevel
levelof ofgeneralization.
generaliza-
tion. Logistic regression is a very popular model used to find the
Logistic regression is a very popular model used to find the conditional probability, and conditional probability,
and
it it is also
is also one one
of theof the
mostmost frequently
frequently usedused learning
learning methods
methods forfor maliciousURL
malicious URLdetection.
detec-
tion. Similar
Similar to ordinary
to ordinary regression
regression analysis,
analysis, the primary
the primary goal of goal of logistic
logistic regression
regression is to
is to express
express relations
relations between between
dependent dependent
variablesvariables and independent
and independent variables in variables
specificin specific so
functions
functions
that they canso that they can
be used be used asmodels
as prediction prediction
[25].models
XGBoost [25].is XGBoost
a machine is learning
a machine learn-
algorithm
ing algorithm based on a decision tree using a gradient boosting
based on a decision tree using a gradient boosting framework. This ensemble tree method framework. This ensem-
ble tree
uses the method
gradientuses the gradient
descent architecturedescent architecture
of gradient boostingof gradient
machines boosting
to trainmachines
weak learners to
(generally classification and regression tree (CART)) [26]. An SVM is used to create a
boundary to separate different points that belong to a single class. An SVM finds the closest
point of the best line or the best decision boundary that can segregate multiple dimensional
spaces into classes. The closest data point is called the hyperplane. Selecting a hyperplane
that maximizes the margin between the hyperplane and learning dataset improves classifi-
cation accuracy [27]. A convolutional neural network (CNN) is a multidirectional artificial
SVM finds the closest point of the best line or the best decision boundary that can segre-
gate multiple dimensional spaces into classes. The closest data point is called the hyper-
plane. Selecting a hyperplane that maximizes the margin between the hyperplane and
learning dataset improves classification accuracy [27]. A convolutional neural network 6 of 15
Appl. Sci. 2022, 12, 12070
(CNN) is a multidirectional artificial neural network. There are multiple convolutional
layers in traditional neural networks. We employed the CNN-LSTM architecture, com-
bining convolutional neural
neural network
network. Therelayers for input
are multiple data feature
convolutional layersextractions
in traditionaland a long
neural networks. We
short-term memoryemployed
(LSTM) for
the sequence
CNN-LSTM predictions.
architecture, combining convolutional neural network layers
for input data feature extractions and a long short-term memory (LSTM) for sequence
predictions.
3.2. Extraction and Pre-Processing of Malicious Code Distribution Web Features
In this section,3.2.
weExtraction
define malicious web features
and Pre-Processing of Maliciousand perform
Code a process
Distribution to extract
Web Features
them, as shown in Figure 2. We
In this employed
section, we definean malicious
improved webfeature setand
features consisting
perform aofprocess
26 fea-to extract
tures to enhance web detection
them, as shown performance
in Figure 2. We based on the
employed an collected dataset.
improved feature setIn this frame-
consisting of 26 features
to enhance web detection performance based on the collected
work, when the data are input, the feature set is extracted from the feature extraction dataset. In this framework,
when the data are input, the feature set is extracted from
module. The process comprises three modules: data loading, URL-based extraction, and the feature extraction module. The
process comprises three modules: data loading, URL-based extraction, and content-based
content-based extraction. The collected URLs are registered in the database and the fea-
extraction. The collected URLs are registered in the database and the feature data for
ture data for machine learning
machine (URL
learning length,
(URL length,domain
domaindata, andHTML
data, and HTMLcontents)
contents) are are ex- in raw
extracted
tracted in raw datadataform, which is then forwarded to the training data pre-processing
form, which is then forwarded to the training data pre-processing module. The feature
module. The featuredatadata consist
consist of ‘URL-based
of ‘URL-based feature
feature data’ extracted
data’ extracted from thefrom
URL the
itselfURL itself
and ‘content-based
feature data’ extracted from the HTML source code
and ‘content-based feature data’ extracted from the HTML source code upon URL re- upon URL requests. The 26features
that we defined and extracted are listed in Table 1.
quests. The 26features that we defined and extracted are listed in Table 1.
Table 1. Cont.
Because the extracted feature data are not appropriate for machine learning, raw data
require a vectorization process that converts feature data into numbers. Based on the
machine learning algorithm used, this process involves two modules: URL and content
because the former involves a word tokenization of lexical features of URLs, converting
domain character data into vectors appropriate for the CNN deep learning algorithm, while
the latter involves a conversion of 26 sets of feature data extracted through HTML parsing
into vectors appropriate for machine learning.
Appl. Sci. 2022, 12, 12070 3.3. Machine Learning Model Selection in MCDWDS 8 of 15
Figure 3 describes how to select better-performing models and how to replace the
inferior models with their superior counterparts. The model selection submodule evalu-
•ates Replace
the classification performance
the existing of each
models with model
better by machine
models learningthe
by comparing parameters
previousand,
and ifup-
appropriate, replaces the existing model with new models.
dated models.
Figure 3.
Figure 3. Machine
Machine Learning
Learning Model
ModelSelection
Selectionand
andReplace
ReplaceProcess.
Process.
be adaptively inputboth
which covers to the contentsresults
validation or URLsandaccording
performance to the algorithm type.
data;
• Load a ‘Model Information’ of the previously used models;
• Replace the existing models with better models by comparing the previous and up-
dated models.
3.5. Ensemble
Ensemble Machine
Machine Learning
Learning Prediction Results Analysis
Figure 5 shows the ensemble machine learning prediction analysis process: process:
①1 Input the prediction results for each model.
Input the prediction results for each model.
②2 Dynamically
Dynamicallycalculate
calculateweights
weights(model
(modelreliability)
reliability)using
usingthe
the performance
performance values
values of
of
each model.
each model.
③3 Calculate
Calculatethe
theensemble
ensembleresult
resultofofsoft
softvoting
voting(using
(using the
the weighted
weighted average)
average) with
with the
the
weights
weightsandandperformance
performancevalues
valuesofofeach
eachmodel.
model.
④4 The Theresult
resultofofdetermining
determiningwhether
whetherthetheensemble
ensembleisisbenign
benign(green
(greenin
inFigure
Figure5)
5)or
ormali-
mali-
cious (red in Figure 5) is delivered by comparing the results of ensemble calculations.
cious (red in Figure 5) is delivered by comparing the results of ensemble calculations.
① Input the prediction results for each model.
② Dynamically calculate weights (model reliability) using the performance values of
each model.
③ Calculate the ensemble result of soft voting (using the weighted average) with the
Appl. Sci. 2022, 12, weights
12070 and performance values of each model. 9 of 15
④ The result of determining whether the ensemble is benign (green in Figure 5) or mali-
cious (red in Figure 5) is delivered by comparing the results of ensemble calculations.
Table 3 above shows the precision (expressed in a percentage) of each machine learning
model in descending order: light gradient boosting machine (0.9546), random forest (0.9511)
is in the first place, followed by XGBoost (0.9499), logistic regression (0.9337), decision tree
(0.9162), 1D-CNN (0.804), and SVM (0.6096).
The highest difference between precision and accuracy was for logistic regression,
which recorded a value of 0.03. This was mainly attributable to the parameter properties
due to alternating model selections. However, an SVM recorded a very low level of
performance (approximately 60%). This poor performance was due to sparse features,
which made it difficult to find a hyperplane pattern against the Kernel-SVM. The malicious
web detection dataset structure is not compatible with the SVM kernel. In addition, when
the scale or format of feature values is the same and the number of data samples is small
(10,000 or less), an SVM can show good performance regardless of the data dimension [28].
The dataset used in this study consists of heterogeneous features and the number of samples
is approximately 100, 000. Therefore, it is considered that the SVM result is low.
In addition, we performed experiments with the high-performance extreme learning
machines (HP-ELM) model [29] and light gradient boosting machine (LGBM) [30] to
compare the latest models. There are studies applied to this model in the field of malware
detection [31]. As a result of the experiment, the F1-score was measured as 0.6946. The type
and scale of the features of the dataset are diverse, so it is not suitable for HP-ELM, which
is thought to indicate low performance. The LGBM is a tree-based ensemble model.
Appl. Sci. 2022, 12, 12070 11 of 15
Table 4. Performance of the Voting Ensemble Techniques based on Dynamic Model Weight.
When we applied the ensemble techniques, the highest precision was observed in
weighted soft voting. Because both precision and recall had good records; the F1-score was
also good. In terms of accuracy, we obtained the best results for hard voting. However,
the difference in F1-score and accuracy between weighted soft voting and hard voting was
only 0.0115 and 0.0139, respectively.
The poorest result of accuracy was found in soft voting, with a difference of 0.1403
compared with weighted soft voting. In terms of precision, the difference was 0.14. Poor
performance in soft voting was mainly due to a substantial gap between the prediction
classification assessments in different machine learning models. As shown in Table 3, the
prediction classification of an SVM was 60%, which led to a lower probability of soft voting.
Figure 6 shows the ROC curve of the weighted soft voting model. The AUC value was
0.924, indicating that the TP and FP ratios were not biased and high detection accuracy was
confirmed. The red-dot line in Figure 6 represents the worst case.
An experiment was performed with AdaBoost, a different ensemble technique, to
compare the performance of the proposed ensemble model. Table 5 presents the results
of the AdaBoost experiment. In conclusion, the accuracy performance of the hard voting
method was the highest with 0.9548, and the weighted soft voting method showed high
precision with 0.9587, recall 0.9587, and F1-score 0.9587 for the remaining performance indi-
cators. The proposed method dynamically adjusts weights by reflecting the characteristics
of each model and thus exhibits a high accuracy performance compared to other methods.
However, as aforementioned, there is a disadvantage that depends on the performance of
the sub-model. Therefore, the proposed soft voting ensemble technique based on dynamic
weighting is suitable for malicious web detection.
Appl. Sci. 2022, 12, 12070 Figure 6 shows the ROC curve of the weighted soft voting model. The AUC value 12 of 15
was 0.924, indicating that the TP and FP ratios were not biased and high detection accu-
racy was confirmed. The red-dot line in Figure 6 represents the worst case.
The classification results from ensemble machine learning were validated using actual
malicious and benign webpage classifications. For validation purposes, we compared the
actual false positive rate with malicious website data, referring to the reputation scores
provided by malicious website search engines such as Google Safe Browsing and Virus
Total. Table 6 shows the URL prediction results per reputation score ranging from 3 to 15.
The ensemble predictions for the 11,665 URLs that were actually benign showed that
the total number of false positives was 633, and the total number of true positives was
11,032. In other words, the false positive rate was as low as 5.43%, meaning that 94.57% of
the predictions were true. The ensemble predictions for URLs that were actually malicious
showed that the total number of false positives was 11,337, accounting for 4.35%. This
means that the ratio of true negatives was 95.65%, which was slightly higher than that of
the benign URL predictions by 1.08 percent points. When the reputation score was 8 or
more, all predictions were true-positive. The ensemble predictions for benign URLs with
a reputation score of 8 or more found an additional 141 false positives. This additional
detection of malicious websites can increase prediction performance by 22.27%.
5. Conclusions
In this study, we proposed a new ensemble machine learning method for malicious
webpage detection. Various machine learning techniques have been studied to detect
malicious URLs. A higher performance inevitably requires greater data processing. Re-
cent studies have shown that SVMs, decision trees, random forests, and other popular
machine learning models can deliver impressive performance. However, malicious web-
pages are constantly evolving, and their detection requires repetitive model evaluations,
improved generalization performance, and different combinations of features. To this end,
we conducted an extensive analysis of ensemble machine learning techniques to automate
repetitive tasks.
We discovered that while a single-model detection performance achieved an average of
86%, the ensemble framework yielded better detection performance in weighted soft voting
by 9 percent points. Validating the classification results shows that the ensemble approach
detected an additional 141 malicious webpages, thus improving the detection performance
by 22.27%. We intend to add an analysis system function that focuses on feature dataset
management of malicious webpages and their importance for intuitive technologies and
cybersecurity standards for addressing ever-changing malicious web URLs.
In our future work, we intend to study malicious attack web detection technology that
can detect malicious websites and malicious code distribution sites as well as waypoints and
domains containing C&C. In addition, as a sub-model, we want to study a model that can
detect whether a website is malicious through deep learning using website binary imaging.
References
1. Kang, H.K.; Shin, S.S.; Kim, D.Y.; Park, S.T. Design and Implementation of Malicious URL Prediction System based on Multiple
Machine Learning Algorithms. J. Korea Multimed. Soc. 2020, 23, 1396–1405. [CrossRef]
2. Le, H.; Pham, Q.; Sahoo, D.; Hoi, S.C.H. URLNet: Learning a URL Representation with Deep Learning for Malicious URL
Detection. arXiv preprint 2018, arXiv:1802.03162. [CrossRef]
3. Patil, D.R.; Patil, J.B. Survey on Malicious Web Pages Detection Techniques. Int. J. u-e-Serv. Sci. Technol. 2015, 8, 195–206.
[CrossRef]
4. Baykara, M.; Gürel, Z.Z. Detection of Phishing Attacks. In Proceedings of the 2018 6th International Symposium on Digital
Forensic and Security (ISDFS), Antalya, Turkey, 22–25 March 2018; pp. 1–5. [CrossRef]
5. Cova, M.; Kruegel, C.; Vigna, G. Detection and Analysis of Drive-by-Download Attacks and Malicious JavaScript Code. In
Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 281–290.
[CrossRef]
6. Singhal, S.; Chawla, U.; Shorey, R. Machine Learning & Concept Drift Based Approach for Malicious Website Detection. In
Proceedings of the 2020 International Conference on Communication Systems & Networks (COMSNETS), Bengaluru, India, 7–11
January 2020; pp. 582–585. [CrossRef]
7. Bhoj, N.; Tripathi, A.; Bisht, G.S.; Dwivedi, A.R.; Pandey, B.; Chhimwal, N. Comparative Analysis of Feature Selection Techniques
for Malicious Website Detection in SMOTE Balanced Data. RS Open J. Innov. Commun. Technol. 2021, 2, 1–10. [CrossRef]
8. Chaiban, A.; Sovilj, D.; Soliman, H.; Salmon, G.; Lin, X. Investigating the Influence of Feature Sources for Malicious Website
Detection. Appl. Sci. 2022, 12, 2806. [CrossRef]
9. Altay, B.; Dokeroglu, T.; Cosar, A. Context-Sensitive and Keyword Density-Based Supervised Machine Learning Techniques for
Malicious Webpage Detection. Soft Comput. 2019, 23, 4177–4191. [CrossRef]
10. Zhuang, W.; Jiang, Q.; Xiong, T. An intelligent anti-phishing strategy model for phishing website detection. In Proceedings of the
2012 32nd International Conference on Distributed Computing Systems Workshops, Macau, China, 18–21 June 2012. [CrossRef]
11. Invernizzi, L.; Miskovic, S.; Torres, R.; Saha, S.; Lee, S.-J.; Mellia, M.; Kruegel, C.; Vigna, G. Nazca: Detecting Malware Distribution
in Large-Scale Networks. NDSS 2014, 14, 23–26. [CrossRef]
12. Eshete, B.; Kessler, F.B. Effective Analysis, Characterization, and Detection of Malicious Web Pages. In Proceedings of the 22nd
International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 355–359. [CrossRef]
13. Tretyakov, K. Machine Learning Techniques in Spam Filtering. In Data Mining Problem-Oriented Seminar; MTAT: Beauvallon,
France, 2004; pp. 60–79. Available online: https://courses.cs.ut.ee/2004/dm-seminarspring/uploads/Main/P06.pdf (accessed
on 16 January 2022).
14. Knuth, D.E. Postscript about NP-hard problems. ACM SIGACT News. 1974, 6, 15–16. [CrossRef]
15. Beheshti, Z.; Shamsuddin, S.M. A review of population-based meta-heuristic algorithms. Int. J. Adv. Soft Comput. Appl 2013, 5,
1–35.
16. Aljabri, M.; Alhaidari, F.; Mohammad, R.M.A.; Samiha, M.; Alhamed, D.H.; Altamimi, H.S.; Chrouf, S.M.B. An Assessment
of Lexical, Network, and Content-Based Features for Detecting Malicious URLs Using Machine Learning and Deep Learning
Models. Comput. Intell. Neurosci. 2022, 2022, 3241216. [CrossRef] [PubMed]
17. Wang, H.-H.; Yu, L.; Tian, S.-W.; Peng, Y.-F.; Pei, X.-J. Bidirectional LSTM Malicious Webpages Detection Algorithm Based on
Convolutional Neural Network and Independent Recurrent Neural Network. Appl. Intell. 2019, 49, 3016–3026. [CrossRef]
18. Ozker, U.; Sahingoz, O.K. Content Based Phishing Detection with Machine Learning. In Proceedings of the 2020 International
Conference on Electrical Engineering (ICEE), Istanbul, Turkey, 25–27 September 2020; pp. 27–32. [CrossRef]
19. Chatterjee, M.; Namin, A.S. Detecting Phishing Websites through Deep Reinforcement Learning. In Proceedings of the 2019 IEEE
43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; Volume 2,
pp. 227–232. [CrossRef]
20. Vara, K.D.; Dimble, V.S.; Yadav, M.M.; Thorat, A.A. Based on URL Feature Extraction Identify Malicious Website Using Machine
Learning Techniques. Int. Res. J. Innov. Eng. Technol. 2022, 6, 144–148. [CrossRef]
21. Choi, S.Y.; Lim, C.G.; Kim, Y.M. Automated Link Tracing for Classification of Malicious Websites in Malware Distribution
Networks. J. Inf. Process. Syst. 2019, 15, 100–115. [CrossRef]
22. Wang, G.; Stokes, J.W.; Herley, C.; Felstead, D. Detecting Malicious Landing Pages in Malware Distribution Networks. In
Proceedings of the 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN),
Budapest, Hungary, 24–27 June 2013. [CrossRef]
23. Salami, H.O.; Ibrahim, R.S.; Yahaya, M.O. Detecting Anomalies in Students' Results Using Decision Trees. Int. J. Mod. Educ.
Comput. Sci. 2016, 8, 31–40. [CrossRef]
24. Desai, A.; Jatakia, J.; Naik, R.; Raul, N. Malicious Web Content Detection Using Machine Leaning. In Proceedings of the 2017 2nd
IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore,
India, 19–20 May 2017; pp. 1432–1436. [CrossRef]
25. Chiramdasu, R.; Srivastava, G.; Bhattacharya, S.; Reddy, P.K.; Reddy Gadekallu, T. Malicious Url Detection Using Logistic
Regression. In Proceedings of the 2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS), Barcelona,
Spain, 23–25 August 2021; Volume 2021, pp. 11–16. [CrossRef]
Appl. Sci. 2022, 12, 12070 15 of 15
26. Mokbal, F.M.M.; Dan, W.; Xiaoxi, W.; Wenbin, Z.; Lihua, F. XGBXSS: An Extreme Gradient Boosting Detection Framework for
Cross-Site Scripting Attacks Based on Hybrid Feature Selection Approach and Parameters Optimization. J. Inf. Secur. Appl. 2021,
58, 102813. [CrossRef]
27. Brintha, N.C.; Preethi, C.; Winowlin Jappes, J.T. Exploring Malicious Webpages Using Machine Learning Concept. In Proceedings
of the 2021 2nd International Conference for Emerging Technology (INCET), Belagavi, India, 21–23 May 2021; pp. 1–5. [CrossRef]
28. Sheykhmousa, M.; Mahdianpari, M.; Ghanbari, H.; Mohammadimanesh, F.; Ghamisi, P.; Homayouni, S. Support vector machine
versus random forest for remote sensing image classification: A meta-analysis and systematic review. IEEE J. Sel. Top. Appl. Earth
Obs. Remote Sens. 2020, 13, 6308–6325. [CrossRef]
29. Akusok, A.; Bjork, K.-M.; Miche, Y.; Lendasse, A. High-Performance Extreme Learning Machines: A Complete toolbox for Big
Data Applications. IEEE Access 2015, 3, 1011–1025. [CrossRef]
30. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree.
Adv. Neural Inf. Process. Syst. 2017, 30, 3149–3157. Available online: https://dl.acm.org/doi/10.5555/3294996.3295074 (accessed
on 16 January 2022).
31. Shamshirband, S.; Chronopoulos, A.T. A new malware detection system using a high performance-ELM method. In Proceedings
of the 23rd International Database Applications & Engineering Symposium, Athens, Greece, 10–12 June 2019; pp. 1–10. [CrossRef]