0% found this document useful (0 votes)
29 views20 pages

Next-Generation Cyber Attack Prediction For Iot Systems: Leveraging Multi-Class SVM and Optimized Chaid Decision Tree

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views20 pages

Next-Generation Cyber Attack Prediction For Iot Systems: Leveraging Multi-Class SVM and Optimized Chaid Decision Tree

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Dalal et al.

Journal of Cloud Computing (2023) 12:137 Journal of Cloud Computing:


https://doi.org/10.1186/s13677-023-00517-4
Advances, Systems and Applications

RESEARCH Open Access

Next‑generation cyber attack prediction


for IoT systems: leveraging multi‑class SVM
and optimized CHAID decision tree
Surjeet Dalal1, Umesh Kumar Lilhore2*, Neetu Faujdar3, Sarita Simaiya2, Manel Ayadi4, Nouf A. Almujally4 and
Amel Ksibi4

Abstract
Billions of gadgets are already online, making the IoT an essential aspect of daily life. However, the interconnected
nature of IoT devices also leaves them open to cyber threats. The quantity and sophistication of cyber assaults aimed
against Internet of Things (IoT) systems have skyrocketed in recent years. This paper proposes a next-generation
cyber attack prediction framework for IoT systems. The framework uses the multi-class support vector machine (SVM)
and the improved CHAID decision tree machine learning methods. IoT traffic is classified using a multi-class support
vector machine to identify various types of attacks. The SVM model is then optimized with the help of the CHAID
decision tree, which prioritizes the attributes most relevant to the categorization of attacks. The proposed framework
was evaluated on a real-world dataset of IoT traffic. The findings demonstrate the framework’s ability to catego-
rize attacks accurately. The framework may determine which attributes are most crucial for attack categorization
to enhance the SVM model’s precision. The proposed technique focuses on network traffic characteristics that can be
signs of cybersecurity threats on IoT networks and affected Network nodes. Selected feature vectors were also cre-
ated utilizing the elements acquired on every IoT console. The evaluation results on the Multistep Cyber-Attack
Dataset (MSCAD) show that the proposed CHAID decision tree can significantly predict the multi-stage cyber attack
with 99.72% accuracy. Such accurate prediction is essential in managing cyber attacks in real-time communication.
Because of its efficiency and scalability, the model may be used to forecast cyber attacks in real time, even in massive
IoT installations. Because of its computing efficiency, it can make accurate predictions rapidly, allowing for prompt
detection and action. By locating possible entry points for attacks and mitigating them, the framework helps
strengthen the safety of IoT systems.
Keywords Anomaly detection, CHAID decision tree, IoT cyber attacks, Multistep attack, Security investigation, Zero-
day attack

*Correspondence:
Umesh Kumar Lilhore
[email protected]
Full list of author information is available at the end of the article

© The Author(s) 2023, corrected publication 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://​creat​iveco​mmons.​org/​licen​ses/​by/4.​0/.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 2 of 20

Introduction A supervised learning framework with a better classi-


The term "Internet of Things" refers to a broad category fication performance than numerous different classifica-
of technology solutions and meaningful objects that tion algorithms, but its application is restricted due to the
interact with one another online, in addition to the big extended training times required for massive data sets.
data that all objects produce. Automation, intellectual Various feature selection methods are combined with
equipment in home automation, and essential infra- SVM to acquire reduced dimensional statistics. A clas-
structure are all examples of IoT device equipment with sification model needs minimal training time as just an
various uses and complexity. IoT devices were created outcome. An ideal set of characteristics is chosen using
to improve safety and convenience among many fac- feature selection before constructing a model. A par-
ets of a person’s life. In addition to greater comfort, the ticular feature selection algorithm is used in the feature
IoT introduces new cybersecurity-related issues and dif- selection phase to assess the ranking of each possible
ficulties. The characteristics of a setting affect the secu- characteristic, and the finest "k" characteristics are then
rity problems underlying the IoT infrastructure. An IoT determined. This process involves creating a ranked list
framework is a potential IoT ecosystem component con- of features from which a subset of factors can be cho-
sisting of collections of advanced technologies with the sen using various specific requirements. One of the
same or equivalent technical specifications. If a specific most prevalent statistical methods is the CHAID, which
device is vulnerable, such homogeneity magnifies the forecasts imbalance from the predicted allocation if the
consequence. feature occurrence is not highly dependent upon that
A multi-stage cyber-attack is precisely what its name categorical variable.
suggests: A cyber-attack that takes place in steps instead The performance of Smart IoT devices can be altered
of an instantaneous attack. When a resource’s integ- mainly by device manufacturing companies even with-
rity, confidentiality, or availability is compromised by out the customer’s consent by changing the device’s cus-
an incursion [1], it is considered an intrusion. Intrusion tom firmware, a significant IoT cyber threat hazard. It
detection systems are the first line of defence in crucial adds new security vulnerabilities that could enable the
IoT networks. Anomalies in network traffic or signa- IoT device to accomplish unpleasant activities on the cli-
tures help them identify known threats. Security alarms ent device, like secretly capturing confidential user infor-
are growing at an exponential rate as network traffic mation and even inadequately altering capabilities.
increases. However, sophisticated attacks elude IoT secu- This work proposes an IoT cybersecurity threat detec-
rity systems by carrying out each attack step individually tion model that utilizes a multi-class SVM algorithm and
and dividing the attack into many consequential seg- CHAID feature screening for high precision, lower false
ments. As a result, modern cyberattacks are becoming positives, and optimistic factors. The proposed model
more accurate, distributed and large-scale. Undetected optimizes a kernel parameter by calculating the variance
cyberattacks can have devastating consequences. To for each attribute feature and determining the highest
secure vital resources now or in the future, a description attribute variance. A high conflict will lead to a better
and projection of the attack and documentation of the kernel parameter if the kernel and variance are inversely
attacker’s behaviour are helpful. related. This method is known as the variance optimiza-
Similarly, a multi-stage cyber attack on an organization tion technique. The critical contribution of this research
may include using a rogue employee who first recons for mainly includes:
weaknesses in the network defences and might use his
position within the organization to drop a malware pay- a) The primary goals of the research were to investigate
load that is activated at a reasonable time. The utilization the potential for multi-stage cyber threat detection in
of risky web servers, like telnet servers and File Trans- IoT devices using load flow and a more in-depth net-
fer Protocols (FTP) servers, along with security flaws in work monitoring that considers IoT security protocol
devices and access control lists, are critical problems. characteristics.
Security flaws with policies and procedures employed b) This research developed a model for early and auto-
by the communications infrastructure are also an issue. matic recognition of cyber security threats for IoT
Even highly specialized vulnerable IoT equipment with infrastructure based on the CHAID method, which
resource constraints can be leveraged to track and collect creates locate and new secure paths.
information on the IoT to utilize crucially. As a result, the c) This research attempts to improve cyber attack detec-
entire IoT infrastructure may be severely harmed by flaws tion effectiveness through an SVM ML algorithm.
in the protocols used by the IoT application. Depending d) The proposed technique focuses on network activity
on the ecosystems the vulnerable connected systems per- characteristics that can be signs of cybercrime in the
form in, such effects’ amid the challenges vary. network ecosystem and vulnerable Smart systems.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 3 of 20

The complete article is organized as follows: Related markings. Using this vocabulary, they could tell
work section covers the related work in the field of IoT between various login animal power initiatives across
cyber security attack detection, Materials and methods multiple apps using a single, generic pattern. The
section covers the materials and methods, Results and researchers presented a review of previous research
discussion section covers the results and analysis, includ- and a rigorous examination of several machine learning
ing experimental details and performance metrics, and methods [6]. The paper also includes statistical data to
Conclusion section covers the conclusion and future compare the method recognition effectiveness of suspi-
scope of the IoT cyber security research. cious activities in IoT network systems. According to
research observations, the random forest method gen-
Related work erates the most detailed findings for the feature sets.
Fundamentally, circumstances would gain from a method A cyber security model with an Intrusion Detection
and language for exhibiting IoT cyber security threads [2] System (IDS) is discussed in IoT architecture, which
in the direction of robotized location and recognizability utilizes alarms relating to unusual traffic to connect
of proof of multistep digital assault. IoT architecture is an IoT devices. Because there are many possible permu-
example of attack designs familiar with reusing nonexclu- tations of attack time, risk assessment, and attack hub
sive modules in the assault. A prototype implementation data in the IoT, this study presented a method to mimic
of a scenario acknowledgement motor using Categori- multistep assault circumstances within the company.
cal Abstract Machine Language (CAML) was developed, The results of the trials proved that the suggested tech-
which gradually consumed first-level security warnings nique could accurately reproduce multistep assault
and generated reports that differentiate multistep attack situations and trace them back to the original host. It
situations in the alarm stream. might help senior leaders better express safety actions
Protecting IoT devices from top to bottom is described to employees, helping to make the workplace safer for
in [3], contributing to a greater capacity for mission- everyone. The IoT network’s attack recognition strat-
driven digital situational awareness. Therefore, the IoT egy is discussed in [7]. Its foundation is the application
cauldron plotted out all potential network vulnerabilities of advanced systems. A sequence of network architec-
by linking, summarizing, standardizing, and interweaving tures is used to create IoT solutions. The method uses
data from diverse sources. It allowed for a more nuanced the information gain, random forest classifier, corre-
understanding of potential attack vectors, leading to mit- lation analysis, and feature global ranking to decrease
igation suggestions. A flexible demonstration supported the number of features. The additional investigation
a multi-stage analysis of firewall rules and host-to-have is based on three feature sets coupled using the sug-
vulnerabilities, including attack routes within the organi- gested method to generate an optimized part and
zation from the outside. They portrayed a prepared rela- functionality.
tionship because of Caldron assault charts and analyzed Research [8] presents a method for IoT threat detec-
the impact of attacks on missions. tion that relies on cloud technology software-defined
In [4], the authors examine a cyber security threat networks (SDN). It uses a decentralized multiple SDN
detection model for IoT devices based on a Hidden to prevent attacks within low-power wireless IoT equip-
Markov Model (HMM). IoT devices relied on informa- ment. The predetermined neighbourhood DNS server
tion mining to deal with warnings and generate input for of the designated sector was used to carry out network
the HMM. Given our acquired knowledge, their archi- activity dominion for each network interface field. The
tecture could continuously stream Snort warnings and central component of the strategy is a unique regulator
anticipate disruptions. With enough data, our approach installed in a cloud infrastructure and linked to a base
might infer patterns in the multi-stage attack and rank station.
aggressors accordingly. This allows our system to accu- According to research [9], Evaluation of cyber attack
rately predict attackers’ behaviour and assess the relative detection using a holistic strategy proposed to address
danger of different groups of attackers. the challenge of pinpointing novel, nuanced threats and
In [5], the authors present a multistep signature lan- the best ways to neutralize them. Particularly illustrative
guage model for IoT device communication that can of this issue are zero-day attacks and multistep assaults,
aid in attacking predetermined sites based on standard- which consist of several steps, some malevolent and oth-
ized log events collected from various applications and ers not. To identify the multi-stage assault scenario, they
devices. In addition, the wordy language helps us inte- present a substantially Boosted Neural Network in this
grate our understanding of external threats and refer- study. The outcomes of running many machine learning
ence up-to-date warning signs. Using this technology, methods were displayed, and a greatly enhanced neural
they’d be able to manufacture generic sleep-boosting network was shown.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 4 of 20

Research [10] presents the IoT network’s cyber threat Materials and methods
detection strategy. It is founded on the application of Dataset
advanced techniques. The created expert system uses There are the following attack scenarios in the pre-
an assortment of network architectures to function. scribed dataset below
The method uses the correlation matrix, random for-
1st attack scenario
est method, and information gain to score the features
to decrease the number of features. Three different fea- The attacker’s goal here is to crack any password on
ture sets are used as the basis for the exploratory studies, any host in the target network via a brute force attack.
which aim to create an optimized feature set by combin- The attack may be broken down into three distinct
ing them with the suggested algorithm. The researchers phases the attacker uses. All of the ports were scanned
utilized random forest, XG-Boost and K-nearest neigh- at once [21]. Hypertext Transfer Protocol (HTTP) rack
bour, ML algorithms to analyze the data. Website Copier was used as a backup method to save
Research [11] suggests a novel and effective encryp- a copy of web pages for use outside of the cloud-based
tion method for foreseeing cyber attacks on cyber- service. A total of 470 guesses were made, and a script
physical systems to counteract these dangers. The employing brute force was eventually executed with
recommended approach uses Bayesian optimization favourable results. Figure 1 depicts the occurrence of
strategies to hone the values of the hyper-parameters the attacks.
in the LightGBM algorithm. The University of Nevada
has used the suggested technique on its intrusion 2nd attack scenario
detection dataset (UNR-IDD). The authors have tried
Scenario B utilizes HTTP Slowloris Distributed
out the proposed method in Reno. Accuracy of 0.9918,
Denial of Service (DDoS) to launch the initial DDoS
precision of 0.9922, and recall of 0.9922 were all
attack on the APP [22]. They finally began their volu-
attained in the suggested way. The technique improves
metric distributed denial of service attack using the
the cyber-physical system’s security, as our empiri-
Radware tool. Figure 2 depicts the conditions box plats
cal assessment shows that it boosts accuracy and
of the attacks.
AUC value. As a result, the proposed approach may
Three hosts (192.168.159.131, 192.168.159.14, and
provide reliable guarantees for the protection of user
192.168.159.16) were compromised after an hour of
data. Table 1 presents a comparative analysis of the
the volume-based DDoS attack. With the help of a
proposed model and existing IoT-based cyber security
heatmap, the author represented the nature of attacks
threat detection methods.
described in Fig. 3.

Table 1 Comparison of various existing methods in the field of IoT cyber threats
Ref Technique Dataset Model Type Evaluation parameter Limitations

[12] Contextual information, Light Synthetic data Binary classification Accuracy 86.15% inefficient in the reading
probe model of the sensor
[13] Binary visualization, Convolu- KDD dataset Multi-class Accuracy 92.82% Not capable of predicting all
tional Neural Network Model types of attacks
[1] Random forest model, Logistic DS20S-traffic Binary classification Accuracy 94.31% Requires high computational
Regression facilities
[14] Naïve Bayes with Long Short- NSL dataset Multi-class Accuracy 94.31% Not dynamics
Term Memory (LSTM)
[15] Logistic Regression IDS dataset Binary classification Accuracy 90.27 Failed in real-time scenarios
[16] Neural Network Model Synthetic data Multi-class Accuracy 91.47 Slow in big-size dataset
[17] Regression Model with SVM Synthetic data Binary classification Accuracy 90.47 Speed slow
model
[18] K-nearest neighbour Online IoT dataset Multi-class Accuracy 89.97 Limited Training Data
with Xgboost
[19] AdaBoost and Decision Tree Synthetic data Binary classification Accuracy 93.77 Data Imbalance
[20] Gradient boosted machine Online IoT dataset Binary classification Accuracy 90.07 less effective if not updated
and ANN model regularly
Pro‑ Optimized CHAID Decision Tree Online IoT attack dataset Multi-class Accuracy 99.97
posed and Multi Class SVM fusion
Model
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 5 of 20

Fig. 1 Frequencies of attacks

Development stages heavily on the factors selected. Therefore, this is an


Here are the steps that were taken to create a framework essential stage.
for predicting cyber attacks using multi-class support Step 5.Model training: The fifth step is to train the
vector machines and the CHAID decision tree: framework using the features chosen in the previ-
ous step. Several machine learning techniques may
Step 1.Problem definition: The first step in developing do this, including multi-class SVM and CHAID
a framework is pinpointing the issue. The issue is decision tree.
foreseeing cyber assaults on the Internet of Things Step 6. Model evaluation: The sixth step is to analyze
(IoT) infrastructure. the test set and determine how well the framework
Step 2. Data collection: The second step is to amass infor- works. This is useful in evaluating the framework’s
mation for the framework’s training and assessment ability to handle data it has never seen before.
processes. For the framework to accurately antici- Step 7.Deployment: Putting the framework into pro-
pate future cyber attacks, the data must indicate duction is the seventh step. The framework might
such attacks in the actual world. be accessible as a web service or used with existing
Step 3.Data preprocessing: The third step is to prepare security solutions.
any necessary data before using it for training or
testing purposes. As part of this process, eliminat- It is a continuous phase to improve the multi-class
ing anomalies may be required, standardizing the support vector machine (SVM) and CHAID deci-
data, and filling in any gaps. sion tree cyber attack prediction system. The accu-
Step 4.Feature selection: The fourth step is to choose the racy and performance of the proposed model may be
characteristics for training and evaluate the frame- tweaked using new data and updated machine learning
work. The correctness of the framework depends algorithms.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 6 of 20

Fig. 2 Conditional Box Plot of attacks

Decision tree decision tree algorithm for classification and regres-


Regarding classification, decision trees are the most com- sion tasks, particularly when dealing with categorical
mon supervised learning algorithm with a predetermined variables.
target variable. It is an input and output variable for cat- The optimization of the CHAID decision tree involves
egorical and continuous data [23–27]. If the most signifi- several key steps:
cant splitter/differentiator in input variables is identified,
the population or sample has been divided into two or • Feature Selection: The optimization process includes
more homogenous groups (or subpopulations). Multiple identifying and selecting the most relevant features
algorithms are used to determine whether or not to split for building the decision tree. This helps to reduce
a node into two or more sub-nodes in a decision tree. dimensionality, improve interpretability, and enhance
Sub-nodes are more homogeneous when they are created the model’s overall performance.
[28–32]. • Splitting Criterion: The optimization determines the
Using another way, the node’s purity improves as the most suitable splitting criterion for the decision tree
target variable rises. Nodes in a decision tree are divided nodes. The splitting standard measures the associa-
into sub-nodes based on all of the relevant factors, and tion between the predictor variables and the target
then the most homogenous sub-nodes are selected as the variable, allowing for the creation of informative and
final sub-nodes. The variable target type is also consid- predictive splits.
ered while choosing an algorithm. • Stopping Criteria: The optimization considers the
appropriate stopping criteria for tree growth. This
Optimized CHAID decision tree prevents overfitting and ensures that the decision
The Optimized CHAID Decision Tree-based Model is a tree does not become too complex, leading to poor
variant of the traditional CHAID decision tree algorithm generalization and performance on unseen data.
that incorporates optimization techniques to improve its • Pruning: Pruning techniques are applied to the
performance and effectiveness [33]. CHAID is a popular decision tree to eliminate unnecessary branches
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 7 of 20

Fig. 3 Heat map of attacks

and nodes that do not contribute significantly to


its predictive power. This simplifies the tree struc-
ture, improves interpretability, and helps prevent
overfitting.

By optimizing the CHAID decision tree, the model


can effectively handle complex datasets, identify
important features, and provide accurate predictions.
The optimization process enhances the interpretabil-
ity of the decision tree and improves its generalization
capabilities [34–41].
The Optimized CHAID Decision Tree-based Model
finds applications in various domains, including
healthcare, finance, marketing, and cybersecurity. It
Algorithm 1: The CHAID algorithm
is particularly useful when dealing with categorical or
mixed-type data, making it suitable for scenarios where As a result of using this technique, it is incredibly effi-
traditional decision tree algorithms may not be as effec- cient at searching through enormous datasets [43]. Still,
tive [42]. The optimized CHAID Decision Tree-based it is not guaranteed to offer the best splitting forecast at
Model offers an advanced and refined approach to deci- any given time. Algorithm 2 shows the CHAID decision
sion tree modelling, providing enhanced performance tree construction method. It performs multi-level splits
and interpretability for various applications. Algo- when computing classification trees.
rithm 1 shows the CHAID algorithm steps below.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 8 of 20

model, which learns to combine the predictions


optimally.

• Final Prediction and Performance Evaluation: The


final integrated prediction is obtained once the pre-
dictions are combined. This integrated prediction is
then evaluated using standard metrics such as accu-
racy, precision, recall, F1 score, and area under the
Algorithm 2: CHAID decision tree algorithm ROC curve to assess its performance.
The integration of the Support Vector Machine (SVM) • Tuning and Optimization: Researchers may further
and CHAID (Chi-squared Automatic Interaction Detec- fine-tune the integration process by adjusting hyper-
tion) model involves combining the predictions of both parameters or weights to optimize the ensemble’s
models to leverage their respective strengths and improve performance on the specific task.
overall prediction performance [34, 36–38]. The integra-
tion is typically achieved through an ensemble approach, The integration of SVM and CHAID can be particu-
where the predictions of the individual models are com- larly useful when complementing each other’s strengths
bined using various techniques. Here’s a general outline [43–47]. For example, SVM handles high-dimensional
of how SVM and CHAID can be integrated: data and complex decision boundaries effectively, while
CHAID provides interpretable and transparent decision
• Train Individual Models: The SVM and CHAID rules. By combining the two models, researchers can
models are trained individually on the same data- potentially achieve better overall predictive performance
set. SVM is a powerful machine learning algorithm while retaining interpretability in certain scenarios.
for classification tasks, while CHAID is a decision
tree-based method for categorical data analysis. Each Multi‑class SVM model
model learns from the dataset and creates its deci- When the labels are chosen from a finite volume set,
sion boundaries or rules to make predictions. the issue of labelling records is resolved by SVM. Multi-
• Obtain Model Predictions: The individual SVM and class learning characterizes the whole method [39–42].
CHAID models predict the same test data or new Many multi-class learning methods are developed using
instances after training. The predictions are typically different classifiers for fundamental binary problems.
in the form of class labels or probabilities. Numerous multi-class training classifiers have been used,
• Combine Predictions: The predictions from SVM including decision trees, Ada-Boost, and SVM. Among
and CHAID can be combined using various ensem- the most popular methods for solving the multi-class
ble techniques. Some common methods include: issue is the SVM, which divides a single problem into
numerous binary sub-problems.
a Majority Voting: In majority voting, the final To create a collection of binary classification problems
prediction is determined by selecting the class (B1, B2,…, Bn) for 1 to s class set for each classification
label that receives the most votes from SVM and model that received training to distinguish itself from the
CHAID. For example, if SVM predicts Class A, other classifiers. Merging them following the optimum
CHAID predicts Class B, and another SVM indi- outcome before using the sgn feature will yield a multi-
cates Class A, the majority vote would favour class classification concept. Sgk(y) is the distance towards
Class A. the hyperplane from a point y, which can be calculated
b Weighted Averaging: In weighted averaging, each as (1).
model’s prediction is given a weight, and the final n j
prediction is obtained by calculating the weighted Sgk y = xi ∗ βk y, yi + ak (1)
j=1
average of the individual model predictions. The
consequences can be determined based on the
Proposed model
personal model’s performance or other criteria.
The proposed model is based on an Optimized CHAID
c Stacking: Stacking is a more sophisticated
Decision Tree and multi-class SVM fusion for cyber
ensemble method where the predictions of the
threat detection in IoT infrastructure. Figure 4 shows the
individual models are used as input to a meta-
working of the proposed model.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 9 of 20

Fig. 4 The working of the proposed model

The first data preprocessing step involves normali- coherent approach were implemented and tested to
zation, accompanied by chi-square-based extracted detect IoT cyber security attacks.
features. The proposed model includes two phases:
Initially, low-rank matrix features have been elimi- Experimental setup
nated, and the best possible subset of all characteris- The proposed model has been run on any computer with
tics using chi square-based feature extraction. Finding a minimum of 2 GB of RAM and 1 GHz processor. The
the highest-priority features essential for the classifier framework requires the following software:
largely depends mostly on ranking features. The sta-
tistics are separated into training, validation, and test- • Python 3.6 or higher
ing set during the second phase. The optimized kernel • NumPy
attribute is obtained using the tenfold cross-validation. • Pandas
• Scikit-learn

Evaluation
There are the following parameters to be used in perfor-
mance evaluation as below:

• Precision: Precision pre can be formulated as


described in Eq. (6).
(TP)
Pre = (6)
(TP + FP)

• Recall: A recall, Rec, can be formulated as described


Algorithm 3: The proposed Model in Eq. (7).
(TP)
Rec = (7)
Results and discussion (FN + TP)
The proposed model and existing model Contex- • F-Measure: An f-measure FMe can be formulated as
tual information, Cyber Security Game (CSG), mul- described in Eq. (8).
tistep attack alert correlation system Systematic and
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 10 of 20

(TP) statistically-based supervised learning methods for


Fme = (8) creating decision trees is the CHAID method. Table 3
(FN + TP)
displays the CHAID model designed for the current
• Accuracy: Accuracy Acc can be formulated in Eq. (9). problem.
(TP + TN ) One of the multivariate dependency methods, the
Acc = (9) CHAID algorithm, is used to find correlations between
(FN + TP) + (FP + TN )
a category-dependent variable and several categorical or
metric-independent variables (in which case, their cod-
Scenario 1 ing and transformation into categorical variables must be
Most importantly, regarding ML models, the CHAID done previously). Figure 2 displays three modes among
model performed better than SVM in experiments.TCP, 77 nodes created in the CHAID model. Malicious traf-
UDP, HTTP GET, and DNS tunnelling attacks were all fic was modelled after network activity from well-known
roughly detected at the same level due to the inclusion botnets like Mirai, Dark Nexus, and Gafgyt and sourced
of several IoT multi-vector cyberattack characteristics from publicly available datasets that catalogue attacks on
based on flow analysis and features based on the most IoT networks using protocols including TCP, UDP, HTTP
widely used IoT protocols. In this scenario, the authors GET, and DNS tunnelling.
analyzed and compared the efficacy of existing machine In addition, malicious traffic was created with standard
learning-based methods for detecting attacks on the tools, while data from non-threatening Internet of Things
infrastructure supporting the Internet of Things. devices, including a router, thermostat, and video cam-
The suggested model requires dividing the dataset as era, was captured. By applying many forms of machine
follows: 70% training and 30% testing. The collection learning, the traits described in the paper were sorted
includes actual attacks from the following Label threat and then deleted from the incoming data. To what extent
classes: Brute_Force, HTTP_DDoS, ICMP_Flood, Nor- machine learning algorithms can identify multi-vector
mal, and Port_Scan. When no new merging pairs are attacks on the Internet of Things infrastructure is pri-
found, searching for a new couple continues until the marily a function of the objects used in training and test
p-value is less than the significance level met.CHAID samplings/settings. More investigation is being put into
analysis relies heavily on statistical testing, and it is feasi- this crucial component.
ble to distinguish two primary functions: Automated and iterative tree building using Pearson’s
Chi-square statistic and CHAID denotes the correspond-
• Combination of individual values and categorizations ing p-value in Fig. 5. In Fig. 5, "nodes" are the places or
of predictor variables branches where information is separated according to pre-
• Predictor variables are chosen according to the sta- determined rules. Each node stands for a group of simi-
tistical significance of their relationship with the lar records inside the dataset, like attack categories, % of
dependent variable. attacks encounters and number of attacks. Figure 6 displays
predictor importance in the CHAID model. As shown in
Table 2 contains this model’s top Decision Rules for Fig. 6, each node in a decision tree constructed with the
’Label’. This table indicates the rule confidence con- CHAID method has a set of predictors applied to it, and
cerning a particular rule. One of the most widely used these predictors are chosen for their ability to partition the

Table 2 Top decision rules for ’label’


Rule ID Rule Mode category Record count Record Rule confidence
percentage

76 Bwd IAT Std >  = 2.000, & BwdPkt Len Std <  = 3.000 &InitBwd Win Brute_Force 24,204 26.8 100.0
Byts >  = 2& InitBwd Win Byts < 3 SYN Flag Cnt <  = 2 & Tot FwdPkts <  = 3
57 Flow IAT Min < 4 & Bwd IAT Min < 1 & Pkt Len Std < 1.000 & Flow Dura- Brute_Force 16,841 18.7 99.7
tion < 2 & Tot FwdPkts < 2
56 Flow IAT Min < 3 & Bwd IAT Min < 1 & Pkt Len Std < 1.000 & Flow Dura- Brute_Force 8,348 9.3 95.3
tion < 2 & Tot FwdPkts < 2
5 SYN Flag Cnt <  = 2 & Tot FwdPkts < 1 Port_Scan 7,107 7.9 99.9
72 Fwd IAT Min >  = 2 & TotLenFwdPkts <  = 3 & Flow IAT Mean >  = 5.000 & Normal 6,427 7.1 99.9
SYN Flag Cnt < 1 &
Tot FwdPkts <  = 3
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 11 of 20

Table 3 CHAID model information Testing hypotheses regarding whether two variables
Model Information
are (or aren’t) independent is vital to the CHAID meth-
od’s implementation. The authors got an insight into the
Target Field Label model’s performance in forecasting cyber attacks for IoT
Model Type Multi-clas devices by analyzing the values in the confusion matrix
Decision
Tree
and computing the evaluation metrics. This gives us
Algorithm Name CHAID
insight into the model’s discriminatory abilities, allowing
Number of Features 20
us to spot problems like false positives and false nega-
Tree Depth 6
tives. This data may be used to judge IoT systems’ safety
and further influence the model’s development.
Number of Nodes 77
Table 4 depicts the results gained by the CHAID model
on the prescribed dataset. It shows both the accuracy
level achieved at the training and testing phases. This
data into useful categories. The relevance of predictors is model earns a 90.17% accuracy level overall.
a tool for figuring out which variables truly matter for the
tree’s ultimate verdict. By locating these powerful predic- Scenario 2
tors, insights into which factors have a greater influence on Next, the support vector machine has been implemented
the result being predicted may be gained. to evaluate the various detection methods. The dataset

Fig. 5 Nodes in the CHAID model


Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 12 of 20

Fig. 6 Predicator importance in the CHAID model

was used to train and test the algorithm, with 75% of the All characteristics for both datasets were tried out in
data being used for training and 25% for testing. the prior study. However, our suggested model consid-
In Table 5, the authors compare the performance of ers a feature selection method based on information gain
the SVM model against one-class and two-class SVMs. and, in the end, employs just 25 of the essential charac-
While a two-class SVM may be more accurate in most teristics, as shown in Fig. 9.
cases, The authors could save time and effort by creat- The multi-class support vector machine (SVM) and
ing a powerful one-class SVM to classify our datasets CHAID decision tree used in the Internet of Things (IoT)
offline. Regular traffic can be used as a training data- cyber attack prediction framework yielded encouraging
set for a one-class SVM. Therefore, the objective of this findings. The framework not only distinguished the most
phase is dual. crucial criteria for attack classification but also achieved
The first step is comparing the various SVM methods excellent accuracy while classifying attacks. The frame-
to see which provides the most accurate detection. Com- work’s 99.72% accuracy is a big step forward over earlier
parisons are made between linear and non-linear Radial approaches. The SVM model’s accuracy may be enhanced
Basis Function (RBF) models of a one-class SVM and a by giving more importance to certain characteristics dur-
two-class SVM, respectively. Second, the authors want ing training.
to see how well the various SVM approaches perform on Figure 10 demonstrates that combining multi-class
intrusion detection tasks compared to our unsupervised SVM with the CHAID decision tree effectively predicts
anomaly-based IDS. Table 6 describes the simulation cyber-attacks in IoT devices. The framework is effective
results obtained through the proposed CHAID model. enough to classify attacks with high precision and zero
Compared to prior research, this proposed method can in on their most salient characteristics. This data may
generate a significantly accurate label, as presented in Fig. 7. strengthen the defences protecting IoT infrastructure
Table 7 has been reconstructed as including the by pinpointing possible attack entry points. The frame-
detailed performance of the proposed model. work’s excellent accuracy is a notable advancement over
The information displayed in Table 6 has been graphi- earlier approaches. This indicates that the framework can
cally represented by Fig. 8, proving that the proposed detect cyber assaults on Internet of Things (IoT) devices.
model achieved the maximum level of accuracy (99.78%), A further useful discovery is the selection of the
as shown in Fig. 8. top five characteristics for use in classifying attacks.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 13 of 20

Table 4 Results gained by the CHAID model


Results for Output Field Label

Comparing $R-Label with Label


Partition 1_Training 2_Testing
Correct 89977 99.78% 38512 99.72%
Wrong 202 0.22% 108 0.28%
Total 90179 38620
Confidence Values Report for $RC-Label
“Partition”=1_Training
  Range 0.452-1.0
  Mean Correct 0.861
  Mean Incorrect 0.759
  Always Correct Above 0.972 (0.34% of the cases)
  Always Incorrect Below 0.452 (0% of the cases)
  99.78% Accuracy Above 0.0
  2.0 Fold Correct Above 0.737 (99.89% of the cases)
“Partiotion”= 2_Testing
  Range 0.452-1.0
  Mean Correct 0.861
  Mean Incorrect 0.759
  Always Correct Above 0.961 (0.76% of the cases)
  Always Incorrect Below 0.452 (0% of the cases)
  99.78% Accuracy Above 0.0
  2.0 Fold Correct Above 0.779 (99.86% of the cases)

Table 5 Results gained by the SVM model


Results for Output Field Label

Comparing $R-Label with Label


Partition 1_Training 2_Testing
Correct 74496 82.61% 38512 99.72%
Wrong 15683 17.39% 108 0.28%
Total 90179 38620
Confidence Values Report for $RC-Label
“Partition” = 1_Training
  Range 0.452–1.0
  Mean Correct 0.741
  Mean Incorrect 0.459
   Always Correct Above 0.907 (0.34% of the cases)
   Always Incorrect Below 0.252 (0% of the cases)
   99.78% Accuracy Above 0.0
   2.0 Fold Correct Above 0.637 (82.61% of the cases)
“Partiotion” = 2_Testing
  Range 0.452–1.0
  Mean Correct 0.952
  Mean Incorrect 0.649
   Always Correct Above 0.789 (0.76% of the cases)
   Always Incorrect Below 0.368 (0% of the cases)
   99.78% Accuracy Above 0.0
   2.0 Fold Correct Above 0.779 (99.72% of the cases)
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 14 of 20

Table 6 Simulation results for the proposed CHAID model Table 7 Accuracy results for the proposed and existing model
Attack precision recall f1-score support S. No Technique Accuracy %

Backdoor 1.00 0.93 0.97 4805 1 Contextual information [12] 86.45%


DDoS_HTTP 0.94 0.61 0.74 9709 2 Cyber Security Game (CSG) [2] 88.67%
DDoS_ICMP 1.00 0.99 1 13588 3 Multistep attack alert correlation system [6] 90.78%
DDoS_TCP 0.74 0.57 0.65 10012 4 Systematic & coherent approach [7] 96.97%
DDoS_UDP 1.00 1 1 24314 5 Proposed Model 98.28%
Fingerprinting 0.25 0.88 0.39 171
MITM 1.00 1.00 1.00 72.00
Normal 1.00 1.00 1.00 272800.00
By giving greater importance to these characteris-
Password 0.49 0.31 0.38 9987
tics when training the SVM model, accuracy may be
Port_Scanning 0.31 0.49 0.38 3995
improved. This research found that combining multi-
Ransomware 0.96 0.92 0.94 1938
class SVM with the CHAID decision tree was the most
SQL_injection 0.43 0.71 0.54 10165
effective method for predicting IoT cyber-attacks.
Uploading 0.62 0.37 0.47 7361
The framework is effective enough to classify attacks
Vulnerability_scanner 0.93 0.84 0.88 1005
with high precision and zero in on their most salient
XSS 0.31 0.77 0.44 3013
characteristics. This data may strengthen the defences
accuracy NA NA 0.97 381935
protecting IoT infrastructure by pinpointing possible
macro avg 0.73 0.76 0.72 381935
attack entry points.
weighted avg 0.98 0.965 0.968 381935
When deciding which machine learning model to use
in a production setting, comparing their respective tim-
ing performances is crucial. Compared to the CHAID
decision tree, multi-class SVM is a more time-consuming

Fig. 7 Confusion Matrix for the proposed model


Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 15 of 20

Fig. 8 Comparison of accuracy % for existing Vs. Proposed Method

and resource-intensive technique. This is because the Compared to CHAID decision trees, whose time com-
CHAID decision tree is a greedy method, while multi- plexity climbs at a logarithmic rate as n increases,
class SVM needs to tackle a quadratic optimization issue. multi-class SVMs have a cubic growth rate. This means
The temporal complexity of multi-class SVM and the multi-class SVM will be less efficient for big datasets than
CHAID decision tree are compared in the following table the CHAID decision tree. When evaluating the speed
(Table 8). with which different ML models complete their tasks, it
Where n is the number of training samples, and C is is important to consider more than just the time com-
the hyperparameter of the multi-class SVM algorithm. plexity involved.

Fig. 9 TPR Validation


Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 16 of 20

Fig. 10 TPR Validation

Table 8 Time complexity of existing model Discussion


Significant progress has been made in IoT security
Algorithm Time Complexity
with a revolutionary multi-class SVM and an improved
Multi-class SVM O(n^3 * log(C)) CHAID decision tree-based model for cyber attack
CHAID decision tree O(n * log(n)) prediction for IoT devices. In this section, the authors
explore the research’s main conclusions and ramifica-
tions while elaborating on the model’s advantages and
disadvantages. Compared to more conventional predic-
• The time needed to train and forecast scales linearly
tion methods, our innovative model, which combines
with the model’s size.
multi-class SVM with an improved CHAID decision
• The hardware platform in use may also impact model
tree, shows substantial improvement. Combining the
performance. For training and forecasting with deep
best features of both algorithms, this model may suc-
learning models, for instance, a GPU will outperform
cessfully defend Internet of Things (IoT) systems
a CPU.
from more sophisticated and varied cyberattacks. The
model can handle numerous attack classes according
Here are a few things to keep in mind while picking an
to the multi-class SVM algorithm, and it is optimized
ML model for continuous forecasting:
for speed and accuracy in classification thanks to the
CHAID decision tree.
• Multi-class SVM may be the best option when work-
Feature selection methods are incorporated into the
ing with a limited dataset.
model to determine which features are most important
• The CHAID decision tree may be the best option for
and informative for cyber attack prediction. The model
a huge dataset.
can enhance its prediction abilities by lowering the num-
• Training and prediction using a GPU is recom-
ber of characteristics included in the analysis and avoid-
mended if the model is big.
ing the negative effects of the curse of dimensionality.
• If resources on the hardware platform are tight, a less
To improve the precision of predictions, the CHAID
complex design should be favoured.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 17 of 20

decision tree algorithm may be tweaked to zero down While the outcomes of our approach are encourag-
on the most discriminatory characteristics. The CHAID ing, it is important to note its limits. The training data
decision tree technique makes the model more under- must be high quality and sufficiently representative of
standable and comprehensible. The decision tree format the real world for the model to work well. Future stud-
simplifies the analysis and analysis of alternatives. This ies should gather more diverse and realistic datasets
openness aids in the detection of possible vulnerabili- to enhance the model’s generalizability. Cyber attack
ties and countermeasures. It allows security analysts and prediction models might be even more effective with
system managers to understand better the elements con- additional research into ensemble approaches and
tributing to cyber assaults on IoT devices. Because of its incorporating other machine learning techniques.
efficiency and scalability, the suggested approach is well- Improved accuracy, interpretability, and efficiency are
suited for predicting cyber attacks in real time for wide- some of the benefits that the unique multi-class SVM
spread IoT installations. The model’s ability to deal with and optimized CHAID decision tree-based model
high-dimensional data and quickly produce predictions is bring to the problem of cyber attack prediction for IoT
due to the use of the multi-class SVM algorithm and the devices. By working together, these algorithms improve
optimized CHAID decision tree, which are well-known the handling of multi-class situations, feature optimi-
for their computational efficiency. The capacity to iden- zation, and decision clarity. Future studies should aim
tify and respond quickly to cyber attacks in IoT systems to develop and improve this model to increase its use-
relies heavily on the system’s scalability and efficiency. fulness and the security of IoT systems against cyber
Combining Multi-Class SVM with an Optimized threats.
CHAID Decision Tree for cyber attack detection is a There is scepticism about the added complexity intro-
potent technique to boost detection systems’ preci- duced by employing many classifiers in an ensemble
sion and recall. Multi-class SVM, a supervised machine model. As time has progressed, however, processing
learning technique, may classify data into numerous units like mobile devices have become progressively
categories. It’s an effective algorithm that can reach quicker, and memory resources have become increas-
very high levels of precision. It is sensitive to the choice ingly inexpensive; this has led to the possibility of a wide
of hyperparameters and can be computationally expen- range of algorithms, including ensemble approaches,
sive to train. The CHAID decision tree algorithm has being used for fog computing. Efficient resource alloca-
been enhanced to identify cyber-attacks better. The tion in fog computing is another area of study. Moreo-
algorithm is easily understood and interpreted. ver, studies have developed fog system designs that may
On the other hand, it may not be as precise as multi- use ensemble learning without significantly increas-
class SVM. The advantages of each method may be ing latency. It is argued that the design and efficient
obtained by combining them. An Optimized CHAID resource allocation method explored in this article
decision tree can provide the recall, while a Multi-Class may be used to implement the stacking strategy. Since
Support Vector Machine can provide the accuracy. missing a cyberattack is associated with a high cost,
Using Multi-Class SVM as a primary classifier is one the discovery that stacking can beat single classifiers
approach to combining these two methods. The authors for counterattack detection in IoT Smart city applica-
would utilize Multi-Class SVM to divide the data into tions has significant value despite modest increases in
manageable categories. The data inside each class complexity.
would then be classified using an Optimized CHAID
decision tree. Because Multi-Class SVM may be used Conclusions
to filter out much of the irrelevant information, this To forecast cyber-attacks in IoT systems, the authors
strategy has the potential to yield good results. With provide a unique multi-class support vector machine
this information, the Optimized CHAID decision tree (SVM) and improved CHAID decision tree-based model.
can zero down on the cyber threats that are most likely In addition to enhanced prediction accuracy, this model
to occur. Parallel execution is yet another method for boasts enhanced interpretability, scalability, efficiency,
combining these two programs. This would involve and optimized feature selection. The proposed model
employing both algorithms to sort the information. gains the highest accuracy level (98.28%). It is maxi-
The combined output of the two algorithms would then mum accuracy achieved in both the training and test-
serve as the basis for a conclusion. This strategy has ing phases. Using multi-class support vector machines
the potential for success since it takes advantage of the (SVMs) and improved CHAID decision tree algorithms,
best features of both algorithms. An Optimized CHAID various attack classes may be handled efficiently and with
decision tree can provide the recall, while a Multi-Class complete clarity. The model incorporates feature selec-
Support Vector Machine can provide the accuracy. tion approaches to zero in on the most important aspects
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 18 of 20

for cyber attack prediction, lowering the dimensionality the authors can zero in on the data that will impact the
and increasing the efficiency with which the model oper- model’s performance most. Additionally, our proposed
ates. By improving interpretability, the CHAID decision technique with the SVM technique leads to higher per-
tree method gives security analysts a deeper understand- formance than the single or other models employed in
ing of attack vectors and weak spots. A potential topic of recent publications in categorizing attack types in terms
study is the combination of Multi-Class SVM and Opti- of accuracy, precision, recall, and F1-score metrics. In
mized CHAID decision tree for detecting cyber attacks. the future, the authors want to investigate deep learning
Combining the best features of these two algorithms strategies that might significantly improve the effective-
allows for the creation more effective and trustworthy ness of IoT threat detection.
systems for detecting cyber attacks. The study found the Finally, as automated systems and Smart cities gain
following additional results: popularity, they will also face increased cyber attacks.
Suppose citizens are denied access to or otherwise
• Accuracy and recall in detecting cyber attacks can be have their privacy invaded within an automated sys-
enhanced by combining Multi-Class SVM with an tem. In that case, it can have severe consequences for
Optimized CHAID decision tree. them as individuals and be expensive for the govern-
• Multi-Class SVM may be used as a first-stage classi- ment to fix. System failures in managing emergencies
fier in integrating these two techniques, or the two (such as accidents and fires) can potentially endanger
can be used simultaneously. people’s health. Our findings that stacking classifiers
• Organizational requirements should guide the selec- can improve the detection of cyberattacks in smart city
tion of an integration strategy. networks have ramifications beyond technological con-
tributions, including economic and societal ones.
Improving the accuracy and reliability of cyber attack More information will be gained in this regard from
detection systems by integrating Multi-Class SVM and studies to be conducted in the future. To better iden-
Optimized CHAID decision tree is a promising field of tify cyber-attacks, new machine learning algorithms
research. may be created. Because they will be customized to
Due to its efficiency and scalability, the model may be the unique traits of cyber assaults, these algorithms
used for real-time prediction in massive IoT rollouts. Its may be more accurate and trustworthy than their pre-
computing performance allows for rapid forecasts and decessors. Cyber attack detection systems may be
faster cyber threat detection and mitigation. Our model more effective using additional data sources like net-
has potential, but it is not without caveats. Training data work traffic data and system logs. This information can
is crucial to the model’s success; thus, it’s important to be utilized to spot trends in cyber assaults that aren’t
use a wide variety of data that accurately represents the picked up by currently available databases. It is possi-
target domain. Investigating ensemble approaches and ble to build automated reaction systems responding to
incorporating additional machine learning techniques in cyber threats. The authors may use these technologies
future studies might improve the resilience and accuracy to quarantine compromised machines, stave off harm-
of the model. ful traffic, and roll back to a prior configuration.
Our unique multi-class support vector machine
(SVM) and improved CHAID decision tree-based Acknowledgements
The authors thank Princess Nourah bint Abdulrahman University Researchers
model both add to the development of cyber attack pre-
Supporting Project number (PNURSP2023R410), Princess Nourah bint Abdul-
diction in IoT systems. It’s a helpful resource for coun- rahman University, Riyadh, Saudi Arabia.
tering online dangers and protecting sensitive data.
Authors’ contributions
More work will improve and broaden the model, lead-
UKL & S.D. were responsible for Validation, Software, Data Curation, and
ing to stronger defences for Internet of Things devices. Writing - Original Draft. N.F. & S.S. was responsible for Conceptualization,
Thus, a CHAID-based paradigm is proposed for pre- Writing - Original Drafts. MA & NA was responsible for Writing - Original Draft,
Visualization. NF & NA were responsible for Writing - Review & Editing. SD &
dicting multi-stage cyber threat detection for IoT com-
MA were responsible for Formal Analysis. A.K. was responsible for Writing -
munication. In this research, the authors investigate Original Draft, Resources, and Supervision. The author(s) read and approved
whether the proposed CHAID method can be used to the final manuscript.
detect cyberattacks in IoT-based Smart city applica-
Funding
tions. Through testing with the most up-to-date IoT The funding of this work was provided by Princess Nourah bint Abdulrah-
attack database, we have found that this technique, man University Researchers Supporting Project number (PNURSP2023R410),
Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
mainly stacking, outperforms individual models in dis-
tinguishing malicious from benign data. Using a fea- Availability of data and materials
ture selection method informed by information gain, Publicly available datasets were analyzed in this study.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 19 of 20

Declarations 14. Shahin M, Chen FF, Hosseinzadeh A, Bouzary H, Rashidifar R (2022) A


deep hybrid learning model for detecting cyber attacks in industrial IoT
Ethics approval and consent to participate devices. The Int J Adv Manuf Technol 123(5):1973–1983
Not applicable. 15. Yazdinejad A, Kazemi M, Parizi RM, Dehghantanha A, Karimipour H (2023)
An ensemble deep learning model for cyber threat hunting in industrial
Consent for publication internet of things. Digit Commun Networks 9(1):101–110
Not applicable. 16. Ismail S, Reza H (2022) Evaluation of Naïve Bayesian Algorithms for Cyber-
Attacks Detection in Wireless Sensor Networks. In 2022 IEEE World AI IoT
Competing interests Congress (AIIoT). IEEE, pp. 283–289
The authors declare no competing interests. 17. Ahmad T, Zhang D (2021) Using the Internet of things in smart energy
systems and networks. Sustain Cities Soc 68:102783
Author details 18. Le K-H, Nguyen M-H, Tran T-D, Tran N-D (2022) IMIDS: An Smart intrusion
1
Department of Computer Science and Engineering, Amity University Hary- detection system against cyber threats in IoT. Electronics 11(4):524
ana, Gurugram, India. 2 Department of Computer Science and Engineering, 19. Semwal P, Handa A (2022) “Cyber-attack detection in cyber-physical
Chandigarh University, Mohali, Punjab 1404133, India. 3 Department of Com- systems using supervised machine learning.” In Handbook of Big Data
puter Engineering and Applications, GLA University, Mathura (UP)‑281406, Analytics and Forensics. Cham, Springer, pp 131–140
India. 4 Department of Information Systems, College of Computer and Informa- 20. Raimundo RJ, Rosário AT (2022) Cybersecurity in the internet of things in
tion Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, industrial management. Appl Sci 12(3):1598
11671 Riyadh, Saudi Arabia. 21. Chakrabarty S, Engels DW. "A secure IoT architecture for smart cities."
In 2016 13th IEEE annual consumer communications & networking
Received: 12 June 2023 Accepted: 10 September 2023 conference (CCNC), pp. 812–813. IEEE, 2016.
22. Koroniotis N, Moustafa N, Schiliro F, Gauravaram P, Janicke H (2020) A
holistic review of cybersecurity and reliability perspectives in smart
airports. IEEE Access 8:209802–209834
23. Ansere JA, Han G, Wang H, Choi C, Wu C (2019) A reliable energy effi-
References cient dynamic spectrum sensing for cognitive radio IoT networks. IEEE
1. Abdullahi M, Baashar Y, Alhussian H, Alwadain A, Aziz N, Capretz LF, Internet Things J 6(4):6748–6759
Abdulkadir SJ (2022) Detecting cybersecurity attacks in internet of things 24. Onyema EM, Dalal S, Romero CAT, Seth B, Young P, Wajid MA (2022) Design
using artificial intelligence methods: a systematic literature review. Elec- of intrusion detection system based on cyborg intelligence for security of
tronics 11(2):198 cloud network traffic of smart cities. J Cloud Computing 11(1):1–20
2. Chukwudi AE, Udoka E, Charles E (2017) Game theory basics and its appli- 25. Dalal S, Seth B, Jaglan V, Surbhi MM, Dahiya N, Rani U, Le DN, Hu YC
cation in cyber security. Adv Wireless Commun Net 3(4):45–49 (2022) An adaptive traffic routing approach toward load balancing and
3. Abu Al-Haija Q, Krichen M, Abu Elhaija W (2022) Machine-learning-based congestion control in Cloud–MANET ad hoc networks. Soft Computing
darknet traffic detection system for IoT applications. Electronics 11(4):556 26(11):5377–5388
4. Lombardi M, Pascale F, Santaniello D (2022) Two-step algorithm to detect 26. Krundyshev, Vasiliy, and Maxim Kalinin. "Hybrid neural network
cyber-attack over the can-bus: a preliminary case study in connected framework for detection of cyber attacks at smart infrastructures."
vehicles. ASCE-ASME J Risk and Uncert in Engrg Sys Part B Mech Engrg In Proceedings of the 12th International Conference on Security of
8(3):031105 Information and Networks, pp. 1–7. 2019.
5. Rawat R, Mahor V, Garg B, Chouhan M, Pachlasiya K, Telang S (2022) Modeling 27. Saheed YK, Arowolo MO (2021) Efficient cyber attack detection on the
of cyber threat analysis and vulnerability in IoT-based healthcare systems internet of medical things-smart environment based on deep recur-
during COVID. In Lessons from COVID-19. Academic Press, pp. 405–425 rent neural network and machine learning algorithms. IEEE Access
6. Wang X, Gong X, Yu L, Liu J (2021) MAAC: Novel alert correlation method 9:161546–161554
to detect multi-step attack. In 2021 IEEE 20th International Conference 28. Seth B, Dalal S, Jaglan V, Le D-N, Mohan S, Srivastava G (2022) Integrat-
on Trust, Security and Privacy in Computing and Communications (Trust- ing encryption techniques for secure data storage in the cloud. Trans
Com). IEEE, pp. 726–733 Emerging Telecommun Technol 33(4):e4108
7. Kimani K, Oduol V, Langat K (2019) Cyber security challenges for IoT- 29. Shafiq M, Tian Z, Sun Y, Xiaojiang Du, Guizani M (2020) Selection of
based smart grid networks. Int J Crit Infrastruct Prot 25:36–49 effective machine learning algorithm and Bot-IoT attacks traffic iden-
8. Pacheco J, Hariri S (2016) IoT security framework for smart cyber infra- tification for internet of things in smart city. Futur Gener Comput Syst
structures. In 2016 IEEE 1st International workshops on Foundations and 107:433–442
Applications of self* systems (fas* w). IEEE, pp. 242–247 30. Masud RM (2019) IoT-based electric vehicle state estimation and con-
9. Dalal S, Manoharan P, Lilhore UK, Seth B, Simaiya S, Hamdi M, Raahemifar trol algorithms under cyber attacks. IEEE Internet Things J 7(2):874–881
K (2023) Extremely boosted neural network for more accurate multi- 31. Seth B, Dalal S, Le DN, Jaglan V, Dahiya N, Agrawal A, Sharma MM,
stage Cyber attack prediction in cloud computing environment. J Cloud Prakash D, Verma KD (2021) Secure cloud data storage system using
Computing 12(1):1–22 hybrid paillier–blowfish algorithm. Computers Materials Continua 67:1
10. Sontowski S, Gupta M, Chukkapalli SSL, Abdelsalam M, Mittal S, Joshi A, 32. Gochhayat SP, Lal C, Sharma L, Sharma DP, Gupta D, Saucedo JAM, Kose
Sandhu R (2020) Cyber attacks on smart farming infrastructure. In 2020 U (2020) Reliable and secure data transfer in IoT networks. Wireless Net
IEEE 6th International Conference on Collaboration and Internet Comput- 26(8):5689–5702
ing (CIC). IEEE, pp. 135-143 33. Liu PY, Wu KR, Liang JM, Chen JJ, Tseng YC. "Energy-efficient uplink
11. Dalal S, Poongodi M, Lilhore UK, Dahan F, Vaiyapuri T, Keshta I, Aldossary scheduling for ultra-reliable communications in NB-IoT networks."
SM, Mahmoud A, Simaiya S (2023) Optimized LightGBM model for In 2018 IEEE 29th Annual International Symposium on Personal, Indoor
security and privacy issues in cyber-physical systems. Trans Emerging and Mobile Radio Communications (PIMRC), pp. 1–5. IEEE, 2018.
Telecommun Technol 25:e4771 34. Ghosh S, Dagiuklas T, Iqbal M, Wang X (2022) A cognitive routing
12. Tran MQ, Elsisi M, Liu MK, Vu VQ, Mahmoud K, Darwish MM, Abdelaziz framework for reliable communication in iot for industry 5.0. IEEE Trans
AY, Lehtonen M (2022) Reliable deep learning and iot-based monitoring Industr Inf 18(8):5446–5457
system for secure computer numerical control machines against cyber- 35. Rathore MS, Poongodi M, Saurabh P, Lilhore UK, Bourouis S, Alhakami
attacks with experimental verification. IEEE Access 10:23186–23197 W, Osamor J, Hamdi M (2022) A novel trust-based security and privacy
13. ÖZALP AN, ALBAYRAK Z, ÇAKMAK M, ÖZDOĞAN E (2022) Layer-based model for internet of vehicles using encryption and steganography.
examination of cyber-attacks in IoT. In 2022 International Congress on Comput Electr Engi 102:108205
Human-Computer Interaction, Optimization and Robotic Applications 36. Conti M, Kaliyar P, Lal C. "REMI: a reliable and secure multicast routing
(HORA). IEEE, pp. 1–10 protocol for IoT networks." In Proceedings of the 12th International
Conference on Availability, Reliability and Security, pp. 1–8. 2017.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 20 of 20

37. Maddikunta PKR, Pham QB, Prabadevi B, Deepa N, Dev K, Gadekallu TR,
Ruby R, Liyanage M (2022) Industry 5.0: A survey on enabling technolo-
gies and potential applications. J Industrial Inform Integ 26:100257
38. Khan WU, Ihsan A, Nguyen TN, Ali Z, Javed MA (2022) NOMA-enabled
backscatter communications for green transportation in automotive-
industry 5.0. IEEE Transact Industrial Inform 18(11):7862–7874
39. Hassan A, Prasad D, Khurana M, Lilhore UK, Simaiya S (2021) Integra-
tion of internet of things (IoT) in health care industry: an overview of
benefits, challenges, and applications. Data Sci Innovations Smart Syst
30:165–180
40. Liu Y, Wu H, Rezaee K, Khosravi MR, Khalaf OI, Khan AA, Ramesh D, Qi
L (2022) Interaction-enhanced and time-aware graph convolutional
network for successive point-of-interest recommendation in traveling
enterprises. IEEE Transact Industrial Inform 19(1):635–643
41. Qi L, Liu Y, Zhang Y, Xiaolong Xu, Bilal M, Song H (2022) Privacy-aware
point-of-interest category recommendation in internet of things. IEEE
Internet Things J 9(21):21398–21408
42. Liu Y, Li D, Wan S, Wang F, Dou W, Xiaolong Xu, Li S, Ma R, Qi L (2022) A
long short-term memory-based model for greenhouse climate predic-
tion. Int J Intell Syst 37(1):135–151
43. Abu Al-Haija Q, Al-Fayoumi M. "An intelligent identification and clas-
sification system for malicious uniform resource locators (URLs)." Neural
Computing and Applications (2023): 1–17.
44. Al-Haija QA, McCurry CD, Zein-Sabatto S. "Intelligent self-reliant cyber-
attacks detection and classification system for IoT communication using
deep convolutional neural network." Selected Papers from the 12th
International Networking Conference: INC 2020 12. Springer International
Publishing, 2021.
45. Abu Al-Haija Q, Badawi AA, Bojja GR (2022) Boost-defence for resilient IoT
networks: a head-to-toe approach. Expert Syst 39(10):e12934
46. Abu Al-Haija Q, Alohaly M, Odeh A (2023) A lightweight double-stage
scheme to identify malicious DNS over HTTPS traffic using a hybrid learn-
ing approach. Sensors 23(7):3489
47. Al-Haija QA (2023) Cost-effective detection system of cross-site scripting
attacks using hybrid learning approach. Results Eng 19:101266

Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.

You might also like