Next-Generation Cyber Attack Prediction For Iot Systems: Leveraging Multi-Class SVM and Optimized Chaid Decision Tree
Next-Generation Cyber Attack Prediction For Iot Systems: Leveraging Multi-Class SVM and Optimized Chaid Decision Tree
Abstract
Billions of gadgets are already online, making the IoT an essential aspect of daily life. However, the interconnected
nature of IoT devices also leaves them open to cyber threats. The quantity and sophistication of cyber assaults aimed
against Internet of Things (IoT) systems have skyrocketed in recent years. This paper proposes a next-generation
cyber attack prediction framework for IoT systems. The framework uses the multi-class support vector machine (SVM)
and the improved CHAID decision tree machine learning methods. IoT traffic is classified using a multi-class support
vector machine to identify various types of attacks. The SVM model is then optimized with the help of the CHAID
decision tree, which prioritizes the attributes most relevant to the categorization of attacks. The proposed framework
was evaluated on a real-world dataset of IoT traffic. The findings demonstrate the framework’s ability to catego-
rize attacks accurately. The framework may determine which attributes are most crucial for attack categorization
to enhance the SVM model’s precision. The proposed technique focuses on network traffic characteristics that can be
signs of cybersecurity threats on IoT networks and affected Network nodes. Selected feature vectors were also cre-
ated utilizing the elements acquired on every IoT console. The evaluation results on the Multistep Cyber-Attack
Dataset (MSCAD) show that the proposed CHAID decision tree can significantly predict the multi-stage cyber attack
with 99.72% accuracy. Such accurate prediction is essential in managing cyber attacks in real-time communication.
Because of its efficiency and scalability, the model may be used to forecast cyber attacks in real time, even in massive
IoT installations. Because of its computing efficiency, it can make accurate predictions rapidly, allowing for prompt
detection and action. By locating possible entry points for attacks and mitigating them, the framework helps
strengthen the safety of IoT systems.
Keywords Anomaly detection, CHAID decision tree, IoT cyber attacks, Multistep attack, Security investigation, Zero-
day attack
*Correspondence:
Umesh Kumar Lilhore
[email protected]
Full list of author information is available at the end of the article
© The Author(s) 2023, corrected publication 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 2 of 20
The complete article is organized as follows: Related markings. Using this vocabulary, they could tell
work section covers the related work in the field of IoT between various login animal power initiatives across
cyber security attack detection, Materials and methods multiple apps using a single, generic pattern. The
section covers the materials and methods, Results and researchers presented a review of previous research
discussion section covers the results and analysis, includ- and a rigorous examination of several machine learning
ing experimental details and performance metrics, and methods [6]. The paper also includes statistical data to
Conclusion section covers the conclusion and future compare the method recognition effectiveness of suspi-
scope of the IoT cyber security research. cious activities in IoT network systems. According to
research observations, the random forest method gen-
Related work erates the most detailed findings for the feature sets.
Fundamentally, circumstances would gain from a method A cyber security model with an Intrusion Detection
and language for exhibiting IoT cyber security threads [2] System (IDS) is discussed in IoT architecture, which
in the direction of robotized location and recognizability utilizes alarms relating to unusual traffic to connect
of proof of multistep digital assault. IoT architecture is an IoT devices. Because there are many possible permu-
example of attack designs familiar with reusing nonexclu- tations of attack time, risk assessment, and attack hub
sive modules in the assault. A prototype implementation data in the IoT, this study presented a method to mimic
of a scenario acknowledgement motor using Categori- multistep assault circumstances within the company.
cal Abstract Machine Language (CAML) was developed, The results of the trials proved that the suggested tech-
which gradually consumed first-level security warnings nique could accurately reproduce multistep assault
and generated reports that differentiate multistep attack situations and trace them back to the original host. It
situations in the alarm stream. might help senior leaders better express safety actions
Protecting IoT devices from top to bottom is described to employees, helping to make the workplace safer for
in [3], contributing to a greater capacity for mission- everyone. The IoT network’s attack recognition strat-
driven digital situational awareness. Therefore, the IoT egy is discussed in [7]. Its foundation is the application
cauldron plotted out all potential network vulnerabilities of advanced systems. A sequence of network architec-
by linking, summarizing, standardizing, and interweaving tures is used to create IoT solutions. The method uses
data from diverse sources. It allowed for a more nuanced the information gain, random forest classifier, corre-
understanding of potential attack vectors, leading to mit- lation analysis, and feature global ranking to decrease
igation suggestions. A flexible demonstration supported the number of features. The additional investigation
a multi-stage analysis of firewall rules and host-to-have is based on three feature sets coupled using the sug-
vulnerabilities, including attack routes within the organi- gested method to generate an optimized part and
zation from the outside. They portrayed a prepared rela- functionality.
tionship because of Caldron assault charts and analyzed Research [8] presents a method for IoT threat detec-
the impact of attacks on missions. tion that relies on cloud technology software-defined
In [4], the authors examine a cyber security threat networks (SDN). It uses a decentralized multiple SDN
detection model for IoT devices based on a Hidden to prevent attacks within low-power wireless IoT equip-
Markov Model (HMM). IoT devices relied on informa- ment. The predetermined neighbourhood DNS server
tion mining to deal with warnings and generate input for of the designated sector was used to carry out network
the HMM. Given our acquired knowledge, their archi- activity dominion for each network interface field. The
tecture could continuously stream Snort warnings and central component of the strategy is a unique regulator
anticipate disruptions. With enough data, our approach installed in a cloud infrastructure and linked to a base
might infer patterns in the multi-stage attack and rank station.
aggressors accordingly. This allows our system to accu- According to research [9], Evaluation of cyber attack
rately predict attackers’ behaviour and assess the relative detection using a holistic strategy proposed to address
danger of different groups of attackers. the challenge of pinpointing novel, nuanced threats and
In [5], the authors present a multistep signature lan- the best ways to neutralize them. Particularly illustrative
guage model for IoT device communication that can of this issue are zero-day attacks and multistep assaults,
aid in attacking predetermined sites based on standard- which consist of several steps, some malevolent and oth-
ized log events collected from various applications and ers not. To identify the multi-stage assault scenario, they
devices. In addition, the wordy language helps us inte- present a substantially Boosted Neural Network in this
grate our understanding of external threats and refer- study. The outcomes of running many machine learning
ence up-to-date warning signs. Using this technology, methods were displayed, and a greatly enhanced neural
they’d be able to manufacture generic sleep-boosting network was shown.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 4 of 20
Research [10] presents the IoT network’s cyber threat Materials and methods
detection strategy. It is founded on the application of Dataset
advanced techniques. The created expert system uses There are the following attack scenarios in the pre-
an assortment of network architectures to function. scribed dataset below
The method uses the correlation matrix, random for-
1st attack scenario
est method, and information gain to score the features
to decrease the number of features. Three different fea- The attacker’s goal here is to crack any password on
ture sets are used as the basis for the exploratory studies, any host in the target network via a brute force attack.
which aim to create an optimized feature set by combin- The attack may be broken down into three distinct
ing them with the suggested algorithm. The researchers phases the attacker uses. All of the ports were scanned
utilized random forest, XG-Boost and K-nearest neigh- at once [21]. Hypertext Transfer Protocol (HTTP) rack
bour, ML algorithms to analyze the data. Website Copier was used as a backup method to save
Research [11] suggests a novel and effective encryp- a copy of web pages for use outside of the cloud-based
tion method for foreseeing cyber attacks on cyber- service. A total of 470 guesses were made, and a script
physical systems to counteract these dangers. The employing brute force was eventually executed with
recommended approach uses Bayesian optimization favourable results. Figure 1 depicts the occurrence of
strategies to hone the values of the hyper-parameters the attacks.
in the LightGBM algorithm. The University of Nevada
has used the suggested technique on its intrusion 2nd attack scenario
detection dataset (UNR-IDD). The authors have tried
Scenario B utilizes HTTP Slowloris Distributed
out the proposed method in Reno. Accuracy of 0.9918,
Denial of Service (DDoS) to launch the initial DDoS
precision of 0.9922, and recall of 0.9922 were all
attack on the APP [22]. They finally began their volu-
attained in the suggested way. The technique improves
metric distributed denial of service attack using the
the cyber-physical system’s security, as our empiri-
Radware tool. Figure 2 depicts the conditions box plats
cal assessment shows that it boosts accuracy and
of the attacks.
AUC value. As a result, the proposed approach may
Three hosts (192.168.159.131, 192.168.159.14, and
provide reliable guarantees for the protection of user
192.168.159.16) were compromised after an hour of
data. Table 1 presents a comparative analysis of the
the volume-based DDoS attack. With the help of a
proposed model and existing IoT-based cyber security
heatmap, the author represented the nature of attacks
threat detection methods.
described in Fig. 3.
Table 1 Comparison of various existing methods in the field of IoT cyber threats
Ref Technique Dataset Model Type Evaluation parameter Limitations
[12] Contextual information, Light Synthetic data Binary classification Accuracy 86.15% inefficient in the reading
probe model of the sensor
[13] Binary visualization, Convolu- KDD dataset Multi-class Accuracy 92.82% Not capable of predicting all
tional Neural Network Model types of attacks
[1] Random forest model, Logistic DS20S-traffic Binary classification Accuracy 94.31% Requires high computational
Regression facilities
[14] Naïve Bayes with Long Short- NSL dataset Multi-class Accuracy 94.31% Not dynamics
Term Memory (LSTM)
[15] Logistic Regression IDS dataset Binary classification Accuracy 90.27 Failed in real-time scenarios
[16] Neural Network Model Synthetic data Multi-class Accuracy 91.47 Slow in big-size dataset
[17] Regression Model with SVM Synthetic data Binary classification Accuracy 90.47 Speed slow
model
[18] K-nearest neighbour Online IoT dataset Multi-class Accuracy 89.97 Limited Training Data
with Xgboost
[19] AdaBoost and Decision Tree Synthetic data Binary classification Accuracy 93.77 Data Imbalance
[20] Gradient boosted machine Online IoT dataset Binary classification Accuracy 90.07 less effective if not updated
and ANN model regularly
Pro‑ Optimized CHAID Decision Tree Online IoT attack dataset Multi-class Accuracy 99.97
posed and Multi Class SVM fusion
Model
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 5 of 20
The first data preprocessing step involves normali- coherent approach were implemented and tested to
zation, accompanied by chi-square-based extracted detect IoT cyber security attacks.
features. The proposed model includes two phases:
Initially, low-rank matrix features have been elimi- Experimental setup
nated, and the best possible subset of all characteris- The proposed model has been run on any computer with
tics using chi square-based feature extraction. Finding a minimum of 2 GB of RAM and 1 GHz processor. The
the highest-priority features essential for the classifier framework requires the following software:
largely depends mostly on ranking features. The sta-
tistics are separated into training, validation, and test- • Python 3.6 or higher
ing set during the second phase. The optimized kernel • NumPy
attribute is obtained using the tenfold cross-validation. • Pandas
• Scikit-learn
Evaluation
There are the following parameters to be used in perfor-
mance evaluation as below:
76 Bwd IAT Std > = 2.000, & BwdPkt Len Std < = 3.000 &InitBwd Win Brute_Force 24,204 26.8 100.0
Byts > = 2& InitBwd Win Byts < 3 SYN Flag Cnt < = 2 & Tot FwdPkts < = 3
57 Flow IAT Min < 4 & Bwd IAT Min < 1 & Pkt Len Std < 1.000 & Flow Dura- Brute_Force 16,841 18.7 99.7
tion < 2 & Tot FwdPkts < 2
56 Flow IAT Min < 3 & Bwd IAT Min < 1 & Pkt Len Std < 1.000 & Flow Dura- Brute_Force 8,348 9.3 95.3
tion < 2 & Tot FwdPkts < 2
5 SYN Flag Cnt < = 2 & Tot FwdPkts < 1 Port_Scan 7,107 7.9 99.9
72 Fwd IAT Min > = 2 & TotLenFwdPkts < = 3 & Flow IAT Mean > = 5.000 & Normal 6,427 7.1 99.9
SYN Flag Cnt < 1 &
Tot FwdPkts < = 3
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 11 of 20
Table 3 CHAID model information Testing hypotheses regarding whether two variables
Model Information
are (or aren’t) independent is vital to the CHAID meth-
od’s implementation. The authors got an insight into the
Target Field Label model’s performance in forecasting cyber attacks for IoT
Model Type Multi-clas devices by analyzing the values in the confusion matrix
Decision
Tree
and computing the evaluation metrics. This gives us
Algorithm Name CHAID
insight into the model’s discriminatory abilities, allowing
Number of Features 20
us to spot problems like false positives and false nega-
Tree Depth 6
tives. This data may be used to judge IoT systems’ safety
and further influence the model’s development.
Number of Nodes 77
Table 4 depicts the results gained by the CHAID model
on the prescribed dataset. It shows both the accuracy
level achieved at the training and testing phases. This
data into useful categories. The relevance of predictors is model earns a 90.17% accuracy level overall.
a tool for figuring out which variables truly matter for the
tree’s ultimate verdict. By locating these powerful predic- Scenario 2
tors, insights into which factors have a greater influence on Next, the support vector machine has been implemented
the result being predicted may be gained. to evaluate the various detection methods. The dataset
was used to train and test the algorithm, with 75% of the All characteristics for both datasets were tried out in
data being used for training and 25% for testing. the prior study. However, our suggested model consid-
In Table 5, the authors compare the performance of ers a feature selection method based on information gain
the SVM model against one-class and two-class SVMs. and, in the end, employs just 25 of the essential charac-
While a two-class SVM may be more accurate in most teristics, as shown in Fig. 9.
cases, The authors could save time and effort by creat- The multi-class support vector machine (SVM) and
ing a powerful one-class SVM to classify our datasets CHAID decision tree used in the Internet of Things (IoT)
offline. Regular traffic can be used as a training data- cyber attack prediction framework yielded encouraging
set for a one-class SVM. Therefore, the objective of this findings. The framework not only distinguished the most
phase is dual. crucial criteria for attack classification but also achieved
The first step is comparing the various SVM methods excellent accuracy while classifying attacks. The frame-
to see which provides the most accurate detection. Com- work’s 99.72% accuracy is a big step forward over earlier
parisons are made between linear and non-linear Radial approaches. The SVM model’s accuracy may be enhanced
Basis Function (RBF) models of a one-class SVM and a by giving more importance to certain characteristics dur-
two-class SVM, respectively. Second, the authors want ing training.
to see how well the various SVM approaches perform on Figure 10 demonstrates that combining multi-class
intrusion detection tasks compared to our unsupervised SVM with the CHAID decision tree effectively predicts
anomaly-based IDS. Table 6 describes the simulation cyber-attacks in IoT devices. The framework is effective
results obtained through the proposed CHAID model. enough to classify attacks with high precision and zero
Compared to prior research, this proposed method can in on their most salient characteristics. This data may
generate a significantly accurate label, as presented in Fig. 7. strengthen the defences protecting IoT infrastructure
Table 7 has been reconstructed as including the by pinpointing possible attack entry points. The frame-
detailed performance of the proposed model. work’s excellent accuracy is a notable advancement over
The information displayed in Table 6 has been graphi- earlier approaches. This indicates that the framework can
cally represented by Fig. 8, proving that the proposed detect cyber assaults on Internet of Things (IoT) devices.
model achieved the maximum level of accuracy (99.78%), A further useful discovery is the selection of the
as shown in Fig. 8. top five characteristics for use in classifying attacks.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 13 of 20
Table 6 Simulation results for the proposed CHAID model Table 7 Accuracy results for the proposed and existing model
Attack precision recall f1-score support S. No Technique Accuracy %
and resource-intensive technique. This is because the Compared to CHAID decision trees, whose time com-
CHAID decision tree is a greedy method, while multi- plexity climbs at a logarithmic rate as n increases,
class SVM needs to tackle a quadratic optimization issue. multi-class SVMs have a cubic growth rate. This means
The temporal complexity of multi-class SVM and the multi-class SVM will be less efficient for big datasets than
CHAID decision tree are compared in the following table the CHAID decision tree. When evaluating the speed
(Table 8). with which different ML models complete their tasks, it
Where n is the number of training samples, and C is is important to consider more than just the time com-
the hyperparameter of the multi-class SVM algorithm. plexity involved.
decision tree algorithm may be tweaked to zero down While the outcomes of our approach are encourag-
on the most discriminatory characteristics. The CHAID ing, it is important to note its limits. The training data
decision tree technique makes the model more under- must be high quality and sufficiently representative of
standable and comprehensible. The decision tree format the real world for the model to work well. Future stud-
simplifies the analysis and analysis of alternatives. This ies should gather more diverse and realistic datasets
openness aids in the detection of possible vulnerabili- to enhance the model’s generalizability. Cyber attack
ties and countermeasures. It allows security analysts and prediction models might be even more effective with
system managers to understand better the elements con- additional research into ensemble approaches and
tributing to cyber assaults on IoT devices. Because of its incorporating other machine learning techniques.
efficiency and scalability, the suggested approach is well- Improved accuracy, interpretability, and efficiency are
suited for predicting cyber attacks in real time for wide- some of the benefits that the unique multi-class SVM
spread IoT installations. The model’s ability to deal with and optimized CHAID decision tree-based model
high-dimensional data and quickly produce predictions is bring to the problem of cyber attack prediction for IoT
due to the use of the multi-class SVM algorithm and the devices. By working together, these algorithms improve
optimized CHAID decision tree, which are well-known the handling of multi-class situations, feature optimi-
for their computational efficiency. The capacity to iden- zation, and decision clarity. Future studies should aim
tify and respond quickly to cyber attacks in IoT systems to develop and improve this model to increase its use-
relies heavily on the system’s scalability and efficiency. fulness and the security of IoT systems against cyber
Combining Multi-Class SVM with an Optimized threats.
CHAID Decision Tree for cyber attack detection is a There is scepticism about the added complexity intro-
potent technique to boost detection systems’ preci- duced by employing many classifiers in an ensemble
sion and recall. Multi-class SVM, a supervised machine model. As time has progressed, however, processing
learning technique, may classify data into numerous units like mobile devices have become progressively
categories. It’s an effective algorithm that can reach quicker, and memory resources have become increas-
very high levels of precision. It is sensitive to the choice ingly inexpensive; this has led to the possibility of a wide
of hyperparameters and can be computationally expen- range of algorithms, including ensemble approaches,
sive to train. The CHAID decision tree algorithm has being used for fog computing. Efficient resource alloca-
been enhanced to identify cyber-attacks better. The tion in fog computing is another area of study. Moreo-
algorithm is easily understood and interpreted. ver, studies have developed fog system designs that may
On the other hand, it may not be as precise as multi- use ensemble learning without significantly increas-
class SVM. The advantages of each method may be ing latency. It is argued that the design and efficient
obtained by combining them. An Optimized CHAID resource allocation method explored in this article
decision tree can provide the recall, while a Multi-Class may be used to implement the stacking strategy. Since
Support Vector Machine can provide the accuracy. missing a cyberattack is associated with a high cost,
Using Multi-Class SVM as a primary classifier is one the discovery that stacking can beat single classifiers
approach to combining these two methods. The authors for counterattack detection in IoT Smart city applica-
would utilize Multi-Class SVM to divide the data into tions has significant value despite modest increases in
manageable categories. The data inside each class complexity.
would then be classified using an Optimized CHAID
decision tree. Because Multi-Class SVM may be used Conclusions
to filter out much of the irrelevant information, this To forecast cyber-attacks in IoT systems, the authors
strategy has the potential to yield good results. With provide a unique multi-class support vector machine
this information, the Optimized CHAID decision tree (SVM) and improved CHAID decision tree-based model.
can zero down on the cyber threats that are most likely In addition to enhanced prediction accuracy, this model
to occur. Parallel execution is yet another method for boasts enhanced interpretability, scalability, efficiency,
combining these two programs. This would involve and optimized feature selection. The proposed model
employing both algorithms to sort the information. gains the highest accuracy level (98.28%). It is maxi-
The combined output of the two algorithms would then mum accuracy achieved in both the training and test-
serve as the basis for a conclusion. This strategy has ing phases. Using multi-class support vector machines
the potential for success since it takes advantage of the (SVMs) and improved CHAID decision tree algorithms,
best features of both algorithms. An Optimized CHAID various attack classes may be handled efficiently and with
decision tree can provide the recall, while a Multi-Class complete clarity. The model incorporates feature selec-
Support Vector Machine can provide the accuracy. tion approaches to zero in on the most important aspects
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 18 of 20
for cyber attack prediction, lowering the dimensionality the authors can zero in on the data that will impact the
and increasing the efficiency with which the model oper- model’s performance most. Additionally, our proposed
ates. By improving interpretability, the CHAID decision technique with the SVM technique leads to higher per-
tree method gives security analysts a deeper understand- formance than the single or other models employed in
ing of attack vectors and weak spots. A potential topic of recent publications in categorizing attack types in terms
study is the combination of Multi-Class SVM and Opti- of accuracy, precision, recall, and F1-score metrics. In
mized CHAID decision tree for detecting cyber attacks. the future, the authors want to investigate deep learning
Combining the best features of these two algorithms strategies that might significantly improve the effective-
allows for the creation more effective and trustworthy ness of IoT threat detection.
systems for detecting cyber attacks. The study found the Finally, as automated systems and Smart cities gain
following additional results: popularity, they will also face increased cyber attacks.
Suppose citizens are denied access to or otherwise
• Accuracy and recall in detecting cyber attacks can be have their privacy invaded within an automated sys-
enhanced by combining Multi-Class SVM with an tem. In that case, it can have severe consequences for
Optimized CHAID decision tree. them as individuals and be expensive for the govern-
• Multi-Class SVM may be used as a first-stage classi- ment to fix. System failures in managing emergencies
fier in integrating these two techniques, or the two (such as accidents and fires) can potentially endanger
can be used simultaneously. people’s health. Our findings that stacking classifiers
• Organizational requirements should guide the selec- can improve the detection of cyberattacks in smart city
tion of an integration strategy. networks have ramifications beyond technological con-
tributions, including economic and societal ones.
Improving the accuracy and reliability of cyber attack More information will be gained in this regard from
detection systems by integrating Multi-Class SVM and studies to be conducted in the future. To better iden-
Optimized CHAID decision tree is a promising field of tify cyber-attacks, new machine learning algorithms
research. may be created. Because they will be customized to
Due to its efficiency and scalability, the model may be the unique traits of cyber assaults, these algorithms
used for real-time prediction in massive IoT rollouts. Its may be more accurate and trustworthy than their pre-
computing performance allows for rapid forecasts and decessors. Cyber attack detection systems may be
faster cyber threat detection and mitigation. Our model more effective using additional data sources like net-
has potential, but it is not without caveats. Training data work traffic data and system logs. This information can
is crucial to the model’s success; thus, it’s important to be utilized to spot trends in cyber assaults that aren’t
use a wide variety of data that accurately represents the picked up by currently available databases. It is possi-
target domain. Investigating ensemble approaches and ble to build automated reaction systems responding to
incorporating additional machine learning techniques in cyber threats. The authors may use these technologies
future studies might improve the resilience and accuracy to quarantine compromised machines, stave off harm-
of the model. ful traffic, and roll back to a prior configuration.
Our unique multi-class support vector machine
(SVM) and improved CHAID decision tree-based Acknowledgements
The authors thank Princess Nourah bint Abdulrahman University Researchers
model both add to the development of cyber attack pre-
Supporting Project number (PNURSP2023R410), Princess Nourah bint Abdul-
diction in IoT systems. It’s a helpful resource for coun- rahman University, Riyadh, Saudi Arabia.
tering online dangers and protecting sensitive data.
Authors’ contributions
More work will improve and broaden the model, lead-
UKL & S.D. were responsible for Validation, Software, Data Curation, and
ing to stronger defences for Internet of Things devices. Writing - Original Draft. N.F. & S.S. was responsible for Conceptualization,
Thus, a CHAID-based paradigm is proposed for pre- Writing - Original Drafts. MA & NA was responsible for Writing - Original Draft,
Visualization. NF & NA were responsible for Writing - Review & Editing. SD &
dicting multi-stage cyber threat detection for IoT com-
MA were responsible for Formal Analysis. A.K. was responsible for Writing -
munication. In this research, the authors investigate Original Draft, Resources, and Supervision. The author(s) read and approved
whether the proposed CHAID method can be used to the final manuscript.
detect cyberattacks in IoT-based Smart city applica-
Funding
tions. Through testing with the most up-to-date IoT The funding of this work was provided by Princess Nourah bint Abdulrah-
attack database, we have found that this technique, man University Researchers Supporting Project number (PNURSP2023R410),
Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
mainly stacking, outperforms individual models in dis-
tinguishing malicious from benign data. Using a fea- Availability of data and materials
ture selection method informed by information gain, Publicly available datasets were analyzed in this study.
Dalal et al. Journal of Cloud Computing (2023) 12:137 Page 19 of 20
37. Maddikunta PKR, Pham QB, Prabadevi B, Deepa N, Dev K, Gadekallu TR,
Ruby R, Liyanage M (2022) Industry 5.0: A survey on enabling technolo-
gies and potential applications. J Industrial Inform Integ 26:100257
38. Khan WU, Ihsan A, Nguyen TN, Ali Z, Javed MA (2022) NOMA-enabled
backscatter communications for green transportation in automotive-
industry 5.0. IEEE Transact Industrial Inform 18(11):7862–7874
39. Hassan A, Prasad D, Khurana M, Lilhore UK, Simaiya S (2021) Integra-
tion of internet of things (IoT) in health care industry: an overview of
benefits, challenges, and applications. Data Sci Innovations Smart Syst
30:165–180
40. Liu Y, Wu H, Rezaee K, Khosravi MR, Khalaf OI, Khan AA, Ramesh D, Qi
L (2022) Interaction-enhanced and time-aware graph convolutional
network for successive point-of-interest recommendation in traveling
enterprises. IEEE Transact Industrial Inform 19(1):635–643
41. Qi L, Liu Y, Zhang Y, Xiaolong Xu, Bilal M, Song H (2022) Privacy-aware
point-of-interest category recommendation in internet of things. IEEE
Internet Things J 9(21):21398–21408
42. Liu Y, Li D, Wan S, Wang F, Dou W, Xiaolong Xu, Li S, Ma R, Qi L (2022) A
long short-term memory-based model for greenhouse climate predic-
tion. Int J Intell Syst 37(1):135–151
43. Abu Al-Haija Q, Al-Fayoumi M. "An intelligent identification and clas-
sification system for malicious uniform resource locators (URLs)." Neural
Computing and Applications (2023): 1–17.
44. Al-Haija QA, McCurry CD, Zein-Sabatto S. "Intelligent self-reliant cyber-
attacks detection and classification system for IoT communication using
deep convolutional neural network." Selected Papers from the 12th
International Networking Conference: INC 2020 12. Springer International
Publishing, 2021.
45. Abu Al-Haija Q, Badawi AA, Bojja GR (2022) Boost-defence for resilient IoT
networks: a head-to-toe approach. Expert Syst 39(10):e12934
46. Abu Al-Haija Q, Alohaly M, Odeh A (2023) A lightweight double-stage
scheme to identify malicious DNS over HTTPS traffic using a hybrid learn-
ing approach. Sensors 23(7):3489
47. Al-Haija QA (2023) Cost-effective detection system of cross-site scripting
attacks using hybrid learning approach. Results Eng 19:101266
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.