Discover Internet of Things: Maximizing Intrusion Detection Efficiency For Iot Networks Using Extreme Learning Machine
Discover Internet of Things: Maximizing Intrusion Detection Efficiency For Iot Networks Using Extreme Learning Machine
Research
Abstract
Intrusion Detection Systems (IDSs) are crucial for safeguarding modern IoT communication networks against cyberat-
tacks. IDSs must exhibit exceptional performance, low false positive rates, and significant flexibility in constructing attack
patterns to efficiently identify and neutralize these attacks. This research paper discusses the use of an Extreme Learning
Machine (ELM) as a new technique to enhance the performance of IDSs. The study utilizes two standard IDS-based IoT
network datasets: NSL-KDD 2009 via Distilled-Kitsune 2021. Both datasets are used to assess the effectiveness of ELM
in a conventional supervised learning setting. The study investigates the capacity of the ELM algorithm to handle high-
dimensional and unbalanced data, indicating the potential to enhance IDS accuracy and efficiency. The research also
examines the setup of ELM for both NSL_KDD and Kitsune using Python and Google COLAB to do binary and multi-class
classification. The experimental evaluation revealed the proficient performance of the proposed ELM-based IDS among
other implemented supervised learning-based IDSs and other state-of-the-art models in the same study area.
Keywords Extreme Learning Machine (ELM) · Intrusion detection system (IDS) · Machine learning · Internet of Things
(IoT) · Cyber-attack
1 Introduction
In 2004, “Huang et al.” created ELM for the first time. Considering that ELM is quicker than ML, the authors of [1] attempted
to create and deploy their model for the digital twin platform. Machine Learning (ML) has successfully used Multilayer
Feedforward Neural Network (MLFN) properties. Scientists discovered that when N hidden neurons and any activation
function are used, a single-layer feedforward neural network may be trained effectively with tolerable errors. The use of
Neural Network (NN) processes in various domains, such as pattern recognition, graphical interpretation, risk estimates,
management systems, projections, and classification, has enormous potential.
As NN research progressed, studies on NN estimating abilities stimulated the incorporation of NNs in qualitative
approaches to solutions involving differentials. [2, 3]. Furthermore, many scientists regard ANN and ML as subclasses
of Artificial Intelligence (AI), such as the authors of [4], who employ AI applications to address nonlinear autoregres-
sive neural networks. They also enable more precise forecasts for prospective needs, markets, and technology areas.
Furthermore, [5], researchers provide a precise NN model for identifying and categorizing network activities in an IoT
system. The research dives deeper into the application of ELM as a result of this research study. The NSL-KDD dataset
* Qasem Abu Al‑Haija, [email protected] | 1Department of Cybersecurity, King Hussein School of Computing Sciences, Prince
Sumaya University for Technology, P.O. Box 1438, Amman 11941, Jordan. 2Department of Cybersecurity, Faculty of Computer & Information
Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
is employed to test IDSs in network security. With the threat landscape of cyberattacks evolving, the NSL-KDD dataset
offers an opportunity to examine how ELM could enhance the accuracy and efficiency of IDS systems, responding to the
ongoing battle against cyber threats. On the other hand, Kitsune, a fresh and dynamic network traffic analysis dataset,
presents interesting prospects for ELM. Kitsune’s singular focus on real-time anomaly detection complements ELM’s
quick learning and flexibility capabilities.
Researchers and practitioners may investigate novel approaches to detecting anomalies and possible dangers in
dynamic network settings by using ELM with Kitsune. Where the research looks for NSL-KDD, and Kitsune, a new and adap-
tive analyzer of network traffic datasets, offers intriguing possibilities for ELM. Kitsune’s unique emphasis on immediate
time anomaly detection complements ELM’s rapid learning and adaptability features. By combining ELM with Kitsune,
researchers and practitioners may examine innovative techniques for identifying abnormalities and potential threats in
dynamic network contexts where the study seeks out NSL-KDD and Kitsune datasets to examine if this prospective ELM
approach might aid in network security and the dynamic detection of anomalies. By the end of this journey, I gained
substantial insight into the potential of ELM to transform the field of ANN and its immediate applications in real-world
data processing. IDS can yet be improved in simplicity, efficacy, and efficiency. [6, 7]. Furthermore, Fig. 1 represents the
direction of this research study. This study purposefully uses our approach to detect attacks using NSL_KDD and Kitsune
for binary and multi-class classification to detect attacks in various ways.
1.1 Research objectives
This research focuses on using ELM to identify attacks. Thus, researchers of the study are working through the approach
suggested for the ELM-IDS model, utilizing a variety of datasets. These various datasets include unusual behaviors and
attacks. Our study has utilized Google COLAB and Python. This study’s objectives may be summarized as follows:
• Analyzing in-depth the types of attacks within datasets. Describing previous studies with comparable detection
approaches to measure their effectiveness.
• Develop an IDS to identify (binary classification: Normal vs. Attack) and categorize (multi-classification: attack types).
• Identify the performance (using standard evaluation metrics) of ELM using different ML algorithms (KNN, Decision
Trees, and Random Forests) and compare the performance findings for binary and multi-class for both datasets.
• Contrasting our study findings to relevant studies from previous years.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
1.2 Research contribution
Considering the various ways of implementing ELM and various experiments in various sectors, our research study
applied ELM in Python using Google COLAB. As a result, we are conducting this research today as a species. Our research
study is essential currently since it demonstrated the ability of the general ELM based on IDS to identify attacks and
harmful activities. The following structure is how this research is organized: First, an overview of ANN within ELM, tax-
onomy within details, and afterward, an overview of previous research carried out within the last 7 years. Thus, the study
examines 85 historical studies, with deadlines ranging from 2011 to 2024. Past studies are arranged chronologically in
the Reference section throughout the analytical section. Also, the study represents an algorithm to apply ELM, detailed
in Section IV in full. The results of our experiment are illustrated in Section V. Also, the study illustrates the discussion.
Within Section V. Finally, a summary of our study and work is provided in VI, and our work is concluded. Figure 1 displays
the many portions of the study and the overall direction of this research.
1.3 Problem statement
Although there is growing interest in using ELM for IDS, there must be a knowledge gap in adequately understanding
potential real-world operations, performance benchmarks, and possible enhancements over traditional IDS systems.
Unfortunately, previous research has not produced a dataset, detection, or classification system in a suitable setting
and context with unambiguous categorization. This motivated us to look for a means to adapt this research study to
additional datasets, including other types of normal and assaulting various classification approaches, leading to a higher
accuracy rate and increased security in injection detection technologies.
1.4 Research motivation
ELMs were found to outperform traditional NNs in a variety of tasks, including classification, regression, and clustering.
Furthermore, they outperform traditional NNs in terms of resistance to noise and overfitting. It may be implemented in
various ways, including Python, ML, Microsoft Azure, and RStudio. In addition, ELM or ELM inside ANN applications are
used in various industries. To identify assaults and abnormal traffic, one must design and deploy numerous ELM datasets.
Furthermore, the study suggested that the present research project integrates ELM using Google COLAB and Python.
As a result, our motivation and contribution to this study are characterized as follows: In the introduction section, intro-
duce the idea to implement an ELM from NNs. Also, ANN taxonomy is mentioned. The background section summarizes
the structure of ANN, SLFN, and ELM. In the analysis section, the review of the latest past studies over the last 8 years,
around 80 studies are organized chronologically within the literature review section. Furthermore, ELM results using
Python and Google COLAB are provided in the ELM Implementation section. Finally, the results and discussion section
provide overall outcomes from the experiment for our ELM-IDS model.
2 Background
ELM has come into use in cybersecurity, notably in detecting and mitigating threats using ML. ELM’s unique design,
which involves randomly generated input-to-hidden layer weights, has proved advantageous in dealing with large-scale
datasets and high-dimensional feature spaces. In the context of attack detection, ELM’s computational effectiveness is
very useful, allowing for quick model training and inference [8]. Its capacity to generalize effectively helps to identify
patterns suggestive of attacks, making it a vital tool for improving security measures. Researchers and practitioners want
to construct strong ML models capable of identifying and reacting to many types of cyber threats, such as adversarial
assaults, data poisoning, and evasion strategies, by exploiting the benefits of ELM. As cybersecurity evolves, ELM stands
out as a viable strategy for enhancing NN systems’ capabilities to protect against evolving and intricate security threats.
While employing ELM, one must understand how it is used in NNs. For this, the study goes through the history of ELM
inside NNs and how different scientists have classified ELM as single or many. NNs are a network. ELM, on the other hand,
is regarded as a kind of NN. NNs are employed to examine and apply ELM in this project.
ANNs are numerical models that process data inspired by biological neural systems [9]. While ANN’s taxonomy is
shown in Fig. 2, a single hidden NN (SLFN) is employed. Figure 3 shows an example of a generalized ELM employed in
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
any learning system. Wherever ELM occurs in Fig. 3, it may be portrayed in various architectural styles; however, our study
focuses on ELM’s general thought architecture. Though it’s necessary to build ELM using NNs, it might also be done using
ML or another approach. ELM may be implemented using single-layer perceptrons or multilayer perceptrons. Because
ELM is an SLFN, it is best to build it using NNs [10, 11].
Figure 4 shows an ELM as a single hidden NN. ELM also has an input layer, a hidden layer, and an output layer. A single-
hidden-layer NN is an ANN with just one hidden layer between.
Its input and output layers. Deep models will also include one or more hidden layers. However, the ELM model is a
deep model that will use one or more single-hidden layers.
Single Hidden NNs (SLFN) are illustrated in Fig. 4. Assuming there was SLFN, it would have an input layer, hidden
layers, and an output layer. Given the nature of ELM, ELM is an SLFN; therefore, it may implement ELM or SLFN. In addi-
tion, ML and NNs are referred to as AI. As previously stated, ANNs are a type of ML and AI. ANN is based on nonlinear
functions for weighted summation of input variables. The ANN approach may be depicted as a graph with nodes
(neurons) and edges (connections between them). The weights are often allocated to neurons and edges, impact-
ing the learning mechanism. The weight varies with the strength of the transmission signal at the connection [12].
Before transmitting a signal, neurons may have a threshold passed by the aggregate signal. Neurons are frequently
arranged in layers. Different layers may modify their inputs in various ways. Signals may travel several times between
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
the first layer (the input layer) and the final layer (the output layer) [13, 14]. The weights, bias, weight summing, and
activation functions will give the output. ANNs mimic the knowledge-gathering process by using a large number
of neurons connected in a specific manner and powerful learning network algorithms [15]. Figure 5 depicts an ANN
structure. According to the same figure, ANN has a starting value, bias, activation function, and output. Due to
NNs, there’s a multi-step process to evaluate metric measurements. About ANN and ELM, the authors have applied
a comparison between ELM and ANN within [16], where the number of dengue occurrences will be forecasted by
including environmental parameters (rainfall, temperature, and humidity) while contrasting the outcomes of ELM
and ANN (backpropagation).
FNNs and RNNs are two distinct topologies in NN. FNNs can process input data sequentially via layers without
internal memory, rendering them highly effective for activities requiring separate inputs, such as image classifica-
tion. In contrast, RNNs employ loops for preserving data throughout time, enabling them to evaluate sequential
data, including text and time series, via capturing input dependencies. Whereas FNNs are easier to train and acquire
fixed-size inputs quicker, RNNs excel within temporal comprehension and context preservation, although they suf-
fer from difficulties with gradients that fade. Both approaches offer unique contributions to a wide range of areas,
spanning natural language processing to image recognition, by leveraging their different strengths in dealing with
particular data structures and temporal dependencies. Figure 6 shows the differences between (a) Feedback NN and
(b) Feedforward Neural Networks (FNN). The main difference is Feedback NN has recurrent connections, allowing
them to process sequential data with memory, while FNN processes inputs sequentially without internal state or
loops. Also, Feedback NN is named as RNNs [17].
Furthermore, Table 1 illustrates the depth differences between FNN and Feedback NN in terms of signal, time
operation, structure complexity, memory time, and applied range.
3 Literature review
This section illustrates an analysis of past studies that are related to our approach, experiment, and concept. ELM
plays a vital effect on the performance of classification [66]. While ELM is applied in different sectors, we analyzed
ANN within ELM using various datasets to detect attacks and malicious activities. For the importance of our work, The
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
current study analyzes all studies related to ELM dealing with NNs. Also, the study reviewed 85 related studies from
2011 to 2024. Where Fig. 7 represents in detail the advancement of ELM over the years. However, for the importance
of the organization of this study, references are classified in chronological order from 2024 to 2016. Table 2 shows
the analysis of related studies, which represents the year, dataset or method, summarized approach, and results.
To begin, I will start by representing related works from past years. According to the authors [18], they designed
their own forecasting sales ELM model using a sales dataset from SAP to investigate data quality that cannot be
measured quantitatively. They achieved an accuracy of 111% and a predicted value of 132%. Thus, authors due [19],
The experimental assessment of indicated ML-based XSS detection systems revealed dominance over the hybrid
learning-based XSS detection system, which achieved the highest metrics for performance and involved 99.8% speci-
fied detection accuracy and detection precision, along with a quick detection time of just 103.1 s.
In [20], they used the principle of expectation maximization (EM) and incrementally reweighted the lowest.
Squares (IRLS) methods to solve the altered goal function. They assess the algorithm’s performance using numerical
modeling and experimentation on benchmark datasets, revealing its edge regarding resilience and generalization
over existing state-of-the-art ML algorithms. Thus, within [21], researchers present two effective methods for solving
high-dimensional PDEs based on randomized NNs. They use a randomized NN to express the response field to the
extremely dimensional PDE issue, which has its hidden-layer coefficients set to random values and fixed while the
output-layer coefficients are learned. The development of TELM [22], which serves as an efficient classifier, ignores
the numerical knowledge buried throughout the data. In the present research, they first created a Fisher-regularized
(FTELM) by including Fisher regularization into the TELM learning architecture to utilize the statistical information
gathered from sample data fully. In [23], researchers investigate their suggested effective malware detection method
and create an incredibly scalable structure for detecting, classifying, and categorizing zero-day malware.
Authors of [24] discussed developing, training, validating, evaluating, and discussing a novel recognition and classifi-
cation methodology for identifying and categorizing harmful standardized resource locators (URLs). The system that is
Vol:.(1234567890)
Table 2 Analysis of past studies
Ref Year Dataset Model Classes Benefits Limitation
[18] 2023 sales dataset from SAP ELM within forecasting sales Multi-class Error score 0.02716 ELM sales forecasting continues to
require a thorough understand-
ing of NN and statistics
[19] 2023 XSS-Attacks-2019 dataset hybrid model Two class The average detection time is No limitation
103.1 s
[20] 2023 EAGLE dataset Mixture-ELM MAPE Multi-class Attaining a value of 0.187% The framework’s hyperparameter
Discover Internet of Things
best predictions
[21] 2023 They create their dataset ELM-PDE Two class Using the ELM method, they No limitation
solve the system (42) with an
NN architecture
[22] 2023 UCI datasets T-ELM Two class FTELM is better than CL1-FTELM This paper’s authors attempt to
transition FTELM and CL1-FTELM
from a supervised learning
paradigm to a semi-supervised
paradigm
[23] 2023 MALIMG dataset ELM-Net model Within different Two class Adding several additional layers They will need to experiment
algorithms to the current designs allows with these modifications and
them to analyze even bigger add new features to the current
malware information
[24] 2023 ISCX-URL2016) High-performance NN-based IDS Two classes and five classes En_Bag strategy topped all No limitation
others, with rate of accuracy of
99.3% and 97.92% for both the
2-class and 5-class classifica-
tions, respectively
[25] 2023 diabetes dataset Annealing-ELM Two class Their simulation of annealing to No limitation
the resulting golden section
produces accelerations of 3.5
| https://doi.org/10.1007/s43926-024-00060-x
and 38.3
[26] 2023 UNSW-NB15 dataset AS-ELM & OI-SVDD Multi-class The proposed AI-ELM model is False-positive rates tend to occur
capable of a 99% accuracy rate in anomaly detection structures;
their system is no exception.
Their suggested OI-SVDD model
will likely be vulnerable to these
anomaly identification issues
Vol.:(0123456789)
Research
Table 2 (continued)
Ref Year Dataset Model Classes Benefits Limitation
Research
[27] 2023 Broken Bars Dataset Broken rotor bars-IMS model Four classes Their approach detects and The suggested system employs
classifies BRBs with a 99.8% self-configurable optimizable
Vol:.(1234567890)
accuracy NNs (ONNs), which alter their
hyperparameters and specifica-
tions to achieve the best design
for every issue statement
[28] 2023 CWRU dataset MSCNN-ELM Multi-class & Two class They use Microsoft Azure, which No limitation
is a new, strong environment
[29] 2023 CIRA-CICDoHBrw-2020 dataset DoH IDS model Two class Predictive accuracy of 99.4% and The demand for labeled data is an
100% investigation restriction
[30] 2022 DDoS Dataset 2022 LR-DDoS model Two class Despite the other models, No limitation
their approach has obtained
the greatest performance in
LR-DDoS detection, reaching
an overall detection rate of
99.9975%
Discover Internet of Things
[31] 2022 UAV-IDS-2020 dataset UAV-IDS-ConvNet model Two class The most significant findings Their assessment depends only
from experiments obtained by on simulation data, with no
the researchers demonstrated real deployment, evaluation, or
a successful IDS accuracy of performance analysis. Testing
99.50% our method within a real-world
context could be fascinating and
(2024) 4:5
instructive
[33] 2022 They use 29 datasets from the BAB-ELM model Multi-class They use MATLAB to deal with The present feature techniques
UCI this large number of datasets for learning are extremely tough
and complex to distribute since
they generate a huge number
of intermediate outcomes with
complex interconnections and
multiple repetitive procedures
[34] 2022 NSL_KDD dataset IoT-IDCS-CNN model Multi-class (5 classes) & Two class The collaboration of the three No limitation
sectors has produced a system-
atic engineering method that
may be executed with great
performance and precision
[38] 2021 Rainfall & other six datasets BBO-ELM Multi-class They compare BBO-ELM with No limitation
three models due to various
datasets
[39] 2021 CMB dataset CNN-ELM-BA Multi-class & 2- class The suggested algorithm The authors suggest an approach
achieved an overall accuracy to solve two classes. On the
score of 95.25% other hand, multi-class still
needs to be solved
| https://doi.org/10.1007/s43926-024-00060-x
Table 2 (continued)
Ref Year Dataset Model Classes Benefits Limitation
[40] 2021 NSL_KDD dataset ELM & FLN models Multi-class The average accuracy of the No limitation
overall ELM model is 0.9865%,
and the max accuracy is
0.9255%
[42] 2020 MNIST dataset AE-ELM Multi-class The ELM was selected because No limitation
it offers analytical weight calcu-
lation and efficient training,
Discover Internet of Things
[55] 2024 NSL-KDD dataset WELM model Multi-class (Five classes) Effective in addressing classifica- Relatively low recall rates for U2R
tion issues with unbalanced and R2L due to small sample size
datasets and unbalanced dataset
Vol.:(0123456789)
Research
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
suggested employs four ensembles of supervised NN approaches: the supervised ensemble of bagging trees (En_Bag),
the power source set of k-nearest neighbor (En_kNN), the ensemble of increased decision trees (En_Bos), and the power
source ensemble of subspace discriminators (En_Dsc). Furthermore, within [25], authors employ the golden section
search and simulate annealing as a heuristic approach for determining the proper number of neurons within an ELM
hidden layer. While they need to contrast a sequential technique using the highest-dimensional database, the outcomes
demonstrate that the simulation of annealing speeds up the search for the correct number of neurons by up to 4.5 times
and the golden section search by up to 95.7 times. Moreover, authors [26] studied using SVDD and ELM to create a NIDS
with online learning capability to defend the IIoT device from assaults using MEC.
Furthermore, according to authors in [27], considering a 99.8% accuracy and a prediction time of 1.64 microseconds,
the suggested technique successfully identifies and categorizes BRBs. Subsequently [28], it has been suggested that
MSCNN-ELM be used to handle the issues of difficult defect feature extraction and poor diagnostic accuracy in manu-
facturing processes effectively. They extracted features using several convolution layers with different branches and
confirmed that the overall feature extraction capacity was superior to that of a single-scale model. Overall, MSCNN
parameters are produced randomly according to a Gaussian probability distribution, which speeds up the calculation.
The targets consisted of a 10% FPR decrease, an FPR of less than 0.6%, and a better FPR than J48 DT or MLP classifiers
utilizing similar data sets.
Moreover, within [29], the study’s authors utilize a hybrid learning method; they offer a lightweight, two-stage strategy
for identifying malicious DoH traffic. The structure is divided into two tiers. The detailed experiential assessment revealed
a high-performance system with prediction accuracy of 99.4% and 100%, respectively, with a predictive overhead of
0.83 s and 2.27 s for layers one and two. Thus, authors within [23] suggested a new lightweight IDS for detecting LR-DDoS
attacks on SD-IoT networks. They classify data distribution into 50% test and 50% train. Also, they achieve 99.9975%
accuracy.
Regarding [31], the numerical answer to the time fractional Black–Scholes approach, the Legendre, a wavelet NN
model integrated into ELM, is presented. The operation matrix of the derivative’s fractional derivative, depending upon
the two-dimensional Legendary wavelet, is therefore produced and used to solve the European options pricing issue
in this manner. This approach solves the issue by solving a series of algebraic equations. The experimental findings
show that this strategy provides a reasonable numerical solution compared to the benchmark method. Due to [32], the
authors offer a self-learning IDS (UAV-IDSConvNet) that can detect malicious attackers infiltrating UAVs employing deep
convolutional NNs. The suggested strategy especially considers protected Wi-Fi traffic data records from three types of
regularly used UAVs.
Moreover, due to [33], the article’s objective is to describe a Bayesian attribute bagging based on ELM (BAB-ELM) for
dealing with high-dimensional classification and regression issues. Due to IoT research, authors within [34] suggested an
architecture that may be tweaked and utilized for intrusion detection and classification with any IoT cyber-attack dataset.
The suggested system is divided into three subsystems: feature engineering, feature learning (FL), and detection and
classification. In this article, every one of the subsystems will be extensively explained and examined. The suggested
architecture leverages DL models to identify minimally modified IoT networking assaults with excellent detection and
classification accuracy over IoT traffic derived from either a system in real-time or a pre-collected dataset. Because this
study utilizes system engineering (SE) approaches, ML technology, and information security in the IoT systems sector,
the combined corporation of the three domains has effectively generated a systematically designed system that might
be deployed using high-performance trajectories. Consequently, within [35], the authors propose an innovative FDI
attack detection technique to address the issue of high calculation complexity generated by high-dimensional data.
They begin by mapping high-dimensional data to low-dimensional space via the LLE technique and training an ELM
classification model.
In [36], the researcher suggested that the loss function does not need discretization and, therefore, is highly paral-
lelizable. The ELM technique is employed to solve a system of linear equations to find the network parameters. Among
the example studies, they proved the performance of ELMNET when solving for advection–diffusion PDE (AD-PDE).
The experimental results of the suggested method have been contrasted with those of an effective deep NN, and they
revealed that the ELMNET achieves considerable gains in accuracy and training time. Authors Due [37] present research
project concepts, run through, and assess a remotely controlled automobile’s remote keyless entry (RKE). The transfer
learning technique operates the ResNe50 deep convolutional NN (DCNN) previously trained on the ImageNet dataset. The
in-depth evaluation demonstrated remarkable performance characteristics for the proposed detection system, including
an accuracy in classification of 99.71% at an extremely low detection rate. Next to [38], this research suggested the BBO-
ELM and DNN models for multi-step-ahead rainfall prediction across India and homogenous areas. It contrasted their
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
performance to the results of the three approaches: GA-ELM, PSO-ELM, and ELM. As a result, the suggested wavelet-based
models might be utilized to forecast rainfall every month in the Indian area. Moreover, the authors of this contribution
conducted an investigation.
Also, authors within [39] determine the best number of layers to substitute—a heuristic technique known as the bat
algorithm was used to optimize the parameters of ELM. The author’s technique was evaluated using holdout confirmation,
and the most accurate predictions were derived by averaging the outcomes of five experiments. Due to this study [40],
the authors suggested an examination of ELM and FLN-based IDS with varying quantities of neurons to assess the influ-
ence of the algorithm’s layout on IDS accuracy. Additionally, the authors’ recommendations within [41] outperformed and
demonstrated robustness when contrasted with the various models employed. As a result, this research demonstrates the
effective utilization of EO-ELM and DNN in R-R modeling, which could potentially be employed within the hydrological
modeling sector. Due to [42], the authors suggested that the approach achieves equivalent reconstructing performance
on real datasets while training for substantially less time than previous deep NN approaches. With only 8.23 s of training
over a period, an NN containing almost 8000 hidden units produced a normalized mean-squared error of 1.28103 for an
average compression ratio of 10:1. In this paper [43], the authors present a hybrid NN model for Android ransomware
detection that combines a modified current swarm intelligence method, the SSA algorithm, using Kernel-ELM. The SSA
is altered at both the swarm structure and the computational binarization level.
Due to [44], the authors suggest the intrusion detection method ILECA and present an enhanced LDA-based ELM
classification. Furthermore, they compare the performance and efficiency of ILECA with the other five methods utilizing
tests on the NSL-KDD dataset. ILECA has the highest accuracy and detection rate (92.35% and 91.53%, respectively),
while the runtime is just 0.1632 s. As a result, ILECA has high generalization and real-time features, and its overall per-
formance outperforms the other five usual methods. While [46] represents the basic ELM technique, which uses the
Moore Penrose opposite, loses precision towards extremely noisy data; nevertheless, by incorporating the SA and PSO
heuristic algorithms for optimization, ELM’s resilience for noisy data was accomplished. As a result, whenever noisy data
is unavoidable, various investigations can utilize this technology to generate good models.
However, within [47], authors suggest using ORL and NUST datasets as a powerful classification. This study proposes
CNN-RELM. The CNN-RELM uses the gradient-descending approach to train CNN unless the learning target accuracy
is reached. The whole connected layer of CNN is then replaced with a RELM optimized by a genetic algorithm, but
the remaining layers of CNN remain intact. Researchers within [48] suggest a novel distribution of classification-based
weighted ELM (D-WELM), a new weighting strategy utilized in ELM, which is proposed for binary and multi-class classifi-
cation problems involving unbalanced data distributions. The purpose of this research [49] is to determine the influence
of characteristics taken from the source code of a.apk file, which represents the internal structure of the software, on
the prediction of malware and conventional Android applications. They empirically calculated, analyzed, and compared
the performance of seven classifier strategies, three data sampling techniques, and three feature selection techniques
to create Android malware prediction models.
Authors Due [50] suggest an ELM model to detect a DDoS attack using the Tasllis Entropy dataset within an IoT envi-
ronment. Furthermore, researchers [51] suggested that the HGWO-ELM approach used the grey wolf planner to find
the global best variable and choice of features and enhance it regarding search strategy, food supply, and updating
equations. The approaches laid out in this research outperform previous methods for classification in simulated experi-
ments and have some guidelines that are important for fault diagnostics of hydropower-producing units. This article [52]
proposes a two-hidden-layer ELM referred to as T-ELM through the introduction of an innovative approach to determine
the parameter values of the second invisible layer (connection weights among the initial and subsequent hidden layers,
as well as the bias of the power source next hidden layer), getting the real hidden layer output nearer to the anticipated
hidden layer results within a two-hidden-layer feedforward network. Researchers within [53] focus on applying ELM
technology to enhance the efficiency and precision of IDS within network security research. The research systematically
analyzes previous studies over the last decade, categorizing them into S-ELM, U-ELM, and Semi-ELM. It highlights the
importance of ELM and AI and its increasing interest, emphasizing the need for further exploration, especially in handling
high-dimensional data. The study also proposes hybrid models such as ELM with Deep Learning (DL) to improve interpret-
ability and scalability, suggesting promising advancements in ELM-IDS techniques. The study [54] focuses on improving
agricultural practices through multispectral and hyperspectral information processing technology for enhanced crop
management and productivity. Finally, the authors of this study [55] evaluate the Weighted Extreme Learning Machine
(WELM) against other ML algorithms for unbalanced data classification, demonstrating superior accuracy and stability.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
A feedforward NN in ELM comprises a single hidden layer with randomly generated weights and biases. This section will
be divided into three subsections: ELM environments, approaches and techniques, and datasets.
4.1 ELM environment
ELM trains substantially quicker than typical NNs because it uses a Moore–Penrose pseudo-inverse of the hidden layer
output matrix. Firstly, the sub-section starts by finding an environment in which ELM can be implemented easily and
clearly. Where the study finds various ways to implement ELM within ML or ANN, see Fig. 8, which represents various
ways to implement ELM. Also, various ways to implement ELM, ML, MATLAB, Microsoft Azure, and NN approaches utilize
randomized weights and biases within the hidden layer [56].
In contrast, the established least-squares method employs the Moore–Penrose inverse matrix to determine the weights
of the output layers. The benefits of this method include its training speed, adaptation ability, and resilience, as well as
its inadequacies in addressing extremely nonlinear situations.
Firstly, Google COLAB is used, 12 RAM, disk 107.7, and GPU 12 using Python. Also, the NSL-KDD dataset and the
Distilled-Kitsune 2021 datasets are chosen. Our algorithm is achieving good results within the NSL-KDD and Distilled-
Kitsune datasets.
Whereas the established least-squares method employs the Moore–Penrose inverse matrix to determine the weights
of the output layer, the benefits of this method include its training speed, adaptation ability, resilience, and inadequa-
cies in addressing extremely nonlinear situations. Google COLAB is chosen to implement ELM. Firstly, Google COLAB is
employing using Python. Also, the NSL-KDD dataset and the Distilled-Kitsune 2021 datasets are employed via the ELM
model. Our algorithm is achieving good results within the NSL-KDD and Distilled-Kitsune datasets. Figure 9 provides
steps for implementing ELM in Google COLAB and Python. In [57], the authors used Microsoft Azure to implement ELM.
Also, they classified ELM as a class of AI. The simulation outcomes reveal that a mixture of ELM and under-sampling
produced the best results, with an average. F1-score of 0.9541 for binary categorization and 0.9555 for classification of
multiple classes. Table 3 below presents an ELM package and library with various choices depending on the model and
algorithm that must be implemented. The usual.
Cyberattacks are intended to deliberately abuse the system’s capabilities to compromise required data and use the equip-
ment in an unauthorized manner. Cyberattacks frequently begin against the target system to breach critical security
services such as confidentiality, integrity, and availability. Also, cyberattacks are currently accessible in various forms,
including ransomware, malware, adware, surveillance, denial of service, and others. This study implements the suggested
ELM-IDS model to detect attacks using ML.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
4.2.1 Data selection
Firstly, two datasets were selected. The NSL-KDD and Kitsune datasets are based on a broad concept of IoT threats and
attacks. The ELM model was also used to identify attacks based on NSL_KDD and Distilled-Kitsune datasets with differ-
ent characteristics and sample numbers. The study was employed via Python and Google COLAB. The dataset is unique
due to the characteristics listed in Table 4. Both datasets were chosen to evaluate and measure the performance of IDS
in cybersecurity. Although the NSL-KDD dataset contains structured network intrusion logs and provides a balanced
and improved benchmark for IDS, the Distilled Kitsune dataset was used for its realistic representation of network traffic
anomalies.
4.2.2 Data preprocessing
Data preprocessing is any technique performed on raw data to prepare it for execution in a further data processing activ-
ity. It is a portion of data preparation. Traditionally, it has been a vital first stage throughout data mining. Data preparation
approaches have been greatly enhanced to design, train, and perform inferences against AI and ML models. Data pre-
processing puts data into a format that can be studied and handled more quickly and efficiently in data mining, machine
learning, and other data science activities. The techniques are often employed at the beginning of the ML and AI pipeline
to provide correct results. The dataset is publicly available as CSV files, structured data files generated by a computer. The
data was normalized using sk-learn’s StandardScaler() technique. It standardizes the features by subtracting the mean
and scaling it down to unit variance. Data preprocessing involves any procedure done within data to prepare it for use
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
in subsequent data processing operations. Historically, it has been part of the data preparation process after uploading
datasets into Google COLAB within the CSV file format. Missing, duplicated, NA is removed, and data balancing values.
In addition, the one-hot encoding approach is employed to transform categorical values into numerical value rep-
resentations for ML algorithms. This approach converts each category in a categorical feature into a separate binary
feature column. Thus, the categorical data transformation for the one-hot encoding strategy turns each category within
a categorical feature into a distinct binary vector—the main reason for employing the one-hot encoding approach is
that ML doesn’t accept any categorical values.
4.2.3 Data balancing
To begin a new step within data preprocessing by balancing the dataset using the SMOTE technique. The SMOTE algo-
rithm describes a method for producing synthetic samples from the minority class. SMOTE is one of the most extensively
utilized oversampling techniques for dealing with imbalanced data problems. It attempts to equalize the distribution of
class by randomly reproducing minority-class situations. By mixing existing minority instances, SMOTE generates new
minority instances. The SMOTE algorithm is an improved method for dealing with imbalanced data during categoriza-
tion scenarios. An imbalanced dataset demonstrates that the observed frequencies are substantially diverse among the
numerous probable outcomes of a categorical variable. Essentially, there are many observations of one kind and few of
another. For balancing the dataset:
• Over-sampling: Random oversampling entails collecting random data from the minority class and then replacing or
adding the data to the training sample.
• Re-sampling/under-sampling: Under-sampling or Re-sampling represents the inverse of random oversampling. It
includes picking random data from most of the class and substituting or adding it to the training dataset.
• SMOTE algorithm: A strategy employed with unbalanced datasets to assist ML classification. It involves creating new
data from existing minority data and subsequently using it to supplement the dataset.
Data preparation is an important step in ML because data quality directly influences the efficacy of ML models. The
dataset must be updated and purified to fit the ML model during data preparation.
4.2.4 Data distribution
The dataset chosen is divided into 80% for training and 20% for testing for both binary and multi-class for each dataset.
Then, it goes into tiring the following splits:
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
• 90/10 Split: Using only 90% of the dataset as training data fed to the model to learn from before it begins the predic-
tion process, the testing dataset consists of the remainder of the data.
• 80/20 Split: Using only 80% of the dataset as training data fed to the model to learn from before it begins the predic-
tion process, the testing dataset consists of the remainder of the data.
• 70/30 Split: Using only 70% of the dataset as training data fed to the model to learn from before it begins the predic-
tion process, the testing dataset consists of the remainder of the data.
• 40/60 Split: Using 40% of the dataset as training data fed to the model to learn from before it begins the prediction
process, the testing dataset consists of the remainder of the data.
• 50/50 Split: Using half of the dataset as training data fed to the model to learn from before it begins the prediction
process, the testing dataset consists of the remainder of the data.
• 40/60 Split: Using 40% of the dataset as training data fed to the model to learn from before it begins the prediction
process, the testing dataset consists of the remainder of the data.
• 30/70 Split: Using only 30% of the dataset as training data fed to the model to learn from before it begins the predic-
tion process, the testing dataset consists of the remainder of the data.
The importance of employing various split percentages in partitioning both datasets, such as 90–10% and 80–20%,
enables a thorough assessment of ELM-IDS performance and generalizability. For instance, the 90%/10% split opti-
mizes training data, possibly improving model learning, while the smaller test set assesses its performance on previ-
ously unknown data. The 80%/20% split, on the other hand, provides a larger test set for more rigorous performance
evaluation, guaranteeing that the ELM-IDS model doesn’t overfit and can generalize successfully. This method facilitates
hyperparameter adjustment, tests the model across many data subsets, and ensures that it remains effective in real-world
scenarios with varying data distributions.
Here, we start to define the ELM model, then train the ELM model, then compare it with six various models to compare
accuracy with various models. Figure 10 represents the suggested ELM-IDS model for detecting IoT attacks on traffic. ELM
contains a single layer of hidden nodes, where the weights between inputs and hidden nodes are randomly assigned.
Furthermore, the hidden layer consists of activation functions. Table 5 illustrates a general comparison via common activa-
tions within ML, Where the ReLU activation function is used. Refer to Table 6 and Fig. 10; the dataset is the input (X), then
goes through training pattern/matching. Next, IDS will be employed to detect attacks and go through the ELM model.
ELM model will evaluate the dataset during ELM-IDS within deep hidden layers to send results to the testing process to
ensure the ELM-IDS suggested model’s effectiveness, performance, and accuracy. Were optimal, and the testing process
was compared with cross-validation accuracy results. If the results are the same, the attack is detected, and the value is
normal, it will go through output(Y). otherwise, it will return data to the phase. ELM model is compared with the most
common models for training 80% and testing 20%. Also, ELM-IDS is provided within different splitting distributions to
measure metrics and compare ELM-IDS results with various distributions.
Vol.:(0123456789)
Research
Vol:.(1234567890)
Table 5 List of activations functions
Activation Description Benefits Formula
Sigmoid The sigmoid function reduces input values to an interval between 0 and 1, and it - Clean and unique 1
Discover Internet of Things
1+e−x
is frequently employed within the outcome layer in binary classification models - Since the outputs fall within the range (0, 1), it is ideal for binary classification
issues
SoftMax The SoftMax function is defined as a sigmoidal combination - Can handle a variety of classes exp(zi )
∑
- It normalizes and splits the results between 0 and 1 for each class j exp(zj )
ReLU The function ReLU delivers the input with positive integers and 0 for negative - Effective in terms of computation max(0, x)
values, providing the model with non-linearity - Reduces the disappearing gradient issue
Tanh Tanh function is extremely similar to the sigmoid/ logistic activation function, - Tanh produces a zero-centered output, allowing us to readily map the output (ex −e−x )
ex +e−x
even though it features the same S-shape, with the main difference being a -1 values as highly unfavorable, neutral, or extremely favorable
to 1 output range -Typically employed in neural network hidden layers since its values range from
-1 to; hence, the mean value for the hidden layer is 0 or very near to it. It aids
data centering and makes learning towards the following layer much easier
| https://doi.org/10.1007/s43926-024-00060-x
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
Step 1 Input (X), where (X) is the dataset in the ELM model
Step 2 Train pattern matching, evaluate dataset, and detect any attacks and missing, duplicated values
Step 3 After the trading model, it will go through IDS t to detect attacks and malicious. Followed by
going to the ELM model and testing the model, results, and performance
Step 4 After testing, the optimal step is important for attack behavior, which is an attack or normal.
Normal will go as a result (Y) Else: return to the ELM model
Step 5 Step 1 + 2 + 3 + 4 again until results from the ELM model are balanced, clean, and normal behavior
Step 6 Output (Y)
4.4 ELM evaluation
Within this stage, the performance of the proposed ELM-IDS model within NSL_KDD and Distilled-Kitsune datasets
needs to be measured. The performance evaluation findings regarding the suggested ELM-IDS model based on
various metrics are shown. A confusion matrix is utilized in Fig. 11 to validate the performance of the proposed
machine-learning ELM model. The confusion matrix evaluates the model’s prediction using the subsequent criteria:
True Positives (TP): It is an attack, as predicted by the model, and it is true (Injected); true Negatives (TN): It is true, but
the model predicted it to be negative (Authentic), False Positives (FP): Despite being negative, the model projected
it to be positive (Authentic), and False Negatives (FN): Despite being positive, the model predicted it to be negative
(Injected).
Accordingly, several evaluation metrics can be calculated as follows:
• Accuracy represents the percentage of true positives and true negatives properly detected by the model. Accuracy
formula:
TP + TN
𝐀𝐜𝐜𝐮𝐫𝐚𝐜𝐲 = (1)
TP + FP + TN + FN
• Precision: the proportion of true positives corresponding to every model’s successful prediction. Precision assesses
the likelihood of a procedure producing correct results. Precision formula:
TP
Precision = (2)
TP + FP
• Recall: estimates the number of right predictions made over the whole dataset, including correct predictions the
model missed. As a result, a high rate of recall in an ELM-IDS model is preferred. Recall formula:
TP
Recall = (3)
TP + FN
• F1-Score calculates the percentage of correctly predicted events. It is a weighted harmonic mean of recall and preci-
sion. F1 score formula:
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
Precition ∗ Recall
F1score = 2 ∗ (4)
Precition + Recall
• False Positive Rate (FPR) formula:
FP
𝐅𝐏𝐑 = (5)
TN + FP
• True Negative Rate (TNR) formula:
TP
𝐓𝐍𝐑 = (6)
TP + FN
• Matthew’s correlation coefficient (MCC): MCC provides the most suitable single-value classification measure for
summarizing a confusion or error matrix.
TP ∗ TN − FP ∗ FN
sevenMCC = √ (7)
(TP + FP)(TP + FN)(TN + FP)(TN + FN)
• Cohen’s Kappa: Used to examine the ideal agreement and agreement through coincidence between two raters
(real-world observers and the classification model) for evaluating the effectiveness of ML classification models.
2 ∗ (TP ∗ TN − FN ∗ FP)
K= (8)
(TP + FP) ∗ (FP + TN) + (TP + FN) + (FN + TN)
4.5 Classification process
Classify and categorize traffic records into normal or attack for binary classification within the Classification process. Also,
the multi-classification process to a dedicated traffic category in the Kitsune dataset {normal, OS Scan attack, Fuzzing
attack, Video Inj attack, ARP attack, Wiretap attack, SSDP F attack, SYN DoS attack, SSLR attack, and Mirai attack}, and
in the NSL-KDD dataset {normal, DoS attacks, probe attacks, R2L attacks, and U2R attacks}. Afterward, go through the
ELM model and repeat the steps until the process is complete and the goal is met. Cross-validation serves as a statistical
approach to estimating the ELM model’s performance. It prevents overfitting in a prediction model, especially when
the available data seems restricted.
There are various types of cross-validation, such as non-exhaustive methods that contain the holdout method and
the K-fold approach. Exhaustive methods. Non-exhaustive methods and the K-fold approach are employed. Figure 12
illustrates a five-fold cross-validation concept and how cross-validation works. A possibility for improving the holdout
method is to use K-fold cross-validation. This strategy guarantees that our model’s score is independent of how to choose
train and test datasets. The dataset is partitioned into k subgroups, and the holdout procedure is performed k times. Let’s
take it one step at a time. Randomly divide the whole dataset into k folds (subsets).
1. Build the ELM model on k-1 folds of the dataset based on every fold within the dataset.
2. The model is then tested to see how effective it is for the kth fold.
3. Repeat until all k-folds are utilized as the test set.
4. The cross-validation accuracy is the average of all recorded accuracies.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
4.6 Datasets
This sub-section will go through the dataset’s characteristics, records, and content, considering that the dataset is open and
can be used in any way to fit. This dataset contains no personally identifiable data violations; however, the dataset is required
to tackle specific problems such as categorization. Researchers of the study work within the NSL-KDD 2009 dataset and
Distilled Kitsune 2021 dataset and apply these two datasets using Google COLAB and Python within ELM to detect attacks
due to both datasets. Furthermore, each binary and multi-class classification will determine the various assaults within both
datasets. Additionally, binary classification normal or attack across the two datasets, then compute total accuracy based on
the binary and multi-class confusion matrix.
4.6.1 NSL‑KDD dataset
First, select the NSL-KDD 2009 dataset, which features five classifications. NSL_KDD is an updated and improved version of
the KDD’99 dataset. The NSL_KDD dataset is a useful benchmark for comparing different intrusion detection algorithms.
NSL_KDD. The NSL KDD includes 42 characteristics, which are listed in Table 7. These 42 attributes can be put into the char-
acteristic generator (selected sample) to acquire any key feature.
Various multiple attacks within NSL-KDD occurred, where attacks are classified in Table 8. NSL-KDD has five attacks, a
family containing 40 types of attacks. According to the criteria, it should be possible to detect attacks through the NSL KDD
dataset. NSL-KDD is a KDD CUP 99 version. NSL-KDD has 40 attacks in Table 8, classified into five classes: Normal, Probe, U2R,
R2L, and DoS. Furthermore, the binary distribution for NSL-KDD is 48% attack and 52% normal.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
The Kintsune dataset has been uploaded, and the CSV file contains 116 features and nine classes resulting from the Kit-
sune dataset. The Kintsune dataset is used to detect attacks. Table 9 illustrates classes and numbers of malicious within
the kitsune dataset. Furthermore, the binary distribution for Distilled-kitsune is 8% attack and 92% normal.
Due to ELM, most researchers must measure accuracy or any metrics measurements to achieve a high accuracy rate.
However, the difference within [58] is that authors propose and investigate to analyze errors obtained through the least
square estimation using the G-ELM model. To start the important section of this study, the Kintsune dataset and NSL-KDD
were chosen to detect attacks and abnormal activities. However, categorizing this section into subsections shows clear
results of ELM-IDS experiments. Due to this section, they are first starting with the NSL-KDD results subsection. Where
all results and brief data within this subsection about results are provided. Secondly, the Distilled-Kitsune subsection
results provide a brief data result. Lastly, the comparison results subsection will show a strong comparison between the
NSL-KDD dataset, the Distilled-Kitsune dataset, and other related research.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
5.1 NSL‑KDD results
Table 8 represents the multi-class data distribution for five classes, which contain 40 attacks that are ordered ascending
by extraction, detection attacks, and metrics evaluation, which is needed to test and train the ELM-IDS model to achieve
the goals of this study. Due to Table 8, the evaluation metric values are displayed. Table 10 provides a metric evaluation
compression for both train and test, binary and multi-class. Nine evaluation metrics (Accuracy, F1 score, Recall, Precision,
FPR, TNR, MCC, Cohen’s Kappa, and Log Loss) are measured. Figure 13 illustrates a balancing dataset dealing with the
SMOTE algorithm. Where (a) shows balancing for multi-class random oversampling, random sampling, and the SMOTE
algorithm. Furthermore, (b) is used to balance the binary class.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
Furthermore, Fig. 14 illustrates train and test classification reports for multi-class cross-validation. Figure 15 illustrates
a multi-class cross-validation confusion matrix detection for the ELM-IDS purposed model.
Moreover, as mentioned before, the NSL_KDD multi-class contains five classes via Table 8. Where (a) is provided a test
multi-class cross-validation confusion matrix and (b) is a train multi-class cross-validation confusion matrix. Classes: The
classes are labeled as Dos, Probe, R2L, U2R, and Normal, which likely refers to types of network attacks in the context of
the NSL-KDD dataset. Also, Fig. 16 shows a cross-validation binary confusion matrix detection for the model suggested
by the ELM-IDS. Figure 16a provides a train binary cross-validation confusion matrix. Although (b) represents a test binary
cross-validation confusion matrix. Both (a) and (b) contain two classes (normal and attack).
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
5.2 Distilled‑Kitsune results
Moreover, Fig. 17 provides a Distilled-Kitsune balancing dataset using the SMOTE algorithm. Where (a) is balancing
multi-class and (b) is balancing binary (two-class). Random over-sampling and random under-sampling are initialized,
and then the SMOTE algorithm for both (a) and (b). Cross-validation evaluation metrics for both trains and test binary
and multi-class are compared. Where Table 11 shows the calculation for nine evaluation metrics. Furthermore, Fig. 18
illustrates train and test classification reports for multi-class cross-validation. (Accuracy, F1 score, Recall, Precision, FPR,
TNR, MCC, Cohen’s Kappa and Log Loss). Figure 19 illustrates a Distilled-Kitsune multi-class confusion matrix detection
for the ELM-IDS suggested model, where (a) refers to the test multi-class cross-validation confusion matrix and (b) trains
multi-class cross-validation confusion matrix. Also, multi-class Distilled-Kitsune contains ten classes, as mentioned in
Table 10. Figure 20 provides a Distilled-Kitsune cross-validation binary confusion matrix following the confusion matrix.
While (a) is for training the binary cross-validation confusion matrix, and (b) refers to testing the binary cross-validation
confusion matrix.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
This subsection will compare results for NSL_KDD and Kitsune Distilled-Kitsune datasets. Furthermore, the epoch for both
ELM model testing and cross-validation for both datasets is compared. Then, it will show the ELM model within various
splits. Also, another comparison of the ELM model within three common models is needed to see if ELM performance is
better than other models. To start with Table 12, which shows the results of testing and cross-validation for the ELM model.
During ML, especially when training neural networks, an epoch is one full run over the training dataset throughout
the training phase. It is an essential notion in iterative.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
Optimization techniques are required to update model parameters to decrease the specified loss function. In addition,
each epoch represents a complete cycle around the training dataset, which allows the model to acquire knowledge and
alter its parameters. The total amount of epochs has a vital role in the performance of models and adaptation. Balancing
agreement and avoiding overfitting was important in choosing the optimal number of epochs during model training.
Figure 21 describes the ELM-IDS model within various splits for NSL-KDD and Distilled-Kitsune.
Figure 22 illustrates the ELM comparison within various ML models during 80% training and 20% split, compar-
ing accuracy, f1-score, recall, precision, MCC, and Cohen’s Kappa. Although there’s a strong comparison of results,
the ELM model is faster than any ML model. Good generalization performance Improved efficiency and accuracy.
Furthermore, ELM is compared with DT, RF, and KNN. B means binary class, and M means multi-class, while the results
are shown below in Table 13, where Table 13 shows the overall compression ELM with other models via both datasets
and binary and multi-class.
Furthermore, Fig. 23 provides an overall comparison analysis of statistics about the missing values and duplicated
records within both NSL-KDD and Distilled-Kitsune.
Datasets. Although missing values as ordered NSL-KDD and Distilled-Kitsune, 0, 19 and duplicated values as
ordered NSL-KDD and Distilled-Kitsune, 629 and 6001.
Figure 24 illustrates time complexity for both datasets, where (a) and (b) for the Distilled-Kitsune dataset. As
ordered (a) and (b), for multi-class and binary class. Thus, (d) and (d) multi-class and binary for the NSL-KDD dataset.
For further importance, the time complexity is measured in seconds.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
Fig. 21 ELM-IDS within various splits of data. Where a 90/10 split, b 80/20 split, c 70/30 split, d 60/40 split , e 50/50 split and finally f 40/60
split
Although ELM-IDS is compared with 33 past existing] models over the last years, then finally compared to the
ELM-IDS model. Thus, Table 14 represents an ELM-IDS comparison with another existing model during the last stud-
ies. During [78], the authors aim to enhance the accuracy of IDS systems. Where the method for feature selection
and use in IDS was proposed. It was applied to 10% of the KDD dataset. It was shown that the algorithm proposed
in this paper is more efficient than those proposed in the previous papers. Furthermore, researchers within [79]
authors provide an effective approach for executing Generalized Discriminant Analysis (GDA) on big datasets with
many training samples. GDA outperforms PCA regarding detection rate, false positives, and training/testing time.
When comparing two classifiers, the C4.5 performed better in all classes (Normal, DOS, R2L, U2R, and Prob). Moreo-
ver, in the research [81], the authors suggest DoH-IDS models dealing with the CIRA-CIC-DoHBrw-2020 dataset.
The authors suggested that the model achieved 100% accuracy and F1-score. Offers a novel hybrid learning-based
double-stage strategy to detect malicious DoH traffic. The authors’ plans are focused on developing an unsuper-
vised machine learning-based DoH IDS model to alleviate the limitation of the labeled data requirement. Finally,
due to the research [84], authors utilize ELM in IoT networks for cybersecurity, offering the Fusion Extreme Learning
Machine with Improved Multi-Feature Integration (FELM-EMFI) for protecting communication channels. Having a
high accuracy of 92.13%, the methodology outperforms traditional approaches such as BP, SVM, and ELM, indicat-
ing potential for practical cybersecurity applications and providing an improved solution to strengthen network
security and IDS.
In short, the study demonstrates the potential of ensemble approaches in improving accuracy and efficiency
within IDS employing the ELM-IDS model.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
Fig. 22 ELM-IDS vs Others ML Models (RF, KNN and DT). Where a shows the F1-score, b Accuracy, c Recall, d precision, e MCC metric and f
Cohen’s Kappa
6 Conclusions
The following piece examines ELM design, implementation, and assessment. ELM is a single hidden layer neural net-
work and ML/AI class. During this paper, ELM-IDS aims to identify assaults is purposed. Python and Google COLAB are
used to implement ELM-IDS. Before starting data distribution, the study guarantees that all data appears accurate and
quantitative, then trains and tests the ELM model. Our findings show that ensemble approaches deliver greater accuracy
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
Accuracy
KNN 88.60% 99.36% 99.33% 99.47%
DT 96.71% 99.38% 99.42% 99. 86%
RF 98.70% 99.55% 99.50% 99.50%
ELM-IDS 92.50% 99.63% 99.98% 99. 90%
Recall
KNN 88.60% 98.35% 99.30% 99.40%
DT 96.70% 99.35% 99.42% 99.80%
RF 98.69% 99.54% 99.5% 99.72%
ELM-IDS 92.5% 99.62% 99.80% 99.98%
F1-score
KNN 87.90% 98.35% 99.30% 99.40%
DT 98.69% 99.54% 99.50% 99.72%
RF 99.68% 99.36% 99.41% 99.68%
ELM-IDS 92.45% 99.62% 99.98% 99.9%
MCC
KNN 85.30% 96.75% 98.25% 98.96%
DT 95.91% 98.77% 99.30% 99.73%
RF 98.37% 99.19% 99.47% 99.44%
ELM-IDS 90.65% 99.25% 99.87% 99.85%
Precision
KNN 98.38% 98.38% 99.29% 99.46%
DT 99.38% 99.38% 99.41% 99.79%
RF 98.69% 99.55% 99.50% 99.50%
ELM-IDS 92.5% 99.62% 99.90% 99.9%
Cohen’s kappa
KNN 85.08%, 96.72% 99.26% 98.95%
DT 95.88% 98.76% 99.36% 99.72%
RF 98.36% 99.08% 99.40% 99.42%
ELM-IDS 90.63% 99.25% 99.85% 99.80%
Bold values are the results correspond with ELM based IDS (proposed)
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
percentages while requiring less time. In addition, data is trained via the ELM model and then contrasted with the usual
seven models. ELM was tested against six algorithms underneath the ELM model in this experiment for improved accuracy
findings. The outcomes of the binary and multi-classification classification stages were then compared. ELM obtained
99.78% accuracy in binary classification and 99.90% in multi-class classification. Finally, ELM-IDS provides a potentially
novel and one-of-a-kind approach to addressing the growing cybersecurity challenges in our interconnected world.
ELM-IDS has proven to be capable of providing customizable and efficient systems for intrusion detection. In the future,
we may test more IoT datasets, compare ELM with various models, and work to enhance accuracy in NSL-KDD multi-class.
Vol.:(0123456789)
Table 14 Comparison with other existing ELM-IDS
Ref Year Model # of classes Dataset Measurements values Remarks
Research
[18] 2023 ELM forecasting method Multi-class Sales dataset digitized from Accuracy 111% Forty-three rows and six
Vol:.(1234567890)
SAP parameter features
[20] 2023 Mixture-ELM MAPE Multi-class EAGLE dataset Attaining a value of 0.187% the dataset comprises 1,184
simulations of 990 time
steps (33 s at 30 fps)
[23] 2023 ELM-Net Two class CICDDoS2019 ELM-Net got 99.62% accuracy They constructed the
intangible behavior of 25
users employing the HTTP,
HTTPS, FTP, SSH, and email
protocols
[25] 2023 Annealing-ELM Two class diabetes dataset Acceleration 3.48 × and 38.33 × is achieved dataset contains 442 samples
with ten features
[57] 2023 MSCNN-ELM Multi-Class & Two class CWRU Dataset F1-score reached 0.9541 for binary classification The dataset comprises 25
& 0.9555 for multiclass classification characteristics, most of
which are lag features along-
Discover Internet of Things
[60] 2022 OS-ELM Two class CICDDoS2019 dataset OS-ELM has achieved The CICDDoS2019 dataset
• 97.19% accuracy contains evidence of the
• 2.81% error rate application, transportation,
• 0.985% auc and network layer assaults
• 5.4s Time
• 0.971% f1 score
[61] 2022 RA-ELM Two class DDoS attack dataset Given a significance value of 0.03 (p0.05), the RA- The balanced dataset includes
ELM method is 84.6% and ELM is 90.7% 2,00,000 normal traffic and
2,00,000 DDoS network traf-
fic incidents
[62] 2022 ELM & HMM models Muti-class UNSW-NB15 dataset 0.9754% accuracy rate and 0.0276% FPR UNSW-NB15 contains nine
types of attacks, 49 features,
and 2,540,044 samples
[63] 2022 Kernel-ELM Two class KDD99 dataset The average test accuracy is 98.3165%, and the The KDD99 dataset contains
testing time is 0.7791 s 41 features and five classes
[64] 2021 ELM model Three classes Heart Rate a Variability ELM accuracy 0.878 The authors used a dataset
dataset containing 5000 samples, 35
features, and three classes
| https://doi.org/10.1007/s43926-024-00060-x
Table 14 (continued)
Ref Year Model # of classes Dataset Measurements values Remarks
[65] 2022 ELM-RNN model Two class NSL-KDD dataset The accuracy of detection of DDoS attacks is 99% The NSL-KDD dataset includes
42 features and five classes
[66] 2021 DL-ELM model Two class NSL_KDD dataset The accuracy of the ELM model is 85.25%
[67] 2020 RTS-DELM-CSIDS model Four classes NSL-KDD dataset The data was randomly separated through 70%
training (103,962 samples) and 30% validation
(44,554 samples), with an accuracy of 96.22%
Discover Internet of Things
[68] 2021 DL-ELM model Two class NSL_KDD dataset The researchers divided the data into 70% train
and 30% test. They reach an overall accuracy
of 92.04%. Furthermore, the DL-ELM recom-
mended model attained 91.23% accuracy
[69] 2020 FRCG-KELM model Two class NSL-KDD dataset The FRCG-KELM model outperformed the others
[70] 2019 ELM-IDS model Muti-class NSL-KDD dataset Accuracy is 94%, and it took 28.48 s to complete
(2024) 4:5
Vol.:(0123456789)
Research
Table 14 (continued)
Ref Year Model # of classes Dataset Measurements values Remarks
Research
[77] 2024 Fusion Extreme Learning Muti-class A single unified dataset ELM accuracy is 90.04% FELM-EMFI is 92.13% Features are fused using
Machine with enhanced concatenation, weighted
Vol:.(1234567890)
multi-feature integration averaging, or other fusion
(FELM-EMFI) strategies. Principal Compo-
nent Analysis (PCA) is used
for feature extraction
[80] 2024 ELM model Two class Twitter Spam Detection The proposed ELM model achieved 98.84% Spam refers to unwanted con-
dataset accuracy, 83.3% precision, 96.15% recall, and tent posted by known fake
88.001% f1-score Twitter accounts, including
politically motivated mes-
sages, automatically gener-
ated content, meaningless
posts, and clickbait
[82] 2024 ELM model & PCA-KELM Muti-class Fault type classification and ELM model achieves an accuracy of 96.36%, and Combined PCA-KELM results
model faulty line identification in the PCA-KELM approach achieves an accuracy show better accrues than
power systems of 99.46% employed ELM model
Discover Internet of Things
induvial
[83] 2024 KHO-ELM and W-ELM Muti-class NSL-KDD dataset The accuracy is 87%, and the false positive rate The research contributes
equals 2 valuable insights into the
application of meta-heuristic
algorithms for real-world
problem-solving, particularly
(2024) 4:5
Author contributions Shahad Altamimi: Methodology, Data Curation, Visualization, Investigation, Software, Resources, Software, Writing—
Original Draft Preparation, Writing—Reviewing and Editing. Qasem Abu Al-Haija: Conceptualization, Methodology, Validation, Supervision,
project administration, funding acquisition, writing-original draft preparation, writing-reviewing, and editing.
Data availability The data associated with research is publicly available and can be retrieved via: (1) NSL-KDD Dataset: Canadian Institute for
Cybersecurity (CIS). Available online: https://www.unb.ca/cic/datasets/nsl.html (2) Distillited Kitsune Datset: Mendeley Data. 2018. Available
online: https://data.mendeley.com/datasets/ zvsk3k9cf2/1.
Declarations
Competing interests The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adapta-
tion, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article
are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in
the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://c reati vecom
mons.o
rg/l icens es/b
y/4.0
/.
Appendix A
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
References
1. Cheng M, Tang H, Khan A, Syam MS, Assam M, Li D, Uzair AB. A digital twin-based visual servoing with extreme learning machine and
differential evolution. Int J Intell Syst. 2023. https://doi.org/10.1155/2023/6639963.
2. Li X, Li X, Wang F, Zhu D. A detection algorithm of malicious code variants based on extreme learning. In: 2022 IEEE international
conference on advances in electrical engineering and computer applications (AEECA), Dalian, China, 2022, Pp. 262-266, https://doi.
org/10.1109/AEECA55500.2022.9919022
3. Alsulami AA, Abu Al-Haija Q, Tayeb A, Alqahtani A. An Intrusion detection and classification system for IoT traffic with improved data
engineering. Appl Sci. 2022. https://doi.org/10.3390/app122312336.
4. Wang J, Lu S, Wang SH, et al. A review on extreme learning machine. Multimed Tools Appl. 2022;81:41611–60. https://doi.org/10.
1007/S11042-021-11007-7.
5. Hariprasad S, Deepa T, Bharathiraja N. Detection of DDoS attacks in IoT networks using sample selected RNN-ELM. Intell Autom Soft
Comput. 2022;34(3):1425.
6. Wang X, Tu S, Zhao W, Shi C. A novel energy-based online sequential extreme learning machine to detect anomalies over real-time
data streams. Neural Comp and Appl. 2022. https://doi.org/10.1007/s00521-021-05731-2.
7. Droos Q, Al-Haija A, Alnabhan M. Lightweight detection system for low-rate DDoS attack on software-defined-IoT. In: 6th Smart Cities
Symposium (SCS 2022), Hybrid Conference, Bahrain, 2022, pp. 157-162. https://doi.org/10.1049/icp.2023.0388
8. Gao J, Chai S, Zhang C, Zhang B, Cui L. A novel intrusion detection system based on extreme machine learning and multi-voting
technology. In proceedings of 2019 Chinese control conference (CCC). IEEE, 2019. 8909–8914.
9. Zhang X, Yang J, Zhao Y. Numerical solution of time fractional black-scholes model based on legendre wavelet neural network with
extreme learning machine. Fractal Fract. 2022;6:401. https://doi.org/10.3390/fractalfrac t6070401.
10. Abu Al-Haija Q, Al Badawi A. High-performance intrusion detection system for networked UAVs via deep learning. Neural Comput
Appl. 2022. https://doi.org/10.1007/s00521-022-07015-9.
11. Kim M. Theoretical bounds to generalization error for generalized extreme learning machine. SSRN 4178565. 2022.
12. He Y, Ye X, Huang JZ, Viger PF. Bayesian attribute bagging-based extreme learning machine for high-dimensional classification and
regression. ACM Trans Intell Syst Technol. 2022;13(2):30.
13. Phoophiwfa T, Laosuwan T, Volodin A, Papukdee N, Suraphee S, Busababodhin P. Adaptive parameter estimation of the generalized
extreme value distribution using an artificial neural network approach. Atmosphere. 2023. https://doi.org/10.3390/atmos14081197.
14. He C, Kang H, Yao T, Li X. An effective classifier based on convolutional neural network and regularized extreme learning machine.
Math Biosci Eng. 2019;16(6):8309–21. https://doi.org/10.3934/mbe.2019420.
15. Kurniawan H, Triloka J, Ardhan Y. The artificial neural network approach is analyzed using the extreme learning machine method for
mining sales forecasting development. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140179.
16. Abu Al-Haija Q. Top-down machine learning-based architecture for cyberattacks identification and classification in IoT communica-
tion networks. Front Big Data. 2022. https://doi.org/10.3389/fdata.2021.782902.
17. Cao Bo, Zhang K, Wei Bo, Chen L. Status quo and prospects of artificial neural network from the perspective of gastroenterologists.
World J Gastroenterol. 2021;27:2681–709. https://doi.org/10.3748/wjg.v27.i21.2681.
18. Vásquez J, Mora M, Vilches K. A Review of multilayer extreme learning machine neural networks. Artif Intell Rev. 2023. https://doi.
org/10.1007/s10462-023-10478-4.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
19. Kumar L, Hota C, Mahindru A, Neti LBM. Android malware prediction uses an extreme learning machine with different kernel func-
tions, In: Proceedings of the 15th Asian Internet Engineering Conference. 2019. pp. 33–40.
20. Wang Y, Dong S. An extreme learning machine-based method for computational PDEs in higher dimensions. Amsterdam: Elsevier;
2023.
21. Kale P, Sonavane S. PF-FELM: a robust PCA feature selection for fuzzy extreme learning machine. IEEE J Sel Top Signal Process.
2018;12(6):1303–12. https://doi.org/10.1109/JSTSP.2018.2873988.
22. Sun H, Hou M, Yang Y, Zhang T, Weng F, Han F. Solving partial differential equation based on bernstein neural network and extreme learn-
ing machine algorithm. Neural Process Lett. 2019. https://doi.org/10.1007/s11063-018-9911-8.
23. Dharanidharana E, Parthipanb V. A novel analysis of network traffic in distributed denial of service (DDoS) attack to improve accuracy
using extreme learning machine algorithm over regression algorithm. Adv Parallel Comput Algorithms, Tools Paradig. 2022;41:265.
24. Quan H, Huynh H. Solving partial differential equations based on extreme learning machine. Math Comput Simul. 2022. https://doi.org/
10.1016/j.matcom.2022.10.018.
25. Wang K, Li J. An intrusion detection method integrating KNN and transfer extreme learning machine. In: 2022 2nd Asia-Pacific confer-
ence on communications technology and computer science, Shenyang, China, 2022. pp. 221–226, https://doi.org/10.1 109/ACCTC
S53867.2022.00053.
26. Li M, Sun Q, Liu X. Data distribution based weighted extreme learning machine. In: 2019 4th international conference on machine
learning technologies (ICMLT ’19). ACM, NY, USA, 1–6. https://doi.org/10.1145/3340997.3340998.
27. Shen Y, Zheng K, Wu C. A hybrid PSO-BPSO based kernel extreme learning machine model for intrusion detection. J Inf Process Syst.
2022;18(1):146–58.
28. Liu X, Zhou Y, Meng W, Luo Q. Functional extreme learning machine for regression and classification. Math Biosci Eng. 2022;20:3768–
92. https://doi.org/10.3934/mbe.2023177.
29. Abu Al-Haija Q, Alsulami AA. Detection of fake replay attack signals on remote keyless controlled vehicles using pre-trained deep
neural network. Electronics. 2022;11:3376. https://doi.org/10.3390/electronics11203376.
30. Li Z et al. Research on DDoS attack detection based on ELM in IoT environment. In: 2019 IEEE 10th International conference on soft-
ware engineering and service science (ICSESS), Beijing, China, 2019. pp. 144–148, https://doi.org/10.1109/ICSESS47205.2019.90408
55.
31. Kumar R, Singh MP, Roy B, Shahid AH. A comparative assessment of metaheuristic optimized extreme learning machine and deep
neural network in multi-step-ahead long-term rainfall prediction for all-Indian regions. Water Resour Manage. 2021;35(6):1927–60.
32. Zheng X, Ye Z, Su J, Chen H, Wang R. Network intrusion detection based on hybrid rice algorithm optimized extreme learning machine.
In: 2018 IEEE 4th International symposium on wireless systems within the international conferences on intelligent data acquisition
and advanced computing systems (IDAACS-SWS), Lviv, Ukraine, 2018. pp. 149-153. https://doi.org/10.1109/IDAACS-SWS.2018.85255
87
33. Yang Y, Hou M, Luo J. A novel improved extreme learning machine algorithm for solving ordinary differential equations using legendre
neural network methods. Adv Differ Equ. 2018;2018(1):1–24.
34. Abu Al-Haija Q, Mohamed O, Abu Elhaija W. Predicting global energy demand for the next decade: a time-series model using nonlinear
autoregressive neural networks. Energy Explor Exploit. 2023;41:1–15. https://doi.org/10.1177/01445987231181919.
35. Tiffany S, Sarwinda D, Handari B, Hertono G. The comparison between extreme learning machine and artificial neural network-back
propagation for predicting the number of dengue incidences in DKI Jakarta. J Phys: Conf Ser. 2021;1821:012025. https://doi.org/10.
1088/1742-6596/1821/1/012025.
36. Perangin-Angin DJ, Bachtiar FA. Classification of stress in office work activities using extreme learning machine algorithm and one-way
ANOVA F-test feature Selection. In: 2021 4th international seminar on research of information technology and intelligent systems,
Indonesia, 2021. pp. 503-508 https://doi.org/10.1109/ISRITI54043.2021.9702802
37. Zhang W, Li J, Huang S, Wu Q, Liu S, Li B. Application of multi-scale convolutional neural networks and extreme learning machines
in mechanical fault diagnosis. Machines. 2023;11:515. https://doi.org/10.3390/machines11050515.
38. Alade O, Selamat A, Sallehuddin R. A review of advances in extreme learning machine techniques and its applications. Berlin: Springer;
2018. https://doi.org/10.1007/978-3-319-59427-9_91.
39. Siyuan L, Shuaiqi L, Shui-hua W, Yu-dong Z. Cerebral microbleed detection via convolutional neural network and extreme learning
machine. Front Comput Neurosci. 2021;15:738885.
40. Xiao J, Zou G, Xie J, Qiao L, Huang B. Identification of shaft orbit based on the grey wolf optimizer and extreme learning machine. In:
2018 2nd IEEE advanced information management, communicates, electronic and automation control Conf., Xi’an, China, 2018. pp.
1147-1150. https://doi.org/10.1109/IMCEC.2018.8469198
41. Kumari MTM, Karimy AU. Intelligent intrusion detection system using deep learning and extreme machine learning algorithms. 2021.
42. Albadra MAA, Tiuna S. Extreme learning machine: a review. Int J Appl Eng Res. 2017;12(14):4610–23.
43. Ali MH, Jaber MM. Comparison between extreme learning machine and fast learning network based on intrusion detection system
(No. 5103). 2021.
44. Abu Al-Haija Q, Al-Fayoumi M. An intelligent identification and classification system for malicious uniform resource locators (URLs).
Neural Comput & Applic. 2023;35:16995–7011. https://doi.org/10.1007/s00521-023-08592-z.
45. Khan MA, et al. Enhance intrusion detection in computer networks based on deep extreme learning machine. Comput Mater Contin.
2021;66(1):467–80.
46. Cömert Z, Kocamaz AF, Güngör S. Cardiotocography signals with artificial neural network and extreme learning machine. In: 2016
24th Signal processing and communication application conference, Zonguldak, Turkey, 2016. pp. 1493-1496. https://d oi.org/10.
1109/SIU.2016.7496034
47. Sun Y, Xie Y, Qiu Z, Pan Y, Weng J, Guo S, Detecting android malware based on extreme learning machine. In: 2017 IEEE 15th intl conf
on dependable, autonomic and secure computing, 15th intl conf on pervasive intelligence and computing, 3rd intl conf on big data
intelligence and computing and cyber science and technology.
Vol.:(0123456789)
Research Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x
48. Roy B, Singh MP, Kaloop MR, Kumar D, Hu J-W, Kumar R, Hwang W-S. Data-driven approach for rainfall-runoff modelling using equi-
librium optimizer coupled extreme learning machine and deep neural network. Appl Sci. 2021;11(13):6238. https://doi.org/10.3390/
app11136238.
49. Nilesh R, Sunil W. Improving extreme learning machine through optimization a review. In: 2021 7th international conference on
advanced computing and communication systems (ICACCS), Coimbatore, India, 2021. pp. 906–912, https://doi.org/10.1109/ICACC
S51430.2021.9442007.
50. Nuha HH, Balghonaim A, Liu B, Mohandes M, Deriche M, Fekri F. Deep neural networks with extreme learning machines for seismic
data compression. Arab J Sci Eng. 2020;45(3):1367–77.
51. Almeida EG, Mora M, Huerfano Y, Jurado JS, Jeraldo NM, Yavina RL, Moreno YB, Tobar L. Estimating the optimal number of neurons
in an extreme learning machine using simulated annealing and the golden section. J Phys: Conf Ser. 2023;2515:012003. https://doi.
org/10.1088/1742-6596/2515/1/012003.
52. Zhang Z, Hou R, Yang J. Detection of social network spam based on improved extreme learning machine. IEEE Access.
2020;8:112003–14.
53. Al-Haija QA. Cost-effective detection system of cross-site scripting attacks using hybrid learning approach. Results Eng.
2023;19:101266.
54. Aljawarneh S, Aldwairi M, Bani Yassein M. Anomaly-based intrusion detection system through feature selection analysis and building
a hybrid efficient model. J Comput Sci. 2017. https://doi.org/10.1016/j.jocs.2017.04.009.
55. Liu JF, Liu Y, and. Lu Y. The field terrain recognition based on extreme learning machine using wavelet features. In: 2017 IEEE inter-
national conference on mechatronics and automation, Takamatsu, Japan, 2017, pp. 1947–1951, https://doi.org/10.1109/ICMA.2017.
8016116.
56. Qu BY, Lang BF, Liang JJ, Qin AK, Crisalle OD. Two-hidden-layer extreme learning machine for regression and classification. Neuro-
computing. 2016;175:826–34.
57. Haider A, Adnan Khan M, Rehman A, Rahman M, Seok KH. A real-time sequential deep extreme learning machine cybersecurity
intrusion detection system. Comput Mater Contin. 2020;66:1785–98.
58. Faris H, Habib M, Almomani I, Eshtay M, Aljarah I. Optimizing extreme learning machines using chains of salps for efficient Android
ransomware detection. Appl Sci. 2020;10(11):3706.
59. Shahid H, Singh MP, Roy B, Aadarsh A. Coronary artery disease diagnosis using feature selection based hybrid extreme learning
machine. In: 2020 3rd international conference on information and computer technologies (ICICT), San Jose, CA, USA, 2020. pp.
341–346, https://doi.org/10.1109/ICICT50521.2020.00060.
60. Surantha N, Gozali ID. Evaluation of the improved extreme learning machine for machine failure multi-class classification. Electronics.
2023;12(16):3501. https://doi.org/10.3390/electronics12163501.
61. Zheng D, Hong Z, Wang N, Chen P. An improved LDA-based ELM classification for intrusion detection algorithm in IoT application.
Sensors. 2020;20(6):1706.
62. Al-Haija QA, Altamimi S, AlWadi M. Analysis of extreme learning machines (ELMs) for intelligent intrusion detection systems: a survey.
Expert SystAppl. 2024;253:124317. https://doi.org/10.1016/j.eswa.2024.124317.
63. Gyamfi E, Jurcut AD. Novel online network intrusion detection system for industrial IoT based on OI-SVDD and AS-ELM. IEEE Internet
Things J. 2023;10(5):3827–39. https://doi.org/10.1109/JIOT.2022.3172393.
64. Zhao S, Chen XA, Wu J, Wang YG. Mixture extreme learning machine algorithm for robust regression. Knowl-Based Syst.
2023;280:111033.
65. Zhang J, Li Y, Xiao W, et al. Non-iterative and fast deep learning: multilayer extreme learning machines. J Franklin Inst.
2020;357(13):8925–55.
66. Xue Z, Cai L. Robust fisher-regularized twin extreme learning machine with capped L1-norm for classification. Axioms. 2023;12(7):717.
https://doi.org/10.3390/axioms12070717.
67. Sree VK, Shravani P, Sravani V, Devendar P. Intelligent malware detection using extreme learning machine. Turk J Comput Math Educ
(TURCOMAT). 2023;14(2):50–63.
68. Abbas A, Khan MA, Latif S, Ajaz M, Shah AA, Ahmad J. A new ensemble-based intrusion detection system for the Internet of Things.
Arab J Sci Eng. 2022;47:1–15.
69. Yang Z, Baraldi P, Zio E. A comparison between extreme learning machine and artificial neural network for remaining useful life pre-
diction. In: 2016 Prognostics and System Health Management Conference, Chengdu, China, 2016, pp. 1-7. https://doi.org/10.1109/
PHM.2016.7819794
70. Elhaija WA, Al-Haija QA. A novel dataset and lightweight detection system for broken bars induction motors using optimizable neural
networks. Intell Syst Appl. 2023;17:200167. https://doi.org/10.1016/j.iswa.2022.200167.
71. Shojafar M, Taheri R, Pooranian Z, Javidan R, Miri A, Jararweh Y. Automatic Clustering of Attacks in Intrusion Detection Systems. In:
2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates,
2019. pp. 1–8, https://doi.org/10.1109/AICCSA47632.2019.9035238.
72. Jayakumar N, Gokulnath G, Jayarekha CT. Extreme learning machine enhanced remote sensing image classification system. In: 2024
2nd international conference on artificial intelligence and machine learning applications theme: healthcare and internet of things
(AIMLA), Namakkal, India, 2024. pp. 1–5, https://doi.org/10.1109/AIMLA59606.2024.10531446.
73. Salazar E, Mora M, Vásquez J, Gelvez E, Almeida EG. Conditioning of extreme learning machine for noisy data using heuristic optimi-
zation. J Phys: Conf Ser. 2020;1514:012007. https://doi.org/10.1088/1742-6596/1514/1/012007.
74. Yang Y, Hou M, Sun H, Zhang T, Weng F, Luo J. Neural network algorithm based on Legendre improved extreme learning machine for
solving elliptic partial differential equations. Soft Comput. 2020. https://doi.org/10.1007/s00500-019-03944-1.
75. Gao J, Li J, Jiang H, Li Y. A novel cyber-attack detection approach based on kernel extreme learning machine using fr-conjugate gra-
dient. In: 2020 39th Chinese Control Conference, Shenyang, China, 2020, pp. 7637–7642, https://doi.org/10.23919/CCC50068.2020.
9188985.
Vol:.(1234567890)
Discover Internet of Things (2024) 4:5 | https://doi.org/10.1007/s43926-024-00060-x Research
76. Abirami MS, Pandita S, Rustagi T. Improving intrusion detection system using an extreme learning machine algorithm. Int J Recent
Technol Eng. 2019;8(24):234–9. https://doi.org/10.35940/ijrte.b1043.0782s419.
77. Kaliraj P. Intrusion detection using whale optimization based weighted extreme learning machine in applied nonlinear analysis.
Comm Appl Nonlinear Anal. 2024;31(2s):186–203.
78. Taheri R, Ahmadzadeh M, Kharazmi MR. A new approach for feature selection in the intrusion detection system. Fen Bilimleri Dergisi
(CFD), 2015. 36(6).
79. Singh S, Silakari S, Patel R. An efficient feature reduction technique for the intrusion detection system. In: 2009 international confer-
ence on machine learning and computing. Singapore. 2011. pp. 147–153.
80. Modadugu SY, Rao SSN, Reddy DV. Extreme learning machine for spammer detection and fake user identification from twitter. In:
2024 IEEE 13th international conference on communication systems and network technologies (CSNT), Jabalpur, India, 2024. pp.
1141–1146, https://doi.org/10.1109/CSNT60213.2024.10545989.
81. Abu Al-Haija Q, Alohaly M, Odeh A. A lightweight double-stage scheme to identify malicious DNS over HTTPS traffic using a hybrid
learning approach. Sensors. 2023. https://doi.org/10.3390/s23073489.
82. Chothani N. Combined PCA and kernel-based extreme learning machine technique for classification of faults in IEEE 9- bus system.
In: 2024 Third International Conference on Power, Control and Computing Technologies (ICPC2T), Raipur, India, 2024, pp. 380–385,
https://doi.org/10.1109/ICPC2T60072.2024.10474888.
83. Kaliraj P, Subramani B. Intrusion detection using krill herd optimization based weighted extreme learning machine. J Adv Inf Technol.
2024;15(1):147–54.
84. Parvathi S, Sushma T, Anusree K, Talari VS, Dasari SN, Krishnan VG. Implementation of extreme learning model for cyber security in
IoT networks. 2024. https://doi.org/10.1109/ICSCNA58489.2023.10370596.
85. Dai X, Yi X, Zhou D, Guo F, Liu D. False data injection attack detection based on local linear embedding and extreme learning machine. In:
2022 IEEE 17th International Conference on Control & Automation, Naples, Italy, 2022, https://doi.org/10.1109/ICCA54724.2022.98318
51.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Vol.:(0123456789)