Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
8 views
12 pages
Research 2
R2
Uploaded by
Sidharth Hari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save Research 2 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
8 views
12 pages
Research 2
R2
Uploaded by
Sidharth Hari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save Research 2 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 12
Search
Fullscreen
Digital Communications and Networks 10 (2024) 205-216 Contents lists available at SeienceDinect Digital Communications and Networks journel homepage: nvwkesipublishing.comidean Feature extraction for machine learning-based intrusion detection in ® IoT networks al Mohanad Sarhan **”, Siamak Layeghy*, Nour Moustafa”, Marcus Gallagher ®, Marius Portmann * * Unive of Quen Brisbane. QLD, 4072, Auras > ney of Mew Sh Wa, Caner AC, 2612 Ausra "aege number of network security Breaches a Tol networks have denonsvated the unrlabiliy of current, "Network ntrsion Detection Sytem (NID), Consequny, network inleruptons ad low of sensitive data have twas observed that mast researchers aim to obtain beter clssiGcation remus by using a tof untied combi alos of Feaure Reduction (FR) and Machine Learning (IL) technlgues on NIDS datses. However, thse Aatses are diferet in feature sts, tack types, and network design. Therefore, this paper ais to dacover Forward (DFP, Convolutional Neural Network (CNN), Recurrent Neural Neswork (RNN), Decision Tree (DD, Logistic Regression (LR), and Nalve Bayes (NB). The accuracy of thee Feature Extraction (FE algorths I detected; Peacipal Component Analysis (PCA), Auto-encode (AE), snd Linear Disrsinant Avsiyss (LDA), ae algorithms have been widely used, the deccrmination oftheir optimal numberof extracted dimensions has been ‘overlooked. The renuts indcate that no cleat FE method o ML model can achieve the bes scores for ll datasets, ‘he opti! number of extracted dimensions has been Idenied foreach dataset, and LDA degrades the per PCA. Finally, this paper conlade that the choice of datasets significantly alters the performance of the applied techniques We blleve that a universal (benchmark) feature set s needs to faeltate further advancement and ropes of researc ln tls Held 1. Introduction Cybersecurity attacks and thelr assoclated risks have significantly increased since the rapid growth ofthe interconnected digital world [1], ‘eg, the Internet of Things (oT) and Software-Defined Networks (SDN) 2] oT isan ecosystem of interrelated digital devices and objects known, 2s "thing" (3]. They are embedded with sensors, computing chips and ‘other technologies to collet and exchange Gata over the internet 1oT networks aim to inerease the productivity of the hosting environment, sch ae industrial systems and ‘smart’ buildings. loT devices are growing significantly, with an expected numberof 50 billion devices bythe end of 2020 3]. Tis growth has led to an increase in eyber attacks and the risks associated with them. Consoquently, businesses and governments are proactively looking for new ways to protect their personal and organ isational data stored on networked devices. Unfortunately, current * Coresponding author, ‘Bimal address: sathanGug.net au (Mt. Swan) Insps//doiorg/10.1016/}.dean.2022.08.012. security measures in IoT networks have proven unreliable against un precedented attacks [4)-For instance, n 2017, attackers compromised 2 «asin's sensitive database thcough an Jo fish tank’s thermometer. Ac cording to the Nozomi networks report, new and modified IoT botnet tacks increased rapidly in the frsthalfof 2020, with 5794 af oT devices vulnerable to attacks [5]. According to the Symantec Internet Security Threat Report, more than 2.4 million new malware variants were created Jn 2018 [6]. That led to growing interest in improving the capabilites of NIDSS to deteet unprecedented attacks. Therefore, new innovative ap proaches are required to enhance the attack detection performance of [Network Intusion Detection Systems (NIDS®) ‘An NIDS is implemented ina network to analyse traffic flowsto detect Security threats and protect digital assets [7]. It is designed to provide high eyber-seeurity protection in operational infrastructures and aims to preserve the three principles of information systems security: Received 12 March 2021; Received in revised form 16 July 2022; Accepted 31 August 2022 ‘Avaliable online 7 September 2022 2352-8648 / 2024 Chongdng University of Pst nd Telecommnications. Publishing Services by Hsevier BL on behalf of Kei Communications Co I This isan ‘open access article under the CC BY-NCND Heense (hp: /reativeommons.og/lieeses/by-n0-/40/.‘confidentiality, integrity, and availabiliy (7]. Detecting eyber-atacks and threats have been the primary goal of NIDSs fora lon time. There ‘are two main types of NIDSs: Signature-based aims to match and compare ‘the signatures fom an incoming traffic with a database of predetermined signatures of previously known attacks [8]. Although they usualy pro vide a high level of detection accuracy for precedented attacks, they fl to detect 2r0-day or modified threats that donot exist nthe databace. As ‘attackers constantly change their techniques and strategies for con ducting attacks to evade current security measures, NIDSs must be ‘adaptive to evolving detection approaches. However, the current method for tuning signatures to keep vp with changing attack vectors is unrel able, Anomaly-based NIDSs aim to overcome the limitations faced by signature NIDSs by using advanced statistical methods, which have ‘enabled researchers to determine the behavioural pattems of network traf. Various methods are used for anomaly detection, such as statis tical, knowledge- and Machine Learning (MIL}-based techniques [8]. Generally, they can achieve higher accuracy and Detection Rate (DR) levels fr zero-day attacks, as they focus on matehing attack patterns and behaviours rather than signatures [9]. However, anomaly NIDSs suffer from high False Alarm Rates (FARS) as they ean identify any unique ‘benign trafic tat deviates from secure behaviour as an anomaly Current signature NIDSs have proven unreliable for detecting zero day attack signatures [10] as they passthrough ToT networks. This is ‘due to the lack of known attack signatures inthe system's database. To prevent these incidents from recurring, many techniques, including ML, have been developed and applied with some suecess. Mis an emerging lechnology with nev capablides to learn and extraet harmful paters from network trffc, which can be beneficial for detecting security threats [11], Deep Learning (DL) is an emerging branch of Mi that has proven very succesful in detecting sophisticated data pattems (12). Its ‘models are inspired by biologial neural systems in which a network of Imerconnected nodes transmits data signals, Fach node contains @ mathematical activation function that converts input to ouput. These ‘models consist of hidden layers that can further extract complex patterns in network traf. These patterns are learnt through network attack vectors, which can be obtained from various features transmitted Uhrough necwork traffic, such as packet count/ske, prosoedl, services ‘and flags. Hach attack type has a different identifying patter, known asa set of events that may compromise the security prineples of networks if undetected, Researchers have developed and applied various MIL models, which ‘are often combined with Feature Reduction (FR) algorithms to poten Wally improve their performance, Using a set of evaluation metrics, promising results forthe detection capabilites of ML havebeen obtained, but these models are not yet reliable fr real production foT networks. ‘The tend inthis field is to outperform state-of-the-art results for a spe cific dataset rather than to gain insighes into an Ml-based NIDS appl cation [13]. Therefore, the extensive amount of academic research conducted outweighs the number of actual deployments in the rea ‘operational worl, Although this could be due to the high cost of errors ‘compared with those in other domains (13], it may also be that these techniques are unreliable ina real environment. This s because they a ‘often evaluated on a single dataset consisting of list of features that might not be feasible for collection or storage in alive IoT network fed ‘Moreover, due tothe nature of ML, there is often room for improvement Jn its hyper-parameters when implemented on a specie dataset ‘Therefore, this paper aims to measure the generalisabilty of Feature Extraction (FE) algorithms and ML. models combinations on diferent [NIDS datasets, In this paper, the effectiveness of three DI. models in detecting attack, vectors has been measured and compared with three Shallow Learning (St) models, ic. Dep Feed Forward (DFF), Convolution Neural Network (GNM), Recurrent Neural Network (RNN), Decision Trees (DT), Logistic Regression (LR) and Naive Bayes (NB). Three FE algorithms, namely, Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) ‘and Auto-encoder (AE, have been explored, and ther effects on three Digit Carmi and Nek 10 (202) 205-216 benchmark datasets, UNSW-NB1S, ToNIoT and CS#-CIC-1DS2018 have been scudied, Th results ofthe complete fll dataset, without any FE sigorthim applied, are also calevlated for comparison, The extracted ourputs of PCA and LDA are analysed by calculating their respective ‘valance score. The optimal numbers of dimensions when applying the [AE and PCA algorithms are found by experimenting with 1,2, 3,4, 5,10, 20, and 30 dimensions. This paper is structured as follows; in Section 2, related works conducted in this field are explained, I is followed by methodology section where the data processing, FE algorithms, and ML classifiers used and their architeerures and parameters are mentioned. In Seetion %, the datasets used and their importance in research ace dis cussed, the evaluation metres used are defined, and the results achieved ae listed and explained. In summary, the key contributions ofthe paper «Experimental evaluation of 18 combinations of FE algorithms and Ml classifiers across three NIDS datasets. «Exploration of the numberof feature dimensions and their impact on ‘he clasification performance. + Analysis of feature variance and their correlation to the detection 2, Related works This section provides an overview of related papers and studies in this area. Due to the rapidly evolving nature of networks, new attack see rnarios appear dally, and the age of a dataset i eriial. As old datasets contain outdated patterns of benign and attack traffic, they are consi ered obsolete end have limited significance. Therefore, datasets released within the last Ove years are selected as they represent up-to-date network traffic. An updated version of CSE-CICIDS2017, known a5 CCSF-CICADS2018, was released publicly by the University of New Brunswick. Although the University of New South Wales released another dataset known as ToN-IoT in late 2019, limited papers that used I were found atthe time of writing. Therefore, examining this dataset and its performance against those very well-known and widely used datasets is another contribution ofthis paper. Researchers have widely ‘used the UNSW-NB1S dataset de to its various features and atack types Papers in which the UNSW.NBIS, ToN-IoT and CSE-CICIDS2018 datz Sets were used ate analysed inthe following paragraphs. In (14, the authors implemented a CNN model and evaluated it on the UNSW-NBIS dataset. The CNN uses max-poolin, anda complet list of lis hyperparameters is provided, Experiments were conducted with different numbers of hidden layers and an addition ofa Long Short Term Memory (ISTM) layer. The three-layer network performed best on the balanced and unbalanced datasets, achieving an accuracy of 85.86% and 91.2%, respectively, with the minority class oversampled to balance the Tabel classes. The authors slso compared three activation functions (sigmoid, rela, and tanh), with sigmoid obtaining the best accuracy of 91.2%, Although they claimed to have built reliable NIDS model, a DR 0 96.17% and FAR of 14% are not ideal. They also didnot evaluate thelr best model on various datasets to determine its stability or performance for different attack types or packet features, Khan etal, explored the five algorithms DT, RE, Gradient Boosting (GB), AdaBoost, and NB on the UUNSW-NBLS dataset with an extra tre clasfir for FE. The extracted Features could have been heasily influenced by identifying features such as IPs and ports, which aze biased towards alacking/victim nodes. The results showed that RF (98.60%) achieved the best score, followed by AdaBoost (97.829) and DI (97.85%). However, in terms of prediction times, DT performed the best with 0.75s, while RF and AdaBoost took 6.97s and 21.225, espectvely [15]. In [6l, the authors investigated various activation functions (ely, sigmoid, tanh, and softsign) and optimisers (adam, spd, adagrad, nadam, adamax and RMSProp) with different numbers of nodes in the hidden layers. They aimed to find the optimal set of byper-parameters for po: tential use in an NIDS. The experiment was conducted using DFF and-stMarchitectures on the UNSW:NB1S dataset. There was no substantial ‘improvement using LSTM rather than DFF, with the relu activation Tunction outperforming the others. Most optimisers performed siilaly ‘well, except for SGD, which was less accurate, They claimed that their ‘est setting forthe hyper-parameters was using rlu, adam, and a namber ‘of nodes following a configuration with the rule 0.75 x input + output. ‘Their best accuracy results were 98.8% for DFF and 98% for LSTM. However, in the paper, nether the flow identifier features were dropped nor their bestlaimed set of hyper-parameters is evaluated on another dataset. In Ref. [17], the authors proposed an AE neural architecture ‘consisting of LSTM and dense layers as an FE. tool. The extracted outputs then fed into an RF clasifier to perform the attack detection. Three datasets, UNSWNBIS, ToN-oT, and NSL-KDD, were use to evaluate the performance ofthe proposed methodology. The results indicate thatthe ‘chosen classifier achieves higher detection performance without using ‘compression methods, However, training time has been significantly ‘reduced by using lower dimensions In (18), the authors visually explored the effects of applying PCA and ‘AB on the UNSW-NBIS and NSI-KDD datasels, They aso experimented ‘with different dimensions (ranging between 2 and 30) using the class fiers K Nearest Neighbour (KNN), DFF, and DT ina binary and mult-cass classification seenaro. The study found that AE performed better than PCA for KNN and DFF, but both were similar for DT. An optimal namber ‘of dimensions (20) was found forthe UNSW-NB1S dataset buna forthe [NSL-KDD one. In Ref. (19], a CNN and ar RNN model were designed to defect attacks in the CSE-CIC-IDS2018 dataset. The authors followed a supervised binary classification where CNN outperformed RNN in detecting each attack type. The authors have omitted some benign packets to balance attack and benign classes co improve classification performance. A significant increase in the performance was obtained in ‘the detection of minority samples of attacks, Beloush etal. explored DT, INB, SVM and RF models on the UNSW.NBIS dataset, They have used accuracy as the defining metric where RF achieved 97.49%, followed by ‘8 DT score of 95.82%, and SVM and NB led to poor results, They applied rho FR techniques, where the full datasets features have been utilised ‘Training and testing times were also recorded, where NB achieved the fastest time (201 Im [21), the CSE-CICADS2018 dataset has been uilised to explore seven different DL. models, ie, supervised (OFF, RN and CNN) and unsupervised (restricted Boltzmann, DBN, deep Boltzmann machine and deep AE). The experiments also included a comparison of different learning rates and numbers of hidden nodes, However, any data pre-processing phase including FR, was not mentioned. Moreover, the flow identifiers were not dropped, which would have caused a bias to wards attacking vitims’ nodes of applications. All models performed similarly with light variations in the DRs of their attack types. Ia terms of ‘overall accaracy, CNN had the highest of 97.38% when using 100 hidden nodes with 0.5 as the Iearning rate, Increasing the number of hidden nodes and learning rate improved the accuracy, but also increased the training time. In Ref, [22], the authors compared two FF techniques, ‘namely, PCA and LDA, and proposed a linear discriminative PCA by feeding the disriminant information ovrpat from the LDA into the PCA. Although the ML. model they used in their experiments was not ident ‘Red, their method was evaluated on the UNSW-NBIS datasets their technique did not perform wel for detecting fuzzers and exploiting a. tacks, they decided to eliminate them: from some of thelr results whleh ‘are not ideal ina realistic network environment. Nevertheless, thei re sults were still poor, withthe best one for binary clasifeation having @ DR of 92.35%, One of their stated future works is to determine the ‘optimal number of principal components i. the number of dimensions Ina PCA. “Most ofthe works found in the literature still adopted the negative habits addressed in Ref. [23], with researchers aiming to create new FR methods and bulld new ML models to outperform the state-of she-art results, However, due to the nature of the domain, researchers can a: ways find a combination or variation that would result in slighty beter Digit Carmi and Nek 10 (202) 205-216 numerical results, This can also be achieved by modifying any hhyper-parameters used, which often have room for improvement when applied toa certain dataset. In most papers, experiments were conducted using a single dataset which questions the conclusion that ther proposed techniques could be generalised across datasets. As each dataset contains its own private set of features, there ae variations in the information presented. Consequently, these proposed techniques may have diferent performances, steongly influenced by the chosen dataset. The exper mental issues mentioned above create a gap between the extensive aca demic research conducted on ML-based NIDSs and the actual deployments of ML-based NIDS in the operational world, However, compared with other applications, the same MI tools have been deployed in commercial scenarios with great success. We believe this is due to the high cost of errors in the NIDS domain, making it eitical to design an optimal ML. model before deployment. Therefore, as gaining insights into the Ml-based NIDS application is erucil, this paper explores the per formance of combinations of FE algorithms and ML models on diferent datasets. This will help determine if the best combination can be generalised for all chosen datase's, Also althougls applying PCA and AL algorithms have been common in recent papers, finding the optimal ruimber of dimensions to be used has been overlooked. The extracted dimensions of PCA and LDA are analysed by computing the variance and its correlation with the detection accuracy. 3. Methodology Tis paper explores the effects of applying three FE techniques (PCA, DA and AF) on three DI. models (DFF, CNN and RNN) and three SL slassfiers (DT, LR and NB). For PCA and AE, several dimensions ,2345,10,20 and 30) are selected to potentially find the optimal number. Three publicly released NIDS datasets that reflect modern network behaviour are utilised to conduct our experiments, with an overall representation provided in Fig. 1. The datasets are processed for efficent FE and ML procedures. Then, the predictions made by the classifiers are collected, and certain evaluation metres are statistically ot +. C attack) — Fig. 1. System arhitctraSies CLO. me OO OF OVO), ‘calculated, The Python programming language is used to design and conduct the experiments, and the TensorFlow and SeiKitLearaIbrares ‘re-used to build the DL and SL models, respectively. 3.1. ata procesing Data processing is an essential fist step in enhancing the taining process for ML, models. ll datasets are publicly available to download for ‘esearch purposes. The duplicate samples (flows) are removed to reduce the storage size and avoid redundancy. Moreover, the flow identifiers, such as source/destination IP, ports and timestamps, are removed (0 prevent prediction bias towards the atacker’s or a victim's end nodes/ application. Then, the strings and non-numeric features are mapped (0 ‘numerical values using a categorical encoding technique. These datasets ‘contain features such as protocols and services, which are collected in their eative string values, while the ML models are designed to operate ‘efficiently with numerical values. There are two main techniques for ‘encoding the features: one hot encoding and label encoding. The former transforms a feature into X categories by adding X number of features, using 1 10 represent the presence of a category and 0 to represent is Dil Carman and Nebo 1 (2024 205-218, absence. However, this increases the numberof dimensions ofa dataset, which might affect the performance and efficiency of the Ml. models. ‘Therefore, the label encoding technique maps each category to an integer. The nan, dash, and infnty values are replaced with 0 to generate a rumerical-only dataset far use in the following steps. Any Boolean feature is replaced by 1 when it s true and 0 when itis fale. Further more, the min-max feature scaling is applied to reduce complexity to bring ll feature values becween 0 and 1. Italsoallowsall features to have equal weights, however, due to the nature of network traffic features, ‘some values are larger than others, which ean cause an Ml, mode! to pay ‘more attention to them by assigning heavier weights. The min-max scaler compatesall values of each feature by Eq, (1), where Xe isthe new feature value ranging from 0 to 1, X is the orignal feature value and Xyay and Xe ae the maximum and minimum values of the feature, respectively “The dataset is spit into two portions for training and testing, and they are stratified based on the label features, whichis essential du tothe class Imbalances of the datasets. X= Naw Koa Xow w 32, Feature extraction Es the proces of reducing the numberof dimensions or features in ‘dataset. time to extract the valuable and relevant information spread among the raw input features and project i into a reduced number of features while minimising informational los. The three FE algorithms used, PCA, LDA, and AF, are described in the following paragraphs. “+ Principal Component Analysis (PCA): An unsupervised linear transformation algorithm that extracts features based on statistical procedures Itfinds the eigenvectors withthe highest eigenvalues ina Covariance matrix and projects the dataset into a lower-dimensional space with a specified number of dimensions (features). These extracted features are an uncorrelated set called principal compo: ents. Although PCA is sensitive to outliers and missing valves, taims to reduce dimensionality without losing too much important or valuable information. The Singular Value Decomposition (SVD) solver Is used in the PCA algorithm implemented in this paper. Different dimensions are explored to determine the effect of altering the input dimensions and find the optimal number of extracted fes Linear Diseriminant Analysis (LDA): A supervised learning linear transformation algorithm that projects the features onto a straight line, It uses the class labels to maximise the distances between the ‘mean of different classes (interlass) and minimise the distance be: tween the mean of the same class (intraclass). It aims to produce features that are more distinguishable from each other. Similar to PCA, it sims to find linear combinations of features that help explain the dataset using a lower number of dimensions. However, unlike PCA, its number of extracted features needs to be equal to one less ‘han its numberof classes, which is one in our ease, because there are ‘wo classes: attack and benign. LDA can also be utilised as a classifi cation algorithm, However, in this paper, itis utilised as an FE technique where an SYD solver is implemented. + Autotncoder (AE): An artfcial neural network designed to lea and rebuild feature representations. It contains two symmetrical components, an encoder and a decoder, with the former extracting & certain number of features from the dataset and the latter recon: serueting them. When the number of nodes in the hidden layer is, designed to be less than the number of input nodes, the model can compress the data. Therefore, during training, the model wil learn to produce a lower-dimensional representation of the original input, ‘with the least loss of information. A dense AE architecture i used in ‘these experiments because of the nature ofthe data. The number ofnodes in the encoder block decreases inthe order of 30, 20 and 10, and the decoder block increases in the reverse order. The number of odes in the middle layer is set to the number of output dimensions required. All te layers consist ofthe relu activation funetion, adam optimiser and binary eross-eatropy loss function. 3. Machine learning ML is a subset of Artificial Intelligence (AD that uses certain algo rithms to learn and extract complex pattems from dats. In the context of [ML-besed NIDS, ML models can lear harmful pattems from network afi, which can be beneficial inthe detection of security threats. Lis ‘an emerging ML branch thet is proven capable of detecting sophisticated data pattems. Its madels are inspired by biological neural systems, in hich a network of interconnected aodes transmits data signals. Building fan ML model following a supervised classification method involves two processes: training and testing. During the frst phase, the model is trained uring labelled malicious and benign network packets from the training dataset to extract patterns and fit the corresponding models parameters. Then, the testing phase evaluates the model’ reliability by measuring its performance for classifying unseen attacks and benign tafe onthe testing set of unlabelled network packets. These predictions ‘are compared with the actual labels inthe testing dataset to evaluate the model using certain metres explained in Section 3, The hyper-parameters used inthe Dl mode's are listed in Tuble 1, All three datasos used in the experiments suffer from a class imbalance in terms of the frequency of benign and attack samples, which usually ‘causes the model to predict the dominant class over the others. As the learning phase of an ML model is often biased towards the class with the majority of samples, che minarity class may not be well fited or trained in the final model [24]. Due tothe nature of the experiments in two of ‘the datasets, the minority class is an attack one, namely class 1, which is ‘critical for the model to be able to detect and classify samples in that ‘las. To deal with the datasets imbalanced classes, weights are assigned to each clas, with the minority having a "heavier" weight than the ma- jority. Therefore, the model emphasises or gives priority to the former ‘las inthe training phase (25). The clases’ weights are calculated using Ba, (2) TovalSamplesCount Wet — 5 CiaseSamplesCount @ + Deep Feed Forward (DFF): A class of Multi-layer Perceptrons (MPS) that is usually constructed of three or more hidden layers. In this, ‘model the datas fed forward through the input layer and predictions are obtained on the outputs. Bach layer consists of several nodes with ‘Table aye) oni Adam An a Ads Digit Carmi and Nebr 1 (202) 205-216 weighted connections mapping the high-level features as input tothe Aesired output. The weights are randomly initialised and then opti mised in the learning phase through process known as back. ropagation. The input is @ row (flow) of the CSV file fed. into the input layer consisting of nodes equal to the namber of input di mensions. Then, it passed through three hidden layers consisting of 20 dense nodes, each having a relu activation function. The weight, And biases are optimised using the adam algorithm with the binary ‘ross-entropy loss function, Finally, due to our number of classes, the ‘output layer isa single sigmoidal unit. The dropout rate of 0.2 is used remove 208 of the nodes information to avoid overfting the training datast Fs. 2 presents the DFF architecture. ‘Convolution Neural Network (CNN): An originally designed model to map images to outputs, which has proven to be effective when applied to any prediction scenario, Its hidden layers are typically convolutional and pooling ones, and a fully connected CN includes, ‘an additional dene layer. Convolutional layers extract features with, kkemels from the input, and the pooling layers can enhance these features. The input is converted to a 2-dimensional shape to be compatible with the Conv1D layer. All layers have 20 filters, with ‘Kernel sizes of 3 in the input layer and 2 and I in the first and second, hidden layers, respectively. All activation funetons used in the con volutional layers are relu, and the average pooling size is 2 besween cach set of two convolutional layers. The input is passed to a dropout witha value of 0.2 and then tothe final dense sigmoid classifer. Fg. 3 presents the mapping and pooling ofthe input by the convolutional layers unt a prediction is made by the dense output layer. The hid den layers are removed for each input with less than 10 features, and Its kernel size is reduced to 1. Recurrent Neural Network (RNN}: A model that can capture the sequential information present in input data while making. pre dictions through an internal memory that stores a sequence of inputs, and it is successful in language-processing scenarios. Although there fare various types of RN, LSTM isthe most commonly used type of RN. Fach LSTM node contains three gates: forget, input, and output ‘The input is converted to a dimensional shape to be compatible with the requirements of the LSTM layer. The number of nodes is {qual othe numberof input dimensions inthe input layer. The input {is then passed through a single hidden layer consisting of 10 nodes, with relu activation functions. then, the weight and bas of each feature and layer are optimised using the adam algorithm based on, the binary cross-entropy loss function, The output layer isa single sigmoidal output unit. The dropout rate of 0.2is used to remove 2036 of the models information to avoid over‘fting the taining dataset. ig. 4 presents the mapping of an input to is output through LSM, layers. Logistic Regression (LR): A linear classification model used for predictive analysis. It uses the logistic functlon, also known as the sigmoid function, to classify a binary output. It ealeulates the prob: ability of being an output cass between 0 and 1. itis easy to imple ment and requires few computational resources, but may not work well for non-linear scenarios. The Ibfgs optimisation algorithm is, selected with an 2 egularsation technique to specify the strategy for penalisation to avoid over fitting. The tolerance value ofthe stopping criterias set to 1e-4, the vale of the regularisation strength to 1, and ‘the maximum numberof iterations to 100, Decision Trees (DT): A model that follows. tre series in which each fend node represent a high-level featur. The branches represent the outputs and the leaves represent the label classes. It uses a supervised, learning method mainly for clasification and regression purposes, aiming to map features and values to thelr desired outcome It i Widely used because itis easy to build and understand, but i ean create an overcome tre that overfits the taining data, The DT's Classification and Regression Trees (CART) algorithm is used due to its capability to construct binary trees using the input features (26) The Gin impurity function is selected 10 measure the quality ofaspit.‘Naive Bayes (NB): A supervised algorithm that performs classifica tion via the Bayes rule and models the elas-conditional distribution ofeach feature independently. Although itis known to be efficient in terms of time consumption, i follows the "Native" assumption of in dependence between each pair of input festures, The Gaussian NB. algorithm is chosen for is classification capabilities and retains the default value for variance smoothing of 1-9. 44, Bvaluation metres To evaluate the performances of the FE algorithms and ML models, the following evaluation metres ae used: ‘+ TP: True Positive isthe numberof correctly classified attack samples TN: True Negative is the umber of correctly clasified benign samples “+P: False Positive isthe numberof misclassified atack samples. ‘| BN: Fase Negative ie the number of misclassified benign samples. + hee: Accuracy the number of correctly classified samples divided by the total number of samples @ ‘* DR: Detection Rate, also known as recall isthe numberof correctly classified attack samples divided by the total number of attack samples pro ® ay ‘+ FAR: False Alarm Rate Is the number of incorrectly classified atack samples divided by the total number of benign samples rity © “+ AUG: Area Under the Curve isthe area under the Receiver Operating Characteristics (ROC) curve that indiates the trade-off between the DR and FAR. “Most metrics are heavily affected by the imbalance of classes in the datasets. For example, model can achieves high accuracy or Fl score by predicting only the major class or having both a high DR and FAR, whieh makes it not ideal. Therefore, a single mesric cannot be used to differ ‘entiate between models. The ROC considers both the DR and FAR by plotting them on the x- and y-axes, respectively, and then the AUG is ‘calculated, This represents the trade-off between the two aspects and _measures the performance of an NIDS for distinguishing between attack ‘and benign flows. As shown in Fig. 5, the ROC curve for an optimal NIDS is almed toward the top Iefthand corner of the graph with the highest, possible AUC value of 1. On the other hand, an imperfect NIDS generates ‘2 graph ofa diagonal line and has the lowest possible AUC value of 05 4, Results and discussion ‘The following results are obtained from the testing sets using a stratified folding method of fivefolds, and the mean results ace Digit Carmi and Nek 10 (202) 205-216 300% Detection Rate (OR) Vulnerable Nios False Alarm Rate (FAR) 100% Fig. 5. AUC calculated. In this section, the results for each dataset are initaly is cussed, and then all of them are considered for discussion. The early comparison of the models and FE algorithns is conducted using AUC as the comparison metric. For each dataset, the effects of applying the FE algorithms using different dimensions for exch MI. model are presented separately. Also, the best combination of an ML model and FE algorithm Is selected to measure its performance for detecting each attack type statsuealy 41, Datasets Data selection is crucial for determining the eeliabilty of Ml, models, and the credibility of their evaluation phases. Obtaining labelled network data is challenging due to generation, privacy and security issues. Also, production networks do not generate labelled flows, which is mandatory when following 2 supervised learing methodology. Therefore, re Searchers have created publily available benchmark datases for raining and evaluating ML. models. They are generated through virtual network testbed set up in a lab, where normal network traffic is mixed with synthedie attack traf. The packets ate then processed by extracting certain features using particular tols and procedures, An additional abel feature is created to indicate whether flow is malicious or benign, Fach sample is defined by a network flow, with a flow considered a unidi rectional data log between two end nodes where all the transmitted packets share specific characteristics such as IP addresses and port numbers. The following three datasets have been use: + UNSW.NBIS: A commonly adopted dataset released in 2018 by the (Cyber Range Lab ofthe Australian Centre for Cyber Security (ACCS) L271 The dataset originally contains 49 features extracted by Argus And Bro-IDS, now called Zeek tools, Although pre-selected training and testing datasets were created, the fll dataset hasbeen utilised. It has 2,218,761 (87.35%) benign Rows and 321,283 (12.65%) attack ‘ones, that is, 2,540,044 flows. ts low identifier features are: rip, ‘tip, spor, dort, stime and lime. The dataset contains non-integer features, such as proto, service and state. The dataset contains nine attack types known a5 fuzzers, analysis, backdoor, Denial of Service (bos), expos, generic, reconnaissance, shelleode and worms. ToN-o: A recent hewrogencous dataset released in 2019 by ACCS [23]. ts network traffic portion collected over an ToT ecosystem as Deen utilised, and itis made up of mainly atack sample with aration of 796,380 (3.56%) benign flows to 21,542,641 (96.44%) attack ‘ones, that is, 22,399,021 flows in total, I contains 44 original Features,‘extracted by Bro-DS tool. The flow identifier features are named: 5, ‘sip, stip, sport and dst port. Iccontains non-integer features, such as prow, service and conn state, ssL.ersion, sLcipher,sLsubjet, sis suer, ds. query, hp method, hp version, hap resp mime ype, htp or ‘gmime.opes, pri, hp_user agent, weird add and weird name. Is Boolean features include ds AA, das RD, dis RA, dis rejected, slr sumed, ssLesablshed and weird notice. The dataset includes multiple tock settings, such as backdoor, DoS, Distributed DoS (DOS), i Jection, Man In The Middle (MITM), password, ransomware, scanning and Gross-Site Sriptng (XS). CSE-CIC-IDS2O18: A dataset released by a collaborative project be tweeen the Communications Security Establishment (CSE) and Cana: dian institute for Cybersecurity (CIC) in 2018 [23]. Their developed tool called CICHlowMeter-V3 was used to extract 75 network data features. The full dataset has been used, which has 13,484,708, (83.0796) benign flows and 2,748,235 (16,938) attack ones, thats, 16,222,948 flows, ls low identifier features are called Dit, Flow I, ‘re IP, Sre Port, Dst Port and Timestamp. Several attack settings were condiicted, such as brute-force, bot, DoS, DDoS, inflation, and web aitacks. 42. UNSW.NBTS “The results achieved on the UNSW-NBIS dataset by the MI. models are similar in terms of their best performance, with DT obtaining the ‘worst, as indicated in Fi. 6. The DFF and RN models perform similerly Where AE and PCA exponentially inerease until dimension 3, when they start to fairy stabilise, while CNN requires 10 dimensions. AE improves ‘the performances of CNN and RNN when using a low number of di ‘mensions, whereas PCA with 2 dimensions reduces them significant. All DL models perform equally well, achieving @ generally higher AUC score than the SL classifiers. Although the NB and LR models achieve poor results when using 2 low number of dimensions for both AE and PCA, they improve rapidly with higher dimensions and obtain promising AUC scores. The performance of DI onthe full dataset when using any of the FE algorithms is poor. PCA degrades the performance when the number of dimensions increases. LDA using a single dimension has achieved an excellent detection performance, similar to that achieved using the full dataset in most classifiers, where ic achieved a higher score with NB but a lower one with DT. Dil Carmona Neb 10 (202) 205-215 ‘able LUNSW.NDIS clasifcntion metres. om Acoma) ARON _AUE RA 3 Pa ass he 20 S86 Bis 0s gam tat eae sa isa ‘The best results obtained by each FE algorithm using each ML model are listed in Table 2. When using the AF technique, the CNN performs best among other classifiers, with a high AUC store of 0.8960. AE im- proves the performance of CNN and RN models compared tothe other FE techniques, LR and DT achieve their bet performances when applied to the full dataset without using any FE algorithm, LDA and PCA sign icantly degrade the performance of DT by decreasing the DR of attacks. However, they improve the NB classifier DR and lower its FAR. Inter estingly, LDA performs better than PCA in all ML models except DPF, indicating an extreme correlation between one of the datase’ features and labels, Overal, the optimal number of dimensions for PCA and AE appears to be 20, which matches the findings in Ref. [18] n Table 3, the best-performing ML model has been applied, which is CNN, when using the AE technique with 20 dimensions to measure the DR ofeach attack type It is confirmed that each attack type inthe test dataset is almost fully detected, with backdoor ané DoS attacks obtaining the lowest DRs. 7 (@) DEF (b) CNN I / : ll - pr (NB lg. 6. UNSWNBIS resultsDil Carmo and Nebr 10 (202) 205-215 eben Tables SVNTIS tack detain Te dian metic oo nc or CC 7 ais aie on nla aa Tear has aon? Bate mo ara on kn sas lose tore 43 Taher mt thos Se ae tao Using the Too dataset the ress chewed by each Elgon te Toes mo Sear ante and ML del re significantly diferent, as delayed in ig. 7. Over, mak eae yes Stam Dr obtains the best possible results when its applied to fll dataset and me ae ses estos ‘Abts tmed. The DEF model acleves i best eulison he comple fal" FMF ye Me ava dnaset, performing obviously poorly when sing AE but better when rea komme Sie mane Using LDA and PCA as it stable ater 4 dimension. For any dimension tbe Sen seas atom les than10,CNNperformsinetently withAEandPCA-LMeDRE.RNN XH Rh tn ear matty performs port wen sig AE but wel when ising PCA sistas 0 m1 a7 om sae ara ablse with 2dimensons- DT achieves great reals when applied ote mame ge gy aan fall datasets, similar to AE, with dimensions greater than or equal to 2. However, when using LDA or PCA, it will generate defective results. LR ‘and NB do not perform efficiently on this dataset using any of the FE algorithms. LDA improves the performances of RNN and NB, but reduces those of DFF, CNN, and DT applied to the full dataset, “The full metrics of the best results obtained by each FE method wsing all ML models on the ToN-loT dataset are listed In Table &. The FAR values ae considerably large because there are more attack samples than benign samplesin the dataset. DFF performs best when applied tothe fll dataset, achieving alow PAR, Le, 1.26%, and a low DR of 76.6794. AE ‘decreases the performance of DFF even after using the maximum number ‘of dimensions provided. FE algorithms, especially PCA, significantly Improve the performances of RNN and NB applied to the fll dataset, DT ‘obtains the highest scares when applied to the full dataset, and AE ‘extracted dimensions, The best isto use AF with 10 dimensions asthe DR ‘of 98.28% and FAR of 3.21% are recorded, but inetfective when using PCA and LDA. LR and NB achieve the worst performances ofthe six MI. ‘models. LDA proves unreliable compared to PCA and AE for al earning models except RNN and NB, DFP, RNN, LR aod NB obtain their best results for PCA using 5 di ‘mensions, making it she best number of dimensions, while AE requires a higher mumber of 20, Table 5 displays the types of attacks inthis dataset ‘tables “ToN.e attacks detection, os Teal Protiecd Deo eeton 52559 a7 m8 Pawerd 265088 1ss07a masa (b) CNN (©) RNN (NB Fig. 7. ToN or eels‘and thelr actual numberof samples compared with the numberof dla sified ones. The best-performing combination of FE and MI methods has been used for prediction, and DT is applied to an AE of 20 dimensions having 2 98.28% DR. This table shows that each attack type is almost fully detected excep: for MITM and ransomware because there are few samples of each ofthe models to train on, Seanning and injection attacks have 97.69% and 97.68% DRs, respectively, despite their suficient samples, indicating that thelr patters are mare complex 44, CSECICIDS2018 Asillustrated in Fig. 8 the DL models perform equally well in terms of their best AUC scores. DEF is applied to the fll dataset, and good detection performance is achieved when PCA is used. The effects ofthe ‘AB’ and PCA's changing dimensions ace very sill for CNN as i also has difficulty in classification using a lower numberof dimensions. RNN performs equally using all FE algorithms, with AB slighty better than ‘others. DT performs well with AE and when applied to the full dataset, bot performs very poorly with LDA and PCA. Using AE requires only 3 ‘dimensions to stabilise snd reach its maximum AUC. NB obtains its best ‘results using LDA and PCA, peaking at dimension 20, and LR performs ‘equally using the three FE algorithms, Moreover, AE and PCA have similar impacts on all Ml models except D1, for which AE significantly ‘outperforms PCA. LR and NB perform poorly throughout the ‘experiments “Table 6 displays the bes score obtained by the FE algorithms fr each [ML mode! applied tothe CSF-CIC-1D52018 dataset. DFFand CNN achieve thelr best peeformances when applied tothe full dataset, while the FE algorithms improve the classification capability of RNNS. LDA performs worse than AF and PCA forall models except NB, However, LR and NB are ineffective in detecting attacks present in this dataset. The optima ‘numbers of PCA and AE dimensions are 20 and 10, respectively, due to thelr requirement in most ML, classifiers. In Table 7, attack types in the dataset and their actual numbers compared with their correct predictions are presented. The bestperforming combination of the model and FE algorithm has been used for prediction; that, the DT classifier is applied to 10 extracted dimensions using AE, This table shows that ezch attack type is almost fully detected, except Brute Force -Web, Brute Force -XSS, ‘and SQL injection, due to thele low number of sample counts in the dataset, which matches the findings in Ref. [50]. However, infiltration attack are more diffieult ro detect despite their majority in the dataset “his could be due to the similarity of is statistical distribution with another clas type, which leads to confusion of the detection model Puther analysis is requieed, such a6 Lets, to measure the difference between the distributions of each class 45. Discusion According to the evaluation results, it has been observed that a relatively small number of feature dimensions can achieve the class ‘allon performance close to the maximum. In addition, the marginal income of more dimensions is very small. The outputs of LDA and PCA ‘are analysed using ther respective variance to understand and explain this behaviour, The variance i the distribution ofthe squared deviations ‘ofthe output fom its respective mean. The variance ofeach dimension fextcacted from all the datasets using PGA and LDA is discussed ‘Measuring the variance ofthe dimensions being fed into the ML las fiers is necessary for this field, It will aid in understanding hove FE techniques perform on NIDS datasets Fig. 9 shows the variance of each dimension extracted in PCA forthe three datasets, As observed, the first 10 feature dimensions account for the bulk of the variance, with a minor contribution of additional di ‘mensions. This is consistent with and explains the result in Figs. 6-8, where a higher number of features beyond 10 does not provide any further increase in classification accuracy. Fig. 10 displays the variance of the single LDA feature for each of the three considered datasets. The Digit Carmi and Nek 10 (202) 205-216 extracted LDA feature of the UNSW-NBIS dataset has a significantly higher variance compared tothe other two datasets. Tis might indicate that one or avery small number of features in the UNSW-NBI5 dataset strongly correlate 1o the labels. This is consistent with the results observed in Figs. 6-8, where the LDA forthe UNSW-NBIS dataset ach ieves a significantly higher classification accuracy than the other two datasets. The clasificetion accuracy of LDA for UNSW-NBIS is close to that achieved with the full datas, ie, with the complete set of features The results for the datasets have been grouped based on the ML models, as shown in Fig. I. The best dimensions of PCA and AE are selected fora fair comparison It is clear that patterns form the effets of applying FE algorithms. In Fig. (a), DEF isthe best when applied to the full dataser due to the ability of a dense network to assign weights to relevant features, while AE lowers the detection accuracy of the DFF ‘model. Figure Fig. 11(b) shows that applying CNN tothe full dataset or using PCA or AF does not significantly alter its performance, but using UDA, the outcome deteriorates. In Fig 11(0), the necessity of applying an FE algorithm before using RNN is obvios, withthe best being AE, fo Towed by PCA, and lay, LDA. Hg. 11@) proves the unreliability of using LDA or PCA for a DT model, whereas this model works efficiently ‘when applied to the full dataset or using AE. In Fig. 11), applying a linear FE algorithm, namely, LDA or PCA, improves the performance of the NB model. LDA achieves the best results, while the NB has the worst results among the six ML models without an FE algorithm. Fig. 11(0 shows that applying LR tothe fll dataset or using FE methods leas to the same results where AEimproves the models performance onthe ToN- Jot dataset while LDA decreases it on the CSE-CICDS2018. Overall, there is a clear pater ofthe effects of the FE methods and classification capabilities of ML models forthe thre datasets. Models such as RNN and NB benefit from applying FE algorithms, whereas DFF does not. LDAs general performance is negative for the ToNIoT end CSE-CIC-1DS2018 datasets when using all ML madels except NB, This i explained by the low variance scores achieved by the two datasets compared tothe UNSW- INBIS dataset, However, LR and NB do not perform well for detecting tacks in the three datasets, with the best scores atained by diferent set of techniques The experimental evaluation of 18 diferent combinations of FE and ML techniques has assisted in finding the optimal combination foreach dataset used, On the UNSW-NBIS dataset, the CNN classifier obtains the best score when applied tothe AE dimensions. On the ToN-(oT and CSE €1C-1052018 ones, DT outperforms the other models and achieves the best scores using the AB technique, However, no single method works best across the utilised NIDS datasets. This is caused by the vast differ ence in the feature sets that make up the utilised datasets. Therefore, iis very necessary {0 create a universal set of features for future NIDS datasets is essential. The universal set needs to be easily generated from live network trafic headers as they do not require deep packet inspec Lon, which is challenging in enerypted trafic. The features should also not be biased towards providing information on limited protocols or attack types but rather on all network trafic and attack scenarios. The features will be required to be small in number to enable a feasible deployment, but contain an adequate number of security events toad in the successful detection of network attacks. The optimal numberof di mensions has been identified for all three datasets, which is 20 df ‘mensions. This is indicated in Fig, 9, where further dimensions gain no additional informational variance. After analysing the DR of each attack type based on the best-performing models, it ean be concluded that in @ perfect dataset, the numberof attack sampes needs to e balanced to be efficient in binary classification scenarios. 5. Conelusions Im this paper, PCA, atoencoder and LDA have been investigated and evaluated regarding thelr impact on the classification performance ach ieved in conjunction with a range of machine learning models. Variance isused to analyse their performance, particularly the correlation ber weenDia Cormac ard Nebo 1 (2024 205-215, (b) CNN pr ‘Tables (Src 1ps2008 Fea 20 Mo br R76 ‘Table? (CSE-CICADS2OIA attacks deteson ‘ninck De tntiaen SQL necten Stewie (NB Fig, 8, CSF.CICID52018 ress De FARO m7 ra e187 ma 197 redid Variance Fig. 9. Variance ofthe extracted PCA dimensions. UNSW-NB15 ToN-loT CSE-CIC-IDS2018 Fig. 10, Variance ofthe extracted LDA dimension the number of dimensions and detection accuracy, Three deep learning models (DFF, CNN and RNN) and three shallow learning classificationDig Carman and Nebo 1 (2024 205-218, j@pr (NB (LR Fig. 11. Performance of Mt clasifers across three NDS datasets, algorithms (LR, DT and NB) have been applied to three recent benchmark [NIDS datasets, ie, UNSW-NBLS, ToN-oT and CSE-CICDS2018. In this paper, the optimal combination foreach dataset has ben mentioned. The ‘optimal numberof extracted feature dimensions has been identified for ‘each dataset through an analysis of variance and their impact on the classification performance. However, among the 18 tried combinations ‘of PE algorithm and MIL classifiers, no single combination performs best ‘across all throe NIDS datasets. Therefor, itis important to note that Finding a combination of an FR algorithm and MI. classifier that performs well acrost a wide range of datasets and in practical application scenarios is far from trivial and needs further investigation. While research which ‘aims {0 improve the intrusion detection and attack classification per {formance fora particular data and feature st by a few percentage points is valuable, we believe a stronger focus should be placed on the gen cerallsabilty of the proposed algorithms, especialy thele performance in ‘more practical network scenarios. In particular, we believe it is erucial to ‘work towards defining generic feature sets that are applicable and ef Gent across a wide range of NDS dataset and practical network settings. ‘Such a benchmark feature set would allow a broader comparison of different ML classifiers and would signifteantly benefit the research ‘community. Finally, explaining the internal operations of ML models would attract the benefits of Explainable Al (KAI) in the NIDS domain Declaration of competing interest “The authors declare that they have no known competing fnanetal {interests or personal relationships that could have appeared to influence the work reported in this paper. References (1st, P.Kosntlaos, Park C Alaa, Lape, uve af i ‘iusConman, av. Terras 20 (0) (2018) 313-495, Satan Rhian, Wong, R-Anadd Srey a0 aed ewok Inesin deco tem sing ache ern aptates, eerste NW ‘Aol 12 (2) 19) 495-50. SEA an, Sl oer review, lek ston, nd open ‘henge: are Gener Compe Sa 2 (208) 385-41 ACNGwte A mW Yani, O8 Lynn ltr np 0 tsonony of secs a 206 de tteveaton Coulee on tectonic Dg (CED), IEEE, 2016, 9p 321-326, ‘A Pato, Oot sus repe ig ot oe siting ransom ae ‘Serpe ty Un ip’ omsntmioeacscn/sog ba iene non oases -2020/, 2000 a o a st 161 Spots, net Sy Threat Report vl 24, 2019, UR i/o Meomeonsiocntrae soto, 171 Yost, erating non ction ie nd ata ining, 2008 Inernaoeal Syponion n Vitus MUlinedaComotng 08 9156298, on /Sor/TON AMC 00859. 101 cari ead, Die Vero, Machen Ving, Atay. {bond moor sro eto chien aed henge mot Secu an (cy Cop tea, Reps ag ej 2008 Pr. Al Hanlin, Gad Zatti iaobannad Crp newer son deo tens fr tea spend ‘tacks eo tant JOCTA Ie big Cats Ag 10 2) GO) Hae, 6. cack Neer, Towards evahatn of ln adver ting Paces oe ed ACR CONERT Wahoo Bi DN, Macibe teaming vod hail tence for Da Conmuneston Newt, 2015, Est Men, tn Aa pein of ahi ing i Conference (NAC), EE, 199, pp S717 — ‘Canad Ny We Sun, MN A dep amin proc for news inion detente, Pong fhe 0 a rain! Cnfeence nein nina Common Tein oe Some, Pason, Ove the aed work on ng hie cin fo two nixon Stn: 210 RE Syponan o Sect and ey, Aeon Aus WK, 1c ed ewok ion dtcon with Trecigence in vrmatinané Carcaon GEANONOL TO T7097 "than © roman 8 Hoel, Peformance eration of abanced ‘ean ning sri for ete usa dtr sh Freeing feral Contre cv ie (E201) NETTR, ana di 202, pp S19 peor 1007 7898-15308, EA Lavo Mega, VA iM, Sasi, ahi of “stn tng hrc smat, 10 fem 8 O20 90059014, ‘Anal, Yall A Nove Dimeasan Redon Sch onion ‘Zoe Gay, WS Disco eto ol etn of ‘ewer intron Seaton it nfrmaton Secsty and Pry 201), aes, hepato 10107 are ssntowe 2 Wein, Whang CH, A Reo nrason tetn Mal Bed on Conolooal Neral Netor, Secu wth igen Compung od Beta Seer 01. pp. 71-70, hh ong/10 1087-30 Tom, beck, $1 Hada. snmod, Peromance vletion of enon ton bed on machine ringing Apache ar Prceia Compe 5 127 con) 14, hee, ogo toon pros IA eran ic Malorne, X Smith Dep Leon Yeigus for ber Sky ttaion Beton Deed Asanti aiareieis9 Te 1 09) an oa 3) oa os os) on os 9) p01 en[22] Holo, JO. ech H. chen, A male Jag tas Intusion deeton sppeuch rind ett, 2020 [EE Iterations Conference ous ‘Tehsology cyt 0.1109 cess 200 067255, ‘erwin detect, 2010 IEF! Span on Soar an Pee ‘Femina Hrowayh, Gri M Gala, F. Herr, RC. Prt, Leming en Tatulnand Bata Sm ened, Sage 2018 Ga. Yin € Dong. Yang C205 On the ss imlnce ole, 2008 feoe 2008503 ‘TK Ho, Rando eson forest: reeds of Sed nteatndl Confrence (e Dosunan Ans uadRecogun, a 1, IEE, 195, pp. 278-282, ra ro st (25) Dig Carmina Nebo 1 (2024) 205-218, (27) N, Moss, J. ay, swe compchcsive daa fr network aeuson {cacti ste esis network dla se, 2015 ry Cammunitors ‘se inrmtion Stone Coserenes (MIS 10.1208 en 2015 7348912, N sa, Tonio Dtvty 8 ie ons 10.2122 a, fantaz seins? UR [Staal A Heb! Lash, AA, Govan, Toward gent @ ew lnrson detection datet and tntrson fi chance, rocteings fhe Iasazdoasonl aptly canon UW. Chen, zhang: Wo, Baling ato-ncoer ntson detection stem tes on radi fret etre ean, Compt Seu 95 (2020) 2018S, ge aey10a016/jeme 2000103851 28) ws) (30)
You might also like
Mini Project Report
PDF
No ratings yet
Mini Project Report
42 pages
A_hybrid_approach_for_efficient_feature_selection_
PDF
No ratings yet
A_hybrid_approach_for_efficient_feature_selection_
44 pages
Sensors
PDF
No ratings yet
Sensors
31 pages
Survey - Cyber Resilience 2
PDF
No ratings yet
Survey - Cyber Resilience 2
29 pages
Journal Pre Proof
PDF
No ratings yet
Journal Pre Proof
17 pages
DDOS Attack Final
PDF
No ratings yet
DDOS Attack Final
41 pages
Liu 2021
PDF
No ratings yet
Liu 2021
37 pages
Project Paper Publication
PDF
No ratings yet
Project Paper Publication
10 pages
ids deep network belief
PDF
No ratings yet
ids deep network belief
15 pages
TABLE OF CONTENT (1)(2)
PDF
No ratings yet
TABLE OF CONTENT (1)(2)
55 pages
AML Based Intrusion Detection
PDF
No ratings yet
AML Based Intrusion Detection
17 pages
EESNN Hybrid Deep Learning Empowered SpatialTemporal Features for Network Intrusion Detection System
PDF
No ratings yet
EESNN Hybrid Deep Learning Empowered SpatialTemporal Features for Network Intrusion Detection System
16 pages
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered Feature Selection With ML Model For IoT Threats & Attack Detection
PDF
No ratings yet
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered Feature Selection With ML Model For IoT Threats & Attack Detection
18 pages
A framework for intrusion detection based on few-shot learning
PDF
No ratings yet
A framework for intrusion detection based on few-shot learning
15 pages
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
PDF
No ratings yet
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
24 pages
s42400-023-00161-0
PDF
No ratings yet
s42400-023-00161-0
15 pages
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
PDF
No ratings yet
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
16 pages
ppt
PDF
No ratings yet
ppt
32 pages
Network Intrusion Detection Using Feature
PDF
No ratings yet
Network Intrusion Detection Using Feature
24 pages
Feature extraction for machine learning-based intrusion detection in
PDF
No ratings yet
Feature extraction for machine learning-based intrusion detection in
12 pages
A_Deep_Learning_Approach_to_Network_Intrusion_Detection_FINAL
PDF
No ratings yet
A_Deep_Learning_Approach_to_Network_Intrusion_Detection_FINAL
10 pages
Summary of articles
PDF
No ratings yet
Summary of articles
9 pages
Intrusion Detection in Wireless Sensor Networks
PDF
No ratings yet
Intrusion Detection in Wireless Sensor Networks
5 pages
A Deep Learning Approach To Network Intrusion Detection FINAL
PDF
No ratings yet
A Deep Learning Approach To Network Intrusion Detection FINAL
11 pages
Batch Reinforcement Learning Approach Using Recursive Feature Elimination for Network Intrusion Detection
PDF
No ratings yet
Batch Reinforcement Learning Approach Using Recursive Feature Elimination for Network Intrusion Detection
16 pages
Next-gen Network Attack Detection With Machine Learning and Deep Learning Techniques
PDF
No ratings yet
Next-gen Network Attack Detection With Machine Learning and Deep Learning Techniques
5 pages
fin_irjmets1708609848
PDF
No ratings yet
fin_irjmets1708609848
4 pages
LBDMIDS LSTM Based Deep Learning Model F
PDF
No ratings yet
LBDMIDS LSTM Based Deep Learning Model F
7 pages
ICCAD25_paper_7737 (2)
PDF
No ratings yet
ICCAD25_paper_7737 (2)
5 pages
Feature Level Fusion of Multi-Source Data For Network Intrusion Detection
PDF
No ratings yet
Feature Level Fusion of Multi-Source Data For Network Intrusion Detection
7 pages
Deep_Convolutional_Neural_Networks_for_Intrusion_Detection_in_Automotive_Ethernet_Networks
PDF
No ratings yet
Deep_Convolutional_Neural_Networks_for_Intrusion_Detection_in_Automotive_Ethernet_Networks
6 pages
19148-Article Text-78917-2-10-20240405
PDF
No ratings yet
19148-Article Text-78917-2-10-20240405
24 pages
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
PDF
No ratings yet
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
44 pages
Network Intrusion Detection and Prevention
PDF
No ratings yet
Network Intrusion Detection and Prevention
8 pages
Intrusion Detection System For IoT Environment Using Ensemble Approaches
PDF
No ratings yet
Intrusion Detection System For IoT Environment Using Ensemble Approaches
4 pages
10.1515 - Eng 2022 0403
PDF
No ratings yet
10.1515 - Eng 2022 0403
11 pages
ZR - Network Intrusion Detection System Based on Machine
PDF
No ratings yet
ZR - Network Intrusion Detection System Based on Machine
6 pages
TSP JCS 46915
PDF
No ratings yet
TSP JCS 46915
23 pages
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
PDF
No ratings yet
Intrusion Detection Using Deep Neural Network Algorithm On The Internet of Things
4 pages
Deep Learning Algorithms For Intrusion D
PDF
No ratings yet
Deep Learning Algorithms For Intrusion D
8 pages
Flow Dataset For Network Intrusion Detection
PDF
No ratings yet
Flow Dataset For Network Intrusion Detection
23 pages
HDLNIDS Hybrid Deep-Learning
PDF
No ratings yet
HDLNIDS Hybrid Deep-Learning
17 pages
Paper Review of IIS Course
PDF
No ratings yet
Paper Review of IIS Course
10 pages
1 s2.0 S1568494624001522 Main
PDF
No ratings yet
1 s2.0 S1568494624001522 Main
9 pages
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
PDF
No ratings yet
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
8 pages
Multi Level Deep Learning Model For Network Anomal
PDF
No ratings yet
Multi Level Deep Learning Model For Network Anomal
12 pages
01-2020 DL CNN
PDF
No ratings yet
01-2020 DL CNN
17 pages
631eaa91dbcfb7 78471842
PDF
No ratings yet
631eaa91dbcfb7 78471842
13 pages
Processes Intrusion Detection
PDF
No ratings yet
Processes Intrusion Detection
14 pages
1 s2.0 S2772503023000130 Main
PDF
No ratings yet
1 s2.0 S2772503023000130 Main
13 pages
1.1 Motivation
PDF
No ratings yet
1.1 Motivation
65 pages
Symmetry 15 01251
PDF
No ratings yet
Symmetry 15 01251
31 pages
AWID For IntrusionCISS2019
PDF
No ratings yet
AWID For IntrusionCISS2019
6 pages
Final Progress
PDF
No ratings yet
Final Progress
22 pages
A Survey On Effective Machine Learning Algorithm For Intrusion Detection System
PDF
No ratings yet
A Survey On Effective Machine Learning Algorithm For Intrusion Detection System
4 pages
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
PDF
No ratings yet
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
11 pages
A Deep Learning Approach To Network Intrusion Detection - IEEE TETCI v2n1 201802 - Shone, Ngoc, Phai, Shi
PDF
No ratings yet
A Deep Learning Approach To Network Intrusion Detection - IEEE TETCI v2n1 201802 - Shone, Ngoc, Phai, Shi
10 pages
Anomaly Detection On Iot Network Using Deep Learning
PDF
No ratings yet
Anomaly Detection On Iot Network Using Deep Learning
14 pages