Wendym,+04 Victor
Wendym,+04 Victor
Unsupervised Algorithms
Abstract
The increase in the deployment of IOT networks has improved productiv-
ity of humans and organisations. However, IOT networks are increasingly
becoming platforms for launching DDOS attacks due to inherent weaker
security and resource-constrained nature of IOT devices. This paper focusses
on detecting DDOS attack in IOT networks by classifying incoming network
packets on the transport layer as either “Suspicious” or “Benign” using
unsupervised machine learning algorithms. In this work, two deep learning
algorithms and two clustering algorithms were independently trained for
mitigating DDOS attacks. Emphasis was laid on exploitation based DDOS
attacks which include Transmission Control Protocol SYN-Flood attacks and
UDP-Lag attacks. Mirai, BASHLITE and CICDDOS2019 datasets were used
in training the algorithms during the experimentation phase. The accuracy
score and normalized-mutual-information score are used to quantify the
classification performance of the four algorithms. Our results show that the
autoencoder performed overall best with the highest accuracy across all the
datasets.
1 Introduction
The increment of sensors and computing devices have made life easy and
convenient for us due to the fast and accurate computation of our information.
Nevertheless, the rapid increase in the deployment and combination of con-
nected devices has truly expose essential resources to DDOS threats [1]. In
2016, the Mirai attack that ruined several notable websites actually exposed
the weakness of IOT devices [2]. Over 100,000 inadequately protected player,
cameras, digital video recording and other IOT devices were turned into
botnets. The Mirai source code that was further released resulted in fre-
quent additional IOT attacks. With the magnitude of attacks that have been
launched, securing IOT devices is a problem as host-centric IT security solu-
tions cannot be totally relied upon because most manufacturers appliances
place more priority on functionality and cost over security. Besides, unlike
servers that can undergo software update, IOT software is hardly or never
updated, hence making them more vulnerable to attackers. In view of these
security problems and resource-constrained nature of IOT devices, greater
focus should be placed on packet security within the IOT network.
Traditional network-centered security has relied on predefined signature
or system representations for known threats [3]. Recently, the awareness of
using machine learning to secure network has increased rapidly. However,
many of the machine learning solutions use supervised learning i.e. they
create attack classifiers by training on identified anomalies [4], which makes
them futile towards new threats. The main aim of this work is to determine the
performance of unsupervised learning algorithms in accurately classifying
network packets as either benign or malicious. We achieve this by training the
algorithms on modern DDOS datasets and performing rigorous testing while
benchmarking the performance of the algorithms using standard performance
metrics.
The rest of the paper is organized as follows: Section 2 presents related
works; Section 3 elaborates the data set and describes the methodology fol-
lowed in the research; Section 4 details the experimentation procedures, the
result gotten and the observations from the results while Section 5 discusses
the conclusion of our research as well as highlighting the future work.
2 Related Work
According to [5], detection systems of network intrusion have traditionally
been rule-based. Nevertheless, machine learning and statistical approaches
DDOS Detection on Internet of Things Using Unsupervised Algorithms 571
have also made major contributions [5]. Machine learning have also proven
to be effective in two main ways of securing network which are: feature
engineering (i.e. ability to extract the most important structures from network
data to assist model learning) [6] and classification. In security environment,
classification tasks usually involve training both suspicious and benign data
to build models that can detect known attacks [7].
The authors in [8] pointed out that steps such as collection of network
information, feature extraction and analysis, and classification detection
provide a means for building efficient software-based tools that can detect
anomalies such as software-defined networking (SDN). Another study [9]
provides a thorough classification of DDOS attacks in terms of detection
technology. The study also emphasizes how the characteristics of the network
security of an SDN defines the possible approaches to setting up a defense
against DDOS attacks. Similarly, [10] have explored this area too. In other
approaches to DDOS defense, [4] propose a scheduling based SDN controller
architecture to effectively limit attacks and protect networks in DoS attacks.
The growth of cloud computing and IOT has inevitably led to the
migration of denial-of-service attacks on cloud computing devices as well.
Thus, cloud computing devices must implement efficient DDOS detection
systems in order to avoid loss of control and breach of security [11]. Studies
such as [12] aimed at tackling this problem by determining the source
of a DDOS attack using PTrace (powerful trace) source control methods.
PTrace controlled such attack sources from two aspects, packet filtering
and malware tracing, to prevent the cloud from becoming a tool for DDOS
attacks. Other studies such as [13] approach the problem of filtering by
using a set of security services called filter trees. In the study, XML and
HTTP based DDOS attacks are filtered out using five filters for detection
and resolution. Detection based on classification has also been proposed and
a classifier system for detection against DDOS TCP flooding attacks was
created [14]. These classifiers work by taking in an incoming packet as input
and then classifying the packet as either suspicious or otherwise. The nature
of an IP network is often susceptible to changes such as the flow rate on
the network and in order to deal with such changes, self-learning systems
have been proposed that learn to detect and adapt to such changes in the
network [15].
Many of the existing models for DDOS detection have primarily focused
on SYN-flood attacks and haven’t been trained to detect botnet attributes.
More studies are thus needed where models are trained to detect botnet as bot-
net becomes the main technology for DDOS organization and execution [16].
572 V. Odumuyiwa and R. Alabi
3 Methodology
A DDOS attack temporarily or indefinitely constraints the availability of a
network resource to its intended users. The challenge then for the network
administrator is to deploy DDOS detection systems that are capable of
analysing incoming packets to the transport layer. These detection systems
may then determine if these incoming packets are suspicious or benign. In
the following subsections, we present the design methodology for a DDOS
detection system that uses unsupervised machine learning algorithms. The
problem is therefore to design and train four efficient unsupervised machine
learning systems that are capable of detecting a DDOS attack on the transport
layer.
DDOS Detection on Internet of Things Using Unsupervised Algorithms 573
3.1 Datasets
In order to train the unsupervised machine learning algorithms, the following
DDOS attack datasets were sourced.
1. The first dataset is the DDOS evaluation dataset (CICDDOS2019) [27].
The full dataset consists of both reflection and exploitation based DDOS
attacks in the form of both suspicious and benign network packets. The
dataset is further grouped into TCP and UDP based attacks.
2. The second dataset is the Mirai dataset created by [28]. Mirai is a specific
type of botnet malware that overrides networked Linux devices and
successfully turns them into bots used for distributed attacks such as
DDOS. The Mirai dataset consists of 80,000 SYN-flood instances and
65,000 UDP-lag attacks on security camera IOT devices.
3. Finally, the third dataset is the BASHLITE botnet attack dataset
on a webcam IOT device and is also provided by [28]. Similar to
Mirai, BASHLITE is a botnet malware for distributed attacks on net-
worked devices. The BASHLITE dataset consists of 110,000 SYN-flood
instances and 100,000 UDP-lag attacks. Both Mirai and BASHLITE are
open-source malware that can be used for academic research purposes.
subspace that is lower and a decoder that recreate the input from this latent
subspace”. The encoder and the decoder, can be defined as transitions φ and
ψ such that:
φ: X → F
ψ: F → X
φ, ψ = argminφ, ψkX − (ψ ◦ φ)Xk2 (2)
Figure 2 Restricted Boltzmann machine showing the visible and hidden units.
Each neuron in the hidden layers of the encoder and decoder make use
of the Rectified Linear Unit (ReLu) activation function. The hyperparameters
selected for the autoencoder model are outlined as follows;
• Batch size: A batch size of 2048 is selected.
• Number of epochs: An epoch number ranging between 10–20 is
initialized.
• Loss: The mean squared error loss function is used.
• Optimizer: Adam, an adaptive algorithm, is selected. It is a state-of-the-
art optimizer for deep neural networks (Schneider et al, 2019).
• Betas: These are Adam optimizer coefficients used for computing
running averages of gradient and its square (0.5, 0.999).
• Learning rate (0.0002).
In Equation (3), wij are real valued weights associated with vj and
hi , and bj and ci are real valued bias terms associated with units i and
j respectively. The contrastive divergence learning algorithm is one of the
successful training algorithms used to approximate the log-likelihood energy
gradient and perform gradient ascent to maximize the likelihood [24].
The RBM used for this project is a two-layer RBM with an architecture
as described in Figure 2. The dataset consists of continuous variables scaled
between 0 and 1 so therefore we model the RBM as a continuous variable
model with the hidden units and visible units taking on values between
(v, h) ∈[0, 1]m where m is number of visible units. Similar to the autoen-
coder algorithm, we use the reconstruction error to define the classification
task for the RBM. The parameters selected for the RBM include:
• Number of units: The number of hidden and visible units are set to
be the same value according to the number of features present in the
training data. That is, for instance for the CICDDOS2019, the number
of hidden and visible units for the RBM is 77.
• Training algorithm: The k-step contrastive divergence algorithm with
Gibbs sampling is used for training the algorithm with k = 10.
• Training Epoch: An epoch of 10 is selected, experimental results show
that increasing the epoch beyond 10 does not improve training results.
3.3.3 K-Means
The K-means algorithm takes the full dataset consisting of multiclass data
points, then clusters the datapoints into separate clusters to the best of its
ability; this classification occurs when you feed in the input and the model
assigns the input into one of the computed clusters. Assumed a set of obser-
vations (x1 , x2 , x3 , . . ., xn ), where each observation is a d-dimensional real
vector, the k-means objectives is to partition the n observations into k (≤ n)
sets S = {s1 , s2 , . . . , sk } in order to reduce the within-cluster sum of squares
(WCSS) (i.e. variance).
The K-Means clustering algorithm has relatively fewer parameters to
select. The default “pure” version of the K-means algorithm is used as
DDOS Detection on Internet of Things Using Unsupervised Algorithms 577
as [22, 26] and the parameters for the expectation maximization algorithm
include:
• Number of components: This is the number of clusters to be estimated
and is set to two because of the binary classification task of suspicious
or benign.
• Number of iterations: The number of iterations is like the epoch of the
autoencoder where they both define the number of training iterations to
run the algorithm. A default value of 300 is used.
• Covariance type: The covariance parameter defines the structure of the
covariance matrix with respect to each component or cluster. The “full”
covariance is chosen where each cluster has its own covariance matrix
and has been shown to achieve the best results.
Where yi is the original input vector and ybi is the reconstructed output
vector. The mean squared error is computed over all the output of the model.
Ideally, it is preferable to have a mean squared error close to zero. However,
depending on the size of the values in the predicted output, a mean squared
error within the range of 2–5 decimal places is acceptable. Representing the
reconstruction error as the mean squared error allows one to know when the
model is presented with an input that is very far off from what was contained
in the training set. Thus, if for instance the autoencoder is trained on a dataset
comprising only of benign packets, whenever a benign packet is presented
to the autoencoder, we expect that the reconstructed output should be quite
similar and therefore the reconstruction error should be low. However, if this
same model is presented with a suspicious packet that is fairly different from
the features of benign packet then we should expect the reconstruction error
to be quite high. The same logic can be applied to the restricted Boltzmann
machine.
With this formulation established, it is easier to frame the classification
problem using the autoencoder and RBM. Where in our example, a low
reconstruction error means the packet is benign, while a high reconstruction
error means the packet is suspicious. Using these predictions, we can then
compute the accuracy much like we did with the K-Means model. The
accuracy score simply calculates a ratio of the number of correctly classified
packets over the incorrectly classified packets.
4 Experimental Results
In this session, we present the experimental results for each model across all
datasets. The results are presented in subsections, with each subsection ded-
icated to a model. For the Autoencoder and Restricted Boltzmann Machine,
their subsections consist of plots showing the training and test loss, a table
summarizing the performance across the datasets and a detailed discussion of
the results. For the rest of the models, they do not optimize a loss function
and so only the summary tables and a detailed discussion of the results were
presented. Performance evaluations are also carried out using the accuracy
and Normalized Mutual score. Tables 5 and 6 also show the summary of all
the results of the different models.
(a)
(b)
Figure 3 Autoencoder training and validation loss on the CICDDOS2019 SYN-Flood (a)
and UDP-Lag (b).
DDOS Detection on Internet of Things Using Unsupervised Algorithms 581
(a)
(b)
Figure 4 Autoencoder training and validation loss on the Mirai SYN-flood (a) and UDP-Lag
data (b).
the training epoch increases. It is important to point out that the autoencoder
is trained to reconstruct SYN-Flood data, meaning it should be unable to
reconstruct benign data. We chose the SYN-Flood data for training because
there were more instances than the benign data. The same choice is made for
the UDP-Autoencoder model, where we train it on the UDP-Lag data instead
of on benign UDP data.
In Figure 4(a) and 4(b), the training and validation loss of the Autoen-
coder model on the Mirai SYN and UDP-Lag data reduces steadily as the
training epochs increase. It is clear that by epoch 10, the loss starts to
converge quickly to zero. Therefore 20 is the sufficient number of training
epochs in order to avoid overfitting.
582 V. Odumuyiwa and R. Alabi
Table 1 Test Accuracy and Normalized Mutual Information score for the autoencoder
models on the SYN-Flood and UDP-Lag across all datasets
Data Accuracy (%) NMI
CICDDOS2019 SYN-Flood 0.8945 0.5363
CICDDOS2019 UDP-Lag 0.8617 0.4216
Mirai SYN-Flood 0.9744 0.6211
Mirai UDP-Lag 0.9621 0.5733
BASHLITE SYN-Flood 0.9933 0.9927
BASHLITE UDP-Lag 0.9921 0.9822
The training and validation loss for the BASHLITE loss shown in Figure 5
is less steep than that of the Mirai loss. Table 1 shows the accuracy of the
autoencoder model across all datasets, here we can see that the model has the
highest accuracy on the BASHLITE dataset hence the reason why the loss is
less steep and flattens out quickly by epoch 10 and 14 respectively.
In Table 1, we present the accuracy and NMI scores for the autoencoder
model. These scores were determined based on the formulation described in
Section 3.4. The result show that the autoencoder model performs best on
the BASHLITE SYN-Flood data with a higher accuracy of 99%. In general,
the autoencoder performs better on the Mirai and BASHLITE datasets than
that of the CICDDOS2019 dataset. Does this mean the model is better
suited to detect botnet attacks? We suspect that this is due to the feature
selection process for the datasets, the Mirai and BASHLITE datasets show
less variance across the features when compared to the CICDDOS2019
dataset. The NMI scores show higher correlations where the accuracy is
higher, indicating a correlation between the target predictions and predicted
value.
(a)
(b)
Figure 5 Autoencoder training and validation loss on the BASHLITE SYN-flood (a) and
UDP-Lag data (b).
generative models using batch sampling because the batch sampling prevents
the loss function from settling in a local optimum. The oscillations in the loss
indicate that the training algorithm continues to explore the search space not
settling for a local optimum. The test set results of the RBM on the Mirai
dataset show improvement over the CICDDOS2019 dataset with its highest
accuracy score being recorded against the Mirai UDP-Lag (Table 2).
The RBMs performance on the BASHLITE dataset is similar to its
performance on the Mirai data, still, the overall performance is much lower
584 V. Odumuyiwa and R. Alabi
(a)
(b)
Figure 6 Restricted Boltzmann machine training loss on the CICDDOS2019 SYN-flood (a)
and UDP-Lag data (b).
than that of the Autoencoder. The results indicate that the RBM is less suited
for the kind of precise reconstruction of the continuous input value that is
easily achieved by the autoencoder. The stochastic property of the RBMs
hidden units makes it difficult to accurately reconstruct the continuous input
from a large unknown latent space. The autoencoder solves this problem by
first encoding the input vector into a lower dimensional space thus reducing
the dimensionality of the latent space, making the resampling process more
DDOS Detection on Internet of Things Using Unsupervised Algorithms 585
Table 2 Test Accuracy and Normalized Mutual Information score for the Restricted Boltz-
mann machine model on the SYN-Flood and UDP-Lag across all datasets
Data Accuracy (%) NMI
CICDDOS2019 SYN-Flood 0.5651 0.1919
CICDDOS2019 UDP-Lag 0.5089 0.1103
Mirai SYN-Flood 0.6067 0.1639
Mirai UDP-Lag 0.7797 0.3895
BASHLITE SYN-Flood 0.6709 0.2506
BASHLITE UDP-Lag 0.6210 0.1007
Table 3 Test Accuracy and Normalized Mutual Information score for the K-Means model
on the SYN-Flood and UDP-Lag training and validation data
Data Accuracy (%) NMI
CICDDOS2019 SYN-Flood 0.7538 0.1949
CICDDOS2019 UDP-Lag 0.7139 0.1427
Mirai SYN-Flood 0.7636 0.0912
Mirai UDP-Lag 0.7478 0.1387
BASHLITE SYN-Flood 0.6451 0.1059
BASHLITE UDP-Lag 0.6823 0.1306
accurate and less computationally intensive. The NMI across the RBMs per-
formance is low, showing poor correlation between the target and predicted
value.
(a)
(b)
Figure 7 Restricted Boltzmann machine training loss on the Mirai SYN-flood (a) and UDP-
Lag data (b).
The NMI scores for the K-Means model are relatively low too with well
below average correlations. Although one should interpret the accuracy and
NMI scores independently, the low NMI scores for the K-Means discourages
one from being too optimistic about the model’s performance.
(a)
(b)
Figure 8 Restricted Boltzmann machine training loss on the BASHLITE SYN-flood and
UDP-Lag data.
Table 4 Test Accuracy and Normalized Mutual Information score for the EM model on the
SYN-Flood and UDP-Lag training and validation data
Data Accuracy (%) NMI
CICDDOS2019 SYN-Flood 0.7096 0.1144
CICDDOS2019 UDP-Lag 0.6759 0.1446
Mirai SYN-Flood 0.7030 0.2771
Mirai UDP-Lag 0.8051 0.2901
BASHLITE SYN-Flood 0.7636 0.3074
BASHLITE UDP-Lag 0.7575 0.2678
588 V. Odumuyiwa and R. Alabi
Table 5 Summary of the Accuracies across all datasets and all models
Dataset/ Restricted Expectation-
Model Autoencoder Boltzmann Machine Maximization
CICDDOS2019 SYN-Flood 0.8945 0.5651 0.7538 0.7096
CICDDOS2019 UDP-Lag 0.8617 0.5089 0.7139 0.6759
Mirai SYN-Flood 0.9744 0.6067 0.7636 0.7030
Mirai UDP-Lag 0.9621 0.7797 0.7478 0.8051
BASHLITE SYN-Flood 0.9933 0.6709 0.6451 0.7636
BASHLITE UDP-Lag 0.9921 0.6210 0.6823 0.7575
Table 6 Summary of the Normalized Mutual Information score across all datasets and all
models
Dataset/ Restricted Expectation-
Model Autoencoder Boltzmann Machine Maximization
CICDDOS2019 SYN-Flood 0.5363 0.1919 0.1949 0.1144
CICDDOS2019 UDP-Lag 0.4216 0.1103 0.1427 0.1446
Mirai SYN-Flood 0.6211 0.1639 0.0912 0.2771
Mirai UDP-Lag 0.5733 0.3895 0.1387 0.2901
BASHLITE SYN-Flood 0.9927 0.2506 0.1059 0.3074
BASHLITE UDP-Lag 0.9822 0.1007 0.1306 0.2678
5 Conclusion
The unsupervised machine learning models were trained on both SYN-Flood
and UDP-Lag DDOS datasets. The training and test results both show that
DDOS Detection on Internet of Things Using Unsupervised Algorithms 589
References
[1] Shirazi, “Evaluation of anomaly detection techniques for scada commu-
nication resilience,” IEEE Resilience Week, 2016.
[2] N. Mirai, “mirai-botnet,” 2016. [Online]. Available: https://www.cybe
r.nj.gov/threat-profiles/botnetvariants/mirai-botnet. [Accessed 31
December 2019].
[3] H. Zhou, B. Liu and D. Wang, “Design and research of urban intelli-
gent transportation system based on the Internet of Things,” Internet of
Things, pp. 572–580, 2012.
[4] S. Lim, S. Yang and Y. Kim, “Controller scheduling for continued
SDN operation under DDOS attacks,” Electronic Letter, pp. 1259–1261,
2015.
[5] A. Buczak and E. Guven, “A survey of data mining and machine learning
methods for cyber security intrusion detection,” IEEE Communications
Surveys & Tutorials, vol. 18.2, 2016.
590 V. Odumuyiwa and R. Alabi
Biographies