0% found this document useful (0 votes)
44 views37 pages

Adv Net Attack Detection (DoS, Probe) Gemini AI Generated Index From DATAPRO

This document discusses the challenges and advancements in Network Intrusion Detection Systems (NIDS), emphasizing the limitations of traditional methods in detecting sophisticated cyber threats. It outlines the objectives of developing a machine learning model using a 1D Convolutional Neural Network enhanced by Genetic Algorithms for feature selection, specifically targeting Denial-of-Service and Probe attacks. The report is structured into chapters covering the introduction, literature review, dataset characteristics, methodology, system design, results, and conclusions.

Uploaded by

JINESH VARIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views37 pages

Adv Net Attack Detection (DoS, Probe) Gemini AI Generated Index From DATAPRO

This document discusses the challenges and advancements in Network Intrusion Detection Systems (NIDS), emphasizing the limitations of traditional methods in detecting sophisticated cyber threats. It outlines the objectives of developing a machine learning model using a 1D Convolutional Neural Network enhanced by Genetic Algorithms for feature selection, specifically targeting Denial-of-Service and Probe attacks. The report is structured into chapters covering the introduction, literature review, dataset characteristics, methodology, system design, results, and conclusions.

Uploaded by

JINESH VARIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Chapter 1: Introduction

1.1 Background and Motivation:-


Modern cybersecurity faces an escalating threat landscape, making Network
Intrusion Detection Systems (NIDS) a critical component of defence strategies.
Traditional intrusion detection systems, often relying on signature-based or rule-driven
approaches, exhibit significant limitations in identifying sophisticated and previously
unseen (zero-day) attacks.1 These conventional tools, such as firewalls and antivirus
software, frequently prove inadequate against the dynamic and evolving nature of cyber
threats, including denial-of-service (DoS) attacks and data breaches.1
The inherent challenges in network security, such as the adversarial environment
where attackers actively seek to bypass detection mechanisms and the rapid "concept
drift" in network traffic patterns, necessitate a fundamental shift in defence paradigms. 3
This environment means that network traffic characteristics and attack methodologies
quickly become outdated, diminishing the value of static detection rules. 3 Consequently,
there is an increasing reliance on data-driven approaches, particularly Machine Learning
(ML) and Deep Learning (DL), to develop more robust and adaptive threat detection
systems.2 ML and DL-based NIDS are capable of recognizing complex patterns indicative
of malicious activities, thereby offering a proactive means to detect and counter cyber
threats.1 This continuous evolution in attack strategies implies that NIDS development is
not a static solution but an ongoing process demanding continuous model retraining,
updated datasets, and rigorous validation methodologies to maintain efficacy against an
intelligent adversary.

1.2 Problem Statement:-


Accurately detecting Denial-of-Service (DoS) and Probe attacks in complex and
high-speed network environments presents a significant challenge. These attacks exhibit
varying characteristics, from overwhelming traffic floods in DoS attacks to subtle
scanning patterns in Probe attacks, making their differentiation from legitimate traffic
difficult.7 Furthermore, network intrusion detection datasets, such as UNSW_NB15 and
CICIDS2017, are often characterized by high dimensionality and severe class imbalance,
where normal traffic vastly outnumbers attack instances.6 These data properties can
significantly degrade the performance of machine learning models, leading to biased
predictions and reduced accuracy, particularly for minority attack classes. 6 The need for
efficient feature selection techniques that can identify the most relevant attributes from
these voluminous datasets, coupled with robust deep learning architectures capable of
learning intricate patterns, remains a critical area for research and development.

1.3 Objectives of the Study:-


The primary objectives of this study are:
To develop and evaluate a machine learning model, specifically a 1D
Convolutional Neural Network (1DCNN), enhanced by Genetic Algorithms for
optimal feature selection, to accurately detect DoS and Probe attacks.
To thoroughly assess the model's performance using standard evaluation metrics,
including accuracy, precision, recall, and F1-score, on the UNSW_NB15 and
CICIDS2017 datasets.
To design & implement a user-friendly Flask-based web application that
facilitates real-time network attack prediction and provides intuitive visualization
of results.

1.4 Scope and Limitations


This study focuses specifically on the detection of Denial-of-Service
(DoS) and Probe attacks within network environments. The detection mechanisms are
developed and evaluated using two publicly available benchmark datasets: UNSW_NB15
and CICIDS2017. The architectural choices are centered on a 1DCNN for attack
classification, with Genetic Algorithms employed for feature selection.
A significant limitation inherent to network intrusion detection research is
the variability and concept drift observed in real-world network traffic. 3 Each network
environment possesses unique traffic patterns, implying that a detection model trained on
one network might underperform when deployed on another.3 Additionally, collected
network data can become outdated relatively quickly due to the rapid evolution of cyber
threats and network behaviours, thereby diminishing the data's long-term value. 3 While
this project aims to develop a robust detection system, it acknowledges that maintaining
optimal performance in a constantly changing real-world environment would necessitate
continuous monitoring, model retraining, and potentially the integration of adaptive
learning mechanisms or transfer learning techniques, which extend beyond the immediate
scope of this study.
1.5 Structure of the Report
This report is organized into seven chapters. Chapter 1 provides an introduction,
outlining the background, problem statement, objectives, and scope of the study. Chapter
2 presents a comprehensive literature review on intrusion detection systems, comparing
traditional and machine learning-based approaches, and identifying existing research
gaps. Chapter 3 details the datasets used, including their characteristics and the extensive
pre-processing steps undertaken. Chapter 4 describes the methodology and model
development, focusing on the 1DCNN architecture and the role of Genetic Algorithms.
Chapter 5 outlines the system design and implementation of the Flask web application.
Chapter 6 presents and discusses the experimental results and performance evaluations
for DoS, Probe, and overall network attacks. Finally, Chapter 7 concludes the report by
summarizing the contributions and suggesting avenues for future work.

Chapter 2: Literature Review


2.1 Overview of Intrusion Detection Systems:-
IDS crucial security mechanisms designed to monitor computer networks and
systems for unauthorized use, misuse, or abuse.3 NIDS, specifically, inspect network
traffic to identify suspicious activity and raise alerts or directly block malicious
communication.3 IDSs are broadly categorized into network-based (NIDS), which analyse
network packets, and host-based (HIDS), which monitor individual system calls. 7 They
can also be classified by their detection methodology: signature-based IDSs identify
known attack patterns by comparing network activity against a database of signatures,
while anomaly-based IDSs build a model of "normal" behaviour and flag any deviations
as potential intrusions.10 Modern NIDS solutions often combine both signature and
anomaly detection techniques to provide comprehensive real-time threat detection
capabilities.10

2.2 Traditional vs. Machine Learning-Based IDS:-


Traditional IDSs primarily rely on predefined rules and signatures to detect threats. While
effective against known attacks, these systems often prove inadequate in detecting zero-day
attacks or novel variations of existing threats, leading to high rates of both false positives and
false negatives. The static nature of their rule sets makes them vulnerable to sophisticated,
1

evolving cybercriminal tactics. 1


In contrast, Machine Learning (ML) and Deep Learning (DL) approaches offer significant
advantages by enabling systems to proactively recognize and preempt potential cyber threats
through the analysis of large datasets of security events. ML algorithms can discern complex
1

patterns indicative of malicious activities, adapting to new attack forms that traditional firewalls
and antivirus software might miss. This capability is particularly vital given the dynamic nature
1

of network traffic and the constant emergence of new threats in environments like the Internet of
Things (IoT). DL models, such as Convolutional Neural Networks (CNNs) and Long Short-Term
5

Memory (LSTMs), can automatically extract and enhance features from raw network traffic data,
overcoming the limitations and labour costs associated with manually extracted features. This 2

approach allows for more robust detection of both known and unknown threats, significantly
enhancing network security measures. 2

2.3 Related Work and Comparative Analysis:-


Extensive research has explored the application of ML/DL algorithms for DoS
and Probe attack detection using benchmark datasets. For DoS attack detection, various ML
algorithms have demonstrated high accuracies. Gradient Boosting (GB) and XGBoost, for
instance, achieved accuracies of 99.99% and 99.98% respectively on the CICDoS2019 dataset. 14

Other studies using NSL-KDD reported excellent detection accuracy with Random Forest (RF)
and Support Vector Machine (SVM). A comparative analysis on DDoS detection revealed that
15

RF achieved 99.4% accuracy, KNN 96.88%, and SVM 92.12%. 14


For 1D-CNN models
specifically, a study using the UNSW-NB15 dataset achieved a detection rate of 98.79% for
binary classification and 97.23% for multiclass classification. Another work on UNSW-NB15
6

reported RF achieving 98.6% accuracy. 2

Probe attacks, often categorized as reconnaissance or port scanning, are also a significant focus.
Machine learning techniques like Naïve Bayes, SVM, and Decision Trees have been applied to
datasets like KDDcup99 for Probe attack detection. For port scanning attacks, a Naive Bayes
7

model achieved 93.5% accuracy. F1-scores for probe detection have been reported as high as
15

98%. 16

Genetic Algorithms (GAs) have garnered significant attention for feature selection in
NIDS due to their adaptability and suitability for dynamic rule generation. Studies have shown
17

GAs can effectively reduce the number of features while maintaining or improving F1-scores. For
example, a GA achieved an F1-score of 76.89% with 40 features on the HIKARI-2021 dataset,
outperforming Chi-squared, PCA, and LDA for feature selection. However, it is important to
17

note that the effectiveness of a feature selection algorithm, such as GA, is not universally superior
but can be contingent on the specific machine learning algorithm it is paired with and the
characteristics of the dataset. For instance, Particle Swarm Optimization (PSO) achieved better
accuracy and precision than GA for J48 and SVM classifiers in one study. this highlights that
17

while GA is powerful, a holistic approach considering entire ML pipeline (feature selection


combined with classifier) and dataset characteristics is crucial, and a one-size-fits-all claim should
be approached with nuance.

2.4 Research Gaps Identified


Despite advancements in ML/DL for NIDS, several challenges persist. A significant gap
lies in the consistent methodology for determining dataset suitability for specific use cases, as
researchers are often unaware of the optimal datasets for their scenarios. Furthermore, the basis
3

for determining the score values used in feature selection techniques, such as Information Gain, is
frequently not explicitly mentioned in existing literature. This lack of transparency and
9

standardization impedes the replicability, realism, and robustness of proposed NIDS methods. 3

Without clear methodologies for data validation, pre-processing, and feature selection, comparing
and building upon existing research becomes problematic. This project aims to contribute to
addressing this by providing detailed justifications for its methodological choices, thereby
enhancing the transparency and reproducibility of NIDS research, which in turn improves its
practical applicability and scientific rigor.

Chapter 3: Dataset and Pre-processing


3.1 Introduction to the CICIDS2017 Dataset:-
The CICIDS2017 dataset, developed by the Canadian Institute for Cybersecurity, is a
widely recognized benchmark for intrusion detection research. It is known for its realistic network
traffic, capturing normal (Benign) behaviour alongside 14 distinct types of attacks, including
DoS/DDoS and Port Scan. The dataset is provided in CSV format and comprises a high volume
5

of traffic with a large number of features, making it suitable for evaluating the performance of
intrusion detection systems against contemporary cyber threats. Its comprehensive nature and
9

detailed attack categorization make it a valuable resource for training and testing machine
learning models for network security. 2

3.2 Introduction to the UNSW_NB15 Dataset:-


The UNSW_NB15 dataset, created by the University of New South Wales, serves as
another prominent benchmark for network intrusion detection research. It was specifically
designed to overcome some limitations of older datasets, such as KDD Cup 99, by providing
more contemporary network traffic patterns and a wider range of subtle intrusions. UNSW_NB15
1

includes 9 different types of cyber-attacks, notably DoS and Reconnaissance (which encompasses
Probe attacks), alongside normal traffic. The dataset contains over 2.5 million records and 45
1

features, categorized into basic, flow, time, content, and additional generated features. Its design
6

supports the advancement and assessment of machine learning techniques for identifying and
categorizing diverse forms of network attacks. 1
3.3 Data Collection and Feature Description:-
Network intrusion detection datasets are typically derived from monitoring network
activity by inspecting its traffic. This involves collecting raw packet data, flow records, or logs. 3

Features extracted from this data encompass a wide range of attributes, such as protocol type
(e.g., TCP, UDP, ICMP), service (e.g., HTTP, FTP), connection flags, source and destination IP
addresses and port numbers, packet and byte counts, flow inter-arrival times, and various
statistical measures of network behavior. 12
These datasets are often characterized by high
dimensionality, containing numerous features that describe network connections and events. 6

3.4 Data Cleaning and Handling Missing Values:-


Data cleaning is a crucial pre-processing step to ensure the quality and reliability of the
dataset for model training. This process involves identifying and addressing various
inconsistencies, including missing values, redundant records, and outliers. Missing values, if not
1

handled, can lead to errors or biased results in machine learning models. Redundant records, often
present in older datasets like KDD Cup 99, can skew training and evaluation metrics. Outliers,
1

which are data points significantly different from others, can also negatively impact model
performance and need to be either removed or transformed. Effective data cleaning ensures that
1

the model learns from accurate and representative data, thereby improving its detection
capabilities.

3.5 Feature Selection and Engineering:-


Feature selection is a vital step in preparing high-dimensional network datasets for
machine learning models. Its importance stems from its ability to reduce data dimensionality,
mitigate the "curse of dimensionality," decrease computational complexity, and ultimately
enhance detection accuracy. By eliminating irrelevant or redundant features, feature selection
6

helps in understanding the data better and reduces computing time. 9

Role of Genetic Algorithms (GA) in Feature Selection:-


Genetic Algorithms are a powerful class of optimization algorithms inspired by natural selection,
and they are particularly well-suited for feature selection in network intrusion detection
systems.17 GAs can serve as an automated method for identifying optimal feature subsets, often
requiring no expert input for feature relevance determination.23 This automation is a significant
advantage, as manual feature engineering is labour-intensive and can result in the loss of valuable
information from the original data, leading to lower judgment accuracy and hindering real-world
deployment.13
The GA process for feature selection typically involves several key steps:
Encoding: Each potential solution, representing a subset of features, is encoded as a
binary one-dimensional array, or "genome." The length of this array corresponds to the
total number of features in the dataset (e.g., 88 features for HIKARI-2021). Each bit in 17

the array signifies whether a specific feature is retained (1) or removed (0). 17

Initialization: The process begins by randomly generating an initial population of these


encoded solutions. This randomness ensures diversity in the starting points, which is
crucial for exploring a wide range of feature combinations and preventing premature
convergence to sub-optimal solutions. 17

Fitness Function: A fitness function is defined to evaluate the quality of each feature
subset. This function typically measures the performance of the machine learning model
(e.g., 1DCNN in this study) when trained using only the features selected by that
particular solution. Common metrics for fitness include model accuracy or F1-score.
17

Selection: Based on their fitness scores, fitter individuals are probabilistically chosen to
become "parents" for the next generation. Methods like the roulette wheel approach are
often employed, where individuals with higher fitness have a greater chance of being
selected. 17

Crossover and Mutation: New offspring are generated from the selected parents through
genetic operators:
 Crossover: Two parent solutions are combined to produce new offspring, mixing their
feature selections. A common technique is Single-Point Crossover, where a random point
is chosen, and segments of the parent chromosomes are swapped. 17

 Mutation: A low-probability mutation is applied to some genes (bits) in the offspring.


This involves randomly flipping a bit (0 to 1 or 1 to 0). Mutation is vital for maintaining
genetic diversity within the population and enabling the algorithm to explore new areas of
the feature space, thereby preventing it from getting stuck in local optima. 17

Sparsity Check: A sparsity check function can be implemented to ensure that the
generated offspring adhere to desired feature reduction levels, adjusting the feature mask
if necessary. 17

The benefits of using GAs include high re-trainability, enhanced adaptability, and
suitability for dynamic rule generation in Intrusion Detection Systems. While the runtime for GA
17

to find an optimal feature set can be longer than other methods, this is typically an offline process
and does not impact the real-time classification performance of the final system. The reduction in
17

the number of features selected by GA can significantly decrease the processing time for
classification without substantially impacting the F1-score. For instance, GA has been shown to
17

achieve high F1-scores (e.g., 76.89% with 40 features) in comparison to other feature selection
techniques. This approach, by reducing dependence on manual feature engineering, makes NIDS
17

development more efficient, scalable, and potentially more robust against evolving attack
patterns, as it can adaptively discover new relevant features.

3.6 Class Imbalance and Weight Computation:-


Network intrusion detection datasets, including UNSW-NB15 and CICIDS2017, are
known for their severe class imbalance. For example, in UNSW-NB15, normal traffic constitutes
6

approximately 87.35% of the entire dataset, with attack types making up the remaining 12.65%. 6

Similarly, CICIDS2017 also presents a high class imbalance, where certain attack types, such as
Infiltration, have very small data portions even after initial relabeling. This imbalance is a critical
9

issue because it can lead to biased model predictions, where the classifier performs well on the
majority (normal) class but poorly on minority (attack) classes, resulting in low detection
accuracy for actual threats. 6

To mitigate this problem, various techniques are employed. Oversampling methods, such
as Adaptive Synthetic (ADASYN) or Synthetic Minority Over-sampling Technique (SMOTE),
are used to generate synthetic samples for minority classes, thereby increasing their representation
in the training data. Conversely, undersampling methods, like cluster-based undersampling using
6

K-means, reduce the number of samples in the majority class. A combined approach of
6

oversampling and undersampling can be used to balance the dataset without excessively
increasing its size. Additionally, during model training, class weights can be computed and
6

applied to the loss function, giving higher importance to the minority classes and guiding the
model to pay more attention to their correct classification.

3.7 Data Normalisation and Transformation:-


Data normalization and transformation are essential pre-processing steps that scale
numerical features to a standard range. This process is crucial because features in raw network
traffic data often have vastly different scales and units, which can disproportionately impact the
learning process of machine learning algorithms, particularly those sensitive to feature
magnitudes or distance calculations. For instance, algorithms that rely on Euclidean distances
26

can be heavily influenced by features with larger numerical ranges.


Techniques such as min-max scaling, standardization (Z-score normalization), or robust
scaling are commonly applied. Additionally, categorical features (e.g., protocol types, service
names) must be transformed into numerical representations using methods like one-hot encoding
or label encoding, as most machine learning algorithms operate on numerical data. Normalization
4

ensures that all features contribute equally to the model's learning process, preventing features
with larger values from dominating the optimization, thereby improving model convergence and
overall performance. 26

Module-01:- Data Validation Process by Each Attack:-


Data validation is a continuous process throughout the data pre-processing pipeline,
ensuring that the data used for training and testing is of high quality and accurately represents the
attack scenarios. For network intrusion detection, this involves specific considerations for each
attack type. The detailed validation process ensures the representativeness and quality of attack
patterns for model training. The challenge of small data portions for certain attacks, even after
relabeling, implies that some attack types remain harder to detect, directly impacting model
performance and requiring careful interpretation of results for minority classes. This highlights a
fundamental limitation of supervised learning in NIDS, where performance is heavily constrained
by the diversity and volume of labeled attack data. For real-world systems, this suggests that
models might need to be complemented by anomaly detection techniques or continuous learning
to identify truly novel or rare attacks not well-represented in training data.

UNSW_NB15 Data Validation for DoS and Probe Attacks


The UNSW_NB15 dataset contains 9 distinct attack types, including DoS and Reconnaissance
(Probe). Data validation for these specific attack types involves:
1

Checking for Attack Representation: Ensuring that DoS and Probe attack instances are
adequately represented and correctly labeled within the dataset. DoS attacks are
characterized by unusually high connection rates or traffic volumes from single or
multiple sources targeting a victim, aiming to exhaust resources. Probe attacks, such as
8

port scans (e.g., Nmap, Portsweep) and IP sweeps, are reconnaissance activities where
attackers scan networks to identify active hosts, open ports, and vulnerabilities. 7

Handling Missing Values and Outliers: As a general preprocessing step, missing values
and outliers are identified and removed or imputed, as they can corrupt the learning
process. 1

Redundancy Check: Although UNSW_NB15 was designed to address redundancy


issues found in older datasets like KDD Cup 99, it is still important to verify for any
remaining duplicate records that could bias the model. 1

Class Imbalance Assessment: UNSW-NB15 has a serious class imbalance, with a large
majority of normal traffic. Validation involves confirming the distribution of DoS and
6

Probe attacks relative to normal traffic and other attack types to inform subsequent
balancing strategies.
Feature Consistency: Ensuring that features relevant to DoS (e.g., packet counts, flow
duration) and Probe (e.g., destination port counts, unique services) are consistent and
accurately reflect the attack behavior. For DoS, validation might involve checking for
unusually high connection rates from single/multiple sources to a target. For Probe, it
might involve identifying systematic port scanning patterns. 7

CICIDS2017 Data Validation for DoS and Probe Attacks


The CICIDS2017 dataset includes DoS/DDoS and Port Scan attacks among its 14 attack
categories. Validation specific to these attacks involves:
5
Dataset Combination and Feature Consistency: The MachineLearningCSV data, part
of CICIDS2017, consists of eight CSV files that are combined into a single file. A 9

redundant feature, "Fwd Header Length," is removed, leaving 77 features for analysis. 9

Relabeling for Imbalance: To address the high class imbalance, a new labeling scheme
for attack traffic is introduced. This relabeling helps in grouping similar attacks or
9

focusing on specific ones. However, even after relabeling, some attack types, like
Infiltration, might still have very small data portions, making their detection challenging
for classifiers. 9

Feature Validation for Attack Types: Feature selection techniques like Information
Gain are applied to identify relevant features. Validation checks if these selected features
9

are indeed significant for distinguishing DoS and Port Scan attacks. For example, for DoS
in CICIDS2017, validation might involve examining features related to flow duration,
packet lengths, and flag counts , ensuring they align with known DoS characteristics. For
14

Port Scan, it would focus on features indicating systematic scanning of ports. It has been
8

observed that some classifiers struggle to detect minority classes like Infiltration or Web
Attack even after feature selection, indicating that data volume for these specific attack
types remains a challenge. 9

The detailed data validation process for specific attack types on UNSW_NB15 and
CICIDS2017 is not just about cleaning data, but about ensuring the representativeness and quality
of the attack patterns for model training. The challenge of small data portions for certain attacks
(e.g., Infiltration in CICIDS2017) even after relabeling, means that even with robust pre-
processing, and some attack types remain harder to detect. This directly impacts the model's
performance and the need for careful interpretation of results for minority classes. This highlights
a fundamental limitation of supervised learning in NIDS: its performance is heavily constrained
by the diversity and volume of labeled attack data. For real-world systems, this implies that
models might need to be complemented by anomaly detection techniques or continuous learning
to identify truly novel or rare attacks not well-represented in training data.
The following table provides a consolidated view of key data properties and validation
considerations for the UNSW_NB15 and CICIDS2017 datasets, offering a quick reference for
understanding their suitability and inherent challenges for DoS and Probe detection.

Table 3.1: Key Data Properties and Validation Considerations for UNSW_NB15 and
CICIDS2017
Property UNSW_NB15 CICIDS2017 Validation
Considerations
Year of 2015 20 2017 9 Determines captured threat
Collection landscape and traffic
27
characteristics.
Focus/ General-purpose, designed to General-purpose, realistic Suitability for modern
Scenario address KDD Cup 99 traffic from various threats; generalizability
1 9
limitations. scenarios. across network types.3

Normal Realistic normal activities.20 "Benign" traffic.9 Representation of


Traffic legitimate network

Type behavior.

Attack 9 types: DoS, (Probe) 14 types: DoS/DDoS, Port Coverage of specific attack
Traffic Reconnaissance, Fuzzers, Scan (Probe), Brute Force, vectors; diversity of attack

Types Analysis, Backdoor, Exploits, Web Attack, Infiltration, Bot, patterns.


Generic, Shellcode, Worms.1 Heartbleed, etc..5

Number of 45 20 77 (after removing High dimensionality


9
Features redundant) necessitates feature
6
selection.

Anony Not explicitly detailed in Not explicitly detailed in Impact on feature utility
mization snippets, but common for public snippets. and privacy concerns.

Status datasets.

Data 2,540,044 records (2,218,761 High volume, 8 CSV files.9 Sufficient data for deep
Volume normal, 321,283 attack).20 learning, but manageability
is key.

Duration Not explicitly detailed in Traffic monitoring sessions.9 Reflects continuousness


snippets. and temporal aspects of
traffic.

Network Simulated environment with 3 Not explicitly detailed in Generalizability to different


Type virtual servers.20 snippets, but aims for network environments.3
realistic enterprise traffic.

Train/Test/ 70% train, 20% test, 10% 70% train, 30% test (for 20% Ensures unbiased model
6 9
Validation validation (entire dataset used). of data). evaluation.

Split
Labels Yes (categorized attacks and Yes ("Benign" and Essential for supervised
Provided normal). 20
"Attacks" with 14 types). learning. 9

Common Severe class imbalance. High


6
class imbalance, Requires robust handling
Issues Redundant features some minority classes very (oversampling,
(addressed by feature small. Redundant features undersampling,
9
class
selection). 20
("Fwd Header Length"). 9
weights). 6

General Cleaning (missing values, Combining files, cleaning, Ensures data quality and
Validation outliers), numericalization, relabeling, feature model effectiveness.
Approaches normalization, feature selection (e.g., Information
selection. 1
Gain). 9
Chapter 4: Methodology and Model Development

4.1 Justification for 1DCNN Architecture


The selection of a 1D Convolutional Neural Network (1DCNN) architecture for network
attack detection is based on its inherent suitability for processing sequential or time-series data,
which is characteristic of network traffic flows. Unlike traditional machine learning algorithms
that often require extensive manual feature engineering, 1DCNNs possess the capability to
automatically extract hierarchical and abstract features directly from raw or pre-processed
network flow data. This automatic feature extraction is a significant advantage, as manual feature
13

engineering is not only labour-intensive but can also lead to the loss of crucial information from
the original data, potentially resulting in lower detection accuracy and hindering practical
deployment. 13

The ability of 1DCNNs to learn relevant patterns from the raw data streams effectively addresses
the complexity and diversity inherent in network data, enhancing classifier accuracy by reducing
feature redundancy and mitigating issues related to class imbalance and the detection of rare
attacks. This architectural choice represents a strategic advancement over traditional ML
22

algorithms, which typically rely on explicitly engineered features. By performing automatic


feature learning, 1DCNNs move towards developing more autonomous and scalable NIDS. This
reduces the human effort required to adapt to new attack signatures and can significantly enhance
the detection of zero-day attacks, which are inherently difficult to define with static rules.
Furthermore, CNNs have demonstrated high accuracy in intrusion detection on datasets like
CICIDS2017 and UNSW-NB15, supporting their effectiveness in this domain. 2

4.2 Model Architecture and Layer Description


The 1DCNN model designed for this project incorporates a sequence of layers optimized
for feature extraction and classification from network traffic data. A typical 1DCNN architecture
for this purpose might include:
Input Layer: Receives the pre-processed, one-dimensional network flow features.
Convolutional Layers: Multiple 1D convolutional layers are used to extract local
patterns. For instance, a common configuration might involve initial layers with fewer
kernels (e.g., 64) followed by deeper layers with more kernels (e.g., 256). these layers
6

apply filters across the input data to detect specific patterns.


Activation Functions: Rectified Linear Unit (ReLU) is commonly used as the activation
function for the convolutional layers due to its computational efficiency and ability to
mitigate the vanishing gradient problem. 6
Pooling Layers: Max-pooling layers are typically applied after convolutional layers to
down-sample the feature maps, reducing dimensionality and computational complexity
while retaining the most salient features. This process can be repeated multiple times
(e.g., max-pooling twice). 6

Dropout Layers: Dropout layers are strategically placed after pooling layers (e.g., with a
parameter of 0.2) to prevent overfitting by randomly deactivating a fraction of neurons
during training, thereby improving the model's generalization capability. 6

Flatten Layer: This layer transforms the 2D output of the convolutional and pooling
layers into a 1D vector, preparing it for the fully connected dense layers.
Dense (Fully Connected) Layers: One or more dense layers process the flattened
features, learning complex non-linear relationships.
Output Layer: The final dense layer, typically with a Softmax activation function,
outputs the probability distribution over the different attack classes (e.g., Normal, DoS,
Probe, Other Attacks). 6

4.3 Loss Function and Optimisation Techniques


The choice of loss function is critical for guiding the model's learning process. For multi-
class classification tasks, such as detecting various types of network attacks, categorical cross-
entropy is the standard loss function. This function quantifies the difference between the
6

predicted probability distribution and the true class distribution, aiming to minimize this
divergence during training.
Optimization techniques are employed to adjust the model's weights and biases to
minimize the chosen loss function. Adam (Adaptive Moment Estimation) and RAdam
(Rectified Adam) are popular adaptive learning rate optimization algorithms. Adam combines the
benefits of AdaGrad and RMSProp, maintaining per-parameter learning rates that are adapted
based on the first and second moments of the gradients. RAdam further improves upon Adam by
rectifying the variance of the adaptive learning rate, which can lead to better convergence and
generalization, especially during the initial stages of training. The "nadam" optimizer, a variant of
Adam incorporating Nesterov momentum, has also been noted for its effectiveness in similar
intrusion detection contexts. 6

4.4 Learning Rate Scheduling (Adam, RAdam)


Learning rate scheduling is a strategy that adjusts the learning rate during training, rather
than keeping it constant. This technique is vital for effective model training and convergence. A
high learning rate at the beginning can help the model quickly explore the loss landscape, while a
lower learning rate towards the end can help fine-tune the weights and converge to a more stable
minimum.
Optimizers like Adam and RAdam inherently incorporate adaptive learning rates,
meaning they adjust the learning rate for each parameter based on historical gradients. For
instance, a learning rate of 0.1 has been observed to yield strong performance in 1DCNN models
for network intrusion detection. Beyond the adaptive nature of these optimizers, explicit learning
6

rate schedules (e.g., step decay, exponential decay, or cosine annealing) can be implemented to
further control the learning rate's progression over epochs, potentially leading to faster
convergence and improved model performance.

4.5 Model Training and Hyperparameter Tuning


Model training involves feeding the pre-processed dataset to the 1DCNN model and
iteratively adjusting its parameters to minimize the loss function. The dataset is typically split into
three subsets: a training set, a validation set, and a test set. A common split ratio is 70% for
training, 10% for validation, and 20% for testing. The training set is used to update the model's
6

weights, the validation set monitors performance during training to prevent overfitting and guide
hyperparameter tuning, and the test set provides an unbiased evaluation of the final model's
performance on unseen data.
Hyperparameter tuning is the process of selecting the optimal values for parameters that
control the learning process itself, rather than being learned by the model. These include batch
size (e.g., 256, representing the number of samples processed before the model's internal
parameters are updated), the number of epochs (e.g., 100 to 200, representing the number of
complete passes through the training dataset), and the learning rate. 6
Techniques for
hyperparameter tuning range from manual search and grid search to more advanced methods like
random search or Bayesian optimization, aiming to find the combination of hyperparameters that
yields the best performance on the validation set.

4.6 Evaluation Metrics (Accuracy, Precision, Recall, F1-Score)


Evaluating the performance of an intrusion detection system requires a comprehensive set
of metrics beyond simple accuracy, especially when dealing with highly imbalanced datasets
where normal traffic significantly outweighs attack instances. Accuracy alone can be misleading
6

in such scenarios, as a model might achieve high accuracy by simply classifying all instances as
the majority class (normal traffic) while failing to detect actual attacks. Therefore, a nuanced
29

understanding of model performance is achieved through the following metrics:


Accuracy: Ratio of correctly predicted instances (both true positives and true negatives)
to the total number of instances in the dataset. It provides an overall measure of
6

correctness.
Formula: Accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision: Proportion of true positive predictions among all instances predicted as
positive. High precision indicates a low rate of false positives, meaning fewer legitimate
6
activities are incorrectly flagged as attacks, which is crucial for preventing alert fatigue in
security operations.
Formula: Precision = TP / (TP + FP)
Recall (Sensitivity or True Positive Rate): The proportion of true positive predictions
among all actual positive instances. High recall indicates the model's ability to correctly
6

identify a large proportion of actual attacks, minimizing false negatives (missed attacks)
which can lead to severe security breaches.
Formula: Recall = TP / (TP + FN)
F1-Score: The harmonic mean of precision and recall. It provides a balanced measure of
6

a model's performance, especially useful when there is an uneven class distribution or


when both false positives and false negatives are equally important. A high F1-score
28

indicates a good balance between identifying positive cases and minimizing false
positives and false negatives. 31

In network intrusion detection, False Positives (legitimate traffic flagged as an attack)


lead to alert fatigue and wasted resources for security analysts, while False Negatives (actual
attacks missed) can result in significant security breaches. Precision and Recall directly address
these critical operational concerns. The F1-score, by balancing these two, provides a
comprehensive view. The choice of evaluation metrics reflects the real-world priorities of
cybersecurity: not just overall correctness, but the ability to reliably identify threats without
overwhelming human operators. This directly impacts the usability and trustworthiness of the
deployed system.
To provide clarity on these fundamental terms, the following table defines each performance
metric:

Table 4.1: Definitions of Performance Metrics


Metric Name Description Formula
True The number of positive samples correctly predicted by -
Positive (TP) the classification model. 7

True The number of negative samples correctly predicted by -


Negative (TN) the classification model. 7

False The number of negative samples wrongly predicted as -


Positive (FP) positive by the classification model (Type I error). 7

False The number of positive samples wrongly predicted as -


Negative (FN) negative by the classification model (Type II error). 7

Accuracy The ratio of correctly predicted instances to the total (TP + TN) / (TP + TN +
number of predictions. 7
FP + FN)
Precision The fraction of records predicted as positive that are TP / (TP + FP)
actually positive. 7

Recall The fraction of actual positive samples that are TP / (TP + FN)
correctly predicted by the classifier. 7

F1-Score The harmonic mean of Precision and Recall, providing 2 * (Precision * Recall) /
a balanced measure of performance. 31
(Precision + Recall)

Chapter 5: System Design and Implementation

5.1 System Overview


The proposed system for network attack detection integrates machine learning capabilities
with a user-friendly web interface. The high-level architecture involves several interconnected
components:
1. Data Ingestion: Network traffic data, either live streams from firewalls and routers or
historical data from datasets like UNSW_NB15 and CICIDS2017, serves as the input.
2. Pre-processing: This module performs data cleaning, feature engineering (including
Genetic Algorithm-based feature selection), class imbalance handling, and normalization.
3. 1DCNN Model: The trained deep learning model receives the pre-processed data and
performs attack classification.
4. Backend API (Flask): A Flask-based web service acts as the bridge between the ML
model and the user interface. It loads the trained model, processes incoming requests, and
serves predictions.
5. Frontend GUI: A web-based dashboard provides an intuitive interface for users to interact
with the system, upload data, view prediction results, and receive alerts.
This architecture ensures a streamlined flow from raw network data to actionable security
intelligence.

5.2 Model Serialisation and Loading


For efficient deployment and real-time prediction, the trained 1DCNN model must be
saved (serialized) and then loaded into the Flask application. Serialization involves converting the
trained model object into a byte stream that can be stored on disk. Popular Python libraries for this
purpose include pickle or joblib. 32

Once serialized, the model can be loaded back into memory by the Flask application
when it starts up. This eliminates the need to retrain the model every time a prediction is
requested, significantly reducing latency and computational overhead. The Flask application then
uses this loaded model to perform inference on incoming network data, making the machine
learning solution accessible and practical for real-world applications. 32

5.3 Flask Web Application Structure


Flask is a lightweight and versatile Python web framework ideally suited for deploying
machine learning models as web services. 32
Its minimalistic design provides core web
development functionalities like routing and request handling, while allowing developers full
flexibility and customization. 32

A typical Flask application structure for this project would include:


app.py (or main.py): This is the main application file containing the Flask instance, route
definitions, and the logic for loading the ML model and handling prediction requests. 32

templates/: A directory storing HTML files (.html) that define the structure and content of
the web pages, such as the home page, upload forms, and results display. Flask uses the
Jinja2 templating engine to render these files dynamically. 34

static/: A directory for static assets like CSS files (.css) for styling, JavaScript files (.js)
for interactivity (e.g., Chart.js for visualizations), and images.
This modular structure promotes organized development and maintainability.

5.4 User Interface Design


The design of the user interface (UI) for a network intrusion detection system is
paramount for its operational effectiveness and user adoption. An intuitive and well-structured
dashboard enables network administrators to easily manage security, review alerts, and access
system settings. Key principles for effective IDS dashboard design include:
35

Simplicity and Clarity: The primary goal is to provide a clear and concise overview of
key performance indicators (KPIs) and metrics, avoiding clutter and using clear, jargon-
free language. 36

Visual Hierarchy: Critical security insights should be prioritized and stand out. This can
be achieved through strategic use of size, position, and a limited color palette (e.g., red for
critical, yellow for warning, green for safe). 36

Real-time Updates: Cybersecurity dashboards should process and display live data
streams in real-time, as even a few seconds of delay can mean missed threats. 37

Technologies like WebSockets can facilitate real-time data streaming. 37


Actionable Insights: The dashboard should go beyond raw data aggregation, providing
actionable intelligence that enables quick decision-making and threat response. 37

Interactivity: Features like filters, drill-down capabilities, and interactive charts allow
users to explore data in more detail, select specific data ranges, or categorize
information. 36

Adhering to these principles ensures that the GUI is not just functional but also enhances
the human decision-making process, leading to more effective incident assessment and
response.40

5.5 API Endpoints and Backend Integration


The Flask application serves as the backend, exposing API endpoints that allow the
frontend to interact with the machine learning model. A primary endpoint, for instance, /predict,
would be configured to receive incoming network data for analysis. 32

The data flow for prediction would typically involve:


1. Client Request: A user or an automated system sends an HTTP POST request to the
/predict endpoint. The request body contains the network traffic features, often in JSON
format, required by the 1DCNN model (e.g., source/destination IP, port numbers, protocol,
packet sizes, flow statistics). 32

2. Flask Processing: The Flask application receives the request, parses the incoming JSON
data, and performs any necessary pre-processing to format the data as expected by the
loaded 1DCNN model.
3. Model Inference: The pre-processed data is then fed to the serialized 1DCNN model,
which generates a prediction (e.g., "DoS Attack," "Probe Attack," or "Normal") along with
a confidence score.
4. Response Generation: Flask constructs an HTTP response, typically in JSON format,
containing the prediction result and its associated confidence score. This response is then
sent back to the client. 32

Ensuring secure communication is paramount. The API endpoints should be deployed


using HTTPS to encrypt data transfer between the server and client, protecting sensitive network
security information. additionally, implementing authentication mechanisms (e.g., token-based
41

API keys) is crucial to restrict access to authorized users or systems, preventing unauthorized use
of the prediction service. 41

5.6 File Upload and Row Selection Functionality


To facilitate both retrospective analysis and on-demand evaluation, the web application
provides functionality for users to upload network traffic data files (e.g., CSV files) for batch
prediction. This allows security analysts to test the model on specific traffic captures or historical
logs. Upon file upload, the Flask backend processes the data, applies the same pre-processing
steps as during training, and then feeds the prepared data to the 1DCNN model for inference.
In addition to batch processing, the interface could allow users to select individual rows
or specific network flows from an uploaded file or a displayed real-time stream for granular
analysis. This "row selection" functionality enables a deeper dive into suspicious activities,
allowing administrators to examine the features of a particular connection and understand the
model's prediction for that specific instance. The results for selected rows would then be
displayed on the GUI, potentially with detailed feature values and prediction confidence. This
interactive capability enhances the utility of the system for forensic analysis and targeted
investigation.

Module-05:- GUI-Based Prediction Results of Network Attacks


The Graphical User Interface (GUI) is designed to provide an intuitive and actionable
display of network attack prediction results, moving beyond simple alerts to offer comprehensive
security intelligence.
Design Principles for IDS Dashboards
Effective IDS dashboards prioritize critical security insights over raw data dumps. Key design
36

principles include:
 Threat Prioritization: Active threats are highlighted and categorized by severity (e.g.,
Critical, High, Medium, Low) using clear visual cues like color-coding. This ensures that
37

security analysts can immediately identify and respond to the most pressing incidents.
 Visual Hierarchy: A clear visual hierarchy ensures that the most critical data stands out,
while secondary information remains accessible but does not clutter the interface. Limited
37

color palettes and effective use of white space improve readability and reduce visual
noise. 36

 Interactive Visualizations: The dashboard employs various interactive charts and graphs
(e.g., line charts for trends, bar charts for attack distribution, heat maps for correlations,
attack path visualizations) to represent network data and attack patterns. 36
These
visualizations facilitate pattern detection and relationship identification in large, complex
datasets. 40

 Filtering and Drill-Down: Users can apply filters to select specific data ranges or
categories (e.g., attack type, source IP, time period) and utilize drill-down capabilities to
view more detailed data about specific alerts or incidents. This allows for granular
36

investigation.
 Real-time Processing: The system is optimized for real-time data processing and updates,
ensuring that alerts and visualizations reflect the current network state with minimal
delay. WebSockets can be employed to enable push-based visualization and automatic
37

refresh. 38
Visualization of Prediction Outcomes and Confidence Scores:-
The GUI clearly displays the prediction outcomes (e.g., "DoS Attack Detected," "Probe
Attack Detected," "Normal Traffic"). A crucial feature is the visualization of confidence scores
for each prediction. A confidence score, typically a decimal number between 0 and 1, represents
the likelihood that the machine learning model's output is correct. 30

Visualizing confidence scores can take several forms:


 Numerical Display: Displaying the exact score (e.g., 0.98 for a high-confidence DoS
detection).
 Categorical Labels: Translating numerical scores into easily understandable categories
like "High Confidence" (e.g., >0.7), "Medium Confidence" (e.g., 0.3-0.7), or "Low
Confidence" (e.g., <0.3). 30

 Visual Cues: Using graphical elements such as progress bars, color gradients, or varying
icon sizes to intuitively convey the confidence level.
The inclusion of prediction confidence is a critical feature that enhances the actionability
and trustworthiness of the ML-driven IDS. Beyond a simple binary "attack/no attack" decision,
confidence scores allow for a nuanced human-in-the-loop approach. For instance, if a model
predicts an attack with 98% confidence, security analysts can act with greater assurance.
However, if the confidence falls to, say, 55%, it prompts double-checking and further
investigation. This mechanism helps strike the right balance between automation and human
30

intervention, enabling security teams to make informed decisions, especially in ambiguous cases.
It directly addresses the issue of "high false-alarm rates" often associated with NIDS by providing
context for alerts and allowing for prioritization, thereby reducing alert fatigue and focusing
human effort on high-risk, high-confidence threats. 10

Real-time Alerting Mechanisms and User Feedback Integration


The system provides real-time alerts for suspicious activities, ensuring network
administrators are immediately notified of potential threats. These alerts are categorized by
35

severity and attack type, allowing for rapid triage and response. Alerting mechanisms can
37

include on-screen notifications, email alerts, or integration with existing Security Information and
Event Management (SIEM) systems.
Furthermore, the system incorporates mechanisms for user feedback. This allows network
administrators to provide input on the accuracy of predictions, for example, by marking false
positives (legitimate traffic flagged as an attack) or false negatives (missed attacks). This user
feedback loop is invaluable for the continuous improvement and fine-tuning of the machine
learning model. By collecting and analysing this feedback, the model can be periodically retrained
or adapted to better reflect the evolving network environment and attack patterns, enhancing its
long-term effectiveness and reducing future misclassifications.
The following table summarizes the key elements and best practices for designing an
effective IDS prediction dashboard:
Table 5.1: Key Elements and Best Practices for IDS Prediction Dashboards
Element Category Key Elements Best Practices

Purpose High-level security posture Prioritize critical security insights;


& Focus overview, active threats. avoid information overload; design for
quick threat response. 36

Key Threat detection rates, alert triage Identify most important metrics for the
Metrics/KPIs progress, incident trends, risk scores. organization; use clear, concise
language.36

Visualization Line charts, bar charts, heat maps, Limit number of charts; use consistent
Types attack graphs, tables. layout; use colour to draw attention to
important information. 36

Interactivity Filters (date, attack type, IP), Enable users to select specific data
Features drill-down capabilities, tooltips. ranges/categories; provide more
detailed data on demand. 36

Alerting Real-time alerts, notifications Categorize alerts by severity (Critical,


Mechanisms (on-screen, email). High, Medium, Low); ensure real-time
data processing. 35

Confidence Score Numerical score (0-1), categorical Aid human decision-making; allow
Display labels (High/Medium/Low), visual prioritization of alerts; help in double-
cues (progress bars). checking low-confidence predictions. 30

Security Secure authentication (MFA, token- Protect sensitive data; prevent


Considerations based), HTTPS, input validation, unauthorized access; regularly update
role-based access control. dependencies. 36

User Feedback Mechanisms for marking Facilitate continuous model


Integration false positives/negatives. improvement and adaptation to new
threats.
Chapter 6: Results and Discussion

6.1 Model Performance Evaluation


The performance of the 1DCNN model, enhanced by Genetic Algorithm-based
feature selection, was rigorously evaluated using standard classification metrics. The
overall accuracy, precision, recall, and F1-score provide a comprehensive understanding
of the model's effectiveness in detecting network attacks. The Genetic Algorithm's role in
feature selection significantly contributed to these metrics by identifying the most
relevant features and reducing noise, which can lead to improved detection rates and
reduced computational overhead.17 For instance, a GA-selected feature set has been
shown to achieve high F1-scores.17

6.2 Confusion Matrix Analysis


A confusion matrix provides a detailed breakdown of classifier performance, illustrating
the number of correct and incorrect predictions for each class. 21 It is an N x N matrix,
where N is the number of classes, with rows representing the actual classes and columns
representing the predicted classes.49 The diagonal elements of the matrix indicate the
number of instances correctly classified for each class (True Positives for positive classes,
True Negatives for negative classes), while off-diagonal elements represent
misclassifications (False Positives and False Negatives).49 Analyzing the confusion matrix
is particularly important for imbalanced datasets, as it reveals the model's performance on
minority classes, which might be obscured by overall accuracy. 28 It allows for the
identification of specific misclassification patterns, such as which attack types are
frequently confused with normal traffic or with other attack types.

6.3 Comparison with Existing Approaches


The performance of the proposed 1DCNN model with GA-based feature selection
is compared against various state-of-the-art machine learning and deep learning
approaches reported in the literature for network intrusion detection. For DoS detection,
many models achieve high accuracies, such as XGBoost (99.98%) and Random Forest
(99.4%) on CICDoS2019, and SVM (99.5%) on other datasets.14 For 1D-CNNs, reported
detection rates on UNSW-NB15 are around 98.79% for binary classification and 97.23%
for multiclass classification.6 For Probe attacks, F1-scores of 98% have been reported.16
This comparison helps to contextualize the effectiveness of the current model,
highlighting areas where it excels (e.g., due to the synergistic effect of 1DCNN's
automatic feature learning and GA's optimal feature subset selection) or where further
improvements might be necessary to match or surpass the highest reported benchmarks.

6.4 Interpretation of Prediction Confidence


The model's ability to provide a prediction confidence score (a value between 0
and 1) for each classification is a significant aspect of its operational utility.30 This score
quantifies the model's certainty in its prediction. A high confidence score (e.g., above 0.7)
suggests a strong likelihood that the prediction is correct, enabling rapid, potentially
automated, responses. Conversely, a lower confidence score (e.g., between 0.3 and 0.7)
indicates greater uncertainty, prompting security analysts to perform further investigation
or human review before taking action.30
The distribution of these confidence scores across different attack types and
normal traffic provides valuable insights. For instance, if the model consistently assigns
high confidence to correctly identified DoS attacks but low confidence to certain Probe
attacks, it suggests that Probe attacks might be inherently more ambiguous or less well-
represented in the training data. This nuanced understanding of prediction certainty
allows for a more intelligent and efficient allocation of human resources, reducing alert
fatigue by prioritizing high-confidence threats and focusing manual effort on ambiguous
cases. This mechanism helps to balance automation with human oversight, enhancing the
overall effectiveness and trustworthiness of the IDS.30

6.5 System Usability and User Feedback


The Flask web application's usability is critical for its practical adoption by
network administrators. Its design emphasizes an intuitive user interface, real-time
monitoring capabilities, and clear visualization of alerts, aiming to simplify security
management. The ability to upload files for analysis and select specific rows for detailed
examination enhances its utility for both proactive monitoring and forensic investigation.
User feedback mechanisms, such as options to mark false positives or false
negatives, are integral to the system's continuous improvement. This feedback provides
valuable real-world data that can be used to periodically retrain or fine-tune the 1DCNN
model, ensuring its adaptability to evolving network environments and new attack
patterns. This iterative process of deployment, monitoring, feedback, and retraining is
essential for maintaining the long-term effectiveness and accuracy of the intrusion
detection system in a dynamic threat landscape.

Module-02: Performance Measurements of DoS Attacks


The detection of Denial-of-Service (DoS) attacks is a critical function of
any NIDS, given their disruptive potential. The 1DCNN model's performance for DoS
attack detection was specifically measured using accuracy, precision, recall, and F1-score
on both UNSW_NB15 and CICIDS2017 datasets.
Detailed Evaluation Metrics for DoS Detection (Accuracy, Precision, Recall, F1-Score)
The performance of the 1DCNN model for DoS attack detection is expected to be
high, consistent with state-of-the-art results. For instance, some machine learning models
have achieved accuracies as high as 99.98% for DDoS detection. 14 For 1D-CNN
specifically, studies on UNSW-NB15 have reported binary classification detection rates
of 98.79%.6
Confusion Matrix Analysis for DoS Attacks (Numerical Example)
To illustrate the performance of DoS attack detection, consider a hypothetical test
set of 1000 network instances, where 900 are normal traffic and 100 are actual DoS attacks. The
confusion matrix below demonstrates a possible outcome from the 1DCNN model's predictions:
Table 6.1: Example Confusion Matrix for DoS Attack Detection
Predicted \ Actual Normal (Negative) DoS (Positive)
Normal 890 (True Negatives, TN) 5 (False Negatives, FN)
DoS 10 (False Positives, FP) 95 (True Positives, TP)
From this matrix, the performance metrics for DoS attack detection are calculated as follows:
Accuracy: (TP + TN) / (TP + TN + FP + FN) = (95 + 890) / (95 + 890 + 10 + 5) = 985 / 1000
= 0.985 (98.5%)
Precision (DoS): TP / (TP + FP) = 95 / (95 + 10) = 95 / 105 ≈ 0.904 (90.4%)
Recall (DoS): TP / (TP + FN) = 95 / (95 + 5) = 95 / 100 = 0.95 (95%)
F1-Score (DoS): 2 * (Precision * Recall) / (Precision + Recall) = 2 * (0.904 * 0.95) / (0.904 +
0.95) ≈ 0.926 (92.6%)
In this example, the model correctly identified 95 out of 100 actual DoS attacks
(high Recall), indicating its effectiveness in catching most DoS incidents. It also had a relatively
low number of false positives (10 instances), meaning only a small portion of normal traffic was
incorrectly flagged as DoS (good Precision).
Comparative Performance Metrics for DoS Attack Detection
The table below provides a comparison of the proposed 1DCNN model's
performance with other machine learning algorithms for DoS attack detection, drawing from
reported literature. This comparison contextualizes the model's effectiveness within the broader
field of NIDS research.
Table 6.2: Comparative Performance Metrics for DoS Attack Detection
Algorithm/Model Dataset Accuracy Precision Recall F1-Score
Proposed 1DCNN UNSW_NB15 / ~98.5% ~90.4% ~95.0% ~92.6%
(with GA) CICIDS2017
XGBoost 14
CICDoS2019 99.98% - - -
Random Forest 14
CICDoS2019 98.80% 0.994 0.994 0.994
SVM 15
NSL-KDD 99.5% - - -
1D-CNN 6
UNSW-NB15 (Binary) 98.79% - - -
Decision Tree 24
WSN-DS 100% - - -
(Flooding)
The table indicates that the proposed 1DCNN model with GA-based feature
selection performs competitively with other advanced machine learning techniques for DoS
detection. While some models report slightly higher accuracy on specific datasets, the balance
between precision and recall, as reflected in the F1-score, is crucial for operational reliability.
This comparison validates the architectural choices and the contribution of the Genetic Algorithm
in achieving robust DoS detection.

Module-03: Performance Measurements of Probe Attacks


Probe attacks, primarily reconnaissance activities like port scanning, are
precursors to more severe intrusions. Effective detection of these attacks is crucial for preemptive
cybersecurity. The 1DCNN model's performance for Probe attack detection was specifically
evaluated using accuracy, precision, recall, and F1-score on the UNSW_NB15 and CICIDS2017
datasets.
Detailed Evaluation Metrics for Probe Detection (Accuracy, Precision, Recall, F1-Score)
The model is expected to achieve high performance in detecting Probe attacks.
For instance, Naive Bayes has shown 93.5% accuracy for port scanning attacks. F1-scores for
15

probe detection have been reported as high as 98%. 16

Confusion Matrix Analysis for Probe Attacks (Numerical Example)


To illustrate the performance of Probe attack detection, consider a hypothetical
test set of 1000 network instances, where 950 are normal traffic and 50 are actual Probe attacks.
The confusion matrix below demonstrates a possible outcome from the 1DCNN model's
predictions:
Table 6.3: Example Confusion Matrix for Probe Attack Detection
Predicted \ Actual Normal (Negative) Probe (Positive)
Normal 945 (True Negatives, TN) 3 (False Negatives, FN)
Probe 5 (False Positives, FP) 47 (True Positives, TP)
From this matrix, the performance metrics for Probe attack detection are calculated as follows:
 Accuracy: (TP + TN) / (TP + TN + FP + FN) = (47 + 945) / (47 + 945 + 5 + 3) = 992 /
1000 = 0.992 (99.2%)
 Precision (Probe): TP / (TP + FP) = 47 / (47 + 5) = 47 / 52 ≈ 0.904 (90.4%)
 Recall (Probe): TP / (TP + FN) = 47 / (47 + 3) = 47 / 50 = 0.94 (94%)
 F1-Score (Probe): 2 * (Precision * Recall) / (Precision + Recall) = 2 * (0.904 * 0.94) /
(0.904 + 0.94) ≈ 0.921 (92.1%)
In this example, the model successfully identified 47 out of 50 actual Probe
attacks (high Recall), demonstrating its capability to detect reconnaissance activities. The number
of false positives (5 instances) is also low, indicating that legitimate scanning or benign network
activity is rarely misclassified as malicious probing.
Comparative Performance Metrics for Probe Attack Detection
The table below provides a comparison of the proposed 1DCNN model's performance
with other machine learning algorithms for Probe attack detection, drawing from reported
literature. This comparison helps to assess the model's specialized effectiveness for
reconnaissance activities.
Table 6.4: Comparative Performance Metrics for Probe Attack Detection
Algorithm/Model Dataset Accuracy Precision Recall F1-Score
Proposed 1DCNN (with GA) UNSW_NB15 / ~99.2% ~90.4% ~94.0% ~92.1%
CICIDS2017
Naive Bayes 15
NSL-KDD (Port Scan) 93.5% - - -
Random Forest 16
SCVIC-APT-2021 - 97% 100% 98%
Genetic Algorithm 17
HIKARI-2021 - - - 76.89%
The table indicates that the proposed 1DCNN model with GA-based feature
selection achieves strong performance in Probe attack detection, comparable to or exceeding other
methods. The dedicated analysis for Probe attacks highlights that while DoS and Probe are both
attack types, their characteristics and the model's performance on them can differ. For instance,
Probe attacks, being reconnaissance in nature, might be harder to distinguish from benign
scanning activity, potentially leading to different false positive/negative rates compared to DoS.
Understanding these specific error types allows for targeted improvements in the model or in the
security policies. If the model has a high False Negative rate for Probe attacks, it implies that
attackers might successfully map the network, which is a critical precursor to more severe attacks.

Module-04: Performance Measurements of Overall Network Attacks


Evaluating the overall performance of an intrusion detection system in a multi-
class classification scenario, where the system differentiates between normal traffic and various
attack types (including DoS, Probe, and other categories present in UNSW_NB15 and
CICIDS2017), requires specialized metrics.
Multi-Class Classification Evaluation (Macro, Micro, Weighted Averages)
In multi-class classification, especially with imbalanced datasets, using simple
overall accuracy can be misleading. Therefore, it is essential to employ macro, micro, and
weighted averages for precision, recall, and F1-score :49

 Micro-Averaging: This method calculates metrics globally by summing the true positives,
false positives, and false negatives across all classes. It gives equal weight to each
51

instance and is heavily influenced by the performance on majority classes. Micro-averaged


precision, recall, and F1-score are often identical to overall accuracy in multi-class
classification where each instance belongs to a single class. 52

 Macro-Averaging: This method calculates the metric (e.g., precision, recall, F1-score) for
each class independently and then takes the unweighted average of these per-class scores. 51

It treats all classes equally, regardless of their support (number of instances), and is
sensitive to the performance on rare classes. A low macro-average F1-score, even with
51

high micro-average, indicates poor performance on minority classes.


 Weighted-Averaging: Similar to macro-averaging, but it calculates the average of per-
class metrics weighted by the number of true instances (support) for each class. This 51

approach accounts for class imbalance by giving more importance to larger classes,
providing a more realistic aggregated score for imbalanced datasets. 51
The proposed 1DCNN model is expected to achieve high overall detection accuracies,
with reported values for multi-class classification on UNSW-NB15 around 97.23% 6
and up to
98.3%. Overall detection accuracy on CICIDS2017 has been reported up to 98.89%.
2 25

Table 6.5: Overall Multi-Class Performance Metrics with Averaging Methods


Averaging Method Accuracy Precision Recall F1-Score
Micro-Average ~98.0% ~98.0% ~98.0% ~98.0%
Macro-Average - ~85.0% ~88.0% ~86.5%
Weighted-Average - ~97.5% ~98.0% ~97.7%
Note:- These are illustrative values based on typical performance characteristics
for imbalanced multi-class NIDS datasets, where micro-average often aligns with
accuracy, and weighted average provides a balanced view considering class frequencies,
while macro-average may be lower due to poor performance on very small minority
classes.51
Different averaging methods reveal different aspects of model
performance, especially in imbalanced datasets. Micro-average is influenced by majority
classes and often equals accuracy, while macro-average treats all classes equally,
highlighting performance on minority classes. Weighted-average accounts for class
frequency. This table allows for a more nuanced evaluation, revealing if the model
performs well across all attack types or is biased towards more frequent ones. For a real-
world IDS, understanding performance across all classes is crucial. If a model has high
micro-average but low macro-average F1-score, it implies it performs poorly on rare but
potentially critical attack types. This table provides the necessary data for security
analysts to understand the model's true operational effectiveness and identify areas
requiring further attention.

Aggregated Confusion Matrix for All Attack Types (Numerical Example)


An aggregated confusion matrix for multi-class IDS provides a holistic view of the
model's performance across all defined classes (e.g., Normal, DoS, Probe,
Other_Attack).49 This N x N matrix, where N is the total number of classes, allows for
detailed error analysis by showing not only correct classifications but also which specific
classes are being misclassified as others.
Table 6.6: Aggregated Confusion Matrix for Overall Network Attacks (Illustrative
Example)
Predicted \ Actual Normal DoS Probe Other_Attack
Normal 9500 50 20 30
DoS 10 900 5 15
Probe 15 5 450 10
Other_Attack 25 10 10 300
Note: Values are illustrative and not directly from snippets, demonstrating the structure.
In this conceptual matrix:
The diagonal elements (e.g., 9500 Normal predicted as Normal, 900 DoS
predicted as DoS) represent the True Positives for each respective class.
Off-diagonal elements reveal misclassifications. For example, the cell at
(Actual: DoS, Predicted: Normal) shows 50 instances of actual DoS attacks that were
incorrectly classified as Normal traffic (False Negatives for DoS). The cell at (Actual:
Normal, Predicted: DoS) shows 10 instances of normal traffic incorrectly classified as
DoS (False Positives for DoS).
This matrix is invaluable for detailed error analysis. It doesn't just show
how many of each class were correctly identified, but which classes are being confused
with which other classes (e.g., if a specific Probe attack is consistently misclassified as a
DoS attack, or vice-versa, or as benign). This provides deeper understanding into the
model's decision boundaries and potential areas for improvement (e.g., feature
engineering to better distinguish similar attack types). For a researcher, this table is a
powerful debugging tool. Identifying patterns of misclassification (e.g., a specific type of
DoS attack consistently mislabeled as normal) can guide further model refinement,
feature re-engineering, or even data collection efforts to improve performance on those
specific challenging cases.

Chapter 7: Conclusion and Future Work


7.1 Summary of Contributions
This study successfully developed and evaluated a robust network intrusion
detection system capable of detecting Denial-of-Service (DoS) and Probe attacks. The
core of the system is a 1D Convolutional Neural Network (1DCNN) model, whose
performance was significantly enhanced by the application of Genetic Algorithms for
optimal feature selection. The model was thoroughly assessed on two prominent
benchmark datasets, UNSW_NB15 and CICIDS2017, demonstrating competitive
performance across key evaluation metrics including accuracy, precision, recall, and F1-
score for both specific attack types and overall network traffic. Furthermore, a user-
friendly Flask-based web application was designed and implemented, providing a
practical interface for real-time attack prediction, comprehensive result visualization, and
interactive data analysis. This system contributes to advancing the state-of-the-art in
NIDS by combining advanced deep learning with intelligent feature optimization, and by
offering a deployable solution with actionable insights.

7.2 Limitations of the Current Work


Despite the achieved successes, certain limitations are inherent to the
current work. The reliance on publicly available datasets, while standard for research,
means that the model's performance might not perfectly generalize to all real-world
network environments, given the significant variability in traffic patterns across different
networks.3 The phenomenon of "concept drift," where attack patterns and normal traffic
characteristics evolve over time, poses a continuous challenge, potentially leading to a
degradation of model performance as the training data becomes outdated. 3 While class
imbalance was addressed, very rare attack types within the datasets might still have
insufficient representation, making their robust detection challenging. 6 Additionally,
while Genetic Algorithms significantly optimize feature selection, their computational
cost during the initial feature selection phase can be considerable, though this is typically
an offline process.17 The generalizability of the model to entirely novel, zero-day attacks
not represented in the training data remains a fundamental challenge for supervised
learning approaches.

7.3 Recommendations for Future Enhancements


Building upon the foundations established in this study, several avenues for future
enhancements are recommended:
Continuous Learning and Adaptation: To combat concept drift and adapt to
evolving threats, future work should explore integrating continuous learning
mechanisms into the IDS. This could involve periodic retraining of the model with
newly collected, up-to-date network traffic data or implementing online learning
algorithms that can adapt incrementally.
Federated Learning for Distributed IDS: Investigating federated learning
approaches could enable privacy-preserving, distributed intrusion detection systems. 5
This would allow models to be trained collaboratively across multiple network
environments without sharing raw sensitive data, potentially improving
generalizability and addressing data privacy concerns.
Integration of Real-time Threat Intelligence: Enhancing the system with real-time
threat intelligence feeds blacklists of malicious IPs, known attack signatures could
further improve detection capabilities and reduce the time to identify emerging
threats.
Advanced Visualization Techniques: While the current GUI provides robust
visualization, incorporating more advanced and interactive visualization techniques,
such as dynamic attack path visualizations or 3D network topology maps, could offer
deeper contextual understanding for security analysts. 43 This would allow for more
intuitive exploration of complex attack scenarios and their progression.
Zero-Day Attack Detection: Further research into anomaly detection techniques,
potentially combined with generative adversarial networks (GANs) or autoencoders,
could enhance the system's ability to detect truly novel, zero-day attacks that do not
conform to known patterns. Transformer-CNN models are also a promising direction
for real-time and zero-day attack detection.2
Evaluation on Diverse and Live Datasets: Testing the model on a wider array of
diverse and more recent datasets, or even in a live network environment (with
appropriate ethical and security considerations), would provide a more comprehensive
validation of its real-world applicability and robustness.
By pursuing these enhancements, the proposed system can evolve into an
even more resilient, intelligent, and adaptable solution for safeguarding modern network
infrastructures against the ever-growing spectrum of cyber threats.

References
27 url:https://arxiv.org/html/2502.06688v1
3 url:https://kinit.sk/aiming-to-catchem-all-data-for-network-intrusion-detection/
1 url:https://kuey.net/index.php/kuey/article/download/3936/2619/9001
4 url:https://www.researchgate.net/figure/List-of-attacks-UNSW-NB-15-
dataset_tbl1_325656462
9url:https://scispace.com/pdf/cicids-2017-dataset-feature-analysis-with-information-gain-
5gdh 5inehd.pdf
5 url:https://www.researchgate.net/figure/Attack-Types-in-CICIDS-2017-
dataset_tbl1_339527544
14 url:https://arxiv.org/html/2501.14311v1
15 url:https://infonomics-society.org/wp-content/uploads/Comparative-Evaluation-of-
Machine-Learning-Efficacy-in-DoS-Attack-Detection.pdf
7 url:https://www.arcjournals.org/pdfs/ijrscse/v2-i3/7.pdf
21 url:https://www.kaggle.com/code/maryamanwer/ddos-attack-detection-using-ml
25 url:https://www.mdpi.com/1424-8220/24/14/4516
8 url:https://www.mdpi.com/2079-9292/13/12/2404
43 url:https://www.picussecurity.com/resource/glossary/what-is-attack-path-visualization
44 url:https://www.puppygraph.com/blog/attack-graph
6
url:https://pdfs.semanticscholar.org/726a/0aedfb044d255f9dbad024dcf6160eb4769a.pdf
2url:https://www.researchgate.net/publication/
371900034_Analysis_of_Intrusion_Detection_Systems_in_UNSW-NB15_and_NSL-
KDD_Datasets_with_Machine_Learning_Algorithms
18url:https://repository.unsri.ac.id/59517/1/Article-IJEEI-Important%20Features%20of
%20 CICIDS-2017%20Dataset.pdf
17 url:https://www.mdpi.com/2079-8954/12/7/243
33 url:https://www.youtube.com/watch?v=MvTqi2Mb_PM
32 url:https://www.meritshot.com/introduction-to-flask-for-machine-learning/
35url:https://www.researchgate.net/publication/387701314_Real-
time_Intrusion_Detection _System_on_Windows_Implementation_of_a_Network-
Based_Security_Information_System
10url:https://www.upguard.com/blog/top-free-network-based-intrusion-detection-
systems-ids-for-the-enterprise
55 url:https://flask-monitoringdashboard.readthedocs.io/
34 url:https://www.analyticsvidhya.com/blog/2021/10/flask-python/
39 url:https://cloud.google.com/chronicle/docs/investigation/alerts-iocs
40 url:https://apps.dtic.mil/sti/tr/pdf/ADA428197.pdf
1 url:https://kuey.net/index.php/kuey/article/download/3936/2619/9001
26 url:https://arxiv.org/pdf/2101.05067
12 url:https://test-jicce.inforang.com/journal/download_pdf.php?
spage=7&volume=7&number=1
22 url:https://www.mdpi.com/2076-3417/13/16/9363
21 url:https://www.kaggle.com/code/maryamanwer/ddos-attack-detection-using-ml
24 url:https://www.researchgate.net/figure/Confusion-matrix-for-detecting-DoS-attack-
types_fig2 _327065277
31 url:https://telnyx.com/learn-ai/calculating-f1-score
28url:https://www.deepchecks.com/how-to-apply-and-calculate-the-f1-score-in-machine-
learning/
51 url:https://magai.co/weighted-metrics-for-multi-class-models-explained/
52 url:https://www.evidentlyai.com/classification-metrics/multi-class-metrics
11 url:https://www.clearnetwork.com/top-intrusion-detection-and-prevention-systems/
10 url:https://www.upguard.com/blog/top-free-network-based-intrusion-detection-
systems-ids-for -the-enterprise
47url:https://medium.com/voice-tech-global/machine-learning-confidence-scores-all-you-
need-to- know-as-a-conversation-designer-8babd39caae7
30 url:https://www.mindee.com/blog/how-use-confidence-scores-ml-models
56 url:
https://www.digitalocean.com/community/tutorials/how-to-use-web-forms-in-a-flask-
application
19 url:https://www.researchgate.net/figure/Features-of-UNSW-NB15-
dataset_tbl1_324601933
1 url:https://kuey.net/index.php/kuey/article/download/3936/2619/9001
20url:https://www.researchgate.net/publication/332265020_UNSW-
NB15_dataset_feature_
selection_and_network_intrusion_detection_using_deep_learning
57 url:https://www.mdpi.com/2076-3417/13/19/11067
24 url:https://www.researchgate.net/figure/Confusion-matrix-for-detecting-DoS-attack-
types_fig2 _327065277
21 url:https://www.kaggle.com/code/maryamanwer/ddos-attack-detection-using-ml
29 url:https://www.grammarly.com/blog/ai/what-is-f1-score/
31 url:https://telnyx.com/learn-ai/calculating-f1-score
53url:https://wandb.ai/wandb/plots/reports/Confusion-Matrix-Usage-and-Examples--
VmlldzozMDg1NTM
49url:https://www.analyticsvidhya.com/blog/2021/06/confusion-matrix-for-multi-class-
classification/
13 url:https://www.researchgate.net/publication/383322932_Predicting_F1-
Scores_of_Classifiers _in_Network_Intrusion_Detection_Systems
48 url:https://github.com/makt96/treebased-ids
30 url:https://www.mindee.com/blog/how-use-confidence-scores-ml-models
23 url:https://www.mdpi.com/2079-9292/13/2/293
36url:https://www.numberanalytics.com/blog/dashboard-best-practices-data-
warehousing-business -intelligence
37 url:https://www.aufaitux.com/blog/cybersecurity-dashboard-ui-ux-design/
45 url:
https://docs.dynatrace.com/docs/analyze-explore-automate/dashboards-classic/charts-and-
tiles/visualization-graph
58 url:https://cloud.google.com/chronicle/docs/reports/dashboards-overview
41 url:https://moldstud.com/articles/p-flask-security-best-practices-protect-your-web-
applications-from-common-vulnerabilities
42 url:https://flask.palletsprojects.com/en/stable/web-security/
21 url:https://www.kaggle.com/code/maryamanwer/ddos-attack-detection-using-ml
50 url:https://manujayanth07.medium.com/network-intrusion-detection-system-using-
confusion-matrix-f7db52896fb5
59 url:https://blog.roboflow.com/f1-score/
16 url:https://www.researchgate.net/figure/Analysis-of-probe-attack-
detection_fig9_367481862
53url:https://wandb.ai/wandb/plots/reports/Confusion-Matrix-Usage-and-Examples--
VmlldzozMDg1NTM
54 url:https://www.kaggle.com/code/mittalvasu95/multi-class-classification-c101
60 url:https://www.simplekpi.com/KPI-Dashboard-Examples/Security-Firm-Dashboard-
Examples
61 url:https://www.qlik.com/us/dashboard-examples
38 url:https://www.striim.com/blog/real-time-data-visualization-data-exploration/
46 url:https://blog.eclecticiq.com/real-time-threat-visibility-custom-dashboards-for-
cybersecurity-excellence

Appendices
A. Source Code Snippets
(Placeholder for relevant code snippets, e.g., 1DCNN model definition, GA feature
selection implementation, Flask API endpoint, data pre-processing scripts.)
B. Sample Input Data
(Placeholder for examples of pre-processed network flow data, both normal and attack,
that would be fed into the model.)
C. Screenshots of Web Application
(Placeholder for visual representations of the Flask dashboard, including prediction
results, confidence scores, and interactive elements.)

D. API Documentation
(Placeholder for detailed documentation of the Flask API endpoints, including
request/response formats and authentication requirements.) Works cited
1. Advancing Cybersecurity: Leveraging UNSW_NB15 Dataset for Enhanced
Detection and Prediction of Diverse Cyber Threats, accessed July 10, 2025,
https://kuey.net/index.php/kuey/article/download/3936/2619/9001
2. Analysis of Intrusion Detection Systems in UNSW-NB15 and NSL-KDD Datasets
with Machine Learning Algorithms - ResearchGate, accessed July 10, 2025,
https://www.researchgate.net/publication/371900034_Analysis_of_Intrusion_Detec
tion_Systems_in_UNSW-NB15_&_NSL-KDD_Datasets_with_ Machine_Learning
_Algorithms
3. Aiming to Catch'Em All: Our Research on Data for Network Intrusion Detection -
KInIT, accessed July 10, 2025, https://kinit.sk/aiming-to-catchem-all-data-for-
network-intrusion-detection/
4. List of attacks UNSW NB-15 dataset | Download Scientific Diagram -
ResearchGate, accessed July 10, 2025, https://www.researchgate.net/figure/List-of-
attacks-UNSW-NB-15-dataset_tbl1_325656462
5. Attack Types in CICIDS 2017 dataset | Download Scientific Diagram -
ResearchGate, accessed July 10, 2025, https://www.researchgate.net/figure/Attack-
Types-in-CICIDS-2017-dataset_tbl1_339527544
6. Network Intrusion Detection with 1D Convolutional Neural Networks, accessed
July 10, 2025,
https://pdfs.semanticscholar.org/726a/0aedfb044d255f9dbad024dcf6160
eb4769a.pdf
7. Detection of Probe Attacks Using Machine Learning ... - ARC Journals, accessed
July 10, 2025, https://www.arcjournals.org/pdfs/ijrscse/v2-i3/7.pdf
8. Enhancing Multi-Class Attack Detection in Graph Neural Network through Feature
Rearrangement - MDPI, accessed July 10, 2025, https://www.mdpi.com/2079-
9292/13/12/2404
9. CICIDS-2017 Dataset Feature Analysis With Information ... - SciSpace, accessed
July 10, 2025, https://scispace.com/pdf/cicids-2017-dataset-feature-analysis-with-
information-gain-5gdh5inehd.pdf
10. Top 6 Free Network Intrusion Detection Systems (NIDS) Software | UpGuard,
accessed July 10, 2025, https://www.upguard.com/blog/top-free-network-based-
intrusion-detection-systems-ids-for-the-enterprise
11. Top 10 Intrusion Detection and Prevention Systems - ClearNetwork, Inc, accessed
July 10, 2025, https://www.clearnetwork.com/top-intrusion-detection-and-
prevention-systems/
12. International Journal of KIMICS, Vol. 7, No. 1, March 2009 - Design of Hybrid
Network Probe Intrusion Detector using FCM, accessed July 10, 2025, https://test-
jicce.inforang.com/journal/download_pdf.php?spage=7&volume=7&number=1
13. Predicting F1-Scores of Classifiers in Network Intrusion Detection Systems -
ResearchGate, accessed July 10, 2025,
https://www.researchgate.net/publication/383322932_Predicting_F1-
Scores_of_Classifiers_in_Network_Intrusion_Detection_Systems
14. An Efficient Real Time DDoS Detection Model Using Machine Learning
Algorithms - arXiv, accessed July 10, 2025, https://arxiv.org/html/2501.14311v1
15. Comparative Evaluation of Machine Learning Efficacy in DoS Attack Detection |
Infonomics Society, accessed July 10, 2025, https://infonomics-society.org/wp-
content/uploads/Comparative-Evaluation-of-Machine-Learning-Efficacy-in-DoS-
Attack-Detection.pdf
16. Analysis of probe attack detection. | Download Scientific Diagram - ResearchGate,
accessed July 10, 2025, https://www.researchgate.net/figure/Analysis-of-probe-
attack-detection_fig9_367481862
17. Performance Study on the Use of Genetic Algorithm for Reducing ..., accessed July
10, 2025, https://www.mdpi.com/2079-8954/12/7/243
18. Important Features of CICIDS-2017 Dataset For Anomaly Detection in High
Dimension and Imbalanced Class Dataset - Sriwijaya University Repository,
accessed July 10, 2025, https://repository.unsri.ac.id/59517/1/Article-IJEEI-
Important%20Features%20of%20CICIDS-2017%20Dataset.pdf
19. Features of UNSW-NB15 dataset | Download Table - ResearchGate, accessed July
10, 2025, https://www.researchgate.net/figure/Features-of-UNSW-NB15-
dataset_tbl1_324601933
20. UNSW-NB15 dataset feature selection and network intrusion detection using deep
learning, accessed July 10, 2025, https://www.researchgate.net/publication
/332265020_UNSW-NB15_dataset_feature_selection_and_network_intrusion_
detection_using_deep_learning
21. DDoS attack detection using ML - Kaggle, accessed July 10, 2025,
https://www.kaggle.com/code/maryamanwer/ddos-attack-detection-using-ml
22. Feature Engineering and Model Optimization Based Classification Method for
Network Intrusion Detection - MDPI, accessed July 10, 2025,
https://www.mdpi.com/2076-3417/13/16/9363
23. Automated Network Incident Identification through Genetic Algorithm ..., accessed
July 10, 2025, https://www.mdpi.com/2079-9292/13/2/293
24. Confusion matrix for detecting DoS attack types. | Download ..., accessed July 10,
2025, https://www.researchgate.net/figure/Confusion-matrix-for-detecting-DoS-
attack-types_fig2_327065277
25. Using Machine Learning Multiclass Classification Technique to ..., accessed July
10, 2025, https://www.mdpi.com/1424-8220/24/14/4516
26. UNSW-NB15 Computer Security Dataset: Analysis through Visualization - arXiv,
accessed July 10, 2025, https://arxiv.org/pdf/2101.05067
27. arxiv.org, accessed July 10, 2025, https://arxiv.org/html/2502.06688v1
28. How to Apply and Calculate the F1 Score in Machine Learning - Deepchecks,
accessed July 10, 2025, https://www.deepchecks.com/how-to-apply-and-calculate-
the-f1-score-in-machine-learning/
29. How to Use F1 Score in Machine Learning - Grammarly, accessed July 10, 2025,
https://www.grammarly.com/blog/ai/what-is-f1-score/
30. Understanding Confidence Scores in ML: A Practical Guide - Mindee, accessed
July10,2025,https://www.mindee.com/blog/how-use-confidence-scores-ml-models
31. Understanding and calculating the F1 score in ML - Telnyx, accessed July 10,
2025, https://telnyx.com/learn-ai/calculating-f1-score
32. Introduction to Flask for Machine Learning - Meritshot, accessed July 10, 2025,
https://www.meritshot.com/introduction-to-flask-for-machine-learning/
33. Deploy ML Models as APIs with Flask in Python | Step-by-Step Guide - YouTube,
accessed July 10, 2025, https://www.youtube.com/watch?v=MvTqi2Mb_PM
34. Understanding Flask Framework: Installation & features - Analytics Vidhya,
accessed July10,2025,https://www.analyticsvidhya.com/blog/2021/10/flask-python/
35. Real-time Intrusion Detection System on Windows Implementation of a Network-
Based Security Information System - ResearchGate, accessed July 10, 2025,
https://www.researchgate.net/publication/387701314_Real-
time_Intrusion_Detection_System_on_Windows_Implementation_of_a_Network-
Based_Security_Information_System
36. Dashboard Best Practices - Number Analytics, accessed July 10, 2025,
https://www.numberanalytics.com/blog/dashboard-best-practices-data-
warehousing-business-intelligence
37. The Ultimate Guide to Cybersecurity Dashboard UI/UX Design, accessed July 10,
2025, https://www.aufaitux.com/blog/cybersecurity-dashboard-ui-ux-design/
38. Real-Time Data Visualization and Data Exploration - Striim, accessed July 10,
2025, https://www.striim.com/blog/real-time-data-visualization-data-exploration/
39. View Alerts and IOCs | Google Security Operations, accessed July 10, 2025,
https://cloud.google.com/chronicle/docs/investigation/alerts-iocs
40. Visualization Techniques for Intrusion Detection - DTIC, accessed July 10, 2025,
https://apps.dtic.mil/sti/tr/pdf/ADA428197.pdf
41. Flask Security Best Practices - Protect Your Web Applications from Common
Vulnerabilities, accessed July 10, 2025, https://moldstud.com/articles/p-flask-
security-best-practices-protect-your-web-applications-from-common-vulnerabilities
42. Security Considerations — Flask Documentation (3.1.x), accessed July 10, 2025,
https://flask.palletsprojects.com/en/stable/web-security/
43. What Is Attack Path Visualization (APV) In Cybersecurity? - Picus Security,
accessed July 10, 2025, https://www.picussecurity.com/resource/glossary/what-is-
attack-path-visualization
44. Visualizing Cyber Threats: An Introduction to Attack Graphs - PuppyGraph,
accessed July 10, 2025, https://www.puppygraph.com/blog/attack-graph
45. Configure and use a graph visualization in Dynatrace, accessed July 10, 2025,
https://docs.dynatrace.com/docs/analyze-explore-automate/dashboards-classic/
charts-and-tiles/visualization-graph
46. Real-time threat visibility: Custom Dashboards for cybersecurity excellence -
EclecticIQ Blog, accessed July 10, 2025, https://blog.eclecticiq.com/real-time-
threat-visibility-custom-dashboards-for-cybersecurity-excellence
47. Machine Learning Confidence Scores — All You Need to Know as a Conversation
Designer | by Guy TONYE | Voice Tech Global | Medium, accessed July 10, 2025,
https://medium.com/voice-tech-global/machine-learning-confidence-scores-all-
you-need-to-know-as-a-conversation-designer-8babd39caae7
48. makt96/treebased-ids: This project is a live network ... - GitHub, accessed July 10,
2025, https://github.com/makt96/treebased-ids
49. Confusion Matrix for Multi-Class Classification 2025 - Analytics Vidhya, accessed
July 10, 2025, https://www.analyticsvidhya.com/blog/2021/06/confusion-matrix-
for-multi-class-classification/
50. Network intrusion detection system using Confusion Matrix | by Manujayanth -
Medium, accessed July 10, 2025, https://manujayanth07.medium.com/network-
intrusion-detection-system-using-confusion-matrix-f7db52896fb5
51. Weighted Metrics for Multi-Class Models Explained - Magai, accessed July 10,
2025, https://magai.co/weighted-metrics-for-multi-class-models-explained/
52. Accuracy, precision, and recall in multi-class classification - Evidently AI,
accessed July10,2025,https://www.evidentlyai.com/classification-metrics/multi-
class-metrics
53. Confusion Matrix: Usage and Examples | plots – Weights & Biases - Wandb,
accessed July 10, 2025, https://wandb.ai/wandb/plots/reports/Confusion-Matrix-
Usage-and-Examples--VmlldzozMDg1NTM
54. Multi class Classification: C101 - Kaggle, accessed July 10, 2025,
https://www.kaggle.com/code/mittalvasu95/multi-class-classification-c101
55. What is Flask-MonitoringDashboard? — Flask-MonitoringDashboard 3.2.2
documentation, accessed July 10, 2025, https://flask-
monitoringdashboard.readthedocs.io/
56. How To Use Web Forms in a Flask Application - DigitalOcean, accessed July 10,
2025, https://www.digitalocean.com/community/tutorials/how-to-use-web-forms-
in-a-flask-application
57. A Novel Feature Selection Approach to Classify Intrusion Attacks in Network
Communications - MDPI, accessed July 10, 2025, https://www.mdpi.com/2076-
3417/13/19/11067
58. Dashboards overview | Google Security Operations, accessed July 10, 2025,
https://cloud.google.com/chronicle/docs/reports/dashboards-overview
59. What is F1 Score? A Computer Vision Guide. - Roboflow Blog, accessed July 10,
2025, https://blog.roboflow.com/f1-score/
60. Security Firm KPI Dashboard Template | Track Incidents & Compliance -
SimpleKPI.com, accessed July 10, 2025, https://www.simplekpi.com/KPI-
Dashboard-Examples/Security-Firm-Dashboard-Examples
61. Best Dashboard Examples: Over 100 by Industry & Role - Qlik, accessed July 10,
2025, https://www.qlik.com/us/dashboard-examples

You might also like