0% found this document useful (0 votes)

44 views37 pages

Cyber Attack Prediction

The project aims to develop a web-based Cyber Attack Prediction System using Machine Learning, Deep Learning, and Generative Adversarial Networks to enhance real-time intrusion detection and address class imbalance in cybersecurity datasets. The system integrates multiple AI approaches, achieving consistent accuracy and focusing on practical applicability in production environments. Future improvements may explore hybrid strategies combining GANs with SMOTE techniques for better data quality and training performance.

Uploaded by

Yug Raithatha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views37 pages

Cyber Attack Prediction

Uploaded by

Yug Raithatha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Cyber

Attack
Prediction
BCSE355L CLOUD ARCHITECTURE
Sakethram Sathish 23BCE0934
Yug Raithatha 23BCE0964
Divya Juliet 23BCE2297
AIM of the project
To develop an advanced, web-based Cyber Attack Prediction
System that leverages Machine Learning ML, Deep Learning DL,
and Generative Adversarial Networks GANs for:

● Detecting network intrusions in real-time using multiple AI

models
● Improving detection accuracy through ensemble methods and
hybrid modeling
● Addressing class imbalance in cybersecurity datasets using
GAN-based synthetic data augmentation
● Providing an interactive and user-friendly interface for data
analysis, model training, evaluation, and live prediction
MOTIVATION behind the project
We chose a fundamental cybersecurity problem
and approached it with practical elegance.
Instead of just chasing accuracy metrics like
most research, we focused on real-world
applicability - specifically tackling data
imbalances and optimizing the false positive
ratio to create a system that actually works in
production environments.
Project novelties compared to the existing methods
● In the existing methods they use about 1 or 2 approaches in their pipeline but we have integrated 3 different
approaches being ML+DL+ Gen AI in a single pipeline
● Existing models are more focused on theoretical data generation but we attack a real world problem using our Gan
model which is it handels data imbalance
● The predictions of the first model were used to augment the training data of the next model so that the next model
would use the prediction data and the dataset of its [Link] existing models Treat models as separate entities
without knowledge transfer
● Our Model Achieves consistent 88% accuracy across all three configurations other papers report higher theoretical
accuracy but lack practical consistency our models show excellent normal traffic identification (98% accuracy) minimal
false alarms (2-14 mistakes per 1000 normal samples)
● Practical Problem-Solving Focus
● Your model specifically addresses: Class imbalance through GAN-based augmentation,Vanishing gradient problem
using LeakyReLU activation,Overfitting prevention with strategic dropout layers (0.1, 0.2, 0.3)
● Unlike other papers that focus only on accuracy, your research acknowledges: The critical issue of missing 25% of
actual attacks,The trade-off between model complexity and practical performance,The importance of false positive
rates in real deployments
●
Additional Findings
For the same model we replaced
the GAN model with a SMOTE
+Tomek model
why was this change implemented ?
the SMOTE +Tomek model is faster
, it's always predictable hence more
reliable , more simple , helps in
imbalanced data
this model has proved to show
better accuracy than our previous
model and has an exceptional
increase in the out of the bag score
Proposed
Architecture
Screenshots

Figure 1: Home Screen of the Cyber Attack Prediction System. This interface highlights the three core technological components of
the solution: Machine Learning (Random Forest), Deep Learning (Neural Network), and GAN Augmentation, along with key dataset
information and instructions for system usage.
Screenshots

Figure 2: Streamlit Application Control Panel and Data Analysis Interface. This view shows the main navigation sidebar and the
Data Analysis page, where the user can load the dataset before proceeding to model training.
Screenshots

Figure 3: Dataset Load Conﬁrmation and Sample Data View. The application successfully loads the dataset, displaying the total
number of samples and a head-view of the raw features, including session_id, various network metrics, and the binary target
variable, attack_detected.
Screenshots

Figure 4: Attack Detection Distribution of the Loaded Dataset. This bar chart displays the class imbalance, showing the count of
'Normal' instances (blue bar, class 0) versus 'Attack' instances (red bar, class 1) in the cybersecurity dataset.
Screenshots

Figure 5: Feature Correlation Matrix Visualization. This heatmap, displayed in the Data Analysis section, illustrates
the linear relationship between the numeric features. The darker red shades (e.g., along the diagonal) indicate strong
positive correlations, while lighter shades or low absolute values (like the -0.01) indicate weak or no linear correlation
between features.
Screenshots

Figure 6: Descriptive Feature Statistics (Numerical). This table provides a statistical summary of the dataset's numerical
columns, including the count, mean, standard deviation, minimum, quartiles (25%, 50%, 75%), and maximum values for each
feature.
Screenshots

Figure 7: Completed Model Training Pipeline. This view summarizes the sequential execution of the training steps, including
data preprocessing, training of the Random Forest and Deep Learning models, and the ﬁnal training of the Hybrid RF + DL model
using GAN Augmented data, along with their preliminary performance metrics.
Screenshots

Figure 8: Model Performance Comparison on Test Data (Metrics Tab). This view presents a side-by-side comparison of the three
models' performance, highlighting Accuracy and ROC AUC scores. The RF + DL (GAN Augmented) model is shown achieving a
competitive accuracy, demonstrated visually by the accompanying bar chart.
Screenshots

Figure 9: Visual Comparison of Model Accuracies. This bar chart clearly contrasts the performance of the three models:
Random Forest (Baseline), Deep Learning + Random Forest (Ensemble), and the ﬁnal Hybrid RF + DL with GAN
Augmentation, showing that the most advanced model maintains the highest performance on the test set.
Screenshots

Figure 10: Detailed Classiﬁcation Reports for Baseline and GAN Augmented Models. This table provides a comprehensive
breakdown of model performance using metrics such as precision, recall, and F1-score for each class (0=Normal, 1=Attack),
highlighting the effectiveness of the augmentation strategy in balancing performance across classes.
Screenshots

Figure 11: Confusion Matrices for Random Forest, Deep Learning + Random Forest, and RF + DL (GAN Augmented) Models.
This side-by-side comparison shows the number of True Positives (Attack correctly identified), True Negatives (Normal
correctly identified), False Positives, and False Negatives for each model, demonstrating the classification performance on
both Normal and Attack classes.
Screenshots

Figure 12: Deep Learning Training History (Learning Curves Tab). These plots show the model's performance over epochs,
detailing the Area Under the Curve (AUC) and Loss for both the training and validation datasets, conﬁrming the
convergence and stability of the Deep Learning component of the hybrid system.
Screenshots

Figure 13: Real-time Predictions Interface. This view demonstrates the ﬁnal capability of the system by generating a random
scaled input sample and simultaneously displaying the prediction (Normal/Attack) and the conﬁdence score from each of the
three trained models: Random Forest, Deep Learning, and the Augmented RF (Hybrid) model.
Screenshots

Figure 14: Model Comparison and Final Decision Logic. This section of the predictions interface combines the
individual model results (Model Votes) to arrive at a definitive Final Decision on the network traffic. In this case, a
'Perfect Agreement' among all three models leads to the classification of NORMAL TRAFFIC.
Screenshots

Figure 15: Detailed Prediction Analysis and Confidence Insights. This final prediction view provides a tabular
summary of each model's prediction, confidence, and the probability breakdown for both Normal and Attack
classes. The Insights section confirms the perfect agreement and highlights the model with the highest
prediction confidence (Deep Learning).
Screenshots

Figure 1: The Amazon Web Services (AWS) login interface, illustrating the choice between logging in as the high-privilege Root
user or a managed-privilege IAM user for accessing the AWS Management Console.
Screenshots

Figure 2: The subsequent stage of the AWS Root user sign-in process, requiring the entry of the Root user password to
authenticate access to the AWS Management Console. This screen follows the email entry step (as shown in Figure 1).
Screenshots

Figure 3: The enforcement of Multi-Factor Authentication (MFA) during the AWS sign-in process. This security layer requires the
user to input a unique, time-sensitive code from a designated MFA device, signiﬁcantly enhancing account protection against
unauthorized access.
Screenshots

Figure 4: The AWS Management Console Home dashboard displayed upon successful login. This interface provides the user with an
overview of recently accessed services (e.g., EC2), regional settings (e.g., Europe (Stockholm)), account identiﬁcation, and current
status information (AWS Health, Cost and usage).
Screenshots

Figure 5: The Amazon EC2 (Elastic Compute Cloud) Instances dashboard within the AWS Management Console, showing the details of
a single virtual machine instance. The view highlights key operational data relevant to cloud resource management, including the
instance ID, state (Stopped), type ([Link]), and associated network details (Public and Private IPv4 addresses).
Screenshots

Figure 6: The Actions dropdown menu for an EC2 instance, illustrating the administrative capabilities available to the user. This
menu includes critical operational controls such as Launch Instances, Start/Stop/Reboot/Terminate instance, and options for
managing networking and security conﬁgurations.
Screenshots

Figure 7: Confirmation of an action execution within the EC2 dashboard. The notification banner confirms the successful
initiation of the instance start operation. The instance state is transitioning from Stopped (Figure 5) to Running, demonstrating
the real-time feedback provided by the AWS Management Console during resource provisioning and operational changes.
Conclusions
This project introduced a novel hybrid model for
cyber attack prediction by sequentially
combining Random Forest, Deep Learning, and
GANs. The key finding was that while the
advanced DL and GAN models provided slight
enhancements, the Random Forest classifier
remained a robust and interpretable core for
detection. Feature analysis confirmed that failed
login attempts are critical predictors.
Future discussions
While our current implementation successfully leverages
GAN-based augmentation to address class imbalance in
cybersecurity data, we recognize the proven effectiveness of
SMOTE variants demonstrated in alternative approaches. Future
iterations could explore hybrid strategies that combine the realistic
sample generation of GANs with the computational efficiency of
SMOTE+ techniques. This would allow us to balance the trade-off
between synthetic data quality and training performance, potentially
further optimizing our false positive ratios while maintaining the
practical elegance that defines our approach to this fundamental
cybersecurity challenge.
Outcomes Achieved
1. Successful AWS Deployment:
The Cyber Attack Prediction System was successfully hosted on AWS,
enabling global accessibility and demonstrating cloud deployment skills such
as EC2 configuration, environment setup, and security management.

2. End-to-End ML Pipeline Implementation:

Implemented a complete data pipeline from preprocessing, feature scaling,
model training, and evaluation to live prediction, showcasing practical
understanding of machine learning workflow in a cloud environment.

3. Integration of Multiple AI Techniques:

Combined Machine Learning (Random Forest), Deep Learning (Neural
Networks), and Generative Adversarial Networks (GANs) to enhance
accuracy and detect anomalous or unseen cyber-attack patterns effectively.
Outcomes Achieved
4. Interactive Web Application Development:
Designed and deployed a Streamlit-based interactive web interface that
allows real-time input and prediction, improving usability and demonstrating
frontend-backend integration with ML models.

5. Model Optimization and Evaluation:

Achieved measurable accuracy improvements through hyperparameter tuning,
normalization, and evaluation metrics like confusion matrix, ROC-AUC, and
classification reports.
References
Abdullahi, M., Baashar, Y., Alhussian, H., Alwadain, A., Aziz, N., Capretz, L. F., & Abdulkadir, S. J. (2022). Detecting Cybersecurity Attacks in Internet of
Things Using Artificial Intelligence Methods: A Systematic Literature Review. MDPI Electronics. Available online:
[Link] MDPI

(Paper) An efficient cyber threat prediction using a novel artificial intelligence technique. (2024). Introduces a Cuttlefish-based Peephole LSTM
(CbP-LSTM) model for threat prediction and preprocessing pipeline. Available via ResearchGate:
[Link]
ResearchGate+1

Meduri, K., Gonaygunta, H., & Nadella, G. S. (2024). Evaluating the Effectiveness of AI-Driven Frameworks in Predicting and Preventing Cyber Attacks.
(International Journal / ResearchGate entry). Available:
[Link]
cks. ResearchGate

Ankalaki, S., Atmakuri, A. R., Pallavi, M., Hukkeri, G. S., Jan, T., & Naik, G. R. (2025). Cyber Attack Prediction: From Traditional Machine Learning to
Generative Artificial Intelligence. (IEEE/Conference/Article draft available online). PDF available:
[Link] Research @ Flinders

Khalaf, M. A., & Steiti, A. (2024). Artificial Intelligence Predictions in Cyber Security: Analysis and Early Detection of Cyber Attacks. Babylonian Journal
of Machine Learning (2024). Available:
[Link] Mesopotamian Pre

Internship Report
No ratings yet
Internship Report
7 pages
26 TH May Presentation
No ratings yet
26 TH May Presentation
19 pages
19bit0368 Capstone Final Review
No ratings yet
19bit0368 Capstone Final Review
48 pages
Chapter - 08
No ratings yet
Chapter - 08
7 pages
IEEE-Ai For Cybersecurity
100% (1)
IEEE-Ai For Cybersecurity
3 pages
DDOS Attack Final
No ratings yet
DDOS Attack Final
41 pages
SRS Cyber
No ratings yet
SRS Cyber
11 pages
Explainable AI For IDS Final Report
No ratings yet
Explainable AI For IDS Final Report
94 pages
Conference-template-A4 (AutoRecovered)
No ratings yet
Conference-template-A4 (AutoRecovered)
6 pages
You Are An Expert in Preparing Scientific Reports
No ratings yet
You Are An Expert in Preparing Scientific Reports
10 pages
Cyber Thread
No ratings yet
Cyber Thread
15 pages
Cybersecurity Packet Sniffing Guide
No ratings yet
Cybersecurity Packet Sniffing Guide
17 pages
Base Paper Interview
No ratings yet
Base Paper Interview
5 pages
Team - Akash, Dhanasekar
No ratings yet
Team - Akash, Dhanasekar
22 pages
Screens
No ratings yet
Screens
14 pages
Transfer Learning for Anomaly Detection
No ratings yet
Transfer Learning for Anomaly Detection
64 pages
A Machine Learning-Based Classification and Prediction Technique For DDoS Attacks
No ratings yet
A Machine Learning-Based Classification and Prediction Technique For DDoS Attacks
7 pages
Cyber Security Darknet Threat Detection Using Machine Learning On Network Traffic
No ratings yet
Cyber Security Darknet Threat Detection Using Machine Learning On Network Traffic
22 pages
MMAKR
No ratings yet
MMAKR
13 pages
You Are An Expert in Preparing Scientific Reports
No ratings yet
You Are An Expert in Preparing Scientific Reports
7 pages
Chapter 3
No ratings yet
Chapter 3
50 pages
Futureinternet 16 00481
No ratings yet
Futureinternet 16 00481
74 pages
Mlns Notes
No ratings yet
Mlns Notes
20 pages
Ai ML
No ratings yet
Ai ML
9 pages
Team 3 - Conference Paper
No ratings yet
Team 3 - Conference Paper
18 pages
AI-Based Intrusion Detection Report
No ratings yet
AI-Based Intrusion Detection Report
10 pages
Predicitve Risk Model
No ratings yet
Predicitve Risk Model
32 pages
BDCC 07 00065
No ratings yet
BDCC 07 00065
26 pages
For Progress Seminar
No ratings yet
For Progress Seminar
7 pages
Chapter 3 and 4
No ratings yet
Chapter 3 and 4
14 pages
ML-Based Network Intrusion Detection
No ratings yet
ML-Based Network Intrusion Detection
3 pages
Cyber Threat Intelligence from Hackers
No ratings yet
Cyber Threat Intelligence from Hackers
123 pages
AI-Driven Threat Hunting in Enterprise Networks
No ratings yet
AI-Driven Threat Hunting in Enterprise Networks
22 pages
Journal 8 (2025)
No ratings yet
Journal 8 (2025)
15 pages
User-Centric ML Framework for Cybersecurity
No ratings yet
User-Centric ML Framework for Cybersecurity
11 pages
ML For Intrusion Detection
No ratings yet
ML For Intrusion Detection
6 pages
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
No ratings yet
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
5 pages
76.phikita Phishing Kit Attacks Dataset For Phishing Websites Identification Felipe
No ratings yet
76.phikita Phishing Kit Attacks Dataset For Phishing Websites Identification Felipe
100 pages
Machine Learning Based Network Traffic P
No ratings yet
Machine Learning Based Network Traffic P
13 pages
22mdt1038 Capstone Final
No ratings yet
22mdt1038 Capstone Final
63 pages
ML-Based Smart Intrusion Detection
No ratings yet
ML-Based Smart Intrusion Detection
110 pages
Cyber Threat Intelligence via ML
No ratings yet
Cyber Threat Intelligence via ML
4 pages
Paper-New 2
No ratings yet
Paper-New 2
5 pages
IoT Presentation
No ratings yet
IoT Presentation
22 pages
Real-Time Intrusion Detection Leveraging Deep Learning: A Comparative Analysis of CNN, RNN, and Transformer Architectures
No ratings yet
Real-Time Intrusion Detection Leveraging Deep Learning: A Comparative Analysis of CNN, RNN, and Transformer Architectures
9 pages
AI Approach Used in The Prototype
No ratings yet
AI Approach Used in The Prototype
3 pages
КШ - 1.2 англ
No ratings yet
КШ - 1.2 англ
14 pages
Homomorphic Encryption-Enabled Deep Learning Model For Intelligent Cyber Threat Detection
No ratings yet
Homomorphic Encryption-Enabled Deep Learning Model For Intelligent Cyber Threat Detection
7 pages
ICCAD25 Paper 7737
No ratings yet
ICCAD25 Paper 7737
5 pages
Project
No ratings yet
Project
6 pages
Final Report
No ratings yet
Final Report
63 pages
AI-Driven Cyber Threat Prediction: Analyzing Patterns in Cybercrime To Enhance Proactive Defense Strategies
No ratings yet
AI-Driven Cyber Threat Prediction: Analyzing Patterns in Cybercrime To Enhance Proactive Defense Strategies
5 pages
Research 2
No ratings yet
Research 2
12 pages
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
No ratings yet
Machine Learning Methods For Secure Internet of Things Against Cyber Threats Synopsis
4 pages
15 - AI-Powered Modeling and Prediction of Cyber Intrusion Pathways
No ratings yet
15 - AI-Powered Modeling and Prediction of Cyber Intrusion Pathways
6 pages
Role of AI ML in Cybersecurity
No ratings yet
Role of AI ML in Cybersecurity
27 pages
Splnproc1703 C
No ratings yet
Splnproc1703 C
12 pages
Chapter - 07
No ratings yet
Chapter - 07
13 pages
Mensuration Techniques for Lumber Measurement
No ratings yet
Mensuration Techniques for Lumber Measurement
27 pages
MS Word and MS Power Point
No ratings yet
MS Word and MS Power Point
19 pages
Thoracocentesis &pleural Drainage in HH
No ratings yet
Thoracocentesis &pleural Drainage in HH
3 pages
429E Operation Manual PDF
No ratings yet
429E Operation Manual PDF
32 pages
30 Most Mythical Creatures From Folklore, Legends
No ratings yet
30 Most Mythical Creatures From Folklore, Legends
2 pages
Hydraulic Motor Curve 1
No ratings yet
Hydraulic Motor Curve 1
29 pages
Solution Manual For Industrial Organization: Contemporary Theory and Empirical Applications, 5th Edition, Lynne Pepall Dan Richards George Norman Kindle & PDF Formats
100% (7)
Solution Manual For Industrial Organization: Contemporary Theory and Empirical Applications, 5th Edition, Lynne Pepall Dan Richards George Norman Kindle & PDF Formats
92 pages
ADVERT N.001-290-2022-1 - SC013 - Road Marking
No ratings yet
ADVERT N.001-290-2022-1 - SC013 - Road Marking
3 pages
Memory Management 2010
No ratings yet
Memory Management 2010
103 pages
Key Plan
No ratings yet
Key Plan
1 page
Communications Networks A First Course PDF
0% (1)
Communications Networks A First Course PDF
2 pages
Day1 - 02 - Advanced x86 - BIOS and SMM Internals - Chipset Architecture
No ratings yet
Day1 - 02 - Advanced x86 - BIOS and SMM Internals - Chipset Architecture
65 pages
Unidad 4 Expo
No ratings yet
Unidad 4 Expo
16 pages
HNS L4 Curriculum
No ratings yet
HNS L4 Curriculum
70 pages
Week2 - Master The Data
No ratings yet
Week2 - Master The Data
28 pages
The Butterflys Evil Spell Script
0% (1)
The Butterflys Evil Spell Script
36 pages
Analysis of Parabolic Shell by Different Models Using Software SAP 2000
No ratings yet
Analysis of Parabolic Shell by Different Models Using Software SAP 2000
8 pages
Bestway Maroon 20 Queen Air Mattress With Built-In Pump
No ratings yet
Bestway Maroon 20 Queen Air Mattress With Built-In Pump
1 page
Understanding States of Matter and Intermolecular Forces
No ratings yet
Understanding States of Matter and Intermolecular Forces
25 pages
Dos and Donts in Metallic Expansion Joints Installation
No ratings yet
Dos and Donts in Metallic Expansion Joints Installation
1 page
STA200 Summer 2020 Final Assessment
No ratings yet
STA200 Summer 2020 Final Assessment
7 pages
Action Movie World - Deleted Scenes
No ratings yet
Action Movie World - Deleted Scenes
31 pages
Master Document Numbering SOP
No ratings yet
Master Document Numbering SOP
6 pages
Solutions To Paper - 3 Physics: 1 V V I 2 R R / 2 3R
No ratings yet
Solutions To Paper - 3 Physics: 1 V V I 2 R R / 2 3R
11 pages
Auto Brokerage & Import Services
No ratings yet
Auto Brokerage & Import Services
13 pages
Metrology and Measurement Question Bank
No ratings yet
Metrology and Measurement Question Bank
3 pages
360BEInstallAdminGuide 10 2020
100% (1)
360BEInstallAdminGuide 10 2020
304 pages
Mechanisms Lab Guide for Aviation Students
No ratings yet
Mechanisms Lab Guide for Aviation Students
35 pages
Santosc Action Research Full Paper 2024 2025
No ratings yet
Santosc Action Research Full Paper 2024 2025
7 pages
Java Script
No ratings yet
Java Script
8 pages

Cyber Attack Prediction

Uploaded by

Cyber Attack Prediction

Uploaded by

Cyber

● Detecting network intrusions in real-time using multiple AI

2. End-to-End ML Pipeline Implementation:

3. Integration of Multiple AI Techniques:

5. Model Optimization and Evaluation:

You might also like