MGM's College of Engineering and Technology Kamothe, Navi Mumbai – 410 209
Department of Information Technology
A.Y.2023-24
SPAM EMAIL CLASSIFIER
Roll no. Name of student
45 Patil Tanmay
49 Rane Pururaj
50 Ratate Sayoj
51 Rothe Rudra
Under the guidance of
Prof . Sanjay Waykar
1
MGMCET
Outline
Introduction
Literature Survey
Proposed system
Software & Hardware Requirements
Algorithm
Flowchart
Conclusion
References
MGMCET
1. Introduction
Email communication is essential, but
spam emails clutter inboxes and pose
security risks. The Spam Email Classifier
project aims to automatically classify
emails as spam or non-spam using
machine learning and natural language
processing. By doing so, it helps users
maintain a clean inbox, improve
productivity, and reduce security threats.
The system uses data preprocessing and
classification models to identify patterns
in email content for accurate classification.
MGMCET
2. Literature survey
Various techniques for
spam classification
include:
Naive Bayes classifier
Support Vector Machines
(SVM)
Decision Trees
Most common method:
Text classification using
machine learning.
MGMCET
3. Proposed System
Goal: Build an
efficient spam
email classifier
using a dataset.
Approach: Use natural
language processing
(NLP) and machine
learning algorithms for
classification.
MGMCET
4. Software & Hardware Requirements
Software:
Python (with libraries like Scikit-learn, Pandas)
Jupyter Notebook or any IDE
Hardware:
Minimum 4 GB RAM
Processor: Intel i3 or above
MGMCET
5. Algorithm
Step 1: Collect a labeled email dataset
(spam and non-spam).
Step 2: Preprocess the data (remove stop
words, stemming, tokenization).
Step 3: Split the dataset into training and
test sets.
Step 4: Train a classification model (e.g.,
Naive Bayes).
Step 5: Test the model and evaluate
performance.
MGMCET
6. Flowchart
Start
Collect Data
Preprocess Data
Split Dataset
Train Model
Test Model
Evaluate Performance
Classify Email
End
MGMCET
7. Conclusion
Spam email
classification helps
improve email security.
Machine learning
algorithms provide
effective solutions.
Model accuracy can be
enhanced with more
training data.
MGMCET
7. Conclusion
Natural Language Processing (NLP)
Techniques – https://www.nltk.org/
Python Libraries for Machine Learning –
https://scikit-learn.org/
Spam Classification Research Paper –
https://arxiv.org/abs/1802.03682
MGMCET