sqr da 2
sqr da 2
DIGITAL ASSIGNMENT – 2
Submitted By
Gokulraj M - 21MIS0458
Slot: A1
TITLE: CREDIT CARD FRAUD DETECTION
1. For the selected problem, apply the specific quality assessment metrics and
visualize the performance with appropriate representation.
Logistic Regression:
Support Vector Machine (SVM):
Random Forest (RF) Classifier:
Extra Tree Classifier:
Performance Charts (ROC Curve)
The Receiver Operating Characteristic (ROC) curve serves as a tool to evaluate and
compare the performance of various classification models we've chosen. To generate
this ROC curve, we've employed libraries such as scikit-learn and matplotlib.
SVM
The Random Forest Classifier emerges as the top-performing model, boasting the
highest accuracy (98.76%), recall (84.61%), F1 score (34.54%), precision (21.70%),
and ROC score (91.71%). Following closely behind is the Extra Trees (Ensemble)
model, with commendable performance metrics including accuracy (98.24%), recall
(81.91%), and ROC score (90.10%).
Statistical Analysis for Credit Card Fraud Detection
To assess the quality of our fraud detection model, we use precision, recall, and F1-
score, which directly impact fraud detection accuracy. A high precision ensures minimal
false positives, while high recall minimizes missed fraudulent transactions.
T-test (Comparing Two Models)
A T-test was performed to compare the precision of Extra Trees and Random Forest
models. The result showed that Extra Trees had a significantly higher precision,
suggesting it is better at detecting fraud cases correctly.
import numpy as np
from scipy.stats import ttest_ind
Results: