Open In App

Credit Card Fraud Detection - ML

Last Updated : 30 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The goal of this project is to develop a machine learning model that can accurately detect fraudulent credit card transactions using historical data. By analyzing transaction patterns, the model should be able to distinguish between normal and fraudulent activity, helping financial institutions flag suspicious behavior early and reduce potential risks.

Challenges include:

  • Handling imbalanced datasets where fraud cases are a small fraction of total transactions.
  • Ensuring high precision to minimize false positives (flagging a valid transaction as fraud).
  • Ensuring high recall to detect as many fraud cases as possible.

Step 1: Importing necessary Libraries

We begin by importing the necessary Python libraries: numpy, pandas, matplotlib and seaborn for data handling, visualization and model building.

Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import gridspec

Step 2: Loading the Data

Load the dataset into a pandas DataFrame and examine its structure. You can download the dataset from here. The dataset contains 284,807 transactions with 31 features including:

  • Time: This shows how many seconds have passed since the first transaction in the dataset.
  • V1-V28: These are special features created to hide sensitive information about the original data.
  • Amount: Transaction amount.
  • Class: Target variable (0 for normal transactions, 1 for fraudulent transactions).
Python
data = pd.read_csv("creditcard.csv")
print(data.head())

Output:

Credit card data received

Now, let's explore more about the dataset using df.describe() method.

Python
print(data.describe())

Output :

Screenshot-2025-03-21-160335

Step 3: Analyzing Class Distribution

The next step is to check the distribution of fraudulent vs. normal transactions.

  • We separate the dataset into two groups: fraudulent transactions (Class == 1) and valid transactions (Class == 0).
  • It calculates the ratio of fraud cases to valid cases to measure how imbalanced the dataset is.
  • It then prints the outlier fraction along with the number of fraud and valid transactions.
  • This analysis is crucial in fraud detection, as it reveals how rare fraud cases are and whether techniques like resampling or special evaluation metrics are needed.
Python
fraud = data[data['Class'] == 1]
valid = data[data['Class'] == 0]
outlierFraction = len(fraud)/float(len(valid))
print(outlierFraction)
print('Fraud Cases: {}'.format(len(data[data['Class'] == 1])))
print('Valid Transactions: {}'.format(len(data[data['Class'] == 0])))

Output:

Screenshot-2025-03-21-160615

Since the dataset is highly imbalanced with only 0.02% fraudulent transactions. we’ll first try to build a model without balancing the dataset. If we don’t get satisfactory results we will explore ways to handle the imbalance.

Step 4: Exploring Transaction Amounts

Let's compare the transaction amounts for fraudulent and normal transactions. This will help us understand if there are any significant differences in the monetary value of fraudulent transactions.

Python
print("Amount details of the fraudulent transaction")
fraud.Amount.describe()

Output :

Screenshot-2025-03-21-161041 Python
print("details of valid transaction")
valid.Amount.describe()

Output :

Screenshot-2025-03-21-161336

From the output we observe that fraudulent transactions tend to have higher average amounts which is important in fraud detection.

Step 5: Plotting Correlation Matrix

We can visualize the correlation between features using a heatmap using correlation matrix. This will give us an understanding of how the different features are correlated and which ones may be more relevant for prediction.

Python
corrmat = data.corr()
fig = plt.figure(figsize = (12, 9))
sns.heatmap(corrmat, vmax = .8, square = True)
plt.show()

Output:

Most features do not correlate strongly with others but some features like V2 and V5 have a negative correlation with the Amount feature. This provides valuable insights into how the features are related to the transaction amounts.

Step 6: Preparing Data

Separate the input features (X) and target variable (Y) then split the data into training and testing sets

  • X = data.drop(['Class'], axis = 1) removes the target column (Class) from the dataset to keep only the input features.
  • Y = data["Class"] selects the Class column as the target variable (fraud or not).
  • X.shape and Y.shape print the number of rows and columns in the feature set and the target set.
  • xData = X.values and yData = Y.values convert the Pandas DataFrame or Series to NumPy arrays for faster processing.
  • train_test_split(...) splits the data into training and testing sets into 80% for training, 20% for testing.
  • random_state=42 ensures reproducibility (same split every time you run it).
Python
X = data.drop(['Class'], axis = 1)
Y = data["Class"]
print(X.shape)
print(Y.shape)

xData = X.values
yData = Y.values

from sklearn.model_selection import train_test_split
xTrain, xTest, yTrain, yTest = train_test_split(
        xData, yData, test_size = 0.2, random_state = 42)

Output :

(284807, 30)
(284807,)

Step 7: Building and Training the Model

Train a Random Forest Classifier to predict fraudulent transactions.

  • from sklearn.ensemble import RandomForestClassifier: This imports the RandomForestClassifier from sklearn.ensemble, which is used to create a random forest model for classification tasks.
  • rfc = RandomForestClassifier(): Initializes a new instance of the RandomForestClassifier.
  • rfc.fit(xTrain, yTrain): Trains the RandomForestClassifier model on the training data (xTrain for features and yTrain for the target labels).
  • yPred = rfc.predict(xTest): Uses the trained model to predict the target labels for the test data (xTest), storing the results in yPred.
Python
from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier()
rfc.fit(xTrain, yTrain)

yPred = rfc.predict(xTest)

Step 8: Evaluating the Model

After training the model we need to evaluate its performance using various metrics such as accuracy, precision, recall, F1-score and the Matthews correlation coefficient.

Python
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, matthews_corrcoef, confusion_matrix 
accuracy = accuracy_score(yTest, yPred)
precision = precision_score(yTest, yPred)
recall = recall_score(yTest, yPred)
f1 = f1_score(yTest, yPred)
mcc = matthews_corrcoef(yTest, yPred)

print("Model Evaluation Metrics:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"Matthews Correlation Coefficient: {mcc:.4f}")

conf_matrix = confusion_matrix(yTest, yPred)
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues",
            xticklabels=['Normal', 'Fraud'], yticklabels=['Normal', 'Fraud'])
plt.title("Confusion Matrix")
plt.xlabel("Predicted Class")
plt.ylabel("True Class")
plt.show()

Output :

out
Evaluating the Model

Model Evaluation Metrics:

The model accuracy is high due to class imbalance so we will have computed precision, recall and f1 score to get a more meaningful understanding. We observe:

  • Accuracy: 0.9996: Out of all predictions, 99.96% were correct. However, in imbalanced datasets (like fraud detection), accuracy can be misleading i.e. a model that predicts everything as "not fraud" will still have high accuracy.
  • Precision: 0.9873: When the model predicted "fraud", it was correct 98.73% of the time. High precision means very few false alarms (false positives).
  • Recall: 0.7959: Out of all actual fraud cases, the model detected 79.59%. This shows how well it catches real frauds. A lower recall means some frauds were missed (false negatives).
  • F1-Score: 0.8814: A balance between precision and recall. 88.14% is strong and shows the model handles both catching fraud and avoiding false alarms well.
  • Matthews Correlation Coefficient (MCC): 0.8863: A more balanced score (from -1 to +1) even when classes are imbalanced. A value of 0.8863 is very good, it means the model is making strong, balanced predictions overall.

We can balance dataset by oversampling the minority class or by undersampling the majority class we can increase accuracy of our model.

Get complete notebook link here: click here.


Next Article

Similar Reads