0% found this document useful (0 votes)
23 views34 pages

Smart Hand Gesture Recognition Report

The internship report details the development of a Smart Hand Gesture Recognition System using computer vision and machine learning, submitted by V. Deepak for a Bachelor of Technology in Computer Science & Engineering. It covers the fundamentals of machine learning, including types, algorithms, and workflows, as well as practical applications such as linear regression for house price prediction and K-means clustering for customer segmentation. The report emphasizes the importance of data preprocessing, model evaluation, and the use of Python and relevant libraries in implementing machine learning solutions.

Uploaded by

Yaseer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views34 pages

Smart Hand Gesture Recognition Report

The internship report details the development of a Smart Hand Gesture Recognition System using computer vision and machine learning, submitted by V. Deepak for a Bachelor of Technology in Computer Science & Engineering. It covers the fundamentals of machine learning, including types, algorithms, and workflows, as well as practical applications such as linear regression for house price prediction and K-means clustering for customer segmentation. The report emphasizes the importance of data preprocessing, model evaluation, and the use of Python and relevant libraries in implementing machine learning solutions.

Uploaded by

Yaseer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Internship Report

on

SMART HAND GESTURE RECOGNITION SYSTEM USING


COMPUTER VISION AND MACHINE LEARNING
Submitted by
[Link]
22F11A05J4
In partial fulfillment of the requirement for the award of the Degree of
Bachelor of Technology
In

COMPUTER SCIENCE & ENGINEERING

By
Under the esteemed guidance of

Dr.P.K. Venkateswara Lal,


Ph.D Professor

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

NARAYANA ENGINEERING COLLEGE :: GUDUR


(AUTONOMOUS)
(Recognized by UGC (f) and 12(B),An ISO 9001:2015 Certified Institution ,Approved by AICTE New Delhi &Permanently
Affiliated to JNTUA, Ananthapuramu)

Dhurjati Nagar, Gudur - 524101 , Tirupathi Dt,. A.P., India


WEBSITE :[Link]
NARAYANA ENGINEERING COLLEGE :: GUDUR
(AUTONOMOUS)
(Recognized by UGC (f) and 12(B), An ISO 9001:2015 Certified Institution, Approved by AICTE New Delhi& Permanently
Affiliated to JNTUA , Ananthapuramu)

Dhurjati Nagar, Gudur- 524101 , Tirupathi Dt, .A.P. ,India


WEBSITE: [Link]

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

BONAFIDE CERTIFICATE

This is to certify that the internship report entitled “SMART HAND GESTURE
RECOGNITION SYSTEM USING COMPUTER VISION AND MACHINE LEARNING”
being submitted by [Link] (22F11A05J4), in partial fulfilment for the award of the Degree of
Bachelor of Technology in Computer Science & Engineering Department to the Narayana
Engineering College Gudur, is a record to be bonafied work carried out by him/her under my
guidance and supervision.

INTERNSHIP GUIDE HEAD OF THE DEPARTMENT

Dr. P.K. Venkateswara Dr. P. Venkateswara Rao,


Lal,Ph.D Professor Ph.D Professor
ACKNOWLEDGEMENT

I am extremely thankful to Dr . P .Narayana, the Founder Chairman of Narayana


Group for his good initiation starting technical institution in Gudur like rural area for helping
economically poor students. I also thankful to [Link], the Chairman of Narayana Group
for providing the infrastructural facilities to work in, without this the work would not have been
possible.
I would like to express our deep sense of gratitude to Dr. V. Ravi Prasad,
Principal, Narayana Engineering College, Gudur for his continuous effort in creating a
competitive environment in our college and encouraging throughout this course.
I would like to convey our heartfelt thanks to Dr. P. Venkateswara Rao, Ph.D Professor
& HOD of Computer Science and Engineering for giving the opportunity to embark up on this
topic and for her continues encouragement throughout the preparation of the project.
I would like to thank our guide Dr. P.K. Venkateswara Lal,Ph.D , Professor Department
of CSE for his/her valuable guidance, constant assistance, support, endurance and
constructive suggestions for the betterment of the project.
I also wish to thank all the staff members of the Department of Computer Science &
Engineering for helping us directly or indirectly in completing this project successfully.
Finally are thankful to our parents and friends for their continued moral and material
support throughout the course and in helping us to finalize the report.

[Link]

22F11A05J4
1. Introduction to Machine Learning

1.1. What is Machine Learning?

Machine Learning is a branch of Artificial Intelligence (AI) that allows computers to learn from data
and make decisions or predictions without being explicitly programmed.

Instead of writing detailed instructions for every possible scenario, we train a model using data, and
the model learns patterns to perform tasks.

1.2. Types of Machine Learning

 Supervised Learning

 The model is trained on labeled data (input with correct output).


 Learns a function that maps inputs to outputs.
 Examples:
o Predicting house prices (Regression)
o Spam email detection (Classification)

 Unsupervised Learning

 The model works on unlabeled data.


 Finds hidden patterns or groupings.
 Examples:
o Customer segmentation (Clustering)
o Topic modeling (Dimensionality Reduction)

 Semi-Supervised Learning

 Uses a small amount of labeled and a large amount of unlabeled data.


 Useful when labeling data is expensive or time-consuming.

 Reinforcement Learning

 A model (agent) learns by interacting with an environment and receiving rewards or


penalties.

1
 Examples:
o Self-driving cars
o Game-playing AIs (e.g., AlphaGo)

1.3 Common Algorithms

Type Algorithm Examples


Supervised Linear Regression, Decision Trees, SVM
Unsupervised K-Means, PCA, Hierarchical Clustering
Reinforcement Q-Learning, Deep Q Networks (DQN)

1.4 Typical Machine Learning Workflow

1. Data Collection
2. Data Cleaning & Preprocessing
3. Feature Engineering
4. Model Selection
5. Training the Model
6. Model Evaluation
7. Prediction/Deployment
8. Monitoring & Improvement

1.5 Goals of Machine Learning

1. Automation: Enable machines to perform tasks without human intervention.

2. Prediction: Forecast future trends and behaviors based on data.

3. Pattern Recognition: Identify hidden patterns in data.

4. Learning from Experience: Improve system performance with more data.

5. Decision Making: Support or automate data-driven decisions.

6. Data Understanding: Analyze large datasets to extract insights.

7. Adaptability: Adjust to new or evolving data.

8. Personalization: Deliver customized user experiences.

2
1.6 History of Machine Learning

Year Milestone Description


1950 Turing Test Alan Turing proposed a test to evaluate a machine's ability to
exhibit intelligent behavior.
1952 First ML Program Arthur Samuel created a self-learning checkers game — one
of the first ML applications.
1957 Perceptron Frank Rosenblatt introduced the Perceptron, an early type of
neural network.
1967 Nearest Neighbor Used for pattern recognition and laid the foundation for basic
Algorithm classification.
1981 Explanation-Based Systems could analyze training data and form general rules.
Learning
1986 Backpropagation Popularized for training multi-layer neural networks; huge
breakthrough in deep learning.
1997 IBM’s Deep Blue Defeated world chess champion Garry Kasparov —
combining AI & ML concepts.
2006 Deep Learning Coined Geoffrey Hinton popularized the term "Deep Learning",
advancing neural networks.
2012 ImageNet Breakthrough A deep neural network (AlexNet) achieved record accuracy in
image recognition.
2016 AlphaGo Victory Google DeepMind's AlphaGo beat the world champion in the
game of Go.
2020+ Generative AI &
Models like GPT, BERT, and DALL·E changed natural
Transformers language processing and generation.

3
2. Objectives
2.1 Understand the Fundamentals of Machine Learning

The primary goal was to build a strong foundation in the core principles of Machine Learning. This
included understanding different types of ML (supervised, unsupervised, reinforcement), various
algorithms, and how they work conceptually and mathematically. The objective also included
learning the real-world relevance and applications of ML in industries like healthcare, finance,
retail, and technology.

2.2 Learn Data Preprocessing and Model Building Techniques

A critical step in any ML project is cleaning and preparing the data. This objective focused on
techniques such as handling missing values, encoding categorical data, normalizing features, and
splitting data into training and test sets. It also included selecting appropriate models, configuring
parameters, and understanding how to structure a complete machine learning pipeline.

2.3 Apply ML Algorithms to Real-World Datasets

The internship involved hands-on work with real-world datasets from platforms like Kaggle and
UCI. The objective was to implement and apply various machine learning algorithms—such as
Linear Regression, K-Means Clustering, Support Vector Machines, and Convolutional Neural
Networks— to practical problems like house price prediction, customer segmentation, and image
classification.

2.4 Gain Experience in Training, Testing, and Evaluating ML Models

Beyond building models, an essential goal was to understand how to measure model performance
and improve accuracy. This included learning to use metrics such as accuracy, precision, recall, F1-
score, and confusion matrix. It also covered techniques like cross-validation, overfitting prevention,
and hyperparameter tuning to enhance model performance and generalization.

4
3. Tools and Technologies Used
3.1 Programming Language: Python

Python was the primary language used throughout the internship due to its simplicity, readability,
and extensive support for data science and machine learning. Python provides powerful libraries
and frameworks that make ML model development faster and more efficient.

3.2 Libraries and Frameworks

 scikit-learn: A widely used ML library that provides simple and efficient tools for data
mining, classification, regression, clustering, and model evaluation.
 pandas: Essential for data manipulation and analysis, pandas offers fast, flexible data
structures such as DataFrames that make working with structured data intuitive.
 numpy: Used for numerical computing in Python, especially for handling arrays and
matrices, which are foundational in ML.
 matplotlib and seaborn: These libraries were used for data visualization, enabling the
creation of plots, histograms, correlation heatmaps, and more to better understand data
distributions and patterns.

3.3 Development Environments

 Jupyter Notebook: Provided an interactive environment for writing and executing code,
documenting processes, and visualizing outputs—all in one place.
 Google Colab: A cloud-based Jupyter environment that offers free access to GPUs and
allows seamless collaboration and execution of ML projects without local setup.

3.4 Dataset Sources

 Kaggle: A popular platform that provides a large repository of public datasets along with
machine learning competitions. Many practical projects in the internship were based on
Kaggle datasets.
 UCI Machine Learning Repository: A well-established source of benchmark datasets used
for testing and comparing machine learning algorithms in academic research and practical
experiments.

5
4. Tasks Undertaken

Task 1: Linear Regression for House Price Prediction

1. Objective

The goal of this task was to develop a Linear Regression model to predict house prices based on various
numerical features such as lot area, number of bathrooms, garage size, and other structural
attributes. The model was trained on historical housing data to identify patterns and estimate future
property prices.

2. Tools & Libraries Used

 Python – Primary programming language for implementation

 pandas – For data loading, cleaning, and manipulation

 scikit-learn – For model building, training, evaluation, and preprocessing

 matplotlib & seaborn – For data visualization and result interpretation

3. Dataset

 Training Dataset: [Link]


Contained past house sales, including the sale price and other numerical features.

 Test Dataset: [Link]


A separate dataset used for generating predictions after training the model.

4. Workflow

 Data Loading

Loaded the training dataset using pandas.read_csv().

 Feature Selection
Selected only numerical features for simplicity, excluding the target (SalePrice) from input
features.
 Missing Value Imputation
Used SimpleImputer with a mean strategy to fill in missing values in the numerical columns.
 Data Splitting
Divided the data into training (80%) and testing (20%) sets using train_test_split().
6
 Model Training
Trained a Linear Regression model using the training data.
 Model Evaluation

 Predicted on test data


 Evaluated using Mean Squared Error (MSE) and R-squared Score (R²)

 External Testing

 Made predictions on a separate test dataset ([Link])


 Saved the predicted results to a file named [Link]

 Visualization

 Scatter Plot: Actual vs Predicted sale prices


 Residual Plot: Distribution of errors between predicted and actual prices

5. Code Implementation

# Importing necessary libraries


import pandas as pd
import [Link] as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from [Link] import SimpleImputer
from [Link] import mean_squared_error, r2_score

# Load the training dataset


train_df = pd.read_csv("[Link]")

# Separate numerical and categorical columns


numerical_cols = train_df.select_dtypes(include=['number']).columns
categorical_cols = train_df.select_dtypes(exclude=['number']).columns

# Remove 'SalePrice' from the list of numerical features

7
numerical_features = [col for col in numerical_cols if col != 'SalePrice']

# Impute missing values in numerical columns using mean


imputer = SimpleImputer(strategy='mean')
train_df[numerical_features] = imputer.fit_transform(train_df[numerical_features])

# Final dataframe for modeling


train_df = train_df[numerical_features + ['SalePrice']]

# Features (X) and Target (y)


X = train_df[numerical_features]
y = train_df['SalePrice']

# Train-test split (80/20)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Linear Regression Model


model = LinearRegression()
[Link](X_train, y_train)

# Predict on test split


y_pred = [Link](X_test)

# Evaluate model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Evaluation on [Link] test split:")


print("Mean Squared Error:
{:.2f}".format(mse)) print("R-squared:
{:.4f}".format(r2))

# Predict on external [Link]


external_test_df = pd.read_csv("[Link]")
external_test_data = external_test_df[numerical_features]
external_test_data = [Link](external_test_data)

# Predict

8
external_predictions = [Link](external_test_data)

9
# Save predictions to CSV
output_df = external_test_df.copy()
output_df['Predicted_SalePrice'] = external_predictions
output_df.to_csv("[Link]", index=False) print("\
nPredictions saved to [Link]")

# Actual vs Predicted Plot


[Link](figsize=(8, 6))
[Link](x=y_test, y=y_pred)
[Link]("Actual Sale Price")
[Link]("Predicted Sale Price")
[Link]("Actual vs Predicted Sale Prices")
[Link](True)
plt.tight_layout()
[Link]()

# Residual Plot
residuals = y_test - y_pred
[Link](figsize=(8, 5))
[Link](residuals, kde=True, bins=30)
[Link]("Distribution of Residuals")
[Link]("Residual (Actual - Predicted)")
[Link]("Frequency")
plt.tight_layout()
[Link]()

6. Results & Conclusion

Model Evaluation Results

After training the linear regression model and evaluating it on the test set (from [Link]), the
following metrics were observed:

 Mean Squared Error (MSE): This metric measures the average squared difference
between actual and predicted values. A lower MSE indicates a better fit.

10
 R-squared Score (R²): Indicates how well the model explains the variance in the target
variable. Values closer to 1.0 represent a stronger model fit.

Output:

11
Task 2: K-Means Clustering for Customer Segmentation

1. Objective

The main objective of this task was to apply unsupervised learning techniques to segment mall
customers into different groups based on their annual income and spending behavior. This helps
businesses to better understand customer behavior and develop targeted marketing strategies.

2. Tools & Libraries Used

 Python – Programming language used for implementation


 pandas – For data manipulation and analysis
 scikit-learn – For clustering algorithms and data preprocessing
 matplotlib – For visualizing the elbow curve and results
 StandardScaler – For normalizing the data before clustering

3. Dataset

 Dataset Name: Mall_Customers.csv

 Attributes:

 CustomerID – Unique ID for each customer (dropped during preprocessing)


 Gender – Categorical column (not used for clustering in this task)
 Age, Annual Income (k$), Spending Score (1-100) – Used for segmentation

4. Workflow

1. Data Loading

 Loaded the dataset using pandas.read_csv().

2. Preprocessing

 Dropped irrelevant columns: CustomerID and Gender.


 Applied feature scaling using StandardScaler to normalize the data.

12
3. Finding Optimal Clusters

 Used the Elbow Method by plotting inertia for k = 1 to 10 clusters.

 Identified k = 5 as the optimal number of clusters.

4. Applying K-Means Clustering

 Used KMeans(n_clusters=5) to fit the scaled data.


 Assigned each customer a cluster label (0 to 4).

5. Cluster Analysis

 Calculated average age, income, and spending score per cluster.


 Interpreted behavioral patterns and purchasing trends within each group.

5. Code Implementation

# Set environment variable to avoid memory leak warning on Windows + MKL


import os
[Link]["OMP_NUM_THREADS"] = "1"

# Import necessary libraries


import pandas as pd
from [Link] import KMeans
from [Link] import StandardScaler
import [Link] as plt

# Load the dataset


mall_customers_df = pd.read_csv("Mall_Customers.csv")

# Preprocessing: Drop irrelevant columns


mall_customers_df.drop(columns=['CustomerID', 'Gender'], inplace=True)

# Feature scaling
scaler = StandardScaler()
mall_customers_scaled = scaler.fit_transform(mall_customers_df)

13
# Determine the optimal number of clusters using the Elbow Method
inertia = []
for k in range(1, 11):
kmeans = KMeans(n_clusters=k, random_state=42)
[Link](mall_customers_scaled)
[Link](kmeans.inertia_)

# Plot the Elbow Method graph


[Link](range(1, 11), inertia, marker='o')
[Link]('Elbow Method')
[Link]('Number of Clusters')
[Link]('Inertia')
[Link](True)
[Link]()

# Based on the elbow method, choose the optimal number of clusters


optimal_k = 5

# Apply KMeans clustering


kmeans = KMeans(n_clusters=optimal_k, random_state=42)
[Link](mall_customers_scaled)

# Add cluster labels to the original dataset


mall_customers_df['Cluster'] = kmeans.labels_

# Analyze the characteristics of customers in each cluster


cluster_means = mall_customers_df.groupby('Cluster').mean()
print(cluster_means)

6. Results & Conclusion

 Cluster Analysis Results

After applying K-Means Clustering with the optimal value of k = 5, customers were grouped
into five distinct segments based on their:

 Annual Income
 Spending Score

14
 Age

Each cluster displayed unique characteristics:

 One cluster contained high-income, high-spending customers (ideal targets for premium
services).
 Another cluster had low-income but high-spending behavior (potentially risk-prone but
active buyers).
 Some clusters were moderate in both spending and income—ideal for value-based offerings.

Visual Output

Output:

Cluster Average Age


Average Annual Income (k$) Average Spending Score (1–100)
0 55.28 47.62 41.71
1 32.88 86.10 81.53
2 25.77 26.12 74.85
3 26.73 54.31 40.91
4 44.39 89.77 18.48

15
Interpretation of Clusters

 Cluster 1: Young, high-income, and high-spending customers — best suited for premium
products and loyalty programs.

 Cluster 2: Very young customers with low income but high spending — may represent impulse
buyers or students.

 Cluster 4: Older, high-income but low-spending customers — may need targeted engagement
strategies.

 Cluster 0 & 3: Moderate spending and income — good targets for general-purpose or value-
driven promotions.

Conclusion:

This task demonstrated how unsupervised learning (specifically K-Means Clustering) can be used
for customer segmentation. Through this project, I learned to:

 Preprocess real-world datasets


 Normalize features for clustering
 Apply and interpret K-Means clustering
 Derive business insights from data patterns

This knowledge is valuable in domains such as marketing analytics, retail strategy, and customer
relationship management (CRM).

16
Task 3: SVM for Image Classification (Cats vs Dogs)

1. Objective

The objective of this task was to build a binary image classification model using Support Vector
Machine (SVM) to distinguish between images of cats and dogs. This project aimed to apply
supervised learning in a computer vision context.

2. Tools & Libraries Used

 Python – Programming language for development


 OpenCV (cv2) – For reading and preprocessing image data
 NumPy – For numerical operations and array manipulation
 pandas – For managing and saving prediction results
 scikit-learn – For implementing the SVM model and evaluation

3. Dataset

 Training Folder: train1/ – Contained grayscale images of cats and dogs with filenames like
[Link], [Link], etc.
 Test Folder: test1/ – Contained unlabelled images used for prediction.
 Label Mapping:

 cat → 0
 dog → 1

4. Workflow

1. Data Loading

 Custom functions were written using OpenCV to load and resize images to 64x64 pixels.
 Images were normalized (scaled between 0 and 1) and flattened into 1D vectors.

2. Data Preparation

 Created X (features) and y (labels) from 5,000 images (subset for faster training).
 Split the data into training and validation sets (80/20 split).

17
3. Model Training

 rained a Support Vector Machine (SVM) model using a linear kernel.


 Fitted the model on the training data.

4. Model Evaluation

 Calculated validation accuracy on the test split using .score().

5. Prediction on New Data

 Loaded test images from test1/, preprocessed them in the same way, and predicted their labels.
 Mapped the results to either "Cat 🐱" or "Dog 🐶".

6. Saving Results

 Created a CSV file ([Link]) with image IDs and predicted labels for submission or
external analysis.

5. Code Implementation

import os
import cv2
import numpy as np
import pandas as pd
from [Link] import SVC
from sklearn.model_selection import train_test_split

#
# Load training data
#
def load_data(data_dir, image_size=(64, 64)):
images = []
labels = []
for file in [Link](data_dir):
if [Link]('cat'):
label = 0
elif [Link]('dog'):

18
label = 1
else:
continue
path = [Link](data_dir, file)
image = [Link](path, cv2.IMREAD_GRAYSCALE)
if image is not None:
image = [Link](image, image_size)
image = image / 255.0
[Link]([Link]())
[Link](label)
return [Link](images), [Link](labels)

#
# Load test data
#
def load_test_images(test_dir, image_size=(64, 64)):
test_images = []
image_names = []
for image_file in [Link](test_dir):
image_path = [Link](test_dir, image_file)
image = [Link](image_path, cv2.IMREAD_GRAYSCALE)
if image is not None:
image = [Link](image, image_size)
image = image / 255.0
test_images.append([Link]())
image_names.append(image_file)
return [Link](test_images), image_names

# Paths
train_dir = "train1"
test_dir = "test1"

# Load and preprocess training data


print("📦 Loading training data...")
X, y = load_data(train_dir, image_size=(64, 64))

# Optional: Use a subset for faster training

19
subset_size = 5000
X = X[:subset_size]
y = y[:subset_size]

print(f"✅ Loaded {len(X)} images. 🐱 Cats: {[Link](y==0)}, 🐶 Dogs: {[Link](y==1)}")

# Train/test split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVM
print("⚙️Training SVM model...")
svm_classifier = SVC(kernel='linear')
svm_classifier.fit(X_train, y_train)
print("✅ SVM model trained.")

# Evaluate
accuracy = svm_classifier.score(X_val, y_val)
print(f"📊 Validation Accuracy: {accuracy * 100:.2f}%")

# Load and predict on test data


print("📦 Loading test data...")
X_test, test_image_names = load_test_images(test_dir, image_size=(64, 64))
print(f"✅ Loaded {len(X_test)} test images.")

# Predict
print("🔍 Predicting test images...") y_pred
= svm_classifier.predict(X_test)

# Show sample predictions print("\n


Sample Predictions:")
for name, pred in zip(test_image_names[:10], y_pred[:10]):
label = "Dog 🐶" if pred == 1 else "Cat 🐱" print(f"{name}:
{label}")

# Save to CSV
df = [Link]({

20
'id': [int([Link]('.')[0]) for name in test_image_names],
'label': y_pred
})
df = df.sort_values(by='id')
df.to_csv("[Link]", index=False) print("\
n📝 Saved predictions to [Link]")

6. Results & Conclusion

Model Evaluation

 The Support Vector Machine (SVM) model was trained on a dataset of 5,000 labeled cat
and dog images.
 It achieved a validation accuracy of approximately [fill from output]%, indicating reliable
classification performance for a simple linear model.

Note: Validation accuracy may vary based on data size, quality, and random splits. The model can
be improved by using a non-linear kernel (e.g., RBF) or applying more advanced feature extraction
techniques.

Example Output:

Image File Predicted Label


[Link] Dog 🐶
[Link] Cat 🐱
[Link] Dog 🐶
[Link] Cat 🐱
[Link] Dog 🐶
[Link] Cat 🐱
[Link] Cat 🐱
[Link] Dog 🐶

21
[Link] Dog 🐶
[Link] Cat 🐱

Output File

 A CSV file named [Link] was generated.


 It contains predictions for each image in the test dataset in the following format:

 id,label
 1,1
 2,0
 3,1
 4,0
 ...

Where:

 label = 0 → Cat
 label = 1 → Dog

Conclusion:

This task demonstrated the practical application of Support Vector Machines (SVM) in image
classification, particularly in binary classification scenarios like cat vs. dog detection.

Through this task, I learned:

 How to preprocess and normalize image data.


 How to apply a machine learning classifier to image vectors.
 How to evaluate and improve model performance.
 How to use OpenCV and scikit-learn together in a vision pipeline.

22
Task 4: Hand Gesture Recognition Using CNN

1. Objective

The goal of this task was to build a Convolutional Neural Network (CNN) model capable of
recognizing hand gestures in real-time or through uploaded images. The system predicts gestures
like "palm", "thumbs up", "ok", etc., and overlays the result visually with an emoji and confidence
score.

This task applies deep learning in computer vision to enhance Human-Computer Interaction (HCI)

2. Tools & Libraries Used

 Python – Programming language used for the implementation


 TensorFlow / Keras – For loading and running the pre-trained CNN model
 OpenCV – For image capture, processing, and display
 NumPy – For numerical operations and array handling
 Tkinter – For graphical file upload dialog (optional image selection)

3. Dataset & Classes

The model was trained on a dataset of labeled hand gesture images, each corresponding to one of
10 gesture classes.

Class Label Meaning Emoji


palm Open hand

fist Closed fist ✊


thumb Thumbs up 👍
ok OK gesture 👌
index Pointing index ☝️
l L sign □
fist_moved Moving fist ✊🏽
palm_moved Moving palm

c C sign 🤙

23
down Pointing down 👇

4. Workflow

The following steps were followed to implement the Hand Gesture Recognition system using a
trained CNN model:

1. Model Loading

 A pre-trained Keras model was loaded using:

model = load_model('my_model.keras')

2. Image Preprocessing

Before predictions, every image (live or uploaded) undergoes the following steps:

 Convert to grayscale (if in color)


 Resize to 64 x 64 pixels
 Normalize pixel values to [0, 1] range
 Add dimensions to match CNN input shape (batch size, height, width, channels)

img = [Link](img, cv2.COLOR_BGR2GRAY)


img = [Link](img, (64, 64))
img = [Link]('float32') / 255.0
img = np.expand_dims(img, axis=-1) # add channel dimension
img = np.expand_dims(img, axis=0) # add batch dimension

3. Prediction

 The preprocessed image is passed to the model:

predictions = [Link](processed)
predicted_class = class_labels[[Link](predictions)]
confidence = [Link](predictions

24
4. Output Display

 The predicted gesture name, emoji, and confidence (e.g., “thumb 👍 (95.3%)”) are overlayed
on the image.
 This is displayed using OpenCV's [Link]().

5. User Interaction Options

 Option 1: Upload Image

 User selects an image from their system.

 Option 2: Live Camera

 Opens webcam and runs continuous prediction on the video feed in real time.
 Pressing q stops the camera session.

5. Code Implementation

The following Python script loads a trained CNN model and provides two modes of
interaction: real-time prediction via webcam and prediction on an uploaded image.

import cv2
import numpy as np
from [Link] import load_model
from tkinter import Tk
from [Link] import askopenfilename

# Load the trained model


model = load_model('my_model.keras')

# Define class labels and emojis


class_labels = ['palm', 'fist', 'thumb', 'ok', 'index', 'l', 'fist_moved', 'palm_moved', 'c', 'down']
emoji_map = {
'palm': '',
'fist': '✊',
'thumb': '👍',

25
'ok': '👌',
'index': '☝️',
'l': '□',
'fist_moved': '✊🏽',
'palm_moved': '', 'c': '🤙',
'down': '👇'
}

IMAGE_SIZE = 64

# Preprocessing function
def preprocess_image(img):
if [Link][2] == 3:
img = [Link](img,
cv2.COLOR_BGR2GRAY) img = [Link](img,
(IMAGE_SIZE, IMAGE_SIZE)) img =
[Link]('float32') / 255.0
img = np.expand_dims(img, axis=-1)
img = np.expand_dims(img, axis=0)
return img

# Prediction function
def predict_image(img):
processed = preprocess_image(img)
predictions = [Link](processed)
predicted_class = class_labels[[Link](predictions)]
confidence = [Link](predictions)
return predicted_class, confidence

# Result display
def display_result(img, predicted_class, confidence):
emoji = emoji_map.get(predicted_class, '')
text = f"{predicted_class} {emoji} ({confidence*100:.1f}%)"
[Link](img, text, (10, 30), cv2.FONT_HERSHEY_SIMPLEX,
1, (0, 255, 0), 2, cv2.LINE_AA)
[Link]("Prediction", img)

26
# Real-time camera prediction
def live_camera():
cap = [Link](0)
if not [Link]():
print("Cannot open camera")
return
while True:
ret, frame = [Link]()
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
predicted_class, confidence =
predict_image(frame) display_result(frame,
predicted_class, confidence) if [Link](1) &
0xFF == ord('q'):
break
[Link]()
[Link]()

# Upload and predict from file


def upload_and_predict():
Tk().withdraw()
file_path = askopenfilename(filetypes=[("Image files", "*.jpg *.jpeg *.png")])
if not file_path:
print("No file selected")
return
img = [Link](file_path)
if img is None:
print("Failed to load image")
return
predicted_class, confidence = predict_image(img)
display_result(img, predicted_class, confidence)
[Link](0)
[Link]()

# Main interaction
def main():

27
choice = input("Choose option:\n1 - Upload Image\n2 - Live Camera\nEnter choice: ")

28
if choice == '1':
upload_and_predict()
elif choice == '2':
live_camera()
else:
print("Invalid choice")

if name == " main ":


main()

6. Results & Conclusion

Model Prediction

Results

Once the model is loaded and the image is provided (via webcam or file), it identifies the most likely
gesture and displays:

 Gesture label
 Emoji
 Confidence score

Sample Output Table

Input Type Predicted Gesture Emoji Confidence (%)


Webcam palm 93.6

Image: [Link] ok 👌 87.4


Image: [Link] fist ✊ 91.1
Webcam thumb 👍 95.0

Output Display Example:

On screen (via OpenCV), the model overlays:

ok 👌 (87.4%)
29
Conclusion

This project demonstrated the use of deep learning and computer vision to interpret hand gestures
for Human-Computer Interaction (HCI). Through this task, I learned to:

 Load and use a pre-trained CNN for image classification.


 Apply image preprocessing and prediction pipelines.
 Use OpenCV for real-time prediction and visual feedback.
 Combine user input interfaces (camera, file upload) in a complete ML application.

This task is a practical stepping stone towards building more advanced gesture-controlled interfaces
and assistive technologies.

30
[Link] COMPLETION CERTIFICATE

31

You might also like