ResearchPaperours (2)
ResearchPaperours (2)
Abstract— One of the non-verbal communication techniques languages. Gesture recognition is the developing fields of
utilized in sign language is the hand gesture. People with research. Being a significant part in nonverbal communication
speech or hearing impairments use it most frequently to hand gestures are playing key role in our daily life. By
converse with one another and with non-disabled persons. considering in mind the similarities of human hand shape with
Several makers around the world have created various sign
four fingers and one thumb, the software aims to present a real
language systems, but they are neither adaptable nor cost-
effective for end users. So, the "Hand sign and gesture time system for recognition of hand gesture and signs on basis
recognition system software” presented in this proposal of detection of some shape-based features like orientation,
offers a system prototype that can automatically Centre of mass centroid, fingers status, thumb in positions of
understand sign language to aid deaf and dumb individuals raised or folded fingers of hand, although these feature
in communicating with each other and other people more extraction part will be totally resolved by Convolutional Neural
efficiently. In platform like "YouTube" videos, where there Network (CNN). In this we will capture every video frame and
is currently no feature for auto-text generation on the each frame will be used to locate hands to crop it from frame,
fundamentals of gestures and sign languages, this technique
which will work as input for out CNN.
can also be employed. The software aims to present a real-
time system for hand gesture and sign recognition on the
One of the key challenges in hand and sign gesture detection is
basis of detection of some shape-based features like
orientation, Center of Mass centroid, fingers status, and dealing with the variability in hand shapes and orientations. To
thumb in positions of raised or folded fingers of hand, address this challenge, researchers have developed a range of
although these feature extraction part will be completely CNN architectures that are designed to be robust to variations
resolved by Convolutional Neural Network (CNN). We in hand pose and appearance. TheCNN is trained to learn
used CNNs for training and testing, and as a result, our test discriminativefeatures from these images For example, some
accuracy was 99.89%. Our model's independence from architectures incorporate recurrent neural networks (RNNs) to
external hardware or devices is one of its benefits.
capture the temporal dynamics of hand movements, while
others use attention mechanisms to focus on specific parts of
Keywords- Convolutional Neural Network(CNN), Haar
Cascade, Orientation, Regien Of Interest (ROI), Auto-text the hand.
Generation.
Another important consideration in hand and sign gesture
1. INTRODUCTION detection is real-time performance. CNNs can be
computationally intensive, and may require significant
Hand gesture is one of the methods used in sign language for processing power to run in real-time on a mobile or embedded
non-verbal communication. It is most commonly used by device. To address this issue, researchers have explored various
people who have hearing or speech problems to communicate techniques such as model compression, pruning, and
among themselves or with normal people. Various sign quantization to reduce the computational requirements of
language systems have been developed by many makers around CNNs.
the world but they are neither flexible nor cost-effective for the
end users. Hence in this proposed “Hand sign and gesture 2. Methodology
recognition system software” which presents a system
prototype that is able to automatically recognize sign language Raw video footage will act as a input for our software input.
to help deaf and dumb people to communicate more effectively Raw video footage is now broken down to frames. Then these
with each other or normal people. This system can also be used frames will be then sent to Haar_Cascade model to filter out the
in platforms like “YouTube” videos where there is currently no Region of Interest (ROI), in our case it is hands. Then these
feature for auto-text generation on the basic of gestures and sign ROI are now cropped out of frames and sent to CNN model
which will classify these images. In traditional system a
different model is used to extract region of interest, but in our
system, we implemented the use of Haar_Cascade algorithm to
extract it. The cascaded image is then sent to CNN model where
the given image is classified. Our project deals with the
OpenCV python module to record the live footage for from a
device camera and it does all processing without any artificial
devices for this project to run. This is what we call a video-
based project, thus increasing scalability. Image is taken frame
by frame from video input and a section of it is cropped out,
which will be then passed to next stage. In this stage image is Fig 4. Real Hand Figure[4] Fig 5.Real Hand Figure
then turned to grey scale thus reducing processing power of the
system which in turn boosts latency time of our output.This with keypoints[4]
greyscale image is now then sent to next stage for feature
outlining which is then sent to the model.This feature outlining
List of Steps along with description
is done by Gaussian blur. The image after applying gaussian
blur looks like below.
1.Data Acquisition
Data acquisition is the process of sampling signals that
measure real world physical phenomena and converting
them into a digital form that can be manipulated by a
computer and software.
3. Feature Extraction
Feature extraction plays a crucial role in sign language
Figure1: This is image pre-processing for recognition, enabling accurate interpretation of manual
easier recognition and feature outlining. a) gestures and movements used in sign languages.
actual image b) Region of Interest [5]
4. Classification
We’ll first use Media Pipe to recognize the hand and the hand Classification involves the identification and
key points. Media Pipe returns a total of 21 key points for each categorization of different sign language gestures.
detected hand.
5. Gesture Recognition
Gesture recognition facilitates effective communication
between individuals who are deaf or hard of hearing and
the hearing community.
Fig4: American Sign Language[8] Fig5: Feature Extraction[4] I. A. Haar Cascade Classifiers
The creation and execution of a Hand Sign and Gesture Real-Time Performance: Designed for minimal latency, the
Recognition System that employs sophisticated deep learning system guarantees immediate gesture recognition, making it
methodologies has revealed considerable promise in the realm ideal for both interactive and assistive applications.
of human-computer interaction and assistive technologies. This
system utilizes Convolutional Neural Networks (CNNs) along Flexibility and Scalability: The architecture is adaptable and
with effective preprocessing techniques to attain a high level of can accommodate a broad spectrum of gesture types, making it
accuracy in identifying both static and dynamic gestures. The applicable for various uses, including sign language
findings highlight the system’s capability to accurately classify interpretation and gesture-controlled systems.
a wide array of gestures, encompassing more than 3000 distinct
variations, with exceptional precision and dependability. This system is also cost-efficient, requiring only standard
imaging devices for data collection and minimal specialized
The primary contributions of this study are as follows: hardware. Its capacity for integration with existing frameworks
and compatibility across platforms facilitates smooth
Improved Recognition Precision: The CNN-driven model deployment in diverse settings. By merging accuracy,
has outperformed traditional approaches, greatly enhancing the scalability, and user-friendliness, this gesture recognition
classification of intricate gestures. system signifies a major advancement in gesture-based
interaction technologies, improving accessibility and
Automated Feature Extraction: The system streamlines communication for a wide range of users
essential processes such as histogram-based feature extraction
5. FUTURE GOAL