Speech to sign - An application to convert
speech to American Sign Language
Semester 2022-23
ABSTRACT
The deaf and hearing-impaired make up a sizable community with specific needs
that operators and technology have only recently begun to target. There is no such
freely available software, let alone a single one with a reasonable price to translate
uttered speech into sign language in real time. Language plays a vital role in the
communication of ideas, thoughts, and information to others. Hearing-impaired
people also understand our thoughts using a language known as sign language.
Every country has a different sign language which is based on their native
language. In our project, our major focus is on American Sign Language, which
is mostly used by hearing- and speaking-impaired communities in World. While
communicating our thoughts and views with others, one of the most essential
factors is listening. What if the other party is not able to hear or grasp what you
are talking about? This situation is faced by nearly every hearing-impaired person
in our society. This led to the idea of introducing an audio to American Sign
Language translation system which can erase this gap in communication between
hearing-impaired people and society. The system accepts audio as input and
converts it to American Sign Language as output. The heart of the system is
natural language processing which equips the system with tokenization, parsing,
lemmatization, and part-of-speech tagging.
KeyWord: American Sign Language Translator, natural language processing,
tokenization, part-of-speech tagging
1. INTRODUCTION
Every country has a different sign language which is based on their native
language. It is not easy for us to speak when we know the other person is not
listening, let alone hearing impaired. Even those with sufficient hearing abilities
tend to ignore or avoid communication with those who do not hear, and for those
who cannot hear it becomes even more difficult. Having the skill to talk to those
who cannot hear can not only bridge the gap between the two but also help in the
exchange of a lot of ideas and new thoughts which could encourage these people
to contribute to the development of technology. Every mind can contribute to
making unknowns into knowns and impossible into possible.
1.1 Overview
It is said that Sign language is the mother language of deaf people. This includes
the combination of hand movements, arms or body and facial expressions. There
are 135 types of sign languages all over the world. Some of them are American
Sign Language (ASL), Indian Sign Language (ISL), British Sign Language
(BSL), Australian Sign Language (Auslan) and many more. We are using Indian
Sign Language in this project. This system allows the deaf community to enjoy
all sort of things that normal people do from daily interaction to accessing the
information Sign language is communication language used by the deaf peoples
using face, hands or eyes while using vocal tract. Sign language recognizer tool
is used for recognizing sign language of deaf and dumb people. Gesture
recognition is an important topic due to the fact that segmenting a foreground
object from a cluttered background is a challenging problem. There is a difference
when human looks at an image and a computer looking at an image. For Humans
it is easier to find out what is in an image but not for a computer. It is because of
this, computer vision problems remain a challenge. Sign language is a language
that consists of signs made with hands and other movements, facial expressions
and postures of body, which is primarily used by people who are deaf or hard
hearing peoples that they can easily express their thoughts or can easily
communicate with other people. Sign language is very important a far the deaf
people are concerned for their emotional, social and linguistic growth. First
language for the deaf people is sign language which get proceeded bilingually
with the education of national sign language as well as national written or spoken
language. There are different communities of deaf people all around the world
therefore the sign language for these communities will be different. The different
sign languages used by different communities are America uses American Sign
Language, Britain sign language is used by Britain, similarly, India uses Indian
sign language etc. For expressing thoughts and communicating with each other.
According to the 2011 census of India, there are 63 million people, which sums
up to 6.3% of the total population, who are suffering from hearing problems. Out
of these people, 76-89% of the American hearing challenged people have no
knowledge of language either signed, spoken or written. The reason behind this
low literacy rate is either the lack of sign language interpreters, unavailability of
American Sign Language tools or lack of research on American sign language.
Sign language is a natural way of communication for challenged people with
speaking and hearing disabilities. There have been various mediums available to
translate or to recognize sign language and convert them to text, but text to sign
language conversion systems have been rarely developed, this is due to the
scarcity of any sign language corpus. This is done by eliminating stopwords from
the reordered sentence. Stemming is applied to convert the words to their root
form as American sign language does not support for inflections of the word. All
words of the sentence are then checked against the words in the dictionary
containing videos representing each of the words. If the words are not found in
the dictionary, its corresponding synonym is used to replace it. The proposed
system is innovative as the existing systems are limited to direct conversion of
words into American sign languages whereas our system is capable of doing the
translation.
1.2 SIGN LANGUAGE
It is a language that includes gestures made with the hands and other body parts,
including facial expressions and postures of the body.It used primarily by people
who are deaf and dumb. There are many different sign languages as, British,
Indian and American sign languages. British sign language (BSL) is not easily
intelligible to users of American sign Language (ASL) and vice versa . A
functioning signing recognition system could provide a chance for the inattentive
communicate with non-signing people without the necessity for an interpreter. It
might be wont to generate speech or text making the deaf more independent.
Unfortunately there has not been any system with these capabilities thus far.
during this project our aim is to develop a system which may classify signing
accurately. American Sign Language (ASL) is a complete, natural language that
has the same linguistic properties as spoken languages, with grammar that differs
from English. ASL is expressed by movements of the hands and face. It is the
primary language of many North Americans who are deaf and hard of hearing,
and is used by many hearing people as well.
1.3 PROBLEM STATEMENT
The main purpose of the project is to take user input and convert it to sign
language. Natural Language Processing (NLP) is implemented to classify the
text/speech into small parts. Then searching words/letters from the database. At
the end display the appropriate sign or gestures to the user. In this, problem we
have considered that are;
1. Speech recognition & converting into text.
2. The whole statement was converted into sign language.
3. No words are found in the database/dataset. Sign language is a language that
uses manual communication methods such as facial expressions, hand gestures
and bodily movements to convey information. This project makes use of videos
for specific words combined to translate the text language into sign language.
Speech impaired people use hand signs and gestures to communicate. Normal
people face difficulty in understanding their language. Hence there is a need of a
system which recognizes the different signs, gestures, and conveys the
information to the deaf people from normal people. It bridges the gap between
physically challenged people and normal people. Our approach provides the
result in a minimum time span with maximum precision and accuracy in
comparison to other existing approaches.
1.4 MOTIVATION
1.4.1 To make it easy for normal people who do not know ASL(American Sign
Language) to communicate with deaf people.
1.4.2 To make a working application that will convert any speech from
microphone to American sign language in real time to bridge the communication
gap between deaf people and normal people
1.5 AIM AND OBJECTIVES
1.5.1 Aim This project intends to:
1. To create a translation system that consists of a parsing module which parses
the input English sentence to phrase structure grammar representation on which
American sign language grammar rules are applied.
2. To convert these sentences into American sign language grammar in the real
domain.
3. To develop a communication system for the deaf people.
1.5.2 Objective
The main aspect of this project is to help deaf and dumb people to communicate
easily in society with people who don’t know sign language. This web application
converts the speech into sign language , it will be open source and freely available
which in turn will benefit the deaf community. To increase opportunity for
advancement and success in education, employment, personal relationship, and
public access venues.
The application listens to an user and displays the American sign language signs
for the words spoken by the user. The user can then use the signs to communicate
with a deaf person.
2. LITERATURE SURVEY
2.1. Sign Language in English A two-way communication system is suggested in
the paper [4] but the authors are only able to convert 26 alphabets and three
characters with an accuracy rate of 99.78% using CNN models. The authors only
suggest that future work must be conducted in the field of natural language
processing to convert speech into sign language. In the paper [5], the authors
propose a system that converts sign language into English and Malayalam. The
authors of the paper suggest using an Arduino Uno, which uses a pair of gloves
to recognize the signs and translates the signs from ISL to the preferred language.
The system is useful, as it recognizes two-hand and motion signs. The American
Sign Language interpreter presented in the paper [6] uses hybrid CNN models to
detect multiple sign gestures and then goes on to predict the sentence that the user
is trying to gesture by using natural language processing techniques. The system
is able to achieve 80–95% accuracy under various conditions. In another study,
the HSR model is used by the authors in converting ISL signs into text. The HSR
model gives an advantage over RGB-based models, but this system has an
accuracy ranging from 30 to 100% depending upon the illumination, hand
position, finger position [7] etc. The authors of [8] paper propose a system that
recognizes 26 ASL signs and converts them into English text. They use principal
component analysis to detect the signs in MATLAB. The ASL to sign language
synthesis tool uses VRML avatars and plays them using a BAP player. The major
problem with the system is that many complex movements are not possible using
the current VRML avatars. For example, touching the hand to any part of the
body is not possible in the current system [9]. In another study mentioned in [10],
one video-based sign language translation system converts signs from ISL, BSL,
and ASL with an overall accuracy of 92.4%. The software utilizes CNN and RNN
for the real-time recognition of dynamic signs. The system then converts the signs
into text and then uses text–speech API to give an audio output to the user. The
authors of another paper first use the Microsoft Kinect 360 camera to capture the
movement of the ISL signs. A unity engine is used to display the Blender 3D
animation created by the authors. Although the system can successfully convert
words into sign language, it is not able to convert phrases/multiple words into
ISL.
2.2. Sign Language in Other Languages The work presented by the authors of
[11] is another bidirectional sign language system. The system is able to achieve
97% accuracy when translating sign languages to text or audio. Authors use
Google to API to convert speech to text and then the system produces a 3D figure
using the unity engine after extracting keywords from the input [12]. Another
system proposed by the authors in the paper [13] converts Malayalam text and
gives a 3D animated avatar as the sign language output. The system uses
HamNoSys notation, as it is the main structure of the signs [14]. A unique Russian
text to Russian Future Internet 2022, 14, 253 5 of 17 sign language [15] system
utilizes semantic analysis algorithms to convert text to sign language. It focuses
on the lexical meanings of the words. Although the system can reduce the
sentence into gestures, it is observed by the authors that the sentence proposition
can be improved by further making the algorithm more efficient.
1.6.1 Speech Recognition Set Device ID to the selected microphone: In this step,
we specify the device ID of the microphone that we wish to use in order to avoid
ambiguity in case there are multiple microphones. This also helps debug, in the
sense that, while running the program, we will know whether the specified
microphone is being recognized. During the program, we specify a parameter
device_id. The program will say that device_id could not be found if the
microphone is not recognized. Allow Adjusting for Ambient Noise: Since the
surrounding noise varies, we must allow the program a second or too to adjust the
energy threshold of recording so it is adjusted according to the external noise
level. Speech to text translation: This is done with the help of Google Speech
Recognition. This requires an active internet connection to work. However, there
are certain offline Recognition systems such as Pocket Sphinx, but have a very
rigorous installation process that requires several dependencies. Google Speech
Recognition is one of the easiest to use.
Basic Concept
First, we use the webkit SpeechRecognition to capture audio as input. We then
use the Chrome/Google Speech API to
transform the audio to text. Currently, we use NLP (natural language processing)
to break down the material into smaller,
more easily comprehensible chunks. We have a reliance parser that analyses the
sentence's grammatical structure and builds
up the word connections. Finally, we converted audio/text into Sign language and
user will get videos/clips as sign language
for given input.
Pre-processing of text
The filler words which are used to fill the gap in the sentence are apparently
lesser- meaning words. They provide less context to the sentence. There are
around 30+ filler words in the English Language which hardly makes sense in the
sentence. So, the system removes the filler words from the sentence and makes it
more meaningful. By removing these words, the system will save time.
2.3 Google Speech API
A Speech-to-Text API synchronous recognition request is the simplest method
for performing recognition on speech
audio data. Speech-to-Text can process up to 1 minute of speech audio data sent
in a synchronous request. After Speech-toText processes and recognizes all of the
audio, it returns a response.
A synchronous request is blocking, meaning that Speech-to-Text must return a
response before processing the next request.
Speech-to-Text typically processes audio faster than real-time, processing 30
seconds of audio in 15 seconds on average. In
cases of poor audio quality, your recognition request can take significantly longer.
Noise removal is the process of removing the unwanted noise or any absurd noise
from the input data which is in takes
in terms of speech. Different types of noise removal techniques are Filtering
technique, spectral restoration and many more.
Modulation detection and synchrony detection are the two noise removal
techniques. Since the speech from the user or the
normal person is taken using a microphone of computer or a cellular phone clarity
of sound may not be guaranteed therefore
it is sent to the noise removal.
Hardware and Software Requirements
The system will require the basic facilities needed to develop a system and
implement it. Major requirements are as
follows.
Hardware requirements
1. Disk Space of around 50 Gigabyte.
2. Intel Atom processor or Intel Core i3 processor.
3. A GPU (1GB) is recommended for training and for inference speed, but is not
mandatory.
4. RAM (4GB)
5. A microphone
6. A keyboard
3.4.2 Software requirements
1. Both Windows and Linux are supported.
2. Chrome or other browsers
3. Internet connectivity
Advantages and Disadvantages
Advantages
1. It can be applied to higher level applications: This system will extract an input
audio features. This output can be applied to higher level applications.
2. Takes lesser time: It applies clustering by extracting audio features and it tends
to consume lesser time than to compare separately with every other sign language
on the dataset.
3. More accurate results: As it uses speech features instead of metadata the sign
language comparison is more accurate and thus we can achieve higher precision
of results.
4. Easy user interface: The system is implemented as a simple and easy-to-use
user interface kept in mind. Any normal user can easily retrieve their features
without any hindrance.
Disadvantages
1. Limited training dataset: The project is currently implemented on a finite
dataset stored in a folder/personal system.
Although it can be expanded there are certain storage constraints to which the
project is limited.
2. Size and format constraints: The project can be applied only to .mp4 files as
feature extraction is easier for such files. Moreover larger video clips that exceed
the limit are hard to analyze as they require larger space to store and
Process.
SYSTEM DESIGN DETAILS
A data flow diagram (DFD) is a graphical representation of the flow of data
through an information system, modelling its process aspects. A DFD is often
used as a preliminary step to create an overview of the system without going into
great detail, which can later be elaborated. DFDs can also be used for
visualization of data processing.
Fig UML diagram for system
American sign language alphabets
The Sign Language alphabet is a manual alphabet. Manual alphabet is a system
of representing all the letters of an alphabet, using only the hands. Making words
using a manual alphabet is called fingerspelling. Manual alphabets are a part of
sign languages. For ASL, the one-handed manual alphabet is used. Fingerspelling
is used to complement the vocabulary of ASL when spelling individual letters of
a word is the preferred or only option, such as with proper names or the titles of
works. Letters should be signed with the dominant hand and in most cases, with
palm facing the viewer. The American manual alphabet is depicted in Figure
Algorithm :-
1. Open Web Application.
2. Input the text or click on the microphone to speak.
3. Click on submit.
4. Input is processed by the system.
5. Start button for display of animation.
6. Shows the Required result.
7. Close.
Fig Block diagram of proposed system
Tools and Technologies Used-
WebKit API -
WebKit's C++ application programming interface (API) provides a set of classes
to display Web content in windows, and implements browser features such as
following links when clicked by the user, managing a back-forward list, and
managing a history of pages recently visited. WebKit is a layout engine designed
to allow web browsers to render web pages. The WebKit engine provides a set of
classes to display web content in windows, and implements browser features such
as following links when clicked by the user, managing a back-forward list, and
managing a history of pages recently visited.
JavaScript -
JavaScript, often abbreviated as JS, is a programming language that is one of the
core technologies of the World Wide Web, alongside HTML and CSS. As of
2022, 98% of websites use JavaScript on the client side for webpage behavior,
often incorporating third-party libraries.
Javascript is the most popular programming language in the world and that makes
it a programmer’s great choice. Once you learnt Javascript, it helps you
developing great front-end as well as back-end softwares using different
Javascript based frameworks like jQuery, Node.JS etc.
Javascript is everywhere, it comes installed on every modern web browser and so
to learn Javascript you really do not need any special environment setup. For
example Chrome, Mozilla Firefox , Safari and every browser you know as of
today, supports Javascript. Javascript helps you create really beautiful and crazy
fast websites. You can develop your website with a console like look and feel and
give your users the best Graphical User Experience.
JavaScript usage has now extended to mobile app development, desktop app
development, and game development. This opens many opportunities for you as
Javascript Programmer.
Html (Hyper Text Markup Language)
The Hypertext Markup Language or HTML is the standard markup language for
documents designed to be displayed in a web browser. It can be assisted by
technologies such as Cascading Style Sheets (CSS) and scripting languages such
as JavaScript. Web browsers receive HTML documents from a web server or
from local storage and render the documents into multimedia web pages. HTML
describes the structure of a web page semantically and originally included cues
for the appearance of the document.
HTML elements are the building blocks of HTML pages. With HTML constructs,
images and other objects such as interactive forms may be embedded into the
rendered page. HTML provides a means to create structured documents by
denoting structural semantics for text such as headings, paragraphs, lists, links,
quotes and other items. HTML elements are delineated by tags, written using
angle brackets.
Tags such as and directly introduce content into the page. Other tags such as
surround and provide information about document text and may include other
tags as sub-elements. Browsers do not display the HTML tags but use them to
interpret the content of the page.
Css (Cascading Style Sheets)
Cascading Style Sheets (CSS) is a style sheet language used for describing the
presentation of a document written in a markup language such as HTML. CSS is
a cornerstone technology of the World Wide Web, alongside HTML and
JavaScript. CSS is designed to enable the separation of presentation and content,
including layout, colors, and fonts.
This separation can improve content accessibility; provide more flexibility and
control in the specification of presentation characteristics enable multiple web
pages to share formatting by specifying the relevant CSS in a separate .css file,
which reduces complexity and repetition in the structural content; and enable the
.css file to be cached to improve the page load speed between the pages that share
the file and its formatting. Separation of formatting and content also makes it
feasible to present the same markup page in different styles for different rendering
methods, such as on-screen, in print, by voice (via speech-based browser or
screen reader), and on Braille-based tactile devices. CSS also has rules for
alternate formatting if the content is accessed on a mobile device.
VSCode -
Visual Studio Code, also commonly referred to as VS Code, is a source-code
editor made by Microsoft with the Electron Framework, for Windows, Linux and
macOS. Features include support for debugging, syntax highlighting, intelligent
code completion, snippets, code refactoring, and embedded Git.
Visual Studio Code is a streamlined code editor with support for development
operations like debugging, task running, and version control. It aims to provide
just the tools a developer needs for a quick code-build-debug cycle and leaves
more complex workflows to fuller featured IDEs, such as Visual Studio IDE.
NLTK Library
NLTK has been called “a wonderful tool for teaching, and working in,
computational linguistics using Python,” and “an amazing library to play with
natural language.”
word tokenizes
We use the method word_tokenize() to split a sentence into words. The output
of word tokenization can be converted to Data Frame for better text understanding
in machine learning applications. It can also be provided as input for further text
cleaning steps such as punctuation removal, numeric character removal or
stemming. Machine learning models need numeric data to be trained and make a
prediction. Word tokenization becomes a crucial part of the text (string) to
numeric data conversion. Please read about Bag of Words or CountVectorizer.
Please refer to below word tokenize NLTK example
to understand the theory better.
Elimination of Stop Words
Since ISL deals with words associated with some meaning, unwanted words are
removed these include various parts of speech such as TO, POS(possessive
ending), MD(Modals), FW(Foreign word), CC(coordinating conjunction), some
DT(determiners like a, an, the), JJR, JJS(adjectives, comparative and superlative),
NNS, NNPS(nouns plural, proper plural), RP(particles), SYM(symbols),
Interjections, non-root verbs.
Lemmatization and Synonym replacement
American sign language uses root words in their sentences. So we convert them
to root form using Porter Stemmer rules.
Along with this each word is checked in bilingual dictionary, if word does not
exist, it is tagged to its synonym containing
the same part of speech.
Code -
Index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Sign Language</title>
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css"
integrity="sha384-
TX8t27EcRE3e/ihU7zmQxVncDAy5uIKz4rEkgIXeMed4M0jlfIDPvg6uqKI2x
Xr2" crossorigin="anonymous">
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"
integrity="sha384-
DfXdz2htPH0lsSSs5nCTpuj/zy4C+OGpamoFVy38MVBnE+IbbVYUew+OrC
XaRkfj" crossorigin="anonymous">
</script>
<script
src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.j
s"
integrity="sha384-
ho+j7jyWK8fNQe+A12Hb8AhRq26LrZ/JpcUGGOn+Y7RsweNrtN/tE3MoK7
ZeZDyx" crossorigin="anonymous">
</script>
<link rel="stylesheet" href="css/style.css" type="text/css">
</head>
<body>
<div class="container">
<header>
<h1>Speech to sign language</h1>
</header>
<main class="px-3 py-3">
<div id="alert-placeholder">
</div>
<button type="button" class="btn btn-info" id="start-listening">Start
Listening</button>
<button type="button" class="btn btn-info" id="stop-listening">Stop
Listening</button>
</main>
<div id="output">
</div>
</div>
<script src="js/script.js"></script>
</body>
</html>
Style.css
*{
box-sizing: border-box;
}
html,
body {
min-height: 100vh;
margin: 0;
padding: 0;
}
#output img {
max-width: 100%;
max-height: 100%;
}
body {
font-family: Helvetica, Arial, sans-serif;
color: #0d122b;
display: flex;
flex-direction: column;
padding-left: 1em;
padding-right: 1em;
}
header {
border-bottom: 1px solid #0d122b;
margin-bottom: 2em;
}
Script.js
$(document).ready(function () {
let recognition;
if ('speechSynthesis' in window) {
// new speech recognition object
const SpeechRecognition = window.speechRecognition ||
window.webkitSpeechRecognition;
recognition = new SpeechRecognition();
// Recognition start event handler
recognition.onstart = () => {
// Show alert
$("#alert-placeholder").html(`
<div class="alert alert-success alert-dismissible fade show" role="alert">
Voice recognition started. Try speaking into the microphone.
<button type="button" class="close" data-dismiss="alert" aria-
label="Close">
<span aria-hidden="true">×</span>
</button>
</div>
`);
console.log('Voice recognition started. Try speaking into the
microphone.');
// Add animation
$('main').addClass('speaking');
}
recognition.onresult = function (event) {
// Get the text spoken
const transcript = event.results[0][0].transcript;
console.log(transcript);
recognition.stop();
$('#stop-listening').prop('disabled', true);
$('#start-listening').prop('disabled', true);
$('main').removeClass('speaking');
// Delay required before starting to listen again
setTimeout(function () {
$('#start-listening').prop('disabled', false);
}, 400);
$('#output').empty();
$("#alert-placeholder").empty();
$('#output').html(`<h1 class="display-4">${transcript}</h1>`);
transcript.split(' ').forEach(element => {
let html = `
<div class="row">
<h1>${element}</h1>
`;
$('#output').append('');
for (let i = 0; i < element.length; i++) {
const character = element.charAt(i);
let image;
if (character.toLowerCase() != character.toUpperCase()) {
image = `<img src="Alphabets/${character.toLowerCase()}.png"
alt="${character}">`
} else {
image = `<img src="Alphabets/default.png" alt="${character}">`
}
html += '<div class="col-md-2">' + image + '</div>';
}
html += '</div>';
$('#output').append(html);
});
};
} else {
console.log('Speech recognition not supported 😢');
alert('Speech recognition not supported 😢');
$('#start-listening').prop('disabled', true);
}
$('#stop-listening').prop('disabled', true);
$('#start-listening').click(function () {
// start recognition
recognition.start();
$('#stop-listening').prop('disabled', false);
});
$('#stop-listening').click(function () {
// stop recognition
recognition.stop();
$('main').removeClass('speaking');
$('#stop-listening').prop('disabled', true);
$("#alert-placeholder").empty();
$("#alert-placeholder").html(`
<div class="alert alert-warning alert-dismissible fade show" role="alert">
No words detected. Please try again.
<button type="button" class="close" data-dismiss="alert" aria-
label="Close">
<span aria-hidden="true">×</span>
</button>
</div>
`);
});
});
Results and Discussion
Initial User Interface, User can press on start listening button before spaking
any sentence and can press stop listening button to convert the spoken
sentence into American sign language.
After pressing the start listening button, application asks for permission to use the
microphone
1) Input audio in microphone - Hi, my name is shruti Nice to meet you
Output Screenshot -
2) Input audio in microphone - What is your hobby?
Output Screenshot -
3) Input audio in microphone - How are you?
Output Screenshot -
Conclusions
In this project, we have presented a user-friendly audio to American Sign
Language translation system specially developed for the hearing- and speaking-
impaired community of India. The main aim of the system is to bring a feeling of
inclusion among the hearing-impaired community in society. The system does
not only help the person who is suffering from a disability but would also be
beneficial for the hearing people who want to understand the sign language of a
hearing-impaired person so that they can communicate with them in their
language. The core of the system is based on natural language processing and
Indian Sign Language grammar rules. The integration of this system in areas such
as hospitals, buses, railway stations, post offices, and even in video conferencing
applications, etc., could soon be proved a boon for the hearing-impaired
community not only in India but in other countries as well. In the future, the
features of the system could be enhanced by integrating reverse functionality, i.e.,
an American Sign Language to audio/text translation system which could open
the path for a two-way communication system.
Future Scope
1) In future, the proposed approach will be tested against unseen sentences.
Furthermore, the machine translation approach will be studied and
implemented on the parallel corpora of English and ASL sentences. The
ASL corpus can be used for testing ASL words and sentences and the
performance will be evaluated with evaluation parameters.
2) This could enable sign language users to access personal assistants, to use
text- based systems, to search sign language video content and to use
automated real-time translation when human interpreters are not available.
With the help of AI, automated sign language translation systems could
help break down communication barriers for deaf individuals.
3) Various front-end options are available such as .net or android app, that
can be used to make the system cross platform and increase the availability
of the system.
4) The system can be extended to incorporate the knowledge of facial
expressions and body language too so that there is a complete
understanding of the context and tone of the input speech.
5) A mobile and web based version of the application will increase the reach
to more people.
6) Integrating hand gesture recognition system using computer vision for
establishing 2-way communication system.
7) We can develop a complete product that will help the speech and hearing
impaired people, and thereby reduce the communication gap.
REFERENCES
[1]. American Sign Language Video Dictionary and Inflection Guide. (2000).
[CD-ROM]. New York: US. National Technical Institute for the Deaf, Rochester
Institute of technology. ISBN: 0-9720942-0-2.
[2]ASL University. Fingerspelling: Introduction.
http://www.lifeprint.com/asl101/fingerspelling/fingerspelling. html
[3]. M.Elmezain,A.Al-Hamadi,J.Appenrodt and B.Michaelis, A Hidden Markov
Model- based Continuous Gesture
Recognition System for Hand Motion Trajectory, 19th International Conference
on IEEE, Pattern Recognition,
2008, ICPR 2008, pp. 1–4, (2008).
[4]. P. Morguet and M. Lang M, Comparison of Approaches to Continuous Hand
Gesture Recognition for a Visual Dialog System,IEEE International Conference
on IEEE Acoustics, Speech, and Signal Processing, 1999,
Proceedings, 1999, vol. 6, pp. 3549–3552, 15– 19 March(1999).
[5]. T.Starner,“Visual Recognition of American Sign Language Using Hidden
Markov Models,” Master’s thesis, MIT,
Media Laboratory, Feb. 1995.
[6]. Neha Poddar, Shrushti Rao, Shruti Sawant, Vrushali Somavanshi,
Prof.Sumita Chandak "Study of Sign Language
Translation using Gesture Recognition" International Journal of Advanced
Research in Computer and
Communication Engineering Vol. 4, Issue 2, February 2015.
[7]. Deaf Mute Communication Interpreter Anbarasi Rajamohan, Hemavathy R.,
Dhanalakshmi M.(ISSN: 2277-1581)
Volume 2 Issue 5, pp: 336-341 1 May 2013.
[8]. Zouhour Tmar, Achraf Othman & Mohamed Jemni: A rule-based approach
for building an artificial English-ASL
corpus http://ieeexplore.ieee.org/document/6578458/
[9]. Dictionary | Indian Sign Language. (n.d.). Retrieved July 15, 2016, from
http://indiansignlanguage.org/dictionary
[10]. P. Kar, M. Reddy, A. Mukherjee, A. M. Raina. 2017. INGIT: Limited
Domain Formulaic Translation from Hindi
Strings to Indian Sign Language. ICON.
[11]. M. Vasishta, J. Woodward and S. DeSantis. 2011. An Introduction to Indian
Sign Language. All India Federation
of the Deaf (Third Edition)
[12]. V. Lpez-Ludea, C. Gonzlez-Morcillo, J.C. Lpez, E. Ferreiro, J. Ferreiros,
and R. San- Segundo. Methodology
fordeveloping an advanced communications system for the deaf in a new domain.
Knowledge-Based Systems,
56:240 – 252, 2014.