0% found this document useful (0 votes)

77 views23 pages

A Framework For Speech Recognition Development

This document proposes a framework for developing speech applications that addresses integrity issues. Currently, there are many different speech recognition apps developed independently, which can cause conflicts when running multiple apps on the same machine. The proposed framework would allow developers to still build from scratch while providing a platform for apps to work together. This would resolve integrity problems and make it easier for developers to create speech applications.

Uploaded by

Jason Elroy Martis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views23 pages

A Framework For Speech Recognition Development

Uploaded by

Jason Elroy Martis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

A Framework For Speech Application Development

By Jason Elroy Martis NMAMIT Nitte [email protected]

Agenda
Introduction Applications Working and Types FSM Problems Proposed Solution Results Conclusion References

Normal Ways of Interaction

Normal Interaction actually works in 2 basic forms Language Meta Language (Body Language) Both forms occur simultaneously which makes interaction experience richer.

Language communicated through

Language is communicated in form of Speech What is Speech ??? Speech is the vocalized form of human communication. It is based upon the syntactic combination of lexicals and names that are drawn from vocabularies. It forms to be the most natural way of how we interact Example : Hey! How are you?

Hence Speech Recognition (SR)

Speech Recognition is the process of converting a speech signal to a sequence of lexicals by means of an algorithm. i.e Instruct something by speech signals and the computer will recognize it . Is this Necessary??? Of Course (It improves our natural way of communication with the electronic or virtual world )

Application of SR
There are innumerable applications. Some are Military Uses
Remote Command and Control Centers
(plane ,Satellite etc)

Health Care
Automated medical prescriptions WOW!!!

Educational Uses
Helps teachers and students too

So how does SR work ??

A very simple model demonstrates how SR works

Approaches of SR
Basically divided into 3
Acoustic Phonetic Approach (Works on phonemes) Pattern Recognition Approach ( Works on Patterns) Artificial Intelligence Approach ( Advanced Functionality)

Acoustic Phonetic Approach

Need to know phonetics (the Language of Enunciation ) Recognize Phonemes, convert to lexicals and match to words .

Pattern Recognition
Pattern Recognition Works in 2 Phases
Pattern Training Comparison

Pattern Training is modeled by a FSM (Finite State Machines). In simple words Speech Templates are created and stored .
The speakers recognized words and the stored templates are compared and verified If Matched: Accept Not Matched :Reject

Pattern Recognition Contd

Model:

Problems: Different accents can cause Problems

Artificial Intelligence Approach

This approach overcomes some disadvantages of Template based
Maintains a knowledge base Automatically correct words.
Eg What your name?? (Error!!!)

It overcomes some problems of Speaker variance and other constraints of Speech

E.g. Culture, Accent, etc..

Speech Recognition Model

Finite State Machines Based SR Model

It is a very simple approach 2 main Stages are present
The Acceptor The Transducer

Acceptor used for accepting of rejecting lexicals

Transducer is for transition from a set of words to another as i/p grows.

FSM based SR Model Contd

What if match causes a problem ( 2 words are same )
Know and no both sound same (How to overcome this problem ??)

Solution :We can attach weights to them to improve recognition (This can work better )

Performance of Speech based Systems

The performance of Speech works on 2 main basis WER (Word Error rate) WRR (Word Recognition Rate)

WER is simple indicating how the word is recognized

WRR is Word recognition Rate

So What is New in this ???

Theres Nothing new in this as speech recognition is developed from almost nothing to everything now All are attracted and developing lots of apps on it This causes an integrity issue
All apps are from scratch There can be App Conflicts (2 diff apps on same comp) Both apps are waiting for the same word and cause conflicts on same machine License on these machines (normal developer has to do nothing but sit silently until SDK comes) Yuck !!!

How can we Solve this

We Combine both of this Approaches Allow developers to build from scratch (This makes them independent) Allow a Platform where they can work together So, Why not build a framework where users can build things easily and plus from scratch We dont loose anything and we improve integrity issues

How does this Framework Look ???

Notice how integrity issue is resolved and apps are developed easily

Results
Notice how the results affect the accuracy
Type of Speech Normal Dictionary Speech Accuracy 50-90%

Choices (Customized)
Choices (General ) Individual Letters

90%
80% 30%

Customized Phonetics

70%

Conclusion
Speech is a natural way of Communication. Numerous applications of Speech are present. There are various approaches and they have their own Pros and Cons FSMs are one way to make job easier and better

There are lots of problems Recognition problems Integrity issues So , We need a platform independent framework that can solve these issues and make the life of speech developers easier.

References
[1] Wienstien C.J. Military and government applications of human-machine communication by voice. In Proceedings of the Natl. Acad. Sci. USA. Volume 92 10011 10016. October 1995. [2].Dat Tat Tran, Fuzzy Approaches to Speech and Speaker Recognition, A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra. [3] R.K.Moore, Twenty things we still don t know about speech, Proc.CRIM/ FORWISS Workshop on Progress and Prospects of speech Research an Technology , 1994. [4].Sadaoki Furui, 50 years of Progress in speech and Speaker Recognition Research, ECTI Transactions on Computer and Information Technology, Vol.1. No.2 November 2005. [5]. Willie Walker .etal. Sphinx-4: A Flexible Open Source Framework for Speech Recognition http://cmusphinx.sourceforge.net/sphinx4 [6] M.A.Anusuya, Speech Recognition by Machine: A Review. In (IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 3, 2009 http://arxiv.org/ftp/arxiv/papers/1001/1001.2267.pdf [7] Neann Mathai, A Literature Survey of Speech Recognition and Hidden Markov Models. http://shenzi.cs.uct.ac.za/~honsproj/cgibin/view/2009/katz_mathai_sobey.zip/Speech_Katz_Mathai_Sobey/Downloads/NeannMathaiLiteratureSu rvey.pdf [8] Pavel Stemberk, Speech recognition based on FSM and HTK toolkits http://stembep.wz.cz/!papers/Zilina-dt04/zildt04.pdf [9] Steve Renals, Speech recognition. http://dsp-book.narod.ru/rec-notes.pdf

Speech Recognition Seminar
No ratings yet
Speech Recognition Seminar
19 pages
Speech Recognition in AI (COMP 334)
No ratings yet
Speech Recognition in AI (COMP 334)
26 pages
Speech Recognition for Developers
No ratings yet
Speech Recognition for Developers
38 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
Speech Recognition Report
100% (1)
Speech Recognition Report
20 pages
Lecture 9 - Speech Recognition
No ratings yet
Lecture 9 - Speech Recognition
65 pages
SPEECH
100% (1)
SPEECH
17 pages
Speech Processing
No ratings yet
Speech Processing
70 pages
Speech Recognition1
No ratings yet
Speech Recognition1
24 pages
Speech Recognition Course Guide
No ratings yet
Speech Recognition Course Guide
74 pages
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
No ratings yet
(IJCST-V4I2P62) :Dr.V.Ajantha Devi, Ms.V.Suganya
6 pages
Speech Recognition System Proposal
No ratings yet
Speech Recognition System Proposal
11 pages
Speech Recognition System
No ratings yet
Speech Recognition System
5 pages
9 Speech Recognition
No ratings yet
9 Speech Recognition
26 pages
Speech Recognition
No ratings yet
Speech Recognition
17 pages
Final Slide
No ratings yet
Final Slide
18 pages
NLP 1.3.1 - Speed Recogmnition
No ratings yet
NLP 1.3.1 - Speed Recogmnition
20 pages
Natural Language Processing: by Dr. Parminder Kaur
No ratings yet
Natural Language Processing: by Dr. Parminder Kaur
26 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Speech Recognition: An Overview
No ratings yet
Speech Recognition: An Overview
19 pages
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
No ratings yet
Speechrecognitionfinalpresentation 141124072610 Conversion Gate01
30 pages
Speech Recognition1
100% (1)
Speech Recognition1
39 pages
SPEECH RECOGNITION SYSTEM Final
No ratings yet
SPEECH RECOGNITION SYSTEM Final
16 pages
Speech Recognition Applications TEXT
No ratings yet
Speech Recognition Applications TEXT
7 pages
Speech Recognition PPT F
100% (3)
Speech Recognition PPT F
16 pages
Phases of Speech Recognition
No ratings yet
Phases of Speech Recognition
2 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Reconocimiento de Voz - MATLAB
No ratings yet
Reconocimiento de Voz - MATLAB
5 pages
Voice Assistant
No ratings yet
Voice Assistant
34 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
A Study On Automatic Speech Recognition
100% (1)
A Study On Automatic Speech Recognition
2 pages
Speech Recognition
No ratings yet
Speech Recognition
11 pages
IT Report-1
No ratings yet
IT Report-1
14 pages
IRJET Speech Scribd
No ratings yet
IRJET Speech Scribd
3 pages
Minor Project123
No ratings yet
Minor Project123
40 pages
Speech Recognition Application
No ratings yet
Speech Recognition Application
13 pages
A Report On
No ratings yet
A Report On
35 pages
Unit 5 UA
No ratings yet
Unit 5 UA
19 pages
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
No ratings yet
Speech Recognition For Mobile Systems: BY: Pratibha Channamsetty Shruthi Sambasivan
36 pages
Case Study: Speech Recognition For Virtual Assistants: 1. Problem Identification
No ratings yet
Case Study: Speech Recognition For Virtual Assistants: 1. Problem Identification
8 pages
Speech Recognition
No ratings yet
Speech Recognition
7 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Speech Recognition Project
No ratings yet
Speech Recognition Project
33 pages
Synopsis
No ratings yet
Synopsis
5 pages
Speech Recognition Internship Report
No ratings yet
Speech Recognition Internship Report
4 pages
Final Report
No ratings yet
Final Report
35 pages
DL Proj Rep
No ratings yet
DL Proj Rep
11 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
Speech Recognition Seminar Report
No ratings yet
Speech Recognition Seminar Report
24 pages
AIML
No ratings yet
AIML
9 pages
Speech Recognition As Emerging Revolutionary Technology
No ratings yet
Speech Recognition As Emerging Revolutionary Technology
4 pages
Ai Project Sona-1 (1) - 250630 - 194118
No ratings yet
Ai Project Sona-1 (1) - 250630 - 194118
10 pages
AI For Speech Recognition Complete
No ratings yet
AI For Speech Recognition Complete
4 pages
Rohit
No ratings yet
Rohit
14 pages
Chapter 1 Introduction To Statistics For Engineers
No ratings yet
Chapter 1 Introduction To Statistics For Engineers
91 pages
Langue and Parole
No ratings yet
Langue and Parole
1 page
13-4429 #40
No ratings yet
13-4429 #40
185 pages
Rottman BookReviewOfBornBelievers
No ratings yet
Rottman BookReviewOfBornBelievers
5 pages
Structuralism
100% (1)
Structuralism
3 pages
Nina Begus Dissertation DAC
No ratings yet
Nina Begus Dissertation DAC
505 pages
Smallbookofbigtechniquessansmindb Index
100% (3)
Smallbookofbigtechniquessansmindb Index
18 pages
Principles of Teaching
80% (10)
Principles of Teaching
172 pages
Participant Information Sheet
No ratings yet
Participant Information Sheet
3 pages
Christian Mysticism & Grace
No ratings yet
Christian Mysticism & Grace
3 pages
Vinayagar Agaval - English
No ratings yet
Vinayagar Agaval - English
5 pages
ENWR 110 Paper 3 - NBA Dress Code
100% (1)
ENWR 110 Paper 3 - NBA Dress Code
9 pages
Chipko Movement
No ratings yet
Chipko Movement
2 pages
Vibrionicalife Operations Manual
100% (3)
Vibrionicalife Operations Manual
14 pages
SSM Self Supported Ministry Facts
93% (14)
SSM Self Supported Ministry Facts
23 pages
Political Theory Lec1
No ratings yet
Political Theory Lec1
31 pages
Paradoxes of Set Theory - Wikipedia
No ratings yet
Paradoxes of Set Theory - Wikipedia
8 pages
Mba ZG511
No ratings yet
Mba ZG511
8 pages
Four O'Clock
No ratings yet
Four O'Clock
3 pages
Jesus and Mary
No ratings yet
Jesus and Mary
16 pages
Agents of Socialization
No ratings yet
Agents of Socialization
18 pages
Chapter 26
No ratings yet
Chapter 26
17 pages
Legal Analysis: Scott v. Dock Co.
No ratings yet
Legal Analysis: Scott v. Dock Co.
3 pages
Reflection On The Seven Grandfather Teachings
No ratings yet
Reflection On The Seven Grandfather Teachings
4 pages
Ihm - (Aurangabad) : Hotel Management As Career
No ratings yet
Ihm - (Aurangabad) : Hotel Management As Career
2 pages
Articles About Pedophilia
No ratings yet
Articles About Pedophilia
75 pages
Commentary On Brahma Samhita
No ratings yet
Commentary On Brahma Samhita
99 pages
How To Build A Memory Palace
100% (1)
How To Build A Memory Palace
4 pages
Models Are Real, Olafur Eliasson
No ratings yet
Models Are Real, Olafur Eliasson
2 pages
Kristeva Bulgaria0001
No ratings yet
Kristeva Bulgaria0001
10 pages

A Framework For Speech Recognition Development

Uploaded by

A Framework For Speech Recognition Development

Uploaded by

A Framework For Speech Application Development

By Jason Elroy Martis NMAMIT Nitte [email protected]

Normal Ways of Interaction

Language communicated through

Hence Speech Recognition (SR)

So how does SR work ??

Acoustic Phonetic Approach

Pattern Recognition Contd

Problems: Different accents can cause Problems

Artificial Intelligence Approach

It overcomes some problems of Speaker variance and other constraints of Speech

Speech Recognition Model

Finite State Machines Based SR Model

Acceptor used for accepting of rejecting lexicals

Transducer is for transition from a set of words to another as i/p grows.

FSM based SR Model Contd

Performance of Speech based Systems

WER is simple indicating how the word is recognized

WRR is Word recognition Rate

So What is New in this ???

How can we Solve this

How does this Framework Look ???

You might also like