0% found this document useful (1 vote)

283 views27 pages

Speech Recognition

sppech recognition window speech recognition processing, afvantage, disadvantage, limitation, future scope, how speaker recognition works

Uploaded by

Lokendra Singh Shekhawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

283 views27 pages

Speech Recognition

sppech recognition window speech recognition processing, afvantage, disadvantage, limitation, future scope, how speaker recognition works

Uploaded by

Lokendra Singh Shekhawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 27

Speech Recognition

By,
Chhatbar Jay(14mecc03)
Lokender Sekhawat(14mecc08)

What is Speech Recognition?

Speech recognition is the ability of a
machine or program to identify words and
phrases in spoken language and convert
them to a machine-readable format.
Also, known as automatic speech
recognition or computer speech recognition
which means understanding voice by the
computer and performing any required task.
Speech Recognition (SR) is the ability to
translate a dictation or spoken word to text.

Where can it be used?

Dictation
System control/navigation
Commercial/Industrial applications
Voice dialing

Block diagram of speech

recognition

Speech Modeling
Acoustic Model
An acoustic model is created by taking audio
recordings of speech, and their text transcriptions,
and
using
software
to
create
statistical
representations of the sounds that make up each
word. It is used by aspeech recognitionengine to
recognize speech.

Language Model
Language modeling is used in manynatural
language processingapplications such asspeech
recognition tries to capture the properties of a
language, and to predict the next word in a speech
sequence.

TYPES OF VOICE
RECOGNITION
There are two types of speech recognition. One is
calledspeaker-dependentand
the
other
is
speaker-independent. Speaker-dependent software is
commonly used for dictation software, while speakerindependent software is more commonly found in
telephone applications.
Speaker-dependent software works by learning the
unique characteristics of a single persons voice, in a
way similar to voice recognition. New users must first
train the software by speaking to it, so the computer
can analyze how the person talks. This often means
users have to read a few pages of text to the computer
before they can use the speech recognition software.

TYPES OF VOICE
RECOGNITION
Speaker-independent software is designed to
recognize anyones voice, so no training is involved.
This means it is the only real option for applications
such as interactive voice response systems where
businesses cant ask callers to read pages of text
before using the system. The downside is that
speaker-independent software is generally less
accurate than speaker-dependent software.
Speech recognition engines that are speaker
independent generally deal with this fact by limiting
the grammars they use. By using a smaller list of
recognized words, the speech engine is more likely
to correctly recognize what a speaker said.

How do humans do it?

Articulation produces
sound waves which
the ear conveys to the brain
for processing

How might computers do it?

Acoustic waveform

Acoustic signal

Digitization
Acoustic analysis of the speech
signal
Language interpretation

Speech recognition

DIFFERENT PROCESSES
INVOLVED
Digitization
Converting analogue signal into digital
representation

Signal processing
Separating speech from background noise

Phonetics
Variability in human speech

Phonology
Recognizing individual sound distinctions
(similar phonemes)
is the systematic use of sound to encode
meaning in any spokenhuman language

DIFFERENT PROCESSES
INVOLVED(CONTD.)
Lexicology and syntax
Lexicology is that part oflinguisticswhich
studieswords, their nature and meaning,
words' elements, relations between words,
words groups and the whole lexicon.

Syntax and pragmatics

Semantics tells about themeaning
Pragmatics is concerned with bridging the
explanatory gap betweensentencemeaning
and speaker's meaning

Digitization
Analogue to digital conversion
Sampling and quantizing

Sampling is converting a continuous signal into a discrete signal

Quantizing is the process of approximating a continuous range of
values

Use filters to measure energy levels for various

points on the frequency spectrum
Knowing the relative importance of different
frequency bands (for speech) makes this
process more efficient
E.g. high frequency sounds are less informative,
so can be sampled using a broader bandwidth
(log scale)

Separating speech from background

noise
Noise cancelling microphones
Two mics, one facing speaker, the other facing
away
Ambient noise is roughly same for both mics

Knowing which bits of the signal relate to

speech

Process of speech recognition

Speaker
Recognition

Speech
Recognition

parsing
and
arbitration

Switch on
Channel 9

Speaker
Recognition

Speech
Recognition

parsing
and
arbitration

Who is
speaking?

Speaker
Recognition

Speech
Recognition

parsing
and
arbitration

Annie
David
Cathy

Authentication

What is he
saying?

Speaker
Recognition

Speech
Recognition

parsing
and
arbitration

On,Off,TV
Fridge,Door

Understanding

What is he
talking
about?

Speaker
Recognition

Speech
Recognition

parsing
and
arbitration

SK
Switch,to,channel,nine

Inferring and execution

Channel->TV
Dim->Lamp
On->TV,Lamp

Framework of Voice Recognition

Face
Recognition

Gesture
Recognition

parsing
and
arbitration

Authentication

Understanding Inferring and execution

Speaker Recognition

Definition
It is the method of recognizing a person based on his voice
It is one of the forms of biometric identification

Depends of speaker specific

characteristics.

Generic Speaker Recognition System

Speech signal

Analysis

Preprocessing Frames

Feature
Extraction

Preprocessing

Feature
Extraction

Feature
Vector

Score

Pattern
Matching

Speaker Model

ADVANTAGES
Advantages
People with disabilities
Organizations - Increases productivity, reduces costs and
errors.
Lower operational Costs
Advances in technology will allow consumers and
businesses to implement speech recognition systems at
a relatively low cost.
Cell-phone users can dial pre-programmed numbers by voice
command.
Users can trade stocks through a voice-activated trading
system.
Speech recognition technology can also replace touch-tone
dialing resulting in the ability to target customers that speak
different languages

DISADVANTAGES
Difficult to build a perfect system.
Conversations
Involves more than just words (non-verbal communication;
stutters etc.
Every human being has differences such as their voice,
mouth, and speaking style.

Filtering background noise is a task that can even be

difficult for humans to accomplish.

Future of Speech Recognition

Accuracy will become better and better.
Dictation speech recognition will gradually become
accepted.
Small hand-held writing tablets for computer speech
recognition dictation and data entry will be
developed, as faster processors and more memory
become available.
Greater use will be made of "intelligent systems"
which will attempt to guess what the speaker
intended to say, rather than what was actually said,
as people often misspeak and make unintentional
mistakes.
Microphone and sound systems will be designed to
adapt more quickly to changing background noise
levels, different environments, with better
recognition of extraneous material to be discarded.

References
1. Alwang, Greg. Speech Recognition, PC
Magazine, December 1 1999
2. Hauptmann, Alexander G. Jang, Photina
Jaeyun. Carnegie Mellon University. Learning to
Recognize Speech by Watching Television, IEEE
Intelligent Systems, September/October 1999.
3. Miastkowski, Stan. Latest Speech Software
Gets You Up and Running Faster, PC World,
November 1999.

THANK YOU

Speech Recognition
No ratings yet
Speech Recognition
17 pages
Speech Recognition Report
100% (1)
Speech Recognition Report
20 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Speech Recognition Seminar
100% (2)
Speech Recognition Seminar
19 pages
An Introduction To Face Detection and Recognition
No ratings yet
An Introduction To Face Detection and Recognition
73 pages
Face Recognition Using PCA
100% (6)
Face Recognition Using PCA
30 pages
UNIT 5 Application AI
No ratings yet
UNIT 5 Application AI
16 pages
Synopsis
No ratings yet
Synopsis
18 pages
Project Proposal: FPGA Based Speech Recognition Project
100% (1)
Project Proposal: FPGA Based Speech Recognition Project
9 pages
Speech Recognition Technology
No ratings yet
Speech Recognition Technology
9 pages
Voice Morphing for TTS Systems
100% (4)
Voice Morphing for TTS Systems
5 pages
CCS369 - TSS-Unit 5
No ratings yet
CCS369 - TSS-Unit 5
23 pages
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
No ratings yet
Fundamentals of Speech Recognitiony - Lawrence Rabiner - Biing-Hwang Juang PDF
546 pages
Jarvis: Virtual Voice Command Desktop Assistant
No ratings yet
Jarvis: Virtual Voice Command Desktop Assistant
4 pages
Speech Recognition
No ratings yet
Speech Recognition
16 pages
The Joy of Computing Using Python - Course Assignment-1
No ratings yet
The Joy of Computing Using Python - Course Assignment-1
5 pages
Indian Sign Language Detection
No ratings yet
Indian Sign Language Detection
5 pages
Online Jewelry Shop System Overview
No ratings yet
Online Jewelry Shop System Overview
23 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
Graphical Password Minor Report
No ratings yet
Graphical Password Minor Report
79 pages
Anusha Technical Final
No ratings yet
Anusha Technical Final
24 pages
Personal Voice Assistant in Python
100% (1)
Personal Voice Assistant in Python
30 pages
CCS369 - TSS-Unit 4
No ratings yet
CCS369 - TSS-Unit 4
30 pages
Speech Processing
No ratings yet
Speech Processing
71 pages
TEXT-PROMPTED REMOTE SPEAKER AUTHENTICATION - Project Report - GANESH TIWARI - IOE - TU
94% (18)
TEXT-PROMPTED REMOTE SPEAKER AUTHENTICATION - Project Report - GANESH TIWARI - IOE - TU
71 pages
Voice Morphing
No ratings yet
Voice Morphing
16 pages
Python Speech Recognition Guide
No ratings yet
Python Speech Recognition Guide
18 pages
Lecture 9 - Speech Recognition
No ratings yet
Lecture 9 - Speech Recognition
65 pages
Sign Language and Common Gesture Using CNN
0% (1)
Sign Language and Common Gesture Using CNN
7 pages
AI Architecture and Components
No ratings yet
AI Architecture and Components
4 pages
Jewellery Store Pullu Synopsis
No ratings yet
Jewellery Store Pullu Synopsis
10 pages
Graphical Password Authentication
No ratings yet
Graphical Password Authentication
2 pages
2-Problem Solving and Search Techniques
No ratings yet
2-Problem Solving and Search Techniques
12 pages
Minor Project Final Report (Group 8)
No ratings yet
Minor Project Final Report (Group 8)
42 pages
Python and Machine Learning: A Practical Training Report On
No ratings yet
Python and Machine Learning: A Practical Training Report On
65 pages
Mini Project Progress Presentation: Chatbot (Artificial Intellgence Customer Care Service
100% (1)
Mini Project Progress Presentation: Chatbot (Artificial Intellgence Customer Care Service
11 pages
Project Report GitHub
No ratings yet
Project Report GitHub
32 pages
Voice-Controlled Army Tank
100% (1)
Voice-Controlled Army Tank
60 pages
Roo Project
No ratings yet
Roo Project
16 pages
Full ML Viva Questions Answers Q1 To Q70
No ratings yet
Full ML Viva Questions Answers Q1 To Q70
6 pages
SARA A Voice Assistant Using Python
No ratings yet
SARA A Voice Assistant Using Python
18 pages
Code Sync Real Time Collaborative Code Editing
No ratings yet
Code Sync Real Time Collaborative Code Editing
9 pages
Voice Morphing
No ratings yet
Voice Morphing
20 pages
Voice Morphing
No ratings yet
Voice Morphing
23 pages
A Graphical Password Authentication System Abs
No ratings yet
A Graphical Password Authentication System Abs
3 pages
Voice Based Virtual Assistant Research Paper
100% (1)
Voice Based Virtual Assistant Research Paper
4 pages
Bitcoin Price Prediction with LSTM
No ratings yet
Bitcoin Price Prediction with LSTM
66 pages
Car Number Plate Detection
No ratings yet
Car Number Plate Detection
10 pages
Automatic Number Plate Extraction A Review
No ratings yet
Automatic Number Plate Extraction A Review
4 pages
Voice Based System Assistant Using NLP and Deep Learning-1
No ratings yet
Voice Based System Assistant Using NLP and Deep Learning-1
82 pages
Chapter 2: THE PROJECT
No ratings yet
Chapter 2: THE PROJECT
25 pages
Voice Assistent - Minor
No ratings yet
Voice Assistent - Minor
14 pages
Soil Classification and Crop Recommendation System
No ratings yet
Soil Classification and Crop Recommendation System
4 pages
Face Recognition
100% (1)
Face Recognition
92 pages
AI Medical Diagnosis Project Report
No ratings yet
AI Medical Diagnosis Project Report
46 pages
Project Report
No ratings yet
Project Report
40 pages
Speech Recognition: BY Charu Joshi
100% (2)
Speech Recognition: BY Charu Joshi
26 pages
Speech Recognition for Tech Enthusiasts
No ratings yet
Speech Recognition for Tech Enthusiasts
26 pages
SPEECH
100% (1)
SPEECH
17 pages
Speech Recognition PPT F
100% (2)
Speech Recognition PPT F
16 pages
Vias and Mode Conversion
No ratings yet
Vias and Mode Conversion
1 page
Spice Basics
No ratings yet
Spice Basics
14 pages
Lokender Resume Help
No ratings yet
Lokender Resume Help
1 page
Lokendra Singh Sekhawat Resume 2
No ratings yet
Lokendra Singh Sekhawat Resume 2
2 pages
Lokender Shekhawat: Ustrial Experience (M.Tech.)
No ratings yet
Lokender Shekhawat: Ustrial Experience (M.Tech.)
2 pages
Electronics Engineer Portfolio
No ratings yet
Electronics Engineer Portfolio
3 pages
Lokender Shekhawat: Ustrial Experience (M.Tech.)
No ratings yet
Lokender Shekhawat: Ustrial Experience (M.Tech.)
2 pages
Electronics Engineering Resume
No ratings yet
Electronics Engineering Resume
2 pages
Electronics Engineer with Automation Expertise
No ratings yet
Electronics Engineer with Automation Expertise
2 pages
Lokendra Singh Sekhawat Resume 4
No ratings yet
Lokendra Singh Sekhawat Resume 4
2 pages
Electronics Engineering Resume
No ratings yet
Electronics Engineering Resume
2 pages
Lokender Singh Shekhawat: Work Experience
No ratings yet
Lokender Singh Shekhawat: Work Experience
2 pages
Lokender Shekhawat Python Resume
No ratings yet
Lokender Shekhawat Python Resume
2 pages
Lokender Singh Shekhawat: Core Competency
No ratings yet
Lokender Singh Shekhawat: Core Competency
2 pages
Lokender Singh Shekhawat: Work Experience
No ratings yet
Lokender Singh Shekhawat: Work Experience
2 pages
Lokender Singh Shekhawat: Work Experience
No ratings yet
Lokender Singh Shekhawat: Work Experience
2 pages
Repeater Optimization Methodologies For Custom CPU Designs: Guide: Dr. N. P. Gajjar
No ratings yet
Repeater Optimization Methodologies For Custom CPU Designs: Guide: Dr. N. P. Gajjar
1 page
ASP Term Paper Ppt1
No ratings yet
ASP Term Paper Ppt1
20 pages
Helloj Upload
No ratings yet
Helloj Upload
1 page
16th Lok Sabha Election
No ratings yet
16th Lok Sabha Election
3 pages
Understanding Terrorism in India
No ratings yet
Understanding Terrorism in India
11 pages
Attention Eclipse Manipulating Attention To Bypass LLM Safety-Alignment
No ratings yet
Attention Eclipse Manipulating Attention To Bypass LLM Safety-Alignment
18 pages
Applications of Artificial Neural Networks in Voice Recognition and Nettalk PDF
No ratings yet
Applications of Artificial Neural Networks in Voice Recognition and Nettalk PDF
33 pages
Detecting Alzheimer From Speech
No ratings yet
Detecting Alzheimer From Speech
11 pages
Mobile Voice Chat
No ratings yet
Mobile Voice Chat
100 pages
A Greek Voice Recognition Interface For ROV Applications, Using Machine Learning Technologies and The CMU Sphinx Platform
No ratings yet
A Greek Voice Recognition Interface For ROV Applications, Using Machine Learning Technologies and The CMU Sphinx Platform
11 pages
Assistive Devices STC
No ratings yet
Assistive Devices STC
17 pages
Speech Emotion Recognition: Two Decades in A Nutshell, Benchmarks, and Ongoing Trends
No ratings yet
Speech Emotion Recognition: Two Decades in A Nutshell, Benchmarks, and Ongoing Trends
9 pages
843 AI - Projects - Cookbook - XI To XII Book (2025-26)
No ratings yet
843 AI - Projects - Cookbook - XI To XII Book (2025-26)
65 pages
Design of A Voice Recognition System Using Artificial Intelligence
No ratings yet
Design of A Voice Recognition System Using Artificial Intelligence
7 pages
Pretotype It (Second Pretotype Edition)
100% (1)
Pretotype It (Second Pretotype Edition)
72 pages
Wheelchair Important
No ratings yet
Wheelchair Important
71 pages
Project Report
No ratings yet
Project Report
27 pages
Idol Ds
No ratings yet
Idol Ds
8 pages
Itec 103.1
No ratings yet
Itec 103.1
10 pages
E-I Artikel
No ratings yet
E-I Artikel
7 pages
Intelligent Systems in Business
No ratings yet
Intelligent Systems in Business
30 pages
Homeautomation IEEE
No ratings yet
Homeautomation IEEE
6 pages
Personal Voice Assistant in Python
No ratings yet
Personal Voice Assistant in Python
14 pages
Real-Time Language Translation
No ratings yet
Real-Time Language Translation
4 pages
International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169 Volume: 5 Issue: 3 254 - 257
No ratings yet
International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169 Volume: 5 Issue: 3 254 - 257
4 pages
STAN J. CATERBONE - Mark Zuckerberg and FACEBOOK CASE FILE As of Thursday April 20, 2017
No ratings yet
STAN J. CATERBONE - Mark Zuckerberg and FACEBOOK CASE FILE As of Thursday April 20, 2017
335 pages
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
No ratings yet
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
7 pages
Isolated Digit Recognition System
100% (1)
Isolated Digit Recognition System
3 pages
MBF 841 Emerging Technologies in Information Technology
No ratings yet
MBF 841 Emerging Technologies in Information Technology
221 pages
AI Fund Training Deck
No ratings yet
AI Fund Training Deck
99 pages
Video Editing Tools With Artificial Intelligence
No ratings yet
Video Editing Tools With Artificial Intelligence
10 pages
Automatic Speaker Recognition Using Hybrid Parameters Based On Machine Learning Applied On Two Dataset
No ratings yet
Automatic Speaker Recognition Using Hybrid Parameters Based On Machine Learning Applied On Two Dataset
12 pages
Desktop Voice Assistant
No ratings yet
Desktop Voice Assistant
4 pages
Thesis AfaanOromoo 2016
No ratings yet
Thesis AfaanOromoo 2016
93 pages
Deep Learning A Z PDF
100% (9)
Deep Learning A Z PDF
799 pages

Speech Recognition

Uploaded by

Speech Recognition

Uploaded by

Speech Recognition

What is Speech Recognition?

Where can it be used?

Block diagram of speech

How do humans do it?

How might computers do it?

Syntax and pragmatics

Sampling is converting a continuous signal into a discrete signal

Use filters to measure energy levels for various

Separating speech from background

Knowing which bits of the signal relate to

Process of speech recognition

Inferring and execution

Framework of Voice Recognition

Understanding Inferring and execution

Depends of speaker specific

Generic Speaker Recognition System

Filtering background noise is a task that can even be

Future of Speech Recognition

You might also like