Context Sensitive Spelling and Language Models

The document discusses language models and probabilistic language modeling. It explains that language models use n-gram modeling to predict the probability of words and sentences based on previous words by using large corpora of text for training. Applications of language models include context sensitive spelling correction in OCR and speech recognition to predict words that are not easily readable.

Uploaded by

jayit saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views24 pages

Context Sensitive Spelling and Language Models

Uploaded by

jayit saha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Language

Models
Session-19

Dr. Subhra Rani Patra

SCOPE, VIT Chennai
Context Sensitive Spelling Correction
Probabilistic Language Modeling
Applications

• In OCR-Optical Character Recognition

Can Predict the word that is not easily readable in the given image
• Correcting a sentence
If we write “Deer Sir” instead of “Dear Sir”
• Speech recognition
Language Model
• Language Model (LM)
• A language model is a probability distribution over entire sentences or texts
• N-gram: unigrams, bigrams, trigrams,…
• In a simple n-gram language model, the probability of a word, conditioned
on some number of previous words
• Item:
• Phonemes
• Syllables
• Letters
• Words
• Anything else depending on the application.
How do we train these models?
Very large corpora

• Corpora are online collections of text and speech

• Brown Corpus
• Wall Street Journal
• AP newswire
• Hansards
• Timit
• DARPA/NIST text/speech corpora (Call Home, Call Friend, ATIS, Switchboard,
Broadcast News, Broadcast Conversation, TDT, Communicator)
• TRAINS, Boston Radio News Corpus
Computing P(W)

The Chain Rule

Probability of words in sentences
Estimating these probability values
Markov Assumption
Markov Assumption
N-gram Models
N-gram Models
Estimating N-gram Probabilities
An Example
An Example
Bigram Counts for 9222 Restaurant
Sentences
Computing Bigram Probabilities
Computing Sentence probabilities
What Knowledge does N-gram represent
Practical Issues
Google N-grams
Example from the 4-gram data

NLP Language Models Explained
No ratings yet
NLP Language Models Explained
65 pages
N-gram Models in NLP Explained
No ratings yet
N-gram Models in NLP Explained
28 pages
NLP Unit-4
No ratings yet
NLP Unit-4
62 pages
Unit-5 Notes NLP
No ratings yet
Unit-5 Notes NLP
28 pages
Lecture 6 To 8 N-Gram
No ratings yet
Lecture 6 To 8 N-Gram
19 pages
Language Model
No ratings yet
Language Model
2 pages
Language Modeling
No ratings yet
Language Modeling
50 pages
6.chapter6 LanguageModel
No ratings yet
6.chapter6 LanguageModel
33 pages
N-Gram Language Modeling Techniques
No ratings yet
N-Gram Language Modeling Techniques
87 pages
NLP-Ch-2 Introduction To Language Models
No ratings yet
NLP-Ch-2 Introduction To Language Models
82 pages
Ngrams
No ratings yet
Ngrams
22 pages
04 Language Modeling
No ratings yet
04 Language Modeling
70 pages
Understanding N-grams in Language Modeling
No ratings yet
Understanding N-grams in Language Modeling
78 pages
Lecture 10 - N-Gram Language Models4 - Unit 2
No ratings yet
Lecture 10 - N-Gram Language Models4 - Unit 2
4 pages
Language Models & N-Gram Analysis
No ratings yet
Language Models & N-Gram Analysis
41 pages
Introduction to N-grams in NLP
No ratings yet
Introduction to N-grams in NLP
88 pages
Technical NLP U3-6
No ratings yet
Technical NLP U3-6
83 pages
Intro To Language Models - Soumyasis Mishra - 191001021003 - BCS4C
No ratings yet
Intro To Language Models - Soumyasis Mishra - 191001021003 - BCS4C
10 pages
N-Gram Language Models in NLP
No ratings yet
N-Gram Language Models in NLP
22 pages
Module-1 ch-2
No ratings yet
Module-1 ch-2
31 pages
N Grams
No ratings yet
N Grams
51 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Ngrams
100% (1)
Ngrams
22 pages
08 NLP - N-Gram Language Models
No ratings yet
08 NLP - N-Gram Language Models
65 pages
LM 24 Aug
No ratings yet
LM 24 Aug
75 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Language Modeling in NLP
No ratings yet
Language Modeling in NLP
15 pages
Language Modeling with N-grams
No ratings yet
Language Modeling with N-grams
79 pages
N-gram Language Modeling Overview
No ratings yet
N-gram Language Modeling Overview
84 pages
Introduction to N-grams in Language Modeling
No ratings yet
Introduction to N-grams in Language Modeling
97 pages
N-Gram Models in Language Processing
No ratings yet
N-Gram Models in Language Processing
51 pages
NLP Sem Unit 5
No ratings yet
NLP Sem Unit 5
9 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Unit 2
No ratings yet
Unit 2
75 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
NLP for Language Model Enthusiasts
No ratings yet
NLP for Language Model Enthusiasts
74 pages
NLP Unit-5.2 Notes
No ratings yet
NLP Unit-5.2 Notes
72 pages
NLP 1.2
No ratings yet
NLP 1.2
22 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
N-Gram Language Model Overview
No ratings yet
N-Gram Language Model Overview
28 pages
Probabilistic Language Modeling Challenges
No ratings yet
Probabilistic Language Modeling Challenges
12 pages
Introduction to N-grams in Language Modeling
No ratings yet
Introduction to N-grams in Language Modeling
76 pages
Statistical Language Model
No ratings yet
Statistical Language Model
9 pages
NLP Unit 4 Q & A
No ratings yet
NLP Unit 4 Q & A
17 pages
NLP
No ratings yet
NLP
12 pages
Linguistics & N-Gram Models
No ratings yet
Linguistics & N-Gram Models
47 pages
2.1 Chap NLP Ngrams
No ratings yet
2.1 Chap NLP Ngrams
37 pages
Cs224n 2025 Lecture05 RNNLM
No ratings yet
Cs224n 2025 Lecture05 RNNLM
54 pages
N-Gram Language Modelling Overview
No ratings yet
N-Gram Language Modelling Overview
40 pages
Language Modeling: Introduction To N-Grams
No ratings yet
Language Modeling: Introduction To N-Grams
88 pages
5) Lecture Feb11&13&17&18
No ratings yet
5) Lecture Feb11&13&17&18
21 pages
Grammar-Based Language Modeling Overview
No ratings yet
Grammar-Based Language Modeling Overview
36 pages
N-gram Language Models Explained
No ratings yet
N-gram Language Models Explained
3 pages
Viterbi Algorithm: Session-18
No ratings yet
Viterbi Algorithm: Session-18
19 pages
Evaluation of Language Model Perplexity: Session-21
No ratings yet
Evaluation of Language Model Perplexity: Session-21
19 pages
Graphalgorithms-Bfs and Dfs
No ratings yet
Graphalgorithms-Bfs and Dfs
14 pages
IJCRR - Instructions To Author (S) : Submission of Manuscript
No ratings yet
IJCRR - Instructions To Author (S) : Submission of Manuscript
15 pages
Fall2016 Tutorial1 Model Answer
No ratings yet
Fall2016 Tutorial1 Model Answer
10 pages
Module4 Practice Problems
No ratings yet
Module4 Practice Problems
2 pages

Context Sensitive Spelling and Language Models

Uploaded by

Context Sensitive Spelling and Language Models

Uploaded by

Language

Dr. Subhra Rani Patra

• In OCR-Optical Character Recognition

• Corpora are online collections of text and speech

The Chain Rule

You might also like