Contoh Soal N Gram (Bagus)

This document contains exercises on using n-gram language models for natural language processing. It provides sample training data and asks students to calculate n-gram probabilities, predict next words, compute perplexity, and determine the most probable sentences given a bigram language model with and without Laplace smoothing. Solutions are provided that calculate the relevant probabilities and perplexities.

Uploaded by

yeninur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

452 views2 pages

Contoh Soal N Gram (Bagus)

Uploaded by

yeninur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Machine Learning

Exercises: language models (n-grams)

Laura Kallmeyer

Summer 2016, Heinrich-Heine-Universität Düsseldorf

Exercise 1 Consider the following toy example (similar to the one from Jurafsky & Martin (2015)):
Training data:
<s> I am Sam </s>
<s> Sam I am </s>
<s> Sam I like </s>
<s> Sam I do like </s>
<s> do I like Sam </s>

Assume that we use a bigram language model based on the above training data.

1. What is the most probable next word predicted by the model for the following word sequences?

(1) <s> Sam . . .

(2) <s> Sam I do . . .
(3) <s> Sam I am Sam . . .
(4) <s> do I like . . .

2. Which of the following sentences is better, i.e., gets a higher probability with this model?

(5) <s> Sam I do I like </s>

(6) <s> Sam I am </s>
(7) <s> I do like Sam I am </s>

1. (1) and (3): “I”.

(2): “I” and “like” are equally probable.
(4): </s>
2. Probabilities:
3 3 1 1 2 2
(5): 5 · 5 · 5 · 2 · 5 · 3
3 3 2 1
(6): 5 · 5 · 5 · 2
1 1 1 1 3 2 1
(7): 5 · 5 · 2 · 3 · 5 · 5 · 2
(6) is the most probable sentence according to our language model.
Exercise 2 Consider again the same training data and the same bigram model. Compute the perplexity
of

<s> I do like Sam

Solution:
1 1 1 1 1
The probability of this sequence is 5 · 5 · 2 · 3 = 150 .
√
The perplexity is then 4 150 = 3.5

Exercise 3 Take again the same training data. This time, we use a bigram LM with Laplace smoothing.

1. Give the following bigram probabilities estimated by this model:

P (do|<s>) P (do|Sam) P (Sam|<s>) P (Sam|do)
P (I|Sam) P (I|do) P (like|I)
Note that for each word wn−1 , we count an additional bigram for each possible continuation wn .
Consequently, we have to take the words into consideration and also the symbol </s>.
2. Calculate the probabilities of the following sequences according to this model:

(8) <s> do Sam I like

(9) <s> Sam do I like

Which of the two sequences is more probable according to our LM?

Solution:

References
Jurafsky, Daniel & James H. Martin. 2015. Speech and language processing. an introduction to natural language processing,
computational linguistics, and speech recognition. Draft of the 3rd edition.

Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
No ratings yet
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
36 pages
NLP Assignment: N-Gram Modeling
No ratings yet
NLP Assignment: N-Gram Modeling
4 pages
Wireless 4 DOF Robotic Arm Design
No ratings yet
Wireless 4 DOF Robotic Arm Design
11 pages
N-Gram Language Models Overview
No ratings yet
N-Gram Language Models Overview
400 pages
ch6 Perceptron MLP PDF
No ratings yet
ch6 Perceptron MLP PDF
31 pages
Pick and Place Robotic ARM Using PLC: Abhiraj Bhalerao Prasad Doifode
No ratings yet
Pick and Place Robotic ARM Using PLC: Abhiraj Bhalerao Prasad Doifode
4 pages
Sociological Study of Perceptron Convtroversy
No ratings yet
Sociological Study of Perceptron Convtroversy
50 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
Midterm2006 Sol Csi4107
100% (2)
Midterm2006 Sol Csi4107
9 pages
Hr627 Total Ass Quiz
No ratings yet
Hr627 Total Ass Quiz
15 pages
Automatic Sorting System Using Machine V PDF
No ratings yet
Automatic Sorting System Using Machine V PDF
6 pages
Toc Unit 1 MCQS 2019-20
100% (1)
Toc Unit 1 MCQS 2019-20
567 pages
ML MCQs 4units
No ratings yet
ML MCQs 4units
30 pages
Information Retrieval Exam 2008
100% (1)
Information Retrieval Exam 2008
8 pages
NLP Exam Prep for Engineering Students
No ratings yet
NLP Exam Prep for Engineering Students
52 pages
Operating Systems Lab Report
No ratings yet
Operating Systems Lab Report
20 pages
Python Dictionary Quiz Results
No ratings yet
Python Dictionary Quiz Results
1 page
Core Python ESD Final Draft
No ratings yet
Core Python ESD Final Draft
111 pages
Tensor Processing Unit Overview
50% (2)
Tensor Processing Unit Overview
23 pages
Chapter Simple Linear Regression 1
100% (1)
Chapter Simple Linear Regression 1
77 pages
3 Python Self Assessment
No ratings yet
3 Python Self Assessment
2 pages
Jupyterlab
100% (1)
Jupyterlab
91 pages
SMJP 1043: Programming (C++) For Engineers
100% (1)
SMJP 1043: Programming (C++) For Engineers
51 pages
NLP Word Vectors: Intro & Methods
No ratings yet
NLP Word Vectors: Intro & Methods
128 pages
Illustrated Word2vec Guide
100% (1)
Illustrated Word2vec Guide
24 pages
Advanced Simpy
No ratings yet
Advanced Simpy
25 pages
Computer Security and Safety, Ethics, and Privacy
No ratings yet
Computer Security and Safety, Ethics, and Privacy
51 pages
Programming Fundamental Teaching Plan
No ratings yet
Programming Fundamental Teaching Plan
4 pages
Theory of Automata
50% (2)
Theory of Automata
44 pages
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
No ratings yet
Speech and Text Emotion Recognition Using Machine Learning Batch Number - 08 First Review 2.0
12 pages
Hill Cipher
No ratings yet
Hill Cipher
5 pages
Python Programming Lecture 1
No ratings yet
Python Programming Lecture 1
14 pages
Chapter 4 PPT
No ratings yet
Chapter 4 PPT
30 pages
Chapter 1 - PPT - ch01
No ratings yet
Chapter 1 - PPT - ch01
58 pages
Deeplearning Ai
No ratings yet
Deeplearning Ai
69 pages
Assignment6 cs22bt012
No ratings yet
Assignment6 cs22bt012
20 pages
L7-N-grams and Language Models
No ratings yet
L7-N-grams and Language Models
64 pages
Problem Set 2 - Ngram LMs
No ratings yet
Problem Set 2 - Ngram LMs
2 pages
Language Modeling 2025
No ratings yet
Language Modeling 2025
52 pages
NLP Lec 11
No ratings yet
NLP Lec 11
6 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
4 pages
DS311 Natural Language Processing Test 1 Solutions
No ratings yet
DS311 Natural Language Processing Test 1 Solutions
4 pages
Lecture 8 N Gram Numerical
No ratings yet
Lecture 8 N Gram Numerical
5 pages
Language Models
No ratings yet
Language Models
59 pages
Language Models: CS6370: Natural Language Processing
No ratings yet
Language Models: CS6370: Natural Language Processing
35 pages
N Grams
No ratings yet
N Grams
13 pages
NLP UNIT III (Part 1)
No ratings yet
NLP UNIT III (Part 1)
15 pages
5 StatisticalLanguageModel
No ratings yet
5 StatisticalLanguageModel
27 pages
5-N Gram
No ratings yet
5-N Gram
35 pages
Lectures LM
No ratings yet
Lectures LM
57 pages
21aml162 2nd Ia QP
No ratings yet
21aml162 2nd Ia QP
2 pages
Pract Q
No ratings yet
Pract Q
6 pages
Lecture - 3 - Statistical Language Models
No ratings yet
Lecture - 3 - Statistical Language Models
56 pages
Language Modelling
No ratings yet
Language Modelling
17 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
56 pages
N-Gram Language Models Lecture
No ratings yet
N-Gram Language Models Lecture
59 pages
N Grams
No ratings yet
N Grams
51 pages
Video v3
No ratings yet
Video v3
34 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
Guideline Document
No ratings yet
Guideline Document
81 pages
The World Teacher
No ratings yet
The World Teacher
321 pages
Android Native Library Fuzzing Guide
No ratings yet
Android Native Library Fuzzing Guide
15 pages
FortiManager-7 4 0-Administration - Guide
No ratings yet
FortiManager-7 4 0-Administration - Guide
937 pages
All The Kanji
No ratings yet
All The Kanji
18 pages
Madurai Schools
No ratings yet
Madurai Schools
28 pages
21st Century Literature Final Periodic Test
0% (2)
21st Century Literature Final Periodic Test
10 pages
Dashrath Nandan MAD (Complete) Notes
No ratings yet
Dashrath Nandan MAD (Complete) Notes
54 pages
Egg-X ProCall - Programação
No ratings yet
Egg-X ProCall - Programação
47 pages
Reconstruction of Norwegian Ballads
No ratings yet
Reconstruction of Norwegian Ballads
13 pages
Iwrbs Learning Activity Sheet 2
100% (1)
Iwrbs Learning Activity Sheet 2
8 pages
PISO Verilog PDF
No ratings yet
PISO Verilog PDF
5 pages
MOde Frontier Tutorial
No ratings yet
MOde Frontier Tutorial
35 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Notes On Ephesians
100% (1)
Notes On Ephesians
115 pages
Lesson Plan For Demo
100% (3)
Lesson Plan For Demo
2 pages
Design A Park
100% (1)
Design A Park
3 pages
ENG 342 Lecture Note
No ratings yet
ENG 342 Lecture Note
28 pages
A Practical Guide Using LLMs ChatGPT and Beyond
No ratings yet
A Practical Guide Using LLMs ChatGPT and Beyond
24 pages
EDIT 068 - Pronoun - Usage
No ratings yet
EDIT 068 - Pronoun - Usage
4 pages
Unit5 How To Implement UDP Sockets in C
No ratings yet
Unit5 How To Implement UDP Sockets in C
23 pages
Technical Specification of ZXJ10
100% (1)
Technical Specification of ZXJ10
137 pages
Macmillan 1 Revision Unit 8-9-10
No ratings yet
Macmillan 1 Revision Unit 8-9-10
22 pages
Dialogue 1. Definition: Inner Dialogue - in Inner Dialogue, The Characters Speak To Themselves and Reveal Their
No ratings yet
Dialogue 1. Definition: Inner Dialogue - in Inner Dialogue, The Characters Speak To Themselves and Reveal Their
3 pages
Edtpa Silent-E
No ratings yet
Edtpa Silent-E
9 pages
Music Project
No ratings yet
Music Project
6 pages
Fatgen 103
No ratings yet
Fatgen 103
35 pages
Java Versions
No ratings yet
Java Versions
5 pages
Formal and Informal English
No ratings yet
Formal and Informal English
11 pages
Conditional Sentences Guide
0% (1)
Conditional Sentences Guide
13 pages