0% found this document useful (0 votes)

26 views11 pages

DSAI514 Lec2 Point Estimation Part 1

The document discusses KL Divergence, a measure of information loss when an estimated distribution Q approximates a true distribution P. It explains how the divergence is calculated using the probabilities assigned to events by both distributions, emphasizing that events with higher probabilities in P have a greater impact on the divergence. Additionally, a toy example of a movie recommendation system illustrates how user ratings can serve as the true distribution and how predicted ratings can be compared using KL Divergence.

Uploaded by

Knn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views11 pages

DSAI514 Lec2 Point Estimation Part 1

Uploaded by

Knn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DSAI 514 – Statistical Inference

Point Estimation

Instructor: Ş. Betül Özateş

Boğaziçi University
19/02/2025

Devore, Jay L., Kenneth N. Berk, and Matthew A. Carlton. (2012) Modern mathematical statistics with applications. Vol. 285. New York: Springer,
and https://medium.com/@hosamedwee/kullback-leibler-kl-divergence-with-examples-part-2-9123bff5dc10
Last Lecture: KL Divergence

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

information lost when Q is used to approximate P.
• P is the true distribution, and Q is the estimated distribution.
Last Lecture: KL Divergence

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

information lost when Q is used to approximate P.
• P is the true distribution, and Q is the estimated distribution.

• P(x) is the probability of event x according to the true distribution.

• This term is used as a weighting factor, meaning events that are
more probable in the first distribution have a larger impact on the
divergence.
Last Lecture: KL Divergence

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

information lost when Q is used to approximate P.
• P is the true distribution, and Q is the estimated distribution.

• P(x) is the probability of event x according to the true distribution.

• This term is used as a weighting factor, meaning events that are more
probable in the first distribution have a larger impact on the divergence.

• If an event is highly probable in P but not in Q, we want that to

contribute more to our divergence measure.

• If an event is not very likely in P, we don’t want it to contribute much to

our divergence measure, even if Q assigns it a high probability.

• This is because we’re measuring the divergence from P to Q, not the other way
around.
Last Lecture: KL Divergence

𝑃 𝑥
• log( ) is the ratio of the probabilities assigned to event x by P and Q.
𝑄 𝑥
Last Lecture: KL Divergence

𝑃 𝑥
• log( ) is the ratio of the probabilities assigned to event x by P and Q.
𝑄 𝑥

• If P and Q assign the same probability to x, then this ratio is 1, and the
logarithm of 1 is 0, so events that P and Q agree on don’t contribute to
the divergence.
Last Lecture: KL Divergence

𝑃 𝑥
• log( ) is the ratio of the probabilities assigned to event x by P and Q.
𝑄 𝑥

• If P and Q assign the same probability to x, then this ratio is 1, and the
logarithm of 1 is 0, so events that P and Q agree on don’t contribute to
the divergence.

• If P assigns more probability to x than Q does, then this ratio is greater

than 1, and the logarithm is positive, so this event contributes to the
divergence.
Last Lecture: KL Divergence

𝑃 𝑥
• log( ) is the ratio of the probabilities assigned to event x by P and Q.
𝑄 𝑥

• If P and Q assign the same probability to x, then this ratio is 1, and the
logarithm of 1 is 0, so events that P and Q agree on don’t contribute to
the divergence.

• If P assigns more probability to x than Q does, then this ratio is greater

than 1, and the logarithm is positive, so this event contributes to the
divergence.

• If P assigns less probability to x than Q does, then this ratio is less than
1, and the logarithm is negative, but remember that we’re multiplying
this by P(x), so events that P assigns low probability to don’t contribute
much to the divergence.
Last Lecture: KL Divergence

𝑃 𝑥
• 𝑃 𝑥 log :
𝑄 𝑥
For each outcome, we calculate how much probability P assigns to it,
and then multiply it by the log of the ratio of the probabilities P and Q
assign to it.
- This ratio tells us how much P and Q differ on this particular outcome.
Last Lecture: KL Divergence

• Finally, we sum over all possible outcomes. This gives us a single

number that represents the total difference between P and Q.

• This means that the KL Divergence is a weighted sum of the log difference in
probabilities, where the weights are the probabilities according to the first
distribution.
KL Divergence: Movie Recommendation System – Toy Example
• Let’s say we have a user who has rated 5 movies in the past. The ratings are on a scale
of 1 to 5, with 5 being the highest. Here are the ratings:
• Movie A: 5
• Movie B: 4
• Movie C: 5
• Movie D: 1
• Movie E: 2
• We can consider these ratings as the “true” distribution of the user’s preferences.

• Now, let’s say our recommendation system has some features for each movie (like
genre, director, actors, etc.). Based on these features, it predicts the following ratings
for the user’s preferences:
• Movie A: 4
• Movie B: 3
• Movie C: 5
• Movie D: 2
• Movie E: 3

At Salak Is 2009
No ratings yet
At Salak Is 2009
2 pages
Kullback-Leibler Divergence
No ratings yet
Kullback-Leibler Divergence
13 pages
Adrl App
No ratings yet
Adrl App
139 pages
Notes On Kullback-Leibler Divergence and Likelihood Theory
No ratings yet
Notes On Kullback-Leibler Divergence and Likelihood Theory
4 pages
Understanding Kullback-Leibler Divergence
No ratings yet
Understanding Kullback-Leibler Divergence
2 pages
Kullback-Leibler Divergence Overview
No ratings yet
Kullback-Leibler Divergence Overview
6 pages
Math Worksheet: KL Divergence & Tests
No ratings yet
Math Worksheet: KL Divergence & Tests
2 pages
KL Divergence
No ratings yet
KL Divergence
8 pages
Divergences
No ratings yet
Divergences
8 pages
2a Probability4
No ratings yet
2a Probability4
7 pages
Probability in Neural Networks
No ratings yet
Probability in Neural Networks
25 pages
Lecture 17 - KL Divergence, Autoencoders
No ratings yet
Lecture 17 - KL Divergence, Autoencoders
54 pages
DSAI514 Lec1 Background in Prob Part3
No ratings yet
DSAI514 Lec1 Background in Prob Part3
25 pages
Statistical Distances & Metrics
No ratings yet
Statistical Distances & Metrics
12 pages
KL Divergence in Supervised Learning
No ratings yet
KL Divergence in Supervised Learning
4 pages
Total Variation & Entropy Analysis
No ratings yet
Total Variation & Entropy Analysis
2 pages
Distance
No ratings yet
Distance
18 pages
3 (A) Solutions
No ratings yet
3 (A) Solutions
9 pages
Understanding Cross-Entropy in ML
No ratings yet
Understanding Cross-Entropy in ML
24 pages
Lecture 2 Slides With Q&A 20242025
No ratings yet
Lecture 2 Slides With Q&A 20242025
38 pages
Cse 150 HW1
No ratings yet
Cse 150 HW1
2 pages
B.A. P DSC - Basic Statistics F Dn0AymG
No ratings yet
B.A. P DSC - Basic Statistics F Dn0AymG
16 pages
Uncertainity Analysis 1
No ratings yet
Uncertainity Analysis 1
39 pages
BusStats Finals
No ratings yet
BusStats Finals
15 pages
Maths Methods 3&4 - July 2020 - Compressed
No ratings yet
Maths Methods 3&4 - July 2020 - Compressed
119 pages
Stats 1 Ch5 - Probability
No ratings yet
Stats 1 Ch5 - Probability
11 pages
Probability and Statistics: Wikipedia
No ratings yet
Probability and Statistics: Wikipedia
12 pages
Statistical Inferences Notes
No ratings yet
Statistical Inferences Notes
15 pages
STA301 - Midterm Fall 2012 (December) : STA301-Statistics and Probability
No ratings yet
STA301 - Midterm Fall 2012 (December) : STA301-Statistics and Probability
10 pages
Machine Learning PDF
No ratings yet
Machine Learning PDF
77 pages
Kullback-Leibler Divergence Estimation of Continuous Distributions
No ratings yet
Kullback-Leibler Divergence Estimation of Continuous Distributions
5 pages
2 - Maximum Likelihood
No ratings yet
2 - Maximum Likelihood
20 pages
2018 - 2019 Introduction To Statistics and Probability
No ratings yet
2018 - 2019 Introduction To Statistics and Probability
4 pages
Kullback-Leibler Divergence Overview
No ratings yet
Kullback-Leibler Divergence Overview
23 pages
KL Divergence
No ratings yet
KL Divergence
17 pages
Probability and Statistics Overview
No ratings yet
Probability and Statistics Overview
19 pages
GR12 Probability WS
No ratings yet
GR12 Probability WS
33 pages
Emre Sermutlu CENG 235 Probability and Statistics
No ratings yet
Emre Sermutlu CENG 235 Probability and Statistics
64 pages
4485
No ratings yet
4485
4 pages
B.A. P Basic Statistics For Econ 3gp3l47
No ratings yet
B.A. P Basic Statistics For Econ 3gp3l47
16 pages
QQS1013 Elementary Statistics: Possibility A Particular Event Will Occur
100% (1)
QQS1013 Elementary Statistics: Possibility A Particular Event Will Occur
33 pages
QT Assignment
No ratings yet
QT Assignment
7 pages
Probability
No ratings yet
Probability
17 pages
Statistics and Probability 2-Mark Questions and Answers With Formulas
No ratings yet
Statistics and Probability 2-Mark Questions and Answers With Formulas
6 pages
Statistics Exam Prep Guide
No ratings yet
Statistics Exam Prep Guide
15 pages
AP Statistics Study Guide
100% (2)
AP Statistics Study Guide
12 pages
Probability AND Counting Principle: Grade 12
75% (4)
Probability AND Counting Principle: Grade 12
47 pages
Quasi-Maximum Likelihood Theory
No ratings yet
Quasi-Maximum Likelihood Theory
25 pages
Mathematical Foundations of ML Concepts
No ratings yet
Mathematical Foundations of ML Concepts
24 pages
Unit 4
No ratings yet
Unit 4
12 pages
Name Rabia Basri ID 18PKR10306 Program B.S (Library Info - Sciences) Semester SPRING 2024
No ratings yet
Name Rabia Basri ID 18PKR10306 Program B.S (Library Info - Sciences) Semester SPRING 2024
16 pages
Sms2309 Intro To Statistics Outline
No ratings yet
Sms2309 Intro To Statistics Outline
6 pages
Central Tendency & Variability Guide
No ratings yet
Central Tendency & Variability Guide
7 pages
Sma 2103 Probability and Statistics-Print Ready
No ratings yet
Sma 2103 Probability and Statistics-Print Ready
4 pages
Probability Concepts & Rules
No ratings yet
Probability Concepts & Rules
13 pages
Stat110 Harvard Notes
No ratings yet
Stat110 Harvard Notes
32 pages
Lecture 17-Backpropagation
No ratings yet
Lecture 17-Backpropagation
28 pages
Lecture 18-ANNs and Overfitting
No ratings yet
Lecture 18-ANNs and Overfitting
21 pages
Lecture 19-Support Vector Machines For Maximization of Margin
No ratings yet
Lecture 19-Support Vector Machines For Maximization of Margin
38 pages
Lecture 20-Dual Quadratic Programming Formulation of SVMs and Kernel Trick
No ratings yet
Lecture 20-Dual Quadratic Programming Formulation of SVMs and Kernel Trick
31 pages
Understanding Multilayer Perceptrons
No ratings yet
Understanding Multilayer Perceptrons
24 pages
Algebra Basics for Students
No ratings yet
Algebra Basics for Students
23 pages
SectionQuiz 2A Ch01 1.1 S E
No ratings yet
SectionQuiz 2A Ch01 1.1 S E
2 pages
Grade 9 Trigonometry Worksheet
No ratings yet
Grade 9 Trigonometry Worksheet
3 pages
Year 5 Maths: Fractions & Probability
No ratings yet
Year 5 Maths: Fractions & Probability
4 pages
Maths (Probability) 5
No ratings yet
Maths (Probability) 5
17 pages
Franklin Math Bowl Sixth Grade Test 2010
No ratings yet
Franklin Math Bowl Sixth Grade Test 2010
3 pages
Design and Analysis of Algorithms Course
No ratings yet
Design and Analysis of Algorithms Course
20 pages
Soalan Kemahiran Matematik Menengah
No ratings yet
Soalan Kemahiran Matematik Menengah
9 pages
Computational Fluid Dynamics Course: Numerical Algorithms For Hyperbolic Equations - II
No ratings yet
Computational Fluid Dynamics Course: Numerical Algorithms For Hyperbolic Equations - II
19 pages
Fun Math Subtraction Games for Kids
No ratings yet
Fun Math Subtraction Games for Kids
8 pages
Math (2nd)
100% (1)
Math (2nd)
6 pages
01 1MA1 1H - Aiming For 7 Spring 2022
No ratings yet
01 1MA1 1H - Aiming For 7 Spring 2022
24 pages
Mathematics in Ancient India
No ratings yet
Mathematics in Ancient India
62 pages
Clustering Jain Dubes (1) - 69-103
No ratings yet
Clustering Jain Dubes (1) - 69-103
5 pages
Stress Concentration in Laminates with Holes
No ratings yet
Stress Concentration in Laminates with Holes
18 pages
Math Exam Prep for Students
No ratings yet
Math Exam Prep for Students
13 pages
ASM-Method: Optimal Transport Solution
No ratings yet
ASM-Method: Optimal Transport Solution
5 pages
4B Chapter 8 More About Equations Part 1
No ratings yet
4B Chapter 8 More About Equations Part 1
8 pages
Boolean Algebra - Tutorial
No ratings yet
Boolean Algebra - Tutorial
13 pages
HKIMO Primary 2 Mock Test Paper
No ratings yet
HKIMO Primary 2 Mock Test Paper
5 pages
GeoGebra Math Media for Students
No ratings yet
GeoGebra Math Media for Students
8 pages
Excel Dictionary
No ratings yet
Excel Dictionary
205 pages
Gauss' Law for Engineers
No ratings yet
Gauss' Law for Engineers
10 pages
12th 1st Unit Test Syllabous 25-26
No ratings yet
12th 1st Unit Test Syllabous 25-26
6 pages
Regents Exam Quadratics Practice
No ratings yet
Regents Exam Quadratics Practice
8 pages
Get 209 Engineering Mathematics I Part I
No ratings yet
Get 209 Engineering Mathematics I Part I
27 pages
Cross-Sectional Area of River Calculation
No ratings yet
Cross-Sectional Area of River Calculation
4 pages
Substitution Methods
No ratings yet
Substitution Methods
6 pages
First Three Experiments
No ratings yet
First Three Experiments
14 pages
Long - History of Monge Problem 11p
No ratings yet
Long - History of Monge Problem 11p
11 pages

DSAI514 Lec2 Point Estimation Part 1

Uploaded by

DSAI514 Lec2 Point Estimation Part 1

Uploaded by

DSAI 514 – Statistical Inference

Instructor: Ş. Betül Özateş

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

• P(x) is the probability of event x according to the true distribution.

• The KL Divergence of Q from P, denoted as DKL(P||Q) is a measure of

• P(x) is the probability of event x according to the true distribution.

• If an event is highly probable in P but not in Q, we want that to

• If an event is not very likely in P, we don’t want it to contribute much to

• If P assigns more probability to x than Q does, then this ratio is greater

• If P assigns more probability to x than Q does, then this ratio is greater

• Finally, we sum over all possible outcomes. This gives us a single

You might also like