0% found this document useful (0 votes)

15 views9 pages

1 Inference

Inference

Uploaded by

antoniofrancaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views9 pages

1 Inference

Inference

Uploaded by

antoniofrancaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Introduction to Probabilistic Inference

Probabilistic inference is the process of deducing the probabilities of certain outcomes or

parameters given observed data, within the framework of probability theory. It forms the
cornerstone of many fields such as statistics, machine learning, artificial intelligence, and data
science. By leveraging probabilistic models, we can make informed decisions, predict future
events, and understand underlying patterns in data.

Importance of Probabilistic Inference

In many real-world scenarios, we are faced with uncertainty due to incomplete or noisy data.
Probabilistic inference allows us to quantify this uncertainty and make predictions or decisions
accordingly. It provides a principled way to update our beliefs in light of new evidence, ensuring
that our conclusions are grounded in both prior knowledge and observed data.

Applications of Probabilistic Inference

Probabilistic inference is widely used across many real-world problems involve estimating
unobserved variables from observed data. Examples include:

Application Observed variable Unobserved variable

climate science earth observations climate forecast
autonomous driving image pixel values pedestrians and vehicles present
movie recommendation ratings of watched films ratings of unwatched films
medicine genome DNA susceptibility to genetic diseases

Fundamental Concepts
Sum Rule
The sum rule, also known as the marginalization rule, allows us to compute the marginal
probability of a random variable by summing (or integrating) over all possible values of another
variable:

For discrete variables:

p(x) = ∑ p(x, y)

For continuous variables:

p(x) = ∫ p(x, y) dy

Product Rule
The product rule expresses the joint probability of two events as the product of a conditional
probability and a marginal probability:

p(x, y) = p(x) p(y|x) = p(y) p(x|y)

These two rules form the basis for all probabilistic reasoning and inference.

Bayes' Theorem
Bayes' theorem is derived from the product rule and provides a way to update our beliefs about
the parameters or hypotheses in light of new data:

p(D|θ) p(θ)
p(θ|D) =
p(D)

Posterior: p(θ ∣ D): Probability of parameters θ given data D.

Likelihood: p(D ∣ θ): Probability of data D given parameters θ.
Prior: p(θ): Initial probability of parameters θ.
Marginal Likelihood: p(D): Probability of data D.

Bayesian Inference
Bayesian inference is a statistical method that applies Bayes' theorem to update the probability
for a hypothesis as more evidence or information becomes available.

Learning: Parameter Estimation

In Bayesian learning, we estimate the parameters θ of a model m given observed data D:

p(D|θ, m) p(θ|m)
p(θ|D, m) =
p(D|m)
Posterior: p(θ ∣ D, m): Updated belief about parameters after observing data.
Likelihood: p(D ∣ θ, m): Probability of observing data D given parameters θ.
Prior: p(θ ∣ m): Initial belief about parameters before observing data.
Evidence: p(D ∣ m): Probability of observing data under model m.

Explanation

Posterior represents what we know about the parameters after seeing the data.
Likelihood encapsulates what the data tells us about the parameters.
Prior reflects what we knew (or assumed) before observing the data.

Prediction: Predictive Distribution

Once we have the posterior distribution of the parameters, we can make predictions about new
data x : ∗

∗ ∗
p(x |D, m) = ∫ p(x |θ, m) p(θ|D, m) dθ

Predictive Distribution: p(x ∗

∣ D, m) : Probability of future observations given past data.
Integrating Over Parameters: We average over all possible parameter values, weighted
by their posterior probability.

Interpretation
We average all possible predictions p(x ∣ θ, m), weighting each by how plausible θ is given the
∗

observed data D. This approach naturally incorporates uncertainty in the parameter estimates
into our predictions.

Model Comparison
In Bayesian inference, we can compare different models to see which one explains the data
best:

p(D|m) p(m)
p(m|D) =
p(D)

Posterior Probability of Model p(m ∣ D): Probability that model m is the correct model
given the data.
Model Evidence p(D ∣ m): Probability of the data under model m.
Prior Probability of Model p(m): Initial belief about the plausibility of model m.

Bayes Factors
The ratio of the posterior probabilities of two models is known as the Bayes factor, which
quantifies the evidence in favor of one model over another.

Bayesian Decision Theory

Bayesian decision theory provides a framework for making optimal decisions under uncertainty
by maximizing expected utility (or reward).

Expected Reward
The expected reward R(a) for taking action a is calculated as:

R(a) = ∑ R(a, x) p(x|D)

Reward R(a, x): Reward for taking action a when the true state of the world is x.
Posterior Probability p(x ∣ D): Probability of state x given data D.

Explanation

We compute the action a with the highest expected conditional reward, considering all possible
states of the world. This approach separates inference and decision-making, allowing us to first
infer probabilities and then make decisions based on these probabilities.

Flavours of Inference and Decision Problems

Machine learning and inference problems can generally be categorized into three main types:

1. Supervised Learning
Objective: Learn a mapping from inputs x to outputs y based on observed pairs
.
(x i , y i )

Applications: Regression, classification, time series prediction.

2. Unsupervised Learning
Objective: Model the underlying structure or distribution in data without explicit
output labels.
Applications: Clustering, dimensionality reduction, density estimation.
3. Reinforcement Learning
Objective: Learn to make decisions by performing actions a in an environment to
t

maximize cumulative rewards r . t

Applications: Robotics, game playing, adaptive control systems.

Example: The Radioactive Decay Problem

To illustrate the concepts of probabilistic inference, let's consider a classic problem in statistical
estimation: estimating the decay constant of a radioactive substance.

Problem Setup
Unstable particles decay at distances x from a source, following an exponential distribution
characterized by a decay constant λ:

1 x
p(x|λ) = exp (− )
Z(λ) λ

Normalization Constant Z(λ): Ensures the probability density integrates to one over the
observed range.

We observe N decay events within a specific range (x min , . Our goal is to infer the value of
x max )

λ based on these observations.

Heuristic Approaches
Before delving into Bayesian inference, let's explore two heuristic methods for estimating λ.

1. Histogram-Based Approach
Method: Bin the observed decay distances into a histogram and perform linear
regression on the logarithm of the bin counts.
Assumption: The logarithm of the counts should decrease linearly with distance for
an exponential distribution.
Issues:
Bin Size Sensitivity: The choice of bin size can significantly affect the estimate.
Uncertainty Estimation: Does not provide a measure of uncertainty for λ.
Justification: Linear regression may not be the most appropriate method due to
the discrete nature of histogram counts.
2. Statistic-Based Approach
Method: Use the sample mean of the observed distances to estimate λ.

Formula
x min exp(−x min /λ) − x max exp(−x max /λ)
μ = λ +
exp(−x min /λ) − exp(−x max /λ)

Issues:
Sample Mean Limitations: The sample mean might exceed the maximum possible
value due to the truncated range.
Arbitrariness: The choice of using the mean is somewhat arbitrary and may not fully
utilize the information in the data.

Bayesian Inference Approach

A more principled method is to apply Bayesian inference to estimate λ.

Steps in Bayesian Inference

1. Specify the Likelihood Function:

The likelihood of observing the data {x N

n } n=1 given λ is:

N
p({x n } n=1 |λ) = ∏ p(x n |λ)

n=1

2. Choose a Prior Distribution:

We select a prior distribution p(λ) that reflects our initial beliefs about λ. For example, a
uniform prior over a reasonable range:

p(λ) = U (λ; λ min , λ max )

3. Compute the Posterior Distribution:

Applying Bayes' theorem:

N
p({x n } n=1 |λ) p(λ)
N
p(λ|{x n } n=1 ) =
N
p({x n } )
n=1
Since p({x n}
N
n=1
) does not depend on λ, we can write:
N

N
p(λ|{x n } n=1 ) ∝ p(λ) ∏ p(x n |λ)

n=1

4. Simplify the Posterior Expression:

Substituting the exponential likelihood:

N N
N
1 1
p(λ|{x n } n=1 ) ∝ p(λ)( ) exp (− ∑ xn )
Z(λ) λ
n=1

The posterior depends on λ through the normalization constant Z(λ) and the exponential term.

5. Compute Sufficient Statistics:

Note that the data enter the posterior only through the sum S = ∑
N

n=1
xn and the number
of observations N . These are known as sufficient statistics.

Understanding the Likelihood

The likelihood function p({x N
n } n=1 ∣ λ) represents how probable the observed data are for
different values of λ. It typically peaks at the value of λ that makes the observed data most
probable.

Posterior Visualization
By plotting the posterior distribution p(λ ∣ {x n
}
N
n=1
, we can visualize our updated beliefs about λ
)

after observing the data. The shape of the posterior reflects both the data and the prior.

Summarizing the Posterior

We can compute summaries of the posterior distribution, such as:

Mean: Expected value of λ under the posterior.

Variance: Measures the uncertainty in our estimate of λ.
Credible Intervals: Ranges within which λ lies with a certain probability (e.g., 95%
credible interval).
Predictive Distribution
With the posterior distribution in hand, we can make predictions about future decay events.

Computing the Predictive Distribution

The predictive distribution for a new observation x is: ∗

∗ N ∗ N
p(x |{x n } n=1 ) = ∫ p(x |λ) p(λ|{x n } n=1 ) dλ

This integral averages over all possible values of λ, weighted by their posterior probabilities.

Interpretation

The predictive distribution incorporates both the uncertainty in λ and the inherent randomness
of the decay process. It provides a full probabilistic description of where we expect future decay
events to occur.

Maximum Likelihood Estimation (MLE)

As an alternative to Bayesian inference, we can use maximum likelihood estimation to find the
value of λ that maximizes the likelihood of the observed data.

MLE Formulation
N
λ ML = argmax p({x n } n=1 |λ)
λ

Comparison with Bayesian Approach

Point Estimate: MLE provides a single estimate of λ, whereas Bayesian inference

provides a full posterior distribution.
Uncertainty Quantification: Bayesian inference naturally accounts for uncertainty in λ,
while MLE does not.
Prior Information: Bayesian inference incorporates prior beliefs, which can be beneficial
when data are scarce.
Summary of the Radioactive Decay Problem
The Bayesian approach to the radioactive decay problem involves:

1. Model Specification: Assuming an exponential decay model p(x ∣ λ).

2. Prior Selection: Choosing a prior p(λ) that reflects prior beliefs.
3. Posterior Computation: Applying Bayes' theorem to compute p(λ ∣ {x N
.
n } n=1 )

4. Prediction: Calculating the predictive distribution for future observations.

This approach provides a principled and coherent method for parameter estimation and
prediction, fully accounting for uncertainty.

Conclusion
Probabilistic inference is a powerful framework for reasoning under uncertainty. By leveraging
the fundamental rules of probability and Bayesian principles, we can:

1. Update our beliefs in light of new data.

2. Make predictions that account for parameter uncertainty.
3. Compare models in a principled way.
4. Make optimal decisions based on expected rewards.

Introduction to Bayesian Econometrics
No ratings yet
Introduction to Bayesian Econometrics
30 pages
PML Class 1 2025
No ratings yet
PML Class 1 2025
54 pages
Bayesian Statistics 01
100% (1)
Bayesian Statistics 01
22 pages
Unit - 5 ML
No ratings yet
Unit - 5 ML
57 pages
Bayesian Inference: A Practical Primer: Outline
No ratings yet
Bayesian Inference: A Practical Primer: Outline
28 pages
Basics of Bayesian Modeling and Estimation
No ratings yet
Basics of Bayesian Modeling and Estimation
21 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Bayesian Learning for Graphics
No ratings yet
Bayesian Learning for Graphics
141 pages
Bayesian Analysis in Environmental Valuation
No ratings yet
Bayesian Analysis in Environmental Valuation
34 pages
Bayesian Inference & Applications
No ratings yet
Bayesian Inference & Applications
12 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Bayesian Inference: The Basics
No ratings yet
Bayesian Inference: The Basics
37 pages
Introduction to Bayesian Analysis
No ratings yet
Introduction to Bayesian Analysis
49 pages
Bayesian Modelling For Data Analysis and Learning From Data
No ratings yet
Bayesian Modelling For Data Analysis and Learning From Data
19 pages
Introduction to Bayesian Statistics
No ratings yet
Introduction to Bayesian Statistics
6 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Baysian-Slides 16 Bayes Intro
No ratings yet
Baysian-Slides 16 Bayes Intro
49 pages
Baysian Inferences
No ratings yet
Baysian Inferences
20 pages
Bayesian Statistics: Thomas Bayes
No ratings yet
Bayesian Statistics: Thomas Bayes
22 pages
(Ebook) Introduction To Bayesian Econometrics and Decision Theory
No ratings yet
(Ebook) Introduction To Bayesian Econometrics and Decision Theory
29 pages
Bayesian Statistics
No ratings yet
Bayesian Statistics
76 pages
Bayes Theorem in Machine Learning
No ratings yet
Bayes Theorem in Machine Learning
40 pages
Bayesian Inference
No ratings yet
Bayesian Inference
5 pages
Bayesian Data Analysis
No ratings yet
Bayesian Data Analysis
36 pages
Block 4 ST3189
No ratings yet
Block 4 ST3189
25 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
24 Intro To Bayesian Inference
No ratings yet
24 Intro To Bayesian Inference
33 pages
Structure Learning in Graphical Models
No ratings yet
Structure Learning in Graphical Models
49 pages
IDS22Bayes Applications
No ratings yet
IDS22Bayes Applications
34 pages
Bayesian Ibrahim
No ratings yet
Bayesian Ibrahim
370 pages
Statistical Computing & Monte Carlo Methods
No ratings yet
Statistical Computing & Monte Carlo Methods
23 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
Introduction To Machine Learning CS - 229
No ratings yet
Introduction To Machine Learning CS - 229
109 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
20-Bayesian 310456690
No ratings yet
20-Bayesian 310456690
34 pages
Lecture 5
No ratings yet
Lecture 5
23 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Var PPTS
No ratings yet
Var PPTS
249 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Understanding Bayes' Theorem in ML
No ratings yet
Understanding Bayes' Theorem in ML
22 pages
Bayesian Inference
No ratings yet
Bayesian Inference
18 pages
Studio 5 Questions
No ratings yet
Studio 5 Questions
8 pages
A Beginner's Notes On Bayesian Econometrics (Art)
No ratings yet
A Beginner's Notes On Bayesian Econometrics (Art)
21 pages
Bayes Lecture Notes
No ratings yet
Bayes Lecture Notes
79 pages
Bayesian Statistics Essentials
No ratings yet
Bayesian Statistics Essentials
180 pages
ML Notes 4
No ratings yet
ML Notes 4
27 pages
Introduction To Probabilistic Learning
No ratings yet
Introduction To Probabilistic Learning
9 pages
CH 5
No ratings yet
CH 5
45 pages
Bayesian Inference: Chris Mathys
No ratings yet
Bayesian Inference: Chris Mathys
32 pages
Bayesian Learning: Thanks To Nir Friedman, HU
No ratings yet
Bayesian Learning: Thanks To Nir Friedman, HU
41 pages
Bayesian Estimation
No ratings yet
Bayesian Estimation
13 pages
Lecture 6. Bayesian Estimation
No ratings yet
Lecture 6. Bayesian Estimation
14 pages
Maximum Likelihood Estimation Guide
No ratings yet
Maximum Likelihood Estimation Guide
34 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Introduction To Bayesian Methods With An Example
No ratings yet
Introduction To Bayesian Methods With An Example
25 pages
IT590 Bayesian Theory Lecture 1
No ratings yet
IT590 Bayesian Theory Lecture 1
5 pages
Chi-Square Practical
No ratings yet
Chi-Square Practical
3 pages
Mean Median Mode
No ratings yet
Mean Median Mode
36 pages
Popcorn Experiment Example
100% (2)
Popcorn Experiment Example
35 pages
Section B Group 9 Final Project PDF
No ratings yet
Section B Group 9 Final Project PDF
24 pages
AP Stats Chapter 1: Exploring Data
No ratings yet
AP Stats Chapter 1: Exploring Data
3 pages
Hypothesis Testing - A Visual Introduction To Statistical Significance
100% (4)
Hypothesis Testing - A Visual Introduction To Statistical Significance
137 pages
Session 8 - Kruskal Wallis H-Test
100% (1)
Session 8 - Kruskal Wallis H-Test
15 pages
Chapter 3.2-Sampling & Sampling Design
No ratings yet
Chapter 3.2-Sampling & Sampling Design
33 pages
1 Probability Unit 3
No ratings yet
1 Probability Unit 3
22 pages
Method Validation Training Overview
No ratings yet
Method Validation Training Overview
3 pages
Linear Combination of Random Variables: E (X) and Var (X) of Modified Random Variable
No ratings yet
Linear Combination of Random Variables: E (X) and Var (X) of Modified Random Variable
2 pages
The Chi-Square Test
No ratings yet
The Chi-Square Test
18 pages
Shock Speed Histogram Analysis
100% (1)
Shock Speed Histogram Analysis
4 pages
FDSA 5th Question
No ratings yet
FDSA 5th Question
4 pages
Best Graphs for Titer Level Analysis
No ratings yet
Best Graphs for Titer Level Analysis
5 pages
Gamma Variable Simulation and Spline Analysis
No ratings yet
Gamma Variable Simulation and Spline Analysis
8 pages
Lecture 02 - Exploratory Data and Descriptive Statistics
No ratings yet
Lecture 02 - Exploratory Data and Descriptive Statistics
27 pages
Box and Whisker Graphs
No ratings yet
Box and Whisker Graphs
4 pages
Engle 1982
No ratings yet
Engle 1982
22 pages
Lasso & Elastic-Net Models in R
No ratings yet
Lasso & Elastic-Net Models in R
72 pages
Lesson 12 Hypothesis Testing and Interpretation
No ratings yet
Lesson 12 Hypothesis Testing and Interpretation
10 pages
Understanding Data Dispersion
No ratings yet
Understanding Data Dispersion
46 pages
Lecture 8 SAMPLING AND SAMPLING DISTRIBUTIONS - ECN 2331 NOTES
No ratings yet
Lecture 8 SAMPLING AND SAMPLING DISTRIBUTIONS - ECN 2331 NOTES
9 pages
13.1 Power Point
No ratings yet
13.1 Power Point
11 pages
Quality Reliability Eng - 2007 - Woodall - Some Relationships Between Gage R R Criteria
No ratings yet
Quality Reliability Eng - 2007 - Woodall - Some Relationships Between Gage R R Criteria
8 pages
Rizza Mae E. Samillano (Correlation and Regression)
No ratings yet
Rizza Mae E. Samillano (Correlation and Regression)
2 pages
Understanding Multiple Regression Techniques
No ratings yet
Understanding Multiple Regression Techniques
9 pages
Stats: Correlation vs. Regression
No ratings yet
Stats: Correlation vs. Regression
4 pages
4ECON003W.coursework.24A5 - Tagged
No ratings yet
4ECON003W.coursework.24A5 - Tagged
4 pages
Factors Influencing GPA at HANU
No ratings yet
Factors Influencing GPA at HANU
23 pages

1 Inference

Uploaded by

1 Inference

Uploaded by

Introduction to Probabilistic Inference

Probabilistic inference is the process of deducing the probabilities of certain outcomes or

Importance of Probabilistic Inference

Applications of Probabilistic Inference

Application Observed variable Unobserved variable

For discrete variables:

For continuous variables:

p(x, y) = p(x) p(y|x) = p(y) p(x|y)

Posterior: p(θ ∣ D): Probability of parameters θ given data D.

Learning: Parameter Estimation

Prediction: Predictive Distribution

Predictive Distribution: p(x ∗

Bayesian Decision Theory

R(a) = ∑ R(a, x) p(x|D)

Flavours of Inference and Decision Problems

Applications: Regression, classification, time series prediction.

maximize cumulative rewards r . t

Applications: Robotics, game playing, adaptive control systems.

Example: The Radioactive Decay Problem

λ based on these observations.

Bayesian Inference Approach

Steps in Bayesian Inference

1. Specify the Likelihood Function:

The likelihood of observing the data {x N

2. Choose a Prior Distribution:

p(λ) = U (λ; λ min , λ max )

3. Compute the Posterior Distribution:

Applying Bayes' theorem:

4. Simplify the Posterior Expression:

Substituting the exponential likelihood:

5. Compute Sufficient Statistics:

Understanding the Likelihood

Summarizing the Posterior

We can compute summaries of the posterior distribution, such as:

Mean: Expected value of λ under the posterior.

Computing the Predictive Distribution

Maximum Likelihood Estimation (MLE)

Comparison with Bayesian Approach

Point Estimate: MLE provides a single estimate of λ, whereas Bayesian inference

1. Model Specification: Assuming an exponential decay model p(x ∣ λ).

4. Prediction: Calculating the predictive distribution for future observations.

1. Update our beliefs in light of new data.

You might also like