Lecture 2

This lecture discusses analyzing relationships between two or more variables simultaneously. It covers joint and conditional distributions, covariance and correlation, and linear prediction models. Key points include: 1) Joint distributions of two random variables (X,Y) are described by their simultaneous probability density function f(x,y). Marginal densities are obtained by integrating f(x,y) over one of the variables. 2) Dependence between X and Y is summarized by their covariance and correlation coefficient. Correlation measures the strength and direction of any linear relationship between the variables. 3) The conditional density f2(y|x) describes the distribution of Y given a value of X. It can be used to find the

Uploaded by

satyabasha

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Lecture 2

Uploaded by

satyabasha

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Lecture 2 Program

1. Introduction
2. Simultaneous distributions
3. Covariance and correlation
4. Conditional distributions
5. Prediction
1
Basic ideas
We will often consider two (or more) variables
simultaneously.
Examples (B& S, page 15)
2
There are two typical ways this is can be done:
(1) The data (x
1
, y
1
), . . . , (x
n
, y
n
) are
considered as independent replications of a
pair of random variables, (X, Y ).
(2) The data are described by a linear regres-
sion model
y
i
= a +bx
i
+
i
, i = 1, . . . , n
Here y
1
, . . . , y
n
are the responses that are
considered to be realizations of random vari-
ables, while x
1
, . . . , x
n
are considered to be
xed (i.e. non-random) and the
i
s are
random errors (noise)
Situation 1) occurs for observational studies,
while situation 2) occur for planned experi-
ments (where the values of the x
i
s are under
the control of the experimenter).
In situation 1) we will often condition on the
observed values of the x
i
s, and analyse the
data as if they are from situation 2)
In this lecture we focus on situation 1)
3
Joint or simultaneous distributions
The most common way to describe the si-
multaneous distribution of a pair of random
variables (X, Y ), is through their simultaneous
probability density, f(x, y)
This is dened so that
P( (X, Y ) A) =

A
f(x, y) dxdy
The marginal density of X is obtained by
integrating over all possible values of Y :
f
1
(x) =

f(x, y)dy
and similarly for the marginal density f
2
(y) of Y .
If f(x, y) = f
1
(x) f
2
(y), the random variables
X and Y are independent.
Otherwise, they are dependent, which means
that there is a relationship between X and Y ,
so that certain realizations of X tend to occur
more often together with certain realizations
of Y than others.
4
Covariance and correlation
The dependence between X and Y is often
summarized by the covariance:
= Cov(X, Y ) = E[(X
1
)(Y
2
)]
and the correlation coecient:
= corr(X, Y ) =
Cov(X, Y )
sd(X) sd(Y )
The following are important properties of the
correlation coecient.
corr(X, Y ) takes values in the interval [1, 1]
corr(X, Y ) describes the linear relationship
between Y and X.
If X and Y are independent corr(X, Y ) = 0,
but not (necessarily) the other way around
5
Correlation: correlated data
1 0 1 2

1
0
1
Correlation 0.9
x
y
2 1 0 1 2

1
0
1
Correlation 0.5
x
y
2 1 0 1 2

1
0
1
2
Correlation 0.9
x
y
6
Correlation: uncorrelated data
1 0 1 2 3

1
0
1
Correlation 0.0
x
y
7
Correlation: uncorrelated data
2 1 0 1 2
0
1
2
3
4
5
6
Correlation 0.0
x
y
8
Transformations
Sometimes a transformation may improve the
linear relation
9
Sample versions of covariance
and correlation
Data (x
1
, y
1
), . . . , (x
n
, y
n
) are independent
replicates of (X, Y ).
Empirical analogues to the population concepts
and basic results:
Empirical covariance:
=
1
n 1
n

i=1
(x
i
x
n
) (y
i
y
n
)
Empirical correlation coecient:
=

s
1n
s
2n
When n increases:

10
Conditional distributions
The conditional density of Y given X = x
is given by
f
2
(y|x) =
f(x, y)
f
1
(x)
If X and Y are independent, so that f(x, y) = f
1
(x)f
2
(y),
we see that f
2
(y|x) = f
2
(y). This is reasonable, and
corresponds to the fact that there are no information in
a realization of X about the distribution of Y
Using the conditional density, one may nd the
conditional mean and the conditional variance:
Conditional mean:
2|x
= E(Y |x)
Conditional variance :
2
2|x
= Var(Y |x)
When (X, Y ) is bivariate, normally distributed,

2|x
is linear in x, and is known as the regres-
sion of Y on X = x (cf. below).
11
Prediction
When X and Y are dependent, it is reasonable
that knowledge of the value of X can be used
to improve the prediction for the correspond-
ing realization of Y .
Let

Y (x) be such a predictor. Then:

Y (x) Y is the prediction error

Y
opt
(x) = E(Y |x) minimizes E[(

Y (x)Y )
2
],
the mean squared prediction error
E(Y |x) will often depend on unknown
parameters, and it may be complicated to
compute
12
Linear prediction
It is convenient to consider linear predictors,
i.e. predictors of the form:

Y
lin
(x) = a +bx
Minimizing E[(a + bX Y )
2
] w.r.t. a and b
yields:
b =

2
1
and a =
2
b
1
The minimum is E[(

Y
lin
(x)Y )
2
] =
2
2
(1
2
).
Note that if
2
increases, the mean squared
error decreases.
13
Linear prediction, contd.
Without knowledge of the value of X, the best
predictor is the unconditional mean of Y , i.e.

Y
0
=
2
.
This has mean squared error E[(

Y
0
Y )
2
] =
2
2
.
Hence, a sensible measure of the quality of a
prediction is the ratio
E[(

Y
lin
(x) Y )
2
]
E[(

Y
0
Y )
2
]
= 1
2
.
For judging a prediction, the squared correla-
tion coecient is the appropriate measure.
When a and b are unknown, we plug in the
empirical counterparts:

b =

2
1
and a =
2

b
1
= y

b x
14
The bivariate normal distribution
When (X, Y ) is bivariate normal:
The distribution is described by the ve
parameters
1
,
2
,
2
1
,
2
2
and
The marginal distributions of X and Y are
normal, X N(
1
,
2
1
), Y N(
2
,
2
2
)
corr(X, Y ) = and Cov(X, Y ) =
1

2
The conditional distributions are normal
E(Y |x) =
2
+

2

1
(x
1
)
Var(Y |x) =
2
2
(1
2
)
b =

2

1
=

2
1
and a =
2
b
1
15

Unit J PDF
No ratings yet
Unit J PDF
117 pages
16oct24 Annotations
No ratings yet
16oct24 Annotations
35 pages
CH 5 3502 PDF
No ratings yet
CH 5 3502 PDF
5 pages
cov_normal
No ratings yet
cov_normal
5 pages
Lecture1 Introduction To GPs
No ratings yet
Lecture1 Introduction To GPs
172 pages
Variables Aleatorias
No ratings yet
Variables Aleatorias
14 pages
Correlation in Random Variables
No ratings yet
Correlation in Random Variables
6 pages
Covariances
No ratings yet
Covariances
12 pages
Bivariate Normal Distribution Presentation
No ratings yet
Bivariate Normal Distribution Presentation
14 pages
Suresh Kumar 5-9 Chap Notes
No ratings yet
Suresh Kumar 5-9 Chap Notes
24 pages
Correlations and Copulas
No ratings yet
Correlations and Copulas
53 pages
Packet8
No ratings yet
Packet8
12 pages
OptimalLinearFilters PDF
No ratings yet
OptimalLinearFilters PDF
107 pages
Lec4 IntroToProbabilityAndStatistics
No ratings yet
Lec4 IntroToProbabilityAndStatistics
44 pages
Week 4 - Probability Descriptive Statistics cont (Post-Class)
No ratings yet
Week 4 - Probability Descriptive Statistics cont (Post-Class)
50 pages
Capitulo 1 Rencher
No ratings yet
Capitulo 1 Rencher
19 pages
Covariance and Some Conditional Expectation Exercises: Scott Sheffield
No ratings yet
Covariance and Some Conditional Expectation Exercises: Scott Sheffield
69 pages
Joint Distributions: A Random Variable Is That Maps To Numbers
No ratings yet
Joint Distributions: A Random Variable Is That Maps To Numbers
37 pages
ch5_covariance_0
No ratings yet
ch5_covariance_0
15 pages
Joint Probability Distribution
No ratings yet
Joint Probability Distribution
14 pages
Random Variables: Van Nam Tran
No ratings yet
Random Variables: Van Nam Tran
17 pages
Stat_Note4
No ratings yet
Stat_Note4
4 pages
SSP4SE Appa
No ratings yet
SSP4SE Appa
10 pages
Joint Discrete PMFs PDF
No ratings yet
Joint Discrete PMFs PDF
2 pages
Bivariate Normal
No ratings yet
Bivariate Normal
11 pages
7 Multiple Random Variables
No ratings yet
7 Multiple Random Variables
15 pages
The Distribution of Function of Random Variable
No ratings yet
The Distribution of Function of Random Variable
17 pages
Distributions and Normal Random Variables
No ratings yet
Distributions and Normal Random Variables
8 pages
2021 - Week - 3 - Ch.2 Random Process
No ratings yet
2021 - Week - 3 - Ch.2 Random Process
11 pages
Ece - 650 - Lect - 3 - SP - 15 - v7
No ratings yet
Ece - 650 - Lect - 3 - SP - 15 - v7
40 pages
7_Expectation
No ratings yet
7_Expectation
20 pages
CN Ols
No ratings yet
CN Ols
133 pages
PRO-Ch4 (2021-22 Note
No ratings yet
PRO-Ch4 (2021-22 Note
52 pages
MS Lectures 5
No ratings yet
MS Lectures 5
15 pages
Covariance - Correlation and Regression (Lecture)
No ratings yet
Covariance - Correlation and Regression (Lecture)
11 pages
Joint Probability Distributions and Random Samples
No ratings yet
Joint Probability Distributions and Random Samples
28 pages
03-Multiple Random Variables
No ratings yet
03-Multiple Random Variables
42 pages
Unit-12 IGNOU STATISTICS
No ratings yet
Unit-12 IGNOU STATISTICS
34 pages
Econometrics1 2 PDF
No ratings yet
Econometrics1 2 PDF
63 pages
Chap4 Slides
No ratings yet
Chap4 Slides
50 pages
4Gaussian Discriminant
No ratings yet
4Gaussian Discriminant
50 pages
Chapter 7 PDF Lecture Notes
No ratings yet
Chapter 7 PDF Lecture Notes
42 pages
Elements of Probability Theory
No ratings yet
Elements of Probability Theory
6 pages
5 Joint Probability Distribution.7245.1583725420.9784
No ratings yet
5 Joint Probability Distribution.7245.1583725420.9784
30 pages
Week 2 DrBuddhananda Banerjee Vector RV
No ratings yet
Week 2 DrBuddhananda Banerjee Vector RV
10 pages
MIT14 381F13 Lec1 PDF
No ratings yet
MIT14 381F13 Lec1 PDF
8 pages
Probability
No ratings yet
Probability
12 pages
Probability Stats II Course Overview
No ratings yet
Probability Stats II Course Overview
6 pages
Joint Distribution
No ratings yet
Joint Distribution
9 pages
1.017/1.010 Class 11 Multivariate Probability: Multiple Random Variables
No ratings yet
1.017/1.010 Class 11 Multivariate Probability: Multiple Random Variables
3 pages
EDA Report
No ratings yet
EDA Report
24 pages
Joint Dist
No ratings yet
Joint Dist
30 pages
2.4 Multivariate Random Variables-1607080443813
No ratings yet
2.4 Multivariate Random Variables-1607080443813
22 pages
Advanced ML Notes (Midterm)
No ratings yet
Advanced ML Notes (Midterm)
10 pages
Joint Probability 4
No ratings yet
Joint Probability 4
47 pages
SF 2940 Forms
No ratings yet
SF 2940 Forms
23 pages
Chap 3: Two Random Variables: X X X X X
No ratings yet
Chap 3: Two Random Variables: X X X X X
63 pages
Chap 3: Two Random Variables: X X X X X
No ratings yet
Chap 3: Two Random Variables: X X X X X
63 pages
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
Linear Algebra For Computer Vision - Part 2: CMSC 828 D
No ratings yet
Linear Algebra For Computer Vision - Part 2: CMSC 828 D
23 pages
Lect1 General Background
No ratings yet
Lect1 General Background
124 pages
CS195-5: Introduction To Machine Learning: Greg Shakhnarovich
No ratings yet
CS195-5: Introduction To Machine Learning: Greg Shakhnarovich
33 pages
Lect12 Photodiode Detectors
No ratings yet
Lect12 Photodiode Detectors
80 pages
Chem 373 - Lecture 5: Eigenvalue Equations and Operators
No ratings yet
Chem 373 - Lecture 5: Eigenvalue Equations and Operators
21 pages
Lecture 5
No ratings yet
Lecture 5
25 pages
Lecture 4
No ratings yet
Lecture 4
9 pages
Lecture 4
No ratings yet
Lecture 4
9 pages
4 Lecture 4 (Notes: J. Pascaleff) : 4.1 Geometry of V V
No ratings yet
4 Lecture 4 (Notes: J. Pascaleff) : 4.1 Geometry of V V
5 pages
Image Formation in Man and Machines
No ratings yet
Image Formation in Man and Machines
45 pages
Elementary Data Structures: Steven Skiena
No ratings yet
Elementary Data Structures: Steven Skiena
25 pages
Topics To Be Covered: - Elements of Step-Growth Polymerization - Branching Network Formation
No ratings yet
Topics To Be Covered: - Elements of Step-Growth Polymerization - Branching Network Formation
43 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Program Analysis: Steven Skiena
No ratings yet
Program Analysis: Steven Skiena
20 pages
Brightness: and C
No ratings yet
Brightness: and C
39 pages
Image Analysis: Pre-Processing of Affymetrix Arrays
No ratings yet
Image Analysis: Pre-Processing of Affymetrix Arrays
14 pages
Genomic Signal Processing: Classification of Disease Subtype Based On Microarray Data
No ratings yet
Genomic Signal Processing: Classification of Disease Subtype Based On Microarray Data
26 pages
Lecture 3
No ratings yet
Lecture 3
4 pages
Asymptotic Notation: Steven Skiena
No ratings yet
Asymptotic Notation: Steven Skiena
17 pages
History British I 20 Wil S Goog
No ratings yet
History British I 20 Wil S Goog
709 pages
Lecture 2
No ratings yet
Lecture 2
3 pages
A Christopher Hitchens Bookshelf-9
No ratings yet
A Christopher Hitchens Bookshelf-9
2 pages
Experimental Research: Group 3
No ratings yet
Experimental Research: Group 3
12 pages
Overview of Statistical Methods Used For Environmental Monitoring Data Analysis
No ratings yet
Overview of Statistical Methods Used For Environmental Monitoring Data Analysis
4 pages
The Effects of Computer Games in Academic Performance: College of Radiologic Technology
No ratings yet
The Effects of Computer Games in Academic Performance: College of Radiologic Technology
5 pages
Geetuuuuuuuuuuu
100% (1)
Geetuuuuuuuuuuu
18 pages
Economics Statistics
No ratings yet
Economics Statistics
6 pages
PHY 212-Quantum Mechanics 1-Muhammad Sabieh Anwar
No ratings yet
PHY 212-Quantum Mechanics 1-Muhammad Sabieh Anwar
4 pages
Proposal Admass Case of Skylight
No ratings yet
Proposal Admass Case of Skylight
12 pages
Bayesian Inference
No ratings yet
Bayesian Inference
20 pages
Applying 7E Inst Model PDF
No ratings yet
Applying 7E Inst Model PDF
12 pages
Am (101-120) Analisis Multinivel
No ratings yet
Am (101-120) Analisis Multinivel
20 pages
Tugas Biostat
No ratings yet
Tugas Biostat
24 pages
BRM Unit-1
No ratings yet
BRM Unit-1
47 pages
Acceptance Sampling Powerpoint
100% (1)
Acceptance Sampling Powerpoint
67 pages
Time Series Project
50% (4)
Time Series Project
2 pages
ZNANIECKY - The Method of Sociology PDF
No ratings yet
ZNANIECKY - The Method of Sociology PDF
361 pages
Article 185218
No ratings yet
Article 185218
9 pages
GARCH
No ratings yet
GARCH
2 pages
Physiotherapy Evidence Review - Critical Review
No ratings yet
Physiotherapy Evidence Review - Critical Review
4 pages
Chapter 6 - : Statistical Process Control
No ratings yet
Chapter 6 - : Statistical Process Control
51 pages
Khi Square Test
No ratings yet
Khi Square Test
49 pages
Chapter 4 - Marketing Research
No ratings yet
Chapter 4 - Marketing Research
24 pages
Chapter Iii
No ratings yet
Chapter Iii
2 pages
Employee Stree Full
No ratings yet
Employee Stree Full
88 pages
The Scientific Method and How To Write Up Lab Reports
No ratings yet
The Scientific Method and How To Write Up Lab Reports
18 pages
Advantages of The Agilent Cary 610 FTIR Microscope: Technical Overview
No ratings yet
Advantages of The Agilent Cary 610 FTIR Microscope: Technical Overview
10 pages
Paths of Return (2009)
No ratings yet
Paths of Return (2009)
190 pages
Tests of Hypo PDF
No ratings yet
Tests of Hypo PDF
176 pages
Assignment Part II 1703
No ratings yet
Assignment Part II 1703
9 pages
Chapter 1
100% (1)
Chapter 1
10 pages