0% found this document useful (0 votes)

17 views50 pages

Week 4 - Probability Descriptive Statistics Cont (Post-Class)

The document outlines the learning objectives for Week 4 of AFM 323, focusing on bivariate probability distributions, including joint, marginal, and conditional distributions, as well as covariance and correlation. It discusses the extension of probability concepts from univariate to bivariate distributions, providing examples and properties of joint and marginal distributions. Additionally, it covers the importance of exploratory data analysis and descriptive statistics in the context of financial asset returns.

Uploaded by

yukttha.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views50 pages

Week 4 - Probability Descriptive Statistics Cont (Post-Class)

Uploaded by

yukttha.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Welcome to AFM 323:

Quantitative Foundations for

Finance
Week 4
Learning Objectives
 Bivariate discrete probability distributions
 Bivariate continuous probability distributions
 Joint, marginal and conditional distributions
 Covariance and correlation
 Univariate to multivariate framework – portfolio of assets
 Linear combinations of normally distributed random variables
 Exploratory Data Analysis for Financial Asset Returns
 Descriptive Statistics
 Univariate Descriptive Statistics continued
 Empirical CDF
 Value-at-Risk
 Additional Measures of Dispersion
 QQ Plots
 Bivariate descriptive statistics
 Covariance and correlation

2
Bivariate Distributions
 So far we have considered the probability distributions of univariate random variables: Suncor stock return or S&P
TSX composite index return
 Now, we extend the definitions and concepts to two random variables, say X and Y:
X=Suncor stock return and Y=S&P TSX composite index return
 First, we discuss how to extend the concept of a probability distribution of a single random variable X to a joint
probability distribution of two random variables X and Y
 We want to know if there is any relation between X and Y.
 In particular, we want to know the probability that X takes on a particular value x and Y takes on a particular value y
 That is, we want to determine p(x,y) = Pr(X = x , Y = y)
 This joint probability distribution function determines the likelihood that rv’s X and Y takes on values in the joint
sample space for X and Y

3
Bivariate Distribution - Example
 Consider two discrete random variables for monthly return on Suncor stock (in percent), labelled X and monthly
return on Encana denoted as Y.

 For simplicity we assume that the sample spaces for X and Y are respectively, so that the
random variables X and Y are discrete

 The joint sample space for X and Y is a two-dimensional grid:

4
Joint Distribution - Example
 The joint distribution for X and Y is given by the following table:
 Now, we can determine the probability that X takes on a particular
value x and Y takes on a particular value y,
i.e., p(x, y) = Pr(X = x, Y = y) from the values in the table on the right
 Example: p(0, 0) = Pr(X = 0, Y = 0) = 1/8; p(1, 1) = 1/8 1/8
 This is a joint probability distribution function because it makes a statement about the probability of two events
occurring together
 The bivariate distribution is illustrated graphically in the figure below as a 3-dimensional bar chart:

5
Properties of a Joint pdf P(x,y)
 The joint sample space for X and Y:
 The joint probability distribution function for X and Y are nonnegative for
all x and y in the joint sample space for X and Y:
 p(0,0) = 1/8
p(1,0) = p(2,0) =
2/8 p(3,0) =
1/8 00
1/8 2/8 1/8
 The joint probability distribution function for X and Y are zero for all x and y not in the joint sample space for X and
Y:
 The joint probability distribution functions for X and Y sum to 1 for all x and y in the joint sample space for X and Y:

p(0,0) + p(0,1) +p (1,0) + p(1,1) + p(2,0) + p(2,1) + p(3,0) + p(3,1) =

1/8 + 0 + 2/8 + 1/8 + 1/8 + 2/8 +0 + 1/8 =1

6
Marginal Distribution
 The joint probability distribution tells the probability of X and Y occurring together.
 What if we only want to know about the probability of X occurring or the probability of Y occurring?
 Suppose that we want to find Pr(X=0) and Pr(Y=1) from a given joint distribution.
 Consider the joint distribution in the table to the right:
 What is Pr(X = 0) independent of the value of Y ?
 Now X can occur if Y = 0 or if Y = 1 and since these two events are
mutually exclusive we have that:
Pr(X=0) = Pr(X=0,Y=0) + Pr(X=0,Y=1) = 1/8+0 = 1/8

 Notice that this probability is equal to the horizontal (row) sum of the probabilities in the table at X=0.

 p(X=1) = 3/8 p(X=2) = 3/8 p(X=3) = 1/8

3/8 3/8 1/8

7
Marginal Probability

 Consider the joint distribution in the table above:

 What is Pr(Y = 0) independent of the value of X ?
 Now Y can occur if X = 0 or 1 or 2 or 3- these events are mutually exclusive we have that
Pr(Y=0) = Pr(X=0,Y=0) + Pr(X=1,Y=0) = Pr(X=2,Y=0) + Pr(X=3,Y=0) = 0 + 1/8 + 2/8 + 1/8
p(Y=0)= 4/8

 Notice that this probability is equal to the vertical (row) sum of the probabilities in the table at Y=0.

 p(Y=1) = 4/8
4/8

8
Marginal Probability
 The probability Pr(X=x) is the marginal probability distribution function of X and is in general given by

 Similarly, the probability Pr(Y=y) is the marginal probability distribution function of Y and is in general given by

 It is a called a marginal probability distribution function because it depends only on totals found in the margins of the
table.

 The marginal probabilities of X=x are given in the last column of the above Table.
 The marginal probabilities of Y=y are given in the last row of Table.
 Notice that these probabilities sum to 1.

9
Conditional Probability
 Suppose that we know that Y=0.
 How does this particular knowledge affect the probability that X=0, 1, 2, or 3, or how can we make good use of this
information to improve the probability that X=0,1, 2, or 3?
 i.e., what are: Pr(X=0|Y=0), Pr(X=1|Y=0), Pr(X=2|Y=0), or Pr(X=3|Y=0) equal to?

 Similarly, suppose that we know Y=1

 How does this particular knowledge affect the probability that X=0, 1, 2, or 3?
 i.e., what are Pr(X=0|Y=1), Pr(X=1|Y=1), Pr(X=2|Y=1), or Pr(X=3|Y=1) equal to?

 The answer is conditional probability.

 Suppose that we know that Y=0.
 Using Bayes’ Law

 Pr(X=0|Y=0) means the probability that X=0 given that Y= 0.

 Pr(X=0|Y=0) = 1/4; Similarly Pr(X=1|Y=0) = Pr(X=2|Y=0) = Pr(X=3|Y=0) =
2/4
2/4 1/4
1/4 00

10
Conditional Probability
 Pr(X=0|Y=0) = 1/4 > Pr(X=0) = 1/8
 Hence, knowledge that Y=0 does increase the likelihood that X=0
 Clearly, X depends on Y, i.e., knowing that Y=0 gives us a higher
probability that X=0 (1/4) compared to not knowing that Y=0, in
which case the probability that X=0 is 1/8
 In contrast, the marginal probability, Pr(X=0) ignores information about Y.
 Now suppose that we know that X=0
 How does this knowledge affect the probability that Y=0?
 To find out we compute

 Notice that Pr(Y=0|X=0)=1 > Pr(Y=0) =1/2

 That is, knowledge that X=0 makes it certain that Y=0.

11
Conditional Probability
 Similarly, we can calculate:

 In general, the conditional probability that X = x given that Y = y (provided that Pr(Y = y) ≠ 0) is

 The conditional probability that Y = y given that X = x (provided that Pr(X = x) ≠ 0) is

12
Independence
 Let X and Y be two discrete random variables with:
 pdfs: p(x), p(y)

 sample spaces:

 joint pdf: p(x,y)

 Then X and Y are (statistically) independent random variables if and only if the joint PDF of X and Y is the product of
individual PDFs: for all x in SX and y in SY.
 If X and Y are independent random variables, then the conditional PDF of X given Y (or Y given X) is equal to its
respective marginal PDF:

 Intuition
 X and Y are independent if knowledge of X does not influence probabilities associated with Y and knowledge of
Y does not influence probabilities associated with X.

13
Bivariate Distributions for Continuous RV
 The joint pdf of continuous rv’s X and Y is a non-negative function f (x, y) such that
 The three-dimensional plot of the joint probability distribution gives a probability
surface whose total volume is unity.
 Let [x1, x2] and [y1, y2] be intervals on the real line. Then
 Example of a bivariate standard normal distribution

 It has the shape of a symmetric bell centered at

x = 0 and y = 0

14
Bivariate Standard Normal Distribution
 To find Pr(−1 < X < 1, −1 < Y < 1), we need to solve

which does not have an analytical solution.

 Numerical approximation methods (available in Excel) are required to evaluate the above integral.

15
Covariance and Correlation
 In panel (a) we see no relationship between X and Y
 In panel (b) we see a perfectly positive linear
relationship between X and Y
 In panel (c) we see a perfectly negative linear
relationship
 In panel (d) we see a positive, but less than perfect,
linear relationship.

 Let X and Y be two random variables

 The covariance between X and Y measures the direction of a linear relationship between any two random variables.
 The correlation between X and Y measures both the direction and strength of a linear relationship between any two
random variables
 Note the adjective, linear, in the above sentences.

16
Covariance
 Definition:

 The covariance between two random variables X and Y is given by

17
Covariance - Example
 Example: For the data in the table below:

 Mean (X) = 3/2 Mean (Y) = 1/2

3/2 1/2

18
Properties of Covariances
 Let X and Y be random variables and let a and b be constants.
 Some important properties of Cov(X, Y) are
 Cov(X, X) = Var(X)

 Cov(X, Y) = Cov(Y, X)

 Cov(aX, bY) = a ∙ b ∙ Cov(X, Y)

 If X and Y are independent then Cov(X, Y) = 0 (i.e. no association implies no linear association)

 However, if cov(X, Y) = 0, then X and Y are not necessarily independent (no linear association does not

necessarily imply no association – could have nonlinear association)

 If X and Y are jointly normally distributed, then Cov(X, Y) = 0 implies that X and Y are independent.

19
Correlation
 Correlation: measures both the direction and strength of the linear relationship between any two random variables
 The correlation between two random variables X and Y is given by
 i.e. the correlation coefficient is a scaled/normalized covariance
 Example: For the data in the table, we have

 = 0.577
0.577

20
Correlation
Properties of Correlations:

21
Linear Combinations of Two RV (Review)
 Let X and Y be random variables
 Define a new random variable Z that is a linear combination of X and Y : Z = aX + bY , where a and b are constants

 Then

 And

 Result: A linear combination of two normally distributed random variables is itself a normally distributed random
variable.

22
Portfolio Returns (Review)
 RA = return on asset A with E[RA] = μA and Var(RA) = σ2A

 RB = return on asset B with E[RB] = μB and Var(RB) = σ2B

 Cov(RA, RB) = σAB

 Cor(RA, RB) = ρAB =

 Portfolio
 x = share of wealth invested in asset A
A
 xB = share of wealth invested in asset B
 xA + x B = 1
 The portfolio return is

23
Portfolio Returns and Risk
 How much wealth should be invested in assets A and B?
 Portfolio expected return (this is the gain from investing):

 Portfolio variance /SD (this is the risk from investing):

24
Multi-Period Continuously Compounded
Return
 Let rt = ln(1+Rt) be monthly continuously compounded returns.
 Assume that for all t so that
 Then the annual cc return is equal the sum of twelve-monthly cc returns:
 Since each monthly return is normally distributed, the annual return is
also normally distributed.
 Then the expected annual return:

 Hence, the expected 12-month (annual) return is equal to 12 times the expected monthly return.
 The variance of the annual return:
so that the annual variance is also
equal to 12 times the monthly variance.

25
Multi-Period Continuously Compounded
Return
 The SD of the annual return:

 Hence, the annual standard deviation is times the monthly standard deviation (this result is famously known
as the square root of time rule)
Data Analysis – Excel Add-in
We will be using the Data Analysis ToolPak Add-In for Excel in this course extensively!
To activate it:
 File -> Options -> Add-Ins on the Left Sidebar
 Highlight Analysis ToolPak & hit GO (not OK)
 Check the Analysis ToolPak Option and hit OK

To see the Data Analysis tab -> go to DATA tab and at the far right end banner you should see Data Analysis under the
Analysis section

Refer to the Excel Primer pdf on the Learn site throughout the course if you ever need to go back and remember how we
use various Data Analysis features to calculate statistics and figures

27
Population & Samples
 A population is defined as all members of a specified group
 descriptive measure of a population characteristic (mean, variance) is a parameter

 A sample is a subset of the population

 descriptive measure of a sample characteristic is a sample statistic.

 Nominal scale  Interval scale

 weakest level of measurement  provide ranking

 categorize data – do not rank them  differences between scale values are equal – can be

 example: hedge fund classifications added or subtracted

 example: Celsius and Fahrenheit temperature scales
 Ordinal scale
 sort data into categories  Ratio scale
 example: Standard & Poor’s bond ratings  strongest level of measurement

 all characteristics of interval scale and zero as origin

 apply widest range of statistical tools to data that are

on a ratio scale
 example: returns, earnings per share.

28
Concept of Random Sampling
 A random sample is a sequence of (usually an infinite number of)independently and
identically distributed (i.i.d.) random variables with an unknown pdf, p(x)

 An observed sample (we call data) are (usually a finite number of)
observations generated by the random sample

 Descriptive Statistics are data summaries used to

 describe certain features of the observed sample (or data)

 learn about the unknown pdf, p(x), and

 capture observed dependencies, if any, in the data.

29
Histogram
 A frequency distribution is a tabular display of data summarized in a relatively small number of intervals
 A histogram is the graphical equivalent of a frequency distribution
 A histogram is used to describe the shape of the distribution of the observed sample (or data):
 How to construct a histogram?
 Order data from smallest to largest values; min = smallest value, max = largest value, range = max – min

 Bin width (Scott’s normal reference rule) = 3.5*standard deviation/(number of observations 1/3)

 Number of bins = number of observations 1/2

 Divide the range into N equally spaced bins

 Count the number of observations in each bin

 Create a bar chart.
 Excel – Under Data Analysis, go to Histogram -> Let’s Try it Out

30
Monthly CC Returns - Histogram
Suncor Monthly CC Returns Histogram
80
 The histogram has a bell-shape like the normal
distribution and is centered around values slightly more
70
than zero
60  The bulk of the Suncor returns are between -5% and 15%.
50  The histogram for Suncor is slightly skewed left (long left
Frequency

40 tail) due to larger negative returns than large positive

30 returns
20
 Note: When comparing two or more return distributions,
10
try using the same bins for each histogram – this allows us
to visually see the distribution and compare easier
0
e
5% 0% 5% 0% 5% 0% 5% 0% -5% 0% 5% 10% 15% 20% 25% 30% 35% or
-4 -4 -3 -3 -2 -2 -1 -1 M

-0.5 Take
Take55minutes
minutestotoopen
open“Descriptive
“DescriptiveStatistics
Statistics––
 Eliminating gaps between bars in a histogram (Excel primer pp. 4-8) In-Class
In-ClassProblems”
Problems”and
andAttempt
AttemptS&P
S&PTSXTSX
 Right-mouse the column bar. Click Format. Hit Format Selection on the left side. Returns Histogram tab
Returns Histogram tab
 On the right side under Format Data Series, change Gap Width from 150% to 0%.
Hit ENTER.

31
Monthly Price Data Time Plot
Suncor Adjusted Closing Price (CAD)  What do you observe about asset prices in the plots
2001-2023
shown?
80
60
 The prices exhibit random-walk like behavior with no
40 tendency for the observations on the prices to revert to a
20 constant (or time independent) mean and, thus, appear to
0
be non-stationary
2000/12
2001/08
2002/04
2002/12
2003/08
2004/04
2004/12
2005/08
2006/04
2006/12
2007/08
2008/04
2008/12
2009/08
2010/04
2010/12
2011/08
2012/04
2012/12
2013/08
2014/04
2014/12
2015/08
2016/04
2016/12
2017/08
2018/04
2018/12
2019/08
2020/04
2020/12
2021/08
2022/04
2022/12
2023/08
 Both the Suncor stock price and the S&P TSX Composite
index show the run-up to the global financial crisis of 2008
S&P TSX Composite Index 2001-2023 and then the sharp drop and the subsequent recovery after
25000
20000
the financial crisis. In 2022, there is a variation between
15000 the two - the index has passed the highs of January 2020
10000 while Suncor’s price dropped sharply at the start of the
5000 health crisis and has since recovered.
0
 There is a common trend observed between the two price
2000/12
2001/09
2002/06
2003/03
2003/12
2004/09
2005/06
2006/03
2006/12
2007/09
2008/06
2009/03
2009/12
2010/09
2011/06
2012/03
2012/12
2013/09
2014/06
2015/03
2015/12
2016/09
2017/06
2018/03
2018/12
2019/09
2020/06
2021/03
2021/12
2022/09
2023/06
series.

32
Monthly CC Returns Time Plot
40%
 What do you observe about asset prices in the plots shown?
30% SUNCOR Monthly CC Return 2001-2023
20%  In contrast to asset prices, asset returns are mean-reverting and the
10%

0%
common monthly mean values seem close to zero
-10%  The constant mean value assumption of stationarity looks to hold.
-20%  However, the volatility (i.e., the fluctuation of returns about the
-30%01 11 09 07 05 03 01 11 09 07 05 03 01 11 09 07 05 03 01 11 09 07 05 03 01 11 09 07
01 01 02 03 04 05 06 06 07 08 09 10 11 11 12 13 14 15 16 16 17 18 19 20 21 21 22 23
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
-40%
mean) of both series appears to change over time
-50%
 Both series show higher volatility during the 2008 financial crisis
-60% and the 2020 health crisis
15%
S&P TSX Composite Monthly CC Returns 2001-2023  This is an indication of time-varying conditional volatility (which
10%
is a form of non-stationarity in volatility).
5%

0%
 There does not appear to be any evidence of systematic time
dependence in the returns
-5%

-10%
 Later on we will see that the estimated autocorrelation coefficients
-15%
(which is a new concept to be discussed) are very close to zero
-20%
 The returns for Suncor and the S&P TSX index tend to move
-25%
200101 200212 200411200610 200809201008 201207 201406 201605 201804 202003 202202 together suggesting a positive correlation.

33
Monthly CC Returns Time Plot (Another Perspective)
40%
Monthly CC Returns 2001-2023
30%

20%

10%  Suncor is more volatile than the S&P TSX index

-10%
 In general, the lower volatility of the S&P TSX index
represents the reduced risk of a large diversified
-20%
2001/012003/022005/032007/042009/052011/062013/072015/082017/092019/102021/11 portfolio.
-30%

-40%

-50%
Suncor cc return SU_Ret
-60% S&P TSX cc return
S&P_TSX_Ret

34
Empirical Quantiles
 Mean and variance describe the shape characteristics of a distribution of data such as continuously compounded
returns

 Often we are also interested in describing a relative location of a particular measurement within a given data set

 One such measure is a percentile

 When a company XYZ reports that its yearly sales are in the 90th percentile of all companies in the industry – what
does it mean?
 It means that 90% of all companies in this industry have yearly sales less than XYZ, and only 10% have yearly

sales exceeding XYZ

35
Percentiles
 Empirical percentiles that partition a data set into 4 segments, with each segment containing exactly 25% of the measurement are
known as quartiles .
 The lower (or first) quartile is the 25th percentile,
 The middle (or second) quartile is the median or 50 th percentile,
 The upper (or third) quartile is the 75 th percentile,
 The second empirical quartile is the sample median and is the data point such that half of the data is less than or equal to its
value.
 The distance between the upper (3rd) and lower (1st) quartiles is known as the interquartile range (IQR):
 IQR shows the size of the middle of the distribution of the data
 Quartiles are useful in finding unusual observations in a data set.
 Use [Link] (representing inclusive) to calculate percentiles of dataset

36
Sample Statistics
 To calculate sample quantities for the mean, variance (or standard deviation), skewness and kurtosis of our financial
data, two critical assumptions about the data must be met:
1. data must be covariance (or weakly) stationary, so that the population quantities for the mean, variance (or
standard deviation), skewness and kurtosis of the data are constants and not functions of time. This allows the
sample quantities to be calculated as sample averages
2. Over the sample of observation (t=1,..,T), there must be only one regime/process generating the data, so that
sample quantities can be calculated as one sample average for each moment.
 Under these two assumptions, we calculate the sample mean, variance (or standard deviation), skewness and kurtosis
as follows:

37
Outliers
 Extremely large or small values are called “outliers”
 Outliers can be thought of in two ways:
 First, an outlier can be the result of a data entry error - the outlier is not a valid observation and should be

removed from the data sample

 Second, an outlier can be a valid data point whose behavior is seemingly unlike the other data points - the outlier

provides important information and should not be removed from data sample
 For financial market data, outliers are typically extremely large or small values that could be the result of a data entry
error (e.g. price entered as 1 instead of 10) or a valid outcome associated with some unexpected news.
 Outliers are problematic for data analysis because they can greatly influence the value of sample statistics: the sample
mean, variance, standard deviation, skewness and kurtosis
 Percentile measures are more robust to outliers; outliers do not greatly influence these measures (e.g. median instead
of mean; IQR instead of SD)
 IQR (interquartile range) – outlier robust measure of spread

 Moderate Outlier: Extreme outlier:

38
Outliers

 To illustrate the impact of outliers on sample statistics, the simulated data (i.e. i.i.d N(0,1) data is polluted by a single
large negative outlier)
 The above table compares the sample statistics of the unpolluted and polluted data.
 The sample statistics are influenced by the outliers:
 mean

 skewness

 kurtosis

 standard deviation
Sample Statistics - Example
Excel – Under Data Analysis, go to Descriptive Statistics -> Let’s Try it Out

Take
Take55minutes
minutestotoopen
open“Descriptive
“DescriptiveStatistics
Statistics––
In-Class
In-ClassProblems”
Problems”and
andAttempt
AttemptS&P
S&PTSXTSX
Returns DStats tab
Returns DStats tab

Calculate:
Calculate:
–– Descriptive statistics
Descriptive statistics
–– 1stst, 5thth, 10thth, 25thth, 50thth, 75thth, 90thth, 95ththand 99thth
1 , 5 , 10 , 25 , 50 , 75 , 90 , 95 and 99
percentiles
percentiles
–– Interquartile range
Interquartile range
–– Moderate outliers
Moderate outliers
–– Extreme outliers
Extreme outliers

40
Additional Measures of Dispersion
 Relative Dispersion: Coefficient of Variation = standard deviation/mean
 Free of scale – allows comparison of dispersion across datasets - how much dispersion exists relative to the

mean of the distribution

 Amount of risk per unit of return

 Sharpe Ratio: Amount of excess return per unit of risk

 (Mean portfolio return - mean risk-free return)/standard deviation of portfolio return

 Common measure of portfolio performance.

 Chebyshev’s inequality using standard deviation as a measure of dispersion

 Let k be any positive constant greater than 1. The proportion of the observations within k standard deviations of the
mean is at least (1- 1/k2) for all k>1
 The inequality holds for samples and populations and for discrete and continuous data regardless of the shape
of the distribution
 K =2, 75% of the observations should lie within +/- 2 standard deviations of the mean
 K =5, 96% of the observations should lie within +/- 5 standard deviations of the mean

41
Empirical CDF
 Recall that the CDF of a rv X is
 Then the empirical CDF of a random sample is
 How to compute and plot the empirical
for a sample of data ?

 Sort data from smallest to largest values in the form of order statistics:

 Plot against sorted data

 are known as order statistics, in particular, and

 Why are we interested in computing an empirical CDF?
 This is a simple way to assess whether a given empirical distribution of asset return is normally distributed or
close to being normally distributed as often assumed
 Next, we compare the empirical CDF of a random variable (which is our asset return) to the CDF of a N(0,1)
distribution.

42
Calculating the Empirical CDF
 Question: Does the observed data come from a normal distribution? Let’s
Let’ssee
seewhat
whatthis
thislooks
lookslike
likeininExcel
Excel
 To answer this question, we follow the steps given below:
 Step 1. Standardize data to have a zero mean and a variance equal to one

 Step 2. Sort standardized data from smallest to largest values:

 Step 3. Compute standard normal (also known as Gaussian White Noise – GWN) CDF at each sorted value:

 Plot and against the sorted data.

 We can interpret the Empirical CDF as follows:
1.20
CDF_SU_Ret
(EMPIRICAL)
– If the red curve is close to the blue curve which is the
CDF_SU_Ret
(Normal)
1.00 reference distribution which is normal, then our
0.80 conjecture the that empirical distribution of Suncor’s cc
returns is normal is appropriate
0.60
– The closer the two curves, the more plausible it is that
0.40 the data is sampled from a normally distributed
0.20
population
– We notice there are deviations especially around the tails
0.00
-8.00 -6.00 -4.00 -2.00 0.00 2.00 4.00 6.00 (positive and negative) for Suncor’s returns from a
normal distribution.

43
Value at Risk (Review)
 Let denote a sample of T simple monthly returns on an investment.
 Let be the initial value of an investment
 For , the historical VaRα is for simple returns where
 Note: For cc returns , we use where
 Consider investing $10,000 in Suncor for a month, and we calculate the VaR at 1%, we can say VaR0.01 =
10,000*(exp(q0.01) - 1) = $1,854. So we say that a $10,000 monthly investment in Suncor will lose $1,854 or more
with 1% probability -> recall from last week!
 If the corresponding VaR at 1% for the S&P TSX is $858, since this is considerably smaller than Suncor’s 1% VaR,
we can say that investing in Suncor is riskier than investing in the S&P 500 index.

44
Quantile-Quantile (Q-Q) Plot Let’s
Let’ssee
seewhat
whatthis
thislooks
lookslike
likeininExcel
Excel

 A normal probability or Quantile-Quantile (QQ plot) is useful for comparing the data with the quantile of a specified
or reference distribution (usually a normal distribution) that we think is appropriate for the return data -> i.e. if we
believe the distribution is normal and want to check it
 The QQ-plot is an XY plot with the reference distribution (normal distribution quantiles on the x-axis and the
empirical quantiles (Suncor empirical quantiles) on the y-axis.
 How to construct a QQ Plot
1. Column C is rank, i ranging from 1: n (n is number of observations in the data series)
2. Column D is the sorted Suncor returns
3. Column E is the cumulative relative frequency: i/n
4. Column F lists the standard normal quantiles: NORMINV(E2,0,1)
5. Column F values are copied and pasted as values in column G
6. Column H is the standardized Suncor returns
7. Highlight columns F, G and H and draw a scatter XY plot.

45
Q-Q Plot Interpretation
Q-Q Plot Suncor Monthly CC returns
 We can interpret the QQ plot in the following way:
4  If all of the points are close to a straight line, then the
reference distribution we conjecture is appropriate
2  If the points do not fall close to a straight line, then the
reference distribution we conjecture is not appropriate and
-4 -3 -2 -1
0
0 1 2 3 4 5
we should consider a different distribution instead
 The closer the red dots are to the blue dots, the more
-2 plausible it is that the data is sampled from a normally
distributed population.
-4  The QQ plot for Suncor’s returns indicate that there are
outliers indicating deviation from a normal distribution.
-6

standard normal quantiles standardized SU returns

 Standard normal quantiles are plotted against themselves
 Standardize SU returns (Y axis) are plotted against standard normal
quantiles (X axis)

46
Bivariate Descriptive Statistics
SUNCOR CC Ret vs. S&P TSX CC Ret
S
u 15%
 Sample covariance
n
c
o 10%
r  Sample correlation
5%

0%
-60% -50% -40% -30% -20% -10% 0% 10% 20% 30% 40%
 Sample covariance and correlation between Suncor and
S&P TSX Return -5% S&P TSX cc returns
-10%
 (Use Data/Data Analysis/Analysis Tools/Covariance)
-15%

-20%

-25%

 Suncor’s returns appear to be positively correlated (moderately high

positive correlation) to the S&P TSX index returns. The correlation
0.63

47
Wrap Up & Next Class

48
Market/Economics Graphics Report &
Presentation
Goal: To link economics & finance to a topic of your choice. Find an article or research a topic that
interests you and your group. Link that topic or article to the concept of economics or finance.

You must cover the following elements:

• Motivation for choosing the topic (why is it relevant or important)
• Description of your findings
• Intuitive economic explanations for your finding
• Main takeaways
• Suggestions for additional research/analysis that could provide additional insights to the findings

The selection of your topics is pretty open ended – you can really discuss anything as long as it has a relation to
economics and enough content to create a report, graphic and presentation.

Format:
• Max 2-page report (excluding references) + 1 pager graphic (graphic not included in the 2-page count)
• The graphic is meant to be an infographic that the user can read and pick up the key concepts of your report
from

Presentation:
• 10-minute presentation to the class with a 5 min Q&A session
• Q&A team will ask questions to the presenting team and if time permits, we will open up Q&A to the entire
class
• First presentations will take place on October 22 – your report & slide deck is due to the dropbox by
12PM on the day of your presentation
49
Now What for Week 5?
Week 5 Focus: Constant Expected Return (CER) Model
• What does the CER mean?
• How can we define error terms?
• Estimating regression parameters of the CER model
• Statistical properties of estimators

Problem Sets:
• Problem Set 3 – Descriptive Statistics now available on Learn (attempt to complete it to test your
understanding)
• Review the Probability Review (Part V), Descriptive Stats (Part I) and Descriptive Stats (Part II) excel files
on Learn for the sample calculations

Assignments Due:
• Assignment 2 – Random Variables & Descriptive Statistics due 7PM on October 8
• Please review the assigned stock information (under the Admin folder on Learn) to see what stock you
have been assigned. Note that you will stick with this stock to complete all assignments in the course

Projects & Presentations:

• None

Joint Distributions: A Random Variable Is That Maps To Numbers
No ratings yet
Joint Distributions: A Random Variable Is That Maps To Numbers
37 pages
Joint Distribution of Two Random Variables
No ratings yet
Joint Distribution of Two Random Variables
5 pages
Stats 116 SU
No ratings yet
Stats 116 SU
128 pages
Random Variable
No ratings yet
Random Variable
12 pages
Joint Probability 4
No ratings yet
Joint Probability 4
47 pages
23ECE205 FODS 06 Joint Probabilities
No ratings yet
23ECE205 FODS 06 Joint Probabilities
31 pages
Multivariate Distributions Overview
No ratings yet
Multivariate Distributions Overview
139 pages
Joint and Conditional Probability Distributions
100% (1)
Joint and Conditional Probability Distributions
52 pages
Module 2 Class
No ratings yet
Module 2 Class
71 pages
Joint Probability Distributions
No ratings yet
Joint Probability Distributions
16 pages
Joint Probability Distribution
No ratings yet
Joint Probability Distribution
28 pages
Bivariate Distribution (Discrete RV)
No ratings yet
Bivariate Distribution (Discrete RV)
6 pages
Chapter IV-2
No ratings yet
Chapter IV-2
35 pages
Multivariate Probability Distributions
No ratings yet
Multivariate Probability Distributions
46 pages
08 Bivariate Distributions
No ratings yet
08 Bivariate Distributions
65 pages
6-A. Joint Probability Distribution
No ratings yet
6-A. Joint Probability Distribution
76 pages
Joint Distributions and Their Applications
No ratings yet
Joint Distributions and Their Applications
25 pages
Joint Distributions of Random Variables
No ratings yet
Joint Distributions of Random Variables
25 pages
Unit - III Joint Probability Distribution (Full Notes)
100% (2)
Unit - III Joint Probability Distribution (Full Notes)
30 pages
PRO-Ch4 (2021-22 Note
No ratings yet
PRO-Ch4 (2021-22 Note
52 pages
Packet 8
No ratings yet
Packet 8
12 pages
MIT18 05S14 Class7slides PDF
No ratings yet
MIT18 05S14 Class7slides PDF
28 pages
Joint Probability Distribution - Updated
No ratings yet
Joint Probability Distribution - Updated
28 pages
Chapter 4 - Joint and Conditional Probability Distributions
No ratings yet
Chapter 4 - Joint and Conditional Probability Distributions
31 pages
Week 8 Notes
No ratings yet
Week 8 Notes
8 pages
04-Joint Distributions
No ratings yet
04-Joint Distributions
24 pages
Joint Probability Distribution Guide
No ratings yet
Joint Probability Distribution Guide
6 pages
Understanding Multiple Random Variables
No ratings yet
Understanding Multiple Random Variables
42 pages
Joint Probability Distribution
100% (1)
Joint Probability Distribution
10 pages
Lecture 16 Covariation Correlations
No ratings yet
Lecture 16 Covariation Correlations
42 pages
Joint Probability Distributions Guide
No ratings yet
Joint Probability Distributions Guide
47 pages
Econometrics1 2 PDF
No ratings yet
Econometrics1 2 PDF
63 pages
Joint Distribution & Probability Functions
No ratings yet
Joint Distribution & Probability Functions
28 pages
EDA Report
No ratings yet
EDA Report
24 pages
Advanced Probability Concepts
No ratings yet
Advanced Probability Concepts
82 pages
LECTURE 4 Joint Probability Distribution
No ratings yet
LECTURE 4 Joint Probability Distribution
28 pages
Joint Probability Distribution
No ratings yet
Joint Probability Distribution
14 pages
Chapitre 6 - Multivariate Probability Distributions
No ratings yet
Chapitre 6 - Multivariate Probability Distributions
41 pages
Understanding Multiple Random Variables
No ratings yet
Understanding Multiple Random Variables
34 pages
Probability
No ratings yet
Probability
44 pages
Joint Distribution XXXX
No ratings yet
Joint Distribution XXXX
8 pages
Lecture 8
No ratings yet
Lecture 8
21 pages
Joint Probability Distribution Guide
No ratings yet
Joint Probability Distribution Guide
26 pages
Joint Dist
No ratings yet
Joint Dist
30 pages
Econ-2042 - Unit 4-HO
No ratings yet
Econ-2042 - Unit 4-HO
13 pages
Advanced Probability Concepts
100% (1)
Advanced Probability Concepts
35 pages
Math13 Topic 5
No ratings yet
Math13 Topic 5
8 pages
ProSta Chap3 (2021.2)
No ratings yet
ProSta Chap3 (2021.2)
76 pages
Lecure-4 Probability
No ratings yet
Lecure-4 Probability
51 pages
Joint Distribution and Later
No ratings yet
Joint Distribution and Later
61 pages
Llecture2 1
No ratings yet
Llecture2 1
62 pages
MAT 326 Chapter 7 Fall 2024
No ratings yet
MAT 326 Chapter 7 Fall 2024
9 pages
Chap 3.1
No ratings yet
Chap 3.1
25 pages
Advanced Probability & Statistics - 23CST-286
No ratings yet
Advanced Probability & Statistics - 23CST-286
28 pages
05.02.the Continuous Case
No ratings yet
05.02.the Continuous Case
14 pages
Probability Stats II Course Overview
No ratings yet
Probability Stats II Course Overview
6 pages
Chapter5: Joint Probability Distributions
No ratings yet
Chapter5: Joint Probability Distributions
39 pages
Joint Discrete PMFs PDF
No ratings yet
Joint Discrete PMFs PDF
2 pages
Lecure-3 - 2 Probability
No ratings yet
Lecure-3 - 2 Probability
55 pages
Nine I's Model for Good Governance
100% (2)
Nine I's Model for Good Governance
5 pages
HallTicket 7230134
No ratings yet
HallTicket 7230134
1 page
Syllable-Based Prosodic Analysis of Amharic Read Speech
No ratings yet
Syllable-Based Prosodic Analysis of Amharic Read Speech
5 pages
Solicitation Travel Assistant
No ratings yet
Solicitation Travel Assistant
8 pages
Creation and Evaluation of An Endodontic Diagnosis Training Software
No ratings yet
Creation and Evaluation of An Endodontic Diagnosis Training Software
5 pages
Heypac Brochure 2008
No ratings yet
Heypac Brochure 2008
16 pages
English 4 Q3 Module 3
100% (1)
English 4 Q3 Module 3
27 pages
Abstrac Programming Lab-Manual
No ratings yet
Abstrac Programming Lab-Manual
50 pages
AllotmentLetterReport 202505A028686
No ratings yet
AllotmentLetterReport 202505A028686
2 pages
PQT AEC & CWC Invitation Letter
No ratings yet
PQT AEC & CWC Invitation Letter
3 pages
Giáo Viên Tiếng Anh Lớp 2
No ratings yet
Giáo Viên Tiếng Anh Lớp 2
7 pages
Sano Gervais Presentation About Learning Aim A Investigating Data Modelling (Autosaved) (Autosaved) (Autosaved) (Autosaved)
No ratings yet
Sano Gervais Presentation About Learning Aim A Investigating Data Modelling (Autosaved) (Autosaved) (Autosaved) (Autosaved)
22 pages
Instructor Support A320
100% (31)
Instructor Support A320
208 pages
E61997 20080308 CoC
No ratings yet
E61997 20080308 CoC
4 pages
Sinamics DC Master Migration Guide en
No ratings yet
Sinamics DC Master Migration Guide en
38 pages
Info - Iec61340 5 1 (Ed3.0) B
No ratings yet
Info - Iec61340 5 1 (Ed3.0) B
15 pages
CA-1991, Dated 21102022
No ratings yet
CA-1991, Dated 21102022
17 pages
Gambela Agricultural Development Proposal
67% (6)
Gambela Agricultural Development Proposal
56 pages
Sudanese Civil War (2023-Present) - Wikipedia
No ratings yet
Sudanese Civil War (2023-Present) - Wikipedia
34 pages
Applications To The Study of Power and Colonialism 35 Routledge Approaches To History 1st Edition Alexandre Coello de La Rosa
100% (3)
Applications To The Study of Power and Colonialism 35 Routledge Approaches To History 1st Edition Alexandre Coello de La Rosa
118 pages
EU Certification Guidelines
No ratings yet
EU Certification Guidelines
28 pages
LBO Model Template
No ratings yet
LBO Model Template
67 pages
MSD Electricity Approved Drawings MMR Block-I
No ratings yet
MSD Electricity Approved Drawings MMR Block-I
9 pages
Scan-Based Sound Visualization Thesis
100% (2)
Scan-Based Sound Visualization Thesis
205 pages
DESKTOP PUBLISHINGpdf
No ratings yet
DESKTOP PUBLISHINGpdf
14 pages
ProductLiterature - Helianthus DBT Brochure
100% (1)
ProductLiterature - Helianthus DBT Brochure
16 pages
Villamor Golf Club Price Estimate
No ratings yet
Villamor Golf Club Price Estimate
3 pages
3-2-9 - Soft Computing Lab
No ratings yet
3-2-9 - Soft Computing Lab
2 pages
Must-Read Science Books 2022
No ratings yet
Must-Read Science Books 2022
21 pages
Basic Electronics - Q&a 2
No ratings yet
Basic Electronics - Q&a 2
25 pages

Week 4 - Probability Descriptive Statistics Cont (Post-Class)

Uploaded by

Week 4 - Probability Descriptive Statistics Cont (Post-Class)

Uploaded by

Welcome to AFM 323:

Quantitative Foundations for

 The joint sample space for X and Y is a two-dimensional grid:

p(0,0) + p(0,1) +p (1,0) + p(1,1) + p(2,0) + p(2,1) + p(3,0) + p(3,1) =

 p(X=1) = 3/8 p(X=2) = 3/8 p(X=3) = 1/8

 Consider the joint distribution in the table above:

 Similarly, suppose that we know Y=1

 The answer is conditional probability.

 Pr(X=0|Y=0) means the probability that X=0 given that Y= 0.

 Notice that Pr(Y=0|X=0)=1 > Pr(Y=0) =1/2

 The conditional probability that Y = y given that X = x (provided that Pr(X = x) ≠ 0) is

 joint pdf: p(x,y)

 It has the shape of a symmetric bell centered at

which does not have an analytical solution.

 Let X and Y be two random variables

 The covariance between two random variables X and Y is given by

 Mean (X) = 3/2 Mean (Y) = 1/2

 Cov(aX, bY) = a ∙ b ∙ Cov(X, Y)

necessarily imply no association – could have nonlinear association)

 RB = return on asset B with E[RB] = μB and Var(RB) = σ2B

 Cov(RA, RB) = σAB

 Cor(RA, RB) = ρAB =

 Portfolio variance /SD (this is the risk from investing):

 A sample is a subset of the population

 Nominal scale  Interval scale

 example: hedge fund classifications added or subtracted

 all characteristics of interval scale and zero as origin

 apply widest range of statistical tools to data that are

 Descriptive Statistics are data summaries used to

 learn about the unknown pdf, p(x), and

 capture observed dependencies, if any, in the data.

 Number of bins = number of observations 1/2

 Divide the range into N equally spaced bins

 Count the number of observations in each bin

40 tail) due to larger negative returns than large positive

10%  Suncor is more volatile than the S&P TSX index

 One such measure is a percentile

sales exceeding XYZ

removed from the data sample

 Moderate Outlier: Extreme outlier:

mean of the distribution

 Sharpe Ratio: Amount of excess return per unit of risk

 Common measure of portfolio performance.

 Chebyshev’s inequality using standard deviation as a measure of dispersion

 Plot against sorted data

 are known as order statistics, in particular, and

 Step 2. Sort standardized data from smallest to largest values:

 Plot and against the sorted data.

standard normal quantiles standardized SU returns

 Suncor’s returns appear to be positively correlated (moderately high

You must cover the following elements:

Projects & Presentations:

You might also like