0% found this document useful (0 votes)

38 views50 pages

Fooled by Statistics & Machine Learning by Dr. Tanujit

Uploaded by

Shivansh Ghelani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views50 pages

Fooled by Statistics & Machine Learning by Dr. Tanujit

Uploaded by

Shivansh Ghelani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

SRCASW, University of Delhi

"Fooled" by Statistics & ML

Dr. Tanujit Chakraborty

Ph.D from Indian Statistical Institute, Kolkata, India.
Assistant Professor of Statistics at Sorbonne University
[email protected] | https://www.ctanujit.org
N ORMALITY IS A MYTH !

C ORRELATION DOES NOT IMPLY CAUSATION !

A LL MODELS ARE WRONG , BUT SOME ARE USEFUL !

2/50
N ORMALITY IS A MYTH !

3/50
Normality & Beyond Normality

Normality is a paved road. It is easy to walk but no flowers

grow on it. — Vincent Van Gogh.

By Dr. Saul McLeod (2019)

4/50
Few Famous Quotations

Normality is a myth; there never was, and there never will be a normal
distribution — Roy C. Geary (1947; Biometrika, vol. 34, 248).

Everybody believes in the exponential law of errors (the normal

distribution), the experimenters, because they think it can be proved by
mathematicians; and the mathematicians, because they believe that it has
been established by observations — E.T. Whittaker and G. Robinson
(1967).

... the statisticians knows ... that in nature there never was a normal
distribution, there never was a straight line, yet with normal and linear
assumptions, known to be false he can often derive results which match
to a useful approximation, those found in real world — George W. Box
(1976, Journal of American Statistical Association, vol. 71, 791-799).

5/50
Normal Distribution

A random variable X is said to be

normally distributed with mean µ and
variance σ 2 , if the probability density
function of X is the following (for
−∞ < µ < ∞ and σ > 0 )

1 (x−µ)2
−
f (x; µ, σ) = √ e 2σ2 ; −∞ < x < ∞
2πσ

Probability Density Function of Normals

6/50
Galton Board

• Sir Francis Galton, Charles

Darwin’s half-cousin, invented
the ’Galton Board’ in 1874 to
demonstrate that the normal
distribution is a natural
phenomenon.

• It specifically shows that the

binomial distribution
approximates a normal
distribution with a large enough
sample size.
Picture of Galton Board

7/50
How it has started?

Gambling Question: A 17th century gambler, the Chevalier de

Mere, asked Pascal for an explanation of his unexpected losses in
gambling.

The famous correspondence between Pascal and Fermat was

instigated in 1654, and they were mainly interested to calculate
the following binomial sum:

j
X n
pk (1 − p)n−k
k
k=i

The problem was not difficult when n is small.

8/50
A Brief History

Within few years the following problem arises in a sociological

study, where the following computation was necessary:
n = 11, 429, i = 5745, j = 6128
j
X n
pk (1 − p)n−k
k
k=i

Original Problem: The problem is to test the hypothesis that male

and female births are equally likely against the actual birth in Lon-
don over 82 years from 1629 - 1710. It is observed that the relative
number of male births varies from a low of 7765/15, 448 = 0.5027
in 1703 to a high of 4748/8855 = 0.5362 in 1661. Given that
11,429 is the average number of births in London over 82 years,
and 5745 and 6128 are two limits.

9/50
Solution

Using the following recurrence relation

n n n−x
=
x+1 x x+1

and some involved rational approximation it has been obtained

6128 i
X 11, 429 1
P(5747 ≤ X ≤ 6128 | p = 1/2) =
i 2
i=5745
≈ 0.292

10/50
Solution

Using the following recurrence relation

n n n−x
=
x+1 x x+1

and some involved rational approximation it has been obtained

6128 i
X 11, 429 1
P(5747 ≤ X ≤ 6128 | p = 1/2) =
i 2
i=5745
≈ 0.292

11/50
The Breakthrough

De Moivre began the search for this approximation in 1721 ,

and in 1733 it has been proved that
n
n 1 2 −2x2 /n
n ≈√ e
2 + x 2 2πn
and n Z a/√n
X n 1 4 2
≈√ e−2y dy.
x 2 2π 0
|x−n/2|≤a

12/50
Normal Approximation

Eventually using the second approximation one gets

j ! !
X n j − np i − np
pk (1 − p)k ≈ Φ −Φ
k
p p
k=i
np(1 − p) np(1 − p)

where Z z
1 2 /2
Φ(z) = √ e−x dx
2π −∞

which is the cumulative distribution function (CDF) of the

standard normal distribution.

13/50
Error Modeling

Gauss (1809) made the following assumptions and deduce the

normal distribution as an error distribution:
1 Small errors are more likely than large errors.
2 For any real numbers ϵ, the likelihood of errors of
magnitudes ϵ and −ϵ are equal.
3 In the presence of several measurements of the same
quantity, the most likely value of the quantity being
measured is their average.

To read more about the evolution of normal distribution: Saul Stahl (2006), “The
evolution of normal distribution”, Mathematics Magazine, vol. 79, no. 2, 96 - 113.

14/50
Central Limit Theorem

Lindeberg-Levy CLT:

Suppose {X1 , X2 , · · · } is a sequence of

independent identically distributed
random variables with mean µ and
variance σ 2 < ∞, then as n → ∞
√ n
!
n 1X
Xi − µ → N(0, 1)
σ n
i=1

CLT in Practice

15/50
Some Drawbacks

What will happen if the data indicate that the parent

distribution
1 is not symmetric?
2 is heavy tail?
3 is not unimodal?

What will happen if error distribution is not normal during

regression modeling?

16/50
Possible Remedy

In Distribution Theory:
1 Skew Normal Distribution (A Azzalini, Scandinavian
Journal of Statistics 1985)
2 Power Normal Distribution (RD Gupta, Test 2008)
3 Geometric Skew-Normal Distribution (D Kundu, Sankhya
2014), etc.

In Regression Theory:
1 Box-Cox Transformation (Box, Cox, JRSS Series-B 1964)
2 Generalized linear model (Nelder, Wedderburn, JRSS
Series-A 1972)
3 Semiparametric and Nonparametric Approaches (see
ESLR/ISLR Book), etc.

17/50
C ORRELATION DOES NOT IMPLY CAUSATION !

18/50
Correlation
Correlation may indicate any type of association. Correlation implies
association, but not causation. Conversely, causation implies
association, but not correlation1

19/50 1
Altman, Krzywinski, "Association, correlation and causation", Nature Methods (2015)
Causality: What is it?

Causality is central notion in science, decision-taking and daily life.

Causal inference ≈ Causal language/model + Statistical inference.

Question: How do you define cause and effect?

20/50
Causality in Philosophy

“. . . Thus we remember to have seen that species of

object we call flame, and to have felt that species of
sensation we call heat. We likewise call to mind
their constant conjunction in all past instances.
Without any farther ceremony, we call the one
cause and the other effect, and infer the existence
of the one from that of the other.”
- David Hume, A Treatise of Human Nature
(1738).

But: Does the stork really

bring babies?

21/50
Causality in Statistics

22/50
Causality in Statistics

23/50
A Paradigm Shift: Basic Contributions

• The modeling of the underlying structures provides a language to

encode causal relationships – the basis of a causality theory.
• Causality theory helps to decide when, and how, causation can be
inferred from domain knowledge and data.

24/50
ML techniques are impacting our life

A day in our life with Machine Learning techniques...

25/50
Now we are stepping into risk-sensitive
areas
Shifting from Performance Driven to Risk Sensitive...

26/50
Problems of today’s ML - Explainability

Most machine learning models are black-box models...

27/50
Problems of today’s ML - Stability

Most ML methods are developed under I.I.D hypothesis...

28/50
Problems of today’s ML - Stability

29/50
Problems of today’s ML - Stability

30/50
A plausible reason: Correlation

Correlation is the very basics of machine learning....

31/50
Correlation is not explainable

32/50
Correlation is "unstable"

33/50
Correlation Vs. Causation

It’s not the fault of correlation, but the way we use it...

34/50
A Practical Definition of Causality

35/50
Benefits of bringing causality into learning

More Explainable and More Stable...

36/50
Causality everywhere

37/50
Correlation does not imply causation

38/50
“Correlation = Causation” is a cognitive bias

39/50
Then, what does imply causation?

Source: https://www.bradyneal.com/causal-inference-course

40/50
Then, what does imply causation?

Source: https://www.bradyneal.com/causal-inference-course

41/50
Languages for Causality

Using potential
Using structural
outcomes /
Using graphs: equations:
counterfactuals:
• 1921 Wright • 1921 Wright • 1923 Neyman
(genetics); (genetics); (statistics);
• 1988 Pearl (computer • 1943 Haavelmo • 1973 Lewis
science “AI”); (econometrics); (philosophy);
• 1993 Spirtes, • 1975 Duncan (social • 1974 Rubin (statistics);
Glymour, Scheines sciences);
(philosophy).
• 2000 Pearl (computer • 1986 Robins
science). (epidemiology);

Reference: The Book of Why: The New Science of Cause and Effect by Judea Pearl and
Dana Mackenzie (2019).

42/50
A LL MODELS ARE WRONG , BUT SOME ARE USEFUL !

43/50
The Science of Forecasting

Forecasting is estimating how the sequence of observations will continue into

the future. Whether it is the rise/fall in exchange rates, the outcome of
elections, or winners at the Oscars, there is sure to be something you want to
know.

44/50
Random futures

Mathematical/Statistical models are simplifications of reality – and life is

sometimes too complex to model accurately.

45/50
Which is easiest to forecast? (Easy to Tough)

1 Time of sunrise this day next year.

2 Maximum temperature tomorrow.

3 Daily electricity demand in 3 days time.

4 Google stock price tomorrow.

5 Exchange rate of USD/INR next week.

How do we measure “easiest”?

What makes something easy/difficult to forecast?

46/50
Forecastability factors

Something is easier to forecast if:

• We have a good understanding of the factors that contribute to it, and

can measure them (for stock price and exchange rates causes are mostly
unknown).
• There is lots of data available.

• The future is somewhat similar to the past.

• The forecasts cannot affect the thing we are trying to forecast (say,
Warren Buffett, CEO of Berkshire Hathaway, make some comment that
stock price may change!).
• When should we give up? When there is insufficient data? When the
models give implausible forecasts?.

47/50
Various Forecasting Models

A recently published survey paper: Nowcasting of COVID-19 confirmed cases:

Foundations, trends, and challenges (Chakraborty et al., Modelling, Control and Drug
Development for COVID-19 Outbreak Prevention, 2021)
48/50
Forecasts can go very wrong

“Prediction is very difficult, especially if it’s about the future!”

- Niels Bohr, Danish Physicist & Nobel laureate in Physics.

49/50
Textbook and References

50/50

Probability and Statistics (Tanton)
No ratings yet
Probability and Statistics (Tanton)
252 pages
Introduction To Probability and Statistics
No ratings yet
Introduction To Probability and Statistics
17 pages
Capstone Project The Philosophy of Stati
No ratings yet
Capstone Project The Philosophy of Stati
1 page
Mit Micro Economics Lecture
No ratings yet
Mit Micro Economics Lecture
9 pages
00 Prob Stat Introduction
No ratings yet
00 Prob Stat Introduction
26 pages
Statistics Introduction
No ratings yet
Statistics Introduction
3 pages
Spirt Es 1993
No ratings yet
Spirt Es 1993
551 pages
Lecture 1
No ratings yet
Lecture 1
12 pages
Math in The Modern World Statistics Introduction
No ratings yet
Math in The Modern World Statistics Introduction
19 pages
Causal Inference: 1.1 Two Types of Causal Questions
No ratings yet
Causal Inference: 1.1 Two Types of Causal Questions
19 pages
Ders 1-2 Introduction
No ratings yet
Ders 1-2 Introduction
23 pages
Stats Notes Books
No ratings yet
Stats Notes Books
10 pages
CDSS - Day 2
No ratings yet
CDSS - Day 2
230 pages
Introduction To Statistics and Probablity-M.nurul Islam
15% (26)
Introduction To Statistics and Probablity-M.nurul Islam
9 pages
Imperial Causality
No ratings yet
Imperial Causality
124 pages
Basicsofstatisticalmethods PDF
No ratings yet
Basicsofstatisticalmethods PDF
85 pages
Fundamentals of Mathematical Statistics: Pavol Oršanský
No ratings yet
Fundamentals of Mathematical Statistics: Pavol Oršanský
85 pages
Hunermund - Causal Inference Seminar - Lecture 1 - Fundamentals
No ratings yet
Hunermund - Causal Inference Seminar - Lecture 1 - Fundamentals
26 pages
Research - Stats Notes
No ratings yet
Research - Stats Notes
44 pages
Mathematical Statistics For Economics: Lecturer: DR Ioannis (Yiannis) Karavias
No ratings yet
Mathematical Statistics For Economics: Lecturer: DR Ioannis (Yiannis) Karavias
33 pages
Lesson One Introduction To Inferential Statistics
No ratings yet
Lesson One Introduction To Inferential Statistics
20 pages
Introduction to Statistics in Counseling
No ratings yet
Introduction to Statistics in Counseling
23 pages
EXAMPLE 2.13:: Applied Statistics Random Variables and Probability Distributions
No ratings yet
EXAMPLE 2.13:: Applied Statistics Random Variables and Probability Distributions
23 pages
Statistics and Applications
No ratings yet
Statistics and Applications
65 pages
Probability in Public Health
100% (2)
Probability in Public Health
138 pages
Peter Spirtes 2010
No ratings yet
Peter Spirtes 2010
20 pages
Aronow P.M., Miller B.T. - Foundations of Agnostic Statistics-Cambridge University Press (2019)
No ratings yet
Aronow P.M., Miller B.T. - Foundations of Agnostic Statistics-Cambridge University Press (2019)
318 pages
Introduction To Probability Theory and S
No ratings yet
Introduction To Probability Theory and S
127 pages
Introduction To Probability Theory and Statistics
No ratings yet
Introduction To Probability Theory and Statistics
127 pages
ECON 361: Income & Inequality: Lecture 2: Review of Statistics
No ratings yet
ECON 361: Income & Inequality: Lecture 2: Review of Statistics
279 pages
Statistics and Probability
No ratings yet
Statistics and Probability
43 pages
What Is Probability
No ratings yet
What Is Probability
8 pages
Principles of Statistical Inference
100% (10)
Principles of Statistical Inference
236 pages
Normal Distribution (Bell Curve) - Definition, Examples, & Graph
No ratings yet
Normal Distribution (Bell Curve) - Definition, Examples, & Graph
10 pages
Foundations of Agnostic Statistics
100% (1)
Foundations of Agnostic Statistics
318 pages
Advance Statistics
No ratings yet
Advance Statistics
292 pages
Data Surveys Experiments Statistician
No ratings yet
Data Surveys Experiments Statistician
8 pages
Some Key Comparisons Between Statistics and Mathematics and Why Teachers Should Care
No ratings yet
Some Key Comparisons Between Statistics and Mathematics and Why Teachers Should Care
14 pages
Statistics and Probability
No ratings yet
Statistics and Probability
12 pages
Statistics Reviewer
No ratings yet
Statistics Reviewer
17 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
1styear Full Book Statistics PB
No ratings yet
1styear Full Book Statistics PB
249 pages
(Ebook) Everyday Probability and Statistics: Health, Elections, Gambling and War by Michael Mark Woolfson ISBN 9781848167612, 184816761X PDF Download
No ratings yet
(Ebook) Everyday Probability and Statistics: Health, Elections, Gambling and War by Michael Mark Woolfson ISBN 9781848167612, 184816761X PDF Download
172 pages
Micceri, T. (1989) - The Unicorn, The Normal Curve, and Other Improbably Creatures. Micceri89
No ratings yet
Micceri, T. (1989) - The Unicorn, The Normal Curve, and Other Improbably Creatures. Micceri89
18 pages
Statistical Inferences Notes
No ratings yet
Statistical Inferences Notes
15 pages
Aleix Ruiz de Villa Robert - Causal Inference For Data Science (MEAP V04) - Manning (2023)
No ratings yet
Aleix Ruiz de Villa Robert - Causal Inference For Data Science (MEAP V04) - Manning (2023)
217 pages
Statistics
No ratings yet
Statistics
167 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
23 pages
Biostatistics Basics PDF
No ratings yet
Biostatistics Basics PDF
32 pages
Probability and Distributions
No ratings yet
Probability and Distributions
6 pages
ML Section16 Causality
No ratings yet
ML Section16 Causality
57 pages
Exception Types
No ratings yet
Exception Types
17 pages
Pre Registration Summary: Fee Rule:Lease With Security Upto 5 Years
No ratings yet
Pre Registration Summary: Fee Rule:Lease With Security Upto 5 Years
2 pages
Military Vehicle Eligibility List
No ratings yet
Military Vehicle Eligibility List
16 pages
The Ruthless Elimination of Hurry How To Stay Emotionally Healthy and Spiritually Alive in The Chaos of The Modern World
No ratings yet
The Ruthless Elimination of Hurry How To Stay Emotionally Healthy and Spiritually Alive in The Chaos of The Modern World
16 pages
Autodesk Inventor Drawing Standards
No ratings yet
Autodesk Inventor Drawing Standards
1 page
5 Yield PDF
No ratings yet
5 Yield PDF
2 pages
Christopher Faraji: Business Development Expert
No ratings yet
Christopher Faraji: Business Development Expert
3 pages
Case Study Cocacola
No ratings yet
Case Study Cocacola
10 pages
Oliver! JR - Audition Form
No ratings yet
Oliver! JR - Audition Form
1 page
Chapter 1 The Lodging Industry
No ratings yet
Chapter 1 The Lodging Industry
34 pages
Grain Substitution
100% (2)
Grain Substitution
1 page
LOGi TPC
No ratings yet
LOGi TPC
34 pages
Part A Question 1: Performance Plus For The HKDSE - All-In-One Exam Practice (Vol. 2) (2nd Edition) Set 6 Paper 2
No ratings yet
Part A Question 1: Performance Plus For The HKDSE - All-In-One Exam Practice (Vol. 2) (2nd Edition) Set 6 Paper 2
3 pages
Poisoning & Drug Overdose 7th Edition Kent R. Olson (Ed.)
No ratings yet
Poisoning & Drug Overdose 7th Edition Kent R. Olson (Ed.)
41 pages
Mech Malleable Iron Fittings Catalogue
No ratings yet
Mech Malleable Iron Fittings Catalogue
12 pages
Chapter 4 and 5 - Feasibility Study
No ratings yet
Chapter 4 and 5 - Feasibility Study
36 pages
BCG Matrix 1
No ratings yet
BCG Matrix 1
4 pages
AGLC Rules of Play
No ratings yet
AGLC Rules of Play
236 pages
Risk Management - ITC
No ratings yet
Risk Management - ITC
65 pages
Value Stream Core Process Identification v1.0
No ratings yet
Value Stream Core Process Identification v1.0
8 pages
Reports On EA Valve Servicing Maintenance and Integrity Checks AUGUST 2020
No ratings yet
Reports On EA Valve Servicing Maintenance and Integrity Checks AUGUST 2020
37 pages
Manual: High Pressure Cleaner MC 300/21
No ratings yet
Manual: High Pressure Cleaner MC 300/21
46 pages
Zusammenfassung Yule
100% (1)
Zusammenfassung Yule
18 pages
Computer Worksheet Viva
No ratings yet
Computer Worksheet Viva
58 pages
The Political Economy of Monetary Circuits Tradition and Change in Post-Keynesian Economics (Jean-François Ponsot, Sergio Rossi (Eds.) ) (Z-Library)
100% (1)
The Political Economy of Monetary Circuits Tradition and Change in Post-Keynesian Economics (Jean-François Ponsot, Sergio Rossi (Eds.) ) (Z-Library)
257 pages
Remedial Statistics 1
No ratings yet
Remedial Statistics 1
2 pages
CPM V Akande
No ratings yet
CPM V Akande
4 pages
Trends and Innovations in Bakery Garcia Etal
No ratings yet
Trends and Innovations in Bakery Garcia Etal
14 pages
Task M2-1 Heidy Padilla....
No ratings yet
Task M2-1 Heidy Padilla....
9 pages
Hot Cross Buns b2 Lesson Plan
No ratings yet
Hot Cross Buns b2 Lesson Plan
3 pages

Fooled by Statistics & Machine Learning by Dr. Tanujit

Uploaded by

Fooled by Statistics & Machine Learning by Dr. Tanujit

Uploaded by

SRCASW, University of Delhi

"Fooled" by Statistics & ML

Dr. Tanujit Chakraborty

C ORRELATION DOES NOT IMPLY CAUSATION !

A LL MODELS ARE WRONG , BUT SOME ARE USEFUL !

Normality is a paved road. It is easy to walk but no flowers

By Dr. Saul McLeod (2019)

Everybody believes in the exponential law of errors (the normal

A random variable X is said to be

Probability Density Function of Normals

• Sir Francis Galton, Charles

• It specifically shows that the

Gambling Question: A 17th century gambler, the Chevalier de

The famous correspondence between Pascal and Fermat was

The problem was not difficult when n is small.

Within few years the following problem arises in a sociological

Original Problem: The problem is to test the hypothesis that male

Using the following recurrence relation

and some involved rational approximation it has been obtained

Using the following recurrence relation

and some involved rational approximation it has been obtained

De Moivre began the search for this approximation in 1721 ,

Eventually using the second approximation one gets

which is the cumulative distribution function (CDF) of the

Gauss (1809) made the following assumptions and deduce the

Suppose {X1 , X2 , · · · } is a sequence of

What will happen if the data indicate that the parent

What will happen if error distribution is not normal during

Causality is central notion in science, decision-taking and daily life.

Question: How do you define cause and effect?

“. . . Thus we remember to have seen that species of

But: Does the stork really

• The modeling of the underlying structures provides a language to

A day in our life with Machine Learning techniques...

Most machine learning models are black-box models...

Most ML methods are developed under I.I.D hypothesis...

Correlation is the very basics of machine learning....

More Explainable and More Stable...

Forecasting is estimating how the sequence of observations will continue into

Mathematical/Statistical models are simplifications of reality – and life is

1 Time of sunrise this day next year.

2 Maximum temperature tomorrow.

3 Daily electricity demand in 3 days time.

4 Google stock price tomorrow.

5 Exchange rate of USD/INR next week.

How do we measure “easiest”?

What makes something easy/difficult to forecast?

Something is easier to forecast if:

• We have a good understanding of the factors that contribute to it, and

• The future is somewhat similar to the past.

A recently published survey paper: Nowcasting of COVID-19 confirmed cases:

“Prediction is very difficult, especially if it’s about the future!”

You might also like