You're reading from Causal Inference and Discovery in Python Unlock the secrets of modern causal machine learning with DoWhy, EconML, PyTorch and more

Product type Paperback

Published in May 2023

Publisher Packt

ISBN-13 9781804612989

Length 466 pages

Edition 1st Edition

Languages

Python

Tools

PyTorch

Concepts

Data Science

Author (1):

Aleksander Molak

View More author details

Table of Contents (22) Chapters

Preface

1. Part 1: Causality – an Introduction

2. Chapter 1: Causality – Hey, We Have Machine Learning, So Why Even Bother? FREE CHAPTER

3. Chapter 2: Judea Pearl and the Ladder of Causation

4. Chapter 3: Regression, Observations, and Interventions

5. Chapter 4: Graphical Models

6. Chapter 5: Forks, Chains, and Immoralities

7. Part 2: Causal Inference

8. Chapter 6: Nodes, Edges, and Statistical (In)dependence

9. Chapter 7: The Four-Step Process of Causal Inference

10. Chapter 8: Causal Models – Assumptions and Challenges

11. Chapter 9: Causal Inference and Machine Learning – from Matching to Meta-Learners

12. Chapter 10: Causal Inference and Machine Learning – Advanced Estimators, Experiments, Evaluations, and More

13. Chapter 11: Causal Inference and Machine Learning – Deep Learning, NLP, and Beyond

14. Part 3: Causal Discovery

15. Chapter 12: Can I Have a Causal Graph, Please?

16. Chapter 13: Causal Discovery and Machine Learning – from Assumptions to Applications

17. Chapter 14: Causal Discovery and Machine Learning – Advanced Deep Learning and Beyond

18. Chapter 15: Epilogue

19. Chapter 16: Unlock Your Book’s Exclusive Benefits

How to unlock these benefits in three easy steps

20. Index

21. Other Books You May Enjoy

The basics II – propensity scores

In this section, we will discuss propensity scores and how they are sometimes used to address the challenges that we encounter when using matching in multidimensional cases. Finally, we’ll demonstrate why you should not use propensity scores for matching, even if your favorite econometrician does so.

Matching in the wild

Let’s start with a mental experiment. Imagine that you received a new dataset to analyze. This data contains 1,000 observations. What are the chances that you’ll find at least one exact match for each row if there are 18 variables in your dataset?

The answer obviously depends on a number of factors. How many variables are binary? How many are continuous? How many are categorical? What’s the number of levels for categorical variables? Are variables independent or correlated with each other?

To get an idea of what the answer can be, let’s take a look at Figure 9.5: