0% found this document useful (0 votes)
46 views38 pages

Chapter 2

Uploaded by

gatopeidolas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views38 pages

Chapter 2

Uploaded by

gatopeidolas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Introduction to

relational plots and


subplots
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
Questions about quantitative variables
Relational plots

Height vs. weight

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Questions about quantitative variables
Relational plots

Height vs. weight

Number of school absences vs. final grade

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Questions about quantitative variables
Relational plots

Height vs. weight

Number of school absences vs. final grade


GDP vs. percent literate

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Introducing relplot()
Create "relational plots": scatter plots or line plots
Why use relplot() instead of scatterplot() ?

relplot() lets you create subplots in a single figure

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


scatterplot() vs. relplot()
Using scatterplot() Using relplot()

import seaborn as sns import seaborn as sns


import matplotlib.pyplot as plt import matplotlib.pyplot as plt

sns.scatterplot(x="total_bill", sns.relplot(x="total_bill",
y="tip", y="tip",
data=tips) data=tips,
kind="scatter")
plt.show()
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subplots in columns
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
col="smoker")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subplots in rows
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
row="smoker")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subplots in rows and columns
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
col="smoker",
row="time")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subgroups for days of the week

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Wrapping columns
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
col="day",
col_wrap=2)
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Ordering columns
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
col="day",
col_wrap=2,
col_order=["Thur",
"Fri",
"Sat",
"Sun"])
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Customizing scatter
plots
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
Scatter plot overview
Show relationship between two quantitative variables

We've seen:

Subplots ( col and row )

Subgroups with color ( hue )

New Customizations:

Subgroups with point size and style

Changing point transparency

Use with both scatterplot() and relplot()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subgroups with point size
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
size="size")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point size and hue
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
size="size",
hue="size")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subgroups with point style
import seaborn as sns
import matplotlib.pyplot as plt

sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
hue="smoker",
style="smoker")
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing point transparency
import seaborn as sns
import matplotlib.pyplot as plt

# Set alpha to be between 0 and 1


sns.relplot(x="total_bill",
y="tip",
data=tips,
kind="scatter",
alpha=0.4)
plt.show()

1 Waskom, M. L. (2021). seaborn: statistical data visualization. https://seaborn.pydata.org/

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Introduction to line
plots
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
What are line plots?
Two types of relational plots: scatter plots
and line plots

Scatter plots

Each plot point is an independent


observation

Line plots

Each plot point represents the same "thing",


typically tracked over time

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Air pollution data
Collection stations throughout city
Air samples of nitrogen dioxide levels

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Scatter plot
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2_mean",
data=air_df_mean,
kind="scatter")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Line plot
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2_mean",
data=air_df_mean,
kind="line")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subgroups by location

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Subgroups by location
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2_mean",
data=air_df_loc_mean,
kind="line",
style="location",
hue="location")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Adding markers
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2_mean",
data=air_df_loc_mean,
kind="line",
style="location",
hue="location",
markers=True)
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off line style
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2_mean",
data=air_df_loc_mean,
kind="line",
style="location",
hue="location",
markers=True,
dashes=False)
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Multiple observations per x-value

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Multiple observations per x-value
Scatter plot

import matplotlib.pyplot as plt


import seaborn as sns

sns.relplot(x="hour", y="NO_2",
data=air_df,
kind="scatter")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Multiple observations per x-value
Line plot

import matplotlib.pyplot as plt


import seaborn as sns

sns.relplot(x="hour", y="NO_2",
data=air_df,
kind="line")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Multiple observations per x-value
Shaded region is the confidence interval

Assumes dataset is a random sample

95% confident that the mean is within this


interval

Indicates uncertainty in our estimate

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Replacing confidence interval with standard deviation
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2",
data=air_df,
kind="line",
ci="sd")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off confidence interval
import matplotlib.pyplot as plt
import seaborn as sns

sns.relplot(x="hour", y="NO_2",
data=air_df,
kind="line",
ci=None)
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

You might also like