0% found this document useful (0 votes)
3 views

seaborn

The document provides detailed information on the seaborn library's boxplot and pairplot functions, including their parameters and usage examples. It explains how boxplots visualize data distributions across categories and how pairplots display pairwise relationships in datasets. Additionally, it highlights customization options for both plotting functions to enhance visual representation.

Uploaded by

Mbusa André
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

seaborn

The document provides detailed information on the seaborn library's boxplot and pairplot functions, including their parameters and usage examples. It explains how boxplots visualize data distributions across categories and how pairplots display pairwise relationships in datasets. Additionally, it highlights customization options for both plotting functions to enhance visual representation.

Uploaded by

Mbusa André
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 71

seaborn.

boxplot

seaborn.boxplot(data=None, *, x=None, y=None, hue=None, order=None, hue_order=None, orient=N


one, color=None, palette=None, saturation=0.75, fill=True, dodge='auto', width=0.8, gap=0, whis=1.5,
linecolor='auto', linewidth=None, fliersize=None, hue_norm=None, native_scale=False, log_scale=No
ne, formatter=None, legend='auto', ax=None, **kwargs)

Draw a box plot to show distributions with respect to categories.

A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that
facilitates comparisons between variables or across levels of a categorical variable. The box shows
the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for
points that are determined to be “outliers” using a method that is a function of the inter-quartile
range.

See the tutorial for more information.

Note

By default, this function treats one of the variables as categorical and draws data at ordinal positions
(0, 1, … n) on the relevant axis. As of version 0.13.0, this can be disabled by setting native_scale=True.

Parameters:

dataDataFrame, Series, dict, array, or list of arrays

Dataset for plotting. If x and y are absent, this is interpreted as wide-form. Otherwise it is expected to
be long-form.

x, y, huenames of variables in data or vector data

Inputs for plotting long-form data. See examples for interpretation.

order, hue_orderlists of strings

Order to plot the categorical levels in; otherwise the levels are inferred from the data objects.

orient“v” | “h” | “x” | “y”

Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input
variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting
wide-form data.

Changed in version v0.13.0: Added ‘x’/’y’ as options, equivalent to ‘v’/’h’.

colormatplotlib color

Single color for the elements in the plot.

palettepalette name, list, or dict

Colors to use for the different levels of the hue variable. Should be something that can be interpreted
by color_palette(), or a dictionary mapping hue levels to matplotlib colors.

saturationfloat

Proportion of the original saturation to draw fill colors in. Large patches often look better with
desaturated colors, but set this to 1 if you want the colors to perfectly match the input values.
fillbool

If True, use a solid patch. Otherwise, draw as line art.

New in version v0.13.0.

dodge“auto” or bool

When hue mapping is used, whether elements should be narrowed and shifted along the orient axis
to eliminate overlap. If "auto", set to True when the orient variable is crossed with the categorical
variable or False otherwise.

Changed in version 0.13.0: Added "auto" mode as a new default.

widthfloat

Width allotted to each element on the orient axis. When native_scale=True, it is relative to the
minimum distance between two values in the native scale.

gapfloat

Shrink on the orient axis by this factor to add a gap between dodged elements.

New in version 0.13.0.

whisfloat or pair of floats

Paramater that controls whisker length. If scalar, whiskers are drawn to the farthest datapoint
within whis * IQR from the nearest hinge. If a tuple, it is interpreted as percentiles that whiskers
represent.

linecolorcolor

Color to use for line elements, when fill is True.

New in version v0.13.0.

linewidthfloat

Width of the lines that frame the plot elements.

fliersizefloat

Size of the markers used to indicate outlier observations.

hue_normtuple or matplotlib.colors.Normalize object

Normalization in data units for colormap applied to the hue variable when it is numeric. Not relevant
if hue is categorical.

New in version v0.12.0.

log_scalebool or number, or pair of bools or numbers

Set axis scale(s) to log. A single value sets the data axis for any numeric axes in the plot. A pair of
values sets each axis independently. Numeric values are interpreted as the desired base (default 10).
When None or False, seaborn defers to the existing Axes scale.

New in version v0.13.0.


native_scalebool

When True, numeric or datetime values on the categorical axis will maintain their original scaling
rather than being converted to fixed indices.

New in version v0.13.0.

formattercallable

Function for converting categorical data into strings. Affects both grouping and tick labels.

New in version v0.13.0.

legend“auto”, “brief”, “full”, or False

How to draw the legend. If “brief”, numeric hue and size variables will be represented with a sample
of evenly spaced values. If “full”, every group will get an entry in the legend. If “auto”, choose
between brief or full representation based on number of levels. If False, no legend data is added and
no legend is drawn.

New in version v0.13.0.

axmatplotlib Axes

Axes object to draw the plot onto, otherwise uses the current Axes.

kwargskey, value mappings

Other keyword arguments are passed through to matplotlib.axes.Axes.boxplot().

Returns:

axmatplotlib Axes

Returns the Axes object with the plot drawn onto it.

See also

violinplot

A combination of boxplot and kernel density estimation.

stripplot

A scatterplot where one variable is categorical. Can be used in conjunction with other plots to show
each observation.

swarmplot

A categorical scatterplot where the points do not overlap. Can be used with other plots to show each
observation.

catplot

Combine a categorical plot with a FacetGrid.

Examples

Draw a single horizontal boxplot, assigning the data directly to the coordinate variable:
sns.boxplot(x=titanic["age"])

Group by a categorical variable, referencing columns in a dataframe:

sns.boxplot(data=titanic, x="age", y="class")

Draw a vertical boxplot with nested grouping by two variables:

sns.boxplot(data=titanic, x="class", y="age", hue="alive")


Draw the boxes as line art and add a small gap between them:

sns.boxplot(data=titanic, x="class", y="age", hue="alive", fill=False, gap=.1)

Cover the full range of the data with the whiskers:

sns.boxplot(data=titanic, x="age", y="deck", whis=(0, 100))


Draw narrower boxes:

sns.boxplot(data=titanic, x="age", y="deck", width=.5)

Modify the color and width of all the line artists:

sns.boxplot(data=titanic, x="age", y="deck", color=".8", linecolor="#137", linewidth=.75)


Group by a numeric variable and preserve its native scaling:

ax = sns.boxplot(x=titanic["age"].round(-1), y=titanic["fare"], native_scale=True)

ax.axvline(25, color=".3", dashes=(2, 2))

Customize the plot using parameters of the underlying matplotlib function:

sns.boxplot(
data=titanic, x="age", y="class",

notch=True, showcaps=False,

flierprops={"marker": "x"},

boxprops={"facecolor": (.3, .5, .7, .5)},

medianprops={"color": "r", "linewidth": 2},

seaborn.pairplot

seaborn.pairplot(data, *, hue=None, hue_order=None, palette=None, vars=None, x_vars=None, y_va


rs=None, kind='scatter', diag_kind='auto', markers=None, height=2.5, aspect=1, corner=False, dropna
=False, plot_kws=None, diag_kws=None, grid_kws=None, size=None)

Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each numeric variable in data will by
shared across the y-axes across a single row and the x-axes across a single column. The diagonal plots
are treated differently: a univariate distribution plot is drawn to show the marginal distribution of the
data in each column.

It is also possible to show a subset of variables or plot different variables on the rows and columns.

This is a high-level interface for PairGrid that is intended to make it easy to draw a few common
styles. You should use PairGrid directly if you need more flexibility.
Parameters:

datapandas.DataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

huename of variable in data

Variable in data to map plot aspects to different colors.

hue_orderlist of strings

Order for the levels of the hue variable in the palette

palettedict or seaborn color palette

Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.

varslist of variable names

Variables within data to use, otherwise use every column with a numeric datatype.

{x, y}_varslists of variable names

Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.

kind{‘scatter’, ‘kde’, ‘hist’, ‘reg’}

Kind of plot to make.

diag_kind{‘auto’, ‘hist’, ‘kde’, None}

Kind of plot for the diagonal subplots. If ‘auto’, choose based on whether or not hue is used.

markerssingle matplotlib marker code or list

Either the marker to use for all scatterplot points or a list of markers with a length the same as the
number of levels in the hue variable so that differently colored points will also have different
scatterplot markers.

heightscalar

Height (in inches) of each facet.

aspectscalar

Aspect * height gives the width (in inches) of each facet.

cornerbool

If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.

dropnaboolean

Drop missing values from the data before plotting.

{plot, diag, grid}_kwsdicts


Dictionaries of keyword arguments. plot_kws are passed to the bivariate plotting
function, diag_kws are passed to the univariate plotting function, and grid_kws are passed to
the PairGrid constructor.

Returns:

gridPairGrid

Returns the underlying PairGrid instance for further tweaking.

See also

PairGrid

Subplot grid for more flexible plotting of pairwise relationships.

JointGrid

Grid for plotting joint and marginal distributions of two variables.

Examples

The simplest invocation uses scatterplot() for each pairing of the variables and histplot() for the
marginal plots along the diagonal:

penguins = sns.load_dataset("penguins")

sns.pairplot(penguins)
Assigning a hue variable adds a semantic mapping and changes the default marginal plot to a layered
kernel density estimate (KDE):

sns.pairplot(penguins, hue="species")
It’s possible to force marginal histograms:

sns.pairplot(penguins, hue="species", diag_kind="hist")


The kind parameter determines both the diagonal and off-diagonal plotting style. Several options are
available, including using kdeplot() to draw KDEs:

sns.pairplot(penguins, kind="kde")
Or histplot() to draw both bivariate and univariate histograms:

sns.pairplot(penguins, kind="hist")
The markers parameter applies a style mapping on the off-diagonal axes. Currently, it will be
redundant with the hue variable:

sns.pairplot(penguins, hue="species", markers=["o", "s", "D"])


As with other figure-level functions, the size of the figure is controlled by setting the height of each
individual subplot:

sns.pairplot(penguins, height=1.5)
Use vars or x_vars and y_vars to select the variables to plot:

sns.pairplot(

penguins,

x_vars=["bill_length_mm", "bill_depth_mm", "flipper_length_mm"],

y_vars=["bill_length_mm", "bill_depth_mm"],

)
Set corner=True to plot only the lower triangle:

sns.pairplot(penguins, corner=True)
The plot_kws and diag_kws parameters accept dicts of keyword arguments to customize the off-
diagonal and diagonal plots, respectively:

sns.pairplot(

penguins,

plot_kws=dict(marker="+", linewidth=1),

diag_kws=dict(fill=False),

)
The return object is the underlying PairGrid, which can be used to further customize the plot:

g = sns.pairplot(penguins, diag_kind="kde")

g.map_lower(sns.kdeplot, levels=4, color=".2")


4

seaborn.PairGrid

class seaborn.PairGrid(data, *, hue=None, vars=None, x_vars=None, y_vars=None, hue_order=None,


palette=None, hue_kws=None, corner=False, diag_sharey=True, height=2.5, aspect=1, layout_pad=0.
5, despine=True, dropna=False)

Subplot grid for plotting pairwise relationships in a dataset.

This object maps each variable in a dataset onto a column and row in a grid of multiple axes.
Different axes-level plotting functions can be used to draw bivariate plots in the upper and lower
triangles, and the marginal distribution of each variable can be shown on the diagonal.

Several different common plots can be generated in a single line using pairplot(). Use PairGrid when
you need more flexibility.

See the tutorial for more information.


__init__(data, *, hue=None, vars=None, x_vars=None, y_vars=None, hue_order=None, palette=None,
hue_kws=None, corner=False, diag_sharey=True, height=2.5, aspect=1, layout_pad=0.5, despine=Tru
e, dropna=False)

Initialize the plot figure and PairGrid object.

Parameters:

dataDataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

huestring (variable name)

Variable in data to map plot aspects to different colors. This variable will be excluded from the
default x and y variables.

varslist of variable names

Variables within data to use, otherwise use every column with a numeric datatype.

{x, y}_varslists of variable names

Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.

hue_orderlist of strings

Order for the levels of the hue variable in the palette

palettedict or seaborn color palette

Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.

hue_kwsdictionary of param -> list of values mapping

Other keyword arguments to insert into the plotting call to let other plot attributes vary across levels
of the hue variable (e.g. the markers in a scatterplot).

cornerbool

If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.

heightscalar

Height (in inches) of each facet.

aspectscalar

Aspect * height gives the width (in inches) of each facet.

layout_padscalar

Padding between axes; passed to fig.tight_layout.

despineboolean

Remove the top and right spines from the plots.

dropnaboolean
Drop missing values from the data before plotting.

See also

pairplot

Easily drawing common uses of PairGrid.

FacetGrid

Subplot grid for plotting conditional relationships.

Examples

Calling the constructor sets up a blank grid of subplots with each row and one column corresponding
to a numeric variable in the dataset:

penguins = sns.load_dataset("penguins")

g = sns.PairGrid(penguins)

Passing a bivariate function to PairGrid.map() will draw a bivariate plot on every axes:
g = sns.PairGrid(penguins)

g.map(sns.scatterplot)

Passing separate functions to PairGrid.map_diag() and PairGrid.map_offdiag() will show each


variable’s marginal distribution on the diagonal:

g = sns.PairGrid(penguins)

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower triangles of the plot (which are
otherwise redundant):

g = sns.PairGrid(penguins, diag_sharey=False)

g.map_upper(sns.scatterplot)

g.map_lower(sns.kdeplot)

g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:

g = sns.PairGrid(penguins, diag_sharey=False, corner=True)

g.map_lower(sns.scatterplot)

g.map_diag(sns.kdeplot)
The PairGrid constructor accepts a hue variable. This variable is passed directly to functions that
understand it:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot)

g.add_legend()
But you can also pass matplotlib functions, in which case a groupby is performed internally and a
separate plot is drawn for each level:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(plt.hist)

g.map_offdiag(plt.scatter)

g.add_legend()
Additional semantic variables can be assigned by passing data vectors directly while mapping the
function:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot, size=penguins["sex"])

g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue mapping, you will want to disable
mapping of the variable on the diagonal axes. Note that the hue variable is excluded from the list of
variables shown by default:

g = sns.PairGrid(penguins, hue="body_mass_g")

g.map_diag(sns.histplot, hue=None, color=".3")

g.map_offdiag(sns.scatterplot)

g.add_legend()
The vars parameter can be used to control exactly which variables are used:

variables = ["body_mass_g", "bill_length_mm", "flipper_length_mm"]

g = sns.PairGrid(penguins, hue="body_mass_g", vars=variables)

g.map_diag(sns.histplot, hue=None, color=".3")

g.map_offdiag(sns.scatterplot)

g.add_legend()
The plot need not be square: separate variables can be used to define the rows and columns:

x_vars = ["body_mass_g", "bill_length_mm", "bill_depth_mm", "flipper_length_mm"]

y_vars = ["body_mass_g"]

g = sns.PairGrid(penguins, hue="species", x_vars=x_vars, y_vars=y_vars)

g.map_diag(sns.histplot, color=".3")

g.map_offdiag(sns.scatterplot)

g.add_legend()

It can be useful to explore different approaches to resolving multiple distributions on the diagonal
axes:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot, multiple="stack", element="step")


g.map_offdiag(sns.scatterplot)

g.add_legend()

Methods

__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)

add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and
resizing the figure.

apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return
self.

map(func, **kwargs) Plot with the same function in every subplot.

map_diag(func, **kwargs) Plot with a univariate function on each diagonal


subplot.

map_lower(func, **kwargs) Plot with a bivariate function on the lower diagonal


subplots.

map_offdiag(func, **kwargs) Plot with a bivariate function on the off-diagonal


subplots.

map_upper(func, **kwargs) Plot with a bivariate function on the upper diagonal


subplots.

pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its
value.

savefig(*args, **kwargs) Save an image of the plot.

set(**kwargs) Set attributes on each subplot Axes.

tick_params([axis]) Modify the ticks, tick labels, and gridlines.

tight_layout(*args, **kwargs) Call fig.tight_layout within rect that exclude the


legend.

Attributes

fig DEPRECATED: prefer the figure property.

figure Access the matplotlib.figure.Figure object underlying the grid.

legend The matplotlib.legend.Legend object, if present.

seaborn.PairGrid
class seaborn.PairGrid(data, *, hue=None, vars=None, x_vars=None, y_var
s=None, hue_order=None, palette=None, hue_kws=None, corner=False, diag
_sharey=True, height=2.5, aspect=1, layout_pad=0.5, despine=True, dropna
=False)
Subplot grid for plotting pairwise relationships in a dataset.

This object maps each variable in a dataset onto a column and row in a
grid of multiple axes. Different axes-level plotting functions can be used
to draw bivariate plots in the upper and lower triangles, and the marginal
distribution of each variable can be shown on the diagonal.

Several different common plots can be generated in a single line


using pairplot(). Use PairGrid when you need more flexibility.

See the tutorial for more information.

__init__(data, *, hue=None, vars=None, x_vars=None, y_vars=None, hu


e_order=None, palette=None, hue_kws=None, corner=False, diag_share
y=True, height=2.5, aspect=1, layout_pad=0.5, despine=True, dropna=
False)
Initialize the plot figure and PairGrid object.
Parameters:
dataDataFrame
Tidy (long-form) dataframe where each column is a variable and each row
is an observation.
huestring (variable name)
Variable in data to map plot aspects to different colors. This variable will
be excluded from the default x and y variables.
varslist of variable names
Variables within data to use, otherwise use every column with a numeric
datatype.
{x, y}_varslists of variable names
Variables within data to use separately for the rows and columns of the
figure; i.e. to make a non-square plot.
hue_orderlist of strings
Order for the levels of the hue variable in the palette
palettedict or seaborn color palette
Set of colors for mapping the hue variable. If a dict, keys should be values
in the hue variable.
hue_kwsdictionary of param -> list of values
mapping
Other keyword arguments to insert into the plotting call to let other plot
attributes vary across levels of the hue variable (e.g. the markers in a
scatterplot).
cornerbool
If True, don’t add axes to the upper (off-diagonal) triangle of the grid,
making this a “corner” plot.
heightscalar
Height (in inches) of each facet.
aspectscalar
Aspect * height gives the width (in inches) of each facet.
layout_padscalar
Padding between axes; passed to fig.tight_layout.
despineboolean
Remove the top and right spines from the plots.
dropnaboolean
Drop missing values from the data before plotting.

See also

pairplot
Easily drawing common uses of PairGrid.
FacetGrid
Subplot grid for plotting conditional relationships.

Examples

Calling the constructor sets up a blank grid of subplots with each


row and one column corresponding to a numeric variable in the
dataset:

penguins = sns.load_dataset("penguins")
g = sns.PairGrid(penguins)

Passing a bivariate function to PairGrid.map() will draw a bivariate


plot on every axes:

g = sns.PairGrid(penguins)
g.map(sns.scatterplot)
Passing separate functions
to PairGrid.map_diag() and PairGrid.map_offdiag() will show each
variable’s marginal distribution on the diagonal:
g = sns.PairGrid(penguins)
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower
triangles of the plot (which are otherwise redundant):

g = sns.PairGrid(penguins, diag_sharey=False)
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:

g = sns.PairGrid(penguins, diag_sharey=False, corner=True)


g.map_lower(sns.scatterplot)
g.map_diag(sns.kdeplot)

The PairGrid constructor accepts a hue variable. This variable is


passed directly to functions that understand it:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
g.add_legend()
But you can also pass matplotlib functions, in which case a
groupby is performed internally and a separate plot is drawn for
each level:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter)
g.add_legend()
Additional semantic variables can be assigned by passing data
vectors directly while mapping the function:

g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot, size=penguins["sex"])
g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue
mapping, you will want to disable mapping of the variable on the
diagonal axes. Note that the hue variable is excluded from the list
of variables shown by default:

g = sns.PairGrid(penguins, hue="body_mass_g")
g.map_diag(sns.histplot, hue=None, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()

The vars parameter can be used to control exactly which


variables are used:

variables = ["body_mass_g", "bill_length_mm",


"flipper_length_mm"]
g = sns.PairGrid(penguins, hue="body_mass_g", vars=variables)
g.map_diag(sns.histplot, hue=None, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()

The plot need not be square: separate variables can be used to


define the rows and columns:

x_vars = ["body_mass_g", "bill_length_mm", "bill_depth_mm",


"flipper_length_mm"]
y_vars = ["body_mass_g"]
g = sns.PairGrid(penguins, hue="species", x_vars=x_vars,
y_vars=y_vars)
g.map_diag(sns.histplot, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()
It can be useful to explore different approaches to resolving
multiple distributions on the diagonal axes:

g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot, multiple="stack", element="step")
g.map_offdiag(sns.scatterplot)
g.add_legend()
Methods
__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)
add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and
resizing the figure.
apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return
self.
map(func,
**kwargs) Plot with the same function in every subplot.
map_diag(func, **kwargs) Plot with a univariate function on each diagonal
subplot.
map_lower(func, **kwargs) Plot with a bivariate function on the lower diagonal
subplots.
map_offdiag(func, **kwargs) Plot with a bivariate function on the off-diagonal
subplots.
map_upper(func, **kwargs) Plot with a bivariate function on the upper diagonal
subplots.
pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its
value.
savefig(*args, **kwargs) Save an image of the plot.
set(**kwargs) Set attributes on each subplot Axes.
tick_params([axis]) Modify the ticks, tick labels, and gridlines.
tight_layout(*args, **kwargs) Call fig.tight_layout within rect that exclude the
legend.

Attributes

fig DEPRECATED: prefer the figure property.


figure Access the matplotlib.figure.Figure object underlying the grid.
legend The matplotlib.legend.Legend object, if present.

seaborn.PairGrid

class seaborn.PairGrid(data, *, hue=None, vars=None, x_vars=None, y_vars=None, hue_order=None,


palette=None, hue_kws=None, corner=False, diag_sharey=True, height=2.5, aspect=1, layout_pad=0.
5, despine=True, dropna=False)

Subplot grid for plotting pairwise relationships in a dataset.

This object maps each variable in a dataset onto a column and row in a grid of multiple axes.
Different axes-level plotting functions can be used to draw bivariate plots in the upper and lower
triangles, and the marginal distribution of each variable can be shown on the diagonal.

Several different common plots can be generated in a single line using pairplot(). Use PairGrid when
you need more flexibility.

See the tutorial for more information.


__init__(data, *, hue=None, vars=None, x_vars=None, y_vars=None, hue_order=None, palette=None,
hue_kws=None, corner=False, diag_sharey=True, height=2.5, aspect=1, layout_pad=0.5, despine=Tru
e, dropna=False)

Initialize the plot figure and PairGrid object.

Parameters:

dataDataFrame

Tidy (long-form) dataframe where each column is a variable and each row is an observation.

huestring (variable name)

Variable in data to map plot aspects to different colors. This variable will be excluded from the
default x and y variables.

varslist of variable names

Variables within data to use, otherwise use every column with a numeric datatype.

{x, y}_varslists of variable names

Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.

hue_orderlist of strings

Order for the levels of the hue variable in the palette

palettedict or seaborn color palette

Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.

hue_kwsdictionary of param -> list of values mapping

Other keyword arguments to insert into the plotting call to let other plot attributes vary across levels
of the hue variable (e.g. the markers in a scatterplot).

cornerbool

If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.

heightscalar

Height (in inches) of each facet.

aspectscalar

Aspect * height gives the width (in inches) of each facet.

layout_padscalar

Padding between axes; passed to fig.tight_layout.

despineboolean

Remove the top and right spines from the plots.

dropnaboolean
Drop missing values from the data before plotting.

See also

pairplot

Easily drawing common uses of PairGrid.

FacetGrid

Subplot grid for plotting conditional relationships.

Examples

Calling the constructor sets up a blank grid of subplots with each row and one column corresponding
to a numeric variable in the dataset:

penguins = sns.load_dataset("penguins")

g = sns.PairGrid(penguins)
Passing a bivariate function to PairGrid.map() will draw a bivariate plot on every axes:

g = sns.PairGrid(penguins)
g.map(sns.scatterplot)
Passing separate functions to PairGrid.map_diag() and PairGrid.map_offdiag() will show each
variable’s marginal distribution on the diagonal:

g = sns.PairGrid(penguins)

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower triangles of the plot (which are
otherwise redundant):
g = sns.PairGrid(penguins, diag_sharey=False)

g.map_upper(sns.scatterplot)

g.map_lower(sns.kdeplot)

g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:

g = sns.PairGrid(penguins, diag_sharey=False, corner=True)


g.map_lower(sns.scatterplot)

g.map_diag(sns.kdeplot)
The PairGrid constructor accepts a hue variable. This variable is passed directly to functions that
understand it:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot)

g.add_legend()
But you can also pass matplotlib functions, in which case a groupby is performed internally and a
separate plot is drawn for each level:
g = sns.PairGrid(penguins, hue="species")

g.map_diag(plt.hist)

g.map_offdiag(plt.scatter)

g.add_legend()
Additional semantic variables can be assigned by passing data vectors directly while mapping the
function:
g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot)

g.map_offdiag(sns.scatterplot, size=penguins["sex"])

g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue mapping, you will want to disable
mapping of the variable on the diagonal axes. Note that the hue variable is excluded from the list of
variables shown by default:
g = sns.PairGrid(penguins, hue="body_mass_g")

g.map_diag(sns.histplot, hue=None, color=".3")

g.map_offdiag(sns.scatterplot)

g.add_legend()

The vars parameter can be used to control exactly which variables are used:

variables = ["body_mass_g", "bill_length_mm", "flipper_length_mm"]

g = sns.PairGrid(penguins, hue="body_mass_g", vars=variables)

g.map_diag(sns.histplot, hue=None, color=".3")

g.map_offdiag(sns.scatterplot)
g.add_legend()

The plot need not be square: separate variables can be used to define the rows and columns:

x_vars = ["body_mass_g", "bill_length_mm", "bill_depth_mm", "flipper_length_mm"]

y_vars = ["body_mass_g"]

g = sns.PairGrid(penguins, hue="species", x_vars=x_vars, y_vars=y_vars)

g.map_diag(sns.histplot, color=".3")

g.map_offdiag(sns.scatterplot)

g.add_legend()
It can be useful to explore different approaches to resolving multiple distributions on the diagonal
axes:

g = sns.PairGrid(penguins, hue="species")

g.map_diag(sns.histplot, multiple="stack", element="step")

g.map_offdiag(sns.scatterplot)

g.add_legend()
Methods

__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)

add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and resizing the
figure.

apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return self.

map(func, **kwargs) Plot with the same function in every subplot.

map_diag(func, **kwargs) Plot with a univariate function on each diagonal subplot.

map_lower(func, **kwargs) Plot with a bivariate function on the lower diagonal subplots.

map_offdiag(func, **kwargs) Plot with a bivariate function on the off-diagonal subplots.

map_upper(func, **kwargs) Plot with a bivariate function on the upper diagonal subplots.

pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its value.

savefig(*args, **kwargs) Save an image of the plot.

set(**kwargs) Set attributes on each subplot Axes.

tick_params([axis]) Modify the ticks, tick labels, and gridlines.

tight_layout(*args, **kwargs) Call fig.tight_layout within rect that exclude the legend.

Attributes

fig DEPRECATED: prefer the figure property.

figure Access the matplotlib.figure.Figure object underlying the grid.

legend The matplotlib.legend.Legend object, if present.

You might also like