seaborn
seaborn
boxplot
A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that
facilitates comparisons between variables or across levels of a categorical variable. The box shows
the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for
points that are determined to be “outliers” using a method that is a function of the inter-quartile
range.
Note
By default, this function treats one of the variables as categorical and draws data at ordinal positions
(0, 1, … n) on the relevant axis. As of version 0.13.0, this can be disabled by setting native_scale=True.
Parameters:
Dataset for plotting. If x and y are absent, this is interpreted as wide-form. Otherwise it is expected to
be long-form.
Order to plot the categorical levels in; otherwise the levels are inferred from the data objects.
Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input
variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting
wide-form data.
colormatplotlib color
Colors to use for the different levels of the hue variable. Should be something that can be interpreted
by color_palette(), or a dictionary mapping hue levels to matplotlib colors.
saturationfloat
Proportion of the original saturation to draw fill colors in. Large patches often look better with
desaturated colors, but set this to 1 if you want the colors to perfectly match the input values.
fillbool
dodge“auto” or bool
When hue mapping is used, whether elements should be narrowed and shifted along the orient axis
to eliminate overlap. If "auto", set to True when the orient variable is crossed with the categorical
variable or False otherwise.
widthfloat
Width allotted to each element on the orient axis. When native_scale=True, it is relative to the
minimum distance between two values in the native scale.
gapfloat
Shrink on the orient axis by this factor to add a gap between dodged elements.
Paramater that controls whisker length. If scalar, whiskers are drawn to the farthest datapoint
within whis * IQR from the nearest hinge. If a tuple, it is interpreted as percentiles that whiskers
represent.
linecolorcolor
linewidthfloat
fliersizefloat
Normalization in data units for colormap applied to the hue variable when it is numeric. Not relevant
if hue is categorical.
Set axis scale(s) to log. A single value sets the data axis for any numeric axes in the plot. A pair of
values sets each axis independently. Numeric values are interpreted as the desired base (default 10).
When None or False, seaborn defers to the existing Axes scale.
When True, numeric or datetime values on the categorical axis will maintain their original scaling
rather than being converted to fixed indices.
formattercallable
Function for converting categorical data into strings. Affects both grouping and tick labels.
How to draw the legend. If “brief”, numeric hue and size variables will be represented with a sample
of evenly spaced values. If “full”, every group will get an entry in the legend. If “auto”, choose
between brief or full representation based on number of levels. If False, no legend data is added and
no legend is drawn.
axmatplotlib Axes
Axes object to draw the plot onto, otherwise uses the current Axes.
Returns:
axmatplotlib Axes
Returns the Axes object with the plot drawn onto it.
See also
violinplot
stripplot
A scatterplot where one variable is categorical. Can be used in conjunction with other plots to show
each observation.
swarmplot
A categorical scatterplot where the points do not overlap. Can be used with other plots to show each
observation.
catplot
Examples
Draw a single horizontal boxplot, assigning the data directly to the coordinate variable:
sns.boxplot(x=titanic["age"])
sns.boxplot(
data=titanic, x="age", y="class",
notch=True, showcaps=False,
flierprops={"marker": "x"},
seaborn.pairplot
By default, this function will create a grid of Axes such that each numeric variable in data will by
shared across the y-axes across a single row and the x-axes across a single column. The diagonal plots
are treated differently: a univariate distribution plot is drawn to show the marginal distribution of the
data in each column.
It is also possible to show a subset of variables or plot different variables on the rows and columns.
This is a high-level interface for PairGrid that is intended to make it easy to draw a few common
styles. You should use PairGrid directly if you need more flexibility.
Parameters:
datapandas.DataFrame
Tidy (long-form) dataframe where each column is a variable and each row is an observation.
hue_orderlist of strings
Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.
Variables within data to use, otherwise use every column with a numeric datatype.
Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.
Kind of plot for the diagonal subplots. If ‘auto’, choose based on whether or not hue is used.
Either the marker to use for all scatterplot points or a list of markers with a length the same as the
number of levels in the hue variable so that differently colored points will also have different
scatterplot markers.
heightscalar
aspectscalar
cornerbool
If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.
dropnaboolean
Returns:
gridPairGrid
See also
PairGrid
JointGrid
Examples
The simplest invocation uses scatterplot() for each pairing of the variables and histplot() for the
marginal plots along the diagonal:
penguins = sns.load_dataset("penguins")
sns.pairplot(penguins)
Assigning a hue variable adds a semantic mapping and changes the default marginal plot to a layered
kernel density estimate (KDE):
sns.pairplot(penguins, hue="species")
It’s possible to force marginal histograms:
sns.pairplot(penguins, kind="kde")
Or histplot() to draw both bivariate and univariate histograms:
sns.pairplot(penguins, kind="hist")
The markers parameter applies a style mapping on the off-diagonal axes. Currently, it will be
redundant with the hue variable:
sns.pairplot(penguins, height=1.5)
Use vars or x_vars and y_vars to select the variables to plot:
sns.pairplot(
penguins,
y_vars=["bill_length_mm", "bill_depth_mm"],
)
Set corner=True to plot only the lower triangle:
sns.pairplot(penguins, corner=True)
The plot_kws and diag_kws parameters accept dicts of keyword arguments to customize the off-
diagonal and diagonal plots, respectively:
sns.pairplot(
penguins,
plot_kws=dict(marker="+", linewidth=1),
diag_kws=dict(fill=False),
)
The return object is the underlying PairGrid, which can be used to further customize the plot:
g = sns.pairplot(penguins, diag_kind="kde")
seaborn.PairGrid
This object maps each variable in a dataset onto a column and row in a grid of multiple axes.
Different axes-level plotting functions can be used to draw bivariate plots in the upper and lower
triangles, and the marginal distribution of each variable can be shown on the diagonal.
Several different common plots can be generated in a single line using pairplot(). Use PairGrid when
you need more flexibility.
Parameters:
dataDataFrame
Tidy (long-form) dataframe where each column is a variable and each row is an observation.
Variable in data to map plot aspects to different colors. This variable will be excluded from the
default x and y variables.
Variables within data to use, otherwise use every column with a numeric datatype.
Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.
hue_orderlist of strings
Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.
Other keyword arguments to insert into the plotting call to let other plot attributes vary across levels
of the hue variable (e.g. the markers in a scatterplot).
cornerbool
If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.
heightscalar
aspectscalar
layout_padscalar
despineboolean
dropnaboolean
Drop missing values from the data before plotting.
See also
pairplot
FacetGrid
Examples
Calling the constructor sets up a blank grid of subplots with each row and one column corresponding
to a numeric variable in the dataset:
penguins = sns.load_dataset("penguins")
g = sns.PairGrid(penguins)
Passing a bivariate function to PairGrid.map() will draw a bivariate plot on every axes:
g = sns.PairGrid(penguins)
g.map(sns.scatterplot)
g = sns.PairGrid(penguins)
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower triangles of the plot (which are
otherwise redundant):
g = sns.PairGrid(penguins, diag_sharey=False)
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:
g.map_lower(sns.scatterplot)
g.map_diag(sns.kdeplot)
The PairGrid constructor accepts a hue variable. This variable is passed directly to functions that
understand it:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
g.add_legend()
But you can also pass matplotlib functions, in which case a groupby is performed internally and a
separate plot is drawn for each level:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter)
g.add_legend()
Additional semantic variables can be assigned by passing data vectors directly while mapping the
function:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot, size=penguins["sex"])
g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue mapping, you will want to disable
mapping of the variable on the diagonal axes. Note that the hue variable is excluded from the list of
variables shown by default:
g = sns.PairGrid(penguins, hue="body_mass_g")
g.map_offdiag(sns.scatterplot)
g.add_legend()
The vars parameter can be used to control exactly which variables are used:
g.map_offdiag(sns.scatterplot)
g.add_legend()
The plot need not be square: separate variables can be used to define the rows and columns:
y_vars = ["body_mass_g"]
g.map_diag(sns.histplot, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()
It can be useful to explore different approaches to resolving multiple distributions on the diagonal
axes:
g = sns.PairGrid(penguins, hue="species")
g.add_legend()
Methods
__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)
add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and
resizing the figure.
apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return
self.
pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its
value.
Attributes
seaborn.PairGrid
class seaborn.PairGrid(data, *, hue=None, vars=None, x_vars=None, y_var
s=None, hue_order=None, palette=None, hue_kws=None, corner=False, diag
_sharey=True, height=2.5, aspect=1, layout_pad=0.5, despine=True, dropna
=False)
Subplot grid for plotting pairwise relationships in a dataset.
This object maps each variable in a dataset onto a column and row in a
grid of multiple axes. Different axes-level plotting functions can be used
to draw bivariate plots in the upper and lower triangles, and the marginal
distribution of each variable can be shown on the diagonal.
See also
pairplot
Easily drawing common uses of PairGrid.
FacetGrid
Subplot grid for plotting conditional relationships.
Examples
penguins = sns.load_dataset("penguins")
g = sns.PairGrid(penguins)
g = sns.PairGrid(penguins)
g.map(sns.scatterplot)
Passing separate functions
to PairGrid.map_diag() and PairGrid.map_offdiag() will show each
variable’s marginal distribution on the diagonal:
g = sns.PairGrid(penguins)
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower
triangles of the plot (which are otherwise redundant):
g = sns.PairGrid(penguins, diag_sharey=False)
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot, size=penguins["sex"])
g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue
mapping, you will want to disable mapping of the variable on the
diagonal axes. Note that the hue variable is excluded from the list
of variables shown by default:
g = sns.PairGrid(penguins, hue="body_mass_g")
g.map_diag(sns.histplot, hue=None, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot, multiple="stack", element="step")
g.map_offdiag(sns.scatterplot)
g.add_legend()
Methods
__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)
add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and
resizing the figure.
apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return
self.
map(func,
**kwargs) Plot with the same function in every subplot.
map_diag(func, **kwargs) Plot with a univariate function on each diagonal
subplot.
map_lower(func, **kwargs) Plot with a bivariate function on the lower diagonal
subplots.
map_offdiag(func, **kwargs) Plot with a bivariate function on the off-diagonal
subplots.
map_upper(func, **kwargs) Plot with a bivariate function on the upper diagonal
subplots.
pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its
value.
savefig(*args, **kwargs) Save an image of the plot.
set(**kwargs) Set attributes on each subplot Axes.
tick_params([axis]) Modify the ticks, tick labels, and gridlines.
tight_layout(*args, **kwargs) Call fig.tight_layout within rect that exclude the
legend.
Attributes
seaborn.PairGrid
This object maps each variable in a dataset onto a column and row in a grid of multiple axes.
Different axes-level plotting functions can be used to draw bivariate plots in the upper and lower
triangles, and the marginal distribution of each variable can be shown on the diagonal.
Several different common plots can be generated in a single line using pairplot(). Use PairGrid when
you need more flexibility.
Parameters:
dataDataFrame
Tidy (long-form) dataframe where each column is a variable and each row is an observation.
Variable in data to map plot aspects to different colors. This variable will be excluded from the
default x and y variables.
Variables within data to use, otherwise use every column with a numeric datatype.
Variables within data to use separately for the rows and columns of the figure; i.e. to make a non-
square plot.
hue_orderlist of strings
Set of colors for mapping the hue variable. If a dict, keys should be values in the hue variable.
Other keyword arguments to insert into the plotting call to let other plot attributes vary across levels
of the hue variable (e.g. the markers in a scatterplot).
cornerbool
If True, don’t add axes to the upper (off-diagonal) triangle of the grid, making this a “corner” plot.
heightscalar
aspectscalar
layout_padscalar
despineboolean
dropnaboolean
Drop missing values from the data before plotting.
See also
pairplot
FacetGrid
Examples
Calling the constructor sets up a blank grid of subplots with each row and one column corresponding
to a numeric variable in the dataset:
penguins = sns.load_dataset("penguins")
g = sns.PairGrid(penguins)
Passing a bivariate function to PairGrid.map() will draw a bivariate plot on every axes:
g = sns.PairGrid(penguins)
g.map(sns.scatterplot)
Passing separate functions to PairGrid.map_diag() and PairGrid.map_offdiag() will show each
variable’s marginal distribution on the diagonal:
g = sns.PairGrid(penguins)
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
It’s also possible to use different functions on the upper and lower triangles of the plot (which are
otherwise redundant):
g = sns.PairGrid(penguins, diag_sharey=False)
g.map_upper(sns.scatterplot)
g.map_lower(sns.kdeplot)
g.map_diag(sns.kdeplot)
Or to avoid the redundancy altogether:
g.map_diag(sns.kdeplot)
The PairGrid constructor accepts a hue variable. This variable is passed directly to functions that
understand it:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot)
g.add_legend()
But you can also pass matplotlib functions, in which case a groupby is performed internally and a
separate plot is drawn for each level:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter)
g.add_legend()
Additional semantic variables can be assigned by passing data vectors directly while mapping the
function:
g = sns.PairGrid(penguins, hue="species")
g.map_diag(sns.histplot)
g.map_offdiag(sns.scatterplot, size=penguins["sex"])
g.add_legend(title="", adjust_subtitles=True)
When using seaborn functions that can implement a numeric hue mapping, you will want to disable
mapping of the variable on the diagonal axes. Note that the hue variable is excluded from the list of
variables shown by default:
g = sns.PairGrid(penguins, hue="body_mass_g")
g.map_offdiag(sns.scatterplot)
g.add_legend()
The vars parameter can be used to control exactly which variables are used:
g.map_offdiag(sns.scatterplot)
g.add_legend()
The plot need not be square: separate variables can be used to define the rows and columns:
y_vars = ["body_mass_g"]
g.map_diag(sns.histplot, color=".3")
g.map_offdiag(sns.scatterplot)
g.add_legend()
It can be useful to explore different approaches to resolving multiple distributions on the diagonal
axes:
g = sns.PairGrid(penguins, hue="species")
g.map_offdiag(sns.scatterplot)
g.add_legend()
Methods
__init__(data, *[, hue, vars, x_vars, ...] Initialize the plot figure and PairGrid object.
)
add_legend([legend_data, title, ...]) Draw a legend, maybe placing it outside axes and resizing the
figure.
apply(func, *args, **kwargs) Pass the grid to a user-supplied function and return self.
map_lower(func, **kwargs) Plot with a bivariate function on the lower diagonal subplots.
map_upper(func, **kwargs) Plot with a bivariate function on the upper diagonal subplots.
pipe(func, *args, **kwargs) Pass the grid to a user-supplied function and return its value.
tight_layout(*args, **kwargs) Call fig.tight_layout within rect that exclude the legend.
Attributes