Visualizing Many Distributions at Once
Visualizing Many Distributions at Once
Earlier methods like histograms or single distribution plots don’t scale well to many
groups
Example: weather data → showing temperature distributions for each of the 12
months in a year.
For that e use specialized visualization methods such as:
Boxplots
Violin plots
Ridgeline plots
Strip charts
Sina plots
Compare multiple distributions simultaneously
Key Terms
Response variable: The variable whose distribution is being studied (e.g.,
temperature).
Grouping variable: Defines subsets of data for comparison (e.g., months).
General approach: One axis shows the response variable, the other the grouping
variable.
Visualizing Distributions Along the Vertical Axis
Boxplots
Components
Median: central line.
Interquartile Range (IQR): 25th to
75th percentile.
Whiskers: extend up to 1.5×IQR from
box.
Outliers: dots beyond whiskers.
Advantages:
Compact → works well when comparing
many groups.
Shows skewness and spread.
Standardized → universally understood.
Visualizing Distributions Along the Vertical Axis
Boxplots
Violin Plots
Violin plots extend boxplots by
showing distribution shape
through kernel density estimation
(KDE).
When to use:
Large sample sizes.
Detecting subtle distribution
shapes.
Limitations:
Requires enough data → small
samples make density misleading.
Strip Charts
Definition: Plot all data points
individually (raw visualization).
Problem: Overplotting (points
overlap, hiding density).
Visualizing Distributions Along the Vertical Axis
Jittering
Add random horizontal noise
so points don’t overlap.
Reveals frequency without
density smoothing.
Best for small to medium
samples.
Not suitable for large datasets
(becomes unreadable).
Visualizing Distributions Along the Vertical Axis
Sina Plots
Sina plot of Lincoln temperatures.
Hybrid of violin plot + jittered points.
Points spread horizontally
proportional to density at that value.
Shows both individual
observations and overall
distribution.
Best for medium datasets.
Combines advantages of violins
(shape) and strip charts (raw points).
Visualizing distributions along the horizontal axis
Ridgeline Plots