Comprehensive Guide to Scatter Plot using ggplot2 in R
Last Updated :
20 Dec, 2023
In this article, we are going to see how to use scatter plots using ggplot2 in the R Programming Language.
ggplot2 package is a free, open-source, and easy-to-use visualization package widely used in R. It is the most powerful visualization package written by Hadley Wickham. This package can be installed using the R function install. packages().
install.packages("ggplot2")
A Basic Scatterplot with ggplot2 in R uses dots to represent values for two different numeric variables and is used to observe relationships between those variables. To plot the scatterplot we will use we will be using the geom_point() function. Following is brief information about ggplot function, geom_point().
Syntax : geom_point(size, color, fill, shape, stroke)
Parameter :
- size : Size of Points
- color : Color of Points/Border
- fill : Color of Points
- shape : Shape of Points in range from 0 to 25
- stroke : Thickness of point border
- Return : It creates scatterplots.
Basic Scatterplot with ggplot2 in R
R
library (ggplot2)
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()
|
Output:

Basic Scatterplot with ggplot2 in R
Basic Scatterplot with ggplot2 in R with groups
Here we will use distinguish the values by a group of data (i.e. factor level data). aes() function controls the color of the group and it should be factor variable.
Syntax:
aes(color = factor(variable))
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ( aes (color = factor (Sepal.Width)))
|
Output:

Basic Scatterplot with ggplot2 in R
Changing color in Basic Scatterplot with ggplot2 in R
Here we use aes() methods color attributes to change the color of the datapoints with specific variables.
R
ggplot (iris) +
geom_point ( aes (x = Sepal.Length,
y = Sepal.Width,
color = Species))
|
Output:

Basic Scatterplot with ggplot2 in R
Changing Shape in Basic Scatterplot with ggplot2 in R
To change the shape of the datapoints we will use shape attributes with aes() methods.
R
ggplot (iris) +
geom_point ( aes (x = Sepal.Length, y = Sepal.Width,
shape = Species , color = Species))
|
Output:

Basic Scatterplot with ggplot2 in R
Changing the size aesthetic in Basic Scatterplot with ggplot2 in R
To change the aesthetic or datapoints we will use size attributes in aes() methods.
R
ggplot (iris) +
geom_point ( aes (x = Sepal.Length,
y = Sepal.Width,
size = .5))
|
Output:

Basic Scatterplot with ggplot2 in R
Label points in Basic Scatterplot with ggplot2 in R
To deploy the labels on the datapoint we will use label into the geom_text() methods.
R
library (ggplot2)
color_palette <- c ( "blue" , "green" , "red" )
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point (size = 3) +
geom_text ( aes (label = Species),
position = position_nudge (x = 0.05, y = 0.05),
size = 3,
show.legend = FALSE ) +
scale_color_manual (values = color_palette) +
theme_minimal () +
ggtitle ( "Sepal Length vs. Sepal Width" ) +
xlab ( "Sepal Length" ) +
ylab ( "Sepal Width" ) +
theme (legend.position = "right" )
|
Output:

Basic Scatterplot with ggplot2 in R
Regression lines in Basic Scatterplot with ggplot2 in R
Regression models a target prediction value supported independent variables and mostly used for finding out the relationship between variables and forecasting. In R we can use the stat_smooth() function to smoothen the visualization.
Syntax: stat_smooth(method=”method_name”, formula=fromula_to_be_used, geom=’method name’)
Parameters:
- method: It is the smoothing method (function) to use for smoothing the line
- formula: It is the formula to use in the smoothing function
- geom: It is the geometric object to use display the data
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point () +
stat_smooth (method=lm)
|
Output:

Basic Scatterplot with ggplot2 in R
Using stat_mooth with loess mode in Basic Scatterplot with ggplot2 in R
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point () +
stat_smooth ()
|
Output:

Basic Scatterplot with ggplot2 in R
geom_smooth() function to represent a regression line and smoothen the visualization.
Syntax: geom_smooth(method=”method_name”, formula=fromula_to_be_used)
Parameters:
- method: It is the smoothing method (function) to use for smoothing the line
- formula: It is the formula to use in the smoothing function
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point () +
geom_smooth ()
|
Output:

Basic Scatterplot with ggplot2 in R
In order to show the regression line on the graphical medium with help of geom_smooth() function, we pass the method as “loess” and the formula used as y ~ x.
geom_smooth with loess mode in Basic Scatterplot with ggplot2 in R
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point () +
geom_smooth (method=lm, se= FALSE )
|
Output:

Basic Scatterplot with ggplot2 in R
The intercept and slope can be easily calculated by the lm() function which is used for linear regression followed by coefficients().
Intercept and slope in Basic Scatterplot with ggplot2 in R
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point () +
geom_smooth (intercept = 37, slope = -5, color= "red" ,
linetype= "dashed" , size=1.5)
|
Output:

Basic Scatterplot with ggplot2 in R
Change the point color/shape/size manually
scale_fill_manual, scale_size_manual, scale_shape_manual, scale_linetype_manual, are builtin types which is assign desired colors to categorical data, we use one of them scale_color_manual() function, which is used to scale (map).
Syntax :
- scale_shape_manualValue) for point shapes
- scale_color_manual(Value) for point colors
- scale_size_manual(Value) for point sizes
Parameter :
- values : A set of aesthetic values to map the data. Here we take desired set of colors.
Return : Scale the manual values of colors on data
Changing aesthetics in Basic Scatterplot with ggplot2 in R
R
library (ggplot2)
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point () +
geom_smooth (method=lm, se= FALSE , fullrange= TRUE )+
scale_shape_manual (values= c (3, 16, 17))+
scale_color_manual (values= c ( 'pink' , 'yellow' , 'green' ))+
theme (legend.position= "top" )
|
Output:

Basic Scatterplot with ggplot2 in R
Marginal rugs to Basic Scatterplot with ggplot2 in R
To add marginal rugs to the scatter plot we will use geom_rug() methods.
R
ggplot (iris) +
geom_point ( aes (x = Sepal.Length, y = Sepal.Width,
shape = Species , color = Species))+
geom_rug ()
|
Output:

Basic Scatterplot with ggplot2 in R
Here we will add marginal rugs into the scatter plot
Marginal rugs in Basic Scatterplot with ggplot2 in R
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()+
geom_rug ()
|
Output:

Basic Scatterplot with ggplot2 in R
Scatter plots with the 2-D density estimation
To create density estimation in scatter plot we will use geom_density_2d() methods and geom_density_2d_filled() from ggplot2.
Syntax: ggplot( aes(x)) + geom_density_2d( fill, color, alpha)
Parameters:
- fill: background color below the plot
- color: the color of the plotline
- alpha: transparency of graph
Scatterplots with 2-D density estimation
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()+
geom_density_2d ()
|
Output:

Basic Scatterplot with ggplot2 in R
Using geom_density_2d_filled() to visualize the situation of color inside the datapoints
Adding aesthetics
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()+
geom_density_2d (alpha = 0.5)+
geom_density_2d_filled ()
|
Output:

Basic Scatterplot with ggplot2 in R
stat_density_2d() can be also used to deploy the 2d density estimation.
Deploy density estimation
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()+
stat_density_2d ()
|
Output:

Basic Scatterplot with ggplot2 in R
Scatter plots with ellipses
To add a circle or ellipse around a cluster of data points, we use the stat_ellipse() function. This function automatically computes the circle/ellipse radius to draw around the cluster of points by categorical data.
R
ggplot (iris, aes (x = Sepal.Length, y = Sepal.Width)) +
geom_point ()+
stat_ellipse ()
|
Output:

Basic Scatterplot with ggplot2 in R
Similar Reads
Data visualization with R and ggplot2
Data visualization with R and ggplot2 in R Programming Language also termed as Grammar of Graphics is a free, open-source, and easy-to-use visualization package widely used in R Programming Language. It is the most powerful visualization package written by Hadley Wickham. It includes several layers
9 min read
Working with External Data
Basic Plotting with ggplot2
Plot Only One Variable in ggplot2 Plot in R
In this article, we will be looking at the two different methods to plot only one variable in the ggplot2 plot in the R programming language. Draw ggplot2 Plot Based On Only One Variable Using ggplot & nrow Functions In this approach to drawing a ggplot2 plot based on the only one variable, firs
5 min read
How to create a plot using ggplot2 with Multiple Lines in R ?
In this article, we will discuss how to create a plot using ggplot2 with multiple lines in the R programming language. Method 1: Using geom_line() function In this approach to create a ggplot with multiple lines, the user need to first install and import the ggplot2 package in the R console and then
3 min read
Plot Lines from a List of DataFrames using ggplot2 in R
For data visualization, the ggplot2 package is frequently used because it allows us to create a wide range of plots. To effectively display trends or patterns, we can combine multiple data frames to create a combined plot. Syntax: ggplot(data = NULL, mapping = aes(), colour()) Parameters: data - Def
3 min read
How to plot a subset of a dataframe using ggplot2 in R ?
In this article, we will discuss plotting a subset of a data frame using ggplot2 in the R programming language. Dataframe in use: AgeScoreEnrollNo117700521880103177915419752051885256199630717903581971409188345 To get a complete picture, let us first draw a complete data frame. Example: C/C++ Code #
8 min read
Change Theme Color in ggplot2 Plot in R
A theme in ggplot2 is a collection of settings that control the non-data elements of the plot. These settings include things like background colors, grid lines, axis labels, and text sizes. we can use various theme-related functions to customize the appearance of your plots, including changing theme
4 min read
Modify axis, legend, and plot labels using ggplot2 in R
In this article, we are going to see how to modify the axis labels, legend, and plot labels using ggplot2 bar plot in R programming language. For creating a simple bar plot we will use the function geom_bar( ). Syntax: geom_bar(stat, fill, color, width) Parameters : stat : Set the stat parameter to
5 min read
Common Geometric Objects (Geoms)
Comprehensive Guide to Scatter Plot using ggplot2 in R
In this article, we are going to see how to use scatter plots using ggplot2 in the R Programming Language. ggplot2 package is a free, open-source, and easy-to-use visualization package widely used in R. It is the most powerful visualization package written by Hadley Wickham. This package can be inst
7 min read
Line Plot using ggplot2 in R
In a line graph, we have the horizontal axis value through which the line will be ordered and connected using the vertical axis values. We are going to use the R package ggplot2 which has several layers in it. First, you need to install the ggplot2 package if it is not previously installed in R Stu
6 min read
R - Bar Charts
Bar charts are a popular and effective way to visually represent categorical data in a structured manner. R stands out as a powerful programming language for data analysis and visualization. In this article, we'll look at how to make visually appealing bar charts in R. Bar Charts using RA bar chart
5 min read
Histogram in R using ggplot2
ggplot2 is an R Package that is dedicated to Data visualization. ggplot2 Package  Improve the quality and the beauty (aesthetics) of the graph. By Using ggplot2 we can make almost every kind of graph In RStudio. What is Histogram?A histogram is an approximate representation of the distribution of nu
7 min read
Box plot in R using ggplot2
A box plot is a graphical display of a data set which indicates its distribution and highlights potential outliers It displays the range of the data, the median, and the quartiles, making it easy to observe the spread and skewness of the data. In ggplot2, the geom_boxplot() function is used to creat
5 min read
geom_area plot with areas and outlines in ggplot2 in R
An Area Plot helps us to visualize the variation in quantitative quantity with respect to some other quantity. It is simply a line chart where the area under the plot is colored/shaded. It is best used to study the trends of variation over a period of time, where we want to analyze the value of one
3 min read
Advanced Data Visualization Techniques
Combine two ggplot2 plots from different DataFrame in R
In this article, we are going to learn how to Combine two ggplot2 plots from different DataFrame in R Programming Language. Here in this article we are using a scatter plot, but it can be applied to any other plot. Let us first individually draw two ggplot2 Scatter Plots by different DataFrames then
2 min read
Annotating text on individual facet in ggplot2 in R
In this article, we will discuss how to annotate a text on the Individual facet in ggplot2 in R Programming Language. To plot facet in R programming language, we use the facet_grid() function from the ggplot2 library. The facet_grid() is used to form a matrix of panels defined by row and column face
5 min read
How to annotate a plot in ggplot2 in R ?
In this article, we will discuss how to annotate functions in R Programming Language in ggplot2 and also read the use cases of annotate. What is annotate?An annotate function in R can help the readability of a plot. It allows adding text to a plot or highlighting a specific portion of the curve. Th
4 min read
Annotate Text Outside of ggplot2 Plot in R
Ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same few components: a data set, a set of geomsâvisual marks that represent data points, and a coordinate system. There are many scenarios where we need to annotate outside the plot area or specific area as
2 min read
How to put text on different lines to ggplot2 plot in R?
ggplot2 is a plotting package in R programming language that is used to create complex plots from data specified in a data frame. It provides a more programmatic interface for specifying which variables to plot onto the graphical device, how they are displayed, and general visual properties. In this
3 min read
How to Connect Paired Points with Lines in Scatterplot in ggplot2 in R?
In this article, we will discuss how to connect paired points in scatter plot in ggplot2 in R Programming Language. Scatter plots help us to visualize the change in two more categorical clusters of data. Sometimes, we need to work with paired quantitative variables and try to visualize their relatio
2 min read
How to highlight text inside a plot created by ggplot2 using a box in R?
In this article, we will discuss how to highlight text inside a plot created by ggplot2 using a box in R programming language. There are many ways to do this, but we will be focusing on one of the ways. We will be using the geom_label function present in the ggplot2 package in R. This function allow
3 min read
Adding labels, titles, and legends in r
Working with Legends in R using ggplot2
A legend in a plot helps us to understand which groups belong to each bar, line, or box based on its type, color, etc. We can add a legend box in R using the legend() function. These work as guides. The keys can be determined by scale breaks. In this article, we will be working with legends and asso
7 min read
How to Add Labels Directly in ggplot2 in R
Labels are textual entities that have information about the data point they are attached to which helps in determining the context of those data points. In this article, we will discuss how to directly add labels to ggplot2 in R programming language. To put labels directly in the ggplot2 plot we add
5 min read
How to change legend title in ggplot2 in R?
In this article, we will see how to change the legend title using ggplot2 in R Programming. We will use ScatterPlot. For the Data of Scatter Plot, we will pick some 20 random values for the X and Y axis both using rnorm() function which can generate random normal values, and here we have one more pa
3 min read
How to change legend title in R using ggplot ?
A legend helps understand what the different plots on the same graph indicate. They basically provide labels or names for useful data depicted by graphs. In this article, we will discuss how legend names can be changed in R Programming Language. Let us first see what legend title appears by default.
2 min read
Customizing Visual Appearance
Handling Data Subsets: Faceting
Grouping Data: Dodge and Position Adjustments