A scatter diagram is a graphical method used to study the relationship between two variables by plotting data points on a graph. It helps in visually identifying the direction and strength of correlation between variables without performing complex calculations. Each point on the graph represents a pair of values, making it easy to observe patterns such as positive, negative or no correlation.
- Does not require complex calculations, making it simple and easy to use.
- Helps in detecting outliers or unusual data points in the dataset.
- Acts as a foundation for further analysis using methods like correlation and regression.
How to Draw a Scatter Diagram
A scatter diagram is drawn to visually represent the relationship between two variables using data points. The following steps are involved:
- Identify the Variables: Determine the independent variable (X) and the dependent variable (Y) from the given data.
- Draw Axes: Draw two perpendicular axes on a graph, placing the independent variable on the X-axis and the dependent variable on the Y-axis.
- Choose a Suitable Scale: Select an appropriate scale for both axes to clearly represent all data values.
- Plot the Data Points: Plot each pair of values (X, Y) on the graph as a dot. Each dot represents one observation.
- Observe the Pattern: After plotting all points, examine the overall pattern to identify the type and strength of correlation between the variables.
Types of Correlation in Scatter Diagram
A scatter diagram shows different types of correlation based on how the data points are distributed on the graph. By observing the pattern of points, we can identify the direction and strength of the relationship between variables.
1. Perfect Positive Correlation
All data points lie exactly on a straight line with an upward slope, showing a perfect relationship between variables.
- As one variable increases, the other also increases in a perfectly consistent manner.
- The correlation coefficient is +1, indicating maximum positive correlation.

2. Perfect Negative Correlation
Here, all points lie exactly on a straight line with a downward slope, indicating a perfect inverse relationship.
- As one variable increases, the other decreases in a perfectly consistent manner.
- The correlation coefficient is -1, indicating maximum negative correlation.

3. Positive Correlation
The points cluster around an upward-sloping line, showing a direct relationship between variables.
- As one variable increases, the other generally increases.
- The points are close to the line but not perfectly aligned, indicating a strong or moderate relationship.

4. Negative Correlation
The points cluster around a downward-sloping line, indicating an inverse relationship between variables.
- As one variable increases, the other generally decreases.
- The spread of points shows the strength of the negative relationship.

5. No Correlation
The points are scattered randomly without forming any specific pattern or direction.
- There is no relationship between the variables.
- The value of correlation is close to zero.

The correlation coefficient is a numerical measure that indicates the strength and direction of the relationship between two variables. Its value ranges from -1 to +1, where values close to +1 indicate strong positive correlation, close to -1 indicate strong negative correlation and near 0 indicate no correlation.
How to Interpret a Scatter Diagram
Interpreting a scatter diagram involves analyzing the pattern of plotted points to understand the direction, strength and nature of the relationship between two variables. By carefully observing the distribution of points, meaningful conclusions about correlation can be drawn.
- Closeness of Points (Strength of Correlation): If the points are closely clustered around a line, it indicates a strong correlation widely scattered points suggest a weak or no correlation.
- Presence of Trend: If the points show a clear upward or downward pattern, the variables are correlated; absence of any visible pattern indicates no correlation.
- Direction of Trend: An upward trend from left to right indicates positive correlation (both variables move together), while a downward trend indicates negative correlation (variables move in opposite directions).
- Perfect Correlation: If all points lie exactly on a straight line, the relationship is perfectly correlated, either positively (upward slope) or negatively (downward slope).
- Outliers Detection: Unusual points that lie far from the general pattern (outliers) may affect the relationship and should be analyzed separately.
Example: Drawing and Interpreting a Scatter Diagram
Consider the following data for two variables X and Y. We are required to draw a scatter diagram and determine the type of correlation between them.
X | Y |
|---|---|
10 | 80 |
20 | 160 |
30 | 240 |
40 | 320 |
50 | 400 |
60 | 480 |
The values of X are plotted on the X-axis and the corresponding values of Y are plotted on the Y-axis. Each pair of values, such as (10, 80), (20, 160), …, (60, 480), is represented as a point on the graph.

After plotting the points, the pattern formed helps in understanding the relationship between the variables.
- The points lie close to a straight line with an upward slope from left to right.
- This indicates a strong positive correlation, meaning both variables increase together.
Advantages
A scatter diagram is a simple and effective tool for analyzing the relationship between two variables. Its key advantages are:
- Simplicity: It is a simple and non-mathematical method to study correlation, requiring only basic plotting of points.
- Initial Analysis Tool: It serves as the first step in understanding the relationship between variables before applying advanced statistical methods.
- Easy Interpretation: The pattern of points can be easily understood at a glance, helping to quickly identify the presence or absence of correlation.
- Visual Representation: It provides a clear graphical representation of data, making trends and relationships easy to observe.
- Detection of Outliers: It helps in identifying unusual or extreme values that may affect analysis.
- No Complex Calculations: It does not require formulas or computations, making it suitable for quick analysis.
- Flexible Use: It can be applied to different types of data and in various fields like business, science and research.
Limitations
- Rough Measure: It provides only a general idea about the relationship between variables and does not give precise results.
- Non-Quantitative Method: It does not provide an exact numerical value of correlation, making it less accurate compared to statistical methods.
- Unsuitable for Large Data: When the number of observations is very large, the diagram becomes crowded and difficult to interpret clearly.
- Limited to Two Variables: It can effectively represent only two variables at a time and cannot handle multiple variables simultaneously.
- Subjective Interpretation: The conclusion depends on visual judgment, which may vary from person to person.
Related Articles
- Correlation: Meaning, Significance, Types and Degree of Correlation
- Methods of Measurements of Correlation
- Spearman’s Rank Correlation Coefficient in Statistics
- Karl Pearson's Coefficient of Correlation | Assumptions, Merits and Demerits
- Karl Pearson's Coefficient of Correlation | Methods and Examples