0% found this document useful (0 votes)
92 views122 pages

Basic Statistics in Psychology

The document outlines the syllabus and structure for a course titled 'Basic Statistics in Psychology' offered by the University of Delhi. It includes units on descriptive statistics, standard scores, and analysis of relationships, detailing various statistical measures and their applications in psychological research. The material is designed for B.A. (Hons) Psychology students and emphasizes the importance of statistics in understanding psychological phenomena.

Uploaded by

Gaurav Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views122 pages

Basic Statistics in Psychology

The document outlines the syllabus and structure for a course titled 'Basic Statistics in Psychology' offered by the University of Delhi. It includes units on descriptive statistics, standard scores, and analysis of relationships, detailing various statistical measures and their applications in psychological research. The material is designed for B.A. (Hons) Psychology students and emphasizes the importance of statistics in understanding psychological phenomena.

Uploaded by

Gaurav Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

BASIC STATISTICS

IN PSYCHOLOGY

BASIC STATISTICS IN PSYCHOLOGY


B.A. (HONS) PSYCHOLOGY
SEMESTER-II
DSC-06

DSC-06

DEPARTMENT OF DISTANCE AND CONTINUING EDUCATION DEPARTMENT OF DISTANCE AND CONTINUING EDUCATION
UNIVERSITY OF DELHI UNIVERSITY OF DELHI
Basic Statistics in Psychology

Editorial Board
Prof. N.K. Chadha
Dr. Swati Jain

Content Writers
DDr. Deepesh Rathore, Dr. Poonam Phogat,
Dr. Shweta Chaudhary
. Poonam Vats, Dr. Halley,, Dr. Swati Jain
Academic Coordinator
Mr. Deekshant Awasthi

© Department of Distance and Continuing Education


ISBN: 978-81-19417-63-6
Ist edition: 2024
E-mail: [email protected]
[email protected]

Published by:
Department of Distance and Continuing Education
Campus of Open Learning/School of Open Learning,
University of Delhi, Delhi-110 007

Printed by:
School of Open Learning, University of Delhi

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

Peer Review Committee


Dr. Nupur Gosain
Ms. Vidyut Singh

This Study Material is duly recommended in the meeting of Standing Committee


held on 08/05/2023 and approved in Academic Council meeting held on 26/05/2023
Vide item no. 1014 and subsequently Executive Council Meeting held on
09/06/2023 vide item no. 14 {14-1(14-1-11)}. In continuation of the Notification
No. CNC-II/093/1(23)/2022-23/17 dated 06/04/2023, the following changes are
being made to the SLM of Basic Statistics in Psychology Semester-2 (earlier
known as Statistical Methods in Psychological Research) of BA (Hons.)
Psychology under the Department of Psychology for the students admitted in the
year 2023 onwards.

Corrections/Modifications/Suggestions proposed by Statutory Body,


DU/Stakeholder/s in the Self Learning Material (SLM) will be incorporated in
the next edition. However, these corrections/modifications/suggestions will be
uploaded on the website https://sol.du.ac.in. Any feedback or suggestions can be
sent to the email- [email protected]

Printed at: Vikas Publishing House Pvt. Ltd. Plot 20/4, Site-IV, Industrial Area Sahibabad, Ghaziabad - 201 010 (600 Copies)

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

SYLLABUS
Basic Statistics in Psychology
Syllabus Mapping

Unit – I
Introduction to Descriptive Statistics: Level of Measurement, Measures of Lesson 1: Relevance of
Central Tendency: Mean, Median and Mode (Characteristics and Statistics
Computation); Measures of Variability: Range, Semi-Interquartile Range, (Pages 3–28);
Standard Deviation, Variance (Characteristics and Computation). Lesson-2: Central Tendency
(Pages 29–52)

Unit – II
Score Transformations: Standard Scores and Percentile Ranks Lesson-3: Standard Scores
(Characteristics and Computation); Normal Probability Curve: Characteristics (Pages 55–83)
and Application of Normal Probability Curve.

Unit – III
Analysis of Relationships: Meaning, Direction and Degree of Correlation; Lesson-4: Analysis of
Factors Affecting Pearson’s Correlation; Computation of Correlation: Relationships
Pearson’s Coefficient Correlation and Spearman’s Rank Order Correlation; (Pages 87–112)
Prediction and Simple Regression (Concept and Calculation).

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

CONTENTS

UNIT I: RELEVANCE OF STATISTICS

Lesson 1 Relevance of Statistics 3–28


1.1 Learning Objectives
1.2 Introduction
1.2.1 Psychological Research
1.2.2 Why do Psychologists Carry Out Research?
1.2.3 What are the Different Types of Research?
1.3 Relevance of Statistics in Psychological Research
1.4 Descriptive and Inferential Statistics
1.5 Levels of Measurement
1.6 Grouped Frequency Distribution
1.6.1 Frequency Distribution
1.6.2 Grouped Frequency Distribution
1.6.3 Steps Involved in Creating a Grouped Frequency Distribution
1.6.4 Real Limits vs. Apparent Limits
1.6.5 Relative Frequency Distribution
1.6.6 The Cumulative Frequency
1.7 Graphical Representation of Data (Histogram, Frequency Polygon, Cumulative
Percentage Curve)
1.7.1 Histogram
1.7.2 Frequency Polygon
1.7.3 Cumulative Percentage Curve
1.8 Solved Illustrations

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

1.9 Summary
1.10 Answers to In-Text Questions
1.11 Glossary
1.12 Self-Assessment Questions
1.13 References
1.14 Suggested Readings
Lesson 2 Central Tendency 29–52
2.1 Learning Objectives
2.2 Introduction
2.3 Measures of Central Tendency: Definition, Properties and Comparison
2.3.1 Mean, Median, and Mode
2.3.2 Comparison of Mean, Median and Mode
2.4 Calculation of Mode, Median and Mean from Raw Scores
2.5 Effects of Linear Score Transformations on Measures of
Central Tendency
2.6 Measures of Variability Range; Semi-Interquartile Range; Variance;
Standard Deviation (Properties and Comparison)
2.6.1 Range and Semi - Interquartile
2.6.2 Variance
2.6.3 Standard Deviation
2.6.4 Quartile Deviation
2.7 Calculation of Variance and Standard Deviation
2.8 Effects of Linear Score Transformations on Measures of Variability
2.9 Summary
2.10 Answers to In-Text Questions
2.11 Glossary

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

2.12 Self-Assessment Questions


2.13 References
2.14 Suggested Readings

UNIT II: STANDARD SCORES

Lesson 3 Standard Scores 55–83


3.1 Learning Objectives
3.2 Introduction to Standard (z) Scores
3.3 Properties of z-scores
3.4 Transforming Raw Scores into z-scores
3.5 Determining Raw Scores from z-scores
3.6 Some Common Standard Scores
3.6.1 T-score
3.6.2 Stanine-score
3.6.3 STEN-score
3.7 Computations of Percentiles and Percentile Ranks from Grouped Data
3.7.1 Calculating Percentiles from Grouped Data
3.7.2 Calculating Percentile Ranks from Grouped Data
3.8 Comparison of z-scores and Percentile Ranks
3.9 The Normal Probability Distribution: Nature, Properties and Applications
3.9.1 Nature of the Normal Distribution
3.9.2 Properties of the Normal Distribution
3.9.3 Applications of the Normal Distribution
3.10 Normal Curve and Standard Scores
3.10.1 Finding Areas when the Score in known and Finding Scores when the
Area is known

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

3.11 Summary
3.12 Answers to In-Text Questions
3.13 Glossary
3.14 Self-Assessment Questions
3.15 References
3.16 Suggested Readings

UNIT III: ANALYSIS OF RELATIONSHIPS

Lesson 4 Analysis of Relationships 87–112


4.1 Learning Objectives
4.2 Introduction
4.3 Understanding Correlation
4.3.1 Scatter Diagram
4.3.2 Components of Correlation: Direction and Magnitude
4.3.3 Meaning of Correlation
4.4 Calculating Pearson’s Correlation
4.5 Correlation and causation
4.6 Effects of Linear Score Transformations
4.7 Factors Influencing Correlation
4.8 Spearman Rank Correlation Method
4.9 Linear Regression Analysis/Simple Regression
4.10 Summary
4.11 Answers to In-Text Questions
4.12 Glossary
4.13 Self-Assessment Questions
4.14 References
4.15 Suggested Readings

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
UNIT I: RELEVANCE OF STATISTICS

LESSON 1 RELEVANCE OF STATISTICS

LESSON 2 CENTRAL TENDENCY


Relevance of Statistics

LESSON 1 NOTES

RELEVANCE OF STATISTICS

Dr. Deepesh Rathore


Assistant Professor
Department of Psychology
Lakhsmibai College, University of Delhi
Email-Id: [email protected]

Structure
1.1 Learning Objectives
1.2 Introduction
1.2.1 Psychological Research
1.2.2 Why do Psychologists Carry Out Research?
1.2.3 What are the Different Types of Research?
1.3 Relevance of Statistics in Psychological Research
1.4 Descriptive and Inferential Statistics
1.5 Levels of Measurement
1.6 Grouped Frequency Distribution
1.6.1 Frequency Distribution
1.6.2 Grouped Frequency Distribution
1.6.3 Steps Involved in Creating a Grouped Frequency Distribution
1.6.4 Real Limits vs. Apparent Limits
1.6.5 Relative Frequency Distribution
1.6.6 The Cumulative Frequency
1.7 Graphical Representation of Data (Histogram, Frequency Polygon, Cumulative
Percentage Curve)
1.7.1 Histogram
1.7.2 Frequency Polygon
1.7.3 Cumulative Percentage Curve
1.8 Solved Illustrations
1.9 Summary
1.10 Answers to In-Text Questions

Self-Instructional
Material 3

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 1.11 Glossary


1.12 Self-Assessment Questions
1.13 References
1.14 Suggested Readings

1.1 LEARNING OBJECTIVES

 To understand the role of research in psychology


 To identify different types of research
 To examine the different scales of measurement
 To understand how data can be represented

1.2 INTRODUCTION

1.2.1 Psychological Research

Psychological research is research conducted by psychologists to systematically


identify the ways in which various factors influence the behaviour of individuals or
groups. The conclusion arrived at from these researches can help us in gaining a
much deeper level of understanding of the world around us, from managing our
health, decision making, and managing people in organisations to improving classroom
learning, and much more.
Psychological research does provide valuable insights but it does not mean that
the results are right all the time. Therefore, special attention must be paid to the way
research is being carried out by the researcher, proper methods and methodology
should be considered before undertaking any research. Although these two terms are
often used synonymously, they are closely linked yet different in the sense that
methodology is a much broader term that includes the entire research process from
the assumptions, sociocultural context, and ethical principles to the political impact of
the knowledge generated at the end. Whereas, methods point to the specific techniques
Self-Instructional used to collect and analyse data and reporting of the results. Along with this, the
4 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

researchers also need to have some qualities for the research process to work well NOTES
such as:
 The ability to persist in the face of constant setbacks and challenges related to
limited resources, both in terms of time and money.
 The ability to show tolerance in the face of ambiguity.
 The ability and desire to carry out research in an ethical manner.
 The ability to be logical and think rationally.
 The ability to change one’s mind in the face of counter-evidence.
 The ability to plan and organise research.
 The ability to communicate the result of the study in such a way that it’s easy to
comprehend by students as well as professionals.

1.2.2 Why do Psychologists Carry Out Research?

There are five main reasons why a researcher carries out research:
 Exploration: It is carried out when we know a very little or nothing about a
phenomenon or observation. It tries to address the “what” question in research,
what is that phenomenon? This acts as the first stage in any research which
usually does not lead to any specific answers but helps in developing a much
more extensive study later on when sufficient inquiry and evidence gathering has
been done on the research question.
 Description: It describes the details of the phenomenon in its current as well as
socio-cultural and historical context, and its relationships to the various variables
present in the environment. It tries to address the “who” and “how” questions,
who is involved? And how does this event happen? The research question is
very clear at this stage and the outcome is a detailed description.
 Explanation: It refers to finding the reasons why a phenomenon occurs. This
type of research is followed after exploratory and descriptive research and
focuses on the “why” question. Past researches can also be looked at to find
the causes and reasons.
 Prediction: It describes when a phenomenon or behaviour is likely to occur
again. This is possible once you have a sufficient understanding of the underlying Self-Instructional
Material 5

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES reasons and the cause and effect relationship is clearly established. This type of
research focuses on the “when” question.
 Control: It refers to changing or influencing the behaviour to improve the quality
of life of an individual or group by making constructive changes and modifying
the unhealthy thought patterns of lifestyle choices that people engage in.

In-Text Questions
1. _______ tries to address “what” question in research.
2. _________describes the phenomenon is likely to occur again.
3. ______provides valuable insights but does not mean that results are correct
all the time.
4. Basic research is also known as ________.
5. A manager using tests of intelligence and personality to select appropriate
candidate for the organization is ________ type of research.

1.2.3 What are the Different Types of Research?

Research can be classified along four dimensions:


a) Basic vs. Applied research
b) Laboratory vs. Field research
c) Quantitative vs. Qualitative research
d) Cross-sectional vs. Longitudinal research
 Basic vs. Applied research
Basic research is also known as pure research; this type of research is carried out to
advance our understanding of a psychological phenomenon by either discovering
something new or refuting the already existing theoretical or philosophical framework.
Basic researches has a long term impact on the field of study, where it could act as a
benchmark for future studies. For example, Francis Galton’s research on intelligence.
Applied research on the other hand is carried out to find a solution to a practical
problem, they are short term in their approach and rarely lead to building new theories.
Self-Instructional Instead, this type of research utilizes already well-established theories, for example, a
6 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

manager using tests of intelligence and personality to select the right candidate for their NOTES
organisation.
 Laboratory vs. Field research
Laboratory research is carried out in a controlled environment, where the researcher
has control over all the variables involved in the research process, the researcher
carefully selects the participants and follows all the steps systematically. For example,
Albert Bandura’s Bob doll experiment. Field research on the other hand is carried
out in natural settings where the researcher doesn’t have much control over the
environment. These researches therefore involve the use of observations, and surveys
rather than experiments. For example, a researcher who is interested in analysing
pro-social/helping behaviour. Since these studies are done in natural settings, the
results obtained from these researches are more generalizable as compared to
laboratory research.
 Quantitative vs. Qualitative research
Quantitative research involves the use of numbers to gather, analyse, and present data.
Here the data is collected primarily through the use of a questionnaire and is reported
in the form of numbers such as mean, percentiles, standard deviation, correlation
coefficients, etc. The use of numbers makes it easier to summarize data collected from
large samples. Since the data is gathered from a large sample size, the result is more
generalizable. Qualitative research on the other hand involves the use of words and
images to gather, analyse, and present data. Here the data is collected using case
study, interview, or observation techniques and analysed using content analysis, thematic
analysis, etc. The data is gathered from a small sample size, as it is difficult to carry out
detailed interviews and observations because of time and cost constraints, and thus
the result is less generalizable.
 Cross-sectional vs. Longitudinal research:
Cross-sectional research is carried out at one point in time, to capture the level of
a variable at that time. For example, the vocabulary level of 5th grade students,
where the data will be collected only once and hence is economical in terms of
time and money required. Longitudinal research on the other hand is carried out
over an extended period of time, to capture the process of change. For example,
vocabulary development of 5th grade students over the period of 6 months, the
Self-Instructional
data will be collected multiple times to track the level and pace of development of Material 7

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES vocabulary. Therefore, this type of research is much more costly and time
consuming.

In-Text Questions
6. _________research is carried out in a controlled environment.
7. ______research applies data collection techniques like content analysis,
thematic analysis, etc.
8. _______type of research applies mean, percentiles, standard deviation, etc.

1.3 RELEVANCE OF STATISTICS IN


PSYCHOLOGICAL RESEARCH

Statistics is a branch of mathematics that involves the collection, presentation, and


analysis of data. It helps in finding out trends and patterns in data which can lead to
identifying the probability of whether an event or behaviour is going to take place or
not. Proper use of statistical methods and techniques can help in diagnosing patients
and help to make better decisions to improve their health and wellbeing. In the case of
organisations, it can help in correctly identifying the suitable candidate for a job and
ultimately can lead to better efficiency and productivity of the organisation.
There are primarily two types of statistical methods used in psychology, descriptive
statistics and inferential statistics.

1.4 DESCRIPTIVE AND INFERENTIAL STATISTICS

 Descriptive statistics
It describes and summarizes the data which helps the researchers clearly identif the
nature of information available to them after data collection. For example, if a researcher
wants to understand the performance of first year undergraduate psychology students,
then he/she can calculate the mean and standard deviation and plot the student’s
Self-Instructional performance using a histogram. There are a variety of statistics that are used to describe
8 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

data, such as mean, median, mode, standard deviation, range, percentile, percentile NOTES
ranks, correlation coefficients, etc.
 Inferential statistics
It is used to draw conclusions about a population by collecting data from a sample
statistic (such as mean). For example, we want to understand whether the performance
of first year psychology students of University of Delhi (let’s assume a population of
size 1,000) is the same as the average performance of first year psychology students
of Ambedkar University (mean = 80). In order to achieve our objective, we can
collect sample data (n= 100) from all the first-year psychology students studying in
different colleges (5 from each university) of University of Delhi and then test our
hypothesis that the average performance of students from both the universities is the
same or not. In our attempt to test our hypothesis, we will use statistical techniques
such as t-test (which will be covered later on) to find out whether there is any difference
in performance or not. There are a variety of techniques that come under inferential
statistics such as z-test, t-test, ANOVA (Analysis of Variance), chi-square test, etc.

1.5 LEVELS OF MEASUREMENT

A researcher includes a lot of different variables in the research, such as gender, height,
weight, IQ, motivation, personality, self-esteem, job satisfaction, stress level, etc. and
for statistical analysis, assigns a number to a particular variable, for example, an IQ
score of 130, or forming different categories such as 1 for male and 2 for female.
Psychologist S. S. Stevens (1946) identified four different ways of assigning numbers
to observations known as measurement scales:
1. Nominal scale
2. Ordinal scale
3. Interval scale
4. Ratio scale
 Nominal scale
Nominal (means “name”) scale is used for variables that are qualitative in nature rather
than quantitative, such as gender, male or female, and others. The requirement for Self-Instructional
Material 9

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES using a nominal scale is that the categories must be mutually exclusive (one category
needs to be completely independent of the other) in nature and exhaustive (there must
be enough categories to accommodate all the observations). For example, the results
of the students include two categories, pass and fail, a student who pass the exam
can’t come in the fail category, and vice versa.
 Ordinal scale
Ordinal (means “order”) scale is also used in cases where the categories are mutually
exclusive and exhaustive. Ordinal scale is a higher level of measurement compared to
nominal scale, as here other than categorization we also assign ranks to the categories
in such a way that it is easier to identify which category will come first and which
category comes last. For example, when countries are given ratings such as 1, 2, or 3,
it is clear that 1 is better than 2 which is better than 3.
 Interval scale
Interval scale represents the next level of complexity than the nominal and ordinal
scales. This scale has all the properties of an ordinal scale with the addition that the
difference (distance) between points on this scale is same across the scale. For example,
when the temperature is measured on a degree Celsius scale then the difference between
30o and 40o is the same as the difference between 5o and 15o, that is 10o degree
Celsius. But on this scale zero is just an arbitrary point, in our example 0o doesn’t
mean there is no heat.
 Ratio scale
Ratio scale includes all the characteristics of the interval scale but also an additional
zero point. For example, the Kelvin scale to measure temperature measures temperature
just like the Celsius scale that is the distance between points having the same difference
means the same thing but in addition to that it has a true zero point or absolute zero
temperature which implies an absence of heat.

In-Text Questions
9. _____type of scale is used for variables that are qualitative in nature rather
than quantitative.
10. In ________scale other than organization we also see ranks to the categories.
11. When temperature is measured on degree Celsius _______scale is used.
Self-Instructional
10 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

NOTES
1.6 GROUPED FREQUENCY DISTRIBUTION

(Excluded from current syllabus, for your personal understanding only)

1.6.1 Frequency Distribution

The data collected comes in a variety of forms and we need to organise the data to
make accurate interpretations. This can be accomplished with the help of frequency
distribution, which illustrates the number of observations for a particular category.
For example, if we want to present the different disciplines selected by students
as their undergraduate majors, it can be presented as shown in Table 1
Table 1: Frequency of Majors Selected by University of Delhi Students

Majors Frequency

Psychology 800

Mathematics 500

Economics 1200

English 950

Total = 3450

1.6.2 Grouped Frequency Distribution

 Grouped scores
When you have a wide range of scores, it is better to combine the scores to create
groups of scores. For example, suppose psychology students obtain marks in statistics
paper as shown in Table 2. We can now group these scores to make class intervals
(range of values that are grouped together) such as 64-65, 66-67, etc.

Self-Instructional
Material 11

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 2: Marks Obtained by Psychology Students in Statistics Paper

65 70 73 76 75 74
66 71 73 76 77 73
66 71 75 79 78 76
70 72 75 80 74 86
70 72 76 73 73 83

Table 3 shows how the grouped frequency distribution would look like, as can
be seen, this type of distribution makes it easier to visualise and understand data.
We can see (table 3), students scored in the range of 65 (lowest) to 83 (highest),
where most of the students scored between 71 to 77. But as you can see, there are
different ways of creating class intervals, column B and column C have the same width
of 3 scores but depending upon where you start your interval, the frequency of scores
varies. In column B, the interval starts with 65 and in column C it starts with 66, so the
question should be, which out of the two is the correct way to form intervals? The
answer to this question lies in some clearly defined guidelines to create intervals, please
keep in mind these are simply guidelines not hard and fast rules.
The guidelines are as follows:
a) Class intervals must be mutually exclusive: It means that the intervals should
not overlap with each other, i.e., scores should not come in more than one
interval.
b) Intervals must be continuous: It means that even if some intervals don’t have
scores in them, one must include those intervals nevertheless. For instance, in
column A, the frequency for the interval 68-69 is 0.
c) Interval containing highest-score should be at top: Placing interval containing
highest value at the top and interval containing lowest value at the bottom makes
it easier to understand the frequency distribution.
d) Intervals should have same width: All the intervals that you create should be
of equal width. For example, within each column in table you will find same
interval width.
e) Interval width should be convenient: Interval width should be equal as well
as convenient such as 2, 3, 4, 5, 10, 20, 50, etc. this makes the data easy to
represent and understand.
Self-Instructional
12 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

Table 3: Scores from Table Converted into Class Interval NOTES


A B C
Class interval Class interval Class interval
width = 2 width= 3 width= 3
Scores Frequency Scores Frequency Scores Frequency
86-87 1 85-87 1 86-88 1
84-85 0 82-84 1 83-85 1
82-83 1 79-81 2 80-82 1
80-81 1 76-78 6 77-79 3
78-79 2 73-75 10 74-76 9
76-77 5 70-72 7 71-73 9
74-75 5 67-69 0 68-70 3
72-73 7 64-66 3 65-67 3
70-71 5 N= 30 N= 30
68-69 0
66-67 2
64-65 1
N= 30

f) No. of class intervals: More the class intervals better will be the accuracy of
interpretation. If you create fewer class intervals, then it will result in wider
intervals and thus there will be more loss to accuracy. For instance, Column A
vs. B and C in table 3.
g) Lower score as a multiple of interval width: It is advisable to make the
lower score of the interval as a multiple of the interval width, this makes the
interpretation easier, for instance, in column A of table 3, 64 being the lowest-
score of class intervals as well as a multiple of 2. This is not the case with
column B and C.

1.6.3 Steps Involved in Creating a Grouped Frequency Distribution

Following are the steps taken to create grouped frequency distribution


a) Find the highest and the lowest-scores.
b) Find range for the scores by subtracting the lowest-score from the highest-
score.
Self-Instructional
Material 13

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES c) Divide the range by 10 and 20 to find out the largest and smallest interval width,
then select a convenient width between these values.
d) Find the score at which the interval width should begin highest or lowest, it
should be in the multiple of the interval width.
e) List the class intervals with highest value at the top and making continuous
intervals of equal width.
f) Use tally system to count the number of scores within an interval, then convert
the tally into frequency.
Let us understand this with the help of the data given in table 2, the lowest-
score is 65 and the highest-score is 86. The range therefore will be the difference
between the two, which comes out to be 21. Next, we will divide the range by 10 and
20 which will give us 2.1 and 1.05. Now, we need to select a convenient interval width
between these two values, let us select 2 as the width of class interval (symbolized
by i) i = 2.

Next, we need to find out the starting point of the bottom class interval. The
lowest-score is 65, so the starting value could be 64 or 65 that will make 64-65 and
65-66 as the intervals containing the value 65, but only one interval starts with a value
that is in multiple of the width (i = 2) which is 64. Hence, our bottom most interval will
be 64-65 as shown in table 3. Now, we will create intervals of equal width, keep in
mind the class intervals would have been the same if we started with the top most
interval (85-86 or 86-87).
Create a table of frequency of scores and tally marks, as shown in table 4. The
calculated frequency then can be inserted as shown in table 3, column A. Similarly, this
can be done for column B and C.

Self-Instructional
14 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

Table 4: Frequency Calculation for the Intervals NOTES

Scores Tally Frequency

86-87 | 1
84-85 0
82-83 | 1
80-81 | 1
78-79 || 2
76-77 |||| 5
74-75 |||| 5
72-73 |||| || 7
70-71 |||| 5
68-69 0
66-67 || 2
64-65 | 1

N= 30

1.6.4 Real Limits vs. Apparent Limits

In real life situations it is possible that a student might-score in decimals, for instance,
70.5, in such cases it becomes difficult to place this within an interval containing discrete
values.
Real limits: It extends from half unit below the smallest to half unit above. For instance,
the apparent limit of 70-71 can be converted into real limits of 69.5 (the real lower
limit) and 71.5 (the real upper limit). It important to keep in mind that now scores can’t
fall on the real limit as it is calculated by taking half of the smallest unit of measurement.
Apparent limits: It extends from smallest unit of the measurement in the interval to
the largest.

Self-Instructional
Material 15

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 5: Real Upper and Lower Limits for the Distribution of Score

Apparent limits Real Limits Frequency (f)


86-87 85.5-87.5 1
84-85 83.5-85.5 0
82-83 81.5-83.5 1
80-81 79.5-81.5 1
78-79 77.5-79.5 2
76-77 75.5-77.5 5
74-75 73.5-75.5 5
72-73 71.5-73.5 7
70-71 69.5-71.5 5
68-69 67.5-69.5 0
66-67 65.5-67.5 2
64-65 63.5-65.5 1
N= 30

1.6.5 Relative Frequency Distribution

Relative frequency distribution shows the proportion or percentage of scores within


an interval. Relative frequency proportions are obtained by dividing frequencies by the
total number of observations and then are converted into percentages by multiplying
them by 100 as shown in table 6. Relative frequencies are beneficial when you have to
compare performance of two groups of unequal sizes since comparing only raw
frequencies gives us false impression because of the group size differences.

Self-Instructional
16 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

Table 6: Relative Frequency Proportion and Percentage Calculation NOTES


Using Frequency Data

Apparent Group A Group B


limits
Frequency Relative Relative Frequency Relative Relative
(f) frequency frequency frequency frequency
proportion percentage proportion percentage
86-87 1 .033 3.3 1 .0167 1.67
84-85 0 0 0 3 .05 5
82-83 1 .033 3.3 4 .067 6.7
80-81 1 .033 3.3 1 .0167 1.67
78-79 2 .067 6.7 12 .2 20
76-77 5 .167 16.7 8 .133 13.3
74-75 5 .167 16.7 8 .133 13.3
72-73 7 .233 23.3 10 .167 16.7
70-71 5 .167 16.7 3 .05 5
68-69 0 .00 0 5 .083 8.3
66-67 2 .067 6.7 0 0 0
64-65 1 .033 3.3 5 .083 8.3
Total N= 30 1 100 N= 60 1 100

1.6.6 The Cumulative Frequency

A cumulative frequency distribution helps us in answering questions such as how


many students scored lower than the upper real limit of each class interval. In our
example, if we want to find out how many students scored below or higher than 70,
then we can easily answer these questions using cumulative frequency distribution
shown in table 7.

Self-Instructional
Material 17

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 7: Cumulative Frequency Distribution

Apparent Real Limits Frequency Cumulative Cumulative Cumulative


limits (f) frequency frequency frequency
(Proportion) (Percentage)

86-87 85.5-87.5 1 30 1 100


84-85 83.5-85.5 0 29 .967 96.7
82-83 81.5-83.5 1 29 .967 96.7
80-81 79.5-81.5 1 28 .933 93.3
78-79 77.5-79.5 2 27 .90 90
76-77 75.5-77.5 5 25 .833 83.3
74-75 73.5-75.5 5 20 .67 67
72-73 71.5-73.5 7 15 .5 50
70-71 69.5-71.5 5 8 .267 26.7
68-69 67.5-69.5 0 3 .1 10
66-67 65.5-67.5 2 3 .1 10
64-65 63.5-65.5 1 1 .033 3.3

N= 30

In order to calculate cumulative frequency, we start from the bottom and


write down the number of cases that fall below the upper real limit of lower most
interval. Next, we add the frequency of the interval above our previous interval to
that number. For example, in table 7 we started with the number of cases that fall
below 65.5 (63.5-65.5 interval) which is 1, next for the interval 65.5-67.5 we add
frequency of this interval with 1 (1+2) to get the cumulative frequency for this interval
(3) and so on for all the intervals. To calculate cumulative frequency proportion for
each interval we simply divide the cumulative frequency with the total number of
cases. For example, cumulative frequency proportion for the interval 85.5-87.5 is 1
(30/30). To calculate cumulative frequency percentage, we multiply the proportion
with 100.

Self-Instructional
18 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

NOTES
In-Text Questions
12. ______extends from half unit below the smallest to half unit above.
13. ________shows the proportion of scores within an interval.
14. _______helps us in answering questions such as how many students scored
lower than the upper real limit of each class interval.

1.7 GRAPHICAL REPRESENTATION OF THE DATA

So far, we have discussed how to tabulate data in terms of frequency and


calculating percentile points and ranks. Now we will look at how to present the
same data in graphical form. Graphs represents the same data as mentioned in
the table but still they are very useful as they make interpretation of the data
easier as well as helps us in identifying patterns. In this section we are going to
discuss three types of graphs: Histogram, frequency polygon, and cumulative
percentage curve.
There are few basic factual information about graphs that you should be familiar
with, usually there are two axes in a graph, the horizontal axis (x- axis or abscissa)
and the vertical axis (y-axis or ordinate). Generally, the score or categories are
represented on the x-axis while frequency is represented on the y-axis along with
their respective labels. The point of intersection of both the axes is zero. As far as
the scaling is concerned, it is customary to make the y-axis 3/4th the length of the x-
axis.

1.7.1 Histogram

Histogram consists of group of rectangles where the vertical sides are on the real
lower and upper limits of an interval and the width of the rectangle is the same as the
width of the interval it represents. The height of the rectangle on the other hand is
frequency for a particular interval it represents. The frequency can be both raw as well
as relative frequencies.
Self-Instructional
Material 19

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 8: Raw Scores of Students in a Statistical Reasoning Test

65 70 73 85 78 74
66 71 73 86 79 83
66 71 75 80 82 82
70 72 75 80 80 86
70 72 76 73 82 83

Table 9: Frequency Distribution with Class Interval Width of 3

Apparent limits Real Limits Frequency Cumulative


(f) frequency

84-86 83.5-86.5 3 30

81-83 80.5-83.5 5 27

78-80 77.5-80.5 5 22

75-77 74.5-77.5 3 17

72-74 71.5-74.5 6 14

69-71 68.5-71.5 5 8

66-68 65.5-68.5 2 3

63-65 62.5-65.5 1 1

N= 30

Following steps should be followed to create a histogram:


Step 1: From the raw data, construct a grouped frequency distribution.
Step 2: Take a graph paper and then draw the x-axis and y axis, where the number of
boxes in the y-axis should be 0.75 times the number for the x-axis.

Self-Instructional
20 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

NOTES
Frequency

Scores in statistical reasoning test

Figure 1: Histogram of Scores Obtained in Statistical Reasoning Test

Step 3: Next draw the adjoining rectangles, where the width will be equal to class
interval and height equal to the frequency. Score represented on the x-axis can be the
mid-point value of the interval or it can also be the real lower and upper limits as the
edges of the rectangle.
Step 4: Finally, assign labels to the x-axis and y-axis.

1.7.2 Frequency Polygon

Frequency polygon is a graph made by connecting dots (mid points of the class interval).
x-axis represents the scores, and y-axis represents frequency of scores in the
interval. In order to construct a frequency polygon, following steps can be taken:
Step 1: Create a frequency distribution from the raw data, identify the mid points of
the intervals.
Step 2: Put the mid points on the x-axis and decide on the scale of the x-axis and y-
axis.
Self-Instructional
Material 21

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
Frequency

Scores in statistical reasoning test

Figure 2: Frequency Polygon Depicting the Scores Obtained on the


Statistical Reasoning Test

Step 3: Place the dots on the graph at the intersection of the frequency and mid-point
of the interval and join all the dots with straight lines.
Step 4: Label the axes.

1.7.3 Cumulative Percentage Curve

Cumulative percentage curve is based on the concept of cumulative percentage


distribution, which indicates that the percentage of scores that falls below the upper
real limit, note that in case of frequency polygon, we plot the mid points of the class
intervals. The height represents the cumulative percentage for the corresponding
upper real limit, with this in mind we can say that the slope of the curve can never be
negative, because as we move across the x-axis the percentage of cases falling
below a particular upper limit increase. Another aspect related to the shape of the

Self-Instructional
22 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

curve is that it usually appears like S-shaped figure, which is also referred to as NOTES
ogive curve.
Cumulative Percentage

P50

Score= 75.5

Upper real limits of the intervals

Figure 3: Cumulative Percentage Curve Representing the Scores Obtained on


Statistical Reasoning Test

The advantage of having this type of curve is that it can help us in determining
the percentile point or percentile ranks. This can be done by drawing a line from the
desired percentage point to the curve and then drawing a vertical line from the point of
intersection to the x-axis. This method gives accurate results given the scaling is done
properly. For example, P50 = 75.5. (Section 1.8)

In-Text Questions
15. ______consists a group of rectangles where vertical sides are on real lower
and upper limits.
16. ______is made with connecting midpoints of class interval.
17. _______indicates the percentage of scores falling below the upper real limit.

Self-Instructional
Material 23

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
1.8 SOLVED ILLUSTRATIONS

Calculation of P50 can be done as follows:


Step 1: Find the class interval within which the P50 falls, this is done by
I. Finding the score below which 50% of the scores fall
II. 50% of the total observation will be calculated, 50% of 30 = 15
III. Therefore, the 15th score from the bottom falls in the interval 74.5-77.5
Step 2: Determine the number of cases from the lower real limit of the interval to
where P50 will be, in this case it is 1.
Step 3: Assume that the class interval is equally distributed and find out the additional
points from the lower real limit from where the 1sT-score will fall using (1/3) × 3 = 1
Step 4: Finally add these points to the lower real limit to get to the percentile.
P50 = (1/3) × 3 + 74.5 = 75.5

1.9 SUMMARY

 Psychological research is the research conducted by psychologists to


systematically identify the ways in which various factors influence the behaviour
of individual or groups.
 There are five main reasons why a researcher carries out research: Exploration,
Description, Explanation, Prediction, and Control
 Researches can be classified along four dimensions: Basic vs. Applied research,
Laboratory vs. Field research, Quantitative vs. Qualitative research, and Cross-
sectional vs. Longitudinal research
 There are primarily two types of statistical methods used in psychology,
descriptive statistics and inferential statistics.
 Psychologist S. S. Stevens (1946) identified four different ways of assigning
numbers to observations known as measurement scales: Nominal scale, Ordinal
scale, Interval scale, and Ratio scale.
Self-Instructional
24 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

 Percentile point are commonly referred to as percentile, represents a point below NOTES
which a specific number of cases fall.
 Percentile rank on the other hand represents, the percentage of cases that falls
below a point on the measurement scale.

1.10 ANSWERS TO INTEXT QUESTIONS

1. Exploration
2. Prediction
3. Psychological research
4. Pure research
5. Applied
6. Laboratory
7. Quantitative
8. Qualitative
9. Nominal
10. Ordinal
11. Interval
12. Real limit
13. Relative frequency distribution
14. Cumulative frequency distribution
15. Histogram
16. Frequency polygon
17. Cumulative percentage curve

Self-Instructional
Material 25

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
1.11 GLOSSARY

 Psychological research: It is a research conducted by psychologists to


systematically identify the ways in which various factors influence the behaviour
of individuals or groups.
 Basic research: It is also known as a pure research which is carried out to
advance our understanding of psychological phenomenon.
 Applied research: It is a research carried out to find a solution to a practical
problem.
 Laboratory research: It is a research carried out in the controlled environment
where the researcher has the control over all the variables involved in research
process.
 Field research: It is a research carried out in natural settings where the
researcher does not have much control over the environment.
 Quantitative research: It is a research that involves the use of numbers to
gather, analyse and present the data.
 Qualitative research: It is a research that involves the use of words and
images to gather, analyse and present the data.
 Cross-sectional research: It is a research carried out at one point in time to
capture the level of a variable at that time.
 Longitudinal research: It is a research conducted over an extended period.
 Descriptive statistics: It is used to organize or summarise a set of data.
 Inferential statistics: It provides data from a sample that a researcher study
that enables them to make conclusions about the population.
 Nominal scale: It is used for variables that are qualitative in nature rather than
quantitative in nature.
 Ordinal scale: It is used to measure things by ranking and doesn’t necessarily
imply equal distances between the rankings.
 Ratio scale: It includes all the characteristics of interval scale but also has a
Self-Instructional
26 Material zero point.

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Relevance of Statistics

 Frequency distribution: It is a representation in a graphical or tabular format NOTES


that displays the number of observations within a given interval.
 Real limits: The real limit separates two adjacent scores and is located exactly
half way between the scores.
 Apparent limits: It extends to the smallest unit of measurement in the interval
to the largest.
 Relative frequency distribution: It shows the proportion of percentage of
scores within an interval.
 Cumulative frequency distribution: It is the number of times an outcome
occurs that is above or below a certain value.
 Histogram: It is a diagram consisting of rectangles whose area is proportional
to the frequency of variable and whose width is equal to the class interval.
 Frequency polygon: It is a graph made by connecting dots.

1.12 SELF-ASSESSMENT QUESTIONS

Q 1. Differentiate between inferential and descriptive statistics.


Q 2. Describe the qualities of a good researcher.
Q 3. Highlight the key differences between various types of researches.
Q 4. Sixty Students from commerce department of ABC University have obtained
the following scores in a test.

46 55 65 60 77 87
55 59 81 60 76 64
65 61 84 63 72 63
43 68 56 64 78 65
76 89 96 60 88 80
78 75 77 70 81 78
55 77 78 71 84 74
63 73 89 54 69 77
80 71 90 53 70 58

Self-Instructional
Material 27

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Based on the raw scores create the following:


a) Frequency and cumulative frequency distribution
b) Draw a frequency polygon
c) Calculate P75

1.13 REFERENCES

Garrett, H.E (2005). Statistics in Psychology and Education. Delhi: Cosmo Publications.
King, B.M. & Minium, E.W, (2007). Statistical Reasoning in the Behavioral Sciences
(5th Ed.). Noida: Wiley.
Mangal, S.K. (2012). Statistics in Psychology and Education (2nd Ed.). Delhi:
Prentice Hall of India.
N.K. Chadha (2009) Applied Psychometry. Sage Pub: New Delhi
N.K. Chadha (1991) Statistics for Behavioral and Social Sciences. Reliance Pub.
House: New Delhi
N.K. Chadha and R.L. Sehgal (1984) Statistical Methods in Psychology, ESS
Publications: New Delhi

1.14 SUGGESTED READINGS

Aron, A., Aron, E.N., & Coups, E.J. (2007). Statistics for Psychology (4th Ed.).
Delhi: Prentice Hall of India.
Howitt, D and Cramer, D. (2011). Introduction to Statistics in Psychology. London,
UK: Pearsons Education Ltd.
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–
680. https://doi.org/10.1126/science.103.2684.677

Self-Instructional
28 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

LESSON 2 NOTES

CENTRAL TENDENCY
Dr. Poonam Phogat
Associate Professor, Gargi College
University of Delhi
Email-Id: [email protected]
Dr. Shweta Chaudhary
Assistant Professor, Gargi College
University of Delhi
Email-Id: [email protected]

Structure
2.1 Learning Objectives
2.2 Introduction
2.3 Measures of Central Tendency: Definition, Properties and Comparison
2.3.1 Mean, Median, and Mode
2.3.2 Comparison of Mean, Median and Mode
2.4 Calculation of Mode, Median and Mean from Raw Scores
2.5 Effects of Linear Score Transformations on Measures of
Central Tendency
2.6 Measures of Variability Range; Semi-Interquartile Range; Variance;
Standard Deviation (Properties and Comparison)
2.6.1 Range and Semi - Interquartile
2.6.2 Variance
2.6.3 Standard Deviation
2.6.4 Quartile Deviation
2.7 Calculation of Variance and Standard Deviation
2.8 Effects of Linear Score Transformations on Measures of Variability
2.9 Summary
2.10 Answers to In-Text Questions
2.11 Glossary
2.12 Self-Assessment Questions
2.13 References
2.14 Suggested Readings Self-Instructional
Material 29

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
2.1 LEARNING OBJECTIVES

 To familiarize students with the use of statistical methods in psychological


research.
 To foster an understanding of the techniques of descriptive statistics for
quantitative research.
 To teach the application of the same in the field of psychology.
 To have the knowledge of properties and computation of the various measures
of central tendency and variability.

2.2 INTRODUCTION

“Statistics” as defined by the American Statistical Association (ASA) is “the science


of learning from data, and of measuring, controlling and communicating uncertainty”
(2023).
Statistics plays a crucial role in psychology as it can provide important insights
into the data. Statistics involve the process of analysing, interpreting, and drawing
conclusion from the data collected for psychological research purposes. Statistical
methods are used for research hypotheses, establishing, and interpret the relationships
between various variables and to make valid conclusions and predictions from the
research.
 Descriptive Statistics
It is the statistics that involves the process of summarizing and describing the features
of a dataset. It involves calculating the measures of central tendency like mean, median,
mode and the measure of variability like range, semi-quartile range, variance and
standard deviation of the dataset.
 Inferential Statistics
This kind of statistics involves the use of the sample data to make inferences about a
population. It is also used to test the hypotheses and make predictions about the
Self-Instructional
30 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

population. Researches in psychology involves the testing of hypothesis which is done NOTES
based on analysis of the collected data and making accurate inferences. For example,
you conduct a survey outside Fab India store in a mall asking 100 people if they like
shopping at the store.
Psychologists and researchers have many statistical techniques available to
analyse the data however they should carefully choose the technique they would use.
Each technique has its own assumptions, limitations and there should be critical evaluation
done before selecting a technique. The appropriate use of statistics involves the
psychologists or researchers to study statistics theory in depth and critically analyse
the research question being addressed. The aim of the research question, hypothesis
and available resources will help psychologists and researchers to use the best statistical
technique. We have only discussed some of the statistical techniques here to keep it
specific to the topic of discussion.

2.3 MEASURES OF CENTRAL TENDENCY:


DEFINITION, PROPERTIES AND COMPARISON

Central tendency is also known as central location measures which are statistical values
that indicate the central location of a set of data in a distribution. It is important to
know that different datasets may have different measures of central tendency and the
choice of the measure of central tendency being used should be based on purpose or
aim of the research.

2.3.1 Mean, Median, and Mode

The three common measure of central tendency are:


 Mean
Mean also known as the average is the sum of all the values in a dataset divided by the
number of values. (Vetter T. R., 2017). It is the most common measure of central
tendency and is represented by the symbol “” (mu). Mean can be calculated for both
continuous and discrete variables, more often for continuous data.
Self-Instructional
Material 31

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Properties of Mean


1. It is sensitive to outliers, meaning that a single large or value can significantly
affect the mean. The mean can be skewed higher or lower because of the
outliers. ((Hurley & Tenny, 2022). Thus, it reduces the ability of the mean to
indicate the central tendency accurately.
2. It is good measure of central tendency for a dataset with normal distribution.
3. It is a useful measure when data has an equal number of large and small values,
as it accounts for all the values in the set.
4. It is rigidly defined and easy to calculate.
 Median
It is the middle value in the dataset after the values are arranged in ascending order
(smallest to largest). If the number of values is odd, then the median is the middle
value. If the number of values is even, then the median is the mean of the two middle
values. (Manikandan S., 2011) Median is also known as ‘positional average’ (Sundaram
et al., 2010 as cited in Manikandan S., 2011)
Properties of median
1. Unlike mean or average, median is not sensitive to outliers. Simply said a single
large or small value in the date will not skew the results.
2. It gives a clear indication of the central tendency in case of the dataset has a
large number of middle values.
3. It is easy to calculate and comprehend.
 Mode
Mode is the most frequently occurring value in a dataset. A set can have one mode,
multiple modes, or no mode at all. A dataset with multiple modes is said to be
multimodal, a data with two modes is said to be bimodal (Central Tendency &
Variability, 2023). In a dataset where each value is unique, there will no mode as no
value occurs more than once.
Properties of mode
1. Just like median, mode is also not affected by outliers. The significantly large or
small value will not impact the mode.
Self-Instructional
32 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

2. It is useful in the case of a dataset that has a large set of repetitions. It provides NOTES
clear indication of the most common value.
3. Mode may not represent the central tendency in some distributions.

2.3.2 Comparison of Mean, Median and Mode

1. Mean is a better measure of central tendency for a normally as it considers all


the values in the set. The main challenge to the value of mean may be due to the
outliers in a dataset.
2. Median is a better measure of central tendency for a dataset with skewed
distributions. Median tends to account for the outliers with there being a minimum
effect on its value.
3. Mode is a better measure of central tendency for a dataset with large number of
repetitions. Mode can be found for both numerical and categorical (non-
numerical) data.

Fig. 1: (Kare, 2023)


In conclusion, the choice of central tendency tool to be used depends on the
characteristics of the dataset. The kind of data that is there and what information is to
be extracted from the dataset. Each tool is aligned with a different perspective and
should be utilised by the researcher in line with purpose of the research.
Self-Instructional
Material 33

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
2.4 CALCULATION OF MODE, MEDIAN AND MEAN
FROM RAW SCORES

In this section of the chapter, we will look at the formula for calculating the mean,
median and mode from raw scores.
 Calculation of mean
Mean or average is the sum of all the values in the dataset divided by the total number
of values. The formula for the calculation of average is:
μ = (Σx) / N
where x is the sum of all values in the set, and N is the number of items in the set.
Here is an example for a small data set:
Raw scores: 8, 3, 6, 5, 11, 5, 2
Mean of the dataset: (8 + 3 + 6 + 5 + 1 1 + 5 + 2)/7 = 5.71
 Calculation of Median
To calculate the median, first arrange the set in an ascending order (smallest to largest).
If there are odd number of values, then median is the middle number. If there are even
number of values, then it is the average of the two middle values.
Raw data scores: 8, 3, 6, 5, 11, 5, 2
Ordered data set: 2, 3, 5, 5, 6, 8, 11
Median of this data set is 5 as it is the middle value. There are 3 values both
before and after the dataset.
Let’s look at the dataset again with one less data value
Raw data scores: 8, 3, 6, 11, 5, 2
Ordered dataset: 2, 3, 5, 6, 8, 11
As there are even number of values, the median is the average of 5 and 6 (the
two middle numbers) which is (5+6)/2 = 5.5

Self-Instructional
34 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

 Calculation of Mode NOTES


Mode is the most frequently appearing value in a dataset. Let us look at the data set
again:
Raw scores: 8, 3, 6, 5, 11, 5, 2
Ordered data set: 2, 3, 5, 5, 6, 8, 11
Mode for this set is 5 as it repeated twice
Raw scores: 8, 3, 6, 5, 11, 2
Ordered data set: 2, 3, 5, 6, 8, 11
There is no mode for this set as there are no values that are repeated.
Here’s another example of calculating the mean, median, and mode:
Dataset: 4, 5, 6, 2, 1, 8, 10, 12, 9, 11, 7, 3, 15, 13, 14
Ordered set: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
Mean = (4 + 5 + 6 + 2 + 1 + 8 + 10 + 12 + 9 + 11 + 7 + 3 + 15 + 13 + 14)
/ 15 =120 / 15 = 8
Median = 8
Mode = None (no value appears more than once).
Let us try to calculate the mean, median and mode in case the dataset contains frequency
tables.
 Mean: To calculate the mean, multiply the values by the frequencies, then sum
up the subtotals and finally, divide the total by the total amount of frequency.
Mean = xf / N
(Where x are values, f is the frequency of the values and N is the total no. of.
frequency)
 Median: To calculate the median, calculate the total of the frequencies and
then divide it by 2
Median = f/2
(Where f is the total of the frequencies)
 Mode: Mode is the frequency that occurs the maximum number of times.
Self-Instructional
Material 35

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Example 1:
Let us try to calculate the mean, median and mode for a frequency table:
Age in years Number of boys
13 5
11 3
9 7
14 6
7 4

To calculate the mean:

Age in years
Number of boys (f) xf
(x)
13 5 65
11 3 33
9 7 63
14 6 84
7 4 28
Total 25 ∑xf = 273

AgMean = xf / N
= 273/25
= 10.92
To calculate the median:

Age in years Cumulative


Number of boys (f)
(x) Frequency
13 5 5
11 3 5+3=8
9 7 8 + 7 = 15
14 6 15 + 6 = 21
7 4 21 + 4 = 25
Total N = 25

Self-Instructional
36 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

Cumulative frequency is the number of observations that lie above (or below) a NOTES
particular value in a data set. Hence in the above table each subsequent frequency is
added or accumulated to obtain the cumulative frequency of the next interval.
Here, total frequency N = f = 25
Median = N/2 = 25/2 = 12.5
In Cases where the cumulative frequency is not seen exactly in the frequency
distribution table then the cumulative frequency of the next interval is taken. For
example, in this case the cumulative frequency greater than 12.5 and closer to 12.5 is
cumulative frequency 15, therefore the median is the 15th value which is 9.
Mode for a grouped data showed in a frequency distribution table is the raw
score of the mid-point of the class interval with highest frequency. Mode is also 9 for
this dataset as it occurs the most frequently.
Example 2:
The table below shows the results of ratings of 0-3 given to a TV show in response to
a survey by 50 viewers of a popular OTP platform.

Rating No of viewer's response


0 8
1 20
2 18
3 4

Rating (x) No of viewer's response (f) xf


0 8 0
1 20 20
2 18 36
3 4 12
N = 50 ∑xf = 68

Mean = xf / N
= 68 / 50
= 1.36
This means that the average rating for the show was 1.36.
Self-Instructional
Material 37

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Median:
Cumulative
Rating (x) No of viewer's response (f)
Frequency
0 8 8
1 20 8+20=28
2 18 28+18=46
3 4 46+4=50
N = 50

Median = N/2
= 50/2
= 25
The cumulative frequency greater than and closer to 25 is 28. So median is the
25 value which is 1.
th

Mode for this table is also 1 as it the value that has been most frequent in the
dataset.

In-Text Questions
1. Statistics play a key role in psychology:
a) as it helps interpret data b) as it helps analyze data
c) as it helps us summarize data d) All of the above
2. A dataset of 4 test scores of a student is collected. The average score for the
student was calculated. This is an example of:
a) Inferential statistics b) Corelation statistics
c) Descriptive statistics d) None of the Above
3. Measures of central tendency and measure of variability are used in:
a) Inferential statistics
b) Descriptive statistics
c) Both inferential and descriptive statistics
d) None of the Above
4. All datasets will always have a mode.
a) True
Self-Instructional
38 Material b) False

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

5. A dataset can be bimodal, trimodal or multimodal NOTES


a) True
b) False
6. Median is the preferred method of central tendency:
a) When there is a large set of data values
b) When data values are normally distributed
c) When there is skewed distribution of data values
d) All of the above
7. Outliers have an impact on which measure of central tendency?
a) Median b) Mode
c) Mean d) None of the above
8. Mean is the preferred measure of central tendency:
a) When there is a large set of data values
b) When there is normal distribution of data values
c) When there is skewed distribution of data values
d) All of the above
9. Calculate the mean, median and mode for the following dataset:
8, 2, 9, 10, 12, 4, 6, 7,8, 10 8, 9, 10

2.5 EFFECTS OF LINEAR SCORE


TRANSFORMATIONS ON MEASURES OF
CENTRAL TENDENCY

Linear score transformation involves the manipulation of central tendency. Linear score
transformations refers to the mathematical operations that are performed on a set of
scores or data values in order to change their distribution this linear transformation is
done using the formula:
Y = aX + B
Self-Instructional
Material 39

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Where,
X is the original score,
Y is the transformed score,
a is the scaling factor, and
b is the shift factor
The effects of linear score transformations on various measures of central tendency
are as follows:
Mean: There is a change in the mean value of the dataset due to linear transformations.
The means of the transformed score is equal to the original mean multiplied by the
scaling factor a plus the shift factor b.
Median: The median is also affected in the same way as the raw scores.
Mode: There may or may not be change in the mode of the dataset depending on the
values of a and b.
X Y (+4)
1 5
3 7
4 8

Linear transformation is of +4,


Mean before = 2.667; mean after transformation= 6.667 (change by +4)
Median before= 3; median after transformation = 7 ( change by +4)

2.6 MEASURES OF VARIABILITY RANGE; SEMI-


INTERQUARTILE RANGE; VARIANCE;
STANDARD DEVIATION (PROPERTIES AND
COMPARISON)

Variability refers to “the spread, or dispersion, of a group of scores. Measures of


variability (sometimes called measures of dispersion) provide descriptive information
about the dispersion of scores within data” (Cherney, M. (Ed.) (2017). Range, Semi-
Self-Instructional
40 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

interquartile range (IQR), Variance and Standard Deviation are all measure of variability NOTES
in statistics.

2.6.1 Range and Semi - Interquartile

Range is the difference between the largest and smallest values in the set. It gives a
rough idea of how spread out the values are in a dataset.
Range = Highest Score (HS) - Lowest-score (LS)
For example, for a dataset of [1, 5, 6, 10] the range is 10 - 1 which is 9.
Semi-interquartile range (IQR) is defined as half of the difference between the
75th percentile (upper quartile) and the 25th percentile (lower quartile) of the data.
The IQR is a robust measure of variability that is less sensitive to outliers than the
range.

2.6.2 Variance

Variance is a measure of the spread of the data around the mean. It is calculated as the
average of the squared deviations of each data point from the mean. Variance is
expressed in squared units and can be difficult to interpret, but it is a key component in
the calculation of standard deviation. The term variance was used to describe the
square of the standard deviation by R.A. Fisher in 1913. The variance (s2 ) or mean
square (MS) is the arithmetic mean of the squared deviations of individual scores from
their means. In other words, it is the mean of the squared deviation of scores. Variance
is expressed as V = SD².
The merits of variance are:
1. It is calculated on all observations and hence is more accurate.
2. Any algebraic further calculations can be done on variance.
3. It is not affected by sampling fluctuations.
4. It is does not fluctuate easily.
The demerits of variance are:
1. Since it uses the raw scores directly it may be lengthy and tedious to calculate.
2. It gives greater weight to extreme values. Self-Instructional
Material 41

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 2.6.3 Standard Deviation

The square root of the variance is Standard deviation. It is a measure of the dispersion
or spread of the data around the mean. The term standard deviation was first used in
writing by Karl Pearson in 1894. Standard deviation is denoted by the symbol ‘’
(Greek letter sigma). This is most popular used method of variability. The standard
deviation indicates the average of distance of all the scores around the mean. The
mean with smaller standard deviation is more reliable than mean with large standard
deviation. A smaller SD shows greater homogeneity of the data.
The merits of SD are:
1. Since it is the best measure of variation, it is widely used.
2. It is calculated using all the observations of the data.
3. It gives an accurate estimate of population parameter when compared with
other measures of variation.
4. SD is stable and is not affected by sample fluctuations
5. It is also possible to calculate combined SD that is not possible with other
measures.
6. Further statistics can be applied on the basis of SD like, correlation, regression,
tests of significance, etc.
7. Coefficient of variation is based on mean and SD. It is the most appropriate
method to compare variability of two or more distributions.
The limitations of SD are:
1. SD gives more weight to outliers and extreme score.
2. It is difficult to compute as compared to other measures of dispersion.
Uses of Standard Deviation:
1. When more reliable and accurate measure of variability are needed then SD is
used.
2. It is used when further statistics like, correlation, regression, tests of significance,
etc. have to be computed.

Self-Instructional
42 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

2.6.4 Quartile Deviation NOTES

When range and Inter quartile range are compared, the range provides a rough idea of
the spread of the data, while the IQR and standard deviation provide more precise
measures of variability that are less sensitive to outliers. The variance is an intermediate
step in the calculation of the standard deviation and is used in many statistical tests and
procedures.

Merits of Quartile Deviation

1. Quartile deviation is a better measure of dispersion than range because it takes


into account 50 per cent of the data.
2. It is not affected by extreme scores since it does not consider the upper and
lower 25 per cent of the data.
3. It is the only measure of dispersion which can be computed from the frequency
distribution with open-end class.
Limitations:
1. Since its calculation is based on the middle 50 percent values, it is not regarded
as a stable measure of variability
2. Sampling fluctuation have an effect on quartile deviation.
3. The value of quartile deviation is not affected by the distribution of the individual
values within the intervals of middle 50 percent observed values.

Uses of Quartile Deviation

1. It is appropriate when the distribution contains few and very extreme scores.
2. It is used when the median is the measure of central tendency.
3. Also, it is used when our primary interest is to determine the concentration
around the median.
Here’s an example of how to calculate the range and semi-interquartile range for a
data set:
Suppose the data set is: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. Note
that the set is already ordered in ascending order. Self-Instructional
Material 43

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES  Semi-interquartile range


To calculate the semi-interquartile range, we first need to find the quartiles of the data
set. The quartiles divide the data set into four equal parts. Let’s understand this better.
Any data set consists of an array of numbers. If this data set is arranged from lower
scores to upper scores and if we cut the data into four equal parts consisting of 25%
or scores in each quarter, then the first 25% of scores fall in the first quarter or ¼ , the
second data point is mid value , that is 50 percent of data is below it. This is also the
median point of the data.
The first quartile (Q1) is the median of the lower half (50th percentile point) of
the data set. In the present example, data values 1, 2, 3, 4, 5, 6, 7 are in the lower half.
Median of Q1 is 4.
The second quartile (Q2) is the median of the entire data set. The median of the
entire set is 7.
The third quartile (Q3) is the median of the upper half of the data set. The date
values 9, 10, 11, 12, 13, 14, 15 is in the upper quartile. The median for Q3 is 12.
The formula for calculation of Semi-interquartile range is (Q3-Q1)/2
IQR = (Q3-Q1)/2
= (12-4)/2
=4

2.7 CALCULATION OF VARIANCE AND STANDARD


DEVIATION

Here’s an example of how to calculate the variance and standard deviation for a data
set:
Suppose the data set is: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. Note
that the set is already ordered in ascending order.
Variance: The variance is a measure of the spread of the data set. The formula
for the variance is:
Self-Instructional
44 Material Variance =  (x - x)2 / n

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

where x is each value in the data set, x is the mean i.e. average of the data set, and n is NOTES
the number of values in the data set.
Mean of this dataset is:
(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15)/15 = 8
Variance = ((1-8)2 + (2-8)2 +(3-8)2 +(4-8)2 +(5-8)2 ... + (15-8)2 ) / 15
= 280/ 15
= 18.6
Standard Deviation: The standard deviation is the square root of the variance. The
formula for the standard deviation is:
Standard Deviation = Variance
In this case, the standard deviation is 18.66 = 4.3197.
Let’s look at one more example: Dataset: [1, 4, 6, 6, 8, 9, 11, 12, 14, 15, 19,
34, 35]
Note that the set is already in ascending order. If that is not the case for the data
set given, always order the data values in ascending order.
Mean = 174 / 13 = 13.38
Variation = (x - mean)2 / n
= ((1-13.38)2 + (4-13.38)2 +(6-13.38)2 +(6-13.38)2 +... + (35-
13.38)2 ) / 13
= 1346.7 / 13 = 103.592
Standard Deviation = Variance
= 103.592 = 10.178
Here is another example of how to calculate the range, semi-interquartile range, variance,
and standard deviation for a data set:
Suppose the data set is [3, 4, 4, 5, 6, 7, 8, 8, 9, 10, 11, 12, 12, 13, 15].
Mean = (3 + 4 + 4 + 5 + 6 + 7 + 8 + 8 + 9 + 10 + 11 + 12 + 12 + 13
+ 15) / 15 = 8.466 = 8.5
Variance = ( (3-8.4)^2 + (4-8.4)2 + ... + (15-8.4)2 ) / 15 = 187.75/15 =
12.52
Standard Deviation = Variance = 12.52 = 3.54 Self-Instructional
Material 45

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
2.8 EFFECTS OF LINEAR SCORE
TRANSFORMATIONS ON MEASURES OF
VARIABILITY

As discussed earlier, linear score transformations are mathematical operations that


involve adding a constant and multiplying by a constant factor to each score in a
dataset. Linear score transformations refer to mathematical operations that are
performed on a set of scores or data values in order to change their distribution. The
most common linear score transformation is the standardization, which involves
subtracting the mean of the scores from the raw scores and dividing the result by the
standard deviation. This transformation has the effect of converting the scores into
standard units, with a mean of 0 and a standard deviation of 1.
The effects of linear score transformations on measures of variability, such as
the variance and standard deviation, depend on the specific transformation applied.
For example, standardizing scores has the effect of reducing the variance of the scores
to 1, as all the scores are transformed into standard units with a variance of 1.
However, other linear transformations, such as adding a constant value or
multiplying the scores by a constant, can have different effects on the variance and
standard deviation. For example, adding a constant value to all scores increases the
mean of the scores, but does not change the variance or standard deviation. Multiplying
all scores by a constant change both the mean and the variance, but the standard
deviation is only changed if the constant is not equal to 1.
Here is the summary of how these transformations affect each measure:
 Range: The range of the transformed scores is equal to the original range
multiplied by the scaling factor a.
 Semi-Interquartile Range: The semi-interquartile range of the transformed scores
is equal to the original semi-interquartile range multiplied by the scaling factor a.
 Variance: The variance of the transformed scores is equal to the original variance
multiplied by the square of the scaling factor a.
 Standard Deviation: The standard deviation of the transformed scores is equal
Self-Instructional to the original standard deviation multiplied by the scaling factor a.
46 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

In terms of comparison, the effect of linear score transformations on the measures of NOTES
central tendency and variability can be summarized as follows:
 Mean and standard deviation are both affected by linear score transformations,
while median, mode, range, and semi-interquartile range are not affected.
 The mean and standard deviation are more sensitive to outliers compared to the
median and semi-interquartile range.
 The range and semi-interquartile range are measures of variability that are less
affected by outliers compared to variance and standard deviation.
 In summary, the effects of linear score transformations on measures of variability
depends on the specific transformation applied, and it is important to understand
how the transformed scores will affect the results of any statistical analysis
performed on the data.

In-Text Questions
10. The dataset provides the weight of the bags filled by a worker in every 2
hours:
39,43,36,38,46,51,33,44,44,43.
Find the mode of this data set. Are there more than 1 mode? If so, why?
11. Calculate the range, IQR, variance and standard deviation for this dataset:
2,4,5,6,12,14,15,21,23,25
12. What is the mode of the data:
44, 42, 35, 37, 45, 50, 32, 43, 43, 40, 36, 44, 43, 44, 47
Does the data has more than one mode?
13. If Q1 = 10, Q2 = 12 and Q3 = 21, what is the IQR?
14. Range helps give a rough idea of how spread the data is:
a) True b) False
15. _____________ is the intermediate step in the calculation of standard deviation
16. Range and semi-interquartile range are measures of variability that are less
affected by outliers:
a) True b) False
17. Range, IQR, Variance and Standard Deviation are:
a) Measure of central tendency b) Measure of variability
Self-Instructional
c) Both d) None of the above Material 47

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
2.9 SUMMARY

 Descriptive Statistics is the statistics that involves the process of summarizing


and describing the features of a dataset.
 Inferential Statistics is the type of statistics that involves the use of the sample
data to make inferences about a population.
 Central tendency, also known as central location measures, are statistical values
that indicate the central location of a set of data in a distribution.
 Mean also known as the average is the sum of all the values in a dataset divided
by the number of values.
 Median is the middle value in the dataset after the values are arranged in ascending
order, in case of odd number of scores. In case of even, it is the average of the
middle two scores when arranged in ascending order.
 Mode is the most frequently occurring value in a dataset. A set can have one
mode, multiple modes, or no mode at all.
 Linear score transformation involve the manipulation or shift in the entire data
set or set of scores.
 The means of the transformed score are equal to the original mean multiplied by
the scaling factor a plus the shift factor b.
 The median is not affected by the linear transformations
 Mode: There may or may not be change to the mode for the dataset depending
on the values of a and b.

2.10 ANSWERS TO IN-TEXT QUESTIONS

1. d
2. c
3. b
Self-Instructional
48 Material 4. b

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

5. Correlation statistics NOTES


6. Multivariate statistics
7. a
8. c
9. mean= 7.86, median=6, mode=8,10
10. b
11. Range=23; IQR=15; SD=8.35; s2=69.78
12. 43,44. Yes there are more than one mode as 43, 44 are repeated more frequently
and equal times
13. 11
14. a
15. Variance
16. a
17. b

2.11 GLOSSARY

 Central Tendency: This is the location of the central points of data.


 Mean: It is the point of average of the data set.
 Median: It is the point where data is divided into two equal halves.
 Mode: It is the point of highest occurrence of any score.
 Range: It is the difference between highest-score and lowest-score.
 Semi-Interquartile range (IQR): It is the average of the difference between
the third quarter and first quarter points of the data set.
 Standard Deviation: The standard deviation is a measure of the amount of
variation or dispersion of a set of values.
 Variance: It is the square of standard deviation.
Self-Instructional
Material 49

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
2.12 SELF-ASSESSMENT QUESTIONS

Q1. Find the mean, median, mode, and range for the following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13
Q2. Why will some dataset have multiple modes while some dataset has no mode?
Q3. Which measure of central tendency is affected by outliers and which is not
affected?
Q4. Define Median. How is median calculated for even set of values?
Q5. Explain semi-interquartile range. How is IQR calculated?
Q6. What do you understand by linear transformation? Which measures of tendency
and variability are not affected by linear transformations?
Q7. What do you understand by outliers?
Q8. Calculate the range, IQR, variance and standard deviation for the following
values:
10, 15, 25, 26, 26, 29, 30, 35, 40, 45

2.13 REFERENCES

Cherney, M. (Ed.) (2017). . (Vols. 1-4). SAGE Publications, Inc, https://dx.doi.org/


10.4135/9781483381411
Correlational studies. Study Smarter US. (n.d.). Retrieved February 19, 2023, from
https://www.studysmarter.us/explanations/psychology/research-methods-in-
psychology/correlational-studies/
Hurley, M., & Tenny, S. (2022, July 18). Mean - StatPearls - NCBI Bookshelf.
National Library of Medicine. Retrieved February 19, 2023, from https://
www.ncbi.nlm.nih.gov/books/NBK546702/
Kare, D. (2023, January 25). Difference between mean median and mode in tabular
Self-Instructional form.
50 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Central Tendency

Testbook Learn. Retrieved February 19, 2023, from https://testbook.com/learn/maths- NOTES


difference-between-mean-median-and-mode/
Manikandan S. (2011). Measures of central tendency: Median and mode. Journal of
pharmacology & pharmacotherapeutics, 2(3), 214–215.
Sundaram, K. R., Dwivedi, S. N., & Sreenivas, V. (2010). Medical statistics:
Principles & methods. Anshan. https://doi.org/10.4103/0976-500X.83300
THE UNIVERSITY OF UTAH. (2023). Central Tendency & Variability. Central
Tendency & Variability - Sociology 3112 - Department of Sociology - The
University of utah.
Retrieved February 19, 2023, from https://soc.utah.edu/sociology3112/central-
tendency-variability.php
Vetter T. R. (2017). Descriptive Statistics: Reporting the Answers to the 5 Basic
Questions of Who, What, Why, When, Where, and a Sixth, So
What?. Anesthesia and
Analgesia, 125(5), 1797–1802. https://doi.org/10.1213/ANE.0000000000002471
What is statistics? New World: Artificial Intelligence. (2023, January 12). Retrieved
February 19, 2023, from https://www.newworldai.com/what-is-statistics/

2.14 SUGGESTED READINGS

Aron, A., Coups, E.J. & Aron, E.N. (2013). Statistics for Psychology (6th Ed.).
Pearson Education.
Asthana, H.S. & Bhushan, Braj (2007). Statistics for social sciences (with SPSS
applications). New Delhi: Prentice Hall of India.
Field, A. (2009). Discovering Statistics using SPSS (3rd Ed). New Delhi: Sage.
Garrett, H.E. (2005). Statistics in Psychology and Education. Paragon International
Publishers
Howitt, D. &Cramer, D. (2011). Introduction to Statistics in Psychology (5th Ed.).
Pearson Education.
Self-Instructional
Material 51

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES King, B.M., Rosopa, P.J., &Minium, E.W. (2011). Statistical Reasoning in the
Behavioral Sciences (6th Ed.).
Mangal, S.K. (2010). Statistics in Psychology and Education (2nd Ed.). PHI Learning.
Mohanty, B. & Misra, S. (2015). Statistics for behavioral and social sciences. New
Delhi: Sage Publications.
N.K. Chadha (2009) Applied Psychometry. Sage Pub: New Delhi
N.K. Chadha (1991) Statistics for Behavioral and Social Sciences. Reliance Pub.
House: New Delhi
N.K. Chadha and R.L. Sehgal (1984) Statistical Methods in Psychology, ESS
Publications: New Delhi

Self-Instructional
52 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
UNIT II: STANDARD SCORES

LESSON 3 STANDARD SCORES


Standard Scores

LESSON 3 NOTES

STANDARD SCORES
Dr. Poonam Phogat
Associate Professor, Gargi College
University of Delhi
Email-Id: [email protected]
Dr. Shweta Chaudhary
Assistant Professor, Gargi College
University of Delhi
Email-Id: [email protected]

Structure
3.1 Learning Objectives
3.2 Introduction to Standard (z) Scores
3.3 Properties of z-scores
3.4 Transforming Raw Scores into z-scores
3.5 Determining Raw Scores from z-scores
3.6 Some Common Standard Scores
3.6.1 T-score
3.6.2 Stanine-score
3.6.3 STEN-score
3.7 Computations of Percentiles and Percentile Ranks from Grouped Data
3.7.1 Calculating Percentiles from Grouped Data
3.7.2 Calculating Percentile Ranks from Grouped Data
3.8 Comparison of z-scores and Percentile Ranks
3.9 The Normal Probability Distribution: Nature, Properties and Applications
3.9.1 Nature of the Normal Distribution
3.9.2 Properties of the Normal Distribution
3.9.3 Applications of the Normal Distribution
3.10 Normal Curve and Standard Scores
3.10.1 Finding Areas when the Score in known and Finding Scores when the
Area is known

Self-Instructional
Material 55

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 3.11 Summary


3.12 Answers to In-Text Questions
3.13 Glossary
3.14 Self-Assessment Questions
3.15 References
3.16 Suggested Readings

3.1 LEARNING OBJECTIVES

 To develop an understanding of the nature of standard scores.


 To understand the various types of standard scores.
 To learn about the applications of standard scores.
 To learn about the nature and applications of the normal probability
distribution.

3.2 INTRODUCTION TO STANDARD (Z) SCORES

The standard score is the deviation of the score from the distributed mean of the
scores. It is also known as the z-score. The standard score or z-score tells us how far
a score is in a distribution from its mean. Z - score informs us how far in the score or
if values are greater than or less than the mean or average in standard deviation units.
A positive standard score means that the value of the score is larger than the given
mean, whereas a negative standard score means the value of the score is smaller than
the given mean in a distribution. In different types of psychological tests, standard
scores allow us to compare scores between two or more data sets. Standard score
helps us to gets an accurate and significantly consistent comparison in relation to the
mean or average of the score.
For example, if a teacher wants to assess a student’s performance in 2 different
subjects, Psychology and Education in their final exams. He cannot just compare both
subjects’ marks because each class may contain a different population size. Let’s say
Self-Instructional
56 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

the particular student got 80 in Psychology and 65 in Education, the teacher just NOTES
cannot say that the student is performing better in Psychology compared to Education.
If most of the students from the Psychology class were getting marks around 80, then
the particular student’s performance is at par with the average performance of the
class, but if most of the students got lower marks in Education class in comparison to
this particular student then it would mean that the student is performing well in this
subject and he is one of the top scorers of that subject. It is for such scenarios that we
need to look at the standard score of the student in comparison to the entire class to
gain a realistic understanding of the student’s performance.
Since we have already discussed that standard scores are deviations from the
mean score of a particular distribution, it is important that the standard score mean
value is always zero, and their standard deviation is equal to one. The standard
score or z-score is always interpreted in terms of its positive or negative distances
from its mean value. These values help us understand and give an accurate picture
of the position of the scores in a particular dataset. Since we cannot just compare
two raw scores, we need to convert these raw scores to standard scores.

3.3 PROPERTIES OF Z-SCORES

Some of the key characteristics of z-scores are:


1. Standard score mean value is always 0.
2. Standard score standard deviation is always 1.
3. A positive standard score (z-score is above 0) means that the value of the score
is larger than the given mean.
4. A negative standard score (z-score is below 0) means the value of the score is
smaller than the given mean.
5. When shape of the graph of the z-score distribution would be the same as the
graph of the original distribution
6. The total of the squared z-score would always be equal to the total number of
z-score values.
Self-Instructional
Material 57

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Advantages of Standard scores:

1. Converting raw scores into standard or z-scores in a data distribution does not
impact the characteristics of the distribution.
2. Standard score and z-score help us analyse and compare scores from two
different distributions.

Disadvantages of standard scores

1. In order to depict the positive and negative positions of the standard scores
from the mean, plus and minus signs are used. This can be confusing and
misleading.
2. Decimals used in standard score or z-score can create confusion.

3.4 TRANSFORMING RAW SCORES INTO


Z-SCORES

In the section we will look at how to convert raw score into z-scores using a step-by-
step process. In order to convert a set of raw scores in a given dataset to a standard
score or z-score, we can follow the specific steps mentioned below-
1. First, we need to calculate the mean and standard deviation of a particular
distribution.
2. Then, we need to substitute the value of both the mean and standard deviation
in the given formula to get the standard score or z-score.
Z = X- M / 
(X = Raw scores, M = Mean of the scores, and  = standard deviation of the
scores)
Let’s look at an example to calculate the z-score:
There are two sections, namely, A and B in B.A. (H) Psychology second year
class. To test student achievement in social psychology paper two different exams
Self-Instructional
58 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

were conducted for each of the section. Rita from Section A scored 70 marks while NOTES
Surabhi from Section B scored 80 marks. We would now try to determine which
student, Rita or Surbhi (who belong to different class sections) performed better in
social psychology exam.
Just by looking at both students’ marks, we cannot just say Surabhi performed
better than Rita because she got 80 in her social psychology paper compared to Rita
who got 70. The paper for section B might have been easier than section A or both the
question papers were fundamentally different wherein one was objective question
based and the other one was all descriptive questions. For this reason, it would be
unfair to compare the scores obtained by Rita and Surbhi as they don’t belong to the
same scale.
In order to do a correct comparison, we need to convert their raw score, which
is their exam marks into z-scores for comparison.
Mean and standard deviation of section A and section B are given are given below-
Section A - Mean - 50, standard deviation - 10
Section B - Mean - 70, standard deviation - 20
By using the z-score formula,
Z = R-M / 
R = Raw scores obtained by the students.
M = Mean of the exam performances.
 = standard deviation of the distribution of scores in the given test.
z-score for Rita = (70-50)/10
= 2.0
Similarly, z-score for Surbhi = 80-70/ 20
= 0.5
We can now compare the z-score and can conclude that Rita performed
better in social psychology with 2.0 z-score as compared to Surbhi who has a z-
score of 0.5

Self-Instructional
Material 59

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Let’s look at another example. You score 180 on a test. The mean () for the
test was 140 and the standard deviation () was 20. Based on the assumption the
values had a normal distribution, the z-score for you would be:
z = (x – ) / 
= (180-140)/20
=2

3.5 DETERMINING RAW SCORES FROM Z-SCORES

In the previous section we looked at how to calculate the z-score from the raw score.
In this section we are going to look at how to calculate the raw score if we have been
given the z-score. The formula for calculating the raw score from z-score is:
x = µ + (z x σ)
(Where µ equals the mean, z equals the z-score, and σ equals the standard
deviation)
Now let’s do it in an example:
Let’s say in Delhi, the mean/average income for a household is 500000 rupees
(annually) with the standard deviation of 6000 rupees. If a household has a z-
score of 2. Then what is the annual income of this household?
To solve this, we will use the raw score formula:
x = µ + zσ
(Where µ equals the mean, Z equals the z-score, and σ equals the standard
deviation)
X = 500000 + (2*6000)
= 500000 + 12000
= 512000
Therefore, the annual income of this household is 512000 rupees.
Self-Instructional
60 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

Another Example: NOTES

In a psychology exam, the mean score was 75 with a standard deviation of 6. If


Student A has a z-score of -3 and Student B has a z-score of 1. What are the
actual scores of Student A and Student B?

Student A:

x= µ + zσ
= 75+ (-3)*6
= 75-18
= 57

Student B:

x= µ + zσ
= 75 + (1*6)
= 75 + 6
= 81

3.6 SOME COMMON STANDARD SCORES

Some of the common standard scores are T-score, Stanine-score and sten scores
which are used to overcome some of the limitations of the z-score. We will discuss
these different T-scores in a detail in this section.

3.6.1 T-score

As mentioned earlier, the use of decimal points in the z-score can create confusion and
difficulty to interpret data distribution. To overcome the limitations of the standard
score or z-score a more reliable and useful score named T-score may be used. This
score was first used by William A. McCall. The score is called T name and is given in
honour of the renowned psychologists, Terman and Thorndike. Self-Instructional
Material 61

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES T-score is a type of standard score on a data distribution that has a mean of 50
and a standard deviation of 10. A T-score is almost similar to z-score but many people
prefer T-score because of the lack of negative numbers, which makes the interpretation
easy. Further, there are no decimal points in T-score which makes data distribution
less confusing and difficult.
To calculate T-score from raw scores, we use the formula given below:
T = 10z +50
(Where z is the z-score, remember to calculate the z-score we would need the
mean and standard deviation of the data set)
Therefore, the first step in the calculation of T- score is the analysis of the mean
and standard deviation of a given distribution. This would help us calculate the z -
score and use the above formula to calculate the T-score.
For example:
In the final year examinations of the psychology course, two students, namely,
Maya and Jia scored the following marks given below in the table. Out of these
two students whose overall score is better?
Table 1: Marks Scored by Maya and Jia in Social and Developmental
Psychology Final Examination with their Mean Value and Standard Deviation

Subject name Maya’ score Jia’s score Mean Standard deviation

Social psychology 70 80 60 10
Developmental 80 70 50 20
psychology

At the first glance, it may seem Maya’s and Jia’s marks are almost the same.
Both the students scored equal numbers by scoring 150 in total in the two subjects.
But we cannot conclude that both did equally well. Frequency distribution of the marks
scored by the two students are different. Mean value and standard deviation for both
subjects are different.
In order to compute who scored better we need to convert the raw scored
marks into T-score.
Self-Instructional
62 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

Step - 1 NOTES
Conversion of Raw marks scored in two subjects by two students into T-score
(a) Social Psychology.
Maya’s T-score = 10z + 50
As we know z = (X - M )/ σ
= 10 (X - M)/ σ + 50
= 10* (70 - 60)/10 + 50
= 10*10/10 + 50
= 10 + 50
= 60
Jia’s T-score
T-score = 10z + 50
= 10 (X - M)/ σ + 50
= 10*(80 - 60)/10 + 50
= 10*20/10 + 50
= 20 + 50
= 70
(b) Developmental Psychology
Maya’s T-score: 10z + 50 =
= 10 (X - M)/ σ + 50
= 10*(80 - 50)/20 + 50
= 10*30/20 + 50
= 65
Jia’s T-score:10z + 50
= 10 (X- M)/ σ + 50
= 10*(70 - 50)/20 + 50
= 10*20/20 + 50
Self-Instructional
= 60 Material 63

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Step - 2
In the second step, we will add both student’s T-score obtained in social and
developmental psychology.
Maya’s total T-score in two subjects: 60 + 65 = 125
Jia’s total T-score in two subjects: 70 + 60 = 130
Looking at Maya’s and Jia’s combined T-score, it has been revealed that Jiya
performed better in her final exams as compared to Maya.
As is true for each score, T-scores has its own advantages and disadvantages.
The advantages and disadvantages of T-score are discussed follows.

Advantages of T-scores:

1. In T-score, mean and standard deviations of distribution are fixed, therefore


there is no need for decimal points.
2. There is no use of negative signs in T-score. In the computation of scores,
whole numbers are produced. This leads to a better understanding of the score.

Disadvantages of T-scores:

1. In order to get back the value of the true raw score, one needs to know about
the original raw scores mean and standard deviation value.
2. Sometimes fixed mean and standard deviation values can create difficulty in
analysis.

3.6.2 Stanine-score

The word stanine is used for standard nine numbers. Stanine-score consists of 9
categories, with a mean value of 5 and a standard deviation 2. Stanine-score can be
used to transform or convert any score into a nine-point single digit-score.
Same as z-score and T-score that we have discussed previously this scale is
used to assign a single digit number to a test-score relative to all other test-scores in
that particular group. As Stanine-scores are always whole numbers starting from 0-9,
Self-Instructional
64 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

this can’t be expressed with negative scores. Likewise, a Stanine-score cannot be NOTES
expressed with decimals points either.
Stanine-scores are quite similar to normal distribution scores, we can use the
scores as a bell curve that has 9 divisions. These 9 divisions are numbers from 1 to 9,
starting from the left-hand side that is 1 and ending with 9.
For example, if a psychology teacher wants to convert her students’ performance
in their final year exam into a Stanine-score scale. She first has to convert the marks of
the students using the Stanine-score scale. For example, 10/100 is 1 Stanine-score,
whereas 90/100 is 9 in the Stanine-score scale. Here students who got 90 or above
90 marks were in the top 4% of the class whereas students who got only 10 marks out
of 100 were in the bottom 4% of the class in terms of performance in the psychology
subject.

How to convert raw score into a Stanine-score?

Given below is the procedure to convert a raw test-score into a Stanine-score using
the previous example given above. Let’s say a psychology teacher wants to convert
100 students’ performance in the final exam into a Stanine-score scale.
First, the teacher needs to rank the scores from lowest value to the highest. In
the second step, the teacher will assign a Stanine-score to every student’s exam marks
using the Stanine scale.
Table 2: Stanine-score Ranking

Score Ranking Bottom 4% 7% 12% 17% 20% 17% 12% 7% Top 4%

Stanine-score 1 2 3 4 5 6 7 8 9

(Source: Singh, A.K)


These Stanine-scores then translate to the following classifications (Singh, A.K.)
 Stanine-scores of 1-3: Below Average
 Stanine-scores of 4-6: Average
 Stanine-scores of 7-9: Above Average

Self-Instructional
Material 65

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 3: Raw Exam Scores and their Stanine-score

Exam scores of Stanine-score Percentage of scores


students (out of 100)

10 1 Bottom 4%

20 2 Bottom 7%

30 3 Bottom 12%

40 4 Bottom 17%

50 5 Middle 20%

60 6 Top 17%

70 7 Top 12%

80 8 Top 7%

90 and above 9 Top 4%

Advantages of using Stanine-score:

1. Stanine-score helps us understand where a specific raw score of a test lies


relative to all other test-scores.
2. Stanine-score have a mean of 9 and standard deviation of 2.

Disadvantages of using Stanine-score:

One major disadvantage of using Stanine-score is that test-scores in the Stanine-


score scale are not equally divided or sized. Therefore, a test-score in a particular
stanine could be closer to scores in the next or previous stanine.

3.6.3 STEN-score

STEN-scores shortly used for the standard ten. If we take a scale and divide it into
10 parts or units, we can call it the STEN-score scale. However, all the 10 units will
not be equally divided into 10 even units of 10%. These units will be divided into bell
Self-Instructional curve shapes, where the majority of the scores from the test will lie in the middle
66 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

portion, which will be the average range. And only 2% of the scores will lie in the NOTES
lowest and the highest range score in the STEN-score scale.
The mean for the STEN-score is 5.5 and a standard deviation value of
around 2.
Table 4: STEN-scores and Position in the Test Scale

STEN-score Position in the Test scale

1 or 2 Far below average

3 or 4 Below average

5-7 Average

8 Above average

9 or 10 Far above average

How to convert a z-score to ta STEN-score?


The formula to convert a z-score to a STEN-score is given below. In order to convert
a z-score to STEN-score. Multiply the z-score by the standard deviation of that
particular test-score and add the mean value of the test-score.
STEN = z(SD) + M
(Where z is the z-score, SD is the standard deviation of 2 and M is the Mean of
5.5)
For example: For a z-score value 1, the STEN-score would be:
STEN = z(SD) + M
STEN = 1(2) + 5.5
= 7.5
For a z-score of 2, the STEN-score would be:
STEN = z(SD) + M
STEN = 2(2) + 5.5
= 4 + 5.5 Self-Instructional
Material 67
= 9.5

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Here, values of STEN-scores are always rounded up, so in the above example
the ‘7.5’ STEN-score would be rounded up to 8 STEN-score and the ‘9.5’ STEN-
score would be rounded up to 10.

Advantages of using STEN-score:

STEN-scores are easy to understand and interpretable, as it ranges from 1-10 scores.
They can be easily standardized to compare across different test-scores. As there is
no negative value, it is easy to understand and less complicated.

Disadvantages of using STEN-score:

Scores in the STEN-score scale are not equally divided into 10 equal units. The size
of the units may vary according to the test. There are situations where a scale of 10
points is way too high, there may be instances or psychological tests where fewer
divisions of scale are more suitable.

3.7 COMPUTATIONS OF PERCENTILES AND


PERCENTILE RANKS FROM GROUPED DATA

Percentile (means hundred) is used to describe the position of a participant with respect
to a group and is based on cumulative frequency distribution percentage. There are
two important concepts related to percentile, percentile point and percentile ranks.
Percentile point is commonly referred to as percentile, represents a point below
which a specific number of cases fall. For example, in table 8, we can see that in verbal
reasoning section of CAT, 92 % of the candidates score below the score of 35.
Therefore, 92th percentile is 35. Percentile rank on the other hand represents, the
percentage of cases that falls below a point on the measurement scale. In our example
above, the percentile rank for the score 35 is 92.

Self-Instructional
68 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

Table 5: Data of Common Admission Test (CAT) NOTES

Percentage of candidates scoring below a particular score

Verbal Quantitative Data


Score obtained reasoning aptitude Interpretation
(Out of 50)
45 98 99 92
35 92 94 83
25 85 86 74
15 67 70 51
5 25 30 15
Number of candidates 5,00,000 5,00,000 5,00,000

Although the definitions of percentile and percentile rank are similar, the difference
lies in that fact that percentile ranks can take values between 0 and 100 only. Whereas
percentile point can take any value that the scores can take. In our example, the
maximum value of the percentile point can be 50. Symbolically, percentile is represented
by P, 20th percentile as P20, 30th percentile as P30 and so on. Let us assume that one
of the candidates scored 25 in data interpretation, then it can be written as P74 = 25
(74% of the test takers scored below 25 in data interpretation). In terms of the percentile
ranks, the subscript here indicates the rank which is 74.

3.7.1 Calculating Percentiles from Grouped Data

If we want to find 20th percentile that is P20, it implies finding a score below which
20 % of the cases fall. Since there are a total of 30 cases then 20% of 30 is 6,
therefore P20 is the point below which 6 cases fall.
In order to calculate this, we will start from the bottom of the distribution,
we can see that this point will fall in the interval 69.5-71.5 but we can’t be sure at
this point that what this score will be, the only point we can be sure of is that it falls
within this interval. Another observation about this interval is that there are a total
of 8 scores and we assume that all the scores are equally distributed in this interval.

Self-Instructional
Material 69

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 71.5 5
2 cases
4
Width of
3 P20
the class
interval is 2
2 3 cases
1

69.5 3 cases fall below this point

Figure 1: Placement of Percentile Score

Since there are three cases below the real limit of 69.5, we need to move up 3
more points to find out P20. In order to do that we need to cover 3 parts of the 5 equal
interval limit, we can do this by the following calculation: (3/5) × 2 = 1.2 points and
subsequently we will add this value to the lower limit to get P20.
P20 = (3/5) × 2 + 69.5 = 70.7
The entire process can be summarized in following steps:

Step 1: Find the class interval within which the P20 falls, this is done by:
I. Finding the score below which 20% of the scores fall
II. 20% of the total observation will be calculated, 20% of 30 = 6
III. Therefore, the 6th score from the bottom falls in the interval 69.5-71.5
Step 2: Determine the number of cases from the lower real limit of the interval to
where P20 will be, in this case 3.
Step 3: Assume that the class interval is equally distributed and find out the additional
points from the lower real limit from where the 6th score will fall using (3/5) X 2 = 1.2
Step 4: Finally add these points to the lower real limit to get to the percentile.

P20 = (3/5) × 2 + 69.5 = 70.7

Self-Instructional
70 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

3.7.2 Calculating Percentile Ranks from Grouped Data NOTES

Suppose we want to calculate the percentile rank for the score of 79. We can find this
out by first determining the interval within which this score falls, which is 77.5-79.5. In
order to get to 79, we need to add 1.5 to the real lower limit of the interval,
77.5+1.5=79. There are 2 scores in this interval and the interval width is also 2. Let us
assume that the two scores are equally distributed, we therefore must come up 1.5/2
X 2 = 1.5 cases from the bottom of the interval and since there are 25 scores below
the real lower limit, the point is 25 + 1.5 = 26.5 cases up from the bottom of the
distribution. Finally, 26.5/30 = .883 or 88.3%. Therefore, the score of 79 is at a point
below which 88.3% of the cases fall.
The entire process can be summarized mathematically as:

  1.5  
 25   2  2  
Percentile rank of the score 79  100      88.3
 30 
 

To better understand the process, let’s calculate percentile rank for the score of
74 using the above mathematical notation.

  0.5  
15   2  5  
Percentile rank of the score 74  100      54.16
 30 
 

In-Text Questions
1. From the data given in table 7, calculate the following:
a) P40
b) Percentile rank for 82

Self-Instructional
Material 71

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
3.8 COMPARISON OF Z-SCORES AND PERCENTILE
RANKS

Percentile is a statistical measurement used to locate a score of an individual, particular


item with reference to other items or scores of individuals in the particular data set. In
simple words, we can say percentile helps us to locate a single test-score in relation to
other scores in the test. Percentile also shows the number of test-scores a particular
score surpassed. Percentile rank can be defined as a measurement technique
representing the total number of scores in a given test or distribution lying below the
given score.
In percentile, the particular scores or items are equally divided into 100 parts.
And each part of these 100 equally divided parts is known as percentile. Therefore,
for a particular test percentile of a given score or item, the value will always range from
1st percentile to 100th percentile.
For example:
If Ramesh got 196 marks in UGC NET exam and he placed in 96 percentile. This
means that Ramesh’s score is better than 96 % of the candidates. How to calculate
a percentile?
Given below is the formula to compute percentile.
P= (n/N)*100
(Where, P stands for percentile, n is the ordinal rank of the value in the dataset
and N is the total number of values.
For example:
Let’s assume four MA psychology students from a university scored 35, 50, 60,
75 in a cognitive psychology test. What will be the percentile rank of the student
who scored 60?
Step - 1
First, we need to arrange the scores or numbers in ascending order, and give the
scores rank according to their position ranging from 1 to the lowest-scored and 4 to
Self-Instructional
72 Material the highest-scored student.

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

NOTES
Scores obtained 30 50 60 75

Rank 1 2 3 4

Step - 2
By using the percentile formula given above, we will get the percentile rank of the
student who scored 60.
P = (n/N) *100
= (3/4) *100
= 75
Hence, the student who scored 60 is placed in the 75th percentile rank.
Few other statistical measurement techniques are quartiles, deciles and median.
We can understand the concept of percentile with the help of median. Median can be
defined as a measurement technique in statistics, which divides the scores into two
equal parts. In the median, 50% of the data lies below and the other 50% lies above
the medium point.
In terms of quartiles, we can define it as a four number series for a given score
distribution. This four number series is defined as 1st quartiles, 2nd quartiles, 3rd quartiles
and 4th quartiles. Like 4 quartiles, deciles are divided in 10 parts, described as 1st to
10th decile.

Uses of percentile and percentile rank

 Percentile and percentile rank can be used in the field of social sciences and
humanities to indicate a particular score of an individual or an item’s position
with reference to other scores and items of the test or distribution.
 To identify and rank students’ performances in various exams and various co-
curricular activities.
 To rank and identify companies, organisations, institutions performances in
various field.
Self-Instructional
Material 73

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
In-Text Questions
2. z-scores tell us how far below or above the score the value is in:
a) Mean units b) Standard deviation units
c) Range units d) Raw score units
3. z-scores can be taken as the common standard to compare different kinds of
datasets
a) True b) False
4. The standard deviation of the z-score is:
a) 0 b) 10
c) 1 d) 100
5. The z-score above 0 means
a) All sample values are equal b) Sample values are below the mean
c) Sample values are above the mean d) None of the above
6. What does a negative z-score imply?
a) Value of the score is equal to the mean
b) Value of the score is greater than the mean
c) Value of the score is less than the mean
d) All of above

3.9 THE NORMAL PROBABILITY DISTRIBUTION:


NATURE, PROPERTIES AND APPLICATIONS

The normal probability distribution, also known as the Gaussian distribution, is a


continuous probability distribution that is widely used in statistics, engineering, and
the natural sciences. The distribution is bell-shaped and symmetrical around its mean,
with the majority of the data concentrated in the middle and tapering off towards the
tails.
Self-Instructional
74 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

3.9.1 Nature of the Normal Distribution NOTES

The normal distribution is a theoretical distribution that represents the probability of a


random variable taking on different values. It is used to model many real-world
phenomena, such as the heights and weights of a population, IQ scores, and
measurements of physical and chemical processes. The normal distribution is also
important in statistical inference, as many statistical tests and models rely on the
assumption of normality.

3.9.2 Properties of the Normal Distribution

The normal distribution is defined by two parameters: the mean () and the standard
deviation (). The mean determines the centre of the distribution, while the standard
deviation measures the spread of the data. The distribution is fully specified by these
two parameters, and once they are known, we can calculate probabilities for any
range of values.
The normal distribution has several important properties, including:
1. The distribution is symmetric around the mean.
2. The mean, median, and mode are all equal.
3. The total area under the curve is equal to 1.
4. The tails of the distribution extend infinitely in both directions.
5. The shape of the distribution is determined by the mean and standard deviation.

3.9.3 Applications of the Normal Distribution

The normal distribution is widely used in statistics and scientific research, and is applied
in many fields including:
1. Quality control: The normal distribution is used to model the distribution of
values in a process, and to set control limits that can be used to detect when the
process is out of control.
2. Inferential statistics: Many statistical tests and models rely on the assumption
of normality, and the normal distribution is used to calculate probabilities and Self-Instructional
confidence intervals. Material 75

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 3. Financial modelling: The normal distribution is used to model stock prices
and returns, and to calculate probabilities of different investment outcomes.
4. Epidemiology: The normal distribution is used to model the distribution of a
disease or illness in a population, and to estimate the probability of different
outcomes.
5. Psychology: The normal distribution is used to model many psychological
variables, such as IQ scores and personality traits.
In summary, the normal probability distribution is a fundamental tool in statistics
and scientific research, providing a way to model many real-world phenomena and to
make predictions about future outcomes.

3.10 NORMAL CURVE AND STANDARD SCORES

3.10.1 Finding Areas when the Score in known and Finding Scores when
the Area is known

z-scores are usually populated in bell curve can be used to find the area covered under
the bell curve. Further, the bell curve area can be used to find the z-score.

Figure 2: Area under the Normal Curve


Self-Instructional
76 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

If you are using a standard normal distribution table, find the row that corresponds NOTES
to the first digit of the z-score and the column that corresponds to the second digit of
the z-score. Then, look at the corresponding cell in the table, which will give you the
area to the left of the z-score.
Table 6: (Stu Z Table - University of Arizona)

For example:
If the z-score is 1.25, you would look in the row that corresponds to 1.2 (the first part
of 1.25) and the column that corresponds to 0.05 (the last part of 1.25). The
corresponding cell in the table would give you the area to the left of 1.25, which is
0.8944.
To calculate the area to the right of a given z-score, subtract the area to the left
of the z-score from 1.0. Using the example above, the area to the right of 1.25 would
be:
1.0 - 0.8944 = 0.1056.
Self-Instructional
Material 77

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES So, the area to the right of 1.25 is 0.1056 in proportion (out of 1) and 10.56
percent of the population.
Calculating z-score from the area:
Using the same standard normal distribution table, find the area in the table and look for
the corresponding z-score. The table will typically have values for the area to the left of
the z-score, so if you need to find the z-score for an area to the right of the mean, you’ll
need to subtract the area from 1 before looking up the corresponding z-score.
For example, suppose you want to find the z-score that corresponds to an area
of 0.95 to the left of the mean. Using a standard normal distribution table, you would
look for the row that corresponds to 0.9 (the first digit of 0.95) and the column that
corresponds to 0.05 (the second digit of 0.95). The value in the corresponding cell
would give you the z-score that corresponds to an area of 0.95, which is approximately
1.645.
Another example is that NPS can be used to determine percentage of individual
whose scores fall between two given scores.
Example:
If in a sample of 1000 cases, the mean is 14.5 and SD is 2.5. Assuming normality,
how many individuals scored between 12 and 16?
First convert both the raw scores into z-scores.
Z1 = (12-14.5)/ 2.5 = - 1
Z2 = (16-14.5)/ 2.5 = 0.6
Now, see the table for the areas between 0 and 0.6, it is found that 22.57
percent cases lie. We already know that between 0 and 1, 34.13 % cases lie. Hence
between both the points 12 and 16, total of 22.57 and 34.13%, that is 56.7 % of
cases lie. The total cases are 1000, hence 56.7% of 1000 is 567 individuals lie between
these two points.

Self-Instructional
78 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

NOTES
In-Text Questions
7. What is one advantage of T -score?
a) It cannot be used to interpret data
b) There are no negative or decimal numbers
c) There are no advantages of T-score
8. Calculate T-score for a value with a z-score of 0.5.
9. Calculate raw score for a value with a mean of 50, z-score of 0.5 and standard
deviation of 2.
10. What is one disadvantage of STEN and Stanine-scores?
11. Percentile rank range is from:
a) 1-10
b) 1-59
c) 0-99
d) 1-99

3.11 SUMMARY

The chapter aimed at introducing different kind of standard scores available to


psychologists to analyse, interpret and draw conclusions from the data. Z- score, T-
score, Stan, Stanine and percentile are some of the standard scores that are available,
however which score is best suited to the dataset should be carefully considered.
Each score has its own advantages and disadvantages. Overall aim of the research,
potential audience/beneficiary of the research, the ability to use these scores to interpret
data, and to make valid conclusions should be some of the considerations when
choosing these scores.

Self-Instructional
Material 79

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
3.12 ANSWERS TO IN-TEXT QUESTIONS

1. a) P40 = 72.74
  0.5  
 28   1  2  
b) Percentage rank of the score 82  100      96.66
 30 
 
2. a
3. a
4. b
5. c
6. c
7. b
8. 55
9. 51
10. The scales are not equally sized.
11. d

3.13 GLOSSARY

 Gaussian distribution or bell-shaped curve: It is the standard distribution


described by the probability density function or formula with mean as 0 and SD
as 1 unit. According to it 68% of population falls in the range of -1 and +1 SD.
 Percentile: A percentile is a score that indicates the individual’s relative position
with respect to his cohorts is on a scale of 100 points.
 Standard deviation: It is the square root of the sum of square of deviation of
each score from the mean divided by the total number of scores.
 Stanine-score: It is a standard score in which the tesT-scores are scaled on a
Self-Instructional
80 Material nine point normally distributed scale.

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

 STEN-score: It is a standard score in which the tesT-scores are scaled on ten NOTES
points o the normal scale.
 T-score: It is a normalized standard score with a mean of 50 and SD of 10
points.
 z-score: It is a normalized standard score with a mean of 0 and SD of 1 point.

3.14 SELF-ASSESSMENT QUESTIONS

Q1. What are some of the advantages of using the z-score?


Q2. Calculate the area to the left for a z-score of 0.46.
Q3. What is one advantage and disadvantage of T-score?
Q4. Calculate the z-score for a test result of 110 marks with a mean of 90 and
standard deviation of 15.
Q5. What does scoring 75th percentile in an exam mean?
Q6. Calculate the raw score for a z-score of 1.5, mean of 65 and standard deviation
of 10.
Q7. T-scores can be negative and difficult to interpret. True or False?
Q8. Stanine-score has a scale of 1-9. True or False?
Q9. Calculate the STEN-score for a z-score of 2, standard deviation 2 and mean of
5.5.
Q10. Find the percentile for the score of 60 in the dataset [65 of 70, 34, 65, 45, 23,
55, 60, 75].

3.15 REFERENCES

Drummond, R. J. (2000). Appraisal procedures for counselors and helping


professionals (5th ed.) Upper Saddle River, NJ: Prentice-Hall. Self-Instructional
Material 81

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Glen, S. (2021, January 14). STEN-score. Statistics How To.


Glen, S. (2022, August 10). Stanine-score: Definition, examples, how to convert.
Statistics
http://www.learningaboutelectronics.com/Articles/Area-under-the-curve-calculator-z-
score.php Area Under the Curve Calculator. Area under the curve calculator.
(2018).
https://www.statisticshowto.com/stanine/
https://www.statisticshowto.com/sten-score/
https://www.varsitytutors.com/hotmath/hotmath_help/topics/percentile
h t t p s : / / ww w. a l l e yd o g . c o m / g l o s s a r y/ d e f i ni t i o n -c i t . p hp ? t e r m = Z-
Score+%28Standard+Score%29 Score (Standard Score). (n.d.). In
Alleydog.com’s online glossary.
https://study.com/learn/lesson/stanine-score-uses-examples.html
https://www.math.arizona.edu/~rsims/ma464/standardnormaltable.pdf , Stu Z Table
- University of Arizona. (n.d.).
Zach. (2021, January 1). What is a Stanine-score? (definition & examples).
Statology. https://www.statology.org/stanine-score/

3.16 SUGGESTED READINGS

Aron, A., Coups, E.J. & Aron, E.N. (2013). Statistics for Psychology (6th Ed.).
Pearson Education.
Asthana, H.S. & Bhushan, Braj (2007). Statistics for social sciences (with SPSS
applications). New Delhi: Prentice Hall of India.
Field, A. (2009). Discovering Statistics using SPSS (3rd Ed). New Delhi: Sage.
Garrett, H.E. (2005). Statistics in Psychology and Education. Paragon International
Publishers/
Self-Instructional
82 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Standard Scores

Howitt, D. &Cramer, D. (2011). Introduction to Statistics in Psychology (5th Ed.). NOTES


Pearson Education.
King, B.M., Rosopa, P.J., &Minium, E.W. (2011). Statistical Reasoning in the
Behavioral Sciences (6th Ed.).
Mangal, S.K. (2010). Statistics in Psychology and Education (2nd Ed.). PHI Learning.
Mohanty, B. & Misra, S. (2015). Statistics for behavioral and social sciences. New
Delhi: Sage Publications.
N.K. Chadha (2009) Applied Psychometry. Sage Pub: New Delhi
N.K. Chadha (1991) Statistics for Behavioral and Social Sciences. Reliance Pub.
House: New Delhi
N.K. Chadha and R.L. Sehgal (1984) Statistical Methods in Psychology, ESS ESS
Publications: New Delhi

Self-Instructional
Material 83

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
UNIT III: ANALYSIS OF RELATIONSHIPS

LESSON 4 ANALYSIS OF RELATIONSHIPS


Analysis of Relationships

LESSON 4 NOTES

ANALYSIS OF RELATIONSHIPS
Dr. Deepesh Rathore
Assistant Professor
Department of Psychology
Lakhsmibai College, University of Delhi
Email-Id: [email protected]

Structure
4.1 Learning Objectives
4.2 Introduction
4.3 Understanding Correlation
4.3.1 Scatter Diagram
4.3.2 Components of Correlation: Direction and Magnitude
4.3.3 Meaning of Correlation
4.4 Calculating Pearson’s Correlation
4.5 Correlation and causation
4.6 Effects of Linear Score Transformations
4.7 Factors Influencing Correlation
4.8 Spearman Rank Correlation Method
4.9 Linear Regression Analysis/Simple Regression
4.10 Summary
4.11 Answers to In-Text Questions
4.12 Glossary
4.13 Self-Assessment Questions
4.14 References
4.15 Suggested Readings

4.1 LEARNING OBJECTIVES

 Understanding the meaning of correlation


 Constructing scatter diagram for the relationship between pairs of variables Self-Instructional
Material 87

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES  Understand how correlation does not mean causation


 Understand how to calculate Pearson's product moment correlation

4.2 INTRODUCTION

Charles Darwin was a famous scientist who worked on the concept of evolution of
species through natural selection. In his study he identified that there are variations
among species that helps them in adapting to their environment and thus ensuring their
survival. This finding inspired Francis Galton (Darwin’s cousin) to carry out researches
on individual differences. He wanted to understand the role of inheritance on the stature
of the children, for this he collected data of the height of the parents as well as their
offspring, he then tabulated this data.
Table 1: Illustration of Galton’s Data (Not the Original Data)

Height of Parents Height (in cm)


Offspring
Below 160 160-165 165-170 170-175 Above 175
(in cm)
Above 175 1 1 2 4 20
170-175 5 4 5 15 10
165-170 6 2 12 12 5
160-165 2 10 2 1 2
155-160 14 5 1 3 1
Below 155 1 1 1

The distribution thus created is an example of what is known as bivariate


distribution. A bivariate distribution is a distribution that shows the relationship between
two variables. Here, the two variables being, height of parents and height of offspring.
In order to establish the relation between the two variables, we can draw a straight line
as shown in the table 1. Karl Pearson in the year 1896 formulated a technique to
mathematically calculate this relationship, which is referred to as correlation.

Self-Instructional
88 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

Correlation is defined as a statistical technique used to understand the level of NOTES


association between variables. For example, if we consider IQ and CGPA in graduation
as two variables, it has been seen that students who have higher IQ score are more
likely to score higher CGPA. This suggests that there is a high level of association
between IQ and CGPA. Results like these can also be used to screen students for
admission process, this also involves another related concept known as prediction.
When two variables are closely related to each other, then on the basis of the value of
one variable we can predict the value of the other variable. This concept can also be
used by a manager who wants to hire high performing employees. Researches have
shown that employees with higher level of motivation usually have higher performance
and are less likely to leave the organisation, the association between these variables
can help in making better hiring decisions by assessing an applicant’s level and type of
motivation. The prerequisite for predictions to be effective is to have high level of
correlations between the variables.

In-Text Questions
1. _______is a distribution that shows the relationship between two varibales.
2. _______ is considered as a statistical technique used to understand the level
of association between variables.

4.3 UNDERSTANDING CORRELATION

4.3.1 Scatter Diagram

Scatter diagram also known as scatter plot is a way of representing information


regarding relationship between variables. In table 2, we can see the scores obtained
by 20 students on an IQ test (labelled X) and another variable as a measure of their
classroom performance CGPA (labelled Y). When we plot the scores on a graph and
represent the intersection of IQ and CGPA scores with a dot, we obtain a scatter plot
shown in figure 1.

Self-Instructional
Material 89

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Table 2: Scores Obtained by the Students on their IQ Test and CGPA

S. No. Intelligence Quotient Cumulative Grade Point Average (CGPA)


(IQ) Y
X
1 100 5
2 102 6
3 111 8
4 123 9.5
5 114 8.5
6 100 5.5
7 98 6.5
8 99 6.5

9 97 5.5
10 101 7
11 103 8
12 122 9
13 111 8
14 100 8
15 98 7.5
16 105 7
17 109 8.5
18 110 8.5
19 118 9
20 100 8

We can also draw a line passing through the cluster of dots on the diagram.
This line indicates the nature of relationship between the two variables, which is
linear in nature. This implies that as the IQ scores increases the CGPA also
increases.

Self-Instructional
90 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

NOTES
CGPA

Figure 1: Scatter Plot of IQ and CGPA Scores of 20 Students

But this is not always the case, there are a lot of different variables that don’t
share a linear relationship, which implies that we cannot draw a straight line
connecting or hugging points on the scatter plot, instead what we can do is, use
curved lines to connect the dots and hence we call the relationship as curvilinear, as
shown in figure 2.
Happiness

Learning new task


Figure 2: Relationship between Happiness and Learning New Tasks

Keep in mind, that the discussion about the prediction from correlation is only
applicable when the relationship between the variables is linear not curvilinear.
In order to draw a scatter plot, we cause the following steps:
Step 1: We start by first assigning label X and Y to the variables. Self-Instructional
Material 91

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Step 2: Next, we plot values of the variables on the x-axis and y-axis, starting with
lower values from left to higher values to the right on the x-axis and for y-axis, starting
from lower values at the bottom to the higher values at the top.
Step 3: After plotting the values, we will find values of Y for the corresponding values
of X and mark the intersection with a dot.
Step 4: Repeat step 3 for all the values.
Step 5: Name each axis and add the title of the graph.

4.3.2 Components of Correlation: Direction and Magnitude

Correlation between variables is calculated with the help of Pearson’s correlation


coefficient (symbolized by r) formulated by Pearson in the year 1896. If we consider
two variables X and Y then we can represent correlation between the variables as XY .
Correlation coefficient can take values between –1 to +1. Where, -1 represents perfect
negative correlation, +1 as perfect positive correlation, and 0 represents no correlation.
In figure 3, we can see that the dots move from lower left corner to the upper right
corner, indicating linear relationship in the upward direction. This type of scatter plot is
formed when one variable increases in value, the other variable also increases.
Variable Y

Variable X
Figure 3: Scatter Plot Represent Perfect Positive Correlation
Self-Instructional
92 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

But on the other hand, the opposite can also be the case, that is as the value of one NOTES
variable increases, the value of the other variable starts decreasing. Thus, forming a
downward trend line, from upper left corner to the lower right corner shown in figure 4.
Variable Y

Variable X
Figure 4: Scatter Plot Represent Perfect Negative Correlation

Finally, in figure 5 we can see that there is no clear relationship between the
variables. In terms of the direction, there is no clarity, as higher values of one variable
are related to the higher as well as lower values of the other variable. Hence creating
a non-directional scatter plot.
Variable Y

Variable X
Figure 5: Scatter Plot Represent no Correlation
Self-Instructional
Material 93

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES So far, we have talked about one of the components of correlations, which is
direction, now we will discuss the second component, which is magnitude. Magnitude
implies the strength of the relationship between the variables. As we have previously
discussed, the value of r ranges between -1 to +1, in this the signs (+ or -) that is
positive or negative indicates the direction of the relationship. On the other hand, the
values represent the magnitude of the relationship. It implies that if the value of r is
closer to ±1, higher will be the correlation coefficient, irrespective of the direction. For
example, a correlation of +0.70 is same in strength as -0.70, the difference is in the
direction, one is positive and the second is negative.
Variable Y

(a)

Variable X
Variable Y

(b)

Variable X
Self-Instructional Figure 6: (a) Scatter Diagram Showing High Positive Correlation
94 Material
(b) Moderate to Low Positive Correlation

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

4.3.3 Meaning of Correlation NOTES

In this section we will try to find out answer to the question ‘what does it mean when
we say correlation between variable X and Y is r = +.85?’. Before answering this
question there are few aspects that needs to be clear about correlation. Firstly,
correlation represents a degree of linear relationship between variables, it doesn’t
mean that one variable is causing changes in the other variable. Secondly, when we
compare correlation coefficients, we cannot say r = +.80 is twice as large as r
=+.40, which implies that correlation coefficients should not be confused as
representing percentages. If this is the case, then in what way the degree or magnitude
of difference in correlation coefficient should be interpreted? The answer can be
obtained by looking at the percentage of cases falling above the median on the 1st
variable and percentage of cases obtained above or below the median on the 2nd
variable (Michael, 1966).
Table 3: Representing Meaning of Correlation w.r.t the
Percentage of Cases for the 2nd Variable

If we consider our earlier example of IQ and CGPA score of students, and


let's say the correlation between these variables is r = +.80, then by looking at
table 3, we can say that those who scored above the median on their IQ test, i.e.
79.3% of them will also score above the median on their CGPA, and 20.7% will
score below the median. If there is perfect correlation (r = +1.00), then it implies Self-Instructional
Material 95

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES that all the cases above the median on IQ will also be above the median on CGPA.
On the other hand, if there is no correlation (r = 0.00), then it means that only
50% of those who are above the median on IQ will also be above the median on
CGPA.

In-Text Questions
3. _______is also known as scatter plot.
4. We can use linear relationship in a scatter plot. (True/False)
5. +1 represents a _______type of correlation.
6. _______implies strength of relationship between variables.

4.4 CALCULATING PEARSON’S CORRELATION

So far, we have discussed about what correlation is, how the relationship between
variables can be plotted using scatter plot. In this section we are going to see how we
can calculate correlation coefficients between variables. One of the mostly widely
accepted and used correlation coefficient is the Pearson’s product moment correlation
coefficient.
Pearson’s correlation coefficient can be calculated using two methods:

(1) Standard score or z-score formula

In this method, we first convert the raw scores of both the variables into z-scores and
then we calculate the sum of the product of each pair of scores, this is also known as
the cross-products and then dividing it by the total number of pair of scores.

 ( Z X ZY )
r .........................(a)
n

(2) Deviation score formula

In this method, using deviation score formula, we can directly calculate Pearson's
Self-Instructional correlation coefficient using raw scores. Here, we will first calculate the sum of the
96 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

products of deviation scores for each pair of scores and then we will divide the result NOTES
by the product of number of pairs of scores and standard deviations of each variable.
Mathematically,

( X  X )(Y  Y )
r
nS X SY

We can further simplify this formula by substituting values of SXSY,

( X  X ) 2 (Y  Y ) 2
SX  , SY 
n n

( X  X )(Y  Y )
r
( X  X )2 (Y  Y ) 2
n
n n

( X  X )(Y  Y )
r
( X  X ) 2 (Y  Y ) 2

We know that,

(X ) 2
SS X   ( X  X )2  X 2 
n

(Y ) 2
SSY   (Y  Y )  Y 
2 2

( X  X )(Y  Y )
r .........................(b)
( SS X )( SSY )

Raw Score equivalent of ( X  X )(Y  Y )

(X )(Y )
 ( X  X )(Y  Y )  XY  n
Self-Instructional
Material 97

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Let’s understand how to calculate r using raw score formula with the help of an
example.
Table 4: Raw Score Method Solution

S no. Variable X Variable Y X2 Y2 XY


1 10 5 100 25 50
2 11 8 121 64 88
3 12 10 144 100 120
4 13 11 169 121 143
5 15 9 225 81 135
6 19 15 361 225 285
7 11 6 121 36 66
8 12 8 144 64 96
9 8 4 64 16 32
10 13 12 169 144 156
 X  124  Y  88 X 2
 1618 Y 2
 876  XY  1171

(X ) 2 (124) 2
SS X   X 2   1618   80.4
n 10

(Y ) 2 (88) 2
SSY   Y 2   876   101.6
n 10

(X )(Y ) 124 *88


 ( X  X )(Y  Y )   XY  n
 1171 
10
 79.8

( X  X )(Y  Y )
r
( SS X )( SSY )

(X )(Y )
XY 
r n
( SS X )( SSY )

79.8
Self-Instructional =  .88
98 Material 80.4 *101.6

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

The calculation of r using raw score formula can be summarized in the following steps: NOTES
Step 1: After writing the raw scores, calculate the following,
X, Y, X 2, Y 2, and XY
Step 2: Calculate the values for the sum of squares for both the variables,
SSX, SSY and ( X  X )(Y  Y )
Step 3: Substitute the values calculated in step 1 and 2 in the formula (b).

In-Text Questions
7. Following data represents the scores obtained by psychology students on
their level of motivation and self-esteem.
Motivation (X): 12, 16, 14, 12, 13, 20, 24
Self-esteem (Y): 5, 8, 7, 4, 3, 10, 12
Calculate Pearson's correlation coefficient using deviation score formula.
8. In _____ method we convert raw scores into z scores.
9. In _____ method we can directly calculste Pearson’s correlation coefficient
usinf raw scores.

4.5 CORRELATION AND CAUSATION

One of the most important aspects of correlation is that if there is correlation between
two or more variables then it only means that the variables are associated with each
other but this association or shared variation between the variables does not mean that
one variable is causing changes in another variable i.e., correlation doesn’t imply
causation. It means that there is only association between the variables and not a
cause-and-effect relationship. For example, if there is a study that claims that use of a
new medicine is positively related with improvement in diabetes. Based on this
association can we conclude that the medicine is effective in treatment of diabetes?
The answer is, it is possible that the medicine is effective in treatment of diabetes but
Self-Instructional
Material 99

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES we cannot be sure about this conclusion because we have not taken into account for
the possibility of whether the patients were engaged in active lifestyle like exercising,
walking, eating healthy, eating less sugar, and carbohydrates, their age, weight, etc. All
these variables may influence the relationship between the medicine and improvement
in diabetes.
This does not mean that correlation is not important or that the variables does
not influence each other. They may influence each other directly or indirectly but this
acts a starting point for further studies. When we say one variable (X) cause change in
another variable (Y), it means that on the basis of X we can predict Y. Hence, we can
say that correlation is involved in establishing association between variables not
prediction, which is possible on the basis of cause-and-effect relationship between the
variable.

4.6 EFFECTS OF LINEAR SCORE


TRANSFORMATIONS

Linear transformation involves changes in each raw score by adding a constant,


subtracting a constant, multiplying a constant, or dividing by a constant. All these
changes in the raw score does not influence the value of the correlation coefficients. To
understand this more clearly, let us look at our data in table 1, the two variables shown
in the table are IQ and CGPA. CGPA scores are in decimal points, like 8.5, now if we
multiply each of the CGPA scores by 10, then this score will become 85, and if we
look at the mean value of CGPA, it changes by the multiple of 10, same is the case
with standard deviation. But if we look at the correlation coefficient, it remains the
same. Similarly, if we subtract 50 from the IQ scores, and divide the resulting numbers
by 10, then still the correlation coefficient will be the same at r = +.781.
Another form of linear score transformation is converting the raw scores into
standard scores and then calculating correlation coefficient (formula (b)). In this instance
also the correlation coefficient comes out to be the same.

Self-Instructional
100 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

NOTES
4.7 FACTORS INFLUENCING CORRELATION

1. Sample size: When the sample size is small, then the correlation coefficient is
slightly unstable, but as the sample size increases, the correlation coefficient
becomes more reliable.
2. Nature of sample: Correlation coefficient between variables changes as we
change the sample. In other words, correlation between two variables is not
fixed, it depends upon the sample that we collect, different samples results in
different correlation coefficients.
3. Linear relationship: Correlation coefficient as a measure of relationship
between variables is appropriate only when the nature of relationship between
the variables is linear in nature, as shown in figure 1, where a straight line can be
drawn, connecting most of the dots on the scatter plot.
4. Variability of scores: When there is high variability in the score distribution
among the variables, then correlation coefficient reduces. On the other hand,
when the variability is less, that is scores are concentrated close together, then
correlation coefficient increases.
5. Discontinuity in scores (missing values): When there are missing values in
one variable or both variables, then correlation coefficient overestimates the
strength of relationship between the variables. In other words, value of correlation
coefficient increases because of missing values.

In-Text Questions
10. Correlation implies causation of two variables. (True/False)
11. Linear transformation can be acquired by converting raw scores into standard
scores. (True/False)
12. When there are missing values in one variable or both variables it is called
_______.

Self-Instructional
Material 101

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
4.8 SPEARMAN RANK CORRELATION METHOD

Spearman rank correlation is a statistical measure that evaluates the strength and direction
of association between two variables. Unlike Pearson correlation, which assesses
linear relationships, Spearman correlation is based on the ranks of the data points
rather than their actual values. It’s particularly useful when dealing with ordinal or non-
normally distributed data.
The formula for calculating the Spearman rank correlation coefficient (ρ) is as
follows:
The following steps are adopted for calculating Spearman Rank Correlation:
Step 1: Ranking the Data: First, the data for both variables are ranked
independently from lowest to highest. If there are ties (i.e., identical values), the ranks
are averaged. Let X and Y be two variables at ordinal data level. Let rank X represent
the order in which the values of X occur, and likewise rank Y represent the
corresponding order in which values of Y occur. Each value of X is associated with a
value of Y – they form pairs of values.
Step 2: Calculating the Differences in Ranks: Next, the difference between the
ranks of each paired observation is calculated. These differences represent the deviations
from the perfect correlation.

Step 3: Squaring the Differences: The squared differences are calculated to


eliminate any negative signs and to give more weight to larger differences.

Step 4: Summing the Squared Differences: The squared differences are then
summed across all pairs of observations. The following formula will be used:
Sum of squared observations:
Step 5: Calculating the Spearman Rank Correlation Coefficient: Finally, the
Spearman correlation coefficient (often denoted by the symbol ρ) is calculated using a
formula that incorporates the sum of squared differences and the sample size.
Self-Instructional
102 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

NOTES

Where:
represents the difference between the ranks of paired observations.
n is the number of paired observations.
This coefficient ranges from –1 to 1, where:
ρ = 1 indicates a perfect positive monotonic relationship (i.e., as one variable
increases, the other variable also increases).
ρ = –1 indicates a perfect negative monotonic relationship (i.e., as one variable
increases, the other variable decreases).
ρ = 0 indicates no monotonic relationship between the variables.
Spearman rank correlation is robust to outliers and does not assume linearity or
homoscedasticity, making it applicable to a wide range of data types. It’s commonly
used in various fields such as psychology, sociology, economics, and biology to explore
relationships between variables that may not conform to the assumptions of parametric
tests. However, it’s important to note that Spearman correlation does not imply causation;
it merely measures the strength and direction of association between variables.
Questions 1: A researcher wants to investigate the relationship between the number
of hours spent studying and the exam scores of 8 students. The data collected is as
follows:
Hours Exam
Studied Score
Student (X) (Y)
1 5 70
2 7 85
3 4 60
4 6 75
5 3 55
6 8 90
7 6 80
8 4 65
Self-Instructional
Calculate the Spearman rank correlation coefficient for this data. Material 103

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Solution:
Rank the data for both variables:

Hours Exam
Studied Rank Score Rank
Student (X) (X) (Y) (Y) di=RXi−RYi di2
1 5 4 70 4 0 0
2 7 7 85 7 0 0
3 4 3 60 2 1 1
4 6 5 75 5 0 0
5 3 2 55 1 1 1
6 8 8 90 8 0 0
7 6 5 80 6 -1 1
8 4 3 65 3 0 0
Sum 0 3

Question 2: Given the following ranks of two variables, calculate the Spearman rank
correlation coefficient:
Variable X: 2, 4, 6, 8, 10
Variable Y: 5, 10, 15, 20, 25
Solution:
di=RXi
RXi RYi −RYi di2
2 5 -3 9
4 10 -6 36
6 15 -9 81
8 20 -12 144
10 25 -15 225
Sum 495

Self-Instructional
104 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

Questions for Practice NOTES

Spearman Rank-Order Correlation

Q1. Eleven candidates appeared for railway exams and their scores on reasoning
tests and aptitude tests are provided below. Calculate Spearman’s rank
correlation.

Candidate Reasoning Test Aptitude test

A 20 30

B 50 60

C 28 50

D 25 40

E 70 85

F 90 90

G 76 56

H 45 82

I 30 42

J 19 31

K 26 49

Q2. The following are the scores of 12 students in Physics and Maths. To what
extent is the knowledge of students in the 2 subjects related?

Student A B C D E F G H I J K L

Maths 80 45 55 56 58 60 65 68 70 75 85 90

Physics 82 86 50 48 60 62 64 65 70 74 90 75

Self-Instructional
Material 105

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES Q3. The following are the ranks obtained by 10 candidates in history and political
science entrance tests. Is there a relation between the candidate’s knowledge in
the two subjects. If yes, what is the extent of relation?

Rank in 6 3.5 3.5 1 2 7 9 8 5 10


history

Rank in Pol. 4 1 6 7 5 8 10 9 2 3
Science

Q4. Find Rho for the data given below.

Sl. No. X Y

1. 12 21

2. 15 25

3. 24 35

4. 20 24

5. 8 16

6. 15 18

7. 20 25

8. 20 16

9. 11 16

10. 26 38

Q5. Following are the ranks obtained by players in two online video games. Determine
whether their skill in both the games are related and to what extent.

Player A B C D E F G H I J

Rank in 1 2 3 4 5 6 7 8 9 10
Game 1

Rank in 6 7 5 10 3 9 4 1 8 2
Self-Instructional Game 2
106 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

NOTES
4.9 LINEAR REGRESSION ANALYSIS/SIMPLE
REGRESSION

Linear regression is a statistical method used to model the relationship between a


dependent variable (often denoted as Y) and one or more independent variables (often
denoted as X). It assumes a linear relationship between the independent variables and
the dependent variable. The goal of linear regression is to find the best-fitting line or plane
that describes the relationship between the variables. This line or plane is often represented
by the equation of a straight line in two dimensions or a hyperplane in higher dimensions.
The equation of a simple linear regression model can be expressed as:

Where:
 Y is the dependent variable.
 X is the independent variable.
 b 0 is the intercept (the value of Y when X=0).
 1 is the slope (the change in Y for a one-unit change in X).
  is the error term, representing the difference between the observed and
predicted values of Y.
The parameters 0 and 1 are estimated from the data using a method such as
ordinary least squares (OLS). OLS minimizes the sum of the squared differences
between the observed and predicted values of the dependent variable.
Steps involved in performing linear regression:
1. Data Collection: Collect data on the dependent and independent variables
of interest.
2. Data Exploration: Explore the data to understand the relationship between
the variables, check for outliers, and assess the assumptions of linear
regression.
3. Model Building: Choose the appropriate regression model (simple linear
regression, multiple linear regression, etc.) based on the number of
Self-Instructional
independent variables and the nature of the relationship. Material 107

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES 4. Parameter Estimation: Use statistical techniques like OLS to estimate the
parameters (intercept and coefficients) of the regression model.
5. Model Evaluation: Evaluate the goodness of fit of the model using measures
such as the coefficient of determination (R2), adjusted R2, and residual
analysis.
6. Prediction and Inference: Use the fitted regression model to make
predictions about the dependent variable for new or unseen data. Additionally,
conduct hypothesis tests and confidence interval estimation for the regression
coefficients to make inferences about the relationship between the variables.
Linear regression is widely used in various fields, including economics, finance,
social sciences, engineering, and machine learning. It provides a simple yet powerful
tool for analyzing and predicting the behavior of continuous variables based on their
relationships with other variables. However, it’s essential to assess the assumptions of
linear regression and interpret the results with caution, especially in the presence of
non-linear relationships or influential outliers.
Q1. A company manufactures an electronic device which can be used in a wide
range of temperatures. The company knows that increased temperature shortens
the lifespan of the device. The study is conducted where lifespan of device is
determined as a function of temperature and the following data is found:

Temp. (in celsius) 10 20 30 40 50 60 70 80 90

Lifespan (in hours) 420 365 285 220 176 117 69 34 5

State the simple linear regression line in the form of y = a + bx


Q2. Find the simple linear regression equation for the following datasets having
dependent variable ‘y’ and independent variable ‘x’:

x 5 2 6 8 9 3 7

y 3 7 4 10 5 6 4

Self-Instructional
108 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

Q3. A science teacher recorded the length of time, y minutes, taken to travel to NOTES
school when leaving home x minutes after 8 am on each day of the week. The
results are as follows:

x 0 10 20 30 40 50 60

y 18 27 28 39 39 48 51

i) Calculate the linear regression equation for the above dataset.


ii) Plot the dataset in a graph and draw the regression line on the scatter
diagram.
Q4. The following data set shows the monthly sales of different ice-cream brands
and the money spent on online advertising by each company.

Monthly sales (in lakhs) 40 55 54 95 48

Advertising expense (in lakhs) 1 2 1.5 5 3

Find the linear regression line in the form of y = a +bx

4.10 SUMMARY

 Correlation is a statistical measure used to calculate the relationship between


two variables.
 Scatter diagram also known as scatter plot is a way of representing information
regarding relationship between variables.
 Prediction from correlation is only applicable when the relationship between the
variables is linear not curvilinear.
 Pearson’s correlation coefficient (symbolized by r) formulated by Pearson in
the year 1896.
Self-Instructional
Material 109

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES  Correlation coefficient can take values between -1 to +1, where, -1 represents
perfect negative correlation, +1 as perfect positive correlation, and 0 represents
no correlation.
 Correlation does not imply causation. It means that there is only association
between the variables and not a cause-and-effect relationship.
 Linear transformation involves changes in each raw score by adding a constant,
subtracting a constant, multiplying a constant, or dividing by a constant. All
these changes in the raw score does not influence the value of the correlation
coefficients.

4.11 ANSWERS TO IN-TEXT QUESTIONS

1. Bivariate distribution
2. Correlation
3. Scatter diagram
4. False
5. Perfect positive
6. Magnitude
7. r = +.94
8. Standard score
9. Deviation score formula
10. False
11. True
12. Discontinuity

Self-Instructional
110 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Analysis of Relationships

NOTES
4.12 GLOSSARY

 Correlation: It is a statistical technique used to understand the level of association


between variables.
 Scatter diagram: It is also known as scatter plot and is a way of representing
information regarding relationship between variables.
 Variable: It is the quantity that may be changed in the mathematical problem.
 Linear transformation: It is a weighted combination of scores where each
score is first multiplied by a constant and then the products are summed.

4.13 SELF-ASSESSMENT QUESTIONS

Q1. What do you understand by the term correlation?


Q2. “Correlation doesn’t mean causation”. Explain.
Q3. Ten employees were given rating regarding their work performance by their
manager and by the customers.
Employee id: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110
Customers ratings: 7, 8, 4, 5, 6, 3, 9, 7, 5, 2
Managers ratings: 5, 6, 2, 8, 7, 5, 8, 6, 7, 6
a) Draw a scatter diagram for the two variables.
Calculate Pearson’s correlation coefficient.

4.14 REFERENCES

Michael, W. B. (1966). An interpretation of the coefficients of predictive validity and


determination in terms of the proportions of correct inclusions or exclusions in cells of
a fourfold table. Educational and Psychological Measurement, 26(2), 419–425. https:/
Self-Instructional
/doi.org/10.1177/001316446602600215 Material 111

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Basic Statistics in Psychology

NOTES
4.15 SUGGESTED READINGS

Aron, A., Aron, E.N., & Coups, E.J. (2007). Statistics for Psychology (4th Ed.).
Delhi: Prentice Hall of India.
Howitt, D and Cramer, D. (2011). Introduction to Statistics in Psychology. London,
UK: Pearsons Education Ltd.
Garrett, H.E (2005). Statistics in Psychology and Education. Delhi: Cosmo
Publications.
King, B.M. & Minium, E.W, (2007). Statistical Reasoning in the Behavioral Sciences
(5th Ed.). Noida: Wiley.
Mangal, S.K. (2012). Statistics in Psychology and Education (2nd Ed.). Delhi:
Prentice Hall of India.
N.K. Chadha (2009) Applied Psychometry. Sage Pub: New Delhi.
N.K. Chadha (1991) Statistics for Behavioral and Social Sciences. Reliance Pub.
House: New Delhi.
N.K. Chadha and R.L. Sehgal (1984) Statistical Methods in Psychology, ESS
Publications: New Delhi.

Self-Instructional
112 Material

© Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
BASIC STATISTICS
IN PSYCHOLOGY

BASIC STATISTICS IN PSYCHOLOGY


B.A. (HONS) PSYCHOLOGY
SEMESTER-II
DSC-06

DSC-06

DEPARTMENT OF DISTANCE AND CONTINUING EDUCATION DEPARTMENT OF DISTANCE AND CONTINUING EDUCATION
UNIVERSITY OF DELHI UNIVERSITY OF DELHI

You might also like