0% found this document useful (0 votes)
14 views20 pages

GNS 311 Adv Psychometrics

The document outlines a course on advanced psychometric studies, covering basic concepts, statistical classification, and the application of statistics in psychometric tests. It explains the importance of psychometrics in measuring psychological constructs and includes details on descriptive and inferential statistics, as well as the use of software like SPSS. Additionally, it discusses types of data, measurement scales, and the differences between parametric and non-parametric tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views20 pages

GNS 311 Adv Psychometrics

The document outlines a course on advanced psychometric studies, covering basic concepts, statistical classification, and the application of statistics in psychometric tests. It explains the importance of psychometrics in measuring psychological constructs and includes details on descriptive and inferential statistics, as well as the use of software like SPSS. Additionally, it discusses types of data, measurement scales, and the differences between parametric and non-parametric tests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

ADVANCED PSYCHOMETRIC STUDIES

COURSE OUTLINE

1.0 Introduction:

1.1 Basic concepts in Psychometrics

1.2 Statistical Classification

1.3 Psychometric tools

2.0 Application of Descriptive and Inferential Statistics in Psychometric Tests:

2.1 Measures of Central Tendency

2.2 Relate the Measures of Central Tendency to Psychometric tests

2.3 Measures of Variability and their psychometric relevance

2.4 Transformed Scores and their psychometric relevance

2.5 Inferential Statistics and its relationship with Psychometrics

3.0 Computer Application in Statistics:

3.1 Introduction of SPSS- Statistical Package for Social Sciences

3.2 Use of SPSS as it relates to Psychometrics

3.3 Appreciate the use of SPSS Software

1
INTRODUCTION
Psychometrics is a field of study within psychology concerned with the theory and technique
of measurement which includes the measurement of knowledge, abilities, attitudes,
and personality traits. The field is primarily concerned with the study of differences between
individuals. It is a scientific discipline concerned with the question of how psychological
constructs (e.g., intelligence, neuroticism, or depression) can be optimally related to observables
(e.g., outcomes of psychological tests, genetic profiles, neuroscientific information).

Simply, psychometrics generally refers to specialized fields within psychology and education
devoted to testing, measurement, assessment, and related activities. It is concerned with the
objective measurement of latent constructs that cannot be directly observed. Examples of latent
constructs include intelligence, introversion, mental disorders, and educational achievement.

BASIC CONCEPTS IN PSYCHOMETRICS


1. Variable: is defined as an attribute of an object of study. It is a characteristic of interest for
each person or thing in a population. Variables may be numerical or categorical. Numerical
variables take on values with equal units such as weight in pounds and time in hours.
Categorical variables place the person or thing into a category e.g. gender (male or female).
2. Population: is the entire set of items from which you draw data for a statistical study. It can
be a group of individuals, a set of items, etc. It makes up the data pool for a study. An
example of a population would be the entire student body at a school. It would contain all
the students who study in that school at the time of data collection. For example, students in
Nigeria can be a study population, students of FEDPOLEL is a population. Studying a
whole population may seem difficult or unachievable, hence we rather study a sample of the
population.
3. Sample: (A part or small section selected from the population is called a sample and
process of such selection is called sampling). represents the group of interest from the
population, which you will use to represent the data. The sample is an unbiased subset of the
population that best represents the whole data. To study the population, we select a sample.
The idea of sampling is to select a portion (or subset) of the larger population and study that
portion (the sample) to gain information about the population. The results obtained for
different groups who took part in the study can be extrapolated to generalize for the
population.

2
The process of collecting data from a small subsection of the population and then using it to
generalize over the entire set is called Sampling.

Samples are used when:


 The population is too large to collect data.
 The data collected is not reliable.
 The population is hypothetical and is unlimited in size.
4. Parameter: statistical measurements such as Mean, Variance etc. of the population are
called parameters. It is a numerical quantity or attribute of a population that is estimated
using data collected from the population. It is the number that summarize data for an entire
population.
5. Statistics: are numbers that summarize data from a sample, i.e. some subset of the entire
population.

Example: A researcher wants to estimate the average height of women aged 20 years or older.
From a simple random sample of 45 women, the researcher obtains a sample mean height
of 63.9 inches.
Parameter: average height of all women aged 20 years or older.
Statistics: average height of 63.9 inches from the sample of 45 women.

A nutritionist wants to estimate the mean amount of sodium consumed by children under the age
of 10. From a random sample of 75 children under the age of 10, the nutritionist obtains a
sample mean of 2993 milligrams of sodium consumed.
Parameter: mean amount of sodium consumed by children under the age of ten.
Statistics: mean of 2993 milligrams of sodium obtained from the sample of 75 children.
6. Data: any information that has been collected, observed, generated or created to validate
original research findings. Data can be generated from observation, self-report, etc.
7. Measurements: process of observing and recording the observations that are collected as
part of a research effort. Levels of measurement and reliability of measurement are the two
major concepts under measurement.
8. Hypothesis: is a statement given by an individual. Usually it is required to make decisions
about populations on the basis of sample information. Such decisions are called Statistical
Decisions. In attempting to reach decisions it is often necessary to make assumption about

3
population involved. Such assumptions, which are not necessarily true, are called statistical
hypothesis.
9. Null Hypothesis and Alternative Hypothesis: A hypothesis which is tested under the
assumption that it is true is called a null hypothesis and is denoted by H0. Thus a hypothesis
which is tested for possible rejection under the assumption that it is true is known as Null
Hypothesis. The hypothesis which differs from the given Null Hypothesis H0 and is
accepted when H0 is rejected is called an alternative hypothesis and is denoted by H1 (The
hypothesis against which we test the null hypothesis, is an alternative hypothesis).

STATISTICAL CLASSIFICATION

Statistics is a branch of applied mathematics that involves the collection, description, analysis,
and inference of conclusions from Statisticians, people who do statistics, are particularly
concerned with determining how to draw reliable conclusions about large groups and general
events from the behaviour and other observable characteristics of small samples. These small
samples represent a portion of the large group or a limited number of instances of a general
phenomenon.

TYPES OF STATISTICS
The two major areas of statistics are known as descriptive statistics, which describes the
properties of sample and population data, and inferential statistics, which uses those properties
to test hypotheses and draw conclusions.

Descriptive Statistics
Descriptive statistics mostly focus on the central tendency, variability, and distribution of
sample data. Central tendency means the estimate of the characteristics, a typical element of a
sample or population, and includes descriptive statistics such as mean, median,
and mode. Variability refers to a set of statistics that show how much difference there is among
the elements of a sample or population along the characteristics measured, and includes metrics
such as range, variance, and standard deviation.

Inferential Statistics
This type of statistics is used to interpret the meaning of Descriptive statistics. That means once
the data has been collected, analysed and summarised then we use these stats to describe the
meaning of the collected data.

4
Inferential statistics are tools that statisticians use to draw conclusions about the characteristics
of a population, drawn from the characteristics of a sample, and to decide how certain they can
be of the reliability of those conclusions. Based on the sample size and distribution statisticians
can calculate the probability that statistics, which measure the central tendency, variability,
distribution, and relationships between characteristics within a data sample, provide an accurate
picture of the corresponding parameters of the whole population from which the sample is
drawn.

Inferential statistics are used to make generalizations about large groups, such as estimating
average demand for a product by surveying a sample of consumers' buying habits or to attempt
to predict future events, such as projecting the future return of a security or asset class based on
returns in a sample period.

Difference between Descriptive and Inferential Statistics


Descriptive statistics are used to describe or summarize the characteristics of a sample or data
set, such as a variable's mean, standard deviation, or frequency. Inferential statistics, in contrast,
employs any number of techniques to relate variables in a data set to one another, for example
using correlation or regression analysis. These can then be used to estimate forecasts or infer
causality.

PARAMETRC AND NON-PARAMETRIC TESTS


In the literal meaning of the terms, a parametric statistical test is one that makes assumptions
about the parameters (defining properties) of the population distribution(s) from which one's data
are drawn, while a non-parametric test is one that makes no such assumptions.

Parametric Test: is the hypothesis test which provides generalisations for making statements
about the mean of the parent population. A t-test based on Student’s t-statistic, which is often
used in this regard.

The t-statistic rests on the underlying assumption that there is the normal distribution of variable
and the mean in known or assumed to be known. The population variance is calculated for the
sample. It is assumed that the variables of interest, in the population are measured on an interval
scale.

5
Most of the statistical tests we perform are based on a set of assumptions. When these
assumptions are violated the results of the analysis can be misleading or completely erroneous.

Typical assumptions are:

 Normality: Data have a normal distribution (or at least is symmetric).


 Homogeneity of variances: Data from multiple groups have the same variance
 Linearity: Data have a linear relationship
 Independence: Data are independent

Non-Parametric Test: is defined as the hypothesis test which is not based on underlying
assumptions, i.e. it does not require population’s distribution to be denoted by specific
parameters.

The test is mainly based on differences in medians. Hence, it is alternately known as the
distribution-free test. The test assumes that the variables are measured on a nominal or ordinal
level. It is used when the independent variables are non-metric.

BASIS FOR
PARAMETRIC TEST NONPARAMETRIC TEST
COMPARISON

Meaning A statistical test, in which specific assumptions are A statistical test used in the case of non-metric
made about the population parameter is known as independent variables, is called non-parametric
parametric test. test.

Basis of test statistic Distribution Arbitrary

Measurement level Interval or ratio Nominal or ordinal

Measure of central Mean Median


tendency

Information about Completely known Unavailable


population

Applicability Variables Variables and Attributes

6
BASIS FOR
PARAMETRIC TEST NONPARAMETRIC TEST
COMPARISON

Correlation test Pearson Spearman

TYPES OF DATA
Quantitative data is information about quantities, and therefore numbers, and qualitative data is
descriptive, and regards phenomenon which can be observed but not measured, such as language.

Qualitative Data: is defined as non-numerical data, such as text, video, photographs or audio
recordings. This type of data can be collected using diary accounts or in-depth interviews, and
analyzed using grounded theory or thematic analysis. Qualitative research is the process of
collecting, analyzing, and interpreting non-numerical data, such as language. Qualitative research
can be used to understand how an individual subjectively perceives and gives meaning to their
social reality.

Quantitative Data: or research involves the process of objectively collecting and analyzing
numerical data to describe, predict, or control variables of interest. The goals of quantitative
research are to test causal relationships between variables, make predictions and generalize
results to wider populations.

TYPES OF MEASUREMENT

DISCRETE AND CONTINOUS DATA


Like the many ways to create data, there are plenty of various data types. There are structured
and unstructured data. Then there is qualitative and quantitative data. And finally, there are
discrete vs. continuous data, which is the fundamentals for every person working with
businesses.

Discrete data: are data that can only take on certain values These values do not have to be
complete numbers, but they are values that are fixed. It only contains finite values, the
subdivision of which is not possible. It includes only those values which are separate and can
only be counted in whole numbers or integers, which means that the data can not be split into
fractions or decimals.
7
Discrete Data Examples: The number of students in a class, the number of chocolates in a bag,
the number of strings on the guitar, the number of fishes in the aquarium, etc.
In many cases, discrete data can be prefixed with “the number of”. For example:
The number of students who have attended the class;
The number of customers who have bought different products;
The number of groceries people are purchasing every day;
This data type is mainly used for simple statistical analysis because it’s easy to summarize and
compute. In most of the practices, discrete data is displayed by bar graphs, stem-and-leaf-plot
and pie charts.

Continuous data: is the data that can be of any value. Over time, some continuous data can
change. It may take any numeric value, within a potential value range of finite or infinite. The
continuous data can be broken down into fractions and decimals, i.e. according to measurement
accuracy, it can be significantly subdivided into smaller sections.
Continuous data is considered the complete opposite of discrete data.
Variables in continuous data sets often carry decimal points, with the number stretching out as
far as possible. Typically, it changes over time. It can have completely different values at
different time intervals, which might not always be whole numbers. Continuous Data Examples:
Measurement of height and weight of a student, Daily temperature measurement of a place,
Wind speed measured daily, etc. Continuous data can be measured by using specific tools and
displayed in line graphs, skews, histograms.
Difference Between Continuous and Discrete Data
Discrete Data Continuous Data

The type of data that has clear spaces Continuous information is information that falls
between values is discrete data. into a continuous series.

Discrete data is countable. Continuous data is measurable

There are distinct or different values in Every value within a range is included in
discrete data. continuous data.

The bar graph is used to graphically A histogram is used to graphically represent


represent discrete data. continuous data.

Ungrouped frequency distribution of Grouped distribution of continuous data


discrete data is performed against a single tabulation frequencies is performed against a
value value group.

8
Points in a graph of the discrete function
The points are associated with an unbroken line.
remain unconnected.

SCALES OF MEASUREMENT
Before we can appreciate what statistical tests to carry out or interpret results, we need
knowledge of scales of measurement. The importance in understanding the distinctions among
the scales of measurement lie in the limited capabilities of certain types of measurements and
because different statistical procedures are useful for analyzing data collected on some scales.
Measurement is a process of classifying or assigning numbers to observations in order to
differentiate these observations with respect to some criterion or property (Mendenhall, McClave
and Ramey, 1977). About four types of scales are commonly used. These are nominal, ordinal,
interval and ratio scales.

The Nominal Scale


This is a limited type of measurement involving only classification where distinction is based on
classes or categories. Labels are given in order to distinguish one class from another. For
example, on a questionnaire you might be asked to indicate your sex, (whether you are a male or
female). It can be seen that a nominal scale allows us to make qualitative distinctions. We do not
make any attempts to measure the size of the response. It must be noted that where scores are
assigned to the responses (e.g. male = 1 and female = 2) the scoring are only categorical.

The remaining three levels of measurement are used for quantitative variables
The Ordinal Scale (assigning scores so that they represent the rank order of the individuals)
In ordinal scale, classification is based on some defined variable. The classes are ordered on
some continuum, and it can be said that one class is higher than another. As the name of this
scale implies, the observations made by the researcher is ordered from biggest to smallest. For
example, a teacher may rank the performance of pupils in his class. The information provided
will tell us who performed best, the second best, and so on. However, we cannot know how
much better the best student performed than the second best. This limits the deduction we can
make from this type of data.

The Interval Scale


The interval level of measurement involves assigning scores so that they represent the precise
magnitude of the difference between individuals, but a score of zero does not actually represent
the complete absence of the characteristic. A classic example is the measurement of heat using
9
the Celsius or Fahrenheit scale. The difference between temperatures of 20°C and 25°C is
precisely 5°, but a temperature of 0°C does not mean that there is a complete absence of heat. In
psychology, the intelligence quotient (IQ) is often considered to be measured at the interval
level.
The Ratio Scale (measurement involves assigning scores in such a way that there is a true
zero point that represents the complete absence of the quantity)
Ratio scale has absolute zero point, where one number can be stated justifiably to be a certain
multiple of another. For example, imagine that Employee A working in a factory may
successfully pack 250 crates of soft drinks in a day, while Employee B may pack 500 crates. We
would be justified in stating that the first worker packed twice as much crates of soft drink than
the second worker. Here, it is possible to form ratios and make comparison based on the
magnitude of measurements. With a ratio scale we can draw the same conclusions as we did with
ordinal and interval scale as well as make comparisons.

PSYCHOMETRIC TOOLS

Questionnaire

Interview

Case history

Work sample

Observations

MEASUREMENT OF CENTRAL TENDENCY

The aim of measuring central tendency is to be able to give a description about the scores of a
group of individuals with a simple measurement. This single value that is used to describe the
group of individuals. Gravetter and Wallnau(1985) define central tendency as a statistical
measure that identifies the single most representative score for an entire distribution.

The usefulness of measures of central tendency is in its ability to differentiate between groups of
individuals. The three most common ways of measuring central tendency are the mean, the
median and the mode. They have different characteristics and their computation is also different.

The Mean
The arithmetic mean, commonly known as the mean is computed by adding the score of a set of
measurements divided by the number of scores.
10
x =∑x/N

x = refers to mean

∑= Summation sign

X= Set of scores

N= number of scores

To determine the mean of a frequency distribution, simply multiply each score by its frequency
and then add up the results. Thereafter divide your result by the number of scores (N), which can
be found by summing the frequencies.

Example:

Determine the mean score of students in a class

Test f fx
56 3 168
60 2 120
48 1 48
51 5 255
63 4 252

Mean = ∑x = ∑f = 843 = 56.2


N 15

The mean of grouped frequency distribution

Sometimes the Scores Xc data we have at hand have


Midpoint f fXc
already been 65-69 66 1 66 grouped and we want to find
the mean of 60-64 63 2 126 this distribution. To do this
55-59 58 3 174
we have to apply the formula for mean
50-54 52 4 208
of grouped 45-49 47 6 282 data.
40-44 44 6 264
x = ∑f Xc 35-39 37 5 185
30-34 33 8 264
N 25-29 27 4 108
20-24 22 3 66
Sums N= 42 1743
11
Mean = Ʃ fXc/N = 1743 = 41.5
N 42
The Median
When we are interested in the exact midpoint of a distribution, then the median is the appropriate
statistics. The median divides a given distribution in statics. The median divides a given
distribution in exactly equal halves. In other words, half of the scores are equal to or greater than
the median.

To find the median in a given distribution: first rearrange the scores in order of magnitude (i.e.
ascending order). The middle score is the median. If you have an even number of scores, select
the middle pair of scores, add them together and divide by 2.

For example: 10, 7, 5, 6, 7, 8, 9, 4

When the data is rearranged, we have

1
4 5 6 7 7 8 9 0
7+ 14
Median = 7 = = 7
2 2

The Mode
The mode is the third measure of central tendency. The mode is the most frequently occurring
score in a given distribution. For example, the political affiliation of a simple of university
students is given below.

12
Frequency

People’s Democratic Party {PDP} 18

Alliance for Democracy {AD} 11

All People’s Party {APP} 9

In this distribution, the mode is the category PDP because it has the highest frequency. It should
be noted that in computing the mean, every score in the distribution is used. This is what makes
it the most useful or preferred measure of central tendency. However, in situations where it is
difficult or impossible to compute the mean, then the other measures of central tendency, i.e. the
median and mode may be used. If a nominal scale is used to measure the scores distribution, it
may not be possible to compute the mean, the median of that distribution. The most appropriate
measure of central tendency will be the mode. However, if extreme scores exist in a distribution,
these extreme scores may affect the mean because all the scores in that distribution are
considered. This is because one or two extreme scores may have a large influence on the mean,
which may distort the single value that represents the entire distribution. Therefore, the median
will be the most appropriate measure of central tendency for skewed distributions. That may be
the reason why the median is used to report income level because there are some few individuals
who earn as much as N50, 000.00 a month. The mean is distorted by these extreme incomes and
will be unrepresentative of the income of most people.

MEASURES OF VARIABILITY
If two data sets have the same mean, median and mode, it will be wrong to conclude that they are
the same. This is because there may be variation about their centres. Variability enables us to
know the degree to which scores in a distribution are far apart or close together. In other words,
we want to know how spread out the scores in a distribution are. Therefore, a good measure of
variability is that which gives us an accurate information about the distribution (whether spread
out or clustered together).

Data Set I: 40 38 4 40 39 39
4
40 39 40
2 3
Data Set 4 4
46 37 33 42 36 47 34 45
II: 0 0

Figure 2.10 Dot Plots of Data Sets

13
The two sets of ten measurements each center at the same value: they both have mean, median,
and mode 40. Nevertheless a glance at the figure shows that they are markedly different. In Data
Set I the measurements vary only slightly from the center, while for Data Set II the
measurements vary greatly. Just as we have attached numbers to a data set to locate its center, we
now wish to associate to each data set numbers that measure quantitatively how the data either
scatter away from the center or cluster close to it. These new quantities are called measures of
variability, and we will discuss two of them.

The Range
The range is the simplest measure of variation and is defined as the difference between the
highest and lowest value in a distribution. In calculating the range, you will notice that the only
two extreme scores are considered and the other scores in the distribution are ignored, this makes
the range a not to good measure of variability. Thus, the range does not give us the variability or
spread of the entire distribution, this means that, the range is a measure of variability because it
indicates the size of the interval over which the data points are distributed.

Two sets of scores may have the same range but there may be a clear difference in the
distribution of scores in each set. The scores might be clustered together in the first set and it
may be spread out in the second set. Consider the set of score below:

Set 1: 10 55 35 40 38 50 45 52 60 100

Set 2: 10 20 25 60 70 80 90 85 80 100

14
From the above, the first set of scores is clustered in the middle of the distribution whereas it is
spread out in the second set. For this reason, the range is not a good measure of variability.

Note: A smaller range indicates less variability (less dispersion) among the data, whereas a larger
range indicates the opposite.

The Standard Deviation (SD)


SD is a measure of the distance between each score and the mean of that particular distribution.
In other words, the mean of a distribution is used as a reference point and the extent to which
each score is far from or near the mean gives the variability of the distribution. In short, the SD is
a kind of average of all the deviations from the mean.

In calculating the standard deviation:

i. Find each deviation (x) from the mean x- x , where x isthe score∧x is themean
ii. Square each deviation, finding x2
iii. Sum the squared deviations, finding Ʃ x2
iv. Divide the sum by n, finding Ʃ x2/n
v. Extract the positive square root of the result of step 4.

S.D= Ʃ x2
n

Where x is a deviation from the mean and n equals the sample size.

Variance is the average of all squared deviation from the mean. Since we squared all the
deviation from the mean we must correct this. That is why SD is the square root of the variance.

TRANSFORMED SCORES
Transformed scores is the process of converting raw scores into transformed scores for 2
main purposes:

 It gives meaning to the scores and allows some kind of interpretation of the scores.

15
 It allows direct comparison of two scores. For example, a score of 33 on the first test might not
mean the same thing as a score of 33 on the second test.

Example: If a student, upon viewing a recently returned test, found that he or she had made a
score of 33, would that be a good score or a poor score? Based only on the information given, it
would be impossible to tell. The 33 could be out of 35 possible questions and be the highest
score in the class, or it could be out of 100 possible points and be the lowest score, or anywhere
in between.

The transformations belong to two general types; percentile ranks and linear transformations.
Percentile ranks are advantageous in that the average person has an easier time understanding
and interpreting their meaning. However, percentile ranks also have a rather unfortunate
statistical property that makes their use generally unacceptable among the statistically
sophisticated.

Percentile Ranks Based on the Sample

A percentile rank is the percentage of scores that fall below a given score. For
example, a raw score of 33 on a test might be transformed into a percentile rank of
98 and interpreted as "You did better than 98% of the students who took this test."
In that case the student would feel pretty good about the test. If, on the other hand, a
percentile rank of 3 was obtained, the student might wonder what he or she was
doing wrong.

The procedure for finding the percentile rank is as follows:

 Rank order the scores from lowest to highest.


 Find the proportion of scores that fall below the score and convert to a percentage by multiplying
by 100.
 Find one-half the proportion of scores that fall at the score and convert to a percentage by
multiplying by 100.
 Add the percentage of scores that fall below the score to one-half the percentage of scores that
fall at the score.

The result is the percentile rank for that score.

16
It's actually easier to demonstrate and perform the procedure than it sounds. For
example, suppose the obtained scores from 11 students were:

33 28 29 37 31 33 25 33 29 32 35

You want to know the percentile rank for the score of 31. The first step would be to
rank order the scores from lowest to highest.

25 28 29 29 31 32 33 33 33 35 37

Computing the percentage falling below a score of 31, for example, gives the value
4/11 = .364 or 36.4%. The four in the numerator reflects that four scores (25, 28, 29,
and 29) were less than 31. The 11 in the denominator is N, or the number of scores.
The percentage falling at a score of 31 would be 1/11 = .0909 or 9.09%. The
numerator being the number of scores with a value of 31 and the denominator again
being the number of scores. One-half of 9.09 would be 4.55. Adding the percentage
below to one-half the percentage within would yield a percentile rank of 36.4 + 4.55
or 40.95%. The computations are illustrated in the figure below.

Similarly, for a score of 33, the percentile rank would be computed by adding the
percentage below (6/11=.5454 or 54.54%) to one-half the percentage within ( 1/2 *
3/11 = .1364 or 13.64%), producing a percentile rank of 68.18%. The 6 in the
numerator of percentage below indicates that 6 scores were smaller than a score of
33, while the 3 in the percentage within indicates that 3 scores had the value 33. All
three scores of 33 would have the same percentile rank of 68.18%. The
computations are illustrated in the figure below.

The preceding procedure can be described in an algebraic expression as follows:

17
Application of this algebraic procedure to the score values of 31 and 33 would give
the following results:

Note that these results are within rounding error of the percentile rank computed
earlier using the procedure described in words.

When computing the percentile rank for the smallest score, the frequency below is
zero (0), because no scores are smaller than it. Using the formula to compute the
percentile rank of the score of 25:

Computing the percentile rank for the largest score, 37, gives:

18
In the last two cases it has been demonstrated that a score may never have a
percentile rank equal to or less than zero or equal to or greater than 100. Percentile
ranks may be closer to zero or one hundred than those obtained if the number of
scores was increased.

The percentile ranks for all the scores in the example data may be computed as
follows:

Percentile Ranks based on the Sample

Score 25 28 29 29 31 32 33 33 33 35 37

Percentile
4.6 13.6 27.3 27.3 40.9 50 68.2 68.2 68.2 86.4 95.4
Rank

INFERENTIAL STATISTICS

Inferential statistics is a branch of statistics that makes the use of various analytical tools to draw
inferences about the population data from sample data. There are two main types of inferential
statistics - hypothesis testing and regression analysis. The samples chosen in inferential statistics
need to be representative of the entire population. The goal of inferential statistics is to make
generalizations about a population. In inferential statistics, a statistic is taken from the sample
data (e.g., the sample mean) that used to make inferences about the population parameter (e.g.,
the population mean).

Types of Inferential Statistics

Inferential statistics can be classified into hypothesis testing and regression analysis. Hypothesis
testing also includes the use of confidence intervals to test the parameters of a population. Given
below are the different types of inferential statistics.
19
Inferential Statistics
Hypothesis testing Regression Analysis
Z Test Linear Regression
F Test Nominal Regression
T test Logistic Regression
ANOVA test Ordinal Regression
Wilcoxin Signed Rank Test
Mann Whitney U Test

Hypothesis Testing
Hypothesis testing is a type of inferential statistics that is used to test assumptions and draw
conclusions about the population from the available sample data. It involves setting up a null
hypothesis and an alternative hypothesis followed by conducting a statistical test of significance.
A conclusion is drawn based on the value of the test statistic, the critical value, and
the confidence intervals. A hypothesis test can be left-tailed, right-tailed, and two-tailed. Given
below are certain important hypothesis tests that are used in inferential statistics.

Regression Analysis
Regression analysis is used to quantify how one variable will change with respect to another
variable. There are many types of regressions available such as simple linear, multiple linear,
nominal, logistic, and ordinal regression. The most commonly used regression in inferential
statistics is linear regression. Linear regression checks the effect of a unit change of the
independent variable in the dependent variable.

20

You might also like