CHAPTER 1: INTRODUCTION TO STATISTICS
AND DATA PRESENTATION
Statistics – is a branch of applied mathematics
that involves the collection, description,
analysis, and inference of conclusions from
quantitative data. It concerns with determining
how to draw reliable conclusions about large
groups and general phenomena from the
observable characteristics of small samples that
represent only a small portion of the large group
or limited number of instances of a general IMPORTANCE OF STATISTICS IN EDUCATION
phenomenon.
TWO MAJOR AREAS OF STATISTICS 1. Statistics helps in the collection and
Descriptive statistics Inferential statistics presentation of Data in a calculated and
which describes the uses those properties systematic manner. Statistics in educations
properties of sample to test hypotheses helps in the orderly arrangement of both
and population data and draw processed and unprocessed data.
conclusions.
2. Statistics makes the teaching and learning
refers to the method while inferential
of collection, statistics refers to process more efficient. Statistics in Education,
extraction, summary, using the properties with special considerations to measurement
presentation, to test hypotheses and evaluation of concepts, are essential parts
measures of central and draw of the teaching and learning process.
tendency, and conclusions based on
3. Statistics helps in the provision and
measures of the evidence
variability. The obtained from presentation of the exact type of description. It
purpose of sample practically helps teachers to give an accurate
descriptive statistics is description of data. This could be found in the
to facilitate the cases of the administration of a pupil or
presentation and observation of a child.
interpretation of data.
4. Statistics serves as a reliable source of history
in education. This is because statistical
USES OF STATISTICS documentation is always empirical and easy for
understanding.
1. Statistics helps in providing a better 5. Statistics helps in the summary and
understanding and accurate description of presentation of results. Statistics helps one to
nature’s phenomena. make data precise, concise, and meaningful.
2. Statistics helps in the proper and efficient POPULATION VS. SAMPLE
planning of a statistical inquiry in any field of
study.
▪ In statistics, we commonly hear the
3. Statistics helps in collecting appropriate words samples and population. In most
quantitative data. experimental research, we usually use
the term samples while in educational
4. Statistics helps in presenting complex data in
research, we commonly use population.
a suitable tabular, diagrammatic and graphic
form for an easy and clear comprehension of ▪ Population is the entire group of elements
the data. to be studied while a sample is the
specific group or the subset of a
5. Statistics helps in understanding the nature
population. Since sample is a subset of a
and pattern of variability of a phenomenon
population, this means that the sample
through quantitative observations.
size is always less than the population
6. Statistics helps in drawing valid inferences, size. To have a better understanding, let
along with a measure of their reliability about us say that the population refers to the
the population parameters from the sample senior high school students of Pampanga
data State Agricultural University. The sample
that will be used in the study is the
AREAS THAT USE STATISTICS students of the STEM strand. In research,
population does not always refer to
people.
A categorical data is data which is grouped into
categories, such as data for a 'gender' or
'smoking status' variable while continuous data
is data which is measured on a continuous
numerical scale and which can take on a large
number of possible values, such as data for a
‘weight’ or ‘distance’ variable.
Discrete data measures counts or numbers of
events, such as data for a ‘class attendance’
variable, and while it is numerical data it is not
measured on a continuous numerical scale- so
CLASSIFICATION OF STATISTICS it doesn’t fit neatly into either of the
classifications above. It is usually treated as
continuous data, but if there are only a small
1. Parametric Statistics – is an approach which
number of values (such as for a ‘number of
assumes a random sample from a normal
children under three in family’ variable) you
distribution and involves testing of hypothesis
might choose to treat them as categories
about the population parameter. The basic
instead.
idea in a parametric method is that there is a
set of fixed parameters that determine a One final thing to note is that any continuous
probability model. data can always be turned into categorical
data, by simply creating categories out of it.
2. Nonparametric Statistics – is a statistical
Continuous data for an ‘age’ variable could be
approach for estimating and hypothesis testing
turned into categorical data by creating
when no underlying data distribution is
categories of 11-20 years old, 21-30 years old,
assumed. In this statistical technique, the set of
31-40 years old, etc., for example, and this can
parameters is not fixed. It is also referred to as
be useful if you want to analyze your continuous
distribution-free method.
data using statistics and statistical tests
DATA AND VARIABLES designed for categorical data. You can’t go
the other way around though and turn
categorical data into continuous data, so if you
Data refers to observations and measurements
have the choice then for maximum flexibility it is
which have been collected and analyzed in
preferable to collect continuous data.
some way, often through research and can be
categorized as qualitative (attributes) and SOURCES OF DATA
quantitative (numerical).
Variables are the characteristics or attributes There are two types of statistical data sources
that you are observing, measuring and namely primary data source and secondary
recording data for- some examples include data source.
height, weight, eye colour, dog breed, climate,
1. Primary data source – refers to data that
electrical conductivity, customer service
come from original sources and are
satisfaction and class attendance, just to name
collected at hand which include data
a few. As the word suggests, the value of a
from government agencies, business
variable varies from one subject (i.e. person,
establishments, organizations, and
place or thing) to another. Examples of
individuals who carry original data or first-
variables include faculty ranks, educational
hand information relevant to a given
attainment, salary, and sex
problem.
LEVELS OF DATA MEASUREMENT
2. Secondary data source – refers to the
data collected by others for another
Data can be classified into four levels of purpose which may include information
measurement. They are (from lowest to highest stored in books, internet, brochures,
level): Nominal scale level. Ordinal scale level. journals and periodicals.
Interval scale level. Ratio scale level.
PRESENTATION OF DATA
The data gathered in a research should be
presented in a manner that is easily understood
by the audience or listeners. This presentation
can be done in textual, tabular or graphical or
a combination of textual and tabular methods.
1. Textual presentation – data are
presented in paragraph form, written
and read, and a combination of texts b. Line Graph – most useful in displaying
and numbers. data that changes continuously over
time.
Example:
c. Pie or Circle Graph – shows percentages
Of the 100 students interviewed, the following
of data effectively.
issues in a library use were noted: 25 for old
books, 40 for unarranged books, 15 for d. Pictograph (Pictogram) – uses small
unsuitable lightings, and 20 for torn pages of identical or figures of objects called
books. isotopes in making comparisons. Each
picture represents a definite quantity
2. Tabular presentation – uses statistical
table and a systematic organization of DATA COLLECTION METHOD
data in columns and rows.
Parts of Statistical table: There are three data collection method that a
researcher can use.
Table heading – consists of table number
and title. The table number serves to give 1. Direct or Interview Method. The term
the table an identity. ‘interview ‘is derived from the Latin
language which means “see each
Stubs – classification or categories which
other“. In general terms, the interview is
are found at the left side of the body of
nothing but a formal meeting between
the table
an interviewer and interviewee where
Box Head – the top of the column questions are asked by former and
answers are given by later. This method is
Body – main part of the table considered the most expensive way of
Footnotes – any statement or note collecting data because it needs more
inserted at the foot or bottom of the time and money in conducting it.
table 2. Indirect or Questionnaire Method. In
Source Notes – source of statistics which questionnaire method, it is not possible
may include to acknowledge the origin on the part of the researcher to conduct
of the data. an intensive or in-depth study of the
feelings, reactions and sentiments of the
respondents. This method is relatively
simple and inexpensive for it requires a
few staff to handle it.
There are following types of questionnaires:
▪ Computer questionnaire. Respondents
are asked to answer the questionnaire
which is sent by mail. The advantages of
the computer questionnaires include
their inexpensive price, time-efficiency,
and respondents do not feel pressured,
therefore can answer when they have
time, giving more accurate answers.
However, the main shortcoming of the
mail questionnaires is that sometimes
respondents do not bother answering
them and they can just ignore the
questionnaire.
▪ Telephone questionnaire. Researcher
may choose to call potential
respondents with the aim of getting them
to answer the questionnaire. The
advantage of the telephone
3. Graphical presentation – uses graphs (bar, questionnaire is that, it can be
line, pie or circle, and pictograph) to present completed during the short amount of
the data. A graph is perhaps the most time. The main disadvantage of the
attractive, effective, and convincing way of phone questionnaire is that it is expensive
presenting a data. most of the time. Moreover, most people
a. Bar Graph – used to show comparison or do not feel comfortable to answer many
relationship between groups. questions asked through the phone and
it is difficult to get sample group to A. Probability Sampling. Probability sampling is
answer questionnaire over the phone. a sampling technique, in which the subjects of
the population get an equal opportunity to be
▪ In-house survey. This type of
selected as a representative sample
questionnaire involves the researcher
visiting respondents in their houses or 1. Simple Random Sampling - Every
workplaces. The advantage of in-house member and set of members has an
survey is that more focus towards the equal chance of being included in the
questions can be gained from sample. Technology, random number
respondents. However, in-house surveys generators, or some other sort of chance
also have a range of disadvantages process is needed to get a simple
which include being time consuming, random sample. Usually, this is done by
more expensive and respondents may getting a certain percentage of the
not wish to have the researcher in their population to be included in the study.
houses or workplaces for various reasons.
Example — A teachers puts students' names in
▪ Mail Questionnaire. This sort of a hat and chooses without looking to get a
questionnaires involves the researcher to sample of students.
send the questionnaire list to respondents
2. Stratified Random Sampling - The
through post, often attaching pre-paid
population is first split into groups. The
envelope. Mail questionnaires have an
overall sample consists of some members
advantage of providing more accurate
from every group. The members from
answer, because respondents can
each group are chosen randomly.
answer the questionnaire in their spare
time. The disadvantages associated with Example — A student council surveys 100100100
mail questionnaires include them being students by getting random samples of 25
expensive, time consuming and freshmen, 25 sophomores, 25 juniors, and 25
sometimes they end up in the bin put by seniors.
respondents.
3. Cluster Random Sampling - The
3. Registration Method. This method of population is first split into groups. The
collecting data is commonly enforced by overall sample consists of every member
certain laws, ordinances or standard practices. from some of the groups. The groups are
Examples are birth certificate, marriage selected at random.
certificate and death certificate registration,
license, and motor vehicle registration. Example — An airline company wants to survey
its customers one day, so they randomly select
4. Observation Method. This method makes use 555 flights that day and survey every passenger
of the different human senses in gathering on those flights.
information.
4. Systematic Random Sampling - Members
5. Experimentation. This method is usually of the population are put in some order.
conducted in laboratories where specimens are A starting point is selected at random,
subjected to some aspects of control to find out and every n th member is selected to be
cause and effect relationships. in the sample.
Example — A principal takes an alphabetized
list of student names and picks a random
starting point. Every 20th student is selected to
take a survey.
B. Nonprobability Sampling. Non-probability
sampling is a sampling method in which not all
members of the population have an equal
SAMPLING TECHNIQUES chance of participating in the study.
1. Purposive Sampling - is a non-probability
In research, the researcher should be able to sampling method and it occurs when
determine the sampling technique that he or “elements selected for the sample are
she will used. Various sampling techniques or chosen by the judgment of the
sample designs can be used by the researcher. researcher. Also known as judgmental,
This sampling technique depends on the nature selective or subjective sampling,
of the problem and the kind of population that purposive sampling relies on the
will be used. If a sample isn't randomly selected, judgement of the researcher when it
it will probably be biased in some way and the comes to selecting the units (e.g.,
data may not be a representative of the people, cases/organizations, events,
population. pieces of data) that Something to laugh
about… are to be studied. The main In quantitative research it is usually employed to
objective of a purposive sample is to collect data regarding the number of
produce a sample that can be logically occurrences in a specified period of time
assumed to be representative of the
c. Documentary analysis
population.
When the researcher obtains the information
Example — A team of researchers wanted
directly from the source
to understand what the significance of white
skin—whiteness—means to white people, so ▪ Indirect Methods - A survey solicit
they asked white people about this. This is a information from people
homogenous sample created on the basis
of race. Examples:
2. Convenience Sampling - The researcher Pre-election polls, marketing surveys, etc. Survey
chooses a sample that is readily maybe administered in variety of ways .
available in some non-random way. a. Telephone interview, and
Accidental sampling is also similar to
convenience sampling. b. Self –Administered questinanaire
Example — A researcher polls people as QUESTIONAIRRE DESIGN
they walk by on the street. Key principles :
3. Quota Sampling - is defined as a non- 1. Keep the questionnaire as short as
probability sampling method in which possible
researchers create a sample involving
individuals that represent a population. 2. As short, simple and clearly worded
Researchers choose these individuals questions.
according to specific traits or qualities.
3. Start with demographic questions to help
They decide and create quotas so that
respondents get started comfortably.
the market research samples can be
useful in collecting data. 4. Use dichotomous (yes/no) and multiple
choice questions.
Example — You could divide a population
by the province they live in, income or 5. Use open ended questions cautiously.
education level, or sex. The population is
It takes time and effort to respond to the
divided into groups (also called strata) and
question
samples are taken from each group to meet
a quota. Literal responses can be difficult for
respondents not familiar with expressing owns
views and opinions of the question.
METHODS OF COLLECTING DATA
6. Avoid using leading questions.
7. Protest a questionnaire on a small
Direct Methods number of people
8. Think about the way you intend to use
a. Interview – is classified as formal
the collected data when preparing the
and informal. In a formal interview
questionnaire.
the interviewer uses an interview
guide or set of questions during A questionnaire is one of the common surveys
the interview process or method of gathering data. It can be
classified as open-ended and close –ended
In informal interview the
EXAMPLES OF QUESTIONAIRRE:
researcher is free to ask questions
1. Open ended questionnaire
without using a guide so that
▪ ____Age
there is no limit to the information ▪ ____ Sex
obtained during the interview ▪ ____Civil Status
▪ ____Average monthly income
b. Observation is classified as
qualitative and quantitative. 2. Close Ended Questionnaire
Age Civil Status
In qualitative research, it is usually ____20-30 yrs old ____ Single
consists of detailed notation on ____31-40 yrs old ____ Married
behaviour, events and the ____41-50 yrs old ____ Widow/er
context surrounding the events Sex ____ Single
parent
and behaviour
___ Male
____ Female
K=N/n
3. Experiments - it is used when the
If N=100, n =20
objective is to determine the cause and
Then k= 100/20
effect of a certain phenomenon under
K=5
controlled conditions
c. Stratified Random Sampling - A
Sampling which enables a researcher to pick a sampling technique where the
subgroup as a basis for making judgements population is first divided into
about a larger group subsets based on homogeneity
called strata.
Steps on sampling
1. Identify the population For example in a college of
engineering student’s population,
2. Determine the sample size the stratifications are according
3. Select the sample to sex (male and female)
SAMPLE SIZE OF THE POPULATION
1. SLOVINS FORMULA
2. Gary’s acceptable sample sizes
depending on the type of research
a) Descriptive research – 10% of the
population , for smaller population
it should be at least 20% d. Cluster Sampling - A sampling
technique which occurs when
b) CORRELATION RESEARCH - 30% one select the members of a
SUBJECTS sample in cluster rather than in
using separate individuals. It is a
c) CAUSAL COMPARATIVE RESEARCH
sampling where groups, not
– 15% subjects group
individuals, are randomly selected
d) EXPERIMENTAL RESEARCH - 15
NON PROBABILITY SAMPLING
SUBJECTS PER GROUP OR 30
SUBJECTS PER GROUP
1. Purposive Sampling
The sample respondents are selected based on
SAMPLING PLANS
certain criteria laid down by the researcher. For
A sampling plan is just a method or example, a researcher might want to find
procedure for specifying how sample will be whether the residential house in a subdivision
taken from a population ABC comply with the building fire requirements.
Instead of interviewing the owners of all the
PROBABILITY SAMPLING
residential houses in subdivision ABC, he can
It is a sampling technique in which every
purposely choose to interview the owners of
item in a population has an equal chance of
20 residential houses in subdivision ABC.
being selected and quantifies as the sample .
2. Quota Sampling
a. Simple Random Sampling - a
sampling techniques using the Quota sampling is a method for selecting survey
concept of a lottery or fish bowl participants that is non-probabilistic version of
method. For example, if you stratified sampling.
computed 50 samples out of 200,
3. Convenient Sampling
you may write down the names or
the numbers on sheets of papers The researcher picks the sample respondents
up to 200 and place in a box from the population that he finds convenient to
mixed thoroughly. Then start the interview due to their availability or
drawing the paper until you accessibility. For example a researcher might
obtain the 50 samples want to find out the popularity of a certain
b. Systematic Sampling - a sampling candidate in Ms. Earth 2020. He might choose
technique which selects sample his respondents within his barangay since this is
units by every kth members of a more convenient for him
population after arranging
perhaps alphabetically or other
sort. Let say for instance every fifth
member, meaning say 5th , 10th,
15th, etc.