STATISTICS Example: 90% satisfaction of a sample of 50
customers
STATISTICS- Set of mathematical procedures for
organizing, summarizing, and interpreting • SAMPLING ERROR- differences between the
information. sample and the population that exist only
because of the observations that happened to be
- Used to organize and summarize the
selected for the sample
information so that the researchers can see what
happened in the study and can communicate the QUALITATIVE RESEARCH
results
-Content analysis
- Helps the researchers to answer the questions
-historical
by determining exactly what general conclusions
are justified based on the specific results that -ethnographic
were obtained.
QUANTITATIVE RESEARCH
• VARIABLE – a characteristic or condition that
changes or has different values for different -experimental
individuals -single subject
• DATA- measurements or observations -correlational
commonly called scores or raw scores
-Casual comparative
• PARAMETER- whole population
- Survey
• STATISTIC- sample of population
CORRELATIONAL RESEARCH- describe and
• DESCRIPTIVE STATISTIC- used to describe, measure the degree of relationship between two
organize and summarize information about an variable, measured without manipulation
entire population INDEPENDENT VARIABLE-manipulated
-frequency DEPENDENT VARIABLE-measured
EXPERIMENTAL GROUP- exposed to the
-central tendency independent variable
-variability CONTROL GROUP- not given the treatment
OPERATIONAL DEFINITION- range from simple
Example: the average test score for the students
and straightforward to complex definition.
in a class
Should be tied to the theoretical constructs
• INFERENTIAL STATISTIC- used to generalize DISCRETE VARIABLE- whole number, can only
about a population based on a sample of data take a finite number of distinct values
CONTINUOUS VARIABLE- infinite number of
- T- test –
values. (age, weight, temperature) RATIO DATA-
- Analysis of variance true zero, (height, weight, income, age)
INTERVAL DATA- no true zero, (degree celcius)
- Correlation ORDINAL DATA- ordered categories, (order or
- Regression scaling)
-Non parametrics NOMINAL DATA- no ordering or direction
FREQUENCY DISTRIBUTION • Polygon- a dot is centered above each
score.
• One method for simplifying and
– The height of the dot corresponds to
organizing data
the frequency.
is to construct a frequency
– A continuous line is drawn from dot to
distribution.
dot to connect the series of dots
• Frequency distribution- is an
• Bar graph- is just like a histogram except
organized tabulation showing
that gaps or spaces are left between
exactly how many individuals
adjacent bars.
are located in each category on the
– For a nominal scale, the space
scale of
emphasizes that the scale consists of
measurement.
separate, distinct categories.
➢ Can be structured either as a table or as
– For ordinal scales, separate bars are
a graph, and presents the same two
used because you cannot assume that
elements:
the categories are all the same size
• The set of categories that make up the
• Smooth curve- emphasizes the fact that
original measurement scale
the distribution is not showing the exact
• A record of the frequency, or number
frequency for each category,
of individuals in each category
• Normal curve- the word normal refers to
➢ frequency distribution table consists of
a specific shape that can be precisely
at least
defined by an equation.
two columns– one listing categories on
• Central tendency - measures where the
the scale
center of the distribution is located.
of measurement (X) and another for
• Variability -measures the degree to
frequency
which the scores are spread over a wide
(f).
range or are clustered together.
➢ X column- values are listed from the
highest to lowest, without skipping any. • Symmetrical- if the left side of
➢ frequency column- tallies are the graph is (roughly) a mirror image of
determined for each value (how often the right
each X value occurs in side.
the data set) • Skewed- if the scores tend to
➢ sum of the frequencies should equal N. pile up toward one end of the scale and
• p = f/N- can be used for the taper off
proportion (p) for each category gradually at the other end
➢ A fourth column can display the
percentage of the distribution
corresponding to each X value
• The percentage is found by multiplying
p by 100.
– The sum of the percentage column is
100%
• Histogram- a bar is centered above each
score (or class interval)
• CENTRAL TENDENCY- is a statistical
measure to determine a single score that
defines the center of a distribution. The
goal of central tendency is to find the
single score that is most typical or most
representative of the entire group.
• MEAN- is the sum of the scores divided
by the number of scores.
- mean can be defined as the “balance
point.
• The formula for the population mean is
∑𝑋
𝑁
• Positively skewed distribution, the • WEIGHTED MEAN- The overall sum of
scores tend to pile up on the left side of the scores for the combined group
the distribution with the tail tapering off ∑𝑋1+∑𝑥2
to the right. 𝑁1+𝑁2
• Negatively skewed distribution, the
• MEDIAN- defines the middle of the
scores tend to pile up on the right side
distribution in terms of scores.
and the tail points to the left.
- The goal of the median is to locate the
• Percentile rank for a particular X value is
midpoint of the distribution.
the percentage of individuals with scores
• MODE- is the
equal to or less than that X value.
score or category that has the greatest
• Percentile- an X value is described by its
frequency.
rank
• SYMMETRICAL DISTRIBUTION- the
• Interpolation- mathematical process
right-hand side is a mirror image of
based on the assumption that the scores
the left-hand side.
and the
- The median is exactly at the center
percentages change in a regular, linear
because exactly half of the area in the
fashion as you move through an interval
graph will be on either side of the center.
from one end to
• POSITIVELY SKEWED DISTRIBUTION- the
the other.
most likely order of the three measures of
• Stem-and-leaf- display provides an
central tendency from smallest to largest
efficient method for obtaining and
(left to right) is the mode, median, and
displaying a frequency
mean.
distribution.
• NEGATIVELY SKEWED DISTRIBUTION, the
• Stem- consisting of
most probable order is mean, median,
the first digit or digits
and mode.
• Leaf- consisting of the
• VARIABILITY- provides a quantitative
final digit.
measure of
the differences between scores in a
distribution and describes the degree to
which the scores
are spread out or clustered together.
• RANGE- is the distance covered by the
scores in a distribution, from the smallest
score to the
largest score.
• MEAN ABSOLUTE DEVIATION- average
distance of all the elements in a data set
from the mean of the same data set.
• DEVIATION- is distance from the mean:
deviation score = X – μ
• VARIANCE- equals the mean of the
squared deviations.
• STANDARD DEVIATION is the square root
of the variance and provides a measure of
the standard, or average distance from
the mean.
• POPULATION VARIANCE is represented
by the symbol s2 and equals the mean
squared distance from the mean.
• POPULATION STANDARD DEVIATION is
represented by the symbol s and equals
the square root of
the population variance.
• SAMPLE VARIANCE is represented by the
symbol
s2 and equals the mean squared distance
from
the mean.
• SAMPLE STANDARD DEVIATION is
represented by
the symbol s and equal the square root of
the
sample variance.
• UNBIASED if the average
value of the statistic is equal to the
population parameter.
• BIASED if the average value
of the statistic either underestimates or
overestimates the corresponding
population parameter.