Basic concepts of Statistics
Slide 1
Aims
1. Types of data analysis
2. Level of measurement
3. Descriptive statistics
4. Confidence interval
5. Type-I & Type-II Error
Types of Data Analysis
Quantitative research is ‘Explaining phenomena by
collecting numerical data that are analysed using various
statistical techniques’
Statistical techniques can be classified as
1. Univariate, and
2. Multivariate techniques.
Slide 3
Types of Data Analysis
Qualitative Research provides insights and understanding of the
problem setting
1. Direct approach (Nondisguised)
• Depth interview
• Focus group
2. Indirect approach(Disguised): Projective techniques
• Association techniques
• Completion techniques
• Construction techniques
• Expressive techniques
Level of measurement
• Nominal
• Ordinal
• Interval
• Ratio
Slide 5
Nominal variable
There are two or more categories
Ex. Male or Female, Dead or Alive (Two
Categories )
Ex. Whether someone is Science, Commerce, or Arts
background.
Ordinal variable
The same as a nominal variable but the categories
have a logical order
Whether a customers are very satisfied, satisfied,
neutral, dissatisfied, or very dissatisfied
Interval variable
Equal intervals on the variable represent equal
differences in the property being measured
Ex. The difference between 6 and 8 is equivalent
to the difference between 13 and 15
Ratio variable
The same as an interval variable, but the ratios of
scores on the scale must also make sense
Ex. A score of 16 on an anxiety scale means that
the person is, in reality, twice as anxious as
someone scoring 8.
Descriptive statistics
1. Central tendency
• Mean
• Median
• Mode
2. Measures of dispersion
• Range
• Interquartile range
• Standard deviation
• Variance
3. Distribution
• Skewness
• Kurtosis
Central tendency: Mean
The sum of scores divided by the number of
scores.
Example . HR score of nine participants
Mean =
(70+71+74+80+73+75+82+64+69)/9
= 73.11
Central tendency: Median
The middle score when scores are ordered.
Example : HR score of nine participants
64, 69, 70, 71, 73, 74, 75, 80, 82
Median is 73
Central tendency: Mode
• Mode
– The most frequent score
• Bimodal
– Having two modes
• Multimodal
– Having several modes
A Bimodal Distribution
Slide 14
Measures of dispersion: Range
• The range measures the spread of the data. It is simply the
difference between the largest and smallest values in the sample.
Range = Xlargest – Xsmallest
64, 69, 70, 71, 73, 74, 75, 80, 82
Maximum: 82
Minimum: 64
Range: 82-64 =18
Measures of dispersion: Interquartile Range
Quartiles
– The three values that split the sorted data into four equal parts.
– Second Quartile = median.
– Lower quartile = median of lower half of the data
– Upper quartile = median of upper half of the data
• The interquartile range is the difference between the
75th and 25th percentile. For a set of data points
arranged in order of magnitude, the pth percentile is
the value that has p% of the data points below it and
(100 - p)% above it.
Measures of dispersion:Variance
• The sum of squares is a good measure of overall variability, but is
dependent on the number of scores.
• We calculate the average variability by dividing by the number of
scores (n).
• This value is called the variance (s2).
• The variance is the mean squared deviation from the mean. The variance
can never be negative.
Measures of dispersion:Standard Deviation
• The variance has one problem: it is measured in units squared.
• This isn’t a very meaningful metric so we take the square root value.
• This is the Standard Deviation (s).
n
(Xi - X)2
sx = n-1
i =1
• The coefficient of variation is the ratio of the standard deviation to
the mean expressed as a percentage, and is a unitless measure of
relative variability.
CV = sx / X
Frequency Distributions : Histograms
• Frequency Distributions
– A graph plotting values of observations on the horizontal axis,
with a bar showing how many times each value occurred in the
data set.
• The ‘Normal’ Distribution
– Bell shaped
– Symmetrical around the centre
Slide 20
The Normal Distribution
Slide 21
Properties of Frequency Distributions
• Skew: The symmetry of the distribution.
– Positive skew (scores bunched at low values with the tail
pointing to high values).
– Negative skew (scores bunched at high values with the tail
pointing to low values).
• Kurtosis: The ‘heaviness’ of the tails.
– Leptokurtic = heavy tails.
– Platykurtic = light tails.
Slide 22
Skew
Slide 23
Kurtosis
Slide 24
Normal distribution curve
Slide 25
Properties of z-scores
• 1.96 cuts off the top 2.5% of the distribution.
• −1.96 cuts off the bottom 2.5% of the distribution.
• As such, 95% of z-scores lie between −1.96 and 1.96.
• 99% of z-scores lie between −2.58 and 2.58,
• 99.9% of them lie between −3.29 and 3.29.
Slide 26
Standard Errors
A sampling distribution is simply the frequency
distribution of sample means from the same
population. The standard deviation of sample means
is known as the standard error of the mean (SE).
= 10
M = 10 M=9
M = 11 M = 10
ss
M=9 M=8 M = 12
M = 10
M = 11
XX
4 M
ea
S
Dn1
==10
.2 NN
3
2
Frequency
1
06789101 121314
S
am
pleM
ean
Slide 28
Confidence Intervals
• Domjan et al. (1998)
– ‘Conditioned’ sperm release in Japanese Quail.
• True Mean
– 15 Million sperm
• Sample Mean
– 17 Million sperm
• Interval estimate
– 12 to 22 million (contains true value)
– 16 to 18 million (misses true value)
– CIs constructed such that 95% contain the true value.
Slide 29
Slide 30
Slide 31
One- and Two-Tailed Tests
Slide 32
Type I and Type II Errors
• Type I error
– occurs when we believe that there is a genuine effect in our
population, when in fact there isn’t.
– The probability is the α-level (usually .05)
• Type II error
– occurs when we believe that there is no effect in the population
when, in reality, there is.
– The probability is the β-level (often .2)
Slide 33
The SPSS Environment
Slide 34
Starting SPSS
Slide 35
The Data Editor
Slide 36
The Variable View
Slide 37
Variable Types
• Numeric
– Numbers (e.g. 7, 0, 120)
• String
– Letters (e.g. ‘Andy’, ‘Idiot’)
• Currency
– Currency (e.g. £20, $34, €56)
• Date
– Dates (e.g. 21-06-1973, 06-21-73, 21-Jun-1973)
Slide 38
Creating a String Variable
Slide 39
Creating a Date Variable
Slide 40
Creating a Coding Variable
Slide 41
The Viewer Window
Slide 42
The Syntax Window
Slide 43