0% found this document useful (0 votes)
16 views

LESSON-8

The document discusses statistical tests used to evaluate hypotheses about population distributions, distinguishing between parametric and nonparametric tests. It details various tests including t-tests, ANOVA, and their assumptions, providing examples of when to use each type. The document emphasizes the importance of understanding normal distribution and the interpretation of p-values in hypothesis testing.

Uploaded by

Joelyn Capa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

LESSON-8

The document discusses statistical tests used to evaluate hypotheses about population distributions, distinguishing between parametric and nonparametric tests. It details various tests including t-tests, ANOVA, and their assumptions, providing examples of when to use each type. The document emphasizes the importance of understanding normal distribution and the interpretation of p-values in hypothesis testing.

Uploaded by

Joelyn Capa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

LESSON 8 : COMPARISON TESTS

Statistical tests are intended to decide whether a hypothesis about distribution of one or more
populations or samples should be rejected or accepted.

Statistical Tests

Non-
parametric test
parametric test

PARAMETRIC TESTS

- Is a statistical test that makes assumptions about the parameters of the population distribution from
which one’s data is drawn.
- Used for quantitative data
- Used for continuous variables
- Used when data are measured on approximate interval or ratio scales of measurement
- Data should follow normal distribution

NONPARAMETRIC TESTS
- Nonparametric tests don’t require that your data follow the normal distribution. They’re also known as
distribution-free tests and can provide benefits in certain situations.

Statistical tests for normal distribution


To test your data analytically for normal distribution, there are several test procedures, the best known being the
Kolmogorov-Smirnov test, the Shapiro-Wilk test, and the Anderson Darling test.

In all of these tests, you are testing the null hypothesis that your data are normally distributed. The null hypothesis
is that the frequency distribution of your data is normally distributed. To reject or not reject the null hypothesis, all these
tests give you a p-value. What matters is whether this p-value is less than or greater than 0.05.
n > 50

n < 50

If the p-value is less than 0.05, this is interpreted as a significant deviation from the normal distribution and it can be
assumed that the data are not normally distributed. If the p-value is greater than 0.05 and you want to be statistically
clean, you cannot necessarily say that the frequency distribution is normal, you just cannot reject the null hypothesis.

PARAMETRIC TESTS

I- t-test
- is a statistical test that is used to compare the means of two groups. It is often used in hypothesis
testing to determine whether a process or treatment actually has an effect on the population of interest, or
whether two groups are different from one another.

Example:

a. you want to know if the average grade of the students in math differs according to gender
b. you want to know if cooperative learning strategy affects the math performances of the students by comparing
their mean score before and after the treatment.

Hypothesis:

H0 : There is no significant difference in the average grade of students in math between male and female.

Ho: There is no significant difference in the mean pre-test and mean post test scores of the students.

When to use a t test?


A t - test can only be used when comparing the means of two groups (a.k.a. pairwise comparison).

The t test assumes your data:

1. are independent
2. are (approximately) normally distributed
3. have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance)
What type of t-test should I use?
When choosing a t test, you will need to consider two things: whether the groups being compared come from a
single population or two different populations, and whether you want to test the difference in a specific direction.

A. Independent t-test ( two-sample t-test)


- compares the means between two unrelated groups on the same continuous, dependent variable.
- used to test the null hypothesis µ1 = µ2

H0 : There is no significant difference in the average grade of students in math between male and female.

B. Dependent t- test (paired-sample t-test)

- compares the means of two related groups to determine whether there is a statistically significant
difference between these means.

- - used to test the null hypothesis µbefore = µafter


Ho: There is no significant difference in the mean pre-test and mean post test scores of the students.

Independent t-test
Example

The concentration of cholesterol (a type of fat) in the blood is associated with the risk of developing heart disease, such
that higher concentrations of cholesterol indicate a higher level of risk, and lower concentrations indicate a lower level of
risk. If you lower the concentration of cholesterol in the blood, your risk of developing heart disease can be reduced. Being
overweight and/or physically inactive increases the concentration of cholesterol in your blood. Both exercise and weight
loss can reduce cholesterol concentration. However, it is not known whether exercise or weight loss is best for lowering
cholesterol concentration. Therefore, a researcher decided to investigate whether exercise or weight loss intervention is
more effective in lowering cholesterol levels. To this end, the researcher recruited a random sample of inactive males that
were classified as overweight. This sample was then randomly split into two groups: Group 1 underwent a calorie-
controlled diet and Group 2 undertook the exercise-training program. In order to determine which treatment program was
more effective, the mean cholesterol concentrations were compared between the two groups at the end of the treatment
programs.
We can see that the group means are statistically significantly different because the value in the "Sig. (2-tailed)" row is less
than 0.05. Looking at the Group Statistics table, we can see that those people who undertook the exercise trial had lower
cholesterol levels at the end of the programme than those who underwent a calorie-controlled diet.

Interpretation:

This study found that overweight, physically inactive male participants had statistically significantly lower cholesterol
concentrations (5.80 ± 0.38 mmol/L) at the end of an exercise-training programme compared to after a calorie-controlled
diet (6.15 ± 0.52 mmol/L), t(38)=2.428, p=0.020.

Dependent t-test
Example:

A group of Sports Science students (n = 20) are selected from the population to investigate whether a 12-week
plyometric-training program improves their standing long jump performance. In order to test whether this training
improves performance, the students are tested for their long jump performance before they undertake a plyometric-
training program and then again at the end of the program (i.e., the dependent variable is "standing long jump
performance", and the two related groups are the standing long jump values "before" and "after" the 12-week
plyometric-training program).

Interpretation:

You might report the statistics in the following format: t(degrees of freedom) = t-value, p = significance level. In our case
this would be: t(19) = -4.773, p < 0.0005. Due to the means of the two jumps and the direction of the t-value, we can
conclude that there was a statistically significant improvement in jump distance following the plyometric-training program
from 2.48 ± 0.16 m to 2.52 ± 0.16 m (p < 0.0005); an improvement of 0.03 ± 0.03 m.

II- ANOVA
- which stands for Analysis of Variance, is a statistical test used to analyze the difference between the means of
more than two groups.

- - used to test the null hypothesis Ho: The means of all the groups are equal
Ha: Not all the means are equaL

Assumptions of ANOVA
The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:

1. Independence of observations: the data were collected using statistically valid sampling methods, and there are
no hidden relationships among observations. S
2. Normally-distributed response variable: The values of the dependent variable follow a normal distribution.
3. Homogeneity of variance: The variation within each group being compared is similar for every group. If the
variances are different among the groups, then ANOVA probably isn’t the right fit for the data.
A. one-way ANOVA
Use a one-way ANOVA when you have collected data about one categorical independent variable and
one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three
different groups or categories).

ANOVA tells you if the dependent variable changes according to the level of the independent variable. For example:

• Your independent variable is social media use, and you assign groups to low, medium, and high levels of social
media use to find out if there is a difference in hours of sleep per night.
• Your independent variable is brand of soda, and you collect data on Coke, Pepsi, Sprite, and Fanta to find out if
there is a difference in the price per 100ml.

Ho: There is no significant difference in the price per 100ml among the brand of soda.

Post-hoc testing
ANOVA will tell you if there are differences among the levels of the independent variable, but not which
differences are significant. To find how the treatment levels differ from one another, perform a TukeyHSD (Tukey’s
Honestly-Significant Difference) post-hoc test.

Example

A manager wants to raise the productivity at his company by increasing the speed at which his employees can use a
particular spreadsheet program. As he does not have the skills in-house, he employs an external agency which provides
training in this spreadsheet program. They offer 3 courses: a beginner, intermediate and advanced course. He is unsure
which course is needed for the type of work they do at his company, so he sends 10 employees on the beginner course,
10 on the intermediate and 10 on the advanced course. When they all return from the training, he gives them a problem
to solve using the spreadsheet program, and times how long it takes them to complete the problem. He then compares
the three courses (beginner, intermediate, advanced) to see if there are any differences in the average time it took to
complete the problem.

We can see that the significance value is 0.021 (i.e., p = .021), which is below 0.05. and, therefore, there is a statistically
significant difference in the mean length of time to complete the spreadsheet problem between the different courses
taken. This is great to know, but we do not know which of the specific groups differed.
Interpretation:

There was a statistically significant difference between groups as determined by one-way ANOVA (F(2,27) = 4.467, p =
.021). A Tukey post hoc test revealed that the time to complete the problem was statistically significantly lower after taking
the intermediate (23.6 ± 3.3 min, p = .046) and advanced (23.4 ± 3.2 min, p = .034) course compared to the beginners
course (27.2 ± 3.0 min). There was no statistically significant difference between the intermediate and advanced groups
(p = .989).

B. Two-way ANOVA
The two-way ANOVA compares the mean differences between groups that have been split on two independent
variables (called factors). The primary purpose of a two-way ANOVA is to understand if there is an interaction between the
two independent variables on the dependent variable. For example, you could use a two-way ANOVA to understand
whether there is an interaction between gender and educational level on test anxiety amongst university students, where
gender (males/females) and education level (undergraduate/postgraduate) are your independent variables, and test
anxiety is your dependent variable.

The interaction term in a two-way ANOVA informs you whether the effect of one of your independent variables
on the dependent variable is the same for all values of your other independent variable (and vice versa). For example, is
the effect of gender (male/female) on test anxiety influenced by educational level (undergraduate/postgraduate)?

Example

A researcher was interested in whether an individual's interest in politics was influenced by their level of education and
gender. They recruited a random sample of participants to their study and asked them about their interest in politics, which
they scored from 0 to 100, with higher scores indicating a greater interest in politics. The researcher then divided the
participants by gender (Male/Female) and then again by level of education (HS, college and post grad). Therefore, the
dependent variable was "interest in politics", and the two independent variables were "gender" and "education".

Dependent variable : Interest in politics

Independent variable: Gender ( male and female)

Level of education ( high school, college, post grad)


We can see from the table above that there was no statistically significant difference in mean interest in politics between
males and females (p = .448), but there were statistically significant differences between educational levels (p < .001). The
interaction (the "gender*education_level" row) have a statistically significant effect on the dependent variable, "interest
in politics".

Interpretation:

A two-way ANOVA was conducted that examined the effect of gender and education level on interest in politics. There was
a statistically significant interaction between the effects of gender and education level on interest in politics, F (2, 52) =
7.315, p = .002.

If you had a statistically significant interaction term and carried out the procedure for simple main effects in SPSS
Statistics, you would also report these results. Briefly, you might report these as:

Simple main effects analysis showed that males were significantly more interested in politics than females when
educated to university level (p = .002), but there were no differences between gender when educated to school (p =
.465) or college level (p = .793).

You might also like