0% found this document useful (0 votes)
22 views13 pages

Student's T Test For Samples With Unequal Variances

The Student's t-test is a statistical tool used to determine whether there are significant differences between the means of two samples. It compares the mean of a sample with a hypothetical or known mean. It is used when the samples have unequal variances. It requires that the data are independent, random, and follow a normal distribution. The result is a t value that indicates whether the difference between the means is statistically significant.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views13 pages

Student's T Test For Samples With Unequal Variances

The Student's t-test is a statistical tool used to determine whether there are significant differences between the means of two samples. It compares the mean of a sample with a hypothetical or known mean. It is used when the samples have unequal variances. It requires that the data are independent, random, and follow a normal distribution. The result is a t value that indicates whether the difference between the means is statistically significant.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

STUDENT'S T TEST FOR

SAMPLES OF EQUAL VARIANCES


The Student's 't' test is a tool of the deductive statistical type. Which is
used to assess the mean of two different samples through hypothesis testing,
specifically, to identify if there is a significant difference between their means,
for example: a 't' test could be used if we want to compare the achievement of the
reading of men and women. In this test we have an independent variable
and a dependent variable. The independent variable (gender in this case) can only
having two levels (male and female), in the contrary case that this variable had more
For two levels, a one-way analysis of variance (ANOVA) would be used.

If we put it technically, the t-Student test for a sample is a technique


used to determine if the mean of a sample is statistically different from
a known or hypothetical population mean. This test is used when the population
does not follow a normal distribution or when thesample sizeit is small (less
of 30).

The Student's t-test is based on the calculation of the t statistic, which is obtained by dividing

the difference between the sample mean and the hypothetical or known mean
thestandard deviationfrom the sample divided by the square root of the size of the
sample.

If the calculated t statistic value is greater than the critical t value obtained from a
Student's distribution table with a given level of significance and degrees of
freedom (n-1), the null hypothesis that the two means are equal is rejected and
it concludes that there is sufficient evidence to assert that the sample mean is
significantly different from the hypothetical or known average.

In summary, the t-Student test for a sample is a useful tool for analysis.
if a sample of data is representative of a larger population and for
determine if the difference between the sample mean and the population mean is
significant from a statistical point of view.

1
Difference between the 't' test and ANOVA test: both are tools
statistics used to compare the means of two or more groups of data. Without
embargo, there are some important differences between them:

• Number of groups: The t-test is used to compare the mean of two groups.
data, while ANOVA is used to compare the average of three or more groups
of data.
• Type of variables: The test is used for continuous numerical variables and data.
independent, while ANOVA is used for numerical variables
continuous and dependent or independent data.
• Type of result: The test yields a t value, which indicates the significance.
statistic of the difference of means between two groups. On the other hand, ANOVA provides
as a result, an F value, which indicates the statistical significance of the difference of
averages among three or more groups.

• Type of analysis: The t-test is a univariate analysis, which means that it only
analyze one independent variable at a time. ANOVA is a multivariate analysis,
which means that it can analyze several independent factors at the same time.

Advantages of the "t" test: the test is used in many fields, such as research.
medical, psychology, economics, education. Here are some examples:

• Compare two groups: The test is used to compare two groups of data.
example, to compare the mean of the test results between two groups
of students.
• Evaluation of the effectiveness of a treatment: The t-test can be used to
evaluate whether a treatment/intervention has a significant effect on a variable
of interest compared to a control group that did not receive the treatment.
• Analysis of experiments: The test is often used in scientific experiments
to compare the results of a treatment group with a control group.
• Survey data analysis: It can be used in the same way for data analysis.
ofsurveysto compare the means of two data groups, for example, to
compare the average income between men and women.

2
• Sensitivity to sample size: Unlike other statistical tests, it is
sensible to the sample size, which means it can be used with samples
small or large.
• Normal distribution not required: The t-test is robust to deviations from the
normality of the population, especially when the sample size is
large.
• Simplicity of calculation: It is a relatively simple and easy statistical technique.
calculate, which makes its application easier in various contexts.
• Wide application: The test is applied in various areas, such as research
medical, educational research, market research, engineering, among
others.
• Identification of statistical significance: The t-test allows us to identify whether a
observed difference between the sample mean and the hypothetical population mean
whether it is significant or not from a statistical point of view.

Assumptions for the "t" test: although the test resists relatively well the
deviations presented by the data have certain limitations.

• The data is continuous and follows a normal distribution.


• The size of the population for each sample must be less than thirty.
• The data sample has been taken randomly from the population.
• There is homogeneity in variance (the variability of each group is similar).
• The values of the mean (X̅), observations (n), and variance (σ) must be obtained.

The basic idea for calculating a Student's t-test is to find the difference between the
averages of the two groups and divide it by the standard error (of the difference), that is,
standard deviation of the distribution of differences.

A confidence interval for a two-tailed t-test is calculated by multiplying the


critical values for the standard error, adding and subtracting that from the difference of
the two averages. The effect size is used to calculate the practical difference. If there are
several thousand patients, it is very easy to find a statistically significant difference
significant. Knowing whether that difference is practical or significant is another question.

3
With studies involving group differences, the effect size is the difference.
of the two averages divided by the standard deviation of the control group (or the standard deviation

standard mean of both groups if there is no control group). Generally, the size
the effect is only important if there is statistical significance. An effect
A size of 2 is considered small, 5 is considered medium, and 8 is considered large.

Types of 't' test: there are three t tests to compare means, the 't' test for one
show, the two-sample 't' test and the paired 't' test; as mentioned the
This document focuses solely on the second test, the "t" test for two.
samples assuming unequal variances.

T TEST FOR TWO SAMPLES

• Continuous measurement
Type of variable • Categorical or nominal to define pairs
in a group

Determine if the population means of


Objective
two distinct groups are or are not equal

Estimation of the population mean Mean of the sample from each group

Determine if the average heart rate


Example is the same for two groups of people or
no

T-test for two samples: you can use the test when the values of your data
they are independent, they are randomly selected from two normal populations and both
Independent groups have equal variances. To carry out a valid test:

• The values of the data must be independent. The measures of an observation


they do not affect the measurements of any other observation.

• The data from each group must be obtained through a random sample of the
population.

4
• The data of each group have a normal distribution.
• Data values are continuous.
• The variances of the two independent groups are equal.

For the Student's t-test for samples, once we check that it


two assumptions are met:

1. The continuous variable follows a normal distribution for the two categories of the
nominal variable.
2. There is homoscedasticity, which means that the variance of the values of the
The continuous variable is the same in both groups of the dichotomous nominal variable.

A separate point about normality, the hypothesis of normality is more important when
the two groups have small sample sizes that in the case of being
big.

Normal distributions are symmetrical, that is, "equal" on both sides of the center.
Normal distributions do not have extreme values or outliers. You can
check these two characteristics of a normal distribution with graphs. Before
we decided that the body fat data was "close enough" to the
normal distribution in order to proceed with the hypothesis of normality. In the
The following figure shows a normal quantile plot for men and women.
that supports our decision.

5
You can also conduct a formal normality test using software.
The previous figure shows the results of the normality test with the software.
JMP. We do group tests separately. Both the test for men and
for women show that we cannot reject the normal distribution hypothesis.
We can proceed with the hypothesis that the body fat data for
Men and women have a normal distribution.

While, in the case of homoscedasticity existing, the value of this ratio must
being close to unity. The further it is from unity, the greater the probability.
that the variances are really different and that the observed difference is not due to
at random. We know that the quotient of two variances follows a probability distribution.
from the F of Snedecor.

To calculate this probability, we can perform Snedecor's F test, which


takes into account the degrees of freedom of the numerator and denominator of the quotient of

variances.

Steps to perform a Student's t-test: below are the steps to carry out a
simplified Student's t-test:

1. Define the null and alternative hypothesis: the null hypothesis states that there is no
significant difference between the two means, while the alternative hypothesis
it establishes that there is a significant difference.

2. Select the appropriate type of t-test: this will depend on whether the samples are
independent or related.
3. Calculate the mean, standard deviation, and sample size for each group.
4. Calculate the t statistic using the appropriate formula, which will take into account the
difference between the means, the variability of the data, and the sample size.
5. Determine the critical value of t using a Student's t-distribution table and the
desired significance level (generally, 0.05).
6. Compare the calculated value of t with the critical value of t. If the calculated value of t is
greater than the critical value, the null hypothesis is rejected and the hypothesis is accepted

alternative. If the calculated value of t is less than the critical value, it cannot be
reject the null hypothesis.
6
7. Interpret the results appropriately and conclude whether there is a difference.
significant between the two averages or not.

The above written, as mentioned, is general in nature, so it will be proceeded to


specify what each step involves, as well as providing an example for a
greater with engagement.

The essential part of the 't' test is that we start the procedure from hypotheses.
(hypothesis test) being that:

Hypothesis testing: it is a rule for accepting or rejecting a statement. For this,


analysis of two opposing hypotheses of a population: the null hypothesis and the alternative hypothesis

alternative.

Null hypothesis (H0conjecture of a parameter that is to be tested.


Generally, the null hypothesis consists of the equality of a parameter in the
two samples (absence of difference or effect). In the case of the t-test Excel is
the equality of population means in two samples.

1= 2

Alternative hypothesis (H1conjecture of a parameter that is fulfilled if it is rejected


the null hypothesis. It generally consists of the difference of a parameter in the
two samples. For the t-test, Excel indicates that there is a difference in the means.
population of two samples.

1< 2− 1≠ 2− 1> 2

Regarding the difference of each of the propositions, it will be detailed later.

Since both hypotheses are the antithesis of each other, it is possible to make mistakes.
due to a misinterpretation of the displayed data, which is as follows:

Type I error: rejects a null hypothesis that is actually true. The probability of
Making a type I error depends on the alpha level that was chosen. If the alpha probability was set at
If p < 0.05, then there is a 5% chance of making a Type I error. It can be

7
reduce the possibility of making a type I error by setting a smaller alpha level (p < .01).
The problem with doing this is that it increases the possibility of a type II error.

Type II error: failure to reject a false null hypothesis.

Once we know which contrast statistic we are going to use, the next step
it is the choice of a type I error level or alpha, in general terms we will recommend
A 5%, however, this value must be subject to the context of each investigation.
The reduction of Type I error means that it is easier to accept the null hypothesis, but
it will also lead us to the acceptance of a null hypothesis that is actually false (error
type II).

Obviously, when all the calculations have been made, we must check if the
hypothesis that has been chosen is correct or incorrect, for example, if it has been chosen the
null hypothesis, the conditions for rejecting it are:

• If the calculated absolute t-Student value (t statistic) is greater than the tabled t value (value
critic of two tails
• If the p-value is less than the significance level of the t-Student test. The p value is

a measure of evidence against H0If the above is fulfilled, there is enough.


evidence to reject H0If it is not met, it is statistically not significant, and
therefore H cannot be rejected0.

The mentioned data will be examined later, as well as its acquisition and significance.
what they have in the interpretation of one.

Commonly, the "t" test is done with the Excel program, which only needs the data.
(regardless of the order) and with the data analysis button for "t-test for two
samples assuming unequal variances.

It is important to highlight that equal and unequal variances are very different, both in their

procedure as in its results, that is why, if calculations were to be performed


incorrectly, the interpretation may be wrong, that is why before
to conduct a 't' test, an 'f' test is performed on the data we have, with this
we can determine whether the variances of the means are equal or unequal.

8
Test 'f' for variance of two samples: we will not elaborate too much on this, just
let's focus on the result that remains after using it in Excel on a database, which
It is a table with relevant data of F, P(F<=f) a one-tailed test and critical value for F (one-tailed).

Let us remember that the null hypothesis in a Fisher test is rejected if any condition is met.

of these two conditions:

0.05 > P(F<=f)

F > critical value for F

If both are met, it is concluded that the null hypothesis is rejected and the variances of grades.

the two courses are not the same.

Once this knowledge is obtained, we can perform the "t" test.

The default alpha (α) in Excel is 0.05, which can be changed if desired.
this is a proposed data point. This represents the percentage of error that will be encountered

the hypothesis that has been raised is usually quite small, the default
As mentioned, it is 5%.

9
Alongside the created table, there are some formulas that allow us to do the same.
process that the Excel program performs, as well as a table that allows us to find
the critical t values for both tails, since the process to find it is very
tedious and somewhat complex.

From what has been developed, we can test the null hypothesis of our 't' test in Excel.
which is rejected if either of these two conditions are met:

> ( ≤ )

í ( ) < í

Replacing:

0.05 > 0.38526946

2,16036866 < 0.89847907

10
We see that neither of the two is fulfilled, therefore, there is no significant evidence.
to say that the average grades of the two courses are different. The null hypothesis
our t test in Excel is not rejected.

In this case, the acceptance of the null hypothesis will depend on the probability of the
statistic. We can see how the statistic t = 0.8984 does not exactly match with the
calculated manually, this is because statistical programs use
many more decimals in the calculations than those used when using a
calculator. Moreover, this results table also offers us the probability of the
statistical, that is, the p value or p-value, which in the bilateral case (two tails) takes the value
0.28977. When this value takes a probability greater than 0.025 (0.05/2=0.025),
we assume that the statistic is not due to chance, therefore, we can accept the
null hypothesis stating that the differences between the means are equal to zero. Also
it offers us the critical value of the right half of the distribution, both for testing
unilateral (one tail), as bilateral (two tails). We can see how this value coincides
with the calculated using a function that is also in Excel. The value of the half
the left side of the distribution is the same, but in negative, that is, -2.1604 since
The Student's t distribution is symmetric.

Graph for the 't' test: the graph is another way to verify the truthfulness of the
hypothesis that has been chosen.

We will use the t statistics data, α, P(T<=t) two tails.

11
As can be seen in the same way, the value falls within a reliability area, without
embargo, it does not take into account the other relationship (α<P(T<=t)) which makes it not so

I need, however, a good visualization of the problem in question.

Something to mention as well is that when we established the following:

1< 2− 1≠ 2− 1> 2

Each one has an effect on the graph:

• We choose 'the lesser a', the value obtained will focus on the left side.
from the graph, that is, in the tail closer to the origin.

• If we choose 'the greatest a', the value obtained will be centered on the right side of
the graph, that is, at the tail far from the origin.

• And if we choose the sign of 'different from', we are holding both inequalities, by
what the two tails mentioned will be used for.

12
BIBLIOGRAPHY

Anonymous. (July 14, 2011). Connectionism. Retrieved from


Unable to access the content from the provided link.
for_the_comparison_of_two_independent_samples-j960497l

Anonymous. (2020). Student's t-test.Obtained from https://www.scientific-


european-federation-osteopaths.org/wp-content/uploads/2019/01/Test-t
The provided text is not translatable as it seems to be a filename or reference to a file.

Cruz, J. (September 29, 2021). Ninja Excel. Retrieved from


The provided text is a URL. Please provide text that needs to be translated.

Engineer, T. A. (February 6, 2018). Student's T (Theory).

JMP. (n.d.). JMP STATISTICAL DISCOVERY. Obtained from


The provided text is a URL and does not contain translatable content.
test.html

JMP. (n.d.). JMP STATISTICAL DISCOVERY. Retrieved from


Unable to access external links.

KhanAcademySpanish. (December 15, 2019). Two-sample t-test for


difference of means | Khan Academy in Spanish.

Molina, M. (October 13, 2021). ANESTHESIZE. Retrieved from


https://anestesiar.org/2021/step-by-step-student-t-test/
independent-samples/

Ortega, C. (April 2023). QuestionPro. Retrieved from


Unable to access external links for translation.

13

You might also like