The Mann-Whitney U-test, also known as the Wilcoxon rank-sum test is a non-parametric hypothesis test used to assess the difference between two independent samples of continuous data. It compares the distributions of two independent groups to check whether one group tends to have higher or lower values than the other. It works by ranking all observations from both groups together and then evaluating whether these ranks differ significantly.
Hypotheses in the Mann-Whitney U Test
Although it does not directly test medians, it is commonly interpreted as a test for median differences when the distributions of both groups are similarly shaped.
When to Use the Mann-Whitney U Test
The test is appropriate when:
- The data are ordinal or continuous but does not follow a normal distribution.
- There are two independent groups (e.g., treated vs. untreated patients, two different teaching methods, etc.).
- The sample size is small (usually less than 30 per group).
- The distributions of the two groups should have a similar shape for valid median comparisons.
If more than two groups need to be compared, the Kruskal-Wallis Test should be used instead.
Requirements of the Mann-Whitney U Test
For the test to provide valid results, the following requirements should be met:
- Independence: The data points in one group should not influence or be related to the data points in the other group.
- Ordinal or Continuous Data: The dependent variable should be measured at an ordinal or continuous level.
- Same Distribution Shape: Although normal distribution is not required, the distributions of both groups should have a similar shape as this helps i comparing central tendencies like median.
- Sufficient Sample Size: Each sample should have at least 5 observations for valid statistical conclusions. Smaller sample sizes may lead to unreliable results because there might not be enough data to detect a true difference between the groups.
1. Collect two independent samples: Gather two samples for test. (Sample 1 and Sample 2).
2. Rank the data: Rank all observations from smallest to largest across both groups. If two observations have the same value, assign them the average rank.
3. Sum the ranks: Compute the rank for each sample (denoted as R₁ and R₂).
4. Calculate the U-statistic using the formula:
U_1 = n_{1}n_{2} +\frac{n_{1}\left ( n_{1}+1 \right )}{2} - R_{1}
U_2 = n_{1}n_{2} +\frac{n_{2}\left ( n_{2}+1 \right )}{2} - R_{2}
where:
- n1 , n2 are the sample sizes for the two groups.
- R1 , R2 are the rank sums of each group.
The final U-statistic is the smaller value of U₁ and U₂.
5. Compare U to the critical value: Look up the critical value from the Mann-Whitney U Table at the chosen significance level (e.g., 0.05).
6. Decision rule:
- If U ≤ U₀ (critical value), reject the null hypothesis.
- Otherwise, do not reject the null hypothesis.
Example: Comparing Student Test Scores
A test was conducted on two batches of students and their scores are given below:
| Batch 1 | Batch 2 |
|---|
| 3 | 9 |
| 4 | 7 |
| 2 | 5 |
| 6 | 10 |
| 2 | 8 |
| 5 | 6 |
Step 1: Define the Hypotheses
- Null Hypothesis (H₀): There is no significant difference between the test scores of the two batches.
- Alternative Hypothesis (H₁): There is a significant difference between the test scores of the two batches.
- The level of significance (α) is set at 0.05.
Step 2: Rank the Scores
To perform the Mann-Whitney U test, we rank the combined data (both batches together) from lowest to highest. If two values are the same, we assign them the average rank.
| Batch 1 | Rank (Batch 1) | Batch 2 | Rank (Batch 2) |
|---|
| 2 | 1.5 | 5 | 5.5 |
| 2 | 1.5 | 6 | 7.5 |
| 3 | 3 | 7 | 9 |
| 4 | 4 | 8 | 10 |
| 5 | 5.5 | 9 | 11 |
| 6 | 7.5 | 10 | 12 |
- Rank Sum for Batch 1 (R₁) = 23
- Rank Sum for Batch 2 (R₂) = 55
Step 3: Compute the Mann-Whitney U Statistic
Calculate U1 and U2 :
U_1 = (6 \times 6) + \frac{6(6+1)}{2} - 23 = 36 + 21 - 23 = 34
U_2 = (6 \times 6) + \frac{6(6+1)}{2} - 55 = 36 + 21 - 55 = 2
The smaller value of U1 and U2 is U = 2.
Step 4: Compare U with the Critical Value
Using a Mann-Whitney U critical values table for n1 = 6 and n2 = 6 at α = 0.05,
we find the critical value: U0 = 5
Since U = 2 is less than U0 = 5, we reject the null hypothesis.
Step 5: Conclusion
Since we rejected the null hypothesis, we conclude that there is a significant difference in test scores between the two student batches.
Implementation of Mann and Whitney U test in Python
Before its implementation we should have some basic knowledge about scipy.
python
from scipy.stats import mannwhitneyu
batch_1 = [3, 4, 2, 6, 2, 5]
batch_2 = [9, 7, 5, 10, 8, 6]
stat, p_value = mannwhitneyu(batch_1, batch_2)
print('Statistics=%.2f, p=%.2f' % (stat, p_value))
alpha = 0.05
if p_value < alpha:
print('Reject Null Hypothesis (Significant difference between two samples)')
else:
print('Do not Reject Null Hypothesis (No significant difference between two samples)')
Output:
Statistics=2.00, p=0.01
Reject Null Hypothesis (Significant difference between two samples)
Explore
Machine Learning Basics
Python for Machine Learning
Feature Engineering
Supervised Learning
Unsupervised Learning
Model Evaluation and Tuning
Advanced Techniques
Machine Learning Practice