Eda Final Topic

The document outlines the process of hypothesis testing, including formulating null and alternative hypotheses, choosing a significance level, calculating test statistics, and interpreting results. It details various types of hypothesis tests such as Z-tests, T-tests, and Chi-squared tests, along with common formulas and assumptions. Additionally, it provides examples and interpretations of results from hypothesis tests, emphasizing the importance of understanding one-sided and two-sided tests.

Uploaded by

Jomjomjomjomjomjomjomjomjomjom Jomjomjomjomjomjomjomjomjom Jomjomjomjomjomjomjomjom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views108 pages

Eda Final Topic

Uploaded by

Jomjomjomjomjomjomjomjomjomjom Jomjomjomjomjomjomjomjomjom Jomjomjomjomjomjomjomjom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Calculate & interpret the results of

Hypothesis Testing
Hypothesis testing is a statistical method used to determine
whether a hypothesis about a population parameter is true
or false. Here's a step-by-step guide:
Hypothesis Testing Process
1. Formulate the null and alternative hypotheses: Define the null
hypothesis (H0) and alternative hypothesis (H1).
2. Choose a significance level (α): Typically 0.05.
3. Select a sample and calculate the test statistic: Calculate the
sample mean, proportion, or other relevant statistic.
4. Determine the critical region: Identify the region where the
null hypothesis is rejected.
5. Calculate the p-value: Probability of observing the test statistic
under H0.
6. Make a decision: Reject H0 if p-value < α or test statistic falls
within the critical region.
7. Interpret the results: Discuss implications of accepting or
rejecting H0.
Types of Hypothesis Tests
1. Z-test: For large samples, comparing means or
proportions.
2. T-test: For small samples, comparing means.
3. Chi-squared test: For categorical data, testing
independence or goodness-of-fit.
4. ANOVA: Comparing means across multiple groups.
5. Regression analysis: Testing relationships between
variables.
Common Hypothesis Testing Formulas
1. Z-score: Z = (X̄ - μ) / (σ / √n)
2. T-statistic: t = (X̄ - μ) / (s / √n)
3. Chi-squared statistic: χ² = Σ [(observed - expected)² /
expected]
4. p-value: Calculated using statistical software or tables.
Interpretation of Results
1. Reject H0: Significant difference/effect exists (p-value < α).
2. Fail to reject H0: No significant difference/effect (p-value ≥ α).
3. Type I error: Rejecting H0 when true (α).
4. Type II error: Failing to reject H0 when false (β).
Statistical inference of two samples
Statistical inference for two samples involves comparing the
characteristics of two groups to determine if there's a
significant difference between them. Here's a comprehensive
overview:
Types of Two-Sample Tests
1. Independent Samples T-Test: Compares means of two
independent groups.
2. Paired Samples T-Test: Compares means of two related
groups (e.g., before-after).
3. Mann-Whitney U Test: Compares medians of two
independent groups (non-parametric).
4. Wilcoxon Signed-Rank Test: Compares medians of two related
groups (non-parametric).
5. Chi-Squared Test: Compares proportions of two independent
groups.
Assumptions
1. Independence: Samples are randomly selected and
independent.
2. Normality: Data follows a normal distribution (for parametric
tests).
3. Equal Variances: Variances are equal across groups (for
parametric tests).
Hypothesis Testing
1. Null Hypothesis (H0): No significant difference between
groups.
2. Alternative Hypothesis (H1): Significant difference between
groups.
3. Significance Level (α): Typically 0.05.
4. Test Statistic: Calculated value (e.g., t-statistic, z-score).
5. p-value: Probability of observing the test statistic under H0.
Interpretation
1. Reject H0: Significant difference between groups (p-value < α).
2. Fail to reject H0: No significant difference between groups (p-
value ≥ α).
3. Type I Error: Rejecting H0 when true (α).
4. Type II Error: Failing to reject H0 when false (β). Common
Formulas
1. Independent Samples T-Test:
t = (X̄1 - X̄2) / sqrt((s1^2/n1) + (s2^2/n2))
2. Paired Samples T-Test: t = (X̄d) / (sd / sqrt(n))
3. Mann-Whitney U Test: U = n1n2 + (n1(n1+1)/2) - R1
Problem
A coffee shop owner claims that the average amount of coffee
consumed per customer is 250ml. A random sample of 36
customers shows an average consumption of 270ml with a
standard deviation of 50ml. Test the claim at a 5% significance
level.
Given Values
1. Sample size (n): 36
2. Sample mean (X̄): 270ml
3. Sample standard deviation (s): 50ml
4. Population mean (μ): 250ml (claimed)
5. Significance level (α): 0.05
Hypothesis
1. H0: μ = 250 (null hypothesis)
2. H1: μ ≠ 250 (alternative hypothesis)
Calculation
1. Calculate the test statistic (t):
t = (X̄ - μ) / (s / √n)
t = (270 - 250) / (50 / √36)
t = 20 / 8.33
t ≈ 2.4
2. Determine the degrees of freedom:
df = n - 1
df = 36 - 1
df = 35
3. Find the critical t-value:
Using a t-distribution table or software, find the critical t-value
for α = 0.05 and df = 35.
t-critical ≈ ±2.030
4. Calculate the p-value:
Using software or a t-distribution table, find the p-value
associated with
t ≈ 2.4 and df = 35.
p-value ≈ 0.023
Interpretation
Since the calculated t-value (2.4) exceeds the critical t-value
(2.030), and the p-value (0.023) is less than α (0.05), we:
1. Reject H0: The average amount of coffee consumed per
customer is significantly different from 250ml.
2. Conclude: The coffee shop owner's claim is incorrect. The
actual average consumption is likely higher than 250ml.
Software Output
Here's an example output from R:
t.test(x = c(270),mu = 250, sd = 50, n = 36)
One Sample t-test
data: c(270) t = 2.4, df = 35, p-value = 0.023
alternative hypothesis:
true mean is not equal to 250
95 percent confidence interval:
253.411 286.589
sample estimates:
mean of x 270
This output confirms our manual calculations.
Interpretation
Since the calculated t-value (2.4) exceeds the critical t-value
(2.030), and the p-value (0.023) is less than α (0.05), we:
1. Reject H0: The average amount of coffee consumed per
customer is significantly different from 250ml.
2. Conclude: The coffee shop owner's claim is incorrect. The
actual average consumption is likely higher than 250ml.
One-Sided Hypothesis (Directional Test)
1. Alternative hypothesis (H1) is directional: Specifies the
direction of the difference or relationship.
2. Null hypothesis (H0) is opposite: States the absence of the
specified effect or difference.
3. Tested in one direction: Only one tail of the distribution is
considered.
4. Example: H0: μ ≤ 10, H1: μ > 10 (testing if the mean is greater
than 10)
Two-Sided Hypothesis (Non-Directional Test)
1. Alternative hypothesis (H1) is non-directional: Doesn't specify
the direction of the difference or relationship.
2. Null hypothesis (H0) states equality: States the absence of any
difference or relationship.
3. Tested in both directions: Both tails of the distribution are
considered.
4. Example: H0: μ = 10, H1: μ ≠ 10 (testing if the mean is
different from 10)
Key Differences
1. Directionality: One-sided tests have a directional alternative
hypothesis, while two-sided tests have a non-directional
alternative hypothesis.
2. Null hypothesis: One-sided tests have a null hypothesis that
specifies the absence of the effect in one direction, while two-
sided tests have a null hypothesis that specifies equality.
3. Critical region: One-sided tests have a critical region in one
tail, while two-sided tests have critical regions in both tails.
4. p-value calculation: One-sided tests calculate the p-value
using the area in one tail, while two-sided tests calculate the p-
value using the area in both tails.
When to Use
1. One-sided test: Use when:
- You have a prior expectation about the direction of the effect.
- The alternative hypothesis is directional.
- You want to detect an increase or decrease.
2. Two-sided test: Use when:
- You don't have a prior expectation about the direction of the
effect.
- The alternative hypothesis is non-directional.
- You want to detect any difference (increase or decrease).
Examples
1. One-sided test:
Testing whether a new medicine increases blood pressure
(H0: μ ≤ 120, H1: μ > 120).
2. Two-sided test:
Testing whether a new exercise program affects blood pressure
(H0: μ = 120, H1: μ ≠ 120).
Common Mistakes
1. Incorrectly specifying the direction of the alternative
hypothesis.
2. Failing to consider the directionality of the test.
3. Misinterpreting the results of a one-sided test.
Problem
A company claims that the average lifespan of its batteries is at
least 500 hours. A random sample of 25 batteries has a mean
lifespan of 520 hours with a standard deviation of 30 hours. Test
the claim at a 5% significance level.
Given Values
1. Sample size (n): 25
2. Sample mean (X̄): 520
3. Sample standard deviation (s): 30
4. Population mean (μ): 500 (claimed)
5. Significance level (α): 0.05
Hypotheses
1. H0: μ ≤ 500 (null hypothesis)
2. H1: μ > 500 (alternative hypothesis, one-sided)
Test Statistic
1. t = (X̄ - μ) / (s / √n)
2. t = (520 - 500) / (30 / √25)
3. t = 20 / 6 4.
t ≈ 3.33
Degrees of Freedom
1. df = n - 1
2. df = 25 - 1
3. df = 24
Critical Region
1. One-sided test, α = 0.05
2. Critical t-value ≈ 1.711 (using t-distribution table)
p-value
1. p-value ≈ 0.0013 (using t-distribution table or software)
Here's a sample problem with a solution for a two-sided
hypothesis:
Problem A researcher claims that the average height of adults in
a population is 175 cm. A random sample of 36 adults has a
mean height of 178 cm with a standard deviation of 8 cm. Test
the claim at a 5% significance level.
Given Values
1. Sample size (n): 36
2. Sample mean (X̄): 178 cm
3. Sample standard deviation (s): 8 cm
4. Population mean (μ): 175 cm (claimed)
5. Significance level (α): 0.05
Hypotheses
1. H0: μ = 175 (null hypothesis)
2. H1: μ ≠ 175 (alternative hypothesis, two-sided)
Test Statistic
1. t = (X̄ - μ) / (s / √n)
2. t = (178 - 175) / (8 / √36)
3. t = 3 / 1.33
4. t ≈ 2.26
Degrees of Freedom
1. df = n - 1
2. df = 36 - 1
3. df = 35
Critical Region
1. Two-sided test, α = 0.05
2. Critical t-values ≈ ±2.030 (using t-distribution table)
p-value
1. p-value ≈ 0.029 (using t-distribution table or software)
Decision
1. Reject H0:
Since t ≈ 2.26 > 2.030 and p-value ≈ 0.029 < α = 0.05.
2. Conclude: The researcher's claim is incorrect. The average
height of adults is significantly different from 175 cm.
Software Output Here's an example output from R:
t.test(x = c(178), mu = 175, sd = 8, n = 36)
One Sample t-test data: c(178) t = 2.26, df = 35, p-value = 0.029
alternative hypothesis:
true mean is not equal to 175
95 percent confidence interval: 174.41 181.59
sample estimates: mean of x 178
Interpretation The test results indicate that the average height of
adults is significantly different from 175 cm, supporting the
alternative hypothesis. This suggests that the researcher's claim
may be incorrect.
Here are key concepts and formulas for testing the mean of a
normal distribution:
Hypothesis Testing
1. Null Hypothesis (H0): μ = μ0 (population mean equals a known
value)
2. Alternative Hypothesis (H1): μ ≠ μ0 (two-tailed), μ > μ0 (one-
tailed, right), or μ < μ0 (one-tailed, left)
3. Test Statistic: z = (x ̄ - μ0) / (σ / √n) or t = (x ̄ - μ0) / (s / √n)
4. p-value: Probability of observing the test statistic under H0
5. Critical Region: Range of values where H0 is rejected
Types of Tests
1. One-Sample Z-Test: Known population standard deviation (σ)
2. One-Sample T-Test: Unknown population standard deviation
(s)
3. Two-Sample Z-Test: Comparing means of two independent
samples
4. Two-Sample T-Test: Comparing means of two independent
samples
Formulas
1. One-Sample Z-Test: z = (x ̄ - μ0) / (σ / √n)
2. One-Sample T-Test: t = (x ̄ - μ0) / (s / √n)
3. Two-Sample Z-Test:
z = ((x1̄ - x2̄ ) - (μ1 - μ2)) / √((σ1^2/n1) + (σ2^2/n2))
4. Two-Sample T-Test:
t = ((x1̄ - x2̄ ) - (μ1 - μ2)) / √((s1^2/n1) + (s2^2/n2))
Assumptions
1. Normality: Data follows a normal distribution
2. Independence: Observations are independent
3. Equal Variances: Variances are equal across groups (for two-
sample tests)
Example Suppose we want to test whether the average height of
adults in a population is 175 cm, given a sample of 36 adults
with a mean height of 178 cm and standard deviation of 8 cm.
1. H0: μ = 175
2. H1: μ ≠ 175
3. Test statistic:
t = (178 - 175) / (8 / √36) ≈ 2.26
4. p-value ≈ 0.029
5. Reject H0, conclude μ ≠ 175
Here's a sample problem with a solution:
Problem
A manufacturing company claims that the average weight of its
bags of flour is 2 kg. A random sample of 25 bags has a mean
weight of 2.1 kg and a standard deviation of 0.2 kg. Test the
claim at a 5% significance level.
Given Values
1. Sample size (n): 25
2. Sample mean (x)̄ : 2.1 kg
3. Sample standard deviation (s): 0.2 kg
4. Population mean (μ0): 2 kg (claimed)
5. Significance level (α): 0.05
Hypotheses
1. H0: μ = 2 kg (null hypothesis)
2. H1: μ ≠ 2 kg (alternative hypothesis, two-tailed)
Test Statistic
1. t = (x ̄ - μ0) / (s / √n)
2. t = (2.1 - 2) / (0.2 / √25)
3. t = 0.1 / 0.04
4. t = 2.5
degrees of Freedom
1. df = n - 1
2. df = 25 - 1
3. df = 24
Critical Region
1. Two-tailed test, α = 0.05
2. Critical t-values ≈ ±2.064 (using t-distribution table)
p-value
1. p-value ≈ 0.023 (using t-distribution table or software)
Decision
1. Reject H0: Since t = 2.5 > 2.064 and p-value ≈ 0.023 < α = 0.05.
2. Conclude: The company's claim is incorrect. The average
weight of bags of flour is significantly different from 2 kg.
Software Output
Here's an example output from R:
t.test(x = c(2.1), mu = 2, sd = 0.2, n = 25)
One Sample t-test data:
c(2.1) t = 2.5, df = 24, p-value = 0.023
alternative hypothesis: true mean is not equal to 2
95 percent confidence interval: 1.936 2.164
sample estimates: mean of x 2.1
Interpretation
The test results indicate that the average weight of bags of flour
is significantly different from 2 kg, supporting the alternative
hypothesis. This suggests that the company's claim may be
incorrect.
Here's a comprehensive guide to testing the variance and
standard deviation of a normal distribution:
Hypothesis Testing
1. Null Hypothesis (H0): σ² = σ0² (population variance equals a
known value)
2. Alternative Hypothesis
(H1): σ² ≠ σ0² (two-tailed), σ² > σ0² (one-tailed, right),
or σ² < σ0² (one-tailed, left)
3. Test Statistic: χ² = (n - 1)s² / σ0²
4. p-value: Probability of observing the test statistic under H0
5. Critical Region: Range of values where H0 is rejected
Assumptions
1. Normality: Data follows a normal distribution
2. Independence: Observations are independent
3. Random Sampling: Sample is randomly selected Test
Procedure
1. Calculate sample variance (s²).
2. Choose significance level (α).
3. Determine degrees of freedom (df = n - 1).
4. Calculate test statistic (χ²).
5. Find p-value or critical χ²-value.
6. Make decision: Reject H0 if p-value < α or χ² > critical χ²-value.
Formulas
1. Sample Variance: s² = Σ(xi - x)̄ ² / (n - 1)
2. Test Statistic: χ² = (n - 1)s² / σ0²
3. Standard Error: SE = √(s² / (2(n - 1)))
Types of Tests
1. One-Sample Chi-Square Test: Testing variance of one sample.
2. Two-Sample F-Test: Comparing variances of two independent
samples.
Example
Suppose we want to test whether the variance of exam scores is
100, given a sample of 25 students with a sample variance of
120.
1. H0: σ² = 100
2. H1: σ² ≠ 100
3. α = 0.05
4. df = 25 - 1 = 24
5. χ² = (24)(120) / 100 ≈ 28.8
6. p-value ≈ 0.001 7.
Reject H0, conclude σ² ≠ 100
Here's a sample problem:
Problem
A manufacturer claims that the variance of the weights of its
bags of flour is 0.04 kg². A random sample of 25 bags has a
sample variance of 0.06 kg². Test the claim at a 5% significance
level.
Given Values
1. Sample size (n): 25
2. Sample variance (s²): 0.06 kg²
3. Population variance (σ0²): 0.04 kg² (claimed)
4. Significance level (α): 0.05
Hypotheses
1. H0: σ² = 0.04 kg² (null hypothesis)
2. H1: σ² ≠ 0.04 kg² (alternative hypothesis, two-tailed)
Test Statistic
1. χ² = (n - 1)s² / σ0²
2. χ² = (25 - 1)(0.06) / 0.04
3. χ² = 24(0.06) / 0.04
4. χ² ≈ 36
Critical Region
1. Two-tailed test, α = 0.05
2. Critical χ²-values ≈ 12.401 and 39.364 (using χ²-distribution
table)
p-value
1. p-value ≈ 0.012 (using χ²-distribution table or software)
Decision
1. Reject H0: Since χ² ≈ 36 > 39.364 is not true, but p-value ≈
0.012 0.05.
2. Conclude: The manufacturer's claim is incorrect. The variance
of the weights is significantly different from 0.04 kg².
Interpretation
The test results indicate that the variance of the weights is
significantly different from 0.04 kg², supporting the alternative
hypothesis. This suggests that the manufacturer's claim may be
incorrect.
Here's a comprehensive guide on test on a population
proportion:
Hypothesis Testing
1. Null Hypothesis (H0): p = p0 (population proportion equals a
known value)
2. Alternative Hypothesis (H1): p ≠ p0 (two-tailed), p > p0 (one-
tailed, right), or p < p0 (one-tailed, left)
3. Test Statistic: z = (p̂ - p0) / √(p0(1-p0)/n)
4. p-value: Probability of observing the test statistic under H0
5. Critical Region: Range of values where H0 is rejected
Assumptions
1. Random Sampling: Sample is randomly selected
2. Independence: Observations are independent
3. Large Sample Size: n ≥ 30 or np ≥ 5 and n(1-p) ≥ 5
Test Procedure
1. Calculate sample proportion (p̂).
2. Choose significance level (α).
3. Determine test statistic (z).
4. Find p-value or critical z-value.
5. Make decision: Reject H0 if p-value < α or z > critical z-value.
Formulas
1. Sample Proportion: p̂ = (Number of successes) / n
2. Test Statistic: z = (p̂ - p0) / √(p0(1-p0)/n)
3. Standard Error: SE = √(p0(1-p0)/n)
Test Procedure
1. Calculate sample proportion (p̂).
2. Choose significance level (α).
3. Determine test statistic (z).
4. Find p-value or critical z-value.
5. Make decision: Reject H0 if p-value < α or z > critical z-value.
Formulas
1. Sample Proportion: p̂ = (Number of successes) / n
2. Test Statistic: z = (p̂ - p0) / √(p0(1-p0)/n)
3. Standard Error: SE = √(p0(1-p0)/n)
Types of Tests
1. One-Proportion Z-Test: Testing proportion of one population.
2. Two-Proportion Z-Test: Comparing proportions of two
independent populations.
Example Suppose we want to test whether the proportion of
smokers in a population is 0.3, given a sample of 100 individuals
with 35 smokers.
1. H0: p = 0.3
2. H1: p ≠ 0.3
3. α = 0.05
4. p̂ = 35/100 = 0.35
5. z = (0.35 - 0.3) / √(0.3(1-0.3)/100) ≈ 1.38
6. p-value ≈ 0.168
7. Fail to reject H0, conclude p ≈ 0.3
problem solving for test on a population proportion
Problem
A company claims that 80% of its customers are satisfied with
their service. A random sample of 200 customers found 154
satisfied customers. Test the claim at a 5% significance level.
Given Values
1. Sample size (n): 200
2. Number of successes (x): 154 (satisfied customers)
3. Sample proportion (p̂): 154/200 = 0.77
4. Population proportion (p0): 0.8 (claimed)
5. Significance level (α): 0.05
Hypotheses
1. H0: p = 0.8 (null hypothesis)
2. H1: p ≠ 0.8 (alternative hypothesis, two-tailed)
Test Statistic
1. z = (p̂ - p0) / √(p0(1-p0)/n)
2. z = (0.77 - 0.8 / √(0.8(1-0.8)/200)
3. z ≈ -2.22
Degrees of Freedom Not applicable for one-proportion z-test.
Critical Region
1. Two-tailed test, α = 0.05
2. Critical z-values ≈ ±1.96
p-value
p-value ≈ 0.026 (using z-distribution table or software)
Decision
1. Reject H0: Since z ≈ -2.22 < -1.96 and p-value ≈ 0.026 < α =
0.05.
2. Conclude: The company's claim is incorrect. The population
proportion of satisfied customers is significantly different from
80%.
Here's an overview of statistical inference for two samples:
Hypothesis Testing
1. Two-Sample T-Test: Compare means of two independent
samples.
2. Two-Sample Z-Test: Compare proportions of two independent
samples.
3. Wilcoxon Rank-Sum Test: Compare medians of two
independent samples (non-parametric).
4. Mann-Whitney U Test: Compare distributions of two
independent samples (non-parametric).
Confidence Intervals
1. Two-Sample T-Interval: Estimate difference between means.
2. Two-Sample Z-Interval: Estimate difference between
proportions.
Assumptions
1. Independence: Samples are randomly selected and
independent.
2. Normality: Data follows normal distribution (for parametric
tests).
3. Equal Variances: Variances are equal across samples (for
parametric tests).
Test Statistics
1. Two-Sample T-Test:
t = (x1̄ - x2̄ ) / sqrt((s1²/n1) + (s2²/n2))
2. Two-Sample Z-Test:
z = (p̂1 - p̂2) / sqrt((p1(1-p1)/n1) + (p2(1-p2)/n2))
Interpretation
1. p-value: Probability of observing test statistic under null
hypothesis.
2. Confidence Interval: Range of values for population
parameter.
3. Effect Size: Standardized difference between means (e.g.,
Cohen's d).
Common Tests
1. Paired T-Test: Compare means of paired samples.
2. Two-Sample Test of Proportions: Compare proportions of two
independent samples.
3. Kruskal-Wallis Test: Compare means of three or more
independent samples (non-parametric).
Example
Suppose we want to compare the average heights of males and
females.
1. H0: μ1 = μ2 (null hypothesis)
2. H1: μ1 ≠ μ2 (alternative hypothesis)
3. α = 0.05
4. Sample sizes: n1 = 50 (males), n2 = 50 (females)
5. Sample means: x1̄ = 175.2 cm, x2̄ = 162.1 cm
6. Sample standard deviations: s1 = 5.5 cm, s2 = 4.8 cm
7. t ≈ 8.21
8. p-value ≈ 0.0001 9.
Reject H0, conclude μ1 ≠ μ2.
Problem
A researcher wants to compare the average exam scores of students
from two different teaching methods. Method A (traditional) and
Method B (online). A random sample of 25 students from each
method yielded:
Given Values
1. Method A (Traditional):
- Sample size (n1): 25
- Sample mean (x1̄ ): 80
- Sample standard deviation (s1): 10
2. Method B (Online):
- Sample size (n2): 25
- Sample mean (x2̄ ): 85
- Sample standard deviation (s2): 12
1. Significance level (α): 0.05
Hypotheses
1. H0: μ1 = μ2 (null hypothesis)
2. H1: μ1 ≠ μ2 (alternative hypothesis, two-tailed)
Test Statistic
Two-Sample T-Test:
1 t = (x1̄ - x2̄ ) / sqrt((s1²/n1) + (s2²/n2))
2. t = (80 - 85) / sqrt((10²/25) + (12²/25))
3. t ≈ -1.67
Degrees of Freedom
1. df = n1 + n2 - 2
2. df = 25 + 25 - 2
3. df = 48
Critical Region
1. Two-tailed test, α = 0.05
2. Critical t-values ≈ ±1.96 (using t-distribution table)
p-value
1. p-value ≈ 0.101 (using t-distribution table or software)
Decision
1. Fail to reject H0: Since t ≈ -1.67 < 1.96 and p-value ≈ 0.101 > α
= 0.05.
2. Conclude: No significant difference between average exam
scores of students from Method A and Method B.
Confidence Interval
1. Two-Sample T-Interval: (x1̄ - x2̄ ) ± (t * sqrt((s1²/n1) + (s2²/n2)))
2. 95% CI: (-7.41, 1.41)
Software Output Here's an example output from R:
t.test(x = c(80), y = c(85), var.equal = TRUE)
Two Sample t-test data: c(80) and c(85)
t = -1.67, df = 48, p-value = 0.101
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.407141 1.407141
sample estimates:
mean of x mean of y 80 85
Inference on the difference in means of two normal distributions
with known variances involves hypothesis testing and confidence
intervals.
Hypothesis Testing
1. Null Hypothesis (H0): μ1 = μ2 (equal means)
2. Alternative Hypothesis (H1): μ1 ≠ μ2 (unequal means)
3. Test Statistic: Z = (x1̄ - x2̄ ) / sqrt((σ1^2 / n1) + (σ2^2 / n2))
4. Critical Region: Reject H0 if |Z| > Zα/2 (two-tailed test)
Confidence Interval
1. Confidence Interval: (x1̄ - x2̄ ) ± Zα/2 * sqrt((σ1^2 / n1) + (σ2^2 / n2))
2. Margin of Error: Zα/2 * sqrt((σ1^2 / n1) + (σ2^2 / n2))
Assumptions
1. Normality: Both populations are normally distributed.
2. Independence: Samples are independent.
3. Known Variances: Population variances (σ1^2, σ2^2) are
known.
Example Suppose we want to compare the mean heights of men and
women.
1. Sample Data:
- Men (n1 = 100): x1̄ = 175.2 cm, σ1^2 = 10^2
- Women (n2 = 100): x2̄ = 162.1 cm, σ2^2 = 8^2
2. Hypothesis Test:
- H0: μ1 = μ2
- H1: μ1 ≠ μ2
- Z = (175.2 - 162.1) / sqrt((10^2 / 100) + (8^2 / 100)) ≈ 4.31
- Reject H0 (p-value ≈ 0)
3. 95% Confidence Interval:
- (175.2 - 162.1) ± 1.96 * sqrt((10^2 / 100) + (8^2 / 100))
≈ (11.3, 14.1)
Key Considerations
1. Sample Size:
Ensure adequate sample sizes (n1, n2) for reliable inference.
2. Variance Homogeneity: Verify equal variances (σ1^2 = σ2^2) for
pooled variance estimates.
3. Non-Parametric Alternatives: Consider non-parametric tests (e.g.,
Wilcoxon rank-sum test) for non-normal data.
Example
Suppose we want to compare the mean exam scores of two classes.
1. Sample 1 (Class A): x̄1 = 85, n1 = 50, σ1^2 = 100
2. Sample 2 (Class B): x2̄ = 80, n2 = 60, σ2^2 = 120
3. Hypothesis:
H0: μ1 = μ2
H1: μ1 ≠ μ2
4. α: 0.05
Solution 1.
Test statistic: Z = (85 - 80) / sqrt((100 / 50) + (120 / 60)) ≈ 1.54
2. Critical region: Reject H0 if |Z| > 1.96
3. p-value: P(|Z| > 1.54) ≈ 0.123
4. Decision: Fail to reject H0 (p-value > 0.05)
95% Confidence Interval
1. Margin of error: 1.96 * sqrt((100 / 50) + (120 / 60)) ≈ 4.33
2. Confidence interval: (85 - 80) ± 4.33 ≈ (0.67, 8.33)
Inference on the Difference in Means of Two Normal Distribution,
Variances
Hypothesis Testing
1. Null Hypothesis (H0): μ1 = μ2 (equal means)
2. Alternative Hypothesis (H1): μ1 ≠ μ2 (unequal means)
3. Test Statistic: t = (x1̄ - x2̄ ) / sqrt((s1^2 / n1) + (s2^2 / n2))
4. Degrees of Freedom: Typically, min(n1-1, n2-1)
Confidence Interval
1. Confidence Interval: (x1̄ - x2̄ ) ± tα/2 * sqrt((s1^2 / n1) + (s2^2 / n2))
2. Margin of Error: tα/2 * sqrt((s1^2 / n1) + (s2^2 / n2))
Assumptions
1. Normality: Both populations are normally distributed.
2. Independence: Samples are independent.
3. Equal Variances: Population variances (σ1^2, σ2^2) are equal.
Types of Tests
1. Pooled Variance Test: Assumes equal variances.
2. Welch's Test: Does not assume equal variances.
3. t-Test: Suitable for small samples.
Considerations
1. Sample Size: Ensure adequate sample sizes (n1, n2).
2. Variance Homogeneity: Verify equal variances.
3. Non-Parametric Alternatives: Consider non-parametric tests for non-
normal data.
Example
Compare the mean heights of men and women.
Given Data
1. Men: x1̄ = 175.2 cm, n1 = 100, s1^2 = 10^2
2. Women: x2̄ = 162.1 cm, n2 = 100, s2^2 = 8^2
3. α = 0.05
Hypothesis
1. H0: μ1 = μ2 (equal means)
2. H1: μ1 ≠ μ2 (unequal means)
Solution Pooled Variance Test
1. Pooled variance: sp^2 = ((n1-1)s1^2 + (n2-1)s2^2) / (n1+n2-2)
= ((99100 + 9964) / 198) = 84.85
2. Standard error: SE = sqrt(sp^2 * (1/n1 + 1/n2))
= sqrt(84.85 * (1/100 + 1/100)) ≈ 1.38
3. Test statistic: t = (x1̄ - x2̄ ) / SE = (175.2 - 162.1) / 1.38 ≈ 9.42
4. Degrees of freedom: df = n1 + n2 - 2 = 198
5. Critical value: t0.025,198 ≈ 1.96
6. p-value: P(|t| > 9.42) ≈ 0
Conclusion
Reject H0. Mean heights differ significantly (p < 0.05).
Confidence Interval
1. Margin of error: ME = t0.025,198 * SE ≈ 1.96 * 1.38 ≈ 2.70
2. 95% CI: (175.2 - 162.1) ± 2.70 ≈ (10.40, 13.00)
Conclusion
Reject H0. Mean heights differ significantly (p < 0.05).
Confidence Interval
1. Margin of error: ME = t0.025,198 * SE ≈ 1.96 * 1.38 ≈ 2.70
2. 95% CI: (175.2 - 162.1) ± 2.70 ≈ (10.40, 13.00)
Simple Linear Regression
1. Definition: A statistical method to model the relationship between a
dependent variable and an independent variable (x).
2. Equation: y = β0 + β1x + ε (ε = error term)
3. Goals: Predict y values, identify relationships, and estimate
coefficients (β0, β1).
4. Assumptions: Linearity, independence, homoscedasticity, normality,
and no multicollinearity.
Correlation
1. Definition: Measures the strength and direction of a linear
relationship between two variables.
2. Types: Pearson's r (parametric), Spearman's ρ (non-parametric), and
Kendall's τ.
3. Interpretation: Values range from -1 (perfect negative correlation) to
1 (perfect positive correlation).
4. Correlation coefficient (r): Measures strength and direction.
Differences
1. Purpose: Regression predicts y values, while correlation measures
relationship strength.
2. Direction: Regression implies causality, whereas correlation suggests
association.
3. Equation: Regression provides a predictive model, whereas correlation
provides a coefficient.
Relationship Between Regression and Correlation
1. Correlation coefficient (r): Square root of R-squared (coefficient of
determination) in simple linear regression.
2. R-squared: Measures variability explained by the regression model.
3. Regression slope (β1): Related to correlation coefficient (r).
Statistical Tests
1. t-test: Evaluates regression coefficients.
2. F-test: Assesses overall model significance.
3. p-value: Indicates probability of observing results by chance.
Common Metrics
1. Mean Squared Error (MSE): Measures regression model accuracy.
2. Coefficient of Determination (R-squared): Evaluates model fit.
3. Root Mean Squared Error (RMSE): Measures model accuracy.
Problem A bakery wants to predict the number of bread loaves
sold based on the number of hours advertised (x). The data:
| Hours Advertised (x) | Bread Loaves Sold (y)|
x y
| --- | --- |
| 2 | 100 |
| 4 | 150 |
| 6 | 200 |
| 8 | 250 |
| 10 | 300 |
Step-by-Step Solution
1. Calculate means:
x̄ = (2+4+6+8+10)/5 = 6,
ȳ = (100+150+200+250+300)/5 = 200
2. Calculate deviations:
x deviation y deviation

x- x̄ y- ȳ
|x | y | x-6 | y-200 | (x-6)(y-200) | (x-6)^2 |
|2 | 100 | -4 | -100 | 400 | 16 |
|4 | 150 | -2 | -50 | 100 | 4 |
|6 | 200 | 0 | 0 | 0 | 0 |
|8 | 250 | 2 | 50 | 100 | 4 |
| 10 | 300 | 4 | 100 | 400 | 16 |
3. Calculate slopes (β1):
β1 = Σ[(x-6)(y-200)] / Σ(x-6)2
= (400+100+0+100+400) / (16+4+0+4+16)
= 1000 / 40
= 25
4. Calculate intercept (β0):
β0 = 200 - 25*6
= 200 - 150
= 50
5. Linear Regression Equation:
y = 50 + 25x
6. Plot the graph using scatter plot
7. Interpret the result :
- Every additional hour advertised increases bread loaves sold by 25.
- The bakery sells 50 loaves when no hours are advertised.
Empirical models are mathematical models based on observed
data, experience and statistical analysis rather than purely
theoretical assumptions. They describe relationships between
variables, predict outcomes and estimate parameters.
Types of Empirical Models
1. Linear Regression: Models linear relationships between
variables.
2. Non-Linear Regression: Models non-linear relationships
using polynomial or logarithmic functions.
3. Time Series Models: Analyze and forecast data with
temporal dependencies (e.g., ARIMA, Exponential Smoothing).
4. Machine Learning Models: Algorithms like decision trees,
random forests, and neural networks.
5. Econometric Models: Study economic relationships and
forecast economic indicators (e.g., GDP, inflation).
6. Statistical Models: Hypothesis testing and confidence
intervals (e.g., t-tests, ANOVA).
Characteristics
1. Data-driven: Derived from observational data.
2. Pragmatic: Focus on predictive accuracy rather than
theoretical purity.
3. Flexible: Can accommodate non-linear relationships and
interactions.
4. Interpretable: Provide insights into variable relationships.
Applications
1. Forecasting: Predict future values of economic indicators,
sales or demand.
2. Policy Evaluation: Assess impact of policy interventions.
3. Risk Analysis: Estimate probability of adverse events.
4. Optimization: Identify optimal settings for system
performance.
5. Data Mining: Discover hidden patterns and relationships.
Advantages
1. Improved prediction: Better forecasting accuracy.
2. Practical insights: Inform decision-making.
3. Flexibility: Handle complex relationships.
4. Interpretability: Understand variable interactions.
Limitations
1. Data quality: Sensitive to data errors and biases.
2. Overfitting: Models may fit noise rather than underlying
patterns.
3. Limited generalizability: Models may not apply outside the
data range.
4. Assumptions: Require careful validation.
Common Empirical Modeling Techniques
1. Least Squares Estimation
2. Maximum Likelihood Estimation
3. Cross-Validation
4. Bootstrap Resampling
5. Feature Engineering
Real-World Examples
1. Demand forecasting for supply chain optimization.
2. Credit risk modeling for loan approval.
3. Economic forecasting for fiscal policy.
4. Customer segmentation for targeted marketing.
5. Quality control in manufacturing.
Common Empirical Modeling Techniques
1. Least Squares Estimation
2. Maximum Likelihood Estimation
3. Cross-Validation
4. Bootstrap Resampling
5. Feature Engineering
Example: Predicting Coffee Shop Sales
Problem Statement
A coffee shop owner wants to predict daily sales based on
advertising expenditure.
Data
1. Independent variable (x): Advertising expenditure ($1,000s)
2. Dependent variable (y): Daily sales ($1,000s)
3. Sample size: 10 weeks
4. Data:
| Week | Advertising (x) | Sales
|1 |2 | 10 |
|2 |3 | 12 |
|3 |4 | 15 |
|4 |2 |9 |
|5 |5 | 18 |
|6 |3 | 11 |
|7 |4 | 16 |
|8 |6 | 20 |
|9 |5 | 19 |
| 10 |4 | 17 |
Empirical Model Linear Regression:
y = β0 + β1x + ε
Estimated Model
1. β0: Intercept = 6
2. β1: Slope = 2.5
Estimated Model Equation
y = 6 + 2.5x
Interpretation
1. For every additional $1,000 spent on advertising, sales
increase by $2,500.
2. Initial sales are $6,000 without advertising.
Limitations
1. Assumes linear relationship.
2. Ignores seasonality, competition and economic factors.

Hypothesis Testing Notes (EDA)
No ratings yet
Hypothesis Testing Notes (EDA)
18 pages
Hypothesis Testing & T-Test Guide
No ratings yet
Hypothesis Testing & T-Test Guide
22 pages
Essay On Hypothesis Testing
100% (2)
Essay On Hypothesis Testing
4 pages
Inferential Statistics: Positive Correlation
No ratings yet
Inferential Statistics: Positive Correlation
9 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
Biostat Lec Part 5 (Hypothesis) - V2
No ratings yet
Biostat Lec Part 5 (Hypothesis) - V2
6 pages
Unit 4 Statistical Testing and Modeling in R
No ratings yet
Unit 4 Statistical Testing and Modeling in R
25 pages
L15 Testing of Hypothesis
No ratings yet
L15 Testing of Hypothesis
42 pages
Hypothesis Testting3
No ratings yet
Hypothesis Testting3
7 pages
Hypothesis Testing Intro and Test For Means
No ratings yet
Hypothesis Testing Intro and Test For Means
10 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
7 pages
Intro to Hypothesis Testing Basics
No ratings yet
Intro to Hypothesis Testing Basics
57 pages
Testing of Hypothesis Stastistics
No ratings yet
Testing of Hypothesis Stastistics
93 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
35 pages
Nafisa Tabassum 1003 (BA)
No ratings yet
Nafisa Tabassum 1003 (BA)
5 pages
Statistical Inference and Hypothesis Testing
No ratings yet
Statistical Inference and Hypothesis Testing
33 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
9 pages
Chapter 4 Hypothesis Testing
No ratings yet
Chapter 4 Hypothesis Testing
40 pages
Topic2 3
No ratings yet
Topic2 3
52 pages
Tests of Significance
No ratings yet
Tests of Significance
60 pages
Hypothesis Testing in Research
No ratings yet
Hypothesis Testing in Research
4 pages
Chapter IX Hypothesis Testing
No ratings yet
Chapter IX Hypothesis Testing
31 pages
Lect 7 Hypothesis Testing
No ratings yet
Lect 7 Hypothesis Testing
23 pages
Z Test Hypothesis Testing Overview
No ratings yet
Z Test Hypothesis Testing Overview
33 pages
Test of Hypothesis For 2020
100% (1)
Test of Hypothesis For 2020
62 pages
BKB - Stats
No ratings yet
BKB - Stats
6 pages
CAMI16 - Data Analytics
No ratings yet
CAMI16 - Data Analytics
55 pages
Hypothesis Testing in Inferential Statistics
No ratings yet
Hypothesis Testing in Inferential Statistics
60 pages
Hypothesis Testing for Engineers
No ratings yet
Hypothesis Testing for Engineers
95 pages
Hypothesis Testing - Intro - Summer 2025
No ratings yet
Hypothesis Testing - Intro - Summer 2025
59 pages
Introduction To Statistical Hypothesis Testing in R
No ratings yet
Introduction To Statistical Hypothesis Testing in R
8 pages
Unit - 3
No ratings yet
Unit - 3
69 pages
Hypothesis Testing: 10.1 Testing The Mean of A Normal Population
No ratings yet
Hypothesis Testing: 10.1 Testing The Mean of A Normal Population
13 pages
Hypothesis Testing Final
No ratings yet
Hypothesis Testing Final
22 pages
Hypothesis Testing in Statistics
No ratings yet
Hypothesis Testing in Statistics
50 pages
RM&IPR Mod 4
No ratings yet
RM&IPR Mod 4
97 pages
Lecture 8, Statistics
No ratings yet
Lecture 8, Statistics
34 pages
Hypothesis Testing Basics
No ratings yet
Hypothesis Testing Basics
22 pages
SPSS
No ratings yet
SPSS
26 pages
CH 9 (Notes)
100% (2)
CH 9 (Notes)
23 pages
Hypothesis Testing Guide
No ratings yet
Hypothesis Testing Guide
23 pages
Lab 5
No ratings yet
Lab 5
7 pages
Hypothesis Tests & Control Charts: by S.G.M
No ratings yet
Hypothesis Tests & Control Charts: by S.G.M
26 pages
Hypothesis Testing Guide
No ratings yet
Hypothesis Testing Guide
16 pages
Hypothesis Testing Overview and Steps
No ratings yet
Hypothesis Testing Overview and Steps
8 pages
08 Parametric Tests
100% (1)
08 Parametric Tests
129 pages
MEC 51 - Module 5 - Notes
No ratings yet
MEC 51 - Module 5 - Notes
4 pages
Buss. Stat - Chapter 3 Hypothesis Testing 2
No ratings yet
Buss. Stat - Chapter 3 Hypothesis Testing 2
26 pages
Buss. Stat - Chapter 3 Hypothesis Testing 2
No ratings yet
Buss. Stat - Chapter 3 Hypothesis Testing 2
26 pages
Data Analysis Finals
No ratings yet
Data Analysis Finals
4 pages
Statistics: Hypothesis Testing
100% (2)
Statistics: Hypothesis Testing
32 pages
BI Lec 6 - Hypothesis Testing
No ratings yet
BI Lec 6 - Hypothesis Testing
22 pages
L5 BRM 010327
No ratings yet
L5 BRM 010327
34 pages
Testing of Hypothesis - Fall
No ratings yet
Testing of Hypothesis - Fall
6 pages
Hypothesis Testing (Statistics)
100% (1)
Hypothesis Testing (Statistics)
23 pages
STS 112 Hypothesis Testing
No ratings yet
STS 112 Hypothesis Testing
16 pages
BUS600 Week-10 Note
No ratings yet
BUS600 Week-10 Note
9 pages
Statppt2 - Test Statistic, Z-Critical & T-Critical
No ratings yet
Statppt2 - Test Statistic, Z-Critical & T-Critical
44 pages
Factors Affecting Mathematic Performance of Senior High School Studentin Padre Pio Child Development School - ACADEMIC YEAR 2022 - 2023
No ratings yet
Factors Affecting Mathematic Performance of Senior High School Studentin Padre Pio Child Development School - ACADEMIC YEAR 2022 - 2023
9 pages
Pe Week 1
No ratings yet
Pe Week 1
22 pages
Report 21
No ratings yet
Report 21
2 pages
Kinematics: Motion in a Straight Line
No ratings yet
Kinematics: Motion in a Straight Line
35 pages
Factors Affecting Mathematic Performance of Senior High School Students in Padre Pio Child Development School Academic Year 2022 2023 Pr2 Group 2
No ratings yet
Factors Affecting Mathematic Performance of Senior High School Students in Padre Pio Child Development School Academic Year 2022 2023 Pr2 Group 2
11 pages
RPH Advocate Assembly
No ratings yet
RPH Advocate Assembly
5 pages
Capacitors: Basics and Calculations
No ratings yet
Capacitors: Basics and Calculations
14 pages
Trigonometry
No ratings yet
Trigonometry
4 pages
10 Inductors
No ratings yet
10 Inductors
40 pages
Laboratory Experiment No.2 Zener Diode
No ratings yet
Laboratory Experiment No.2 Zener Diode
8 pages
Quick Tutorial ASPEN Oneliner Ver11 PDF
No ratings yet
Quick Tutorial ASPEN Oneliner Ver11 PDF
33 pages
Kotler Mm16e Inppt 04
No ratings yet
Kotler Mm16e Inppt 04
33 pages
Introduction To Mobile Filmmaking
No ratings yet
Introduction To Mobile Filmmaking
5 pages
FPI Services Proposal and Fee Schedule
No ratings yet
FPI Services Proposal and Fee Schedule
5 pages
SAP SD Implementation Guide
No ratings yet
SAP SD Implementation Guide
417 pages
Atomic Theory for Science Students
No ratings yet
Atomic Theory for Science Students
182 pages
2 - Types of Construction Contract
100% (1)
2 - Types of Construction Contract
19 pages
Jet Engine Basics and Components Quiz
No ratings yet
Jet Engine Basics and Components Quiz
6 pages
Design of A 600 MW Pulverized Coal-Fired Power Plant: Dianzon - Ventanilla
No ratings yet
Design of A 600 MW Pulverized Coal-Fired Power Plant: Dianzon - Ventanilla
1 page
Terex Ta30 Articulated Dumptruck Maintenance Manual
100% (63)
Terex Ta30 Articulated Dumptruck Maintenance Manual
20 pages
ISO 9001 in Architecture (2015)
No ratings yet
ISO 9001 in Architecture (2015)
5 pages
Natural Gas Pipeline Operations
No ratings yet
Natural Gas Pipeline Operations
21 pages
Basic Calculus
No ratings yet
Basic Calculus
44 pages
GPT 3
No ratings yet
GPT 3
14 pages
Astrophysics For People in A Hurry
No ratings yet
Astrophysics For People in A Hurry
4 pages
Final Year Project 2023
No ratings yet
Final Year Project 2023
28 pages
Electrical System of The Volkswagen Sedan
No ratings yet
Electrical System of The Volkswagen Sedan
56 pages
Success Beyond Ivy League Schools
No ratings yet
Success Beyond Ivy League Schools
4 pages
HONDA GB 350 S 2025 (Manual Proprietário) GB350 - 32K0Z800 - Eng - WEB
No ratings yet
HONDA GB 350 S 2025 (Manual Proprietário) GB350 - 32K0Z800 - Eng - WEB
118 pages
Lecture 1 Introduction To Pest and Disease Managment
No ratings yet
Lecture 1 Introduction To Pest and Disease Managment
73 pages
Laptop Comparison for Budget Buyers
No ratings yet
Laptop Comparison for Budget Buyers
5 pages
Grade 11 - Entrepreneurial Skills
No ratings yet
Grade 11 - Entrepreneurial Skills
4 pages
SEO-Optimized Recommendation Letters
No ratings yet
SEO-Optimized Recommendation Letters
3 pages
03 - Professional Regulatory Board of Architecture (PRBoA) - de GUIA, Erick John M.
No ratings yet
03 - Professional Regulatory Board of Architecture (PRBoA) - de GUIA, Erick John M.
11 pages
Native American Cultural Awarenss Crisis Team Training Flyer
No ratings yet
Native American Cultural Awarenss Crisis Team Training Flyer
1 page
CHUTE-ComicsFormNarrating Lives-2011
No ratings yet
CHUTE-ComicsFormNarrating Lives-2011
12 pages
en-US 7743
No ratings yet
en-US 7743
5 pages
निर्वाणषटकम् word by word meaning
100% (1)
निर्वाणषटकम् word by word meaning
3 pages
Microsoft Power Apps for Ethiopian Airlines
No ratings yet
Microsoft Power Apps for Ethiopian Airlines
10 pages
2007-2013 Chevy Silverado + GMC Trucks
50% (2)
2007-2013 Chevy Silverado + GMC Trucks
5 pages

Eda Final Topic

Uploaded by

Eda Final Topic

Uploaded by

Calculate & interpret the results of

You might also like