0% found this document useful (0 votes)
15 views

PSY1004 Session 08

PSYC1004 Introduction to quantitative methods in psychology Week 8 Lecture notes

Uploaded by

winnieleee6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

PSY1004 Session 08

PSYC1004 Introduction to quantitative methods in psychology Week 8 Lecture notes

Uploaded by

winnieleee6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

PSYC1004

Introduction to quantitative methods in


psychology

Session 8

1
Confidence interval (of a population mean)
• Calculating the confidence interval (CI) using the
formula, CI = sample mean ± Z * (s /N), as introduced
in the previous session, requires the population SD (s)
value to be known
• To work out the CI when the population SD is unknown
(as in most research situations), the s can be estimated
as the sample SD (s) with the formula changed to:
Confidence interval (CI) = sample mean ± T * (s/N) ,
N = sample size; and
T is a value such that C% (confidence level) of the area
under the corresponding t distribution function (see
next page) is within ± T around the distribution’s mean.

2
t distribution (source: sphweb.bumc.bu.edu)

• A family of theoretical
probability distributions
mathematically derived
• Each t distribution is
symmetrical about its mean
and extends to infinity and
negative infinity

• The shape of a t distribution (see the figure on this slide) depends on


its degrees of freedom (df)
• For the CI formula (on the previous slide), df = N – 1
• t will become normally distributed for an infinitely large number of
degrees of freedom.
3
t distribution
The T value to apply the
CI formula above can be
obtained from a table of
the critical values of t:
• Take the "significance
level" as 1 - the level
of confidence, e.g., to
construct a 95% CI,
the column for 5%
significance level (for
“two-tailed test”)
(source: Caldwell) should be used
• Look up the T value
for the df required.
4
Confidence interval (example)
A study on retirees (Caldwell):
• Sample mean = 12 (emails sent per week); SD = 3.0;
sample size (N) = 25;
• Confidence interval (CI) = sample mean ± T * (s /N)
• For the t distribution with a df of 24 (i.e., N – 1): 95% of the area
under the distribution curve is within the t value of ±2.064
• 95% CI = 12 ± 2.064 * (3 /25), i.e., between 12 – 1.24 and 12 +
1.24
• The 95% CI of the population mean is estimated to be between
10.76 (lower confidence limit) and 13.24 (upper confidence limit)
• … we estimate that the population mean falls somewhere between 10.76
and 13.24 emails per week, and we have used a method that will produce a
correct estimate 95 times out of 100. (Caldwell)

5
t distribution table
• For this course’s assignments/quizzes, when the df of a distribution in use is not
available from a statistical table (like the table below), use the table row of the
next lower df number, e.g., for a t distribution with 39 degrees of freedom, use
the table row for df = 30 (not 40). Note: this method represents a conservative
approach, and may not give the most accurate value required.
• For other applications, a more accurate value can be obtained from a reliable
software/website, e.g., the Excel formula “= TINV(0.05, 39)” gives 2.023 as the
(two-tailed) critical t value for df = 39.

6
Hypothesis testing (example)
Example (Caldwell):
• The population productivity (unit/day) of all workers (based
on 3 years’ data): mean = 193.80, SD = 31.55
• The productivity of a random sample of 50 workers who
were given flextime arrangement = 202.94

Research hypothesis (H1): Flextime


arrangement makes a difference in
productivity

Null hypothesis (Ho ): The research


hypothesis is not true; the observed
mean difference (202.94 - 193.80) is
attributed to sampling error.
193.80 7
Hypothesis testing (example)
If Ho is true, how likely is it that we obtained a sample mean of 202.94 from a
population having a mean of 193.80 and a SD of 31.55?
 If the sample-population difference would have reasonably likely
happened, Ho (the sampling-error explanation) is not rejected
 If the sample-population difference would have very unlikely happened
(say, < .05), it is reasonable to reject Ho, i.e., to conclude that H1 is
supported by the data.
Population NB: this diagram
distribution is not to scale

s = 31.55

193.8 Sample (mean = 202.94)


(If the null hypothesis is true)
Source: Caldwell
8
Hypothesis testing (example)
• According to the central limit
theorem, the standard error
(SE) of the sampling
distribution = 31.55/SQRT (50) Standard error
(SE) = 31.55/ SQRT (50)
= 4.46 = 4.46

• The observed difference


between the sample mean and
the expected mean = 202.94 – The sample mean is
193.80 = 9.14 located at 9.14 (or 2.05
SE) away from the mean
How to evaluate its probability
(if the Ho is true) ?
• The z score of the sample mean (in the sampling
distribution of the mean) = 9.14/4.46 = 2.05

9
Hypothesis testing (example)
• Assuming that the sampling distribution of the mean is normally
distributed, only 5 times out of 100 will a z-value of more than
+1.96 or less than –1.96 be expected

• Since the sample’s z-value of 2.05 is beyond the range of ±1.96, the
probability of such a z-score occurring is below 5%
• If this probability criterion (< 5%) was preset for rejecting the null
hypothesis, it can be rejected
• Interpretation: the productivity level of flextime workers is
significantly greater than the productivity level of all workers over
the past 3 years. 10
Statistical significance (example)
Population mean ( SD): 193.8 (31.55)
In the example above: Sample N: 50

• If the mean of the flextime worker sample turned out to be 184.51,


the observed difference between the sample mean and the
expected mean would have been
184.51 – 193.80 = - 9.29
• z-score of the sample mean = - 9.29 / SE = - 9.29 / 4.46 = - 2.08
• This z-score falls outside of the range of ± 1.96 (and 95% of the area
under the standard normal distribution curve is within the range)
 The null hypothesis is rejected
 Conclusion: the productivity level of flextime workers is significantly
less than the productivity level of all workers over the past 3 years.

11
Statistical significance (example)
Population mean ( SD): 193.8 (31.55)
In the example above: Sample N: 50

• If the sample mean turned out to be 199.53, the observed


difference between the sample mean and the expected mean
would have been
199.53 – 193.80 = 5.73
• z-score of the sample mean = 5.73/ SE = 5.73/ 4.46 = 1.28
• This z-score falls within the range of ± 1.96 (and 95% of the area
under the standard normal distribution curve is within the range)
 The null hypothesis is not rejected
 Conclusion: there is no significant difference between the
productivity level of flextime workers and the productivity level of
all workers over the past 3 years.

12

You might also like