0% found this document useful (0 votes)
13 views28 pages

LING90003 Research in Applied Linguistics - Week 2 Lecture

Uploaded by

amatesx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views28 pages

LING90003 Research in Applied Linguistics - Week 2 Lecture

Uploaded by

amatesx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

LING90003 Research in Applied Linguistics

Week 2 lecture
Quantitative research 1: Variables, validity, reliability
Reminders
1. Assignment 1 released:
• Group assignment: Self-organise groups for Assignments 1, 2, and the final assignment via the People
page on Canvas by the end of week 3.
• Assignments may be completed individually: For students who plan to make use of the final assignment
(research proposal) for the purpose of a minor thesis, you should do all the assignments as individual
assignments.
• Read through the assignment instructions, start with the assignment reading and choose one design
(experimental or factorial) for your assignment. We will be discussing experimental and factorial research
designs in weeks 3 and 4.

2. Study guide:
• Check this week’s study guide for more detailed elaboration on key concepts

2
Research hypotheses
• Technically, quantitative-experimental studies are done in order to support or falsify a
hypothesis.
• Such hypotheses might be:
– “Study abroad improves fluency.”
– “Explicit instruction leads to more effective grammatical learning than implicit
instruction.”
– “Low-proficiency speakers use fewer supportive moves in requests than high-
proficiency speakers.”
– “Third-person singular –s in English is a late acquired feature.”

3
Variables
Hatch and Lazaraton (1991, p.51) define as:

… an attribute of a person, a piece of text or an object which “varies” from person to


person, text to text, object to object, or from time to time.

For example:
• language proficiency
• L1 background, motivation
• gender
• test performance
• learning style
• post-vocalic /r/
• Etc.

4
Independent and dependent variables
• Quantitative-experimental research seeks to understand a cause-effect relationship
• The “cause” variable is the independent variable
• The “effect” variable is the dependent variable
• We try to see if changes in the independent variable lead to changes in the dependent
variable

?
Independent variable: Dependent variable:
Study abroad L2 fluency

Study abroad L2 fluency


Different! Different?
No study abroad L2 fluency 5
Independent variable (IV)
The variable that might be the cause of an observable variation, e.g.,
• Study abroad: yes / no => fluency
• Type of instruction: explicit vs. implicit vs. none (control) => grammatical accuracy
• Proficiency: advanced vs. upper intermediate vs. lower intermediate vs. beginners =>
supportive moves in requests

• IVs have “levels”, e.g. the IV “proficiency” above has four levels
• IVs can be background variables (e.g. proficiency), or
• IVs can be treatments administered to participants (e.g. explicit or implicit instruction or
not at all)

6
The dependent variable (DV)
The variable that reflects the effect of the IV
– Study abroad: yes / no => fluency
– Type of instruction: explicit vs. implicit vs. none => grammatical accuracy
– Proficiency: advanced vs. upper intermediate vs. lower intermediate vs. beginners =>
supportive moves in requests

• DVs must be observable and measurable


• If a change in the IV leads to a change in the DV, we have established a causal
relationship
• We also need to ascertain the strength of that relationship, e.g.,
– How much of a gain in fluency does study abroad cause?
– How much of a gain in grammatical accuracy do different treatment methods yield?
7
Operationalisation
IVs and DVs must be operationalised (=defined)
• That’s easy for study abroad / no study abroad, or different treatment conditions
• But how do we measure
– Proficiency
– Competence of listening comprehension
– Grammatical accuracy
– Vocabulary retention
– Complexity of requests
• Tests, rater judgments, frequency of target features are common

8
Moderator variables
• A moderator variable is a type of IV that affects the DV but is not the focus of the study
• For example:
IV (Type of instruction: explicit, implicit, none)

MV (L2 proficiency)

DV (grammatical accuracy)

• Maybe there is an interaction between proficiency and type of instruction, e.g., only
high proficiency learners profit from certain types of instruction
9
Controlling variables
• When a moderator variable has not been considered, it’s called an intervening (or
confounding) variable
• Intervening variables (confounds) weaken the study
• They are what we hunt for when critically evaluating an article
• Intervening variables can be controlled by
– Making them full IVs (e.g. differentiate proficiency levels)
– Ensuring that all participants have the same level (e.g. the same proficiency)
– Using large, random samples

10
Sampling
• A sample is a subset of a larger population
• The population is the group of people to which findings can be
generalised, e.g. second language learners
• A sample must adequately reflect the population about which
conclusions are to be drawn
• This depends on the sampling technique used and theoretical
arguments: for example, one could argue that for some SLA studies
the L1 of participants is irrelevant, so it’s okay to have subjects
from a very limited range of L1s

11
Types of samples
Random sample: randomly selected

Stratified random sample: grouped according to a


certain background variable (gender, L2 proficiency
level), random samples drawn from sub-groups
– This is useful to control for the effect of moderator variables

12
Nonrandomised sampling
• Convenience sample/found sample: a sample that is
used for a study because “it's just there”
– e.g. students in the researcher's class, friends/relatives of
the researcher, participants who sign up for participation
– very common in our field; may lack representativeness
and generalisability

13
What‘s a good sample size?
It depends …
• The smaller the sample, the less stable the statistical results will be
• The more statistical tests and high-powered analyses you run, the larger
the sample should be
• If you expect to find large differences/correlations you don't need as big a
sample as if you want to reliably detect small differences/correlations

14
• Qualitative studies use much smaller samples because the workload involved in the
analysis (transcribing, categorising)
• A pilot study can do with a small sample because you’re only making sure your
research instrument works
• If fairly simple statistics are to be done, have a sample of about 30 per condition, e.g.,
– compare study-abroad with non study-abroad students: you need a total of 60
participants
– compare non-study abroad with 6 months abroad and 12 months abroad: you need
a total of 90 participants
• The more, the merrier…

15
Validity and reliability
Validity and reliability
Key considerations in designing a study and evaluating a study

Validity is the degree to which conclusions from findings are legitimate and
defensible
Reliability is concerned with the precision of measurement, e.g. consistency
of a behaviour/test scores over time or over various administrations of
equivalent test versions.

Reliability can be computed statistically, whereas validity is more of a


rhetorical argument based on evidence.
17
A measurement can be reliable but totally
invalid:
Example (I): asking students to take an IQ test as a
measure of ESL knowledge will give highly consistent
results (reliable) but says nothing about learners' ESL
knowledge (invalid).

Example (II): you buy a dodgy 10cm ruler. It’s dodgy


because each of the 1cm divisions is slightly longer.
Because of this, every 10cm line you draw is actually 11cm
long! This ruler is a reliable measure in that every time you
use it, it measures 11cm consistently. But it is not valid
because the conclusions you might draw from your
measurement (e.g. where to cut fabric) would be wrong. 18
Validity and reliability
• However, an unreliable measurement cannot be valid as lack of reliability
means that most of the result is due to error and chance
• In evaluating research, validity is a more central concern than reliability: it
is not too difficult to make instruments measure precisely but drawing the
right conclusions from the results is much more difficult
• Nevertheless, reliability of instrumentation is expected in good quality
research

19
Types of validity: Internal validity
Internal validity: can we legitimately claim that variations in the IV made a
difference to the DV or is there another explanation?

• For example, was it really length of stay abroad that led to greater fluency?
Or could there be another explanation?

20
Types of validity: External validity
External validity is the validity of generalisations of findings across persons,
settings, treatment types, and measurement types.

• In other words, do these findings apply anywhere, with any group of students?
• Sampling is extremely important
• The more “laboratory specific” conditions are, the harder it is to argue external
validity

21
Threats to validity
• External validity is mostly threatened by inadequate sampling: if the sample is
not representative of the target population, conclusions may not be justified.
• Internal validity is subject to a large number of threats.
• These “threats” are also criteria by which we evaluate the quality of research
studies, i.e. did the researcher consider this threat and take steps to eliminate it?
• The most important threats to internal validity are:
– Confounds (intervening variables)
– Experimenter expectancies
– Maturation
– Inadequate instrumentation

22
Confounds
An (uncontrolled) intervening variable is at work that can explain the findings just as much
as the independent variable
• Example: Bouton (1999) claimed that international students in the US increased their
scores on tests of indirect language use (implicature) with increased length of exposure
to the ESL environment.
• However, Roever (2005) showed that their proficiency also increased, so their
improvement was actually due to their increasing proficiency, not exposure alone.
• How could Bouton (1999) have controlled this confound?

23
Experimenter expectancies
The researcher may have hypotheses about the likely outcome
– Example: a researcher who is comparing the writing output of a treatment group and a
control group might rate the essays produced in the treatment group more favourably
because they expect them to be better.
– To complicate matters, if they’re the teacher, they might also teach the treatment
group better.

• Double-blind designs avoid this


• How could a double-blind design be implemented in the situation
described above?
24
Maturation
Perhaps the changes observed over the course of the study are due to
subjects simply maturing, i.e. they would have happened without any
intervention

• Example: learners who live in the L2 environment (e.g. international students studying
English in Australia) are likely to improve their proficiency simply through exposure to
the L2 environment so a specific teaching approach under investigation may actually
have had very little effect

25
Instrumentation
Do the research instruments really engage the construct under study? Is there
construct-irrelevant variance induced through the instrument?

• Example: a test of listening comprehension where students listen to lectures and then
respond to short answer questions uses a scoring method where an answer must be
spelled correctly to gain full marks.
• So a highly proficient listener with poor spelling skills would be disadvantaged.
• If we want to make claims about that person’s listening ability based on the test score,
we will not be drawing valid inferences.

26
Reliability
• Reliability is a measure of the precision of the test/research instrument
• Theoretically, it is the correlation between the participant’s true score (free of error),
and their actual score
• Reliability is expressed as a correlation co-efficient (a number between 0 and 1)
• A high reliability co-efficient is always desirable.
• Generally, reliabilities should be above .7, but for high-stakes tests, they should be
above .8. Reliabilities of .9 and higher are considered excellent.

27
Types of reliability
• Test-retest: administer the same instrument again and check the correlation with the
first administration
• Parallel forms: administer an equivalent alternative version of the same instrument
(e.g. a pretest and a posttest) and correlate these
• Internal consistency: consider the sections of an instrument parallel forms and
compute the correlation between them; this is the most common type, known as
“internal consistency reliability” and the index is Cronbach's alpha (α)
• Inter-rater: calculate the amount of agreement between two (or more) raters
• Intra-rater: calculate the amount of agreement between the first and second (third,
fourth) time that the same rater rates the same dataset

28

You might also like