Research Method In Software Engineering
Research Design
Mulugeta A.
1
What is Research Design?
“A research design- is the arrangement of conditions for collection
and analysis of data in a manner that aims to combine relevance to
the research purpose with economy in procedure.”
It is a comprehensive plan for data collection and data analysis
in an empirical research project
It is a “blueprint” for empirical research aimed at answering
specific research questions or testing specific hypotheses
It is a plan that specifies the sources and types of information
required and the arrangement of conditions for the collection and
analysis of data.
2
Cont.
More explicitly, among others, the designing decisions happen to be in respect
of:
1. What is the strategies of inquiry? /Form of the study/
In quantitative- I) Survey, Experiment; II) Case study, Comparative Case
study Combination of Case study, Case comparison and/or survey
In qualitative - Ethnography, Phenomenology, Case study, Grounded
theory, Narrative)
2. What periods of time will the study include?- The Time period of the
analysis
Cross sectional analysis
Time series
Longitudinal analysis/Panel
Cross sectional and Longitudinal analysis
3. What type of data is required? Where can the required data be found?- The
type of data material to be used
Primary data
Secondary data
Primary and Secondary data
3
Cont.
4. What will be the sample design? The Sampling Procedures.
Database specific
Sample procedure specific
5. What techniques of data collection will be used? Techniques of
Data collection
Personal Interview , Written Individual Questioner
Participant Observation, Non participant observation , Group
discussion , Expert interview, Narrative Interview
Document Analysis
6. How will the data be analyzed? The Method of Data Analysis
Statistical(Frequency Analysis, Cross tables/ Correlation
analysis , regression analysis, path analysis, factor analysis..)
Qualitative(Thematic analysis)
4
Sampling
Sampling- involves the selection of a number of study units
from a defined study population.
Sampling- is the statistical process of selecting a subset
(called a “sample”) of a population of interest for purposes
of making observations and statistical inferences about that
population.
The population is too large for us to consider collecting information from all its
members.
Instead we select a sample of individuals hoping that the sample is
representative of the population
5
The Sampling Design Process
Define the target
Population
Determine the Sampling
Frame
Select Sample
Select Sampling
Technique(s)
Determine the sample size
Execute the sampling
process
6
Define the Target Population
The target population is the collection of elements or objects that
possess the information sought by the researcher and about which
inferences are to be made.
The target population should be defined in terms of elements,
sampling units, extent, and time.
An element is the object about which or from which the information
is desired, e.g., the respondent.
A sampling unit is an element, or a unit containing the element,
that is available for selection at some stage of the sampling
process.
Extent refers to the geographical boundaries.
Time is the time period under consideration .
7
Determine the Sampling frame
It is the list of units from which the sample is to be selected.
This is an accessible section of the target population (usually a list
with contact information) from where a sample can be drawn.
If your target population is professional employees at work,
because you cannot access all professional employees around the
world, a more realistic sampling frame will be employee lists of
one or two local companies that are willing to participate in your
study
The existence of an adequate and up-to-date sampling frame
often defines the study population
8
Sampling Techniques
Probability sampling- involves:
Random selection.
Controlled procedure each element has an equal chance
to be selected
Non probability sampling – involves:
Non-random
Subjective
Each element does not have an equal chance of being
selected
9
Probability sampling
It is a procedure in which each element of the population
has a fixed probabilistic chance of being selected for the
sample.
Sampling units are selected by chance.
It requires a precise definition of the target population and general
specification of the sampling frame.
Confidence intervals which contain the true population value with
a given level of certainty, can be calculated.
This allows researcher to make inferences and projections about
the target population, from which sample was drawn.
10
Probability: Simple Random sampling
Each element in the population has a known and equal probability of
selection.
Each possible sample of a given size (n) has a known and equal
probability of being the sample actually selected.
This implies that every element is selected independently of every
other element.
Advantage
Easy to implement
Disadvantage
Requires listing of population elements
Takes more time to implement
Uses larger sample size
11
Probability : Systematic sampling
The sample is chosen by selecting a random starting point and
then picking every ith element in succession from the sampling
frame.
The sampling interval, i, is determined by dividing the population
size N by the sample size n and rounding to the nearest integer.
For example, there are 100,000 elements in the population and a sample
of 1,000 is desired. In this case the sampling interval, i, is 100. A random
number between 1 and 100 is selected. If, for example, this number is
23, the sample consists of elements 23, 123, 223, 323, 423, 523, and so
on.
Advantage
When the ordering of the elements is related to the characteristic of
interest, systematic sampling increases the representativeness of the
sample
Disadvantage
If the ordering of the elements produces a cyclical pattern, systematic
sampling may decrease the representativeness of the sample.
12
Probability : Stratified sampling
It is a two-step process in which
First, the population is partitioned into subpopulations, or strata .
The strata should be mutually exclusive and collectively exhaustive in that
every population element should be assigned to one and only one stratum and
no population elements should be omitted.
The elements within a stratum should be as uniform as possible, but the
elements in different strata should be as mixed as possible
The stratification variables should also be closely related to the
characteristic of interest
Next, elements are selected from each stratum by a random
procedure(SRS)
In proportionate(balanced) stratified sampling, the size of the sample
drawn from each stratum is proportionate to the relative size of that stratum in
the total population.
In disproportionate (unequal) stratified sampling, the size of the sample
from each stratum is not proportionate to the relative size of that stratum
13
Cont.
Advantage:
To increase statistical efficiency. Observations can be used for
inferential purpose.
It is an objective method of sampling
Disadvantage:
Difficulty to decide the relevant criterion for stratification.
It is costly and time consuming method.
Increase error if subgroups are selected at different rate
14
Probability: Cluster Sampling
The target population is first divided into mutually exclusive and collectively
exhaustive subpopulations, or clusters.
Elements within a cluster should be as heterogeneous (mixed) as possible,
but clusters themselves should be as homogeneous (uniform) as possible.
Ideally, each cluster should be a small-scale representation of the
population.
Then, a random sample of clusters is selected, based on a probability
sampling technique such as SRS.
For each selected cluster, either all the elements are included in the sample (one-
stage) or a sample of elements is drawn probabilistically (two-stage).
In probability proportionate to size sampling , the clusters are sampled with
probability proportional to size. In the second stage, the probability of selecting a
sampling unit in a selected cluster varies inversely with the size of the cluster .
Advantage
Economically efficient; Lowest cost per sample,
Disadvantage
Often lower statistical efficiency(more error) due to subgroups being
homogeneous rather than heterogeneous
15
Non probability sampling
It relies on personal judgment of the researcher
rather than chance to select sample elements.
The researcher can arbitrary or consciously decide what elements
to include in the sample and may yield good estimates of the
population characteristics.
The estimates obtained are not statistically projectable to the
population
16
End of Chapter 4
17