KRUPANIDHI COLLEGE OF MANAGEMENT
DEPARTMENT OF MBA
Subject: Business Analytics and research Methods
CHAPTER :4- DATA COLLECTION AND MEASUREMENT CONCEPTS
1. Different types of Scaling methods/ Scales of Measurement
In Statistics, the variables or numbers are defined and categorized using
different scales of measurements.
Each level of measurement scale has specific properties that determine the
various use of statistical analysis.
There are 4 levels of measurement:
Nominal: the data can only be categorized
Ordinal: the data can be categorized and ranked
Interval: the data can be categorized, ranked, and evenly spaced
Ratio: the data can be categorized, ranked, evenly spaced, and has a natural zero
Nominal level
Categorize data by labelling them in mutually exclusive groups, but there is no order
between the categories.
Examples of nominal scales City of birth, Gender, Ethnicity, Car brands ,Marital
status
Ordinal level
Categorize and rank your data in an order, but you cannot say anything about the
intervals between the rankings.
Although you can rank the top 5 Olympic medallists, this scale does not tell you how
close or far apart they are in number of wins.
Examples of ordinal scales
Top 5 Olympic medallists
Language ability (e.g., beginner, intermediate, fluent),
Likert-type questions (e.g., very dissatisfied to very satisfied)
Interval level
Categorize, rank, and infer equal intervals between neighboring data points, but there
is no true zero point.
The difference between any two adjacent temperatures is the same: one degree. But
zero degrees is defined differently depending on the scale – it doesn’t mean an
absolute absence of temperature.
The same is true for test scores and personality inventories. A zero on a test is
arbitrary; it does not mean that the test-taker has an absolute lack of the trait being
measured.
Examples of interval scales
Test scores (e.g., IQ or exams), Personality inventories,Temperature in Fahrenheit
Ratio level
Categorize, rank, and infer equal intervals between neighboring data points, and there
is a true zero point.
A true zero means there is an absence of the variable of interest. In ratio scales, zero
does mean an absolute lack of the variable.
For example, in the Kelvin temperature scale, there are no negative degrees of
temperature – zero means an absolute lack of thermal energy.
Examples of ratio scales: Height, Weight
2. Explain the types of Questionnaire? Discuss the Types of questions in a
questionnaire
A questionnaire is a research instrument that consists of a set of questions or other
types of prompts that aims to collect information from a respondent. A research
questionnaire is typically a mix of close-ended questions and open-ended questions
Types of questionnaires
•Structured Questionnaires:
Structured questionnaires collect quantitative data. The questionnaire is planned and
designed to gather precise information.
•Unstructured Questionnaires:
Unstructured questionnaires collect qualitative data. They use a basic structure and
some branching questions but nothing that limits the responses of a respondent.
The questions are more open-ended to collect specific data from participants.
Types of questions in a questionnaire
Some of the widely used types of questions are:
• Open-Ended Questions
• Dichotomous Questions
• Multiple-Choice Questions
• Scaling Questions
• Pictorial Questions
Open-Ended Questions: Open-ended questions help collect qualitative data in a
questionnaire where the respondent can answer in a free form with little to no
restrictions.
Dichotomous Questions: The dichotomous question is generally a “yes/no” close-
ended question. This question is usually used in case of the need for necessary
validation. It is the most natural form of a questionnaire.
Multiple-Choice Questions: Multiple-choice questions are a close-ended question
type in which a respondent has to select one (single-select multiple-choice
question) or many (multi-select multiple choice question) responses from a given
list of options. The multiple-choice question consists of an incomplete stem
(question), right answer or answers, incorrect answers, close alternatives, and
distractors. Of course, not all multiple-choice questions have all of the answer
types. For example, you probably won’t have the wrong or right answers if you’re
looking for customer opinion.
Scaling Questions: These questions are based on the principles of the four
measurement scales – nominal, ordinal, interval, and ratio. A few of the question
types that utilize these scales’ fundamental properties are rank order
questions, Likert scale questions, semantic differential scale questions, and Stapel
scale questions.
Pictorial Questions: This question type is easy to use and encourages respondents to
answer. It works similarly to a multiple-choice question. Respondents are asked a
question, and the answer choices are images. This helps respondents choose an
answer quickly without over-thinking their answers, giving you more accurate data.
3. Write a short note on :
a) Cronbach's Alpha b) Inferential Analysis
a) Cronbach's Alpha
Cronbach's alpha is the most common measure of internal consistency
("reliability"). that is, how closely related a set of items are as a group.
A “high” value for alpha does not imply that the measure is unidimensional.
It is most commonly used when you have multiple Likert questions in a
survey/questionnaire that form a scale and you wish to determine if the scale is
reliable.
A—0.9 or higher are considered excellent; B—0.8 to 0.9 are adequate; C—0.7 to
0.8 are marginal; D—0.6 to 0.07 are seriously suspect; F—less than 0.6 are totally
unacceptable.
b) Inferential Analysis
Inferential statistics is a branch of statistics that makes the use of various analytical
tools to draw inferences about the population data from sample data.
There are two main types of inferential statistics - hypothesis testing and regression
analysis.
Hypothesis testing also includes the use of confidence intervals to test the parameters
of a population.
Given below are certain important hypothesis tests that are used in inferential
statistics.
Z Test: A z test is used on data that follows a normal distribution and has a sample
size greater than or equal to 30.
T Test: A t test is used when the data follows a student t distribution and the sample
size is lesser than 30.
F Test: An f test is used to check if there is a difference between the variances of two
samples or populations.
Regression analysis is used to quantify how one variable will change with respect to
another variable. There are many types of regressions available such as simple linear,
multiple linear, nominal, logistic, and ordinal regression. The most commonly used
regression in inferential statistics is linear regression. Linear regression checks the
effect of a unit change of the independent variable in the dependent variable.
4. Difference between Census method and Sample Survey methods of data collection
Basis Census Method Sample Survey Method
Information collected for a
Information collected for all
Coverage subset representing the
items in the population.
entire population.
Suitable for small Preferable for large
Suitability
investigation areas. investigation areas.
Provides lower accuracy as
Generally provides higher it involves studying a small
Accuracy accuracy due to studying sample. However, errors are
every population item. easier to detect and rectify in
this method.
Takes more time for data Requires less time for data
Time
collection. collection.
More expensive as it covers Less expensive due to
Cost
the entire population. smaller sample size.
Nature of Items Suitable for diverse Suitable for homogeneous
population characteristics. population items.
Easier to verify, and doubts
Difficult to verify due to
can be resolved through
Verification high expenses and extensive
additional enumeration if
process.
needed.
5. What is the validity of a research question?
Validity refers to how accurately a method measures what it is intended to measure.
If research has high validity, that means it produces results that correspond to real
properties, characteristics, and variations in the physical or social world.
High reliability is one indicator that a measurement is valid.
CHAPTER : 5- SAMPLING AND DATA PREPARATION
1.Write a short notes on :
Sampling Size
a. No. of people in the selected sample
Sampling Frame
List of Individuals or people include in the same
Sampling Technique
Techniques or Procedure used to select the members of the sample
2. Explain the types of Sampling in detail?
Sampling
Process of selecting a part of the Population
Population is a group of people that is studied in a research eg: members in
town, city ,country.
Difficult for a research to study whole population due to limited resources. eg:
time, cost, etc
Research select a part of the population for his study rather study whole
population
Types of Sampling :a) Probability and Non-Probability Sampling
Probability Sampling : Population size is known
a)Types of Probability Sampling
Simple Random Sampling
Stratified Random Sampling
Systematic Sampling
Cluster Sampling
Multi-Stage Sampling
b) Non-Probability Sampling : Population size is unknown
Purposive Sampling
Convenience Sampling
Snow-Ball Sampling
Quota Sampling
3. Distinguish between Sampling error and Non-sampling error
Sampling error
Occurs when a sample does not accurately represent the population.
Sampling errors can be minimized by increasing the sample size.
For example, if a survey to determine average income only includes high-income
individuals, the results will be biased and higher than the population as a whole.
Non-sampling error
Occurs from sources other than sampling, such as poor sampling techniques, biased
survey questions, or data entry errors. Other examples of non-sampling error include:
a. Measurement errors
b. High nonresponse rates
c. False information by respondents
d. Wrong interpretation by interviewers
e. Inappropriate data analysis
4. Discuss the Characteristics of a Good Sample ?
A sample is that part of the universe or population which is selected for the purpose of
investigation. A sample represents the characteristics of the universe.
Essentials of a good sample: A sample must have the following qualities in order to
arrive at unbiased and right conclusions.
(i) Representative: All characteristics of the universe must be represented by sample. It
is possible only when each unit of the universe stands an equal chance of being selected
in the sample.
(ii) Independence: All units of the sample must be independent of each other,i.e., one
item of the sample should not be dependent upon the other item of the universe.
(iii) Homogeneity: All selected samples should be homogeneous to each other and not
contradictory
(iv) Adequacy: The number of items selected as sample should be fairly adequate so that
some reliable conclusions are drawn for the universe as a whole.
5. What do you meant by Data preparation, field validation, data editing, and coding
of data?
Data preparation, field validation, data editing, and coding are all steps in the process of
transforming raw data into a usable format for analysis:
Data preparation
The process of cleaning, transforming, and enriching raw data to improve accuracy and
reduce processing costs. This can include collecting data, cleansing it, and standardizing
formats.
Field validation
Part of the data editing process, which involves checking completed questionnaires for
errors, omissions, and inconsistencies.
Data editing
The process of checking questionnaires for errors, omissions, and inconsistencies.
Coding
The process of assigning numerical or character codes to questionnaire responses to
prepare the data for computer processing.
Other steps in the data processing process include:
Classifying data
Tabulating data
Creating data diagrams
Inputting and processing data using algorithms
Outputting and interpreting data in readable formats
Storing processed data for future use and report
MODULE SIX: DATA ANALYSIS AND REPORT WRITING
1. Explain the concepts Univariate analysis and Bivariate analysis
Descriptive statistics is a method of analyzing data, and univariate and bivariate
are two types of descriptive statistics that differ in the number of variables they analyze:
Univariate analysis
Analyzes a single variable. The word "uni" means "one". This is the simplest form of data
analysis.
In descriptive statistics, univariate data analyzes only one variable. It is used to identify
characteristics of a single trait and is not used to analyze any relationships or causations.
Univariate Analysis techniques:
a.ANOVA
b.Skewness
C. Measure of central Tendency
Bivariate analysis
Analyzes two variables at once to understand the relationship between them. Bivariate
data, on the other hand, attempts to link two variables by searching for correlation. Two
types of data are collected, and the relationship between the two pieces of information is
analyzed together
For example, you might use a scatterplot to analyze the relationship between two
variables.
Bivariate analysis Techniques
a. Correlation
b. Regression
Here are some other types of statistical analysis:
Multivariate analysis: Analyzes more than two variables to understand how they affect
responses.
Measures of central tendency: Help identify the typical or representative values in a
dataset. The mean, median, and mode are the three main measures of central tendency.
2. Explain the Parametric and Nonparametric Test in detail.
Parametric Test
In Statistics, a parametric test is a kind of hypothesis test which gives generalizations for
generating records regarding the mean of the primary/original population. The t-test is
carried out based on the students’ t-statistic, which is often used in that value.
Z Test: A z test is used on data that follows a normal distribution and has a sample
size greater than or equal to 30.
F Test: An f test is used to check if there is a difference between the variances of two
samples or populations.
T Test: A t test is used when the data follows a student t distribution and the sample
size is lesser than 30.
The t-statistic test holds on the underlying hypothesis, which includes the normal
distribution of a variable. In this case, the mean is known, or it is considered to be known.
For finding the sample from the population, population variance is identified. It is
hypothesized that the variables of concern in the population are estimated on an interval
scale.
Non-Parametric Test
The non-parametric test does not require any population distribution, which is meant by
distinct parameters. It is also a kind of hypothesis test, which is not based on the
underlying hypothesis. In the case of the non-parametric test, the test is based on the
differences in the median. So this kind of test is also called a distribution-free test. The
test variables are determined on the nominal or ordinal level. If the independent variables
are non-metric, the non-parametric test is usually performed.
Parametric and Nonparametric Methods
Description Parametric Methods Nonparametric Methods
Descriptive statistics Mean, Standard deviation Median, Interquartile range
Sample with population (or One sample t-test (n <30) and One sample Wilcoxon signed rank test
hypothetical value) One sample Z-test (n ≥30)
Two unpaired groups Independent samples t-test Mann Whitney U test/Wilcoxon rank
(Unpaired samples t-test) sum test
Two paired groups Paired samples t-test Related samples Wilcoxon signed-rank
test
Three or more unpaired groups One-way ANOVA Kruskal-Wallis H test
Three or more paired groups Repeated measures ANOVA Friedman Test
Degree of linear relationship Pearson’s correlation Spearman rank correlation coefficient
between two variables coefficient
Predict one outcome variable by Linear regression model Nonlinear regression model/Log linear
Description Parametric Methods Nonparametric Methods
at least one independent variable regression model on log normal data
3. Discuss the Error in Testing of Hypothesis?
Type I and Type II errors are subjected to the result of the null hypothesis. In case of type
I or type-1 error, the null hypothesis is rejected though it is true whereas type II or type-2
error, the null hypothesis is not rejected even when the alternative hypothesis is true. Both
the error type-i and type-ii are also known as “false negative”. A lot of statistical theory
rotates around the reduction of one or both of these errors, still, the total elimination of
both is explained as a statistical impossibility.
Type I Error
A type I error appears when the null hypothesis (H0) of an experiment is true, but still, it
is rejected. It is stating something which is not present or a false hit. A type I error is
often called a false positive (an event that shows that a given condition is present when it
is absent). In words of community tales, a person may see the bear when there is none
(raising a false alarm) where the null hypothesis (H0) contains the statement: “There is no
bear”.
The type I error significance level or rate level is the probability of refusing the null
hypothesis given that it is true. It is represented by Greek letter α (alpha) and is also
known as alpha level. Usually, the significance level or the probability of type i error is
set to 0.05 (5%), assuming that it is satisfactory to have a 5% probability of inaccurately
rejecting the null hypothesis.
Type II Error
A type II error appears when the null hypothesis is false but mistakenly fails to be
refused. It is losing to state what is present and a miss. A type II error is also known as
false negative (where a real hit was rejected by the test and is observed as a miss), in an
experiment checking for a condition with a final outcome of true or false.
A type II error is assigned when a true alternative hypothesis is not acknowledged. In
other words, an examiner may miss discovering the bear when in fact a bear is present
(hence fails in raising the alarm). Again, H0, the null hypothesis, consists of the statement
that, “There is no bear”, wherein, if a wolf is indeed present, is a type II error on the part
of the investigator. Here, the bear either exists or does not exist within given
circumstances, the question arises here is if it is correctly identified or not, either missing
detecting it when it is present, or identifying it when it is not present.
The rate level of the type II error is represented by the Greek letter β (beta) and linked to
the power of a test (which equals 1−β).
4. Describe in detail the structure of Research Report.
Report structures do vary among disciplines, but the most common structures include the
following:
Title page
The title page needs to be informative and descriptive, concisely stating the topic of the
report.
Abstract (or Executive Summary in business reports)
The abstract is a brief summary of the context, methods, findings and conclusions of the
report. It is intended to give the reader an overview of the report before they continue
reading, so it is a good idea to write this section last.
An executive summary should outline the key problem and objectives, and then cover the
main findings and key recommendations.
Table of contents
Readers will use this table of contents to identify which sections are most relevant to
them. You must make sure your contents page correctly represents the structure of your
report.
Take a look at this sample contents page.
Introduction
In your introduction you should include information about the background to your
research, and what its aims and objectives are. You can also refer to the literature in this
section; reporting what is already known about your question/topic, and if there are any
gaps. Some reports are also expected to include a section called ‘Terms of references’,
where you identify who asked for the report, what is covers, and what its limitations are.
Methodology
If your report involved research activity, you should state what that was, for example you
may have interviewed clients, organised some focus groups, or done a literature review.
The methodology section should provide an accurate description of the material and
procedures used so that others could replicate the experiment you conducted.
Results/findings
The results/findings section should be an objective summary of your findings, which can
use tables, graphs, or figures to describe the most important results and trends. You do not
need to attempt to provide reasons for your results (this will happen in the discussion
section).
Discussion
In the discussion you are expected to critically evaluate your findings. You may need to
re-state what your report was aiming to prove and whether this has been achieved. You
should also assess the accuracy and significance of your findings, and show how it fits in
the context of previous research.
Conclusion/recommendations
Your conclusion should summarise the outcomes of your report and make suggestions for
further research or action to be taken. You may also need to include a list of specific
recommendations as a result of your study.
References
The references are a list of any sources you have used in your report. Your report should
use the standard referencing style preferred by your school or department eg Harvard,
Numeric, OSCOLA etc.
Appendices
You should use appendices to expand on points referred to in the main body of the report.
If you only have one item it is an appendix, if you have more than one they are called
appendices. You can use appendices to provide backup information, usually data or
statistics, but it is important that the information contained is directly relevant to the
content of the report.
Appendices can be given alphabetical or numerical headings, for example Appendix A, or
Appendix 1. The order they appear at the back of your report is determined by the order
that they are mentioned in the body of your report. You should refer to your appendices
within the text of your report, for example ‘see Appendix B for a breakdown of the
questionnaire results’. Don’t forget to list the appendices in your contents page.
5. Describe in detail the Format of Research Report.
Steps in Report Writing: Report Writing Format
Report writing is a formal style of writing elaborately on a topic. The tone of a report and
report writing format is always formal. The important section to focus on is the target
audience. For example – report writing about a school event, report writing about a
business case, etc.
Report Writing Format
Following are the parts of a report format that is most common.
Executive summary – highlights of the main report
Table of Contents – index page
Introduction – origin, essentials of the main subject
Body – main report
Conclusion – inferences, measures taken, projections
Reference – sources of information
Appendix