0% found this document useful (0 votes)
3 views

RM-EBBA-class-8-CH0-11-Quatitative-analysis

The document outlines the research methodology for quantitative data analysis, focusing on data preparation, coding, entry, and editing. It discusses various statistical analysis types, measures of central tendency and dispersion, and the importance of reliability and validity in data testing. Additionally, it provides practical guidance on using SPSS for descriptive statistics and reliability analysis.

Uploaded by

conpmeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

RM-EBBA-class-8-CH0-11-Quatitative-analysis

The document outlines the research methodology for quantitative data analysis, focusing on data preparation, coding, entry, and editing. It discusses various statistical analysis types, measures of central tendency and dispersion, and the importance of reliability and validity in data testing. Additionally, it provides practical guidance on using SPSS for descriptive statistics and reliability analysis.

Uploaded by

conpmeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

National Economics University, Business school

EBBA-EBDB Programme

Research Methodology
Class 8
Senior lecturer: Assoc. Prof. Dr. Le Thi My Linh

1
Chapter 14

Quantitative Data Analysis

2
Areas of Editing Concern
◼ Asking the proper questions
◼ Recording answers accurately
◼ Screening questions correctly
◼ Recording open-ended answers
completely and accurately

3
Getting the Data Ready for
Analysis: coding
▪ Coding variable: Naming, Assigning Values, Giving
Labels to Variables and to their Values
▪ The variable label can be called anything you like – this is
what will appear on any table or graph you make.
▪ Each variable can have many values, each of which can
have its own label
◼ Data coding: assigning a number to the participants’
responses so they can be entered into a database.

4
DATA CODING: Tips for complex questions

◼ Questions that many choices are allowed


◼ Ordinal question
◼ Questions choosing two options out of a list

5
Getting the Data Ready for
Analysis: Data entry
◼ Data Entry: after responses have been
coded, they can be entered into a
database. Raw data can be entered
through any software program (e.g.,
excel, SPSS)

6
Editing Data after entering data
◼ Check for mistake during entering data
◼ An example of an illogical response is an
outlier response. An outlier is an observation
that is substantially different from the other
observations.
◼ Inconsistent responses are responses that are
not in harmony with other information.
◼ Illegal codes are values that are not specified
in the coding instructions.
7
Types of statistical analysis
◼ Descriptive: describe the variables in a
data matrix
◼ Inferential: Make inferences about the
population’s characteristics based on the
sample data
◼ Differences: compare the mean of the responses
of one group to that of another group
◼ Associative: determines the strength and
direction of relationships between two or more
variables
◼ Predictive: make forecasts of future events
8
Statistical Analysis
◼ Every set of data collected needs some
summary information developed that
describes the numbers it contains
◼ Central tendency and dispersion,
◼ Relationships of the sample data, and
◼ Hypothesis testing

9
Measures of
Central Tendency

Mean
Mode
Arithmetic
Response Most
Average
Often Given
to a Question

Median
Middle Value
of a Rank Ordered
Distribution

10
Measures of
Central Tendency
◼ Each measure of central tendency
describes a distribution in its own
manner:
◼ for nominal data, the mode is the best
measure.
◼ for ordinal data, the median is generally
the best.
◼ for interval or ratio data, the mean is
generally used.
11
Measures of Dispersions
Describes how close to the mean or other measure
of central tendency, the rest of the values fall

Range Standard Deviation


Distance between the Measure of the average
smallest and largest value dispersion of the values about the
in a set mean

12
Descriptive statistics
◼ Measures of Central Tendency or
location
◼ Measures of
Variability/dispersion/spread
◼ Other measures

13
Measures of Central Tendency
◼ Mode- nominal: value in a string of numbers
that occurs most often
◼ Median- ordinal: value whose occurrence lies in
the middle of a set of ordered value. Not
influenced by extreme data points. M=(n+1)/2
◼ E.g. 42,36,39,38,40,34,32,44. M=38.5
◼ Mean- interval or ratio: is the arithmetic
average of a set of numbers. Subject to strong
influence by extreme data points, should always
quote the standard deviation
◼ E.g.. 6,6.5,7,7,7,7.5,8,8.5,9: mode: 7; media:7; mean:
7.38

14
Skewed and Symmetric
Distributions
◼ Symmetric data – data sets whose values are evenly spread
around the center; the mean and median are equal.
◼ Skewed data – data sets that are not symmetric; the mean will
be larger or smaller than the median.

Left- Symmetric Right - Skewed


Skewed

Mean < Median < Mode Mean = Median = Mode < Median < Mean
(Longer tail extends to left)
Mode (Longer tail extends to right)
15
Measures of Spread
◼ Range: identifies the maximum and minimum values
in a set of numbers. Subject to strong influence by
extreme data points
◼ The interquartile range: the range of the middle
50% of scores, hence a measure of how spread out the
middle 50% of scores are. Not subject to influence by
extreme data points
◼ Variance is calculated by subtracting the mean from
each of the observations in the data set, taking the
square of this difference, and dividing the total of these
by the number of observation.
(variance is the square of standard deviation)
◼ Standard Deviation: indicates the degree of variation
in away that can be translated into a bell-shaped curve
distribution. is the average amount of variation around the
mean, reducing the impact of extreme values (outliers)
16
Range
◼ Simplest measure of variation
◼ Difference between the largest and the smallest
observations, hence a measure of how spread out the
data are.
Range = xmaximum – xminimum
Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
(a dispersion of 13 units) 17
Interquartile Range
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%

12 30 45 57 70

Interquartile range
= 57 – 30 = 27

18
Variance
◼ Company A sold 30,40,50 units of a
product during 3 month: Arp., May,
June
◼ Company B sold 10,40,70 units of a
product during the same period
◼ Range?
◼ Variance?

19
Variance
◼ Variance for Co A:=
◼ (30-40)2 + (40-40)2 + (50-40)2 =66.7
3
◼ Variance for Co B:
◼ (10-40)2 + (40-40)2 + (70-40)2 =600
3
◼ The variance is much larger in Co B than Co A. It is
much more difficult for Co B to estimate how many
goods to stock than it is for Co A 20
Things to look for in a
histogram/boxplot
Visual inspection (histograms and/or box plots)

◼ Are all the data within the appropriate range?


◼ Are the data symmetrically distributed, or skewed
(bunched at one end stretched out in a tail to the
other)?
◼ If it’s bunched, does this mean that the variable
doesn’t differentiate between subjects?
◼ Are the data bimodal (two peaks in a histogram),
suggesting the possibility of two different groups?
◼ Are there outliers (extreme points)?

21
Histogram and boxplot
20 5.5

5.0 4
8
9
16

4.5

4.0

10
3.5

3.0

2.5
Std. Dev = .76
Mean = 4.0
2.0 15
10

0 N = 22.00
2.0 3.0 4.0 5.0 1.5
N= 22

Attractive Attractive

Commands: Grapths/histogramm Commands: Analyse/Descriptive statistics/


Click on “Display normal curve Explore/plots

22
Modify variables
◼ Recode variable: from ratio/ordinal/nominal ->
ordinal/nominal vars
• Age-> age groups (children: age<=15, adult: 15- <60;
elders: >60)
• Primary/ Secondary/ High/ A degree -> Below High/
Upper High
• Ethnics: Kinh/ Mong/ Dao/ Pa Ko -> Kinh/ Not Kinh
◼ Compute variable: ratio vars -> ratio vars
• income = salary + bonus
• Revenue = price*quantity
• Lnhhsize = ln(hhsize)
23
Descriptive statistics

Use SPSS (Statistical Package for the Social Sciences)


1. Performing Frequency
- Purpose: Descriptive the structure of the
variables using frequency and percentage with
the plots
- Commands :
Analyze\Descriptive Statistics\
Frequencies...
- Choice variables: variables, option for display
24
- Choice variables, display the tables
- Statistics: the statistics of the variables
- Chart: option
- Outputs: frequency, percentage, percentage
valid, cummulity percentage
- Edite output: Tables, plots

25
Descriptive statistics

2- Performing Descriptive
- Purpose: Display the principal statistics of
variables
- Commands :

Analyze\Descriptive Statistics\
Descriptive...
Designate: variables, summary, display,
save
- Descriptive for test capacity of data
26
Exercise #1 On Frequency
Distributions
◼ Below is a tabulation of the
demographic data from the Frequency
distribution of a survey done by Ms.
Sandra Jones. Her sample consisted of
148 of a total of 3,700 clerical
employees in three service
organizations. Based on the tabulation
provided below, describe the sample
characteristics.
27
Table 1: Frequency Distributions of Sample (n = 148)

RACE EDUCATION GENDER

Non-whites = 48 High School = 38 Males =


(32%) (26%) 11(75%)

Whites = College Degree = Females = 37


100 (68%) 74 (50%) (25%)

Masters Degree =
36 (24%)
28
AGE # OF YEARS IN ORG. MARITAL STATUS
< 20 = 10(7%) < 1 year = 5 (3%) Single 20 (14%)
20-30 = 20(14%) 1-3 = 25(17%) Married 108 (73%)
31-40 = 30(20%) 4-10 = 98(66%) Divorced 13 (9%)
>40 = 88(59%) >10 = 20(14%) Alternative7 (4%)
Lifestyle

29
Testing goodness of data
◼ Reliability
◼ Validity

30
Testing goodness of data
Reliability
◼ Cronbach’s alpha is reliability coefficient that
indicates how well the items in a set are
positively correlated to one another
◼ Cronbach’s alpha is an adequate test of
internal consistency reliability
◼ The closer the Cronbach’s alpha is to 1, the
higher the internal consistency reliability

31
Cronbach’s alpha
◼ Cronbach’s alpha < 0.6: reliabilities is poor
◼ Cronbach’s alpha = 0.6: acceptable
◼ Cronbach’s alpha = 0.7: reliabilities is acceptable
◼ Cronbach’s alpha > 0.8: reliabilities is good
◼ Cronbach’s alpha > 0.9: reliabilities is very good

◼ If Cronbach’s alpha is too low, we could use the result of


cronbach’s alpha table to find out which of the items
would have to be removed from our measure to increase
the internal consistency 32
Reliability Analysis

33
Reliability Analysis
- Commands :
◼ Analyse/scale/reliability analysis/select
the variables constituting the
scale/model alpha/click statistics/scale if
deleted under Descriptives

34
Example: reliability analysis for the
variable customer differentiation
Item-total Corrected
Statistics
Scale Scale Item- Total Alpha if
Mean if Variance if Correlation Item
Item Item Deleted
Deleted Deleted
CUSDIF1 10.04 5.473 .2437 .7454
CUSDIF2 9.7432 5.0176 .5047 .3293
CUSDIF3 9.6486 5.3754 .4849 .3722

Reliability Coefficients
N of Cases = 111.0 N of Items = 3
Alpha = .5878
35
Example: how to write result of
reliability test
◼ The reliability test was executed for each dimension
and the results indicated that the questions measured
each factor have consistency and can be
acknowledged to be reliable and qualified for further
analysis. The Cronbach's alpha of each factor is shown
in the far rightmost column of Table 1. These indexes
are at 0.7 or higher, indicating that the measurement
model achieved the reliability (Hair et al., 2009).
Especially, marketing innovation has good results
with Cronbach Alpha index above 0.8
◼ Hair, J. F., Andreson, R. E., Tahtam, R. L., & Black, C. W. (2009),
Multivariate Data Analysis, Prentice – Hall International, Inc.
36
Testing goodness of data
Validity
◼ Factorial validity can be established by
submitting the data for factor analysis. The
result of factor analysis (a multivariate
technique) will confirm whether or not the
theorized dimensions emerge
◼ When well validated measures are used,
there is no need, of course , to establish their
validity again for each study
◼ The reliability of the items can be tested 37

You might also like