Delhi
Technological
University
UNIVERSITY SCHOOL OF MANAGEMENT AND
ENTREPRENEURSHIP
RESEARCH
METHODOLOGY
ASSIGNMENT
Submitted by- Submitted to-
Nakshatra Agarwal Dr.Jagvinder Singh
(2K19/BBA/57) (Asst. Professor, DTU)
Descriptive Statistic
QUESTION- from the following data which company is better to invest-
com.X com.Y
450 430
460 435
700 465
382 500
455 548
392 650
362 690
386 710
562 770
440 570
Answer:-
COMPANY.X COMPANY.Y
Mean 458.9Mean 576.8
Standard Error 32.37023667 Standard Error 38.61082174
Median 445Median 559
Mode #N/A Mode #N/A
Standard Deviation 102.3636763 Standard 122.098139
Deviation
Sample Variance 10478.32222 Sample Variance 14907.95556
Kurtosis 2.943193865 Kurtosis -
1.431118445
Skewness 1.690066135 Skewness 0.26920438
Range 338 Range 340
Minimum 362 Minimum 430
Maximum 700 Maximum 770
Sum 4589 Sum 5768
Count 10 Count 10
Largest(1) 700 Largest(1) 770
Smallest(1) 362 Smallest(1) 430
cv_X 100*(sigma of X/mean of x)
cv_Y 100*(sigma of Y/mean of Y)
coefficient of variation = 100*(std. cv of y co. 21.16819331
dev./mean)
cv of x co. 22.30631429
co. y is better to invest
FORMMULA- IF(22.30631429>21.16819331,"co. Y is better to invest","co. X is better to
invest")
Company Y is better to invest
QUESTION:-. The following table gives the details of XYZ Company:
Customer Marital Income Expenses
Id Status (in Rs) (in Rs)
1 Married 1800 300
2 Single 1500 200
3 Single 2000 250
4 Married 850 100
5 Single 1600 200
6 Single 1200 200
7 Married 200 50
8 Single 1250 300
9 Single 600 300
10 Married 1200 100
(i) Calculate the descriptive statistics for Income and Expenses. Also interpret the same.
Ans
Income (In Rs) Expenses (in Rs)
Mean 1220 Mean 200
Standard Error 174.1966 Standard Error 28.86751346
Median 1225 Median 200
Mode 1200 Mode 300
Standard Deviation 550.8579 Standard Deviation 91.28709292
Sample Variance 303444.4 Sample Variance 8333.333333
-
Kurtosis -0.16933
Kurtosis 1.157142857
Skewness -0.50161 -
Range 1800 Skewness 0.410791918
Range 250
Minimum 50
Maximum 300
Sum 2000
Count 10
Minimum 200
Maximum 2000
Sum 12200
Count 10
Interpretation
1 Average income 1220
2 Average expanse 200
3 Most Common Income is 1200
4 Most Common Expanse 300
5 Maximum Income 2000
6 Minimum Income 200
7 Difference B/w max and min income 1800
8 Maximum Expanse 300
9 Minimum Expanses 50
10 Difference B/w max and min expanse 250
11 Size of Sample 10
QUESTION- Calculate mean, variance and standard deviation for the following
distribution:-
age 20-30 30-40 40-50 50-60 60-70 70-80 80-90
frequency 3 61 132 153 140 51 2
age frequency x=mid-value d=(x-A)/10 f*d fd^2
20-30 3 25 -3 -9 27
30-40 61 35 -2 -122 244
40-50 132 45 -1 -132 132
50-60 153 55 0 0 0
Ans- 60-70 140 65 1 140 140
70-80 51 75 2 102 204
80-90 2 85 3 6 18
total 542 -15 765
Mean = A+h*sum of fd/N =54.72324723
St. Deviation= h^2*(fd^2/N-(1/N*fd) ^2107975.0157 10^2*(I14^2/E14-(1/E14*H14) ^2)
=11.88036807
CHI-SQUARE TEST
QUESTION- The demand for a particular spare part in a factory was found to vary from day to
day. In a sample study the following information was obtained:
Days Mon. Tue. Wed. Thu. Fri. Sat.
No. of parts demanded 1124 1125 1110 1120 1126 1115
Test the hypothesis that the number of parts demanded on the day of the week.
expected frequency fe) (fo - fe) (fo - fe)2 (fo - fe)2/fe
Days parts demanded (fo)
Mon 1124 1120 4 16 0.014286
Tue 1125
1120 5 25 0.022321
Wed 1110
Thursday 1120 1120 -10 100 0.089286
Fri 1126 1120 0 0 0
Sat 1115
Sum 6720 1120 6 36 0.032143
Mean/ average 1120 1120 -5 25 0.022321
N= 6
Calculated value of the chi 0.180357
α = 5% square test
v (d.f) = 5
Ho: that the no of spare parts demanded does not depend on the day of
week
H1: that the spare parts demanded depends on the day of week
Tabulated value 11.07 FORMULA- =CHISQ.INV.RT (0.05,5)
Accept ho ----- =IF(0.180357<11.07,"accept ho","reject ho")
Accept Ho- that the no. of spare parts demanded does not
depend on the day of week.
QUESTION: - The Following table gives the number of aircraft accidents that occurred during
the seven days of the week. Find whether the accidents are uniformly distributed over the week.
Days Mon Tue Wed Thurs. Fri Sat Total
No of 14 18 12 11 15 14 84
Accidents
Ans-
No of Accidents
Days Observed (fo) expected frequency fe) (fo - fe) (fo - fe)2 (fo - fe)2/fe
Mon 14 14 0 0 0
Tue 18 14 4 16 1.14285714
Wed 12 14 -2 4 0.28571429
Thu 11 14 -3 9 0.64285714
Fri 15 14 1 1 0.07142857
Sat 14 14 0 0 0
84 2.14285714
Mean 14
The accidents are uniformly
Ho
distributed over the week
The accidents are not
H1 uniformly distributed over
the week
n= 6
α = 5%
v(d.f) = 5 = (6 - 1)
Calculated Value 2.142857143
Tabulated Value 11.07049769
Uniformly Distributed Hence, Ho is true
p value 0.829045772
alpha 0.05
QUESTION- You are given two attributes for 200 patients as follows:
State whether the two attributes viz., type of response and Treatment are independent. Also
formulate the hypothesis for the same.Uniformly
(Assume 5%distributed
level of hence, Ho is true.
significance)
Ho: that the two attributes that is type of response and treatment are independent of each other
HI: hat the two attributes that is type of response and treatment are dependent of each other
E(60) 52
E(70) 78
E(20) 28
E(50) 42
FREQUUENCY EXPECTED (Fo-fe) (Fo-fe)2 (Fo-fe)2 /fe
FREQUENCY
60 52 8 64 1.230769
20 28 -8 64 2.28571
70 78 -8 64 0.8205
50 42 8 64 1.523809
5.860805
Calculated value of chi-
square test
Tabulated Value 3.841458821
IF (Calculated Value<Tabulated Value, "accept Ho", "reject Ho")
Here Calculated Value>Tabulated Value
Therefore, we will reject the Ho
Therefore, the two attributes that is type of response and treatment are independent of each other
T-TEST
Question- A machinist is making engine parts with axle diameters of 0.700 inch. A random
sample of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of
0.040 inch. Compute the statistic you would use to test whether the work is meeting the
specifications. Also state how you would proceed further.
Answer- Ho =𝜇 = 0.700 i.e., the product is confirming the specification
HI = is not equal to 0.700 = i.e., not confirming the specifications
= 0.700 inch
X bar = 0.742
S = 0.40 inch
n = 10
Test Statistic: t test
= 𝑥̃−𝜇
t test = 𝑠
√𝑛
t9 Calculated 0.315
I t9 Tabulated 1.83311
IF (Calculated Value<Tabulated Value, "accept Ho", "reject Ho")
IF (Calculated Value<Tabulated Value, "accept Ho", "reject Ho")
Here Calculated Value<Tabulated Value
Therefore, we will accept the Ho
QUESTION- Average heart rate for Americans is 72 beats/minute. A group of 25 individuals
participated in an aerobics fitness program to lower their heart rate. After six months the
group was evaluated to identify is the program had significantly slowed their heart. The mean
heart rate for the group was 69 beats/minute with a standard deviation of 6.5. Was the
aerobics program effective in lowering heart rate? Test at 5% level of significance.
Ho = The Aerobic program was effective in lowering the heart rate
HI = The Aerobic program was ineffective in lowering the heart rate
Degree of freedom — 24
Tabulated t = 2.307692308
I Calculated t l = (69 - = 2.307692308
IF (Calculated Value<Tabulated Value, "accept Ho", "reject Ho")
Here Calculated Value>Tabulated Value
Therefore, we will reject the Ho
It implies that the Aerobic program was ineffective in lowering the heart rate
Z-TEST
QUESTION- Twenty people were attacked by a disease and only 18 survived. Will you reject
the hypothesis that the survival rate, if attacked by this disease, is 85% in favour of the hypothesis
that it is more, at 5% level. (Use Large Sample Test, the significant value of z at 5% level of
significance is 1.645).
ANS-
n= 20
x= 18
Sample proportion p=x/n 0.9
Population Proportion P = 85% 0.85
Q= 0.15
Null Hypothesis Ho P=0.85
Alternate Hypothesis H1 P>0.85
The test static we used as z = (p - P)/SQRT(PQ/n)
z 0.626224
At 5% of significance the tabulated 1.645
Value
Accept Ho
FORMULA=IF(0.626224<1.645,"Accept’’,"Reject Ho")
ANNOVA
QUESTION) The following table shows the lives (in hours) of four batches of electric lamps:
Batches Life of bulbs in hours
1. 1600 1610 1650 1680 1700 1720 1800 -
2. 1580 1640 1640 1700 1750 - - -
3. 1460 1550 1600 1620 1640 1660 1740 1820
4. 1510 1520 1530 1570 1600 1680 - -
Perform an analysis of variance of these data and so that a significance test does not reject
their homogeneity.
ANS- B1 B2 B3 B4
1600 1580 1460 1510
1610 1640 1550 1520
1650 1640 1600 1530
1680 1700 1620 1570
1700 1750 1640 1600
1720 1660 1680
1800 1740
1820
ANOVA: SINGLE FACTOR
SUMMARY
GROUPS Count Sum Average Variance
B1 7 11760 1680 4766.667
B2 5 8310 1662 4220
B3 8 13090 1636.25 12169.64
B4 6 9410 1568.333 4136.667
ANOVA
SOURCE OF SS df MS F P-value F crit
VARIATION
BETWEEN 44360.71 3 14786.9 2.149389 0.122909 3.049125
GROUPS
WITHIN 151350.8 22 6879.583
GROUPS
TOTAL 195711.5 25
Ho = µ1 = µ2 = µ3
H1 = At least two means are different
Here F crit > F therefore, Accept Ho
FORMULA=IF (3.049125>2.149389,"Accept Ho","Reject Ho")
QUESTION- A test was given to give students taken at random from the fifth class of three schools
of a town.
The individual scores are:
School I School II School III
9 7 6
7 4 5
6 5 6
5 4 7
8 5 6
ANSWER-
Ho = µ1 = µ2 = µ3
H1 = at least two means are different
Anova: Single
Factor
SUMMARY
Groups Count Sum Average Variance
School 1 5 35 7 2.5
School 2 5 25 5 1.5
School 3 5 30 6 0.5
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 10 2 5 3.333333 0.070581 3.885294
Within Groups 18 12 1.5
Total 28 14
=IF(F crit>F, "Accept Ho", "Reject Ho")
Here F crit > F, therefore Accept Ho
QUESTION- Three processes A, B and C are tested to see whether their outputs are equivalent. The
following observations of output are made:
Process Process Process
A B C
10 9 11
12 11 10
13 10 15
11 12 14
10 13 12
14 13
15
13
Carry out the analysis of variance and state your conclusions
Ans-
Ho = µA = µB = µC
H1 = at least two means of process are different
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Process A 8 98 12.25 3.357143
Process B 5 55 11 2.5
Process C 6 75 12.5 3.5
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 7 2 3.5 1.098039 0.357386 3.633723
Within Groups 51 16 3.1875
Total 58 18
=IF(F crit>F, "Accept Ho", "Reject Ho")
Here F crit > F, therefore Accept Ho
Correlation and Regression
Coefficients
Standard t Stat P-value Lower 95% Upper Lower Upper 95.0%
Error 95% 95.0%
Intercept 113.3228426 70.36766321 1.61043919 0.15133728 53.0702403 279.7159 53.0702403 279.7159256
X 0.396954315 1.015599207 -0.3908573 0.70752602 2.79846483 2.004556 2.79846483 2.0045562
QUESTION: - The table below shows the height, X, in inches and the pulse rate, Y, per minute, for
9 people.
X 68 72 65 70 62 75 78 64 68
Y 90 85 88 100 105 98 70 65 72
(i) Find the correlation coefficient and interpret your results.
(ii) Find the linear regression equation for Y on X.
(iii) Use the above equation to predict the pulse rate per minute if height is 80 inches.
Ans i) correlation coefficient is
X Y
68 90
72 85
65 88 INTERPRETATION
70 100 X Y 1. Adjusted R has no relevance in this
62 105 X 1 question
- 2. 2% change in y is due to x alone
75 98 Y 0.14614 1
(R square is 0.02)
78 70
So the correlation coefficient is
3. Negative correlation implies that if
64 65
68 72 0.14614 x by 1 unit y decreases by 0.14 units
ii) Linear regression for Y on X
X Y
68 90
72 85 SUMMARY OUTPUT
65 88
70 100 Regression Statistics
62 105 Multiple R 0.146144024
75 98 R Square 0.021358076
78 70 Adjusted R Square -0.118447913
64 65 Standard Error 15.02568263
68 72 Observations 9
ANOVA
Significance
df SS MS F F
Regression 1 34.49091935 34.49091935 0.152769391 0.707526016
Residual 7 1580.39797 225.7711385
Total 8 1614.888889
Y= A + Bx
Y = 113.3228426 - 0.396954315x
iii) Y = 113.3228426 - 0.39654315*80
81.56649746 FORMULA-SUM (113.3228426+(0.396954315*80)
QUESTION- Find Correlation coefficient and regression equation
Advertising Sales (in
Expanse Lakh)
39 47
65 53
62 58
90 86
82 62
75 68
25 60
98 91
36 51
78 84
ANS-
Advertising Expanse Sales (in Lakh)
Advertising Expanse 1
Sales (in Lakh) 0.780410054 1
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.780410054
R Square 0.609039852
Adjusted R Square 0.560169834
Standard Error 10.42530197 Correlation Coefficient Constant =
Observations 10 0.780410054 (+ve corelation)
ANOVA
Significance
df SS MS F F
Regression 1 1354.505 1354.505 12.46244 0.00773
Residual 8 869.4954 108.6869
Total 9 2224
Standard Upper Lower
Coefficients Error t Stat P-value Lower 95% 95% 95.0%
Intercept 33.43979252 9.794777 3.414043 0.009168 10.853 56.02659 10.853
Advertising Expanse 0.500926269 0.141897 3.530219 0.00773 0.173712 0.828141 0.173712
Y: 37.64+0.42x
QUESTION- Find Correlation coefficient and regression equation
Age Weights
1 52.5
2 58.7
3 65
4 70.2
5 75.4
6 81.1
7 87.2
8 95.5
9 102.2
10 108.4
Age Weights
SUMMARY Age 1
OUTPUT Weights 0.998341258 1
Regression Statistics
Correlation Coefficient: 0.998341258 i.e., +ve
Multiple R 0.998341258
R Square 0.996685268
Adjusted R Square 0.996270927
Standard Error 1.141244669
Observations 10
ANOVA
Significance
df SS MS F F
Regression 1 3132.976485 3132.976485 2405.46815 3.30543E-11
Residual 8 10.41951515 1.302439394
Total 9 3143.396
Standard Lower
Coefficients Error t Stat P-value Lower 95% Upper 95% 95.0% Upper 95.0%
Intercept 45.7266667 0.779618529 58.6526166 7.9303E-12 43.92886312 47.5244702 43.9288631 47.52447022
Age 6.16242424 0.125646903 49.04557222 3.3054E-11 5.872681965 6.45216652 5.87268197 6.45216652
Y=45.73+6.16X
QUESTION- Multiple Regression
A B C
30 15 23
40 22 24
35 41 25
45 44 43
25 26 32
19 21 15
29 32 26
23 23 26
31 34 34
32 33 34
ANSWER-
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.639909454
R Square 0.409484109
Adjusted R Square 0.240765283
Standard Error 6.767475551
Observations 10
ANOVA
Significance
df SS MS F F
Regression 2 222.3089227 111.1545 2.42702 0.158237919
Residual 7 320.5910773 45.79873
Total 9 542.9
Standard Upper Lower Upper
Coefficients Error t Stat P-value Lower 95% 95% 95.0% 95.0%
Intercept 12.35341 8.73291 1.41458 0.20010 -8.29664 33.00346 -8.29664 33.00346
B 0.25152 0.33177 0.75810 0.47314 -0.53301 1.03604 -0.53301 1.03604
C 0.39814 0.39707 1.00268 0.34941 -0.54079 1.33706 -0.54079 1.33706