0% found this document useful (0 votes)
28 views37 pages

Final AB 19-21 PIM3 Basics of Business Statistics

The document provides an overview of business statistics, covering its definition, history, and applications across various fields such as accounting, economics, and marketing. It explains key concepts including descriptive and inferential statistics, levels of measurement, and types of data, as well as methods for presenting data through tables and graphs. Additionally, it discusses the importance of statistical literacy and the potential errors that can arise from misusing statistical methods.

Uploaded by

chiron
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views37 pages

Final AB 19-21 PIM3 Basics of Business Statistics

The document provides an overview of business statistics, covering its definition, history, and applications across various fields such as accounting, economics, and marketing. It explains key concepts including descriptive and inferential statistics, levels of measurement, and types of data, as well as methods for presenting data through tables and graphs. Additionally, it discusses the importance of statistical literacy and the potential errors that can arise from misusing statistical methods.

Uploaded by

chiron
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

1

Contents

Basics of Business Statistics

Chapter Title Page No

1. Business Statistics 3

2. Presentation of Data 8

3. Measures of Central Tendency and Dispersion 16

4. Probability 25

2
Chapter 1

Business Statistics

Introduction

Statistics is a mathematical science pertaining to the collection, presentation, analysis and


interpretation or explanation of data. It also provides tools for prediction and forecasting based on
data. It is applicable to a wide variety of academic disciplines, from the natural and social
sciences to the humanities, government and business.

Statistical methods can be used to summarize or describe a collection of data; this is called
descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for
randomness and uncertainty in the observations, and are then used to draw inferences about the
process or population being studied; this is called inferential statistics. Descriptive, predictive, and
inferential statistics comprise applied statistics. Business statistics is the science of good decision
making in the face of uncertainty and is used in many disciplines such as financial analysis,
econometrics, auditing, production and operations including services improvement, and marketing
research.

History

The scope of the discipline of statistics broadened in the early 19th century to include the collection
and analysis of data in general. Today, statistics is widely employed in government, business, and the
natural and social sciences.

Because of its empirical roots and its applications, statistics is generally considered not to be a
subfield of pure mathematics, but rather a distinct branch of applied mathematics. Its
mathematical foundations were laid in the 17th century with the development of probability
theory by Pascal and Fermat. Probability theory arose from the study of games of chance. The
method of least squares was first described by Carl Friedrich Gauss around 1794. The use of
modern computers has expedited large-scale statistical computation, and has also made possible new
methods that are impractical to perform manually.

In applying statistics to a scientific, industrial, or societal problem, it is necessary to begin with a


process or population to be studied. Population is aggregate of objects animate or inanimate.
There might be a population of people in a country, of crystal grains in a rock, or of goods
manufactured by a particular factory during a given period. It may instead be a process observed at
various times; data collected about this kind of "population" constitute what is called a time series.

3
For practical reasons, rather than compiling data about an entire population, a chosen subset of the
population, called a sample, is studied. Data are collected about the sample in an observational or
experimental setting. The data are then subjected to statistical analysis, which serves two related
purposes: description and inference.

• Descriptive statistics can be used to summarize the data, either numerically or


graphically, to describe the sample. Basic examples of numerical descriptors include the
mean and standard deviation. Graphical summarizations include various kinds of charts
and graphs.

• Inferential statistics is used to model patterns in the data, accounting for randomness
and drawing inferences about the larger population. These inferences may take the form
of answers to yes/no questions (hypothesis testing), estimates of numerical characteristics
(estimation), descriptions of association (correlation), or modeling of relationships
(regression). Other modeling techniques include ANOVA, time series, and data mining.

If the sample is representative of the population, then inferences and conclusions made from the
sample can be extended to the population as a whole. A major problem lies in determining the
extent to which the chosen sample is representative. Statistics offers methods to estimate and
correct for randomness in the sample and in the data collection procedure, as well as methods for
designing robust experiments in the first place. (See experimental design.)

The fundamental mathematical concept employed in understanding such randomness is


probability. Mathematical statistics (also called statistical theory) is the branch of applied
mathematics that uses probability and analysis to examine the theoretical basis of statistics.

The use of any statistical method is valid only when the system or population under
consideration satisfies the basic mathematical assumptions of the method. Misuse of statistics can
produce subtle but serious errors in description and interpretation — subtle in the sense that even
experienced professionals sometimes make such errors, serious in the sense that they may affect,
for instance, social policy, medical practice and the reliability of structures such as bridges.
Even when statistics is correctly applied, the results can be difficult for the non-expert to interpret.
For example, the statistical significance of a trend in the data, which measures the extent to
which the trend could be caused by random variation in the sample, may not agree with one's
intuitive sense of its significance. The set of basic statistical skills (and skepticism) needed by people
to deal with information in their everyday lives is referred to as statistical literacy.

4
Applications of Business Statistics

Accounting
Public accounting firms use statistical sampling procedures when conducting audits for their clients.

Economics
Economists use statistical information in
making forecasts about the future of the
economy or some aspect of it.

Marketing
Electronic point-of-sale scanners at retail
checkout counters are used to collect data for a
variety of marketing research applications

Production
A variety of statistical quality control
charts are used to monitor the output of
a production process

Finance

Financial advisors use price-earnings ratios and dividend yields to guide their investment
recommendations.

5
Levels of measurement

There are four types of measurements or levels of measurement or measurement scales used in
statistics: nominal, ordinal, interval, and ratio. They have different degrees of usefulness in
statistical research.

Nominal

When the data for a variable consists of labels or names used to identify an attribute of the
element, the scale of measurement is considered to be nominal.

Example:
Students of a university are classified by the school in which they are enrolled using a non-
numeric label such as Business, Humanities, Education, and so on. Alternatively, a numeric code
could be used for the school variable (e.g. 1 denotes Business, 2 denote Humanities, and 3 denote
Education, and so on).

Ordinal

The data have the properties of nominal data and the order or rank of the data is meaningful. A
nonnumeric label or numeric code may be used.

Example:
Students of a university are classified by their class standing using a nonnumeric label such as
freshman, sophomore, Junior, or Senior. Alternatively, a numeric code could be used for the class
standing variable (e.g. 1 denotes freshman, 2 denotes sophomore, and so on).

Interval

The data have the properties of ordinal data, and the interval between observations is expressed in
terms of a fixed unit of measure. Interval data are always numeric.

Example:
Melissa has an SAT score of 1205, while Kevin has an SAT score of 1090. Melissa scored 115
points more than Kevin.

Ratio

The data have all the properties of interval data and the ratio of two values is meaningful. Variables
such as distance, height, weight, and time use the ratio scale. This scale must contain a zero Value that
indicates that nothing exists for the variable at the zero point.

Example:
Melissa’s college record shows 36 credit hours earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit hours earned as Melissa.

6
Qualitative and Quantitative Data

Data can be further classified as being qualitative or quantitative. Qualitative data include labels or
names used to identify an attribute of each element. Qualitative data use either the nominal or
ordinal scale of measurement and may be nonnumeric or numeric. Quantitative data are obtained
using either the interval or ratio scale of measurement. The statistical analysis appropriate for a
particular variable depends upon whether the variable is qualitative or quantitative.

If the variable is qualitative, the statistical analysis is rather limited. We can summarize the
qualitative data by counting the number of observations in each category or by computing the
proportion of the observations in each qualitative category. However if the characteristic is
quantitative, arithmetic operations often provide meaningful results.

Discrete variable

A variable taking isolated values is called discrete variable. The graphical representation of a
discrete variable is a step function. Examples of a discrete variable can be number of people in a
group, number of accidents occurring on a particular day etc.

Continuous variable

A variable which takes any value within the given interval is referred to as continuous variable.

Example:
Weight of a person, temperature on a given day, rainfall on a given day etc. Graphical
presentation of a continuous variable is a curve.

7
Chapter 2

Presentation of Data

Types of Data

Primary data

Primary data is the one which is collected for the first time by the investigator. He can collect it
using various methods, like survey (census), telephonic interviews, through e-mails etc. This data are
generally referred to as raw data as it is unprocessed data.

Secondary data

In some cases, data needed for a particular application already exist. Companies maintain a
variety of records or databases about their employees, and business operations. Data are also available
from a variety of industry associations and special interest organizations. The internet continues to
grow as an important source of information and statistical data. Almost all companies maintain Web
sites that provide general information about the company as well as data on sales, number of
employees, number of products etc. Government agencies are another important source of
secondary data. Information on vital events (birth, death etc.) is available with the governmental
agencies. Sometimes information can also be collected form published journals.

Presentation of Data

After collection of data, the next stage the statistician has to go through is presentation of data.
Usually, size of the information collected is huge, so it becomes necessary to present it in a more
systematic and concise way in order to bring out important feature or characteristics of the data.

Basically, there are two ways to represent data.

1. Tabular
2. Graphical

Let us start the discussion with the introduction of frequency distribution and various
components of the frequency distribution.

8
Frequency Distribution

A frequency distribution is a tabular summary of data showing the number (frequency) of items in
each of several non-overlapping classes. When raw data is converted into the frequency
distribution, frequency distribution provides summary which offers more insight than the original
data.

Three steps necessary to define classes for a frequency distribution with quantitative data are:

1. Determine the number of non-overlapping classes.


2. Determine the width of each class.
3. Determine the class limits.

Number of classes

There is no specific rule for choosing the number of classes. As a general guideline, classes between
5 and 20 are chosen.

Width of the class

Generally, we choose same width for all classes. Width is denoted by h.


Class width = Upper class boundary – Lower class boundary
= Difference between the two consecutive upper limits
= Difference between the two consecutive lower limits

Mid-point of the class (class mark)

It is the midpoint of the class interval. It is denoted by x. It is obtained as

Class mark = Upper class limit + Lower class limit


2

= Upper class boundary + Lower class boundary


2

Relative Frequency

Relative frequency of class = Frequency of the class


n
Where, n is the total number of observations.

Tabulation

While presenting the data one can make use of tabulation. It is the most concise way of presentation of
data. There can be one-way, two-way or multifold tables depending on number of columns and rows
we choose.

9
Diagrammatic representation

One of the graphical ways of representing data is diagrammatic representation. In this, one can use
Bar graphs or Pie-charts. Bar graph can be of simple, multiple, sub-divided or percentage type.
This is pictorial presentation of data. In this, points are not plotted according to the scale. These are
more attractive and colorful as compared to various graphs available in statistics.

Simple bar diagram

Pie-chart

Graphical Representation

A common graphical presentation of quantitative data is a histogram. It is series of adjacent


rectangles erected on X-axis. It is constructed by placing the variable of interest on the horizontal axis
and the frequency, relative frequency, or percent frequency on the vertical axis.

Histogram

Before we learn ogive curve, let us look at cumulative frequency distribution.

10
Cumulative Frequency Distribution

The following frequency distribution table gives the marks obtained by 40 students:

Cumulative frequency is obtained by adding the frequency of a class interval and the frequencies of
the preceding intervals unto that class interval. This is explained by an example below.

Class Mark Frequency Cumulative frequency


0-10 4 4
10-20 5 (4) + 5 = 9
20-30 12 (9) + 12 = 21
30-40 11 (21) + 11 = 32
40-50 8 (32) + 8 = 40

In the above table it can be observed that frequencies are added from top to bottom and also 4
students got marks 'less than 10', 9 students got marks 'less than 20' and so on.

Therefore, the above distribution is called 'less than' cumulative frequency distribution. The above
table can be re-written as follow:

Class Mark Cumulative frequency


Less than 10 4
Less than 20 9
Less than 30 21
Less than 40 32
Less than 50 40

In the same way 'more than' cumulative frequency distribution can be obtained by adding to the
other frequencies in the reverse order. It is explained in the following table.

Class Mark Frequency Cumulative frequency


0-10 4 (36) + 4 = 40
10-20 5 (31) + 5 = 36
20-30 12 (19) + 12 =31
30-40 11 (8) + 11 = 19
40-50 8 8

The above table can be re-written as follows

Class Mark Cumulative frequency (c.f)


More than 0 40
More than 10 36
More than 20 31
More than 30 19
More than 40 8

11
Ogive curve

It is a cumulative frequency curve. There are two types of ogive curve; less than ogive curve and
more than ogive curve. Ogive curve is drawn by taking data values on the horizontal axis and
cumulative frequencies on the vertical axis.

Example:
Draw a 'less than' ogive curve for the following data

Class Mark Frequency Cumulative frequency


0-10 2 2
10-20 8 10
20-30 12 22
30-40 18 40
40-50 28 68
50-60 22 90
60-70 6 96
70-80 4 100

To plot an Ogive:

(i) We plot the points with coordinates having abscissa as actual limits and ordinates as the
cumulative frequencies, (10, 2), (20, 10), (30, 22), (40, 40), (50, 68), (60, 90), (70, 96) and (80,
100) are the coordinates of the points.

(ii) Join the points plotted by a smooth curve.

(iii) An Ogive is connected to a point on the X-axis representing the actual lower limit of the first
class.

Scale: X - axis 1 cm = 10 marks, Y - axis 1 cm = 10 c.f.

12
Example
Using the data given below, construct a 'more than' cumulative frequency table and draw the Ogive.

Marks 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80


Frequency 3 8 12 14 10 6 5 2

Class Mark Frequency Cumulative frequency (more than)


More than 1 3 60
More than 11 8 57
More than 21 12 49
More than 31 14 37
More than 41 10 23
More than 51 6 13
More than 61 5 7
More than 71 2 2

To Plot an Ogive
(i) We plot the points with coordinates having abscissa as actual lower limits and ordinates as the
cumulative frequencies
(70.5, 2), (60.5, 7), (50.5, 13), (40.5, 23), (30.5, 37), (20.5, 49), (10.5, 57), (0.5, 60) are the
coordinates of the points.
Y-axis 2 cm = 10 c.f.

(ii) An Ogive is connected to a point on the X-axis representing the actual upper limit of the last
class [in this case) i.e., point (80.5, 0)].
Scale: X-axis 1 cm = 10 marks

(iii) Join the points by a smooth curve.

13
Frequency Polygon

The weights of 50 students are recorded below. Draw a frequency polygon for this d at a.

Example
In a frequency distribution, the mid-value of each class is obtained. Then on the graph paper, the
frequency is plotted against the corresponding mid-value. These points are joined by straight
lines. These straight lines may be extended in both directions to meet the - axis to form a
polygon.

Class Mid-mark Frequency


40 – 44 42 3
45 – 49 47 10
50 – 54 52 12
55 – 59 57 15
60 – 64 62 7
65 – 69 67 5

Answer

If the above graph is joined by a smooth curve, then it is known as a frequency curve

14
Exercise
The raw data displayed below are the electric and gas utility charges during the month of July
1990, for a random sample of 50, one- bedroom apartments in Mumbai:
96 171 202 178 147 102 153 197 127 82
157 185 90 116 172 111 148 213 130 165
141 149 206 175 123 128 144 168 109 167
95 163 150 154 130 143 187 166 139 149
108 119 183 151 114 135 191 137 129 158

a. Form a frequency distribution having 7 class intervals with the following class
boundaries Rs.80 but less than Rs.100, Rs.100 but less than Rs.120, and so on.
b. Form the percentage distribution from the frequency distribution developed in a.
c. From the percentage distribution developed in b.
i. Plot the percentage histogram.
ii. Plot the percentage polygon.

d. From the frequency distribution developed in a.


i. Approximate mean, mode, range, midrange, standard deviation and coefficient
of variation.
ii. Based on Chebyshev’s rule, between what two values would we estimate that
at least 75% of the data are contained?
iii. What percentage of data is actually contained within ±2 S.D. of the mean?
iv. Compare above results with those in part ii.

e. From the frequency distribution developed in a.


i. Form the cumulative frequency distribution.
ii. Form the cumulative percentage distribution.
iii. Plot the ogive.
iv. Approximate the median, Q1, Q3, the mid-hinge and the interquartile range.

15
Chapter 3

Measures of Central Tendency and Dispersion


Types of Data Presentation

Generally, data can be arranged in one of the following three ways.

Series of individual observations


x1 , x2 ,…, xn

Ungrouped Frequency Distribution


( xi , fi ) ; i=1, 2,….,n
xi : ith observation in the series
fi : frequency of ith observation in the series

Grouped Frequency Distribution


( xi , fi ) ; i=1, 2,….,k
xi : midpoint of the ith class
fi : frequency of ith class

Describing and Summarizing Data


Three major properties which describe a batch of a numerical data are
• Central Tendency
• Dispersion
• Shape

Summary measures computed from a sample of data are called ‘Statistics’.

Descriptive summary measures computed from an entire population are called ‘Parameters’.

Measure of central tendency/location

Most batches of data show a distinct tendency to group or cluster about a certain ‘central’ value.
Hence, generally it becomes possible to select some typical value called average, to describe the
entire batch. Such a typical value is measure of central tendency or location.

Different measures of central tendency are


• Arithmetic Mean
• Median
• Mode
• Midrange
• Mid-hinge

16
Arithmetic Mean

It is obtained by adding the raw scores and dividing the sum by the number of items.

Properties
• Based on each and every observation in the series.
• Capable of further mathematical treatment.
• Gives distorted representation of data under study if data consists of outliers, i.e. it is
greatly affected by extreme observations.

To find the mean of raw data

Suppose the raw scores are X1, X2, X3,……, XN


= X1 + X2 + X3 + X4 + XN
N

Then, mean is M = ∑X
N
where,
M = mean
X = each score or item
N = number of items
∑ = sigma, which means 'summation of’

Example: Find the mean of 6, 10, 4, 12, 8.

M = 6 + 10 + 4 + 12 + 8
5
M = 40 = 8
5

To find mean for grouped data

∑ fx
M=
∑f

Where, x is the mid-interval


M is the mean
f is the frequency

17
Example: Find the mean for the following table by the 'Direct Method'

x 10 15 20 25 30 Total
f 3 6 5 4 2 20

x f fx
10 3 30
15 6 90
20 5 100
25 4 100
30 2 60
Total 20 380
∑fx
M=
∑f
380
=
20
= 19.0

Example: Calculate the mean marks in the distribution given below:

Marks 0–9 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 Total


f 4 6 12 6 7 5 40

Marks f x fx
0–9 4 4.5 18.0
10 – 19 6 14.5 87.0
20 – 29 12 24.5 294.0
30 – 39 6 34.5 207.0
40 – 49 7 44.5 311.5
50 – 59 5 54.5 272.5
Total 40 1190.0

∑fx
M=
∑f

1190
=
40
= 29.75

18
Median

Median is defined as the middle value in an ordered sequence of data.


It is not affected by magnitude of the observation but is affected by number of observations.

Example: Find the median of 83, 37, 70, 29, 45, 63, 41, 70, 30, 54

Data in the sequence is 29, 30, 37, 41, 45, 54, 63, 70, 70, 83
Median = Middle-most score

Median = 45 + 54 = 99 = 49.5
2 2

Median = 49.5

Example: Find the median of 15, 8, 14, 20, 13, 12, and 16. Series in order is 8, 12, 13, 14, 15, 16, 20.
n = 7 (odd)

n+1
Median = � � th score = 4th score = 14
2
Median = 14

Mode
Mode is defined as the value in a batch of data which occurs most frequently. It does not get affected
by extreme observations.

It is not used for more than descriptive purpose because it is more variable from sample to
sample than other measure of central tendency.

Example: Find the mode of 43, 42, 44, 40, 48, 45, 40, 40
The given series is 40, 40, 40, 42, 43, 44, 45, 48
since 40 is the most repeated score,
Mode = 40

Midrange

It is defined as the average of the two extremes of the data.


Let xmax and xmin be the two extremes of the data then mid-range is defined as

Midrange = xmax + xmin


2

The main drawback of this is that it becomes distorted as a summary measure of central tendency if an
outlier is present.

19
Measures of Dispersion

Measure of location alone cannot reveal all the characteristics possessed by data under study. For
example, it may happen that two series having same measure of central tendency may have
different pattern of variation and if we try to compare these two series using average it will not
be a right thing to do. A measure which can measure this variation is called measure of
dispersion.

Following are measures of dispersion which are most frequently used.


• Range
• Variance
• Standard Deviation
• Coefficient of Variation

Range

It is a crude measure of dispersion. It measures the total spread in the batch of data.
It is given by
xmax - xmin

It fails to take into account how the data are distributed between the smallest and the largest
values

Variance

It is based on each and every observation in the series. It is defined as mean of squared deviation of
each observation about mean.

Standard Deviation

It is the most commonly used measure of dispersion. It is defined as positive square root of the
variance.

Variance and standard deviation reflect how data are varying. They measure the average scatter
around the mean- that is, these measures evaluate how the values fluctuate about the mean.

Standard deviation is calculated using the following formulae. For an individual series,

20
∑(𝑥𝑥−𝑥𝑥)2
(a) S.D = �
𝑛𝑛

∑𝑢𝑢2 ∑𝑢𝑢 2 x−A


(b) S.D = � − � � xh, where u =
𝑛𝑛 𝑛𝑛 ℎ

For a frequency distribution,

∑𝑓𝑓(𝑥𝑥−𝑥𝑥)2
(a) S.D = �
∑𝑓𝑓

∑𝑓𝑓𝑢𝑢2 ∑𝑓𝑓𝑓𝑓 2 x−A


(b) S.D = � −� � xh, where u =
∑𝑓𝑓 ∑𝑓𝑓 ℎ

The square of the Standard deviation is known as Variance.

Coefficient of Variation: It is a relative measure of dispersion. It is particularly used when comparing

the variability of two or more batches of data that are expressed in different units of measurement. C.V.

is also used in a situation where we want to compare two or more sets of data which are measured in

the same units but differ to such an extent that the direct comparison of the respective standard

deviation is not very useful.

𝑆𝑆𝑆𝑆
Coefficient of variation (C.V) = = 100%
𝐴𝐴𝐴𝐴

21
Example:

Calculate the standard deviation and the variance for the following data 7, 8, 11, 6, 13, 8, 10.

Answer

𝟕𝟕+𝟖𝟖+𝟏𝟏𝟏𝟏+𝟔𝟔+𝟏𝟏𝟏𝟏+𝟖𝟖+𝟏𝟏𝟏𝟏
Mean = =9
𝟕𝟕

Values Deviations from mean 9 (x – M)2


X x-M
6 -3 9
7 -2 4
8 -1 1
8 -1 1
10 1 1
11 2 4
13 4 16
N=7 ∑ (x-M)2 = 36

∑(𝐱𝐱−𝐌𝐌)𝟐𝟐
Variance σ2 =
𝐍𝐍
36
= = 5.14
7

36
S.D σ = � 7 = 2.27

22
Shape

For Symmetric Distribution,


Mean = Median = Mode

For Right Skewed (Positively Skewed) Distribution, mean is affected by extremely large
observation. In this case,
mode < median < mean < midrange

For Left Skewed (Negatively skewed) Distribution,


midrange < mean < median < mode

Quartiles
These are the partition values. Quartile is a useful measure of non-central location. It is often
employed when one wants to summarize or describe the properties of large batches of quantitative
data.

There are three quartiles, Q1, Q2 and Q3.

Midhinge
The midhinge is the mean of the first and third quartiles in a batch of data. It is used to overcome
potential problems introduced by extreme values in the data. It is the measure of central tendency.

Interquartile Range
It is the measure of dispersion which measures the spread of middle 50 % of the observations.
Hence, it is not affected by extreme observations.

For Symmetric distribution


median =midhinge = midrange = mean=mode

For Positively Skewed distribution


mode < median < midhinge < mean < midrange

For Negatively Skewed distribution


midrange < mean < midhinge < median < mode

The Five Number Summary


Median, midhinge and interquartile range are called ‘resistance statistics’ because they are
relatively insensitive to extreme values.

In order to get a better idea about the shape of the distribution, we use the five numbers summary.
These five numbers are;
Xmin, Q1, Q2, Q3 and Xmax

23
Exercise

1. In a class of 50 students, 10 have failed and their average of marks is 2.5. The total marks secured
by the entire class were 281. Find the average marks of students who have passed.
2. What will be the mean and the median of 7 consecutive integers, the least of which is x.
3. Mean and median of 51 items are 100 and 95 respectively. At the time of calculations two
items 180 and 90 were wrongly taken as 100 and 10. What are the correct values of mean and
median?
4. The mean of a group of 10 observations is 15. Fifteen more observations are added to this group
and the mean of these 25 observations is found to be 12. Find the mean of the additional 15
observations.
5. The mean of a group of 20 items is 30. Find the mean if each value is doubled and
increased by 5.
6. Calculate population variance from the following information; n = 15,
∑ x = 480, ∑ x2 =15735
7. Means and variances of two series are given below:
Mean Variance
Series A 54 9
Series B 100 4

Which series is more stable?


8. Two samples of size 40 and 45 respectively have the same mean 53, but different standard
deviations 19 and 8. Find the standard deviation of the combined group.
9. Find population variance of observations 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. Compare its variance with
population variance of 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20.
10. The mean and the standard deviation of population of 100 items were found to be 50 and 5
respectively. If at the time of calculations, two items were wrongly taken as 40 and 50 instead of
60 and 30, find the correct standard deviation.

24
Chapter 4

Probability

Counting Principles

Addition
If two different operations can be performed in ‘m’ and ‘n’ different ways, then the number of
ways in which either operation 1 or operation 2 can be performed is given by (m+n) ways.

Multiplication
If two different operations can be performed in ‘m’ and ‘n’ different ways, then the number of
ways in which both operation 1 and operation 2 can be performed is given by (m*n) ways.

Permutations
Permutation is an arrangement of ‘n’ things. In this case order in which these things are arranged is
important. Broadly speaking, there are 2 different cases in which any problem on permutation can be
classified into.

Case I

Arrangement of ‘n’ distinct things taken ‘r’ at a time is given by nPr.

Examples:
1) 2 and 3 are two digits and with these digits, the numbers 32 and 23 are formed. Although,
numbers viz., 32 and 23 consist of the digits 2 and 3, the order of digits is different. Each of the
above arrangements is called a 'permutation'. Thus, the number of arrangements or permutations
of two distinct digits 2 and 3 is 2.

2) The permutation of the three letters a, b, c taken two at a time are

The number of permutations of n dissimilar things taken r at a time without repetition is


denoted by nPr.

And is given by
The number of permutations of ‘n’ different things taken ‘r’ at a time is the same as the number of
ways of filling n letters in r positions, arranged in a straight line.

Each position is accommodating only one letter. We may fill the first position with any one of the
n letters. Having filled the first position in any one of these n ways, we have (n-1) letters with which
to fill the next position.

25
Having filled the first two positions, we have (n-2) letters with which to fill the third position.
Proceeding in this way one can see that filling r positions is like performing r different operations
with n, (n-1), (n-2) ….. different ways respectively. And since, we have to fill all r positions; we
need to multiply the respective number of ways. Therefore, the total number of ways in which r
positions can be filled with n letters without repetition is n (n-1) (n-2) (n-3)… (n-r+1).

Thus, number of r-permutations of n different things denoted by nPr = P(n,r) is given by nPr =
n(n-1)(n-2) (n-3)...(n - r +1)

If we put r = n in the above formula, then

We may understand that 0! = 1.

Properties

26
Case II

Circular Permutations
When things are arranged in places along a line with first and last place, they form a linear
permutation. So far we have dealt only with linear permutations. When things are arranged in
places along closed curve or a circle, in which any place may be regarded as the first or last place,
they form a circular permutation.

Thus, the number of permutations of 4 objects in a row = 4!, whereas the number of circular
permutations of 4 objects is (4-1)! = 3!

The permutation in a row or along a line has a beginning and an end, but there is nothing like
beginning or end or first and last in a circular permutation. In circular permutations, we consider one
of the objects as fixed and the remaining objects are arranged as in linear permutation.

The following arrangements of 4 objects O1, O2, O3, O4 in a circle will be considered as one or
same arrangement

Observe carefully that when arranged in a row, O1 O2 O3 O4, O2O3O4 O1, O3O4O1O2, O4O1O2O3 are
different permutations. When arranged in a circle, these 4 permutations are considered as one
permutation.

Theorem:
The number of circular permutations of n different objects is (n-1)!

Proof:
Each circular permutation corresponds to n linear permutations depending on where we start.

𝑛𝑛!
Since there are exactly n! Linear permutations, there are exactly 𝑛𝑛
permutations. Hence, the
number of circular permutations is the same as (n-1)!

Example

Suppose there are n guests to be arranged along a circular table, ten we have to fix the position of one
of the guest (which can be done in only one way) and then arrange remaining (n-1) guest in (n-1)
positions just like in linear case. Thus, the total number of ways in which n guest can be arranged in a
circular manner is (n-1)!

27
Combinations
The number of ways of selecting r things out of n dissimilar things is denoted by C(n,r) or nCr. The
selections of number of things taking some or all of them at a time are called combinations.

Example:

From a class of 32 students, 4 are to be chosen for a competition. In how many ways can this be done?

We are to select 4 students from 32.

This selection can done in

𝟑𝟑𝟑𝟑.𝟑𝟑𝟑𝟑.𝟑𝟑𝟑𝟑.𝟐𝟐𝟐𝟐
32
C4 = = 35960
𝟒𝟒.𝟑𝟑.𝟐𝟐.𝟏𝟏
Note that there is a relationship between permutations and combinations. For a given set of ‘n’
dissimilar things number of permutations is always greater than corresponding number of
combinations.
P(n,r) 𝑛𝑛!
C(n,r)r! = P(n,r) C(n,r) = =
𝑟𝑟! 𝑟𝑟!(𝑛𝑛−𝑟𝑟!)

Properties
n! n!
C(n, n-r) =
(n−r)! [n−(n−r)]!
= (n−r)!r!
= C(n, r)

C(n,0) = C(n,n) = 1

n! n!
C(n,0) = = = C(n, n) = 1
0! [n−0]! 𝑛𝑛!

Difference between a Permutation and a Combination

• In a combination, only selection is made. In a permutation, not only a selection is made, but
also there is an arrangement of a definite order.
• There is no order of selection in combinations. In permutation, order is a must.
• Usually (i.e., except in special cases or trivial cases), the number of permutations exceeds the
number of combinations.

28
Exercise

1. A gentleman has 6 friends to invite. In how many ways can he send invitation cards to them if
he has 3 servants to carry the cards?

2. How many numbers, each lying between 100 and 1000 can be formed with digits 2, 3, 4, 0, 8,
and 9 (if repetitions of digits are not allowed)?

3. How many three digit numbers divisible by 5 can be formed using any numerals from 0 to 9
without repetition?

4. There are 10 points in a plane, of which 3 are collinear. Find the number of triangles formed
by joining these points.

5. From 7 engineers and 4 doctors a committee of 5 members is to be formed. In how many ways
can this be done

i. To include exactly one doctor?


ii. To include at least one doctor?

6. There are 2 books each of 3 volumes and 2 books each of 2 volumes. In how many ways can these
be arranged on a shelf so that the volumes of the same book remain together?

7. A company has 11 computer engineers and 7 mechanical engineers. In how many ways can
they be seated in a row so that no 2 of the mechanical engineers may sit together?

8. A company has 11 computer engineers and 7 mechanical engineers. In how many ways can
they be seated in a row so that all the mechanical engineers do not sit together?

9. How many words can be formed using letters of the word ‘MATHEMATICS’ if
i. there is no restriction
ii. all the vowels are together
iii. vowels are together and consonants are together

10. A person has 12 friends and he wants to invite 8 of them to a birthday party. Find
i. how many times 3 particular friends will always attend the parties
ii. how many times 3 particular friends will never attend the parties

29
Probability
In our day to day life, we come across many uncertain events. We wake up in the morning and check
the weather report. The statement could be 'there is 60% chance of rain today'. This statement infers
that the chance of rain is more than that having a dry weather. We decide upon our breakfast from a
statement that "corn flakes might reduce cholesterol". What is the chance of getting a flat tyre on the
way to an important appointment? And so on.

How probable an event is? We generally infer by repeated observation of such events in long term
patterns. Probability is the branch of mathematics devoted to the study of such events. People have
always been interested in games of chance and gambling. The existence of games such as dice is
evident since 3000 BC. But such games were not treated mathematically till fifteenth century. During
this period, the calculation and theory of probability originated in Italy. Later in the seventeenth
century, French Mathematicians Pascal and Fermat contributed to this Literature of study. The
foundation of modern probability theory is credited to the Russian mathematician, Kolmogorov. He
proposed the axioms, at which the present subject of probability is based.

Random Experiment and Sample Space


An experiment repeated under essentially homogeneous and similar conditions results in an outcome,
which is unique or not unique but may be one of the several possible outcomes. When the result is
unique then the experiment is called a 'deterministic' experiment.

Example:
While measuring the inner radius of an open tube, using slide calipers, we get the same result by
performing repeatedly the same experiment. Many scientific and engineering experiments are
deterministic.

If the outcome is one of the several possible outcomes, then such an experiment is called a "random
experiment" or 'nondeterministic' experiment.

In other words, any experiment whose outcome cannot be predicted in advance, but is one of the set
of possible outcomes, is called a random experiment.

If we think an experiment as being performed repeatedly, then each repetition is called a trial. We
observe an outcome for each trial.

Example:
An experiment consists of 'tossing a die and observing the number on the upper-most face'

In such cases, we talk of chance of probability, which numerically measures the degree of chance of
the occurrence of events.

Sample Space (S)


The set of all possible outcomes of a random experiment is called the sample space, associated with
the random experiment

30
Note:
Each element of S denotes a possible outcome. Each element of S is known as sample point.
Any trial results in an outcome and corresponds to one and only one element of the set S.
e.g.
1. In the experiment of tossing a coin,
S = {H, T}
2. In the experiment of tossing two coins simultaneously,
S = {HH, HT, TH, TT}
3. In the experiment of throwing a pair of dice,
S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2),………. (6,1), (6,2), (6,3), (6,4), (6,5),
(6, 6)}

Events
An event is the outcome or a combination of outcomes of an experiment. In other words, an event is a
subset of the sample space.

Consider a random experiment of rolling of a six faced die.


The sample space of this experiment is S= {1, 2, 3, 4, 5, 6}

Let A be the event that the number on the uppermost face is odd, then the corresponding set of
favorable outcomes is {1, 3, 5} i.e. A= {1, 3, 5}

Let B be the event that the number on the uppermost face is even. Then, B = {2, 4, 6}.

Let C be the event that the number on the uppermost face is above 7. Now, this set is certainly a
null set or an empty set because there is no favorable outcome. Thus, C= ф

Let D be the event that the number on the uppermost face is an integer between 1 and 6, both
inclusive, the D = {1, 2, 3,4,5,6 = S
Let E be the event that the outcome is less than 2. then, E = {1}

Types of Events
As we have different types of sets, we have different types of events.
We illustrate different types of events using above example.

Simple Event
If an event has one element of the sample space then it is called a simple or elementary event. In
the above example, E = {1} is a simple event

Compound Event
If an event has more than one sample points, the event is called a compound event. In
the above example, A = {1,3,5}is a compound event.

Null Event (ф)


As null set is a subset of S, it is also an event called the null event or impossible event. In the above
example, C is a null event.

31
Sure event
In the above experiment, the sample space S= {1, 2, 3, 4, 5, 6}. The event represented by D occurs
whenever the experiment is performed. Therefore, the event D is called a sure event or certain
event.

Complement of an Event
The complement of an event A with respect to S is the set of all the elements of S which are not in
A. The complement of A is denoted by A' or AC.

Note:
In an experiment if A has not occurred then A' has occurred.

Algebra of Events
In a random experiment, considering S (the sample space) as the universal set let A, B and C is the
events of S. We can define union, intersect on and complement of events and their properties on S,
which is similar to those in set theory.

i) A ∪ B, A ∩ B and A′ are events of the random experiment.


ii) A-B is an event, which is same as ''A but not B".
iii) A ∪ B = B ∪ A, A ∩ B = B ∩ A
iv) (A ∪ B)′ = A′ ∩ B′, (A ∩ B)′ = A′ ∪ B′
v) (A ∪ B) ∪ C = A ∪ (B ∪ C), (A ∩ B) ∩ C = A ∩ (B ∩ C)
vi) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
vii) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)

Union of two events


If A and B are two events defined on the sample space S, then A or B or (A∪B) denotes the event of
the occurrence of at least one of the events A or B.

Intersection of two events


Intersection of two events A and B is the joint occurrence of these two events. It is denoted by (A B).

Mutually Exclusive Events


Two events associated with a random experiment are said to be mutually exclusive, if both
cannot occur together in the same trial or in other words, occurrence of one prevents the
occurrence of the other.

In the above experiment, the events A = {1,3,5 } and B = {2,4,6}are mutually exclusive.
Symbolically,
(A B) = ф
Where, (A B) is the event that both A and B occur.

Events E1, E2, …, En associated with a random experiment are said to be pair-wise mutually exclusive

32
Exhaustive Event
For a random experiment, let E1, E2, E3,….. En be the subsets of the sample space S.
E1, E2, E3, …, En form a set of Exhaustive events if
E1 ∪ E2 ∪ E3 ∪ ..... ∪ En = S

A set of events E1, E2, E3, …. En of S are said to mutually exclusive and exhaustive events if
E1 ∪ E2 ∪ E3 ∪ ......... ∪ En = S

Independent Events
Events are said to be independent if the occurrence of one event does not affect the occurrence of
others.

Let A and B be two events defined on sample space S. Events A and B are said to be independent if

Note:
If A and B are independent, then
i) Ac and Bc are independent
iii) A and Bc are independent
ii) Ac and B are independent

Partition of the sample space


A set of events E1, E2, E3, …. En on S are said to form a partition of the sample space S, if they are
collectively exhaustive and mutually exclusive.
i.e. if
E1 ∪ E2 ∪ E3 En = S
and
Ei ∩ Ej = ∅ for i ≠ j

Equally Likely Outcomes


The outcomes of a random experiment are said to be equally likely, if each one of them has equal
chance of occurrence.

Example: The outcomes of an unbiased coin are equally likely.

Probability of an Event
So far, we have introduced the sample of an experiment and used it to describe events. In this section,
we introduce probabilities associated to the events.

Let S be the sample space associated with the random experiment. Further, let S be finite and equally-
likely, i.e. let there be n (finite) number of sample points in S and let each one of them be equally
likely. Let A be the event defined on S then, probability of occurrence of event A is denoted by P(A)
and is given by
𝑚𝑚
P(A) = 𝑛𝑛 … … … (𝑖𝑖)
Where, m is the number of outcomes favorable for the occurrence of the event A.

33
Note 1: 0≤ P(A) ≤ 1 as 0 ≤ m ≤ n

Note 2: If P(A) = 0 then A is called a null event, or impossible event.


Note 3: If P(A) = 1 then A is called a sure event.

Note 4: If m is the number of cases favorable to A. Then n-m is favorable to "nonoccurrence of


A".
n−m m
Therefore, P(Ac) = = 1− = 1 − P(A)
n n

Axiomatic Approach to Probability


Axiomatic approach to probability closely relates the theory of probability to set theory.

Let S be the sample space of an experiment. Probability is a function, which associates a non-
negative real number to every event A of the sample space denoted by P(A) satisfying the following
axioms

For every event A in S, P(A) ≥ 0. P(S) = 1

P(AC) = 1 - P(A)

P(ф) = 0

If A1, A2, A3,….An are mutually exclusive events in S, then

P(A1 ∪ A2 ∪ A3 ∪ ……. ∪ An) = P (A1) + P(A2) + ….. + P(An)

34
Addition Rule of Probability

If A and B are any two events, then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

If A ∩ B = ∅ then P(A ∪ B) = P(A) + P(B)

If A and B are mutually exclusive events, then P(A ∪ B) = P(A) + P(B)


If A, B, C are any three events, then

P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(B ∩ C) − P(C ∩ A) + P(A ∩ B ∩ C)

35
Exercise
1. A sample of 500 respondents was selected in a large metropolitan area in order to determine
various information concerning consumer behavior. Among the questions asked was ‘’Do you
enjoy shopping for clothing? ” Of 240 males, 136 answered yes. Of 260 females, 224 answered
yes. What is the probability that the respondent chosen at random
i. Is a male?
ii. Enjoys shopping for clothing?
iii. Is a female?
iv. Does not enjoy shopping for clothing?

2. A five digit number is to be formed by digits 1,2,3,4 and 5 without repetition. What is the
probability that the number is divisible by 4?
3. What is the probability that a leap year will have 52 Tuesdays?
4. Two friends A and B apply for two vacancies at the same post. The chances of their
selection are 0.25 and 0.20 respectively. What is the chance that
i. One of them will be selected?
ii. Both will be selected?
iii. None of them will be selected?
5. Probability that a man will be alive 25 years hence is 0.3 and the probability that his wife will be
alive 25 years hence is 0.4. Find the probability that 25 years hence
i. Both will be alive?
ii. Only the man will be alive?
iii. Only the women will be alive?
iv. At least one of them will be alive?
6. One bag contains 5 red and 7 black balls and the other 3 red and 12 black balls. A ball is drawn
at random from either of the bags. What is the chance that the selected ball is black?
7. According to a survey, the probability that a family owns two cars if their annual income is
greater than Rs. 8 lakh is 0.75. Of the households surveyed, 60 per cent had income over Rs.
8 lakh and 52 per cent had two cars. What is the probability that a family has two cars and an
income over Rs. 8 lakh a year?

36
8. The chance that a person stopping at a petrol pump will get his vehicle’s tyres checked is 0.12,
the chance that he will get the oil checked is 0.29 and the chance that he will get both checked
is 0.07.
i. What is the chance that a person will have neither his tyres nor oil checked?
ii. What is the probability that a person who has his oil checked will also have tyres
checked?
9. It is known that 15 per cent of the males and 10 per cent of the females in a town having equal
number of them are unemployed. A person is selected at random from the town.
What is the probability that
i. A person is employed?
ii. A person is male given that he is employed?

10. A certain company encourages its employees to participate in cricket and hockey. A survey
indicates that 40% play cricket, 50% play hockey and 25% play both cricket and hockey. Find
the probability that
i. An employee plays only hockey?
ii. An employee plays only cricket?
iii. An employee takes part in at least one of the games, cricket and hockey?
Note:
Four chapters together with four exercises have been given in the material for the purpose of self-
study. Make sure that you go through entire material. Evaluation will be conducted on this part
immediately after you join the course.

Wish you all the best!

37

You might also like