Final AB 19-21 PIM3 Basics of Business Statistics
Final AB 19-21 PIM3 Basics of Business Statistics
Contents
1. Business Statistics 3
2. Presentation of Data 8
4. Probability 25
2
Chapter 1
Business Statistics
Introduction
Statistical methods can be used to summarize or describe a collection of data; this is called
descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for
randomness and uncertainty in the observations, and are then used to draw inferences about the
process or population being studied; this is called inferential statistics. Descriptive, predictive, and
inferential statistics comprise applied statistics. Business statistics is the science of good decision
making in the face of uncertainty and is used in many disciplines such as financial analysis,
econometrics, auditing, production and operations including services improvement, and marketing
research.
History
The scope of the discipline of statistics broadened in the early 19th century to include the collection
and analysis of data in general. Today, statistics is widely employed in government, business, and the
natural and social sciences.
Because of its empirical roots and its applications, statistics is generally considered not to be a
subfield of pure mathematics, but rather a distinct branch of applied mathematics. Its
mathematical foundations were laid in the 17th century with the development of probability
theory by Pascal and Fermat. Probability theory arose from the study of games of chance. The
method of least squares was first described by Carl Friedrich Gauss around 1794. The use of
modern computers has expedited large-scale statistical computation, and has also made possible new
methods that are impractical to perform manually.
3
For practical reasons, rather than compiling data about an entire population, a chosen subset of the
population, called a sample, is studied. Data are collected about the sample in an observational or
experimental setting. The data are then subjected to statistical analysis, which serves two related
purposes: description and inference.
• Inferential statistics is used to model patterns in the data, accounting for randomness
and drawing inferences about the larger population. These inferences may take the form
of answers to yes/no questions (hypothesis testing), estimates of numerical characteristics
(estimation), descriptions of association (correlation), or modeling of relationships
(regression). Other modeling techniques include ANOVA, time series, and data mining.
If the sample is representative of the population, then inferences and conclusions made from the
sample can be extended to the population as a whole. A major problem lies in determining the
extent to which the chosen sample is representative. Statistics offers methods to estimate and
correct for randomness in the sample and in the data collection procedure, as well as methods for
designing robust experiments in the first place. (See experimental design.)
The use of any statistical method is valid only when the system or population under
consideration satisfies the basic mathematical assumptions of the method. Misuse of statistics can
produce subtle but serious errors in description and interpretation — subtle in the sense that even
experienced professionals sometimes make such errors, serious in the sense that they may affect,
for instance, social policy, medical practice and the reliability of structures such as bridges.
Even when statistics is correctly applied, the results can be difficult for the non-expert to interpret.
For example, the statistical significance of a trend in the data, which measures the extent to
which the trend could be caused by random variation in the sample, may not agree with one's
intuitive sense of its significance. The set of basic statistical skills (and skepticism) needed by people
to deal with information in their everyday lives is referred to as statistical literacy.
4
Applications of Business Statistics
Accounting
Public accounting firms use statistical sampling procedures when conducting audits for their clients.
Economics
Economists use statistical information in
making forecasts about the future of the
economy or some aspect of it.
Marketing
Electronic point-of-sale scanners at retail
checkout counters are used to collect data for a
variety of marketing research applications
Production
A variety of statistical quality control
charts are used to monitor the output of
a production process
Finance
Financial advisors use price-earnings ratios and dividend yields to guide their investment
recommendations.
5
Levels of measurement
There are four types of measurements or levels of measurement or measurement scales used in
statistics: nominal, ordinal, interval, and ratio. They have different degrees of usefulness in
statistical research.
Nominal
When the data for a variable consists of labels or names used to identify an attribute of the
element, the scale of measurement is considered to be nominal.
Example:
Students of a university are classified by the school in which they are enrolled using a non-
numeric label such as Business, Humanities, Education, and so on. Alternatively, a numeric code
could be used for the school variable (e.g. 1 denotes Business, 2 denote Humanities, and 3 denote
Education, and so on).
Ordinal
The data have the properties of nominal data and the order or rank of the data is meaningful. A
nonnumeric label or numeric code may be used.
Example:
Students of a university are classified by their class standing using a nonnumeric label such as
freshman, sophomore, Junior, or Senior. Alternatively, a numeric code could be used for the class
standing variable (e.g. 1 denotes freshman, 2 denotes sophomore, and so on).
Interval
The data have the properties of ordinal data, and the interval between observations is expressed in
terms of a fixed unit of measure. Interval data are always numeric.
Example:
Melissa has an SAT score of 1205, while Kevin has an SAT score of 1090. Melissa scored 115
points more than Kevin.
Ratio
The data have all the properties of interval data and the ratio of two values is meaningful. Variables
such as distance, height, weight, and time use the ratio scale. This scale must contain a zero Value that
indicates that nothing exists for the variable at the zero point.
Example:
Melissa’s college record shows 36 credit hours earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit hours earned as Melissa.
6
Qualitative and Quantitative Data
Data can be further classified as being qualitative or quantitative. Qualitative data include labels or
names used to identify an attribute of each element. Qualitative data use either the nominal or
ordinal scale of measurement and may be nonnumeric or numeric. Quantitative data are obtained
using either the interval or ratio scale of measurement. The statistical analysis appropriate for a
particular variable depends upon whether the variable is qualitative or quantitative.
If the variable is qualitative, the statistical analysis is rather limited. We can summarize the
qualitative data by counting the number of observations in each category or by computing the
proportion of the observations in each qualitative category. However if the characteristic is
quantitative, arithmetic operations often provide meaningful results.
Discrete variable
A variable taking isolated values is called discrete variable. The graphical representation of a
discrete variable is a step function. Examples of a discrete variable can be number of people in a
group, number of accidents occurring on a particular day etc.
Continuous variable
A variable which takes any value within the given interval is referred to as continuous variable.
Example:
Weight of a person, temperature on a given day, rainfall on a given day etc. Graphical
presentation of a continuous variable is a curve.
7
Chapter 2
Presentation of Data
Types of Data
Primary data
Primary data is the one which is collected for the first time by the investigator. He can collect it
using various methods, like survey (census), telephonic interviews, through e-mails etc. This data are
generally referred to as raw data as it is unprocessed data.
Secondary data
In some cases, data needed for a particular application already exist. Companies maintain a
variety of records or databases about their employees, and business operations. Data are also available
from a variety of industry associations and special interest organizations. The internet continues to
grow as an important source of information and statistical data. Almost all companies maintain Web
sites that provide general information about the company as well as data on sales, number of
employees, number of products etc. Government agencies are another important source of
secondary data. Information on vital events (birth, death etc.) is available with the governmental
agencies. Sometimes information can also be collected form published journals.
Presentation of Data
After collection of data, the next stage the statistician has to go through is presentation of data.
Usually, size of the information collected is huge, so it becomes necessary to present it in a more
systematic and concise way in order to bring out important feature or characteristics of the data.
1. Tabular
2. Graphical
Let us start the discussion with the introduction of frequency distribution and various
components of the frequency distribution.
8
Frequency Distribution
A frequency distribution is a tabular summary of data showing the number (frequency) of items in
each of several non-overlapping classes. When raw data is converted into the frequency
distribution, frequency distribution provides summary which offers more insight than the original
data.
Three steps necessary to define classes for a frequency distribution with quantitative data are:
Number of classes
There is no specific rule for choosing the number of classes. As a general guideline, classes between
5 and 20 are chosen.
Relative Frequency
Tabulation
While presenting the data one can make use of tabulation. It is the most concise way of presentation of
data. There can be one-way, two-way or multifold tables depending on number of columns and rows
we choose.
9
Diagrammatic representation
One of the graphical ways of representing data is diagrammatic representation. In this, one can use
Bar graphs or Pie-charts. Bar graph can be of simple, multiple, sub-divided or percentage type.
This is pictorial presentation of data. In this, points are not plotted according to the scale. These are
more attractive and colorful as compared to various graphs available in statistics.
Pie-chart
Graphical Representation
Histogram
10
Cumulative Frequency Distribution
The following frequency distribution table gives the marks obtained by 40 students:
Cumulative frequency is obtained by adding the frequency of a class interval and the frequencies of
the preceding intervals unto that class interval. This is explained by an example below.
In the above table it can be observed that frequencies are added from top to bottom and also 4
students got marks 'less than 10', 9 students got marks 'less than 20' and so on.
Therefore, the above distribution is called 'less than' cumulative frequency distribution. The above
table can be re-written as follow:
In the same way 'more than' cumulative frequency distribution can be obtained by adding to the
other frequencies in the reverse order. It is explained in the following table.
11
Ogive curve
It is a cumulative frequency curve. There are two types of ogive curve; less than ogive curve and
more than ogive curve. Ogive curve is drawn by taking data values on the horizontal axis and
cumulative frequencies on the vertical axis.
Example:
Draw a 'less than' ogive curve for the following data
To plot an Ogive:
(i) We plot the points with coordinates having abscissa as actual limits and ordinates as the
cumulative frequencies, (10, 2), (20, 10), (30, 22), (40, 40), (50, 68), (60, 90), (70, 96) and (80,
100) are the coordinates of the points.
(iii) An Ogive is connected to a point on the X-axis representing the actual lower limit of the first
class.
12
Example
Using the data given below, construct a 'more than' cumulative frequency table and draw the Ogive.
To Plot an Ogive
(i) We plot the points with coordinates having abscissa as actual lower limits and ordinates as the
cumulative frequencies
(70.5, 2), (60.5, 7), (50.5, 13), (40.5, 23), (30.5, 37), (20.5, 49), (10.5, 57), (0.5, 60) are the
coordinates of the points.
Y-axis 2 cm = 10 c.f.
(ii) An Ogive is connected to a point on the X-axis representing the actual upper limit of the last
class [in this case) i.e., point (80.5, 0)].
Scale: X-axis 1 cm = 10 marks
13
Frequency Polygon
The weights of 50 students are recorded below. Draw a frequency polygon for this d at a.
Example
In a frequency distribution, the mid-value of each class is obtained. Then on the graph paper, the
frequency is plotted against the corresponding mid-value. These points are joined by straight
lines. These straight lines may be extended in both directions to meet the - axis to form a
polygon.
Answer
If the above graph is joined by a smooth curve, then it is known as a frequency curve
14
Exercise
The raw data displayed below are the electric and gas utility charges during the month of July
1990, for a random sample of 50, one- bedroom apartments in Mumbai:
96 171 202 178 147 102 153 197 127 82
157 185 90 116 172 111 148 213 130 165
141 149 206 175 123 128 144 168 109 167
95 163 150 154 130 143 187 166 139 149
108 119 183 151 114 135 191 137 129 158
a. Form a frequency distribution having 7 class intervals with the following class
boundaries Rs.80 but less than Rs.100, Rs.100 but less than Rs.120, and so on.
b. Form the percentage distribution from the frequency distribution developed in a.
c. From the percentage distribution developed in b.
i. Plot the percentage histogram.
ii. Plot the percentage polygon.
15
Chapter 3
Descriptive summary measures computed from an entire population are called ‘Parameters’.
Most batches of data show a distinct tendency to group or cluster about a certain ‘central’ value.
Hence, generally it becomes possible to select some typical value called average, to describe the
entire batch. Such a typical value is measure of central tendency or location.
16
Arithmetic Mean
It is obtained by adding the raw scores and dividing the sum by the number of items.
Properties
• Based on each and every observation in the series.
• Capable of further mathematical treatment.
• Gives distorted representation of data under study if data consists of outliers, i.e. it is
greatly affected by extreme observations.
Then, mean is M = ∑X
N
where,
M = mean
X = each score or item
N = number of items
∑ = sigma, which means 'summation of’
M = 6 + 10 + 4 + 12 + 8
5
M = 40 = 8
5
∑ fx
M=
∑f
17
Example: Find the mean for the following table by the 'Direct Method'
x 10 15 20 25 30 Total
f 3 6 5 4 2 20
x f fx
10 3 30
15 6 90
20 5 100
25 4 100
30 2 60
Total 20 380
∑fx
M=
∑f
380
=
20
= 19.0
Marks f x fx
0–9 4 4.5 18.0
10 – 19 6 14.5 87.0
20 – 29 12 24.5 294.0
30 – 39 6 34.5 207.0
40 – 49 7 44.5 311.5
50 – 59 5 54.5 272.5
Total 40 1190.0
∑fx
M=
∑f
1190
=
40
= 29.75
18
Median
Example: Find the median of 83, 37, 70, 29, 45, 63, 41, 70, 30, 54
Data in the sequence is 29, 30, 37, 41, 45, 54, 63, 70, 70, 83
Median = Middle-most score
Median = 45 + 54 = 99 = 49.5
2 2
Median = 49.5
Example: Find the median of 15, 8, 14, 20, 13, 12, and 16. Series in order is 8, 12, 13, 14, 15, 16, 20.
n = 7 (odd)
n+1
Median = � � th score = 4th score = 14
2
Median = 14
Mode
Mode is defined as the value in a batch of data which occurs most frequently. It does not get affected
by extreme observations.
It is not used for more than descriptive purpose because it is more variable from sample to
sample than other measure of central tendency.
Example: Find the mode of 43, 42, 44, 40, 48, 45, 40, 40
The given series is 40, 40, 40, 42, 43, 44, 45, 48
since 40 is the most repeated score,
Mode = 40
Midrange
The main drawback of this is that it becomes distorted as a summary measure of central tendency if an
outlier is present.
19
Measures of Dispersion
Measure of location alone cannot reveal all the characteristics possessed by data under study. For
example, it may happen that two series having same measure of central tendency may have
different pattern of variation and if we try to compare these two series using average it will not
be a right thing to do. A measure which can measure this variation is called measure of
dispersion.
Range
It is a crude measure of dispersion. It measures the total spread in the batch of data.
It is given by
xmax - xmin
It fails to take into account how the data are distributed between the smallest and the largest
values
Variance
It is based on each and every observation in the series. It is defined as mean of squared deviation of
each observation about mean.
Standard Deviation
It is the most commonly used measure of dispersion. It is defined as positive square root of the
variance.
Variance and standard deviation reflect how data are varying. They measure the average scatter
around the mean- that is, these measures evaluate how the values fluctuate about the mean.
Standard deviation is calculated using the following formulae. For an individual series,
20
∑(𝑥𝑥−𝑥𝑥)2
(a) S.D = �
𝑛𝑛
∑𝑓𝑓(𝑥𝑥−𝑥𝑥)2
(a) S.D = �
∑𝑓𝑓
the variability of two or more batches of data that are expressed in different units of measurement. C.V.
is also used in a situation where we want to compare two or more sets of data which are measured in
the same units but differ to such an extent that the direct comparison of the respective standard
𝑆𝑆𝑆𝑆
Coefficient of variation (C.V) = = 100%
𝐴𝐴𝐴𝐴
21
Example:
Calculate the standard deviation and the variance for the following data 7, 8, 11, 6, 13, 8, 10.
Answer
𝟕𝟕+𝟖𝟖+𝟏𝟏𝟏𝟏+𝟔𝟔+𝟏𝟏𝟏𝟏+𝟖𝟖+𝟏𝟏𝟏𝟏
Mean = =9
𝟕𝟕
∑(𝐱𝐱−𝐌𝐌)𝟐𝟐
Variance σ2 =
𝐍𝐍
36
= = 5.14
7
36
S.D σ = � 7 = 2.27
22
Shape
For Right Skewed (Positively Skewed) Distribution, mean is affected by extremely large
observation. In this case,
mode < median < mean < midrange
Quartiles
These are the partition values. Quartile is a useful measure of non-central location. It is often
employed when one wants to summarize or describe the properties of large batches of quantitative
data.
Midhinge
The midhinge is the mean of the first and third quartiles in a batch of data. It is used to overcome
potential problems introduced by extreme values in the data. It is the measure of central tendency.
Interquartile Range
It is the measure of dispersion which measures the spread of middle 50 % of the observations.
Hence, it is not affected by extreme observations.
In order to get a better idea about the shape of the distribution, we use the five numbers summary.
These five numbers are;
Xmin, Q1, Q2, Q3 and Xmax
23
Exercise
1. In a class of 50 students, 10 have failed and their average of marks is 2.5. The total marks secured
by the entire class were 281. Find the average marks of students who have passed.
2. What will be the mean and the median of 7 consecutive integers, the least of which is x.
3. Mean and median of 51 items are 100 and 95 respectively. At the time of calculations two
items 180 and 90 were wrongly taken as 100 and 10. What are the correct values of mean and
median?
4. The mean of a group of 10 observations is 15. Fifteen more observations are added to this group
and the mean of these 25 observations is found to be 12. Find the mean of the additional 15
observations.
5. The mean of a group of 20 items is 30. Find the mean if each value is doubled and
increased by 5.
6. Calculate population variance from the following information; n = 15,
∑ x = 480, ∑ x2 =15735
7. Means and variances of two series are given below:
Mean Variance
Series A 54 9
Series B 100 4
24
Chapter 4
Probability
Counting Principles
Addition
If two different operations can be performed in ‘m’ and ‘n’ different ways, then the number of
ways in which either operation 1 or operation 2 can be performed is given by (m+n) ways.
Multiplication
If two different operations can be performed in ‘m’ and ‘n’ different ways, then the number of
ways in which both operation 1 and operation 2 can be performed is given by (m*n) ways.
Permutations
Permutation is an arrangement of ‘n’ things. In this case order in which these things are arranged is
important. Broadly speaking, there are 2 different cases in which any problem on permutation can be
classified into.
Case I
Examples:
1) 2 and 3 are two digits and with these digits, the numbers 32 and 23 are formed. Although,
numbers viz., 32 and 23 consist of the digits 2 and 3, the order of digits is different. Each of the
above arrangements is called a 'permutation'. Thus, the number of arrangements or permutations
of two distinct digits 2 and 3 is 2.
And is given by
The number of permutations of ‘n’ different things taken ‘r’ at a time is the same as the number of
ways of filling n letters in r positions, arranged in a straight line.
Each position is accommodating only one letter. We may fill the first position with any one of the
n letters. Having filled the first position in any one of these n ways, we have (n-1) letters with which
to fill the next position.
25
Having filled the first two positions, we have (n-2) letters with which to fill the third position.
Proceeding in this way one can see that filling r positions is like performing r different operations
with n, (n-1), (n-2) ….. different ways respectively. And since, we have to fill all r positions; we
need to multiply the respective number of ways. Therefore, the total number of ways in which r
positions can be filled with n letters without repetition is n (n-1) (n-2) (n-3)… (n-r+1).
Thus, number of r-permutations of n different things denoted by nPr = P(n,r) is given by nPr =
n(n-1)(n-2) (n-3)...(n - r +1)
Properties
26
Case II
Circular Permutations
When things are arranged in places along a line with first and last place, they form a linear
permutation. So far we have dealt only with linear permutations. When things are arranged in
places along closed curve or a circle, in which any place may be regarded as the first or last place,
they form a circular permutation.
Thus, the number of permutations of 4 objects in a row = 4!, whereas the number of circular
permutations of 4 objects is (4-1)! = 3!
The permutation in a row or along a line has a beginning and an end, but there is nothing like
beginning or end or first and last in a circular permutation. In circular permutations, we consider one
of the objects as fixed and the remaining objects are arranged as in linear permutation.
The following arrangements of 4 objects O1, O2, O3, O4 in a circle will be considered as one or
same arrangement
Observe carefully that when arranged in a row, O1 O2 O3 O4, O2O3O4 O1, O3O4O1O2, O4O1O2O3 are
different permutations. When arranged in a circle, these 4 permutations are considered as one
permutation.
Theorem:
The number of circular permutations of n different objects is (n-1)!
Proof:
Each circular permutation corresponds to n linear permutations depending on where we start.
𝑛𝑛!
Since there are exactly n! Linear permutations, there are exactly 𝑛𝑛
permutations. Hence, the
number of circular permutations is the same as (n-1)!
Example
Suppose there are n guests to be arranged along a circular table, ten we have to fix the position of one
of the guest (which can be done in only one way) and then arrange remaining (n-1) guest in (n-1)
positions just like in linear case. Thus, the total number of ways in which n guest can be arranged in a
circular manner is (n-1)!
27
Combinations
The number of ways of selecting r things out of n dissimilar things is denoted by C(n,r) or nCr. The
selections of number of things taking some or all of them at a time are called combinations.
Example:
From a class of 32 students, 4 are to be chosen for a competition. In how many ways can this be done?
𝟑𝟑𝟑𝟑.𝟑𝟑𝟑𝟑.𝟑𝟑𝟑𝟑.𝟐𝟐𝟐𝟐
32
C4 = = 35960
𝟒𝟒.𝟑𝟑.𝟐𝟐.𝟏𝟏
Note that there is a relationship between permutations and combinations. For a given set of ‘n’
dissimilar things number of permutations is always greater than corresponding number of
combinations.
P(n,r) 𝑛𝑛!
C(n,r)r! = P(n,r) C(n,r) = =
𝑟𝑟! 𝑟𝑟!(𝑛𝑛−𝑟𝑟!)
Properties
n! n!
C(n, n-r) =
(n−r)! [n−(n−r)]!
= (n−r)!r!
= C(n, r)
C(n,0) = C(n,n) = 1
n! n!
C(n,0) = = = C(n, n) = 1
0! [n−0]! 𝑛𝑛!
• In a combination, only selection is made. In a permutation, not only a selection is made, but
also there is an arrangement of a definite order.
• There is no order of selection in combinations. In permutation, order is a must.
• Usually (i.e., except in special cases or trivial cases), the number of permutations exceeds the
number of combinations.
28
Exercise
1. A gentleman has 6 friends to invite. In how many ways can he send invitation cards to them if
he has 3 servants to carry the cards?
2. How many numbers, each lying between 100 and 1000 can be formed with digits 2, 3, 4, 0, 8,
and 9 (if repetitions of digits are not allowed)?
3. How many three digit numbers divisible by 5 can be formed using any numerals from 0 to 9
without repetition?
4. There are 10 points in a plane, of which 3 are collinear. Find the number of triangles formed
by joining these points.
5. From 7 engineers and 4 doctors a committee of 5 members is to be formed. In how many ways
can this be done
6. There are 2 books each of 3 volumes and 2 books each of 2 volumes. In how many ways can these
be arranged on a shelf so that the volumes of the same book remain together?
7. A company has 11 computer engineers and 7 mechanical engineers. In how many ways can
they be seated in a row so that no 2 of the mechanical engineers may sit together?
8. A company has 11 computer engineers and 7 mechanical engineers. In how many ways can
they be seated in a row so that all the mechanical engineers do not sit together?
9. How many words can be formed using letters of the word ‘MATHEMATICS’ if
i. there is no restriction
ii. all the vowels are together
iii. vowels are together and consonants are together
10. A person has 12 friends and he wants to invite 8 of them to a birthday party. Find
i. how many times 3 particular friends will always attend the parties
ii. how many times 3 particular friends will never attend the parties
29
Probability
In our day to day life, we come across many uncertain events. We wake up in the morning and check
the weather report. The statement could be 'there is 60% chance of rain today'. This statement infers
that the chance of rain is more than that having a dry weather. We decide upon our breakfast from a
statement that "corn flakes might reduce cholesterol". What is the chance of getting a flat tyre on the
way to an important appointment? And so on.
How probable an event is? We generally infer by repeated observation of such events in long term
patterns. Probability is the branch of mathematics devoted to the study of such events. People have
always been interested in games of chance and gambling. The existence of games such as dice is
evident since 3000 BC. But such games were not treated mathematically till fifteenth century. During
this period, the calculation and theory of probability originated in Italy. Later in the seventeenth
century, French Mathematicians Pascal and Fermat contributed to this Literature of study. The
foundation of modern probability theory is credited to the Russian mathematician, Kolmogorov. He
proposed the axioms, at which the present subject of probability is based.
Example:
While measuring the inner radius of an open tube, using slide calipers, we get the same result by
performing repeatedly the same experiment. Many scientific and engineering experiments are
deterministic.
If the outcome is one of the several possible outcomes, then such an experiment is called a "random
experiment" or 'nondeterministic' experiment.
In other words, any experiment whose outcome cannot be predicted in advance, but is one of the set
of possible outcomes, is called a random experiment.
If we think an experiment as being performed repeatedly, then each repetition is called a trial. We
observe an outcome for each trial.
Example:
An experiment consists of 'tossing a die and observing the number on the upper-most face'
In such cases, we talk of chance of probability, which numerically measures the degree of chance of
the occurrence of events.
30
Note:
Each element of S denotes a possible outcome. Each element of S is known as sample point.
Any trial results in an outcome and corresponds to one and only one element of the set S.
e.g.
1. In the experiment of tossing a coin,
S = {H, T}
2. In the experiment of tossing two coins simultaneously,
S = {HH, HT, TH, TT}
3. In the experiment of throwing a pair of dice,
S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2),………. (6,1), (6,2), (6,3), (6,4), (6,5),
(6, 6)}
Events
An event is the outcome or a combination of outcomes of an experiment. In other words, an event is a
subset of the sample space.
Let A be the event that the number on the uppermost face is odd, then the corresponding set of
favorable outcomes is {1, 3, 5} i.e. A= {1, 3, 5}
Let B be the event that the number on the uppermost face is even. Then, B = {2, 4, 6}.
Let C be the event that the number on the uppermost face is above 7. Now, this set is certainly a
null set or an empty set because there is no favorable outcome. Thus, C= ф
Let D be the event that the number on the uppermost face is an integer between 1 and 6, both
inclusive, the D = {1, 2, 3,4,5,6 = S
Let E be the event that the outcome is less than 2. then, E = {1}
Types of Events
As we have different types of sets, we have different types of events.
We illustrate different types of events using above example.
Simple Event
If an event has one element of the sample space then it is called a simple or elementary event. In
the above example, E = {1} is a simple event
Compound Event
If an event has more than one sample points, the event is called a compound event. In
the above example, A = {1,3,5}is a compound event.
31
Sure event
In the above experiment, the sample space S= {1, 2, 3, 4, 5, 6}. The event represented by D occurs
whenever the experiment is performed. Therefore, the event D is called a sure event or certain
event.
Complement of an Event
The complement of an event A with respect to S is the set of all the elements of S which are not in
A. The complement of A is denoted by A' or AC.
Note:
In an experiment if A has not occurred then A' has occurred.
Algebra of Events
In a random experiment, considering S (the sample space) as the universal set let A, B and C is the
events of S. We can define union, intersect on and complement of events and their properties on S,
which is similar to those in set theory.
In the above experiment, the events A = {1,3,5 } and B = {2,4,6}are mutually exclusive.
Symbolically,
(A B) = ф
Where, (A B) is the event that both A and B occur.
Events E1, E2, …, En associated with a random experiment are said to be pair-wise mutually exclusive
32
Exhaustive Event
For a random experiment, let E1, E2, E3,….. En be the subsets of the sample space S.
E1, E2, E3, …, En form a set of Exhaustive events if
E1 ∪ E2 ∪ E3 ∪ ..... ∪ En = S
A set of events E1, E2, E3, …. En of S are said to mutually exclusive and exhaustive events if
E1 ∪ E2 ∪ E3 ∪ ......... ∪ En = S
Independent Events
Events are said to be independent if the occurrence of one event does not affect the occurrence of
others.
Let A and B be two events defined on sample space S. Events A and B are said to be independent if
Note:
If A and B are independent, then
i) Ac and Bc are independent
iii) A and Bc are independent
ii) Ac and B are independent
Probability of an Event
So far, we have introduced the sample of an experiment and used it to describe events. In this section,
we introduce probabilities associated to the events.
Let S be the sample space associated with the random experiment. Further, let S be finite and equally-
likely, i.e. let there be n (finite) number of sample points in S and let each one of them be equally
likely. Let A be the event defined on S then, probability of occurrence of event A is denoted by P(A)
and is given by
𝑚𝑚
P(A) = 𝑛𝑛 … … … (𝑖𝑖)
Where, m is the number of outcomes favorable for the occurrence of the event A.
33
Note 1: 0≤ P(A) ≤ 1 as 0 ≤ m ≤ n
Let S be the sample space of an experiment. Probability is a function, which associates a non-
negative real number to every event A of the sample space denoted by P(A) satisfying the following
axioms
P(AC) = 1 - P(A)
P(ф) = 0
34
Addition Rule of Probability
35
Exercise
1. A sample of 500 respondents was selected in a large metropolitan area in order to determine
various information concerning consumer behavior. Among the questions asked was ‘’Do you
enjoy shopping for clothing? ” Of 240 males, 136 answered yes. Of 260 females, 224 answered
yes. What is the probability that the respondent chosen at random
i. Is a male?
ii. Enjoys shopping for clothing?
iii. Is a female?
iv. Does not enjoy shopping for clothing?
2. A five digit number is to be formed by digits 1,2,3,4 and 5 without repetition. What is the
probability that the number is divisible by 4?
3. What is the probability that a leap year will have 52 Tuesdays?
4. Two friends A and B apply for two vacancies at the same post. The chances of their
selection are 0.25 and 0.20 respectively. What is the chance that
i. One of them will be selected?
ii. Both will be selected?
iii. None of them will be selected?
5. Probability that a man will be alive 25 years hence is 0.3 and the probability that his wife will be
alive 25 years hence is 0.4. Find the probability that 25 years hence
i. Both will be alive?
ii. Only the man will be alive?
iii. Only the women will be alive?
iv. At least one of them will be alive?
6. One bag contains 5 red and 7 black balls and the other 3 red and 12 black balls. A ball is drawn
at random from either of the bags. What is the chance that the selected ball is black?
7. According to a survey, the probability that a family owns two cars if their annual income is
greater than Rs. 8 lakh is 0.75. Of the households surveyed, 60 per cent had income over Rs.
8 lakh and 52 per cent had two cars. What is the probability that a family has two cars and an
income over Rs. 8 lakh a year?
36
8. The chance that a person stopping at a petrol pump will get his vehicle’s tyres checked is 0.12,
the chance that he will get the oil checked is 0.29 and the chance that he will get both checked
is 0.07.
i. What is the chance that a person will have neither his tyres nor oil checked?
ii. What is the probability that a person who has his oil checked will also have tyres
checked?
9. It is known that 15 per cent of the males and 10 per cent of the females in a town having equal
number of them are unemployed. A person is selected at random from the town.
What is the probability that
i. A person is employed?
ii. A person is male given that he is employed?
10. A certain company encourages its employees to participate in cricket and hockey. A survey
indicates that 40% play cricket, 50% play hockey and 25% play both cricket and hockey. Find
the probability that
i. An employee plays only hockey?
ii. An employee plays only cricket?
iii. An employee takes part in at least one of the games, cricket and hockey?
Note:
Four chapters together with four exercises have been given in the material for the purpose of self-
study. Make sure that you go through entire material. Evaluation will be conducted on this part
immediately after you join the course.
37