0% found this document useful (0 votes)
33 views

Statistics For Beginners 2024

S4 chemical engineering statistics

Uploaded by

Snenhlanhla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Statistics For Beginners 2024

S4 chemical engineering statistics

Uploaded by

Snenhlanhla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

STATISTICS FOR BEGINNERS

Compiled by: Sfundo Gumede

MANGOSUTHU UNIVERSITY OF TECHNOLOGY


Dept. of Mathematical Sciences
1. Terminology and Sampling Methods

1.1.Statistical Definitions

• Data/Data set is a set of “values” collected or obtained when gathering information


on some characteristic of interest.

• Statistics refers to the methodology used in drawing meaningful information from a


data set.

• Variable is a characteristic or attribute that can assume different values.

• Descriptive Statistics refers to methods of collecting, organizing, summarizing and


presenting of data.

• Statistical Inference refers to methods of generalizing from samples to populations.

• Population is a collection of all subjects possessing a common characteristic that is


being studied.

• Census is a study where every member (element) of the population is included.

• Sample is a subgroup or subset of the population.

• Statistic is a measure that describes the characteristic of the sample.

• Parameter is a measure that describes the characteristic of the population.

• The number of values in the sample (sample size) is denoted by n. The number of
values in the population (population size) is denoted by N.

1.2. Data type and Measurement scales

• Qualitative variables are variables that assume non-numerical values.

Examples

➢ The faculty of study at MUT (Engineering, Natural Sciences, Management)


➢ The grade (Fail, Supp, Pass, Distinction) obtained in an examination.

1
• Nominal scale is a level of measurement which classifies data into categories in
which no order or ranking can be imposed on the data.

• A variable can be treated as nominal when its values represent categories with no
intrinsic ranking.

Examples
➢ Gender
➢ Region of residence
➢ Religious affiliation.

• Ordinal scale – Level of measurement which classifies data into categories that can
be ordered or ranked.
• A variable can be treated as ordinal when its values represent categories with some
intrinsic order or ranking.
Examples

➢ Levels of service satisfaction from very dissatisfied to very satisfied.


➢ Attitude scores representing degree of satisfaction or confidence and preference
rating scores (low, medium or high).
➢ Likert scale responses to statements (strongly agree, agree, neutral, disagree, strongly
disagree).

• Quantitative variables are variables which assume numerical values.

• Quantitative variable can be classified as discrete or continuous.

• Discrete variables are variables that can assume a finite or countable number of
possible values. Such variables are usually obtained by counting.

Examples

➢ The number of cars parked in a parking lot.

➢ The number of students attending a statistics lecture.

➢ A person’s response (agree, not agree) to a statement. A one (1) is recorded when the
person agrees with the statement, a zero (0) is recorded when a person does not agree.

• Continuous variables are variables that can assume an infinite number of possible
values. Such variables are usually obtained by measurement.

Examples

➢ The body temperature of a person.

2
➢ The weight of a person.

➢ The height of a tree.

• Interval scale is a level of measurement which classifies data that can be ordered and
ranked and where differences are meaningful. However, there is no meaningful zero
and ratios are meaningless.
• Interval data is generated mainly from rating scales, which are used in survey
questionnaires to measure respondents’ attitudes, motivations, preferences and
perceptions.

• Ratio scale is the level of measurement where differences and ratios are meaningful
and there is a natural zero.

• Variables like height, weight, mark (in test) and speed are ratio variables.

• Experiment is the process of observing some phenomenon that occurs.

• An experiment can be observational or designed.

• A designed experiment can be controlled to a certain extent by the experimenter.

• An observational study is not controlled by the experimenter. The characteristic of


interest is simply observed and the results recorded.

• Parameter is a characteristic or measure of description obtained from a population.

Examples

➢ Mean (average) age of all employees working at a certain company.

➢ The proportion of all registered female voters in a certain country.

• Statistic is a characteristic or measure of description obtained from a sample.

Examples

➢ The mean (average) monthly salary of 50 selected employees in a certain government


department.
➢ The proportion of smokers in a sample of 60 university students.

3
1.3. Sampling methods

• When selecting a sample, the main objective is to ensure that it is as representative as


possible of the population it is drawn from. When a sample fails to achieve this
objective, it is said to be biased.


Sampling frame (synonyms: "sample frame", "survey frame") – This is the actual set
of units from which a sample is drawn.
Example
Consider a survey aimed at establishing the number of potential customers for a new
service in a certain city. The research team has drawn 1000 numbers at random from a
telephone directory for the city, made 200 calls each day from Monday to Friday from
8am to 5pm and asked some questions.

➢ In this example, the population of interest is all the inhabitants in the city. The
sampling frame includes only those city dwellers that satisfy all the following
conditions:

1) They have a telephone.


2) The telephone number is included in the directory.
3) They are likely to be at home from 8am to 5pm from Monday to Friday;
4) They are not people who refuse to answer telephone surveys.

• The sampling frame in this case definitely differs from the population. For example, it
underrepresents the categories which either have no telephone (e.g. the most poor),
have an unlisted number, and who were not at home at the time of calls (e.g.
employed people), who don't like to participate in telephone interviews (e.g. more
busy and active people). Such differences between the sampling frame and the
population of interest is a main cause of bias when drawing conclusions based on the
sample.

• Probability samples – Samples drawn according to the laws of chance.

• These include simple random sampling, systematic sampling and stratified


random sampling.

➢ Simple random sampling – Sampling in which each sample of a given size that can
be drawn will have the same chance of being drawn.

• Most of the theory in statistical inference is based on random sampling being


used.

4
Examples

1) The 6 winning numbers (drawn from 49 numbers) in a Lotto draw. Each potential sample
of 6 winning numbers has the same chance of being drawn.

2) Each name in a telephone directory could be numbered sequentially. If the sample size was
to include 2 000 people, then 2 000 numbers could be randomly generated by computer or
numbers could be picked out of a hat. These numbers could then be matched to names in the
telephone directory, thereby providing a list of 2 000 people.

A random sample can be selected by using a table of random numbers.

Solutions

Suppose the first 6 random numbers in the table of random numbers are:
10480, 22368, 24130, 42167, 37570, 77921.

Use these numbers to select the 6 winning numbers in a Lotto draw.

The 49 numbers from which the draw is made all involve 2 digits i.e. 01, 02, . . . , 49.
Putting the above numbers from the table of random numbers next to each other in a string of
digits gives: 10 48 02 23 68 24 13 04 21 67 37 57 07 79 21 .

The winning numbers can be selected by either taking all pairs of digits between 01 and 49
(discarding any numbers outside this range or repeats) by working from left to right or right
to left in the above string.

By working from left to right the winning numbers are: 10, 48, 2, 23, 24 and 13. By working
from right to left the winning numbers are: 21, 7, 37, 21, 4 and 13.

• The advantage of simple random sampling is that it is simple and easy to apply
when small populations are involved. However, because every person or item in a
population has to be listed before the corresponding random numbers can be read,
this method is very cumbersome to use for large populations and cannot be used
if no list of the population items is available. It can also be very time consuming
to try and locate every person included in the sample. There is also a possibility
that some of the persons in the sample cannot be contacted at all.

➢ Systematic sampling – Sampling in which data is obtained by selecting every kth


𝑁
object, where k is approximately 𝑛 .

5
Examples

1) A manufacturer might decide to select every 20th item on a production line to test for
defects and quality. This technique requires the first item to be selected at random as a
starting point for testing and, thereafter, every 20th item is chosen.

2) A market researcher might select every 10th person who enters a particular store, after
selecting a person at random as a starting point; or interview occupants of every 5th house in
a street, after selecting a house at random as a starting point.

3) A systematic sample of 500 students is to be selected from a university with an enrolled


population of 10 000. In this case the population size N=10 000 and the
10000
sample size n=500. Then every 500 = 20th student will be included in the sample.
The first student in the sample can be randomly selected from an alphabetical list of students
and thereafter every 20th student can be selected until 500 names have been obtained.

➢ Stratified random sampling – Sampling in which the population is divided into


groups (called strata) according to some characteristic. Each of these strata is then
sampled using random sampling.

• A general problem with random sampling is that you could, by chance, miss out a
particular group in the sample. However, if you subdivide the population into
groups, and sample from each group, you can make sure the sample is
representative. Some examples of strata commonly used are those according to
province, age and gender. Other strata may be according to religion, academic
ability or marital status.

Example

In a study investigating the expenditure pattern of consumers, they were divided into low,
medium and high income groups.

Income group percentage of population low 40 medium 45 high 15

A stratified sample of 500 consumers is to be selected for this study.

When sampling is proportional to size (an income group comprises the same percentage of
the sample as of the population) the sample sizes for the strata should be calculated as
follows.

low : 0.4 × 500 = 200


medium : 0.45 × 500 = 225
high : 0.15 × 500 = 75

6
➢ Cluster sampling – this is a sampling method in which you divide a population into
groups (clusters) such as districts or schools, and then randomly select some of these
clusters as your sample.

NB: in stratified random sampling method, elements within each stratum are sampled
whereas in cluster sampling only selected clusters are sampled.

• Examples of Non-probability sampling methods:

➢ Convenience sampling.
➢ Quota sampling.
➢ Snowball sampling.
➢ Judgement sampling.

2. Displaying and Summarizing Data

• Frequency distribution – here we illustrate how to construct a frequency distribution


(table) to classify or group numerical data.
• The frequency distribution can be constructed by following the five basic steps:

➢ Determine the smallest unit of measurement (um). This is the smallest unit
in which data can be measured.
➢ Determine the range (R). This is the difference between the maximum value
and the minimum value in the data set.
➢ Determine the number of classes (k). We use Sturge’s rule:
𝑘 = 1 + 3.3 log 𝑛. Always round up to the next whole number.
➢ Determine the class length (l). We divide the range by the number of classes
and round up to the nearest unit of measurement.
➢ Determine the class boundaries (cb) to determine the minimum and
maximum values that the observations in each class will be. For the first class:
1
▪ Lower class boundary= 𝑥𝑚𝑖𝑛 − 2 (𝑢𝑚)
▪ Add the class length to obtain the upper boundary for the first class,
which will be the lower boundary for the second class, and so on…
➢ Determine the class frequencies. The class frequencies are the number of
observations that belong to each class.

7
Exercise

Data below represent the waiting times (in minutes) of a sample of 15


customers at the Mega City Nedbank branch before they receive service:

4.21 5.55 3.02 5.13 4.77 2.34 3.54 3.20 4.5 6.1
0.38 5.12 6.46 6.19 3.79

Arrange this data into a frequency distribution table (clearly show the class
boundaries and the frequencies)

2.1. Graphical display of Frequency Distribution

• Histogram
➢ Class boundaries on the horizontal axis.
➢ Class frequencies on the vertical axis.
➢ Bars are constructed on the horizontal axis from one class boundary to the
next with a height equal to the frequency of the respective class.

• Frequency polygon

➢ Find the class midpoints for each class.


➢ Mark the midpoints on the horizontal axis.
➢ Frequencies on the vertical axis.
➢ Subtract the class width from the first class midpoint and make a point on the
horizontal axis.
➢ Add the class width to the last class and make a point on the horizontal axes.
➢ Connect the points by a line.

• Ogive

➢ Class boundaries on the horizontal axis.


➢ Cumulative frequencies on the vertical axis.
➢ Complete the graph.

2.2. Descriptive summary measures that measure Location

• Three characteristics are commonly used to describe the data profile of a variable.
These are:
o measures of location (both central and non-central)
o measures of spread (or dispersion)
o a measure of shape (skewness).

8
• Location refers to the where the data values are concentrated.
• Central location is a representative ‘middle’ value of concentration of the data, while
non-central location measures identify relevant ‘off-centre’ reference points in the
data set (such as quartiles).
• Dispersion refers to the extent to which the data values are spread about the central
location value.
• Finally, skewness identifies the shape (or degree of symmetry) of the data values
about the central location measure.

➢ To illustrate, an electronic goods company has recorded the daily sales (in
rand) over a 12-month trading period.
o The average daily sales are a measure of central location, while the extent
to which daily sales vary around the average daily sales would be a
measure of dispersion. Finally, a measure of skewness would identify
whether any very large or very small daily sales values relative to the
average daily sales have occurred over this period.

2.2.1. Central Location Measures

• A central location statistic is a single number that gives a sense of the ‘centrality’ of
data values in a sample.

• Researchers often make statements containing phrases such as:

➢ the average salary per job grade


➢ the most popular healthcare plan the average quantity of milk purchased by a
household of four persons
➢ half of our employees spend less than R178 per month commuting to work
➢ the mean age of our employees is 45 years.

• These statements all refer to a typical or central data value used to represent where the
majority of data values lie. They are called central location statistics.
• Three commonly used central location statistics are: the arithmetic mean (also called
the average) (b) the median (also called the second quartile, the middle quartile or the
50th percentile) (c) the mode (or modal value).

• All three measures (mean, median and mode) can be used for numeric data, while
only the mode is valid for categorical data.

• Calculating the arithmetic mean:

1
➢ Raw data: 𝑥̅ = 𝑛 ∑𝑛𝑖=1 𝑥𝑖 , where 𝑥𝑖 are the observations from the data set and
𝑛 is the number of observations in the data set.

9
1
➢ Grouped data: 𝑥̅ = 𝑛 ∑𝑘𝑖=1 𝑓𝑖 𝑋𝑖 , where 𝑓𝑖 is the frequency of the 𝑖 𝑡ℎ class and
𝑋1 , 𝑋2 , … , 𝑋𝑘 are the class midpoints of the frequency distribution.

• Calculating the mode (𝑚𝑜 )


➢ Raw data :
o The mode is the value (observation) which appears the most in the data.
➢ Grouped data :
o We take the class with the highest frequency as the modal class and apply
𝑐(𝑓𝑚 −𝑓𝑚−1 )
the following formula: 𝑚𝑜 = 𝑂𝑚𝑜 + 2𝑓
𝑚 −𝑓𝑚−1 −𝑓𝑚+1
Where 𝑂𝑚𝑜 is the lower limit of the modal interval, 𝑐 is the class width, 𝑓𝑚 is
the frequency of the modal class, 𝑓𝑚−1 and 𝑓𝑚+1 are the frequencies of classes
before and after the modal class, respectively.

• Calculating the median (𝑚𝑒 )

➢ Raw data :

1 𝑦𝑛/2 +𝑦(𝑛⁄ )+1


o If 𝑛 is even 𝑚𝑒 position= 2 (𝑛 + 1) 𝑚𝑒 = 2
2
1
o If 𝑛 is odd 𝑚𝑒 position= 2 (𝑛 + 1) 𝑚𝑒 = 𝑦(𝑛+1)⁄
2
o Note that data has to be arranged in the ascending order before finding the
median position.

➢ Grouped data:
𝑛
o First add the cumulative frequency column and find the median position 2,
𝑛
𝑐( −𝑓(<))
2
Then use the formula: 𝑚𝑒 = 𝑂𝑚𝑒 + 𝑓𝑚𝑒

where 𝑂𝑚𝑒 is the lower limit of the median interval, 𝑓𝑚𝑒 is the frequency of the
median interval, 𝑐 is the class width, 𝑛 is the number of observations and 𝑓(<) is the
cumulative frequency for the class before the median interval.

2.2.2 Percentiles for raw and grouped data

• An 𝑖 percentile is a numerical value that separates the bottom ∝ percent values in the
data set from the top 100 − 𝑖, for example, the first quartile is referred to as the 25th
percentile as it separates the bottom 25% of the values in the data set from the top
75% of the values.
• Calculating percentiles (𝑃𝑖 )
➢ Raw data :
We first arrange data in the ascending order and find the 𝑃∝ using the
𝑖
operation (𝑛 + 1). Suppose that the position is a numerical value 𝑎, 𝑏
100
o The percentile value is then calculated as follows:

10
𝑃𝑖 = 𝑎𝑡ℎ value + 0. 𝑏 (value after the 𝑎𝑡ℎ value − 𝑎𝑡ℎ value)

➢ Grouped data :
𝑖
We find cumulative frequencies and find the 𝑃𝑖 position given by 100 (𝑛) to
identify the 𝑃𝑖 class, then use the formula:

𝑖𝑛
𝑐 (100 − 𝑓(<))
𝑃𝑖 = 𝑂𝑝𝑖 +
𝑓𝑝𝑖
where 𝑂𝑝𝑖 is the lower limit of the percentile class, 𝑓𝑝𝑖 is the frequency of the
percentile interval, 𝑐 is the class width and 𝑓(<) is cumulative frequency of
classes before the percentile class.

2.3. Descriptive summary statistics that measure Dispersion

• The degree to which numerical data spread about the average value is called
variation or spread. There are various measures that are used to measure dispersion,
the most common ones are:
➢ Range (R)
➢ Interquartile range (IQR)
➢ Stand deviation (s)
➢ Variance (𝑠 2 )
➢ Coefficient of variance (CV)

• The range is the difference between the highest and the lowest value in the data set as
discussed earlier. The range is usually only calculated for the ungrouped data.
• The interquartile range is the difference between the upper quartile (𝑄3 /𝑃75 ) and the
lower quartile (𝑄1 /𝑃25 ), that is : 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 . The interquartile range is also
usually calculated for the ungrouped data.
• The standard deviation is most commonly used measure of spread about the mean. It
has an advantage that it uses all the data values in its calculation. It is calculated as
follows:

➢ Raw data:
𝑛
1
𝑠=√ (∑ 𝑥𝑖 2 − 𝑛𝑥̅ 2 )
𝑛−1
𝑖=1

where 𝑥𝑖 are the observations in the data set and 𝑥̅ is the arithmetic mean.

11
o We can also calculate the standard deviation using a scientific calculator.

➢ Grouped data:

𝑘
1
𝑠=√ (∑ 𝑓𝑖 𝑋𝑖 2 − 𝑛𝑥̅ 2 )
𝑛−1
𝑖=1

where 𝑓𝑖 are the class frequencies and 𝑋𝑖 are the class midpoints.

• For the variance, we square the standard deviation.


• The standard deviations cannot be compared directly if the units of measurements are
not the same. To obtain a measure of variation that is unit-free, we express the
standard deviation as a percentage of the mean. This measure is called the coefficient
𝑠
of variation, calculated: 𝐶𝑉 = 𝑥̅ × 100 = %

2.4 Measures of Skewness

• Skewness describes the shape of a unimodal histogram for numeric data.


• Three common shapes of a unimodal histogram can generally be observed:
➢ symmetrical shapes
➢ positively skewed shapes (skewed to the right)
➢ negatively skewed shapes (skewed to the left).
• It is important to know the shape of the histogram because it affects the choice of
central location and dispersion measures to describe the data, and may distort
statistical findings generated from inferential techniques.

Symmetrical Distribution

• A histogram is symmetrical if it has a single central peak and mirror image slopes on
either side of the centre position. It is also called a bell-shaped curve or a normal
distribution.
• If a distribution is symmetrical, all three central location measures (mean, median and
mode) will be equal and therefore any one of them could be chosen to represent the
central location measure for the sample data.

Positively Skewed Distribution

• A histogram is positively skewed (or skewed to the right) when there are a few
extremely large data values (outliers) relative to the other data values in the sample.
• A positively skewed distribution will have a ‘long’ tail to the right.
• The mean is most influenced (‘inflated and distorted’) by the few extremely large data
values and hence will lie furthest to the right of the mode and the median. The

12
median is therefore preferred as the representative measure of central location in
right-skewed distributions.

Negatively Skewed Distribution

• A histogram is negatively skewed (or skewed to the left) when there are a few
extremely small data values (outliers) relative to the other data values in the sample.
• A negatively skewed distribution will have a ‘long’ tail to the left.
• The mean, again, will be most influenced (‘deflated and distorted’) by the few
extremely small data values and hence will lie furthest to the left of the mode and the
median. The median is therefore preferred as the representative measure of central
location in left-skewed distributions.

The Box Plot

• A complete profile of a numeric random variable can be summarised in terms of five


descriptive statistical measures known as the five-number summary table and
displayed graphically as a box plot.
• The five-number summary table for a numerical variable consist of the:
➢ Minimum data value
➢ Lower quartile
➢ Median M
➢ Upper quartile
➢ Maximum data value x max

• From the box plot, it is easy to read the range of the data (between the minimum
and the maximum data values) and the spread of data values between the quartiles
and median. It also highlights the degree of skewness in the data.

• Follow these steps to construct a box plot:

➢ On a horizontal number line, construct a box between the lower and upper quartile
numeric positions.
➢ Mark the median inside the box at its numeric value position on the number line.
➢ Draw a horizontal line from the minimum value position to the Q1 position. This is
called the lower whisker.
➢ Draw another horizontal line from the Q3 position to the maximum value position.
This is called the upper whisker.

13
➢ If the upper whisker and the box is ‘stretched’ at the upper end of the box plot, then
the histogram is positively skewed with a few extremely large values causing the
skewness.
• An outlier (or extreme value) is any data value that lies either
➢ below a lower limit of 𝑄1 − 1.5(𝑄3 − 𝑄1 )
➢ or above an upper limit of 𝑄3 + 1.5(𝑄3 − 𝑄1 )

Example
Construct a box plot based on the five-number summary table values are: Minimum = 33,
first quartile = 43, Median = 47, third quartile = 54 kWh and Maximum = 61

• Follow these steps to observe skewness in a box plot:


➢ If a box plot is symmetrical about the median (i.e. the quartiles, minimum and
maximum values are equidistant from the median in both directions), no
skewness exists.
➢ If the lower whisker and the box is ‘stretched’ at the lower end of the box
plot, then the histogram is negatively skewed with a few extremely small
values causing the skewness.

EXERCISES

1. A chemical engineer measures the pH levels of 20 samples of a chemical solution.


The pH levels are:
6.8 7.2 6.9 7.0 6.7 7.1 6.8 7.3 6.6 7.2 6.9 7.0 6.8 7.1 6.7 7.4
6.5 7.3 6.8 7.2
Use this data to calculate:
1.1 the mean pH levels.
1.2 the standard deviation of the pH levels.
1.3 the variance.
1.3 the coefficient of variation.
1.4 the first, second and third quartiles (interpret your answers).
1.5 the interquartile range.
1.6 the number of outliers
1.7 the 40th percentile of the pH levels.
1.8 and interpret the 70th percentile.
2 Represent data above on a box-and-whisker diagram and hence comment on skewness
of the data.

14
3 Consider the data set below:

7.42 6.29 5.83 6.50 8.34 9.51 7.10 6.80 5.90


4.89 6.50 5.52 7.90 8.30 9.60

Summarize this the data above using a frequency distribution table (clearly
show the class boundaries and the frequencies). Show all your workings.

4. The frequency distribution table below shows annual salaries of 90 randomly


selected chemical engineering interns in KwaZulu-Natal.

Salaries (Rands) Frequency


( 𝑓𝑖 )
36000 −<38000 18
38000 −< 40000 15
40000 −< 42000 12
42000 − < 44000 10
44000 − < 46000 16
46000−<48000 19

4.1 Use the frequency distribution to calculate the:

4.1.1 mean.

4.1.2. mode.

4.1.3. lower quartile.

4.1.4. the upper quartile.

4.1.5. the 70th percentile

4.1.6. the standard deviation.

4.1.7. Use the frequency distribution to construct a frequency polygon.

15
4.1.8. Hence, or otherwise comment on skewness of this data.

4.1.9. Use the frequency distribution to construct a histogram.

4.1.10. Use the frequency distribution to construct an ogive.

4.1.11. Use an ogive the estimate the upper quartile.

5 Repeat all the exercises in Question 4 above using the table below:

The frequency distribution table below shows the heights (in centimetres)
of selected tomato trees in a vegetable farm in Vryheid.

Heights Frequencies

30.5−<33.5 15

33.5−<36.5 7

36.5−<39.5 23

39.5−<42.5 30

42.5−<45.5 15

45.5−<48.5 13

16
3. BASIC PROBABILITY CONCEPTS
3.1 Introduction
In most cases in real life, decisions are made under conditions of uncertainty. Probability
theory provides the foundation for quantifying and measuring uncertainty. It is used to
estimate the reliability in making inferences from samples to populations, as well as to
quantify the uncertainty of future events. It is therefore necessary to understand the basic
concepts and laws of probability to be able to manage uncertainty.
• A probability is the chance, or likelihood, that a particular event will occur.

These are examples of events representing typical probability-type questions:


• What is the likelihood that a task will be completed within 45 minutes?
• How likely is it that a product will fail within its guarantee period?
• What is the chance of a telesales consultant making a sale on a call?

3.2. Random experiments, sample spaces and trials


• a random experiment is a procedure whose outcome in a particular performance or
trial cannot be predetermined.

Although we cannot foretell what the outcome of any single repetition of the experiment
will be, we must be able to list the set of all possible outcomes of the experiment. In
general, random experiments must be capable, in theory at least, of indefinite repetition.
It must also be possible to observe the outcome of each repetition of the experiment.
• The set of all possible outcomes of a random experiment is called the sample space of
the random experiment.

We usually use the letter S to denote the sample space. Each repetition of the procedure
for the random experiment is called a trial, and gives rise to one and only one of the
possible outcomes.
The following are examples of random experiments and their sample spaces:
• We toss a coin. We can list the set of possible outcomes: S = {heads,tails}. We can
repeat the experiment endlessly, and we can observe the result of every trial.

• A phone number is chosen at random. The number is dialled, and the person who
answers is asked whether he/she is currently watching television. If the telephone is
unanswered after 45 seconds, the outcome, “no reply”, is recorded. The set of
possible outcomes, the sample space, is

S = {yes, no, won’t say, number engaged, no reply}.


• A light bulb is allowed to burn until it burns out. The lifetime of the bulb is recorded.
The possible outcomes are the set of non-negative real numbers (i.e. the set of

17
positive numbers plus zero — the bulb might not burn at all). The sample space is
thus
S = {t | t ≥ 0}.

• A die is thrown out onto the table. The dots on the upturned face are counted. The
sample space is
S = {1, 2, 3, 4, 5, 6}.

• In a survey of traffic passing a particular point on the N2 North bound, a time period
of one minute is chosen at random, and the number of vehicles that pass the point in
the minute is counted. The possible outcomes are the integers, including zero,
therefore
S={0, 1, 2, 3,...}.

Sample Spaces: Problems


1 Write down the sample space for each of the following random experiments
a. Flip a coin once
b. Flip a coin twice
c. Roll a die twice
d. People joining aerobics classes will be checked one after the other to see if
they have been diagnosed with diabetes ( D ) or not ( N ). Suppose that the
checking is to be continued until one person is non-diabetic person is found or
four people have been checked.
___________________________________________________________________________

3.3. Calculating Probabilities


Probabilities are broadly of two types: subjective or objective.
• Where the probability of an event occurring is based on an educated guess, expert
opinion or just plain intuition, it is referred to as a subjective probability. Subjective
probabilities cannot be statistically verified and are not used extensively in statistical
analysis.
• Alternatively, when the probability of an event occurring can be verified statistically
through surveys or empirical observations, it is referred to as an objective probability.
This type of probability is used extensively in statistical analysis.

Mathematically, a probability is defined as the ratio of two numbers:


𝑟
𝑃(𝐴) =
𝑛

where:
• A = event of a specific type (or with specific properties)
• r = number of outcomes of event A

18
• n = total number of all possible outcomes (called the sample space)
• P(A) = probability of event A occurring

3.3.1 Properties of a Probability


There are five basic properties that apply to every probability:
• A probability value lies only between 0 and 1 inclusive (i.e. 0 ≤ P(A) ≤ 1).
• If an event A cannot occur (i.e. an impossible event), then P(A) = 0.
• If an event A is certain to occur (i.e. a certain event), then P(A) = 1.
• The sum of the probabilities of all possible events (i.e. the collectively exhaustive set
of events) equals one, i.e. 𝑃(𝐴1 ) + 𝑃(𝐴2 ) +∙∙∙ +𝑃(𝐴𝑘 ), for k possible events.

For example, if cash, cheque, debit card or credit card (i.e. k = 4) are the only
possible payment methods (events) for groceries, then for a randomly selected
grocery purchase, the probability that a customer pays by either cash, cheque, debit
card or credit card is: P(𝐴1 = cash) + P(𝐴2 = cheque) + P(𝐴3 = debit card) + P(𝐴4 =
credit card) = 1.
• Complementary probability: If P(A) is the probability of event A occurring, then the
probability of event A not occurring is defined as P(𝐴̅) = 1 − P(A). For example, if
there is a 7% chance that a part is defective, then P(a defective part) = 0.07 and P(not
a defective part) = 1 − 0.07 = 0.93.

3.3.2. Joint and Conditional probabilities


• A joint probability is the probability that both event A and event B will occur
simultaneously on a single trial of a random experiment, denoted 𝑃(𝐴 ∩ 𝐵).
• A conditional probability is the probability of event A occurring, given that event B
has already occurred. It is written as 𝑃(𝐴|𝐵).
• In formula terms, a conditional probability is defined as follows:

𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴|𝐵) =
𝑃(𝐵)

• The essential feature of the conditional probability is that the sample space is reduced
to the set of outcomes associated with the given prior event B only. The prior
information (i.e. event B) can change the likelihood of event A occurring.
• If two events are mutually exclusive, they cannot occur together in a single trial of a
random experiment. If events A and B are mutually exclusive, then 𝑃(𝐴 ∩ 𝐵) = 0.

• The addition rule:


➢ for non-mutually exclusive events, and
➢ for mutually exclusive events.

19
The addition rule relates to the union of events. It is used to find the probability of
either event A or event B, or both events occurring simultaneously in a single trial of
a random experiment.
• If two events are not mutually exclusive, they can occur together in a single trial of a
random experiment. Then the probability of either event A or event B or both
occurring in a single trial of a random experiment is defined as:

𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)


• If two events are mutually exclusive, then the probability of either event A or event B
(but not both) occurring in a single trial of a random experiment is defined as:

𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

• Multiplication rule for Statistically Independent events: If two events A and B are
statistically independent (i.e. there is no association between the two events) then the
multiplication rule reduces to the product of the two marginal probabilities only.

𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵)


Exercise: Suppose that the Engineering Council of South Africa classifies its registered
members according to Gender (Male or Female) and whether they
are registered as Technicians or Engineers. From 1650 registered members,
1155 are males and the rest are females. They also found that only 40% of
the registered female members are Engineers; the rest are Technicians.
Suppose that only 20% of all the registered members are Engineers.

1 Calculate the probability that a randomly selected engineer is:

1.1 a male.

1.2 not an engineer.

1.3 is a male and an engineer.

1.4 a male or an engineer.

1.5 neither a male nor an engineer.

1.6 a male, given that he is an engineer.

1.7 a technician, given that he is a female.

1.8 a male but not an engineer.

1.9 a male or a female.

20
2 Determine whether events that an employee is a female and that she is an engineer
are statistically independent.

More Exercises

1 Let 𝐴 and 𝐵 be events in a sample space 𝑆 such that 𝑃(𝐴) = 0.5


and 𝑃(𝐵) = 0.2. Use relevant calculations to show that it is not
possible that 𝑃(𝐴 ∪ 𝐵) = 0.8.

2. Let 𝐴 and 𝐵 be events in a sample space 𝑆. Show that it is not possible


that 𝑃(𝐴 ∩ 𝐵) = 0.6 and 𝑃(𝐴|𝐵) = 0.4.

3. An apple cooperative in Elgin, Western Cape receives and groups apples into
A, B, C and D grades for packaging and export. In a batch of 1 500 apples,
795 were found to be grade A, 410 were grade B, 106 were grade C and the
rest grade D. If an apple is selected at random from the batch, what is the
likelihood that it is neither of grade B nor D?

4 Let 𝐴 and 𝐵 be events in a sample space 𝑆 such that 𝑃(𝐴̅ ) = 0.7


and 𝑃(𝐵) = 0.4.

4.1 If 𝐴 and 𝐵 are mutually exclusive, find 𝑃(𝐴 ∪ 𝐵).

4.2 If 𝐴 and 𝐵 are statistically independent, find 𝑃(𝐴 ∪ 𝐵).

4.3 If 𝐴 and 𝐵 are statistically independent, find 𝑃(𝐴/𝐵̅ ).

5 Let 𝐴 and 𝐵 be events in a sample space 𝑆 such that 𝑃(𝐴̅ ) = 0.6, 𝑃(𝐵) = 𝑥
and 𝑃(𝐴 ∪ 𝐵) = 0.5.

5.1 Calculate the numerical value of 𝑥 if 𝐴 and 𝐵 are statistically independent.

5.2 Calculate the numerical value of 𝑥 if 𝐴 and 𝐵 are mutually exclusive.

5.3 Calculate 𝑃( ̅̅̅̅̅̅̅


𝐴 ∪ 𝐵 ).

5.4 Calculate 𝑃(𝐵/𝐴̅ ) if 𝐴 and 𝐵 are mutually exclusive.

6 Let 𝐴 and 𝐵 be events in a sample space, 𝑆. Let 𝑃(𝐴) = 0.35, 𝑃(𝐵) = 0.6
and 𝑃(𝐴|𝐵) =0.52.

6.1. Find 𝑃(𝐴 ∪ 𝐵).

6.2. 𝑃(𝐴|𝐵̅).

21
6.3. Are 𝐴 and 𝐵 mutually exclusive? Substantiate your answer.

6.4. Are 𝐴 and 𝐵 statistically independent? Substantiate your answer.

7. On a research done by the Department of Agriculture at MUT, graduates who


graduated in a past decade are classified according to their major (Animal
Production or Crop Production). They are also classified according to their
Gender (Female or Male). Suppose that there are 200 graduates whose major
was Animal Production and 300 whose major was Crop Production. Of all
graduates, 240 were females and 100 of these majored in Animal Production.
Calculate the percentage of graduates who are

7.1. neither males nor did crop production.

7.2. males, given that they did animal production.

7.3. neither females nor did animal production.

8. The two-way table below shows the IQ rating as well as the creativity rating of 250
individuals in a psychological study.

Low IQ High IQ Total


Low Creativity 75 x 105
High Creativity 20 y z
Total w 155 250

8.1. Calculate the numerical values of w, x, y and z.

8.2. Find the probability that a randomly selected individual from this study
will be classified as:

8.2.1. having a low IQ.

8.2.2. having a low IQ, given that she has high creativity.

8.2.3. having a high IQ and high creativity.

8.2.4. having a low IQ or high creativity or both.

8.3. Let A be an event that a randomly selected individual has a low IQ and let
B be an event that a randomly selected individual has high creativity.

8.3.1. Are A and B mutually exclusive? Substantiate.

8.3.2. Are A and B statistically independent? Show all workings.


__________________________________________________________________________

22
3.4. Bayes Theorem
A very useful tool for finding conditional probabilities is Bayes’ theorem, which connects
P(B |A) with P(A |B), named in honour of Rev. Thomas Bayes, who did pioneering work in
probability theory in the 1700’s.

Bayes Theorem: If A and B are two events, then

𝑃(𝐵|𝐴)∙𝑃(𝐴)
𝑃(𝐴|𝐵) =
𝑃(𝐵|𝐴)∙𝑃(𝐴)+𝑃(𝐵|𝐴̅ )𝑃(𝐴̅ )

Example: The miners are out on strike, with a list of demands. Negotiators reckon that if
management meets one of the demands, the probability that the strike will end is 0.85. But if
this demand is not met, the probability that the strike will end is 0.08. You assess the
probability that management will agree to meet the demand as 0.3. Later you hear that the
strike has ended. What is the probability that demand was met?

• Let A = the demand was met.


• Let B = strike has ended.
• 𝑃(𝐵|𝐴) = 0.85, 𝑃(𝐴) = 0.3 and 𝑃(𝐵|𝐴̅ ) = 0.08

We can then substitute this into the formula to obtain 𝑃(𝐴|𝐵).

Bayes Theorem can be extended as follows:

Suppose that 𝐴1 , 𝐴2 , …,𝐴𝑛 are mutually exclusive events whose union is the sample space 𝑆
and 𝑃(𝐴𝑖 ) > 0. Then, for any event 𝐵 with 𝑃(𝐵) > 0 and any 𝑘 = {1,2, … , 𝑘}, we have

𝑃(𝐵|𝐴 )×𝑃(𝐴𝑘 )
𝑃(𝐴𝑘 |𝐵) = ∑ 𝑃(𝐵|𝐴𝑘
𝑘 )×𝑃(𝐴𝑘 )

Example

A family has two dogs (Rex and Rover) and a cat called Garfield. None of them is fond of the
postman. If they are outside, the probabilities that Rex, Rover and Garfield will attack the
postman are 30%, 40% and 15%, respectively. Only one is outside at a time, with
probabilities 10%, 20% and 70%, respectively. If the postman is attacked, what is the
probability that Garfield was the culprit?

23
Exercises

1 The probability that a student passes Statistics is 0.8 if he studies for the exam and 0.3
if he does not study. If 60% of the class studied for the exam, and a student chosen at
random from the class passes, what is the probability that he did not study?

2 The probability that a cancer test will detect the disease in a person who has cancer is
0.98. The probability that a person who does not have cancer will give a positive
reading on the test is 0.1 (i.e. the test says he has the disease even though he has not).
If 1 per cent of the population has cancer, what is the probability that a person
selected at random will in fact have cancer, given that he shows a positive reading on
the cancer test?

3 The probability that twins are identical is 0.7. Identical twins are always of the same
sex, while non-identical twins are of the same sex with probability 0.5. What is the
probability that twin boys are identical twins?

4 An assembler of electric fans uses motors from two sources. Company A supplies 90%
of the motors and Company B supplies the other 10% of the motors. Suppose that it is
known that 5% of the motors supplied by Company A are defective and 3% of the
motors supplied by Company B are defective. An assembled fan is found to have a
defective motor. What is the probability that this motor was supplied by Company B?

___________________________________________________________________________

4.3 Counting Rules in Probability

• It is usually impractical to list and to count all the elementary events contained in the
sample space or in the event of interest.
• The theory of combinations and permutations frequently comes to the rescue, and
enables the number of elementary events contained in sample spaces and events to be
determined quite easily. This theory is summarized in a series of “counting rules”
given later.

4.3.1 Permutations of n objects


Recall that a set is just a group of objects, and that the order in which the objects are listed is
irrelevant. We now consider the number of different ways all the objects in a set may be
arranged in order.
• A set containing n distinguishable objects has

n(n−1)×···×3×2×1 = n! (“n factorial”)

24
different orderings of the objects belonging to the set.
• We can see this by thinking in terms of having n slots to fill with the n objects in the
set. Each slot can hold one object.
• We can choose any object for the first slot in n ways; there are then n−1 objects
available for the second slot, so we can select an object for the second slot in n−1
ways, leaving n−2 objects available for the third slot, until the last remaining object
has to placed in final slot.
• We say that there are n! distinct arrangements (technically, we call each arrangement
or ordering a permutation) of the n objects in the set.

Example: If the set A = {1,2,3}, list all the possible permutations. There are 3! = 3×2×1 = 6
distinct arrangements of the objects in A.

3.5.2 Permutations of n objects taken r at a time


Suppose now that we have a set containing n objects, and that we have r , where 0 < r ≤ n
slots to fill. In how many ways can we do this, assuming that each object is “used up” once it
is allocated to a slot?
• We number the slots from 1 to r and fill each in turn. We can choose any of the n
objects to fill the first slot. Having filled the first slot there are n−1 objects available,
any of which may be chosen for the second slot. Therefore, the first two slots can be
filled in n(n−1) ways. The first three slots can be filled in n(n−1)(n−2) ways, and so
on…
• There are
𝑛!
𝑛𝑃𝑟 = (𝑛−𝑟)!
ways of ordering r elements taken from a set containing n elements using each
element at most once.

• Note that 𝑛𝑃𝑟 can be found directly from the calculator (without using factorial
notation) by pressing n, then SHIFT, then the multiplication sign followed by r and
then equal to.
• Note that we are
(a) choosing r objects and
(b) arranging them.
• We are here involved in two processes, choosing and arranging. The number of ways
of choosing and arranging r objects out of n distinguishable objects is called the
number of permutations of n objects taken r at a time and is denoted by 𝑛𝑃𝑟 (“n
permutation r”).This formula is also valid for r = n if we adopt the convention that
0! = 1.
Example: How many different pictures (a rearrangement of the same people is considered a
different picture) are possible if 10 people are present?
Solution: This is the same as asking for the number of permutations of 10 objects taken 3 at a
time, given by 10P3 = 720.

25
Example: Suppose that 19 political parties contested an election. How many different ways
can the top four political parties be lined up?
Solution: This is equivalent to asking: “How many permutations of 19 objects taken 4 at a
time are there?” The answer is 19P4 = 93024.

3.5.3 Combinations of n objects taken r at a time


• Now suppose we want merely to count the number of ways of choosing r elements
out of the n elements in our set without regard to the arrangement of the chosen
elements.
• We call this the number of combinations of n objects taken r at a time, and denote it
by the symbol 𝑛𝐶𝑟 ‘n combination r’.
• To find the value of 𝑛𝐶𝑟, using a calculator we press n, then SHIFT, then the
multiplication sign followed by r and then equal to.

Example: In how many ways can a 9-man work team be formed from 15 men?
Solution: The problem asks only for the number of ways of choosing 9 men out of 15,
which is 15C9 = 5005.
Exercise: From 8 accountants and 5 computer programmers, in how many ways can one
select a committee of
(a) 5 people.
(b) 3 accountants and 2 computer programmers?
(c) 5 people, subject to the condition that the committee contain at least 2 computer
programmers and at least two accountants.

3.5.4 Permutations, with repetitions


• We now suppose that we have n types of objects and r slots, and that we have at least
r objects of each type available. We can thus fill the first slot with any of the n types
of objects, there are still n types of objects available for the second slot, ... Because
there are at least r objects of each type, there are still objects of each of the n types
available for the final, rth slot. Thus the number of permutations of n types of objects
taken r at a time, allowing repetitions is 𝑛 × 𝑛 ×∙∙× 𝑛 = 𝑛𝑟 ∙.

Example: How many four digit numbers can be made from the 10 digits from 0 to 9, if
repetitions are permitted?
Solution: We have four slots to fill. But because all of the 10 digits remain available to fill
every slot, this can be done in 10 × 10 × 10 × 10 = 10000 ways. This makes sense, because
there are 10 000 numbers from 0 (actually 0 000) to 9 999.
Example: How many four letter words can be made with a 26-letter alphabet — including all
nonsense words?
Solution: 26 × 26 × 26 × 26 = 456976

26
Example: It is proposed to adopt a system of motor car number plates which uses three
letters of the alphabet (excluding I and O) followed by three digits. How many number plates
are possible?
Solution: the number of possible number plates is 24 × 24 × 24 × 10 × 10 × 10 = 13
824 000.

Exercises
1. Determine whether each of the following situations would require calculating
a permutation or combination:

1.1. Selecting a treasurer, president and a vice president from a council of six
members.
1.2. Assigning different visitors’ cars to different parking bays.
1.3. Selecting five students to attend a State of the Nation Address.

2. There are 6 doors to a lecture hall. How many ways can a lecturer enter a
lecture hall through one door and leave a hall through a different door?

3. If seven friends were asked to line up for a groupie with the owner of the cell
phone in the centre, how many distinguishable photos are possible?

4. Suppose that a lotto player plays a single ticket, what is the probability of getting
all the number correctly?

5. In how many different ways can the letters of the of the word ‘SURVEYING’
be arranged if:

6.1. letters may be arranged in any order?


6.2. words must start with V and end with Y?
6.3. vowels must be together?

7. A six-number number plate has to be made using digits 0;1;2;….;9. If repetition of


digits is allowed, what is the probability that a number plate made:

7.1. is an even number?


7.2. does not start with a zero?

8. All telephone numbers at MUT start with 031907 followed by a four digit number.
Using the digits 0;1;2;…..9, determine the total number of distinct telephone numbers
if:

8.1. repetition is not allowed in the last four digits.


8.2. repetition of digits is allowed and the last three digits are 888.
8.3. repetition of digits is allowed and the last three digits are the same.

9. In how many ways can the letters of the word “SIPHESIHLE” be arranged?

27
10. Suppose that a student, whose name is Cyril would like to create a 10-character
password for his email account. He decided to use five English letters (A-Z)
followed by five digits (0-9). How many possible passwords can he create if:

10.1 the first five characters is his name and for the following five characters,
repetition of digits is allowed but the last digit of the password cannot be zero.

10.2 the first three characters will be letters from the name of his boyfriend, Xolani,
in any order (without repetition of letters), the next two characters will be vowels
(allowing repetition) and the last five characters will be odd numbers (not
allowing repetition).

11. Given a class of 12 girls and 10 boys. What is the probability of making a committee
of five students if a committee must consist of:

11.1. two girls?


11.2. at least four girls?
11.3. a maximum of two boys?
11.4. girls only?

12. Out of 20 tyres 3 are defective (you do not know which ones are defective). You
select four tyres. What is the probability that out of the four selected tyres:

14.1. none are defective?


14.2. at least two are defective?
___________________________________________________________________________

5 Introduction to Probability Distributions

4.1. Random variables

• We previously defined a sample space as the set consisting of all the elementary
events that are possible outcomes of a random experiment. Sometimes, we expressed.
• In order to manipulate the events defined on a sample space mathematically, it is
necessary to attach a numerical value to each elementary event.
• The motivation for assigning numbers to elementary events — it clears the way for us
to develop a general mathematical theory for handling the probabilities of events in a
sample space.
• Once all the elementary events in a sample space have numerical values assigned to
them, we follow the classic algebraic tradition and let X “stand for” the numerical
values of the elementary events.
• We then call X a random variable. X is a variable because it can “take on” (or
assume) different values. X is a random variable because the particular value it takes
on depends on the outcome of a random experiment.
• By convention, statisticians use the capital letters near the end of the alphabet to
denote random variables. Their favourite choice is the letter X.

28
4.1.1. Discrete and continuous random variables

• Random variables fall into two categories — discrete and continuous. The
mathematical treatment of these two types of random variables is very different — as
you will learn later.
• Discrete random variables take on isolated values along the real line, usually (but by
no means always) integer values. Examples of integer-valued discrete random
variables are:
➢ the number of customers entering a store between 09h00 and 10h00
➢ the number of occupied tables at a restaurant
➢ the number of clients visited by a salesperson during a day
➢ the number of applicants who respond to a job advertisement.

• In contrast to discrete random variables, a continuous random variable can


(conceptually, at least) be measured to any degree of accuracy.
• The set of all possible values of a continuous random variable is usually an interval of
the real line. Examples of continuous random variables are:

➢ the distance a car travels on one litre of petrol


➢ the volume of milk that used by the child in a particular morning
➢ the time that a customer waits in the queue at a fast food outlet

4.1.2. Discrete Probability distributions: Introduction

• A probability distribution is a list of all the possible outcomes of a random


variable and their associated probabilities of occurrence.

Exercises: Write down the probability distribution for each of the following
random experiments and the associated random variables
(a) Flip a coin twice and observe the number of heads
(b) Flip a coin three times and observe the number of tails.
(c) Roll a die once and observe the number of dots appearing.

• The distinction between discrete and continuous random variables is critical


because we develop different mathematical approaches for the two types of
random variable.
• We describe discrete random variables mathematically using probability mass
functions. Continuous random variables are described by probability density
functions.
• We adopt the convention of using 𝑝(𝑥) to denote a probability mass function and
𝑓(𝑥) for a probability density function.

4.1.3. The mean and variance of the discrete random variable.


• The mean and the variance of a discrete random variable are given by

𝜇 = ∑ 𝑥 ∙ 𝑝(𝑥) and 𝜎 2 = ∑ 𝑥 2 𝑝(𝑥) − 𝜇 2 , respectively

29
Exercise: Calculate the mean and the standard deviation for each of the random variable in
the previous exercise (number of heads, number of tails and the number of dots appearing).

Exercises

1 Check which of the following functions can serve as probability mass functions:
𝑥
a. 𝑝(𝑥) = 𝑥 = 1,2,3
6

= 0 otherwise
1 3 1 1
b. 𝑝(𝑥) = 𝑥 𝑥 = 16 , 16 , 4 , 2

𝑥
c. 𝑝(𝑥) = 15, 𝑥 = 1,2,3,4,5.

2 Find the value of 𝑐 such that the function below is a probability mass function

𝑝(𝑥) = 𝑐𝑥 𝑥 = 0,1,2,3,4.

Hence, find
a. 𝑃(𝑋 ≥ 1)
b. 𝑃(0 < 𝑋 ≤ 3)
c. 𝑃(𝑋 ≤ 4)

3 Calculate the mean and the standard deviation of 𝑋 for each probability mass function
in question number 1.
___________________________________________________________________________

5.3 Binomial Probability Distribution

• A discrete random variable follows the binomial distribution if it satisfies the


following four conditions:

➢ The random variable is observed n number of times (this is equivalent to


drawing a sample of n objects and observing the random variable in each one).

➢ There are only two, mutually exclusive and collectively exhaustive, outcomes
associated with the random variable on each object in the sample. These two
outcomes are labelled success and failure (e.g. a product is defective or not
defective; an employee is absent or not absent from work; a consumer prefers
brand A or not brand A).

30
➢ Each outcome has an associated probability. The probability for the success
outcome is denoted by p. The probability for the failure outcome is denoted by
1 − p.

➢ The objects are assumed to be independent of each other, meaning that p


remains constant for each sampled object (i.e. the outcome on any object is not
influenced by the outcome on any other object). This means that p is the same
(constant) for each of the n objects.
• If these four conditions are satisfied, then the following binomial question can be
addressed:

The Binomial Question

‘What is the probability that x successes will be occur in a randomly drawn sample of n
objects?’
This probability can be calculated using the binomial probability distribution formula:

𝑃(𝑥) = 𝑛𝐶𝑥 ∙ 𝑝 𝑥 ∙ (1 − 𝑝)𝑥 for 𝑥 = 0,1,2, … , 𝑛.

Where:
➢ n = the sample size, i.e. the number of independent trials (observations)
➢ x = the number of success outcomes in the n independently drawn objects
➢ p = probability of a success outcome on a single independent object (1 − p) =
probability of a failure outcome on a single independent object

Exercise: The Avis car hire company has a fleet of rental cars that includes the make Opel.
Experience has shown that one in four clients requests to hire an Opel. If five reservations are
randomly selected from today’s bookings,

(a) what is the probability that two clients will have requested an Opel?
(b) what is the probability that at most two clients will have requested an Opel?
(c) what is the probability that at least one clients will have requested an Opel?
(d) what is the probability that three clients will not have requested an Opel?

• The mean and variance of the binomial random variable are given by 𝜇 = 𝑛𝑝 and
𝜎 2 = 𝑛𝑝(1 − 𝑝), respectively.

Exercises

1. The South African Department of Health has reported that 30% of all goats
born in South Africa have been diagnosed with abscesses. If a random
sample of 10 goats born in South Africa are randomly selected,

1.1 approximate the expected number of goats that will not be diagnosed with
this disease.

1.2 calculate the probability that six of these will not be diagnosed with

31
abscesses.

1.3 calculate the probability that at most two of these goats will be diagnosed
with this disease.

2. six stores for which she is responsible. Experience has shown that there is a
one-in five chance that a given store will run out of stock before the merchandiser’s
weekly visit.

2.1. What is the probability that, on a given weekly round, the merchandiser
will find exactly one store out of stock?

2.2 What is the probability that, at most, two stores will be out of stock?

2.3 What is the probability that a minimum of two stores will be out of stock?

2.4 What is the mean number of stores out of stock each week?

3. A marketing manager makes the statement that the long-run probability that a
customer would prefer the deluxe model to the standard model of a product is 30%.

3.1. What is the probability that exactly three in a random sample of 10 customers will
prefer the deluxe model?

3.2. What is the probability that more than two in a random sample of 10 customers will
prefer the standard model?

3.3. In a random sample of 10 customers, calculate the standard deviation of the number
of customers who prefer the standard model.

4.3. Poisson Probability Distribution

• A Poisson process is also a discrete process.


• A Poisson process measures the number of occurrences of a particular outcome of a
discrete random variable in a predetermined time, space or volume interval for which
an average number of occurrences of the outcome is known or can be determined.
• These are examples of a Poisson process:
➢ the number of breakdowns of a machine in an eight-hour shift
➢ the number of cars arriving at a parking garage in a one-hour time interval
➢ the number of sales made by a telesales person in a week
➢ the number of problems identified at the end of a construction project
➢ the number of particles of chorine in one litre of pool water
➢ the number of typing errors on a page of a newspaper.

32

In each case, the number of occurrences of a given outcome of the random variable,𝑥 ,
can take on any integer value from 0, 1, 2, 3, … up to infinity (in theory).
The Poisson Question

‘What is the probability of x occurrences of a given outcome being observed in a


predetermined time, space or volume interval?’
• The Poisson question can be answered by applying the Poisson probability
distribution formula:

𝑒 −𝑎 ∙𝑎𝑥
𝑃(𝑥) = for 𝑥 = 0,1,2, …
𝑥!
Where:
➢ a = the mean number of occurrences of a given outcome of the random variable
for a predetermined time, space or volume interval
➢ e = is a mathematical constant.
➢ x = number of occurrences of a given outcome for which a probability is required.

• The mean and the variance are 𝜇 = 𝜎 2 = 𝑎

Exercises

1 A farm that produces pineapples in Hluhluwe receives an average of eight


orders in a 16-day interval. Suppose that the number of orders that a farm
receives follows a Poisson distribution.

1.1 Calculate the standard deviation of the number of orders that a farm receives
in an eight-day interval.

1.2 Calculate the probability that in a given four-day interval, a company will
receive three orders.

1.3 Calculate the probability that in a given 16-day interval, a company will
receive at least two orders.

2. Ice-cream vendor’s sales follow a Poisson distribution with an average


rate of 10 per hour.

2.1. What is the probability that he sells more than one ice-cream in his first
hour of operation?

2.2. What is the average number of ice-creams he sells in 3 days?

3. A company that supplies ready-mix concrete receives, on average, six orders per day.

3.1 What is the probability that, on a given day:

33
3.1.1 only one order will be received?

3.1.2 no more than three orders will be received?

3.1.3 at least three orders will be received?

3.1.4 What is the probability that, on a given half-day, only one order will be received?

3.2 What is the mean and standard deviation of orders received per day?

4. The number of tubes of toothpaste purchased by a typical family is a random variable


having a Poisson distribution with an average of 1.8 tubes per month.

4.1 What is the probability that a typical family will purchase at least three
tubes of toothpaste in any given month?

4.2 What is the likelihood that a typical family will purchase less than four tubes of
toothpaste in any given month?
__________________________________________________________________________

4.4. Normal Probability Distribution

• The normal probability distribution is continuous and has the following properties:

➢ The curve is bell-shaped.


➢ It is symmetrical about a central mean value, µ.
➢ The tails of the curve never touch the x-axis, meaning that there is always a
non-zero probability associated with every value in the problem domain (i.e.
asymptotic).
➢ The distribution is always described by two parameters: a mean (µ) and a
standard deviation (σ).
➢ The total area under the curve will always equal one, since it represents the total
sample space. Because of symmetry, the area under the curve below µ is 0.5,
and above µ is also 0.5.
➢ The probability associated with a particular interval of x-values is defined by
the area under the normal distribution curve between the limits of 𝑥1 and 𝑥2 .

• To find the probability that 𝑥 lies between 𝑥1 and 𝑥2 , it is necessary to find the area
under the bell-shaped curve between these x-limits.
• This is done by converting the x-limits into limits that correspond to another normal
distribution called the standard normal distribution (or z-distribution as it is commonly
called) for which areas have already been worked out. These areas are given in a
statistical table.

Exercises

34
1 The manager of a local gym has determined that the length of time patrons spend at the
gym is a normally distributed variable with a mean of 80 minutes and a standard
deviation of 20 minutes.

1.1 What proportion of patrons spend more than two hours at the gym?

1.2 What proportion of patrons spend less than one hour at the gym?

1.3 What is the least amount of time spent by 60% of patrons at the gym?

2 The lifetime of a certain type of automatic washing machine is normally distributed


with mean and standard deviation equal to 3.1 and 1.1 years respectively.

2.1 If this type of washing machine is guaranteed for one year, what percentage of original
sales will require replacement if they fail within the guarantee period?

2.2 What percentage of these washing machines is likely to be operating after


three years?

2.3 What percentage of these washing machines is likely to be operating after


4 years?

3 Telemarketers for Clientele Life in the Durban branch spend an average


of R60 per day calling their potential clients. Assume that the daily amounts
spent by telemarketers are normally distributed with a variance of R100 .

3.1. Calculate the probability that a randomly selected telemarketer uses daily:

3.2. more than R80 calling potential clients.

3.3. between R44 and R59 calling potential clients.

3.4. What is the minimum amount of airtime spent by the 95% of the telemarketers?

35
References

• Applied statistics and probability for engineers


Author: D.C. Montgomery (John Wiley & Sons, 2010)

• Probability and Statistics for engineers and scientists


Author: J. Devore (2004)

• Applied Business Statistics: Methods and Applications (3rd edition).


Author: Trevor Wegner (Juta, 2012).

• INTROSTAT
Authors: L. Underhill & D. Bradfield (University of Cape Town, 2013)

36

You might also like