Unit 1 - Intro (FLUID MECHANICS)
Unit 1 - Intro (FLUID MECHANICS)
UNIT 1
Ebenezer Asabre
June, 2023.
Definition
Statistics is the science of conducting studies to collect, organize,
summarize, analyze,and draw conclusions from data.
Branches of Statistics
Depending on how data are used, statistics can be put into two areas:
1 Descriptive Statistics
2 Inferential Statistics
Descriptive Statistics
Descriptive statistics consists of the collection, organization,
summarization and presentation of data. It presents data in tabular and
graphical forms as well as using nummerical measures such as percentages,
mean, standard deviation, etc in an informative way.
Inferential Statistics
Inferential or inductive statistics is concerned with drawing conclusions
regarding all information of interest on the basis of samples taken.
For example the Department of Electrical/Electronic Engineering may be
contracted by the ECG to produce energy-saving light bulbs. In assessing
the quality of the bulbs, sample of the bulbs after production could be sent
to individuals to use and based on their responds, the whole production
can be commercialised.
Types of Data/Variables
Data can be put in two types depending on its nature:
Qualitative data/variable
Quantitative data/variable
Qualitative Variable
These are quantities that can be placed into distinct categories, according
to some characteristics or attributes. Example are gender, grade of student
in Probability & Statistics course, religious/political affiliation, etc.
Quantitative Variable
This represents data that are numerical and can be ordered or ranked.
Examples are weight, height, temperature, etc. Quantitative variables can
be further be distinguished into discrete and continuous types.
Exercise 2
Identify each variable as being either discrete or continuous variable giving
a reason for your answer.
1 The number of blue balls selected randomly from a box.
2 The average speed of a moving vehicle.
3 Weight loss after a series of aerobic exercises.
4 The number shown after a die is tossed once.
Scale of Measurement
Scale of measurement refers to ways in which variables are defined and
categorized. Each scale of measurement has certain properties which in
turn determines the appropriateness for use of certain statistical analyses.
The four scales of measurement are nominal, ordinal, interval and ratio.
Sampling Techniques
Random Sampling: This is the sampling technique where all members
(belonging to the population) have equal chance of being selected (into
the sample).
Systematic Sampling
Researchers obtain systematic samples by numbering each subject of the
population and then selecting every kth subject. For example, suppose
there are 100 subjects in the population and a sample of 20 subjects are
needed. Our k = 100
20 = 5.
This means after selecting our first sample randomly (from 1 and 5), every
5th subject would be selected. If subject 3 is the first subject selected;
then the sample would consist of the subjects whose numbers are 3rd, 8th,
13th, etc.
Cluster Sampling
Here the population is divided into groups called clusters by some means
such as geographic area. Then the researcher randomly selects some of
these clusters and uses all members of the selected clusters as the subjects
of the samples.
Frequency Distribution
The frequency distribution table indicates the occurrence of the
observations or values in data obtained.
Example 1
The data below are the number of light bulbs per household sampled from
a community some time ago.
0 1 4 4 3 1 2 3 1 2
2 4 3 0 2 5 0 2 2 1
3 2 1 1 3 2 3 4 5 2
1 0 5 4 2 0 3 5 1 2
4 3 0 2 2 1 1 2 2 4
When the range of the data is large, the data must be grouped into classes
that are more than one unit in width, in what is called a grouped
frequency distribution.
Definitions:
1 Class Intervals: These are continuous intervals selected in such as
that they are mutually exclusive.
2 Class Boundaries: These are numbers that are the upper and lower
class if there are no gaps between the consecutive classes.
3 Class Width/Class size: The difference between the lower and the
upper class boundaries
4 Class Mark/Class Midpoint: This is the midpoint of the class
interval.
Class boundaries = Lb − Ub
Lb = lower limit − 0.5
Ub = Upper limit + 0.5
=⇒ For the class limits 11 − 20 the boundaries are 10.5 − 20.5
Exercise 3
Twenty-five army inductees at the 66 Artillery Military Barracks - Ho, were
given a blood test to determine their blood type.
The data set obtained was:
A B B AB O
O O B AB B
B B O A O
A O O O B
B AB O A B
INTRODUCTION
An average is a value which is typical and a representation of a set of
data. There are different approaches of measuring central tendency.
Example of such approaches are arithmetic mean, geometric mean,
harmonic mean and quadratic mean.
Arithmetic Mean
The arithmetic mean is the sum of the values, divided by the total number
of values. The symbol represents the sample mean.
It is denoted by:
X̄ = X1 +X2 +Xn3 +...+Xn = n1 ni=1 Xi
P
Example 3
The data below represents the mid semester exam (marked out of 20)
scores of some 10 students in a Prob. & Stats class.
15, 13, 14, 18, 18, 14, 16, 19, 11, 17.
Find the mean score.
Example 4
The length of 40 electric cables are given to the nearest centimetres as:
Length(cm) 4-8 9-13 14-18 19-23 24-28 29-33
Frequency 2 4 7 14 8 5
Assumed Mean
This is also called working or guessed mean. To use this method we have
some assumed mean to work with. Normally it is recommended when the
data large.
Example 5
Given an assumed mean of 60, find the mean of the data below.
60, 65, 54, 56, 70, 65.
Solution
Since di = xi − A
d1 = x1 − A = 60 − 60 = 0
d2 = x2 − A = 65 − 60 = 5
P
So di = 0, 5, −6, −4, 10, 5 ∴ di = 10
P
di
=⇒ X¯A = A + n = X̄ = 60 + 10
6 = 61.667
Weighted Mean
In some cases, data points carries different weights and we have to
consider their weights in calculating the overall mean.
Solution
P
wi xi (3 × 5) + (3 × 4.5) + (2 × 4.5) + (2 × 4)
GPA = X̄w = P =
wi 3+3+2+2
45.5
= = 4.55
10
Harmonic Mean
The harmonic mean of n observation X1 , X2 , ..., Xn is given by:
For an ungrouped data:
n
X¯h = P 1
Xi
Example 7
A student rides a motor bike from campus to Ho Sports Stadium at an
average speed of 60km/h and returns to campus along the same route at
an average speed of 40km/h. Find the average speed of the entire journey.
Solution
n 2
X̄h = P 1 = 1 1
= 48km/h
Xi 60 + 40
Exercise 5
A lecturer drives the first 210km at an average speed of 90km/h and the
remaining 146km at an average speed of 110km/h from Kumasi to Ho.
What is the average speed of the entire trip.
Geometric Mean
The k th root of the product of k observations is the geometric mean. The
geometric mean is applicable when finding means of percentages, ratios,
indexes or growth rates.
For ungrouped data:
p
X¯g = k X1 × X2 × X3 × · · · × Xk
q
For grouped data: X¯g = k X1f1 × X2f2 × X3 × · · · × Xkfk
Example 8
The population of a community increased by 20% from the year 2019 to
2020 and 36% from 2020 to 2021. Find the average rate of increase
during the entire period (i.e 2019-2021).
This is a geometric mean with k = 2.
√
=⇒ X¯g = 1.20 + 1.36 = 1.277
This means that on the average the population increased by 27.7%.
In other words, the population increased 1.277 times during the period
under consideration (i.e 2019-2021).
Example 10
The weight of 40 metal bars in kg is presented in the table below. Find
the median.
Wieght (kg) Frequency Cumulative Frequency
118-126 3 3
127-135 5 8
136-144 9 17
145-153 12 29
154-162 5 34
163-171 4 38
172-180 2 40
Ebenezer Asabre (Ho Technical University, Department
Probability
of Mathematics
and Statistics
and Statistics
UNIT 1 Probability and Statistics
June,
for2023.
Engineers)39 / 53
Solution
We have the following parameters
40
N = 40 i.e 2 = 20
MODE
The mode is the observation with the highest frequency.
Data set that has only one value that occurs with the greatest frequency is
said to be unimodal. If a data set has two values that occur with the
same greatest frequency, both values are considered to be the mode and
the data set is said to be bimodal. If a data set has more than two values
that occur with the same greatest frequency, each value is used as the
mode, and the data set is said to be multimodal.
Where,
Lb = lower class boundary of the modal class
d1 = the difference between the frequency of the modal class and the one
before it.
d2 = the difference between the frequency of the modal class and the one
after it.
h = class with of the modal class.
Exercise 5
Find the mode of the following data:
Class interval 20-22 23-25 26-28 29-31 32-34
Frequency 3 6 12 9 2
MIDRANGE
The midrange (MR) of a data set is average of the of the lowest and
highest values in the data set.
lowest value + highest value
MR =
2
Exercise 6
The data below shows the points per game of Michael Jordan in the
1996/1997 NBA regular season (in 10 selected games). Calculate the
midrange of the data.
37, 39, 33, 45, 41, 35, 28, 38, 43, 54.
Example 11
Find the lower and second quartiles for the following data.
Class interval Frequency cumulative frequency
20-22 3 3
23-25 6 9
26-28 12 21
29-31 9 30
32-34 2 32
Range
By definition,
Range = Maximum value - Minimum value
Mean Deviation
The mean deviation of a set of n observations X1 , X2 , ..., Xn with mean X̄
is given by: Pn
|Xi − X̄ |
Mean deviation = i=1
n
COEFFICIENT OF VARIATION
The coefficient of variation (CV) for a sample is given by:
s
CV = × 100%
x̄
For population:
σ
CV = × 100%
µ
The coefficient of variation enables us to compare the variation of values
taken from different populations.
Solution
9.8
Height: CV = 198.6 × 100% = 4.93%
Weight: CV = 60.45
216 × 100% = 27.99%
From the above we can clearly see that there is less variation in height
than weight. This makes some kind of sense because generally weights
among men vary more than height.
Exercise 7
1 Find the variance and standard deviation of the data given below:
Class 11-20 21-30 31-40 41-50 51-60 61-70
f 3 7 17 12 19 5
2 The mean weight of 150 students is 60kg. The mean weight of boys
is 70kg with standard deviation of 10kg.
For the girls, the mean weight is 55kg and standard deviation of 15kg.
Find the number of boys.