0% found this document useful (0 votes)
2 views

Chapter 1 and 2

The document provides an introduction to statistics, defining it as the science of collecting, organizing, presenting, analyzing, and interpreting numerical data. It classifies statistics into descriptive and inferential branches, outlines the stages of statistical investigation, and discusses key terms and concepts such as population, sample, and measurement scales. Additionally, it covers the applications, uses, and limitations of statistics, as well as methods of data collection and presentation.

Uploaded by

nigatu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 1 and 2

The document provides an introduction to statistics, defining it as the science of collecting, organizing, presenting, analyzing, and interpreting numerical data. It classifies statistics into descriptive and inferential branches, outlines the stages of statistical investigation, and discusses key terms and concepts such as population, sample, and measurement scales. Additionally, it covers the applications, uses, and limitations of statistics, as well as methods of data collection and presentation.

Uploaded by

nigatu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

CHAPTER 1

1. INTRODUCTION
1.1 Definition and classifications of statistics
1.1.1 Definition:
We can define statistics in two ways.
1. Plural sense (lay man definition).
It is an aggregate or collection of numerical facts.
Any aggregate of numbers cannot be called statistical data. We say that an aggregate of numbers
is statistical data when they are
❖ Comparable
❖ Measurable
❖ Collected for a well-defined objective.
2. Singular sense (formal definition)
Statistics is defined as the science of collecting, organizing, presenting, analyzing and interpreting
numerical data for the purpose of assisting in making a more effective decision.
Statistics is a subject that deals with numbers and figures describing a certain situation. It primarily
deals with numerical data taken by survey and summarizes these data in such way that this
summary gives a good indication a bout the nature of the data.
1.1.2 Classifications:
Depending on how data can be used, statistics is sometimes divided in to two main areas or
branches.
1. Descriptive Statistics: is concerned with summary calculations, graphs, charts and tables.
❖ It consists of collection, organization, summarization and presentation of data.
❖ It deals with describing data without attempting to infer anything that goes beyond
the given set of data.
2. Inferential Statistics: is a method used to generalize from a sample to a population. For
example, the average income of all families (the population) in Ethiopia can be estimated
from figures obtained from a few hundred (the sample) families.
❖ It is important because statistical data usually arises from sample.
❖ Statistical techniques based on probability theory are required
For example,
1. The average age of students in Dilla university is 20.5 years
2. The average income of all families (the population) in Ethiopia can be estimated from
figure obtained from a few hundred (the sample) families.
3. There is a relationship between smoking tobacco and an increased risk of developing
cancer.
1.2 Stages in Statistical Investigation
There are five stages or steps in any statistical investigation.
1. Collection of data: the process of measuring, gathering, assembling the raw data up on
which the statistical investigation is to be based.
Data can be collected in a variety of ways; one of the most common methods is through
the use of survey. Survey can also be done in different methods, three of the most
common methods are
❖ Telephone survey
❖ Mailed questionnaire
❖ Personal interview
Exercise: discuss the advantage and disadvantage of the above three methods with
respect to each other.
2. Organization of data: Summarization of data in some meaningful way, e.g table
form
3. Presentation of the data: The process of re-organization, classification, compilation,
and summarization of data to present it in a meaningful form.
4. Analysis of data: The process of extracting relevant information from the summarized
data, mainly through the use of elementary mathematical operation.
5. Inference of data: The interpretation and further observation of the various statistical
measures through the analysis of the data by implementing those methods by which
conclusions are formed and inferences made.
❖ Statistical techniques based on probability theory are required
1.3 Definitions of some terms
A. Statistical Population: It is the collection of all possible observations of a specified
characteristic of interest (possessing certain common property) and being under study. An example
is all of the students in DU4101 course in this term.
B. Sample: It is a subset of the population, selected using some sampling technique in such a
way that they represent the population.
C. Sampling: The process or method of sample selection from the population.
d. Sample size: The number of elements or observation to be included in the sample.
E. Census: Complete enumeration or observation of the elements of the population. Or it is the
collection of data from every element in a population
F. Parameter: Characteristic or measure obtained from a population.
G. Statistic: Characteristic or measure obtained from a sample.
H. Variable: It is an item of interest that can take on many different numerical values.
1.4 Applications, Uses and Limitations of statistics
1.4.1 Applications of statistics:
❖ In almost all fields of human endeavor.
❖ Almost all human beings in their daily life are subjected to obtaining numerical
facts e.g. abut price.
❖ Applicable in some process e.g. invention of certain drugs, extent of environmental
pollution.
❖ In industries especially in quality control area.
1.4.2 Uses of statistics:
The main function of statistics is to enlarge our knowledge of complex phenomena. The following
are some uses of statistics:
1. It presents facts in a definite and precise form.
2. Data reduction.
3. Measuring the magnitude of variations in data.
4. Furnishes(supply) a technique of comparison
5. Estimating unknown population characteristics.
6. Testing and formulating of hypothesis.
7. Studying the relationship between two or more variable.
8. Forecasting future events.
1.4.3 Limitations of statistics
As a science statistics has its own limitations. The following are some of the limitations:
❖ Deals with only quantitative information.
❖ Deals with only aggregate of facts and not with individual data items.
❖ Statistical data are only approximately and not mathematical correct.
❖ Statistics can be easily misused and therefore should be used be experts.
1.5 Types of Variables and Measurement scales
1.5.1 Types of variables
A variable is characteristic of an object that can have different possible values. There are two
types of variables.
1. Qualitative Variables are nonnumeric variables and can't be measured. Examples
include gender, religious affiliation, color, beauty and state of birth.
Qualitative variables also called categorical variables
2. Quantitative Variables are numerical variables and can be measured.
Examples: balance in checking account, number of children in family, height, income,
temperature etc.
Quantitative variables can be further classified as;
❖ Discrete variables and
❖ Continuous variables
I. Discrete variables are variables whose values are counts
e.g. number of students, family size (number of households), number of pages of a book
II. Continuous variables are variables that can have any values within an interval.
e.g. weight, volume, length, etc.
1.5.2 Measurement scales
Proper knowledge about the nature and type of data to be dealt with is essential in order to specify
and apply the proper statistical method for their analysis and inferences. Measurement scale refers
to the property of value assigned to the data based on the properties of order, distance and fixed
zero.
In mathematical terms measurement is a functional mapping from the set of objects {Oi} to the
set of real numbers {M(Oi)}.

The goal of measurement systems is to structure the rule for assigning numbers to objects in such
a way that the relationship between the objects is preserved in the numbers assigned to the objects.
The different kinds of relationships preserved are called properties of the measurement system.
Order
The property of order exists when an object that has more of the attribute than another object, is
given a bigger number by the rule system. This relationship must hold for all objects in the "real
world".
The property of ORDER exists
When for all i, j if Oi > Oj, then M(Oi) > M(Oj).
Distance
The property of distance is concerned with the relationship of differences between objects. If a
measurement system possesses the property of distance, it means that the unit of measurement
means the same thing throughout the scale of numbers. That is, an inch is an inch, no matters where
it falls - immediately ahead or a mile downs the road.
More precisely, an equal difference between two numbers reflects an equal difference in the "real
world" between the objects that were assigned the numbers. In order to define the property of
distance in the mathematical notation, four objects are required: Oi, Oj, Ok, and Ol . The difference
between objects is represented by the "-" sign; Oi - Oj refers to the actual "real world" difference
between object i and object j, while M(Oi) -M(Oj) refers to differences between numbers.
The property of DISTANCE exists, for all i, j, k, l
If Oi-Oj ≥ Ok- Ol then M(Oi)-M(Oj) ≥ M(Ok)-M(Ol ).
Fixed Zero
A measurement system possesses a rational zero (fixed zero) if an object that has none of the
attribute in question is assigned the number zero by the system of rules. The object does not need
to really exist in the "real world", as it is somewhat difficult to visualize a "man with no height".
The requirement for a rational zero is this: if objects with none of the attribute did exist would they
be given the value zero. Defining O0 as the object with none of the attribute in question, the
definition of a rational zero becomes:
The property of FIXED ZERO exists if M(O0) = 0.
The property of fixed zero is necessary for ratios between numbers to be meaningful.

SCALE TYPES
Measurement is the assignment of numbers to objects or events in a systematic fashion. Four levels
of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each
possessed different properties of measurement systems.
Nominal Scales
Nominal scales are measurement systems that possess none of the three
properties stated above.
❖ Level of measurement which classifies data into mutually exclusive, all-inclusive
categories in which no order or ranking can be imposed on the data.
❖ No arithmetic and relational operation can be applied.
Examples:
❖ Political party preference (Republican, Democrat, or Other,)
❖ Sex (Male or Female.)
❖ Marital status (married, single, widow, divorce)
❖ Country code
❖ Regional differentiation of Ethiopia.
Ordinal Scales
Ordinal Scales are measurement systems that possess the property of order, but not the property
of distance. The property of fixed zero is not important if the property of distance is not satisfied.
❖ Level of measurement which classifies data into categories that can be ranked. Differences
between the ranks do not exist.
❖ Arithmetic operations are not applicable but relational operations are
applicable.
❖ Ordering is the sole property of ordinal scale.
Examples:
❖ Letter grades (A, B, C, D, F).
❖ Rating scales (Excellent, very good, Good, Fair, poor).
❖ Military status
❖ Ranks in race, etc.
Interval Scales
❖ Interval scales are measurement systems that possess the properties of Order and
distance, but not the property of fixed zero.
❖ Level of measurement which classifies data that can be ranked and differences are
meaningful. However, there is no meaningful zero, so ratios are meaningless.
❖ Interval scale data convey better information than ordinal and nominal scale data.
❖ All arithmetic operations except division and multiplication are applicable.
❖ Relational operations are also possible.
Examples:
❖ IQ
❖ Temperature in oF.
Ratio Scales
❖ Ratio scales are measurement systems that possess all three properties: order, distance, and
fixed zero. The added power of a fixed zero allows ratios of numbers to be meaningfully
interpreted; i.e. the ratio of Bekele’s height to Martha's height is 1.32, whereas this is not
possible with interval
scales.
❖ Level of measurement which classifies data that can be ranked, differences are
meaningful, and there is a true zero. True ratios exist between the different units of
measure.
❖ All arithmetic and relational operations are applicable.
❖ This measurement scale provides better information than interval scale of measurement.
❖ Zero measurement indicates absence of the quantity being measured.
Examples:
❖ Weight
❖ Height
❖ Number of students
❖ Age
Exercise:
The following present a list of different attributes and rules for assigning numbers to objects. Try
to classify the different measurement systems into one of the four types of scales (nominal. ordinal,
interval and ratio).
1) A response to the statement "Abortion is a woman's right" where "Strongly
Disagree" = 1, "Disagree" = 2, "No Opinion" = 3, "Agree" =4, and "Strongly
Agree" = 5, as a measure of attitude toward abortion.
2) Times for swimmers to complete a 50-meter race
3) Months of the year Meskerm, Tikimit…
4) Socioeconomic status of a family when classified as low, middle and upper
classes.
5) Blood type of individuals, A, B, AB and O.
6) Pollen counts provided as numbers between 1 and 10 where 1 implies there is
almost no pollen and 10 that it is rampant, but for which the values do not
represent an actual count of grains of pollen.
7) Regions numbers of Ethiopia (1, 2, 3 etc.)
8) The number of students in a college;
9) the net salary of a group of workers;
10) the height of the men in the same town;
11) Your checking account number as a name for your account.
12) Your checking account balance as a measure of the amount of money you have
in that account.
13) The order in which you were eliminated in a spelling bee as a measure of your
spelling ability.
14) Your score on the first statistics test as a measure of your knowledge of
statistics.
15) Your score on an individual intelligence test as a measure of your intelligence.
16) The distance around your forehead measured with a tape measure as a measure
of your intelligence.
CHAPTER 2
2.Methods of data collection and presentation
2.1. Methods of data collection
Raw data: are collected data, which have not been organized numerically.
Examples: 25,10,32,18,6,93,4.
An array: is an arrangement of raw numerical data in ascending or descending order of magnitude.
➢ It enables us to know the range of the data set easily and it also gives us any scientific
investigation requires data related to the study. The required data can be obtained from
either a primary source or a secondary source.
There are two sources of data:
1. Primary Data
❖ Data measured or collect by the investigator or the user directly from
the source.
❖ Two activities involved: planning and measuring.
A) Planning:
➢ Identify source and elements of the data.
➢ Decide whether to consider sample or census.
➢ If sampling is preferred, decide on sample size, selection method, etc.
➢ Decide measurement procedure.
➢ Set up the necessary organizational structure.
B) Measuring: there are different options.
➢ Focus Group
➢ Telephone Interview
➢ Mail Questionnaires
➢ Door-to-Door Survey
➢ Mall Intercept
➢ New Product Registration
➢ Personal Interview and
➢ Experiments are some of the sources for collecting the primary data.
2. Secondary Data
❖ Data gathered or compiled from published and unpublished sources or files.
❖ When our source is secondary data check that:
✓ The type and objective of the situations.
✓ The purpose for which the data are collected and compatible with the
present problem.
✓ The nature and classification of data is appropriate to our problem.
✓ There are no biases and misreporting in the published data.
Note: Data which are primary for one may be secondary for the other.

2.2. METHODS OF DATA PRESNTATION


Having collected and edited the data, the next important step is to organize it. That is to present it
in a readily comprehensible condensed form that aids in order to draw inferences from it. It is also
necessary that the like be separated from the unlike ones.
The presentation of data is broadly classified in to the following two categories:
❖ Tabular presentation
❖ Diagrammatic and Graphic presentation.
The process of arranging data in to classes or categories according to similarities technically is
called classification.
Classification is a preliminary and it prepares the ground for proper presentation of data.
Definitions:
❖ Frequency: is the number of values in a specific class of the distribution.
❖ Frequency distribution: is the organization of raw data in table form using classes and
frequencies.
Examples: A frequency distribution presenting the numbers of males and females in a class.
Sex Frequency
Male 40
Female 56

There are three basic types of frequency distributions


❖ Categorical frequency distribution
❖ Ungrouped frequency distribution
❖ Grouped frequency distribution
There are specific procedures for constructing each type.
1) Categorical frequency Distribution:
Used for data that can be place in specific categories such as nominal, or ordinal. e.g. marital status.
Example: a social worker collected the following data on marital status for 25 persons.
(M=married, S=single, W=widowed, D=divorced)
M S D W D
S S M M M
W D S M M
W D D S S
S W W D D
Solution:
Since the data are categorical, discrete classes can be used. There are four types of marital status
M, S, D, and W. These types will be used as class for the distribution. We follow procedure to
construct the frequency distribution.
Step 1: Make a table as shown.

Class Tally Frequency Percent


(1) (2) (3) (4)
M
S
D
W

Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
𝑓
Step 4: Find the percentages of values in each class by using; % = ∗ 100
𝑛

Where f= frequency of the class, n=total number of values.


Percentages are not normally a part of frequency distribution but they can be added since they
are used in certain types diagrammatic such as pie charts.
Step 5: Find the total for column (3) and (4).
Combing all the steps one can construct the following frequency distribution.
Class Tally Frequency Percent
(1) (2) (3) (4)
M //// 5 20
S //// // 7 28
D //// // 7 28
W //// // 6 24

2) Ungrouped frequency Distribution:


➢ Is a table of all the potential raw score values that could possibly occur in the data along
with the number of times each actually occurred.
➢ Is often constructed for small set or data on discrete variable.
Constructing ungrouped frequency distribution:
❖ First find the smallest and largest raw score in the collected data.
❖ Arrange the data in order of magnitude and count the frequency.
❖ To facilitate counting one may include a column of tallies.
Example:
The following data represent the mark of 20 students.
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
Construct a frequency distribution, which is ungrouped.
Solution:
Step 1: Find the range, Range=Max-Min=90-60=30.
Step 2: Make a table as shown
Step 3: Tally the data.
Step 4: Compute the frequency.
Mark Tally Frequency
60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1
Each individual value is presented separately, that is why it is named ungrouped frequency
distribution.
3) Grouped frequency Distribution:
➢ When the range of the data is large, the data must be grouped in to classes that are more
than one unit in width.
Definitions:
➢ Grouped Frequency Distribution: a frequency distribution when several numbers are
grouped in one class.
➢ Class limits: Separates one class in a grouped frequency distribution from another. The
limits could actually appear in the data and have gaps between the upper limits of one class
and lower limit of the next.
➢ Units of measurement (U): the distance between two possible consecutive measures. It is
usually taken as 1, 0.1, 0.01, 0.001, -----.
➢ Class boundaries: Separates one class in a grouped frequency distribution from another.
The boundaries have one more decimal place than the row data and therefore do not appear
in the data. There is no gap between the upper boundary of one class and lower boundary
𝑈
of the next class. The lower-class boundary is found by subtracting 2 from the
𝑈
corresponding lower-class limit and the upper-class boundary is found by adding to the
2

corresponding upper-class limit.


➢ Class width: the difference between the upper- and lower-class boundaries of any class. It
is also the difference between the lower limits of any two consecutive classes or the
difference between any two consecutive class marks.
➢ Class mark (Mid points): it is the average of the lower- and upper-class limits or the
average of upper- and lower-class boundary.
➢ Cumulative frequency: is the number of observations less than/more than or equal to a
specific value.
➢ Cumulative frequency above: it is the total frequency of all values greater than or equal
to the lower-class boundary of a given class.
➢ Cumulative frequency below: it is the total frequency of all values less than or equal to
the upper-class boundary of a given class.
➢ Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class interval
together with their corresponding cumulative frequencies. It can be more than or less than
type, depending on the type of cumulative frequency used.
➢ Relative frequency (rf): it is the frequency divided by the total frequency.
➢ Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total
frequency.
Guidelines for classes
1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive. This means that no data value can fall into two different
classes
3. The classes must be all inclusive or exhaustive. This means that all data values must be included.
4. The classes must be continuous. There are no gaps in a frequency distribution.
5. The classes must be equal in width. The exception here is the first or last class. It is possible to
have a "below ..." or "... and above" class.
This is often used with ages.
Steps for constructing Grouped frequency Distribution
1. Find the largest and smallest values
2. Compute the Range(R) = Maximum – Minimum
3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule
𝑘 = 1 + 3.32 log 𝑛 where k is number of classes desired and n is total number of observations.
4. Find the class width by dividing the range by the number of classes and rounding up, not
𝑅
off. 𝑤 = 𝑘

5. Pick a suitable starting point less than or equal to the minimum value. The starting point is
called the lower limit of the first class. Continue to add the class width to this lower limit to get
the rest of the lower limits.

6. To find the upper limit of the first class, subtract U from the lower limit of the second class.
Then continue to add the class width to this upper limit to find the rest of the upper limits.

7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units from
the upper limits. The boundaries are also halfway between the upper limit of one class and the
lower limit of the next class. may not be necessary to find the boundaries.

8. Tally the data.

9. Find the frequencies.

10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not
be necessary to find the cumulative frequencies.

11. If necessary, find the relative frequencies and/or relative cumulative frequencies

Example*:
Construct a frequency distribution for the following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solutions:

Step 1: Find the highest and the lowest value H=39, L=6

Step 2: Find the range; R=H-L=39-6=33


Step 3: Select the number of classes desired using Sturges formula;

𝑘 = 1 + 3.32𝑙𝑜𝑔 𝑛 =1+3.32log (20) =5.32=6(rounding up)

𝑅 33
Step 4: Find the class width; 𝑤 = 𝑘 = = 5.5 = 6 (𝑟𝑜𝑢𝑛𝑑𝑖𝑛𝑔 𝑢𝑝)
6

Step 5: Select the starting point, let it be the minimum observation.

➢ 6, 12, 18, 24, 30, 36 are the lower-class limits.

Step 6: Find the upper-class limit; e.g. the first upper class=12-U=12-1=11

➢ 11, 17, 23, 29, 35, 41 are the upper-class limits.


So, combining step 5 and step 6, one can construct the following classes.
Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41

Step 7: Find the class boundaries;

𝑈
E.g. for class 1 Lower class boundary= 6 − 2 = 5.5
𝑈
Upper class boundary = 11 + 2 = 11.5

Then continue adding won both boundaries to obtain the rest


boundaries. By doing so one can obtain the following classes.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5

Step 8: tally the data.

Step 9: Write the numeric values for the tallies in the frequency column.

Step 10: Find cumulative frequency.

Step 11: Find relative frequency or/and relative cumulative frequency.

The complete frequency distribution follows:


Class Class Class Tally Freq. CF (less CF Rf. Rcf(less
limit boundary mark than (more than
type) than type)
type)
6-11 5.5-11.5 8.5 // 2 2 20 0.10 0.10
12-17 11.5-17.5 14.5 // 2 4 18 0.10 0.20

18-23 17.5-23.5 20.5 //// // 7 11 16 0.35 0.55

24-29 23.5-29.5 26.5 //// 4 15 9 0.20 0.75

30-35 29.5-35.5 32.5 /// 3 18 5 0.15 0.90

36-41 35.5-41.5 38.5 // 2 20 2 0.10 1.00


Diagrammatic and Graphic presentation of data.
These are techniques for presenting data in visual displays using geometric and pictures.
Importance:
➢ They have greater attraction.
➢ They facilitate comparison.
➢ They are easily understandable.
❖ Diagrams are appropriate for presenting discrete data.
❖ The three most commonly used diagrammatic presentation for discrete as well as
qualitative data are:
➢ Pie charts
➢ Pictogram
➢ Bar charts
Pie chart
A pie chart is a circle that is divided in to sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:
𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑎𝑟𝑡
𝐴𝑛𝑔𝑙𝑒 𝑜𝑓 𝑠𝑒𝑐𝑡𝑜𝑟 = ∗ 360
𝑡ℎ𝑒 𝑤ℎ𝑜𝑙𝑒 𝑞𝑢𝑎𝑛𝑡𝑖𝑡𝑦
Example: Draw a suitable diagram to represent the following population in a town.
Men Women Girls Boys
2500 2000 4000 1500
Solutions:
Step 1: Find the percentage.
Step 2: Find the number of degrees for each class.
Step 3: Using a protractor and compass, graph each section and write its name corresponding
percentage.
Class Frequency Percent Degree
Men 2500 25 90
Women 2000 20 72
Boys 4000 40 144
Girls 1500 15 54
Pictogram
In this diagram, we represent data by means of some picture symbols. We decide a suitable picture
to represent a definite number of units in which the variable is measured.
Example: draw a pictogram to represent the following population of a town.
Year 1989 1990 1991 1992
Population 2000 3000 5000 7000

Bar Charts:
➢ A set of bars (thick lines or narrow rectangles) representing some magnitude overtime
space.
➢ They are useful for comparing aggregate over time space.
➢ Bars can be drawn either vertically or horizontally.
➢ There are different types of bar charts. The most common being:
❖ Simple bar chart
❖ Deviation or two-way bar chart
❖ Broken bar chart
❖ Component or sub divided bar chart.
❖ Multiple bar charts.
Simple Bar Chart
➢ Are used to display data on one variable.
➢ They are thick lines (narrow rectangles) having the same breadth. The magnitude of a
quantity is represented by the height /length of the bar.
Example: The following data represent sale by product, 1957- 1959 of a given company
for three products A, B, C.

Product Sales ($) 1957 Sale ($) 1958 Sales ($) 1959
A 12 14 18
B 24 21 18
C 24 35 54

Solutions:

Component Bar chart


➢ When there is a desire to show how a total (or aggregate) is divided in to its component
parts, we use component bar chart.
➢ The bars represent total value of a variable with each total broken in to its component parts
and different colors or designs are used for identifications
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:

Multiple Bar charts


➢ These are used to display data on more than one variable.
➢ They are used for comparing different variables at the same time.
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:

Graphical Presentation of data


➢ The histogram, frequency polygon and cumulative frequency graph or ogive are most
commonly applied graphical representation for continuous data.
Procedures for constructing statistical graphs:
➢ Draw and label the X and Y axes.
➢ Choose a suitable scale for the frequencies or cumulative frequencies and
➢ label it on the Y axes.
➢ Represent the class boundaries for the histogram or ogive or the mid
➢ points for the frequency polygon on the X axes.
➢ Plot the points.
➢ Draw the bars or lines to connect the points.

Histogram

➢ A graph which displays the data by using vertical bars of height to represent
➢ frequencies. Class boundaries are placed along the horizontal axes. Class marks and class
limits are sometimes used as quantity on the X axes.
Example: Construct a histogram to represent the previous data (example *).
Frequency Polygon:
➢ A line graph. The frequency is placed along the vertical axis and classes mid points are
placed along the horizontal axis. It is customer to the next higher- and lower-class interval
with corresponding frequency of zero, this is to make it a complete polygon.
Example: Draw a frequency polygon for the above data (example *).
Solutions:
Ogive (cumulative frequency polygon)
➢ A graph showing the cumulative frequency (less than or more than type) plotted against
upper- or lower-class boundaries respectively. That is class boundaries are plotted along
the horizontal axis and the corresponding cumulative frequencies are plotted along the
vertical axis. The points are joined by a free hand curve.
Example: Draw an ogive curve (less than type) for the above data. (Example *)
i) Less than type cumulative frequency

ii) More type cumulative frequency


Exercises:
1. What is another name for the ogive?
2. What are the three types of frequency distribution?
3. In a frequency distribution, the number of classes should be between _____and_______
4. Data such as blood types (A, B, AB, O) can be organized into a(n)____________ frequency
distribution.
5. The number of visitors to the Historic Museum for 25 randomly selected hours is shown.
Construct a grouped frequency distribution.
15 53 48 19 38
86 63 98 79 38
62 89 67 39 26
28 35 54 88 76
31 7 53 41 68
6. In the construction of a frequency distribution, it is a good idea to have overlapping class
limits, such as 10–20, 20–30, 30–40. If the statement is false, explain why.
CHAPTER 3
3. MEASURES OF CENTERAL TENDENCY
Introduction
When we want to make comparison between groups of numbers it is good to have a single value
that is considered to be a good representative of each group. This single value is called the average
of the group. Averages are also called measures of central tendency.
An average which is representative is called typical average and an average which is not
representative and has only a theoretical value is called a descriptive average. A typical average
should possess the following:
➢ It should be rigidly defined.
➢ It should be based on all observation under investigation.
➢ It should be as little as affected by extreme observations.
➢ It should be capable of further algebraic treatment.
➢ It should be as little as affected by fluctuations of sampling.
➢ It should be ease to calculate and simple to understand.
Objectives:
 To comprehend the data easily.
 To facilitate comparison.
 To make further statistical analysis.
The Summation Notation:
Let X1, X2, X3 …XN be a number of measurements where N is the total number of observation and
Xi is ith observation.
❖ Very often in statistics an algebraic expression of the form X1+X2+X3+...+XN is used in
a formula to compute a statistic. It is tedious to write an expression like this very often, so
mathematicians have developed a shorthand notation to represent a sum of scores, called
the summation notation.
❖ The symbol ∑𝑁
𝑖=1 𝑥𝑖 is a mathematical shorthand for X1+X2+X3+...+XN

The expression is read, "the sum of X sub i from i equals 1 to N." It means "add up all the numbers."
Example: Suppose the following were scores made on the first homework assignment for five
students in the class: 5, 7, 7, 6, and 8. In this example set of five numbers, where N=5, the
summation could be written:

The "i=1" in the bottom of the summation notation tells where to begin the sequence of summation.
If the expression were written with "i=3", the summation would start with the third number in the
set. For example:

In the example set of numbers, this would give the following result:

The "N" in the upper part of the summation notation tells where to end the sequence of summation.
If there were only three scores then the summation and example would be:
Sometimes if the summation notation is used in an expression and the expression must be written
a number of times, as in a proof, then a shorthand notation for the shorthand notation is employed.
When the summation sign "" is used without additional notation, then "i=1" and "N" are assumed.
For example:

PROPERTIES OF SUMMATION

You might also like