0% found this document useful (0 votes)
69 views6 pages

Lecture 2 Statistics

The document discusses grouping and summarizing data through the use of frequency distribution tables. It explains that when dealing with large datasets, the data can be grouped into intervals or classes to more easily analyze and summarize the information. The key aspects covered are: 1) Grouped data involves arranging individual observations into groups according to common characteristics to create a frequency distribution table. 2) When data is grouped, class intervals or classes are used to organize the data, with the class size or width being the difference between the upper and lower limits. 3) Class limits refer to the values defining each group, while class boundaries define the true interval but may not be observed in the frequency table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views6 pages

Lecture 2 Statistics

The document discusses grouping and summarizing data through the use of frequency distribution tables. It explains that when dealing with large datasets, the data can be grouped into intervals or classes to more easily analyze and summarize the information. The key aspects covered are: 1) Grouped data involves arranging individual observations into groups according to common characteristics to create a frequency distribution table. 2) When data is grouped, class intervals or classes are used to organize the data, with the class size or width being the difference between the upper and lower limits. 3) Class limits refer to the values defining each group, while class boundaries define the true interval but may not be observed in the frequency table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Lecture 2

Summary statistics

SHORT REVIEW:

Grouping is an operation that involves selecting and combining indicators with the
same characteristics from a population.

Groupings are performed by task. For example, if the standard of living of the
population is analyzed, then the grouping is made by household income and
expenses, and if the dynamics of population growth is studied, then the grouping is
made by birth rate, mortality rate, family size, number of children in families, etc.
can be done based on indicators.

Groupings are formed according to quantitative and qualitative characteristics.


Groupings are called simple if they are formed by one characteristic or
characteristic, and complex or combined if they are formed by several
characteristics.

The size of the interval is important in groupings due to quantitative


characteristics. Intervals have volume, lower and upper boundaries. Intervals can
be open or closed depending on their boundaries. Open intervals have a lower or
upper limit. Intervals that have upper and lower boundaries are closed intervals.
Closed intervals can be the same or different in size.

Grouping of data plays a significant role when we have to deal with large data.
This information can also be displayed using a pictograph or a bar graph. Data
formed by arranging individual observations of a variable into groups, so that a
frequency distribution table of these groups provides a convenient way of
summarizing or analyzing the data is termed as grouped data.
Frequency distribution table for grouped data
When the collected data is large, then we can follow the below approach to analyse
it easily using tally marks.

Example:

Consider the marks of 50 students of class VII obtained in an examination. The


maximum marks of the exam are 50.

23, 8, 13, 18, 32, 44, 19, 8, 25, 27, 10, 30, 22, 40, 39, 17, 25, 9, 15, 20, 30, 24, 29,
19, 16, 33, 38, 46, 43, 22, 37, 27, 17, 11, 34, 41, 35, 45, 31, 26, 42, 18, 28, 30, 22,
20, 33, 39, 40, 32

If we create a frequency distribution table for each and every observation, then it
will form a large table. So for easy understanding, we can make a table with a
group of observations say 0 to 10, 10 to 20 etc.

The distribution obtained in the above table is known as the grouped frequency
distribution. This helps us to bring various significant inferences like:

(i) Many students have secured between 20-40, i.e. 20-30 and 30-40.

(ii) 8 students have secured higher than 40 marks, i.e. they got more than 80% in
the examination.

In the above-obtained table, the groups 0-10, 10-20, 20-30,… are known as class
intervals (or classes). It is observed that 10 appears in both intervals, such as 0-10
and 10-20. Similarly, 20 appears in both the intervals, such as as10-20 and 20-30.
But it is not feasible that observation either 10 or 20 can belong to two classes
concurrently. To avoid this inconsistency, we choose the rule that the general
conclusion will belong to the higher class. It means that 10 belongs to the class
interval 10-20 but not to 0-10. Similarly, 20 belongs to 20-30 but not to 10-20, etc.
Consider a class say 10-20, where 10 is the lower class interval and 20 is the
upper-class interval. The difference between upper and lower class limits is called
class height or class size or class width of the class interval.

Interval or Class Size. This class interval is very important when


it comes to drawing Histograms and Frequency diagrams. All the classes may have
the same class size or they may have different classes sizes depending on how you
group your data. The class interval is always a whole number.

Below is an example of grouped data where the classes have the same class
interval.

Age (years) Frequency

0–9 12

10 – 19 30

20 – 29 18

30 – 39 12

40 – 49 9

50 – 59 6

60 – 69 0
Below is an example of grouped data where the classes have different class
interval.

Age (years) Frequency Class Interval

0–9 15 10

10 – 19 18 10

20 – 29 17 10

30 – 49 35 20
50 – 79 20 30
Calculating Class Interval

Given a set of raw or ungrouped data, how would you group that data into suitable
classes that are easy to work with and at the same time meaningful?

The first step is to determine how many classes you want to have. Next, you
subtract
the lowest value in the data set from the highest value in the data set and then
you divide by the number of classes that you want to have:

Example 1:

Group the following raw data into ten classes.

Solution:

The first step is to identify the highest and lowest number

Class interval should always be a whole number and yet in this case we have a
decimal
number. The solution to this problem is to round off to the nearest whole number.
In this example, 2.8 gets rounded up to 3. So now our class width will be 3;
meaning
that we group the above data into groups of 3 as in the table below.

Number Frequency

1–3 7

4–6 6

7–9 4

10 – 12 2

13 – 15 2

16 – 18 8

19 – 21 1

22 – 24 2

25 – 27 3

28 – 30 2
Class Limits and Class Boundaries

Class limits refer to the actual values that you see in the table. Taking an example
of the table above, 1 and 3 would be the class limits of the first
class. Class limits are divided into two categories: lower class limit and upper
class limit. In the table above, for the first class, 1 is the lower class
limit while 3 is the upper class limit.

On the other hand, class boundaries are not always observed in the frequency table.
Class boundaries give the true class interval, and similar to class limits, are
also divided into lower and upper class boundaries.

The relationship between the class boundaries and the class interval is given as
follows:
Class boundaries are related to class limits by the given relationships:

As a result of the above, the lower class boundary of one class is equal to the
upper class boundary of the previous class.

Class limits and class boundaries play separate roles when it comes to representing
statistical data diagrammatically as we shall see in a moment.

For example, consider the following groupings according to the size of farm plots
(in hectares): up to 3, 4-5, 6-10, 11-20, 21-50, 51-70, 71-100, 101. -200, 200
above. Both open and closed intervals are considered here. Strictly speaking, the
first and last intervals are open, the rest are closed. In this grouping, closed
intervals differ from each other in volume. Let's look at another example. The
intervals for points of the balloon system are closed and equal in size, except for
the 1st interval: 0-51, 51-60, 61-70, 71-80, 81-90, 91-100. Combined groups are
formed according to two or more characteristics. In such groupings, groups based
on one characteristic are divided into subgroups based on another characteristic .

You might also like