0% found this document useful (0 votes)
30 views

MMW Module 4 Lesson 1

Uploaded by

cantilloaxcel456
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

MMW Module 4 Lesson 1

Uploaded by

cantilloaxcel456
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Module

4
Data
Management

Data Management

Introduction to Data Management


Measures of Central Tendency
Measures of Dispersion
Measures of Relative Position
Probabilities and Normal Distribution
Simple Linear Regression and Correlation

94
Learning Outcomes

At the end of the module, the students will be able to:


1. Understand and be knowledgeable on the language used in statistics;
2. Interpret correctly and objectively statistical evidences through the gathered
data, and make inferences out of it;
3. Convert and transform normally distributed data into standardized one;
4. Use and apply the concept of normal distribution in the fields of specializations;
5. Appreciate the value of statistical analysis, know the impact and apply it in your
daily life;
6. Practice and display diligence, patience, honesty, accuracy and precision in
solving statistical problems.

95
Lesson 1. Introduction to Data Management

If we talk about data management, we deal with statistics. Statistics is an


art and science of collection, organization, presentation, analysis and
interpretation of data. Particularly in the field of medicine, agriculture,
education, business, economics, politics and technology, the information
provided that were translated as data give medical practitioners, educators,
managers and decision makers a better understanding of the different
environment where they are and enables them to make more informed, sound
and better decisions.

Statistics play a very vital role in our society today, especially this time
of pandemic (COVID-19). All should be included, be counted and accountable for.
No one should be left behind. Because of the usefulness of statistics in almost all
fields of endeavor, some cautions should also be considered. Impressive figures
can be blown out of proportions of their real or imagined importance.
Unscrupulous minds with vested interests make improper or unethical use of
different statistical methods. Questionable and even conflicting claims backed
up with “statistics” can be accepted as true which leads one to believe that
anything can be proven statistically. Moreover, faulty researchers maybe slanted
to produce a particular outcome, that is, statistical analyses are chosen to
produce such outcomes.

Most importantly, for the above reasons, for the statistics users or the
researchers that they clearly understand the statistical tools or techniques being
used in their researches. Thus, in this module, careful attention will be given to
the role of statistics as a tool in research.

Science is based on the empirical method for making observations – for


systemically obtaining information. It consists of methods for making
observations. Observations are the empirical “stuff” of science. Statistics, as
we have defined, is an art and science of collection, organization, presentation,
analysis and interpretation of data.

Statistics is a set of concepts, rules, and procedures that help us to


collect, organize and present numerical information in the form of tables,
graphs, and charts; understand and analyze statistical techniques underlying
decisions that affect our lives and well-being; and interpret or make informed
decisions.

96
Statistics is being divided into two (2) categories or branches called
descriptive and inferential statistics. We can differentiate the two using the
definition of statistics.

COLLECTING
ORGANIZING DATA DESCRIPTIVE STATISTICS
PRESENTING

ANALYSIS
DATA INFERENTIAL STATISTICS
INTERPRETING

Since we talked about statistical inference, we should be very careful on


every information we take and use. Many situations require information about
large size or group of people. On top of that, we also have to consider the time,
cost, and many more. Data can be collected from a small portion of the group.
Population refers to the group of elements or set of individuals of interest in a
particular study. The smaller group, sample, is a set of individuals selected from
a population, usually intended to represent the population in a study.

PARAMETER
A parameter is a value, usually a numerical value that describes a
population. It may be obtained from a single measurement, or it may be derived
from a set of measurements from the population. (µ-population mean; δ-
population standard deviation)

STATISTIC
A statistic is a value, usually a numerical value that describes a sample.
It may be obtained from a single measurement, or it may be derived from a set
of measurements from the sample. (Ẍ-sample mean; s-sample standard
deviation)

VARIABLE
A variable is any information that differs from one member to another in
a population or sample. It is a characteristic of interest for the elements. The
weight (kg) in Table 1.1 served as the variable.

97
Table 1.1 Weights of Randomly selected Grade IV pupils in AES, 1st Quarter of 2020
Section Weights (Kg)
IV - 1 50 41 36 34 54 60 51 37
IV - 2 22 39 42 42 45 38 38 40
IV - 3 38 28 32 44 42 47 37 28
IV - 4 27 27 40 41 39 32 36 24
IV - 5 40 39 33 33 27 30 31 45

Each weight of pupils included in the data set is called an element. An entity on
which data are collected. Collected measurements on each variable for every
element in a study provide the data. The set of measurements obtained for
particular element is called observation.
In Table 1.1, we see the different measurements for the first observations
(IV-1) are 50, 41, 36, 34, 54, 60, 51, 37. For the second observations (IV-2) are
22, 39, 42, 42, 45, 38, 38, 40, and so on. A data set with 40 elements contains
40 observations.

CONSTANT
A constant is an information about the population or sample that is true
to all members. The value of pi, temperature (Celsius to Fahrenheit and vice
versa), number of days in a week, and different forms of measurements e.g. 12
inches = I foot, are some examples of constant.

Data is a collection of facts, such as numbers, words, measurements,


observations or just description of things.
Data are classified into two categories: Qualitative and Quantitative data.

1. Qualitative Data
Qualitative data describes qualities or characteristics. It is mostly non-
numerical and descriptive in nature. It often but not always captures
emotions, feeling and subjective perception of something.
Qualitative method of research is characterized by the following:

● Contains open-ended questions which aims to address the ‘how’


and ‘why’ of an event and uses unstructured methods of data
collection to fully explore the topic.
● Rely more heavily on interviews and there are more interactions
between the researcher and the respondents.
● The findings cannot be generalized to any specific population but
it can produce some evidences that can be used to seek general
patterns in different studies but with different issue.
98
It can be collected through:
● In-depth interview
● Observation methods
● Document review

Here are some examples:

1. color of hair, eyes and skin


2. home address and phone number
3. experiences of a person taken from diaries

2. Quantitative Data
Quantitative data deals with things that are measurable and can be
expressed in number and figures. It is usually expressed in numerical form
and can be mathematically computed. Qualitative data can be collected
using:

● Experiments/clinical trials
● Observing and recording well-defined objects such as number of
cars which participated in a motorcade.
● Administering surveys with closed-ended questions.
● Paper-pencil questionnaires

Example:
1. Number of siblings
2. Height and weight
3. Temperature in degree Celsius

Quantitative data can either be:

a. Discrete data – a data which cannot be broken down into smaller


parts. This type of data consists of integers. The number of siblings
(1, 2, 3, …) is an example.

b. Continuous data – data that can be infinitely broken down into


smaller parts or data which can take a decimal value. Examples are
height and weight (1.37 meters and 72.6 kilograms)

For example, if you would describe a house, your description can either be
qualitative or quantitative. Here are some descriptions:

99
Qualitative Quantitative
The house is located in Baguio City. The house is 8.5 meters high.
The house is mostly made of cement. The house has 3 bedrooms.
The color of the house is green. The house’s floor area is 125 square
The door is made of oak tree. meters.

Data Levels of Measurement

1. Nominal data
This level of data is categorical in nature; none is greater than or less than
the other, and it is not in any particular order. Also, the categories are
exclusive and exhaustive, meaning, the response can neither be ‘both’ nor
‘neither’.

Example: Sex (male or Female), civil status (married, divorced, separated,


widow)

2. Ordinal data
Ordinal data must also be exclusive and exhaustive, but the difference is
that the responses are ranked or it has order. Here, you can say that one
response is higher or better than the other.

Example: Academic rank (Instructor, Professor), socioeconomic status (Rich,


middle class, poor)

3. Interval
Here, interval of equal length signifies equal differences in the data.
Difference makes sense but ratios do not. An example is temperature, 30 oC
is not twice as hot as 15oC. Also, the ‘true zero’ start point is not applicable.
This means that zero does not signify the absence of the measurement. Zero
degree Celsius does not mean that there is no temperature.

Example: Temperature

4. Ratio
At this level, both differences and ratios are meaningful. Example, 4
Liters of water is twice as much as 2 Liters of water. There also exists the
‘true zero’ start point in which zero means nothing or the absence of the
measurement. Zero liter of water means there is no water.

100
Example: Weight, Height, Number of children

Data

Qualitative Quantitative

Nominal Ordinal Ratio Interval

Data can also be classified according to who collected the data. It can be
a primary data or secondary data.

Primary data – These are data which were collected first hand. It is more
authentic, reliable and objective as compare to secondary data. Primary data can
be obtained through experiments, surveys, questionnaires, interviews and
observations.

Secondary data – These data are collected from already published in any form.
The review of literature of research is based on secondary sources. The
importance of secondary data is when you do not need to go through the hassle
of collecting data when it is already available and published. It will save time,
effort and money in the part of the researcher. Secondary data can be collected
from books, records, magazines, research articles, newspapers, biographies,
databases, etc.

Data Presentation

1. Textual Presentation
In textual or descriptive presentation, the data are presented using texts
or paragraphs. This is usually used when the number of data is not too large.
For example:
The population of Region I as of May 1, 2020 is 5,301,139 based on the
2020 Census of Population and Housing (2020 CPH). This accounts for about
4.86 percent of the Philippine population in 2020. The 2020 population of the
region is higher by 275,011 from the population of 5.03 million in 2015, and
552,767 more than the population of 4.75 million in 2010. Moreover, it is higher

101
by 1,100,661 compared with the population of 4.20 million in 2000.
(psa.gov.ph)

2. Tabular Presentation
In tabular presentation, data are presented using tables to represent even
a large number of data to make it engaging and easier to read. The data are
arranged in rows (horizontal) and columns (vertical). Tabular presentation
avoids unnecessary details and repetitions of data. It reveals patterns which
cannot be seen when it is presented in textual form.

In presenting data using a table, take note of the following:


● A table must have a table number and a title.
● Subtitles are properly mentioned in the column and row headers.
● Contents of the table are defined clearly.
● Units of measurement are clearly stated whenever necessary.
● Legends for symbols/short forms and sources are indicated in the
footnote.
● The data are logically arranged in the table

Here is an example of a table presenting the population of Region I for the


year 2000-2020.

Table 1. Total Population in Region I

Census Year Census Reference Date Total Population


2000 May 1, 2000 4, 200, 478
2010 May 1, 2010 4, 747, 372
2015 August 1, 2015 5, 026, 128
2020 May 1, 2020 5, 301, 139
Source: Philippine Statistics Authority

Table 2. Population of Region I per Province in Region I

Province 2000 2010 2015 2020


Ilocos Norte 514,241 568,017 593,081 609,588
Ilocos Sur 594,206 658,587 689,668 706,009
La Union 657,945 741,906 786,653 822,352
Pangasinan 2,434,086 2,779,862 2,956,726 3,163,190
Total 4,200,478 4,748,372 5,026,128 5,301,139
Source: Philippine Statistics Authority

102
3. Diagrammatical or Graphical Presentation
This type of presentation uses graphs or diagrams such as bar graph, pie
graph, line graph and scatter diagram. Diagrams give a bird’s eye view of the
data and can be easily understood just by looking at the graph.

Some of the charts or graphs which are commonly used are the following:

1. Pie chart
The following pie graph illustrates the population of Region I per province
for the year 2020 using the data in table 2.

Figure 1. Population of Region I by province for the year 2020

It can be seen in the graph that Pangasinan constitutes 60% of the total
population of Region I.

2. Bar graph
The following bar graph shows the comparison among the population of
the provinces in Region I from 2000, 2010, 2015 and 2020 as seen in
Figure 2.

Figure 2. Population by province in Region I (Bar Graph)

103
The bar graph shows that Pangasinan dominates the population of
Region I from year 2000 to 2020. The province with the least population
is Ilocos Norte.

3. Column chart

This example of column graph is similar to the given bar graph.

Figure 3. Population by province in Region I (Column Graph)

4. Line graph

The following example shows the comparison among the total


population in Region I by province as shown in table 2. Region I is composed
of four provinces namely Ilocos Norte, Ilocos Sur, La Union and Pangasinan.

Figure 4. Population by Province in Region I

104
The line graph illustrates that the population of the provinces from
Region I continuously increased from year 2000 to 2020.

5. Scatterplot

The following scatter plot depicts the population of the


Philippines from the year 1990 up to the present.

Figure 5. Philippine Population from 1990 to 2021

The scatter plot shows an almost perfect linear relationship between the
year and the population of the Philippines.

Looking at the given examples, diagrams are mostly used as visual aids. It
cannot be considered as alternatives for numerical data. Diagrams and graphs are
not as accurate as tabular data. Only tabular data can be used for further analysis.

105
MODULE IV – Data Management
Learning Activity 1 – Introduction to Data Management

Name: ______________________________________________________
Course, Year and Section: _____________________________________

Classify the following data whether they are qualitative or quantitative and
nominal, ordinal, ratio or interval.

Type of Data Level of


(Qualitative or Measurement
Data
Quantitative) (Nominal, Ordinal,
Ratio, Interval)
1. Test questions classified as easy,
average or difficult
2. Years of important historical events
(e.g. 1941, 1980, 2000)
3. Flavor of ice cream
4. Age of students enrolled in GECC 103
5. Amount of money in your savings
account
6. Religion
7. Contact Number
8. Home Address
9. Number of minutes allocated for
reviewing before you sleep
10. IQ

106

You might also like