0% found this document useful (0 votes)
19 views3 pages

Introduction

Stat

Uploaded by

mukeloskhondze
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

Introduction

Stat

Uploaded by

mukeloskhondze
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

STA 131

1.0 Introduction

1.1 What is Statistics?

The most commonly used meaning of statistics is Collections of data such as agricultural statistics,
trade statistics, financial statistics, labour statistics, income and expenditure statistics, etc. Used in this
sense, statistics refer to collections of figures obtained by measuring or counting, normally called
“data”

Statistics represent a common method of presenting information that relate to numerical data, and
can refer to the science of dealing with the numerical data itself. The main objective of statistics is to
provide useful information by means of figures or numbers.

Data and Information

Data are plain raw unorganized facts, as such are of little use. While information on the other hand, is
processed data (data that have been recorded, classified, organized, etc.) that is useful.

1.2 The Nature and Forms of Statistical Methods

Population and Sample

A population refers to all members of a group having a common and well-defined characteristic. The
study of the entire population is called a Census. Examples: Population Census, Economic Census,
Agricultural Census etc.

A Sample is when a selected part of the population is being investigated. Samples are drawn to derive
estimates about the unknown population parameters and to test whether claims about an unknown
population parameter are likely to be true.

Statistic and Parameter

A statistic is a summary measure obtained from sample data; sample mean (𝑥̅ ), sample standard
deviation (s).

While a parameter is a summary measure obtained from the entire population data; population mean
(µ), population standard deviation (𝜎).

As already stated, the objective of statistics is to convert data into meaningful information, which
involves; classification, summarization, interpretation and presentation of the data. These methods
are called Descriptive Statistics.

Converting data that involve making prediction, estimates and judgments about larger groups of data
falls under Inferential Statistics methods. In other words, drawing conclusions about a population
from data obtained from a sample drawn from that population.

Statistical inference goes beyond a mere description of sample data and becomes a methodology to
aid decision makers by reducing the level of uncertainty in decision making that would have existed in
the absence of sample data.
1.3 The Methodology of Statistics

1. Problem Definition: the statistician must clearly define the problem facing him/her, or the
investigation he intends to undertake, and identify the constraints which will be imposed.
Constraints might include personnel, money or time.

2. Collecting the Necessary Data: which may come from internal sources (i.e from within the
organization itself) or from external sources such as Governments and banks. Some of the
data will already have been collected, but if data is not available, then the statistician must
collect it or arrange for it to be collected. Data collection might involve direct observation, but
if opinions are sought then face-to-face interviews, telephone interviews or mailed
questionnaires will be necessary.

3. Classifying and Summarizing Information: here the data is organized or grouped in a


meaningful manner such as summary tables, charts or summary statistics.

4. Presenting the Data: the statistician will use the summarized data to present relationships
and trends that will help communicate the salient points to other interested parties.

5. Analyzing and Interpreting the Data: here, the statistician will use the descriptive measures
obtained as a basis for making statistical inferences, and will look for alternative courses of
action which will help in the decision making process.

1.4 The Classification of Data

1.4.1 Variables

Data collected for the purpose of analysis will represent varying values of the phenomenon under
investigation. The phenomenon is termed variable, and is defined as a characteristic which shows
variation. Observations of this characteristic will yield values of the variable.

1.4.2 Type of Variables

Discrete Variables: Discrete variables are numeric variables that have a countable number of values
between any two values. A discrete variable is always numeric (integer). For example, the number of
customer complaints or the number of flaws or defects.

Continuous variable: Continuous variables are numeric variables that have an infinite number of
values between any two values. The values of a continuous variable will comprise a set of decimal
numbers, the number of digits after the decimal point being determined by the accuracy of the
measuring equipment and the recorder’s ability to use it. For example, age, height, weight, length etc.

Categorical variable: Categorical variables contain a finite number of categories or distinct groups
(non-numeric). Categorical data might not have a logical order. For example, categorical predictors
include gender, material type, and payment method.

Ordinal Variable: ordinal variables are those whose values can be arranged in a meaningful order, but
represent rank value only. For example, when Mandiya arrives at a party and is asked what he would
like to drink, he always asks for gin. If all gin has been drunk he will ask for beer. If all the beer has
gone he will (reluctantly) ask for a soft drink. Mandiya’s drink preference may be ranked as follows:
Gin 3
Beer 2
Soft drink 1

where 3 = most preferred 1= least preferred

The rank here represent absolute preferences, nothing can be deduced from their numerical value
about relative preferences. For example, we cannot assume that Mandiya’s preference for gin over
beer (3 – 2) is the same as his preference for beer over soft drink (2 -1). Likewise, it would be quite
wrong to assume that Mandiya would be indifference between 3 glasses of soft drink and one glass of
gin, it is better just to ask Mandiya. In the case of ordinal level data, it is only the order of values that
is important, and not the difference between them.

Interval level and Ratio level Variables

Interval level Variables: Interval level Variables are those in which the difference between values has
significance, but their values have a fixed and unambiguously defined zero origin. The often quoted
example of such a variable is the Fahrenheit temperature scale, on which the lower reference point is
the freezing point set at 320. Care must be taken in interpreting this scale: it is legitimate to say that
the temperature difference between 400 F and 600 F is the same as the difference between 600 F and
800 F, and is half the difference between 400 F and 800 F. What is not legitimate is to say is that 800 F
is twice as hot as 400 F, since the origin of the scale is not set at zero.

Ratio level Variables: Ratio level Variables are those which have the properties of Interval level
Variables and, additionally have a zero origin.
The degrees Celsius temperature scale is an example of such a variable. When using this, we can say
400 C is twice as hot as 200 C, because the scale has a well-defined origin set at zero.

You might also like