Module of Introduction To Statistics
Module of Introduction To Statistics
                       INTRODUCTION TO STATISTICS
                           Course Code: (ABVM2101)
                               (3/5 Cr.Hrs/ECTS)
                                Module Writer:
                        Getahun G. Woldemariam (MSc)
                                                             May, 2020
                                                       Woliso, Ethiopia
The learning task was designed to equip students with the ability to
    Identify the importance and application areas of statistics in their
     field of study;
    Interpret statistical information, reports, charts and figures;
    Choose appropriate sampling methods and procedures;
    Explain the basic concepts of probability distributions and their
     application;
    Use estimation and testing methods for predication and
     generalization purposes.
    In addition, the learning task attempts to enable students to describe
     data collection tools and procedures.
1. Qualitative Variables are nonnumeric variables and can't be measured. Examples include
gender, religious affiliation, and state of birth.
Applications of statistics:
      In almost all fields of human endeavor.
      Almost all human beings in their daily life are subjected to obtaining numerical facts
       e.g. abut price.
      Applicable in some process e.g. invention of certain drugs, extent of environmental
       pollution.
      In industries especially in quality control area.
Uses of statistics:
The main function of statistics is to enlarge our knowledge of complex phenomena. The
following are some uses of statistics:
   1. It presents facts in a definite and precise form.
   2. Data reduction.
   3. Measuring the magnitude of variations in data.
   4. Furnishes a technique of comparison
   5. Estimating unknown population characteristics.
   6. Testing and formulating of hypothesis.
   7. Studying the relationship between two or more variable.
   8. Forecasting future events.
Limitations of statistics
As a science statistics has its own limitations. The following are some of the limitations:
      Deals with only quantitative information.
      Deals with only aggregate of facts and not with individual data items.
      Statistical data are only approximately and not mathematical correct.
      Statistics can be easily misused and therefore should be used be experts.
In mathematical terms measurement is a functional mapping from the set of objects {Oi} to the
set of real numbers {M(Oi)}.
The goal of measurement systems is to structure the rule for assigning numbers to objects in
such a way that the relationship between the objects is preserved in the numbers assigned to the
objects. The different kinds of relationships preserved are called properties of the measurement
system.
Order
The property of order exists when an object that has more of the attribute than another object,
is given a bigger number by the rule system. This relationship must hold for all objects in the
"real world".
The property of ORDER exists
When for all i, j if Oi > Oj, then M(Oi) > M(Oj).
Distance
The property of distance is concerned with the relationship of differences between objects. If a
measurement system possesses the property of distance it means that the unit of measurement
Ordinal Scales
Ordinal Scales are measurement systems that possess the property of order, but not the
property of distance. The property of fixed zero is not important if the property of distance is
not satisfied.
Level of measurement which classifies data into categories that can be ranked Differences
between the ranks do not exist. Arithmetic operations are not applicable but relational
operations are applicable. Ordering is the sole property of ordinal scale.
Examples:
    Letter grades (A, B, C, D, F)
    Rating scales (Excellent, Very good, Good, Fair, poor)
    Military status
Interval Scales
Interval scales are measurement systems that possess the properties of Order and distance, but
not the property of fixed zero. Level of measurement which classifies data that can be ranked
and differences are meaningful. However, there is no meaningful zero, so ratios are
meaningless. All arithmetic operations except division are applicable. Relational operations are
also possible.
Examples:
    IQ
    Temperature in oF
Ratio Scales
Ratio scales are measurement systems that possess all three properties: order, distance, and
fixed zero. The added power of a fixed zero allows ratios of numbers to be meaningfully
interpreted; i.e. the ratio of Bekele's height to Martha's height is 1.32, whereas this is not
possible with interval scales.
Level of measurement which classifies data that can be ranked, differences are meaningful, and
there is a true zero. True ratios exist between the different units of measure. All arithmetic and
relational operations are applicable.
Examples:
By Getahun G([email protected] )                                                  Page 11
    Weight
    Height
    Number of students
    Age
The following present a list of different attributes and rules for assigning numbers to objects.
Try to classify the different measurement systems into one of the four types of scales.
(Exercise)
      Your checking account balance as a measure of the amount of money you have in that
       account.
      Your score on the first statistics test as a measure of your knowledge of statistics.
      Your score on an individual intelligence test as a measure of your intelligence.
      The distance around your forehead measured with a tape measure as a measure of your
       intelligence.
      A response to the statement "Abortion is a woman's right" where "Strongly Disagree" =
       1, "Disagree" = 2, "No Opinion" = 3, "Agree" = 4, and "Strongly Agree" = 5, as a
       measure of attitude toward abortion.
      Times for swimmers to complete a 50-meter race
      Months of the year Meskerm, Tikimit…
      Socioeconomic status of a family when classified as low, middle and upper classes.
       Blood type of individuals, A, B, AB and O.
      Regions numbers of Ethiopia (1, 2, 3 etc.)
      The number of students in a college;
      The net wages of a group of workers;
      the height of the men in the same town;
    Having collected and edited the data, the next important step is to organize it. That is to present
    it in a readily comprehensible condensed form that aids in order to draw inferences from it. It is
    also necessary that the like be separated from the unlike ones.
    The presentation of data is broadly classified in to the following two categories:
             Tabular presentation
             Diagrammatic and Graphic presentation.
    The process of arranging data in to classes or categories according to similarities technically is
    called classification.
    Classification is a preliminary and it prepares the ground for proper presentation of data.
    Definitions:
             Raw data: recorded information in its original collected form, whether it is counts or
              measurements, is referred to as raw data.
             Frequency: is the number of values in a specific class of the distribution.
             Frequency distribution: is the organization of raw data in table form using classes and
              frequencies.
    There are three basic types of frequency distributions
                    Categorical frequency distribution
                    Ungrouped frequency distribution
                    Grouped frequency distribution
  There are specific procedures for constructing each type.
    2.1.Categorical frequency Distribution:
Used for data that can be place in specific categories such as nominal, or ordinal. e.g. marital
status.
Example: a social worker collected the following data on marital status for 25
persons.(M=married, S=single, W=widowed, D=divorced)
   M                         S                D                  W                   D
   S                         S                M                  M                   M
   W                         D                S                  M                   M
   W                         D                D                  S                   S
   S                         W                W                  D                   D
Solution:
Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using;
       f
 %      * 100   Where f= frequency of the class, n=total number of value.
       n
Percentages are not normally a part of frequency distribution but they can be added since they are
used in certain types diagrammatic such as pie charts.
Step 5: Find the total for column (3) and (4).
Combing the entire steps one can construct the following frequency distribution.
   Class (1)        Tally (2)          Frequency (3)             Percent (4)
   M                                   6                         20
                    /////
   S                //// //            7                         28
   D                //// //            7                         28
   W                ////               5                         24
            80     76        90   85       80
            70     60        62   70       85
            65     60        63   74       75
            76     70        70   80       85
Solution:
   Each individual value is presented separately, that is why it is named ungrouped frequency
   distribution.
Pictogram
-In these diagram, we represent data by means of some picture symbols. We decide abut a
suitable picture to represent a definite number of units in which the variable is measured.
Example: draw a pictogram to represent the following population of a town.
Year          1989           1990       1991                1992
Bar Charts:
-        A set of bars (thick lines or narrow rectangles) representing some magnitude over time
space.
-        They are useful for comparing aggregate over time space.
-        Bars can be drawn either vertically or horizontally.
-        There are different types of bar charts. The most common being :
                30
                25
   Sales in $
                20
                15
                10
                5
                0
                     A                  B                       C
                                       product
-When there is a desire to show how a total (or aggregate) is divided in to its component parts,
we use component bar chart.
-The bars represent total value of a variable with each total broken in to its component parts
and different colours or designs are used for identifications
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
100
                     80
        Sales in $
                                                              Product C
                     60
                                                              Product B
                     40
                                                               Product A
                     20
                      0
                            1957    1958        1959
                              Year of production
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
                 60
                 50
    Sales in $
                 40                                        Product A
                 30                                       Product B
                 20                                       Product C
                 10
                 0
                      1957    1958         1959
                        Year of production
The histogram, frequency polygon and cumulative frequency graph or ogive are most
commonly applied graphical representations for continuous data.
Histogram
    A graph which displays the data by using vertical bars of various height to represent
frequencies. Class boundaries are placed along the horizontal axes. Class marks and class
limits are some times used as quantity on the X axes.
Frequency Polygon:
-         A line graph. The frequency is placed along the vertical axis and classes mid points are
placed along the horizontal axis. It is customer to the next higher and lower class interval with
corresponding frequency of zero, this is to make it a complete polygon.
Example: Draw a frequency polygon for the above data (example *).
Solutions:
                                     4
Value Frequency
                                     0
                                      2. 5       8. 5       14.5   20.5   26.5   32.5     38.5     44.5
                  -      A graph showing the cumulative frequency (less than or more than type) plotted against
                  upper or lower class boundaries respectively. That is class boundaries are plotted along the
                  horizontal axis and the corresponding cumulative frequencies are plotted along the vertical
                  axis. The points are joined by a free hand curve.
Example: Draw an ogive curve(less than type) for the above data.(Example *)
Objectives:
              To comprehend the data easily.
              To facilitate comparison.
              To make further statistical analysis.
The expression is read, "the sum of X sub i from i equals 1 to N." It means "add up all the
numbers."
Example: Suppose the following were scores made on the first homework assignment for five
students in the class: 5, 7, 7, 6, and 8. In this example set of five numbers, where N=5, the
summation could be written:
The "i=1" in the bottom of the summation notation tells where to begin the sequence of
summation. If the expression were written with "i=3", the summation would start with the third
number in the set. For example:
In the example set of numbers, this would give the following result:
The "N" in the upper part of the summation notation tells where to end the sequence of
summation. If there were only three scores then the summation and example would be:
Sometimes if the summation notation is used in an expression and the expression must be
written a number of times, as in a proof, then a shorthand notation for the shorthand notation is
employed. When the summation sign "∑" is used without additional notation, then "i=1" and
"N" are assumed.
For example:
                n                 n
        2.  kX i          k X i             where k is any constant
               i 1              i 1
                n                                  n
        3.      (a  bX
               i 1
                             i   )  na  b X i
                                                  i 1
                                                                   where a and b are any constant
                n                        n               n
        4.     (X
               i 1
                      i    Yi )   X i   Yi
                                        i 1           i 1
                      6      7
                      8      8
         5                                                           5
   a)    Xi
        i 1
                                                             e)    (X
                                                                    i 1
                                                                                 i    Yi )
         5                                                           5
   b)    Yi
        i 1
                                                              f)   X Y
                                                                    i 1
                                                                             i i
5 5
        10                                                        X
                                                                                 2
   c)                                                        g)              i
        i 1                                                        i 1
         5                                                               5              5
   d)    ( X i  Yi )                                       h) ( X i )( Yi )
        i 1                                                          i 1             i 1
                   5
             b)   Y
                  i 1
                           i        6  7  8  7  8  36
                   5
             c)   10  5 *10  50
                  i 1
                   5
             d)   (X
                  i 1
                                   i    Yi )  (5  6)  (7  7)  (7  8)  (6  7)  (8  8)  69  33  36
                   5
             e)   (X
                  i 1
                                   i    Yi )  (5  6)  (7  7)  (7  8)  (6  7)  (8  8)  3  33  36
                   5
             f)   X Y
                  i 1
                               i i       5 * 6  7 * 7  7 * 8  6 * 7  8 * 8  241
                  X                     5 2  7 2  7 2  6 2  8 2  223
                                   2
             g)                i
                  i 1
                       5                  5
             h) ( X i )( Yi )  33 * 36  1188
                   i 1                  i 1
         There are several different measures of central tendency; each has its advantage and
         disadvantage.
                 The Mean (Arithmetic, Geometric and Harmonic)
                 The Mode
                 The Median
                 Quantiles (Quartiles, Deciles and Percentiles)
         The choice of these averages depends up on which best fit the property under discussion.
 Is defined as the sum of the magnitude of the items divided by the number of items.
                                    X           i
             X                        i 1
                                             n
           If X1 occurs f1 times, if X2occurs f2 times, … , if Xn occurs fn times
                                                              k
                                                             fX         i       i                                             k
   Then the mean will be                              X     i 1
                                                                    k                , where k is the number of classes and   f     i    n
                                                                  f
                                                                  i 1
                                                                             i
                                                                                                                              i 1
       f          i   Xi
                                         36
X     i 1
           4
                                            5.15
        f
                                         7
                       i
            i 1
If data are given in the shape of a continuous frequency distribution, then the mean is obtained
as follows:
               k
             f            i   Xi                                                          th                                        th
   X         i 1
                  k
                                         , Where Xi =the class mark of the i class and fi = the frequency of the i class
                   f
                   i 1
                                i
Solutions:
    First find the class marks
    Find the product of frequency and class marks
    Find mean using the formula.
                      Class                     fi    Xi   Xifi
                      6- 10                     35    8    280
                      11- 15                    23    13   299
                      16- 20                    15    18   270
                      21- 25                    12    23   276
                      26- 30                    9     28   252
                      31- 35                    6     33   198
                      Total                     100        1575
          f X        i       i
                                      1575
    X    i 1
              6
                                           15.75
           f
                                      100
                          i
               i 1
Exercises:
                     65-69          6
                     70-74          3
      ( X  X )  0.
     i 1
                 i
 2. The sum of the squared deviations of a set of items from their mean is the minimum. i.e.
            n                 n
           ( Xi  X ) 2   ( X i  A) 2 , A  X
          i 1               i 1
     the mean of n k observation, then the mean of all the observation in all groups often called
     the combined mean is given by:
                                            k
        X n  X 2 n 2  ....  X k n k     X n      i       i
    Xc  1 1                               i 1
             n1  n 2  ...n k
                                                k
                                             n
                                              i 1
                                                         i
       X n  X 2 n2 
                                X i ni
   Xc  1 1         i12
         n1  n2
                           n
                           i 1
                                   i
 4. If a wrong figure has been used when calculating the mean the correct mean can be
     obtained with out repeating the whole process using:
                                            (CorrectValue  WrongValue)
     CorrectMean  WrongMean 
                                                         n
     Where n is total number of observations.
     Solutions:
                                            (CorrectValue  WrongValue)
     CorrectMean  WrongMean 
                                                         n
                               (80  40)
     CorrectMean  65                    65  4  69k.g.
                                  10
k*old mean
Example:
Weighted Mean
               X W      i       i
       Xw      i 1
                    n
                 W
                  i 1
                             i
Example:
       A student obtained the following percentage in an examination:
       English 60, Biology 75, Mathematics 63, Physics 59, and chemistry 55.Find the
       students weighted arithmetic mean if weights 1, 2, 1, 3, 3 respectively are allotted to the
       subjects.
Solutions:
               X W      i       i
                                         60 * 1  75 * 2  63 * 1  59 * 3  55 * 3 615
      Xw       i 1
                                                                                       61.5
                                                     1 2  1 3  3
                    5
                                                                                     10
                 W
                  i 1
                             i
     Merits:
             It is based on all observation.
             It is suitable for further mathematical treatment.
             It is stable average, i.e. it is not affected by fluctuations of sampling to some extent.
             It is easy to calculate and simple to understand.
     Demerits:
            It is affected by extreme observations.
            It can not be used in the case of open end classes.
            It can not be determined by the method of inspection.
            It can not be used when dealing with qualitative characteristics, such as intelligence,
             honesty, beauty.
             The geometric mean of a set of n observation is the nth root of their product.
             The geometric mean of X1, X2 ,X3 …Xn is denoted by G.M and given by:
               G.M  n X1 * X2 * ... * Xn
             Taking the logarithms of both sides
                                                                                   1
    log(G.M)  log(n X 1 * X 2 * ... * X n )  log(X 1 * X 2 * ... * X n ) n
               1                               1
    log(G.M)   log(X 1 * X 2 * .... * X n )  (log X 1  log X 2  ...  log X n )
               n                               n
               1 n
    log(G.M)   log X i
               n i1
Example:
     Solutions:
      G.M  n X1 * X2 * ... * Xn  3 2 * 4 * 8  3 64  4
    Remark: The Geometric Mean is useful and appropriate for finding averages of ratios.
       The harmonic mean of X1, X2 , X3 …Xn is denoted by H.M and given by:
                                            n
                    H.M               n         , This is called simple harmonic mean.
                                           1
                                      
                                      i 1 X i
                                      n             k
                   H.M             k       , n   fi
                                       fi
                                  
                                  i 1 X
                                                  i 1
       If observations X1, X2, …Xn have weights W1, W2, …Wn respectively, then their harmonic
       mean is given by
                           W           i
              H.M     n
                           i 1
                                                , This is called Weighted Harmonic Mean.
                      W
                      i 1
                                  i    Xi
Example: A cyclist pedals from his house to his college at speed of 10 km/hr and back from the
college to his house at 15 km/hr. Find the average speed.
The Mode
 Examples:
           1. Find the mode of 5, 3, 5, 8, 9
                 Mode =5
           2. Find the mode of 8, 9, 9, 7, 8, 2, and 5.
                It is a bimodal Data: 8 and 9
           3. Find the mode of 4, 12, 3, 6, and 7.
                No mode for this data.
If data are given in the shape of continuous frequency distribution, the mode is defined as:
        Where:
                    Xˆ  the mod e of the distribution
                    w  the size of the mod al class
                     1  f mo  f 1
                     2  f mo  f 2
                     f mo  frequencyof the mod al class
                     f 1  frequencyof the class preceedingthe mod al class
                     f 2  frequencyof the class following the mod al class
        Example: Following is the distribution of the size of certain farms selected at random from a
        district. Calculate the mode of the distribution.
Solutions:
    Xˆ  45  10
                       2 
                           
                   2  26 
        45.71
   Merits:
                      It is not affected by extreme observations.
                      Easy to calculate and simple to understand.
                      It can be calculated for distribution with open end class
   Demerits:
                      It is not rigidly defined.
Solutions:
a) First order the data: 2, 4, 5, 6, 8, 9
       Here n=6
       ~  1
       X  (X n  X n )
          2 [2]    [  1]
                    2
          1
           ( X [3]  X [ 4 ] )
          2
          1
          ( 5  6)  5.5
          2
b) Order the data :1, 2, 3, 5, 8
      Here n=5
     If data are given in the shape of continuous frequency distribution, the median is defined as:
      ~             w n
      X  L med       (  c)
                  f med 2
      Where :
           L med  lower class boundary of the median class.
             w  the size of the median class
             n  total number of observations.
             c  the cumulativefrequency(less than type) preceeding the median class.
          f med  thefrequency of the median class.
Remark:
The median class is the class with the smallest cumulative frequency (less than type) greater than or
           n
equal to     .
           2
Example: Find the median of the following distribution.
                            Class   Frequency
                            40-44   7
                            45-49   10
                            50-54   22
                            55-59   15
                            60-64   12
                            65-69   6
                            70-74   3
           n 75
               37.5
           2 2
           39 is the first cumulative frequencyto be greater thanor equalto 37.5
           50  54 is the median class.
             L         49.5, w  5
                 med
                 n  75,     c  17, f          22
                                         med
             ~
          X L       w ( n  c)
                med f          2
                        med
              49.5  5 (37.5  17)
                     22
          54.16
Merits:
            Median is a positional average and hence not influenced by extreme observations.
            Can be calculated in the case of open end intervals.
            Median can be located even if the data are incomplete.
Demerits:
            It is not a good representative of data if the number of items is small.
            It is not amenable to further algebraic treatment.
            It is susceptible to sampling fluctuations.
Quantiles
  When a distribution is arranged in order of magnitude of items, the median is the value of the middle
  term. Their measures that depend up on their positions in distribution quartiles, deciles, and
  percentiles are collectively called quantiles.
   Quartiles:
           - Quartiles are measures that divide the frequency distribution in to four equal parts.
           - The value of the variables corresponding to these divisions are denoted Q1, Q2, and Q3
             often called the first, the second and the third quartile respectively.
           - Q1 is a value which has 25% items which are less than or equal to it. Similarly Q2 has
             50%items with value less than or equal to it and Q3 has 75% items whose values are less
             than or equal to it.
                                               iN
           - To find Qi (i=1, 2, 3) we count        of the classes beginning from the lowest class.
                                                4
           - For grouped data: we have the following formula
Remark:
  The quartile class (class containing Qi ) is the class with the smallest cumulative frequency (less
                                       iN
 than type) greater than or equal to        .
                                        4
Deciles:
      - Deciles are measures that divide the frequency distribution in to ten equal parts.
      - The values of the variables corresponding to these divisions are denoted D1, D2,.. D9 often
           called the first, the second,…, the ninth deciles respectively.
                                                iN
      - To find Di (i=1, 2,..9) we count           of the classes beginning from the lowest class.
                                                10
                   w iN
   Di  LD i          (  c) , i  1,2,...,9
                  f Di 10
   Where :
       LDi  lower class boundaryof the decile class.
             w  the size of the decileclass
             N  total number of observations.
             c  the cumulative frequency (less than type) preceeding the decile class.
           f Di  thefrequency of the decile class.
                                         iN
   than type) greater than or equal to      .
                                         10
Percentiles:
        - Percentiles are measures that divide the frequency distribution in to hundred equal parts.
        - The values of the variables corresponding to these divisions are denoted P 1, P2,.. P99 often
          called the first, the second,…, the ninety-ninth percentile respectively.
                                                 iN
        - To find Pi (i=1, 2,..99) we count         of the classes beginning from the lowest class.
                                                100
Remark:
The percentile class (class containing Pi) is the class with the small cumulative frequency
                                                    iN
       (less than type) greater than or equal to       .
                                                   100
       Example: Considering the following distribution
     Values                Frequency
     140- 150              17
     150- 160              29
     160- 170              42
     170- 180              72
     180- 190              84
     190- 200              107
     200- 210              49
     210- 220              34
     220- 230              31
     230- 240              16
     240- 250              12
Solutions:
        First find the less than cumulative frequency.
        Use the formula to calculate the required quantile.
     Values                Frequency    Cum.Freq(less
                                        than type)
     140- 150              17           17
     150- 160              29           46
     160- 170              42           88
     170- 180              72           160
     180- 190              84           244
     190- 200              107          351
a) Quartiles:
      i.   Q1
    - determine the class containing the first quartile.
      N
         123.25
      4
       170  180 is the class containingthe first quartile.
                                                                               w N
                                                                 Q1  LQ1      (  c)
                 LQ  170 ,
                   1
                                      w 10                                    fQ 4
                                                                                1
                 N  493 ,           c  88 ,      f Q  72
                                                     1                170 
                                                                             10
                                                                                (123.25  88)
                                                                             72
                                                                      174.90
     ii.   Q2
    - determine the class containing the second quartile.
      2* N
            246.5
       4
       190  200 is the class containingthe sec ond quartile.
                 LQ  190 ,
                    2
                                       w 10
                 N  493 ,            c  244 ,          f Q 107
                                                           2
                      10
               170     (246.5  244)
                      72
               190.23
   iii.        Q3
   - determine the class containing the third quartile.
     3* N
           369.75
      4
      200  210 is the class containingthe third quartile.
                        LQ  200 ,
                          3
                                               w 10
                        N  493 ,           c  351 ,       f Q  49
                                                              3
                              w 3* N
   Q3  LQ 3                   (    c)
                              fQ
                               3
                                   4
                      10
               200     (369.75  351)
                      49
               203.83
b) D7
     - determine the class containing the 7th decile.
     7* N
           345.1
      10
     190  200 is the class containingthe seventh decile.
     LD  190 ,
          7
                                   w 10
        N  493 ,                  c  244 ,     f D 107
                                                  7
c) P90
       - determine the class containing the 90th percentile.
       90 * N
               443.7
        100
        220  230 is the class containingthe 90th percentile.
       LP  220 ,
          90
                                w 10
         N  493 ,              c  434 ,      f P  3107
                                                 90
                          w 90 * N
    P90  LP              (       c)
                  90
                         f P 100
                          90
                       10
                220     (443.7  434)
                       31
                223.13
   The measures of dispersion which are expressed in terms of the original unit of a series are
   termed as absolute measures. Such measures are not suitable for comparing the variability of
   two distributions which are expressed in different units of measurement and different average
   size. Relative measures of dispersions are a ratio or percentage of a measure of absolute
   dispersion to an appropriate measure of central tendency and are thus pure numbers
   independent of the units of measurement. For comparing the variability of two distributions
   (even if they are measured in the same unit), we compute the relative measure of dispersion
   instead of absolute measures of dispersion.
   Various measures of dispersions are in use. The most commonly used measures of dispersions
   are:
1) Range and relative range
2) Quartile deviation and coefficient of Quartile deviation
3) Mean deviation and coefficient of Mean deviation
4) Standard deviation and coefficient of variation.
          4.1.The Range (R)
   The range is the largest score minus the smallest score. It is a quick and dirty measure of
   variability, although when a test is given back to students they very often wish to know the
   range of scores. Because the range is greatly affected by extreme scores, it may give a distorted
   picture of the scores. The following two distributions have the same range, 13, yet appear to
   differ greatly in the amount of variability.
   Distribution 1:         32 35     36 36     37     38   40   42   42   43   43   45
   Distribution 2:         32 32     33 33     33     34   34   34   34   34   35   45
   For this reason, among others, the range is not the most important measure of variability.
Merits:
    It is rigidly defined.
    It is easy to calculate and simple to understand.
Demerits:
    It is not based on all observation.
    It is highly affected by extreme observations.
    It is affected by fluctuation in sampling.
    It is not liable to further algebraic treatment.
    It can not be computed in the case of open end distribution.
    It is very sensitive to the size of the sample.
The inter quartile range is the difference between the third and the first quartiles of a set of
items and semi-inter quartile range is half of the inter quartile range.
                 Q3  Q1
       Q.D 
                    2
 Coefficient of Quartile Deviation (C.Q.D)
                      (Q3  Q1 2 2 * Q.D Q3  Q1
          C. Q.D                       
                     (Q3  Q1 ) 2 Q3  Q1 Q3  Q1
 It gives the average amount by which the two quartiles differ from the median.
Example: Compute Q.D and its coefficient for the following distribution.
 Values           Freq.
 140- 150         17
 150- 160         29
 160- 170         42
 170- 180         72
 180- 190         84
Remark: Q.D or C.Q.D includes only the middle 50% of the observation.
   The mean deviation of a set of items is defined as the arithmetic mean of the values of the
   absolute deviations from a given average. Depending up on the type of averages used we have
   different mean deviations.
                       n
                       Xi  X
   M .D ( X )        i 1
                                  n
      For     the         case       of     frequency         distribution   it   is     given     as:
                              k
                              fi X i  X
       M .D ( X )           i 1
                                      n
                            n              ~
             ~
                            Xi  X
       M .D( X )          i 1
                                    n
      For the case of frequency distribution it is given as:
                               k               ~
             ~
                            fi X i  X
       M .D( X )          i 1
                                       n
                                   ~
   Steps to calculate M.D ( X ):
                      ~
1. Find the median,   X
                                                   ~
2. Find the deviations of each reading from X .
3. Find the arithmetic mean of the deviations, ignoring sign.
                    X     i
                                ˆ
                               X
            ˆ)
       M.D( X       i 1
                           n
   Examples:
1. The following are the number of visit made by ten mothers to the local doctor’s surgery. 8, 6,
   5, 5, 7, 4, 5, 9, 7, 4
   Find mean deviation about mean, median and mode.
   Solutions:
     First calculate the three averages
                  ~
           X  6, X  5.5, Xˆ  5
Xi  6 2 2 1 1 1 0 1 1 2 3 14
X i  5.5 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14
Xi  5 1 1 0 0 0 1 2 2 3 4 14
                             10
                              X i  6)          14
     M .D( X )             i 1
                                                    1.4
                                    10           10
               ~
                           X i  5.5           14
        M .D ( X )       i 1
                                                   1.4
                                     10         10
                           10
                           X i  5)            14
        M .D( Xˆ )       i 1
                                                   1.4
                                     10         10
2. Find mean deviation about mean, median and mode for the following distributions.(exercise)
                 Class          Frequency
                 40-44          7
                 45-49          10
                 50-54          22
                 55-59          15
                 60-64          12
                 65-69          6
                 70-74          3
                              M .D
   C.M .D 
              Average about which deviations are taken
                         M .D( X )
       C.M .D( X ) 
                            X
Example: calculate the C.M.D about the mean, median and mode for the data in example 1
above.
Solutions:
                             M .D
C.M .D 
             Average about which deviations are taken
                                                                          ~
                 M .D( X ) 1.4                                ~     M .D( X ) 1.4
 C.M .D( X )                    0.233             C.M .D( X )     ~           0.255
                    X         6                                        X       5.5
                  M .D( Xˆ ) 1.4
   C.M .D( Xˆ )                   0.28
                     Xˆ         5
The Variance
Population Variance
If we divide the variation by the number of values in the population, we get something called
the population variance. This variance is the "average squared deviation from the mean".
                                 1
Population Varince   2 
                                 N
                                    ( X i   ) 2 , i  1,2,.....N
Sample Variance
One would expect the sample variance to simply be the population variance with the
population mean replaced by the sample mean. However, one of the major uses of statistics is
    S2        i 1
                                   , for raw data.
                      n 1
                k
                fi X i        nX 2
                          2
        S2    i 1
                                       , for frequency distribution.
                      n 1
    4.2.Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means that the
units were also squared. To get the units back the same as the original data values, the square
                                                                           root must be taken.
        Population s tan dard deviation     2
        Sample s tan dard deviation  s  S 2
Solutions:
1. X  11
Xi 5 10 12 17 Total
(Xi- X)2 36 1 1 36 74
                     n
                  ( X i  X )2       74
       S2      i 1
                                         24.67.
                         n 1          3
       S  S 2  24.67  4.97.
2. X  55
Xi(C.M) 42 47 52 57 62 67 72 Total
1.
          ( X i  X )2          ( X i  A) 2 , A  X
             n 1                    n 1
2. For normal (symmetric) distribution the following holds.
        Approximately 68.27% of the data values fall within one standard deviation of the mean.
         i.e. with in   ( X  S, X  S)
        Approximately 95.45% of the data values fall within two standard deviations of the mean.
         i.e. with in   ( X  2S , X  2S )
        Approximately 99.73% of the data values fall within three standard deviations of the mean.
         i.e. with in   ( X  3S , X  3S )
3. Chebyshev's Theorem
For any data set ,no matter what the pattern of variation, the proportion of the values that fall
                                                                                                    1
     with in k standard deviations of the mean or   ( X  kS , X  kS )     will be at least   1        ,
                                                                                                    k2
     where k is a number greater than 1. i.e. the proportion of items falling beyond k standard
                                          1
     deviations of the mean is at most
                                          k2
     Example: Suppose a distribution has mean 50 and standard deviation 6. What percent of the
     numbers are?
a) Between 38 and 62
    a) 38 and 62 are at equal distance from the mean,50 and this distance is 12
   ks  12
        12 12
  k          2
         S    6
                                                         1
   Applying the above theorem, at least          (1       ) *100%  75%          of the numbers lie
                                                         k2
     between 38 and 62
b) Similarly done.
                                                        1
    c) It is just the complement of a) i.e. at most      2
                                                           *100%  25%          of the numbers lie less
                                                       k
          than 32 or more than 62.
    d) Similarly done.
   Exercise: The average score of a special test of knowledge of wood refinishing has a mean of
   53 and standard deviation of 6. Find the range of values in which at least 75% the scores will
   lie.
Examples:
       known to be 12 gm and 3 gm respectively. New set of capsules of another drug are obtained
       by the linear transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n ) then what will be the standard
       deviation of the new set of capsules.
   2. The mean and the standard deviation of a set of numbers are respectively 500 and 10.
       a) If 10 are added to each of the numbers in the set, then what will be the variance and
           standard deviation of the new set?
       b) If each of the numbers in the set are multiplied by -5, then what will be the variance and
           standard deviation of the new set?
Solutions:
      Is defined as the ratio of standard deviation to the mean usually expressed as percents.
                  S
          C.V      *100
                  X
      The distribution having less C.V is said to be less variable or more consistent.
   Example: An analysis of the monthly wages paid (in Birr) to workers in two firms A and B
   belonging to the same industry gives the following results
    Solutions:
    Calculate coefficient of variation for both firms.
                  SA         10
         C.VA       *100       *100  19.05%
                  XA        52.5
                  SB         11
        C.VB        *100       *100  23.16%
                  XB        47.5
    Since C.VA < C.VB, in firm B there is greater variability in individual wages.
                 City 1     25   24    23      26   17
                 City2      22   21    24      22   20
                 City3      32   27    35      24   28
Which city have the most consistent temperature, based on these data?
             X 
       Z            , for population.
               
             X X
       Z         , for sample
               S
            Z gives the deviations from the mean in units of standard deviation
            Z gives the number of standard deviation a particular observation lie above or below
             the mean.
            It is used to compare two observations coming from different groups.
Examples:
1. Two sections were given introduction to statistics examinations. The following information
   was given.
Student A from section 1 scored 90 and student B from section 2 scored 95.Relatively speaking
who performed better?
Solutions:
 Calculate the standard score of both students.
       X A  X 1 90  78
ZA                     2
          S1        6
       X B  X 2 95  90
ZB                     1
          S2        5
2. Two groups of people were trained to perform a certain task and tested to find out which
   group is faster to learn the task. For the two groups the following information was given:
                        Value           Group one        Group two
Relatively speaking:
                       a) Which group is more consistent in its performance
                       b) Suppose a person A from group one take 9.2 minutes while person B
                          from Group two take 9.3 minutes, who was faster in performing the
                          task? Why?
Solutions:
              X A  X 1 9.2  10.4
       ZA                         1
                 S1        1.2
              X B  X 2 9.3  11.9
       ZB                         2
                 S2        1.3
          A  1,3,5
          B  2,4,6
          C    or empty spaceor impossibleevent
   Remark: If S (sample space) has n members then there are exactly 2n subsets or events.
   denoted by    A' , or Ac , or A    contains those points of the sample space which don’t belong
   to A.
8. Elementary Event: an event having only a single element or sample point.
9. Mutually Exclusive Events: Two events which cannot happen at the same time.
10. Independent Events: Two events are independent if the occurrence of one does not affect
   the probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or
   occurrence of the second event in a way the probability is changed.
         Solution
            a) S={1,2,3,4,5,6}
            b) S={(HH),(HT),(TH),(TT)}
            c) S={t /t≥0}
                     Sample space can be
                                Countable ( finite or infinite)
                                Uncountable.
         Counting Rules
 In order to calculate probabilities, we have to know
          The number of elements of an event
          The number of elements of the sample space.
 That is in order to judge what is probable, we have to know what is possible.
 In order to determine the number of outcomes, one can use several rules of counting.
Permutation
is
                                n!
                  n Pr   
                             (n  r )!
     3. The number of permutations of n objects in which k1 are alike k2 are alike etc is
                        n!
              
                  k1!*k2 * ...* kn
 Example:
     1. Suppose we have a letters A,B, C, D
              a) How many permutations are there taking all the four?
              b) How many permutations are there if two letters are used at a time?
    AB       BA    CA       DA                           AB        BC
    AC       BC    CB       DB                           AC        BD
    AD       BD    CD       DC                           AD        DC
Note that in permutation AB is different from BA. But in combination AB is the same as BA.
Combination Rule
                        n       n!
                         
                         r  (n  r )!*r!
Examples:
   1. In how many ways a committee of 5 people is chosen out of 9 people?
   Solutions:
     n9 , r 5
     n           n!           9!
                                126 ways
       
       r     ( n  r )!* r!   4!* 5!
   2. Among 15 clocks there are two defectives .In how many ways can an inspector chose three
       of the clocks for inspection so that:
            a) There is no restriction.
            b) None of the defective clock is included.
            c) Only one of the defective clocks is included.
            d) Two of the defective clock is included.
   Solutions: n=15 of which 2 are defective and 13 are non-defective; and r=3
            a) If there is no restriction select three clocks from 15 clocks and this can be done in :
                    n  15 , r  3
                    n           n!          15!
                                                455 ways
                      
                      r     ( n  r )!* r!   12!* 3!
             2  13 
              *    156 ways.
             1  2 
       d) Two of the defective clock is included.
          This is equivalent to two defective and one non defective, which can be done in:
             2  13 
              *    13 ways.
              2  3 
Approaches to measuring Probability
There are 3 different conceptual approaches to the study of probability theory. These are:
      The classical approach.
      The frequenters approach.
      The subjective approach.
      S  1, 2, 3, 4, 5, 6
       N  n( S )  6
   a) Let A be the event of number 4               c) Let A be the event of even numbers
        A  4                                        A  2,4,6
         N A  n( A)  1                               N A  n( A)  3
                     n( A)                                             n( A)
           P( A)           1 6                             P( A)            3 6  0.5
                     n( S )                                            n( S )
   b) Let A be the event of odd numbers            d) Let A be the event of number 8
       A  1,3,5                                      A  {}
        N A  n( A)  3                                 N A  n( A)  0
                     n( A)                                             n( A)
          P( A)             3 6  0.5                      P( A)           0 60
                     n( S )                                            n( S )
   2. A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of this
   candles are selected at random, what is the probability that
       a) All will be defective.
       b) 6 will be non defective
       c) All will be non defective
                                       80 
    Solutions: Total selection            N  n( S )
                                      10 
       a) Let A be the event that all will be defective.
         2. If records show that 60 out of 100,000 bulbs produced are defective. What is the
            probability of a newly produced bulb to be defective?
Solution: Let A be the event that the newly produced bulb is defective.
                     NA     60
        P( A)  lim               0.0006
                N  N    100,000
    Subjective approach - this is type of probability based on the beliefs of the person making the
    probability assessment . subjective probability assessment are often found when events occur
    only once or at most every few times.
    The disadvantages of subjective probability is that two or more person facing the same
    evidence / problem may arrive different probability. That is for the same problem there may be
    different decision.
    Conditional probability and Independency
    Conditional Events: If the occurrence of one event has an effect on the next occurrence of the
    other event then the two events are conditional or dependant events.
    Example: Suppose we have two red and three white balls in a bag
         1. Draw a ball with replacement
            Since the first drawn ball is replaced for a second draw it doesn’t affect the second
            draw. For this reason A and B are independent. Then if we let
                                                                  2
            A= the event that the first draw is red   p ( A) 
                                                                  5
       This is conditional b/c the first drawn ball is not to be replaced for a second draw
       in that it does affect the second draw. If we let
                                                          2
          A= the event that the first draw is red   p ( A) 
                                                          5
          B= the event that the second draw is red  p( B)  ?
Let B= the event that the second draw is red given that the first draw is red P(B) = 1/4
The conditional probability of an event A given that B has already occurred, denoted by
p( A B) is
                p( A  B)
p( A B) =                 ,       p( B)  0
                  p( B)
Remark: (1)      p ( A' B )  1  p ( A B )    (2)   p( B ' A)  1  p( B A)
Examples 1. For a student enrolling at freshman at certain university the probability is 0.25
that he/she will get scholarship and 0.75 that he/she will graduate. If the probability is 0.2 that
he/she will get scholarship and will also graduate. What is the probability that a student who
get a scholarship graduate?
Exercise: A lot consists of 20 defective and 80 non-defective items from which two items are
chosen without replacement. Events A & B are defined as A = the first item chosen is
defective, B = the second item chosen is defective
   a) What is the probability that both items are defective?
   b) What is the probability that the second item is defective?
Note: for any two events A and B the following relation holds.
                                       
      pB   pB A. p A  p B A' . p A'      
Probability of Independent Events
       Required     p A  B 
       a.     p A  B   pB A. p A  3 / 94 10  2 15
       b.     p A  B   p A. pB   4 104 10  4 25
   Inference is the process of making interpretations or conclusions from sample data for the
    totality of the population.
   It is only the sample data that is ready for inference.
   In statistics there are two ways though which inference can be made.
             Statistical estimation
             Statistical hypothesis testing.
                                 Inference        Analyzed
        Population
                                                    Data
                                                 Numerical
            Sample
                                                    data
Statistical Estimation
   This is one way of making inference about the population parameter where the investigator
   does not have any prior notion about values or characteristics of the population parameter.
   There are two ways estimation.
   1) Point Estimation
       It is a procedure that results in a single value as an estimate for a parameter.
   2) Interval estimation
       It is the procedure that results in the interval of values as an estimate for a parameter, which
       is interval that contains the likely values of a parameter. It deals with identifying the upper
       and lower limits of a parameter. The limits by themselves are random variable.
Definitions
       Confidence Interval: An interval estimate with a specific level of confidence
       Confidence Level: The percent of the time that the true value will lie in the interval
       estimate given.
       Consistent Estimator: An estimator which gets closer to the value of the parameter as the
       sample size increases.
       Degrees of Freedom: The number of data values which are allowed to vary once a statistic
       has been determined.
       Estimator: A sample statistic which is used to estimate a population parameter. It must be
       unbiased, consistent, and relatively efficient.
       Estimate: Is the different possible values which an estimator can assumes.
       Interval Estimate: A range of values used to estimate a parameter.
       Point Estimate: A single value used to estimate a parameter.
       Relatively Efficient Estimator: The estimator for a parameter with the smallest variance.
       Unbiased Estimator: An estimator whose expected value is the value of the parameter
       being estimated.
xi    over n is the point estimator used to compute the estimate of the population means,      .That
             xi
is   X                is a point estimator of the population mean.
               n
      Confidence interval estimation of the population mean
Although       X possesses nearly all the qualities of a good estimator, because of sampling error,
we know that it's not likely that our sample statistic will be equal to the population parameter,
but instead will fall into an interval of values. We will have to be satisfied knowing that the
statistic is "close to" the parameter. That leads to the obvious question, what is "close"?
We can phrase the latter question differently: How confident can we be that the value of the
statistic falls within a certain "distance" of the parameter? Or, what is the probability that the
parameter's value is within a certain range of the statistic's value? This range is the confidence
interval.
The confidence level is the probability that the value of the parameter falls within the range
specified by the confidence interval surrounding the statistic.
       There are different cases to be considered to construct confidence intervals.
Case 1: If sample size is large or if the population is normal with known variance
Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a
sample. Consider samples of size n drawn from a population, whose mean is              and standard
deviation is     with replacement and order important. The population can have any frequency
distribution. The sampling distribution of             X will   have a mean    x   and   a standard
                       
deviation  x               , and approaches a normal distribution as n gets large. This allows us to
                         n
use      the       normal       distribution   curve      for    computing     confidence    intervals.
      But usually 
                      2
                          is not known, in that case we estimate by its point estimator S2
Here are the Z values corresponding to the most commonly used confidence levels.
         100(1   )                  2       Z   2
         %
         90                    0.10   0.05      1.645
The unit of measurement of the confidence interval is the standard error. This is just the
standard deviation of the sampling distribution of the statistic.
Examples:
1. From a normal sample of size 25 a mean of 32 was found .Given that the population
     standard deviation is 4.2. Find
            a) A 95% confidence interval for the population mean.
            b) A 99% confidence interval for the population mean.
      Solution:
          a)
b)
Suppose the assumed or hypothesized value of  is denoted by  0 , then one can formulate two
     1.    H 0 :   0           vs      H1 :   0
     2.    H 0 :   0           vs      H1 :   0
     3.    H 0 :   0           vs      H1 :   0
 Case 2: When sampling is from a normal distribution with            2 unknown and small sample size
      - The relevant test statistic is
                       X  0
             t cal                 ~   t with n  1 deg rees of freedom.
                       S n
      - After specifying  we have the following regions on the student t-distribution
        corresponding to the above three hypothesis.
      H0               Reject H0 if            Accept H0 if               Inconclusive if
                     X  0
           Z cal           , if  2 is known.
                      n
                     X  0
                           , if  2 is unknown.
                     S n
      - The decision rule is the same as case I.
Examples:
Solution:
                H 0 :   10          vs      H1 :   10
      Step 2: select the level of significance,     0.01( given)
      Step 3: Select an appropriate test statistics
                 t- Statistic is appropriate because population variance is not known and the sample size is
                 also small.
      Step 4: identify the critical region.
                Here we have two critical regions since we have two tailed hypothesis
                        X  10.06, S  0.25
                                    X   0 10.06  10
                         t cal                       0.76
                                    S n      0.25 10
      Step 6: Decision
                 Accept H0 , since tcal is in the acceptance region.
      Step 7: Conclusion
                At 1% level of significance, we have no evidence to say that the average height content of
                containers of the given lubricant is different from 10 litters, based on the given sample data.
2.         The mean life time of a sample of 16 fluorescent light bulbs produced by a company is computed
      to be 1570 hours. The population standard deviation is 120 hours. Suppose the hypothesized value
            H 0 :   1600         vs         H1 :   1600
  Step 2: select the level of significance,     0.05 ( given)
  Step 3: Select an appropriate test statistics
        Z- Statistic is appropriate because population variance is known.
 Step 4: identify the critical region.
                   X   0 1570  1600
         Z cal                        1.0
                    n      120 16
 Step 6: Decision
        Accept H0, since Zcal is in the acceptance region.
 Step 7: Conclusion
        At 5% level of significance, we have no evidence to say that that the life time of light bulbs is
        decreasing, based on the given sample data.
Exercise: It is known in a pharmacological experiment that rats fed with a particular diet over a
certain period gain an average of 40 gms in weight. A new diet was tried on a sample of 20 rats
yielding a weight gain of 43 gms with variance 7 gms. Test the hypothesis that the new diet is an
improvement assuming normality.
Test of Association
           B
           A            B1     B2      .         .     Bj       .     Bc       Total
           A1           O11    O12                     O1j            O1c      R1
           A2           O21    O22                     O2j            O2c      R2
           .
           .
           Ai           Oi1    Oi2                     Oij            Oic      Ri
           .
           .
           Ar           Or1    Or2                     Orj            Orc
           Total        C1     C2                      Cj                      n
 - The chi-square procedure test is used to test the hypothesis of independency of two attributes
    .For instance we may be interested
                     Whether the presence or absence of hypertension is independent of
                         smoking habit or not.
                     Whether the size of the family is independent of the level of education
                         attained by the mothers.
                     Whether there is association between father and son regarding boldness.
                                           Son
                                           Father      Bold          Not
                                           Bold        85            59
                                           Not         65            91
     Using           5% , test whether there is association between father and son regarding boldness.
     Solution:
     Solution:
      H 0 : There is no associatio n between the size of the family and the level of
         education attained by fathers.
      H1 : not H 0 .
      - First calculate the row and column totals
              R1  83, R2  117, C1  45, C2  96, C3  59
      - Then calculate the expected frequencies( eij’s)
                 Ri * C j      e11  18.675, e12  39.84, e13  24.485
         eij 
                    n             e21  26.325, e22  56.16, e23  34.515
      - Obtain the calculated value of the chi-square.
Correlation Analysis: deals with the measurement of the closeness of the relation ship which
are described in the regression equation.
We say there is correlation if the two series of items vary together directly or inversely.
   7.2.Correlation Analysis
The presence of correlation between two variables may be due to three reasons:
           1.One variable being the cause of the other. The cause is called “subject” or
               “independent” variable, while the effect is called “dependent” variable.
           2.Both variables being the result of a common cause. That is, the correlation that
               exists between two variables is due to their being related to some third force.
           Example:
             Let X1= ESLCE result
                   Y1= rate of surviving in the University
                   Y2= the rate of getting a scholar ship.
 By Getahun G([email protected] )                                                   Page 96
             Both X1&Y1 and X1&Y2 have high positive correlation, likewiseY1 & Y2 have
             positive correlation but they are not directly related, but they are related to each other
             via X1.
             Examples:
                      Price of teff in Addis Ababa and grade of students in USA.
                      Weight of individuals in Ethiopia and income of individuals in Kenya.
Therefore, while interpreting correlation coefficient, it is necessary to see if there is any likelihood
of any relation ship existing between variables under study.
The correlation coefficient between X and Y denoted by       r is given by
      r
               ( X i  X )(Yi  Y )     and the short cut formula is
              ( X i  X )  (Yi  Y )
                          2            2
                  n XY  ( X )(  Y )
     r
             [n X 2  ( X ) 2 ] [n Y 2  ( Y ) 2
    r
                   XY  nXY
           [ X 2  nX 2 ] [ Y 2  nY 2 ]
Remark: Always this r lies between -1 and 1 inclusively and it is also symmetric.
Interpretation of    r
             1.Perfect positive linear relationship (   if r  1)
             2.Some Positive linear relationship ( if    r is   between 0 and 1)
Examples:
   1. Calculate the simple correlation between mid semester and final exam scores of 10 students
      (both out of 50)
Exercise The following data were collected from a certain household on the monthly income
(X) and consumption (Y) for the past 10 months. Compute the simple correlation coefficient.
          X:      650    654   720     456   536     853     735     650   536   666
          Y:      450    523   235     398   500     632     500     635   450   360
 The above formula and procedure is only applicable on quantitative data, but when we have
   qualitative data like efficiency, honesty, intelligence, etc we calculate what is called
   Spearman’s rank correlation coefficient as follows:
   7.3.Steps
       i. Rank the different items in X and Y.
      ii. Find the difference of the ranks in a pair , denote them by Di
      iii. Use the following formula
                         6 Di
                                       2
               rs  1 
                        n(n 2  1)
           Where rs  coefficien t of rank correlatio n
                        D  the difference between paired ranks
                        n  the number of pairs
Example:
          Lipstick types    A B         C   D   E    F    G
          Aster             2   1       4   3   5    7    6
          Almaz             1   3       2   4   5    6    7
Solution:
X                       Y                           R1-R2
                                                                                D2
(R1)                    (R2)                        (D)
2                       1                           1                           1
1                       3                           -2                          4
4                       2                           2                           4
3                       4                           -1                          1
5                       5                           0                           0
7                       6                           1                           1
6                       7                           -1                          1
Total                                                                           12
                     6 Di
                                    2
                                     6(12)
          rs  1              1         0.786
                    n(n 2  1)       7(48)
 Where   a   is a constant which gives the value of Y when X=0 .It is called the Y-intercept.   b
 is a constant indicating the slope of the regression line, and it gives a measure of the change
 in Y for a unit change in X. It is also regression coefficient of Y on X.
a  Y  bX
Example 1: The following data shows the score of 12 students for Accounting and Statistics
                examinations.
            Accounting Statistics
                                    X2        Y2         XY
            X             Y
a)
The Coefficient of Correlation (r) has a value of 0.92. This indicates that the two variables are
positively correlated (Y increases as X increases).
     b)
where:
          Yˆ  7.0194  0.9560 X
              7.0194  0.9560(85)  88.28
- To know how far the regression equation has been able to explain the variation in Y we use a
                                                 2
  measure called coefficient of determination ( r )
                (Yˆ  Y ) 2
       i.e r   2
                (Y  Y ) 2
       Where r  the simple correlatio n coefficient.
- r 2 gives the proportion of the variation in Y explained by the regression of Y on X.
- 1  r 2 gives the unexplained proportion and is called coefficient of indetermination.
Example: For the above problem (example 1):      r  0.9194
 r 2  0.8453  84.53%          of the variation in Y is explained and only 15.47% remains
unexplained and it will be accounted by the random term.
   o Covariance of X and Y measures the co-variability of X and Y together. It is denoted by
S XY and given by
            SX Y 
                         ( X i  X )(Yi  Y )   XY  nXY
                               n 1                        n 1
Xˆ  a1  b1Y
                 b1 
                         XY  nXY
                         Y 2  nY 2
                                            b1SY
                 a1  X  b1Y      ,   r
                                             SX
 Here X is dependent and Y is independent.
     7.5.Choice of Dependent and Independent variable
                   bYX S X bXY SY
 Then   r                        r 2  bYX * bXY
                     SY     SX
 - Moreover, bYX and bX Y are completely different numerically as well as conceptually.
 1. If the correlation is perfect positive, i.e. r  1 then the b values reciprocals of each
      other.
common point ( X , Y )
 Example: The regression line between height (X) in inches and weight (Y) in lbs of male
 students are:
                      4Y  15 X  530  0 and
                      20 X  3Y  975  0
 Determine which is regression of Y on X and X on Y
 Solution
 We will assume one of the equation as regression of X on Y and the other as Y on X and
 calculate   r
                                   530 4             4
         4Y  15 X  530  0  X       Y  bXY 
                                    15 15           15
                                    975 20            20
         20 X  3Y  975  0  Y          X  bYX 
                                     3    3             3
                             530 15          15
   4Y  15 X  530  0  Y        X  bYX 
                               4   4           4
                             975 3            3
   20 X  3Y  975  0  X       Y  bXY 
                              20 20          20
                          15  3  9
      r 2  bYX * bXY       0,1
                          4  20  16