0% found this document useful (0 votes)
14 views39 pages

Statistics I

The course aims to equip students with the knowledge and skills to effectively use statistical information in financial reports and research, focusing on both descriptive and inferential statistics. Students will learn to utilize SPSS for data analysis and will be evaluated through attendance, quizzes, assignments, and a final exam. The course covers various statistical concepts, types of variables, measurement scales, and methods for organizing and presenting data.

Uploaded by

kasimfauziya0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
14 views39 pages

Statistics I

The course aims to equip students with the knowledge and skills to effectively use statistical information in financial reports and research, focusing on both descriptive and inferential statistics. Students will learn to utilize SPSS for data analysis and will be evaluated through attendance, quizzes, assignments, and a final exam. The course covers various statistical concepts, types of variables, measurement scales, and methods for organizing and presenting data.

Uploaded by

kasimfauziya0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
COURSE OBJECTIVES AND OUTCOME ie main objectives of this course are to prepare students with the conceptual jowledge to be an educated user of statistical information in financial reports, ‘cholarly research, and the practical skills to apply this knowledge in the course of pursuing their own research (after taking both ACC223 and ACC224). After taking the complete course (ACC223 and ACC224) the student should have the basic understanding of the statistical procedures and tests used in descriptive and inferential statistics especially in accounting research. Also, the student should be able to use SPSS to conduct some basic data analysis as data analytics startup. COURSE REQUIR EMENTS: ‘Text and Material "You need a ruler pencil and a standard scientific calculator and always bring them to all classes. You can use any standard basic or introduction to statistics (for social sciences or business) text book for your readings. Note that you will be having unexpected tests and quizzes. Course Structure and Grading 1, Class attendance and participation: 10% 2. Tests and quizzes: 10% 3. Assignment 10% 4. Final Exam: 70% Classroom Behavior Please tum off your cell phones as a courtesy to the lecturer and your classmates. No indecent dressing, no chatting (social media), web browsing or any other usage of phone or electronic devices in class. No communication devices, music players, etc. PROPOSED CLASS SCHEDULE [# Date Topic 1_| 1 meeting General Introduction and Words of Advice 2 | 2" meeting Introduction: definition, types of statistics-descriptive, inferential statistics, ie 3 [3% meeting Data and Variable; Types of variables: qualitative, quantitative, discreet, continues | 4 | 4" meeting Variable measurement scales; (Sources and methods of data collection as assignment) 5 | 5® meeting Functional relationship Measures of central tendency: mean, mode m« | 6 | 6" meeting Measures of dispersion: variance, deviation, std. deviation, 7_| 7" meeting 8 | 8" meeting Moments, skewness, kurtosis 9 | 9" meeting 10 | 10 meeting Presentation of statistical data 1 Practical: Basic descriptive analyses using Excel and SPSS 12 13 Ee Naber niet wee le AON meucncste Objective Soe oe SP ih * Know the ways statistics is used. Know the differences between descriptive and inferential statistics. Know the differences between sample and Population. Know the differences between qualitative and quantitative variables. Know the differences between discreet and continuous variable. Know the scales of variable measurement. 1.1 Introduction Like the need for the ability to read in becoming an effective modern personality, Statistics has become an integral part of the modern life. It helps in keeping us informed about what is happening around us in the current world of information, For instance, as a university student, you may be interested in knowing the dependence of the mean starting salary of a university graduate based on your GPA. Another example, many students watch weather forecasting before going to school. Do you know how you get weather information? The information is based on some computer models build on statistical concepts by comparing prior weather with the current Weather and predicting future weather. In financial market statistic plays a great role in analyzing how traders and businessmen invest and make money. When you surf the intemet, read newspapers, watch the news on TV or follow supports you come across the word statistics frequently. Hence, what is statistics then? 1.2 Definition In common usage, the word statistics means numerical information. For example, the number of missed class per semester due to illness; the average starting salary of university graduate; and monthly average internally generated revenue of Kaduna state for 2017. All the examples are statistics as they stand for collection of numerical information. Moreover, in many instances, statistical information is presented in diagrammatic forms. Diagrams include chart, graphs, and tables ete. which are usually used to Present a lot of information and capture the readers’ attention quickly. For instance, Figure 1.1 gives quick glance of the proportions of students in each department of the Faculty of Social and Management Sciences of KASU. No. of students in the various departments of faculty of social and mgt sei KASU POLITICAL SCI 10% ACCOUNTING 19%. MASSCOM: 14% socioLocy BUSINESS ADMIN 13% 22% ECONOMICS 20% = ACCOUNTING = BUSINESS ADMIN = ECONOMICS: SOCIOLOGY =MASSCOM POLITICAL SCI Figure 1.1: No. of students in the various departments of Faculty of Social and Management Sciences Kaduna State University. Therefore, to formally define statistics, it has a broader meaning than just a collection of numerical information. According to Professor David J. Hand! "Statistics is the fun of finding patterns in data; the pleasure of making discoveries; the import of deep philosophical questions; the power to shed light on important decisions, and the ability to guide decisions in business, science, government, medicine, industry..." In a more precise words, statistics is a science of collecting, organizing, presenting, analyzing, and interpretation of data to support making effective decisions. Based on the definitions, statistics starts with collecting relevant data. The collected data is then organized and presented. After then, the data can be analyzed and interpreted to aid decision making, It is also indicated that the subject of statistics involves the world of exploration, understanding and decision making, Hence, its usage has expanded enormously to almost every areas of life. Currently, there is a lot of information and misinformation available to almost everyone in the society. Understanding the information and making informed decisions based on the ' Professor David J. Hand is prominent professor of statistics with interest in ... can recommend whether to reduce the time or not. Therefore, statistics deals with sample data to predict, estimate and finally used to support managerial decision. 1.3. Types of Statistics The subject of statistics is categorized into descriptive statistics and inferential Statistics. 1. Desi 1 Descriptive Statistics iptive Statisties involves methods of organizing, summarizing, and presenting, data in an information way. This method entails construction of tables, charts, graphs and calculation of descriptive meas! tures such as measures of central tendency (mean, median and mode), measures of variation (variance and standard deviation), and measures of percentile, For instance, KASU reported the m and registered to study various und 2010; 700 in 2011; 1,020 in 2012; jumber of candidates that were offered admission lergraduate courses (from 2011 to 2017) as 500 in 1,323 in 2013; and 1,500 in 2014... The report also revealed the percentage increase from one year to another. This is an example of Descriptive Statistics. Tables an measures of locations and other st: id charts can be used to describe such data. Also, latistical measures (to be discussed later) can be used to offer more description of the data. 1.3.2 Inferential Statistics The other type of statistics is Inferential Statistics. It refers to methods of estimating properties of populati ion based on a sample. It also considered as best Suess of the population based on sample data. Inferential Statistics carefully selects sample from a population and use the sample information to draw conclusion on the Population. The example cited (in page...) about recommendation to reduce the number of hours students spend on social media based on sampled students provided an example of typical inferential statistics, Both types of statistics are interconnected. In most, cases, the summarized using some descriptiy, data obtained from the sample are organized and e statistics process before performing inferential analysis. Figure 1.1 presents summary of the classification of statistics into the two the classes descriptive and inferenti ial. he ial eta Ee Sekt Je Descriptive Statistics Figure 1.1: Summary of Types of Statistics Inferential Statistics Itis relevant at this juncture to briefly define what is population and sample. The topic of population and sample will be discussed in detail later. Population refers to the collection of all individuals or objects under consideration in a statistical study. Sample is a portion of the population from which information is obtained. A sample is used because it is usually not feasible to obtained information from the whole population. Consider taking voters opinion. Taking sample may be necessary because of the exorbitant cost of collecting information from entire population of millions of voters. Similarly, for a study onsharks in the ocean, it may be impossible for a biologist to study all the sharks in the ocean. But with careful selection of sampled sharks, conclusion can be generalized on the population. Figure 1.1 depicts the The bulk of numerical data represent information about certain features of the population or sample under consideration. Hence, when studying a population or sample in statistics, we usually reduce what we study into some specific concepts or constructs relating to the subjects of study. For instance, such constructs include cost, price, profit, economic growth, inflation rate and company size etc. (in accounting, business, finance and economics); and behaviors, attitudes and cultures (in psychology and sociology). Moreover, there are another class of more general constructs considered in statistics, such as gender, age, and qualification. Most of these constructs stand for some inconstant concepts that change from one value to another i.e. either increases or decreases. Such constructs are referred to variables. 1.4 Variable and Data A variable is a characteristic of an individual or thing that varies from one individual or thing to others. Example of variables include: age, height, gender, and favorite color. Variables are categorized into qualitative and quantitative. Qualitative variable denotes non-numerical observation representing a category of the variable. Examples of qualitative variable are gender, color of car, and favorite dish. On the other hand, quantitative variable refers to a numerical observation representing an amount or a quantity. Examples of quantitative variable include 8 samber of students in ACC of the governors es of students in ACC 233, a “All these example ropresone and ages rically, be represented numer Quantitative Nariable is further classified in Des bat ieernble is a numerical observation sf eet, ithout intermediate points. For example, number Qualitative Variable 8. Golor, gender Discrete Variable e.g No. of students in class Figure 1.2: Summary of Classifications of Variable Continuous Variable ©, weight, beight Data is a plural of datum. Datum refers to single piece of information about something. Thus, data is a collection of information. Data is commonly presented in @ rectangular long form data set as shown in Table 1.1. Each cell of the table keeps a datum i.e. piece of information about the entities being observed, The structural —~* 1.4.1 Relationships i Relationships between deterministic 3 Relationship is... independent) can is indirect/negative Section. Before d; this juncture to di Considered j, - In terms of ibili eterministi i etaliy tionship, panste telationship a, noes ne een variables. (de, id ©. di c relationship, Oreover, rel 1 Considered anVerse. Relation, tScussing the Pos ifferentiate betwe 1 ferms of direct; Sae Sine ae ion direct/positive itive and Negative Telationsh; ie Sn dependent and independent vais Pertinent at Dependent variable js ca The direction of the Telationship between dependent and the independent variable may be positive or negative as mentioned earlier, The relationship is said to be positive when change in independent variable leads to change in the dependent variable in the same direction. Hence, increase in the independent variable leads to increase in the dependent variable; and decrease in independent variable leads to decrease in the dependent variable. In the relationship occurs when Negative relationship 1.5 Variable Measurement Scale In order to summarize and present data to have a meaningful informations variables must be measured based on appropriate scale of measurement, instance, qualitative variables may only require a Measurement scale that classify and tally/count. For quantitative variable may require more than that. He variable measurement scales are classified into Nominal Scale, Ordinal Sq Interval Scale, and Ratio Scale, 1.5.1 Nominal Scale and the number of males c: counted out of the females and vice versa. Similarly, the LGA of origin variable Table 1.2) requires only differentiating the individual units of observation that SH from Jama’a who is different from $2 who is from Igabi. S3 is different from bi counts are usually converted to percentage. We can show the percentage of mal female; and the percentage of students associated with each LGA of origin from entire units of observation. The Nominal Scale does not signify rank or order. scale is usually used with qualitative variables as they mostly require classification. To facilitate data analysis in computer software, variable on Nominal Sc usually coded numerically. For instance, a code of “1” can be assigned to mal cannot give any meaningful answer. This is basically done to facilitate com process. Further statistical operation cannot be performed on such scale. 1.5.2 Ordinal Scale When respondents ofa survey are asked to rank manager-subordinate relation of ‘GI manager based on excellent, 800d, average, and poor. this scale signifies ranking, it does not show the magnitude from one rank to and Excellent is better than good, but we cannot tell the extent or size of the diffe from excellent to good, and so on. Therefore, Ordinal Scale is the next hi measurement scale above Nominal Scale. Thus, it has the characteristic off Nominal Scale (categorization) plus order/ranking. 1.5.3 Interval Scale difference between the values of the measurement, but it does not contain an abs zero value. The Celsius temperature scale is a good example of interval scale, I Celsius scale, values of temperature can be ranked and the difference betwees valnes can be determined. For example, the temperatures of Kaduna metropal Were taken for first five consecutive days of November and the values in de Celsius are 28, 30, 32, 28, and 26 respectively. These temperatures can be rail €-8, as the third day has the highest temperature followed by the second day an on. Also, the difference between the temperatures can be determined becaui degree Celsius represents a constant unit of measurement in the scale. Ho absence of condition it rather represents freezing point. 1.5.4 Ratio Scale Ratio Scale measures observation of variable based on all the coaroctsri sy interval scale, in addition, the zero point and the ratios between two points/nui are meaningful. Simple examples of ratio scale include wages, weight, profit, units of production etc. Unlike, the interval seale where zero does not repres natural absenée, in ratio scale, zero profit means no profit made. Similarly, | weight represents complete absence of weight. For the meaningful ratio, if com X made a profit of 81,000,000 and company Y made a profit of 500,000, hene made 2 times Y or Y made 50% of X, represented by equal amount of the number assigned to the observations. Fis Katio scale posses all the characteristics of the first three scales, Also, zero B represents absence of the characteristic and ratio between two points is meaningti 1.6 Observational Studies and Experimental Designs 1.7 CHAPTER SUMMARY 18 EXERCISES SECTION TWO: DESCRIBING DATA Learning Objective © Know the methods organizing qualitative data, + Know the methods organizing quantitative data. * Know the differences between sample and population. * Know the differences between qualitative and quantitative variables. ° Know the differences between discreet and continuous variable. * Know the levels of measurement in data 2.1 Introduction In the previous section, two major branches of statistics were introduced- desc and inferential statistics. The current section focuses on the descriptive stati: Hence, this section deals with methods of organizing and summarizing data to 3 the pattern of the data; to identify where values concentrate; and to expose exti values. Using any of the methods of data collection (such as observa experiment, survey etc.), sometimes a lot of data is generated. This data ca complicated if not organized. This section deals with the methods of organ qualitative and quantitative data. 2.2 Organizing Qualitative Data Based on the previous discussions, it should be recalled that qualitative data st as a value for qualitative variable. Also, this is measured based on coun frequency. Hence, frequency is one of the methods of organizing qualitative f Frequency refers to the number of times a distinct yalue occurs. 2.2.1 Frequency Distribution Table Frequency distribution is a technique used for organizing qualitative Frequency distribution of qualitative data is usually a table of valuee| observation and how often they occur. To construct a frequency distribution} of qualitative data, the following three Steps are followed: First, list the di: values of observations in the first column of the frequency distribution table. § record a tally mark in the second column of the frequency distribution table "espective values of observation. Finally, record the count of the respective for each value of observation in the third column. In the second step, it is helpfy cross out each observation after tallying to avoid duplicating or omitting); observation. ' The following example of favorite wether! SeasoHll@Ainy)) land BBE) of students in Introduction to Statistics class further clarifies the frequency distribut Procedures. Table 2.1 provides the data of a survey of favorite colors of 50 student in the Introduction to Statistics class: rainy rainy The data is used to construct a frequeney distribution table based on the steps ah the first column of the Fi requency Distribution table Table 2.2 frequency distributi Introduction to Statistics class mi Talk ——__Tally Favorite Color Interpretation: Out of the so students, weather; 17 of them chose cold Weather; and only 6 of them chose hot weg Hence, by simply glancing at the table we like most of the students prefers the rainy si , frequency Relative Frequency = number of observations 27 Boe 100 = 54% axis) of the graph. The relative frequencies are represented by the vertical b which the frequencies are proportionate to the height of the bars. The following steps provide a step-by-step procedure to construct a bar chai obtain the relative frequency distribution based on the procedure discussed, frequency distribution table (2.2.1 above), Second, draw the graph with the § Xaxis. The Y-axis displays the Telative frequencies and X-axis carries the'B al bearing the variables. Third, for each distinct value draw a vertical bar which het 1s proportionate to the respective relative frequency of such value. Final Vertical axis is labeled with the relative frequency and the horizontal with the of the variable. The bars are labeled with the distinct yalues/observation off variables. Using the previous data (of favorite weather) in the example above, the follo} example practicalize the procedures of constructing bar chart: Favorite Weather Seasons 28 | 10 . i a o weather relative frequency mcold mrainy mhot Interpretation: The chart above shows that out of the 50 students, 27 chose’ season; 17 of them indicated cold weather; and only 6 of them took hot weat their favorite weather. Hence, by simply glancing at the chart we can eas understand some information, like most of the students prefer the rainy seasor few of them like hot weather. 2.2.3 Pie Chart Pie chart is another means of organizing and summarizing qualitative data. Pie is a cycle divided into segments proportionate to the relative frequency off qualitative data. The following steps provided step-by-step procedui constructing a pie chart. First, obtain a relative frequency of the data as discussed in the previously se. Second, draw a cycle and divide it into segments (equal to the number of vari and proportionate to the relative frequencies, Finally, labeled the segments wi distinct values and their relative frequencies. Based on the previous data of students’ favorite weather season, the steps are exemplified below: Favorite Weather Season cold mRainy = Hot Interpretation: The chart indicated that out of the 50 students, 27 have rainy ‘as their favorite weather; 17 of them chose cold weather; and only 6 of them hot weather. Hence, by simply looking at the chart, one can easily understand information, like most of the students prefers the rainy season and few of them hot weather. 2.3 Organizing Quantitative Data The previous section dealt with the procedures of organizing and summ: qualitative data. The qualitative data of the students favorite weather was d in ene) distribution table and bar and pie charts. This goes in line wi section focuses on organizing quantitative data. The quantitative data is classified and depicted in histogram or dot plofs...,.. a % Single value grouping: The procedures of organizing quantitative data starts with grouping the data class. There are three common methods of grouping quantitative data: Like the grouping in the case of qualitative data, s| value grouping takes each value of observation as a class. single value groupi @ppropriate for discrete data with small number of distinct values. For example! following data is a statistics of number of chairs in the fifty classes in the dg q accounting of KASU. Use single value grouping to organize the data into fre distribution table: Table 2.3: Number ‘of Chairs in the Classes of the Department of Accounting, 40 45 40 150 60 55 40 55 60. 60) 20 SES 55 ae 40 40 2|'50 =] coma GOmmnTLEO 55 50 60 55 60 60 60. 45 55 55. SSIs ee ee S| 4 | 50 maim GO mt Ss ae AGEs 60) [40 50 40 60 40 50° (50) 40 50 g ‘ Frequency Distribution Table of Number of Chairs in the Classes of the Depar of Accounting, KASU ; No of Chairs Frequency __| Relative Frequency | RF in Percen: 40 8 0.18 18% b 45 5 O11 10 50 8 0.18 18 | 55 9 0.20 20 | 60 15 0.33 33 Wl IE Total 45 1.00 100 The first column of the above table classifies the observ: classes which is appropriate for grouping such kind of data, ations into single | i a Limit grouping: This is another method of grouping quantitative data that or observations of quantitative variable into ranges of values. Each range starts lower limit which is the starting point and lowest value in the range and end: an upper limit which is the last and highest value in the range. Limit groupi suitable for discrete data with too many distinct values. distribution table using Limit grouping: 4 8 3. 14 6 i 2 9 10 15: 10 0 8 13 18 2 12. 7 0 4 1 14 3 6 4 oi 3 5 8 5 6 1 9) 12 7 6 1 8 5} 9 8 6 ab 6 8 No of Chairs Frequency Relative Frequency | RF in Percent 0-5 12 6-10 24 11-15 7 16-20 2 Total 50 Cut-point grouping: Under this method of grouping, quantitative data, the da organized into ranges. Each range starts with a lower cut-point which is the 16 value of the range and ends with the upper cut-point which is the highest v: the range. Note that the upper cut-point of one class is the next lower cut-p' the next class. Cut-point grouping is suitable for continuous data (i.e. data ex] with decimals). Table 2. 47-5 |" 45:6] 748.8) 504 88| 67.27 501] 47.8 Foe CS HNIETS7-2 703 [665 eee 72.3| 78.8 47.5 66.6 79.8 77.4 71.0| 67.2 55.1 478 57.5 55.6 51.8 50.4 72 67.2 73.1 47.8 47.5 48.6 48.8 51.4 60| 42.2 50.1 47.8 No of Chairs Frequenc Relative Frequen 40 to under 50 mu oar 50 to under 60 60 to under 70 70 to under 80 80 to under 90 Total 50 2.3.1 Classifying Quantitative Data for Histogram 2.3.2 Histogram 3.3 Frequency Polygon 3.4 Dot Plots Nee. By 28 en Lee he Ae FS 2.9 Measure of Central Tendency Meaning: Measure of central tendency, also called measure of location, is the statistical information that gives the middle or centre or average of a set of data. They are all Tegarded as forms of averages. : There are five measures of central tendency. [Link]: (i) Arithmetic mean (ii) Median (ii1)Mode (iv) Geometric mean (v) Harmonic mean 14 Arithmetic mean, median and mode will be dis- cussed now, while geometric mean and harmonic mean will be discussed later. 2.10 The Arithmetic Mean Definition: The arithmetic mean, also popu- larly referred to as the ‘mean’, is the average of a series of figures or values. It is obtained by dividing the sum of these figures by the to- tal number of the figures or values. It is also the average of a collection of observation. The arithmetic mean is the most popularly used measure of central tendency. Formula for calculating arithmetic mean Ex n Where % = arithmetic mean = =represents a Greek letter denoting “sum of” X =series of figures in a given data =x = the total of the values of series of figures in a given data n =number of figures or elements Note: This formula is used especially when the figures are small and ungrouped. Arithmetic mean,X = Example 1 Calculate the arithmetic mean for the scores of eight students in NECO economics exami- nation in the year 1999. The scores are: 14, 18, 24, 16, 30 12, 20 and 10. Solution Step I: Add up the numbers or scores B= 14+ 18 +24+ 16+ 30+ 12+ 204 10= 144 Step II: N = no, of figures or scores which is = 8 Step I: x ==%— - 144 EP nengar i218 Note: When the figures given in a data are large and in most cases repeat themselves, then fre- quencies are used. Frequency is the number of times a particular event or information occurs. Frequency distribution is usually used when data presented are large and most of the num- bers appear more than once. In this case the formula for calculating the arithmetic mean will change slightly to: fx Arithmetic mean, ¥ = 7 where f = [Link] times a particular number occurs (frequency). Other symbols remain the same. Example 2 Calculate the mean of the following sets of numbers: : ee 8, 16, 24, 8, 12, 12, 18, 24, 10, 16, 20, 24, 24, 12, 24, 12, 16, 24; 18, 18. Solution Step I: Identify the numbers that occur in the set, ie. 8, 10, 12, 16, 18, 20 and 24. Arrange these numbers in a frequency distribution table (table 2.11). Step I: Arrange the numbers starting from the smallest number, which is 8, to the highest number, which is 24, as shown in table 2.11. Step III: Arrange the figures or numbers in a frequency distribution table as shown in table 2.11. table 2.11: Frequency distribution ‘Step IV: Apply the formula Arithmetic mean, % = _2fx n = 8X2)+ (10x 1) + (12.4) + 16x 3) ~ £08x3)+ 20x 1) + 24x6) 20 = + 14: 20° = 32 L170 Mean of group data ‘The arithmetic mean can also be prepared for grouped data, In this case the class mark (mid. Points) of the individual class interval is used for the X-column. Formula used is Arithmetic mean = =i Example 3 Calculate the mean of the following marks scored by students in an economics examina. tion: “67 8, 31, 45, 38, 22, 28, 16, 51, 65, 48, 6 24, 18, 12, 16, 48, 38, 50, 44, 6, 18, 16, 24, 32; 36, 26, 14, 20, 12, 18. _,.+ Solution @ Use the steps as in example 2 Gi) Use a class interval of 0 — 9, 10 — 19, 20 — 29, etc. as shown in table 2.12. (Gii)Prepare a frequency table as in table 2.12 table 2.12: Frequency table for marks scored by stu- dents in economics examination. fx af gis ~ 30) = 27.2 Arithmetic mean x= Advantages of arithmetic means (i) Arithmetic mean is very easy to calculat (ii) It gives an exact value (iii)It is the best known average. (iv)Itis very easy to understand. (v) It provides a good method of comparing values. : ; (vi)It makes use of all available information) ina data, Disadvantages of arithmetic m (@ Arithmetic means cannot be obtained ‘graphically. Gi) Certain facts in arithmetic mean may notbe revealed. Gii)It may be difficult to obtain without calcu. <_ lations. * Gy)It can lead to distorted result, (v) It may be badly affected by extreme values in a distribution, Definition: The median is defined as an average, which is the middle value when figures are arranged in order of magnitude. In an even distribution, the median is the aver. Age of the two middle numbers. In other words, the median of adistribution is the middle value When the observations are arranged in order of magnitude starting with either the smellest or the largest number. The median is therefore the value of the middle item, How to calculate the median A. When the nu numbers, the m number, mbers involved are odd edian will be the middle ExampleI Find the median of the following sets of values: 2, 8, 11, 13, 15, 6, 9, 20,7 Solution Formula: Median ee where n is the number of items (observation). This formula is applicable only when n is odd. Step 1 First arrange the numbers in ascending or descending magnitude (a) In ascending magnitude, we have 2, 6, 7,8, 9, 11, 13, 15, 20. Count the numbers of values involved. There are nine (9) numbers involved (odd number). Since there are 9 numbers involved, the middle number is the Sth number and the Sth number is 9. Therefore, the median = 9. Alternatively, the formula can be used Ricdianee a1) EG Peet 10. , Deo i's 2 = 5th number The Sth number in ascending magnitude is 9 Therefore the median = 9 (b) In descending magnitude, we have 20, 15, 13, 11, 9, 8, 7, 6, 2 Since there are nine (9) numbers involved, the Sth number, which is the middle one (9), is the median. Therefore, the median =9 (B) When even numbers are representing the number of events in a data, the two middle values are taken; add them and divide them by two. The median will be the arithmetic mean ofthe two middle numbers. Example 2 Find the median of the following numbers: 20, 8, 12, 8, 10, 14, 18, 5 17 Solution Step I First arrange the numbers in ascending or descending magnitude. In ascending mag- nitude, we have: 5, 8, 8, 10, 12, 14, 18, 20 Step II Count the number of values involved. There are eight (8) values (even numbers). Step Ill Since there are eight (8) numbers involved, the middle will be the 4th and the 5th numbers, which are 10 and 12. Step IV To get the median, then add 10 + 12 together and divide them by 2 ie 1012 | 225044 2) 2 The median = 11 Alternatively, the formula can be used Median = +1 2 where n=8 Ss) 29 gn 2 2 Therefore the median is found between the 4th and 5th number, i.e. between 10 and 12. The middle between 10 and 12 is 11 Therefore, median = 11 (© When a group data is involved, cumula- tive frequency is used This is used when items or values are large and arranging them in ascending order may not work. The formula will now be: i . ber for odd cpp Median: (84) thy ganar gene Z ie. where Nis odd N) N N) tha (N+ 1) th ) (2 ) member for even number of items, i.e. where Nis even 2 ‘Where Nis the summation of all the frequency age of some SSIII students. distribution of SSH students. A cumulative frequency table (table 2.14) is prepared for the distribution. table2.14: Cumulative frequency forage distribu of Si students. 13] 14 [Link] student a (Frequency) eo lRS ‘Cumulative S| 43) 51 From table 2.14, there are 51 members as in- dicated by the terminal (last) cumulative fre- quency. Since the members are odd (51) the median age will be (N+ 1)th member 2 ©. Median age = (51+ 1th 2 2 2 = 26th member ‘The 26th falls within the cumulative under the ‘of 11 years in the table above. Therefore, a: Example 4 1 The data in table 2.15 represents the marks scored by government students in NECO examinations. Calculate the median score, table 2.15: Marks scored by government students in NECO examinations. Marks % | 12| 18] 24 | 30] 36] 40] 48 Frequency] 6 | 1 10 [8 12] 3/4 Solution A cumulative frequency table (table 2,16) is prepared for the distribution. Marks % Frequency] 6[ 1] 10] 8 [12] 3 ‘umulativel requency 6] 7| 17] 25) 37] 40 44 From the table (table 2.16), there are 44 members as indicated by the terminal (last) cumulative frequency. Since this 44 is even, the median score will be: 8 & D ® we S$ wo a b | 13) £ & Median score -() he (3 i th 2 ()ae(4s) 0 = 22nd es ah The 22nd member is 30 marks The 23rd member is 30 marks, Median score =30+30 = 60 2 2 =30 marks Median score = 30 18. 7 ee Advantages of the median: {, Computation in median is very easy. 2. Median is not affected by extremes of val- ues. 3, Itis very easy to understand. 4, Itcan be obtained by graphic form. 5, The median is easy to determine by mere ‘observation. 6. Itdoes not involve serious calculations. Disadvantages of the median 1. Difficulties come up when large values are involved. ‘The re-arrangement of numbers involve a difficult task. Tt may not be needed for further statistical calculations. It tends to ignore extreme values. 2 3. 4. 212 The Mode Definition: The mode can be defined as the most frequently occurring number in a set of numbers or data. It tells us the observation which is most popular. It is the most frquently occuring value in a distribution. Suitability of mode for use Mode is suitable for use when we have large array of numbers or want to find the number that appears most in a series of numbers. The mode may not exist if no item or value tepeats itself. Again, mode may not be unique if more than one item repeats itself and such items have the same highest frequency. The best and easiest way of calculating the mode of any distribution is to form a frequency, table for it. Example I The marks scored by economics students in WAEC examinations are as follows: 30, 25, 60, 80, 60, 25, 80, 60, 40, 60, 80, 30, % Calculate the mode jee 19 Solution Step I Determine the lowest and the highest marks (i.e. 25 and 80), Step II Arrange the numbers in ascending mag- nitude (i.e. 25, 30, 40, 60, 80) Step III Prepare a frequency table (table 2.17) table 2.17: Frequency table of marks scored by eco- nomics students in WAE nati From the table (table 2.17), the highest frequency is 4 and this corresponds to a mark of 60. The mode is 60 Note: A set of values with two modes is called bi-modal but when they are more than two modes, the set is called multi-modal, while a set with only one mode is called uni-modal. Advantages of the mode 1. It is easy to determine. It is easy to understand It is not affected by extremes of values. When data is not complete, mode cannot be difficult to estimate. It is very easy to compute. 2. Si 4. Disadvantages of the mode 1. Itisnota very good measure of accuracy. 2, Itis irrelevant in further statistical calcula- tion 3. It represents a very poor average. 4, Itis difficult to calculate, especially when more than one mode or large numbers are involved. 5. There may be uncertainty in the exact location. 6. Arrangement of data is always tedious. Other types of means Apart from arithmetic mean, there are other 'ypes of means occasionally used for caleula- tion. These are 8eometric mean, the harmonic mean and the quadratic mean. 2.13 Geometric Mean Meaning: The geometric mean of a group of numbers is the Nth root of the product of the numbers, In other words, it is derived from a Set of N observations by taking the Nth root of the product of the numbers. It is denoted by letter G. Formula for calculating geometric mean The formula used for calculating geometric mean is:- es Nth [Product of the various ‘values in an observation FEN) Dopeaat esa ‘Where x = individual value Example s Calculate the geometric mean of the follow- ing set of data: 6, 8, 12 Solution n=3 Product of the various values =6x8x 12 =576 G Wren =NVox8xi2 576 G =8.32 Advantages of geometric mean 1. All available data is used in calculating geometric means 2. Itis useful in calculating statistical data, 3. It provides a balance information on both sides of the distribution. F 4, Itis very important in carrying out research work. Disadvantages of geometric mean 1, Involvement of large volume of data makes geometric mean difficult to understand, 2. Geometric mean is at times difficult to compute. 2.14 Harmonic Mean Meaning: Harmonic mean refers to the recip- Tocal of the arithmetic mean of the reciprocal of some given numbers. It could have num- bers like! Xqy Xs) X .- X,. The harmonic mean is denoted by letter H. Formula for calc The formula used for calculating harmonic mean is given below: N Harmonic mean, H ==2_ Example Calculate the harmonic mean of the following set of data 4, 6 and 8. Solution The reciprocal of 4, 6 and 8 are +, $ and ¢ The arithmetic mean of the reciprocal are H=¢ @+4+4 =} (64443) 24 “4(a) = 7 The harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of given numbers =-2 H= 13 «O54 Advantages of harmonic mean 1. Harmonic mean can easily be determined 20 2. Itdoes not affect the extremes of values in a given data. 3. All values in the observation are taken into consideration. Disadvantages of harmonic mean 1, Harmonic mean principles are difficult to understand. 2, Its scope is limited. 3. Itis difficult to calculate. 2.15 QuadraticMean Meaning: Quadratic mean, also known as the root mean square (RMS), refers to the square root of the arithmetic mean of their squares. The quadratic mean is represented by R.M.S. Formula for calculating R.M.S. The formula used for calculating quadratic mean is: Ms. =\22" N Example Calculate the quadratic mean of the following set of numbers: 2, 4, 6, 8 Solution RMS, =\2+4+ 6+ 8° 4 =| 4+ 16 +364 64 4 Advantages of quadratic mean 1. All values in a given data are taken into consideration. 2. It is easy to determine, Disadvantages of quadratric mean 1. Calculation becomes very difficult when a given values are large, 2. Its principles are difficult to understand. 2.16 Measure of Dispersion or Variability Definition: The measure of dispersion, also known as the measure of variability, refers to the degree of spread of the numerical value in a distribution. It measures the variation that occurs in a given set of data. Examples of some measure of dispersion include the range, the quartile, mean deviation, variance and stand- ard deviation. Definition: The range is defined as the differ- ence between the maximum (highest) and the minimum (lowest) values in a set of data. The range is the simplest and the most straight forward measure of dispersion. Example 1 Find the range in the following data: 12, 6, 19, 8, 24, 16, 36, 9, 40, 6, 50, 48, 12, 10 Solution The maximum (highest) valu The minimum (lowest) value The range Example 2 Find the range in table 2.18, which represents the marks scored in biology by SSI students. table 2.18: Marks scored in biology by SSH students. Solution The maximum (highest) score = 60 The minimum (lowest) score = 11 a Advantages of the range 1. Itis easy to understand. _ 2. Itis easy to calculate ‘or compute. 3. It is useful for further Statistical calcula- tions. Disadvantages of the range 1. Itdoes not take all the values of a distribu- tion into consideration, only values at the extremes are used, 2. Itis nota reliable measure of variability. 2.18 The Quartile Meaning: Quartiles are the values which di- Vide a given distribution into four equal parts. Itis similar to the median except that the me- dian divides a distribution into two equal parts. The four equal parts of a quartile include: @) First quartile Qi Gi) Second quartile Q2 (iii) Third quartile Q3 Gv)Fourth quartile Q4 2.19 Mean Deviation Meaning: The mean deviation is the arithme- tic mean of all absolute deviations from the mean, It represents the differences of all the values from the arithmetic mean divided by the number in a given data. The mean deviation is obtained by finding the sum of all the values of each deviation from the mean (not minding the sign) and then dividing by the numbers (n) ‘of values. The mean deviation is denoted by MD. Example 1 Calculate the mean deviation of the following age of some pupils in Okeke Primary School: 4,5, 6,8, 10,3 Solution Step 1: Find the arithmetic mean x 4+5+6+8+10+3 6 36 6 6 Step 2: Calculate the mean deviation usin; formula above. g the! | | |4-6 |+ [5-6] + |6-6| MD. +|8=6|+ |10-6|+/3~6| t 6 0 = 2414042444 6 -2 2 Mean deviation involving frequency Assume certain figures are given as x,, x3, x, sss: %q With their corresponding frequencies asf, ff. » the mean deviation for th data is stated below. MD.= f00-%)+0)-%) +f (x-%, w+ fq X,) Example 2 Calculate the mean deviation for the set of in table 2.19 table [Link] Age of SSI students that won scholarship. Solution Step I: Prepare a new table (table 2.20) that will reflect all that will be required to make the calculation easy. table 2.20: Age of SSI students that won scholarship. Step I: Calculate the arithmetic mean (X) to enable us compute the fourth column on the table (table 2.20) Arithmetic mean X 2fx =f = 276 20 = 13.8 Mean deviation = Zf |x—x| N = 69.2 20 = 3.46 Advantages of mean deviation 1. All the values in a distribution are used for computation. 2. It is used to indicate the average variation of the values in a given data. Disadvantages of mean deviation 1, It is difficult to calculate, 2, It ignores the plus and minus signs when 23 3. It cannot be used for further mathematical calculation. 2.20 Variance and Standard DeViation Meaning: The variance refers to the arithme- tic mean of the squares of the deviation of the observation from the true mean. It is also referred to as the “mean square deviation”. The standard deviation, on the other hand, is the square root of the variance. The standard deviation is also referred to as the “root mean square deviation”. Formulae for calculating variance and stand- ard deviation The formulae that may be used for calculating variance and standard deviation are stated be- low If for example, there exists observations such AS Xj, Xa, Xs. x, and their true arithmetic , then: mean is xy y (a) Varia..ce = = (b) Standard deviation = Assume that the observations x, X2) X1-.-X» have frequencies ~ f;, f;, fy ....fy» then the for- mulae becomes: (a) Variance = =f(x af (b) Standard deviation = Example 1 Calculate the variance and standard deviation of the following sets of numbers: 3,5, 8,5,6,9 SOR MER So oe ee a a Solution ‘Step I: Calculate the arithmetic mean, x. ~ X= 3454+84+5+649 6 =236 _ ; 6 Step Il: Calculate the deviations (x — x) =|3-6],|5-6],|8-6];|5-6|,|6-6],|9-6| = 3-1 2 1 0 3 Step III: Calculate the squares of these devia- table 2.21(b): Marks scored by chemistry studen their’ Dexamination tions, (x-xyY x 3 C1¥ @? 1? 0)? BF Pera) "949 Step IV: Add up all these squares 9+144+1404+9=24 Step V: Calculate the arithmetic mean of the sum of the squares (i.e. variance) (@) Variance en =4 Step II: Calculate the arithmetic mean (x) tg enable us compute for column four on the table above. wee istx fg = 1820 = 33,7 | 54 | Arithmetic mean (X) = 33.7 | Variance = 4 | From the data available from table 2.21(b) () Standard deviation =\variance alex)? a fa) Varianc =) eX =Va (a) Variance TF = 2 = 10,859.26 Example 2 ” Baa ‘The marks scored by chemistry students in their = 201.1 NECO examination is presented in table 2.20(a). Calculate the variance and standard ; V =x|? pa iaioraenicnt aeuy «(O)\Standard deviasica 2 Sze: table 2.21(a): marks scored by chemistry students in ee their NECO examination, = 141 Solution ti Step I: Prepare a new table (table 2.21(b)) to 4, 1 enable us complete all the necessary data, Advantages of stand: 1. Itrepresents the 2. Itmakes use of, tion, 3. Ittakes mathematical Si ‘ion. t is very useful analysis, lard deviation ‘best measure of dispersion. €very value in the distribu- igns into considera- in further mathematical 5. Itisalsouseful in sample theory. Disadvantages of standard deviation 1, It lays more emphasis on the use of extreme values. 2. Itscalculation is very difficult and tedious. Revision: questions 1(a)What is a table? (b) List five characteris- tics ofa table. (c) State four importance of atable. 2. The distribution of workers at Tanko Ventures Limited is as follows: Cleaners 60 Messengers 15 Drivers 25 Typists 20 Clerks 30 Represent the above information using (@) Pie chart (b) Histogram (c) Bar chart 3. Calculate the arithmetic mean of the numbers: 42, 56, 38, 41, 86, 56 4. Calculate the mean deviation of the follow- ing weight of rats in a biology laboratory: 5,6,7,7,9 5. Considering the following distribution, calculate the variance and the standard deviation: 2,5,6,7,7,9. 6. The table below shows the age distribu- tion of a hypothetical population. No. of people (million) | 8. The raw scores of 20 students of Utopia High | * School who took part in an examination in economics are given below: 38, 39, 12, 20, 18, 28, 20, 46, 34, 20, 70, 64, 52, 48, 64, 43, 66, 53, 69, 34 (a) What is the mean score of the students” marks? (b) How many students passed the examination? (c) What percentage of the students failed the examination? (d) What is the range of the scores? (¢) How many students scored below the mean score? (SSCE Nov. 1990). 9. What is the median? State its merits and demerits (SSCE August 1991) 10. The values of different types of accounts held in Nigerian banks for the period 1984 to 1988 is shown in the table below: Year 1984] 1985 |1986|1987|1988 Savings 100] 120 | 120] 180 | 200 Current 65| 75 | 70} 100] 130 Fixeddeposit | 40] 45 | 60] 145} 50 Present the data above in the form of acom- ponent bar chart (SSCE June 1992) 11(a) Explain each of the following measure of central tendency (i) mean (ii) median (iii) mode (b) Calculate the mean, median and mode of the following set of numbers: 21, 22, 23, 24, 25, 26, 23, 28, 29, 30, 24, 31, 34, 23 (SSCE June 1993) . The daily sales of a department store for one week are as follows: Thu. ays les (¥) Present this information in the form of a pie chart. Show your workings clearly (SSCE June 1989) 7, What is the mode and when is it a suitable average to use? State its disadvantages (SSCE June 1989), (a) Present the above data in a bar graph (the use of graph sheet is essential). (b) Calculate the average daily sales for the week (SSCE June 1996) ce . Using the values in 1208, 250 _ Row drawn using a protractor, Groundnut 19.5% Petroleum 33.3% Fig. 2.2: Pie chart showing value of the most important exports of Nigeria in 1980. Example2 degree, i.e. 100°, 70°, acountry’s budget, Illustrate the data accuray and 45°, a pie chart in Fig. 2.2is with apie chart. Show your workings clearly, ‘The table below shows the sectoral allocation of Table 2.4: Sectoral allocation of a country’s budget. (SSCE June 1998) Solution @ Add up the total value of all the sectors, ie. 30+25+15+10+20 =100 i) Arrange your workings or calculations in the following manner only in degrees, Amount (% Million) Workings in degrees (Angle of Sector) 30 30_ x 360° = 1089 L 00 1 25 25_ x 360? = 90° 100 1 15 15_ x 3609 =54° 00), 10 ‘10_ x 360° = 36° 100 20 20 x 360° = 729 1001 100 360° i) Using the values in degrees, i.e. 108°, 90°, 10 . ae and 72°, a pie chart is now drawn (fig, table 2.5(a): Cocoa production in Nigeria between 1960, protractor. and 1967. a Manufacturing 36% ‘0° Education Solution (i) From the figures given, using the scale: icm:1000 tonnes, the figures can be reduced : ' to simple ones when they are divided by 1000. 3: Piechart showing the sectoral (ii) The new values will now be. allocation of a country’s budget fig, table 2.5(a): Cocoa production in Nigeria between 1960 2.5 BarCharts[Or Graphs] and 1967. Meaning: Bar chart or graph is a graph made up of bars of rectangles which are of equal width and whose lengths are proportional to the quantities they represent. The major characteristics of the bar chart is that the body of the bars must not touch each other. There must be a space or gap between one bar and another. Bar chart may be arranged vertically or horizontally. Types of bar charts There are three major types of bar charts. These are (i) simple bar chart (ii) component bar chart and (iii) multiple bar chart. 7 (a) Simple bar charts 24 Simple bar chart (fig. 2.4) is used when the §.| data given are made up of only one item or component. The bar chart canbe presentedby 8 tabulated data with evenly spaced bars, sepa- 9 tated by gaps with the length proportional to | the magnitude of the value given. A Example The table in 2.5 represents the quantity of 0" 960 1961 1962 1963 1964 1965 1966 1967 cocoa production in Nigeria between 1960 and years 1967. Represent the information or data by a fig. 2.4: Sinnple bay chart showing cocoa: chart. production in Nigeria (1960-1967), ‘ i Saag lated. The yearis 1964, with the least cocoa production be calculated. The year is 1961, he total cocoa Production from 1960 to 1967 can be calculated, ic. 7,000 + 1,500 _ ts 4,000 + 3,500 = 38,500 (iv) The average or mean Production can also be calculated by adding the total produc- tion (as in (iii) above) and dividing it by the number of years (8 years), +4, 8 = 38.500 8 = 4,812.5 tons/year (6) Component bar charts A component bar chart (fig, 2.5) is used when the data involved are of two variables. Example Represent the population of males and females Ga million) (table 2.6) in some towns in Edo State in 1999 by a bar chart. tblé2.6: Population in millions of males and females ‘of some towns in Edo State. Using a graph sheet, choose a suitable scale for the graph, e.g. 1cm represents 10 units, the ‘graph (component bar chart in fig. 2.5) is then Abeokuta and Akure in 1960, 19; Population ('000) Uromi Ubiaja Towns Epkoma Irrua fig.2.5: Con Population of sox ul ar char The multiple bar chart (fig. 2.6) is used when’ there are about three or more variables in a given data. It has multiple bars, each of which stands for a component variable, Example Represent Nigeria’s export of cocoa (table 27, from Ibadan, Abeokuta and Akure in 1960, 1970 and 1980 by a bar chart. table 2.7: Nigeria's export of cocoa from Ibadan, 70 and 1980, Solution Using a graph sheet, choose a suitable scale for the graph, e.g. lem torepresent 10 units, the graph (amuttiple bar chart in fig, 2.6) is then drawn. tal f= Ibadan Abeokuta Akure Cocoa export s 1960 Multiple bar charts a from Ibadan, A ograms Meaning: Pictograms or pictographs are charts, inwhich pictures or drawings of objects are used torepresent items in a given data. The pictures so used are meant to represent the magnitude of the variables or to convey other information. In this case, pictures or diagrams are more appreciated rather than tables or other charts. Example With the aid of table 2.8, draw a pictogram to show the total number of chicken consumed in Ghana between 1998 and 2002. fable 2.8: Chicken consumed in Ghana between 1998 and 2002. 13 Solution (i) The picture or diagram must look like a chicken or bird. (ii) One picture to represent 10,000 chickens. (iii)The pictogram is then drawn as shown in fig. 2.7. om | sgh Sef Sof sn] Saft SG SE Sh a ah we] WE WEEE Sof of SP SB ah SF 2001 2002 of chicken: 8 and 2002 am Meaning: Histogram is a graphical represen- tation of frequency distribution. It is made up of a set of rectangles that have their bases on the horizontal axis, i.e. X-axis, and their fre- quencies on the vertical axis, Y-axis. They also have then rectangles at the centres on the class mark (i.e. mid point) of each interval. The height of each rectangle represents the magni- tude of the data lying within each class inter- val. The areas of the rectangles are proportional ts tothe class frequencies, In ‘drawing a histogram, there isno gap or space betwee1 i aeRerthan n two bars, unlike Example : farmer harvested 60 tubers of yam for eight lays. The number of yam harvested per day is shown in table 2.9. Draw a histo; 9. gram to repre- sent the information. a table 2.9: Number of tubers of yam harvested in a farm, Solution Plot the graph as shown in table 2.9. SRR ae Tubers of yam harvesting en eo Mees RCsRGSE TG -7)\..8 No of days fig. 2.8: A histogram showing the number of tubers of yam harvested ina farm, 2.8 Frequency Distribution Meaning: Frequency distribution refers to the arrangement of data or information in tabular form to reflect their frequencies. Frequency refers to the number of times a particular event 14 or information is usually used when data, are large and most of the numbers may ay more than once. dents in SSI by frequency distribution using the following data. 20. 8 1204 18 18 18 20 12 6 18 20 4 12 8 1812.08) 88) 180 2 20 18 18 20 8 2 4 18 Solution - Arrange the thn in the following manner, table 2.10; Marks scored by 30 biology stus dents in SSI Tally or Counts, | Frequency] Score (x)

You might also like