100% found this document useful (1 vote)
617 views210 pages

Complete Probability Statistics 2 For Cambridge International As A Level by Oxford University Press Leibniz

Uploaded by

Nann Ah Kyin Nar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
100% found this document useful (1 vote)
617 views210 pages

Complete Probability Statistics 2 For Cambridge International As A Level by Oxford University Press Leibniz

Uploaded by

Nann Ah Kyin Nar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 210
eee TES & Statistics BY -eey Tome /14(e) 1) We ensure every Cambridge learner can... Aspire We help every student reach their full potential with complete syllabus support from experienced teachers, subject experts and examiners. Succeed We bring our esteemed academic standards to your classroom and pack our resources with effective exam preparation. You can trust Oxford resources to secure the best results. Progress We embed critical thinking skills into our resources, encouraging students to think independently from an early age and building foundations for future success Find out more www.oxfordsecondary.com/cambridge BN) 833 SUCCEED ba:{0} 03:8) Pepa Ue ee) eee TES & Statistics BY -eey Tome /14(e) 1) OXFORD OXFORD UNIVERSITY Pees Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press isa departinent ofthe University of Oxford Ie farthers the University's objective of excellence in research, Scholarship, and education by publishing worldwide. Oxford isa registered trade mark of Oxford University Press in the UK and in certain other countries © Oxford University Press 2018 “The moral rights ofthe author have been asserted First published in 2015 Second edition 2018 All rights reserved. No part ofthis publication may be reproduced stored ins retrieval ystem, of transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics "ighis organization. Enquiries concerning reproduction outside the scope. ‘of the above should be sent to the Rights Department, Oxford University Press, ar the address above You must not circulate this book in any other binding or cover and yout must impose this same condition on any acquirer British Library Cataloguing in Publication Data Data available ora-19-8025175 10987654321 Paper used in the production of this book is @ narurl, recyclable product ‘made from wood grown in sustainable forests ‘The manufacturing process conforms to the environmental regulations of the country of origin Printed in Great Britain by Bell and Bain Ltd Glasgow ‘The questions, all example answers and comments that appear in this Book were written by the authors. Acknowledgements “The publisher wotld lke to thant the following for permission to reproduce photographs: DPI: Norma Jean Gargeszlage footstock: p2 (TL: Reckermann|Stockphoto: p2 (TR) MaicaiStockphot 2 (ML): dpa picture aliance/Alamy Stock Phone p2 (Ml) Tim Grahamialamy Stock Photo; p2 (HL: Echo(Cettyimages; p2 (BR: Richard Wearlage footstock: p13 (1): ERproductions Lidjagefootstock: 13 (BL) Chungking/Shuterstock: p13 BL}: Phil Robinsonfage foostock: p14 (1) LoloStock) Shutterstock; p14 (BL) DziminjFototia: pt (BR: Pixel Shepherd|ase footstock: p20: Leigh Prather) Shutterstock: p24: Gwright/Alamy Stock Photo: p2#: Martin Plobjage footstock: 7 (1) Bert de Rute) ‘Alamy Stock Photo; p47 (Mj: Vacim Petrakoy)Shurterstock: p47 (Bk: Danita Delmont/Shutverstock: PAB: Caro Seebery/Photothor; p5S (Tl) 0J0.Images/Stockphoto; p55 (TR) Denis Kuvaey/Shurterstock B60: Barna Tanko/Shutterstock: p62: Lucky photozrapher/Shutterstock: p63: DariazuShutterstock; p80: FloridaStock/Shutterstock: p86 (BLy. Janes Steid/Shuxterstock: p86 (BR): Phocologyi971/ Shutterstock; p112 (I) £1 iphoto'Shutrerstock; p112 (Bp: Ei Katsumatefage footstacky p12 Prank ‘Vetero/Alamy Stock Photo; p14: lan Murrayjage foostock: p36: Javier Larreajage footstock: P160: Sarymsakov Andrey/Shutterstock; p16 (IT: MRI805jiStockphoto; p76 (TR): ssuaphotoy 'Stockphoto: p177: AshDesignShuttersiock ‘Gover Itutration by lan Norss, Oxford University Press Contents 1 The Poisson distribution 1 1.1 Introducing the Poisson distribution 2 1.2 ‘The role of the parameter of the Poisson distribution, 5 1.3. The recurrence relation for the Poisson distribution 7 1.4 Mean and variance of the Poisson distribution 10 1.5. Modelling with the Poisson distribution 12 2 Approximations involving the Poisson distribution 20 2.1 Poisson as an approximation to the binomial 2 2.2. The normal approximation to the Poisson distribution 23 3, Linear combination of random variables 28 3.1. Expectation and variance of a linear function of a random variable 29 3.2 Linear combination of two (or more) independent random variables 34 3.3 Expectation and variance of a sum of repeated independent observations of a random variable, and the mean of those observations 38 3.4 Comparing the sum of repeated independent observations with the multiple of a single observation 40 Maths in real-life: The mathematics of the past 44 4 Linear combination of Poisson and normal variables 46 4.1 ‘The distribution of the sum of two independent Poisson random variables "7 4.2 Linear functions and combinations of normal random variables 50 5 Continuous random variables 58 5.1 Introduction to continuous random variables 59 5.2. Probability density functions 61 5.3 Mean and variance of a continuous random variable 65 5.4 Mode of a continuous random variable n 6 Sampling 76 6.1 Populations, census and sampling 7 6.2. Advantages and disadvantages of sampling 9 6.3. Variability between samples and use of random numbers 81 6.4 ‘The sampling distribution of a statistic 86 6.5 Sampling distribution of the mean of repeated observations of arandom variable 2 6.6 Sampling distribution of the mean of a sample from a normal distribution 4 6.7 The Central Limit Theorem 96 6.8 Descriptions of some sampling methods 100 Maths in real-life: Modelling statistics 104 7 Estimation 7.1 Interval estimation 7.2 Unbiased estimate of the population mean 7.3. Unbiased estimate of the population variance 7.4 Confidence intervals for the mean of a normal distribution 7.5 Confidence intervals for the mean of a large sample from any distribution 7.6 Confidence intervals for a proportion 8 Hypothesis testing for discrete distributions 8.1 The logical basis for hypothesis testing 8.2 Critical region 8.3 Type Land ‘Type Il errors 84 Hypothesis test for the proportion p of a binomial distribution 8.5 Hypothesis test for the mean of a Poisson distribution 9 Hypothesis testing using the normal distribution 9.1 Hypothesis test for the mean of a normal distribution, 9.2. Hypothesis test for the mean using a large sample 9.3. Using confidence interval to carry out a hypothesis test Maths in real-tife: A risky business Exam-style paper A Exam-style paper B ‘Tables of the normal distribution Answers Glossary Index 106 107 109) i 116 119 121 127 128 132 137 142 145 152 153, 137 160 364 166 168. 170 171 194 197 Introduction About this book ‘This book has been written to cover the Cambridge International AS & A Level Mathematics (9709) course, and is fully aligned to the syllabus. Inaddition to the main curriculum content, you will find: ‘Maths in real-life’, showing how principles learned in this course are used in the real world. © Chapter openers, which outline how each topic in the Cambridge 9709 syllabus is used in real-life. ‘The book contains the following features: Did you know? ‘Advice on MAM-STYLE GUESTION calculator use ‘Throughout the book, you will encounter worked examples and a host of rigorous exercises. The examples show you the important techniques required to tackle questions. The exercises are carefully graded, starting from a basic level and going up to exam standard, allowing you plenty of opportunities to practise your skills. Together, the examples and exercises put maths in a real-world context, with a truly international focus. At the start of each chapter, you will see a list of objectives covered in the chapter. These are drawn from the Cambridge AS and A Level syllabus. Each chapter begins with a Before you start section and ends with a Summary exercise and Chapter summary, ensuring that you fully understand each topic. Each chapter contains key mathematical terms to improve understanding, highlighted in colour, with full definitions provided in the Glossary of terms at the end of the book. ‘The answers given at the back of the book are concise, However, you should show as many steps in your working as possible, All exam-style questions have been written by the author. About the author James Nicholson is an experienced teacher of mathematics at secondary level, taught for 12 years at Harrow School as well as spending 13 years as Head of Mathematics in a large Belfast grammar school. He is the author of two A Level statistics texts, and editor of the Concise Oxford Dictionary of Mathematics, He bas also contributed to a number of other sets of curriculum and assessment materials, is an experienced examiner and has acted as a consultant for UK government agencies on accreditation of new specifications. James ran schools workshops for the Royal Statistical Society for many years, and has been a member of the Schools and Further Education Committee of the Institute of Mathematics and its Applications since 2000, including six years as chair, and is currently a member of the Community of Interest group for the Advisory Committee on Mathematics Education. He has served as a vice-president of the International Association for Statistics Education for four years, and is currently Chair of the Advisory Board to the International Statistical Literacy Project. Anote from the author The aim of this book is to help students prepare for the Statistics 2 unit of the Cambridge International AS and A Level Mathematics syllabus, though it ‘may also be found to be useful in providing support material for other AS and A Level courses. The book contains a large number of practice questions, many of which are exam-styl. In writing the book I have drawn on my experiences of teaching Mathematics, Statistics and Further Mathematics to A Level over many ‘years as well as on my experience as an examiner, and discussion with statistics educators from many countries at international conferences. So weCEED ) seers Student book & Cambridge syllabus Pou Rell Student Book: Complete Probability & Statistics 2 Bees for Cambridge International AS & A Level Syllabus: Cambridge International AS & A Level Mathematics: Probability and Statistics 2 (9709) ole UL eed Student Book ‘Syllabus overview Unit S2: Probability & Statistics 2 (Paper 6) 1. The Poisson distribution * Calculate probabilities for the distribution Po(A) Pages 3-9 * Use the fact that if X ~ Pola] then the mean and variance of X are each equal to 4 Pages 10-11 + Understand the relevance of the Poisson distribution to the distribution of random Pages 12-15, events, and use the Poisson distribution as a modal * Use the Poisson distribution as an approximation to the binomial cstribution where Pages 21-22 appropriate (7 > 50 and np <5, approximately) * Use the normal distribution, with continuity correction, as an approximation to the Pages 23-25 Poisson distribution where appropriate (2. > 15, approximately) 2. Linear combinations of random variables * Use, in the course of solving problems, the results that: — ElaX +b) = aX) + b and Variax + b) = aver) Pages 29-33, — Flax + bY) = aE) + DE Pages 34-38 — Variax + bY) = @°VartX) + b*Var(y) for independent Xend ¥ Pages 34-98 — if has a normal distribution then so does aX +b Pages 50-54 — if.Xand Y have independent normal distributions then aX + bY has anormal distibution | Pages 50-54 = ifX and ¥ nave independent Poisson clatrioutions then X-+ Yhas a Poisson distribution | Pages 4749 3._Continuous random variables ‘+ Understand the concept of @ continuous random variable, and recall and use properties Pages 59-61 of a probability density function (restricted to functions defined over a single interval) * Use a probability density function to solve problems involving probebilties, and to Pages 61-73 calculate the mean and variance of a distribution (explicit knowledge of the cumulative distribution function is not ineluded, but location of the median, for example, in simple cases by direct consideration of an area may be required) Ree Mera ey ey BN nts Seren) na seem tng Senne Natuenin 4. Sampling and estimation ‘+ Understand the distinction between a sample and a population, and appreciate the Pages 77-79 necessity for randomness in choasing samples ‘+ Explain in simple terms why a given sampling method may be unsatisfactory (knowledge f particular sampling methods, such as quota or strated sampling, is not required, but candidates should have an elementary understanding of the use of random numbers in producing random samples) ‘+ Recognise that a sample mean can be regarded as a random variable, and use the facts that EXX) = and that VariX} '* Use the fact that X has a normal distribution if X has a normal distribution ‘© Use the Central Limit Theorem where appropriate ‘© Caloulate unbiased estimates of the population mean and variance from a sample, using either raw or summarised data (only a simple understanding of the term ‘unbiased’ is, required) ‘© Determine and interpret a confidence interval for a population mean in cases where the population is normally distributed with known variance or where a large sample is used + Determine, from a large sample, an approximate confidence interval for a population proportion Pages 79-86 Pages 86-24 Pages 94-96 Pages 96-100 Pages 109-116 Pages 116-121 Pages 121-124 5. Hypothesis tests ‘+ Understand the nature of a hypothesis test, the difference between one-tall and two-tail tests, and the terms null hypothesis, alternative hypothesis, significance level, rejection region (or critical region), acceptance region and test statistic ‘+ Formulate hypotheses and carry out a hypothesis test in the context of a single ‘observation from a population which has a binomial or Poisson distribution, using — ditect evaluation of probabilities, — anormal approximation to the binomial or the Poisson distribution, where appropriate ‘+ Formulate hypotheses and carry out a hypothesis test concerning the population mean in cases where the population is normally distributed with known variance or where a large sample is used ‘+ Understand the terms Type | error and Type Il error in relation to hypothesis tests ‘+ Calculate the probabilities of making Type | and Type il errors in specific situations involving tests based on a normal distribution or direct evaluation of binomial or Poisson probabilities Pages 128-137 Pages 142-149 Pages 153-161 Pages 137-139 Pages 140-141 The Poisson distribution ‘The Poisson distribution can be used to (at least approximately) model a large number of natural and social phenomena. You might not expect the number of photons arriving at a cosmic ray observatory, the number of claims made to an insurance company, the number of earthquakes ofa given intensity and the number of atoms decaying in a radioactive material to have much in common, but they are all examples of this distribution. ‘The photo is of VERITAS - Very Energetic Radiation Telescope Array in Arizona ~ which is helping to shape our understanding of how subatomic particles like photons are accelerated to extremely high energy levels. Objectives After studying this chapter you should be able to: © Calculate probabilities for the distribution Po(A). ‘© Use the fact that if X ~ Po(A) then the mean and variance of X are each equal to 2. @ Understand the relevance of the Poisson distribution to the distribution of random events, and use the Poisson distribution as a model. Before you start You should know how to: Skills check: 1. Use your calculator to work out values of 1. Find the value of: exponential functions, e.g. a) e Find the value of e** b) e2 e* = 0.0821 (3 s.£) 2. Substitute values into more complex 2. Find the value of p= formulae, e.g. x2.54 rr 0.0821%39.06 _ 2.082 1799.08 9.134 (3p) Find the value of p 1.4 Introducing the Poisson distribution ‘Think about the following random variables: . . . . . ‘The number of dandelions in a square metze of a piece of open ground. ‘The number of errors in a page of a typed manuscript. The number of cars passing a point on a motorway in a minute, The number of telephone calls received by a company switchboard in half an hour. ‘The number of lightning strikes in an area over a year. Introducing the Poisson distribution Do they have any features in common? Does any one of them stand out as being rather different? ‘The behaviour in five of these photos follows the Poisson distribution. Formally, the conditions are that i) events occur at random ii) events occur independently of one another iii) the average rate of occurrences remains constant iv) there is zero probability of simultaneous occurrences. ‘The Poisson distribution is defined as P(X= 17) == for r=0,1,2, 7 You need to have a value for A in order for this to make sense, so there is a family of Poisson distributions but there is only one parameter, A, which is the mean number of occurrences in the time period (or length, area or volume) being considered. You can write the Poisson distribution as X ~ Po(A). Example 1 If X ~ Po(3) find P(X = 2). P(X=2)= £2 $0224 sf) Example 2 ‘The number of cars passing a point on a road during a 5-minute period may be modelled by the Poisson distribution with parameter 4. Find the probability that in a 5-minute period i) 2carsgo past ii) fewer than 3 cars go past. X ~ Po(4) i) P(X=2) ae = 0.14652 .147 (3s) ii) P(X=0) aoe = 0.01831... = 0.0183 3s.£) Remember that OF = 1 and a? = 1 X= 1) = £4 = 0.07326... = 0.0733(3s.£) " P(X <3) = 0.01831... + 0.07326... + 0.146525... .238 (3s.f) The Poisson distribution [UY ry Mathematical note: It is not immediately obvious from the mathematics you cover in this course that the form of the Poisson distribution constitutes a probability distribution - remember from SL Chapter 5 this requires all probabilities to be non-negative (which they obviously all are here because exp(~A) > 0 for any value of A) but also that the sum of the probabilities is 1. on 7 because x4) =14A4 P= for r= 0, 1, 2, 3, «. isa probability distribution Bz 23! 4 ~ this is an example of an advanced topic in Pure Maths where functions like exponentials, logarithms and the trigonometric functions have (infinite) power series forms. Truncated forms of these infinite series are how electronic calculators obtain values of these functions. Exercise 1.1 1. IfX~Po(2)find i) P(X=1) ii) P(X=2) iii) P(X =3). 2. IfX~ Po(1.8)find i) P(X=0) ii) POX=1) iii) P(X = 2), 3. IfX~Po(5.3) find i) P(X=3) ii) P(X=5) iii) P(X =7). 4. IfX~ Po(04) find i) P(X=0) ii) PX=) iti) P(X = 2). 5. IfX~ Po(2.15) find i) P(X=2) ii) P(X=4) iii) P(X =6). 6. IfX~ Po(3.2) find i) P(X=2) ii) P(X <2) i) P(X > 2). 7. ‘The number of telephone calls arriving at an office switchboard in a 5-minute period may be modelled by a Poisson distribution with parameter 3.2. Find the probability that in a 5-minute period a) exactly 2 calls are received b) more than 2 calls are received. 8. ‘The number of accidents which occur on a particular stretch of road in a day may be modelled by a Poisson distribution with parameter 1.3. Find the probability that on a particular day a) exactly 2 accidents occur on that stretch of road b) fewer than 2 accidents occur. Introducing the Poisson distribution 1.2 The role of the parameter of the Poisson distribution ‘The mean number of events in an interval of time or space is proportional to the size of the interval. Example 2 in Section 1.1 looked at the number of cars passing a point on a road during a 5-minute period. This may be modelled by the Poisson distribution with parameter 4. In this case, the number of cars passing that point in a 20-minute period may be modelled by the Poisson distribution with parameter 16, and in a L-minute period may be modelled by the Poisson distribution with parameter 0.8. Ifthe conditions for a Poisson distribution are satisfied in a given period, they are also satisfied for periods of different length. Example 3 ‘The number of accidents in a week on a stretch of road is known to follow a Poisson distribution with mean 2.1. Find the probability that a) ina given week there is 1 accident b)_in a two week period there are 2 accidents c)_ there is 1 accident in each of two successive weeks. a) In one week, the number of accidents follows a Po(2.1) distribution, = 0.257 (3s.f). so the probability of 1 accident = uw b) In two weeks, the number of accidents follows a Po(4.2) distribution, ets. 2 ©) his cannot be done directly as a Poisson distribution since it says what has to 132 3s.£). so the probability of 2 accidents = happen in each of two time periods, but these are the outcomes considered in part a). So the probability this happens in two successive weeks is Ee The Poisson distribution Example 4 ‘The number of flaws in a metre length of dress material is known to follow a Poisson distribution with parameter 0.4. Find the probabilities that a) there are no flaws in a 1 metre length. b) there is | flawin a 3 metre length ©) there is | flaw in a piece of material which is half.a metre long. a) X~Po(0.4) = P(X =0) at = 0.361 (3 sf). b) ¥~Po(1.2) > P(¥ =1)= ©) Z~Po(0.2) = P(Z ~1)= <*02' 0.164 (3s). Exercise 1.2 1. The number of telephone calls arriving at an office switchboard in a 5-minute period may be modelled by a Poisson distribution with parameter 1.4. Find the probability that in a 10-minute period a) exactly 2 calls are received b) more than 2 calls are received. 2. The number of accidents which occur on a particular stretch of road in a day may be modelled by a Poisson distribution with parameter 0.4. Find the probability that during a week (7 days) a) exactly 2 accidents occur on that stretch of road b) fewer than 2 accidents occur. 3. ‘The number of letters delivered to a house on a day may be modelled by a Poisson distribution with parameter 0.8. a) Find the probability that there are 2 letters delivered on a particular day. b) ‘The home owner is away for 3 days. Find the probability that there will be more than 2 letters waiting for him when he gets back. 4. ‘The number of errors on a page of a booklet can be modelled by a Poisson distribution with parameter 0.2. a) Find the probability that there is exactly 1 error on a given page. b) A section of the booklet has 7 pages. Find the probability that there are no more than 2 errors in the section. ¢) The booklet has 25 pages altogether. Find the probability that the booklet contains exactly 6 errors altogether. EM The role of the parameter of the Poisson distributi 5. ‘The number of people calling a car breakdown service can be modelled by 4 Poisson distribution, and the service has an average of 6 calls per hour. Find the probability that in a half-hour period a) exactly 2 calls are received b) more than 2 calls are received. The recurrence relation for the Poisson distribution You can calculate probabilities for a Poisson distribution in sequence using a recurrence relation. Example 5 IfX ~ Po(a) a) write down the probability that a) i) etxdt baa ‘The general relationship is P(X = k +1) = & x P(X=b). ‘The graphs on the next page show the probability distributions for different values of A and what effect changing the value of A has on the shape of a particular Poisson distribution. The Poisson distribution Pa Poisson, A= 1.2 [= EW) All Poisson variables have a sample space which is all of the non-negative integers. a] However, when 2 is relatively low, the eel probabilities tail off very quickly. 3 12_15, 12_o6, ’ be OO OG TL, (| se the initial probability that = 0 is multiplied ° Sa ee ee wn | by 1.2, then 0.6, then 04, 0.3, ... and so the z mode of X possnn A=28 1 £0 Here Ais larger than in the previous graph and ats the peak has moved across to the right. = 04 For values of X which are less than A the Baa probability increases, but once x is greater than 3 on A the probabilities start to decrease. era Hf { More values of x have a noticeable probability, oll I fc so the highest individual probability is not as 07127374 6 6°78 9 10°12 12° | largeas it wasin the previous graph and the x distribution is more spread out. waa EET What happens when A is an integer? Here P(X = 4) = P(X = 3) x 4s P(X = 3) and the distribution has two modes ~ at 3 and 4. i Generally, the mode of the Poisson (A) 302 distribution is at the integer below 2 when A is * oath Lip A not an integer and there are two modes (at 4 ° 2 . Se Say’ [and A~1) when it is an integer. A< Lisa special case. Poleson A= 0.8 [ os meanest Fin! Here even the first time the recurrence relation 04 is used you are multiplying by < 1, so the mode Eos will be 0 and the probability distribution is Wo strictly decreasing for all values of x. Ea dh I 4 ° a orga 4 6 67 8 9 I The recurrence relation for the Poisson distribution ‘The general forms for the probabilities of 0 and | for a Poisson distribution are Example 7 X ~ Po(5.8). State the Since 5.8 is not an integer, the mode is the integer below it, i.e, the mode is 5. Exercise 1.3 1. X~Po(2.5) a) Write down an expression for P(X = 4) in terms of P(X = 3). b) IFP(X= ©) Calculate P(X = 4) directly and check it is the same as your answer to b).. d) What is the mode of X7 2. X~ Pols) a) Write down an expression for P(X = 5) in terms of P(X = 4). b) Explain why X has two modes at 4 and 5. .214, calculate the value of your expression in part a). 3. X~Po(A) and P(X = 4) = 1.2 x P(X=3). a) Find the value of & b) Whatis the mode of X? The Poisson distribution (UR) 10 1.4 Mean and variance of the Poisson distribution IfX ~ Po(A), then E(X)=As Var(X)=A=> st. dev. (o)= VA. A special property of the Poisson distribution is that the mean and variance are always equal. Example 8 ‘The number of calls arriving at a company's switchboard in a L0-minute period can be modelled bya Poisson distribution with parameter 3.5. Give the mean and variance of the number of calls which arrive in i) Here A =3,5 so the mean and variance will both be 3.5. ii) Here A = 21 (= 3.5 x 6) so the mean and variance will both be 21. iii) Here A= 1.75 (= 3.5 + 2) so the mean and variance will both be 1.75. Example 9 ‘A dual carriageway has one lane blocked off because of roadworks. ‘The number of cars passing a point ina road in a number of 1-minute intervals is summarised in the table. Numberofecas | 0 | 1 | 2.3) 4) 5 6 | Frequency 3 | 4/4/25 [30/3 | 1 | a) Calculate the mean and variance of the number of cars passing in I-minute intervals. b) Is the Poisson likely to provide an adequate model for the distribution of the number of cars assing in 1-minute intervals? a) Df=70. Yixf=228, x? = 836, 80 ¥ = TE and Var(x) = Sf 836 -( ) Sy 70 \70 b) ‘The mean and variance are not numerically close so it is unlikely the Poisson will be an adequate model (with only one lane open for traffic, overtaking cannot happen on this stretch of the road and the numbers of cars will be much more consistent than would happen in normal circumstances - hence the variance is much lower than would be expected if the Poisson model did apply). = 228 .26 (3 s.f.) 333 Mean and variance of the Poisson distribution Derivation of mean and variance of the Poisson distribution You must be able to use these results but are not required to be able to prove them ~ they are included here for completeness, and as a nice manipulation using the power series expression for the exponential function. X ~ Pola) <> Pr{X =k} = i eta Sy tat OxeF +2 xa E(x) = Dk x K! cancelling k, alter discarding the zero case nan Pee oe ca SER uh sax Dk Gy axdik eee egagh? * 20D Bor)= Dex aR+A Then Var(X) = A? +A—2? =A. Exercise 1.4 1. TEX ~Po(3.2) find i) BCX) ii) Var(X). 2. IfX ~ Po(49) find the mean and standard deviation of X. 3. X~ Po(3.6) a) Find the mean and standard deviation of X. b) Find P(X > 1), where p= B(X). ) Find P(X > 1 + 20), where ois the standard deviation of X. d) Find P(X < pt 20). 4. X isthe number of telephone calls arriving at an office switchboard in a 10- minute period. X may be modelled by a Poisson distribution with parameter 6. a) Find the mean and standard deviation of X. b) Find P(X > wu), where = E(X). ©) Find P(X > ft +20), where ois the standard deviation of X. @) Find P(X

3)=1-P(Xs3 0.8571 = 0.1429. b) Ina 20-minute period G ofan hour), the mean number of cyclists will be 2 x 5 = 5 P(exactly one) = 342 (3s.f.). ©) The situation is that of a binomial distribution - there are 6 ‘trials, the number of cyclists in each hour is independent of the other periods, and the probability of more than 3 in an hour remains the same for all the 6-hour periods, ie. if Y = number of times that more than 3 cyclists pass by in an hour exactly once ina 6-hour period Y ~ B(6, 0.1429) (using the probability calculated in part a) ii). P(Y = 1) = 6 x 0.1429! x (1 - 0.1429)* = 0.397 (3s.£). Example 12 Ata certain harbour the number of boats arriving in a 15-minute period can be modelled by a Poisson distribution with parameter 1.5. a) Find the probability that exactly six boats will arrive in a period of an hour. b) Given that exactly six boats arrive in a period of an hour, find the conditional probability that twice as many arrive in the second half hour as arrive in the first half hour. a) Inan hour the average number of boats arriving is 6, so P(6 boats arrive in an hour) = £-© = 0.161. b) Iftwice as many arrive in the second half hour, then there needs to be 2 in a half-hour period and then 4 in the next half hour, so P(2 boats arrive in half hour, then 4 boats in next half hour) ost Gir = 0.224 x 0.168 = 0.0376. ‘Then the conditional probability is P(2 then 4 in half hour | 6 boats arrive in an hour) = TTR = 0.234, The Poisson distribution Exercise 1.5 1. For the following random variables state whether they can be modelled by a Poisson distribution. If they can, give the value of the parameter 4; if they cannot then explain why. a) ‘The average number of cars per minute passing a point on a road is 12. The traffic is flowing freely. X= number of cars which pass in a 15 second period. b) ‘The average number of cars per minute passing a point on a road is 14. ‘There are roadworks blocking one lane of the road. X= number of cars which pass in a 30 second period. ©) Amelie normally gets letters at an average rate of 1.5 per day. X= number of letters Amelie gets on December 22nd. @) A petrol station which stays open all the time gets an average of 832 customers in a 24 hour time period. X= number of customers in a quarter of an hour at the petrol station. e) An A&E department in a hospital treats 32 patients an hour on average. X= number of patients treated between 5pm and 7 pm on a Friday evening, 2. For the following situations state what assumptions are needed if a Poisson distribution is to be used to model them, and give the value of A. that would be used. You are not expected to do any calculations! a) On average defects in a roll of cloth occur at a rate of 0.2 per metre. How many defects are there in a roll which is 8m long? b) On average defects in a roll of cloth occur once in 2 metres. How many defects are there in a roll which is 8m long? ©) Asmall shop averages 8 customers per hour. How many customers does it have in 20 minutes? 3. An explorer thinks that the number of mosquito bites he gets when he is in the jungle will follow a Poisson distribution. ‘The explorer records the number of mosquito bites he gets in the jungle during a number of hour-long periods, and the results are summarised in the table. Number of bites | 0 | 1 | 2 | 3 | 4 | 5 | 6 [57 Frequency 3{7{solteoelo6{[slilo IE Modelling with the Poisson distribution a) Calculate the mean and variance of the number of bites the explorer gets, in an hour in the jungle. b) Do you think the Poisson is a good model for the number of bites the explorer gets in an hour in the jungle? |. The number of emails Serena gets can be modelled by a Poisson distribution with a mean rate of 1.5 per hour. a) i) Whatis the probability that Serena gets no emails between 4 pm and 5 pm? ii) What is the probability that Serena gets more than 2 emails between 4 pm and 5 pm? ii) What is the probability that Serena gets one email between 6 pm and 6.20 pm? b) What is the probability that Serena gets more than 2 emails in an hour exactly twice in a 5-hour period? ©) Would it be sensible to use the Poisson distribution to find the probability that Serena gets no emails between 4 am and 5 am? . ‘The number of lightning strikes in the neighbourhood of a campsite in a week can be modelled by a Poisson distribution with parameter 1.5. a) Find the probability that there is exactly one lightning strike in the neighbourhood in a given week, b) Alejandra spends three weeks at the campsite, Find the probability that there are exactly three lightning strikes in the neighbourhood during her holiday. ) Given that the neighbourhood has exactly three lightning strikes during her holiday, find the conditional probability that each week has exactly one strike. Summary exercise 1 « IfX ~ Po(1.45) find @) Find a) P(X=2) P(X 41), where j= E(X), ©) Find P(|X ~ j1| < 0), where ovis the standard deviation of X. 7. Anurban safety officer thinks that the number of traffic accidents in an area will follow a Poisson distribution. ‘The officer records the number of accidents in the area each week over a period of several months, and the results are summarised in the table. Number of accidents |°| |? 4/5/16 Frequency |5/1[1|3[5|2/ 1] 0 a) Calculate the mean and variance of the number of accidents in the area in a week. Summary exercise 1 b) Do you think the Poisson is a good model for the number of accidents in the area in a week? ‘The number of errors on a page of a book can be modelled by a Poisson distribution with parameter 0.15. a) Find the probability that there is exactly lerror ona given page. b) A chapter of the book has 20 pages. Find the probability that there are no more than 2 errors in the chapter. ©) What is the most likely number of errors in the chapter? } EXAM-STYLE QUESTIONS 9. ‘The number of errors on a page of the first proofs of a book can be modelled by a Poisson distribution with parameter 0.6. a) Find the probability that a page has exactly one error on i b) Find the probability that a double page spread has exactly two errors on it. ©) Given that a double page spread has exactly two errors on it, find the conditional probability that each page has exactly one error on it. 10. A shop sells spades. The demand for spades follows a Poisson distribution with mean 2.7 per week. a) Find the probability that the demand is exactly 2 spades in any one week. b) ‘The shop has 4 spades in stock at the beginning of a week. Find the probability that this will be enough to satisfy the demand for spades in that week. ©) Given instead that there are 1 spades in stock, find, by trial and error, the least value of 1 for which the probability of not being able to satisfy the demand for spades in that week is less than 0.1 11. ‘Ihe random variable X has the distribution Po(2.5). the random variable ¥ is defined by ¥=2X. a) Find the mean and variance of Y. b) Give a reason why the variable ¥ does not have a Poisson distribution. 12, Cars travelling south on a rural road pass a particular point randomly and independently at an average rate of 2 cars every three minutes. a) Find the probability that exactly 3 cars travel south past that point in a S-minute * period. Chapter summary The Poisson distribution is defined as Cars travelling north on that road pass the same point randomly and independently at an average rate of | car each minute. b) Find the probability that a total of fewer than 4 cars pass that point in a 3-minute period. . The number of lightning strikes at a particular place in a 28-day period has a Poisson distribution with mean 1.2. a) Find the probability that at most 2 lightning strikes will be recorded at that place in a 42-day period. b) Find, in days, correct to 1 decimal place, the longest time period for which the probability that no lightning strikes will be recorded at that place is at least 0.9. P(X=1)= £4 forr=0, 238 The Poisson distribution has a single parameter, 2. © The Poisson distribution is often written as X ~ Po(A). @ IfX~ Po(A), then E(X) =A; Var(X)= 07 = A=» st. dev. (0) = V2 The conditions for the Poisson are i) events occur at random ii) events occur independently of one another iii) the average rate of occurrences remains constant iv) there is zero probability of simultaneous occurrences. The Poisson distribution ery Approximations involving the Poisson distribution ‘The Poisson provides a good approximation to binomial distributions where 1 is large under certain conditions. For example, the number of genetic mutations in a stretch of DNA can be modelled well by the Poisson distribution ~ there is a lot of work currently being done to understand the processes involved in genetic mutations in both the plant and animal domains, with the possibility of significant medical advances in the treatment of diseases like cancer and Parkinsons. Objectives After studying this chapter you should be able to: © Use the Poisson distribution as an approximation to the binomial distribution where appropriate (n > 50 and np < 5, approximately). © Use the normal distribution, with continuity correction, as an approximation to the Poisson distribution where appropriate (A > 15, approximately). Before you start You should know how to: Skills check: 1. Calculate probabilities using the binomial 1, X ~ B(40, 0.03). Find P(X < 2). distribution, eg. X~ B(LO, 0.3). Find P(X = 2). P(X= 10) 3 Jos 0.7" = 0.233 (3s.f.) 2. Calculate probabilities using the normal 2. X~N(20, 20). Find P(X < 17.1). distribution, e.g. X ~N(40, 15). Find P(X < 44.2). 44.2— 40 P(X < 44.2) =7([2< ~1084) vis = 0.861 (3s.f) 2.1 Poisson as an approximation to the binomial In the last chapter of SI you met the use of the normal distribution as an approximation to the binomial distribution, provided certain conditions were satisfied by the parameters 1 and p. Here we meet a second approximation to the binomial. IF X ~ B(n, p) with n large (nt > 50) and p close to0 (np < 5) then X ~ approximately Bo(A) with A Here are some examples where the binomial and Poisson distributions have the same mean: Poisson (mean = 4) and son mean = 4) and binomial (9 = 10,9 0.4) binomial (a= 40,0 0.1) 02s 02 z= tna] | $2 tena Boss Bos 3 os Zo © cos F a0 © ° o's 2°3"4 5 6 7 8 9 wu 2 or 2's 4 56 7 8 9 0D ‘The mean of the binomial is 4 and the variance _| ‘The variance of the binomial is now 3.6 is 2.4, (remember that the variance of the Poisson is 4). ‘The two sets of probabilities are not particularly | The agreement between the two sets of similar, probabilities is now pretty strong. Poisson (mean = 4) and Poisson (mean = 4) and binomat (a= 40, p = 0.01) binomial n= 4000, p = 0.002) 0.26 025 go © oromal | 3 2 1 oom Boss Boss} Zoos Boot © 0s © 0s ° ° os 23°45 6 1 8 9 sn 2 on 2 3 as 6 7 8 9 0D ‘These two graphs both seem to show the binomial and Poisson to be exactly the same ~ but they are not: while you cannot see any difference on this scale graphically, there are differences between the binomial and the Poisson in both cases and the differences in the last case are much smaller than the differences when = 400 and p = 0.01. ‘There isa fundamental difference in that the Poisson outcome space has no upper limit whereas the binomial is bounded by the value of n. However, when, ris large and p is small, the probabilities of high values of x are very small so Approximations involving the Poisson distribution this is not a problem (in the same way that the normal can never provide an exact model for any physical measurements like heights or weights because the distribution cannot take negative values). “The use of the Poisson as an approximation to the binomial improves as 11 increases and as p gets smaller. Example 1 The probal a) Ifa sample of size 5 is taken, find the probability that exactly one of the components is faulty. b) What is the probability that a batch of 250 of these components has more than 3 faulty components in it? that a component coming off a production line is faulty is 0.01. a) If X = number of faulty components in sample then X ~ B(5, 0.01) and P(X = 1) = 5x 0.01 x 0.99" = 0.0480 (3.£.) b) If Y= number of faulty components in the batch then X ~ B(250,0.01)"~ Po(2.5) and P(Y > 3) = 1 — P(Y <3) = 1 - 0.758 = 0.242 (3s.f.) If you are working in a situation where p is close to 1, you can choose to count failures instead of successes and still construct an appropriate Poisson approximation. Exercise 2.1 1. ‘The proportion of defective pipes coming off a production line is 0. ‘A sample of 60 pipes is examined. a) Using the exact binomial distribution calculate the probabilities that there are i) o ii) 1 iti) 2 iv) more than 2 defectives in the sample. b)_ Using an appropriate approximate distribution calculate the probabilities that there are i) 0 fi) 1 iii) 2 iv) more than 2 defectives in the sample. 2. a) State the conditions under which a Poisson distribution may be used to approximate a binomial distribution. b) 5% of the times a faulty ATM asks for a personal identification number (PIN number) it does not register the number entered correctly. If Tenter my PIN correctly each time, what is the probability that the ATM will not register it correctly in 3 attempts? ©) Over a period of time, 90 attempts are made to enter a PIN. Ifall of the customers enter their PIN correctly, what is the probability that fewer than 3 of the attempts are not registered correctly. Poisson as an approximation to the binomial 3. Ina small town, the football team claim that 95% of the people in town support them. If the claim is correct and a survey of 80 randomly chosen people asks whether they support the football team, find the probability that more than 75 people say they do. 4. A rare but harmless medical condition affects 1 in 200 people. a) Ata cinema-showing which 130 people attend, what is the probability that exactly one person has the condition? b) Ata concert where the audience is 600, use an appropriate approximate distribution to find the probability that there are fewer than 5 people with the condition. 5. ‘The Nutty Fruitease party claim that 1 in 250 people support their policy to distribute free fruit and nut chocolate bars to children taking examinations. a) Inan opinion poll which asks 1000 voters about a range of policies put forward by different parties, find the probability that i) no-one will support the Nutty Fruitcase party policy ii) at least 5 people will support the policy. b) Ifthe opinion poll had 7 people supporting the policy, does this mean that the Nutty Fruitcase party have underestimated the support there is for this policy? 6. A rare medical condition affects 1 in 150 sheep. a) Ina small farm holding with a flock of 180 sheep, what is the probability that exactly one sheep has the condition? b) A large farm has a flock of 500 sheep. Use an appropriate approximate distribution to find the probability that there are fewer than 5 sheep with, the condition. 2.2 The normal approximation to the Poisson distribution For large A( > 15, approximately) you would often use a normal approximation particularly when the probability of an interval is required, e.g. P(X 2 15) or P(6 < X < 14), since this is a single calculation for a continuous random variable but requires multiple calculations for a discrete random variable. Remember that the normal uses the standard deviation to calculate the z-score, i.e. 2 =* You must also include the continuity correction (which you met in SL when using the normal to approximate another discrete distribution ~ the binomial), ‘The parameters used are the mean and variance of the Poisson, ie. l= 0 = A. Approximations involving the Poisson di Example 2 IFX ~ Po(16) calculate P(I1 < X-< 15) 8) using the exact Poisson probabilities b)_by using a normal approximation. a) POLS X< 15)= P= 11, 12,13, 14 13) 16? 16 16) 13! lat 15} 0.389 so use the N(16, 16) distribution to approximate the (16) distribution. ‘The continuity correction says P< X < 15) = (05 <¥ <15.5) where Y is the approximating normal. 105-16 15 =i) vis" ie) (-1.375 < Z <-0.125) = ©(1.375) - (0.125 P(10.5 < Y< 15.5) = 0.9155 - 0.5498 = 0.366. Example 3 ‘The demand for a particular spare part in a car accessory shop may be modelled by a Poisson distribution. On average the demand per week for that part is 2.5. a) The shop has 4 in stock at the start of one week. What is the probability that they will not be able to supply everyone who asks for that part during the week. b) ‘The manager is going to be away for 6 weeks, and wants to leave sufficient stock that there is no more than a 5% probability of running out of any parts while he is away. How many of this particular spare part should a) For the demand in a week, use the Po(2.5) distribution. Then if the demand is 4 or less the shop can supply all the customers. P(X < 4) = 0.0821 + 0.2052 + 0.2565 + 0.2138 + 0.1336 = 0.8912. ‘The probability of not being able to supply all the demand is 1 - 0.891 = 0.109 (35.f.). b) For the demand in 6 weeks, use the Po(15) distribution, which can be approximated by the N(15,15) distribution. ‘You need to find k so that P(demand < k) > 0.95. + (0,95) = 1.6449 so you need to find the smallest integer k which satisfies (ke > 1.6449, which is 21 (solution is k > 20.9). The normal approximation to the Poisson distribution Exercise 2.2 1. Let X ~ Po(A)jand Y ~ N(A, A) where A satisfies the conditions needed for ¥ to be used as an approximation for X. Write down the probability you need to calculate for ¥ (including the continuity correction) as the approximation for each of the following probabilities for X. a) P(X<16) b) P(X > 22) <) P(Xs17) d) P(45 =X < 62) 2. Which of the following could reasonably be approximated by a normal distribution? (for those which can, state the normal distribution that would be used). a) X~ Po(16) b) X~ Po(12.32) ©) X~ Po(8.5) 3. Use normal approximations to calculate a) P(X< 42) ifX~Po(49) —-b) P(X29) if X~ Po(17.5) )_ P(X2 13) if X ~ Po(18.4) d) P(25 2). ii) Find the probability that X = 2 given that X22. b) Using an appropriate approximate distribution calculate the probabilities that there are i) 0 1 i) 2 iv) > 2 defective pipes in the sample. b) Random samples of 150 values of X are taken. 2. A rare disease affects 1 in 2000 people on average. i) Describe fully the distribution of the sample mean. ii) Find the probability that the mean of a random sample of size 150 is less than 2.4. a) Use a suitable approximation to find the probability that, ofa random sample of 7500 people in a city, more than 3 people have the disease. On average 3 people in every 10000 in Canada have a particular gene. A random sample of 4000 people in Canada is chosen. b) Ina random sample of 1 people, the probability that no one has the disease is less than 0.01. Find the least possible ‘The random variable X denotes the number value of n. of people in the sample who have the gene. Use an approximating distribution to calculate the probability that there will be more than 2 people in the sample who have the gene. 3. Customers arri at the exchange and refunds desk in a store at a constant average rate of 1 every 2 minutes. a) State one condition for the number of customers arriving in a given period to be modelled by a Poisson distribution, 2% of bottles on a production line do not have their tops securely fastened. This fault occurs randomly. 200 bottles are checked to Assume now that a Poisson distribution is a see whether the tops are securely fastened. suitable model. b) Find the probability that exactly 4 customers will arrive during a randomly chosen 10-minute period. ©) Find the probability that less than 3 customers will arrive during a randomly chosen 5-minute period. Use a suitable approximation to find the probability that fewer than 4 do not have the top securely fastened. Summary exercise 2 a A dissertation contains 5480 words. For each } 8. word, the probability it contains an error is 0.001, and these errors can be assumed to occur independently. The number of words with errors in the dissertation is represented by the random variable X. a) State the exact distribution of X, including the value of any parameters. b) State an approximate distribution for X, including any parameters, and justify the use of this approximation. Use this approximate distribution to find the probability that that there are more than 4 words printed wrongly in the dissertation. °) Chapter summary If X ~ B(n, p) with 1 large (n> 50) and p close to 0 (np <5) then X ~ approximately Po(A) with 2 = np. ‘A manufacturer packs computer components in boxes of 500. On average, 1 in 2000 components is faulty. Use a suitable approximation to estimate the probability that a randomly chosen box contains at least one faulty component. On average 1 in 3000 adults has a certain medical condition, a) Use a suitable approximation to find the probability that, in a random sample of 4500 people, fewer than 4 have this condition. b) Ina random sample of » people, where nis large, the probability that none has the condition is less than 10%, Find the smallest possible value of v. IX ~ Po(A) with A> 15 (approximately) then X ~ approximately N(A, 2). When the Poisson is approximated by a normal distribution, a continuity correction must be used. Approximations involving the Poisson distribution i Linear combination of random variables The real world is not simple; many things are made up of more than one component. It is often easier to model each component of a process separately than it is to try to produce a complex model of the whole process. Simulations then allow you to get a good idea of what the behaviour of the overall process would be. For example, simulating the number of passengers on a flight, and then the baggage and person weights would be easier to do separately, Objectives After studying this chapter you should be able to: © Use, in the course of solving problems, the results that © E(aX +b) = aB(X) + band Var(aX +b) = «Var(X) © E(aX + bY) = aE(X) + bE(Y) © Var(aX + bY) = a?Var(X) + 6°Var(¥) for independent X and Y. Before you start You should know how to: Skills check: X) and 1, Calculate the mean and variance of a 1. Calculate random variable, eg. x P(X =.) | 0.2 | 0.4 | 04 x 3f4]5 P(X =x) | 0.1 | 0.6 | 03 Calculate E(X) and Var(X). E(X) = (3 x 0.1) + (4x 0.6) + (5x 0.3) = 4.2 9 x 0.1) + (16 x 0.6) + (25 x 0.3) 8 Var(X) = 18 - 4.2 0.36 3.41 Expectation and variance of a linear function of a random variable In SI Sections 5.3 and 5.4 you met the expectation and variance of a discrete random variable: ‘@ The mean or expected value of a probability distribution is defined as w= E(X)= > px. ©. The variance of a probability distribution is defined as Var(X) = E[{X - E(X)}). The alternative version (which is easier to use in practice) is Var(X) = BO®) - {E(X)P. In SI Section 2.5 you saw: Ifa set of data values X is related to a set of values Y so that Y = aX + b, then mean of Y= ax meanofX +b © standard deviation of Y= a x standard deviation of X @ variance of ¥ = a’ x variance of X. ‘The same relationship applies if X and Y are random variables defined in the same way (Y = aX + b). ‘The proof of these results is easiest to do by considering the multiplication bya constant and adding a constant separately, and then the full result is obtained just by applying them one after the other. We will show it in full here for discrete random variables, but the same result holds for continuous random variables which you will meet in Chapter 5 (where the summation is replaced by integration). IfY=ax He = Dap = by = Dy = Dax) p= EX?) = Yiatp > EW") = Py p= Va) pa a’ Yap =a HX) Var(X) = BX?) ~ (ste)?s Var(¥) = EO?) (t4,)? = a E(X?) — (aptg? AP = Atty P Var(X). {EO°) - (uxY} = Ify=X+b te = Lop > oy = Typ = De +) p= Tap + bY p= p+ bosince Dp =. Linear combination of random variables [MBX

You might also like