0% found this document useful (0 votes)
7 views15 pages

Sac400-Lesson 5

This lesson focuses on the analysis of mortality data, specifically modeling the number of deaths using binomial and Poisson distributions. It covers assumptions for modeling, examples for calculating mortality rates, and the method of maximum likelihood for estimating mortality rates. Additionally, it discusses potential issues with the binomial model and introduces alternative distribution assumptions for mortality analysis.

Uploaded by

emmambuvi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views15 pages

Sac400-Lesson 5

This lesson focuses on the analysis of mortality data, specifically modeling the number of deaths using binomial and Poisson distributions. It covers assumptions for modeling, examples for calculating mortality rates, and the method of maximum likelihood for estimating mortality rates. Additionally, it discusses potential issues with the binomial model and introduces alternative distribution assumptions for mortality analysis.

Uploaded by

emmambuvi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

LESSON FIVE.

Analysis of Experience data.

5.1 INTRODUCTION.

A mortality investigation is carried out to investigate the real life experience of a group of
lives. This lesson will be dealing with the analysis of data corrected during this
investigation.

5.2 Learning Outcomes.


By the end of this lesson, you should be able to;
5.2.1 Model the number of deaths observed in the mortality investigation using the
binomial distribution.
5.2.2 Model the number of deaths observed in the mortality investigation using the
Poisson distribution.
5.2.3 Use examples to calculate the rate of mortality and the force of mortality.
5.2.1 MODELING OF THE NUMBER OF DEATHS OBSERVED IN A MORTALITY
INVESTIGATION USING THE BINOMIAL DISTRIBUTION.

To be able to model the number of deaths observed in a mortality investigation, the


following
assumptions are made;
Assumptions;
i) All lives are observed over the same interval of age (x,x+1) within the year.
ii) Death is the only cause of decrement considered.
iii) All lives are identical and independent.
iv) The deaths are uniformly distributed across all the sub intervals within the
year and are equal to ‘d’.

Let N be the number of identical, independent lives age x exactly for one year, ’d’
denotes the
number of deaths between (x,x+1),where ‘d’ is the sample value of the random variable
‘D’ and

q x is the probability of a life dying between age (x,x+1) called the initial rate of mortality.
The number of deaths ‘D’ has binomial distribution if its distribution function is given by;

N
P( D = d ) =  q d (1 − q) N −d d=0.1,…N
d 
EXAMPLE.
A cat has nine lives, so that the cat will not die until it has lost all its nine lives. The
probability
of a cat losing a live is 20% per week. Assuming that the mortality of each life follows
the
binomial distribution, calculate the probability that a cat which has currently lost none of
its
lives will die during the next ten weeks.

SOLUTION
A cat will be dead in 10 weeks if it has lost all 9 lives by then.
Let X denotes the time of death.
Probability of dying after the ten weeks=1-Probability of surviving upto the tenth week.
P(X>10)=1-P(X  10)
Probability of surviving one week =0.8.
Probability of surviving 10 weeks= 0.810. =0.10737
Therefore P(X>10)=1-0.10337)=0.89263.which is the probability of losing one live.
For the cat to die it must lose all its nine lives.
= 0.892639 =0.36.

EXERCISE.
In the above example find the probability that the cat will die during the fifth week.

ESTIMATING THE RATE OF MORTALITY USING THE BINOMIAL


DISTRIBUTION.

We will use the method of maximum likelihood to estimate q x , the mortality rate
between (x,x+1).This will involve selecting the value that maximizes the likelihood of obtaining
the observed number of deaths.
Since all lives have the same likelihood over the interval (x,x+1), then
L(q)= P(D=d)
N
=  q d (1 − q) N −d
d 
Taking logarithm of L(q) we get;
N
Log(L(q) = log  + d log q + ( N − d ) log(1 − q)
d 
To maximize we differentiate with respect to q to get;
d ( L(q )) d ( N − d )
= − =0
dq q 1− q

This equation is zero at the value q that is ;
   d
d (1 − q ) = ( N − d ) q  q =
N

q is a maximum likelihood estimate since

d 2 ( L(q)) d (N − d ) 
=− 2 − 0 and is also unbiased estimator of q i.e E( q )=q
dq 2
q (1 − q) 2

We now derive the variance of q .
−1
  d 2 L( q )  
var( q) = − 2 
q = q
 dq 

d 2 ( L(q)) d (N − d )
=− 2 −
Now dq 2
q (1 − q) 2

d (1 − q) 2 − ( N − d )q 2
=−
q 2 (1 − q) 2
− d + 2qd − dq 2 − Nq 2 + dq 2
=
q 2 (1 − q) 2

Divide by N both numerator and denominator we get;

− d / N + 2qd / N − d / Nq 2 − Nq 2 / N + dq 2 / N
=
q 2 / N (1 − q) 2

  
− d / N + 2qd / N − Nq 2 / N − q + 2q 2 − q 2 − q (1 − q ) at q= q
= = 2 =
q 2 / N (1 − q) 2 q / N (1 − q) 2 2 
q / N (1 − q ) 2
therefore
2
d ( L(q)) −N
2
= 
dq q(1 − q)
Hence;

 
d 2 ( L(q)) q(1 − q)
Var( q )= =
dq 2 N

PROBLEMS WITH THE BINOMIAL MODEL.


The binomial model leads to problems if the observations are more realistic.

i) We might not observe all lives over the same interval of age
The first assumption may not be realistic since even if we may limit our investigation
to lives aged between x and x+1,we might not observe all the lives over the whole
year. Some lives will die earlier hence the deaths are not uniformly distributed. We
may observe a life between x + ai and x + bi and assume d i lives are lost and
d i  d . for all intervals.

d1 d2 di
x x + a1 x + b1 x + a2 x + b2 x + ai x + bi x + 1 1

ii) There may be other decrements other than death and sometimes increments as
well.
Example;
Consider an investigation into the mortality of unemployed men aged 30. There are
several scenario;
• Some lives will leave the investigation because they return to employment in this
case the data will be censored.
• Other men will rejoin investigation if they become unemployed again.

To accommodate these realities that a i and bi are in general not the same , consider the ith life
we define the random variable Di whose realized value(sample value) is d i as
0...if ..i th .life..survives
Di = 
1...if ..i th .life...dies.

P( Di =0)= P( i th life survives from x + ai and x + bi )=1- bi − ai qx + ai

P( Di =1)= P( i th life dies between x + ai and x + bi )= bi − ai qx + a i

The probability that there are d i between x + ai and x + bi .is given by;

since d i = (0,1)
1− d i
P( Di = d i )= bi − ai qxii + ai x (1−bi − ai qxi + ai )
d

Which is a Bernoulli distribution. Each life has a Bernoulli distribution.


For each life we will collect information relating to the values d i , a i and bi .
Define

q = ( b − a qx + a
1 1 1
q
b2 −a2 x+a2 ,………. bN − a N qx + a N ) a 1xN vector of unknown parameters and
~

d =( d 1 d 2 ……… d N )
~

Since each life is assumed to be independent of the others, the overall likelihood is the product of
all the individual Bernoulli distributions in each interval. We can write the overall likelihood
function as


~
1− d i
( bi − ai qxii + ai )x (1−bi − ai qxi + ai )
~ d
L( d , q )= which is multiple parameter function.
i =1

~ ~
To maximize this likelihood L( d q, )we need to find a vector of N maximum likelihood
estimators


 
q =( q ……… q ). This is very difficult to solve.
~ ~1 ~N

~ ~
The usual approach is to reduce the above likelihood L( d q, ) to a one parameter problem by
making an assumption about the distribution of the deaths ‘d’ in the different intervals which allows
us to express any

bi − ai qx+ai in terms of qx .

There are three possible distribution assumptions applied.

i) Uniform distribution assumptions t q x = tq x


ii) Balducci assumption 1−t q x +t = (1 − t )q x
− t
iii) Constant force of mortality assumption t q x = 1 − e
i) Uniform distribution assumption.
t q x = tq x
Proof;
Here we assume that the deaths are uniformly distributed over the interval (x, x+t).
The number of deaths in the interval (x, x+t) are given by;
t d x = l x − l x +t ……………………………………………………………………………………….(5.1)
Since the number of deaths in each year is uniform and is equal to d x and there are ‘t’
years then total number of deaths in ‘t’ years is t d x that is l x − l x +t = td x
…………………….(5.2)
Comparing equations (5.1) and (5.2) we get;
t d x = td x
Dividing both sides by l x we get t q x = tq x
ii) Balducci assumption.
1−t q x +t = (1 − t )q x
This assumption implies a decreasing force of decrement over the age interval.
Proof.
Since Balducci assumption implies decreasing force of mortality across the interval (x, x+t),
this implies that 1 / l x is assumed linear between the interval (x, x+t).
Assuming uniform distribution we have;
l x − l x+t = td x
=t( (l x − l x +1 )
l x +t = l x + t (l x − l x +1 )
Since 1 / l x is assumed linear we therefore get;
1 / l x +t = 1 / l x + t (1 / l x − 1 / l x +1 )
1 / l x +t = 1 / l x − t (1 / l x +1 − 1 / l x )
Multiply this by l x +1
l x +1 / l x +t = l x +1 / l x − t (l x +1 / l x +1 − l x +1 / l x )
1−t Px +t =1 Px − t (1−1 Px )
1 −1−t q x +t = 1−1 q x − t (1 − 1+ 1 q x )

1−t q x +t = (1 − t )1 q x hence the result.

DERIVING THE ACTURIAL ESTIMATE OF MORTALITY RATE.

We would like to find a simple relationship between D, the random variable representing the
number of deaths and q x , the underlying mortality rate.

0.......if ..the..ith..life..survives
Let Di =  
1.......if ..the..ith..life..dies 
N
E ( D) =  E ( Di ) ……………………………………………(5.2)
i =1

N
E(D)=  0xP(ith..life....survives) + 1xP(ith..life...dies)
i =1

N
=  P(ith..life...dies..between( x + a ..and...x + b )
i =1
i i

N
= 
i =1
bi − ai q x + ai …………………………………………………………………………(5.3)

bi −ai q x+ai

x x + ai x + bi x+1

P( ith life dies between ( x + ai and x + bi )

=P(ith life dies between ( x + ai and x + 1 )-P(ith life dies between ( x + bi and x + 1 )

=P(ith life dies between ( x + ai and x + 1 )-P(ith life survives between ( x + a i and x + bi )xP(ith
life dies between ( x + bi and x + 1 )

bi − ai q x+bi = 1− ai qx + ai - bi −ai p x+ai x 1−bi qx+bi ..…………………………………………(5.4)

Substitute eqn (5.4) in eq (5.3) we get

N
E(D ) =  1− ai qx + ai - 1−bi q x+1 x bi −ai p x+ai ………………………… …... .(5.5)
i =1
We can now express 1− ai qx + ai and 1−bi q x+1 in terms of q x using the Balducci assumption in
equation (5.5).

So under Balducci assumption we have.

N N
E ( D) =  (1 − ai )q x -  (1 − bi )q x x bi − ai p x+ai
i =1 i =1

Now bi − ai p x+ai =1- bi −ai qx+ai

= (1 − E ( Di ) from eqn (5.2)

Substituting this we get.

N N
E ( D) =  (1 − ai )q x -  (1 − bi )q x x (1 − E ( Di )
i =1 i =1

N N
=  (1 − ai )q x -  (1 − bi )q x x (1 − d i ) since E ( Di ) = d i
i =1 i =1

 E(D )
qx = N N

 (1 − ai ) −  (1 − bi )
i =1 i =1 since

0.......i.th..life..survives.
di =  
1.......ith...life..dies 

The number of deaths contribute the period of length between x + ai and x + 1 is

N
( x + 1 )-( x + ai )=1- a i thus 
i =1
(1 − ai ) is the total number of individuals exposed to the risk of

dying and the number of survivors contribute the period of length between x + bi and x + 1 is

N
( x + 1 )-( x + bi )=1- bi thus 
i =1
(1 − bi ) is the total number of individuals who survived x + bi

and x + 1 .
N N
Thus 
i =1
(1 − ai ) -  (1 − bi ) = E x denotes the total number of individuals exposed to risk
i =1
between

( x + ai ) and ( x + 1 ) called the initial exposed to risk counting the deaths as exposed to risk until
the end of the year.

Therefore

 d
qx = . Under the assumption of uniform distribution of deaths that a life dies within half an
Ex
year, If there are ‘d’ within the year, then we have d/2 deaths in half an year then,

E x  E xc + d / 2 where Exc is called the central exposed to risk.to those surviving between ( x + ai
) and ( x + 1 ).

 d
Thus q x = .
E +d /2
c
x

Definitions:

E x - Initial exposed to risk.

This is the full potential time under observation for all lives those surviving throughout the
interval and those who die within the interval. This is more complicated and is difficulty to
interpret in terms of the underlying process being modelled. We therefore approximate with E x
 E xc + d / 2

Exc - Central exposed to risk.

This is the observed waiting time at age x for those still alive at the end of the
c
interval(Survivors). Once a life dies it no longer contributes E x This is a very natural quantity
because you just record the time spent under observation.
5.2.2 MODELING OF THE NUMBER OF DEATHS OBSERVED IN A MORTALITY
INVESTIGATION USING THE POISSON DISTRIBUTION.

The Poisson distribution is used to model the number of rare events occurring during some
period of time.For example, The number of particles emitted by a radioactive source in a minute.
Also it can be used to model the number of deaths among a group of lives given the time spent
after exposure to risk.

A random variable X is said to have a Poisson distribution with mean  (  0) ifthe probability
function of X is given by;

P(X=x)= e  for..x = 0,1,2....


− x

x!

Remember that the mean and variance of a Poisson distribution is  .

THE POISSON MODEL

c
Le E x denote the total observed waiting time. Assume that we observe N individuals and that the
force of mortality( Mortality rate) is a constant  , then the Poisson model is given by the
assumption that D has a Poisson distribution with parameter  = E xc ,that is;

e − E X ( E xc ) d
C

P(D=d)= for..d = 0,1,2....


d!

EXAMPLE.

A small country involved in a war conscripted a cohort f heathy young men to serve in the
country’s army for a 3-year period starting on 1st January 1999. During this period a number of
men were killed. Given that the total period of service for the group a a whole was 10 million
man-days and that the annual force of mortality for death in active service is 0.02, Calculate the
probability that at least 500 men were killed while in active service.

SOLUTION

.Let D denotes the number of deaths, D has a Poisson distribution given by;

e − E X ( E xc ) d
C

P(D=d)= for..d = 0,1,2....


d!
c
We are given that  = 0.02 and E x =10,000/365=27,397 years.

Since  = E x = 0.02x27397 = 547.95


c

Therefore the probability that the number of deaths is at least 2 is given by;

P ( D  500) = 1 − P ( D  500)

500
e −547.95 (547.95) d
=1-  ... =1.
d =0 d!

Thus there is a 100% chance at least 2 men will be killed.

This probability can also be approximated using the Normal distribution.

Since the mean and variance of the Poisson distribution is  = E x = 0.02x27397 = 547.95
c

P ( D  500) =P{N(mean,variance)>499.5}

= P( D −   499.5 −  )
 

= P( D −   499.5 − 547.95)
 547.95

= 1- P( Z  499.5 − 547.95 )
547.95

= 1-  (−2.07)

=0.981.

There is a 98.1% probability that at least 500 men will be killed.

ESTIMATING THE UNDERLYING FORCE OF MORTALITY.

We now use our knowledge about the number of deaths observed and the total exposed to risk
(waiting

time) to estimate the unknown true force of mortality.

The likelihood of observing ‘d’ deaths if the true hazard rate is  is given by;
( E XC ) d e − E x
c

L(  ) =
d!

Taking logarithms we get

log L( ) = d log  + d log E x − E x


c c

Differentiating with respect to  we get

d d
( L(  )) = − E xc
d 

 d
This is equal to zero if  = . This is a maximum likelihood estimator since
E xc

d2 −d
( L(  )) = 2  0.
d 2

~ D
This is the realized value of the random variable  =
E xc

~
Properties of the estimator  .

~
1.  is unbiased estimator of  .
Proof.
~ D
Now  = taking expectations we get;
E xc
~ E ( D) E xc
E ( ) = = c =
E xc Ex
~ 
2 Var (  ) =
E xc
Proof.
−1
~  d 2 log L(  ) 
Var (  ) = − 
 d 2  = 
−1
  2
~
 d  
Var (  ) = − 2 =
   d
  

c
Divide both the numerator and the denominator by E x we get;

2 2 
~  / E xc  / E xc 
Var (  ) = =  = c .
d / E xc  Ex

EXAMPLE
In a mortality investigation covering a five year period, where the force of mortality can be
assumed to be constant there are 45 deaths and the population remained approximately
constant at 7,500. Estimate the force of mortality and its standard error.?
SOLUTION.
Here
Exc =7500x5
 d 46
The force of mortality is given by;  = c
= = 0.00123.
E x 7500x5


~  0.00123
The variance is given by; Var (  ) = c
= =0.0000000328
E x 7500x5

E-tivity 5.2.2. –Modelling mortality rate using the Binomial and Poisson distributions..

Numbering, pacing and sequencing 5.2.2.1


Title Modelling mortality rate using the Binomial
and Poisson distributions
Purpose To estimate mortality rate using the Binomial and
Poisson distributions..
Brief summary of overall task Read the following materials’
https://www.acted.co.uk/docs/2015/CMP%20Upgra
de/CT4-PU-15.pdf
Spark

POISON DISTRIBUTION.
Individual task A large computer company always maintains a workforce of
exactly 5,000 young workers, immediately replacing any
worker who leaves. Use the Poisson model to calculate the
probability that there will be fewer than 3 deaths during any 6
month period, assuming that all workers experience a constant
force of mortality of 0.0008 per annum
Interaction begins • Post your answers on the discussion forum
5.2.2.1
• Read what your colleagues have posted.
• In a sentence or two, comment on what two
of your colleagues have posted keeping
netiquette in mind
E-moderator interventions • Focussing group discussion
• Encouraging lurkers (quiet ones) to
contribute
• Providing feedback/ teaching points
• Summarising key points
• Closing the discussion
Schedule and time This activity should take two hours.
Next .Exposed to risk.

5.3 Assessment Questions.


• The population of a small town is expected to remain constant at 50,000 0ver the next
few years. Assuming that all the inhabitants experience a force of mortality of 0.001 per
annum, use a Normal approximation to estimate the probability that there will be more
than 225 deaths in the town for the next 4 years. Ans 0.036.
2. Find the 95% confidence interval for the force of mortality in Q1.
3. A large computer company always maintains a workforce ofexactly5,000 young workers,
immediately replacing any worker who leaves. Calculate the probability that there will be
fewer than 3 deaths during any 6 month period, assuming that all workers experience a
constant force of mortality of 0.0008 per annum. Ans 0.6767.
10,000 school children have been selected to take part in a one year medical study. If the initial
annual rate of mortality is 0.00025 for each child and deaths are expected to occur
independently, calculate the probability that 2 or more of the participants will die before the
end of the study.

5.4 References.

• Preston, S.H., Heuveline, P. and Guillot, M. (2001). Demography: Measuring and Modelling
Population Processes. Oxford: Blackwell. ISBN-13:9781557869512.
• Rowland, D. (2003) Demographic methods and Concepts. Oxford: OUP. ISBN-13: 0340718927
• Hinde, A (1998) Demographic methods. Arnold, London. ISBN-13:9780340718926.
• Newell, C. (1988). Methods and Models in Demography. London: Belhaven. ISBN-
13:9780898624519.

You might also like