Hierarchcial Loss Reserving Guszcza Penultimate
Hierarchcial Loss Reserving Guszcza Penultimate
Case Study
y Nonlinear Empirical
p Bayes
y Hierarchical Model
Models vs Methods
Need for Variability Estimates
Loss Reserving and its Discontents
• Related p
point: traditional methods p
produce p
point estimates only.
y
Reserve variability estimates in practice are often ad hoc.
• Repeated measures
1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,036
1989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,987
1990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,919
1991 2,609 338 753 1,029 1,195 1,326 1,395 1,446
1992 2,077 257 569 754 892 958 1,007
• Unbalanced data
• We are doing forecasting The time series are necessarily of different lengths.
• Non-linear
• Each year’s loss development pattern in inherently non-linear
• Ultimate loss ((ratio)) is an asymptote
y p
• Incomplete information
• Few loss triangles contain all of the information needed to make forecasts
• Most reserving exercises must incorporate judgment and/or background information
Loss reserving is inherently Bayesian
(Thi presentation
(This t ti will
ill cover 1
1-4…
4 Wayne
W will
ill cover 5
5.))
• Notation:
• Data points (Xi, Yi)i=1…N
• j[
j[i]:
] data p
point i belongsg to g
group
p jj.
α j μα σ α2 σ αβ
(
Yi ~ N α j[i ] + β j[ i ] ⋅ X i , σ 2
) where ~ N , Σ
μβ
, Σ=
βj σ αβ σ β2
Copyright © 2008 Deloitte Development LLC. All rights reserved. 11
Example: PIF Growth by Region
• Simple example:
Change in PIF by PIF Growth by Region
region from 2007-10 region1 region2 region3 region4
2600
• 32 data points
• 4 years 2400
• 8 regions
have 80 or 800
regions 2000
• W
We view
i th
the d
dataset
t t 2400
as a bundle of very
short time series
2200
2000
the carpet
2400
• Yi = α + βX
β i + εi
– Or: Yi ~ N(α + βXi, σ2) 2200
2200
2000
parameters:
{α1, α2, …, α8, β} 2200
• And
d it contains 4
2000
2400
• This is a major
improvement
2200
2000
α j μα σ α2 σ αβ
(
• Option 3: random Yi ~ N α j[i ] + β j[i ] ⋅ X i , σ 2
) where ~ N , Σ , Σ =
βj μRegion
slope and intercept PIFGrowth β
by σ αβ σ β2
model region1 region2 region3 region4
2600
• Yi = αj[i] + βj[i]Xi + εi
2400
2600
• And it contains 6
hyperparameters:
2400
{μα, μβ, σ, σα, σβ, σαβ}
2200
2000
PIF = α + βt + ε {PIF = α k
+ β kt + ε k }
k =1, 2 ,..,8
Compromise
Hierarchical Model
• E
Estimates
ti t parameters
t
using a compromise
between complete
pooling and no pooling.
α j μα σ α2 σ αβ
(
Yi ~ N α j[ i ] + β j[i ] ⋅ X i , σ ) 2
where ~ N , Σ
βj μβ
, Σ=
Copyright © 2008 Deloitte Development LLC. All rights reserved. σ αβ σ β2
16
A Credible Approach
Yi ~ N (α j[i ] + β X i , σ 2 ) α j ~ N ( μα , σ α2 )
• This model can contain a large number of parameters: {α1, α2, …, αJ, β}.
• And it contains 4 hyperparameters: {μα, β2, σ,
σ σα}.
}
nj
αˆ j = Z j ⋅ ( y j − βx j ) + (1 − Z j ) ⋅ μˆα where Z j =
nj + σ 2
σ α2
• Does
D this
thi fformula
l llook
k ffamiliar?
ili ?
• Credibility theory is a special case of hierarchical models!
Copyright © 2008 Deloitte Development LLC. All rights reserved. 17
The Middle Way
• This makes precise the sense in which the random intercept model is a
compromise between the pooled-data model (option 1) and the separate
models
od for
o each
a region
g o (option
(op o 2).)
nj
αˆ j = Z j ⋅ ( y j − βt j ) + (1 − Z j ) ⋅ μˆα where Z j =
nj + σ 2
σ α2
• In principle it’s
it s always appropriate to use hierarchical models
• Rather than a judgment call, the data tells us the degree to which the groups should
be fit using separate models or a single common model
• It is no longer an all-or-nothing decision
• Let’s
L t’ model
d l this
thi as a longitudinal
l it di l dataset.
d t t
• Grouping dimension: Accident Year (AY)
• W
We can b build
ild a parsimonious
i i non-linear
li model
d l that
th t uses random
d
effects to allow the model parameters to vary by accident year.
Copyright © 2008 Deloitte Development LLC. All rights reserved. 20
Growth Curves
1.0
stochastic loss reserving
literature.
• But… are GLMs natural
0.8
models for loss triangles?
0.6
•
•
2-parameter curves
θ = scale
(
G ( x | ω , θ ) = 1 − exp − ( x / θ ) ω )
• ω = shape 0.4
• See Clark [2003]
• Heuristic idea:
C
0.2
Development Months
Copyright © 2008 Deloitte Development LLC. All rights reserved. 21
Baseline Model: Heuristics
1.0
CLAY,t = (Ult lossAY)*(1 / LDFt)
0.8
Cumulative Percent of Ultimatte
0.6
0.4
CLAY,t = (Ult lossAY)*Gω,θ(t) + error
0.2
Weibull
Loglogistic
Age in Months
[ (
CumLoss AY ,dev = LR AY * prem AY 1 − exp − (dev / θ )ω + ε AY ,dev )]
LR AY ~ N μ LR , σ LR
2
( )
ε AY ,dev = φε AY ,dev −1 + a AY ,dev
Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY
2500 1988 1989 1990 1991 1992
2000
1500
1000
500
Cumulative Loss
2000
1500
1000
500
Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY
2500 1988 1989 1990 1991 1992
2000
1500
1000
500
Cumulative Loss
2000
1500
1000
500
Weibull Growth Curve Model -- AR(1) Errors; Randomly Varying Ultimate Loss Ratio by AY
0.6
2000
Residual Histogram Norm al QQ Plot Actual vs Predicted
0.5
500
15
0.4
1
0.3
1000
0
0.2
00
50
-1
1
0.1
0.0
-2
3
3 3 3
2
2
8 7 6 4 8 7 6 4 4 6 7 8
1 1 1
1
1
5 2 11 5 2 1 1 1
1 2 5
5 2 5 2 2 5
6 49 2 1 6
4 9 2 1 1 2 4 6 9
4 3 4 2 3
4 4
2 2 3 4
4
2 2
1 1 1
5 1 5 1 1 5
0
0
107 5 4 21 3 10
7 4
2
5 2 4 5 7 10
863 3 6 8 1 3 1 3
3 6 8
1 4 2 1 4 2 1 2 4
8 2 3 8
2 3 2 3 8
65 3 5 6 3 3 5 6
9 7 5 1 9 1 7 5 1 5 7 9
-1
-1
-1
1 1 1
4 4 4
6 6 6
27 2 7 2 7
3 3 3
3 3 3
-2
-2
-2
-3
-3
-3
Copyright
500 © 2008 Deloitte
1000 Development
1500 LLC.
2000 All rights
6 reserved.
18 30 42 54 66 78 90 102 114 1988 1990 1992 1994 1996
26
Model Results
Chain Ladder Analysis
AY premium 6 18 30 42 54 66 78 90 102 114 CL Ult CL res
1988 2,609 404 986 1,342 1,582 1,736 1,833 1,907 1,967 2,006 2,036 2,036 0
1989 2,694 387 964 1,336 1,580 1,726 1,823 1,903 1,949 1,987 2,017 29
1990 2,594 421 1,037 1,401 1,604 1,729 1,821 1,878 1,919 1,986 67
1991 2,609 338 753 1,029 1,195 1,326 1,395 1,446 1,535 89
1992 2,077 257 569 754 892 958 1,007 1,110 103
1993 1,703 193 423 589 661 713 828 115
1994 1,438 142 361 463 533 675 142
1995 1,093 160 312 408 601 193
1996 1,012 131 352 702 350
1997 976 122 576 454
chain link 2.365 1.354 1.164 1.090 1.054 1.038 1.026 1.020 1.015 1.000 12,067 1,543
chain ldf 4.720 1.996 1.473 1.266 1.162 1.102 1.062 1.035 1.015 1.000
growth curve 21.2% 50.1% 67.9% 79.0% 86.1% 90.7% 94.2% 96.6% 98.5% 100.0%
These
Copyrightare theDeloitte
© 2008 12 hierarchical
Development LLC.model
All rightsparameters.
reserved. 27
Making Predictions is Difficult (Especially About the Future)
1500
1000
500
“Given any value (estimate of future payments) and our current state of
knowledge, what is the probability that the final payments will be no larger than
the given value?”
-- Casualty Actuarial Society’s Working Party on Quantifying Variability in
R
Reserve E
Estimates
ti t , 2004
• Before 1990, Bayesian statistics was a lot more talk than action.
• Unless you use conjugate priors, calculating posterior probability distributions is
cumbersome at best, intractable at worst.
Clark, David R. (2003) “LDF Curve Fitting and Stochastic Loss Reserving: A Maximum
Likelihood Approach,” CAS Forum.
Frees, Edward
F Ed d (2006).
(2006) Longitudinal
L it di l and
dPPanell D
Data
t AAnalysis
l i and
dAApplications
li ti iin th
the
Social Sciences. New York: Cambridge University Press.
Gelman, Andrew and Hill, Jennifer (2007). Data Analysis Using Regression and
Multilevel / Hierarchical Models. New York: Cambridge University Press.
Guszcza, James. (2008). “Hierarchical Growth Curve Models for Loss Reserving,”
CAS Forum.
Pinheiro, Jose and Douglas Bates (2000). Mixed-Effects Models in S and S-Plus. New
York: Springer-Verlag.
Springer Verlag
Guszcza, James and Thomas Herzog (2010), “Enhanced Credibility: Actuarial Science
and the Renaissance of Bayesian Thinking,” Contingencies
Zhang, Yanwei,
Zhang Yanwei Vanja Dukic,
Dukic James Guszcza (2010).
(2010) “A
A Bayesian Nonlinear Model for
Forecasting Insurance Payments,” working paper.